Stochastic Optimization System for Bank Reverse Stress Testing †

: The recent evolution of prudential regulation establishes a new requirement for banks and supervisors to perform reverse stress test exercises in their risk assessment processes, aimed at detecting default or near-default scenarios. We propose a reverse stress test methodology based on a stochastic simulation optimization system. This methodology enables users to derive the critical combination of risk factors that, by triggering a preset key capital indicator threshold, causes the bank’s default, thus detecting the set of assumptions that deﬁnes the reverse stress test scenario. This article presents a theoretical presentation of the approach, providing a general description of the stochastic framework and, for illustrative purposes, an example of the application of the proposed methodology to the Italian banking sector, in order to illustrate the possible advantages of the approach in a simpliﬁed framework, which highlights the basic functioning of the model. In the paper, we also show how to take into account some relevant risk factor interactions and second round ﬀ ects such as liquidity–solvency interlinkage and modeling of Pillar 2 risks including interest rate risk, sovereign risk, and reputational risk. The reverse stress test technique presented is a practical and manageable risk assessment approach, suitable for both micro- and macro-prudential analysis. based on a stochastic optimization system as a quantitative means of determining all the possible breaking point solutions on the edge of the default area (i.e., the set of risk driver assumptions related to each breaking point). We then show a possible way to determine a reverse stress test scenario from the set of solutions derived. The model attempts to consider all the bank’s most relevant risk factors and their main interactions, although their modeling has been kept at the simplest possible level. The aim is to supply a practicable general methodology for addressing the reverse stress testing issues outlined above, providing a meaningful representation of selected adverse scenarios that can help us to understand the combination of risk factor assumptions that may threaten the bank’s viability.


Introduction
The recent evolution in prudential regulation has introduced further prescriptions regarding how institution-wide stress testing exercises should be carried out for ICAAP (Internal Capital Adequacy Assessment Process) and ILAAP (Internal Liquidity Adequacy Assessment Process) and recovery plan purposes (BCBS 2018;ECB 2018aECB , 2018bEBA 2018aEBA , 2018b. These increasingly stringent requirements present risk managers in financial institutions and supervisors with new relevant issues on both the methodological and operational sides. Specifically, bank stress tests should: • Consider many adverse scenarios, rather than just one, in order to assess the effects of different combinations of risk factors and degrees of severity, also covering losses related to rare but plausible events and addressing the vulnerabilities of the banks.
• Adopt a high degree of severity, in order to limit the risk of generating a false sense of security after passing a too-lenient stress test exercise.

•
Consider the impact of risks that are difficult to quantify (e.g., reputational and strategic risks).

•
Be able to effectively capture the effects of phenomena related to tail events in the medium-long term such as non-linearity, second round and feedback effects, and interaction between risk factors, particularly in relation to the solvency-liquidity interlinkage. • Perform no less than an annual reverse stress test analysis 1 , also assessing the probability that the events and risk factors assumed in the reverse stress test may occur 2 ; in this regard, the European Banking Authority-EBA has also introduced the concept of «plausibility of scenario» 3 .
Under traditional stress tests, we explicitly select a scenario, then we assess its impact on capital, while reverse stress test works in the opposite direction, where we first specify the magnitude of the impact on capital, or better, an event associated with an impact on capital such as regulatory breach, and then we find the scenarios that generate this loss/default event. The aim of the reverse stress test is to contribute to understanding the bank's vulnerabilities and the degree of sustainability of its business model, identifying the conditions of default or near default, and the associated critical level of risk drivers. The output of this kind of stress test is the detection of the reverse stress test scenario, which also represents the starting point of the recovery plan scenario.
Its usefulness lies in the fact that by comparing the reverse stress test scenario with the bank's stress test exercise, it is possible to challenge the assumptions and the degree of severity of the latter as well as assess the plausibility that the event of default or near-default associated with the reverse stress test scenario may occur. Thence, reverse stress tests are attractive from a risk perspective, but implementing a reverse stress test, though simple in concept, is a nontrivial exercise.
As a matter of fact, assessing the probability of occurrence of the reverse stress test scenario is very different from assessing the default probability of the bank and its overall level of resiliency. In fact, the former is the probability that a specific set of assumptions that cause the bank's default may occur, whereas the latter is the probability associated with the entire set of scenarios in which the bank may default, therefore covering a much larger set of assumptions and adverse events. In other words, there are many ways in which a bank may default, and therefore many default scenarios. To obtain a general and univocal measure of the bank's overall risk, we have to consider all the potential scenarios in which the event of default (or of regulatory breach) may occur (and not just one, i.e., the reverse scenario) to assess the bank's probability of default (or the probability of regulatory breach).
In previous research (Montesi and Papiro 2013, 2018, we described how to assess a bank's financial fragility through a stochastic simulation model by determining the frequency with which a bank may breach a regulatory capital requirement threshold (e.g., Common Equity Tier 1 ratio-CET1 ratio) in the future. This kind of exercise (i.e., assessing the probability of breach) is somewhat simpler than detecting the reverse stress test scenario, because we do not necessarily have to discover which particular adverse event combination determines a regulatory breach. We can estimate and quantify the degree of fragility of a bank even without identifying the exact adverse event and risk factor magnitude that causes the default 4 . Detecting the exact level and combination of risk factors that lead to a default (breach) point is a much more complex task. Since the number of scenarios and risk factor combinations that may trigger a bank's default threshold can be (theoretically and in the real world) very high, there is not just one reverse scenario, but many; this raises the thorny question of how to select one reverse stress test scenario from all those that may cause a bank's default.
Therefore, in addition to the need to model functional relations between risky variables and solvency indicators in adverse conditions, as in the traditional deterministic stress test or even in the more advanced stochastic stress test, reverse stress testing also involves a procedure or criteria to determine a specific set of conditions of risk drivers (e.g., Gross Domestic Product-GDP drop, interest rates shift, stock market crash, etc.) that defines a particular default scenario: the reverse stress test scenario. Despite the regulatory demand for this kind of stress test exercise, there are still very few papers on the topic, and as Grundke (2011) observes: "Unfortunately, despite the intensity with which the necessity of reverse stress tests is discussed by bank supervisors, there is not much scientific literature on how to carry out a quantitative reverse stress test in practice. In general, only case studies for simply structured portfolios with one or two risk factors can be found" (Grundke 2011, p. 74). In 2020, the situation has not substantially changed, and studies generally consider reverse stress tests by focusing on one specific asset class (loans, bonds, etc.) or type of risk (credit, market, etc.), generally assuming a context of linear setting involving elliptical distributions for risk factors. This kind of assumption makes it much easier to implement algorithms capable of detecting reverse stress test scenarios; however, the real context in which banks operate is considerably more complex and requires a different modeling framework 5 . It is necessary to take into account all relevant risks (systemic and idiosyncratic), their complex interactions, non-linear phenomena and feed-back, and second round effects, which are particularly important in tail stressed scenarios and in a multiperiod analysis context. Little research has been published on more general reverse stress test frameworks covering all the main bank risk factors 6 , and in our opinion, more dedicated research work is called for on this topic in order to satisfactorily address the issues just mentioned. The scope of this article is to present a general reverse stress test methodology based on an optimization system applied within a stochastic simulation framework for stress testing. The model provides a quantitative procedure that allows us to derive the combination of risk factors that, by triggering a key indicator threshold (CET1 ratio), causes, with a desired degree of approximation, the bank's default and thus defines the reverse stress test scenario. Furthermore, the reverse stress test process provides a useful set of results that aids in understanding the bank's vulnerabilities and sources of risk.
Since the correct identification of the reverse scenario necessarily requires consideration of the interaction among all the relevant bank's risk sources, in addition to taking into account the traditional Pillar 1 risk factors (credit, market, and operational risks), the methodological framework also shows a possible modeling of some relevant Pillar 2 risks and their feedback and second round effects, particularly taking into account some of the main features of the liquidity-solvency interlinkage. Therefore, we considered interest rate risk and reputational risk, and placed a particular focus on the modeling of sovereign risk. Although all models presented were kept at a very simple and essential level to highlight the basic functioning of the proposed methodology without obscuring the big picture 4 As Taleb noted: "It is far easier to figure out if something is fragile than to predict the occurrence of an event that may harm it. [ . . . ] Sensitivity to harm from volatility is tractable, more so than forecasting the event that would cause the harm". Taleb (2012, pp. 4-5). For example, within the stochastic simulation framework of our model (Montesi and Papiro 2018), once we have selected the bank's risk factors variables (credit risk, market risk, operational risk . . . , etc.) and set for each variable a probability distribution function through a Monte Carlo simulation process, we are able to estimate the probability of breach disregarding which particular scenarios (set of risk factors combination) will determine that event. 5 For this kind of simple reverse stress test approach see Flood and Korenko (2015), Breuer et al. (2009), Kopeliovich et al. (2015). Glasserman et al. (2015). 6 Along that research avenue are the works of: Grundke and Pliszka (2018), McNeil and Smith (2012). J. Risk Financial Manag. 2020, 13, 174 4 of 43 of the framework with too many modeling details, in our opinion, they capture the core elements at stake, thus providing some indications regarding a potentially viable approach for modeling risk factors typically considered as hard to quantify such as reputational risk.
The methodology proposed is quite different from previous research works, since it adopts a holistic approach aimed at stressing the entire bank, considering all the relevant risks affecting capital and the most important interactions and second round effects. Furthermore, another very important advantage lies in the computational effectiveness of the peculiar heuristic technique adopted to solve the reverse stress test problem-a stochastic optimization system based on simulated annealing-capable of detecting all the reverse scenarios that trigger the bank's failure. Moreover, we also provide a selection criteria based on the criterion of proximity to the starting point of the analysis, in order to select the most likely reverse stress scenario among all the possible solutions detected.
The proposed methodology can be applied by bank risk managers and supervisors in all risk assessment processes that require a reverse stress test: Risk Appetite Framework-RAF, ICAAP, Recovery & Resolution Plan, Supervisory Review and Evaluation Process -SREP.
The rest of the paper is organized as follows. In Section 2, we set some preliminary aspects that help us to better address which are the main issues regarding reverse stress testing. Then, in Section 3, we provide a very brief overview of the stochastic modeling framework for reverse stress testing; Section 4 describes the optimization system applied to derive a solution to the identification of the reverse scenario. We then present the assumptions and modeling of a case study based on the Italian banking sector intended to offer a simplified practical example of how the methodology works in Section 5, subsequently followed by Section 6, which reports the main results of the exercise performed. We performed two kinds of analysis: (a) a stress test exercise aimed at assessing the probability of breaching several regulatory thresholds in stressed conditions (OCR-Overall Capital Requirement and TSCR-Total SREP Capital Requirement); and (b) a reverse stress test exercise aimed at detecting a specific set of assumptions of the primary risk drivers that can cause the triggering of a regulatory breach in relation to the two different thresholds (OCR and TSCR), showing how it is possible to perform this kind of exercise with real figures. Section 7 provides some concluding considerations; the appendices report further details of the assumptions and data used for the case study.

Definition and Logic of Reverse Stress Testing Analysis
Before describing the proposed methodology, it is opportune to briefly deal with some preliminary issues that help us to better define the sphere and complexity of this topic. From a technical point of view, reverse stress test analysis is aimed at finding a solution to an inverse problem (i.e., detecting those scenarios on the edge between the condition of viability and default), or in other words, the exact conditions in a small set of risk drivers that trigger the bank's default, which from a regulatory point of view can be identified by the breaking of a minimum regulatory threshold such as Total SREP Capital Requirement (TSCR) or considering higher thresholds such as Over Capital Requirement (OCR) 7 .
The solution to this problem can only be simple if we think in terms of an extremely reduced and simplified modeling framework, for example, based on a synthesis profitability indicator such as Return On Equity -ROE or Return On Assets-ROA. If we assume that we can represent a bank's business model only through a set of four fundamental variables such as business growth (financial assets), risk absorption (RWA), profitability (ROE), and a capital regulatory constraint (CET1 ratio), we can obtain a solution in a closed formula that allows us to detect the critical level of profitability/loss that triggers the default by causing the bank's breach of the regulatory capital constraint. The following 7 Extending the analysis to multiple capital requirements (CET1, Tier 1, total capital, leverage ratio) and further indicators with respect to capital such as liquidity does not change the nature of the issue. formula describes the reverse stress test breaking point (i.e., the level of ROE that causes a CET1 ratio below the pre-set regulatory threshold 8 ): where g is the rate of growth of financial assets; RW is the average risk weight of assets (RWA/TA); α is the capital regulatory adjustments (CET1Ad j) expressed as a percentage share of the Equity Book Value; and CET1 is the regulatory minimum threshold of the common equity capital ratio that sets the default condition. We can also express the breaking point of (1) in terms of ROA, through the well-known relationship between ROE and ROA, as 9 : This kind of simple basic modeling, in which all the complexity of the interactions between the bank's risk factors and accounting/regulatory variables are removed by collapsing everything into a single explicatory variable (ROE or ROA), allows us to highlight the quantitative relations between 8 The following steps describe the derivation of (1). By expressing regulatory capital absorption in terms of total assets (TA), we can define the regulatory capital constraint in terms of the minimum required Equity Book Value (EBV) as: where CET1Ad j represents all the regulatory capital adjustments (such as intangibles, prudentail filters, capital deductions, etc.); a positive sign of CET1Ad j t meaning a net capital deduction (the reverse in case of negative sign). By expressing CET1Ad j as a share α of Equity Book Value (EBV), we have: Thence, (a) can be expressed as: In the absence of extraordinary capital transaction, keeping a capital target profile in line with regulatory constraint can be done by adjusting the dividend/capital retention policy to satisfy the following condition: where δ represents the payout ratio; and ∆EBV t is the change in equity book value required to maintain the capital position in line with the regulatory constraint. By developing some simple passages, we can express (d) as: where ROE t = Net Income t /EBV t−1 . The left side of the expression represents the share of net income retained by the bank, while the right side represents the capital needed to respect the regulatory constraint as a function of the growth of assets, the regulatory capital threshold, the risk weight, and the capital adjustments. By simplifying (e) and placing δ t = 0 we can derive (1). 9 We can express the breaking point condition in terms of ROA by considering the well-known relation that links ROE to ROA through the leverage ( ): In fact, considering that the leverage is the ratio between Total Assets (TA) and EBV, we can also express it as (see previous footnote): which is the leverage implied in the breaking point condition. Considering (vi) and (vii), we then have: By substituting (viii) in (1), we can easily obtain (2) J. Risk Financial Manag. 2020, 13, 174 6 of 43 profitability, capital assets growth and risk; a high level of breaking point losses indicate a low risk of breaching the regulatory capital, while a low level of breaking point losses imply a high risk of breach. However, it tells us nothing more than that. If we intend to minimally analyze and understand the risk factors and the context that can cause the bank's default, which is, after all, the aim of reverse stress testing, we need to introduce several risk drivers into the model. Abandoning the too-simplistic context and introducing more explanatory variables into the modeling framework makes finding a solution to the reverse stress test problem a much more difficult and challenging task.
In fact, the greater the number of risk drivers considered, the greater the number of possible solutions (and not just one, as in the simple example above), since there can be numerous possible combinations of risk factor assumptions that can determine a breaking point. Moreover, the more articulated our model is, the more computationally complex finding the solutions becomes.
As a general rule, limiting the number of risk factors to the most relevant ones and containing the level of analyticity of the model helps to reduce the complexity of the reverse stress test solution. In order to make the reverse problem computationally tractable and its solution meaningful and adequate for its purpose, we must necessarily make some choices regarding the limitation, selection, and modeling of the risk drivers considered, acknowledging that these choices are unavoidably subjective and will affect the solutions. For instance, assuming in the model that credit risk is dependent from GDP (or only from it), for example, through a satellite model that links the GDP rate of growth to PDs (probability of default) and LGDs (loss given default), a certain path of solutions can already be pre-set in terms of scenario conditions while excluding others. This widespread assumption, also adopted in the supervisory stress test, is much less obvious than what it may seem at first glance 10 , considering that quite often (like in the 2007 crisis) the original source of risk stems from the instability of the financial sector, while the downturn of the real economy arises only afterward as a consequence of a financial crisis 11 . In this regard, Hyman Minsky's theoretical contributions are quite enlightening 12 .
Even once we have determined a quantitative technique for solving the reverse stress test, given the fact that this problem (except in the simplistic world outlined above) always involves multiple solutions, we must deal with the issue of how to select one reverse stress test scenario from all the possible breaking points that lie on the edge of the default area.
If we simulate all the possible future economic dynamics of the bank, we can plot on a graph all the values of a key synthesis risk indicator such as the CET1 ratio, which is associated with each forecast scenario simulated. Once we have set a relevant threshold such as that established by prudential regulation and supervisors, we can share the points between those lying above the threshold and those falling below in the breach area. In Figure 1, we show a graphical representation of all the CET1 ratios generated within a simulation of 5000 different forecast scenarios 13 . The points below the thresholds (10% OCR and 7% TSCR) in the pink breach area represent the scenarios in which there is a breach of the relevant threshold. If we group and represent all the points through a distribution function, we can determine the frequency of breaching the relevant threshold within the forecast simulation: 21.88% for OCR and 2.54% for TSCR. To the extent that the simulation effectively replicates the variability of all the main bank's risk drivers and the corresponding capital ratio outcomes, these frequencies can be considered as an assessment of the bank's probability of breach. Assessing such probability of breach corresponds to assessing the bank's financial fragility and its overall degree of risk. As mentioned in the introduction, we have already described how to perform such assessment in a previous paper (Montesi and Papiro 2018); in this article, we focus on determining the points that lie on the edge of the relevant threshold. In fact, taking a closer look at Figure 1, we can see that while most of the points lie either above or below the threshold, some of them just lie on the bar of the threshold (light pink for OCR and dark pink for TSCR). These points correspond to all the scenarios in which the impact of the simulated risk factors exactly triggers the regulatory breach: a slightly less severe impact not causing any breach, while more severe impacts determine capital ratios progressively lower than the threshold. As we can see, there are many points, not just one, that lie on the bar. Of course, the width of the bar (and therefore the number of scenarios lying on it) changes according to the desired degree of accuracy of the threshold's breach. The aim of reverse stress testing is to determine the forecast assumptions of the key risk factors underlying those points, which we can call reverse breaking points. The purpose of this paper is to present a technique to determine the reverse breaking points and then suggest a criterion for selecting one breaking point among all those determined and the corresponding set of risk factor forecast assumptions, which define the reverse stress test scenario.
To do so, we must apply a selection criterion, which again involves some degree of subjectivity; for example, the most likely scenario, the one that has the minimum level of approximation from the breaking condition, an average of the conditions derived from all the potential solution, the scenario that is less distant from the current economic conditions, etc. The selection criterion is not a minor issue, in this regard, an appropriate way to represent all the outcomes of the reverse stress test solutions showing the distribution of the critical levels of the specific bank's main risk factors and its most likely vulnerabilities can strongly help us in finding an effective selection criterion.
Therefore, reverse stress testing involves two types of problems: (1) a computational issue related to the technique used to derive the reverse solutions (the reverse breaking points); and (2) the choice of a criterion to select the reverse stress test scenario from all the solutions obtained. The first issue can be resolved through quantitative methods, while the latter cannot be addressed in purely quantitative terms, but ultimately requires subjective decisional criteria.
In the next section, we present a reverse stress testing model based on a stochastic optimization system as a quantitative means of determining all the possible breaking point solutions on the edge of the default area (i.e., the set of risk driver assumptions related to each breaking point). We then show a possible way to determine a reverse stress test scenario from the set of solutions derived. The model attempts to consider all the bank's most relevant risk factors and their main interactions, although their modeling has been kept at the simplest possible level. The aim is to supply a practicable general methodology for addressing the reverse stress testing issues outlined above, providing a meaningful representation of selected adverse scenarios that can help us to understand the combination of risk factor assumptions that may threaten the bank's viability.

Stochastic Reverse Stress Testing: Modeling and Framework
The model framework described refers to the empirical exercise performed, which will be more specifically described below. This particular type of modeling is partially affected by the limited data set available for the case study; it is of course possible to conceive a more generalized and extensive version of the framework that is better suited to managing more structured contexts for which a wider set of data is available. Whatever the particular modeling features adopted, the general model framework consists of two layers (see Figure 2):

•
An upper layer that includes all the systemic risk factors (e.g., GDP, interest rates, stock market, etc.) representing the drivers on which the stochastic optimization system is based and which defines the set of assumptions of the reverse scenario. The small set of macro variables adopted in our exercise is just a basic example; the type and number of variables can of course be changed and extended to other variables that may be useful for the modeling of the second layer of the framework (however, keep in mind that the higher the number of explanatory variables used in the optimization system, the higher the number of possible solutions to the reverse stress test there will be).
• A lower layer made up of all the mathematical relationships that define a multi-period forecast model that projects the bank's income statement, balance sheet, and regulatory capital figures. The forecast model variables simulate the impact of all systemic and idiosyncratic risk factors on the bank's economics, expressed through probability distribution functions. The forecast model follows the same stochastic framework presented in our previous work, to which we refer (Montesi and Papiro 2018). A sound multi-period stochastic forecasting model must meet the following requirements: (1) A dividend/capital retention policy that reflects the regulatory capital constraints and stress test aims; (2) The balancing of total assets and total liabilities in a multi-period context, so that the financial surplus/deficit generated in each period is always properly matched to a corresponding (liquidity/debt) balance sheet item 14 ; and (3) The setting of rules and constraints to ensure a good level of intrinsic consistency and correctly manage potential conditions of non-linearity. In fact, the most important requirement of a stochastic model is that it must prevent the generation of inconsistent scenarios. In traditional deterministic forecasting models, the consistency of results can be controlled by observing the entire simulation development and set of outputs. However, in stochastic simulation, which is characterized by the automatic generation of a very large number of random scenarios, this kind of consistency check cannot be performed, and we must necessarily prevent inconsistencies ex-ante within the model itself, rather than correcting them ex-post. In practical terms, this entails introducing rules, mechanisms, and constraints into the model that always ensure consistency, even in stressed scenarios 15 .
The necessary link between the upper and lower layers is given by specific satellite models that translate the dynamic of the macro variables (systemic risk) into each of the bank's relevant micro variables that affect the Profit & Loss and balance sheet (PD, interest income and expenses, etc.). In our exercise, we considered a limited and very simple set of satellite models; the models can of course be freely extended and made more sophisticated without changing the essence of the modeling framework and the process by which reverse solutions are derived. Idiosyncratic risks, on the other hand, are simulated directly within the lower layer of the framework. These variables are not subject to the optimization system for determining the reverse stress test scenario. For the purpose of this exercise, we considered idiosyncratic risks as operational risk, reputational risk, and a component of market risk; since the bank considered in the exercise represents a sort of proxy of the countrywide banking sector, we did not consider any idiosyncratic credit risk factor, but more idiosyncratic risk components could be included. By using a multivariate stochastic forecast model such as the one just described, we can develop a stress test analysis aimed at assessing a bank's capital adequacy and its financial fragility, estimating the probability of breaching key regulatory capital/liquidity indicators (by determining the frequency of scenarios in which there is a breach of a pre-set default threshold of capital ratios over the entire number of random scenarios simulated). However, this alone does not allow us to identify a specific set of conditions that can act as a watershed between the viability area and the default area, given that there are many scenarios that can trigger the default including those that determine points within the default area as well as scenarios that lie at its edge (i.e., the breaking points that constitute the solutions to the reverse stress test). Within a stochastic simulation aimed solely at assessing a bank's fragility by determining its probability of breach, we may also fail to detect the specific source of risk. In fact, for that purpose, it would suffice to estimate, for example, the distribution function of loan loss provisions (and all other risky variables), regardless of whether they arise from an idiosyncratic or systemic event. For the purpose of determining the reverse scenario, it is also crucial to determine the source of risk, how that level of provision arises and the specific value of the driver that causes them. In order to identify the particular sets of assumptions related to the breaking points, we must assume a specific modeling layer that specifies how these risk drivers relate to all the other relevant forecast variables. More specifically, for the purpose of reverse stress testing, from a modeling point of view, it is important: A. To use satellite models that link the dynamics of a few drivers (typically macroeconomic variables such as GDP) to the evolution of the bank's micro variables, ultimately affecting P&L, balance sheet, and regulatory figures (e.g., provisions for loan losses, net trading income, other comprehensive income reserve, etc.). Satellite models help to reduce the number of explicative variables and, together with the creation of a forecasting model that properly connects the related input variables with each other 18 , derive a consistent and synthetic set of conditions that solve the reverse stress test exercise and define its scenario. B. To appropriately consider the most relevant interactions between risk factors. Keeping in mind that the purpose of the reverse stress test is to determine the right triggering combination of risk factors, it is crucial that the model is able to take into account and measure the impact of feedback and second round effects among risk factors, which in a multi-period context can generate non-linearity phenomena and are particularly relevant in extreme tail scenarios typically  We may also consider introducing into the framework a further modeling component, in which we can model the feedback given by the bank management's reactions to adverse events that have occurred, in order to bring the model closer to the real behavior of banks under stressed conditions. For instance, it does not make much sense to contemplate in the model that a bank keeps on granting the same amount of new loans within a prolonged severely adverse business cycle; or that in times of trouble, it does not make any cost reductions if they are needed and feasible. As we have explained, the modeling of these kinds of automatisms in adverse scenarios is particularly relevant for stochastic simulation models in a multi-period context. The idea is to make recourse to appropriate control variables and logic functions, through which to model the activation of an economic rational reaction of the bank's management in response to particular preset conditions such as the incurring of net losses or the triggering of a capital threshold 16 . In principle, this kind of feedback mechanism should be limited to those variables that are, at least partially, under the direct control of management, namely costs and investments. Capital issues and sales of business units at unreasonable prices in adverse market conditions to cover capital shortfall should not be considered.
Of course, modeling this kind of feedback function is not an easy task and unavoidably involves some kind of arbitrariness. Nevertheless there are some basic options that, although not exhaustive, can be generally adopted without presenting too many shortcomings such as reducing administrative costs in the case of net losses (e.g., eliminating the personnel bonus, reducing advisory and consultancy expenses, blocking staff turnover and salary increase, delaying IT projects, cutting marketing expenses, etc.); risk assets deleveraging in relation to the capital capacity (e.g., partial renewal of matured exposures, temporary stop to new business, switching from greater to lesser capital absorbing exposures, sales of assets, etc.); reduction of government bond risk during a sovereign crisis (e.g., reducing the portfolio's duration or size acting on a partial renewal or different composition of matured securities); no dividends or interest paid on additional Tier 1 capital instruments in case of net losses; etc. Imagining these kinds of features as a feed-back component of the model allows us to conceive it as a switch that can activate, deactivate, or possibly even modulate management reactions, according to the scope of the analysis, and also to quantify (by differential analysis) the magnitude of the bank's reaction capacity 17 .
In Section 5, we report the assumptions and modeling features of the main risk factors adopted in the reverse stress test exercise, while the assumptions related to minor variables are provided in the Appendices A and B.
By using a multivariate stochastic forecast model such as the one just described, we can develop a stress test analysis aimed at assessing a bank's capital adequacy and its financial fragility, estimating the probability of breaching key regulatory capital/liquidity indicators (by determining the frequency of scenarios in which there is a breach of a pre-set default threshold of capital ratios over the entire number of random scenarios simulated). However, this alone does not allow us to identify a specific set of conditions that can act as a watershed between the viability area and the default area, given that there are many scenarios that can trigger the default including those that determine points within the default area as well as scenarios that lie at its edge (i.e., the breaking points that constitute the solutions to the reverse stress test). Within a stochastic simulation aimed solely at assessing a bank's fragility by determining its probability of breach, we may also fail to detect the specific source of risk. In fact, for that purpose, it would suffice to estimate, for example, the distribution function of loan loss provisions (and all other risky variables), regardless of whether they arise from an idiosyncratic or systemic event. For the purpose of determining the reverse scenario, it is also crucial to determine the source of risk, how that level of provision arises and the specific value of the driver that causes them. In order to identify the particular sets of assumptions related to the breaking points, we must assume a specific modeling layer that specifies how these risk drivers relate to all the other relevant forecast variables. More specifically, for the purpose of reverse stress testing, from a modeling point of view, it is important: To use satellite models that link the dynamics of a few drivers (typically macroeconomic variables such as GDP) to the evolution of the bank's micro variables, ultimately affecting P&L, balance sheet, and regulatory figures (e.g., provisions for loan losses, net trading income, other comprehensive income reserve, etc.). Satellite models help to reduce the number of explicative variables and, together with the creation of a forecasting model that properly connects the related input variables with each other 18 , derive a consistent and synthetic set of conditions that solve the reverse stress test exercise and define its scenario. B.
To appropriately consider the most relevant interactions between risk factors. Keeping in mind that the purpose of the reverse stress test is to determine the right triggering combination of risk factors, it is crucial that the model is able to take into account and measure the impact of feed-back and second round effects among risk factors, which in a multi-period context can generate non-linearity phenomena and are particularly relevant in extreme tail scenarios typically associated with reverse analysis; how they are modeled-in affects the ranking and impact of risk factors and thus the solutions of the model. More specifically, we refer to the need 17 For example, by structuring the feed-back relationships according to the characteristics associated with a particular risk appetite framework-RAF policy (such as risk limits and early warning thresholds activating escalation process and subsequent remediation actions), it is possible to quantify the mitigation effects related to that specific RAF policy, simply as the difference between the results (capital ratios) associated with two different analyses, one performed with the feed-back components on and the other with the feed-back components off. 18 For example, operating cost as a function of business volume, LGD as a function of PD, etc. to consider relevant Pillar 2 risks (such as interest rate, sovereign and reputational risks), and the modeling of dynamics such as systemic-idiosyncratic interactions and liquidity-solvency interlinkage 19 .
Moreover, to resolve the stress test problem, we also need to adopt a specific computational technique aimed at reducing the calculations and elaboration time necessary to reach a solution; this technique is presented in Section 4, while the description of the selection criterion is provided in Section 6 within the empirical part of paper, since the meaning of this methodological part is clearer when put directly in relation with a specific case study.

Simulated Annealing Driven by Multi-Start Strategy: A Heuristic Solution
Reverse stress test problem solving can be compared to a scenario optimization issue. There are several quantitative techniques that can be used to resolve problems of parameter optimization such as scenario optimization (Monte Carlo method), simultaneous perturbation, stochastic gradient descent, random search, swarm optimization, and genetic algorithms 20 . The choice of the specific technique to employ depends largely on the specific context and issues to be addressed; unfortunately, there is no optimal algorithm to be adopted for all conditions. Moreover, in the context of complex and non-linear systems such as the multivariate and multiperiod forecast model outlined above, the choice of the most appropriate technique is not an easy one. In fact, these kinds of models are characterized by computational complexity related to the multiple dimension of the research domain (i.e., we have more than a variable to optimize), the existence of several non-linear relevant conditions that affect the results 21 , and in particular, the presence of other stochastic variables linked to idiosyncratic risk (those related to the lower layer of the model). In addition, the multiperiod context of the exercise generates a further element of complexity due to the significant time dependence relationship among variables. This kind of complexity is quite common in modeling economic and financial phenomena, albeit with a wide range of differences. With specific regard to optimization problems, Gilli and Schumann (2012) highlighted that the use of classical optimization techniques is not necessarily a good solution in optimization problems in financial modeling. More specifically, the two authors argue that in all cases in which there is a high level of computational complexity, it is necessary to make recourse to alternative solutions: "In such cases, an often observed practice is to 'convexify' the problem by making additional assumptions, simplifying the model (for instance by dropping constraints), or imposing prior knowledge (e.g., limiting the parameter domain). This makes it possible to employ classical optimization methods, but the results obtained are not necessarily good solutions to the original problem. An alternative approach, ( . . . ), is the use of optimization heuristics like Simulated Annealing or Genetic Algorithms. These methods are powerful enough to handle objective functions and side constraints essentially without restrictions on their functional form." (Gilli and Schumann 2012, p. 155).
For the reverse stress test exercise presented in this article, the optimization system adopted was Simulated Annealing (SA), an iterative heuristic aimed at approximating a global optimization in a 19 For example, a sovereign debt crisis can impact the bank's cost of funding more or less severely depending on the maturity of its liabilities during the forecasting period; a systemic crisis (GDP/stock market drop) affecting the bank's capital profile can worsen its rating and thus increase its cost of funding; similarly, reputational events can impact the bank's profitability, etc. 20 For a review of this literature, see Gendreau and Potvin (2010). 21 Consider, for instance, the dynamics related to the increase in the cost of funding arising from the bank's rating downgrade, or potential impairment on the stock of government bonds in the HCT portfolio in the case of severe downgrade of the country's rating, or the mechanism of CET1 deductions related to an increase of DTA above the regulatory threshold.
large search space. This heuristic is closely related to the Markov Chain Monte Carlo method (MCMC); more specifically, it is an evolution of the Metropolis-Hastings method 22 . We combined a variant of SA, similar to that proposed by Painton and Diwekar (1995), with a multi-start method to improve the search process. In our opinion, one of the main advantages of this technique, under certain conditions of configuration of the cooling schedule, lies in the fact that it enables us to reach an optimal solution, reducing the amount of sampling necessary. The SA technique, compared to a simpler (but blind) random search method, allows us to better guide the search by setting a calibration of the optimization process. Indeed, as evidenced by Painton and Diwekar (1995), "The goal of stochastic annealing is to minimize the expectation of the objective or cost function by balancing the trade-offs between accuracy and computational efficiency" (Painton and Diwekar 1995, p. 494).
The multi-start technique permits better coverage of the search domain and more accurate mapping of the points on the edge, tracking the best candidate solution at each iteration. By starting from a set of preselected inputs of the domain (the primary risk drivers), it allows us to move more quickly toward the breaking condition, detecting the scenario that presents the highest probability of lying on the edge of the default area. In this regard, SA with multi-start is a very flexible technique, which allows us to adjust the system to the particular purpose of the analysis by properly setting the range of potential values of the variables to be optimized and the steps of the search process. This calibration and parametrization of the reverse stress test is a very important aspect, since it affects the quality of the solutions obtained by means of the optimization.
The basic idea behind the SA technique is the following: at each step, the algorithm randomly (by means of a stochastic sampling method) selects a set of risk driver values as a solution close to the current set, then it measures the quality of the solution by verifying how close the model's output is to the optimal condition (at a pre-set level of accuracy), and then decides whether to stay with the current solution or to move to some neighboring values on the basis of a probabilistic assessment determined through an acceptance function. During the search, the process leads the system to move toward more severe stressed states and is repeated until the system reaches a breaking point. To adapt the SA to a stochastic simulation framework (i.e., a forecast model that includes further stochastic variables in addition to those to be optimized), we performed n trials for each step 23 , thus obtaining a plurality of results that allowed us to consider a more accurate average value to establish whether the point should be added to the set of breaking point solutions or whether the iterative process must continue, restarting from the rejected point. The probability of accepting a move decreases with increasing iterations, and this allows the algorithm to leave local minimum states.
To put it simply, the working of the SA optimization process can be described as follows: (a) Before starting, it is necessary to set a domain sampling grid, which serves as the starting set of values to feed the SA (Multistart) algorithm iterations, in order to keep the optimization process from getting stuck in a local minimum by helping to explore further solutions in a wider space; in other words, it serves to avoid the risk that the wrong choice of a starting point affects the optimization process and thus leads to the selection of a solution that is not the best one. (b) Then, starting from an initial point within the sampling grid, the SA optimization process explores the input variables domain, randomly selecting at each iteration a new solution among all the potential solutions in the neighborhood.
(c) Subsequently, we evaluated the new solution outcome. Considering the stochastic nature of the idiosyncratic risk factors, it is recommended that the outcome be determined as an average of several outcomes, generated with the same input point (potential solution) by running several trials for each new potential solution (in our application we performed ten trials). If the outcome obtained is closer to the target to be achieved than the former outcome, then the solution is adopted as new best optimum; otherwise we decided, through an acceptance function based on the Boltzman function (having as its input number of iterations and distance from the target), whether the solution should be definitively discarded or could still be used as the starting point for the selection of a new random point.
More formally, SA, as applied to our context, can be described as follows. First, we define with S the polytope in R j (with j = number of systemic risk factors × number forecast periods) made up of the intersection of the plans outlined by the value range of the several systematic risk factor drivers subject to optimization, with CET1 the value of the capital ratio threshold, and with f(x, h) the function that returns the value of the CET1 ratio at the risk horizon h conditional on realizations of the systematic risk factors, where x ∈ S. Simply speaking, f(x, h) represents the final result of an articulated forecasting model as outlined in Figure 2, with all its variables and relations simulating the dynamics of P&L, balance sheet, and regulatory capital items, according to the accounting and regulatory rules that set their relationships 24 .
The optimization process will aim to find the set of solution S * according to which: is the expected value of the difference in absolute value between the CET1 ratio conditional on realizations of the systemic risk factors at the time horizon h and the preset threshold CET1 ratio. Every run of the algorithm can be described in the following way:

•
Step 0: Let x 0 ∈ S be a given starting point.

•
Step 1: Sample a point y k from the neighborhood.

•
Step 2: Accept the new point using: where A is the acceptance function; p is a uniformly random number in [0,1]; and t k is a global time-varying parameter called the temperature at iteration k-th.

•
Step 3: Update S * with the information collected up to iteration k-th.

•
Step 4: Set t k = T(t k−1 ), where T is called the cooling schedule function.

•
Step 5: Check the stopping criterion or continue iteration.
Once we have selected the forecast year in which to search for the breaking point solution and the goal of the optimization process (i.e., the capital ratio threshold), in order to apply the SA algorithm, it is necessary to define the method to select the next candidate solution neighborhood, the function A of acceptance, and the cooling schedule T. We applied the selection method proposed by Muller (1959), according to which we selected a random point in a hypersphere with radius r, which represents the neighborhood.
The cooling schedule function T can be defined as: where γ is a constant smaller than the unit 25 . Empirically, we noted that this function allows us to increase the convergence toward the solution point with an exponential decrease of the cooling schedule, thus reducing the sample size 26 . We can define the acceptance function, A, in the following way: where x is the current point and y the previous local optimizer point. The described SA algorithm is just one possible technique for finding a solution to the reverse stress test problem. In our opinion, it is a quite efficient methodology well suited to address complex problems in which random variables are involved; other heuristics may also be adopted if they prove to provide better solutions and reduce the calculation time and errors. In any case, it is important to stress that the choices can have a significant impact on the quality results, and unfortunately, there is no general way to find the best solutions for all possible problems.

The Reverse Breaking Point Frontier
In the lower layer of the model, we manage the idiosyncratic risk factors, which, although relevant in contributing to determine the default conditions, do not take part in the optimization as drivers of the reverse stress test. These idiosyncratic risk factors are also managed as stochastic variables in the model and therefore the contribution of their impact on the bank's capital varies in the different scenarios simulated, especially where non-linearity conditions occur. The variability of the idiosyncratic risk impact causes a margin of error in the precise determination of the reverse breaking points arising from the optimization system. As above-mentioned, to address the issue, we ran many trials for each step of the optimization process in order to reject the results whose average did not meet the required level of accuracy. Of course, increasing the number of trials while improving the reliability of the average also entails an increase in elaboration time; there is a natural trade-off between elaboration time and margin of error that has to be balanced. To this end, for the initial calibration set-up of the model, we can perform tests to assess the optimal number of trials for each step of the process that would offer a good level of accuracy and elaboration time. In other words, keeping in mind that the aim of the reverse stress test should be to select the most plausible scenario in which the bank defaults, we should ensure that solutions with a high probability of exactly triggering the threshold should be considered, while solutions that, for example, require a very high level of (very unlikely) idiosyncratic risk losses should be excluded.
Therefore, in order to calibrate the model, we can generate a high number of stochastic trials (e.g., 10,000) that allow us to detect all the values whose average has a high probability to coincide with a breaking point solution. By performing this procedure for all the breaking points on the default edge, we are able to appraise the quality of the solutions and disregard those that have an average value too distant from the preset threshold. This is shown in Figure 3, in which we report the average values of a sample of 25 breaking points on the edge of the distressed area (relative to the breach of the 9.54% threshold in the third forecast year), determined by simulating 10,000 trials for each breaking point, showing that the range of deviations from the CET1 threshold was quite small.
(e.g., 10,000) that allow us to detect all the values whose average has a high probability to coincide with a breaking point solution. By performing this procedure for all the breaking points on the default edge, we are able to appraise the quality of the solutions and disregard those that have an average value too distant from the preset threshold. This is shown in Figure 3, in which we report the average values of a sample of 25 breaking points on the edge of the distressed area (relative to the breach of the 9.54% threshold in the third forecast year), determined by simulating 10,000 trials for each breaking point, showing that the range of deviations from the CET1 threshold was quite small. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Deviation from CET1 Ratio Threshold Reverse Break-Even Points

Reverse Stress Testing Exercise: The Italian Bank Case Study
We performed a reverse stress test exercise applying the methodology described, based on an aggregated sample of the four largest Italian banks: Intesa Sanpaolo, Unicredit, Banco BPM, and UBI Banca, representing in terms of total assets slightly more than 50% of the Italian banking industry. To create the banks' sample, we added up all the banks' financial statement items to create a sort of aggregated balance sheet that we called ITB (Italian Bank), and which can be considered as representing both a typical Italian bank or a rough proxy of the Italian banking industry.
Since this reverse stress test exercise has been developed exclusively for illustrative purposes, the specific modeling and set of assumptions applied were kept as simple as possible to facilitate the understanding of the basic characteristics of the approach without obfuscating the description of the "big picture" with unnecessary complications. Therefore, the specific set of assumptions adopted for this exercise must be considered strictly as an example of the application of the methodology proposed, and absolutely not as the only or best way to implement the approach. Depending on the information available and the purposes of the analysis, more accurate assumptions and more sophisticated modeling can be adopted. Nevertheless, in our opinion, the results obtained, albeit based on a very simple modeling, can be considered as a sufficiently descriptive analysis of the Italian banking sector's resiliency. Of course, the simplified approach based on a one-bank analysis performed through the aggregation of several banks' financial statements presents some limits in deriving stress testing results for macro-prudential purposes. In fact, in simulating the impact of adverse scenarios for one aggregated bank, we implicitly apply a sort of capital compensation among the banks characterized by capital shortfall and those by excess capital, which in a real context could not occur (i.e., the capital buffer of one bank could not be used for covering a capital shortfall of another bank). This shortcoming may alter the results in all the cases in which there are relevant differences within the sample in terms of capital position and sensitivity to risk factors. The exercise time horizon was 2019-2021, considering 2018 financial statement data as the starting point of the analysis (for further details see Appendix A, in which the templates report the set of forecast model variables used for the analysis).
In order to show the results of the optimization process in isolation from other dynamics, in the exercise, we adopted a static balance sheet assumption, according to which assets and liabilities that mature within the time horizon of the exercise were replaced with similar financial instruments in terms of type, currency, credit quality at date of maturity; no cure rate was assumed on non-performing exposures. Anyway, this assumption is not necessary and can be easily removed.
Here below, we describe the modeling assumptions of the most relevant risk factors, while the complete set of assumptions and modelling features adopted in the reverse stress test exercise are provided in Appendix B.

Credit Risk
This risk factor has been modeled-in through the accounting item "Net Adjustment for Impairment on Loans" (see Table A2 in Appendix A). We adopted the expected loss approach through which yearly loan loss provisions are estimated as a function of three components: probability of default (PD), loss given default (LGD), and exposure at default (EAD). PDs are determined as a function of GDP through a very simple satellite model that broadly replicates the dynamics envisaged by the EBA path generators used for the EU-wide stress test exercises. More specifically, PD values have been simulated through the following satellite model: where ∆GDP is the rate of change of the Italian GDP. We considered a starting point of PD equal to 1.6%, which represents the default rate of the Italian banking system in 2018 (BI 2019, p. 173). Therefore, the credit stress is driven only by systemic risk, managed through the GDP rate of change.
As explained above, the ∆GDP is the risk driver and is handled as a stochastic variable within the optimization system aimed at detecting the reverse stress test scenarios. Considering the range of possible variation of GDP assumed in the reverse stress test simulation (0, −2%), in Figure 4, we report the distribution of all the yearly PD values used for calculating Net Adjustment for Impairment on Loans in the three years of the analysis 27 . Of course, the most severe PD points simulated were sparser, while lower PD points were denser. Therefore, the credit stress is driven only by systemic risk, managed through the GDP rate of change.
As explained above, the ∆ is the risk driver and is handled as a stochastic variable within the optimization system aimed at detecting the reverse stress test scenarios.
Considering the range of possible variation of GDP assumed in the reverse stress test simulation (0, −2%), in Figure 4, we report the distribution of all the yearly PD values used for calculating Net Adjustment for Impairment on Loans in the three years of the analysis 27 . Of course, the most severe PD points simulated were sparser, while lower PD points were denser.

The
LGDs are determined as a function of PDs, according to the strong empirical relationship between the two variables (Altman et al. 2005;and Altman and Hotchkiss 2006, in particular, pp. 326-27). We adopted the linear relationship of LGD under stress estimated by Standard & Poors through the following model (Schmieder et al. 2014, p. 63): where (ℎ ) is the value of LGD at the starting point (2018), which in our exercise is assumed equal to the coverage ratio on the overall NPLs of the four banks in the sample, equal to 59.86%.  The LGDs are determined as a function of PDs, according to the strong empirical relationship between the two variables (Altman et al. 2005;and Altman and Hotchkiss 2006, in particular, pp. 326-27). We adopted the linear relationship of LGD under stress estimated by Standard & Poors through the following model (Schmieder et al. 2014, p. 63): LGD(under stress) = LGD(historical) + 2.1535·PD (7) where LGD(historical) is the value of LGD at the starting point (2018), which in our exercise is assumed equal to the coverage ratio on the overall NPLs of the four banks in the sample, equal to 59.86%. Of course, we are well aware that in addition to GDP further risk factors, systemic and idiosyncratic, affect credit risk impact such as portfolio concentration, geo-sectorial diversification, credit risk mitigation (e.g., real estate collateral on mortgages), etc. Moreover, breaking-down credit risk parameters by business segment (e.g., retail, corporate, residential mortgage, etc.) can also improve the model's accuracy. These additional credit risk factors can be easily introduced in the modeling framework proposed, through specific variables, in this case study, we only provided a high level example of the model's mechanics.
For the calculation of Net Adjustment for Impairment on Loans, we adopted a two-stage model: one stage for performing loans and the other for non-performing loans 28 . For the sake of simplicity, we considered all non-performing loans in one single Non Performing Loans-NPL category, without differentiating between past due and unlikely to pay. In keeping with the static balance sheet assumptions, Exposures At Default-EADs were kept constant and we assumed no cure rate on defaulted exposures. The defaulted credit flow in each period was determined as the product of the expected default rate (PD) at the end of the period times the value of performing loans at the beginning of the period: De f aulted Credit Flow t = PD t ·Gross Per f orming Loan t−1 The NPL stock is determined as: The Net Adjustment for Impairment on Loans is determined as: Net Ad justments f or Impairment on Loans t = De f aulted Credit Flow t ·LGD t + NPL t−1 ·(LGD t − LGD t−1 ) where the first addendum represents the impairments on new defaulted loans; and the second addendum represents the impairments on old defaulted loans due to a change in the coverage of NPLs that occurs any time that LGD t LGD t−1 . The loan losses reserve at the end of the period is then given by the reserve at the beginning of the period, plus the Net Adjustment for Impairment on Loans: Reserve f or Loan Losses t = Reserve f or Loan Losses t−1 + Net Ad justments f or Impairment on Loans t

Market Risk
This risk factor has been modeled-in through the accounting item "Net Gains/Losses on Financial Assets" (see Table A2 in Appendix A), which includes mark-to market gains/losses, realized and unrealized gains/losses on securities in the Held For Trading (HFT) portfolio. The variable is expressed 28 The model adopted is a reduced version of the model developed in Montesi et al. (2019) to which we refer. as a gains/loss rate on the financial assets held for trading and is determined as a function of the Euro stock market index and its volatility through a simple satellite statistical regression model. The stock index and its volatility are handled like stochastic variables within the stochastic optimization system aimed at detecting the reverse stress test scenarios.
The return on the HFT portfolio is determined through a statistical model assessed by means of a regression analysis of net trading income of the four banks compared to the EURO STOXX 50 Index (SX5E) and its volatility, plus a random error that represents the idiosyncratic risk factor; considering a time period that covers the last 13 years. The satellite model is given by the following expression: where ∆SX5E is the change in the EURO STOXX 50 index; VolSX5E is its volatility compared to the previous year (360d); and t represents a random error normally distributed with zero mean and standard deviation equal to 0.003680, which coincides with the standard deviation of the residuals of the estimate.

Operational Risk
Modeled-in through the accounting item "Other Non-Operating Income/Losses" (see Table A2 in Appendix A); this risk factor is modeled directly, making use of the corresponding regulatory requirement records reported by the sample banks (considered as maximum losses due to operational risk events). Operating losses is handled as a stochastic variable and modeled through a Beta distribution function (shaped so as to resemble an exponential function), defined by the minimum equal to zero and the maximum equal to the sum of the four banks' 2018 operating risk capital requirement (€4.594 million, which coincides with the maximum loss) and shape parameters α = 1; β = 5. Figure 5 depicts the probability distribution adopted 29 .

Sovereign Risk
This risk driver is modeled-in through the BTP-Bund spread and handled as a stochastic optimization variable. The spread increase is uniformly applied to all maturities (parallel shift assumption). The spread dynamic determines, through a duration-convexity model, the impact on the value of government bonds held, with effects to Profit & Loss and Accumulated Other Comprehensive Income-AOCI according to the accounting treatment related to the specific portfolio. This risk also generates relevant second round effects and interactions with other risk factors, arising from the impact that the sovereign rating associated with the spread level has on the bank's rating, and subsequently on its cost of funding on newly issued liabilities, and from the impact of the tax effects (DTA)-generated by the change in the value of government bonds held in the portfolio FVTOCI-on regulatory capital deductions and RWA.
More specifically, to model sovereign risk, we considered the 10-year BTP-Bund spread; the starting value is that of the end of December 2018, equal to 250 bps.  29 In less simplistic applications in which static balance assumption is relaxed, operational loss could be parametrically modeled, for example, by modeling the operational losses in terms of rate of loss with respect to total assets or gross income, figures typically considered by regulators as operational risk drivers; anyway this kind of modeling can also generate counterintuitive results, since these drivers-especially gross income-tend to decrease in adverse scenarios, thus reducing (instead than increasing) the operational risk impacts in tail scenarios.

Sovereign Risk
This risk driver is modeled-in through the BTP-Bund spread and handled as a stochastic optimization variable. The spread increase is uniformly applied to all maturities (parallel shift assumption). The spread dynamic determines, through a duration-convexity model, the impact on the value of government bonds held, with effects to Profit & Loss and Accumulated Other Comprehensive Income-AOCI according to the accounting treatment related to the specific portfolio. This risk also generates relevant second round effects and interactions with other risk factors, arising from the impact that the sovereign rating associated with the spread level has on the bank's rating, and subsequently on its cost of funding on newly issued liabilities, and from the impact of the tax effects (DTA)-generated by the change in the value of government bonds held in the portfolio FVTOCI-on regulatory capital deductions and RWA.
More specifically, to model sovereign risk, we considered the 10-year BTP-Bund spread; the starting value is that of the end of December 2018, equal to 250 bps.
With regard to the first-round effect of the spread related to the impact on government bonds held, we considered the holdings of securities in all of the sample banks' accounting portfolios: FVTOCI-Fair Value To Other Comprehensive Income; FVTPL-Fair Value To Profit and Loss; HTC-Held to Collect ("Financial Assets at Amortized Cost"). The impact of the spread on the government bonds follows the accounting treatment of each portfolio; namely: • Impact on the FVTPL portfolio flows to P&L through the accounting item "Net Gains/Losses on Financial Assets/Liabilities at FVTPL". • Impact on the FVTOCI portfolio flows directly to equity through the accounting item "Accumulated Other Comprehensive Income" (OCI Reserve), net of the tax effects which are considered as DTA within the regulatory capital deduction, and RWA. For the sake of simplicity, we considered all the impacts as changes in the OCI Reserve value, without distinguishing between fair value repricing effect (to equity) and credit risk impairment (to P&L) in the case of significant downgrade, as we did for the HTC portfolio explained in the bullet below.
• Securities held in the HTC portfolio have an impact that flows to P&L through the accounting item "Net Adjustment for Impairment on Loan", only if the country rating falls to or below BB+, (two or more notches with respect to the current rating BBB), assuming that it raises a significant increase in credit risk; impairment is made by applying the five year PD reported in the S&P Sovereign Cumulative Average Default Rates (S&P 2018, p. 83) and a 40% LGD (adopting the same assumption of the EBA EU-wide stress test 2018 for Italian sovereign risk). The simplified modeling assumption adopted disregards impairment for ratings above junk bonds, since it would be negligible 30 .
We determined the duration and convexity for each maturity bucket on the basis of benchmark securities of Italian government bonds, as shown in the table below; the first two short-term buckets were assigned a zero duration and convexity. We then assigned duration and convexity to exposures in each portfolio on the basis of the four banks' records of Italian bond holdings published in the EBA 2018 EU-wide Transparency Exercise 31 (see Table 1). We then determined the impact of the spread changes on the value of the government bond holdings by applying the following Duration-Convexity approximation to each portfolio bucket: where ∆FV Y t is the fair value change in Italian government bond holdings at time t for the bucket Y; EXP Y t is the exposure in Italian government bonds at time t for the bucket Y; ∆i t is the yearly change in the sovereign spread; MD Y is the modified duration for the bucket Y; and MC Y is the modified convexity for the bucket Y.
In keeping with the static balance sheet assumption, the government bond holdings that mature within the forecast time period were assumed as reinvested for the same amount and duration, but all allocated in the HTC portfolio, because in a context of sovereign crisis, it would be rational for the bank to allocate them in a portfolio that would be less exposed to market turbulence; of course, these assumptions involve a lesser impact from the spread 32 . Reinvestment in securities yields an interest income calculated with the updated interest rates on government bonds associated with the specific stressed scenario.
In relation to the second-round effect of the spread, we assumed that a sharp increase in the 10-year BTP-Bund spread was associated with a downgrade of the rating of the country (Italy), in keeping with the relationship shown in Figure 6, which associates spread values with rating classes. The theoretical relationship between the BTP-Bund spread and Italy's rating was estimated on the basis of the average percentage increase in the cost of funding associated with each rating class, considering a sample that covers all ratings classes of all issuers (financial and non-financial in the U.S. and EU) between 1995-2018 (see Table 2); the spread increases associated with each of the lower ratings reported in Table 2 were then applied to the current spread and rating class on the Italian debt, in order to rescale that dynamic to the Italian sovereign risk case reported in Figure 6.
Subsequently, a downgrade in the sovereign rating involves a corresponding downgrade in the bank's rating; this empirically evident relationship is shown in the graph below ( Figure 7) and is based on the fact that rating agencies generally cap the rating of any issuer to the country rating.
U.S. and EU) between 1995-2018 (see Table 2); the spread increases associated with each of the lower ratings reported in Table 2 were then applied to the current spread and rating class on the Italian debt, in order to rescale that dynamic to the Italian sovereign risk case reported in Figure 6. Subsequently, a downgrade in the sovereign rating involves a corresponding downgrade in the bank's rating; this empirically evident relationship is shown in the graph below ( Figure 7) and is based on the fact that rating agencies generally cap the rating of any issuer to the country rating. The relation that links the banks' rating to the country rating in each forecast year is the following: This statistical relationship has been assessed considering the S&P, Moody's and Fitch ratings for Italy, and the average rating of the four banks in the sample within the period 1997-2019 33 . In Figure 8, we show the records and estimated relationship. The available observations on the association between the country rating and the average rating of the four banks were used to develop a relationship for all the rating classes; for low rating classes (below B+), the country and bank ratings tended to match (i.e., the country rating cap is active), while for high rating classes (above BB−), the country ratings tended to be higher than the corresponding bank ratings. 33 We transformed the alphanumerical rating classes into an ordinal ranking scale, ranging from 1 (corresponding to D rating class) to 22 (corresponding to AAA rating class). U.S. and EU) between 1995-2018 (see Table 2); the spread increases associated with each of the lower ratings reported in Table 2 were then applied to the current spread and rating class on the Italian debt, in order to rescale that dynamic to the Italian sovereign risk case reported in Figure 6. Subsequently, a downgrade in the sovereign rating involves a corresponding downgrade in the bank's rating; this empirically evident relationship is shown in the graph below ( Figure 7) and is based on the fact that rating agencies generally cap the rating of any issuer to the country rating. The relation that links the banks' rating to the country rating in each forecast year is the following: This statistical relationship has been assessed considering the S&P, Moody's and Fitch ratings for Italy, and the average rating of the four banks in the sample within the period 1997-2019 33 . In Figure 8, we show the records and estimated relationship. The available observations on the association between the country rating and the average rating of the four banks were used to develop a relationship for all the rating classes; for low rating classes (below B+), the country and bank ratings tended to match (i.e., the country rating cap is active), while for high rating classes (above BB−), the country ratings tended to be higher than the corresponding bank ratings. 33 We transformed the alphanumerical rating classes into an ordinal ranking scale, ranging from 1 (corresponding to D rating class) to 22 (corresponding to AAA rating class). The relation that links the banks' rating to the country rating in each forecast year is the following: Bank Rating = 1.1808·Country Rating 0.9107 This statistical relationship has been assessed considering the S&P, Moody's and Fitch ratings for Italy, and the average rating of the four banks in the sample within the period 1997-2019 33 . In Figure 8, we show the records and estimated relationship. The available observations on the association between the country rating and the average rating of the four banks were used to develop a relationship for all the rating classes; for low rating classes (below B+), the country and bank ratings tended to match (i.e., the country rating cap is active), while for high rating classes (above BB−), the country ratings tended to be higher than the corresponding bank ratings. The bank's cost of funding changes according to its rating, the latter being determined as described above. Therefore, a downgrade will cause an increase in the interest rate paid on newly issued liabilities (according to the maturities of the liabilities issued existing at 2018) 34 ; it does not affect the cost of deposits (which are instead exposed to reputational risk). The increase in interest rate applied to these liabilities is given by the difference between the cost of funding associated with a bank's new rating and its starting rating BBB (determined as RWA weighted average of the ratings of the four banks considered in December 2018). The cost of funding associated with each class of rating is determined according to Table 2, derived as the average default spread market values on all classes of ratings of all issuers (financial and non-financial in the U.S. and EU) between 1995-2018. The following picture ( Figure 9) shows a workflow that highlights the interactions between risk factors in the model. The modeling, albeit simplified, considers all the most relevant factors and dynamics. The impact on RWA arises because of the prudential rule that requires DTA up to a regulatory threshold not to be deducted from CET1, but included in the RWA with a 250% risk weight. 34 This connection links sovereign risk with interest rate risk and also to some extent to liquidity risk, because the impact is determined according to the maturity structure of the liabilities and therefore to the bank's funding need. The bank's cost of funding changes according to its rating, the latter being determined as described above. Therefore, a downgrade will cause an increase in the interest rate paid on newly issued liabilities (according to the maturities of the liabilities issued existing at 2018) 34 ; it does not affect the cost of deposits (which are instead exposed to reputational risk).
The increase in interest rate applied to these liabilities is given by the difference between the cost of funding associated with a bank's new rating and its starting rating BBB (determined as RWA weighted average of the ratings of the four banks considered in December 2018). The cost of funding associated with each class of rating is determined according to Table 2, derived as the average default spread market values on all classes of ratings of all issuers (financial and non-financial in the U.S. and EU) between 1995-2018. The following picture ( Figure 9) shows a workflow that highlights the interactions between risk factors in the model. The modeling, albeit simplified, considers all the most relevant factors and dynamics. The impact on RWA arises because of the prudential rule that requires DTA up to a regulatory threshold not to be deducted from CET1, but included in the RWA with a 250% risk weight. A further potential interaction among risk factors could be modeled-in by expressing the bank's rating in each forecasting year as a function not only of the country rating, but also of the bank's capital profile, for example, by setting up a simple relationship that links the shadow bank's rating to its CET1 ratio (or any other more complex scoring). In this way, the cost of funding (interest rate risk) would be connected not only with the sovereign risk, but also to all other risks that affect the bank's capital position (e.g., credit risk). In this case, the bank's rating would be defined as: With the country rating acting as a ceiling on the shadow rating attributed to the bank credit worthiness.

Reputational Risk
We considered the modeling of this risk factor limited to the most relevant and basic impact in relation to the business model of our banks' sample. In practice, we assumed (see Figure 10) that reputational events may cause a customer drop out (mainly from the retail segment) that involves a subsequent decrease of deposits (liabilities with low cost funding) to be replaced by new debt (liabilities with high cost funding) as an immediately available source of funding, a decrease in net commissions arising from a shrinking customer base and an increase in expenses due to legal costs, etc. Considering the context of severely adverse conditions, we assumed that a reputational adverse event occurs in any case in the first forecast year, with impacts (in terms of decrease in deposits and net commissions; and increase in expenses) that can range between zero (in the case of perfect remediation management of the reputational event) and a maximum level set according to benchmark parameters observed in the reference market 35 . The impact then lasts for the entire forecasting period of the analysis 36 . In this regard, we assumed that the reputational event determines 35 We determined the benchmark impact by considering the cases related to four Italian banks hit by severe reputational crisis: Veneto Banca, Banca Carige, Banca MPS, and Banca Banca Popolare di Vicenza. 36 In Montesi and Papiro (2018), we proposed modeling reputational risk within the stochastic simulation framework for the purpose of assessing a bank's probability of default (fragility assessment), considering both the probability that a reputational event may occur and the variability of its adverse impact. In that work, we suggested introducing a reputational event risk stochastic variable (simulated, for example, by means of a binomial type of distribution) through which, for each period, the probability of occurrence of a  Figure 9. Sovereign risk: First and second round effects.
A further potential interaction among risk factors could be modeled-in by expressing the bank's rating in each forecasting year as a function not only of the country rating, but also of the bank's capital profile, for example, by setting up a simple relationship that links the shadow bank's rating to its CET1 ratio (or any other more complex scoring). In this way, the cost of funding (interest rate risk) would be connected not only with the sovereign risk, but also to all other risks that affect the bank's capital position (e.g., credit risk). In this case, the bank's rating would be defined as: Bank Rating = min 1.1808· Country Rating 0.9107 ; scoring rating With the country rating acting as a ceiling on the shadow rating attributed to the bank credit worthiness.

Reputational Risk
We considered the modeling of this risk factor limited to the most relevant and basic impact in relation to the business model of our banks' sample. In practice, we assumed (see Figure 10) that reputational events may cause a customer drop out (mainly from the retail segment) that involves a subsequent decrease of deposits (liabilities with low cost funding) to be replaced by new debt (liabilities with high cost funding) as an immediately available source of funding, a decrease in net commissions arising from a shrinking customer base and an increase in expenses due to legal costs, etc. Considering the context of severely adverse conditions, we assumed that a reputational adverse event occurs in any case in the first forecast year, with impacts (in terms of decrease in deposits and net commissions; and increase in expenses) that can range between zero (in the case of perfect remediation management of the reputational event) and a maximum level set according to benchmark parameters observed in the reference market 35 . The impact then lasts for the entire forecasting period of the analysis 36 . In this regard, we assumed that the reputational event determines an increase in the cost of funding equal to Euribor + bank spread applied to a refinancing up to a maximum of 21% deposits runoff, a decrease in commissions as great as 10%, and an increase in expenses up to 10% 37 . J. Risk Financial Manag. 2020, 13, x FOR PEER REVIEW 24 of 44 an increase in the cost of funding equal to Euribor + bank spread applied to a refinancing up to a maximum of 21% deposits runoff, a decrease in commissions as great as 10%, and an increase in expenses up to 10% 37 . This risk factor was handled by means of three specific stochastic variables related to the decrease in deposits, the decrease in "Net Commissions", and the increase in "Operating Expenses", all modeled through a Beta function defined by the minimum and maximum values and with shape parameters: α = 1; β = 5. In Figure 11, we report the parameters of the probability distribution functions involved in the modeling of reputational risk.
reputational event can be established. Then, in scenarios in which reputational events occur, a series of stochastic variables linked to their possible economic impact such as reduction of commission factor, reduction of deposits factor, increased spread on deposits factor, increase in administrative expenses factor, etc. is in turn activated. Thus, values are generated that determine the entity of the economic impacts of reputational events in every scenario in which they occur. Otherwise, in the case of reverse stress test, in which we always assume adverse conditions, reputational risk can depend only on the magnitude of the losses and not on the probability of occurrence of a reputational event. 37 Consider that a context of negative interest rates does not affect the mechanism envisaged by the model; in fact, in relation to P&L and capital impacts, what is relevant is the difference between the retail cost of funding (customers deposits drop out) and wholesale cost of funding (which replaces the deposits drop out); the latter typically having a higher cost for the bank than the former (even in a negative rates context). Of course, a negative interest rates context-and even more extraordinary supervisor's monetary operations supporting the banking sector (such as TLTRO)-reduces the relevance of the reputational risk impact related to the cost of funding. This risk factor was handled by means of three specific stochastic variables related to the decrease in deposits, the decrease in "Net Commissions", and the increase in "Operating Expenses", all modeled through a Beta function defined by the minimum and maximum values and with shape parameters: α = 1; β = 5. In Figure 11, we report the parameters of the probability distribution functions involved in the modeling of reputational risk. event may occur and the variability of its adverse impact. In that work, we suggested introducing a reputational event risk stochastic variable (simulated, for example, by means of a binomial type of distribution) through which, for each period, the probability of occurrence of a reputational event can be established. Then, in scenarios in which reputational events occur, a series of stochastic variables linked to their possible economic impact such as reduction of commission factor, reduction of deposits factor, increased spread on deposits factor, increase in administrative expenses factor, etc. is in turn activated. Thus, values are generated that determine the entity of the economic impacts of reputational events in every scenario in which they occur. Otherwise, in the case of reverse stress test, in which we always assume adverse conditions, reputational risk can depend only on the magnitude of the losses and not on the probability of occurrence of a reputational event. 37 Consider that a context of negative interest rates does not affect the mechanism envisaged by the model; in fact, in relation to P&L and capital impacts, what is relevant is the difference between the retail cost of funding (customers deposits drop out) and wholesale cost of funding (which replaces the deposits drop out); the latter typically having a higher cost for the bank than the former (even in a negative rates context). Of course, a negative interest rates context-and even more extraordinary supervisor's monetary operations supporting the banking sector (such as TLTRO)-reduces the relevance of the reputational risk impact related to the cost of funding.

Interest Rate Risk
This risk factor was modeled-in by calculating the impact of interest rate changes on Net Interest Income-NII and considering interest rates (Euribor Swap Rate 6M) as a stochastic variable within the stochastic optimization system aimed at detecting the reverse stress test scenarios; the assumptions adopted are based on the available data at the time of the analysis.
(1) The bulk of interest expenses are affected by: (a) The changes in interest rates on the variable rate share of the pre-existent liabilities issued (assumed to be 78.5%). (b) The changes in interest rates on all debt due to banks (interbank short term financial debt), whose source of funding in stressed conditions tends to increase. (c) The changes in interest paid on all newly issued liabilities (both fixed and variable rate issues) during the forecast period, which were assumed to be made at the new market conditions (reflecting the dynamics of both the interest rates and the bank's default spread, which in turn depends on the sovereign spread). (d) The change in interest paid on part of the deposits (we assumed that 10% of deposits were sensitive to Euribor changes). (e) The increase in interest paid due to the decrease in deposits (low cost of funding) related to reputational events and the subsequent switch to debt due to banks (higher cost of funding).
Of the impacts affecting interest expenses (a), (b), and (d) are purely driven by interest rate risk, while (c) is also driven by sovereign risk, and (e) is driven by reputational risk. Of course, the fixed rate share of pre-existing liabilities that does not expire within the forecast period is not affected by interest rate risk.
(2) The bulk of interest income is affected by: (a) The changes in interest rates on the variable rate share of pre-existing credit exposures (assumed to be 40% of the total loans portfolio). (b) The changes in interest rates on all loans to banks (interbank short-term financial exposures). (c) The decrease in interest received due to the new flows of non-performing exposures (we assumed that interest income was generated only on the value of exposures' net of accumulated provisions). (d) The increase in interest received on government bonds due to the re-investment of securities expired within the forecast period in bonds with interest rates, which includes the increase in the sovereign spread (assuming that the re-investment of government bonds is always made with new securities issues that embody the current market conditions); the

Interest Rate Risk
This risk factor was modeled-in by calculating the impact of interest rate changes on Net Interest Income-NII and considering interest rates (Euribor Swap Rate 6M) as a stochastic variable within the stochastic optimization system aimed at detecting the reverse stress test scenarios; the assumptions adopted are based on the available data at the time of the analysis.
(1) The bulk of interest expenses are affected by: The changes in interest rates on the variable rate share of the pre-existent liabilities issued (assumed to be 78.5%).
The changes in interest rates on all debt due to banks (interbank short term financial debt), whose source of funding in stressed conditions tends to increase. (c) The changes in interest paid on all newly issued liabilities (both fixed and variable rate issues) during the forecast period, which were assumed to be made at the new market conditions (reflecting the dynamics of both the interest rates and the bank's default spread, which in turn depends on the sovereign spread). (d) The change in interest paid on part of the deposits (we assumed that 10% of deposits were sensitive to Euribor changes). (e) The increase in interest paid due to the decrease in deposits (low cost of funding) related to reputational events and the subsequent switch to debt due to banks (higher cost of funding).
Of the impacts affecting interest expenses (a), (b), and (d) are purely driven by interest rate risk, while (c) is also driven by sovereign risk, and (e) is driven by reputational risk. Of course, the fixed rate share of pre-existing liabilities that does not expire within the forecast period is not affected by interest rate risk.
(2) The bulk of interest income is affected by: The changes in interest rates on the variable rate share of pre-existing credit exposures (assumed to be 40% of the total loans portfolio).
The changes in interest rates on all loans to banks (interbank short-term financial exposures). (c) The decrease in interest received due to the new flows of non-performing exposures (we assumed that interest income was generated only on the value of exposures' net of accumulated provisions).

(d)
The increase in interest received on government bonds due to the re-investment of securities expired within the forecast period in bonds with interest rates, which includes the increase in the sovereign spread (assuming that the re-investment of government bonds is always made with new securities issues that embody the current market conditions); the reinvestments were determined on the basis of the maturity buckets of the government bonds portfolios held by the four banks.
Of all the impacts affecting interest income, only (a) and (b) were related to interest rate risk, while (c) was driven by credit risk and (d) by sovereign risk (as a matter of fact, this represents a positive mitigation side effect of sovereign risk and not a negative impact). Of course, the fixed rate share of pre-existing exposures was not affected by interest rate risk.
The description of the interest rate risk effects highlights the interlinkage between the risk factors and the second round effects envisaged by the model.

Reverse Stress Test Results and Scenario Selection
The stress test exercise performed was developed exclusively as an exemplification for illustrative purposes and does not represent to any extent a valuation on the capital adequacy of the banks considered. Stress test and reverse stress test results should not be considered as the banks' expected or most likely figures, but rather be considered as potential outcomes related to the extremely severe adverse scenarios assumed.
The risk assessment exercise performed included two kinds of analysis: (a) A stress test exercise, aimed at assessing the probability of breaching regulatory thresholds in stressed conditions; in this regard, we considered two different thresholds: (a) 9.54%, corresponding to the aggregate OCR-Overall Capital Requirement of the four banks of the sample; and (b) 6.5%, corresponding to the aggregated TSCR-Total SREP Capital Requirement. (b) A reverse stress test exercise, aimed at detecting a specific set of assumptions of the primary risk drivers that can cause the triggering of a regulatory breach in each of the three years of the forecast time period; also in this case, we considered the two different thresholds: (a) 9.54% OCR and (b) 6.5% TSCR.
Breaching the first threshold (OCR) is a usual result achieved in stress testing simulation and does not indicate a particularly critical risk condition, since the capital conservation buffers (CCB) are specifically dedicated to absorb capital losses under stressed conditions (even though hitting this threshold implies relevant constraints to the bank such as the submission of a plan for restoring the capital buffer and the application of the Maximum Distributable Amount (MDA) rules for dividend distribution). Breaching the second threshold (TSCR) implies a more relevant risk condition, since it involves the infringement of viability conditions for the bank.
All of the reverse stress test analyses were performed in relation to the transitional CET1 ratio (in any case, the only material difference compared to the fully-phased ratio concerns the transitional effect of the FTA of the IFRS 9 accounting principle). For the sake of simplicity, we did not consider and exhibit other capital constraints (Tier 1, total capital, and leverage ratio), however, considering the application of multiple constraints detection does not affect the methodology.
As already mentioned, we considered the following risk factors as primary drivers of the reverse stress test exercise: Italian Real GDP rate of change; 10-year BTP-Bund spread; Euribor Swap Rate 6 months; SX5E Index rate of change; and SX5E Volatility. Table 3 shows the range of possible values assumed for each risk driver for the reverse stress test optimization process. The first column reports the records at 31 December 2018, which is related to the starting point of the analysis. SX5E Volatility ( *** ) 12.61% 25.00% (+12.39%) 45.00% (+32.39%) ( * ) Italy Real GDP, 2018 YoY% (Source: Eurostat); ( ** ) The range min/max was obtained on the basis of the adverse scenario assumptions for the swap rate in the 2018 EU-wide stress test exercise (see ESRB 2018).; ( *** ) In setting the range of assumptions, we considered that in the last 25 years, the maximum yearly rate of change of the SX5E index was about 45% and the maximum level of volatility recorded was about 37%.
Of course, the range of variability set for each risk factor variable is crucial in determining the output, since the magnitude of the extreme values determines the possibility of breaking the preset threshold; therefore, in correctly interpreting the results of the analysis, we always need to refer to the specific assumptions adopted. In this regard, since detecting whether or not the bank may potentially breach the threshold in a reasonable range of adverse scenarios is in itself a valuable piece of information, we suggest starting the analysis by first setting a proper and plausible level of severity in the assumptions (i.e., neither too mild nor too extreme). In this way, we can detect whether there are adverse but plausible scenarios in which the bank breaches the threshold. Of course, it may be that with that level of severity, the breach may never occur, and thus the reverse scenario cannot be identified. In that case, we must increase the severity of the assumptions beyond the plausibility level and possibly leave the range of values for each variable unconstrained 38 , so that some breaking points can be reached in any case.

The Probability of Breach: A Measure to Assess the Degree of Bank Fragility
The stress test performed was somehow preliminary (even if not strictly necessary) and complementary to the reverse stress test. In fact, with this analysis, by exploring through a stochastic simulation the dynamics related to all the plausible adverse scenarios, we were able to calculate the entire distribution functions of capital ratios recorded in all the scenarios and in each year; then, by setting a particular capital ratio threshold, we can easily determine the bank's probability of breach as the frequency of scenarios with which the bank breaches the preset threshold within a given time frame over the entire number of scenarios simulated (i.e., the points in the breach area in Figure 1). This kind of analysis allows us to assess an effective measure of the fragility degree of the bank.
Tables 4 and 5 report the marginal and cumulated probabilities 39 of breach related to the two different thresholds for each forecast year. These probabilities were assessed through the stochastic simulation methodology described in our above-mentioned previous paper (Montesi and Papiro 2018), assuming for all the systemic risk factor stochastic variables a symmetric Beta distribution function with parameters (4; 4) defined by the minimum and maximum values reported in Table 3.
The probability of breaching the TSCR threshold was null in the first two years and extremely low in the third year. While, as to be expected, under the severe stressed conditions assumed, the cumulated probability of breaching the higher OCR threshold (i.e., eroding the capital conservation buffers) was extremely high in the third year, substantial in the two-year time period, and negligible in the one-year time period.
The results of the probability of the breach stress test indicate that a reverse stress test scenario can be determined only in the third year (2021) for the TSCR threshold (since there were no breaches in the first two years), and for all three forecast years for the OCR threshold.

Reverse Stress Test Results
Once we have detected if and how many times the banks may breach in a given time horizon a certain threshold, by performing the reverse stress test,\ we can then find out the subset of all the particular scenarios that exactly trigger (within a preset approximation level 40 ) the threshold (i.e., the breaking points on the edge of the distressed area in Figure 1); and then, by adopting a selection criterion, derive the assumptions of the primary risk drivers that specify the reverse stress test scenario.
To detect the breaking points, we applied the SA optimization system described in Section 4 to those five systemic risk drivers. On the basis of the solutions obtained through the optimization system, in order to derive one set of assumptions that define the reverse stress test scenario, we need to adopt a criterion to select a scenario among all those related to the breaking points. As already stated, there is no univocal criterion for this purpose; any choice unavoidably involves some degree of subjectivity. Therefore, it is very important to adopt a representative schema of the risk factor solutions determined that can help us to understand which one can be more reasonably considered as the most appropriate critical combination of assumptions that jeopardizes the bank's viability. In this regard, the temporal dimension of the analysis and the large number of breaking points (solutions) obtained add further complexity to the issue. In fact, as explained, it is often necessary to consider several periods and several risk factor drivers to reach default conditions; that plurality of underlying assumptions (solutions) must be reduced and represented in a sensible way in order to allow us to process it and derive a final solution. Simple statistical metrics such as the average can help us in synthesizing the multidimensional complexity of the information related to the breaking points. Once we have addressed this step, we need to apply a criterion to select the reverse stress test scenario from all the breaking point solutions. In this paper, we propose two potential criteria associated with two different ways of representing the reverse stress test scenario, of course, other criteria may also be adopted.
One criterion is based on a very simple statistical measure, the simple average of all the breaking point values. This criterion has the advantage of being of immediate understanding; it can be well suited, whereas the solutions found have a low dispersion and the breach is triggered in a short period (one year). In these circumstances, the rough approximation made by averaging the assumptions can be a reasonable criterion; in Section 6.2.1, we apply this criterion, reporting some simple statistics related to the reverse stress test performed (percentiles, absolute mean deviation, etc.) that can help us to understand the range of the solutions.
Another criterion that can be adopted is based on proximity to the starting point of the analysis. In other words, the scenario that involves, for all the risk factor variables considered, the shortest overall distance from the current conditions, considering somehow that combination of risk factor assumptions as the scenario that may occur before others in triggering the bank's threshold. Minimizing that distance in a generalized context is not an easy task and raises some issues; we describe how to apply this criterion in Section 6.2.2.

The Average Reverse Stress Test Scenario
Tables 6-13 report for each the CET1 ratio threshold (6.54% TSCR and 9.54% OCR) and for each forecast year:

•
The reverse stress test set of results for all the systemic risk drivers, indicating the average values, the mean absolute deviation, and the 95% and 5% percentiles of the set of breaking point solutions determined through the SA optimization system. For a more immediate representation, we did not report the results for all the three forecast years, but only the average values over the three forecast years (in any case, the results in the three years were quite similar). (Tables 6, 8, 10 and 12).

•
The reverse stress test shadow ratings associated with Italy and the ITB; the three year cumulated losses for each risk factor (distinguishing between Pillar 1 and Pilar 2), indicating the average values, the mean absolute deviation, and the 95% and 5% percentiles to provide a measure of the dispersion of the different risk factor impacts (Tables 7, 9, 11 and 13)        The sovereign risk records include losses due to the impact on the government bond values and the second round effect of the related DTA on capital deductions; for the sake of simplicity, we did not include the impact on the cost of funding arising from the downgrade of the country's and banks' rating, that is, instead included in the interest rate risk. Reputational risk reports all the impacts related to this factor including the related increase in the cost of funding.
The cumulative net total loss reported to the right of the tables (Tables 7, 9, 11 and 13) corresponds to the cumulated net income and, of course, is lower than the sum of the impact of several risk factors, since it takes into account the bulk of revenues generated by the bank (net of the operating costs).
In relation to the reverse stress test scenario assumptions, we note that the dispersion of the values around the mean is quite low, making in this case the choice of the average as the selection criterion less relevant. The same narrow dispersion can be observed for the risk factor impacts.
The most relevant risk factor, as expected, was credit risk; followed by sovereign risk, notwithstanding the fact that we did not include the induced effect of increase in the cost of funding in its impact. The third risk factor for relevance was given by interest rate risks. In any case, the total impact of all Pillar 2 risks, albeit very relevant, was always lower than the credit risk impact alone. Not surprisingly, market risk impact (leaving aside sovereign risk) was quite small, since for Italian banks, this risk factor is not one of the most relevant. In fact, for the four banks considered, market trading has historically never generated losses between 2010-2018; and the EBA EU-wide stress test 2018 also reported no losses from trading income within the three years of the adverse scenario, but rather a cumulative income of about €2.3 billion 41 .
The 6.5% TSCR threshold reverse stress test involved, in two years, a downgrade with respect to the current ratings of two notches for the country rating and three notches for ITB bank; while in three years, it involved a five notch downgrade for both the country and ITB. A five notch downgrade also occurred for both the country and ITB in 2019 (one year) in the case of the 9.54% OCR threshold reverse stress test.
The aggregated sample of the four banks at 31 December 2018 reported a CET1 capital of about €97 billion, and existing business generated about €15 billion in revenue (net of operating costs, before provision losses and trading income/losses) for a total theoretical buffer of available financial resources of €112 billion under stable conditions. In order to reduce the capital profile below the 6.54% TSCR threshold, there would have to be a cumulative negative impact of about €−65 billion (€−47 billion in terms of CET1 capital), which would require extremely severe, but not implausible, adverse conditions; consider that in the 2011-2013 period (the peak of the crisis), the four banks accumulated an overall net loss of about €40 billion (including cumulated loan loss provisions of about €55 billion) and with a maximum impact on AOCI of about €−7 billion (although the latter was only partially computed in own funds because of the Basel 3 phase-in).
As a general comment on the reverse stress test exercise, we can note that the 9.54% OCR threshold is almost certainly breached in the severe adverse scenarios simulated. Consider that the 3-year probability of breach for that threshold assessed through the stress test was 97%, and that also in the EBA EU-wide stress test for 2018, that threshold was almost triggered, with a transitional CET1 ratio reaching 9.57% in the third year of the adverse scenario 42 . Nevertheless, overall ITB seems to hold up well enough against the 6.5% TSCR threshold. In fact, breaching that threshold requires a huge cumulative loss that implies, in particular, extremely severe GDP drop and spread increase. In any case, we should keep in mind that in the worst 3-year period (considering the combination of both GDP and spread variables) of the recent crisis (2011)(2012)(2013), the average real GDP drop was 1.3% (cumulative growth −3.9%) and the spread reached a peak of 550 bps, records close to those highlighted by the reverse stress test analysis. A GDP drop close to that necessary to breach the 6.5% threshold can be found in the period 2007-2009, during which Italy's real GDP suffered a yearly average drop over the three years of −1.7% and a cumulative drop of −5.1% 43 .
In the 1-year breach reverse stress test (Table 13) focused on 2019, the optimization process found only one breaking point solution, indicating a very low likelihood of breaching the threshold in a time period of just one year on the basis of the range of assumptions adopted.

Reverse Stress Test Scenario Selection: The Criterion of Proximity
Once we have determined the m breaking point scenarios, we may also consider, as a valid criterion to select the reverse stress test scenario, the one that is the closest to current market conditions. This is because that breaking point may be considered to be in some way associated with the combination of systemic risk factors that is the most likely to cause a reverse stress test scenario (although not in a strict statistical sense) 44 , or at least the combination of systemic risk factors that may trigger the breach before others. We can call this reverse stress test scenario selection criterion the criterion of proximity 45 . In this section, we present a procedure that applies the criterion of proximity.
Considering that each risk factor may have a different unit of measurement, the first necessary step is to normalize the data matrix related to the risk factors through a feature scaling technique. Here, we can employ the "Min-Max Scaling" technique, thanks to which we can perform a linear rescaling of the solution values ranging between zero and one: where z ik is the normalized value of the i-th risk factor (with i = 1:n) in the k-th reverse stress test scenario (with k = 1:m); x ik is the value of the i-th risk factor in the k-th reverse stress test scenario; x min i and x max i are the minimum and maximum value of the i-th risk factor, respectively 46 . The minimum 42 Weighted average value recorded by the four banks in the exercise (Source "EBA: 2018 EU-Wide Stress Test"). 43 Consider that the EBA wide stress test originally scheduled for 2020 envisaged for Italy a cumulative three year drop in GDP of −3.7%, with an average yearly drop of −1.2%. 44 In this regard, see also what the (UK) Financial Services Authority (2011) indicates: "Firms should start by considering a wide number of scenarios that might potentially threaten their business model, despite credible management actions-subject to those scenarios being plausible-and narrow them down to the ones most likely to cause the business model of the firm to fail". (Response n. 22, p. 7). "These should be the most likely scenarios, given that business model failure is a prerequisite for a scenario to be considered, and not necessarily based on an assessment of the absolute probability of the scenario occurring". (Response n. 23, p. 8). 45 It follows the same logic adopted in other works; in this regard, see Breuer et al. (2009), Flood andKorenko (2015), Grundke and Pliszka (2018). 46 An alternative technique could be rank normalization, through which each risk factor value is replaced by its rank in that sample. Another common normalization method involves subtracting from each element of the data matrix the average of the data series and dividing it by the corresponding standard deviation, that is: where x i represents the mean and σ x i is the standard deviation of the i-th risk factor within the n scenarios generated. This kind of normalization, unlike the Min-Max Scaling technique, is capable of correctly managing anomalous values but not of returning normalized values in the same scale as in Min-Max Scaling. Some authors suggest using the mean absolute deviation instead of the standard deviation, in order to express the denominator in the same unit of measurement of the numerator (see Kaufman and Rousseeuw 2005).
value corresponds to the starting point of the analysis, typically the latest available data (in our case study, these are the records as of 31 December 2018), in relation to which we want to minimize the distance in the reverse scenario. Therefore, for each risk factor, the smaller the values expressed by Equation (16), the closer they are to the starting point economic conditions 47 . We can adopt the Euclidean distance as a reference metric for determining the risk factor combination that minimizes the distance from the origin or starting point (Gower 1982(Gower , 1985. Considering that for each reverse stress test analysis we have n risk factors and m scenarios, in terms of Euclidean space, this implies an n-dimensional space with m points representing all the reverse breaking scenarios simulated; thus the problem of finding the nearest scenario to the starting point can be reduced to a simple minimization process, through which we select the combination of risk factor values that minimizes the Euclidean distance of m points from the origin; formally: Intuitively, the minimization process of finding the set of risk factor assumptions closest to the starting point economic conditions identifies the scenario that can also be considered as the most likely to occur among those selected through the optimization process. The process described has been generalized as an n dimensional space, although in most cases, it can be limited to just the few risk factors that determine the greatest part of the impact in the reverse stress test. For the sake of simplicity, we provide a graphical representation of the selection process that considers only the two main risk factors in the reverse stress test performed, GDP and the BTP-Bund spread, which cover more than 80% of the impact on the CET1 ratio. In Figure 12, we plotted all the combinations of GDP and spread values associated with the 113 breaking points found through the optimization system for the 6.5% TSCR threshold reverse stress test 48 . The red dot indicates the combination of GDP and spread changes that minimizes the Euclidean distance from the origin (starting point market conditions) among all the breaking points. Consider that in order to make records easily visualizable, we reduced the scale of the axes in the graph and did not set the origin at zero, since the breaking points were quite concentrated in a narrow area. This may generate a sort of biased perspective in the graph (i.e., the breaking points may look at first glance closer to the origin than they effectively are).
In determining the closest scenario, we can also consider introducing risk factor weights into the minimization process. To this end, all we need to do is add an additional parameter to the minimization process formula given by the risk factor weight, so as to calculate the weighted Euclidean distance. Thence, Equation (17) becomes: where w i is the weight attributed to the i-th risk factor. The weights could be assigned in relation to the risk factor's contribution to the target variable (in our case, the CET1 ratio) in determining the reverse breaking points, or any other subjective criterion considered as eligible 49 . 47 If we had variables expressed in terms of rate of change such as GDP growth rate, the minimum would correspond by definition to a zero rate (i.e., in practice we do not move from current conditions). 48 For a more immediate comprehension of the data, we reported non-normalized values; these are average values recorded within the time period of the analysis. 49 Consider that, as Kaufman and Rousseeuw (2005) note, the weight attribution corresponds to a rescaling of the coordinates through a factor √ w 1 , · · · , √ w n .
A special case of weighted Euclidean distance is given by the Mahalanobis distance. The advantage of this kind of metric is that it takes into account the correlation between the variables and the standard deviations of the risk factors; this feature makes the Mahalanobis distance particularly useful in all cases in which risk factor correlation has not been adequately treated within the forecast model.
Considering that for each risk factor in the k-th scenario the weighted Euclidean distance can be represented in matrix terms as: where z ik represents the vector that defines the coordinates with respect to the origin for each of the n risk factors and for the m scenarios, while W represents a n × n diagonal matrix that includes the weights for the n risk factors. By substituting in the W matrix, the inverse of the covariance matrix S −1 , we obtained the Mahalanobis distance as: A low value of the Mahalanobis distance implies a scenario that is close to the origin; therefore, even in this case, the most plausible reverse scenario is the one that minimizes the Mahalanobis distance. We should expect that explicative variables, which are highly correlated in stressed scenarios such as those related to reverse stress test breaking points, have low Mahalanobis distances. In addition, a characteristic of the Mahalanobis distance is that it is scale-invariant, being insensitive to the variance of the underlying variables, which means that it can be calculated directly on the native variables 50 .
Ultimately, to select the most plausible reverse stress test scenario, we suggest adopting the proximity criterion, which in our opinion is a sensible approach for this kind of exercise.
A low value of the Mahalanobis distance implies a scenario that is close to the origin; therefore, even in this case, the most plausible reverse scenario is the one that minimizes the Mahalanobis distance. We should expect that explicative variables, which are highly correlated in stressed scenarios such as those related to reverse stress test breaking points, have low Mahalanobis distances. In addition, a characteristic of the Mahalanobis distance is that it is scale-invariant, being insensitive to the variance of the underlying variables, which means that it can be calculated directly on the native variables 50 .
Ultimately, to select the most plausible reverse stress test scenario, we suggest adopting the proximity criterion, which in our opinion is a sensible approach for this kind of exercise.

Optimization System for Calibration of Thresholds of Early Warning Risk Indicator
The methodology described for reverse stress testing can also be used for calibrating the thresholds of key risk indicators (KRI) for early warning purposes within the risk appetite framework. This can be done by selecting the KRI to be optimized and by properly setting the relevant threshold to be triggered in the reverse analysis process.
For example, assuming that sovereign risk is a relevant risk factor for an Italian bank to be strictly monitored within its risk appetite through a KRI given by the BTP-Bund spread, in order to set a threshold for the indicator that provides effective early warning signals capable of promptly activating the escalation process and thus prevent potentially dangerous situations, we can proceed in the following way. Once we have set the key capital indicator threshold whose breach we want to

Optimization System for Calibration of Thresholds of Early Warning Risk Indicator
The methodology described for reverse stress testing can also be used for calibrating the thresholds of key risk indicators (KRI) for early warning purposes within the risk appetite framework. This can be done by selecting the KRI to be optimized and by properly setting the relevant threshold to be triggered in the reverse analysis process.
For example, assuming that sovereign risk is a relevant risk factor for an Italian bank to be strictly monitored within its risk appetite through a KRI given by the BTP-Bund spread, in order to set a threshold for the indicator that provides effective early warning signals capable of promptly activating the escalation process and thus prevent potentially dangerous situations, we can proceed in the following way. Once we have set the key capital indicator threshold whose breach we want to prevent (e.g., the OCR CET1 ratio of 9.54%) and added a tolerance buffer (e.g., 0.46%) aimed at creating an early warning area, we can run a reverse stress test on the 10% CET1 ratio threshold (9.54% + 0.46%) as described that identifies the breaking points on the edge of that preset early warning area. For each breaking point, we can obtain the associated spread value, then by applying the preferred selection criteria (e.g., average spread in case the breaking points are quite close, the spread solution that is closest to the market value at the starting point, the most likely spread solution, etc.), we can easily get the spread value that indicates (with the desired tolerance level) the potential triggering of the CET1 ratio threshold. In other words, thanks to the results of the simulation process, we can expect that when the spread reaches the value determined through reverse analysis, we enter into an area of high risk of breaching the capital ratio in the next future.
This technique can be used to easily calibrate all the KRIs of the risk appetite framework according to the desired tolerance level.

Conclusions
Prudential banking regulations and supervisors require banks to perform reverse stress tests within their risk assessment framework. This kind of exercise can be a useful device to understand the sources of risk and the triggering levels of some primary systemic risk drivers. To effectively assess a bank's overall risk (financial fragility degree), we should do something different and somehow simpler: estimate its probability of breach. Notwithstanding the urgency for reverse stress testing, research and best practices offer little methodological support to appropriately address the issue. Reverse stress testing is a complex problem with multiple solutions as there are many ways in which a bank can breach the relevant threshold of its key risk indicator. Moreover, while assessing the probability of breach, we simply need to detect all the risk factors values that can cause the key risk indicator to fall below the relevant threshold in the breach area; in the reverse stress test, we need to identify only those solutions that trigger the threshold and lie on the edge of the breach area. In order to resolve the reverse stress test problem, we need to find an efficient quantitative technique to determine all those combinations of risk factors that can trigger the threshold, plus a criterion to select from those solutions the one that we can consider the reverse stress test scenario.
In this article, we presented a stochastic simulation optimization system to perform bank reverse stress tests aimed at detecting the reverse scenario (i.e., the set of assumptions of key systemic risk drivers that triggers the bank's breach, defined as capital ratios below the regulatory minimum threshold). The quantitative technique employed was based on an optimization stochastic system: simulated annealing with a multi-start strategy configuration.
The methodology identifies the conditions of the primary risk factors that determine all the breaking points on the edge of the distress area, providing a meaningful set of critical assumptions that aids in understanding the bank's vulnerabilities and its sources of risk, thus enabling the user to define the reverse stress test scenario.
Regarding the selection criteria to derive a scenario from the set of all possible solutions of the reverse stress test, we suggest several possibilities and also provide a procedure based on the criterion of proximity to the starting point of the analysis, determined by minimizing the Euclidean distance (or the Mahalanobis distance), which in our opinion may be the most reasonable generalized criterion.
The proposed framework is quite flexible and allows the user to easily introduce additional risk factors, more refined satellite models, and a greater break-down of variables, providing a practical and effective solution to a very challenging computational problem. The same methodology can also be applied to calibrate early warning thresholds for key risk indicators by properly setting the breaching conditions that the optimization system has to resolve.
Since in reverse stress testing we must simulate default or near-default scenarios, we must take into account in the model all the relevant risk factors (including those that are difficult to quantify) and their feedback and second round effects. Therefore, we also presented a possible way to model some relevant Pillar 2 risks and their interlinkages such as sovereign, interest rate, and reputational risks.
We also provided a practical example of the methodology applied to the case of a 'bank', represented by an aggregated financial statement of the main four Italian banks, showing how the main risk factors can be managed within the proposed framework and the reverse stress test results interpreted.
The methodological approach presented, in our opinion, is well suited to be applied by banks' risk managers and supervisors in all enterprise-wide bank risk assessment processes that require a reverse stress test exercise: RAF, ICAAP, ILAAP, Recovery Plan, and SREP.

Net Commission Income
Determined as a percentage of total assets (0.94%), set on the basis of the 2018 ratio between the two items; and then exposed to reputational risk and thus to a stochastic decrease factor.

Administrative Expenses
Determined as a percentage of total assets (1.42%), set on the basis of the 2018 ratio between the two items; and then exposed to reputational risk and thus to a stochastic increase factor.
For the residual forecast variables, we applied the following assumptions:

Dividends and Similar Income
We assumed an income expressed in terms of percentage of total financial assets and equal to the record reported in 2018, kept constant over the entire forecast year.

Net Gains on Hedging Activities
We assumed the value recorded in 2018 constant over the entire forecast period.

Net Gains/Losses on Contractual Changes
We assumed the value recorded in 2018 constant over the entire forecast period.

Net Earned Premiums
We assumed equal to average value of the last five years and constant over the entire forecast period.
Net Gain (Loss) from Insurance Activities We assumed equal to average value of the last five years and constant over the entire forecast period.

Net Adjustments to the Value of Tangible & Intangible Assets 0
Other Operating Income (Expense) We assumed equal to average value of the last five years and constant over the entire forecast period.

Income (Loss) on Equity Investments
We assumed equal to average value of the last five years and constant over the entire forecast period.

Regulatory Capital and Risk Weighted Assets
Regulatory capital was determined in each forecast year by taking into account the impacts of all risk factors on equity through P&L or directly on capital (e.g., impact of sovereign risk on AOCI).
Additional Tier 1 and Tier 2 capital instruments were considered constants through the entire forecast time period, without considering any effect arising from new issues, maturity of existing instruments, or regulatory amortization.
Deferred Tax Assets (DTAs) were determined according to the fiscal impact arising from the scenario and taken into account for the capital deductions according to the treatment set by the prudential regulation (deduction from CET1 for the amount exceeding the threshold and inclusion in RWA for the amount below the threshold).
We considered the phasing-out of the positive prudential filter granted for the first time adoption (FTA) of the accounting principle IFRS9, according to which banks can temporarily benefit from the prudential sterilization of the negative impact of the FTA. More specifically, of the four banks considered in the sample, Intesa Sanpaolo, Gruppo BPM, and Gruppo UBI decided to take advantage of the transitional prudential filter, while Unicredit did not. Therefore, on the basis of the 2018 financial statement Pillar 3 information set, we determined the evolution of the aggregated prudential filter through the forecast time period, adopting the following regulatory phasing-out timetable of 95% in 2018; 85% in 2019; 70% in 2020; and 50% in 2021.
With regard to capital requirements, we determined the credit risk RWA on the basis of a constant risk weight on total net exposure, set equal to the value recorded in 2018 (49.42%) for performing loans, while for non-performing loans, we applied a 100% risk weight. RWA related to the other risks (market and operational) were set constant and equal to the 2018 records.