## 1. Introduction

The global financial crisis (GFC) exacerbated the need for greater accountability in evaluating structured securities and thus has required authorities to implement policies aimed at increasing the level of transparency in the asset-backed securities (ABS) framework. In fact, ABS represents a monetary policy instrument which has been largely used by the European Central Bank (ECB) after the financial crisis. On this ground, in 2010 the ECB issued the ABS Loan-Level Initiative which defines the minimum information requirement at loan level for the acceptance of ABS instruments as collateral in the credit operations part of the Eurosystem. This new regulation is based on a specific template

1 and provides market participants with more timely and standardised information about the underlying loans and the corresponding performance.

After the GFC, a large amount of ABS issued by banks has been used as collateral in repurchase agreement operation (repo) via the ABS Loan Level Initiative in order to receive liquidity. A repo represents a contract where a cash holder agrees to purchase an asset and re-sell it at a predetermined price at a future date or in the occurrence of a particular contingency. One of the main advantages of repo is the guarantee offered to the lender since the credit risk is covered by the collateral in the case of the borrower’s default.

To collect, validate and make available the loan-level data for ABS, in 2012 the Eurosystem designated the European DataWarehouse (ED) as the European securitisation repository for ABS data. As stated on the website, the main purpose of the ED is to provide transparency and confidence in the ABS market.

The ED was founded by 17 market participants (large corporations, organizations and banks) and started to operate in the market in January 2013. To be eligible for repurchase agreement transactions with the ECB, securitisations have to meet solvency requirements: for instance, if the default rates in the pool of underlying assets reach a given level, the ABS is withdrawn as collateral. Clearly, this repository allows for new research related to ABS providing more detailed information at loan level.

In this paper, we consider the credit scoring in ABS of small and medium enterprises (SMEs) by using a database of loan-level data provided by ED. The aim of our analysis is to compare the riskiness of securitised loans with the average of bank lending in the SME market in terms of probability of default.

We consider the SME market since it plays an important role in the European economy. In fact, SMEs constitute 99% of the total number of companies, they are responsible for 67% of jobs and generate about 85% of new jobs in the Euro area (

Hopkin et al. 2014). SMEs are largely reliant on bank-related lending (i.e., credit lines, bank loans and leasing) and, despite their positive growth, they still suffer from credit tightening since lending remains below the pre-crisis level in contrast to large corporates. Furthermore, SMEs do not have easy access to alternative channels such as the securitisation one (

Dietsch et al. 2016). In this respect, ECB intended to provide credit to the Eurozone’s economy in favour of the lending channel by using the excess of liquidity of the banking system

2 due to the Asset-Backed Purchase Program (ABSPP) to ease the borrowing conditions for households and firms. Consequently, securitisation represents an interesting credit channel for SMEs to be investigated in a risk portfolio framework. In particular, SMEs play even a more important role in Italy than the in the rest of the European Union. The share of SME value added is 67% compared to an EU average of 57% and the share of SME employment is 79%. Therefore, the ABS of Italian SMEs represents an interesting case to be investigated since Italy is the third largest economy in the Eurozone.

In this regard, we collect the exposures of Italian SMEs and define as defaulted those loans that are in arrears for more than 90 days. We define the 90-day threshold according to article 178 of Regulation (EU) No 575/2013 (

European Parliament 2013), which specifies the definition of a default of an obligor that is used for the IRB Approach

3. We exploit the informational content of the variables included in the ECB template and compute a score for each company to measure the probability of default of a firm. Then, we analyse a sample of 106,257 borrowers of SMEs and we estimate the probability of default (PD) at individual level through a logistic regression based on the information included in the dataset. The estimated PD allows us to have a comparison between the average PD in the securitised portfolio and the average PD in the bank lending for SMEs.

The variables included in the analysis, which will be presented in

Section 3, are: (i) interest rate index; (ii) business type; (iii) Basel segment; (iv) seniority; (v) interest rate type; (vi) nace industry code; (vii) number of collateral securing the loan; (viii) weighted average life; (ix) maturity date; (x) payment ratio; (xi) loan to value and (xii) geographic region. Using the recovery rate provided by banks, we estimate the loss distribution of a global portfolio composed by 20,000 loans at different cut-off date using

CREDITRISK^{+}™ model proposed by Credit Suisse First Boston (

CSFB 1997).

Our findings show that the default rates for securitised loans are lower than the average bank lending for the Italian SMEs’ exposures, in accordance with the studies conducted on the Italian market by CRIF Ratings

4 (

Caprara et al. 2015).

The remaining of the paper is structured as follows.

Section 2 provides a literature review about SMEs and default estimates while

Section 3 illustrates the empirical analysis and our findings. Finally

Section 4 concludes the paper.

## 3. Empirical Analysis

In this section, we analyze at loan level a SME ABS portfolio issued by an Italian bank during 2011 and 2012. We carry out the analysis by following the loans included in the sample at different pool cut-off dates, from 2014 to 2016, close to or coinciding with the semester. However, it is not possible to track all loans in the various periods due to the revolving nature of the operations which allows the SPV to purchase other loans during the life of the operation.

We examine those variables that may lead to the definition of a system for measuring the risk of a single counterpart that are included in the ECB template. In particular we select: (i) interest rate index (field AS84 of the ECB SMEs template); (ii) business type (AS18); (iii) Basel segment (AS22); (iv) seniority (AS26); (v) interest rate type (AS83); (vi) nace industry code (AS42); (vii) number of collateral securing the loan (CS28); (viii) weighted average life (AS61); (ix) maturity date (AS51); (x) payment ratio; (xi) loan to value (LTV) and (xii) geographic region (AS17). We compute payment ratio as the ratio between the installment and the outstanding amount and loan to value as the ratio between the outstanding loan amount and the collateral value. Interest rate index includes: (1) 1 month LIBOR; (2) 1 month EURIBOR; (3) 3 month LIBOR; (4) 3 month EURIBOR; (5) 6 month LIBOR; (6) 6 month EURIBOR; (7) 12 month LIBOR; (8) 12 month EURIBOR; (9) BoE Base Rate; (10) ECB Base Rate; (11) Standard Variable Rate; (12) Other. Business type assumes: (1) Public Company; (2) Limited Company; (3) Partnership (4); Individual; (5) Other. Basel segment is restricted to (1) Corporate and (2) SME treated as Corporate. Seniority can be: (1) Senior Secured; (2) Senior Unsecured; (3) Junior (4); Junior Unsecured; (5) Other. Interest rate type is divided in: (1) Floating rate loan (for life); (2) Floating rate loan linked to Libor, Euribor, BoE reverting to the Bank’s SVR, ECB reverting to Bank’s SVR; (3) Fixed rate loan (for life); (4) Fixed with future periodic resets; (5) Fixed rate loan with compulsory future witch to floating; (6) Capped; (7) Discount; (8) Switch Optionality; (9) Borrower Swapped; (10) Other. Nace Industry Code corresponds to the European statistical classification of economic activities. Number of collateral securing the loan represents the total number of collateral pieces securing the loan. Weighted Average Life is the Weighted Average Life (taking into account the amortization type and maturity date) at cut-off date. Maturity date represents the year and month of loan maturity. Finally the geographic region describes where the obligor is located based on the Nomenclature of Territorial Units for Statistics (NUTS). Given the NUTS code we group the different locations into North, Center and South of Italy

5.

The final panel dataset used for the counterparties’ analysis contains 106,257 observations.

Table 1 shows the number of non-defaulted and defaulted loans for each pool cut-off date.

In the process of computing a riskiness score for each borrower, we consider the default date to take into account only the loans that are either not defaulted or that are defaulted between two pool cut-off dates (prior to the pool cut-off date in the case of 2014H1). In the considered sample, the observed defaulted loans are equal to

$2.84\%$ of the total number of exposures (

Table 1).

We analyze a total of 159,641 guarantees related to 117,326 loans. For the score and the associated default probability, we group the individual loan information together to associate it with a total of 106,257 borrowers over five pool cut-off dates (

Table 2). In order to move from the level of individual loans to the level of individual companies, we calculate the average for all loans coming from the same counterparty, otherwise we retain the most common value for the borrower.

We analyze the variables included in the ECB template individually through the univariate selection analysis which allows to measure the impact of each variable on loan’s riskiness. We group each variable’s observations according to a binning process in order to: (i) reduce the impact of outliers in the regression; (ii) better understand the impact of the variable on the credit risk through the study of the Weight of Evidence (WOE); (iii) study the variable according to a strategic purpose.

Operators suggest taking the WOE as a reference to test the model predictivity (

Siddiqi 2017), a measure of separation between

goods (non-defaulted) and

bads (defaulted), which calculates the difference between the portion of solvents and insolvents in each group of the same variable. Specifically, the Weight of Evidence value for a group consisting of n observations is computed as:

and could be written as:

The value of WOE will be zero if the odds of $DistrGood/DistrBad$ is equal to one. If the $DistrBad$ in a group is greater than the $DistrGood$, the odds ratio will be less than one and the WOE will be a negative number; if the number of Goods is greater than the $DistrBad$ in a group, the WOE value will be a positive number.

To create a predictive and robust model we use a Monotonous Adjacent Pooling Algorithm (MAPA), proposed by

Thomas et al. (

2002). This technique is a pooling routine utilized for reducing the impact of statistical noise. An interval with all observed values is split in smaller sub-intervals, bins or groups, each of them gets assigned the central value characterizing this interval (

Mironchyk and Tchistiakov 2017). Pooling algorithms are useful for coarse classing when individual’s characteristics are represented in the model. There are three types of pooling algorithm: (i) non-adjacent, for categorical variable; (ii) adjacent, for numeric, ordinal and discrete characteristics; and (iii) monotone adjacent, when a monotonic relationship is supposed with respect to the target variable. While non-adjacent algorithms do not require any assumptions about the ordering of classes, adjacent pooling algorithms require that only contiguous attributes can be grouped together, which applies to ordinal, discrete and continuous characteristic (

Anderson 2007). In this context, MAPA is a supervised algorithm that allows us to divide each numerical variable into different classes according to a monotone WOE trend, either increasing or decreasing depending from the variable considered. For categorical variables we maintain the original classification, as presented in the ECB template. The starting point for the MAPA application is the calculation of the cumulative default rate (bad rate) for each score level:

where

G and

B are the good (non-defaulted) and bad (defaulted) counts,

V is a vector containing the series of score breaks being determined;

v is a score above the last score break; and

i and

k are indices for each score and score break respectively. We calculate cumulative bad rates for all scores above the last breakpoint, and we identify the score with the highest cumulative bad rate; this score is assigned to the vector as shown in Equation (

4).

with

C representing the cumulative bad rate. This iterative process terminates when the maximum cumulative bad rate is the one associated with the highest possible score. To test the model predictivity together with the WOE we use a further measure: the Information Value (IV). The Information Value is widely used in credit scoring (

Hand and Henley 1997;

Zeng 2013) and indicates the predictive power of a variable in comparison to a response variable, such as borrower default. Its formulation is expressed by the formula:

where

$Distr$ refers to the proportion of

$Goods$ or

$Bads$ in the respective group expressed as relative proportions of the total number of Goods and Bads and can be rewritten by inserting the WOE as follows:

with

N representing the non-defaulted loans (Negative to default status),

P the defaulted (Positive to default), the WOE is calculated on the

i-th characteristic and

n corresponds to the total number of characteristics analyzed, as shown in Equations (

1) and (

2). As stated in

Siddiqi (

2017), there is no precise rule of discrimination of the variables through the information value. It is common practice among operators to follow an approximate rule that consists in considering these factors: (i) an IV smaller than

$0.02$ shows an unpredictable variable; (ii) from

$0.02$ to

$0.1$ power is weak; (iii) from

$0.1$ to

$0.3$ average; (iv) above

$0.3$ strong.

Table 3 shows the indication of the information value for each variable within the dataset in the first pool cut-off date.

According to

Siddiqi (

2017), logistic regression is a common technique used to develop scorecards in most financial industry applications, where the predicted variable is binary. Logistic regression uses a set of predictor characteristics to predict the likelihood of a defined outcome, such as borrower’s default in our study. The equation for the logit transformation is described as:

where

${p}_{i}$ represent the posterior probability of the “event” given different input variables for the

i-th borrower;

x are input variables;

${\beta}_{0}$ corresponds to the intercept of the regression line;

${\beta}_{j}$ are parameters and

k is the total number of parameters.

The result

$logit\left({p}_{i}\right)$ in the equation represents a logarithmic transformation of the output, i.e.,

$log\left(p\right[event]/p[nonevent\left]\right)$, necessary to linearize posterior probability and limit outcome of estimated probabilities in the model between 0 and 1. The parameters

${\beta}_{1}\phantom{\rule{3.33333pt}{0ex}}\cdots \phantom{\rule{3.33333pt}{0ex}}{\beta}_{k}$ measure the rate of change in the model as the value of the independent variable varies unitary. Independent variables must be standardized to be made as independent as possible from the input unit or proceed by replacing the value of the characteristic with the WOE for each class created for the variable. The final formulation becomes:

The regression is made on a cross sectional data for each pool cut-off date. We measure the impact of the variables on credit risk through the WOE. If we consider the LTV when the ratio between the outstanding loan amount and collateral value increases, the default rate increases as well while the WOE decreases. This indicates that an increment in the LTV is a sign of a deterioration in the creditworthiness of the borrower. The relation is reported in

Table 4.

We report in Equation (

9) the obtained regression for the first pool cut-off date, for sake of space we include only the first regression. The output of the other pool cut-off date regression is reported in

Appendix A. It should be noted that not all the variables included in the sample are considered significant. The LTV due to a high number of missing values, even if predictive according to the criteria of the information value, has not been included in the regression:

Table 5 reports the coefficients of the considered variables along with the significance level, marked by *** at 1% confidence level and by ** at 5%.

Figure 1a indicates the default probability associated with each score level for the first pool cut-off date. In the

Appendix A we report the relationship for the other pool cut-off dates. We choose a score scale ranging from 500 (worst counterparties) to 800 points (best counterparties). We can see that as the score decreases, the associated default probability increases.

Validation statistics have the double purpose of measuring: (i) the power of the model, i.e., the ability to identify the dependence between the variables and the outputs produced and (ii) the divergence from the real results. We use Kolmogorov-Smirnov (KS) curve and Receiver Operating Characteristic (ROC) curve to measure model prediction capacity.

**Kolmogorov-Smirnov (KS)** The KS coefficient according to

Mays and Lynas (

2004) is the most widely used statistic within the United States for measuring the predictive power of rating systems. The Kolmogorov-Smirnov curve plots the cumulative distribution of non-defaulted and defaulted against the score, showing the percentage of non-defaulted and defaulted below a given score threshold, identifying it as the point of greatest divergence. According to

Mays and Lynas (

2004), KS values should be in the range 20%–70%. The goodness of the model should be highly questioned when values are below the lower bound. Value above the upper bound should be also considered with caution because they are ‘probably too good to be true’. The Kolmogorov-Smirnov statistic for a given cumulative distribution function

$F\left(x\right)$ is:

where

$su{p}_{x}$ is the supremum of the set of distances. The results on the dataset are included in

Figure 2 and show values within the threshold for the first pool cut-off date. In the first report date with a 623 points score the KS value is

$23.8\%$. The statistics for the other pool cut-off dates are reported in

Appendix A.

**Lorenz curve and Gini coefficient** In credit scoring, the Lorenz curve is used to analyze the model’s ability to distinguish between “good” (non-defaulted) and “bad” (defaulted), showing the cumulative percentage of defaulted and non-defaulted on the axes of the graph (

Müller and Rönz 2000). When a model has no predictive capacity, there is perfect equality. The Gini Coefficient is widely used in Europe (

Řezáč and Řezáč 2011), is derived from the Lorenz curve and calculates the area between the curve and diagonal in the Lorenz curve.

**Gini coefficient** The Gini coefficient is computed as:

where

$cpY$ is the cumulative percentage of defaulters and

$cpX$ is the cumulative percentage of non-defaulters. The result is a coefficient that measures the separation between the curve and the diagonal. Gini’s coefficient is a statistic used to understand how well the model can distinguish between “good” and “bad”.

This measure has the following limitations: (i) can be increased by increasing the range of indeterminates, i.e., who is neither “good” nor “bad” and (ii) is sensitive to the definition of the categories of variables both in terms of numbers and types. Operators’ experience, according to

Anderson (

2007), suggests that the level of the Gini coefficient should range between 30% and 50%, in order to have a satisfactory model.

**Receiver Operating Characteristic (ROC)** As reported by

Satchel and Xia (

2008), among the methodologies for assessing discriminatory power described in the literature the most popular one is the ROC curve and its summary index known as the area under the ROC (AUROC) curve. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity and the false-positive rate is also known as the specificity. Specificity represents the ability to identify true negatives and can be calculated as 1 minus the specificity. The ROC therefore results from:

The curve is concave only when the relationship

${p}_{i}^{+}/{p}_{i}^{-}$ has a monotonous relationship with the event being studied. When the curve goes below the diagonal, the model is making a mistake in the prediction for both false positive and false negative but a reversal of the sign could correct it. This is very similar to the Gini coefficient, except that it represents the area under the ROC curve (AUROC), as opposed to measuring the part of the curve above it. The commonly used formula for the AUROC, as reported in (

Anderson 2007, p. 207) is:

and shows that the area below the curve is equal to the probability that the score of a true positive (defaulted,

${S}_{TP}$) is less than that of a true negative (non-defaulted,

${S}_{TN}$), plus

$50\%$ of the probability that the two scores are equal. A 50% value of AUROC implies that the model is making nothing more than a random guess.

Table 6 shows the values of the statistics for the analyzed pool cut-off dates.

Once the predictive ability of the model is tested, it is possible to calculate the probability of default for classes of counterparties. In this respect, we create a master scale to associate a default probability to each score. As stated in

Siddiqi (

2017), a common approach is to have discrete scores scaled logarithmically. In our analysis, we set the target score to 500 with the odds doubling every 50 points which is commonly used in practice (

Refaat 2011). The way to define the rating classes is through the creation of a cut-off defined with classes extension. Using the relationship between logarithm and exponential function it is possible to create the ranges for each rating class. The default probability vector by counterparty is linearized through the calculation of their natural logarithm, then this is divided into 10 equal classes and the logarithms of the cut-off of each class have been converted to identify the cut-off to be associated with each scoring class with an exponential function. With this procedure we calculate an average default probability for each range created (

Figure 1b).

We validate the results obtained in the logistic regression with an out of sample analysis. In our analysis the validation has been performed by following directly

Siddiqi (

2017) which illustrated the standard procedure adopted in credit scoring. The industry norm is to use a random 70% (or 80%) of the development sample for building the model, while the remaining sample is kept for validation. When the scorecard is being developed on a small sample as in our case, it is preferred to use all the samples and validate the model on randomly selected samples of 50–80% length. Accordingly, we decided to use the second approach by selecting an out of sample of 50% of the total observations. We proceed as in the in-sample to analyze the statistics of separation and divergence for the

out of sample, we report the statistics in

Table 7. We observe that statistics do not differ substantially between the

out of sample and the whole sample.

We carry out the analysis of the portfolio composition in all the pool cut-off dates analyzed. The revolving nature of the ABS may cause the composition of the portfolio under study to vary, even significantly. In general, the classes that include most of the counterparties are the central classes, as can be seen in

Figure 3b. It is clear that the counterparties included in the ABS have an intermediate rating. For sake of completeness we report in

Table 8 the actual default frequency in the sample per each rating class.

To estimate the recovery rate of a default exposure it is necessary to have information regarding the market value of the collateral, the administrative costs incurred for the credit recovery process and the cumulative recoveries. Since those data are not available in the dataset, we analyze the recovery rates starting directly from the data provided by the banks in the template under “AS37” with the name of “Bank internal Loss Given Default (LGD) estimate” which estimates the LGD of the exposure in normal economic conditions. The RR of the loan was calculated by applying the equation: $RR(\%)=100\%-LGD(\%)$.

The average recovery rate through all the collaterals related to one loan calculated by the bank is different depending on the level of protection offered, as evidenced by

Figure 4.

As can be seen in

Figure 4, the

originator estimates a lower recovery rate for unsecured exposures than for secured loans. The average RR for secured exposures is

$80.3\%$, while for unsecured exposures on average the bank expects to recover

$66.8\%$ of the amount granted.

Figure 5 and

Table 9 show the recovery rate calculated by the bank by rating level, it can be seen that the average recovery rate calculated by the bank tends to decrease as the counterparty’s rating deteriorates, even if not monotonously.

To investigate portfolio loss distribution we implement CREDITRISK^{+}™ model on a representative sample of approximately 20,000 counterparties, of which 10,000 refer to loans terminated (repaid or defaulted) before the first pool cut-off date while the remaining 10,000 are active at the latest pool cut-off dates and are used to provide a forecast of the future loss profile of the portfolio.

CREDITRISK^{+}™ can be applied to different types of credit exposure including corporate and retail loans, derivatives and traded bonds. In our analysis we implement it on a portfolio of SMEs credit exposures. It is based on a portfolio approach to modelling credit risk that makes no assumption about the causes of default, this approach is similar to the one used in market risk, where no assumptions are made about causes of market price movements.

CREDITRISK^{+}™ considers default rates as continuous random variables and incorporates the volatility of default rates to capture default rates level uncertainty. The data used in the model are: (i) credit exposures; (ii) borrower default rates; (iii) borrower default rate volatilities and (iv) recovery rates. In order to reduce the computational difficulties, the exposures are adjusted by anticipated recovery rates in order to calculate the loss in case of default event. We consider recovery rates provided by ED and include them in the database. The exposures, net of recovery rates, are divided into bands with similar exposures. The model assumes that each exposure has a definite known default probability over a specific time horizon. Thus

We introduce the probability generating function (PGF) defined in terms of an auxiliary variable

z An individual borrower either defaults or does not default, therefore the probability generating function for a single borrower is

6:

CREDITRISK^{+}™ assumes that default events are independent, hence, the probability generating function for the whole portfolio is the product of the individual PGF, as shown in Equation (

17)

and could be written as:

The Credit Risk Plus model

CSFB (

1997) assumes that a borrower’s default probabilities are uniformly small, therefore powers of those probabilities can be ignored and the logarithm can be replaced using the expression

7
and, in the limit, Equation (

18) becomes

where

represents the expected number of default events in one year from the whole portfolio.

$F\left(z\right)$ is expanded in its Taylor series in order to identify the distribution corresponding to this PGF:

thus considering small individual default probabilities from Equation (

22) the probability of realising

n default events in the portfolio in one year is given by:

where we obtain the Poisson distribution for the distribution of the number of defaults. The distribution has only one parameter, the expected number of defaults

$\mu $. The distribution does not depend on the number of exposures in the portfolio or the individual probabilities of default provided that they are uniformly small. Real portfolio loss differs from the Poisson distribution, historical evidence shows in fact that the standard deviation of default event frequencies is much larger than

$\sqrt{\mu}$, the standard deviation of the Poisson distribution with mean

$\mu $. We can espress the expected loss in terms of the probability of default events

where

${v}_{j}$ is the common exposure in the exposure band,

${\epsilon}_{j}$ is the expected loss in the exposure band and

${\mu}_{j}$ is the expected number of defaults in the exposure band. We can derive the distribution of default losses as with

$G\left(z\right)$ as the PGF for losses expressed in multiples of an unit of exposure

LThe inputs that we include are therefore the average of the estimate of the probability of default calculated through the logistic regression and the relative volatility calculated through the pool cut-off dates. The exposure included in the model was calculated net of the recovery rates estimated by the bank. As stated previously, since the data to obtain the recovery rate are not available, we test the model with bank own recovery rates estimates. The mean and volatility values of the default probabilities are shown in

Table 10. For the sake of completeness, we have also reported the mean and standard deviation of the default frequencies.

The model’s estimate on the historical data of the loans terminated in the first available pool cut-off date provides an indication of the expected loss of 2,661,592 Euro against a total exposure of

$48.92$ million Euro with a standard deviation of 670,422 Euro (

Table 11).

The real loss of the analysed portfolio calculated on all terminated loans is 2.10 million euro, lower than the expected loss computed by the model but within the $EL-\sigma $ threshold. The estimated expected loss by the model is $5.44\%$ of the capital exposed to risk which represents the outstanding amount net of recovery rates.

The analysis shows that the portfolio before the first pool cut-off date lost a total of

$4.29\%$ of its value against an estimated loss of

$5.44\%$. Even though the model with the input data used overestimates the expected loss, it is in the

$EL-\sigma $ range. Due to the small number of counterparts and the lack of homogeneity of the data, an estimation error is possible. With a view to analyzing future performance, only loans active in the last pool cut-off date are kept in the portfolio and estimates of PD and volatility have been used as an approximation of the probability of future default. In a sample of 10,000 current counterparties in last pool cut-off date the capital exposed to the total risk of loss is 247 million with an expected loss of

$5.7$ million corresponding to

$2.31\%$ of the total (

Table 12). This means that after the last available report the portfolio would have lost an additional

$2.3\%$ of the capital exposed to risk before the withdrawal.

The average loss in the sample is

$2.14\%$ while the estimate of the future loss in the pool cut-off dates is a further

$2.31\%$. Panel

Figure 6a shows the loss distribution for terminated loans and Panel

Figure 6b illustrates the loss distribution for active exposures.

In accordance with the studies conducted by CRIF

8, Italian company specialized in credit bureau and business information, the default rates of Italian SMEs are around 6%, above those calculated in the analyzed sample. Assuming that recovery rates are similar to those of companies not included in the portfolio of securitized exposures, we can assume that the loss profiles for securitized portfolios are less severe than for exposures retained in the bank’s balance sheet and not securitized.