# SPI-Based Drought Classification in Italy: Influence of Different Probability Distribution Functions

## Abstract

## 1. Introduction

## 2. SPI Definition and Background

- compute the cumulative precipitation amounts for each $j-\mathrm{th}$ month, with $j=1,\text{}\dots ,\text{}12$, using the $k-th$ time scale;
- fit the Gamma distribution to each $j-th$ calendar months, with the Maximum Likelihood Estimation (MLE) method;
- estimate the probability associated to each precipitation value;
- compute the SPI values by inverting the probability evaluated with the Gamma distribution with the standard normal function.

## 3. Case Study and Data

#### 3.1. The Italian Climate Based on Köppen-Geiger

#### 3.2. Daily Rainfall Data

#### 3.3. Monthly Rainfall Main Statistics

## 4. Methodology

#### 4.1. Tested Probability Distributions

#### 4.2. Quantifying the Differences between the SPI Estimation Approaches

#### 4.3. Test of Normality: The Shapiro-Wilk Test

- Shapiro-Wilk statistic ($\omega $) lower than 0.96;
- p-value associated to $\omega $ lower than 0.10;
- absolute value of the SPI median greater than 0.05;

#### 4.4. Critical Drought Intensity-Duration-Frequency (IDF) Curves

- SPI estimation for each time scale to distinguish dry and wet periods and to identify drought events;
- determination of the critical drought severity for all the identified drought events;
- frequency analysis of critical drought severity with the identification of the best fit probability distribution function to describe each sample. Particularly, here we used the first two Extreme Value Distribution functions, that are, the Gumbel and the Fréchet, whose parameters are determined with the MSEN method, which also aim at defining the best fitting one;
- calculation of severity and/or intensity of critical drought of a given duration for a fixed return period from the best fit probability distribution (previously identified at point iii);
- drought intensity can be expressed, for each return period, by using a linear regression line: $I=\alpha D+\beta $, where $D$ is the duration (which varies between 1 month and the maximum observed), while $\alpha $ and $\beta $ are coefficients determinable with curve fitting.

## 5. Results

#### 5.1. Fitting Performance

#### 5.2. Comparison of the SPI Values Estimated with the Two Different Approaches

#### 5.3. Drought Characteristics Comparison

#### 5.4. Normality Test

#### 5.5. Critical Drought IDF Curves

## 6. Discussion and Conclusions

- The Lognormal distribution resulted to be the best fit model to describe almost all the monthly precipitation samples, followed by the Weibull (for 1-month scale) and the Gamma. The Normal distribution, as expected, resulted as the best fitted for a very low percentage of stations, confirming its poor ability in modelling samples with a positive skewness;
- the Pearson Correlation Coefficient ($\rho ),$ evaluated on the entire SPI signal, showed an almost perfect agreement between the SPI signals estimated by the two tested approaches for 12-months time scale. Median $\rho $ values decrease by reducing the SPI’s time scale till 0.97 for SPI1 (Figure 2a);
- the Mean Bias Error (MBE) indicated an opposite behaviour for lower and higher SPI’s time scale. Indeed, for time scales up to 3-months, the BFA presented more severe SPI values than the SA, while the opposite was observed for 6- and 12-months scales;
- by highlighting the differences between the two approaches in detecting both wet and dry periods, the Relative Error (RE) followed the behaviour of the MBE: the BFA tends to detect more extreme conditions than the SA for lower scales (i.e., SPI1 and SPI3), and vice versa for SPI6 and SPI12;
- the same patterns emerged from the analysis of the entire SPI signal are reflected on the analysis of drought events (i.e., SPI ≤ −1). Generally, the SA under-estimates all the drought characteristics (i.e., number of events and number of drought months (Figure 4), duration, severity and interarrival time (Table 5)) for small time scales (up to 3-months), while for longer time scale over-estimates the same characteristics. Clearly, we consider as a benchmark the BFA, since it is built to provide the best model to describe the empirical samples;
- the use of the BFA did not solve the SPI non-normality issue, indeed the percentage of non-normal SPIs is higher than the SA for all the months and all the time scales;
- despite the differences between the two approaches emerged in drought characteristics, the analyzed drought events lie in the same return period classes (Figure 6) for all the time scales.

**Figure 1.**Location of the 332 stations selected from the SCIA dataset. Each station is depicted with a different marker based on the Köppen-Geiger (KG) classification [37,38]. Particularly, 53.8% of the stations belong to Csa-KG, 19.2% to Cfb-KG, 15% to Cfa-KG, 4.8% to Csb-KG, 3.3% to ET-KG, 3% to Dfc-KG, and 0.9% to Dfb-KG. All the information related to the stations are provided in Table S1 of the Supplementary Material.

**Figure 2.**Box plots representing the Pearson correlation coefficient (

**a**) and the Mean Bias Error (

**b**) between the SPI values estimated with the SA and the BFA for each time scale.

**Figure 3.**Box plots representing the Relative Error (RE) of the number of classes detected by the SPI estimated with the BFA and SA. The horizontal zeros grey line represents the perfect match between the two approaches. Both the wet (blue-shades) and dry (red-shades) categories are depicted. The SPI classes (from 1 to 8) are ordered from the extremely wet to the extremely drought categories, according to Table 1.

**Figure 4.**Variability of the differences ${\Delta}_{m}$ (

**a**) and ${\Delta}_{e}$ (

**b**) or each time scale.

**Figure 5.**SPI6 critical drought IDF curves representation for CAPESTRANO-Idro station. Each shaded color represents a different Rp class, while each line is a critical drought IDF for a fixed return period (RP).

**Figure 6.**Boxplots representing the percentage of drought events detected by the two approaches lying in each return period class (Rp) for each SPI time scale.

**Table 1.**Event classification based on SPI values and their probability of occurrence [14].

SPI Values | SPI Classes | Probability (%) |
---|---|---|

$\ge 2.0$ | Extremely wet | 2.3 |

1.5 to 2.00 | Severely wet | 4.4 |

1.0 to 1.5 | Moderately wet | 9.2 |

0 to 1.0 | Mildly wet | 34.1 |

−1.0 to 0 | Mildly drought | 34.1 |

−1.5 to −1.0 | Moderate drought | 9.2 |

−2.0 to −1.5 | Severe drought | 4.4 |

$\le -2$ | Extreme drought | 2.3 |

**Table 2.**Mean values of the main statistics evaluated on the 332 samples for each $k-th$ time scale (1, 3, 6 and 12 months) and for each $j-th$ month. Apart from the probability dry (P

_{dry}), these statistics are evaluated considering the non-zero monthly rainfall.

1-month | 3-months | ||||||||||||

P_{25} | P_{50} | P_{75} | SD | Skew | Pdry | P_{25} | P_{50} | P_{75} | SD | Skew | Pdry | ||

month | 1 | 41.33 | 77.35 | 121.12 | 60.56 | 0.92 | 1.3% | 219.89 | 299.01 | 387.44 | 125.34 | 0.48 | 0% |

2 | 35.14 | 66.98 | 110.76 | 60.12 | 1.15 | 1.0% | 190.24 | 256.97 | 332.51 | 115.73 | 0.65 | 0% | |

3 | 38.51 | 71.00 | 113.26 | 54.54 | 0.86 | 1.3% | 161.94 | 229.09 | 309.63 | 116.03 | 0.77 | 0% | |

4 | 46.19 | 72.74 | 106.42 | 48.40 | 0.96 | 0.6% | 168.06 | 226.85 | 297.86 | 97.84 | 0.62 | 0% | |

5 | 35.48 | 58.67 | 89.13 | 43.12 | 1.15 | 1.4% | 164.54 | 217.10 | 277.25 | 85.09 | 0.54 | 0% | |

6 | 32.41 | 49.19 | 72.62 | 35.60 | 1.52 | 6.3% | 149.18 | 193.92 | 246.78 | 74.29 | 0.58 | 0% | |

7 | 22.04 | 35.92 | 56.22 | 30.22 | 1.52 | 19.1% | 116.45 | 152.24 | 198.02 | 63.72 | 0.89 | 0% | |

8 | 24.42 | 43.28 | 71.44 | 39.45 | 1.37 | 12.5% | 101.72 | 135.83 | 180.40 | 63.91 | 1.05 | 2% | |

9 | 32.93 | 62.99 | 106.11 | 58.97 | 1.25 | 2.5% | 106.64 | 153.13 | 209.85 | 82.32 | 0.96 | 1% | |

10 | 50.38 | 96.28 | 159.98 | 87.21 | 1.19 | 0.9% | 153.35 | 222.95 | 305.76 | 117.73 | 0.87 | 0% | |

11 | 64.44 | 108.52 | 164.74 | 80.37 | 1.04 | 0.5% | 212.44 | 295.89 | 389.32 | 134.99 | 0.68 | 0% | |

12 | 57.33 | 92.25 | 141.43 | 68.85 | 1.05 | 0.4% | 235.40 | 319.01 | 426.46 | 144.01 | 0.70 | 0% | |

6-months | 12-months | ||||||||||||

P_{25} | P_{50} | P_{75} | SD | Skew | Pdry | P_{25} | P_{50} | P_{75} | SD | Skew | Pdry | ||

month | 1 | 427.29 | 528.10 | 651.62 | 175.05 | 0.57 | 0% | 791.59 | 930.03 | 1090.20 | 217.87 | 0.48 | 0% |

2 | 455.31 | 560.20 | 676.80 | 173.94 | 0.57 | 0% | 796.53 | 929.64 | 1080.20 | 210.84 | 0.43 | 0% | |

3 | 457.76 | 564.55 | 679.86 | 174.98 | 0.46 | 0% | 801.08 | 927.79 | 1075.85 | 207.39 | 0.43 | 0% | |

4 | 431.31 | 533.26 | 643.87 | 160.70 | 0.51 | 0% | 801.33 | 926.70 | 1073.89 | 206.62 | 0.46 | 0% | |

5 | 387.32 | 478.80 | 581.76 | 150.44 | 0.52 | 0% | 799.72 | 928.24 | 1073.12 | 208.99 | 0.48 | 0% | |

6 | 345.38 | 431.96 | 531.22 | 142.70 | 0.50 | 0% | 799.64 | 930.59 | 1072.62 | 211.08 | 0.47 | 0% | |

7 | 313.15 | 387.86 | 475.85 | 122.77 | 0.51 | 0% | 800.96 | 930.18 | 1072.33 | 211.03 | 0.50 | 0% | |

8 | 294.92 | 362.40 | 435.90 | 106.68 | 0.52 | 0% | 804.86 | 931.36 | 1068.92 | 207.63 | 0.51 | 0% | |

9 | 290.58 | 357.29 | 433.19 | 108.07 | 0.54 | 0% | 799.60 | 924.47 | 1069.48 | 211.10 | 0.55 | 0% | |

10 | 302.38 | 384.35 | 481.85 | 136.38 | 0.70 | 0% | 793.61 | 923.40 | 1071.99 | 213.58 | 0.60 | 0% | |

11 | 347.39 | 438.48 | 546.24 | 152.75 | 0.65 | 0% | 794.80 | 925.10 | 1078.94 | 212.20 | 0.49 | 0% | |

12 | 383.97 | 487.93 | 605.93 | 170.45 | 0.58 | 0% | 794.46 | 930.71 | 1088.44 | 215.76 | 0.50 | 0% |

Probability Distribution | Cumulative Distribution Function | |
---|---|---|

$\mathrm{Gamma}\text{}(\mathcal{G})$ | ${F}_{\mathcal{G}}\left(x\right)=\frac{1}{\mathsf{\Gamma}\left(\gamma \right)}{{\displaystyle \int}}_{0}^{x}{\beta}^{-\gamma}{t}^{\gamma -1}\mathrm{exp}\left(-\frac{t}{\beta}\right)\mathrm{d}t$ | (1) |

$\mathrm{Lognormal}\text{}\left(\mathcal{L}\mathcal{N}\right)$ | ${F}_{\mathcal{L}\mathcal{N}}\left(x\right)=\frac{1}{2}\left(-\mathrm{erfc}\left(\frac{\mathrm{ln}\left(x\right)-\mu}{\sigma \sqrt{2}}\right)\right)$ | (2) |

$\mathrm{Weibull}\text{}\left(\mathcal{W}\right)$ | ${F}_{\mathcal{W}}\left(x\right)=1-\mathrm{exp}\left(-{\left(\frac{x}{\beta}\right)}^{\gamma}\right)$ | (3) |

$\mathrm{Normal}\text{}\left(\mathcal{N}\right)$ | ${F}_{\mathcal{N}}\left(x\right)=\frac{1}{\sigma \sqrt{2\pi}}{{\displaystyle \int}}_{-\infty}^{\infty}\mathrm{exp}\left(\frac{-{\left(t-\mu \right)}^{2}}{2{\sigma}^{2}}\right)\mathrm{d}t$ | (4) |

**Table 4.**Percentage of stations where the four tested distributions are selected as best fit model, for each $k-th$ time scale and each $j-th$ month.

Month | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | ||

1 | $\mathcal{G}$ | 17% | 30% | 17% | 21% | 27% | 18% | 21% | 26% | 18% | 18% | 20% | 23% |

$\mathcal{L}$$\mathcal{N}$ | 25% | 25% | 27% | 43% | 37% | 51% | 43% | 28% | 36% | 33% | 46% | 48% | |

$\mathcal{W}$ | 34% | 33% | 25% | 23% | 27% | 26% | 27% | 36% | 34% | 26% | 20% | 23% | |

$\mathcal{N}$ | 23% | 11% | 32% | 13% | 9% | 5% | 8% | 11% | 12% | 23% | 13% | 6% | |

3 | $\mathcal{G}$ | 23% | 22% | 23% | 24% | 22% | 22% | 25% | 19% | 22% | 22% | 23% | 23% |

$\mathcal{L}$$\mathcal{N}$ | 39% | 53% | 51% | 42% | 37% | 41% | 49% | 45% | 41% | 45% | 44% | 40% | |

$\mathcal{W}$ | 24% | 15% | 15% | 24% | 29% | 26% | 21% | 28% | 25% | 19% | 22% | 27% | |

$\mathcal{N}$ | 14% | 10% | 11% | 10% | 12% | 11% | 5% | 8% | 12% | 14% | 11% | 11% | |

6 | $\mathcal{G}$ | 24% | 16% | 19% | 15% | 15% | 22% | 20% | 19% | 22% | 22% | 17% | 21% |

$\mathcal{L}$$\mathcal{N}$ | 51% | 60% | 55% | 53% | 56% | 51% | 50% | 51% | 47% | 47% | 58% | 48% | |

$\mathcal{W}$ | 17% | 15% | 16% | 20% | 20% | 19% | 22% | 21% | 22% | 18% | 17% | 26% | |

$\mathcal{N}$ | 7% | 10% | 9% | 12% | 9% | 8% | 8% | 9% | 9% | 13% | 9% | 5% | |

12 | $\mathcal{G}$ | 20% | 17% | 14% | 13% | 15% | 16% | 12% | 10% | 14% | 14% | 15% | 15% |

$\mathcal{L}$$\mathcal{N}$ | 52% | 56% | 57% | 57% | 61% | 62% | 61% | 64% | 61% | 60% | 55% | 55% | |

$\mathcal{W}$ | 18% | 19% | 18% | 21% | 17% | 14% | 15% | 15% | 15% | 17% | 19% | 20% | |

$\mathcal{N}$ | 10% | 8% | 11% | 9% | 6% | 8% | 11% | 11% | 10% | 8% | 10% | 9% |

**Table 5.**Main statistics of drought characteristics, i.e., duration (D), severity (S), and interarrival time (T) for the standard and the best fit approaches.

Standard Approach | Best Fit Approach | ||||||||
---|---|---|---|---|---|---|---|---|---|

SPI1 | SPI3 | SPI6 | SPI12 | SPI1 | SPI3 | SPI6 | SPI12 | ||

D (month) | min | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |

mean | 1.20 | 1.96 | 2.77 | 4.21 | 1.21 | 1.90 | 2.59 | 3.66 | |

max | 7.00 | 14.00 | 26.00 | 51.00 | 8.00 | 14.00 | 23.00 | 46.00 | |

sd | 0.49 | 1.31 | 2.34 | 4.69 | 0.51 | 1.27 | 2.14 | 4.14 | |

S (mm) | min | $-$1.00 | $-$1.00 | $-$1.00 | $-$1.00 | $-$1.00 | $-$1.00 | $-$1.00 | $-$1.00 |

mean | $-$1.89 | $-$3.06 | $-$4.27 | $-$6.29 | $-$2.08 | $-$3.13 | $-$4.07 | $-$5.43 | |

max | $-$11.59 | $-$26.31 | $-$51.66 | $-$77.97 | $-$14.15 | $-$27.05 | $-$44.71 | $-$92.12 | |

sd | 0.96 | 2.47 | 4.30 | 8.22 | 1.31 | 2.62 | 4.13 | 7.32 | |

T (month) | min | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |

mean | 8.26 | 11.36 | 15.52 | 22.31 | 7.94 | 11.42 | 16.26 | 24.17 | |

max | 73.00 | 147.00 | 238.00 | 412.00 | 74.00 | 225.00 | 313.00 | 375.00 | |

sd | 6.57 | 9.72 | 16.15 | 31.04 | 6.35 | 9.94 | 17.30 | 34.36 |

**Table 6.**Percentage of non-normal SPI series detected by the Shapiro-Wilk test for both the approaches and for each k−th time scale and each month.

Month | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | ||

SA | SPI1 | 2.4 | 1.2 | 3.6 | 4.8 | 4.8 | 11.4 | 22.0 | 18.7 | 3.6 | 4.2 | 4.5 | 3.9 |

SPI3 | 2.7 | 2.1 | 1.8 | 4.5 | 1.8 | 4.5 | 1.2 | 2.1 | 0.6 | 4.5 | 3.0 | 2.7 | |

SPI6 | 1.8 | 2.1 | 1.2 | 2.1 | 2.1 | 3.6 | 5.1 | 3.6 | 1.8 | 2.7 | 3.0 | 1.8 | |

SPI12 | 1.2 | 2.4 | 1.5 | 2.1 | 3.9 | 4.2 | 3.3 | 4.2 | 2.4 | 2.1 | 1.8 | 2.7 | |

BFA | SPI1 | 9.3 | 9.6 | 16.3 | 9.0 | 11.1 | 20.8 | 21.1 | 28.9 | 12.0 | 15.4 | 15.4 | 12.3 |

SPI3 | 6.0 | 3.9 | 8.1 | 10.5 | 6.6 | 9.0 | 6.9 | 5.4 | 7.5 | 8.4 | 10.8 | 13.0 | |

SPI6 | 5.4 | 4.8 | 5.4 | 5.4 | 4.5 | 5.7 | 4.8 | 5.7 | 4.2 | 4.2 | 5.7 | 3.3 | |

SPI12 | 3.3 | 3.6 | 4.2 | 4.5 | 5.7 | 4.5 | 4.5 | 5.1 | 4.2 | 6.0 | 4.8 | 3.9 |

**Table 7.**Delta values evaluated for the two coefficients $\alpha $ and $\beta $ of the regression lines.

Δα | Δβ | ||||||||
---|---|---|---|---|---|---|---|---|---|

SPI1 | SPI3 | SPI6 | SPI12 | SPI1 | SPI3 | SPI6 | SPI12 | ||

Return periods | 2 | −0.95 | 1.12 | 0.43 | −0.97 | 0.47 | 0.42 | −0.94 | −3.03 |

5 | 3.72 | 3.57 | 1.77 | 0.48 | 3.73 | 2.09 | 0.16 | −1.96 | |

10 | 7.72 | 5.75 | 3.16 | 1.96 | 6.48 | 3.39 | 0.96 | −1.31 | |

25 | 13.53 | 8.76 | 5.06 | 3.76 | 10.56 | 5.23 | 2.00 | −0.52 | |

50 | 17.85 | 11.09 | 6.48 | 5.11 | 13.87 | 6.75 | 2.80 | 0.11 | |

100 | 21.84 | 13.35 | 7.83 | 6.39 | 17.27 | 8.44 | 3.66 | 0.83 |

