A Study on the Appropriateness of the Drought Index Estimation Method Using Damage Data from Gyeongsangnamdo, South Korea

Drought is one of the disasters that causes the most extensive and severe damage. Therefore, drought prevention must be performed for administrative districts at the national level rather than the individual level. This study proposes a drought index estimation method for Gyeongsangnamdo, South Korea that evaluates its appropriateness through a comparison with damage data over several years. The standardized precipitation index (SPI) by duration was used as the drought index that was estimated for 13 rainfall stations located inside and outside Gyeonsangnam-do using the Thiessen method and cluster analysis. The SPI of Gyeongsangnamdo by duration based on the Thiessen method and cluster analysis for the years when drought damage occurred was compared with an SPI value of −2.0, which is the extreme drought condition, to determine its appropriateness. For the evaluation of the appropriateness, the performance indicators of the mean absolute deviation (MAD), mean squared error (MSE), and root mean square error (RMSE) were used. The analysis results showed that SPI by duration based on the cluster analysis method was more appropriate for damage data over many years than that based on the Thiessen method.


Introduction
Drought is a natural disaster characterized by the lack of precipitation (i.e., rain, snow, or sleet) for a protracted period (i.e., more than 3 to 12 months), resulting in water shortage that greatly affects a wide range of socioeconomic sectors including agriculture, living, and industry. In recent years, the acceleration of climate change leads to changes in the intensity, spatial extent, frequency, duration, and timing of weather and climate extremes that worsens drought conditions, which vary in frequency, duration, and severity per climatic zone, experienced across vast portions of the world [1][2][3]. Particularly, South Korea has faced continuous severe droughts, which is normally concentrated in spring and autumn, and experiences varying drought damages depending on regional characteristics. In addition, the annual average rainfall that occurs in summer is approximately 60%, indicating the necessity for proper management of the supply and distribution of water resources especially during drought. To prevent drought impacts, drought risk assessments and drought pattern identification need to be implemented to establish prevention and response systems for each administrative district rather than point-based planning.
Various drought indices have been developed for drought evaluation, including the standardized precipitation index (SPI), standardized precipitation evapotranspiration index (SPEI), reconnaissance drought index (RDI), Palmer drought severity index (PDSI), and effective drought index (EDI) [4][5][6][7][8]. Among them, SPI can evaluate drought using only precipitation, is less complex to calculate, and is more comparable across regions with

SPI
SPI was developed by McKee et al. [4,72] to analyze the size of drought, considering that the lack of water supply due to the reduction of precipitation with increasing demand causes drought [4,72]. The SPI by duration was analyzed by estimating hourly or monthly cumulative precipitation time series. In addition, the cumulative probability of the variance was estimated by analyzing the time-series rainfall by duration for each month and applying it to the standard normal distribution. The drought indices for 3, 6, 9, and 12 months were estimated for each time axis based on the periodic distribution of precipitation for the corresponding observation point using the gamma probability density function. Moreover, Atmosphere 2021, 12, 998 3 of 18 the SPI was used to identify the spatiotemporal variability of drought. Table 1 shows the drought classification. The SPI parameters were estimated using the maximum likelihood method. The cumulative probability of rainfall events for the time interval of the target point was analyzed using the parameters calculated using Equation (1).
where α is the shape parameter, β is the scale parameter, and x is the estimated rainfall for each observation point according to the time scale designated as the coefficient of the gamma probability density function. The estimates of α and β can be calculated using the following equations, respectively: where F = 1n(x) − ∑ 1n(x) n and n is the amount of precipitation data. The obtained parameters are applied to the cumulative probability distribution functional formula defined in Equation (4).
The Gamma function is not defined as x = 0, but there are cases where the precipitation is zero. Thus, the cumulative probability is given by: where q is the probability when the precipitation is zero. The probability of no rainfall, q, can be expressed as q = m/r using the assumption that m is the number of days with no rainfall and n is the number of days that rainfall occurred. If the cumulative probability H(x) is converted to express a random variable Z of the standard normal distribution with a mean of zero and a variance of 1, then it becomes: where C 0 = 2.515517, C 1 = 0.802853, C 2 = 0.010328, d 1 = 1.432788, d 2 = 0.189269, and d 3 = 0.001308. These variables, C 0 , C 1 , C 2 , d 1 , d 2 , and d 3 , are constants. Moreover, x is the precipitation and H(x) is the cumulative probability of the observed precipitation value.

Thiessen Method
The Thiessen method is a trigonometric network that does not include other points in the circumscribed circle of a triangle that connects three nodes to a Thiessen polygon, which is also referred as the Voronoi diagram. This method can be considered as a Delaunay triangulation because it performs spatial calculations on a set of irregular points and polygons made of bisectors on each side [73]. Figure 1a shows the Delaunay triangulations on JLO, OLU, ULV, VLW, and WLJ, which are trigonometric networks connected with each other and do not include other points in the circumscribed circle of a triangle that connects points arranged on a plane [74]. A Thiessen polygon converts irregularly arranged points into a structure based on a certain principle. A polygon containing points that are closer to the arbitrary point L than points J, O, U, V, and W can be assigned for point L, which is made of vertical bisectors of segments LJ, LO, LU, LV, and LW. Moreover, points J, O, U, V, and W are referred to as the Thiessen neighbors of point L. The construction of Thiessen polygons is shown in Figure 1b.
where C = 2.515517, C = 0.802853, C = 0.010328, d = 1.432788, d = 0.189269, and d = 0.001308. These variables, C , C , C , d , d , and d , are constants. Moreover, x is the precipitation and H(x) is the cumulative probability of the observed precipitation value.

Thiessen Method
The Thiessen method is a trigonometric network that does not include other points in the circumscribed circle of a triangle that connects three nodes to a Thiessen polygon, which is also referred as the Voronoi diagram. This method can be considered as a Delaunay triangulation because it performs spatial calculations on a set of irregular points and polygons made of bisectors on each side [73]. Figure 1a shows the Delaunay triangulations on JLO, OLU, ULV, VLW, and WLJ, which are trigonometric networks connected with each other and do not include other points in the circumscribed circle of a triangle that connects points arranged on a plane [74]. A Thiessen polygon converts irregularly arranged points into a structure based on a certain principle. A polygon containing points that are closer to the arbitrary point L than points J, O, U, V, and W can be assigned for point L, which is made of vertical bisectors of segments LJ, LO, LU, LV, and LW. Moreover, points J, O, U, V, and W are referred to as the Thiessen neighbors of point L. The construction of Thiessen polygons is shown in Figure 1b. The Thiessen polygon method considers the influence of irregularly located rainfall stations on the target area during flood estimation in a watershed. The areal average rain- observed rainfall at that station. Then, the ratio of the area and the total area is expressed, and a i = A i /A becomes a weight, as shown in Equation (11). P m = P 1 A 1 + P 2 A 2 + · · · + P n A n /A 1 where P m is the average rainfall of the watershed, P 1 · · · P n are the rainfall observed at n stations in the watershed, and A 1 · · · A n are the commanded areas of each observation point.

Cluster Analysis
The k-means method of cluster analysis is an algorithm proposed by MacQueen for classifying experimental results or the data obtained from samples according to certain properties [75,76]. Homogeneous patterns are classified into k clusters, while the average was calculated as the central value of a cluster.
In this method, objects at a closer distance are connected by measuring the degree of similarity or dissimilarity between objects when k variables are measured for n data. The i-th observed value of y for n data is set as vector y i . It is assumed that the y i of each data that is composed of k groups belongs to only one group for i = 1, 2, · · · , n and the average of each group is expressed as µ 1 , µ 2 , · · · , µ g , as shown in Equation (12). Among the n data, the set of the observed values that belong to the i-th group is presented as C i , and the classification of C 1 , C 2, · · · , C g that minimizes cluster analysis is shown in Equation (13).
Clusters are created based on the proximity of each data point when the initial value of n data is composed of k clusters. The center of a cluster repeats separation and combination with the data included in the range, and cluster analysis was conducted, as shown in Table 2. Table 2. Cluster analysis procedure.
Step Contents Step 1 Initial k clusters are selected from n data. Step 2 Data are composed of the nearest k clusters.
Step 3 k clusters are created arbitrarily, and the initial values are estimated for the average of each cluster, i.e., µ 1 , µ 2 , · · · , µ g Step 4 The average of n data in k clusters is calculated. Step 5 Steps 3 and 4 are repeated until there is no significant change in the average.

Drought Damage Status and Target Area Selection
In recent years, the increase in temperature, reduced rainfall occurrence, and increase in evapotranspiration have been observed due to the influence of climate change, resulting in the worldwide occurrence of drought damage in United States, Australia, Europe, and Africa [77]. In South Korea, drought damage frequently occurred in the region before the 1980s as the economic development is starting. After the economic development, significant drought damage occurred every five or ten years despite the construction of dams, reservoirs, and water supply facilities.
Before 2000, the status of damage in South Korea was investigated, and reports were prepared for administrative districts where large-scale drought damage occurred. Reports on administrative districts have been published each year since 2010 because of the constant the 1980s as the economic development is starting. After the economic development, significant drought damage occurred every five or ten years despite the construction of dams, reservoirs, and water supply facilities.
Before 2000, the status of damage in South Korea was investigated, and reports were prepared for administrative districts where large-scale drought damage occurred. Reports on administrative districts have been published each year since 2010 because of the constant occurrence of various disasters, including drought. The status of drought damage that occurred from 1965 to 2019 is recorded in the "drought record survey report (1995 and 2001)" and "abnormal weather report (2010 to 2019)" for administrative districts.
Damages occurred at least 1 to 16 times with an average of 4 times for each administrative district in South Korea, except in Daejeon and Ulsan in which no damage was reported. It occurred most frequently in Gyeongsangnamdo (16 times In this study, Gyeongsangnamdo, where drought damage occurred most frequently among the 17 administrative districts in South Korea, was selected as the target area (Figure 3). Moreover, the influence of precipitation is used as an analysis factor because drought damage develops when there is insufficient water supply catering a large demand. Precipitation varies depending on regional characteristics, and rainfall stations that affect the target area are located inside and outside the area. Thirteen rainfall stations affected the target area in which ten of them are located inside the area and three are outside the area. In this study, Gyeongsangnamdo, where drought damage occurred most frequently among the 17 administrative districts in South Korea, was selected as the target area ( Figure 3). Moreover, the influence of precipitation is used as an analysis factor because drought damage develops when there is insufficient water supply catering a large demand. Precipitation varies depending on regional characteristics, and rainfall stations that affect the target area are located inside and outside the area. Thirteen rainfall stations affected the target area in which ten of them are located inside the area and three are outside the area. Figure 4 shows that out of the 27 times drought damage occurred in South Korea, 16 times it was experienced in Gyeongsangnamdo. The blue bar graph is the year of drought damage, and the red line is the cumulative number of drought damage. The drought in Gyeongsangnamdo lasted for two to three years with the addition occurrence of short-term drought that lasted for a year, resulting in more serious drought damage compared to other regions. A total of 50% of drought damage occurred before 1980; then, it periodically occurred every ten or five years since then. Although the number of drought damage occurrences has decreased through various water resource policies, the severity of drought was found to significantly increase.   Figure 4 shows that out of the 27 times drought damage occurred in South K times it was experienced in Gyeongsangnamdo. The blue bar graph is the year of d damage, and the red line is the cumulative number of drought damage. The dro Gyeongsangnamdo lasted for two to three years with the addition occurrence o term drought that lasted for a year, resulting in more serious drought damage co to other regions. A total of 50% of drought damage occurred before 1980; then, it p cally occurred every ten or five years since then. Although the number of drought d occurrences has decreased through various water resource policies, the sev drought was found to significantly increase.    Figure 4 shows that out of the 27 times drought damage occurred in Sou times it was experienced in Gyeongsangnamdo. The blue bar graph is the yea damage, and the red line is the cumulative number of drought damage. The Gyeongsangnamdo lasted for two to three years with the addition occurren term drought that lasted for a year, resulting in more serious drought damag to other regions. A total of 50% of drought damage occurred before 1980; then cally occurred every ten or five years since then. Although the number of drou occurrences has decreased through various water resource policies, the drought was found to significantly increase.

SPI Analysis
In this study, SPI, which is a method used to analyze the influence of dr rainfall on various meteorological factors, was selected to identify the spa scale and drought situation in Gyeongsangnamdo. It is easy and is the most w There were 13 rainfall stations for Gyeonsangnam-do, and 57 weather data we

SPI Analysis
In this study, SPI, which is a method used to analyze the influence of drought using rainfall on various meteorological factors, was selected to identify the spatiotemporal scale and drought situation in Gyeongsangnamdo. It is easy and is the most widely used. There were 13 rainfall stations for Gyeonsangnam-do, and 57 weather data were observed, including rainfall, temperature, wind speed, and humidity. Rainfall stations were installed according to the importance of the region and applicability, and the observation  Table 3 shows the station index, station name, and observation date for 13 rainfall stations. The observation dates of each rainfall station must be unified for drought analysis in Gyeongsangnamdo using the SPI. Four stations began observations between 1965 and 1970, seven stations between 1970 and 1980, and two stations after 1980. Therefore, in this study, SPI was analyzed for the rainfall period from 1973 to 2019 because the rainfall observation data of at least 30 years and drought damage of more than ten years could be compared. Among the rainfall stations, Changwon and Jangju were analyzed based on their observation dates.
The duration of SPI was divided into 3, 6, 9, and 12 months, considering that spring and autumn droughts occur in South Korea based on the monsoon season during the summer. Therefore, the SPI analysis period of each rainfall station for Gyeongsangnamdo was 47 years, from January 1973 to December 2019, and the analysis was conducted for the durations of 3, 6, 9, and 12 months. Figure 5 shows the results of the SPI analysis by duration for each rainfall station. Moreover, the SPI range for the 13 rainfall stations are as follows: SPI3 ranged from −7.08 to 4.09, SPI6 ranged from −3.94 to 3.64, SPI9 ranged from −3.42 to 3.84, and SPI12 ranged from −2.87 to 3.75.
stalled according to the importance of the region and applicability, and the ob weather data begins on different dates. Table 3 shows the station index, statio observation date for 13 rainfall stations. The observation dates of each rainfall station must be unified for drough Gyeongsangnamdo using the SPI. Four stations began observations betwee 1970, seven stations between 1970 and 1980, and two stations after 1980. There study, SPI was analyzed for the rainfall period from 1973 to 2019 because the servation data of at least 30 years and drought damage of more than ten yea compared. Among the rainfall stations, Changwon and Jangju were analyze their observation dates.
The duration of SPI was divided into 3, 6, 9, and 12 months, considering and autumn droughts occur in South Korea based on the monsoon season summer. Therefore, the SPI analysis period of each rainfall station for Gyeong was 47 years, from January 1973 to December 2019, and the analysis was co the durations of 3, 6, 9, and 12 months. Figure 5 shows the results of the SPI duration for each rainfall station. Moreover, the SPI range for the 13 rainfall as follows: SPI3 ranged from −7.08 to 4.09, SPI6 ranged from −3.94 to 3.64, S from −3.42 to 3.84, and SPI12 ranged from −2.87 to 3.75.  Atmosphere 2021, 12, 998 9 of 18 SPI3 ranged from −3.15 to 2.91 on average for each station, and the minimum and maximum SPI ranging from −7.08 to 4.09 were observed from the Geochang station (no. 284). In addition, SPI6 ranged from −2.65 to 2.69 on average, and the minimum SPI ranging from −3.94 to 3.49 was observed from the Sancheong station (no. 289) and the maximum SPI ranging from −3.00 to 3.64 was observed from the Tongyeong station (no. 162). On the other hand, SPI9 ranged from −2.49 to 2.59 on average, and the minimum SPI ranging from −3.42 to 2.98 was observed from the Ulsan station (no. 152) and the maximum SPI ranging from −2.36 to 3.84 was observed from the Geoje station (no. 294). Meanwhile, SPI12 ranged from −2.26 to 2.79 on average, and the minimum SPI ranging from −2.87 to 3.45 was observed from the Geochang station (no. 284) and the maximum SPI ranging from −2.38 to 3.75 was observed from the Geoje station (no. 294). The SPI analysis results for each rainfall station showed that the maximum and minimum ranges and the drought index decreased as the duration increased.

Drought Index Analysis Using the Thiessen Method
In this study, the Thiessen method was used to estimate the areal average rainfall at each station by creating a Thiessen polygon based on the 13 rainfall stations affecting Gyeongsangnamdo shown in Figure 6 and calculating the area ratio of the Thiessen polygon as a weight. In various studies, Thiessen polygons have been applied as a method to analyze the meteorological factors of a watershed or administrative district [10,[58][59][60][61][62][63][64][65][66][67][68][69][70].
Atmosphere 2021, 12, x FOR PEER REVIEW from −3.94 to 3.49 was observed from the Sancheong station (no. 289) and the ma SPI ranging from −3.00 to 3.64 was observed from the Tongyeong station (no. 162). other hand, SPI9 ranged from −2.49 to 2.59 on average, and the minimum SPI r from −3.42 to 2.98 was observed from the Ulsan station (no. 152) and the maximu ranging from −2.36 to 3.84 was observed from the Geoje station (no. 294). Mean SPI12 ranged from −2.26 to 2.79 on average, and the minimum SPI ranging from − 3.45 was observed from the Geochang station (no. 284) and the maximum SPI r from −2.38 to 3.75 was observed from the Geoje station (no. 294). The SPI analysis for each rainfall station showed that the maximum and minimum ranges and the d index decreased as the duration increased.

Drought Index Analysis Using the Thiessen Method
In this study, the Thiessen method was used to estimate the areal average rai each station by creating a Thiessen polygon based on the 13 rainfall stations af Gyeongsangnamdo shown in Figure 6 and calculating the area ratio of the Thiesse gon as a weight. In various studies, Thiessen polygons have been applied as a me analyze the meteorological factors of a watershed or administrative district [10,58- The rainfall stations for Gyeongsangnamdo were analyzed starting from J 1973, but the observation dates for Changwon (no. 155) and Geochang (no. 284) sta January 1988. As the observation dates were different, the area ratio of a Thiessen p was calculated as the weight for 11 stations before 1988 and 13 stations after 1988. lists the weights of rainfall stations according to the analysis period. The Changw 155) and Geochang (no. 284) stations represented approximately 20% of the total w and the weights after 1988 were reduced by 0.6 to 3.8% compared to those before 1  The rainfall stations for Gyeongsangnamdo were analyzed starting from January 1973, but the observation dates for Changwon (no. 155) and Geochang (no. 284) started in January 1988. As the observation dates were different, the area ratio of a Thiessen polygon was calculated as the weight for 11 stations before 1988 and 13 stations after 1988. Table 4 lists the weights of rainfall stations according to the analysis period. The Changwon (no. 155) and Geochang (no. 284) stations represented approximately 20% of the total weight, and the weights after 1988 were reduced by 0.6 to 3.8% compared to those before 1988. The SPI by duration of Gyeongsangnamdo was calculated by applying the area weight of the Thiessen polygon for each rainfall station, as shown in Figure 7. The SPI3 ranged from −3.09 to 2.76, and the quartiles were found to be −0.53 for Q1, 0.07 for Q2, and 0.68 for Q3. Moreover, SPI6 ranged from −2.49 to 2.25, and the quartiles were found to be −0.61 for Q1, 0.13 for Q2, and 0.66 for Q3. Meanwhile, SPI9 ranged from −2.15 to 2.07, and the quartiles were found to be −0.61 for Q1, 0.15 for Q2, and 0.65 for Q3. On the other hand, SPI12 ranged from −2.36 to 2.59, and the quartiles were found to be −0.54 for Q1, 0.19 for Q2, and 0.62 for Q3. The "extremely dry" condition, which is an SPI of −2 or less, was found to occur ten times for SPI3, six times for SPI6, once for SPI9, and thrice for SPI12. The SPI by duration of Gyeongsangnamdo was calculated by applying the area weight of the Thiessen polygon for each rainfall station, as shown in Figure 7. The SPI3 ranged from −3.09 to 2.76, and the quartiles were found to be −0.53 for Q1, 0.07 for Q2, and 0.68 for Q3. Moreover, SPI6 ranged from −2.49 to 2.25, and the quartiles were found to be −0.61 for Q1, 0.13 for Q2, and 0.66 for Q3. Meanwhile, SPI9 ranged from −2.15 to 2.07, and the quartiles were found to be −0.61 for Q1, 0.15 for Q2, and 0.65 for Q3. On the other hand, SPI12 ranged from −2.36 to 2.59, and the quartiles were found to be −0.54 for Q1, 0.19 for Q2, and 0.62 for Q3. The "extremely dry" condition, which is an SPI of −2 or less, was found to occur ten times for SPI3, six times for SPI6, once for SPI9, and thrice for SPI12.

Analysis of the Drought Index Using Cluster Analysis
The SPI by duration was estimated for the 13 rainfall stations affecting Gyeongsangnamdo using cluster analysis. The cluster analysis method used in this study is the kmeans method, which is an unsupervised learning method. The k-means method pro-

Analysis of the Drought Index Using Cluster Analysis
The SPI by duration was estimated for the 13 rainfall stations affecting Gyeongsangnamdo using cluster analysis. The cluster analysis method used in this study is the k-means method, which is an unsupervised learning method. The k-means method proposed by MacQueen is an algorithm that divides the given data into k clusters [76]. For the k-means method, the analyzer must set the initial number of clusters. It was determined by drawing a graph with a function using R, a statistical software program, among various setting methods.
The monthly SPI of Gyeongsangnamdo for the 13 stations was analyzed by setting the number of clusters to two to six. Figure 8 shows the cluster results of the D index and Best.partition for setting the appropriate number of clusters for SPI by duration. For the D index, the point at which the slope of the Y-axis sharply decreased was selected as the appropriate number of clusters, and the slope was largest at three. For Best.partition, the highest point was selected as the appropriate number of clusters, and the value was the highest (more than 40%) at three. Therefore, in this study, the number of clusters for the k-means method was set to three.
Atmosphere 2021, 12, x FOR PEER REVIEW 11 of 18 Best.partition for setting the appropriate number of clusters for SPI by duration. For the D index, the point at which the slope of the Y-axis sharply decreased was selected as the appropriate number of clusters, and the slope was largest at three. For Best.partition, the highest point was selected as the appropriate number of clusters, and the value was the highest (more than 40%) at three. Therefore, in this study, the number of clusters for the k-means method was set to three. The SPI by duration of Gyeongsangnamdo based on cluster analysis was analyzed using three clusters, as shown in Figure 9. The minimum value of SPI3 ranged from −3.30 to −2.96 and the maximum value ranged from 1.80 to 2.96. On the other hand, the minimum value of SPI6 ranged from −2.94 to −2.37 and the maximum value ranged from 1.69 to 2.57. Meanwhile, the minimum value of SPI9 ranged from −2.37 to −2.02 and the maximum value ranged from 1.82 to 2.82. Moreover, the minimum value of SPI12 ranged from −2.50 to −1.95 and the maximum value ranged from 1.72 to 2.97. The analysis results of SPI by duration showed that the minimum value ranged from −3.30 to −1.95 and the maximum value ranged from 1.69 to 2.97. The difference in the cluster analysis of SPI by duration was 1.35 for the minimum value and 1.28 for the maximum value. The SPI by duration of Gyeongsangnamdo based on cluster analysis was analyzed using three clusters, as shown in Figure 9. The minimum value of SPI3 ranged from −3.30 to −2.96 and the maximum value ranged from 1.80 to 2.96. On the other hand, the minimum value of SPI6 ranged from −2.94 to −2.37 and the maximum value ranged from 1.69 to 2.57. Meanwhile, the minimum value of SPI9 ranged from −2.37 to −2.02 and the maximum value ranged from 1.82 to 2.82. Moreover, the minimum value of SPI12 ranged from −2.50 to −1.95 and the maximum value ranged from 1.72 to 2.97. The analysis results of SPI by duration showed that the minimum value ranged from −3.30 to −1.95 and the maximum value ranged from 1.69 to 2.97. The difference in the cluster analysis of SPI by duration was 1.35 for the minimum value and 1.28 for the maximum value.
Meanwhile, the minimum value of SPI9 ranged from −2.37 to −2.02 and the maximum value ranged from 1.82 to 2.82. Moreover, the minimum value of SPI12 ranged from −2.50 to −1.95 and the maximum value ranged from 1.72 to 2.97. The analysis results of SPI by duration showed that the minimum value ranged from −3.30 to −1.95 and the maximum value ranged from 1.69 to 2.97. The difference in the cluster analysis of SPI by duration was 1.35 for the minimum value and 1.28 for the maximum value. As for the SPI of Gyeongsangnamdo by duration, the minimum value among the three clusters analyzed was set as the representative SPI by duration ( Figure 10). The SPI3 of Gyeongsangnamdo ranged from −3.30 to 1.80, and the quartiles were found to be −0.78 for Q1, −0.18 for Q2, and 0.44 for Q3. SPI6 ranged from −2.94 to 1.69, and the quartiles were found to be −0.88 for Q1, −0.18 for Q2, and 0.40 for Q3. SPI9 ranged from −2.37 to 1.82, and the quartiles were found to be −0.95 for Q1, −0.20 for Q2, and 0.39 for Q3, respectively. SPI12 ranged from −2.50 to 1.72, and the quartiles were found to be −0.97 for Q1, −0.28 for Q2, and 0.38 for Q3, respectively. In addition, the "extremely dry" condition, for which SPI is -2 or less, was found to occur 20 times for SPI3, 19 times for SPI6, 14 times for SPI9, and 10 times for SPI12.
(a) (b) As for the SPI of Gyeongsangnamdo by duration, the minimum value among the three clusters analyzed was set as the representative SPI by duration ( Figure 10). The SPI3 of Gyeongsangnamdo ranged from −3.30 to 1.80, and the quartiles were found to be −0.78 for Q1, −0.18 for Q2, and 0.44 for Q3. SPI6 ranged from −2.94 to 1.69, and the quartiles were found to be −0.88 for Q1, −0.18 for Q2, and 0.40 for Q3. SPI9 ranged from −2.37 to 1.82, and the quartiles were found to be −0.95 for Q1, −0.20 for Q2, and 0.39 for Q3, respectively. SPI12 ranged from −2.50 to 1.72, and the quartiles were found to be −0.97 for Q1, −0.28 for Q2, and 0.38 for Q3, respectively. In addition, the "extremely dry" condition, for which SPI is -2 or less, was found to occur 20 times for SPI3, 19 times for SPI6, 14 times for SPI9, and 10 times for SPI12.

Examination of Drought Damage and the Appropriateness of the Drought Index
In this study, SPI by duration was estimated using the Thiessen method and cluster analysis from the data of Gyeongsangnamdo, which were constructed based on reports that include the damage status per year that indicates that the start and end time points of drought are not clear. Therefore, the minimum SPI value by year was calculated for the quantitative evaluation of SPI by duration and drought damage data. Figure 11a shows the minimum value of SPI by year for the durations of 3, 6, 9, and 12 months. SPI by year and duration ranged from −3.09 to 1.06 for the Thiessen method and from −3.30 to 1.30 for cluster analysis, showing that cluster analysis had higher minimum and maximum values than the Thiessen method.
for Q1, −0.18 for Q2, and 0.44 for Q3. SPI6 ranged from −2.94 to 1.69, and the quartiles were found to be −0.88 for Q1, −0.18 for Q2, and 0.40 for Q3. SPI9 ranged from −2.37 to 1.82, and the quartiles were found to be −0.95 for Q1, −0.20 for Q2, and 0.39 for Q3, respectively. SPI12 ranged from −2.50 to 1.72, and the quartiles were found to be −0.97 for Q1, −0.28 for Q2, and 0.38 for Q3, respectively. In addition, the "extremely dry" condition, for which SPI is -2 or less, was found to occur 20 times for SPI3, 19 times for SPI6, 14 times for SPI9, and 10 times for SPI12.

Examination of Drought Damage and the Appropriateness of the Drought Index
In this study, SPI by duration was estimated using the Thiessen method and cluster analysis from the data of Gyeongsangnamdo, which were constructed based on reports that include the damage status per year that indicates that the start and end time points of drought are not clear. Therefore, the minimum SPI value by year was calculated for the quantitative evaluation of SPI by duration and drought damage data. Figure 11a shows the minimum value of SPI by year for the durations of 3, 6, 9, and 12 months. SPI by year and duration ranged from −3.09 to 1.06 for the Thiessen method and from −3.30 to 1.30 for cluster analysis, showing that cluster analysis had higher minimum and maximum values than the Thiessen method.
In Gyeongsangnamdo, drought damage occurred 12 times between 1973 and 2019 (i.e., in 1973, 1975, 1976, 1977, 1981, 1982, 1994, 1995, 2000, 2013, 2016, and 2017). However, there is no report showing the severity, range, and duration of drought in these years. Moreover, the maximum duration of drought in South Korea does not exceed 12 months since the monsoon season occurs during the summer. Therefore, SPI by duration was compared for 12 months or less. Figure 11b shows the SPI by duration when drought damage occurred in Gyeongsangnamdo. The SPI by duration based on the Thiessen method ranged from −2.41 to 0.07 in terms of years.  The severity of the drought was divided into seven categories for the SPI. An SPI of −1.00 indicates the start of drought, and values from −2.00 or less mean that extreme drought is experienced, and the evaluation occurs. It is not possible to accurately deter-  In Gyeongsangnamdo, drought damage occurred 12 times between 1973 and 2019 (i.e., in 1973, 1975, 1976, 1977, 1981, 1982, 1994, 1995, 2000, 2013, 2016, and 2017). However, there is no report showing the severity, range, and duration of drought in these years. Moreover, the maximum duration of drought in South Korea does not exceed 12 months since the monsoon season occurs during the summer. Therefore, SPI by duration was compared for 12 months or less. Figure 11b shows the SPI by duration when drought damage occurred in Gyeongsangnamdo. The SPI by duration based on the Thiessen method ranged from −2.41 to 0.07 in terms of years.
The severity of the drought was divided into seven categories for the SPI. An SPI of −1.00 indicates the start of drought, and values from −2.00 or less mean that extreme drought is experienced, and the evaluation occurs. It is not possible to accurately determine which drought categories cause drought damage, but an SPI value of −2.00 is evaluated as drought in various papers. Therefore, in this study, the SPI by duration analyzed using the Thiessen method and cluster analysis was compared with an SPI value of −2.00, which is a criterion for extremely dry conditions, to determine its appropriateness. The mean absolute deviation (MAD), mean squared error (MSE), and root mean square error (RMSE) were used as performance indicators for determining the appropriateness of time-series analysis. We found that the appropriateness is higher as they come closer to zero. Table 5 shows the results of analyzing drought damage by year and the appropriateness of the SPI by duration analyzed using the Thiessen method and cluster analysis based on MAD, MSE, and RMSE. Particularly, the appropriateness of the SPI by duration using the Thiessen method for drought damage by year was found to be higher than 0.5 and lower than 1.0. The accuracy of each performance indicator was found to be high for SPI3 and SPI6. Meanwhile, the appropriateness of the SPI by duration analyzed using the cluster analysis for drought damage by year was higher than 0.3 and lower than 0.8. Moreover, the accuracy for each performance indicator was high for SPI3 and SPI9. The results show that cluster analysis exhibited higher accuracy than the Thiessen method, indicating that the cluster analysis method has higher precision in estimating SPI by duration for drought damage.

Discussion
Drought is a disaster that needs to be prevented at the national level using disaster management plans proposed by administrative districts in many countries. However, in most studies, the drought index was analyzed using observation stations and estimated by the Thiessen method in which the extreme values for each station tend to be underestimated [58][59][60][61][62][63][64][65]. To address this problem, we proposed a method using cluster analysis to analyze these underestimated extreme values of the drought index. Moreover, its appropriateness was presented through a comparison with past drought damage.
The SPI by duration for 13 rainfall stations in Gyeongsangnamdo, an administrative district in South Korea where drought damage occurred most frequently, was analyzed, and the representative drought index was calculated using the Thiessen method and cluster analysis. In addition, both past drought damage and SPI by duration were analyzed to examine the appropriateness of the analysis methods, resulting in a more accurate result for the cluster analysis. The difference in the appropriateness results of MAD, MSE, and RMSE for drought damage and analysis methods ranged from 0.1 to 0.3, which does not indicate that the estimation method based on the Thiessen method is incorrect.
Moreover, it is apparent that drought damage will be reduced if methodologies with higher appropriateness are developed for disaster prevention.
For the analysis of drought, SPI was analyzed and evaluated using the arithmetic mean, Thiessen, and inverse distance weighting methods, which consider the influence of rainfall stations on administrative districts, watersheds, and spatial ranges [58][59][60][61][62][63][64][65][66][67][68][69][70]. These methods were compared based on the analysis results and the ranges of the drought index. Moreover, the appropriateness was evaluated while adjusting the drought index estimation method or the spatial range for rainfall stations [62]. On the other hand, three regions from 144 rainfall stations in Portugal were divided through cluster analysis, and the period of drought was proposed using SPI [71]. The analysis of the spatial range for rainfall stations is important for drought evaluation, and research on various methods is required. Most previous studies have limitations in terms of accuracy because only the development of indices for evaluating drought or comparisons were presented, indicating the need for a qualitative analysis and appropriateness verification using past drought damage data.
In this study, SPI by duration based on the Thiessen method and cluster analysis was analyzed, and its accuracy was calculated for Gyeongsangnamdo, Korea. However, there are limitations in evaluating drought using SPI that considers only rainfall, since drought is a disaster that occurs due to various causes and complex relationships. Moreover, the occurrence of past drought damage under extreme drought conditions cannot be accurately verified. However, despite these aforementioned limitations, a more reliable disaster prevention will be possible if a more accurate methodology is used to analyze the influence of drought.
In future research, it will be necessary to propose quantitative linkage methods with the drought index through quantitative analysis of drought damage. In addition, most of the current drought analyses propose various durations, but it will be possible to secure durations for disaster prevention if drought damage is linked to the drought index by duration.

Conclusions
Drought is a slow-onset natural hazard in which its effects accumulate slowly over a certain period and may persist for years after the termination of the event. Therefore, determining the exact time of occurrence of drought is difficult. However, various disaster management studies were conducted to prevent its damaging effects. In this study, we proposed a drought analysis method using SPI in Gyeongsangnamdo, where drought damages frequently occurred in South Korea. SPI by duration was analyzed for 13 rainfall stations located inside and outside Gyeongsangnamdo. The representative SPI of Gyeongsangnamdo was estimated by applying the SPI of each station by duration based on the Thiessen method and cluster analysis. For the Thiessen method, the SPI by duration was estimated by applying the area weight of the Thiessen polygon for each rainfall station, resulting in the range −3.09 to 2.76. For cluster analysis, clusters were divided into three clusters using the k-means method, and the minimum value was calculated as the SPI by duration. SPI by duration based on cluster analysis ranged from −3.30 to 1.82.
The minimum value per year was calculated as the representative SPI to compare the SPI of the past damage data of Gyeongsangnamdo by duration using the Thiessen method and cluster analysis. Moreover, appropriateness was compared based on the years the drought damage occurred, since the past drought damage data only present the damage status per year without accurate drought start points. The SPI by duration for the years with drought damage was set to -2.00, which is the criterion for extremely dry conditions, and the accuracy of each analysis method was analyzed using the MAD, MSE, and RMSE.
The appropriateness of SPI by duration for the past drought damage was found to be higher than 0.5, which is lower than 1.0, for the Thiessen method. Meanwhile, for cluster analysis, it was higher than 0.3 and lower than 0.8, indicating that cluster analysis exhibited higher accuracy. Therefore, it is possible to predict drought damage more accurately if cluster analysis is utilized during the analysis of the drought index for rainfall stations in administrative districts. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to [national policy research result].