Analysis of Drought Characteristics in Northern Shaanxi Based on Copula Function

: Precipitation is low and drought occurs frequently in Northern Shaanxi. To study the characteristics and occurrence and development of drought events in Northern Shaanxi is beneﬁcial to the prevention and control of drought disasters. Based on the monthly rainfall data of 10 meteorological stations in Northern Shaanxi from 1960 to 2019, the characteristic variables of drought events at each meteorological station in Northern Shaanxi were extracted by using run theory and copula function. The joint probability distribution and recurrence period were obtained by combining the duration and intensity of drought, and the relationship between drought characteristics and crop drought affected area was studied. The results show that (1) from 1960 to 2019, drought events mainly occurred in Northern Shaanxi with long duration and low severity, short duration and high severity, or short duration and low severity, among which the frequency of drought events that occurred in Yuyang and Baota districts was higher. The frequency of light drought and extreme drought was more in the south and less in the north, while the frequency of moderate drought and severe drought was more in the north and less in the south. (2) The optimal edge distribution of drought intensity and drought duration in Northern Shaanxi is generalized Pareto distribution, and the optimal ﬁtting function is Frank copula function. The greater the duration and intensity of drought, the greater the cumulative probability and return period. (3) The actual recurrence interval and the theoretical recurrence interval of drought events in Northern Shaanxi were close, and the error was only 0.1–0.3a. The results of the joint return period can accurately reﬂect the actual situation, and this study can provide effective guidance for the prevention and management of agricultural dryland in Northern Shaanxi.


Introduction
There are many definitions of drought. In general, drought is a natural phenomenon that water resources in a certain region are in short supply due to the uneven spatial and temporal distribution of water resources. Drought disasters occur in a wide range and high frequency, and the damage and losses caused by them will seriously hinder the socioeconomic, agricultural development and ecological environment construction of the country and region [1][2][3]. In recent years, global warming has had a great impact on the environment and climate. The Fourth Assessment Report of the Intergovernmental Panel on Climate Change in 2007 pointed out that global warming will continuously aggravate extreme hydrological events such as drought and flood disasters, and the risk of drought disasters will also continue to increase [4]. The abnormal SST in the tropical Indo-Pacific Ocean led to severe drought in Northeast Asia in the summer of 2014 [5], and severe drought continued to occur in southwest China in the summer, autumn and winter of 2009, leading to a serious shortage of water resources, causing significant economic losses and seriously affecting people's lives [6]. The frequency and the regularity of drought in the region has very important practical significance.
Drought index is the basis of quantitative study on the occurrence and development of drought. Many scholars use drought indicators such as Palmer Drought Index (PDSI) [7,8], Standardized Precipitation Index (SPI) [9,10], Precipitation Anomaly (PA) [11], Standardized Evapotranspiration Index (SPEI) [12,13] and Comprehensive Meteorological Drought Index (CI) [14] to analyze the characteristics and rules of drought occurrence. At present, the research on drought events mainly focuses on the single factor analysis of drought duration and drought intensity, but the complexity and the huge degree of drought disasters cannot be fully described. Therefore, it is necessary to analyze multiple characteristic variables of drought events, such as drought duration, drought intensity, drought area, and intensity peak, etc. Each characteristic variable of drought has a certain correlation. Only by selecting multiple drought variables for simultaneous analysis can drought events be better described [15][16][17][18]. There are many methods for the identification of drought characteristic variables, among which the most widely used method is the theory of running distance. Precipitation data is a typical time series, and the run theory can be used to analyze the discrete stochastic process well [19][20][21]. Herbst et al. [22] were the first to use the run theory and monthly precipitation data to identify drought characteristics. In the process of identification of the drought event with the run theory, precise control of truncation level is needed, otherwise it will lead to inaccurate identification of drought events [23]. Therefore, it is necessary to set multiple accurate interception levels to improve the accuracy of drought event identification.
Copula function is used for joint analysis of several drought characteristic variables with correlation. Copula function has been widely used in the study of drought disaster [24], flood disaster [25], rainstorm disaster [26] and other meteorological disasters. Its theoretical basis was first proposed by Sklar [27], which can well describe the correlation between two variables. Bahram Saghafian et al. [15] use copula function to analyze the characteristics of groundwater drought, monitor groundwater drought through standardized groundwater index (SGI), and determine the probability and return period of multivariable groundwater drought through copula function. Foo Hui-mean et al. [28] extracted the three characteristic variables of drought severity, duration and peak severity, and analyzed the occurrence frequency of drought events in the peninsula of Malaysia using the joint distribution of three-dimensional copula model. Copula function can carry out joint distribution of drought variables and describe the complexity and variability of drought events more accurately. However, in the existing studies, the copula function has been studied more in hydrological drought and meteorological drought, but less in the field of agricultural drought. Meanwhile, China is a large agricultural country, and the frequent occurrence of drought disasters has a great impact on the agricultural development of China. Therefore, the accurate analysis of agricultural drought in the region is very important.
Shaanxi north-south across half wet and dry climate zone, is a typical ecologically fragile area, in addition, central Shaanxi Qinling as a natural ecological barrier, south and north climate in China have a significant impact, under the background of global climate change, environmental, Qinling mountains north and south of climate change have become hot topics in the study of many scholars [29][30][31]. In the northern part of Shaanxi, there are mainly two municipal districts, namely Yulin and Yan'an. Northern Shaanxi is a typical dry farming area, located in the middle latitude inland, the northern and northwestern parts of the region belong to the semiarid monsoon climate type, the central and southern parts belong to the warm temperate semiarid monsoon climate type. The terrain of northern Shaanxi is high in the northwest and low in the southeast. The northern part is aeolian landform and beach land, the central part is loess hilly and gully region, and the southern part is the gully and gully region of the Weibei Plateau. The temperature and rainfall decrease from southeast to northwest. Therefore, the climate characteristics in this region have obvious spatial differences, leading to the frequent occurrence of drought events, which seriously hinders the development of agriculture and economy in Northern Shaanxi. Run theory and the joint distribution function of copula to are used analyze the law of drought frequency and drought return period in Northern Shaanxi, which provide reference for the formulation of drought resistance strategies in this area.

Study Area and Data
Shaanxi Province is located in the middle latitudes of Northwest China and the northeast side of Qinghai Tibet Plateau, with a long and narrow area of 870 km from north to South and 360 km from east to west, covering a total area of 20.58 × 10 4 km 2 . The landform of the whole province can be divided into three parts: the Loess Plateau in Northern Shaanxi (the sandy area along the great wall and the hilly and gully area in Southern Shaanxi), the Guanzhong Plain (the Weibei channel and Weibei plateau) and the Qinba Mountain landform in southern Shaanxi [32]. The study area is located in the Loess Plateau of Northern Shaanxi, namely Yulin and Yan'an. It belongs to the temperate continental monsoon climate, the average annual precipitation is about 300-700 mm, and the seasonal variation is obvious, and the precipitation is mainly concentrated in summer. Drought events occur frequently [33,34].
In this study, daily data, including precipitation, maximum temperature, minimum temperature and sunshine hours, from 10 meteorological stations in the northern part of Shaanxi Province were used. Figure 1 shows the spatial distribution of 10 meteorological stations in Northern Shaanxi. The time scale was 1960-2019, and the data was provided by the China Meteorological Data Network. The information of agricultural drought disaster is from China Meteorological Disaster Ceremony-Shaanxi Province.

Standardized Precipitation Index
Standardized Precipitation Index (SPI) is a good way to monitor drought severity over time series in different regions [35]. The main step of SPI is to first make statistics of the precipitation in the study period, get the Γ probability distribution, and then carry out normal standardization [36]. The calculation formula is as follows: where x-precipitation in time period, mm; G(x)-cumulative probability corresponding to x; S-positive and negative coefficient of probability density; S = 1 when G(x) > 0.5; S = −1 when G(x) ≤ 0.5. c 0 = 2.515517, c 1 = 0.802853, c 2 = 0.010328, d 1 = 1.432788, d 2 = 0.189269, d 3 = 0.001308. According to GB/T 20481-2017 meteorological drought grade, SPI corresponds to 5 grades, as shown in Table 1. Table 1. SPI drought classification standard.

Drought Feature Recognition
Run length theory is a method to analyze time series, which was initially applied to drought research by Herbst [22]. In this study, run length theory was used to extract two characteristic variables of drought severity and drought duration, The basic operation is as follows [20] (Figure 2): 1 Set three truncation levels R 0 , R 1 and R 2 . In this study, the threshold of drought index was determined by trial-and-error method. R 0 = 0, R 1 = −0.3, R 2 = −0.5. By setting the truncation level to identify drought events, the results were consistent with the actual drought situation of China Meteorological Disasters (Shaanxi volume); 2 Identify potential drought events. If SPI of a month is less than or equal to R 1 , the month is regarded as a potential drought month, and a continuous potential drought month is regarded as a potential drought event. As shown in Figure 2, there are five potential drought events a, b, c 1 , c 2 and d; 3 Determine drought events and their duration and intensity. If the potential drought event lasts for one month, when its SPI is greater than R 2 , this potential drought event will be eliminated (a in Figure 2). When its SPI is less than or equal to R 2 , it is determined as a drought event (b in Figure 2). A potential dry event is defined as a drought event if it lasts for more than a month (d in Figure 2). If the time interval between two drought events occurs is 1 and the SPI for the month is less than or equal to R 0 , the two drought events are combined into one drought event (c 1 and c 2 in Figure 2).

Construct the Marginal Distribution Function of Characteristic Variables
In constructing the drought duration and intensity of the joint distribution function, the first thing to determine is their marginal distribution functions and examine the correlation between them. In this study, the MVCAT toolbox in MATLAB R2020a software was used, and the univariate distribution functions in the toolbox include Beta distribution, exponential distribution, extreme value distribution, Gamma distribution, generalized extreme value distribution, generalized Pareto distribution, anti-Gaussian distribution, logical distribution, logarithmic distribution and Weibull distribution and so on [37]. At the same time, the Pearson rank correlation coefficient, Kendall rank correlation coefficient and Spearman rank correlation coefficient were calculated, and the correlation between the drought duration and intensity of drought was inspected.

Construction of Copula Joint Distribution Function
Copula function can combine the marginal distribution functions with the interval of [0, 1], the most common of which is the symmetric copula-the Archimedean copula. It has simple structure, few parameters and is easy to solve. It is widely used in the calculation of hydrological multivariate [38,39]. According to the definition of copula function, if F D (d) is the edge distribution function of drought duration and F S (s) is the edge distribution function of drought intensity, then the joint distribution function of drought duration and intensity is: There are more than 20 copula functions in the toolbox which can be used for fitting. In this paper, three commonly used symmetric copula functions were selected, namely, Clayton copula, Gumbel copula and Frank copula [18,20] (Table 2). Finally, the selected functions were tested for goodness of fit to determine the optimal copula function, and the test methods were NSE and RMSE. Table 2. Archimedean copula function types.

Copula Type Copula Formula Parameter Range
Clayton θ ∈ R Note: u 1 , u 2 are marginal distribution functions of drought duration and drought intensity, θ is copula function.

Calculation of Return Periods
The univariate recurrence interval of drought duration and drought intensity were as follows [40]: where, T(d) is the recurrence interval of drought duration, T(s) is the recurrence interval of drought severity, n is number of years in the study period, and N is the number of drought events in the study period. When D ≥ d or S ≥ s, the joint return period of characteristic variables is:

Spatial Distribution of Drought Frequency
For the selection of drought index we refer to the previous literature; their research shows that SPI-3 index can better monitor the drought in Northern Shaanxi. The SPI with 3 months time scale can better reflect the seasonal precipitation. The monthly precipitation data were collected and the SPI sequence of 10 meteorological stations in Northern Shaanxi during 1960-2019 was calculated. Then, drought duration, drought intensity, drought frequency and drought grade of drought events were extracted by using the theory of running distance. ArcGIS was used to make a spatial distribution of the frequency of drought events in Northern Shaanxi in recent 60 years, and Figure 3 was obtained. From 1961 to 2019, the average frequency of mild drought in the southern part of the study area was 114, and the average frequency of mild drought in the northern part was 102. The average frequency of moderate drought in the southern part of the study area was 62, and the average frequency of moderate drought in the northern part was 67. The average frequency of severe drought in the southern part of the study area was 26 times, and the average frequency of severe drought in the northern part was 29 times. The average frequency of extreme drought in the southern part of the study area was 18 times, and the average frequency of extreme drought in the northern part was 17 times. The probability of drought events with long duration and low intensity, short duration and high intensity or short duration and low intensity in the whole Northern Shaanxi is higher, which is more than 75%. In the northern part of Northern Shaanxi, the frequency of light drought and extreme drought is higher than that of medium drought and severe drought. In the south of Northern Shaanxi, the frequency of moderate and severe drought is higher than that of light and extreme drought. Therefore, the spatial differences of precipitation in Northern Shaanxi were greater.

Correlation Analysis of Characteristics of Drought Variables
Two characteristic variables, drought intensity and drought duration, were extracted from the run-length theory. Then the Kendall rank, Spearman's rank-order, and Pearson product-moment correlation coefficients of the two variables were calculated, the closer the coefficient value is to 1, the stronger the correlation between the variables. The results are shown in Table 3. According to the table, Kendall's rank correlation coefficient ranged from 0.7452 to 0.8083, Spearman's rank correlation coefficient and Pearson Product-Moment were both close to 0.9, and the highest value could reach 0.9339. This indicates that drought intensity and drought duration have a high correlation. Therefore, with the use of copulas connect function in Northern Shaanxi, this paper's set up of the drought duration and intensity of joint distribution function is feasible.

Determine the Appropriate Marginal Distribution Function
Determining the appropriate edge distribution function is the premise of constructing the joint distribution function of copula. Through the fitting of drought variables, the optimal distribution function was obtained and its parameter values were calculated. Table 4 shows the marginal distribution function and parameter values corresponding to drought variables at each meteorological station. According to the table, the marginal distribution of Dd in Dingbian was generalized extreme value distribution, the marginal distribution of Ds in Wuqi was inverse Gaussian distribution, and the marginal distribution of Ds in Yanchang was inverse Gaussian distribution. The distribution functions of drought characteristic variables at other stations were generalized Pareto distribution. Therefore, the marginal distribution function of drought variables in Northern Shaanxi is determined as the generalized Pareto distribution. The results in the table were calculated by the MVCAT toolbox in MATLAB software, and the goodness of fit of the distribution function of characteristic variables passed the significance test of 0.05. Figure 4 shows the edge distribution function fitting graph and quantile-quantile (Q-Q) graph of two drought variables in Yulin City, and it can be seen from the graph that the fitting effect is good.  Note: k ∈ R is the position parameter, σ > 0 is the scale parameter, θ ∈ R is the shape parameter.

Determine the Appropriate Copula
The goodness of fit of the joint distribution function was tested to determine the optimal copula function. Table 5 shows the calculation results of RMSE and NSE. It can be seen that, except Shenmu Station, RMSE value of Frank copula function at other stations in the study area is the smallest and NSE value is closest to 1, indicating that the fitting effect of Frank copula function in the study area is the best. Therefore, Frank copula function was determined to combine the marginal distribution of drought characteristic variables in Northern Shaanxi.

Drought Risk Probability Assessment
Taking Yulin and Baota stations as examples, the joint probability distribution of drought intensity and drought duration was analyzed in this paper. Figure 5 shows the combined cumulative probability of drought duration and drought intensity, in which (a) shows Yulin Station and (b) Baota Station. As can be seen by the picture, the combined cumulative probability of drought increased with the increase of drought duration and drought intensity. When the drought lasted no more than 4 months or the drought intensity no more than 4, the contour lines were dense and the gradient between the contour lines was small, with the increase of drought duration and intensity, the joint cumulative probability value increases rapidly. When the drought lasts longer than 4 months or the drought intensity exceeds 4, the gradient that the combined cumulative probability increases with drought duration and intensity becomes smaller. This indicates that drought events with long duration and low intensity, short duration and high intensity or short duration and low intensity often occur at these two stations.

Joint Return Period
The joint return periods of Yulin and Baota stations were calculated, as shown in Figure 6. With the increase of drought intensity and drought duration, the joint return period also increased. Most of the combined return periods of drought events at the two sites did not exceed 25 years. This indicates that the frequency of high-intensity and long-duration drought events in these two stations is relatively low. For example, under the condition of return period of 5 years, the drought duration of Yulin station is 5 months, and the drought intensity is 6.1. The drought duration of Baota Station drought event was 5 months, and the drought intensity was 5.2. This indicates that the drought events with the same return period are more likely to occur at Baota Station than at Yulin Station. In general, the joint return periods of the two sites increased with the increase of drought duration and intensity, and the increasing trend was slow at first and then steep.

Application of Joint Return Period
The percentages of cultivated land area and disaster-affected area in Northern Shaanxi were obtained by using China Crop Industry Information Network and Shaanxi Statistical Yearbook, and then the annual agricultural disaster-affected area in Northern Shaanxi was calculated, as shown in Figure 7. As can be seen from the figure, from 1995 to 2010, the drought-affected agricultural area in Northern Shaanxi increased first, then decreased and then increased, and the peak appeared in 2000, 2005, 2007 and 2009. According to the theory of running distance, 21 droughts occurred in Northern Shaanxi during 1995-2010, among which 11 drought events lasted no more than 4 years and had no more than 4 drought intensity. The return period of these drought events was basically 1-2 years. Based on the 10-day data set of agrometeorological disasters in Northern Shaanxi from 1995 to 2010, the actual drought duration and actual return period in the study area were obtained. Based on the average SPI value in Northern Shaanxi, the theoretical drought duration and theoretical return period were calculated by using the run theory and copula function, as shown in Table 6. The results show that the theoretical value of drought duration is basically consistent with the actual value under the same drought grade. The theoretical return period of the same drought event was compared with the actual return period, and the results were basically the same. The error between the theoretical return period and the actual return period was about 0.1-0.3a. The above results indicate that the drought characteristics of drought events extracted by the running distance theory are close to the actual values, and the return period can describe the actual agricultural drought situation more accurately.

Discussion
This paper studies the spatial distribution of drought frequency, the joint probability distribution of drought duration and drought intensity, the return period and the application of the return period in agrometeorological disasters in Northern Shaanxi. The results showed that the probability of drought events with long duration low intensity, short duration high intensity or short duration low intensity in Northern Shaanxi was higher, and Yuyang District and Baota District had high risk of drought, which was consistent with the results obtained by Wang Xiaofeng et al. [23] in the analysis of drought characteristics in Northern Shaanxi. The results show that it is reliable to extract the drought characteristics by using the theory of running distance and to analyze the regional drought characteristics by using the copula function to carry out joint probability distribution.
In the research process, how to select the drought index and how to set the truncation level of the drought index are very important. However, the setting of the threshold of truncation level in the theory of run length is uncertain, and the threshold of truncation level will change with the difference of drought index and region. The drought index selected in this paper is the standardized precipitation index, which has the advantages of simple calculation, wide application and strong applicability. However, at present, the identification of characteristic variables of drought events mostly adopts a single threshold of truncation level, and the accuracy of the identification of drought events is low [41][42][43]. In this study, three truncation levels were set based on previous experience and trialand-error method, combined with actual disaster conditions, to improve the accuracy of drought event identification and drought feature extraction. However, drought index selection is single, the subsequent remains to be improved.
At present, most of the research studies hydrological and meteorological drought using copula function, and seldom combine the research results with crop data [44,45]. The annual drought disaster in China leads to severe grain production, so it is urgent to further study regional agricultural drought [46]. Based on the analysis of drought frequency, joint probability distribution and return period in Northern Shaanxi, this paper further combined with crop affected area, and found that the analysis results could accurately reflect the grade and return period of agricultural drought events. However, the meteorological disasters that have a greater impact on crop yield are also flood disasters. The combination of drought and flood disasters and the rapid transfer of drought and flood increase crop losses. This direction is the key of the follow-up study.
In this paper, the research results of agricultural drought disaster prevention and control of the north of Shaanxi, assessment and management provides a good reference basis. The method can also be applied to the drought research in other regions of the world. For the regions with similar climatic characteristics and topography, our research results can provide reference for the drought research in this area.

Conclusions
Through the calculation of monthly precipitation data, the standardized precipitation index series was obtained, and the characteristic variables of drought events were extracted from SPI3 series by using the run-length theory. Then the copula function was used to carry out the joint probability distribution of drought intensity and drought duration, and the joint cumulative probability and return period were obtained. By comparing the results with the affected area of crops, the accuracy and applicability of the results were determined, and finally the following conclusions were drawn: (1) From 1961 to 2019, the average frequency of mild drought in the southern part of the study area was 114, and the average frequency of mild drought in the northern part was 102. The average frequency of moderate drought in the southern part of the study area was 62, and the average frequency of moderate drought in the northern part was 67. The average frequency of severe drought in the southern part of the study area was 26 times, and the average frequency of severe drought in the northern part was 29 times. The average frequency of extreme drought in the southern part of the study area was 18 times, and the average frequency of extreme drought in the northern part was 17 times. The probability of drought events with long duration and low intensity, short duration and high intensity or short duration and low intensity in the whole Northern Shaanxi is higher, which is more than 75%.
(2) The optimal edge distribution of drought duration and drought intensity in Northern Shaanxi is generalized Pareto distribution, and the optimal fitting function is Frank copula function. Finally, the combined probability and return interval of drought characteristic variables are obtained. When the drought duration is less than 4 months or the drought intensity is less than 4, the joint cumulative probability increases rapidly and the joint return period increases slowly with the increase of drought duration and intensity. When the duration of drought is more than 4 months or the intensity of drought is more than 4, the increasing trend of joint cumulative probability becomes slower and the increasing trend of joint return period becomes steeper.
(3) The model fitting results were compared with the actual crop affected area, and the actual eigenvalues of drought events were close to the theoretical eigenvalues. However, the actual return period and the error between the theoretical return period is only 0.1-0.3 a. The results of the joint return period can accurately reflect the actual situation, and this study can provide effective guidance for the prevention and management of agricultural dryland in Northern Shaanxi.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.