Future Runoff Variation and Flood Disaster Prediction of the Yellow River Basin Based on CA-Markov and SWAT

The purpose of this paper is to simulate the future runoff change of the Yellow River Basin under the combined effect of land use and climate change based on Cellular automata (CA)-Markov and Soil & Water Assessment Tool (SWAT). The changes in the average runoff, high extreme runoff and intra-annual runoff distribution in the middle of the 21st century are analyzed. The following conclusions are obtained: (1) Compared with the base period (1970–1990), the average runoff of Tangnaihai, Toudaoguai, Sanmenxia and Lijin hydrological stations in the future period (2040–2060) all shows an increasing trend, and the probability of flood disaster also tends to increase; (2) Land use/cover change (LUCC) under the status quo continuation scenario will increase the possibility of future flood disasters; (3) The spring runoff proportion of the four hydrological stations in the future period shows a decreasing trend, which increases the risk of drought in spring. The winter runoff proportion tends to increase; (4) The monthly runoff proportion of the four hydrological stations in the future period tends to decrease in April, May, June, July and October. The monthly runoff proportion tends to increase in January, February, August, September and December.


Introduction
The Chinese government promoted the "ecological protection and high quality development of the Yellow River Basin" as a national development strategy in 2019, and pointed out that there are some problems to be solved in the basin, such as water shortage. With its rapid increase in population and the rapid expansion of the economy, water shortage is becoming more and more serious [1], which restricts its high quality and rapid development. Runoff is an important part of water resources in the Yellow River Basin. Therefore, the analysis and simulation of runoff change is very important for the management and effective utilization of water resources in the Yellow River Basin.
Many scholars have simulated future runoff variation in different rivers, such as the Yellow River [2][3][4], Hoeya River [5], Altmühl River [6], Beijiang River [7], and the upper reaches of the Grande River [8]. For example, Li et al. [9] simulated the future runoff variation in the upper reaches of the Yellow River under two climate scenarios (A2 and B2) using seven CMIP3 models. The results showed that the runoff will tend to decrease in the future, and the probability and severity of flood and drought disasters will all increase. Li et al. [10] simulated the runoff variation in the source region of the Yellow River from 2010 to 2020 under two climate scenarios (A2 and B2), and showed that the runoff will decrease gradually in the future. Wei et al. [11] imported the climatic data from the BCC-CSM1.1 model into the Variable Infiltration Capacity (VIC) hydrological model and estimated the runoff change in the upper reaches of the Yellow River from 2011 to The Yellow River basin lies around 95 • 53 ~119 • 05 E, 32 • 10 ~41 • 50 N (Figure 1), it originates from the Bayankala mountains, and flows into the Bohai sea. The Yellow River spans 5464 km and covers 79.5 × 10 4 km 2 (including the inner river area), accounting for 8% of China. Most of the regions belong to arid and semi-arid regions. Its precipitation decreases from southeast to northwest, with an average of 476mm. The overall distribution of temperature is gradually decreased from south to north and from east to west, with an annual average temperature between −4 and 14 • C. Its evaporation is different and increases from southeast to northwest. The water resources in the basin are deficient and the ecological environment is fragile under the influence of climate change and human activities. In addition, the basin is also one of the most important bases for the production of grain and agricultural products in China, accounting for about 13.4% of China's total amount of agricultural products. The total area of cultivated land in the basin is 20.2346 million hm 2 , accounting for about 16.6% [19]. The shortage of water resources in the basin poses a great threat to China's food security.

Data Sources
(1) The monthly runoff data of the Tangnaihai, Toudaoguai, Sanmenxia and Lijin stations from 1967 to 1990 are obtained from the Yellow River Water Conservancy Commission (http://yrcc.gov.cn/, accessed on 1 January 2021). Tangnaihai hydrological station is a runoff monitoring station in the source area of the Yellow River. Toudaoguai hydrological station is the dividing point between the upper and middle reaches of the Yellow River. Sanmenxia hydrological station is an important national hydrological station, which is responsible for the flood control monitoring of the lower Yellow River. Lijin hydrological station is the last hydrological station in the Yellow River Basin. (2) The Digital Elevation Model (DEM) data are downloaded from the Resource and Environmental Science Data Center of the Chinese Academy of Sciences (http://www. resdc.cn/, accessed on 1 January 2021), with a resolution of 1 km × 1 km.
(3) The land use data are downloaded from the Resource and Environmental Science Data Center of the Chinese Academy of Sciences (http://www.resdc.cn/, accessed on 1 January 2021), including 1980, 1995, 2005 and 2015, with a resolution of 1 km × 1 km. (4) The soil data are derived from the World Soil Database with a resolution of 1 km × 1 km (http://westdc.westgis.ac.cn/, accessed on 1 January 2021).

Data Sources
(1) The monthly runoff data of the Tangnaihai, Toudaoguai, Sanmenxia and Lijin stations from 1967 to 1990 are obtained from the Yellow River Water Conservancy Commission (http://yrcc.gov.cn/, accessed on 1 January 2021). Tangnaihai hydrological station is a runoff monitoring station in the source area of the Yellow River. Toudaoguai hydrological station is the dividing point between the upper and middle reaches of the Yellow River. Sanmenxia hydrological station is an important national hydrological station, which is responsible for the flood control monitoring of the lower Yellow River. Lijin hydrological station is the last hydrological station in the Yellow River Basin.
(2) The Digital Elevation Model (DEM) data are downloaded from the Resource and Environmental Science Data Center of the Chinese Academy of Sciences (http://www.resdc.cn/, accessed on 1 January 2021), with a resolution of 1 km × 1 km.
(3) The land use data are downloaded from the Resource and Environmental Science Data Center of the Chinese Academy of Sciences (http://www.resdc.cn/, accessed on 1 January 2021), including 1980, 1995, 2005 and 2015, with a resolution of 1 km × 1 km.
(5) The 134 meteorological station data in and around the Yellow River are obtained  The spatial resolution of global climate model data is very low, so it is difficult to accurately and effectively show the climate change on the watershed scale. Therefore, it is necessary to use a downscaling method to improve its accuracy when studying the watershed scale. The Delta method can correct the data bias of multiple weather stations at the same time, and its operation process is relatively simple, which has been widely used by scholars [22][23][24]. Therefore, based on the observation data of 134 meteorological stations in and around the Yellow River Basin from 1985 to 2005, the Delta method is used to process the precipitation, maximum temperature and minimum temperature data of five CMIP5 models from 2040 to 2060, including MIROC-ESM-CHEM, NorES1-M, IPSL-CM5A-LR, GFDL-ESM2M and HadGEM2-ES, respectively. The specific calculation formula is as follows: In the formula, P Fut , Tmax Fut , Tmin Fut represent the future precipitation, maximum temperature and minimum temperature data of each meteorological station after downscaling, respectively; P His , Tmax His , Tmin His represent the precipitation, maximum temperature and minimum temperature observed data at each meteorological station in the historical period, respectively; [P G f ut ] Mon , [Tmax G f ut ] Mon , [Tmin G f ut ] Mon represent the monthly scale precipitation, maximum temperature and minimum temperature data of the CMIP5 model in the future period, respectively; [P Ghis ] Mon , [Tmax Ghis ] Mon , [Tmin Ghis ] Mon represent the monthly precipitation, maximum temperature and minimum temperature data of the CMIP5 model in the historical period, respectively.

CA-Markov Model
The CA-Markov model effectively combines the advantages of the CA model and Markov model, greatly improves the accuracy of future land use simulation results, and has been widely used [25,26]. Therefore, the CA-Markov model is chosen as the method to simulate the land use of the Yellow River Basin in 2050 in this paper.
The CA model can be expressed by the following formula: where S represents a set of all cells, t and t + 1 represent different times, f represents a functional transformation between cells, and N represents the neighborhood of cells. By analyzing the transition probability of random events from the initial period to another period, the Markov model can predict the situation in the future period on the basis of the initial period. It can be expressed in the following formula: where S t and S t+1 represent the land type matrix at time t and t + 1, respectively; P ij represents the transfer probability matrix.

Kappa Coefficient
The Kappa coefficient is often used to verify the accuracy of the predicted land use map. It is generally believed that the accuracy of the prediction result is better if the Kappa value is greater than 0.75 [26,27].
Assuming that the total number of pixels is N, S is the number of correctly simulated grids. The real number of each class is a 1 , a 2 , . . . , a c , respectively, while the simulated number of each class is b 1 , b 2 , . . . , b c , respectively, the kappa coefficient (K) can be expressed in the following formulas:

SWAT Model
The SWAT model is a distributed hydrological model, and several studies showed that the SWAT model can be better applied to simulate not only streamflow and runoff but also nutrient loadings, water quality, pollution, etc., all over the world in different climate regions [28][29][30][31][32][33][34][35].
In this paper, firstly, the Yellow River Basin is divided into 33 sub basins according to the DEM data ( Figure 2), and then, the Hydrological Response Units (HRUs) in each sub basin are divided according to the land use, slope and soil type data imported into the SWAT model. Slope data are calculated based on DEM data. The basic information of soil type data is displayed in Figure 3 and Table 2. The number of HRUs can affect the speed of SWAT model operation. Therefore, in order to take into account the accuracy and speed of SWAT model operation, it is not suitable to generate too many HRUs. The threshold values of land use, soil and slope are set to 15%, 15% and 15%, which means that land use, soil and slope below 15%, 15% and 15% will be merged into other types. Finally, 138 HRUs are generated.
In this paper, firstly, the Yellow River Basin is divided into 33 sub bas to the DEM data ( Figure 2), and then, the Hydrological Response Units (H sub basin are divided according to the land use, slope and soil type data i the SWAT model. Slope data are calculated based on DEM data. The basic in soil type data is displayed in Figure 3 and Table 2. The number of HRUs speed of SWAT model operation. Therefore, in order to take into account the speed of SWAT model operation, it is not suitable to generate too many HRU old values of land use, soil and slope are set to 15%, 15% and 15%, which me use, soil and slope below 15%, 15% and 15% will be merged into other type HRUs are generated.     Then, the soil attribute database and meteorological attribute database are established, respectively, according to the soil data and meteorological data. Finally, the SWAT model of the Yellow River Basin is established. Among them, 1967-1969 is the warm-up period, 1970-1980 is the calibration period, and 1981-1990 is the validation period. Its time step is months.

SWAT-CUP Software
SWAT-CUP is a software developed by the Swiss Federal Institute of Aquatic Science and Technology for calibrating the SWAT model. In SWAT-CUP software, there are 5 algorithms that can be selected to verify the parameters, namely SUFI2 algorithm, GLUE algorithm, ParaSOl algorithm, MCMC algorithm and PSO algorithm. The SUFI2 algorithm has the fastest operation process and high accuracy [36]. Therefore, the SUFI2 algorithm was selected to describe the uncertainty of the parameters based on a uniform distribution assumption. This algorithm is able to perform the approximation at a 95 percent prediction uncertainty level called 95PPU [37]. SUFI2 initially assumes large uncertainty in the parameters covering all the observed data at 95PPU level. This uncertainty is reduced in subsequent rounds until the difference between the upper and the lower parts of 95PPU-97.5% and 2.5% levels-is minimized and 95PPU includes 80-100% of the observations [38]. The SUFI2 algorithm uses a Latin hypercube sampling approach [39] where n parameters are combined in a satisfying simulation number (500-1000 runs), with the simulations thereafter being assessed using an objective function [38]. There are several objective functions in SWAT-CUP dealing with model calibration [40].
Referring to the existing studies [41,42], 9 parameters, namely CN2.mgt, ALPHA_BF.gw, GW_DELAY.gw, GWQMN.gw, GW_REVAP.gw, CH_K2.rte, HRU_SLP.hru, SOL_AWC().sol and REVAPMN.gw, were selected for verification by the SUFI2 algorithm. The optimal values of 9 parameters in the SWAT model are obtained, as shown in Table 3. Three indicators were selected to evaluate the accuracy of the simulation results in the parameter period and validation period: relative error (RE), determination coefficient (R 2 ) and Nash-Sutcliffe coefficient (NS), which are calculated as follows: where Q i represents the runoff simulation value at time i, Q represents the average value of all runoff simulation values, P i represents the runoff observation value at time i, P represents the average value of all the observed values.

Future Land Use Simulation of the Yellow River Basin
In this study, the land cover map of the Yellow River Basin in 2050 is simulated by IDRISI 17.2 software, which is developed by Clark University. The software includes more than 300 practical and professional modules, such as remote sensing image processing, GIS analysis, decision analysis, spatial analysis, land use change analysis, global change monitoring, suitability assessment mapping, geostatistical analysis, and cellular automata land dynamic change trend prediction. The main steps are as follows: (1) Taking

Future Land Use Simulation of the Yellow River Basin
In this study, the land cover map of the Yellow River Basin in 2050 is simulated by IDRISI 17.2 software, which is developed by Clark University. The software includes more than 300 practical and professional modules, such as remote sensing image processing, GIS analysis, decision analysis, spatial analysis, land use change analysis, global change monitoring, suitability assessment mapping, geostatistical analysis, and cellular automata land dynamic change trend prediction. The main steps are as follows: (1) Taking  (2) The transition suitability maps are generated for the six land cover classes by the Multi-Objective Decision Wizard module (Figure 7). The rules are as follows: 1 Built land: elevation <500 m is the most suitable, the elevation is suitable for 500-1500 m, and elevation >1500 m indicates poor suitability; slope between 0 • and 5 • is the most suitable, slope between 5 • and 15 • is suitable, and slope >15 • indicates poor suitability; distance to road <500 m is the most suitable, distance to road between 500 m and 1000 m is suitable, and distance to road >1 km indicates poor suitability. 2 Farmland: elevation <500 m is the most suitable, the elevation is suitable for 500-1500 m, and elevation >1000 m indicates poor suitability; slope within 0-5 • is the most suitable, slope with 5-15 • indicates poor suitability, and >15 • is unsuitable; distance to highway <1 km is the most suitable, distance to road between 1 km and 3 km is suitable, and distance to road >3 km indicates weak suitability. 3 The conversion of water to other land types is prohibited; 4 Elevation is the main constraint factor of forestland and grassland. Unused land can be converted to other land types at will. Finally, the transfer suitability maps of six land types are generated ( Figure 8);  (Figure 10), and the result is 0.82, indicating that the CA-Markov model can well simulate the land cover change in the Yellow River Basin; (5) Taking the land use data of 2015 as the initial data, the probability transfer matrix from 1980 to 2015 (Table 5) and suitability maps (Figure 8) are input into the CA-Markov model. The 5 × 5 neighborhood filter is selected as the neighborhood definition. The cycle is set to 35 times. The land use data of the Yellow River Basin in 2050 can be simulated ( Figure 11).       (5) Taking the land use data of 2015 as the initial data, the probability transfer matrix from 1980 to 2015 (Table 5) and suitability maps ( Figure 8) are input into the CA-Markov model. The 5 × 5 neighborhood filter is selected as the neighborhood definition. The cycle is set to 35 times. The land use data of the Yellow River Basin in 2050 can be simulated ( Figure 11).

Future Runoff Simulation of the Yellow River Basin
In this section, we focus on the average value of simulated flow results based on the five global climate models. In this paper, the SWAT model is used to simulate the runoff of the Yellow River Basin in the middle of the 21st century (2040-2060) under the combined action of representative concentration path (RCP) scenarios and Land use change (LUC) scenarios (RCP-LUC). The differences between the runoff of the Yellow River Basin in the future (2040-2060) and the base period  are analyzed from four aspects: average runoff, seasonal runoff proportion, monthly runoff proportion and high extreme runoff (Q95). Among them, the monthly runoff value is ranked from low to high, and the value ranked at 95% is the high extreme runoff (Q95). Q95 extreme runoff indicates the 95th percentile of the monthly flow distribution, which can represent the occurrence of flood disasters in a basin [47][48][49].

Average Runoff Change
The differences between the average runoff of the four hydrological stations in the middle of the 21st century (2040-2060) and the base period ) are displayed in Figure 12. The average runoff of Tangnaihai, Toudaoguai, Sanmenxia and Lijin stations under four RCP-LUC scenarios tends to increase compared with the base period. Compared The differences between the average runoff of the four hydrological stations in the middle of the 21st century (2040-2060) and the base period ) are displayed in Figure 12. The average runoff of Tangnaihai, Toudaoguai, Sanmenxia and Lijin stations under four RCP-LUC scenarios tends to increase compared with the base period. Compared with different RCP-LUC scenarios, the order of average runoff increment of Tangnaihai and Toudaoguai hydrological stations is RCP4.5-LUC>RCP8.5-LUC>RCP6.0-LUC>RCP2.6-LUC; the order of average runoff increment of Sanmenxia and Lijin hydrological stations is RCP8.5-LUC>RCP4.5-LUC>RCP2.6-LUC>RCP6.0-LUC.
Compared with the future runoff simulation results under RCP scenarios in the existing studies [11,12], it is found that the average runoff in the future under the RCP-LUC scenario is greater, which indicates that land use data input into the SWAT model will have a great impact on future runoff simulation results. Compared with the future runoff simulation results under RCP scenarios in the existing studies [11,12], it is found that the average runoff in the future under the RCP-LUC scenario is greater, which indicates that land use data input into the SWAT model will have a great impact on future runoff simulation results. Table 7 displays the differences between the proportion of seasonal runoff in the future period (2040-2060) and the base period . Compared with the base period, the runoff proportion of Tangnaihai and Sanmenxia hydrological stations in spring tends to decrease, and it increases in summer, autumn and winter. The runoff proportion of Toudaoguai hydrological station in spring and summer decreases, and increases in autumn and winter. The runoff proportion of Lijin station in spring and autumn decreases, and increases in summer and winter. To sum up, under the RCP-LUC scenario, the spring runoff proportion of the four hydrological stations shows a decreasing trend, and the winter runoff proportion tends to increase. The variation trend of runoff proportion in summer and autumn at four hydrological stations is different.

Monthly Runoff Proportion Change
The differences between the monthly runoff proportion of the four hydrological stations in the future period (2040-2060) and the base period ) are shown in Table 8. Compared with the base period, the monthly proportion runoff of the four hydrological stations in the Yellow River Basin all changed in the middle of the 21st century. The monthly runoff proportion of Tangnaihai, Toudaoguai, Sanmenxia and Lijin hydrological stations in the middle of the 21st century (2040-2060) tends to decrease in April, May, June, July and October. The monthly runoff proportion of the four hydrological stations tends to increase in January, February, August, September and December.

Q95 Extreme Runoff Change
The differences between the Q95 extreme runoff of four hydrological stations in the middle of the 21st century (2040-2060) and the base period ) are shown in Figure 13. Compared with the base period, the Q95 extreme runoff of Tangnaihai, Toudaoguai, Sanmenxia and Lijin stations all increased obviously, indicating that the probability of flood disasters in the Yellow River Basin will increase in the future period.
Compared with the future runoff simulation results under RCP scenarios in the existing studies [11,12], it is found that the Q95 extreme runoff in the future under the RCP-LUC scenario is greater, which indicates that existing studies have underestimated the risk of flood disasters in the future.
LUC. In a word, Tangnaihai and Lijin hydrological stations have the highest probability of flood disasters under the RCP4.5-LUC scenario, while Sanmenxia and Lijin hydrological stations have the highest probability of flood disasters under the RCP8.5-LUC scenario.
Compared with the future runoff simulation results under RCP scenarios in the existing studies [11,12], it is found that the Q95 extreme runoff in the future under the RCP-LUC scenario is greater, which indicates that existing studies have underestimated the risk of flood disasters in the future.

Uncertainty of SWAT and CA-Markov Model on Runoff Simulation
Although the SWAT and CA-Markov models have rigorous theoretical basis and complex structure, there are still some uncertainties when using mathematical formulas to describe the process of runoff generation and land change. In the follow-up study, in order to reduce the uncertainties of the model on the future runoff simulation results, multiple models can be used to simulate the future runoff change (such as VIC and SWAT

Uncertainty of SWAT and CA-Markov Model on Runoff Simulation
Although the SWAT and CA-Markov models have rigorous theoretical basis and complex structure, there are still some uncertainties when using mathematical formulas to describe the process of runoff generation and land change. In the follow-up study, in order to reduce the uncertainties of the model on the future runoff simulation results, multiple models can be used to simulate the future runoff change (such as VIC and SWAT model) and future land use maps (such as CA-Markov and FLUS model) of the Yellow River Basin.

Uncertainty of Parameter Calibration on Runoff Simulation
In this paper, the monthly runoff data of Tangnaihai, Toudaoguai, Sanmenxia and Lijin in the Yellow River Basin are verified by the SUFI2 algorithm in SWAT-CUP software. However, there is still some uncertainty between the simulation results and the actual observation data. During the verification period (1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990), the relative errors between the monthly flow observation values and simulated values of Tangnaihai, Toudaoguai, Sanmenxia and Lijin hydrological stations in the Yellow River Basin are −4.54%, 15.67%, 13.02% and 15.72%, respectively. These parameters will lead to some uncertainties on runoff simulation results in the future. In the follow-up study, we need to collect more data such as urban water, industrial water, agricultural water and water conservancy dam construction, so as to establish a more accurate SWAT model.

Uncertainty of Global Climate Models on Runoff Simulation
For the study of climate change on river runoff, the main uncertainty is the global climate model data [50]. These uncertainties are highly correlated with the corresponding structure, parameters and spatial resolution of the global climate model [51]. The differences between the average runoff and Q95 extreme runoff of the five global climate models in the middle of the 21st century and the base period are shown in Table 9.
The variation range of the average runoff and Q95 extreme runoff of Tangnaihai, Toudaoguai, Sanmenxia and Lijin hydrological stations is very large, which shows that different climate models will lead to great difference in runoff simulation results of the Yellow River Basin in the future. However, using multiple global climate model data can reduce the uncertainty of runoff simulation in the future [50,51]. In this study, only five sets of global climate model data are used as the basic data. In the follow-up study, we will use more climate model data to estimate future runoff.

Uncertainty of Land Use Simulation on Runoff Simulation
The CROSSTAB module of IDRISI 17.2 software was used to calculate the Kappa coefficient and the result is 0.82, which will lead to some uncertainties on land use simulation results in the future. In the follow-up study, we will establish a more accurate CA-Markov model to simulate land use data of the Yellow River Basin. In addition, in the follow-up study, it is necessary to simulate the future land use under the scenarios of ecological protection, status quo continuation and rapid urbanization, and analyze the impact of different land use change scenarios on future runoff.

Conclusions
This paper simulated the future runoff change in the Yellow River Basin under the combined effect of land use and climate change (RCP-LUC) based on CA-Markov and SWAT, and analyzed the changes in the average runoff, Q95 extreme runoff and intraannual runoff distribution under the RCP-LUC scenario. The conclusions were as follows: (1) Compared with the base period , the average runoff of the Yellow River Basin shows an increasing trend in the future period (2040-2060), and the probability of flood disasters also tends to increase.
(2) Compared with the future runoff simulation results under the RCP scenario in the existing studies [11,12], it is found that the average runoff and Q95 extreme runoff in the future under the RCP-LUC scenario are greater, which indicates that existing studies have underestimated the risk of flood disasters in the future.
(3) Compared with the base period, the spring runoff proportion of the four hydrological stations in the Yellow River Basin in the future period (2040-2060) shows a decreasing trend, which increases the risk of drought in spring. The winter runoff proportion tends to increase, and the variation trend of runoff proportion in summer and autumn at four