Next Article in Journal
Optimizing Dredge-and-Dump Activities for River Navigability Using a Hydro-Morphodynamic Model
Previous Article in Journal
Experimenting with Coupled Hydro-Ecological Models to Explore Measure Plans and Water Quality Goals in a Semi-Enclosed Swedish Bay
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Partial Least Squares Regression for Determining the Control Factors for Runoff and Suspended Sediment Yield during Rainfall Events

1
State Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau, Northwest A & F University, Yangling 712100, China
2
Institute of Soil and Water Conservation of Chinese Academy of Sciences and Ministry of Water Resources, Yangling 712100, China
3
College of Resources and Environment, Huazhong Agricultural University, Wuhan 430070, China
*
Author to whom correspondence should be addressed.
Water 2015, 7(7), 3925-3942; https://doi.org/10.3390/w7073925
Submission received: 20 May 2015 / Revised: 6 July 2015 / Accepted: 6 July 2015 / Published: 14 July 2015

Abstract

:
Multivariate statistics are commonly used to identify the factors that control the dynamics of runoff or sediment yields during hydrological processes. However, one issue with the use of conventional statistical methods to address relationships between variables and runoff or sediment yield is multicollinearity. The main objectives of this study were to apply a method for effectively identifying runoff and sediment control factors during hydrological processes and apply that method to a case study. The method combines the clustering approach and partial least squares regression (PLSR) models. The case study was conducted in a mountainous watershed in the Three Gorges Area. A total of 29 flood events in three hydrological years in areas with different land uses were obtained. In total, fourteen related variables were separated from hydrographs using the classical hydrograph separation method. Twenty-nine rainfall events were classified into two rainfall regimes (heavy Rainfall Regime I and moderate Rainfall Regime II) based on rainfall characteristics and K-means clustering. Four separate PLSR models were constructed to identify the main variables that control runoff and sediment yield for the two rainfall regimes. For Rainfall Regime I, the dominant first-order factors affecting the changes in sediment yield in our study were all of the four rainfall-related variables, flood peak discharge, maximum flood suspended sediment concentration, runoff, and the percentages of forest and farmland. For Rainfall Regime II, antecedent condition-related variables have more effects on both runoff and sediment yield than in Rainfall Regime I. The results suggest that the different control factors of the two rainfall regimes are determined by the rainfall characteristics and thus different runoff mechanisms.

1. Introduction

Soil erosion affects soil productivity and sustainable agriculture. Erosion by water strips the fertile topsoil on site, degrades water quality, and clogs streams, rivers, and reservoirs by transporting sediments off site [1]. Suspended sediment yields represent the sum of the erosion produced by all active sources within a watershed [2]. Analysis of the relationships between sediment transport, rainfall, and runoff characteristics can facilitate the elucidation of the factors and processes determining sediment responses [3].
Many recent studies have evaluated factors that control hydrological and sediment responses at an inter-event scale. Oeurng et al. [4] use a Pearson correlation matrix and factorial analysis to assess the relationships between precipitation, discharge, and suspended sediment transport to explain the hydrological and sedimentological responses in a catchment of France. López-Tarazón et al. [5] use multiple regression equations derived from Pearson correlation analysis to describe the relationships between rainfall, runoff, and sediment transport in a mountainous catchment. Wine et al. [6] apply stepwise regression to determine control factors of runoff in watersheds of the Southern Great Plains. When analyzing the relationship between rainfall, runoff, and sediment yield, multivariate statistics are commonly used to relate control factors to the dynamics of discharge and sediment yield [2,7,8,9].
These studies enable soil and water conservationists to understand the complexities of hydrological processes. However, these statistical approaches present particular analytical challenges despite their great potential [10]. Many control factors are highly correlated, which can result in redundancy. Thus, the application of these statistical approaches is somewhat limited f inappropriate approaches or unrepresentative variables are selected. Canonical correlation requires that the ratio of the number of predictors to the sample size be at least 0.025–0.05 [10]. Ordinary regression is hindered by limitations imposed by sample size (the number of observations). Classical multiple regression requires a large sample size relative to the number of predictors [11]. The limitations of traditional multivariate regression approaches in handling multi-collinear and noisy data can be overcome by applying techniques based on multivariate statistical projection [12]. Therefore, the influence of changes in each of the hydrologic variables on runoff and sediment yield must be investigated to enable more effective and accurate watershed management and prediction of the hydrological consequences of rainfall.
Knowledge of sediment yield from small watersheds is critical to understand the linkage between soil erosion processes on hill slopes and suspended sediment transport in large rivers [13]. In this study, data on sediment yield for a small watershed in the Three Gorges Area (TGA) were collected. The Three Gorges Project (TGP) on the Yangtze River in China is the world’s largest hydropower complex project. Following the construction of the Three Gorges Dam (1994–2009), millions of farmers resettled in the surrounding mountain areas and cultivated marginal lands, which are largely on steep slopes with soil of poor structure. The TGA, which refers to the riparian parts along the Yangtze valley between Yichang and Chongqing (Figure 1), contains a substantial amount of arable land on steep slopes known to be susceptible to soil erosion. Sediment yield in this area is estimated to be approximately 700 t·km−2·year−1 [14]. Soil erosion is a serious issue in this area because of long-term anthropogenic pressure, including over use and inappropriate development [15].
Figure 1. Location of the study watershed in the TGA of China.
Figure 1. Location of the study watershed in the TGA of China.
Water 07 03925 g001
The objectives of this paper are: (1) to quantify the contribution of control variables to runoff and sediment yield using partial least squares regression (PLSR); and (2) to investigate the effects of land use change on runoff and suspended sediment at an inter-event scale on a mountainous watershed in the TGA.

2. Study Area and Methods

2.1. Study Area

Integrated small watershed management (ISWM) for soil conservation has developed rapidly in the TGA since the 1990s. ISWM has been conducted in more than 5000 small watersheds with an area of 96,000 km2, and the central government has invested 15.2 billion RMB (about 2.5 billion USD) in this project [16]. As a part of the ISWM program, the Wangjiaqiao watershed was selected as a monitoring site. A national Gauging station on the outlet of the watershed was constructed in 1989. Gauging records of discharge and sediment have provided useful information for decision makers and planners since the 1990s.
The Wangjiaqiao watershed lies in Zigui County of Hubei Province, China (31°5′ N–31°9′ N, 110°40′ E–110°43′ E). The watershed is approximately 50 km northwest of the Three Gorges Dam and covers an area of 1670 ha (Figure 1). Elevations within the watershed range from 184 to 1180 m; slopes range from 2° to 58°, with an average of 23°. Two main soil great groups occur in the study watershed, namely, purple soil derived from purple sandy shale and paddy soil developed from the purple soil. According to the Soil Taxonomy of the USDA, the purple soil and paddy soil are classified into Entisols and Aquepts, respectively. The climate is subtropical, with mean temperatures between 11 and 18 °C. Annual precipitation averages 1016 mm, of which 70% occurs between May and September. Previous studies about this watershed can see Fang et al. [17] and Fang et al. [18].

2.2. Field Surveys and Land Use

Field surveys were conducted in 1995, 2000, and 2005. The watershed topographic map (scale 1:10,000) was used in combination with 1995 and 1999 aerial photographs and 2005 SPOT5 imagery. The land use types were delineated on the photographs and verified in the field. In this watershed, land use is mainly a function of elevation and topography. The remnant forest patches exist primarily on steep, inaccessible peaks and slopes. Little natural vegetation is observed, and most areas are covered by secondary vegetation under human influence. The main agricultural crops are rice (Oryza sativa L.), maize (Zea mays L.), and wheat (Triticum aestivum L.). The streams in the Wangjiaqiao watershed have a trellis drainage pattern, and the length of the main channel is approximately 6500 m. Due to the implementation of ISWM for soil conservation in the TGA in the 1990s, land use was altered between 1995 and 2005 in the Wangjiaqiao watershed. Table 1 provides the areas of various types of land use and the corresponding percentages. In 1995, forest covered 48.7% of the study area, whereas farmland covered 43.1%. The other land use types were relatively minor and consisted of shrub land (3.2%), rural residential land (4.3%), and water bodies (0.7%). During the 1995–2005 periods, some steep lands with slope gradients of more than 25° were converted to forest. This change was related to the implementation of ISWM for soil conservation in the TGA in the 1990s. During this period, forest increased to 56.9% in 2000 and to 66.3% in 2005, whereas farmland decreased to 34.6% and 24.1%, respectively (Figure 2).
Table 1. Changes in different land use categories as a percentage of the total watershed area.
Table 1. Changes in different land use categories as a percentage of the total watershed area.
Land Use% of Total (16.7 km2) AreaChange (%)
1995200020051995–20002000–20051995–2005
Forest48.756.966.38.29.417.6
Farmland43.134.624.1−8.5−10.5−19
Shrub land3.23.13.0−0.1−0.1−0.2
Rural residential land4.34.65.30.30.71
Water body0.70.81.30.10.50.6
Figure 2. Land use change in the study watershed for 1995, 2000, and 2005.
Figure 2. Land use change in the study watershed for 1995, 2000, and 2005.
Water 07 03925 g002

2.3. Field Monitoring

A set of instruments consisting of a continuously recording rain gauge, water-level stage recorder, and silt samplers (bottle type) were used to record rainfall, stream flow, and sediment flow, respectively. The water stage was measured every 15 min and then transformed into discharge via the calibrated rating curve obtained through periodic flow measurements. Suspended sediment concentrations (SSCs) were determined by the gravimetric method. Water samples were vacuum filtered through a 0.45 μm filter, and the residue was oven dried at 105 °C for 24 h. The weight of each dried residue and the sample volume were used to determine the SSC (g·m−3). The suspended sediment yield (TL) was then calculated from the SSC and water discharge (Q) data. Watershed runoff and rainfall data have been collected since 1989.

2.4. Data Processes

In this study, runoff was separated between storm-flow and base-flow, using the classical hydrograph separation method [19]. Despite its arbitrary nature (similar to all other methods), this hydrograph separation method was used in this study to characterize the response of the catchment to a rainstorm, and no interpretation in terms of runoff processes was derived from the separation [20]. Equipment malfunctions prevented complete monitoring of all storms on several occasions. The land pattern of the study watershed varied continuously during the 1990–2005 period. Hydrograph separation was conducted on 29 events occurring in 1995, 2000, and 2004; for these events, we had reasonably completed records of sediment concentrations and conducted reconnaissance field surveys. The inner-event rainfall data for 2005 were missing, so we used data of 2004 instead of 2005. Floods were identified when the increase in stream discharge exceeded 1.5 times the base flow recorded at the beginning of the rainfall event [21]. For each rainfall-runoff event, the characteristics of individual storms were evaluated based on their erosive characteristics. The flood events were subsequently characterized using three groups of variables (Table 2).
Table 2. Flood variables and associated abbreviations used in the statistical analysis of the relationship between rainfall, runoff, and suspended sediment transport.
Table 2. Flood variables and associated abbreviations used in the statistical analysis of the relationship between rainfall, runoff, and suspended sediment transport.
VariablesVariableAbbreviationUnit
Rainfall related variablesTotal precipitationPmm
DurationDmin
Maximum 30 min rainfall depthI30mm
Mean rainfall intensityImmm·h−1
Runoff duration Drmm
Antecedent precipitation 1 day beforeP1Dmm
Antecedent precipitation 5 days beforeP5Dmm
Runoff related variablesAntecedent precipitation 10 days beforeP10Dmm
RunoffRm3
Base flowBFm3·s−1
Total dischargeTQm3·s−1
Flood peak dischargeQmaxm3·s−1
Duration of runoffDqs
Suspended sediment-related variablesMaximum flood suspended sediment concentrationSSCmaxg·m−3
Total suspended sediment loadTLt
Some variables use the follow equations to calculate:
R = TQBF × Dq
where R, TQ, BF, and Dq are the runoff, total discharge, base flow, and duration of discharge respectively.
TLi = SSCi × Qi
T L = i = 1 n T L i
where TLi, SSCi and Qi are the suspended sediment yield, maximum flood suspended sediment concentration, and discharge, respectively, during period i.

2.5. Clustering Approach and Partial Least Squares Regression (PLSR)

Runoff and sediment generation varies considerably depending on rainfall type. Many studies have suggested that local storm patterns are important for determining runoff and sediment yield [22,23]. Such rain parameters as depth, duration, and intensity play a key role in inducing various water erosion rates [24,25]. Thus, prior to PLSR analysis, we used a clustering approach to distinguish rainfall regimes. Clustering approach was evaluated with the SPSS13.0 statistical software package.
Partial least squares regression is a robust multivariate regression method that allows users to perform a wide range of analyses [26]. Partial least squares regression provides a quick overview of the main systematic types of variation in data from complex systems and helps to identify mistakes in the input data. Applied as a multivariate calibration of one dependent variable vs. many independent variables, PLSR is suitable for selectivity enhancements of analytical instruments [26]. PLSR is a method for relating two data matrices, X and Y, by a linear multivariate model, but goes beyond traditional regression in that it models also the structure of X and Y. PLSR derives its usefulness from its ability to analyze data with many, noisy, collinear, and even incomplete variables in both X and Y [27]. Details on the theory, principles, and application of PLSR can be found in the literature [28]. In this study, PLSR was performed with SIMCA-P (Umetrics AB, Umeå, Sweden). Four separate PLSR models were constructed to identify the main variables that control runoff and sediment yield for the two rainfall regimes. For runoff models of two rainfall regimes, runoff (R) is considered to be dependent variable and the other variables in Table 3 are considered to be independent variables. As similar, for sediment load models, total suspended sediment load (TL) are considered to be dependent variable.
Table 3. General characteristics of the analyzed rainfall-runoff events recorded in the Wangjiaqiao watershed.
Table 3. General characteristics of the analyzed rainfall-runoff events recorded in the Wangjiaqiao watershed.
DateD (min)P (mm)I30 (mm)Im (mm/h)Qmax (m3/s)R (m3)TQ (m3)SSCmax (g/m3)TL (Kg)P1D (mm)P5D (mm)P10D (mm)BF (m3/s)
1995.4.16138014.63.80.631.1612,40745,50752010561.613.324.20.034
1995.5.122801916.84.070.975296217,8521590134505.76.40.032
1995.5.19142027.14.71.154.05155,909170,3601310105,3578837.20.036
1995.6.161023.63.82.320.2132,91835,269610761800.10.10.028
1995.6.578029.54.72.274.05244,089217,852112059,781032.332.40.039
1995.6.13179021.91.40.731.5857,29653,4692405129054.1890.033
1995.6.20142018.72.80.793.3373,15595,00840018,39103.926.10.034
1995.6.2182512.13.50.883.41146,189160,93715023,04818.722.644.80.857
1995.10.2125031.45.81.513.1372,05873,79637022,70124.624.641.80.033
1995.10.1314516.59.56.831.2010,02215,731950730.40.413.80.039
1995.10.1748024.13.23.013.3796,50989,487136044,569117.920.80.053
1995.10.19143041.23.71.735.50311,040561,7551460180,757328.1450.104
1995.10.22103514.82.30.862.87157,378141,53838019,554069.287.10.083
2000.5.1571028.63.02.420.08471740013707440021.70.012
2000.6.3565222.62.340.052972262427029905.4260.006
2000.6.572015.23.91.270.8842,02542,571302023,263022.8480.011
2000.6.2660018.416.41.840.16785710,200111022541.237.350.90.014
2000.7.1142045.24.11.913.16265,766352,4872810162,23403.258.60.012
2000.7.2923525.913.16.610.073335327195091714.317.818.60.011
2000.8.2109052.23.22.872.66210,591328,4362170450040.243.70.012
2000.10.21119522.62.61.132.32165,335177,40467074493.76.435.40.088
2000.10.24108023.14.41.282.58136,236141,42662056,2534.831.137.30.475
2000.10.2566510.23.20.925.59126,144972,9901820042,79123.150.560.40.635
2004.5.29590263.12.640.6824,71026,7722806053040.545.90.030
2004.6.3200572.710.12.185.44263,002265,4465240412,65102666.50.020
2004.8.330031.120.16.220.6816,27839,391301076310220.010
2004.9.194856110.97.552.2080,78470,50666029,117000.40.010
2004.9.2422539.23.010.452.43177,984125,34149033,074063.563.50.850
2004.11.1288017.130.01.170.0546404126130234015.815.80.022

3. Results

3.1. Characteristics of the Flood Events

Table 3 summarizes the general characteristics of the rainfall, runoff, suspended sediment, and antecedent conditions associated with the observed floods and variables as analyzed by statistical analysis.
The maximum amount of precipitation for a single event was 72.7 mm (during the event on 3 June 2004); most of the events were relatively small in magnitude. Only 10 events (34%) were greater than the average rainfall value (27.8 mm), whereas the remaining events were below the average. The mean intensity varied from 0.63 to 10.45 mm·h−1, and eight events were greater than the mean value (2.74 mm·h−1). The maximum 30 min intensity ranged from 1.3 to 30 mm; 24% exceeded 10 mm during the 30 min interval, and eight of the events were greater than the average value (6.89 mm). The antecedent rainfall values varied considerably, ranging from 0 to 24.6 mm, 0 to 69.2 mm, and 0.1 to 89.0 mm of precipitation during the 1-, 5-, and 10-day previous periods, respectively.
Figure 3. Bivariate scatter plot matrix of selected event characteristics. Note: ** means very significant levels (p < 0.01), and * means significant levels (p < 0.05).
Figure 3. Bivariate scatter plot matrix of selected event characteristics. Note: ** means very significant levels (p < 0.01), and * means significant levels (p < 0.05).
Water 07 03925 g003
The runoff generated by rainfall varied between 3674 and 309,618 m3, with a mean value of 100,459 m3. Peak discharge oscillated between 0.05 and 5.59 m3·s−1. The peak was greater than the mean value (2.20 m3·s−1) during 16 floods (55% of the total sample). The baseflow level fluctuated from 0.006 to 0.857 m3·s−1. The maximum flood sediment concentrations varied from 130 to 18,200 g·m−3; with a mean concentration of 1740 g·m−3. The total suspended sediment load carried by 15 floods exceeded 167 t (representing a specific suspended sediment yield of 10·t·km−2). The maximum yield during a single flood occurred on 4 June 2004 and reached 412,651 kg. This yield was generated by 72.7 mm of precipitation, which created a flood peak discharge of 5.44 m3·s−1. These values illustrate the degree of geomorphic activity of the system and confirm the high sediment contribution and transport capacity of the channels in the Wangjiaqiao watershed, which are largely related to the availability of fine materials in the TGA areas and the accumulation of these materials along the main channel [15].
We generated a Pearson correlation matrix (Figure 3). The linear correlation coefficients among rainfall-, runoff-discharge-, sediment-, and antecedent condition-related variables are described in detail. Peak flow (Qmax) was significantly correlated with R, TQ, SSCmax, TL, and P10D. The strongest correlation was between precipitation (TQ) and runoff (Qmax). Runoff was significantly correlated with TQ, SSCmax, P1D, P5D, P10D, and BF. The results confirmed that many variables were co-linear.

3.2. Results of Clustering Approach

The 29 rainfall events were divided into two groups using K-means clustering. Three rainfall variables were used during this process: the depth (P), duration (D), and maximum 30 min rainfall intensity (I30) (Table 4). The general characteristics of the two rainfall regimes can be described as heavy and moderate rainfall (p < 0.0001). Compared with Rainfall Regime II (17 events), Rainfall Regime I (12 events) was composed of rainfall events with a high mean P and D and low I30.
Table 4. Statistical features of the different rainfall regimes.
Table 4. Statistical features of the different rainfall regimes.
Rainfall RegimesVariablesMeanSDVariationFrequency (time)
Rainfall Regime IP (mm)32.117.60.5512
D (min)13762880.21
I30 (mm)4.12.20.55
Runoff (m3)156,60192,9880.59
TL (kg)82,665121,6761.47
Rainfall Regime IIP (mm)24.711.90.4817
D (min)5352270.42
I30 (mm)8.98.00.90
Runoff (m3)60,82973,3801.21
TL (kg)16,63619,1401.15

3.3. Results of PLSR Analysis

Many studies have demonstrated that land use type can change runoff and sediment yield [29]; thus, the percentages of forest and farmland during the study years were included in the PLSR models. In a PLSR model, the importance of a predictor for both the independent and dependent variables is given by the variable importance for the projection (VIP) [10,12]. Terms with large VIP values are the most relevant for explaining the dependent variable. To overcome the problem of over-fitting, the appropriate number of components of each PLSR model was determined by cross-validation to achieve an optimal balance between the explained variation in the response (R2) and the predictive ability of the model (goodness of prediction: Q2) [10].
Table 5 provides a summary of the four PLSR models constructed separately for the runoff and sediment yield of the two rainfall regimes.
Table 5. Summaries of the partial least squares regression (PLSR) models.
Table 5. Summaries of the partial least squares regression (PLSR) models.
Rainfall RegimeResponse Variable YR2Q2Component% of Explained Variability in YCumulative Explained Variability in Y (%)RMSECV a (m3 or kg)Q2cum
Rainfall Regime IR0.990.85171.671.651,9770.564
222.594.125,0350.761
32.796.819,5070.737
42.098.812,8010.803
50.799.588620.852
6099.592650.837
TL0.970.62180.680.652,1810.570
216.296.824,0880.624
31.998.716,5000.587
40.599.213,4340.546
Rainfall Regime IIR0.900.65169.669.641,7760.444
220.189.725,1670.646
33.393.021,4770.610
TL0.860.66180.666.611,4270.503
24.388.669120.656
31.492.259330.637
Notes: a The RMSECV (cross-validated root mean squared error), Q2cum (cross-validated goodness of prediction) per component, R2 (goodness of fit), and Q2 (cross-validated goodness of prediction) were calculated for the PLSR models.
For the runoff model of Rainfall Regime I, the prediction error decreased with an increasing number of components, and the minimum RMSECV and maximum Q2 were obtained with five components. An additional increase in the number of components generated a higher prediction error, suggesting that the other components were not strongly correlated with the residuals of the predicted variable [12]. The first component explained 71.6% of the variance in the dataset in terms of the changes in runoff (Table 5). The addition of the second through fifth components to the models cumulatively explained 99.5% of the total variance. For the TL model of heavy rainfall, the maximum Q2 was obtained with two components. The first component explained 80.6% of the variance in the dataset in terms of the changes in TL. The second component explained 16.2% of the variance. These two components explained 96.8% of the total variance. Further addition of components to the PLSR models did not substantially increase the percentage of the variance explained (Table 5). The PLSR weights could be used to describe the quantitative relationship between the predictors and results because they are linear combinations of the original variables that defined the scores [28].
For the runoff model of moderate rainfall, the optimum model had two components, with a maximum Q2 of 0.646. The model explained 89.7% of the total variance, with 69.6% of the variance explained by the first component. The optimum model for the TL of moderate rainfall also contained two components. Those two components explained 88.6% of the total variance. The maximum Q2 was 0.656.
The first component of the runoff model for heavy rainfall (Table 6) was dominated by P, Im, Qmax, TQ, SSCmax, and TL, whereas the second component was dominated by Qmax and TQ. The third, fourth, and fifth components were dominated by many variables, mainly on the negative side. A more convenient and comprehensive expression of the relative importance of the predictors was obtained by exploring their VIP values [12]. For runoff in Rainfall Regime I, higher VIP values were obtained for changes in TQ, Qmax, P, Im, SSCmax, and TL (VIP > 1), followed by the percentage of forest and farmland (0.936 and 0.941) (Figure 4). Predictors with VIP values below one are considered of minor predictive importance. For runoff in Rainfall Regime II, a higher VIP value was obtained for changes in TL, Qmax, BF, P5D, and TQ. Compared with Rainfall Regime I, runoff in Rainfall Regime II was more likely to be affected by antecedent conditions, such as BF and P5D.
Table 6. PLSR for runoff a.
Table 6. PLSR for runoff a.
PredictorsR of Rainfall Regime IR of Rainfall Regime II
RCs bW* (1)W* (2)W* (3)W* (4)W* (5)RCsW* (1)W* (2)
P−0.0230.357 −0.038−0.295−0.346 −0.1370.2060.1020.496
I30−0.1870.172−0.397−0.124−0.300 −0.17−0.114−0.251−0.040
Im0.0050.344 0.161−0.469 −0.486 0.0250.1100.0360.289
Qmax0.2880.365 0.305 0.293−0.031−0.1880.2790.478 0.263
TQ0.5380.446 0.693 0.2940.1820.1920.0180.251−0.251
SSCmax0.0790.339 −0.098−0.0310.0220.025−0.1070.118−0.465
TL0.1950.334 −0.0240.2780.212−0.0740.3880.523 0.536
P1D0.024−0.129−0.1640.1750.1940.731 −0.0530.147−0.339
P5D−0.193−0.0110.103−0.620 −0.325 −0.499 0.1510.325 0.061
P10D0.3260.0990.1380.2180.927 0.414 0.0630.258−0.122
BF0.007−0.0110.1360.004−0.081−0.347 0.1670.346 0.086
Forest0.1440.260−0.1500.0630.394 0.382 −0.059−0.095−0.064
Farm−0.133−0.260.160−0.054−0.375 −0.357 0.0530.091−0.064
D−0.2790.061−0.431−0.111−0.303 −0.644 0.0810.1280.088
Notes: a Values larger than 0.3, which are shown in bold face, indicate that the PLSR components are mainly loaded on the corresponding variables. W* means component. b RCs means Regression Coefficients.
For the TL model of Rainfall Regime I, higher VIP values were obtained for changes in P, I30, Im, Qmax, SSCmax, R, percentage of forest and farmland, and D (see Figure 5). Thus, all variables have important effects on TL with the exception of antecedent conditions and TQ. For the TL model of Rainfall Regime II, high VIP values were obtained for P, Qmax, TQ, R, and BF. Table 7 indicates that Qmax was important in the first and second components of Rainfall Regimes I and II, whereas P5D and P10D had only minor effects on TL for either rainfall regime.
Figure 4. VIP values of each predictor for PLSR of runoff.
Figure 4. VIP values of each predictor for PLSR of runoff.
Water 07 03925 g004
Figure 5. VIP values of each predictor for PLSR of sediment load.
Figure 5. VIP values of each predictor for PLSR of sediment load.
Water 07 03925 g005
Table 7. PLSR for sediment load.
Table 7. PLSR for sediment load.
PredictorsTL of Rainfall Regime ITL of Rainfall Regime II
RCs aW* (1)W* (2)RCsW* (1)W* (2)
P0.0860.355 −0.1330.2440.1120.547
I300.2660.367 0.383 −0.131−0.259−0.082
Im−0.1160.200−0.5590.052−0.0140.156
Qmax0.2580.355 0.373 0.3440.516 0.388
TQ0.0360.225−0.1410.0840.304 −0.094
SSCmax0.1750.396 0.0840.0310.216−0.148
R0.0930.305 −0.0580.4110.532 0.553
P1D−0.016−0.0370.033−0.0980.130−0.406
P5D−0.050−0.067−0.0740.0920.281−0.048
P10D0.0910.0930.167−0.0040.212−0.237
BF0.021−0.0420.106−0.0080.233−0.272
Forest0.0480.285−0.170−0.052−0.097−0.039
Farm−0.0560.2910.1520.0480.0940.031
D0.2920.2960.536 0.0830.1340.182
Notes: a RCs means Regression Coefficients; W* means component.

4. Discussion

4.1. Control Factors for Runoff

The hydrological response of a watershed to a rainfall event is determined by several interacting factors that control runoff generation [30]. In this study, only sediment flux data from the outlet of the watershed were available, and thus, the within-watershed pattern of erosion and deposition remains uncertain. Soil moisture content is reflected in the antecedent conditions with BF, P1D, P5D and P10D.
The results indicated that TQ, Qmax, and TL are important for runoff in Rainfall Regimes I and II. The rainfall characteristics (P and Im) are more important for runoff in Rainfall Regime I, whereas soil moisture content is more important for Rainfall Regime II. The former pattern is in agreement with the results of Oeurng et al. [29]. In Regime I, TQ and Qmax dominate runoff, and base flow has only a minor effect due to its small magnitude. The effect of soil moisture is greater in Rainfall Regime II than in Rainfall Regime I. The sensitivity of the runoff response to soil moisture depends on the predominant runoff mechanisms [31]. Surface runoff generated by saturation-excess flow is driven from spatially and temporally dynamic variable source areas and requires lower rainfall intensities for its initiation. Both infiltration excess and saturation excess runoff processes may occur during heavy rainfall [32]. Discharge is affected by drainage network density, slope, channel roughness, and soil infiltration characteristics [33], and peak flow rates are typically affected by within-storm rainfall characteristics. Many studies have indicated that storm runoff from steep well-vegetated headwater catchments in humid areas is produced by saturation-excess mechanisms [34,35]. Central to the saturation-excess mechanism is the concept of runoff contributing zones, which expand and contract seasonally and during storms depending on antecedent wetness and storm magnitude [36]. The initial soil water content is more important in catchments with good vegetation. The runoff response is more uniform and is not dependent on initial soil moisture when the infiltration excess overland flow is predominant because of high rain depth or less permeable soils [30]. Runoff from lower-intensity storms in soils of higher permeability is controlled by the soil water content of the surface soil layers and is more dependent on the initial conditions [30]. We attributed the different control factors of the two Rainfall Regimes to rainfall characteristics and consequent runoff mechanisms.

4.2. Control Factors of Sediment Yield

The sediment response of catchments is controlled by a complex function of ecological, climatic, and geomorphic processes [37]. All of the rainfall-related variables have important effects on TL for Rainfall Regime I. Raindrop splash detachment and surface overland flow are the two basic drivers of soil peel-back, flood generation, water loss, and sediment mobilization [38]. However, these two drivers are both rooted in rainfall energy; vegetation reduces this energy, and thus, the Qmax observed at the outlet of the watershed may reflect the rainfall energy. The sediment yield of a catchment represents only a part of the total erosion or sediment production within the catchment, as a considerable portion of the sediment is often deposited before reaching the outlet [39]. In the Wangjiaqiao watershed, 76% of the area has slope gradients exceeding of 30%. Cultivated sloping lands are major contributors to sediment yield, and a considerable portion of the sediment may move by gravity rather than by shear forces alone [15]. Lu and Higgit [14] conclude that 60% of sediment is from arable land in 32 catchments in the TGA. The soil parent material of the watershed is predominantly purple sandy shale, and bedrock is typically exposed in the channel bed; thus, channel erosion is rare [18]. Sediments stored in the channel and distributed within tributaries are transported after flood events with sufficient transport capacity. The PLSR model of TL for Rainfall Regime II illustrates that events with low rainfall depth and short duration typically cause very limited hydrological responses and limited sediment transport, in agreement with previous study [40].
Land use change is important for sediment yield [41]. Soil erosion is largely determined by the absence of protective land cover, whereas sediment export to rivers is determined by onsite sediment production and the connectivity of sediment sources and the river [42,43]. Few studies have focused on how changes in land use can influence inter-event hydrologic process and sediment yield at a small watershed scale. Our results indicate that the percentage of forest and farmland in the study area has an important influence on sediment caused by Rainfall Regime I, i.e., heavy rainfall events. The rainfall energy of an event is typically characterized by Im and I30, which had important effects on TL in Rainfall Regime I. Thus, heavy rainfall events, which cause a large sediment yield, are influenced by more factors than lower-intensity rainfall events. A similar conclusion has been reported by Ni et al. [44], who demonstrates that soil erosion has many forms for heavy rainfall events, including sheet erosion, rill erosion, gully erosion, landslip, and even collapses. Thus, the sediment processes are complex, heterogeneous, and controlled by multiple factors.

5. Conclusions

In this paper, K-means clustering was used to classify 29 rainfall events into two rainfall regimes. Four separate PLSR models were constructed to identify the main variables that control runoff and sediment yield in a small agriculture watershed in the TGA. The results confirmed the complex and heterogeneous nature of the hydrology and sediment response in the watershed.
(1)
Rainfall Regime I, which is characterized by a high mean rainfall depth and long duration, produced 2.6-fold greater mean runoff and 4.9-fold greater mean sediment yield than Rainfall Regime II.
(2)
The initial soil water content was more important for controlling runoff in Rainfall Regime II than in Rainfall Regime I, whereas rainfall characteristics played a greater role in controlling runoff in Rainfall Regime I.
(3)
Although land use changed significantly during the study years, these changes were not reflected in the control factors for runoff for Rainfall Regimes I or II or the control factors of sediment yield for Rainfall Regime II. At the inter-event scale, the percentages of forest and farmland only had a significant effect on sediment yield in Rainfall Regime I.
The PLSR methodology presented in this paper partially eliminates the co-dependency of the variables and facilitates a more unbiased view of the control factors for runoff and sediment yield. The variables used for clustering approach and PLSR are easy to obtain. Thus, this practicable and simple approach could be applied to a variety of other watersheds and enable better management of agricultural watersheds.

Acknowledgments

Financial support for this research was provided by the National Natural Science Foundation of China (41301294), the West Light Foundation of the Chinese Academy of Sciences, and the Fundamental Research Funds for the Central Universities (2014YB053).

Author Contributions

Both authors contributed equally to this work, including the related research and writing.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhu, A.X.; Wang, P.; Zhu, T.X.; Chen, L.J.; Cai, Q.G.; Liu, H.P. Modeling runoff and soil erosion in the Three-Gorge Reservoir drainage area of China using limited plot data. J. Hydrol. 2013, 492, 163–175. [Google Scholar] [CrossRef]
  2. Estrany, J.; Garcia, C.; Batalla, R.J. Hydrological response of a small mediterranean agricultural catchment. J. Hydrol. 2010, 380, 180–190. [Google Scholar] [CrossRef]
  3. Zabaleta, A.; Martínez, M.; Uriarte, J.A.; Antigüedad, I. Factors controlling suspended sediment yield during runoff events in small headwater catchments of the Basque Country. Catena 2007, 71, 179–190. [Google Scholar] [CrossRef]
  4. Oeurng, C.; Sauvage, S.; Sánchez-Pérez, J.M. Dynamics of suspended sediment transport and yield in a large agricultural catchment, southwest France. Earth Surf. Process. Landf. 2010, 35, 1289–1301. [Google Scholar] [CrossRef] [Green Version]
  5. López-Tarazón, J.; Batalla, R.J.; Vericat, D.; Balasch, J. Rainfall, runoff and sediment transport relations in a mesoscale mountainous catchment: The River Isábena (Ebro basin). Catena 2010, 82, 23–34. [Google Scholar] [CrossRef]
  6. Wine, M.L.; Zou, C.B.; Bradford, J.A.; Gunter, S.A. Runoff and sediment responses to grazing native and introduced species on highly erodible Southern Great Plains soil. J. Hydrol. 2012, 450, 336–341. [Google Scholar] [CrossRef]
  7. Mayor, Á.G.; Bautista, S.; Bellot, J. Factors and interactions controlling infiltration, runoff, and soil loss at the microscale in a patchy Mediterranean semiarid landscape. Earth Surf. Process. Landf. 2009, 34, 1702–1711. [Google Scholar] [CrossRef]
  8. Jarvis, D.; Stoeckl, N.; Chaiechi, T. Applying econometric techniques to hydrological problems in a large basin: Quantifying the rainfall-discharge relationship in the Burdekin, Queensland, Australia. J. Hydrol. 2013, 496, 107–121. [Google Scholar] [CrossRef]
  9. Taguas, E.; Ayuso, J.; Pérez, R.; Giráldez, J.; Gómez, J. Intra and inter-annual variability of runoff and sediment yield of an olive micro-catchment with soil protection by natural ground cover in Southern Spain. Geoderma 2013, 206, 49–62. [Google Scholar] [CrossRef]
  10. Onderka, M.; Wrede, S.; Rodný, M.; Pfister, L.; Hoffmann, L.; Krein, A. Hydrogeologic and landscape controls of dissolved inorganic nitrogen (DIN) and dissolved silica (DSi) fluxes in heterogeneous catchments. J. Hydrol. 2012, 450, 36–47. [Google Scholar] [CrossRef]
  11. Carrascal, L.M.; Galván, I.; Gordo, O. Partial least squares regression as an alternative to current regression methods used in ecology. Oikos 2009, 118, 681–690. [Google Scholar] [CrossRef]
  12. Shi, Z.; Ai, L.; Li, X.; Huang, X.; Wu, G.; Liao, W. Partial least squares regression for linking land-cover patterns to soil erosion and sediment yield in watersheds. J. Hydrol. 2013, 498, 165–176. [Google Scholar] [CrossRef]
  13. Rommens, T.; Verstraeten, G.; Bogman, P.; Peeters, I.; Poesen, J.; Govers, G.; Van Rompaey, A.; Lang, A. Holocene alluvial sediment storage in a small river catchment in the loess area of central Belgium. Geomorphology 2006, 77, 187–201. [Google Scholar] [CrossRef]
  14. Lu, X.; Higgitt, D. Sediment delivery to the Three Gorges: 2: Local response. Geomorphology 2001, 41, 157–169. [Google Scholar] [CrossRef]
  15. Shi, Z.H.; Cai, C.F.; Ding, SW.; Wang, T.W.; Chow, T. Soil conservation planning at the small watershed level using RUSLE with GIS: A case study in the Three Gorge Area of China. Catena 2004, 55, 33–48. [Google Scholar] [CrossRef]
  16. Liao, C.Y. Soil and water conservation in Yangtze River Basin during past 60 years: Review and perspective. Yangtze River 2010, 41, 2–6. (In Chinese) [Google Scholar]
  17. Fang, N.F.; Shi, Z.H.; Li, L.; Guo, Z.L.; Liu, Q.J.; Ai, L. The effects of rainfall regimes and land use changes on runoff and soil loss in a small mountainous watershed. Catena 2012, 99, 1–8. [Google Scholar] [CrossRef]
  18. Fang, N.F.; Shi, Z.H.; Li, L.; Jiang, C. Rainfall, runoff, and suspended sediment delivery relationships in a small agricultural watershed of the Three Gorges area, China. Geomorphology 2011, 135, 158–166. [Google Scholar] [CrossRef]
  19. Bidin, K.; Greer, T. A spreadsheet-based technique (Lotus 1-2-3) for separating tropical forest storm hydrographs using Hewlett and Hibbert’s slope. Earth Surf. Process. Landf. 1997, 22, 1231–1237. [Google Scholar] [CrossRef]
  20. Latron, J.; Soler, M.; Llorens, P.; Gallart, F. Spatial and temporal variability of the hydrological response in a small Mediterranean research catchment (Vallcebre, Eastern Pyrenees). Hydrol. Process. 2008, 22, 775–787. [Google Scholar] [CrossRef]
  21. Lana-Renault, N.; Regues, D.; Martí-Bono, C.; Beguería, S.; Latron, J.; Nadal, E.; Serrano, P.; Garcia-Ruiz, J. Temporal variability in the relationships between precipitation, discharge and suspended sediment concentration in a small Mediterranean mountain catchment. Nord. Hydrol. 2007, 38, 139–150. [Google Scholar] [CrossRef]
  22. Wei, W.; Chen, L.; Fu, B.; Huang, Z.; Wu, D.; Gui, L. The effect of land uses and rainfall regimes on runoff and soil erosion in the semi-arid loess hilly area, China. J. Hydrol. 2007, 335, 247–258. [Google Scholar] [CrossRef]
  23. Cerdan, O.; Govers, G.; Le Bissonnais, Y.; Van Oost, K.; Poesen, J.; Saby, N.; Gobin, A.; Vacca, A.; Quinton, J.; Auerswald, K. Rates and spatial variations of soil erosion in Europe: A study based on erosion plot data. Geomorphology 2010, 122, 167–177. [Google Scholar] [CrossRef]
  24. Piccarreta, M.; Capolongo, D.; Boenzi, F.; Bentivenga, M. Implications of decadal changes in precipitation and land use policy to soil erosion in Basilicata, Italy. Catena 2006, 65, 138–151. [Google Scholar] [CrossRef]
  25. Nyssen, J.; Poesen, J.; Moeyersons, J.; Haile, M.; Deckers, J. Dynamics of soil erosion rates and controlling factors in the Northern Ethiopian Highlands—Towards a sediment budget. Earth Surf. Process. Landf. 2008, 33, 695–711. [Google Scholar] [CrossRef] [Green Version]
  26. Martens, H.; Martens, M. Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR). Food Qual. Prefer. 2000, 11, 5–16. [Google Scholar] [CrossRef]
  27. Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
  28. Abdi, H. Partial least squares regression and projection on latent structure regression (PLS Regression). Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 97–106. [Google Scholar] [CrossRef]
  29. Ouyang, W.; Skidmore, A.K.; Hao, F.; Wang, T. Soil erosion dynamics response to landscape pattern. Sci. Total. Environ. 2010, 408, 1358–1366. [Google Scholar] [CrossRef] [PubMed]
  30. Castillo, V.; Gomez-Plaza, A.; Martı́nez-Mena, M. The role of antecedent soil water content in the runoff response of semiarid catchments: A simulation approach. J. Hydrol. 2003, 284, 114–130. [Google Scholar] [CrossRef]
  31. McDonnell, J.J. Where does water go when it rains? Moving beyond the variable source area concept of rainfall-runoff response. Hydrol. Process. 2003, 17, 1869–1875. [Google Scholar] [CrossRef]
  32. Srinivasan, M.; Gburek, W.; Hamlett, J. Dynamics of stormflow generation—A hillslope-scale field study in east-central Pennsylvania, USA. Hydrol. Process. 2002, 16, 649–665. [Google Scholar] [CrossRef]
  33. Fiedler, F.R.; Ramirez, J.A. A numerical method for simulating discontinuous shallow flow over an infiltrating surface. Int. J. Numer. Methods Fluids 2000, 32, 219–239. [Google Scholar] [CrossRef]
  34. Latron, J.; Gallart, F. Seasonal dynamics of runoff-contributing areas in a small Mediterranean research catchment (Vallcebre, Eastern Pyrenees). J. Hydrol. 2007, 335, 194–206. [Google Scholar] [CrossRef]
  35. Easton, Z.M.; Fuka, D.R.; Walter, M.T.; Cowan, D.M.; Schneiderman, E.M.; Steenhuis, T.S. Re-conceptualizing the soil and water assessment tool (SWAT) model to predict runoff from variable source areas. J. Hydrol. 2008, 348, 279–291. [Google Scholar] [CrossRef]
  36. Pearce, A.; Stewart, M.; Sklash, M. Storm runoff generation in humid headwater catchments: 1. Where does the water come from? Water Resour. Res. 1986, 22, 1263–1272. [Google Scholar] [CrossRef]
  37. Krishnaswamy, J.; Halpin, P.N.; Richter, D.D. Dynamics of sediment discharge in relation to land-use and hydro-climatology in a humid tropical watershed in Costa Rica. J. Hydrol. 2001, 253, 91–109. [Google Scholar] [CrossRef]
  38. Sanchis, M.; Torri, D.; Borselli, L.; Poesen, J. Climate effects on soil erodibility. Earth Surf. Process. Landf. 2008, 33, 1082–1097. [Google Scholar] [CrossRef]
  39. Walling, D. Tracing suspended sediment sources in catchments and river systems. Sci. Total. Environ. 2005, 344, 159–184. [Google Scholar] [CrossRef] [PubMed]
  40. Shi, Z.H.; Chen, L.D.; Fang, N.F.; Qin, D.F.; Cai, C.F. Research on the SCS-CN initial abstraction ratio using rainfall-runoff event analysis in the Three Gorges Area, China. Catena 2009, 77, 1–7. [Google Scholar] [CrossRef]
  41. Cantón, Y.; Solé-Benet, A.; De Vente, J.; Boix-Fayos, C.; Calvo-Cases, A.; Asensio, C.; Puigdefábregas, J. A review of runoff generation and soil erosion across scales in semiarid south-eastern Spain. J. Arid. Environ. 2011, 75, 1254–1261. [Google Scholar] [CrossRef]
  42. Van Rompaey, A.J.; Govers, G.; Puttemans, C. Modelling land use changes and their impact on soil erosion and sediment supply to rivers. Earth Surf. Process. Landf. 2002, 27, 481–494. [Google Scholar] [CrossRef]
  43. Bakker, M.M.; Govers, G.; van Doorn, A.; Quetier, F.; Chouvardas, D.; Rounsevell, M. The response of soil erosion and sediment export to land-use change in four areas of Europe: The importance of landscape pattern. Geomorphology 2008, 98, 213–226. [Google Scholar] [CrossRef]
  44. Ni, J.P.; Wei, C.F.; Xie, D.T.; Gao, M.; He, B.H. Influence of slope gradient on runoff erosion of purple soil slope in the Three Gorges Reservoir area. J. Sediment Res. 2009, 2, 29–33. (In Chinese) [Google Scholar]

Share and Cite

MDPI and ACS Style

Fang, N.; Shi, Z.; Chen, F.; Wang, Y. Partial Least Squares Regression for Determining the Control Factors for Runoff and Suspended Sediment Yield during Rainfall Events. Water 2015, 7, 3925-3942. https://doi.org/10.3390/w7073925

AMA Style

Fang N, Shi Z, Chen F, Wang Y. Partial Least Squares Regression for Determining the Control Factors for Runoff and Suspended Sediment Yield during Rainfall Events. Water. 2015; 7(7):3925-3942. https://doi.org/10.3390/w7073925

Chicago/Turabian Style

Fang, Nufang, Zhihua Shi, Fangxin Chen, and Yixia Wang. 2015. "Partial Least Squares Regression for Determining the Control Factors for Runoff and Suspended Sediment Yield during Rainfall Events" Water 7, no. 7: 3925-3942. https://doi.org/10.3390/w7073925

Article Metrics

Back to TopTop