A Calibrated, Watershed-Specific SCS-CN Method: Application to Wangjiaqiao Watershed in the Three Gorges Area, China

The Soil Conservation Service curve number ( S C S-C N) method is one of the most popular methods used to compute runoff amount due to its few input parameters. However, recent studies challenged the inconsistent runoff results obtained by the method which set the initial abstraction ratio λ as 0.20. This paper developed a watershed-specific S C S-C N calibration method using non-parametric inferential statistics with rainfall–runoff data pairs. The proposed method first analyzed the data and generated confidence intervals to determine the optimum values for S C S- C N model calibration. Subsequently, the runoff depth and curve number were calculated. The proposed method outperformed the runoff prediction accuracy of the asymptotic curve number fitting method, linear regression model and the conventional S C S-C N model with the highest Nash–Sutcliffe index value of 0.825, the lowest residual sum of squares value of 133.04 and the lowest prediction error. It reduced the residual sum of squares by 66% and the model prediction errors by 96% when compared to the conventional S C S-C N model. The estimated curve number was 72.28, with the confidence interval ranging from 62.06 to 78.00 at a 0.01 confidence interval level for the Wangjiaqiao watershed in China.


Introduction
Accurate direct surface runoff is essential for water resources' planning and development to reduce the occurrence of sedimentation and flooding at their downstream areas [1][2][3]. The simpler hydrological model under the law of parsimony with the least required input parameters and superior predictive model performance is preferable by many researchers [4][5][6][7][8][9][10].
Despite the existence of many rainfall-runoff models, the Soil Conservation Service curve number (SCS-CN) model proposed by the SCS National Engineering Handbook (SCS NEH) [11] is widely used in hydrological design [12,13]. The parameters of curve number (CN) and initial abstraction ratio (λ) are important in the SCS-CN method. The initial abstraction ratio (λ) plays a vital role in SCS-CN model in order to obtain an accurate runoff estimation [14]. Since 1954, λ was proposed by SCS to be 0.20. Equation (1) measures the runoff depth (Q) based on λ, the maximum potential water retention amount (S) and the rainfall depth (P). Throughout this paper, P, S and Q are measured in the unit of millimeters unless stated otherwise.
where I a is the initial abstraction in the unit of millimeters computed using Equation (2).
On the other hand, the parameter CN is a transformation of S (where λ value must be 0.20), as shown in Equation (4).
However, recent studies concluded that λ should not be a fixed value. Furthermore, some researchers also reported that λ value variation away from the proposed 0.20 value and lower than 0.20 was able to achieve better estimation of runoff prediction results in their studies [15][16][17][18][19][20][21][22]. Based on the median values of natural data for 307 watersheds, a group of researchers from the United States of America suggested a rounded value of λ = 0.05 to produce a better estimation of runoff depth [18]. Similarly, a group of researchers adopted the λ value of 0.05 for their research study at the Wangjiaqiao watershed in China [13].
The satellite imaging technique and geographic information system (GIS) were incorporated with the conventional SCS-CN method for studies but no attempt was reported to calibrate the primary SCS-CN rainfall-runoff framework with statistics in recent years. Recently, two groups of researchers developed a globally gridded CN dataset at 250 m spatial resolution [23,24]. However, the 250 m resolution dataset only represents general patterns of soil runoff potential appropriate for regional to global-scale analyses and may not capture the local variance suitable for fine-scale applications. Users need to check local conditions and runoff trends whenever available in their area of interest.
Since the tabulated NEH CN handbook [25] was based on the proposal that λ = 0.20, when λ changed, the CN value was altered and could not be determined or referred from the NEH handbook directly. Based on the study results of a group of researchers from the United States of America, λ = 0.05 was reported as the best value to represent watersheds in the United States of America and they proposed Equation (5) for runoff prediction while the S correlation equation between S 0.05 and S 0.20 , as shown in Equation (6) (in inches), was used to transform the S 0.05 value back to S 0.20 in order to calculate CN values in their studies [18]. Without the correlation between S 0.05 and S 0.20 , the direct substitution of S λ (i.e., S 0.05 ) into Equation (4) yields CN λ (the conjugate CN), which is totally different from the conventional CN (denoted as CN 0.20 , where λ = 0.20) value [18].
Since λ varies from location to location, a watershed-specific SCS-CN calibration method was proposed by using non-parametric inferential statistics based on rainfall and runoff data pairs. In this study, λ was no longer fixed at 0.20. This paper presents the use of inferential statistics to calibrate the primary SCS-CN rainfall-runoff model. To measure the effectiveness of the watershed-specific SCS-CN calibration method, the dataset from a past research in China was used to derive a watershed-specific SCS-CN rainfall-runoff model, a watershed-specific S correlation equation, λ value and the CN of Wangjiaqiao watershed to further improve their runoff prediction accuracy [13]. A calibrated, watershed-specific SCS-CN rainfall-runoff model was developed while an equation was derived to correct the runoff prediction of the conventional SCS-CN model. To date, no other published work has incorporated inferential statistics to calibrate the SCS-CN model.

The Proposed Calibrated Watershed-Specific SCS-CN Method
According to SCS, the initial abstraction (I a ) must be smaller than the smallest rainfall amount in the dataset which initiated runoff [25]. Furthermore, SCS constraint also stated that λ value must be in the range of [0, 1] and S must be a positive integer [25]. Based on Equation (3), the λ value cannot be a negative integer and S should be larger than I a in order to meet its stated constraints.
The non-parametric inferential statistics of the bias corrected and accelerated (BCa) bootstrapping method [26][27][28] was conducted on the given dataset with 2000 random samples (with replacement) to make a statistically significant selection of key parameters-λ and S-with 99% confidence interval (CI) in order to calibrate Equation (1) [26,29]. The data distribution free, BCa bootstrapping technique was used in this study because it is robust and able to produce confidence intervals for statistical assessment.
The selection of mean or median λ and S is an universal dilemma in the hydrological field among researchers [18,30]. IBM statistical software SPSS (version 18.0) was used in this study, the normality test was conducted in SPSS to determine whether the optimum λ and S values should have been chosen from the mean or median confidence intervals. If a given dataset has less than 2000 samples, the Shapiro-Wilk test is suggested rather than the Kolmogorov-Smirnov test. This paper used a dataset which was less than 2000 samples, and therefore, the Shapiro-Wilk test was used. If the p value of the Shapiro-Wilk test is greater than 0.05, then the dataset is considered normally distributed [31]. As such, parameter optimization process should be inferred from the mean CI.
The supervised, non-linear genetic optimization algorithm was used in this study to search for the optimum λ and S values. The optimization algorithm created a population size of 2000, and 2000 random seeds with a mutation rate of 0.075 to converge towards an optimal solution within BCa 99% CI at a small error margin of 0.001 mm to search for the optimum λ and S value within the selected confidence interval range while the least square fitting algorithm minimized the residual sum of squares (RSS) between the predicted runoff and the values of the entire dataset.
As proposed by SCS, the S value was calculated from CN equation, as shown in Equation (4), whereas the CN value was chosen from the NEH handbook [25]. When λ is no longer equal to 0.20, a different λ value will yield a different S value, denoted as S λ . In this study, Equation (1) was rearranged to illustrate a way to solve for S λ , as shown in Equation (8): Equation (8) is known as the general S equation denoted by S λ . For the conventional SCS-CN model where λ = 0.20, S 0.20 can be calculated with Equation (8) according to the corresponding rainfall and runoff data pair. Since the optimum λ value of the calibrated model might be different from λ = 0.20, a statistically significant S correlation is needed to correlate the S λ to S 0.20 [18] in order to determine the equivalent S 0.20 value for the substitution back to Equation (4) to derive an equivalent CN 0.20 used by SCS practitioners. Without the S correlation equation, the CN value derived from any λ value which is not equal to 0.20 is known as the conjugate curve number denoted as CN λ [18].
The watershed-specific rainfall-runoff model and the conventional SCS-CN model were derived from Equation (1); thus, the runoff prediction differences (Q v ) between two models can be modeled according to rainfall depth values in order to adjust the runoff prediction results of Equation (3) with a corrected equation.
The proposed method consists of two main steps: (1) Analyze the rainfall and runoff data pairs using the IBM statistical software SPSS (version 18.0) by generating confidence intervals for both mean and median values of derived λ and S; subsequently, perform a normality test to decide whether the confidence interval of mean or median value is to be used for optimization. (2) Optimize from the confidence interval range. In short, given rainfall-runoff data pairs (P i , Q i ), I a i , S i and λ i for i > 0, the proposed watershed-specific SCS-CN calibration method consists of the following steps: 1. Perform bootstrap, BCa procedure and normality test in SPSS (version 18.0 or an equivalent statistics software) for (λ i , S i ).

Remark 1.
The I a values, S values and I a /S values used in this paper came from a study in China [13]. The I a values used were estimated based on the comparison of the hydrograph with rainfall graph, and the methodology to derive the I a and S value for each event is presented in the study. IBM SPSS version 18.0 was used to conduct all statistical analyses in this paper. The total area of this study site is 1670 hectares and it is about 50 km northwest of the Three-Gorges Dam. Twenty nine rainfall-runoff data pairs were collected from 1994 to 1996 as shown in Table 1 [13].

Runoff Model Assessment
To measure the effectiveness of the proposed method, the Nash-Sutcliffe index (E), the model residual sum of square errors (RSS) and the overall model prediction error (BI AS) were computed using Equation (9)-(11) respectively.
where n is the total rainfall-runoff events of this study. RSS shows the model residual or prediction error. Thus, a predictive model with a lower RSS value is able to predict runoff amount better. Meanwhile, BI AS shows the overall model prediction error by the summation of a model residual (prediction error). A predictive model with zero BI AS value is able to achieve perfect runoff prediction results, whereas positive BI AS value indicates the model tendency to over predict runoff amount and vice versa. Lastly, E index value is used to determine the model prediction efficiency of a model. E index ranges from −∞ to 1.0, where 1.0 implies a perfectly predictive model [32]. E index values between 0 and 1 are generally viewed as acceptable levels of performance; however, when E < 0, the use of the mean runoff value observed can even predict the dataset better than the predictive model [33].

Inferential Statistics Assessment to Obtain Optimum λ and S
In total, 29 different pairs of λ and S values were derived from the rainfall-runoff data pairs of Wangjiaqiao watershed. At α = 0.01 level, the proposed method searched for the optimum value within the CI range of mean and median λ values. Finally, an optimized pair of S and λ values was used to represent the watershed. The descriptive statistics of λ and S values were tabulated in Table 2.
Other than referring to the skewness and kurtosis values for λ and S dataset, the median value will be a better collective representation for λ and S dataset to represent the watershed, as the Shapiro-Wilk normality test also concluded that p < 0.05 for both λ and S dataset. As the data distribution for both λ and S dataset are not normally distributed by nature, the best collective λ and S values were optimized within the median confidence interval ranges of λ and S at α = 0.01 level to minimize the RSS between the model predicted runoff amount and its observed values for the Wangjiaqiao dataset.
The optimum λ value was 0.043 while 260.081 mm was the optimum S value (denoted as: S 0.043 ). The product of the optimum S and λ value yield the initial abstraction (I a ) of 11.19 mm which was smaller than the smallest rainfall amount in the dataset from [13]. It fulfilled the SCS constraints, whereby the I a amount must be met before any runoff process. Thus, based on the proposed watershed-specific SCS-CN calibration method, the runoff depth (Q) for the Wangjiaqiao watershed in China can be computed using Equation (12).
Based on Table 2, neither the mean nor the median BCa 99% λ CI includes the value of 0.20, and therefore, a λ value of 0.20 is not even statistically significant for the dataset of [13] at α = 0.01 level. Furthermore, the standard deviation for λ at BCa 99% level is 0.034 with 77.14% λ fluctuation percentage between its lower and upper CI ranges to show that λ cannot be a constant but a variable due to its high fluctuation nature.

Watershed-Specific S Correlation Equation and CN for Wangjiaqiao Watershed in China
In the study of [13], they referred to the S correlation equation mapped by [18] where the median λ value of 0.05 was reported as the better collective representation for US watersheds. According to [18], a S correlation equation is required to convert the conjugate CN value when λ is no longer equal to 0.20. A different λ value will lead to different corresponding S value and the CN value will change accordingly. This study used Equation (8) to substitute λ = 0.043 and 0.20 with respective rainfall-runoff data pairs to calculate the corresponding S 0.043 and S 0.20 values in order to determine the S correlation equation between S 0.043 and S 0.20 values for Wangjiaqiao watershed in SPSS. Equation (6) should not be adopted as it was derived (in inches) to reflect watershed conditions of the United States of America [18]. SCS practitioner(s) will choose the CN value from the NEH handbook and calculate the S value with Equation (4) The correlation in Equation (13) has a lower standard error of 0.228 mm; the adjusted R 2 (Adj R 2 ) is equal to 0.998; and its p value is less than 0.001. As the optimum S 0.043 = 260.081 mm, the equivalent S 0.20 value can be found by using Equation (13)

Asymptotic CN of Wangjiaqiao Watershed
Many studies concluded that the CN value could be derived with rainfall-runoff data pairs [34][35][36]. In 1993, the asymptotic CN fitting method (AFM) was introduced to determine the CN of a watershed with rainfall-runoff data pairs only [34]. At the WangJiaoQiao watershed, the standard CN behavior was detected with AFM, whereas CN stabilized at 65.10 (see Figure 2; hence, the equivalent S 0.2 value of the asymptotic CN can be calculated with Equation (4) as 136.17 mm). Furthermore, the I a value can be determined, as it is the multiplication of λ and S values, and therefore, the I a value we got was 27.24 mm. However, ten out of twenty nine (34.48%) rainfall events observed in Table 1 are smaller than the calculated I a value. As such, AFM derived an I a value which was in conflict with the aforementioned SCS constraint, one that meant there would be no runoff generated from any rainfall amount below the I a value.

Residual Modeling and the Corrected Equation
Some researchers developed new models or modified the existing SCS-CN rainfall-runoff model by adding more parameters to improve surface runoff prediction accuracy [14,16,22,[37][38][39][40][41][42][43][44]. However, those modified rainfall-runoff models could not solve the problem faced by SCS practitioners or any software that has already integrated the conventional SCS-CN model or embedded λ = 0.20 into its software algorithm.
In order to correct the runoff prediction variance (Q v ) between the conventional SCS-CN model and the new calibrated SCS-CN model to benefit SCS practitioners in their current practice, residual analyses of runoff predictive model were conducted between the two models to form a corrected equation, as shown in Equation (14). The Q v was mapped with several non-linear regression models in SPSS according to rainfall values.
where Q v is the runoff prediction difference (mm) obtained by computing Equations (3)- (12) and P is the rainfall depth (mm). Equation (14) shows the Q v between Equations (3) and (12). Positive Q v indicates that Equation (3) predicted a larger runoff amount compared to Equation (12). Q v can be plotted in order to visualize that Equation (3) produced inconsistent runoff prediction results, where it over predicted runoff when rainfall was less than 16 mm and more than 36 mm (see Figure 3) but under predicted the runoff amount when rainfall depth was between 16 mm and 36 mm. Equation (14) achieved an Adj R 2 near to 1.0 and low standard error of the estimate (0.053 mm) with statistical significance (p < 0.001) to correct and improve the runoff prediction results of Equation (3). As such, Equation (14) can be amended to Equation (3) to improve the runoff prediction accuracy of Equation (3), as shown in Equation (15).
where P > 0.2S, else Q = 0. Equation (15) adjusted and improved the runoff prediction results of Equation (3). RSS of Equation (3) was reduced by 66%; the over prediction tendency of the model was corrected by 96%; and achieved proximate runoff prediction results as the new calibrated SCS-CN model, while the E index was improved by 71% to 0.826. Without model calibration, the conventional SCS-CN model over-predicted runoff amount by almost 155,000 m 3 at the rainfall depth of 85.90 mm when compared to the newly calibrated SCS-CN model at the 1670 ha Wangjiaqiao watershed in China. The runoff over prediction risk would be even worse toward high rainfall intensities. On the other hand, it also under-predicted runoff amount up to 8000 m 3 at the rainfall depth of 26.5 mm. This showed that the conventional SCS-CN model was not only statistically insignificant at α = 0.01 level, but produced inconsistent runoff prediction results at different rainfall depths.

Comparison of Runoff Prediction Models
The law of parsimony favors a simple model with less fitting parameters. As such, this study explored the possibility of using a linear regressed rainfall-runoff model to quantify the runoff behavior at the Wangjiaqiao watershed. Past researchers proposed that the slope of a linear fitting equation could represent the total impervious area of a watershed while the fitting constant was regarded as the depression loss [45]. In this study, SPSS fitted the best linear regressed rainfall-runoff model for Wangjiaqiao watershed as: Q = 0.214P − 4.623 with an Adj R 2 = 0.715 and standard error of estimate was equal to 2.779 mm. Both fitting slope and constant parameter were statistically significant (p < 0.001) but the model produced five out of twenty nine (17.2%) negative runoff prediction results. Table 2 implies that λ value cannot be 0.20 and a constant value at Wangjiaqiao watershed in China. As a result, the conventional SCS-CN model becomes invalid and not statistically significant. When the λ value is fixed at 0.20, the optimized S 0.20 = 100.8 mm and I a value can be calculated as 20.16 mm, but this I a value violated the SCS constraint, as 17.24% of rainfall data pairs from [13] were less than the I a value. Thus, the conventional SCS-CN model faced the same problem as the AFM and the linear regression model. Moreover, the conventional SCS-CN model had the lowest E index; highest RSS; and BI AS when compared to the AFM, the linear regression model and the newly calibrated watershed-specific SCS-CN model. The statistics of five runoff predictive models were tabulated in Table 3. Residual analyses were conducted by using SPSS to measure the runoff prediction error of every runoff predictive model. The model with the smallest residual confidence interval range, lowest standard deviation error and variance was to be the best runoff predictive model of this study. The SPSS normality test showed that the significant value of the Shapiro-Wilk test was more than 0.05 for the AFM model, the newly calibrated watershed-specific SCS-CN model, the corrected SCS-CN model and the linear regression model; thus, their mean residual values were referred to for the accuracy comparison of the predictive model. On the other hand, the p value of the Shapiro-Wilk test for the conventional SCS-CN model was less than 0.05; hence, its median residual values were used as the benchmark for its model accuracy.

λ's confidence interval in
The mean residual value of the newly calibrated, watershed-specific SCS-CN model was among the lowest (0.056 mm) and nearest to zero, while its 99% BCa confidence interval range of mean residuals spanned across a small range when compared to other models. In addition, the newly calibrated, watershed-specific SCS-CN model had a low residual variance and standard deviation which indicated that the newly calibrated runoff predictive model had the ability to achieve a runoff prediction with low error. As a result, the newly calibrated watershed-specific SCS-CN model became a suitable runoff predictive model for the twenty nine data pairs at the Wangjiaqiao watershed in this study. On the other hand, the corrected SCS-CN model also managed to correct runoff prediction errors of the conventional SCS-CN model and achieved proximate runoff prediction results as the newly calibrated watershed-specific SCS-CN model, which proves that the presented residual modeling technique was effective at transforming Equation (3) into a better rainfall-runoff model.
Without model calibration, the conventional SCS-CN model over predicted runoff volume significantly from rainfall depths of 40 mm onward at the 1670 ha Wangjiaqiao watershed in China (see Figure 4). Through Equation (14), it is possible to quantify and model the runoff over prediction volume according to its corresponding rainfall depths. SPSS mapped that Q v (m 3 ) = 21.01P 2 + 642.62P − 54, 520 was able to quantify the runoff over prediction volume from the conventional SCS-CN model with and an Adj R 2 near to 1.0 and low standard error of 712.92 m 3 (p < 0.001).

Conclusions
A new watershed-specific SCS-CN calibration method was proposed to identify the optimum λ and S values, and derive CN with the use of non-parametric inferential statistics, the rainfall-runoff data pairs and the supervised non-linear numerical optimization technique. The proposed model was applied to the Wangjiaqiao watershed in China. Inferential statistics showed that the conventional SCS-CN model was not statistically significant at α = 0.01 level, and therefore, it was not applicable to model the runoff conditions of the Wangjiaqiao watershed in China. The proposed model identified the optimum median λ of 0.043 (with the 99% confidence interval ranging from 0.035 to 0.062) as the best collective λ for the Wangjiaqiao watershed. A watershed-specific S correlation equation was mapped in this study to show that the CN value of the Wangjiaqiao watershed can be derived directly without referring to the NEH handbook. The estimated CN of Wangjiaqiao watershed in China was 72.28 with a 99% confidence interval ranging from 62.06 to 78.0.
The newly calibrated watershed-specific SCS-CN model improved the previous study results: the E index increased by 7.4%, the BI AS of predictive model was reduced by 93.8% and model's RSS was lowered by 24.4%. These improvements were achieved with a λ of 0.043 instead of rounding it to 0.050. The proposed model also outperformed the AFM model, the conventional SCS-CN model and the linear regression model to predict runoff amount at the Wangjiaqiao watershed. The proposed model had the lowest BI AS and RSS, and the highest E index when compared to those runoff predictive models. On the other hand, the linear regression model had the second highest model inaccuracy after the conventional SCS-CN model, and both models produced negative runoff prediction results that were unable to yield a meaningful hydrological interpretation to predict surface runoff at the Wangjiaqiao watershed. Both models were also unable to produce positive runoff prediction results for rainfall depths less than 20 mm.
A runoff corrected equation was formulated through the proposed residual modeling technique under this study. The equation managed to correct runoff inconsistencies of the conventional SCS-CN model and improved its runoff prediction accuracy. The S correlation equation from other study cannot be adopted as it reflects specific watershed conditions. It must be derived with watershed-specific λ and rainfall-runoff data pairs in order to convert the CN λ into an equivalent CN 0.20 . This study also found that the rounding of λ and CN values will induce the runoff prediction errors. As a result, CN with at least two decimal places is recommended to SCS practitioners for their future studies.
Based on the proposed SCS-CN calibration method, Equation (12) is recommended for the runoff prediction of the dataset from [13] at the Wangjiaqiao watershed in the Three Gorges Area. When a new rainfall-runoff dataset becomes available, SCS practitioners should re-derive the calibrated SCS runoff model again with the proposed methodology.