Impacts of Data Quantity and Quality on Model Calibration: Implications for Model Parameterization in Data-Scarce Catchments

: The application of hydrological models in data-scarce catchments is usually limited by the amount of available data. It is of great signiﬁcance to investigate the impacts of data quantity and quality on model calibration—as well as to further improve the understanding of the e ﬀ ective estimation of robust model parameters. How to make adequate utilization of external information to identify model parameters of data-scarce catchments is also worthy of further exploration. HBV (Hydrologiska Byråns Vattenbalansavdelning) models was used to simulate streamﬂow at 15 catchments using input data of di ﬀ erent lengths. The transferability of all calibrated model parameters was evaluated for two validation periods. A simultaneous calibration approach was proposed for data-scarce catchment by using data from the catchment with minimal spatial proximity. The results indicate that the transferability of model parameters increases with the increase of data used for calibration. The sensitivity of data length in calibration varies between the study catchments, while ﬂood events show the key impacts on surface runo ﬀ parameters. In general, ten-year data are relatively su ﬃ cient to obtain robust parameters. For data-scarce catchments, simultaneous calibration with neighboring catchment may yield more reliable parameters than only using the limited


Introduction
Hydrological models are important tools for rainfall-runoff simulation and flood forecasting around the world [1]. Among different types of models, conceptual models are widely used in the simulation of catchments at different scales due to the relatively low data requirements and simplicity of operation. Parameters of conceptual models normally cannot be estimated directly from catchment characteristics. They usually need to be calibrated based on historical records, such as precipitation, temperature, evapotranspiration and discharge. For many locations-especially for small and medium-sized rivers (the catchment area is between 200 km 2 and 3000 km 2 )-the observations are not available or the observation devices just have been established for a short time [2]. As a result, the application of models is often limited by insufficient data or poor data quality due to instrument failures or low reliability, which can lead to problems in parameter identification. The reliability of models is highly affected by the data sets used for model calibration [3]. How to improve the transferability of model parameters in data-scarce catchments remains a challenge in hydrology [4,5]. It is very important to obtain reliable parameter estimation and accurate flood forecasts in these areas to reduce the loss of life and property [6].
Numerous studies have shown that model parameter estimation is sensitive to the number of data points used in the calibration. Many attempts have been made to investigate the data requirements for good model calibration. Yapo et al. [7] tested the sensitivity of the NWSRFS-SMA conceptual model to different lengths of input data, and they suggested that approximately eight years of data are needed to identify robust parameter estimation. Foglia's research shows that the parameters of the distributed TOPographic Kinematic APproximation and Integration (TOPKAPI) model are sensitive to the selection of input data [8]. Subsequently, the generalized Cook's distance approach was introduced by them to have an effective identification of data impacts in hydrological modeling [9]. David et al. [10] evaluated the impacts of data length in a great number of study catchments using a two-stage hybrid framework. High impact was also detected on maximum predicted flows. Li et al. [11] concluded that eight years of data were sufficient to obtain reasonable model parameters, while longer data series did not necessarily yield better results.
The model calibration is also strongly impacted by the quality of observational data. Two data series with the same length of data, but different hydroclimatic conditions often lead to different model parameters. Yapo et al. [7] indicate that the observation records contain relatively wet weather conditions can significantly reduce the uncertainty of model parameters. Wright et al. [12] evaluated the impact of individual data on model simulation based on case-deletion methods and analytical diagnostics in catchments under different climate conditions, the result shows that a single point could change the maximum streamflow prediction by more than 25%. A considerable number of studies have shown that the data series selected for model calibration and validation should be well representative of the various phenomena experienced by the catchment [13][14][15][16]. Continuous series of data covering different weather conditions can effectively reduce the uncertainty of model parameter estimation [11].
However, as mentioned previously, for many small and medium-sized rivers, observations are only available in a short period. For example, through the construction project of early flood warning and forecasting for small and medium-sized rivers in China, a great number of rain gauges and hydrological stations have been built in the past few years [17]. Sudden floods caused by intense rainfall occur frequently in these regions, it is urgent to establish an accurate flood forecasting system to reduce the losses of flood disasters. The main problem at present is that the accumulated data may be insufficient to determine the model parameters, which usually leads to high uncertainties in the parameter estimation. Over the past few decades, a considerable number of studies have been carried out to predict the signatures in ungauged basins by transferring information from gauged catchments based on the idea of hydrological similarity [18,19]. Moreover, a simultaneous calibration for a set of catchments was introduced to find model parameters perform reasonably for all catchments involved in the calibration, which could improve the transferability of model parameters for ungauged basins [20]. This prompts us to consider how to adequately utilize the limited data records of a particular catchment and information from other catchments to obtain accurate runoff forecasts [17]. Therefore, first of all, it is of great importance to evaluate the influences of data quantity and quality in model calibration and parameter identification, to gain more knowledge about how much data and what kind of data sets can be selected to effectively identify robust model parameters. Based on this, further exploration should be focused on how to make full use of the information to obtain parameters with reasonable transfer performance in data-scarce catchments.
This study aims to investigate the impacts of data variation in model calibration, as well as to minimize uncertainty in parameter estimation in data-scarce catchments. Two numeric experiments were conducted on 15 small and medium-sized catchments. In the first experiment, the impacts of data quality and quantity on model calibration were investigated. The HBV (Hydrologiska Byråns Vattenbalansavdelning) model was calibrated for all study catchments using data of different lengths and then the transferability of all calibrated model parameters over two different decades was assessed. In the second study, in order to explore a possible solution for reducing the uncertainty in model parameterization for data-scarce catchments, the simultaneous calibration approach was performed by using information from catchments with minimal spatial proximity.
This study is organized as follows: after the introduction, Section 2 gives a brief description of the study area and hydro-meteorological data. Section 3 explains the methods and the design of two numeric experiments. The results and discussion are presented in Section 4. In Section 5, the summaries and outlook of this research are outlined.

Study Area and Hydro-Meteorological Data
The study domain is located in the Mid-Atlantic region of the United States. A total of 15 catchments were used to conduct this study ( Figure 1). These catchments are a subset of the dataset used for the model parameter estimation experiment (MOPEX) project [21]. The MOPEX project provides more than 50 years' continuous daily precipitation, potential evapotranspiration, average air temperature and daily streamflow for vast amounts of catchments. The daily precipitation and air temperature data were supplied by the National Climate Data Center (NCDC, Asheville, NC, USA), while the discharge series was offered by the United States Geological Survey (USGS) gauges. Detailed descriptions of the MOPEX catchments can be found in Duan [21].  Table 1 lists the catchment properties for the investigate area and Table 2 presents the climate conditions, respectively. It can be seen clearly from tables that the study catchments vary considerably not only in meteorological conditions, but also in catchment characteristics. The smallest catchment has a size of 332 km 2 , while the biggest one is about 2929 km 2 . The median size of the catchments is about 1186 km 2 . The study catchments are impacted by a humid continental climate with relatively warm summers and heavy snowfall in winters. The precipitations are distributed throughout the whole year and show a slight seasonality. The precipitation increases slowly in summer, but the runoff reached the lowest level of the whole year due to the high rate of evapotranspiration. From February to April, the melting of snow leads to a significant increasing in runoff generation. The percentage of snowfall during cold seasons shows an increase from 8.5% in the south region to about 27% at the northeast coast that along with an obvious decline of long-term average temperature from 13.5 • C to 7.2 • C.

HBV Model
The conceptual HBV model was selected to simulate the rainfall-runoff response for the study catchments. HBV model was originally established at the Swedish Meteorological and Hydrological Institute (SMHI) in the early 70s [22]. Compared with other hydrological models, the HBV model has a relatively simple structure and few parameters. Therefore, it is convenient to run a great number of simulations in a short time. After years of development, HBV has become a multipurpose model with a variety of applications in flood forecasting, water resources management and studies on impacts of climate change around the world [23,24].
The HBV model consists of conceptual routines for snow accumulation and snowmelt, evapotranspiration and soil moisture, runoff generation and runoff concentration. The snow accumulation and snowmelt routine are calculated based on the degree-day method by two parameters: degree-day factor (DD) and threshold temperature for snowmelt (TT). Actual soil moisture is calculated by balancing precipitation and actual evapotranspiration using field capacity (FC) and permanent wilting point (PWP) as parameters. If the soil moisture is greater than PWP, the actual evapotranspiration occurs at a potential rate, otherwise, the evapotranspiration will be limited by the ratio of actual soil moisture to PWP. Runoff generation is calculated by a nonlinear function of precipitation and actual soil moisture with a shape coefficient (Beta). The determined runoff is separated into three different flow components: surface flow, interflow and groundwater, which are represented by two linear reservoirs with corresponding residence time (K0, K1 and K2). The surface flow is restricted by the threshold water level of the upper reservoir (L). The upper and lower reservoirs are connected using a linear percolation rate (KD). Finally, the local runoff is supposed to converge to the outlet through a transformation function based on a triangular weighting parameter (Maxbas). More descriptions of the HBV model can be found in our previous studies [20,25].
The lumped version of the HBV model was selected to simulate the daily scale rainfall-runoff response with areal mean precipitation, potential evapotranspiration and mean air temperature as inputs. As shown in Table 3, a total of 9 parameters were selected to be calibrated using historical data. The initial ranges of model parameters were determined according to literature and pretest results. The robust parameter estimation (ROPE) algorithm was selected for model parameter optimization [26]. The ROPE algorithm is based on the conception of data depth function. The basic idea of this approach is to seek the center points in the multidimensional space constructed from all parameter sets. The Monte-Carlo random sampling method is used to generate a pre-given number of parameter sets based on the possible range of model parameters. The ROPE algorithm is a very efficient calibration method that can result in a pre-given number of parameter sets with fairly good model performance.
Considering the nonuniqueness of model parameters, in this study, each calibration results in 10,000 model parameter sets with very similar model performance, but different distributions of the parameter sets. All the calibrated parameter sets are considered to be transferred to other periods. For statistical purposes, the mean model efficiency for the optimal 10,000 parameter sets was used to represent the model performance. Table 3. Description and initial range of the HBV model parameters.

Parameter Description Max Min
TT Threshold temperature for snow melt initiation ( Threshold water level for near surface flow (mm) 100 1

Performance Criteria
The Nash-Sutcliffe coefficient (NS) between the observed and modeled discharge is the most frequently used performance criterion in hydrological modeling [27].
where Q o (t) represents the observed discharge at time t and Q m (t) represents the corresponding modeled discharge, respectively. Q o is the mean observed discharge over the whole calibration period.
Many studies have shown that the selection of the performance criteria has a strong impact on model performance [28]. In this study, according to the available observations, HBV model was simulated on a daily scale. The goal of the model calibration is to capture dynamic behavior and achieve water balance simultaneously. The NS efficiency represents the squared difference between the modeled and observed runoff that pays more attention to high flows than low flows. Therefore, a newly constructed performance criterion by incorporating water balance with NS efficiency was considered in model calibration. Viney et al. [29] suggested to combine NS efficiency and Bias constraints by using the following Equation: where B denotes the bias value for the total simulated runoff and observed runoff. p is a balance factor that specifies the weight to control the severity of the constraint penalty. The value of p is 2.5 in this study. This formula takes into account both reasonable water balance and accurate runoff dynamics. The abbreviation NSB (Nash-Sutcliffe and Bias) is used subsequently for this performance measurement.

Numeric Experiments
In this study, two numeric experiments were designed to evaluate the influences of data quality and quantity on model parameterization, to investigate how much observational data are sufficient or necessary to obtain good model calibration, as well as to explore potential solutions for reducing the uncertainty of model patronization in data-scarce regions.
Numeric Experiment 2 addresses the question of how to incorporate external information into the modeling of data-scarce catchments. We assumed that the target catchment is a sparse catchment with only one-year data, while the neighboring catchment with the minimal spatial distance has long-term observations. The data from neighboring catchment was also considered in model calibration to reduce the uncertainty of parameters. A simultaneous calibration approach was proposed to calibrate the models simultaneously for data-scarce catchment and the neighboring catchment. The goal is to determine robust parameters for data-scarce catchment by using information from the data-rich catchment. The simultaneous calibration method is a multiobjective optimization function, the objective function can be defined as follows: Here, O(θ) is the objective function for a given parameter set θ. NSB * n and NSB * s denote the optimal NSB for the neighboring catchment and the data-scarce catchments, respectively. The optimal performance can be represented by the model performance of individual calibration. NSB n (θ) and NSB s (θ) denotes the NSB value of parameter set θ for the neighboring catchment and the data-scarce catchments, respectively. The greater the value of µ is, the more the biggest loss in model performance contributes to the simultaneous calibration. The aim is to maximize the objective function and to find parameter sets perform well for all the catchments involved in calibration. A value of 4 was given for the balance factor µ to obtain reasonable performance for both catchments.

Comparison of Calibration and Validation Performance
First, the model performances of calibration for one-year data and the validation for ten-year data were compared. Figure 2 plots the NSBs of calibration for one-year data and the first validation period (1970)(1971)(1972)(1973)(1974)(1975)(1976)(1977)(1978)(1979). The points are poorly correlated with a correlation coefficient of 0.45. It can be observed that most points lie below the diagonal, indicating that the calibration performances are generally higher than validation values. The result also demonstrates that the parameter set with high calibration NSB may not yield good validation performance. However, a good calibration result is a prerequisite for good validation. Second, for the two validation periods, the transferred model performances by using parameters calibrated based on one-year data were compared (Figure 3). The NSBs for the two validation periods are similar, with a correlation of 0.8. The correlation values between these two validation periods for the two-year, five-year and ten-year data-based parameter estimation are 0.82, 0.84 and 0.84, respectively. The high correlations imply that the transferability of model parameters is relatively stable, good model performances for the 70s always incorporates perfect simulation for the 80s and vice versa. The difference in model performance between two validation periods increases with the reduction of NSBs, indicating that low-performance simulations are more sensitive to specific periods. This is mainly due to the high uncertainty of model parameters.

Impact of Data Quality
First, the results of individual calibration for the validation periods were compared. As expected, the model performances show a high positive correlation of 0.93 between 1970-1979 and 1980-1989. The histograms in Figure 4 presents the calibration results for these two periods. Due to measurement errors, the HBV model performs differently for the catchments. For both two validation periods, catchment 15 has the lowest NSB value and catchment 12 shows the highest value, respectively. To have a more equitable comparison between different catchments, we assumed that the individual calibration result is the optimal performance for the validation period. Therefore, all the validation model performances were normalized by the optimal performances. The higher the relative performance means the better the transferability of parameters. A value of 100% means a perfect parameter transfer. The plus signs on the upper part of Figure 4 shows the relative NSBs by transferring model parameters from on one-year data-based calibration (results in 20 validations for each catchment). The result shows that for a specific validation period, the model parameters obtained by short-period data perform differently. Some model parameters can well reproduce the rainfall-runoff response for the study catchments, while some parameters could not obtain reasonable relative NSBs in model validation. It can be seen from the plus signs that, in general, the transferred model performances of 1980-1989 outperformed the results of 1970-1979. For all study catchments, most of the parameters estimated by one-year data can achieve more than 60% relative performances. For catchments 12 and 15, all the parameters calibrated based on one-year data perform well with model performances greater than 80% for both two validation periods. The impacts of data with the same length, but the different quality on model calibration were further discussed. For the calibration based on one-year data, the peak flow and the 10th percentile high flow value were calculated. Figure 5 plots the relative NSBs for 1970-1979 against the peak flow and the 10th percentile high flow of calibration period, respectively. It can be seen clearly from the scatterplots that most of the poor transferred performances (less than 80% relative NSB value) correspond to relatively small peak flows over the calibration period (less than 20 mm). The correlation between model performance and the 10th percentile high flow is not as clear as the peak flow. We can guess that the data sets with low peak flow may not contain as many flood events as the data set with high peak flow. In this case, the calibration process is unable to well capture the dynamic behavior and to reproduce the process of flood events. As a result, the model parameters were underestimated and were not suitable for transfer to other time periods with different climate conditions. Furthermore, the correlation between peak flow value and the model parameters was explored. As mentioned previously, we obtained 10,000 parameter sets for each calibration by the ROPE algorithm. Therefore, the mean value and standard deviation of all the optimal parameters were selected to represent the distribution of parameters. The result demonstrates that the peak flow value has a significant influence on the estimation of surface runoff parameters. Figure 6 shows the results for two parameters used for describing surface runoff: threshold water lever (HL) and near surface flow storage constant (K0). The result indicates that for one-year data-based model calibration, the estimation of HL and K0 are highly effected by the peak flow value. A low peak flow implies that the calibration procedure may not provide sufficient information to cope with various climatic conditions, and the hydrological response of the catchment cannot be captured by the model. Therefore, the calibrated parameters are not suitable for transfer to different time periods. This phenomenon has also been observed in previous studies on the influence of data in model calibration. Yapo et al. [7] found that based on the same length of data, the data set with wet conditions is more sufficient to obtain robust estimates of parameters than the data set with dry conditions. Singh and Bardossy indicated that model calibrated based a small subset of unusual flood events can results in equally good transferability as model parameters calibrated based on the whole observation period [30]. Wright et al. [9] showed that for a two-year based daily model simulation in a semi-arid catchment, removing a peak flow record could strongly affect the estimation of model parameters and the predicting of high flows. The results from this experiment indicate that data quality is of great importance in model calibration. Abundant flood information is important for model parameter identification, which is also a typical challenge for model simulation in data-scarce catchments.

Impact of Data Quantity
The effects of data quantity on model calibration were investigated by the transferred results of using parameters calibrated based on different lengths of data. Following the design of the first experiment, 20, 10, 4 and 2 validation results were obtained from the calibration based on 1, 2, 5 and 10 continuous years' data. The mean model performance for each data length category was taken as the transfer result. For a better comparison between different catchments, the results were normalized by the optimal performance for each catchment. Figure 7 shows the mean relative NSBs by transfer parameters that calibrated based on different data lengths for 1970-1979 and 1980-1989, respectively. As expected, for most catchments, the relative NSBs increase significantly with the increase of data length in calibration for both two calibration periods. When the calibration data increases from two to five years, most of the catchments show the greatest increases in model performance. The sensitivity of data lengths in model parameterization varies for the study catchments. For example, the transferability of parameters calibrated based on different lengths of data seems very similar for catchment 9 and 15 in the validation period 1970-1979, while the results are similar for catchment 2,9,15 in the validation period 1980-1989. Increasing the quantity of data in model calibration only leads to a slight improvement in validation. However, for catchment 1, the transferred results improve obviously if more years' data are used for parameter estimation. It can also be found that for some catchments (catchment 2, 3, 9 and 14), the validation NSBs for the period 1970-1979 by transfer parameters from ten-year data-based calibration were slightly smaller than the results based on five-year data. The reason is that the model parameters may be overestimated in the calibration due to the anomalously dry (or wet) climate conditions. There are only 4 and 2 valuation results for the five-year and ten-year data-based calibrations, respectively. Therefore, a single poor transferred performance may have a significant impact on the statistical results. For parameter transfer from ten-year data-based calibration, 11 out of 15 catchments for 1979-1979 and 14 out of 15 catchments for 1980-1989 can obtain more than 90% relative NSBs. The result indicates that ten-year data are sufficient to achieve robust parameter estimations for the study catchments. The distributions of the model parameters calibrated based on different data lengths were compared. The upper part of Figure 8 shows the typical distribution of the parameter HL and K0 for two catchments, and the lower part of figure shows the corresponding transferred NSBs for 1970-1979, respectively. The distribution range of the selected parameters decreases with the increase of the amount of data, indicating that the uncertainty of parameterization can be reduced if more information is included in the calibration. The model performances improved notably with the increase of data quantity used for calibration. However, for catchment 01611600, when the data length was increased from five to ten years, the relative NSB decreases by 6%. This was mainly due to observational errors and special weather conditions during the calibration. The model was specifically adjusted to the observation period, resulting in the overestimation of model parameters. The result of the sensitivity of data length in model calibration is consistent with the general findings of Yapo et al. [7], Anctil et al. [30] and Li et al. [16] where model parameter estimation is strongly affected by the amount of data used in calibration. Based on the large number of experiments that were carried out using different types of models in regions with different climate and underlying conditions, we can conclude that, in general, about eight years of data are required to obtain reliable model parameter estimates. In this study, the result shows that more data series involved in the model calibration usually leads to better transferred model performance. The transferability of model parameters is quite stable while ten-year data were used for model calibration.
In this study, we assumed that the model parameters are constant over time and we did not consider the variation of climatic conditions and catchment characteristics in parameter transfer. However, due to the non-stationary conditions, the model parameters may vary [1,14]. The purpose of modeling is to predict future signatures. Therefore, model parameters should represent the expected climatic conditions and can be transferred to different time periods. Model simulation based on the separation of data series into different climate conditions (e.g., dry and wet periods, warm and cold periods) can help to detect the temporal variations of the model parameters. The transferability of model parameters under non-stationary conditions is worthy of investigation in future work.

Simultaneous Calibration for Parameterization in Data-Scarce Catchments
As shown before, model parameterization is greatly influenced by the selection of calibration data.
Increasing the length of observations shows certain improvements when transferring the parameters to other periods. Furthermore, simultaneous calibration of HBV model was performed for a data-scarce catchment with one-year data and the neighboring catchment with ten-year data (1950)(1951)(1952)(1953)(1954)(1955)(1956)(1957)(1958)(1959) to identify robust parameters for the data-scarce catchment. We treated all study catchments as sparse catchments with only one-year data for calibration. For the data from 1950 to 1969, the simultaneous calibration was carried out for every year separately. It resulted in 20 calibrations and the validation performance was evaluated for the periods 1970-1979 and 1980-1989 as well. The results show that simultaneous calibration always leads to slightly weaker performance than the individual calibration based on one-year data. However, the transferred result for the validation period indicates the robustness of the simultaneous calibration. Taking the validation result of 1970-1979 as an example, the mean NSB of all catchments for individual calibrated parameters is 0.62, while for the simultaneous calibrated parameters the value slightly increases to 0.64. The transferability of simultaneous calibrated parameters was compared with the individual calibrated parameters as shown in Figure 9. It can be seen from the scatterplots that incorporating information from neighboring catchment can provide more reliable parameters for data-scarce catchments. For about 65% of the validation results for 1970-1979 and 64% for 1980-1989, the simultaneous calibrated parameters outperform the individual parameters. Furthermore, approximately 55% of the validation results for both two periods that simultaneous calibration transfer shows better performance than individual transfer. While only about 20% of the individual parameters performed better than simultaneous parameters for both two validation periods. The differences of validated NSBs appears to be greater for data sets with relatively low performances. For data-scarce catchments, if the parameters cannot be effectively identified by itself, using the information from the neighboring catchment is a credible solution to improve the accuracy of prediction. The results suggest that the simultaneous calibration approach offers a possible solution for model parameterization in data-scarce catchments. It can be found from the scatterplots that a certain number of points lie on the diagonal, indicating that the utilization of additional information does not improve the transferability of the parameters. For about a quarter of the points are located in the lower part of the diagonal, indicating that the parameters of simultaneous calibration lead to weaker model performance than the one by individual calibration. In this study, we assumed that the catchments with spatial proximity are more likely to have similar dynamic behavior. The catchment similarity measurement was not included in the scope of this study. In a simultaneous calibration procedure, only the catchment with the shortest geographical distance was considered. Our previous study of simultaneous calibration for a set of catchments suggests that many catchments share parameters and the selection of catchments for simultaneous calibration is important [20]. This experiment used information from only one neighboring catchment. There are several literatures provide schemes for the identification of catchment similarity based on catchment signatures [19,31,32]. We believe that they could provide some guidance for selecting catchments for simultaneous calibration. In this experiment, only the information from one neighboring catchment was considered, and the simultaneous calibration for a set of similar catchments may be a probable approach to identify parameters data-scarce catchments with robust transferability.

Conclusions and Outlook
In this study, we investigated the impacts of data quantity and quality on model calibration and parameter transfer. We also explored the potential solution for model parameterization in data-scarce catchments. Two numeric experiments were conducted on 15 small and medium-sized catchments. The HBV model was used using different lengths of data, the model performances of parameter transfer for two validation periods were evaluated to investigate the impacts of data quality and quantity on model calibration. Meanwhile, the sensitivity of model parameters to high flows were compared. In addition, simultaneous calibration was proposed to incorporate the information from neighboring catchments to improve the parameter estimation in data-scarce catchments. The main findings of this study include: (1) The model performances of both calibration and validation were greatly affected by the observations used in model calibration. Good calibration result were usually a prerequisite for good validation. Due to different data quality and different climatic conditions, model parameters calibrated based on the same length of data still performed differently; (2) Flood events during the calibration period had a significant impact on the identification of model parameters, especially for those related to surface runoff generation and concentration. The lack of flood information during the calibration period may have led to the underestimation of model parameters and cause high uncertainty. Abundant flood information was essential to identify model parameters with robust transferability both in space and time; (3) The transferability of the model parameter increased notably with the increase of the length of data used for model calibration. The sensitivity of data length to parameter estimation varied among the selected catchments. Using ten-year data for calibration, most catchments could obtain more than 90% of the validation model performances, indicating that about ten-year data could achieve reliable parameter estimation for the study catchments; (4) For model parameter estimation in the data-scarce catchment, the result showed that simultaneous calibration with neighboring catchment could lead to more reliable parameter estimations than only using the limited data. The model parameters could be identified by information from other catchments with a high degree of similarity. The simultaneous calibration approach offered a potential approach for model parameterization in data-scarce catchments.
This research further demonstrates that model parameter estimation is a complex process. Although we know that the model parameterization is highly dependent on the observations used for calibration, it is still difficult to quantify the impact due to the varying sensitivities of data in different catchments. The catchment similarity measurement was not explicitly treated in this research, and more studies are required to further investigate the simultaneous calibration approach for catchments with a similar hydrological response. This can further enhance the transferability of model parameters in data-scarce regions. Currently, all the models were simulated on a daily scale. For small and medium-sized catchments, the floods often converge to the outlet very quickly with short leading times. Therefore, the hourly response of the catchments also deserves to be explored in future work.

Conflicts of Interest:
The authors declare no conflict of interest.