Improving the Reliability of Numerical Groundwater Modeling in a Data-Sparse Region

: In data-sparse areas, due to the lack of hydrogeological data, numerical groundwater models have some uncertainties. In this paper, a nested model and a multi-index calibration method are used to improve the reliability of a numerical groundwater model in a data-sparse region, the Nalinggele River catchment in the Qaidam Basin. Referencing this key study area, a regional three-dimensional groundwater ﬂow model is developed in a relatively complete hydrogeological unit. A complex set of calibration indices, including groundwater ﬁtting errors, dynamic groundwater trends, spring discharges, overﬂow zone location, and groundwater budget status, are proposed to calibrate the regional numerical groundwater model in the Nalinggele alluvial–proluvial fan. Constrained by regional groundwater modeling results, a local-scale groundwater model is developed, and the hydrogeological parameters are investigated to improve modeling accuracy and reliability in this data-sparse region.


Introduction
A groundwater system is a complex and open system that is affected by natural conditions and human activities [1]. Numerical groundwater modeling is used to analyze and predict local-or regional-scale groundwater flows under various conditions. It has already become one of the most important research and application tools in contemporary hydrogeology [2,3].
As society, the economy, and technology have developed, large-scale groundwater flow numerical modeling has received more and more attention [4]. These models are critical to many important applications, such as assessment and management of groundwater resources and planned hazardous waste disposal projects [5]. However, the most difficult part of groundwater modeling is its need for large amounts of geological, hydrogeological, and topographical data, which are usually expensive and hard to obtain [6]. Data for larger regions are generally deficient, mainly due to uneven data distribution and low levels regional verification. Without enough data, considerable uncertainty exists in the conceptualization and construction of the regional numerical groundwater model. Other than improving the database, there are two main methods of improving the reliability of numerical groundwater models.
The first is the nested model. Because the significance of local details can be better grasped when the larger system framework is known [7], nested models were used to improve model reliability, including both fully and loosely nested models. The local-scale model embedded in the large-area model belongs to a fully nested model, which was used in Interactive Ground Water (IGW) with real-time hierarchical modeling [8]. Furthermore, a larger-scale model was constructed that provided specified head boundaries for the smaller model where the boundary conditions of the smaller model were uncertain [9,10], an approach that is called a loosely nested model. Conversely, the larger-scale model could use the results of the more precise local-scale model to enhance its own accuracy [11]. This method improves reliability from the model-building perspective.
The second way to improve groundwater model reliability, which is used in major numerical groundwater simulation procedures, involves model calibration and validation, which can limit the uncertainty in the model construction process. Dynamic groundwater curve simulation is the most commonly used approach to justify groundwater models [12][13][14]. Sahoo et al. [15] and Dokou et al. [16] used mean absolute error (MAE), root mean squared error (RMSE), the correlation coefficient (R), and the normalized RMSE (NRMSE) between calculated and observed groundwater head to evaluate the performance of a groundwater flow model. For model calibration, groundwater-level fitting accuracy and consistency of dynamic trends between calculations and observations are the priorities.
In China, the earliest calibration standards are found in Groundwater Resource Management Model Work Requirements (GB/T14497-1993) [17], where the calibration standards for groundwater flow numerical simulation are as follows. In general, the number of nodes with an absolute error of groundwater head less than 0.5 m must be greater than 70% of all observed nodes. For areas with larger groundwater declines (greater than 5 m), the number of nodes with a relative error of groundwater head less than 10% must be greater than 70% of all observed nodes. In fact, these calibration standards are not easy to achieve. In the Geotechnical Industry Standard of China, Calculating Technical Requirements for Numerical Calculation of Groundwater Resources (DZ/T0224-2004) [18,19], the technical requirements for numerical groundwater model calibration and verification have been adjusted as follows. (1) For calibration of transient flow models, the trend of observed groundwater head must be consistent with the simulated values. (2) In general, the fitting error between observed and simulated values should be less than 10% of the magnitude of water-level fluctuations during the fitting period. If water-level fluctuations are small (<5 m), the water-level fitting error should be generally less than 0.5 m. (3) For calibration of steady-state flow models, the outlines of the simulated and measured flow fields should be consistent, and the directions of groundwater flow velocities should be uniform. In practice, these standards are hard to achieve. Despite the standard's change in emphasis from water-level fitting accuracy alone to considering simultaneously the fitting of groundwater variation trends and the accuracy of water levels, the core of the calibration standard still focuses on groundwater level.
For larger-scale groundwater models, the simulation results fit well with measured groundwater-level data in a single well, but this does not always fully prove the reliability of a groundwater model. To calibrate a larger-scale groundwater model, fitting at local scale is not enough; overall scope fitting is also necessary. Groundwater flow-field fitting and a groundwater budget were used in various ways to calibrate regional models by He et al. [12] and Gebreyohannes et al. [20]. Apparently, more constraints must be used to advance the calibration and validation of numerical groundwater models.
In recent work, nested models and model calibration have been used in different ways to improve model reliability, but research examples that combine the two approaches are infrequent. This study combines the two methods mentioned above to calibrate a groundwater model of the Nalinggele alluvial-proluvial fan and provide a scientific reference for improving the reliability of numerical groundwater modeling in data-sparse regions.

Study Area
Qaidam Basin is one of the largest inland basins in northwestern China. The average annual precipitation is about 29.8 mm, and the average annual evaporation is 1679.3 mm (Figure 1). The water resource shortage and the vulnerable ecological environment are the main characteristics of the Qaidam Basin. Furthermore, economic development is lagging, and the degree of hydrogeological investigation is low in the area. However, the rate of social and economic development in the region has accelerated significantly in recent years due to massive development of mineral resources. Hence, water resource requirements have been increasing rapidly, and it has become imperative to find a stable groundwater resource field. The Nalinggele River is the largest inland river in the Qaidam Basin and is the most abundant storage area for surface water and groundwater in the basin [21]. The area has huge potential to become the most important regional water supply field ( Figure 2). The Nalinggele River, which originates from the Kunlun Mountains, has a total length of 305 km and a catchment area of 21,898 km 2 , with an average annual runoff of 13.35 × 10 8 m 3 /year. The study area is the alluvial-proluvial fan, where groundwater resources are the most abundant. From the top of the alluvial fan to the front, Quaternary phreatic water in porous media presents continuous changes in water quantity, water quality, and burial conditions. The aquifer gradually changes from a single unconfined aquifer to a multi-layer aquifer ( Figure 3). The Nalinggele River flows from south to north into the basin, and the river water, after exiting the mountains, infiltrates into the aquifer and recharges the groundwater. The upper part of the alluvial-proluvial fan in front of the mountain forms the groundwater enrichment zone, with abundant water resources and good water quality. At the leading edge of the alluvial-proluvial fan, the runoff is blocked by a fine-grained aquifer, and the groundwater depth becomes shallow. Evaporation and transpiration are strong, and groundwater overflows into springs, forming the East Taijinaier River, which eventually flows into the East and West Taijinaier Lakes. At present, the development and utilization rates of groundwater resources in the Nalinggele River alluvial-proluvial fan area are relatively low; there is only one groundwater well field, with a water yield of 4.5 × 10 3 m 3 /day. The groundwater in the Nalinggele River Basin is not only an important potential water source for the economic development of Qaidam Basin, but also an important controlling factor for the evolution of the ecological environment in the basin. Therefore, a scientific understanding of the distribution and circulation of groundwater resources in the study area must be the basis for sustainable development of water resources, the environment, the economy, and society. Developing a reliable numerical model is the most critical link in this research.

Methodologies
Based on the requirements of research targets and geological survey results, a relatively reliable numerical groundwater model has been developed. To maximize reliability, the following methods were used.

Data Speculation Based on Hydrogeological Conditions
Because drilling data and groundwater-level observation data points are scarce and the drills are concentrated in one local area, the bottom elevation of each aquifer and the regional groundwater flow fields are difficult to determine reasonably given the limited data. Therefore, to determine rational isograms, an understanding of geological conditions and interpolation must be combined based on reasonable extrapolation. To speculate about the bottom elevation of each aquifer, it is necessary to combine geophysical data. For groundwater flow fields, the groundwater level of historical drills and the locations of springs are needed as references.

Nested Model
USGS MODFLOW 2005 was used to construct the nested model, which was developed by the Waterloo Hydrogeological Company in Canada.
A regional-scale model of the general study area and a local-scale model of the area including the key study area in the Nalinggele River alluvial fan, were developed ( Figure 4). In the eastern part of the study area, there is a surface watershed between the Nalinggele and Kaimuqi Rivers, but the groundwater watershed does not exist because of low precipitation, and hence groundwater exchange (Q 1 ) occurs between the two river watersheds. Moreover, there is a geological fault in the southwestern portion of the study area, which breaks the continuity of the groundwater flow field of the aquifer systems, but groundwater exchange (Q 2 ) does occur. Q 1 and Q 2 , the boundary flow quantities in the local-scale model, can be calculated by the ZONEBUDGET module in the regional model. Accordingly, the authenticity of boundary conditions and the reliability of the local model were improved by using the nested model.

Model Calibration and Validation
Due to the paucity of regional hydrogeology investigations, some data were developed by speculation because the reliability of model calibration and verification could not be justified by groundwater-level fitting alone. Based on the actual characteristics of the study area, the rationality and reliability of the model were improved by multi-index calibration. The indices were set as follows.
(1) Groundwater fitting errors. This method is not only reflected in the groundwater table fitting curves for the calculated and observed values, but also by calculating the error distributions of the simulated and observed water levels. The distributions of water-level fitting errors and of error parameters were used to fit the groundwater table. The error parameters used included the standard error of estimation (SEE), the root mean squared error (RMSE), the correlation coefficient (R), and the normalized RMSE (NRMSE). (2) Dynamic groundwater trends. The simulated and observed groundwater-level change trends should be basically consistent. The fluctuations of groundwater level in the study area varied widely, from 1.4 m to 11.9 m, but the associated groundwater sources or sinks are not very clear. Therefore, the numerical model had difficulty simulating abrupt changes in groundwater level in the region, leading to relatively large groundwater fitting errors. However, a reasonable model should be able to achieve consistency between simulated and observed groundwater-level trends.
(3) Spring discharge. Springs are one of the most important discharge mechanisms for groundwater in this area, and the numerical model should reflect this reasonably. According to survey data, the total flow rate of the spring-source river in the Nalinggele River area can reach 7.97 m 3 /s (2.52 × 10 8 m 3 /year). From June 2010 to March 2013, the monitored flows of this spring-source river were estimated as 1.99 × 10 8 m 3 /year. Although there is some uncertainty in the measurement of spring discharge, 1.99 × 10 8 -2.52 × 10 8 m 3 /year should be a relatively reliable reference for spring discharge. (4) Overflow zone location. The location of the actual overflow zone, which is a typical hydrogeological phenomenon in alluvial-proluvial fans, is clearly shown on the remote-sensing map. The area with a groundwater burial depth less than 0 m can be regarded as the simulated overflow zone. The distribution of overflow zones between simulation and observations should basically be consistent. The above calibrations were achieved mainly by statistical fitting based on comparisons between the simulated and measured values.

Regional Groundwater Model
According to the characteristics of the groundwater system, the key study area is an actively alternating cyclic zone of shallow groundwater flow (out of the mountain exit to the East Taijinaier River). Therefore, an area of about 2845 km 2 constituting a relatively complete hydrogeological unit was selected as a regional-scale numerical simulation range ( Figure 5).

Conceptual Model
The western boundary of the phreatic aquifer was the recharge zone margin of the Nalinggele River, and the eastern boundary was the surface watershed of the Kaimuqi and Lalingzaohuo Rivers; these boundaries were conceptualized as impermeable boundaries. In the southern mountainous area, there is a small amount of bedrock fissure water inflow, which was generalized as the flow boundary, and the recharge rate was estimated as 3.2 × 10 6 m 3 /year. The northern portion of the East Taijinaier River was considered as a River Boundary. The southern, western, and eastern boundaries of the confined aquifer were set as impermeable boundaries, and the northern boundary of the confined aquifer discharge to the East and West Taijinaier Lakes was generalized as a General Head Boundary, which allows adaptive exchange of groundwater through the boundary.
The quantity of groundwater recharge was obtained using the water-table fluctuation method [22]. Evaporation was referenced to data from the nearby Xiaozaohuo weather station, and the extinction depth of evaporation was 2 m. Because rainfall in the area is minimal, precipitation is negligible. Only one groundwater pumping site exists in the region, with an extraction volume of 0.45 × 10 4 m 3 /day.
Monitoring data from groundwater observation wells and groundwater-level data from boreholes, spring points, and observation wells were combined with hydrogeological conditions to determine the initial groundwater flow field ( Figure 5). According to the results of pumping tests and hydrogeological conditions, the initial division of hydrogeological parameters was developed as given in Figure 6.

Spatial and Temporal Discretization
The aquifer is divided into two layers. The upper layer (L1) was an unconfined aquifer composed of Holocene strata from the Quaternary and upper Pleistocene strata, and the lower layer (L2) was a confined aquifer composed of middle and lower middle Pleistocene strata. From 30 December 2010 to 30 December 2011 was selected as the calibration period, and the verification period was from 30 December 2011 to 30 December 2012.

Model Calibration
The fitting errors of groundwater level, spring discharge, overflow zone location, and groundwater budget status were considered to calibrate the model.

•
Water-Level Fitting Figure 7 shows the distribution of the groundwater observation wells. The water-level fitting error conformed to a normal distribution, and its average was near zero (Figure 8). Table 1 shows the values of the error parameters, which indicate that the results of water-level fitting are reasonable.

• Dynamic groundwater trends
Although the single-point fitting precision for water level during the calibration period was relatively low, the model performed well in trend fitting. Figure 9 shows the groundwater-level fitting curves of typical single wells.  • Spring Flow Fitting The spring flow rate of the investigative model was 2.21 × 10 8 m 3 /year, which is in the reasonable range of speculative spring flow (1.80 × 10 8 -2.52 × 10 8 m 3 /year). Therefore, the result was considered to have good reliability.

Fitting of Overflow Zone Location
According to the contour map of groundwater depth (Figure 10), the location of zero groundwater depth was in line with the location of the natural groundwater overflow zone in the remote-sensing image. This shows that the simulated results for groundwater depth were in accord with the location of the actual overflow zone.

• Groundwater Budget Status
To obtain the groundwater budget status, the whole study area was set as the groundwater budget calculation area. According to the ZONEBUDGET module, each simulation period was calculated to obtain the budget items for each aquifer. The total area input was 7.61 × 10 8 m 3 /year (of which river infiltration accounted for 99.6% of total supply), and the output was 7.07 × 10 8 m 3 /year (mainly for spring discharge and evaporation, 31.3% and 36.5%, respectively). The budget difference was 5.4 × 10 7 m 3 /year, and the groundwater budget status was positive, which was consistent with the fact that the year of calibration was a year of abundant water in which river runoff was 4.9% higher than in the previous year. Therefore, the simulated results for groundwater levels, dynamic groundwater trends, spring discharges, overflow zone locations, and groundwater budget status all indicated that the model developed here can reflect the actual movement characteristics of groundwater in the study area. Table 2 gives the parameter values after calibrations, which were used in the verification period. Table 2. Calibrated hydrogeological parameters in the regional groundwater model.

Parameter
Partition

Model Verification
Model calibration parameters cannot necessarily reflect accurately another group of groundwater systems under different boundary conditions and extraction situations. Therefore, model validation must be carried out to verify that the parameters developed in the calibration period are correct. According to groundwater-level time-series data, the period from 30 December 2011 to 30 December 2012 was selected as the verification period.

•
Water-Level Fitting Figure 6 shows the distribution of the groundwater observation wells. The water-level fitting error conformed to a normal distribution, and its average was near zero (Figure 11). Table 1 shows the values of the error parameters. • Dynamic groundwater trends The groundwater-level dynamic trends during the verification period were consistent with the measured data, and the fitting accuracy was very close to that of the identification model ( Figure 12). • Spring Flow Fitting The simulated spring flow was 2.2 × 10 8 m 3 , which was in the reasonable range of speculative spring flow.

•
Fitting of Overflow Zone Location The simulated location of the overflow zone showed little change and was still in accordance with the overflow zone location on the remote-sensing map.

• Groundwater Budget Status
The budget status of groundwater quantity was positive, with an input of 7.87 × 10 8 m 3 /year and an output of 7.03 × 10 8 m 3 /year, giving a budget difference of 8.4 × 10 7 m 3 /year. The simulated situation was consistent with the actual situation, showing an increase in groundwater quantity in the year of the verification period.
The model verification results further showed that the generalized aquifer structure, boundary conditions, processed recharge, and chosen hydrogeological parameters were reasonable and could accurately reflect the characteristics of the aquifer. Hence, the numerical groundwater model developed here was deemed to be useful for predicting the variation regularity of regional groundwater resources.
According to Groundwater Resource Management Model Work Requirements (GB/T14497-1993) [17], the number of nodes with an absolute error of groundwater head less than 0.5 m was 26.2% of all observed nodes, meaning that the result does not meet the cited standard. However, the simulations of the calibration and verification model for groundwater level, spring discharge, overflow zone location, and groundwater budget status can be seen to fit well with the actual situation overall. Hence, this study has at least proved that the regional-scale three-dimensional numerical groundwater flow model developed in this research is reasonable and reliable.

Local Groundwater Model
The calculation range of the 3D numerical simulation of regional-scale groundwater flow was larger, and relative ideal boundaries could be set for the model. The objectives of this model were to integrate hydrogeological conditions and various measured data on a macro scale, to explain reasonably the dynamic characteristics of groundwater flow, and to verify the morphology of the regional groundwater flow field. The results of the regional model provided a relatively accurate exchange between surface water and groundwater for the eastern and southern boundary conditions of the local model. The hydrogeological conceptual model and the mathematical model of the Nalinggele River local study area were almost identical. The main difference in the numerical model was that on the profile, the first model layer was split into two sub-layers (SL1 and SL2), whereas the remaining lower layers in the regional-scale model (SL1, SL2, and SL3) were modeled in the local-scale model. The maximum depth of SL1 was controlled at around 300 m, which was compatible with the existing drilling depth. The hydrogeological parameters used approximate results of pumping test values, which helped the model approximate the actual conditions. On the plane extent, the eastern boundary, which was generalized into the flow boundary, was bounded by the middle line between the Nalinggele and Kaimuqi Rivers, and the southern geological fault was set as the flow boundary ( Figure 13). The ZONEBUDGET module was used to calculate the amount of groundwater exchange between Q1 and Q2.
By calibrating and verifying the new model in the local area, the partition of the hydrogeological parameters ( Figure 14 and Table 3) was further adjusted. The number of nodes with an absolute error of groundwater head less than 0.5 m was 28.8% of all observed nodes, which was higher than the result of the regional groundwater model, reflecting a clear improvement in model reliability. Figures 15 and 16 show a comparison of simulated and measured groundwater levels in the observation wells (observation well numbers and locations are as shown in Figure 7). The simulated groundwater levels and the changes in dynamic groundwater-level trends at different times were closer to the observed values than the larger regional-scale simulation results (Figures 9 and 12). This result indicates that the model fitting accuracy over the principal study area is superior to that of the regional-scale model. Therefore, simulation prediction reliability was further improved by using a nested model.

Conclusions
A nested model was used to improve the reliability of a groundwater flow model in a key study area. A three-dimensional numerical simulation model at large regional scale was set up and used to constrain the boundary conditions of a local-scale groundwater flow model. The hydrogeological conditions in the key study area were further investigated.
A multi-index calibration method was proposed to improve the reliability of a groundwater flow model. It enabled the numerical model to reflect correctly the characteristics of groundwater flow movement and distribution and of important hydrogeological phenomena such as springs and overflow zones.
The combination of the nested model and the multi-index calibration method helps to improve the accuracy and reliability of groundwater flow models in hydrogeological data-sparse areas as well as avoid the problem that local numerical groundwater-flow simulation results do not agree with regional groundwater flow-field characteristics.