Comparison of Various Growth Curve Models in Characterizing and Predicting Water Table Change after Intensive Mine Dewatering Is Discontinued in an East Central European Karstic Area

: The modeling of karst water level ﬂuctuations is a crucial task in the water resource management of vulnerable karstic areas. In the Transdanubian Range (East Central Europe, Hungary), from 1950 to 1990, coal and bauxite mining were carried out, with large amounts of karst water being extracted, thus lowering the water table by amounts ranging between 10 and 100 m. Since the cessation of mining activities in the early 1990s, the volume of natural recharge has exceeded the amount of dewatering, and the system has begun to return to its original undisturbed state. This apparently welcome development does, however, bring economic and technical engineering problems. The estimation and prediction of such water level changes is often tackled via the use of deterministic approaches, however, in the present case, it is also addressed with an alternative approach using trend estimation to monthly water level data from 107 karst water wells over the period 1990–2017. To approximate the change in karst water levels, (i) growth curve models were ﬁtted to the monthly data, allowing the estimation of karst water levels, at least as far as 2030. Similarly, this was also done with (ii) deterministic modelling in order to describe the recovery process up to 2030. Speciﬁcally, measured and predicted values for karst water level were used to derive interpolated (kriged) maps to compare the forecasting power of the two approaches. Comparing the results of the trend analysis with those of the traditional deterministic modelling results, it is apparent that the two approaches predict similar spatial distribution of water levels, but slightly different future water level values.


Introduction
Mining has evolved with the development of humankind helping to meet the everincreasing demand for raw materials [1,2]. However, large-scale mining activities are often associated with a number of environmental problems, such as air, noise, soil, or water pollution. Other effects of mining, such as the infrastructure required by industrial-scale activity and changes to the landscape, may persist well beyond the cessation of mining operations, even decades after the closure of the mine [3,4]. The impact on subsurface waterbodies is among the most significant of these, and this can be both qualitative and quantitative [5][6][7]. One of the most common problems is the decrease in water levels associated with over-abstraction, a widespread global phenomenon as a result of increasing extraction not only in the course of mining operations but also for drinking water. In Australia, e.g., the groundwater system was observed over a 36-year period in the Jarrah forest, where bauxite mining took place. On the basis of the observations there, it was revealed that 11 years after the cessation of mining (and thus water abstraction), the water level had returned to the pre-disturbance state [8]. Dewatering due to mining is a major issue in Asia as well. In a sand-grave mine in Kazan, Turkey, the 30 years of over-abstraction led to the extreme depletion of the aquifer, and even, in places, to its complete removal [9]. In the Heilongdong spring area of China, changes in groundwater levels due to dewatering occasioned by long-term coal mining were examined using different statistical methods, e.g., trend analysis. Results revealed that as an effect of the mining activity, the groundwater level in the area had dropped and the discharge of springs decreased significantly [10]. This is a general negative effect occurring in other vast coal fields in North China [11][12][13]. However, it is important to recognize that the change in water level associated with mining activities can occur not only as a decrease of the water level but also, after the closure of a mine, as a very significant increase. This increase after the halt in dewatering has an adverse effect on the environment [14,15], including the generally fragile karst aquifers in such an area (e.g., [12]). The mine water can rapidly recover after closure and could cause flooding of already urbanized areas (e.g., [16]), adjacent mines (e.g., [17,18]).
In practice, deterministic models are often used for estimating such mining-related groundwater level changes [19][20][21]. The application of numerical models in the case of mines established in karst areas can be particularly problematic, as karst systems are highly heterogeneous, usually with a complex flow system [22,23], a key factor which should be taken into consideration in their study. Furthermore, in case of groundwater level prediction, it may be worth comparing the results of different approaches.

Mining Water Extraction, a Central-Eastern European Example
As part of the Soviet planned economy, mining played a prominent industrial role in many countries of the former Soviet Bloc, which was monitored, and annual targets set by state organizations (the State Planning Committee) [24]. In East Central and Eastern Europe, as a result of the socioeconomic changes in the post-Soviet countries in 1989/1990, mining activities were abandoned in many places [25,26] and, in line with this, large-scale mining water abstraction has also been moderated.
In this process, Hungary was no exception. In relation to the closure of mines, one of the most well-known cases in Hungary is the disappearance of springs around Tata [27]. The use of Tata spring waters dates back almost two millennia, to the Romans, who used these springs [28]. By the end of the 20th century, Tata had become a popular spa, as a result of which the infrastructure had also developed significantly [29]. The development of the spa town was interrupted by the mining water abstraction. Water has been pumped in the nearby town of Tatabánya since the 1920s. The situation became so serious that by the early 1950s, a shortage of drinking water occurred [27].
There were other significant regional focal points of the bauxite and coal mining. These mining centers were located in the vicinity of Nyirád, Iszkaszentgyörgy, Tatabánya, and Dorog ( Figure 1). To provide a continuous water supply for mining, abstraction was required, reaching 700 m 3 min −1 by the beginning of the 1960s and even 800 m 3 min −1 for some years [30]. The long-term dewatering had its effect not only on the mining sites themselves, but over a wider area covering of several tens of kilometer. The amount abstracted exceeded the natural recharge, causing ever more extensive cones of depression in the karst water level of the formation and a decrease in groundwater level [31]. According to pers. comm. of officials, 20% of the abstracted water ended up in the drinking water supply, the remaining portion reached the surface water bodies (rivers and lakes). In the early 1990s, large-scale mining ceased, triggering the rise in karst water levels. Parallel, water abstraction for the town's water supply continued, in 1990 at a level of 100-110 m 3 min −1 . This then decreased, reaching a level of 60 m 3 min −1 at the beginning of the 2000s, due to rising water prices and, as a result, more economical water consumption [30]. From the beginning of the 1990s, the amount of natural recharge started to exceed the artificial dewatering, so the recovery of the aquifer began and still continues. Since then, the system has been trying to return to its near-natural state performed in the 1950s, which led to economic and technical-engineering problems as well [32]. Specifically, industrial and residential and public buildings were previously built-in areas where the karst water level was low, while mining was being actively carried out; however, without the artificial dewatering, these could not have been built there. Therefore, the determination of the spatiotemporal distribution of the rise in karst water levels is essential in the future planning of any industrial real estate investment.
The rate of recovery of a karst system can usually be forecasted using non-linear models. The long-term change in a process can be studied by examining a trend that describes it well; thus, future water levels can be estimated. In the karst aquifer of the Transdanubian Range, the recovery process has a limit, and the expected values of water level have an upper maximum. To characterize the process, different types of growth curves [33] can be used, which approach a given constant-K maximum water level value-in time. The use of these models is widespread in many disciplines [34,35] even in the absence of a known K. In the early 1990s, large-scale mining ceased, triggering the rise in karst water levels. Parallel, water abstraction for the town's water supply continued, in 1990 at a level of 100-110 m 3 min −1 . This then decreased, reaching a level of 60 m 3 min −1 at the beginning of the 2000s, due to rising water prices and, as a result, more economical water consumption [30]. From the beginning of the 1990s, the amount of natural recharge started to exceed the artificial dewatering, so the recovery of the aquifer began and still continues. Since then, the system has been trying to return to its near-natural state performed in the 1950s, which led to economic and technical-engineering problems as well [32]. Specifically, industrial and residential and public buildings were previously built-in areas where the karst water level was low, while mining was being actively carried out; however, without the artificial dewatering, these could not have been built there. Therefore, the determination of the spatiotemporal distribution of the rise in karst water levels is essential in the future planning of any industrial real estate investment.
The rate of recovery of a karst system can usually be forecasted using non-linear models. The long-term change in a process can be studied by examining a trend that describes it well; thus, future water levels can be estimated. In the karst aquifer of the Transdanubian Range, the recovery process has a limit, and the expected values of water level have an upper maximum. To characterize the process, different types of growth curves [33] can be used, which approach a given constant-K maximum water level valuein time. The use of these models is widespread in many disciplines [34,35] even in the absence of a known K.

Aims of the Study
The aims of the study were (i) to group the karst water level time series based on their common pattern and determine the groups' spatial distribution within the study area, to obtain an overview on the spatial development of the nature of water level change. At the same time, the study also focuses on the recovery of the karst system, starting at the beginning of 1990s, and still ongoing (though data are only available up to 2017, Section 2.2). Thus, it was also important (ii) to determine the rate and characteristics of the recovery process itself, with a special focus on how the fitting and the prediction of different models would be affected if the maximum value (K) could not be determined perfectly. Multiple type of growth curves is employed to provide the most appropriate approximation of the recovery and (iii) to investigate if any of these can be successfully applied to estimate water levels that might be the next decade. It should be noted that these types of models have not yet been applied in the field of hydrology for such an aim. Since subsurface water level measurements are often lacking data from their time series, as a result of, e.g., instrument failure, the study also addresses (iv) the question of to what extent missing data might influence the fitting and estimation of a model, if these data were missing from the time the recovery started. Lastly, as measured and predicted, karst water level values are used to derive interpolated (kriged) maps to compare the two (trend estimation and deterministic model) forecasts' power. Efficacy was also measured by comparison with the results of a deterministic model, and (v) the advantages and disadvantages of forecasting by different methods can be determined relative to each other.

Hydrogeological Description of the Study Area
The Transdanubian Range is located in the mid-western part of Hungary ( Figure 1). The extension of its karst system is approximately 300 × 100 km [30], composed mainly of Mesozoic formations lying on slightly metamorphosed Paleozoic formations. The most significant of these are the aquifers forming a continuous hydraulic system of the Upper Triassic Main Dolomite and the Dachstein Limestone Formations, with a total thickness of 1.5-2 km [36]. In some areas of the Transdanubian Mountains, above the thick Triassic aquifer, there are locally occurring Jurassic-Cretaceous (Tata Limestone Fm, Zirc Limestone Fm, Ugod Limestone Fm) and Eocene (Szőc Limestone Fm) carbonates forming a hydraulically connected unit with the older formations. The aquitard formations (with lower hydraulic conductivity) located between the aquifers of different ages have the vertical flow slowed down, but they do not prevent it completely [37]. The pre-Cainozoic structure of the Transdanubian Range Unit can be characterized as mainly due to SW-NE compression. As a result, a syncline structure was formed in the middle of the Cretaceous, characterizing the geological settings of the area with the deformation of previously deposited sediments [38].
In the Late Cretaceous, karstification began in the Upper Triassic formations outcropping at the edge of the syncline. Therefore, bauxite deposits were formed in several places in the Northern Bakony Mts. and the Western part of the Southern Bakony Mts. [39,40]. As a result of tectonic movements beginning in the Late Eocene, so-called Paleogene Basins were formed along lateral displacements [41]. Along the indented coastline, the environment was also favorable to the formation of the Eocene bauxite [42].

Dataset Description
The karst water monitoring network of the Transdanubian Range, established in the early 1970s, has since provided a large amount of information on the status of groundwater changes. The karst water level time series for which trend analysis performed in this study originate in the monitoring wells ( Figure 1) of four Hungarian water directorates, and basically include data from the establishment of these up to 2016 or 2017. After preliminary screening, 107 time-series were found to bear witness to the recovery process of the karst system, a process characterized by rising water levels. In more than a third of these (40 wells), data were only available from 1990 or later, and there were several cases (12 wells) in which monitoring ceased in 2010 or before. Water level measurements were not equidistant in time. Regarding the recovery period (1990-2017), 31,438 monthly averages were examined for 107 wells, of which 4113 cases out of the ca 32,000 monthly average water level data (13% of months) had no measurements. Missing data were mostly from the early 2000s ( Figure 2). ater 2021, 13, x FOR PEER REVIEW 5 of 31 or 2017. After preliminary screening, 107 time-series were found to bear witness to the recovery process of the karst system, a process characterized by rising water levels. In more than a third of these (40 wells), data were only available from 1990 or later, and there were several cases (12 wells) in which monitoring ceased in 2010 or before. Water leve measurements were not equidistant in time. Regarding the recovery period (1990-2017) 31,438 monthly averages were examined for 107 wells, of which 4113 cases out of the ca 32,000 monthly average water level data (13% of months) had no measurements. Missing data were mostly from the early 2000s ( Figure 2). Monthly precipitation data were gathered from 47 precipitation monitoring stations and 5 climate stations in the area.

Applied Methodology
In order to examine the karst water recharge of the karst system in the Transdanubian Range, the following steps were followed ( Figure 3).
To meet the aims of the study, after preprocessing the karst water levels (outlier removal etc.) and choosing the focus period of the research (1990-2017) (Section 1.2), the karst water monitoring wells were grouped using cluster analysis [43] on the basis of their annual average karst water levels. The groupings were then validated using linear discriminant analysis [44] and C-index [45] (Section 2.4), as in other studies dealing with subsurface water [46,47]. The years most important in driving the groupings were determined using Wilks' λ statistics [48]. These steps follow the procedure presented in Hatvani et al. [44].
Monthly average karst water data were used for trend estimation, which was achieved by fitting growth models (Section 2.5). After choosing the model, which best described the recovery process, karst water levels were predicted in each well for January 2030.
The karst water levels were also predicted using deterministic modeling (Section 2.6) For this, the infiltration rate was calculated from precipitation data. The average annua recharge rate was calculated in the infiltration areas of the reservoir on the basis of hydraulic model tests performed in recent decades. The initial hydraulic conductivity values used in the early stages of model calibration were derived from pumping test data and literature. The final calibration of the deterministic model was performed using a dataset consisting of the 107 well data-the subject of the study-extended with historic water levels from additional ~100 subsurface wells and 60 springs. These not necessarily spanned the same time interval neither with each other nor with the 107 wells. Because these data were either gappy or covered a too short time period, these could not have been used for purposes other than the calibration of the deterministic model in the study. The database of the model also contains the monthly production data for karst wells, rising Monthly precipitation data were gathered from 47 precipitation monitoring stations and 5 climate stations in the area.

Applied Methodology
In order to examine the karst water recharge of the karst system in the Transdanubian Range, the following steps were followed ( Figure 3).
To meet the aims of the study, after preprocessing the karst water levels (outlier removal etc.) and choosing the focus period of the research (1990-2017) (Section 1.2), the karst water monitoring wells were grouped using cluster analysis [43] on the basis of their annual average karst water levels. The groupings were then validated using linear discriminant analysis [44] and C-index [45] (Section 2.4), as in other studies dealing with subsurface water [46,47]. The years most important in driving the groupings were determined using Wilks' λ statistics [48]. These steps follow the procedure presented in Hatvani et al. [44].
Monthly average karst water data were used for trend estimation, which was achieved by fitting growth models (Section 2.5). After choosing the model, which best described the recovery process, karst water levels were predicted in each well for January 2030.
The karst water levels were also predicted using deterministic modeling (Section 2.6). For this, the infiltration rate was calculated from precipitation data. The average annual recharge rate was calculated in the infiltration areas of the reservoir on the basis of hydraulic model tests performed in recent decades. The initial hydraulic conductivity values used in the early stages of model calibration were derived from pumping test data and literature. The final calibration of the deterministic model was performed using a dataset consisting of the 107 well data-the subject of the study-extended with historic water levels from additional~100 subsurface wells and 60 springs. These not necessarily spanned the same time interval neither with each other nor with the 107 wells. Because these data were either gappy or covered a too short time period, these could not have been used for purposes other than the calibration of the deterministic model in the study. The database of the model also contains the monthly production data for karst wells, rising shafts, and all of the other water abstraction sites. After running the deterministic model, karst water levels were predicted for January 2030.   For the karst water values forecast by trend estimation and deterministic modeling, variogram analysis was applied [49] to interpolate (krige [50]) the water levels in 2016 and 2030 (Section 2.7). The predicted values of the two methods were then compared on difference maps.

Cluster and Discriminant Analysis
Cluster analysis aims to group observations (in this case, monitoring wells) on the basis of measured data (here, annual average water level time series)-in which those with similar properties (patterns) are placed in the same group (cluster). To achieve this, Ward's [43] hierarchical classification method with squared Euclidean distance was applied to the Z-score transformed data. On account of missing data, the process of grouping could only be performed for 94 time series from 1995 to 2015. In order to obtain maximal separation between groups and simultaneously minimize variability within them, Fisher's linear discriminant analysis (LDA) was used in the transformations (i.e., linear combinations) of the original data [44]. The grouping was also validated with the use of the C-index [45], which is regarded as being among the six best indices of correct cluster recovery [40].
In order to define which parameter plays the greatest role in determining the formations of the cluster groups, Wilks's lambda (λ) [48] quotient was used. The value of λ is the ratio of the within-group sum of squares to the total sum of squares and is a number between 0 and 1. If λ = 1, the role played by that parameter in the grouping is the same in every case, and there is, therefore, no inter-group variability. Thus, that particular parameter did not affect the composition of the cluster groups [44,51,52]. If, on the other hand, λ = 0, the parameter under consideration affected the formation of cluster groups to the greatest degree. Therefore, one can say that the smaller the quotient, the greater the degree to which it determines the formation of the various cluster groups.

Growth Curves
Growth curves in general can be characterized by an upper limit (expected maximum value). Out of the 10 different type of curves used, one type of them does not have an inflection point and have a concave run in the domain of 0-∞.
Three of this type of function were used in the study: Bertalanffy [53], Törnquist1, and Törnquist2 [54,55]. For additional technical specifications on the applied curves see the Supplementary Material Section S1.
There are processes that display an exponential rise initially, but over time, the growth slows down. This type of growth curves has an inflection point and they are also called sigmoid curve [56]. Relative to the inflection point, the initial section is concave, while the section after that is convex. The inflection point indicates that there have been some significant changes in the process. This type of curve is the original logistic function, but there are a number of mathematical functions with a similar shape. In this research, seven of these functions were examined, namely, the logistic function [57], the delayed logistic function [58], the squared logistic function [59], the Gompertz function [60], the "63 percent" trend function [58,59], the Johnson function [61], and the Richards function [62]. The main difference between these functions is to be found in the position of the inflection point, and in the shape of the curve. In the case of the Richards function, an additional v parameter was incorporated into the model, with which the location of the inflection point can be controlled.

Model Fitting
As a first step in the estimation of the functions, the expected maximum water level value (K)-prior to the beginning of intensive dewatering in 1950 [63]-of each well was determined and inserted into the calculation. Estimating the trend requires water level time series of appropriate quality and length, which is an obvious limitation. However, the most specific prerequisite is the maximum water level (K value). The water levels vary depending on the parameters characterizing the aquifer, so they characterize the hydrogeological environment of the observation. No additional parameters (e.g., infiltration and hydraulic conductivity) are required for this method.
In not every case is information available on the maximum water levels, prior to the mining activities, i.e., dewatering. In the present case, these figures were, in fact, available. In should be noted, that if not available, the trend estimation procedure can provide a rough estimate for the K value(s) as well by minimizing the error, which in some cases results in an overestimation of the value (Supplementary Material Table S3). Since the K value is the primary requirement and limitation of the method, it was examined whether its deviation within a certain range affects the goodness of fit or not. It was observed that the K value thus obtained is not realistic in all cases. However, the predicted water level did not deviate significantly. Subsequently, the karst water level values were predicted for each examined well with the best fitting model of the trend analysis. Curved regression was performed using the least square method, with the best fit being determined on the basis of the maximized r 2 value, in the present case called "correlation index" determined by Kenney and Keeping [64] (Equation (1)), which is not similar to the frequently used Pearson or Spearman correlation coefficient. It takes into account the spread of the modelled values by standardizing the squared error of the prediction (SSE) with the sum of squares total (SST) (see Equation (1)). In other words, the smallest the SSE, the closer r 2 will get to 1. The best fit was found with changing each parameter iteratively to maximize the r 2 value.
The r 2 was calculated based on this function (Equation (1)): where SSE = sum of squared errors; SST = sum of squares total.
The question of which model performs the best under different degree of missing data, i.e., measured values from the beginning of the time series were deleted yearly (maximum 5 years), was investigated experimentally. After that, the function fitting was re-performed. For this purpose, the 53 time series clearly displaying the exact beginning of the recharge period were used. This means that data were available for dates before 1990.

Deterministic Model
The deterministic model describing the recovery of the Transdanubian Range is a transient (not steady-state) hydraulic model (Supplementary Material Section S2). The model area is divided up into a constant orthogonal grid, the size being 500 × 500 m for the total model area. The upper 200 m of the karst aquifer was built into the model. The unit of time in the transient model is 1 month. In areas where leakage is possible between the Triassic and overlying aquifers (e.g., Cretaceous limestones, Upper Pannonian porous layers), the model calculates the water exchange between the two aquifers on the basis of hydraulic conductivity. The flow rate between the karst aquifer and surface waters was also calculated using the hydraulic model.
The model also includes the major karst springs, which are regarded as fixed drains in the model. In every such location, the flow of the spring is modelled based on the difference of the calculated karst water level and the elevation of the spring multiplied by the localized conductivity value. On the basis of the difference between the karst water level and the elevation of springs, their discharge can be calculated. The recharge is equal to the spatial characteristic of diffuse infiltration, which can be determined from the product of the open karstic infiltration and its intensity. In the calculation of the latter, data from rainfall measuring stations close to the infiltration areas were used. In Hungary in karstic areas surface runoff is negligible, precipitation is instrumentally measured, thus using the base-function of hydrology seepage can be determined by subtracting evapotranspiration-estimated using the Morton CRAE model [65]-from precipitation. Monthly karst water abstraction data were included in the model, as well.
The results of modeling in general, including deterministic modeling, strongly depend on the uncertainty of the input parameters, let them be measured (e.g., yield) or calculated (e.g., infiltration). Using station data, gridded values have to be estimated, which again is highly dependent on the number and spatial distribution of the measured data. Furthermore, the result of the deterministic modelling was also significantly dependent on the exploration of the study area, i.e., on the (hydro)geological information. Thanks to the previous research, the Transdanubian Range was well explored as a result of which a relatively large amount of data is available.
The regionally updated database of the model allows the simulation of water level changes, discharge changes of major karst springs in the karst reservoir from 1951 up to the present day. Furthermore, it is also suitable for predicting changes in the expected karst water level.

Variography and Interpolation
As a final product, two forecast maps were made for January 2030 based on the water levels predicted by trend estimation and deterministic modeling. For the creation of the maps, the kriging of an "optimal" interpolation is necessary, and this is then be employed in the study to estimate the covariances to the highest degree of accuracy possible before mapping [50,51]. To obtain the weights for kriging, theoretical semivariograms are derived describing the spatial autocorrelation structure of the data by minimizing the variance of the error [66]. One of the most common theoretical models is a spherical model, which displays a linear behavior at the origin and the sill is reached at a, while a Gaussian model has a repressed slope (a tangent with zero slope at the origin), and its sill is a limited value that might be approximated to an infinitesimal degree, but never reached. Thus, in the case of the Gaussian model, a practical and an effective range a e = a × √ 3 have to be determined; a e is then used for further geostatistical modeling. A theoretical model can be given by calculating the empirical semivariogram using the algorithm of Matheron [67] (Equation (2)) where γ(h) is the semivariogram and Z(x) and Z(x + h) are the values of a parameter sampled at a planar distance |h| from each other, N(h) is the number of lag-h differences, i.e., n × (n − 1)/2 and n corresponds to the number of sampling locations at a distance h. The most important properties of the semivariogram used in the present study are the nugget, which quantifies the variance at the sampling location (including information regarding the error of the sampling), the sill, that is, the level at which the variogram stabilizes, which is the sum of the nugget (c 0 ) and the reduced sill (c), and the range (a), which is the distance within which the samples have an influence on each other and beyond which they are uncorrelated [68,69]. If the semivariogram does not have a rising part and the points of the empirical semivariogram align parallel to the abscissa, a nugget-type variogram is obtained. In this case, the sampling frequency is insufficient to estimate the sampling range using variography [50,70].

Used Software
Data preprocessing was performed in Microsoft Excel 16.0. IBM SPSS 22 was used for the cluster, Wilks' λ, and discriminant analyses, and deterministic modeling was carried out in Applied Visual MODFLOW. The geostatistical modeling was performed in GS+ 10 and the mapping in Golden Software Surfer 15.

Cluster Analysis
Three groups of the 94 examined time series could be distinguished for the period 1995-2015. The grouping was 100% successful according to the linear discriminant analysis and can be well illustrated using the LD1 and LD2 functions (Figures 4 and 5). analysis and can be well illustrated using the LD1 and LD2 functions (Figure 4 and Figure  5). The value of the C-index was also calculated in the cases in which there are 2-10 groups, on the basis of which it can be determined that three is the optimal group number (the C-index is the smallest in this case).
Based on the 21 years examined in this study, the values of the Wilks' λ quotient are higher on average in the first part of the period and lower in the second part. Thus, the initial phase of the time series was predominant in the spatial separation of the groups.
The three groups are also separated spatially ( Figure 5). The 17 wells of the first cluster are located in the SW part of the Transdanubian Range, in the vicinity of the Nyirád mining center. These time series are characterized by a constant rising tendency and a minimal fluctuation pattern. The initial phase (ca. 1995-1999) of the recovery was faster than the following period ( Figure 5). The second group contains 60 wells, which are situated in the NE part of the area, in the vicinity of the Dorog, Tatabánya, and Iszakszentgyörgy mining centers. The time series of this group are characterized by a constant increase at a similar rate throughout the examined period, with a minimal fluctuation pattern. The 17 wells of the third group are located in the western part of the Bakony Mts., which is a recharge area. Here, the karst water levels are higher (even at the beginning of the 1990s, the figure was about 290 m asl.) compared to the surrounding areas. The various functions presented previously (Section 2.5) were fitted to the monthly average karst water time series. In order to carry out the trend fitting even under the The value of the C-index was also calculated in the cases in which there are 2-10 groups, on the basis of which it can be determined that three is the optimal group number (the C-index is the smallest in this case).
Based on the 21 years examined in this study, the values of the Wilks' λ quotient are higher on average in the first part of the period and lower in the second part. Thus, the initial phase of the time series was predominant in the spatial separation of the groups.
The three groups are also separated spatially ( Figure 5). The 17 wells of the first cluster are located in the SW part of the Transdanubian Range, in the vicinity of the Nyirád mining center. These time series are characterized by a constant rising tendency and a minimal fluctuation pattern. The initial phase (ca. 1995-1999) of the recovery was faster than the following period ( Figure 5). The second group contains 60 wells, which are situated in the NE part of the area, in the vicinity of the Dorog, Tatabánya, and Iszakszentgyörgy mining centers. The time series of this group are characterized by a constant increase at a similar rate throughout the examined period, with a minimal fluctuation pattern. The 17 wells of the third group are located in the western part of the Bakony Mts., which is a recharge area. Here, the karst water levels are higher (even at the beginning of the 1990s, the figure was about 290 m asl.) compared to the surrounding areas.

Determining the Best-Fitting Model
The various functions presented previously (Section 2.5) were fitted to the monthly average karst water time series. In order to carry out the trend fitting even under the conditionŷ 0 = 0, the trend function was fitted not to the absolute water levels, but to the values of water level change relative to the beginning of the recovery phase, e.g., Figure 6.  Based on a two-tailed T test, all model runs were significant at p = 0.05. In most cases (82.24%), the curves showed a very good fit (r 2 > 0.9). However, it could be established that there is not a single function among those examined that best describes the recovery process in the Transdanubian Range. In most cases, the Richards (29.91% of cases) mode proved to be the most accurate. However, the accuracy of the fitting-that is, the value of r 2 -showed very little difference between the functions in several cases. Because the value of r 2 can be similarly high in the case of different functions, the question of which functions fit best was addressed, taking into consideration not only the case with the largest r 2 value Figure 6. Curves fitted on the monthly average karst water levels of well Bakonybél-2/a and Csákberény-86/a, where the brown line is the measured water level, the 10 colored curves are the fitted trend models described above (Section 2.5.1), and the blue horizontal line is the K value (maximum water level).
Based on a two-tailed T test, all model runs were significant at p = 0.05. In most cases (82.24%), the curves showed a very good fit (r 2 > 0.9). However, it could be established that there is not a single function among those examined that best describes the recovery process in the Transdanubian Range. In most cases, the Richards (29.91% of cases) model proved to be the most accurate. However, the accuracy of the fitting-that is, the value of r 2 -showed very little difference between the functions in several cases. Because the value of r 2 can be similarly high in the case of different functions, the question of which functions fit best was addressed, taking into consideration not only the case with the largest r 2 value, but all of the cases in which r 2 ≥ r 2 max − 0.005 (Table 1). However, even despite this, the Richards model proved to be the functions describing the recovery process the best. It should be noted that the difference between the goodness of fits (r 2 values) obtained for the different functions was quite small.

Limitations of Trend Estimation
In some measured time series, data for the first few years of the recovery are missing. However, these also show the character of the recovery phenomenon and contain valuable information about it. It was necessary to examine to what extent the goodness of fit and the accuracy of the predicted water levels are affected even if a few years of the measured data are missing from the beginning of the time series. On the basis of the examination of 53 wells, it was found that only a few missing data (a 1-year gap in 24.53% of cases) cause little difference. However, as the amount of missing data increases, this causes more differences (a 5-year deficit in 77.55% of cases) over the issue of which function fits best ( Table 2). The example of five randomly selected time series illustrates how r 2 and the water level values predicted for January 2030 would change if a longer period was missing from the beginning of the recovery time series. The aim was to rank the functions in the case of 1-5 years of missing data compared to the case of time series with no missing data (Table 3). Although the best-fitting function does indeed change in many cases for the five randomly selected wells, there was no remarkable change in the estimated values. In the case of 5 years of missing data, the largest change in the predicted water level occurred at the Bakonybél-2a well, where the deviation is about 2.5% compared to the prediction based on the whole recovery time series. It is also noticeable that the time series of Bakonybél-2a are characterized by significant, long-term water level fluctuations (Figure 7). Table 3. Changes in r 2 values and water levels predicted for January 2030 in case of 1-5 years of data missing from the beginning of the time series, in the case of 5 wells representing the different parts of the whole study area. Another important element of the method employed is the determination of the threshold value, which the water level asymptotically approaches. This can be specified before the fitting, on the basis of the user's a priori knowledge. Many observation wells were established only after the beginning of dewatering, in the early 1970s, so the undisturbed water levels in the pre-mining period were not always precisely known. An experiment was performed in which the K values of five randomly selected wells were reduced and increased by 2% repeatedly, until they had reached ± 10%, compared to the value originally determined (Table 4). It was established that the greater the degree of the change in the K value, the greater the variety of the type of functions which gave a best fit. Furthermore, it was established that the water levels predicted for January 2030 showed only a few meters' difference even in the case of a relatively high degree of inaccuracy in determining the K maximum value. The analysis of the five time series demonstrates that the greatest deviation from the water levels originally estimated for January 2030 is 2.15%, as also observed at the Bakonybél-2a well, which has a definite fluctuation pattern.  Another important element of the method employed is the determination of the threshold value, which the water level asymptotically approaches. This can be specified before the fitting, on the basis of the user's a priori knowledge. Many observation wells were established only after the beginning of dewatering, in the early 1970s, so the undisturbed water levels in the pre-mining period were not always precisely known. An experiment was performed in which the K values of five randomly selected wells were reduced and increased by 2% repeatedly, until they had reached ± 10%, compared to the value originally determined (Table 4). It was established that the greater the degree of the change in the K value, the greater the variety of the type of functions which gave a best fit. Furthermore, it was established that the water levels predicted for January 2030 showed only a few meters' difference even in the case of a relatively high degree of inaccuracy in determining the K maximum value. The analysis of the five time series demonstrates that the greatest deviation from the water levels originally estimated for

Variography and Predicted Karst Water Level Maps
Water levels were forecasted using two methods: the extrapolation of the best-fitting trend in each well and the construction of a deterministic model. To assess whether the predicted water levels have a spatial autocorrelation structure, empirical semivariograms were estimated, and these indicated a similar, nested structure. Based on the values obtained by trend estimation, a 32.5 km range stood out, while in the results of the modeling, this was true of the spatial ranges of 18 and 39 km (Figure 8).

Variography and Predicted Karst Water Level Maps
Water levels were forecasted using two methods: the extrapolation of the best-fitting trend in each well and the construction of a deterministic model. To assess whether the predicted water levels have a spatial autocorrelation structure, empirical semivariograms were estimated, and these indicated a similar, nested structure. Based on the values obtained by trend estimation, a 32.5 km range stood out, while in the case of the deterministic modeling, this was 28.6 km (Figure 8). For better illustration, 2D surfaces were modelled with kriging from the predicted values for each observation point based on the two methods. According to the map prepared on the basis of the trend estimation (Figure 9), the highest water levels can be expected in the central-SW part of the Transdanubian Range, in the central areas of the Bakony Mts., where the highest water levels are predicted to be in 2030 between 280 and 310 m asl. Low water levels are still expected around mining centers, but a clearly demarcated depression area can no longer be identified. The lowest water levels are expected at the SW (~65 m asl.) and NE (95-115 m asl.) edges of the area. For better illustration, 2D surfaces were modelled with kriging from the predicted values for each observation point based on the two methods. According to the map prepared on the basis of the trend estimation (Figure 9), the highest water levels can be expected in the central-SW part of the Transdanubian Range, in the central areas of the Bakony Mts., where the highest water levels are predicted to be in 2030 between 280 and 310 m asl. Low water levels are still expected around mining centers, but a clearly demarcated depression area can no longer be identified. The lowest water levels are expected at the SW (~65 m asl.) and NE (95-115 m asl.) edges of the area.
Based on the values of the deterministic modeling, the map of January 2030 ( Figure 10) forecasts slightly lower water levels compared to the other method. The highest predicted water levels can also be expected in the Bakony Mts. according to this approach, with water levels slightly higher than 270 m asl, while the lowest predicted water levels in the edge of the area are between 110 and 120 m asl. Based on the values of the deterministic modeling, the map of January 2030 ( Figure  10) forecasts slightly lower water levels compared to the other method. The highest predicted water levels can also be expected in the Bakony Mts. according to this approach, with water levels slightly higher than 270 m asl, while the lowest predicted water levels in the edge of the area are between 110 and 120 m asl. In order to examine the accuracy of the estimation by trend fitting, a difference map between the measured and estimated water levels was prepared for the beginning of 2016 ( Figure 11). The maximum deviation between the estimated and measured values is about 8 m, while the median of the deviations is approximately 1 m, hence the difference between the measured and estimated values is below 1 m in at least half of the wells. In order to examine the accuracy of the estimation by trend fitting, a difference map between the measured and estimated water levels was prepared for the beginning of 2016 ( Figure 11). The maximum deviation between the estimated and measured values is about 8 m, while the median of the deviations is approximately 1 m, hence the difference between the measured and estimated values is below 1 m in at least half of the wells. The difference between the values measured and those estimated using the deterministic model for January 2016 was also plotted ( Figure 12). The maximum deviation is around 20 m for estimation by this method. The median of the differences is 3.29 m, so it can be concluded that the difference between the measured and estimated values exceeds this value in, at most, half of the wells. The difference between the values measured and those estimated using the deterministic model for January 2016 was also plotted ( Figure 12). The maximum deviation is around 20 m for estimation by this method. The median of the differences is 3.29 m, so it can be concluded that the difference between the measured and estimated values exceeds this value in, at most, half of the wells.
Finally, the difference between the projections for January 2030 using the two different methods was examined. Therefore, a difference map based on the values calculated by trend estimation and deterministic modeling was prepared (Figure 13) in such a way that the results obtained by modeling were subtracted from the values obtained by trend analysis. Finally, the difference between the projections for January 2030 using the two different methods was examined. Therefore, a difference map based on the values calculated by trend estimation and deterministic modeling was prepared (Figure 13) in such a way that the results obtained by modeling were subtracted from the values obtained by trend analysis. In some parts of the area shown on the map, where there are no monitoring wells, large differences can occur between the values estimated by the two methods; this, however, is a consequence of unreliable extrapolation. In areas where the amount of data is adequate on the basis of interpolation, the difference map shows the greatest differences in the central areas of the Bakony Mts. Here, the values estimated by trend analysis exceed the results of modeling, and in some places by as much as 30-40 m. In the NE part of the In some parts of the area shown on the map, where there are no monitoring wells, large differences can occur between the values estimated by the two methods; this, however, is a consequence of unreliable extrapolation. In areas where the amount of data is adequate on the basis of interpolation, the difference map shows the greatest differences in the central areas of the Bakony Mts. Here, the values estimated by trend analysis exceed the results of modeling, and in some places by as much as 30-40 m. In the NE part of the study area, on the other hand, the deterministic forecast gives higher expected water levels, but here, the difference between the results of the two methods is smaller (maximum 5-10 m). The median of the absolute deviations is approximately 4 m; thus, in at least half of the wells the difference between the values estimated by the two methods exceeds this value.

Discussion
When mining activity is ceased and dewatering is discontinued, a number of environmental problems emerge (e.g., [71]). Besides the general water quality deterioration of the subsurface water body, the large-scale change in groundwater levels over an extended area (e.g., [72][73][74]) pose an even higher threat to the generally vulnerable karstic areas. Such changes can cause extremely high hydraulic gradients favoring internal erosion and dissolution in the karst and changes in flow paths etc.; for an extensive review see Gutiérrez et al. [75].
After the collapse of the mining industry in the former Soviet Bloc [25,26] large-scale dewatering ceased in many former mining centers, causing an increase in water levels and the regeneration of the system. In Hungary, an extremely large amount of groundwater was abstracted from the karst aquifer of the Transdanubian Range for mining purposes between 1950 and 1990, as well. However, after the mine closure, the water level began to rise to an extreme degree in some areas (especially near to the mining centers) [27,30]. Thanks to the wealth of data from the monitoring wells in the area, water level changes can be examined extensively, the better to understand the recharge process and the behavior of the karst system.
Clustering groundwater time series is a common method to identify the individual behaviors of the different groups. Variant hydrochemical characterization [76,77] or hydrodynamic differences [78] can be determined by different types of cluster analysis-a method which is, furthermore, applicable to the task of revealing spatial separation. In the case of the Transdanubian range, the three groups of time series distinguished by cluster analysis are also spatially separated. The time series of this group also show a rising trend, but with a definite fluctuation pattern ( Figure 14). As the grouping was based on the patterns, and not on the absolute value of water levels, the actual rate of the water level rise can also be very different for wells belonging to the same group. In the wells of the first group, the rise of the water level was between 3 and 58 m between 1995 and 2015. In the second group, the rise ranged from just a few meters to more than 100 m, while in the third group, it varied between 12 and 59 m. The grouping is influenced by the position of the wells associated with the different depression centers (Nyirád, Iszkaszentgyörgy, Tatabánya, and Dorog). This may be partly explained by the pattern of mine closures, which did not take place simultaneously, having given ever-greater impetus to the recovery of the reservoir. The first and the second cluster groups are not characterized by a definite fluctuation pattern, while in the wells of the third group located in the Bakony recharge area, further from the dewatering sites, the water levels show significant fluctuations. Thus, in addition to the recovery, the effect of infiltration is also significant in this area, a fact which is reflected in the time series. The Wilks' λ quotient values obtained also indicate that the water level changes occur at a higher rate at the beginning of the recovery and gradually slow down towards the end of the process. Thus, the initial stage has the most significant effect on the pattern, and so therefore on the separation of the groups, as well. In most cases, groundwater level changes related to mining activity numerical simulation, but this requires the simplification of a very c What is certain is that significant groundwater increase on a regional scale can induce severe economic and technical-engineering issues [79]. In order to be properly prepared for these problems and to mitigate any damage, it is necessary to forecast the change in groundwater level as accurately as possible.
In most cases, groundwater level changes related to mining activity are modeled by numerical simulation, but this requires the simplification of a very complex system [80,81]. Rapantova et al. [82] studied the possibilities of deterministic modeling of groundwater flow in mining areas, and software with different approaches that can be used in different hydrogeological environments. They concluded that deterministic modeling can be the most suitable for answering certain, more complex hydrogeological questions (e.g., the assessment of the impact of hydraulic stresses) [82].
In karst aquifers, numerical simulation can be particularly complicated due to the peculiarities of such formations and, especially, the high degree of heterogeneity related to its triple porosity [80]. Furthermore, for modeling to be successful, a wide variety of input data are required, and these may be error prone or their calculation may give an uncertain result (e.g., in the case of infiltration) [83,84]. When it is a question of water level forecasting, several approaches can be expedient. Charou et al. [85] investigated water level changes due to the effects of mining by processing remote sensing data, which is an inexpensive and effective tool for indicating groundwater changes in mining areas. In order to obtain the most accurate water level estimation, it is recommended that the results obtained by different methods then be compared.
A special hydrogeological situation evolved as a result of mining activities in the Transdanubian Range, where the expected water levels can be estimated not only by deterministic modeling but also by trend analysis. Since the recovery process has an upper limit, functions with threshold values (growth curves) are best suited to describe its trend. From the 10 models examined, the Richards function proved to be the most suitable for describing the recovery process in the karst aquifer, supposedly to the underlying additional factor-compared to the other methods-that renders its computation flexible at the inflection point (Supplementary Material Section S1). The descriptive statistics of water level values predicted for 107 wells by 10 different methods for 2030 were also prepared ( Table S4). The different models were compared based on the interquartile interval of their estimated values for 2030. The Richards function did not provide the smallest range. However, the interquartile range in the present case cannot be considered as a valid statistic to evaluate the goodness of the model predictions, because the now undisturbed and recovering karst water levels tend to follow topography ( Figure. 1) [86,87], thus are expected to show a relatively high spatial variability in 2030. This variability is reflected in the relatively wider interquartile range of the "Richards" model (Table S4) and concurs with the "Richards" model giving the best fit regarding the correlation index (r 2 ). This means that it was not only able to estimate the trend of the measured data but also its spatial variability among the wells.
The values estimated by trend analysis compared to the water levels measured for 2016 show a small difference in many wells; the median of these values is around 1 m. Considering that water levels in the karst systems may in any case be characterized by high temporal variability [80], this small difference in measured and estimated values confirm that the calculation using trend analysis gives an accurate result. The determination of the K threshold value is essential to this method; another requirement is that there be as many measured values from as long a time period as possible. The trend analysis and the forecasting with MODFLOW modeling give similar results, something which is supported by the similar variograms (with nearly identical ranges) that were obtained, as well as the similar spatial distribution of the water levels. The absolute difference in the water levels calculated for January 2030 by the two methods shows a difference of more than 4 m in at least half of the wells. The greatest differences occur in some of the wells in the south of the Bakony Mts, which is an infiltration area, so the time series here are characterized by relatively intense fluctuations. The difference may also occur because the two methods require fundamentally different input parameters. The calculation by trend analysis is based on the measured water level data, which already includes all the impacts affecting it. For MODFLOW modeling, several parameters (e.g., hydraulic conductivity, infiltration, yield of water abstractions) need to be provided, and any inaccuracy in these will affect the estimated values. Besides, during the construction of a deterministic model, a significant simplification of reality is inevitable, e.g., the aquifer of the Transdanubian Range was considered as continuous and a single layer. In the creation of a deterministic model, a more precise determination of leakage from the Cretaceous and younger cover layers is advisable.
Modeling is time consuming, requires a lot of input data (that is difficult to obtain and process), and can be erroneous. Trend estimation-especially for a lot of data-can also be time consuming, but it requires many fewer different datasets. The water level time series practically bear the imprint of all the effects on the aquifer; changes in the water level are the manifestation of all the effects. In addition, although a special case was examined in the present study, trend estimation as a water level forecast option can be applied to other areas of hydrogeology. In this particular case, the growth curves estimated the hydrogeological changes the most accurately. In other cases, it is also necessary to choose an appropriate trend function on the basis of the professional knowledge of the researcher so the water levels can be reliably predicted.
The more accurate forecast of groundwater level in such areas may become an urgent necessity. Using the methodology applied in this study, rises in groundwater can be examined in a reliable and easy-to-use way.

Conclusions and Outlook
The investigation of groundwater level rises and the prediction of water levels was presented and in which a novel methodology composed of different data analysis methods was applied to the former mining area of the Transdanubian Range. The time series of karst water monitoring wells in the Transdanubian Range were split into three groups by cluster analysis, and these groups may be spatially separated on the basis of their location relative to the depression centers, which are characterized by different patterns. Both the deterministic modeling (using MODFLOW) and trend estimation (using the Richards function out of all trialed ones) proved themselves suitable for forecasting of water level rises in the area for 2030. It should be noted, however, that their comparison reveals distinctive spatial differences in the predictions of the two methods. On the one hand, the results of modeling with MODFLOW were found to be greatly affected by the accuracy of the input parameters (infiltration, yields, geological model of the area, etc.) and how precisely the established conceptual model approximates reality. On the other hand, trend estimation with growth models provided similarly accurate results, but without the need for auxiliary data, simply through the investigation of the hydrographs, as they reveal the hydrogeological characteristics of their vicinity. Furthermore, based on the preliminary knowledge, it was necessary to specify the maximum water level as close as possible to the original undisturbed water level. It is possible to state this with some confidence, as neither the slight inaccuracy in specification of the maximum level, nor the few years of missing data at the beginning of the time series seems to affect the applicability of the growth models significantly.
Comparing the two methods, it can be stated that the application of growth curves in trend analysis can be more successfully applied in the forecasting of groundwater levels. The methodology applied to analyze the increasing groundwater level data can help to examine and forecast rising groundwater levels in many areas of the world characterized by such phenomena.