Assessment of TOPKAPI-X Applicability for Flood Events Simulation in Two Small Catchments in Saxony

: Numerical simulations of rainfall-runoff processes are useful tools for understanding hydrological processes and performing impact assessment studies. The advancements in computer technology and data availability have assisted their rapid development and wide use. This project aims to evaluate the applicability of a physically based, fully distributed rainfall-runoff model TOPKAPI-X for the simulation of ﬂood events in two small watersheds of Saxony, Germany. The results indicate that the model was calibrated well for 4.88 km 2 Wernersbach catchment (NSE 0.89), whereas 276 km 2 Wesenitz catchment calibration was only satisfactory (NSE 0.7). The addition of the second soil layer improved the model’s performance in comparison to the simulations with only one soil layer for Wernersbach (NSE increase from 0.83 to 0.89). During the validation process, the model showed a variable performance. The best performance was achieved for Wernersbach for the year with the highest runoff (NSE 0.95) in the last decade. The lowest performance for the Wernersbach and Wesenitz catchments was 0.64 for both. The reasons for the model’s low performance in some years are discussed, and include: (i) input data quality and data insufﬁciency, (ii) methods used within the simulations (interpolation, ETP estimation, etc.), and (iii) assumptions made during the calibration (manual calibration, parameter selection, etc.).


Introduction
Floods are one of the most devastating and deadly natural hazards globally, which affect developing and developed countries [1,2]. Germany is not an exception, considering, for example, the flooding events of 2002 in the Elbe River and the Rhine River in 1995 [3]. The most recent example is the floods caused by extreme rainfall in mid-July 2021, which hit the states of Rhineland-Palatinate and North Rhine-Westphalia the worst. In general, this extreme event accounts for one-third of global economic losses and has affected twothirds of people who have experienced any kind of natural hazard [4]. The authors of [5] have stated that around 50% of water-related disasters worldwide are floods, which affect about 196 million people on a yearly basis in more than 90 states. Due to the natural origin of floods as well as their stochastic nature, complete prevention might not be possible. However, it is possible to lower the related risks and negative consequences by means of real-time flood forecasting systems and the implementation of mitigation actions [2]. In this regard, hydrological models are an essential element in addressing them and in considering potential solutions. As a basis of flood forecasting systems, they

Background and Rationale of the Current Project
The current study aims to assess the applicability of the hydrological model TOPKAPI-X (TOPographic Kinematic APproximation and Integration-EXtended) to simulate high flow events in two German catchments. The authors of [6,25,42,43] have pointed out the following as the main advantages of the TOPKAPI approach: (1) Applicability in increasing spatial and temporal scales, while at the same time keeping the physical interpretation of the parameters; (2) Reduced execution times suitable for distributed model calibrations and real-time operational applications; (3) Complete use of DEM, soil, as well as land cover maps for obtaining non-linear reservoir cascades and an approximation of model parameters; (4) Capacity to compute water balance at a highly detailed spatial and temporal resolution by means of combining land use, soil, and other information; (5) Possibility to track the spatial variability of runoff conditions in the catchment obtaining flow predictions at any point of the channel network; (6) Physical basis, which allows the linkage between the model parameters and catchment characteristics; (7) Possibility to run event-based and continuous simulations (climatological studies); (8) Applicability to ungauged basins due to the possibility of parameter estimation from the literature. Table 1 demonstrates the comparison of the general characteristics of other hydrological models with TOPKAPI-X. It is known that the size and shape of the watershed influences the geometry of the enclosed river system, which as a result influences the amount, distribution, and timing of the runoff [70]. In this regard, TOPKAPI-X has been extensively applied for flood forecasting as well as for the estimation of water balance and its components in different watersheds around the world [71]. For example, the authors of [72] have compared TOPKAPI's (older version of TOPKAPI-X) soil moisture estimate with that of a remotely sensed ASCAT Surface soil moisture product. The results indicated that there is a good correspondence in the dynamic behavior between the results of both sources. In another study, [73] have assessed the sensitivity of TOPKAPI with an added infiltration model (Green-Ampt) to systematic bias in the variables (e.g., rainfall and ETP) as well as in the parameters of the soil component. Overall, the model's sensitivity was governed by the linear function of the input and the parameter bias. In addition, the sensitivity had the same order of magnitude or a lower order of magnitude than the input and parameter bias. According to the authors of [73], these findings indicate that the model can be robust in relation to errors in forcing and parameters. The authors of [32] applied the modified version to the Upper Xixian catchment and found that it can simulate high and low flow events well. The authors of [74] found that the variable-step fourth-order Runge-Kutta algorithm for the non-linear reservoir equations was a good approximation of subsurface flow in the soil matrix, overland flow over the slopes, and surface flow in the channel network. This allowed the retention of the physical properties of the original equations at scales ranging from a few meters to 1 km. The authors of [23] performed a comparison between the lumped and distributed versions of TOPKAPI. The obtained results showed that it is possible to reproduce the results of the distributed version while at the same time shortening the computation time significantly. However, some limitations were identified on the derivation of the relationship linking the soil water content of the catchment to the extent of the saturated area. The authors of [75] have used TOPKAPI for understanding the response of high-elevation catchments to the changing climate. The authors of [28] simulated the glacier mass balance and demonstrated that the combination of discharge and satellite snow cover could improve the model's performance. TOPKAPI-ETH, a glaciohydrological version of the model, has been used to simulate glacier melt and runoff generation in the Nepalese Himalayas [76]. The comparison of TOPKAPI and HEC-HMS for the same catchment showed that TOPKAPI simulates the dynamics of the soil filling, depletion mechanisms, and flood routing in a more unrealistic way [77]. This was assumed to be due to the lack of reservoir modeling in its structure and its hydrograph diffusion effect in the flood wave. The authors of [30] used hourly energy balance simulations together with additional meteorological and surface data to calibrate some of TOPKAPI's parameters for a glacierized basin in central Chile. They showed that the parameters calibrated in this way responded very clearly to the climate settings typical of the dry Andes of Chile. TOPKAPI-X is used as part of the "Global system for hydrological monitoring and forecasting in real-time at high-resolution GH2MF2" project coupled with the Weather Forecast of Global Circulation Models (Deterministic and Ensembles Run) [78]. It is also being used as part of the real-time flood forecasting system in the Campania region (Southern Italy) as part of the Sele Flood Forecasting System [79].
The current project is different from the previous studies where TOPKAPI has been applied, due to the significantly smaller size of the study areas and the small size of the applied grids (10-40 m here, whereas hundreds of meters to 1 km in other studies). To the authors' knowledge, TOPKAPI-X has not been applied to German catchments, which is another reason for the assessment of its applicability.

Wernersbach Catchment
The first study site is a small catchment in the Tharandt Forest, in the state of Saxony, Germany (Figure 1a). It has a drainage area of 4.88 km 2 and lies in the western part of the forest. Tharandt forest mainly consists of Norway spruce and European beech [80]. Land cover of the Wernersbach catchment consists of Coniferous forest and transitional woodland shrubs, which comprise 94.9 and 5.1% of the area, respectively (Figure 1c). Figure 1 illustrates the locations of the meteorological as well as the hydrological data stations, the Digital Elevation Model (DEM of 10 m resolution), soil type, and land cover maps for this catchment used for modeling.

Wernersbach Catchment
The first study site is a small catchment in the Tharandt Forest, in the state of Saxony, Germany (Figure 1a). It has a drainage area of 4.88 km 2 and lies in the western part of the forest. Tharandt forest mainly consists of Norway spruce and European beech [80]. Land cover of the Wernersbach catchment consists of Coniferous forest and transitional woodland shrubs, which comprise 94.9 and 5.1% of the area, respectively (Figure 1c). Figure 1 illustrates the locations of the meteorological as well as the hydrological data stations, the Digital Elevation Model (DEM of 10 m resolution), soil type, and land cover maps for this catchment used for modeling.  As the soil map indicates, the main soil type in this catchment is silt (Figure 1c). The bedrock geology is highly variable (gneiss, rhyolite, sandstone, and claystone from the Lower Cretaceous, basalt), forming a mosaic of various soil classes (mostly Dystric Cambisols, Podsols, and Stagnosols). From the geomorphological point of view, the level and gentle-sloped plateaus are predominant, with the height differences between valleys and plateaus ranging between 50 and 100 m [80]. According to the long-term records from the nearby located Grillenburg weather station (1960-2007), the mean annual air temperature is 7.4 • C, whereas the mean annual precipitation is 879 mm [80]. The available hourly discharge data for the Wernersbach Hauptpegel (illustrated in Figure 1a) showed that in the period 2007-2017, the maximum runoff was 6.2 m 3 /s, the average was 34.93 l/s, while the minimum was 0.19 l/s.

Wesenitz Catchment
The second study area (Wesenitz) is located in East Saxony (Figure 2a). This river is a right-bank tributary of the Elbe River and originates at the Upper Lusatian Highlands [81]. The contributing catchment area is approximately 276 km 2 with a velocity range of 1.5-30 m/s [81,82]. Figure 2 illustrates the DEM of the study area and the locations of rainfall and discharge gauge stations (Wesenitz Elbersdorf Pegel).
As the soil map indicates, the main soil type in this catchment is silt (Figure 1c). The bedrock geology is highly variable (gneiss, rhyolite, sandstone, and claystone from the Lower Cretaceous, basalt), forming a mosaic of various soil classes (mostly Dystric Cambisols, Podsols, and Stagnosols). From the geomorphological point of view, the level and gentle-sloped plateaus are predominant, with the height differences between valleys and plateaus ranging between 50 and 100 m [80]. According to the long-term records from the nearby located Grillenburg weather station , the mean annual air temperature is 7.4 °C, whereas the mean annual precipitation is 879 mm [80]. The available hourly discharge data for the Wernersbach Hauptpegel (illustrated in Figure 1a) showed that in the period 2007-2017, the maximum runoff was 6.2 m 3 /s, the average was 34.93 l/s, while the minimum was 0.19 l/s.
The second study area (Wesenitz) is located in East Saxony (Figure 2a). This river is a right-bank tributary of the Elbe River and originates at the Upper Lusatian Highlands [81]. The contributing catchment area is approximately 276 km 2 with a velocity range of 1.5-30 m/s [81,82]. Figure 2 illustrates the DEM of the study area and the locations of rainfall and discharge gauge stations (Wesenitz Elbersdorf Pegel).  There are 10 land cover and 103 soil classes present in this catchment ( Figure 3). This catchment is different from the previous one due to the presence of settlements in its contributing area. As shown in Figure 2, certain parts (eastern side, characterized by a higher elevation) of the catchment were not covered by the rain gauges. The available daily average discharge data for the Elbersdorf gauge indicated that in the period 2005-2015, the maximum flow was 42.6 m 3 /s, with an average flow of 2.3 m 3 /s, and a minimum flow of 0.5 m 3 /s. The land cover map (Figure 3a) shows the dominance of non-irrigated arable land in the catchment. In the soil type map, the major soil type is silt (Figure 3b). There are 10 land cover and 103 soil classes present in this catchment ( Figure 3). This catchment is different from the previous one due to the presence of settlements in its contributing area. As shown in Figure 2, certain parts (eastern side, characterized by a higher elevation) of the catchment were not covered by the rain gauges. The available daily average discharge data for the Elbersdorf gauge indicated that in the period 2005-2015, the maximum flow was 42.6 m 3 /s, with an average flow of 2.3 m 3 /s, and a minimum flow of 0.5 m 3 /s. The land cover map (Figure 3a) shows the dominance of non-irrigated arable land in the catchment. In the soil type map, the major soil type is silt (Figure 3b).

Description of TOPKAPI-X
TOPKAPI-X (TOPographic Kinematic APproximation and Integration-EXtended) is a fully distributed, physically based rainfall-runoff model [72]. It couples the kinematic wave approach with the topography of the catchment area and transfers the rainfall-run-

Description of TOPKAPI-X
TOPKAPI-X (TOPographic Kinematic APproximation and Integration-EXtended) is a fully distributed, physically based rainfall-runoff model [72]. It couples the kinematic wave approach with the topography of the catchment area and transfers the rainfall-runoff as well as the runoff routing processes into three "structurally-similar", finite dimension, and non-linear reservoir equations [83,84]. Three equations represent (1) drainage in the soil; (2) overland flow on the saturated or impervious soil; and (3) channel flow [18]. The equations are laterally integrated over the grid cells that describe the geometry of the catchment, leading to the cascade of non-linear reservoir equations [42,74]. The channel component is calculated using the kinematic wave method, if the slope is higher than 10 −3 . When the slope is lower than that value, the Muskingum-Cunge-Todini method is applied for this component [85]. A detailed description of the model is given in [85] and [86]. An overview of the relationships between the equations is well explained by the authors of [43].
To each grid cell, values for physical characteristics are assigned, and in each one of them, the mass and momentum balance computations are performed [83,84]. TOPKAPI-X is a newer version of TOPKAPI. It has several improvements, but the main concepts are those of the original TOPKAPI. The following are the improvements in its formulation [79]: (1) Any grid has eight possible flow directions instead of the original four directions; (2) The infiltration module is based on the Green-Ampt model, which allows the reproduction of Hortonian processes and accounts for the infiltration excess mechanism; (3) The second soil layer is added in order to reproduce different hydrological conditions; (4) The addition of the groundwater component based on the cellular automata with full 2D Integrated Finite Difference scheme; (5) Introduction of the new coefficients, which consider the sun height with respect to the cell aspect. This is used for the assessment of radiation and albedo in the snow accumulation and melting module based on mass and energy balance; (6) The addition of the Reservoir and Lake components.

Input Data
The Institute of Hydrology and Meteorology of TU Dresden provided the meteorological and hydrological data for both catchments from the relevant sources (Table 2). In the case of Wernersbach, the data was available in hourly time steps, whereas for Wesenitz it was available daily. The temperature data for Wernersbach was obtained from the meteorological station at Grillenburg.

Meteorological data time scale Hourly Daily
The gaps in the precipitation data for some stations in Wernersbach were filled with the data from the station that had a complete time series. The daily average temperature for Wesenitz was available from only 1 station, and it was taken as a mean for the whole catchment. Due to the absence of data, several ponds (e.g., Niederteich, Waldscheibenteich, etc.) and lakes (such as Natursee, Kiessee, etc.) in this basin were not considered during the simulations. TOPKAPI-X requires several maps in ASCII format as input for pre-processing: filled DEM, basin mask, outlet map, flow direction, flow accumulation, soil, and land cover. Satellite-based DEM (10-m resolution), land Cover, and soil maps for the study areas were provided by the Institute of Hydrology and Meteorology from the corresponding sources ( Table 2). All maps were processed in ArcGIS 10.3.1. The model for Wernersbach had a 10-m grid size, whereas the model for Wesenitz had 40-m grids due to model limitations. Table 2 demonstrates the details of the input data used for model generation.

Model Calibration and Validation
The variability of the parameters within a small range of time and space and uncertainties related to soil characteristics, land cover, etc., necessitated the calibration of the model [18]. Manual calibration was performed for the year with average and variable flows rather than the year with highest flow. As high flow events did not often occur in the considered years, focusing only on them could have biased the calibration. Simulations were performed on hourly time step for Wernersbach (1 October 2009-1 November 2010), while for Wesenitz on daily time step (1 October 2012-1 November 2013). During the calibration, the modeled discharge was visually matched to the recorded flow, while keeping the parameters within their meaningful values and in the same magnitude as in the previous studies with TOPKAPI. The main decisive likelihood function used during the calibration was Nash-Sutcliffe efficiency (NSE). Percentage bias (PBIAS), the coefficient of correlation, and the root mean square error (RMSE) were also calculated and demonstrated in the simulation visualizer tab of TOPKAPI-X. Initial values for all the parameters were derived from the literature. The roughness coefficient for each land cover type was estimated using the values from [1] and they are shown in Tables 3 and 4. Crop coefficients (K c ) were taken from [87], which are also shown in Tables 3 and 4. The calibration of K c and land cover Manning values was not performed due to the statements in the TOPKAPI-X User Manual about their low sensitivity. The Manning's value for each Strahler order was taken from [85]. Due to the absence of parameter data on German soil types and the availability of their generic descriptions only, they were simplified to the main soil types (as classified by USDA). Later initial values for these simplified soil types were taken from [85]. As was completed by the authors of [9], shallow soils were assigned to grid cells representing water bodies, rocky areas, and urban areas. All the simulations started in September of the year before the year of interest to allow the warm-up of the model for 1 month and because the snowmelt process required the snow from the previous year. Tables 5 and 6 show the range of calibrated soil parameters for Wernersbach and Wesenitz, respectively. Model validation and model calibration are essential prerequisites for the derivation of reliable and accurate discharge estimates from the hydrological models [88]. This should demonstrate the conditions when the model performs well or poorly and confirm that it is acceptable for the intended application [89]. The calibrated model parameters were used for other years (Wernersbach for years 2009, 2012, and 2013, Wesenitz for the years 2006, 2008, and 2010), in order to assess if the model could give reasonably accurate results with another independent datasets. In addition to the above-mentioned goodness of fit (GOF) tests, Mean Error (ME), Mean Absolute Error (MAE), Kling Gupta Efficiency (KGE), Volume Efficiency (VE) and others were applied in order to address the characteristics of the simulated output using the hydroGOF package (developed by Zambrano-Bigiarini) of R-Studio (3.5.1). The equations for the corresponding tests can be found in [90][91][92][93]. For a better visualization and analysis of the results the corresponding plots (such as the scatterplot, the flow duration curve, the hydrograph etc.) were made using the hydroTSM package (developed by Zambrano-Bigiarini) of R-Studio.

Calibration
3.1.1. Wernersbach (One Soil Layer) Figure 4 illustrates the simulated and observed runoff hydrograph for the calibration year 2010, when only one soil layer was used ( Figure 4). As can be seen, there is a good visual correspondence between the simulated and observed streamflow. In addition to that, the timing of the flow has been well simulated. The NSE test gave 0.83, which is considered good ( Table 7) according to the classification by [94]. Though in some cases, the model has underestimated the actual flow. results with another independent datasets. In addition to the above-mentioned goodness of fit (GOF) tests, Mean Error (ME), Mean Absolute Error (MAE), Kling Gupta Efficiency (KGE), Volume Efficiency (VE) and others were applied in order to address the characteristics of the simulated output using the hydroGOF package (developed by Zambrano-Bigiarini) of R-Studio (3.5.1). The equations for the corresponding tests can be found in [90][91][92][93]. For a better visualization and analysis of the results the corresponding plots (such as the scatterplot, the flow duration curve, the hydrograph etc.) were made using the hy-droTSM package (developed by Zambrano-Bigiarini) of R-Studio.

Calibration
3.1.1. Wernersbach (One Soil Layer) Figure 4 illustrates the simulated and observed runoff hydrograph for the calibration year 2010, when only one soil layer was used ( Figure 4). As can be seen, there is a good visual correspondence between the simulated and observed streamflow. In addition to that, the timing of the flow has been well simulated. The NSE test gave 0.83, which is considered good (Table 7) according to the classification by [94]. Though in some cases, the model has underestimated the actual flow. The peak in June was significantly lower than the observed flow. This underestimation could be due to the assignment of high infiltration capacity for the soil, an incorrect precipitation amount, and the absence of the second soil layer. When there is one soil layer, TOPKAPI-X does not allow the depth of the soil to be more than 50 cm. If the soil is deeper, a second soil layer should be added. The use of a single (shallow) soil layer might have limited the storage capacity of the soil component. The authors of [72] stated that the increase in soil depth could increase the soil saturation index/soil moisture content. Therefore, the shallow soil layer might not act as a significant water holding layer, which otherwise could have led to faster saturation and subsequently a higher runoff generation. As a result, to evaluate the model performance, when the second soil layer was added, the calibration was further performed by the addition of the second soil layer. The developers of TOPKAPI-X have also mentioned the importance of the second soil layer for a better representation of the baseflow (TOPKAPI-X User Manual).  The peak in June was significantly lower than the observed flow. This underestimation could be due to the assignment of high infiltration capacity for the soil, an incorrect precipitation amount, and the absence of the second soil layer. When there is one soil layer, TOPKAPI-X does not allow the depth of the soil to be more than 50 cm. If the soil is deeper, a second soil layer should be added. The use of a single (shallow) soil layer might have limited the storage capacity of the soil component. The authors of [72] stated that the increase in soil depth could increase the soil saturation index/soil moisture content. Therefore, the shallow soil layer might not act as a significant water holding layer, which otherwise could have led to faster saturation and subsequently a higher runoff generation. As a result, to evaluate the model performance, when the second soil layer was added, the calibration was further performed by the addition of the second soil layer. The developers of TOPKAPI-X have also mentioned the importance of the second soil layer for a better representation of the baseflow (TOPKAPI-X User Manual).

Wernersbach (Two Soil Layers)
As can be observed from the GOF test results, there was a relatively small improvement ( Table 7). The presence of the same underestimation (in June) indicates that the reason for this deviation is not because of the second soil layer absence (Figure 5a). In the original precipitation data, there were gaps during the event in June for the southern rain gauge (located in the highest part of the catchment). As the gaps were filled with the data from another station, this might have resulted in the underrepresentation of the actual rainfall distribution. The presence of significant spatial variability in the precipitation amounts (up to 24 mm) was observed, when the hourly rainfall data from the stations was compared. During the events in June and October, the soil moisture content was of a similar magnitude, though the amount of runoff was not (Figure 5b). This might indicate that the soil was saturated enough, but that the available rain was not enough to result in such a runoff. In the study by the authors of [95] it was found that the runoff generation mechanisms present in the Wernersbach catchment were saturation excess, Hortonian overland flow, and interflow. Due to the lack of data, the Green-Ampt module was not activated, which is responsible for the Hortonian (infiltration excess) overland flow. The authors of [95] claimed that this mechanism was clearly observable during small rain events and was caused by the impermeable roads of 2.6 km length. However, this road was not observable in the available land use map due to the lower resolution of the map. This indicates that modeling with incomplete contributing runoff processes could have caused the observed deviation. The comparison of the flow duration curves (FDC) with one and two soil layers demonstrates which flows were better modeled ( Figure 6). Both graphs indicate the poor performance of TOPKAPI-X's low flow simulation. The authors of [68] explained that the inability of TOPKAPI to simulate the low flows well was due to groundwater module absence. Due to the lack of data, this component was not included, which is likely to have played a role.  As can be observed from the GOF test results, there was a relatively small improve ment ( Table 7). The presence of the same underestimation (in June) indicates that the rea son for this deviation is not because of the second soil layer absence (Figure 5a). In th original precipitation data, there were gaps during the event in June for the southern rai gauge (located in the highest part of the catchment). As the gaps were filled with the dat from another station, this might have resulted in the underrepresentation of the actua rainfall distribution. The presence of significant spatial variability in the precipitatio amounts (up to 24 mm) was observed, when the hourly rainfall data from the stations wa compared. During the events in June and October, the soil moisture content was of a sim ilar magnitude, though the amount of runoff was not (Figure 5b). This might indicate tha the soil was saturated enough, but that the available rain was not enough to result in suc a runoff. In the study by the authors of [95] it was found that the runoff generation mech anisms present in the Wernersbach catchment were saturation excess, Hortonian overlan flow, and interflow. Due to the lack of data, the Green-Ampt module was not activated which is responsible for the Hortonian (infiltration excess) overland flow. The authors o [95] claimed that this mechanism was clearly observable during small rain events and wa caused by the impermeable roads of 2.6 km length. However, this road was not observabl in the available land use map due to the lower resolution of the map. This indicates tha modeling with incomplete contributing runoff processes could have caused the observe deviation. The comparison of the flow duration curves (FDC) with one and two soil layer demonstrates which flows were better modeled ( Figure 6). Both graphs indicate the poo performance of TOPKAPI-X's low flow simulation. The authors of [68] explained that th inability of TOPKAPI to simulate the low flows well was due to groundwater modul absence. Due to the lack of data, this component was not included, which is likely to hav played a role.
(a)   The results are acceptable (NSE-0.7), but the overall underestimation was relatively high 27.5%. The degree of collinearity between the observed and simulated streamflow data was 60% (Figure 8b). Overall, the model could simulate the time of rises and falls in the hydrograph well, but the problem was with the amount of simulated flow (Figure 7a). Due to a large number of initially estimated parameters resulting from 103 soil classes in this catchment, the global optimum might not have been reached; rather, a local optimum was reached. This could result from manual calibration and its time-consuming nature, as a result of which it is not possible to consider the whole parameter space and all the possible combinations. Considering the improvement in the model's performance in the case of the Wernersbach catchment, the calibration for Wesenitz started with two soil layers from the beginning. The soil moisture graph does not show a high variability, probably due to the daily averaging of the data (Figure 7b).    The results are acceptable (NSE-0.7), but the overall underestimation was relatively high 27.5%. The degree of collinearity between the observed and simulated streamflow data was 60% (Figure 8b). Overall, the model could simulate the time of rises and falls in the hydrograph well, but the problem was with the amount of simulated flow (Figure 7a). Due to a large number of initially estimated parameters resulting from 103 soil classes in this catchment, the global optimum might not have been reached; rather, a local optimum was reached. This could result from manual calibration and its time-consuming nature, as a result of which it is not possible to consider the whole parameter space and all the possible combinations. Considering the improvement in the model's performance in the case of the Wernersbach catchment, the calibration for Wesenitz started with two soil layers from the beginning. The soil moisture graph does not show a high variability, probably due to the daily averaging of the data (Figure 7b). , but the overall underestimation was relatively high 27.5%. The degree of collinearity between the observed and simulated streamflow data was 60% (Figure 8b). Overall, the model could simulate the time of rises and falls in the hydrograph well, but the problem was with the amount of simulated flow (Figure 7a). Due to a large number of initially estimated parameters resulting from 103 soil classes in this catchment, the global optimum might not have been reached; rather, a local optimum was reached. This could result from manual calibration and its time-consuming nature, as a result of which it is not possible to consider the whole parameter space and all the possible combinations. Considering the improvement in the model's performance in the case of the Wernersbach catchment, the calibration for Wesenitz started with two soil layers from the beginning. The soil moisture graph does not show a high variability, probably due to the daily averaging of the data (Figure 7b).  The FDC and scatterplot show that the model underrepresented the low flows significantly, which could also be due to the fact the groundwater component was not included ( Figure 8). FDC demonstrates that the model struggled to simulate low flows, as the discrepancy between the simulated and observed flow increased with a decrease in the flow volume (Figure 8a). The authors of [96] claimed that calibration with NSE concentrates more on the high flows of the hydrograph at the expense of improvements to the low flow predictions. In addition, the authors of [97] stated that certain problems could arise as a result of bias normalization and the tendency for the systematic underestimation of flow variability. They found that optimization using NSE can lead to the underestimation of variability in the calibration period, which will later propagate to the validation period. As a result, at certain cases the low flows would be poorly represented after calibration with NSE.

Wernersbach
The validation of the model with the year with the recent high flow event is described next. The year with the highest flow in the last decade gave the best fit in the case of the Wernersbach catchment in comparison to the other validation years (Table 8). There was overall a good fit between the simulated and observed runoff time series (Figure 9a). The FDC and scatterplot show that the model underrepresented the low flows significantly, which could also be due to the fact the groundwater component was not included (Figure 8). FDC demonstrates that the model struggled to simulate low flows, as the discrepancy between the simulated and observed flow increased with a decrease in the flow volume (Figure 8a). The authors of [96] claimed that calibration with NSE concentrates more on the high flows of the hydrograph at the expense of improvements to the low flow predictions. In addition, the authors of [97] stated that certain problems could arise as a result of bias normalization and the tendency for the systematic underestimation of flow variability. They found that optimization using NSE can lead to the underestimation of variability in the calibration period, which will later propagate to the validation period. As a result, at certain cases the low flows would be poorly represented after calibration with NSE.

Wernersbach
The validation of the model with the year with the recent high flow event is described next. The year with the highest flow in the last decade gave the best fit in the case of the Wernersbach catchment in comparison to the other validation years (Table 8). There was overall a good fit between the simulated and observed runoff time series (Figure 9a).

Wernersbach
The validation of the model with the year with the recent high flow event is described next. The year with the highest flow in the last decade gave the best fit in the case of the Wernersbach catchment in comparison to the other validation years (Table 8). There was overall a good fit between the simulated and observed runoff time series (Figure 9a).  As can be seen from the soil moisture map, the significant runoff is noticeable when the soil moisture goes above 80% (Figure 9b). Even though the overall fit is very good (NSE-0.95) and the overall underestimation percentage is low (PBIAS-1.4%), the highest flow has been significantly underestimated with a relative error value of 25.6% (Figure 10a,b). Among all the GOF tests, only the volume efficiency coefficient showed a significant difference between the simulated and observed values (VE-0.69). Among three validation years, the model's performance was lowest for the year 2012 (NSE-0.64). This year also had the highest underestimation (−14.2), but the lowest RMSE and MAE values. For the year 2009 the model performance was acceptable (NSE-0.71) and the underestimation was within the accepted limits (−6.9). The main difference between the year 2013 and the other two validation years is the fact that in the latter there was no high flow. So, the calibrated model showed very good results for a high flow event, whereas for average years the model's performance degraded. In order to have a better look at this peak flow event, a closer view is provided in Figure 10, when one and two soil layers were used. The model represents well the rising and falling limbs. However, the highest flow has been underestimated in both cases. The case with two soil layers underestimated all the peaks during this event, whereas the case with one soil layer overestimated the second part of the event. To put it another way, two soil layers are not saturated enough, whereas one soil layer is saturating faster and generating a higher runoff during this event.

Wesenitz
The model's performance had degraded when the model was validated for Wesenitz with the year 2010. A visual analysis showed that the model represented the patterns of the hydrograph well, though the coefficient of the determination test gave only 0.55 (Figure 11a, Table 8). The relative error between the observed highest flow and the simulated highest flow was around −13.6%. The model performance in terms of the NSE coefficient was the lowest among the validated years (0.64), whereas the KGE coefficient was the highest (0.79) ( Table 8). The overall underestimation was less than 10%, which is accepta- In order to have a better look at this peak flow event, a closer view is provided in Figure 10, when one and two soil layers were used. The model represents well the rising and falling limbs. However, the highest flow has been underestimated in both cases. The case with two soil layers underestimated all the peaks during this event, whereas the case with one soil layer overestimated the second part of the event. To put it another way, two soil layers are not saturated enough, whereas one soil layer is saturating faster and generating a higher runoff during this event.

Wesenitz
The model's performance had degraded when the model was validated for Wesenitz with the year 2010. A visual analysis showed that the model represented the patterns of the hydrograph well, though the coefficient of the determination test gave only 0.55 (Figure 11a, Table 8). The relative error between the observed highest flow and the simulated highest flow was around −13.6%. The model performance in terms of the NSE coefficient was the lowest among the validated years (0.64), whereas the KGE coefficient was the highest (0.79) ( Table 8). The overall underestimation was less than 10%, which is acceptable. It should be pointed out that analyzing the model performance with only 365 points might not be statistically valid and significant. The small peak at the end of February is probably due to the snow melting process. However, the model calculated a significantly lower flow. Considering that this peak did not occur later in the spring, it could be hypothesized that the model was not provided with sufficient snow. This could be due to the lower coverage of the catchment with the precipitation gauges, as the available gauges might have underrepresented the available snow. The soil moisture graph shows that the deeper layer was mostly stable from January to June (Figure 11b). In general, the results for this catchment are acceptable, although further research should be completed with hourly data and with a better rain gauge coverage of the area. Considering the large number of parameters, the problems mentioned earlier (i.e., local optimum, equifinality problem) cannot be excluded in this modeling setup. The two other years used for validation (2006 and 2008) showed better performances than the year with highest flow (NSE 0.72 and 0.67, respectively). Interestingly, in the case of Wesenitz, the model performance during validation did not show significant variation among the years (0.64-0.72), as was observed for Wernersbach. The year 2006 demonstrated the best model performance in terms of NSE, however its overall underestimation was also highest. The volumetric efficiency values were also in close range.

Discussion
The importance and usefulness of hydrological models for water resources management, for understanding the hydrological processes and for impact assessment studies is clear. They are important tools which allow scientists as well as policy-makers to make decisions based on the simulations of catchment behavior [9]. Considering the increasing demand for water and the influence of climate change, they will clearly be part of future water management practices. The current study results illustrated that TOPKAPI-X has the potential to simulate the runoff at small catchments of Saxony. According to the authors of [12], the model performance can vary depending on many factors, such as the model structure, the physiographic characteristics of the basin, and the available data (resolution, accuracy, and quantity). Their main point was that no single model is perfect and best for all problems. Similarly, the findings of this study showed that different conditions result in varying representations of the rainfall-runoff processes at each watershed. On the one hand (e.g., Wernersbach) the model gave very good results; on the other hand (Wesenitz), the results were only satisfactory. An analysis of the results and the available literature has shown that there are several potential sources of error, which will be discussed next.

Missing Data and the Lack of Precipitation Data
In general, the quality and availability of the precipitation data are essential for hydrologic analysis and the design of water resources systems [98]. Nevertheless, the availability of complete data is not common. However, the authors of [99] pointed out that the precipitation variability has a significant influence on the peak flows at small scales. In the case of the Wernersbach catchment, there were gaps in the precipitation data for certain

Discussion
The importance and usefulness of hydrological models for water resources management, for understanding the hydrological processes and for impact assessment studies is clear. They are important tools which allow scientists as well as policy-makers to make decisions based on the simulations of catchment behavior [9]. Considering the increasing demand for water and the influence of climate change, they will clearly be part of future water management practices. The current study results illustrated that TOPKAPI-X has the potential to simulate the runoff at small catchments of Saxony. According to the authors of [12], the model performance can vary depending on many factors, such as the model structure, the physiographic characteristics of the basin, and the available data (resolution, accuracy, and quantity). Their main point was that no single model is perfect and best for all problems. Similarly, the findings of this study showed that different conditions result in varying representations of the rainfall-runoff processes at each watershed. On the one hand (e.g., Wernersbach) the model gave very good results; on the other hand (Wesenitz), the results were only satisfactory. An analysis of the results and the available literature has shown that there are several potential sources of error, which will be discussed next.

Missing Data and the Lack of Precipitation Data
In general, the quality and availability of the precipitation data are essential for hydrologic analysis and the design of water resources systems [98]. Nevertheless, the availability of complete data is not common. However, the authors of [99] pointed out that the precipitation variability has a significant influence on the peak flows at small scales. In the case of the Wernersbach catchment, there were gaps in the precipitation data for certain stations. The filling of these gaps might have added bias to the spatial distribution of precipitation. This was a necessary step as the cumulative difference in the precipitation was significant and not filling them would have resulted in even higher errors. The authors of [9] stated that the presence of elevation variation in an order of more than 100 m could result in the pronounced temporal and spatial variations of the climatic elements. Both catchments had these characteristics, which could mean that the existing rain gauge density might not be enough to exhibit the prevailing meteorological conditions in the catchment. Consequently, this would lead to a pronounced over or underestimation of the rainfall amount. In addition to the above-mentioned points, it is worth mentioning that the hydrological data are subject to significant uncertainty and error (systematic and random) [33,98]. They are described to be critical, as they can affect the continuity of rainfall data and ultimately influence the results of the hydrologic models, which rely heavily on them [100].

Spatial Distribution of Precipitation (Interpolation Method)
The outcomes of hydrological models rely on the space-time structure of atmospheric inputs, which is usually provided by interpolation methods [101,102]. Regarding this, the authors of [102,103] reported that the spatial variability of precipitation especially at fine scales significantly impacts the catchment response, the estimation of model parameters, and the volume and the timing of the peak flow. The errors in the areal extent or in the amount of precipitation will directly lead to large errors in the simulated flow and to the compensation for the input errors by wrong parameter adjustment [101,102]. TOPKAPI-X uses the Thiessen Polygon method for the interpolation of the rainfall. It is a well-known, simple technique for estimating mean areal rainfall based on proximal mapping [98,104]. It might pose limitations to the adequate representation of precipitation distribution. This problem is especially important in the case of sparsely distributed rain gauges, as the authors of [101] stated that this might result in an over or underestimation of the total rainfall amount. The authors of [104] stated the following as its disadvantages: (1) unsuitability for mountainous and hilly regions; (2) unrealistic patchy maps with sudden changes at the polygon boundaries; and (3) loss of information on rainfall gradients. It is possible to input precipitation data distributed by other techniques (e.g., Kriging, inverse distance weighting, etc.) by means of prepared precipitation maps. However, this option was not considered in the current project.

Lack of Information on Parameters and Resulting Uncertainties
The basis of physically based models is that the parameters can be measured and used by tuning during calibration [75]. However, the authors of [105] stated that the measurement of model parameters is not possible due to the presence of preferential flows. In this project, there was no data available regarding the parameters for these specific locations; as a result, they had to be estimated. The approximation of all the parameters can lead to increased uncertainties in the model performance and output errors. As the authors of [106] stated, the uncertainties during calibration would consequently propagate to the model output. The situation becomes more complicated as some properties can significantly vary even in a relatively homogeneous area (e.g., soil hydraulic properties and water content) [63,80]. On top of this, the strong interaction between the parameters further complicates the situation [89]. The authors of [5] pointed out that the use of land coverdefined Manning's coefficients is acceptable in medium and large catchments, whereas in small catchments (or at subcatchment level) their use can introduce significant errors to the results.

Manual Calibration with One Aggregated Response
The process of calibration faces several challenges related to the nature of hydrological models, such as nonlinearity, data errors and insufficiency, correlation among parameters, irregular response surfaces, and the single or multi-objective nature of the models [107]. As mentioned earlier, the manual calibration performed in this study could not consider the problem of equifinality and the whole parameter space. The authors of [9,100] stated that manual calibration is time-consuming, partially subjective, and only feasible when a trained and experienced user performs it. However, the authors of [89] mentioned the limited repeatability, the difficulty for the cases with more than ten parameters, and the amenability for the calibration of one gauge at a time as the major drawbacks of this method. Moreover, when manual calibration is used, it is hard to identify when the best fit has been obtained [100]. Another important point is the uncertainties related to the use of a single integrated response (e.g., discharge). The authors of [30,75] stated that the use of a single integrated response might result in the problem of equifinality. In addition, this does not guarantee the correct reproduction of single internal processes and cannot avoid the compensation of the errors by the different model components [30]. Therefore, in the case of the physically based, distributed models, model calibration should include the use of spatially distributed observations to avoid the parameter identifiability problem, producing a better balance between modeled processes and a better partitioning of the variable runoff components [108].

Insufficient or Inaccurate Processes
Due to the lack of data, several components and modules were not activated during the simulations, such as the Lakes and Reservoir component (for Wesenitz), the Green-Ampt Module, and the Groundwater component (for both watersheds). Their activation could have improved the model performance. In addition, interception might not have been adequately considered in TOPKAPI-X. The author of [109] stated that neglecting interception could introduce significant errors, which would propagate into the subsequent processes simulated (such as soil moisture and groundwater recharge). The authors of [17] found that including an interception in the simulation improved the stream flow modeling due to better soil moisture accounting. The authors of [110] added that the influence of interception losses is noticeable only during small storms and only influences the surface runoff rate. Evapotranspiration (ETP) directly influences the runoff, soil water storage, local precipitation, and temperature at the regional scale [110]. The authors of [56] mentioned that the correct estimation of ETP is essential for obtaining the correct total discharge volume. The use of a simplistic method by TOPKAPI-X might be another reason for the low model performance at certain years. According to the authors of [110], the crop coefficient method works well for irrigation agriculture, where uniform phenology is present. In the case of forests, it is considered problematic due to the large variability of species composition, leaf biomass dynamics throughout the season, age, and the density effects on tree biomass and water transport properties.

Conclusions
The use of the rainfall-runoff models for the management of water resources and for the simulation of historical events is associated with many benefits. With the proper calibration using a variety of historical events, the model could be used for impact assessment studies and the simulation of future events. The current project aimed to assess the applicability of TOPKAPI-X for the simulation of historical flood events in small scale German watersheds using high spatial resolution maps. The study areas had different sizes and varying land cover and soil types. In general, TOPKAPI-X demonstrated a relatively good performance for the chosen study areas. For the Wernersbach catchment, the obtained results were good (NSE 0.83-0.95), though certain events were significantly underestimated. In comparison to the simulations with only one soil layer, the addition of the second soil layer improved the model's performance. The simulations of the Wesenitz catchment were only acceptable (NSE 0.64-0.72), as in some years, the degree of underestimation was high. In this case, the lack of data for certain parts of the catchment might have resulted in a low NSE coefficient. The manual calibration of a large number of parameters might not have guaranteed the best fit. The literature review, the analysis of the input data, as well as the model structure indicated that there could be several sources of error, such as: (1) missing and insufficient precipitation data; (2) the interpolation method used; (3) lack of information on parameters; (4) manual calibration using a single integrated response; and (5) insufficient and inaccurate processes inclusion. It is expected that accounting for and considering the above-mentioned error sources will significantly improve the model's performance.