Global BROOK90 R Package: An Automatic Framework to Simulate the Water Balance at Any Location

: The number of global open-source hydrometeorological datasets and models is large and growing. However, with a constantly growing demand for services and tools from stakeholders, not only in the water sector, we still lack simple solutions, which are easy to use for nonexperts. The new R package incorporates the BROOK90 hydrologic model and global open-source datasets used for parameterization and forcing. The aim is to estimate the vertical water ﬂuxes within the soil–water–plant system of a single site or of a small catchment ( < 100 km 2 ). This includes data scarce regions where no hydrometeorological measurements or reliable site characteristics can be obtained. The end-user only needs to provide a location and the desired period. The package automatically downloads the necessary datasets for elevation (Amazon Web Service Terrain Tiles), land cover (Copernicus: Land Cover 100 m), soil characteristics (ISRIC: SoilGrids250), and meteorological forcing (Copernicus: ERA5 reanalysis). Subsequently these datasets are processed, speciﬁc hydrotopes are created, and BROOK90 is applied. In a last step, the output data of all desired variables on a daily scale as well as time-series plots are stored. A ﬁrst daily and monthly validation based on ﬁve catchments within various climate zones shows a decent representation of soil moisture, evapotranspiration, and runo ﬀ components. A considerably better performance is achieved for a monthly scale.


Introduction
The range of global hydrological models and their applications is vast and growing. These models are covering almost every possible niche in stakeholder's needs: from water/food security and management tools [1] to flood warning systems [2]. One of the major intentions, and indeed a noteworthy one, is to cover the globe by a hydrological model or a dataset that can give stakeholders even in data scarce regions the opportunity to analyze, plan, govern, and decide on the historical and recent hydrological data.
There are numerous studies available on the application of hydrological models for various locations on the globe, suggestions of global parameterizations and their performance [1,[3][4][5][6][7][8][9][10][11]. However, all these models were parameterized by deploying local (i.e., regional, national) datasets of high quality and calibrated to observed discharge data before discussing their performance. On the other side, there are already a few global datasets with major components of the hydrological cycle (i.e., precipitation, evapotranspiration, soil water, runoff) available, which are regularly updated in operational mode (JRA-55 [12], MERRA2 [13], NCEP Climate Forecast System Version 2 (CFSv2) [14], ERA5 [15]). In this case, the resolution of the grid cells was approximately 50 km, which is mostly a tradeoff between the current computational capabilities and the complexity of global climatological models. Therefore, two limitations become obvious and open a potential gap or, more precisely, a challenge. Namely, of gaining highly resolved water cycle components at any location on the globe by an easy model setup and globally available parameterization and meteorological forcing datasets. This challenge was discussed already in 2011 [16] stating that hyper-resolution (i.e., <1 km 2 ) of modeling the global land-surface hydrological processes is critical for scientific and practical needs (for the better understanding and predictability of surface-subsurface and land-atmosphere interactions, water quality and human impacts on the water cycle) but impossible at the moment within current modeling capabilities. Some of the models used in the aforementioned studies are already presented as framework solutions with data input and data processing. Even more, they partly consider model calibration, applications, and result assimilation. There are framework packages in Python [17] and R [18].
Yet, there is to our knowledge no automatic modeling framework "from A to Z" available currently, which incorporates automatic data assimilation, a deterministic hydrological model application, and result postprocessing for any desired location on the globe, and at the same time capable to run on a personal computer.
Recent developments in global datasets show an increase of resolution and quality of the data, which could be used for the parameterization of land cover and soil characteristics and as meteorological forcing of hydrological models. Keeping in mind, that "all models are wrong but some are useful" [19], it becomes a doable challenge to test whether such a framework could be developed in a simple and easy-to-use black-box (or nonexpert) version and still give adequate results for the governing components of the water balance on small spatial scales, starting from a few meters.
For this purpose, the lumped hydrologic model BROOK90 [20] was enabled to gain location related information of parameters and the meteorological input from the ERA 5 [15], the Global Land Cover [21], and the SoilGrids250m [22] datasets.
The main objectives of this study are: 1.
To broaden the BROOK90 community by expanding the scope of its application; 2.
To show the opportunities and limitations of the globally applicable modeling framework based on a lumped physical model and open-source input data; 3.
To simplify a hydrological framework emphasizing usability for nonexperts by full automation of the modeling process in a R package under the motto: "Just drop a catchment and receive a model output"; 4.
A contribution to the open-source hydrological science community by the release of the package code.
The research questions of the paper are stated as following: 1.
Is reasonable location-related model output achievable by deploying a noncalibrated lumped hydrological model based on global parameterization and forcing datasets? 2.
What are potential uncertainties and limitations of such a framework?

Description of the Framework
The description of Global BROOK90 is divided in three main parts, followed by technical remarks on the package. At first, an introduction to the original BROOK90 hydrological model itself is presented. Then, the main input open-source datasets used in the model parameterization are described. Finally, the core functions of the framework are introduced that unite the user input, data download and processing, model parameterization and run, and result postprocessing.

Short Introduction to the Original BROOK90 Model
BROOK90 [23] is a physical lumped hydrological model with a special focus on a detailed representation of vertical water fluxes within the soil-water-plant system at a single site.
Precipitation, maximum and minimum air temperature, solar radiation, vapor pressure, and wind speed on a daily time scale are the standard meteorological input variables (i.e., input of precipitation in higher temporal resolution is possible). Figure 1 shows the model flowchart. It illustrates how the model stores canopy-intercepted rain or snow, the snow storage on the ground, the water in the soil layers, and the groundwater. Net throughfall of precipitation, which was not intercepted by the vegetation, together with the snowmelt either infiltrates (into the soil matrix or into deeper layers by macropores) or concentrates directly to streamflow (overland or vertical macropore flow followed by downslope flow). Additionally, a delayed contribution to the streamflow from vertical or downslope soil drainage and groundwater storage is simulated. Soil water movement is described within several layers by saturated and unsaturated matrix flow and macropore flow using Richard's equation. The parameterization of the soil allows for up to 25 layers (which number is technically possible to extend) and a detailed representation of matrix flow, e.g., depending on uptake by the roots driven through transpiration. Groundwater storage changes by gravity drainage from the deepest soil layer and seepage is modeled via a fixed fraction of groundwater outflow. The evapotranspiration process consists of five components: evaporation of intercepted snow/rain from the canopy, evaporation from snow/soil, and transpiration from a single layer canopy. The resistance framework is built on the Shuttleworth-Wallace approach and allows modeling dense canopies like a forest plantation in temperate climate, but also sparse, open canopies like savannas. For a detailed process description and calculation approaches used, please refer to the original documentation [20].
The main known limitations of BROOK90 [20] are the following: lateral water movement towards downslope areas is not recognized, the model has no implementation of channel routing, hillslope processes are neglected, plant phenology is missing, the vegetation layer is uniformly distributed, and soil frost is not taken into account. Moreover, as the model does not consider the inflow of soil and groundwater, it is not suitable for environments where the vegetation has access to shallow groundwater. Among the unquestionable benefits of BROOK90 are the assumption of the basic physical principles of mass and energy conservation, daily model output based on subdaily routines, a complex evapotranspiration scheme, and a good representation of soil moisture fluxes.
Overall, hydrologists recognize BROOK90 as a useful tool for studies of the water balance for small plots in mainly forested regions (i.e., from the most recent studies [25][26][27][28][29]). Therefore, it has been used in research, teaching, and water management, and also because of its ability to gain reasonable results even under changing climate conditions [20].

Input Open-Source Datasets
Three main datasets are incorporated in the framework of the package. These provide the necessary meteorological, vegetation, and soil input data for the BROOK90 model. A short overview of the datasets is presented in Table 1. The ERA5 global climate reanalysis dataset from Copernicus and European Centre for Medium-Range Weather Forecasts [15] (1979-present, hourly resolution) is used as meteorological forcing of the BROOK90 model in the package. The ERA5 dataset is based on a combination of a global physical model of the atmosphere and observations from across the world using data assimilation principles. This means that the current timestamp of the model run combined with observations is used to derive the next model timestamp. The dataset is accessed via the MARS (Meteorological Archival and Retrieval System) request builder and the "ecwmfr" R package [30]. The following variables are retrieved: 2 m air temperature [K], surface net solar radiation [J m −2 ], precipitation [m], and wind speed in two dimensions [m s −1 ]. The original model resolution is 0.28125 • . This is equivalent to approximately 31 km. It is possible to download ERA5 data on a custom grid and horizontal resolution; however, the data are resampled from the original resolution. The default interpolation method for continuous parameters is bilinear and does not improve the accuracy of the data [31]. For the convenience of further data processing, a 0.1 • × 0.1 • grid size and an "ncdf" (network Common Data Form) output file format is used. The biggest limitations of the dataset are the server restrictions: data retrieval per request is limited to a maximum of 120,000 items (~13 years for five variables) and download speed is strongly affected by the query length of other users' requests. Developers and researchers claim a high quality of the data with regard to a global coverage and the temporal resolution [32][33][34][35][36]. The uncertainty of the dataset can be estimated by the analysis of 10 reanalysis ensemble members, while in the package presented the mean reanalysis output is used. However, it is possible to implement and deploy an ensemble of meteorological forcing, yet with a two times coarser spatial resolution. The SoilGrids250 product [22] provides global information on standard soil properties with a spatial resolution of 250 m. The product is based on machine learning algorithms (namely, random forest, gradient boosting, and multinomial logistic regression) using a large database of worldwide available soil profiles as predictors and remotely sensed measurements as covariates. The following information on soil properties from SoilGrids250 is used: Data are retrieved as "tiff" (Tagged Image File Format) rasters via an API request for the specific coordinate extent from a TU Dresden Geoserver [37]. A global 10-fold cross-validation performed by the developers showed an average prediction error for key soil properties (based on R 2 ) in the range of 54-79%. The predictability is more limited for the variable "depth to the bedrock" and increases for the soil classes. A comparison between various existing soil datasets in the context of model applicability concluded that SoilGrids250 is the current state-of-the-art soil dataset [38]. Some hydrologists have already applied it in their studies [39][40][41][42][43].
The second version of the Land Cover 100 m dataset released in 2019 and distributed by the Copernicus Global Land Service [21,44] represents the 2015 epoch (three-year period giving reference year +/− one year). It covers 23 discrete classes: 12 forest types (evergreen/deciduous, needleleaf/broadleaf, closed/opened, mixed, and unknown), shrubs, herbaceous vegetation, herbaceous wetlands, moss and lichen, bare/sparse vegetation, cropland, urban/built-up areas, snow and ice, permanent water, ocean, and no data. The dataset was obtained using PROBA-V satellite time-series, global training data generated with Geo-Wiki and Google/Bing imagery, and biome-cluster classification algorithms. This algorithm was chosen in order to adapt the algorithm to subcontinental and continental patterns [45]. The original tiles (20 • × 20 • ) [46] were aggregated in one raster and can be accessed at TU Dresden Geoserver by using the location coordinates [37]. In accordance to the validation reports [47,48] the overall classification accuracy is approximately 80% (forest subclasses excluded). The highest accuracy was achieved for forest, bare/sparse vegetation, snow/ice, and permanent water bodies (>85%). The lowest performance was observed for shrubs, herbaceous wetland, and moss/lichen (<65%). The global accuracy yields to 75% with consideration of the forest types in the validation. Currently, the Global Land Cover dataset has global spatial coverage, and the finest spatial resolution and largest number of vegetation types in comparison with all other known alternatives (i.e., ESACCI [49], MCD12Q1 [50], GLC2000 [51], GLOBELAND30 [52]).
Additionally, a digital elevation model (DEM) is required for the calculation of a few orographic characteristics of the catchment, like mean slope and aspect. These are required for the estimation of solar radiation and subsequently evaporation and snowmelt. The DEM is accessed by the using the "elevatr" R package [53], which gives access to various DEMs available in Amazon Web Service Terrain Tiles. The service automatically chooses the dataset with the best resolution for the desired territory: 3DEP [54], ArcticDEM [55], CDEM [56], open data portals of UK [57] and Austria [58], ETOPO1 [59], EUDEM [60], GMTED [61], INEGI [62], Kartverket [63], LINZ [64], SRTM30 [65]. In general, the resolution varies from 3 m to 2.5 km. Figure 2 shows an overview on all datasets collected by the "Global BROOK902" package needed for a simulation of an exemplary catchment of Alto river located on Corsica island, France (104 km 2 ).

Core Functions, Framework, and Parameterization
The package workflow consists of five milestones ( Figure 3): input from user (1), data download (2) and processing (3), BROOK90 application (4), results postprocessing and storage (5). Therefore, four core files (data_downloader, data_processing, model_hydrotope, and brook_framework) and the following main functions are implemented: • brook90.framework. This is the core function of the package. It has only one call function that is visible to the user. It requires the following minimum input information: the path to the catchment shape file ("shp"), the Climate Data Store (CDS) credentials, and the modeling time interval. Incorporating all the functions mentioned below, it creates the necessary (sub) folders in the workspace (if not provided, then it is the location folder of the shp file). It saves the downloaded initial data (land cover, soil and meteorological data, DEM) and runs the BROOK90 routine, and then postprocessing is performed. Finally, it saves the model output results ("csv" files and "png" time-series plots for each variable, "csv" file with hydrotopes and their characteristics).
• download.landcover/download.soil/download.dem/download.meteo. These functions download data required for the model using the data location of the catchment shape file and output folders for each data type (for meteorological data additionally CDS credentials and desired time interval). They create and send API requests and then retrieve and store files. Specifically, these are "tiff" files of the catchment's land cover classification and soil texture classification, stone fraction, depth to the bedrock, and DEM with a~1 km buffer zone, and the "ncdf4" files for temperature, precipitation, solar radiation, and wind speed. • data.processing. This function at first constructs a regular grid over the catchment and creates a subset of hydrotopes. Secondly, it prepares and returns the input for the BROOK90: hydrotope parameters and the processed meteorological data. • brook90.run.subcatchment. The function runs BROOK90 (R version [24]) for a specific unique hydrotope by merging the default BROOK90 parameters with the ones related to the desired hydrotope and executing the model routine. It returns daily time-series of the requested variables. The credentials for the meteorological data download. These consist of the username and key to the Copernicus Climate Data Store (CDS). It can be obtained after registration on the Copernicus website. Before running the framework for the first time, the terms of use of Copernicus products have to be accepted.

•
Optional input information: • The model output folder (default is the same folder of the catchment "shp" file); • The list of output variables (default outputs are soil water content, total runoff, evapotranspiration, precipitation); • The number of days to cut from the output (model "warm-up" period, default is 30 days); • The method of raw ERA5 data averaging (default is weighted mean).
The first step in the package framework when the user is calling the main brook90.framework function consists of a data download. For the extent of the catchment, land cover, soil, and meteorological data are downloaded. They are stored in the corresponding folders created in the model directory. While soil data are retrieved via images, land cover data and the DEM are provided in tiles and need much more time for downloading, unpacking, and clipping. The ERA5 download is implemented as a process that performs multiple requests to overcome the limitation of maximum items per request (~13.7 years of hourly data for one variable). For instance, to cover the period of 1979-2020, 15 requests are needed (i.e., three time divisions for five meteorological variables).
The preparation of the meteorological data starts with an averaging of the downloaded data for the catchment extent. Three methods are available. The first is to take the nearest to the catchment's geometric mass center grid. The second calculates a mean value out of all downloaded cells, and the third derives an area-weighted mean by the interception of the catchment's borders with the regular ERA5 grid. Afterwards, daily minimum and maximum temperature are derived from the hourly data. Furthermore, the calculation of the mean wind speed and mean daily actual vapor pressure (after Bougeault [67]), as well as the upscaling of hourly wind speed and solar radiation data to a daily resolution are performed. Additionally, the data timing is corrected according to the respective time zone and a time shift of −1 h is applied for precipitation [31].
The subset of unique hydrotopes plays a crucial role in the package. A hydrotope is defined by a unique combination of land cover and soil texture type, stone fraction, and depth to bedrock located within the catchment. Since land cover and soil data have different resolutions, a regular grid of 50 × 50 m is constructed over the catchment. The length and the width of a grid cell are adjusted according to the length and width of the longitudinal and the latitudinal degree at the catchment location. However, while land cover and soil texture are coded for each layer with fixed integer indexes, soil stone fracture is presented in percent and soil depth to the bedrock in cm. Thus, to reduce the possible number of unique combinations, the following algorithm is proposed. At first, the depth to the bedrock is reclassified by rounding to 10 cm steps. Secondly, hydrotopes are subset by means of unique land cover, soil texture, and reclassified depth to the bedrock. Finally, all values of the soil stone fraction for each specific layer, which corresponds to a specific subset of unique combination of land cover, soil texture, and depth to bedrock, are averaged to one value per layer and assigned to a specific hydrotope. Additionally, the frequency of the hydrotope occurrence within a catchment is stored for the further processing.
The vegetation of a hydrotope is parameterized using land cover data. For each land cover type, except for "permanent water bodies" and for "open sea", a unique parameter set is defined. Therefore, the initial recommendations from the BROOK90 documentation [20] are followed. They are adapted and enlarged by the additional studies of [68][69][70][71][72][73]. The following vegetation parameters are recognized: The soil hydraulic characteristics for a hydrotope are parameterized using soil texture classes following the recommendations from the model documentation [20]. These classes were derived by field experiments (remark: silt-loam parameters were assigned for the silt soil type, since it is missing in original documentation). The following parameters of soil are used to characterize the soil column in general and each of the layers specifically: • total number of soil layers in the column, • thickness of each layer, • volumetric stone (coarse fragment) fracture of each layer, • soil matrix potential at field capacity of each layer, • volumetric soil water content at field capacity and full saturation of each layer, • exponent of matrix potential-wetness relationship of each layer, • hydraulic conductivity at field capacity of each layer, • soil wetness at the dry end of the near-saturation range of each layer.
Initially, the package creates a standard 200 cm profile with characteristics for each of the layers. However, if the value of the depth to the bedrock is less than 200 cm, the soil column is cut from the bottom to an appropriate depth by means of reduction of layer thickness or the whole layer(s) with its parameters.
The derived parameters of each hydrotope together with meteorological forcing and the environment with the default model parameters are then passed to BROOK90. The model is sequentially executed for each hydrotope. Originally implemented in FORTRAN [23], BROOK90 was translated into the R language in 2018 [24] and the latter code is used as a model routine.
Finally, the outcome of each hydrotope model run for the requested list of variables is allocated in a list of data frames. The daily area-weighted averages are calculated according to the hydrotope occurrence frequency. The results are stored in the model output folder together with standardized time-series plots.
While running the package, all main steps as well as time benchmarks are printed as status messages on the RStudio console to trace the execution of the major routines of the framework's application.
The framework occupies one CPU core and was tested on two different machines: a personal computer (CPU/RAM: 3.4 GHz/15 GB) and a laptop (CPU/RAM: 1.9 GHz/4 GB). The normalized average computational time was estimated by approximately 13 s/23 s for one hydrotope and 1 year with an hourly simulation step. The download of elevation, land cover, and soil data depends fully on the speed of the internet connection (estimated as 10 to 20 min). Apart from that, the download situation for the ERA5 retrieval is different. It depends mainly on the CDS server load and the amount of requested data (per request and per day) and could last for about 9 to 12 h for a complete 1979-2020 period. But once the data are already downloaded, in case of a new model run the data download will be skipped.

Results and Discussion
The performance and first validation of the results are discussed on the basis of five catchments from Iceland, USA, Canada, Germany, and Indonesia. They were selected for two reasons. First, they are located within different climate zones, and second, they have various relief, land cover, and soil texture ( Figure 4, Table 2). To evaluate the performance of the package, two strategies were applied. The first one is to compare the runoff component to the observed discharge data provided by the Global Runoff Data Centre [90]. For the chosen catchments, the following observed time-series are used, respectively : 1979-2018, 1979-2006, 1979-2017, 1979-2016, 1979-2009. The second strategy is to compare the soil water content, evapotranspiration, and runoff to the similar variables available within the ERA5 reanalysis.  [91] and Bing images [92] as background. Numbers near the catchments refer to Table 2.  6 depict the results of the model runs for the major water balance components: soil moisture, evapotranspiration, and runoff on monthly and on daily scales. The soil moisture time-series show that three out of five catchments have clear annual cycles. Minimum-maximum range analysis for all hydrotopes reveals large deviations of some hydrotopes from the catchment weighted mean. Sometimes, this deviation is up to 100%. The evapotranspiration, on the other hand, has a larger interannual variance. The chosen catchments show annual cycles with lowest values for bare/sparse vegetation and highest for broadleaf forests. The runoff plots together with the knowledge of the climate zone allow for distinguishing between the main flood formation factors; i.e., rain seasons for catchments 2 (arid) and 5 (tropics), while others are of a mixed genesis (snowmelt and rainy season). Additionally, big variations in peak runoff between individual hydrotopes can be seen. The highest values for all three variables, with no surprises, were observed in the tropics. However, it should be noted that the absolute values of soil moisture could be severely affected by the uncertainty in the soil column depth estimation (see Section 2.2).
In general, the similar global soil parameterization scheme used in the package appears in the land surface model of ERA5. Soil moisture (the original variable is volumetric soil water content) in ERA5 was also derived according to Darcy's law and with standard soil profiles (four layers). Constant hydraulic characteristics were assigned based on seven different soil textures of the root zone from the FAO/UNESCO Digital Soil Map of the World [93] (~10 km resolution). The seasonality within both monthly and daily interannual cycles obviously behave similarly (except for catchment 3), while large absolute differences could be explained by different soil profile depths used in BROOK90 and ERA5.
While evaporation processes from surfaces in both models have an equivalent level of specification, transpiration in BROOK90 has a more physically detailed representation. The HTESSEL land surface model [94] for the ERA5 uses 1 × 1 km Global Land Cover Characteristics [95] to parameterize 16 vegetation types with five globally fixed parameters (seasonality is represented with leaf area index). The "Global BROOK90" package, on the other hand, has a set of 13 parameters for 18 vegetation types with a 10 times higher dataset resolution.
Focusing on the five catchments, ERA5 gives higher evapotranspiration rates and earlier annual peaks on the monthly scale, especially in winter months. On daily basis, BROOK90 tends to have slightly smaller day-to-day variation. Completely different results for catchment number 5 indicate various representation of the tropical forest in two parameterization schemes, as well as a coarser resolution of ERA5 itself.
The total runoff component from both BROOK90 and HTESSEL is the sum of surface (excess of infiltrability from net throughfall or snowmelt) and subsurface (lateral/downslope and bypass flow in BROOK90 and simple resulting vertical drainage in HTESSEL) components [20,96]. Despite that the models are driven by the same precipitation forcing, the runoff response is different. A general underestimation of the water volume in ERA5 for catchments 1-4 leads to a shortage of low flow conditions and flood peaks on both daily and monthly scales. Runoff from "Global BROOK90" possesses a much better agreement with the observations regarding annual cycles and daily variability, although it can potentially overestimate low flow conditions and is unable to capture the magnitude of flood peaks.
For the quantitative analysis of the package performance, conventional Nash-Sutcliffe Efficiency (NSE) [97] and Kling-Gupta Efficiency (KGE) [98] skill-scores were calculated for the two temporal scales (Figure 7). It can be seen that the performance of simulated weighted mean runoff on a daily scale according to both NSE (0. 36 The addition of NSE and KGE values for all hydrotopes illustrates that for all tested catchments at least one hydrotope performed better than the model weighted mean. Thus, some of the hydrotopes raised skill-score values up to 0.8. This comparison is also valid since technically BROOK90 as a lumped model treats all unique hydrotopes as independent area-dimensionless pseudo-catchments with the same meteorological forcing. Furthermore, it was found that the accuracy of ERA5 runoff is much lower (except only daily NSE for catchment 5) in comparison to the BROOK90 results. Nonetheless, these performance results should be treated with caution, since BROOK90 is actually a lumped water balance model and the model used in ERA5 cannot account for flow accumulation and routing between hydrotopes. Figure 5. Monthly time-series of modeled (weighted mean and maximum/minimum range from all hydrotopes) soil water content, evapotranspiration, and runoff (overlapped with observation line) for the chosen catchments (numbers on the right side refer to Table 2).  The user might face the following limitations or problems applying the package. The biggest uncertainty obviously lies in the core of the framework-the global parameterization. The generalization of vegetation parameters could yield the same parameter set for completely different plant species united with one land cover class, e.g., vineyards and rice (cultivated territories), birch and teak (deciduous broadleaf forest). The same applies to soil hydraulic characteristics assigned based on texture class with values from local USA studies made by the initial model developers. Furthermore, it was found that land cover and soil rasters can have gaps. We observed a lack of consistency and accuracy in high latitudes and islands. This may be resolved in the near future (at least for the Global Land Cover) with the next annual update. Moreover, there are issues with the meteorological forcing dataset that should be considered. At present, with its long time-series, good spatiotemporal resolution, and large number of parameters available [99], ERA5 is one of the best and complete global-gridded reanalysis meteorological datasets [34,[100][101][102][103]. However, its derived precipitation is still far from "state-of-the-art" conditions [104][105][106][107]. As a result, our validation implicitly showed that dataset resolution is still insufficient to capture precipitation heterogeneity in small catchments by smoothing flood peaks. Additionally, there is unfortunately always a chance that the data providers will change the download access interface, so that our retrieval code will need maintenance. There are also a few limitations caused by the BROOK90 itself [20], e.g., no implementation of flow routing, vegetation growth (aging), and snow frost. The first problem could be solved by adding a relatively simple bucket-type-routing module [108]. This will lead to a drastic increase of computation time, since one has to sacrifice the "unique-hydrotope subset" procedure, which significantly reduces model run time. On the other hand, this may serve well for small and relatively homogeneous catchments. Finally, regarding the package build-up itself, all problems presumably lie in the dependency and stability of the third-party R packages, the ERA5 server, and finally, absence currently of a user-friendly interface.

Conclusions and Outlook
This study describes a new R package, which integrates the lumped physical water balance model BROOK90 and open-source datasets within an automatic framework to calculate vertical water fluxes in soil-plant systems at any location on the globe.
The authors consider that the package presented could be beneficial and useful for several reasons. First, the package delivers reliable results for a vast number of water balance components in small catchments or at single sites. Second, the package enables the user to downscale some output variables of global hydrological reanalysis datasets (i.e., soil moisture, evapotranspiration, runoff) to the scale of a hydrotope. Furthermore, the simplicity of the package should encourage nonspecialists with limited or no prior knowledge of hydrological modeling, parameterization, calibration, etc. to simulate water balance components by following the simple guidelines. However, all users should be aware that the simulations and overall water balance estimations are subject to uncertainties and should be treated with caution. Finally, the package is setup to run globally; thus it could be especially valuable for water balance estimations in data-scarce regions, for instance, in regions with the lack or complete absence of hydrometeorological measurements or the inability to obtain reliable site characteristics.
Therefore, the package presented can serve for various applications and stakeholders. Besides detailed water balance analysis of small catchments with global applicability, possible implementations of the package include soil drought monitoring and water management in agriculture or forestry.
Future work will focus on a systematic global validation of the package, as well as on improvements regarding usability and incorporation of alternative datasets.

Funding: This research was funded by the German Federal Ministry of Education and Research (FKZ 01LR
2005A-funding measure "Regional Information on Climate Action" (RegIKlim), section (a) model regions.