Evaluation of Radiation Components in a Global Freshwater Model with Station-Based Observations

In many hydrological models, the amount of evapotranspired water is calculated using the potential evapotranspiration (PET) approach. The main driver of several PET approaches is net radiation, whose downward components are usually obtained from meteorological input data, whereas the upward components are calculated by the model itself. Thus, uncertainties can be large due to both the input data and model assumptions. In this study, we compare the radiation components of the WaterGAP Global Hydrology Model, driven by two meteorological input datasets and two radiation setups from ERA-Interim reanalysis. We assess the performance with respect to monthly observations provided by the Baseline Surface Radiation Network (BSRN) and the Global Energy Balance Archive (GEBA). The assessment is done for the global land area and specifically for energy/water limited regions. The results indicate that there is no optimal radiation input throughout the model variants, but standard meteorological input datasets perform better than those directly obtained by ERA-Interim reanalysis for the key variable net radiation. The low number of observations for some radiation components, as well as the scale mismatch between station observations and 0.5◦ × 0.5◦ grid cell size, limits the assessment.


Introduction
The estimation of the Earth's surface radiation components are of high interest in climate science (e.g., for global radiation budget [1][2][3]) and as a driver for evaporation of water (e.g., [4]).Consequently, global hydrological models (GHMs), which are designed to simulate water fluxes and storages on terrestrial land surface, incorporate radiation information in their calculation of potential evapotranspiration (PET, maximum amount of evapotranspiration if no water limitation occurs) and subsequently actual evapotranspiration (AET, taking into account possible water availability limitations) [5,6].
Until the early 2000s, meteorological input data (hereafter referred to as meteorological forcings) for GHMs solely contained indirect information about radiation (e.g., sunshine duration or cloud cover).Nowadays, meteorological input data for GHMs are based on reanalysis data, which are either driven by observations including satellite data (as in the case of radiation [7]) or sometimes supplemented (corrected) with ground or satellite based observations.A number of such forcings are now available that include downward radiation information, e.g., WATCH Forcing Data methodology applied to ERA-Interim (WFDEI, [8]) and Princeton meteorological forcing (PGFv2, [9]).However, these forcings differ from each other due to different underlying reanalysis and correction approaches, which is reflected in uncertain estimates of the water balance components (e.g., [10,11]).
Although precipitation is one of the major drivers of uncertainty (e.g., [12]), uncertainties in (net) radiation also affect model outputs significantly.Döll et al. [13] varied simulated net radiation within the global freshwater model WaterGAP by 20% resulting in a 4% to 7% difference for global scale AET and a 6% to 10% difference for global scale river discharge, which consequently alters the spatial pattern of renewable water resources (see their Table 2 and Figure 1).In the study of Nasonova et al. [11], variations in global averaged shortwave (longwave) fluxes among their datasets are estimated to be about 14 (13) Wm −2 with subsequent influences to water fluxes.The CMIP5 models vary globally by 19 Wm −2 for both downward fluxes with a much higher variation on the global land area (41 Wm −2 for shortwave and 33 Wm −2 for longwave downward fluxes) [3].Consequently, net radiation over land varies among the 43 CMIP5 models by 29 Wm −2 which is translated into a latent heat flux (evaporation) variation of 14 Wm −2 [3].
However, modifying simulated radiation with a fixed percentage or comparing different products are not appropriate to assess the quality (in terms of the agreement with in-situ observations) of simulated radiation components.There are a number of studies evaluating spatially distributed radiation datasets with station-based observations.For the first time, Wild et al. [14] used surface observations to assess shortwave downward radiation from reanalyses, and later also the longwave downward radiation from reanalyses and climate models [15].Troy and Wood [16] compared seven globally available radiation products with 32 stations from the World Radiation Data Centre (WRDC) archive over northern Eurasia and found differences of 20 Wm −2 for net radiation.Furthermore, NASA/GEWEX Surface Radiation Budget (SRB) data was assessed to be the closest dataset compared to the observations.In other recent studies, data products were evaluated using station observations (e.g., [17][18][19][20][21]) but were mostly focused on regional scales.Overall, to assess the global-scale coverage of a GHM, mainly two radiation-relevant station datasets are available to the scientific community: the Baseline Surface Radiation Network (BSRN, [22]) and the Global Energy Balance Archive (GEBA, [23]).
Heinemann and Kerschgens [24] investigated the representativeness of station measurements in a heterogeneous landscape in northern Germany to simulated surface energy fluxes with a 250 m and 1 km resolution.They concluded that large uncertainties exist which are also dependent on the aggregation method.Furthermore, a strong dependence on land cover types was found.In addition, Horlacher et al. [25] noted reasonable differences between two meteorological stations (8.5 km apart) especially in the upward radiation fluxes that are mainly land-cover-dependent.In a study of Hakuba et al. [26], an error assessment for the representativeness of mean monthly point measurements compared to grid cell satellite-based datasets for solar radiation showed a relatively small mean error of about 3 to 4 Wm −2 .These issues indicate the problem of representativeness when comparing point measurements with grid cell boxes.Nevertheless, and due to the given heterogeneous (and not omnipresent) data coverage of radiation observations, such a comparison can still be of value for assessments of coarse scale gridded products [26].
To the authors' knowledge, there is a lack of studies investigating the reliability of radiation components used in GHMs, and especially their calculation of upward radiation fluxes.In most GHMs, upward (and net) radiation fluxes are calculated according to land surface parameters as land-cover-dependent albedo or emissivity (e.g., [27]).This approach is common and consistent as some other land-cover-dependent characteristics are used, e.g., for modeling snow dynamics and rooting depth [27].However, using radiation-relevant characteristics for calculating upward components is subject to uncertainties.Therefore, it is of interest to evaluate the general performance of state-of-the-art meteorological forcings in terms of radiation fluxes and if such an internal model calculation can be substituted by radiation components, e.g., from reanalysis.
Following the approach of Budyko [28], the Earth's land surface (represented here by 0.5 • × 0.5 • grid cells) can be hydrologically characterized, dividing PET by precipitation (P).If this quotient is less than one, the region can be characterized as energy-limited, otherwise it is water-limited.From the hydrological perspective, changes in net radiation (which is the major driver of PET in most approaches) are interesting, especially in energy-limited areas as, for example, increasing amounts of net radiation are translated into higher amounts of PET and AET and subsequent less available water resources.Therefore, the optimal calculation of radiation components plays a vital role in GHMs.
The overall objective of this study is to evaluate radiation components of two state-of-the-art meteorological forcings and from ERA-Interim reanalysis by comparing the downward components as well as the net components with station-based observations of the BSRN and GEBA datasets.Furthermore, the upward and net radiation components as calculated by WaterGAP (except for the ERA-Interim variant where net radiation is used) are evaluated.The general aim is to find the most plausible approach to obtain net radiation for PET simulation.Within this, we pursue four objectives: 1.To assess the spatial differences of global-scale net radiation as well its components from selected meteorological forcings.
Using two meteorological forcings (WFDEI, and PGFv2) and two ERA-Interim reanalysis variants, we evaluate spatial patterns of the simulated (or provided ones in the case of the ERA-Interim variant where net radiation is taken as input data) radiation components.2. To identify optimal downward radiation input from meteorological forcings.
The downward radiation components of the two meteorological forcings, as well as ERA-Interim reanalysis, are assessed with respect to their performance to station observations of BSRN and GEBA data on a monthly time step.

3.
To evaluate the calculation of upward radiation fluxes within WaterGAP and net radiation components.
For the two meteorological forcings as well as the reanalysis variant where only downward radiation is taken into account, the simulated upward radiation that is estimated by WaterGAP is evaluated by comparing it to observations and the reanalysis variant that provides upward radiation components.In addition, net radiation fluxes are assessed.4. To test if improvements are achieved when using net radiation from the ERA-Interim reanalysis directly.
Net radiation is the most important variable for PET simulation.Therefore, the performance of net radiation and consequences for PET is evaluated in order to investigate if ERA-Interim net radiation could be used for future assessments within global hydrological models.
In Section 2, the data sources, model and efficiency criteria are described.The results are presented and discussed in Section 3 both for the global scale, but also if there are spatial differences between geographic regions and energy-and water-limited areas.Concluding remarks are given in Section 4.

Data and Methods
In this section, we firstly describe the model experiment and the meteorological forcings used in this study, followed by the WaterGAP Global Hydrology Model (WGHM).The two in-situ radiation databases, as well as the efficiency metrics, are presented thereafter.

Experimental Setup
Four different radiation data sets were used within WaterGAP, version 2.2a in order to answer the research questions.Even though the model version is not modified, the term model variant is used to distinguish the meteorological forcings with a five-letter abbreviation (WFDEI, PGFv2, ERAID, and ERAIN, see Table 1).The WFDEI variant (see Section 2.1.2),based on ERA-Interim reanalysis, represents the current standard meteorological forcing and thus standard radiation input for WaterGAP (see STANDARD model variant in [27]).The PGFv2 forcing (see Section 2.1.3)is used to include an alternative forcing which is very frequently applied in global scale modeling and is thus evaluated here as well.Directly interpolated shortwave downward (S↓) and longwave downward radiation (L↓) data from ERA-Interim reanalysis were taken for variant ERAID, which differs from WFDEI due to the interpolation, elevation and bias correction method.Finally, substituting the calculation of upward radiation components by the corresponding fluxes from the ERA-Interim reanalysis is done in model variant ERAIN.For both ERAID and ERAIN, WFDEI was used for P and temperature (T).For ERAIN, the choice of P and T is irrelevant as radiation components are not calculated within WaterGAP.P and T are important for all other variants, especially for the snow dynamics, which have an influence on albedo and thus shortwave upward radiation (S↑), whereas T (at 2 m height) is used to calculate longwave upward radiation (L↑).We decided to use WFDEI T and P for ERAID as: (1) a simple interpolation of T to the 0.5 • grid without considering the environmental lapse rate would lead to uncertainties, especially in mountainous areas [29]; (2) a bias correction of P with observations is required as there is still a (slight) wet bias in ERA-Interim when compared to P observations [7]; and (3) in case P and T for ERAID is the same as for WFDEI, the emissivity value can be evaluated for longwave upward radiation.In the following sections, the meteorological forcings and their sources used for the intercomparison are described in more detail.The European Centre for Medium-Range Weather Forecast (ECMWF) Re-Analysis Interim (ERA-Interim) is the third generation of global atmospheric reanalysis from ECMWF and spans from 1979 until recent time.A reanalysis includes a meteorological forecasting model which uses all available observation data (e.g., station measurements, radiosonde-profiles or satellite data) to initialize the next forecast step.This is done cyclically a few times per day.The model behind ERA-Interim is called Cy31r1, an atmospheric model and data assimilation system that contains the three components: atmosphere, land surface and ocean waves.The land surface model of ERA-Interim is TESSEL (Tiled ECMWF Scheme for Surface Exchanges over Land).Within the data assimilation scheme and using surface (and cloud) albedo and emissivity values, the radiation fluxes are generated in an integrative way using a radiation transfer scheme, with the main aim to preserve the energy balance (details in [30]).Compared to ERA-40 the energy balance on land surface is improved [7].Spatial resolution is horizontal ~80 km (spectral T255 grid) and vertical with 60 layers [7].Reanalyses are widely applied in climate monitoring and analysis [31,32].For this study, S↓ and L↓ from ERA-Interim (model variant ERAID), as well as S↑ and L↑ (model variant ERAIN) were used.ERA-Interim data were interpolated bi-linearly to 0.5 • resolution (without bias and altitude correction) and aggregated to daily values.

WATCH Forcing Data Methodology Applied to ERA-Interim Reanalysis
The basis of this meteorological forcing is the ERA-Interim reanalysis [7].Weedon et al. [8] prepared the output of this reanalysis within the Integrated Project Water and Global Change (WATCH) for global hydrological models by applying an interpolation and bias-correction scheme which is explained briefly here.ERA-Interim T was obtained from the lowest atmospheric model level (10 m) and interpolated to the 0.5 • resolution using an elevation based environmental lapse rate.Furthermore, T is corrected to Climatic Research Unit Time-Series [33] (CRU TS) version 3. 1 (1979-2009) respectively version 3.21 (2010-2012) average T as well as average diurnal T range [8,29].The handling of P is described by Weedon et al. [29].This variable is adjusted to CRU TS3.1 number of wet days and monthly totals using either CRU TS3.1/TS3.21or data from the Global Precipitation Climatology Centre (GPCC) v5/v6.For 1979-2009, the adjustment is based on GPCC v5, and from 2010 to 2012 based on CRU TS3.21.
L↓ from ERA-Interim reanalysis data was elevation corrected after interpolation to 0.5 • resolution using a fixed relative humidity as well as changes in T, surface pressure and specific humidity.Weedon et al. [8] found no necessity for monthly bias correction.S↓ is not elevation corrected after interpolation.In contrast to the previous version of WATCH Forcing Data (WFD) [29], S↓ was adjusted for interannual (but not seasonal) variations in aerosol loading (using CRU TS3.1/3.21 average cloud cover) which results in higher mean monthly values compared to the WFD forcing [8].Furthermore, the aerosol distribution was changed in the ERA-Interim reanalysis so that WFDEI has higher values of S↓ in northern Africa (~40 Wm −2 for the year 2000) and lower values of ~30 Wm −2 in northern South America (same year) [7].WFDEI climate input is already available for the spatial and temporal resolution of WaterGAP.T and P from the WFDEI [8] were used in all variants except for PGFv2 to keep consistency within that meteorological forcing (Table 1).

Princeton Global Meteorological Forcing Dataset
The Princeton Global Meteorological Forcing Dataset, version 2 (PGFv2, http://hydrology.princeton.edu/data.pgf.php) is an updated version of the 60 year-forcing (1948-2008) described by Sheffield et al. [9] and is available between 1901 and 2012.This dataset blends reanalysis data (NCEP-NCAR) with station and satellite observations.Radiation (S↓, L↓) is adjusted for systematic biases at a monthly scale to a product from the University of Maryland (by Rachel Pinker) developed within the NASA MEaSUREs project.S↓ trends are corrected using CRU TS 3.21 cloud cover and for L↓, the year-to-year variation of NCEP-NCAR reanalysis is retained.T is bias corrected by shifting to monthly CRU TS 3.21.P is bias corrected using CRU TS 3.21 and not undercatch corrected (in contrast to the previous version described by Sheffield et al. [9]).All information on this PGFv2 version was provided by personal communication with J. Sheffield, 2015.PGFv2 climate input is already available for the spatial and temporal resolution of WaterGAP.
The spatial resolution of the model is 0.5 • × 0.5 • (55 × 55 km at the equator) and calculations are performed on a daily time step, whereas output is analyzed on a monthly scale.A thorough description of the model and its components for version WaterGAP 2.2 can be found in the appendix of Müller Schmied et al. [27].The version used here is named WaterGAP 2.2a and is described in Döll et al. [38].The only relevant modification of this model version (compared to the description in Müller Schmied et al. [27]) used in this study is the possibility to read in net radiation components, e.g., from ERA-Interim reanalysis.In this case, WaterGAP does not influence radiation calculation and acts simply as a tool to calculate PET as well as to provide the output file format that is then used for the comparison with station observations.
In general, S↓ and L↓ are provided by the meteorological forcings and have the unit in Wm −2 .Net shortwave radiation S net (Wm −2 ) for all model variants except ERAIN (Table 1) is calculated as: where α LC is the albedo (-) based on land cover type LC ([27], their Table A2).Albedo values for WaterGAP are taken from assumptions of the IMAGE model [44].In the case of a reasonable snow cover, the albedo value is varying dynamically in WaterGAP to represent the influence of snow cover dynamics on radiation balance [27].S net from ERAIN is directly used for the assessment.Upward shortwave radiation S↑ (Wm −2 ) is calculated as: Upward longwave radiation L↑ (Wm −2 ) is calculated as: where ε LC is the emissivity (-) based on land cover type ([27], their Table A2), σ is the Stefan-Boltzmann constant (5.67 )) and T is temperature (K).Emissivity values are taken from Wilber et al. [45] who assessed the emissivity from different materials (e.g., minerals) in laboratory experiments and then upscaled it to a land cover classification scheme using the predominant material composition of the land cover class.Net longwave radiation L net (Wm −2 ) for all model variants except ERAIN is calculated as: L net from ERN is directly used for the assessment.Finally, net radiation R net (Wm −2 ) is calculated as:

Baseline Surface Radiation Network (BSRN)
The Baseline Surface Radiation Network (BSRN) was an initiative from the World Climate Research Programme (WCRP) Radiative Fluxes Working Group and is now incorporated into the WCRP Global Energy and Water Cycle Experiment (GEWEX).The aim of BSRN is to collect and provide data from a high qualitative radiometric network.The number of stations is relatively low (Table 2) but shortwave and longwave surface radiation fluxes of "best possible quality currently available" are provided [22] (p. 6).BSRN data were downloaded via the data portal http://pangaea.de/ in 01/2015 as monthly means (calculated by the data warehouse tool) of all available radiation flux variables (if available).In cases where more than one sensor reported data values, we obtained the mean value.Some errors in the database (e.g., interchanged variables) were found, reported and are already being updated.Forty-three BSRN stations (Figure 1a) are located within the land-ocean mask of WaterGAP, providing data for at least 6 months within the analysis time (1992-2012) and could thus be used for this study (see Table 2 for a summary of number of stations/months).In order to extend the geographical coverage of the evaluation sites, the Global Energy Balance Archive (GEBA) is used in addition to BSRN.GEBA contains long-time averaged surface radiation for GEBA stations was assigned when the radiation variable(s) were available for at least ten stations, otherwise joined to "all other combinations".

Global Energy Balance Archive (GEBA)
In order to extend the geographical coverage of the evaluation sites, the Global Energy Balance Archive (GEBA) is used in addition to BSRN.GEBA contains long-time averaged surface radiation fluxes of more than 1500 stations worldwide.The data are quality controlled and coded with quality flags [23].Station data were used if at least 6 monthly data values were available and quality flags indicated no problematic data (quality control procedure 1-4 have values > 4).Twenty GEBA stations were excluded as they provide monthly aggregates of BSRN stations.Figure 1b shows the location of the stations.The data were downloaded via http://www.geba.ethz.ch/ in August 2014.The coordinate was rounded to the next decimal point in case a GEBA station was located at the 0.5 • grid cell border.

Calculation of Net Radiation and Its Components
Based on the monthly averages of the measurements, S net (L net ) radiation was calculated as S↓ minus S↑ (L↓ minus L↑) and R net as the sum of S net and L net .

Efficiency Metrics
Efficiency metrics are used to quantify the goodness-of-fit between observations and simulations.There are numerous metrics available (each with benefits and limitations) and have been subject to reviews in the past e.g., [46,47].Möller [48] analyzed WaterGAP model outputs with BSRN station data (from the four model variants, Table 1) by using ten efficiency metrics.She found that the Nash-Sutcliffe Efficiency and the Mean Absolute Error E MAE are the best metrics to reflect the characteristics of the modeled radiation vs. the simulated ones.The advantages of the Kling-Gupta Efficiency E KG , which builds on the Nash-Sutcliffe Efficiency, are the possibility to split this metric into the three components and to determine the dominant component [49].In this study, E KG and its components, as well as E MAE , are used to assess the simulations with station observations.2.4.1.Kling-Gupta Efficiency Gupta et al. [49] and, later on, Kling et al. [50] decomposed the very popular Nash-Sutcliffe Efficiency metric into three components to allow diagnostic insights into the model performance and created the Kling-Gupta Efficiency metric by computing the Euclidian distance of the components from the ideal point (E KG , Equation (6) in its version 2012 which is used here).
where E KGr is the correlation coefficient between simulated and observed values [-] and can act as indicator for the timing, E KGbeta is the bias ratio (Equation ( 7)) [-] and can act as indicator if biases of the mean values occur and E KGgamma is the variability ratio (Equation ( 8)) [-] which can act as indicator for the variability of simulated (S) and observed (O) values.
where µ is the mean value (e.g., in Wm −2 ), σ is the standard deviation of the value (e.g., in Wm −2 ), CV is the coefficient of variation [-].The optima of E KG and its components are one.The lowest component determines the E KG value.Furthermore, Gupta et al. [49] provided a methodology to assess the relative contribution g i of the three components as: with

Mean Absolute Error
An absolute measure of model efficiency is the Mean Absolute Error E MAE .It is calculated as the absolute difference between observed and simulated (monthly) data, which are then averaged over the number of observations (Equation ( 13)).A great advantage of this absolute efficiency criterion is the resulting number in the unit of measurement, which can then be used, for example, to assess absolute errors among the radiation components.
Water 2016, 8, 450 where O i is the observed (monthly) value for time step i and S i is the modeled (monthly) value of the variable.

Sum of Ranks
According to Möller [48], the model variants were ranked for the efficiency metrics, giving the value "1" for the model variant with the highest performance (maximum E KG or minimum E MAE , respectively) and incrementing values from the highest to lowest performing model variant.This is done for each station and then aggregated over the dataset.Finally, the model variant with the lowest rank can be seen as the best model variant amongst the data set and radiation component considered here.

Results and Discussion
In this section, we firstly compare the radiation components with observations from BSRN and GEBA datasets (Section 3.1).Then, we assess global averages and spatial differences of net radiation and its components (Section 3.2).Finally, we discuss the optimal radiation input and possible improvements in Section 3.3.

Comparison to Station Measurements
Monthly grid cell values generated from the different data sources (see Table 1) are compared with BSRN and GEBA ground measurements.The comparison results are evaluated and discussed using the efficiency metrics defined in Section 2. 4.
Results are displayed as boxplots for the E KG criterion (Figure 2) as well as single components of E KG (Figures 3-5) and for E MAE (Figure 6).The BSRN dataset is separated into one where all 43 stations are included (green color in the figures) and one with only the 16 stations where both components (upward, downward) are measured (blue color in the figures).Remarkable differences exist between the GEBA and BSRN comparison results for all radiation components (Figures 2 and 6).For the downward components and both efficiency criteria, the simulations fit best to the subset of 16 BSRN stations, followed by all BSRN stations.Generally, lower performance is achieved for the GEBA stations, which is expected due to the larger data uncertainty [51].However, due to the larger spatial coverage of this data set (at least for some components), integration of GEBA data allows a more robust assessment.
Analyzing Figures 2 and 6, downward radiation components are represented with high performance by all model variants, indicated by the overall high value for E KG as well as (compared to net radiation) low E MAE of around 10 Wm −2 (slightly higher for S↓, lower for L↓).Except for PGFv2, S↓ has a mean bias (E KGbeta ) greater than one which relates to an overestimation compared to measurements and L↓ has a (slight) mean bias below one (Figure 3).This is in line with previous assessments [2,3].L↑, as modeled by WaterGAP (all except ERAIN), has a lower performance (except E MAE and GEBA) than with ERAIN.Two possible reasons are that: (a) surface temperature as the main factor for L↑ calculation is strongly constrained by observations; or (b) emissivity values are more realistic in ERAIN.Simulated S↑ is modeled with low agreement in comparison to observed values.The timing of S↑ (indicated by E KGr , Figure 4) is similar among the several forcings.The indicator of variability (E KGgamma ) is underestimated more strongly by ERAIN compared to the simulated ones (Figure 5) but most differences among the forcings occur in mean bias (E KGbeta , Figure 3).For GEBA stations and simulated S↑, the mean bias E KGbeta is greater than one.Besides the non-representative albedo and emissivity values of the surface beneath the measurement surface, obviously, different albedo values are responsible for different mean biases between simulations and reanalysis-based S↑ (compare ERAID and ERAIN for GEBA dataset of Figure 3).
There is no general best meteorological forcing for S net .Due to the small footprint of upward radiation measurement, we did not expect better agreements.For example, if the area below the measurement consists of grassland but the majority of the 0.5 • grid cell contains forest, the agreement is expected to be low due to the strongly differing albedo.For the BSRN stations, PGFv2 has the highest performance whilst for the GEBA stations WFDEI (and ERAID for E MAE ) ranks first (Table 3).The variability of net shortwave radiation is underestimated by all forcings, but to a lower degree by ERAIN (Figure 5).Differences in modeling L↑ are small (values close to optimum for E KGr and E KGbeta ) to moderate (E KGgamma ).Interestingly, for L↓, variability is likely to be overestimated by all model variants (except PGFv2 and BSRN) and the variability of L↑ is at the same time underestimated by the WaterGAP simulations.All model variants have a low performance for L net which is a result of differences in timing and variability (Figures 4 and 5).ERAIN ranks first (for E MAE and GEBA together with WFDEI).However, the most important variable for calculating PET is R net .For the BSRN stations, the model variant WFDEI has the highest performance to observations; for GEBA stations, the PGFv2 variant ranks first in both cases followed by ERAIN.In terms of mean bias, R net is overestimated by the model variants, independently of the meteorological forcing (Figure 3).This is in agreement with Wild et al. [3], as they estimated a much lower value for the global land area (see Section 3.2, although Antarctica was included in their assessment).We assume that the overestimation of mean S↓ (Figure 3) is one reason for the high value of global land R net obtained in this study.The measure of variability (Figure 5) (and to a lesser extent timing, Figure 4) for R net varies by reference dataset.Nevertheless, the mean absolute biases for R net are with 15 to 20 Wm −2 being the highest deviation from all the components throughout the forcings.Analyzing the relative contribution of the E KG components to the overall value (Figure 7), it can be stated that the correlation (thus the timing) contributes to the least extent (except S↑) for all radiation components.For the shortwave radiation components (and R net ), the mean bias is dominating, whereas variability is dominating for the longwave components.

Intercomparison of Global-Scale Net Radiation and Its Components
Figure 8 shows simulated net radiation (bottom row) and its components from WFDEI and differences to the other model variants described in Table 1.It is worth noting that in northern Africa and Somalia, the Saudi-Arabian peninsula, central and Western Australia, as well as the Amazon and Mexico, Rnet of ERAIN is significantly lower than the current standard meteorological forcing of WFDEI (brown to red colors in Figure 8, bottom line).Despite large areas of lower Rnet values, there are numerous grid cells with drastically higher values for ERAIN compared to WFDEI (maximum difference in the Himalayan Mountains is 196 Wm −2 ).For long term (1979-2012) global land (except

Intercomparison of Global-Scale Net Radiation and Its Components
Figure 8 shows simulated net radiation (bottom row) and its components from WFDEI and differences to the other model variants described in Table 1.It is worth noting that in northern Africa and Somalia, the Saudi-Arabian peninsula, central and Western Australia, as well as the Amazon and Mexico, R net of ERAIN is significantly lower than the current standard meteorological forcing of WFDEI (brown to red colors in Figure 8  Summarizing the main findings of the inter-comparison of the four model variants, it can be stated that ERAIN has the highest net radiation at the global land scale and the standard forcing WFDEI the lowest.In energy-limited regions, the alternative forcings (especially ERAIN) have higher net radiation which has an influence on the evapotranspiration calculation.Especially for the downward components and PGFv2 as well as for the upward components and ERAIN, differences among the model variants are visible.

Discussion of Optimal Radiation Input
The downward components of all model variants show a high performance compared to the observed radiation apparent in Figure 2 (top row) but may still have systematic biases in terms of Comparing WFDEI and ERAID (which differ in terms of the adjustments of Weedon et al. [8]), 40.2% of the global land area are within ±5 Wm −2 difference for R net , which shows a higher agreement when compared to the other forcings (PGFv2 35.1%, ERAIN 19.1%).Whereas higher (lower) R net values for ERAID are calculated in 36.2% (23.7%) of land area, a similar pattern, i.e., the tendency to calculate higher R net can be found for PGFv2 (37.2%, 27.8%) and ERAIN (48.9%, 32.0%).When separating the energy-limited and water-limited regions (Section 1), the forcing leads to pronounced differences.For ERAID and energy-limited (water-limited) regions, 34.3% (45.3%) of land area is within ±5 Wm −2 difference, with similar values for PGFv2 (38.3%, 32.3%) but are in much lower agreement with ERAIN (18.8%, 19.3%).
R net is larger than 5 Wm −2 in the percentage of land area for energy-limited/water-limited regions and ERAID (40.3%/32.6%)as well as PGFv2 (43.3%/31.9%).For ERAIN, 62.0% of energy-limited regions have R net values that are higher than 5 Wm −2 compared to WFDEI, whereas water-limited regions have, in the predominantly land area (43.2%),R net values that are lower than 5 Wm −2 (see also the spatial differences in Figure 8).Hence, by using ERAID or PGFv2 instead of WFDEI, around 1/3 (and for ERAIN around 1/2) of energy-limited regions could profit from an increasing amount of energy for evapotranspiration.Taken into account the different fraction of energy-limited/water-limited regions, on 18.7% to 28.8% of the global land area, the forcings ERAID, PGFv2 and ERAIN are calculated to have higher R net (compared to WFDEI) which would likely increase evapotranspiration.In contrast, on 8.6% to 11.8% on the global scale, R net decreases with the alternative forcing.
Absolute differences for the radiation components from WFDEI are also shown in Figure 8. Differences in S↓ for WFDEI compared to ERAID and ERAIN are low, but compared to PGFv2, WFDEI has 5 to 25 Wm −2 less S↓ in the US, northern Africa, Australia and the mountain regions in central Asia.In tropical regions, and to a lesser extent in northern Eurasia, S↓ is higher in WFD (Figure 8, first row).S↑ differences of WFDEI and ERAID are (except for small regions) within ±5 Wm −2 , which is expected due to the smaller absolute values.As the land cover class dependent albedo values are comparable, WFDEI and PGFv2 differ with more than ±5 Wm −2 in areas where S↓ is differing, but to a much lesser extent.Most differences (between ±50 Wm −2 ) occur between WFDEI and ERAIN, which is interesting as S↓ is comparable.This could be related to a higher surface albedo of ERAIN that induces higher S↑.Other factors, e.g., clouds, might also contribute to the higher upward radiation, but no hints for an overestimation of cloudiness have been found by comparisons with BSRN stations in Europe and Africa [18].S net is very similar between WFDEI and ERAID, but differs from ERAIN (in general with lower values except the tropics) and PGFv2 (in general with higher values except for the tropics).
The pattern of differences between WFDEI to the other model variants for L↓ is similar in most regions; in the tropics, WFDEI has less L↓, whilst in most other regions, differences are within ±5 Wm −2 or, in the case of PGFv2, higher (up to 25 Wm −2 ) in most other regions.Differences > 5 Wm −2 can be found for L↑ only between WFDEI and ERAIN.Most of the dry regions have higher values in WFDEI (up to 50 Wm −2 for large areas), whereas the tropics (and Greenland) have lower values up to 25 Wm −2 .The difference pattern for L net is diverse.ERAIN has, in general, smaller values for L net , especially in dry regions; ERAID has lower values in the tropics and PGFv2 has lower values in the tropics but higher values for most other areas.When comparing the differences of WFDEI to ERAIN and ERAID, the effect of modeling outgoing radiation within WaterGAP is visible.Based on an additional model run, where T from ERA-Interim (instead from WFDEI) was used for ERAID, we can summarize that T affects model results in only a minor manner and thus, differences in emissivity values (or other, ERA-Interim related characteristics) determine the difference between ERAIN and the WaterGAP calculations.The same results occur when T and P from WFDEI are used in a PGFv2 model run.Whereas the differences in the Amazon are comparable, ERAIN shows large differences in Africa, the Saudi-Arabian peninsula as well as Australia.For some regions, the calculation of outgoing radiation components in WaterGAP (ERAID, PGFv2) leads to changing signs of the differences compared to those obtained from ERAIN.
Summarizing the main findings of the inter-comparison of the four model variants, it can be stated that ERAIN has the highest net radiation at the global land scale and the standard forcing WFDEI the lowest.In energy-limited regions, the alternative forcings (especially ERAIN) have higher net radiation which has an influence on the evapotranspiration calculation.Especially for the downward components and PGFv2 as well as for the upward components and ERAIN, differences among the model variants are visible.

Discussion of Optimal Radiation Input
The downward components of all model variants show a high performance compared to the observed radiation apparent in Figure 2 (top row) but may still have systematic biases in terms of over and underestimation (Figure 3).According to the rank sums in Table 3 and S↓, highest performances are simulated when WaterGAP is forced by WFDEI, followed by PGFv2 and ERAID/ERAIN variants.The preparation and adjustments of WFDEI [8,29] to radiation fluxes from ERA-Interim reanalysis (see Section 2.1.2) lead to improved results for S↓, independently of the station dataset and efficiency criterion (Table 3).Interestingly, the E KG components, except E KGgamma (which is the dominant component (Figure 7) and thus influencing overall E KG value at most), perform worse than the directly obtained S↓ from ERA-Interim (Tables 4 and 5).For L↓ (only elevation corrected), performance is worse for BSRN stations (except E KGgamma which is again the dominant component, Figure 7) but of the same quality for GEBA stations (Table 3).As the results of WFDEI are similar or better than ERAID (except for L↓ and BSRN), the thorough interpolation scheme of Weedon et al. [8] improved the quality of the meteorological forcing.The PGFv2 model variant ranks second for S↓ and BSRN and first for GEBA stations, whereas for L↓ the PGFv2 model variant is at the first position for both observation datasets (except for E KG and GEBA data) and can, therefore, be seen as a valuable alternative for downward radiation.WaterGAP simulations of S↑ have a higher performance than those from ERAIN for the BSRN stations and a lower performance for WFDEI and ERAID for the GEBA stations consistently among the two efficiency criteria (Tables 3 and 4).The calculation of S↑ in WaterGAP and underlying assumptions (e.g., albedo) are of the same quality as those from ERAIN for the grid cells where measurement data are available, even though the E KG values of S↑ are drastically lower compared to S↓. Due to the small footprint of upward radiation measurements (and thus the questionable representativeness for the 0.5 • grid cell) as well as the low number of stations, this result may also be driven by chance.The E KG of L↑ is close to the optimum with highest performance with ERAIN for BSRN stations and for E KG also for GEBA stations (Table 3), whereas E KGbeta and E KGr for all variants are close to optimum and the indicator of variability E KGgamma (which is dominant in E KG value) differs between the model variants (Figures 3-5 and 7).For BSRN and GEBA stations, the WaterGAP calculation of L↑ variability E KGgamma has a higher performance compared to observations than those from ERAIN (Tables 4 and 5).Even though, overall, the E KG value has a slightly weaker performance than those from ERAIN, the approach of WaterGAP using a land cover class based emissivity value to determine L↑ can be seen as plausible.However, here also, the small number of stations and months available for upward radiation components data (Table 2), as well as the small footprint for upward radiation measurement, limits the assessment.However, currently, the emissivity value is kept constant within WaterGAP which is a problem especially when land surface is covered by snow which influences emissivity values [52][53][54].It should, therefore, be tested if a snow-cover-dependent emissivity value leads to improvements for WaterGAP calculation of L↑.
The evaluated L net performance depends largely on the efficiency metric applied.Whereas E KG values perform low for all model variants (Figure 2), values for E MAE are, compared to the other radiation fluxes, relatively low.The subtraction of two large values (L↓, L↑) leads to low absolute values (e.g., compared to S net , Figure 8) and thus, absolute errors can be lower.On the other hand, small uncertainties in L↓ and L↑ are translated into large uncertainties for L net which is visible, e.g., for E KGbeta (Figure 3).
For calculating PET, R net is the most important variable.The relative efficiency criterion E KG (Figure 2) is relatively low, independently of the model variant and the medians of absolute error E MAE are high with around 20 Wm −2 .The rank sums for R net do not perform highest when substituting the upward radiation calculation of WaterGAP with those values obtained directly by ERA-Interim (Table 3).ERAIN has a higher performance than ERAID but for each efficiency criteria and station database, either WFDEI or PGFv2 perform better than ERAIN (Table 3).Snow dynamics are modeled within WaterGAP and albedo changes when snow cover is present [27].Therefore, the use of ERAIN may induce inconsistent representation of land cover dependent parameters as albedo is also modeled by ERAIN.This, and the small benefit of using ERAIN, has to be considered when deciding about the optimal input.
WaterGAP uses a very simple scheme to calculate the energy balance at the surface.It uses only land-cover dependent albedo and emissivity values.The radiation transfer scheme that is integrated in ERA-Interim is highly complex, taken into account different bandwidths of radiation transfer, cloud (and atmosphere) properties and is designed to close the energy balance both at the surface and top level of the atmosphere [7].Even so, in principle, ERA-Interim also considers surface albedo and emissivity and is therewith comparable in a very general sense, even though the processes are far more physically constrained.
All model variants have a positive mean bias for R net (Figure 3), which indicates that this variable is overestimated by about 20% compared to measurements.Consequently, E KGbeta is the dominant component of the E KG value (Figure 7).Obviously, the higher E KGbeta of S↓ propagates to S net and R net (Figure 7) as this is described in the literature [2,3].For the 16 BSRN stations, the current standard forcing of WaterGAP (WFDEI) provides the highest performance for both efficiency criteria (Table 3).For the 142 GEBA stations with net radiation data, the best match to simulations can be reached when using PGFv2 forcing (Table 3).
One of the foci of this study was to assess whether differences in results occur in energy-and water-limited areas.However, the discrepancy of the distribution of stations in energy limited vs. water limited regions (on average, only 28% of GEBA stations and 38% of BSRN stations are in water-limited grid cells) hinders a robust assessment of the performance in those distinct regions.Rank sums aggregated for BSRN and energy/water limited regions (not shown) indicate only slight deviations from those in Table 3.
The scale mismatch between (small scale) station observations and 0.5 • grid cells results in uncertain comparison results.The median E MAE for S↓ from this study is about three times higher (10 to 15 Wm −2 , Figure 6) than those of Hakuba et al. [26].Obviously, due to the heterogeneity of land surface characteristics, the representativeness of downward and upward radiation fluxes differs fundamentally.However, monthly averaging can smear short-term (and small scale) atmospheric effects, which barley applies for land surface heterogeneities.
PET was calculated for the four model variants using the Priestley-Taylor approach [55] (Figure 9).In general, the difference patterns are similar to those of R net (Figure 8).Less PET, in comparison with WFDEI, occurs for all other model variants mainly in the tropics and for ERAIN in the very dry regions of Northern Africa and Australia.On a global scale, WFDEI has highest value for PET and the overall range among the model variants is about 24 mm•year −1 (lowest value for ERAIN).The highest absolute range (100 mm•year −1 ) among the models occurs in water limited grid cells, where WFDEI and ERAID have the same values and ERAIN the lowest.For the energy limited regions, WFDEI has the second lowest value (ERAIN has the highest) while the models vary by about 70 mm•year −1 (Table 6).As AET could be only increased in energy limited regions, this implies that WaterGAP would simulate higher AET when using ERAIN as radiation input.MAE (10 to 15 Wm −2 , Figure 6) than those of Hakuba et al. [26].Obviously, due to the heterogeneity of land surface characteristics, the representativeness of downward and upward radiation fluxes differs fundamentally.However, monthly averaging can smear short-term (and small scale) atmospheric effects, which barley applies for land surface heterogeneities.
PET was calculated for the four model variants using the Priestley-Taylor approach [55] (Figure 9).In general, the difference patterns are similar to those of Rnet (Figure 8).Less PET, in comparison with WFDEI, occurs for all other model variants mainly in the tropics and for ERAIN in the very dry regions of Northern Africa and Australia.On a global scale, WFDEI has highest value for PET and the overall range among the model variants is about 24 mm•year −1 (lowest value for ERAIN).The highest absolute range (100 mm•year −1 ) among the models occurs in water limited grid cells, where WFDEI and ERAID have the same values and ERAIN the lowest.For the energy limited regions, WFDEI has the second lowest value (ERAIN has the highest) while the models vary by about 70 mm•year −1 (Table 6).As AET could be only increased in energy limited regions, this implies that WaterGAP would simulate higher AET when using ERAIN as radiation input.

Conclusions
In this study, we assessed the performance of radiation components in experiments with the global freshwater model WaterGAP on a 0.5 • × 0.5 • grid scale against station observations from the BSRN and GEBA databases.The model was driven with two state-of-the-art meteorological forcings (WFDEI and PGFv2) as well as with the interpolated downward radiation data from ERA-Interim reanalysis (ERAID) and finally a radiation dataset, including both downward and upward components from ERA-Interim (ERAIN).We used two efficiency metrics (one absolute, and one relative) for the comparison analysis.In addition, we assessed spatial differences among the radiation components as well as their effect on PET.The results can be summarized as follows:

•
For global land averages, R net differs only slightly among the model variants (~2 Wm −2 ).However, regionally large differences of R net , S↑ and L net were found, especially in comparison with ERA-Interim reanalysis (Figure 8).

•
R net values of WaterGAP as forced by WFDEI is within ±5 Wm −2 agreement to the ERA-Interim reanalysis for 19.1% of the global land area, with similar numbers for energy-limited and water-limited regions.

•
In 62.0% of energy-limited regions (relates to 28.8% of the global land area), R net of the full radiation dataset from ERA-Interim is higher by more than 5 Wm −2 and has, therefore, the highest potential to increase the simulated evapotranspiration as compared to the other forcings.

•
The downward radiation components of ERA-Interim show less or similar agreement to station observations compared to those from meteorological forcings (Table 3).The interpolation and correction approach of Weedon et al. [8] improves both downward radiation components (Table 3).
However, for all model variants, a systematical overestimation of S↓ and R net was found when comparing to observation data.

•
The performance of S↑ radiation of ERA-Interim lies between the meteorological forcings WFDEI and PGFv2.For BSRN stations, the model variant where ERA-Interim downward components are used has a higher performance than the variant where also ERA-Interim upward components are used, whereas the opposite is true for GEBA stations.ERA-Interim values for L↑ are superior compared to those of WaterGAP except for the absolute error measure derived at the GEBA stations (Table 3).

•
Best results for R net are found for current standard forcing (WFDEI, BSRN) or alternative forcing (PGFv2, GEBA) (Table 3), but median absolute errors are around 20 Wm −2 (comparable to the study of Troy and Wood [16]) and are mainly due to a higher mean value independent on the model variant (Figures 3 and 4).

•
Global values for PET vary only by 25 mm•year −1 and the highest values are achieved with WFDEI forcing.However, in energy limited regions, where a change in PET directly influences AET, WFDEI has the second lowest value while ERAIN is 60 mm•year −1 higher.Some limitations were found:

•
The relatively small number of some radiation measurements (e.g., 16 stations with upward flux measurements for BSRN) limits the overall assessment and hinders a robust assessment of the performance of the model variants, especially when separating into energy-limited/water-limited areas.

•
In contrast to the downward components, which show a reasonable representativeness of station measurements at the grid cell level [26], upward measurements have a small footprint and thus a high uncertainty in terms of representativeness for the grid cell.
The results imply a certain impact of radiation forcing and modeling on the simulation of PET, especially for energy-limited regions, which then affects the modeling of available water resources.However, calculation of PET can be done with many approaches which all have their own uncertainties.To quantify this, in particular, a study on the impact of PET (including different calculation approaches) is needed, followed by a quantification of the consequences for AET and freshwater resources.The results of this study can help to improve the understanding of net radiation estimates as an important driver of simulated evapotranspiration in GHMs.Even though this assessment is model-specific, GHMs with a comparable approach (i.e., land cover dependent albedo and emissivity) can benefit from this analysis as two of the most frequently used meteorological forcings and one reanalysis for two radiation setups were evaluated.

Figure 1 .
Figure 1.Location of the reference stations used for evaluating the radiation components: Baseline Surface Radiation Network (BSRN), (a) and Global Energy Balance Archive (GEBA); (b) A specific color for GEBA stations was assigned when the radiation variable(s) were available for at least ten stations, otherwise joined to "all other combinations".

Figure 1 .
Figure 1.Location of the reference stations used for evaluating the radiation components: Baseline Surface Radiation Network (BSRN), (a) and Global Energy Balance Archive (GEBA); (b) A specific color for GEBA stations was assigned when the radiation variable(s) were available for at least ten stations, otherwise joined to "all other combinations".

Figure 2 .
Figure 2. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency KG E

Figure 2 .
Figure 2. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency E KG (-).Numbers after the slash indicate the number of stations that are outside of the 1.5 × inter quartile range and thus not included in the boxplot.Significant differences of metric distribution (two sample Kolmogorov-Smirnov test, p < 0.05) are present for GEBA stations and shortwave downward radiation for all combinations except ERAID-ERAIN.For single components of E KG , see Figures 3-5.

Figure 3 .
Figure 3. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency componentKGbetaE

Figure 3 .
Figure 3. Validation of monthly radiation components with measurements of 16 BSRN stations(where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency component E KGbeta (-) as indicator for mean bias.Numbers after the slash indicate the number of stations that are outside of the 1.5 × inter quartile range and thus not included in the boxplot.Significant differences of metric distribution (two sample Kolmogorov-Smirnov test, p < 0.05) are present for BSRN stations and the downward components for all combinations with PGFv2 and for the net shortwave as well as net longwave radiation for ERAIN-PGFv2.Within GEBA stations, significant differences occur for shortwave downward radiation between ERAIN/ERAID and PGFv2/WFDEI and also PGFv2-WFDEI; for net shortwave radiation between ERAIN-WFDEI and for net radiation between ERAID-ERAIN, ERAIN-PGFv2 and ERAIN-WFDEI.

Figure 4 .
Figure 4. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency component KGr E

Figure 4 .
Figure 4. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency component E KGr (-) as indicator for correlation, thus timing.Numbers after the slash indicate the number of stations that are outside of the 1.5 × inter quartile range and thus not included in the boxplot.Significant differences of metric distribution (two sample Kolmogorov-Smirnov test, p < 0.05) are present for GEBA stations and shortwave downward radiation for ERAID-WFDEI and ERAIN-WFDEI.

Figure 5 .
Figure 5. Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency componentKGgammaE

Figure 5 .
Figure 5. Validation of monthly radiation components with measurements of 16 BSRN stations(where all components are measured, blue color), 43 BSRN stations where only downward components were measured (green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Kling-Gupta Efficiency component E KGgamma (-) as indicator for variability.Numbers after the slash indicate the number of stations that are outside of the 1.5 × inter quartile range and thus not included in the boxplot.Significant differences of metric distribution (two sample Kolmogorov-Smirnov test, p < 0.05) are present for BSRN stations and longwave downward radiation for all combinations with PGFv2, for longwave upward radiation for ERAID-ERAIN, ERAIN-PGFv2 and ERAIN-WFDEI as well as for net radiation between PGFv2 and WFDEI.For the GEBA stations, significant differences occur for shortwave downward radiation among all combinations (except ERAID-ERAIN), and for longwave upward as well as for net radiation between ERAID-ERAIN, ERAIN-PGFV2 and ERAIN-WFDEI.

Figure 6 .
Figure 6.Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (and green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Mean Absolute Error

Figure 6 .
Figure 6.Validation of monthly radiation components with measurements of 16 BSRN stations (where all components are measured, blue color), 43 BSRN stations where only downward components were measured (and green color) for the time period 1992-2012, as well as GEBA stations (red color) for the time period 1979-2012 using Mean Absolute Error E MAE (Wm −2 ).Numbers after the slash indicate the number of stations that are outside of the 1.5 × inter quartile range and thus not included in the boxplot.

Figure 7 .
Figure 7.The average fraction of

Figure 7 .
Figure 7.The average fraction of E KG component (-) showing the different contribution of correlation, variability and mean bias to the overall efficiency criteria.
Figure8shows simulated net radiation (bottom row) and its components from WFDEI and differences to the other model variants described in Table1.It is worth noting that in northern Africa and Somalia, the Saudi-Arabian peninsula, central and Western Australia, as well as the Amazon and Mexico, R net of ERAIN is significantly lower than the current standard meteorological forcing of WFDEI (brown to red colors in Figure8, bottom line).Despite large areas of lower R net values, there are numerous grid cells with drastically higher values for ERAIN compared to WFDEI

Table 1 .
Model variant names for the experiment and source for climate variables.

Table 2 .
Number of stations and months of the radiation data used within this study.Stations were only considered if at least 6 months of data were available.

Table 2 .
Number of stations and months of the radiation data used within this study.Stations were only considered if at least 6 months of data were available.Variable # Stations # Stations (Calculated) # Months # Months (Calculated)2.3.2.Global Energy Balance Archive (GEBA)

Table 3 .
Rank sums divided by number of stations for the radiation components of BSRN and GEBA stations and the efficiency metrics E KG and E MAE .The lowest rank is shown in bold font, indicating the best match to observations.Model variants are abbreviated with ERD (stands for ERAID), ERN (ERAIN), PGF (PGFv2), WFD (WFDEI) and net radiation components with a subscript "n".

Table 4 .
Rank sums divided by number of stations for the radiation components, BSRN stations and the efficiency metrics E KGr , E KGbeta and E KGgamma .The lowest rank is shown in bold font, indicating the best match to observations.Model variants are abbreviated with ERD (stands for ERAID), ERN (ERAIN), PGF (PGFv2), WFD (WFDEI) and net radiation components with a subscript "n".

Table 5 .
Rank sums divided by number of stations for the radiation components GEBA stations and the efficiency metrics E KGr , E KGbeta and E KGgamma .The lowest rank is shown in bold font, indicating the best match to observations.Model variants are abbreviated with ERD (stands for ERAID), ERN (ERAIN), PGF (PGFv2), WFD (WFDEI) and net radiation components with a subscript "n".