Evaluation of the Soil, Vegetation, and Snow (SVS) Land Surface Model for the Simulation of Surface Energy Fluxes and Soil Moisture under Snow-Free Conditions

: The recently developed Soil, Vegetation, and Snow (SVS) land surface model is being progressively implemented at Environment and Climate Change Canada (ECCC) for operational numerical weather and hydrological predictions. The objective of this study is to evaluate the ability of SVS, in ofﬂine point-scale mode and under snow-free conditions, to simulate the surface heat ﬂuxes and soil moisture when compared to ﬂux tower observations and simulations from the Canadian Land Surface Scheme (CLASS), used here as a benchmark model. To do this, we performed point-scale simulations of between 4 and 12 years of data records at six selected sites of the FLUXNET network under arid, Mediterranean and tropical climates. At all sites, SVS shows realistic simulations of latent heat ﬂux, sensible heat ﬂux and net radiation. Soil heat ﬂux is reasonably well simulated for the arid sites and one Mediterranean site and poorly simulated for the tropical sites. On the other hand, surface soil moisture was reasonably well simulated at the arid and Mediterranean sites and poorly simulated at the tropical sites. SVS performance was comparable to CLASS not only for energy ﬂuxes and soil moisture, but also for more speciﬁc processes such as evapotranspiration and water balance.


Introduction
Land surface models (LSMs) simulate energy and water exchanges between the different components of a land system. LSMs are typically integrated into numerical weather prediction and climate projection schemes in order to provide the required surface boundary conditions [1,2]. They are also a component of land-surface data assimilation systems [3][4][5], as well as hydrological prediction systems [6].
LSMs for use in weather and climate modeling have been the subject of intense development over the last three decades [7,8]. They consolidate knowledge acquired from micrometeorology, soil sciences, as well as field and laboratory research in plant physiology. The complexity inherent to these models makes them remarkable in their capacity to capture energy and moisture exchanges within the soil, vegetation and snow components, as well as hydrological processes at the catchment scale [2]. Substantial progress associated with LSM development has also resulted from model inter-comparison studies [9][10][11] that highlighted the interest in running LSMs decoupled from their host atmospheric model, since offline simulations allow computationally inexpensive research and model improvement [12,13]. A good example is the Canadian Land Surface Scheme (CLASS) [14,15] that has been successfully assessed (e.g., [16][17][18][19]) and widely applied over the North American boreal zone (e.g., [20][21][22][23][24][25]) and other environments (e.g., [26][27][28][29][30]), and more recently in a hydrological context (e.g., [31][32][33]).
According to Pitman [7], the two key equations representing the land surface processes are the energy and water balances. The first one, applied to a surface, is defined as follows: where R n is net radiation; H and LE are the sensible and latent heat fluxes, respectively; and G is the soil heat flux. The water balance (under snow-free conditions) is defined as: where P is precipitation, ET is evapotranspiration, R sur f is the overland flow (possibly including subsurface lateral flow), R base is the drainage (vertical flow from the bottom of the soil column) and ∆S is the variation in the water stored in the surface scheme (i.e., in the vegetation canopy, snowpack or soil column). Note that ET is linked to the surface energy balance by the latent heat flux term, LE = λET, where λ is the latent heat of vaporization. LSM assessment is typically done in the context of the evaluation of all terms of Equation (1). [9,[34][35][36] and/or Equation (2) [37][38][39]. Environment and Climate Change Canada (ECCC) currently uses a modified version of the Interaction between Soil, Biosphere and Atmosphere (ISBA) LSM to produce operational weather forecasts [40][41][42]. ISBA provides surface fluxes of momentum, water and energy to the Global Environmental Multiscale (GEM) atmospheric model. To address particular limitations with their operational version of ISBA [40,41], ECCC developed the Soil, Vegetation and Snow (SVS) LSM. As documented by Alavi et al. [43] and Husain et al. [44], SVS introduces the following improvements compared to ISBA: a new tiling approach, an improved parameterization of the vegetation thermal coefficient, the inclusion (as an option) of photosynthesis processes, a new formulation of land surface albedo and emissivity, a root density function that depends on the vegetation type, an update of the snow melt process and multilayer vertical water transport in the soil.
In a weather prediction context, SVS improved ISBA offline simulations of soil moisture and brightness temperature during the warm season over the North American domain [43,44]. In a hydrologic context, SVS has been shown to produce satisfactory runoff simulations within the large and partly ungauged Lake Ontario basin [6], and in hindcasts of the June 2013 flood in Alberta (Canada) [45]. More recently, Maheu et al. [46] evaluated the point scale soil water transport component of SVS by using precipitation and latent heat fluxes as input data. The model component performed well when simulating soil moisture in Mediterranean and temperate climates of the United States. While previous studies have evaluated the general performance of SVS and contrasted it to ISBA, no detailed evaluation of SVS outside of the North American domain has been performed to date. Moreover, none of the studies focused on evaluating the ability of SVS to simulate surface energy fluxes. Thus, there is a clear need for additional evaluations of SVS focusing on the model's potential to simulate vertical exchanges of heat and water for various climates and biome types. More particularly, we are interested in the evaluation of soil and vegetation components, but not the snow component, which deserves particular attention and would be the subject of another study.
The objective of this study was thus to evaluate the ability of SVS to simulate surface energy fluxes and soil moisture in snow-free conditions for various climate and biome types. SVS was assessed in its offline point-scale configuration by comparing its outputs to energy fluxes and near-surface measurements. It was also compared to simulations from the widely implemented CLASS model, which was used as a benchmark model.

Study Sites
FLUXNET is a network of micrometeorological measurements. It includes more than 800 active and historic flux measurement sites spread across most of the world's climatic regions and representative biomes [47]. Six FLUXNET sites were selected from a total of 212 sites in the FLUXNET2015 dataset updated in November 2016. The sites were filtered progressively using the following four criteria: (1) yearlong snow-free conditions (50 sites); (2) data availability of surface energy fluxes and volumetric soil moisture content measurements over multiple years (12 sites); (3) contrasting climate types; and (4) biomes (6 sites, Figure 1).    1 Köppen-Geiger climate classification: hot semiarid (Bsh) and cold semiarid (BSk). 2 Biome types according to IGBP land cover classification: evergreen needleleaf forest (ENF), grassland (GRA), woody savanna (WSA), closed shrubland (CSH) and evergreen broadleaf forest (EBF). 3 Mean annual precipitation (MAP) and mean annual air temperature (MAT).
The sites show mean annual precipitation, ranging from 306 mm at the arid AU-ASM site to 3041 mm at the tropical GF-Guy site ( Table 1). The mean annual air temperatures range from 15.6 to 25.7 • C (Table 1) The AU-ASM site is covered with a mulga savanna woodland at the southern end of the North Australian tropical transect [48]. At this site, 70% of its annual precipitation occurs during the summer months (December-February) and 85% falls throughout the monsoon season (November-April). The mean monthly temperature can range from 13 • C in June and July to 33 • C in January (Figure 2a). The other arid site, US-Wkg, exhibits a drastic seasonality in precipitation patterns (Figure 2b), with 70% of the annual precipitation taking place during the summer months from July to September because of the annual cycle of the North American monsoon [49]. The land is covered mainly by a diverse mosaic of native grassland species replaced progressively by Lehman lovegrass starting in 2006 [50]. The US-Ton Mediterranean site has a mean annual precipitation of 559 mm, with 80% of it concentrated from November to April and with dry and warm conditions during the summer months from July to September (Figure 2c). Land cover is classified as an oak savanna woodland (dormant in winter and active in spring and summer) interspersed with grassland and forbs (mainly active during the rainy winter periods) [51]. The other Mediterranean site is in Italy, IT-Noe, and displays a similar behavior compared to the US-Ton site in terms of annual temperatures. Precipitation mainly occurs during Spring and high water deficit is observed from May through September. Intense storms provide significant runoff and low water storage during fall [52]. The main species of vegetation are juniper and lentisk with variable-sized patches of bare ground [52]. The MY-PSO tropical site has a mean annual precipitation of 1804 mm, which is less than other regions of Peninsular Malaysia [53]. Rainfall peaks in March and April and from October to December (Figure 2e). Monthly air temperature does not show seasonality and remains uniform around 27 • C ( Figure 2e). The area is covered by a mixed dipterocard forest with a continuous canopy height of around 35 m and some emergent trees surpassing 45 m [54]. The other tropical site, GF-Guy, presents intense precipitation during December-February and April-July due to the north-south movement of the Inter-Tropical Convergence Zone [55]. This site is located in a tropical wet forest with a mean tree height of 35 m and emerging trees exceeding 40 m [55].

Data
Micrometeorological forcing and validation data were both retrieved from FLUXNET2015 dataset (www.fluxnet.fluxdata.org) at 30-min time steps. The forcing data consisted of seven atmospheric variables needed to pilot both SVS and CLASS: air temperature ( • C), precipitation (kg m −2 s −1 ), downward shortwave radiation (W m −2 ), downward longwave radiation (W m −2 ), wind speed (m s −1 ), surface pressure (Pa) and specific humidity (kg kg −1 ). Specific humidity was estimated as a function of relative humidity and air temperature following Gill [56]. FLUXNET2015 data are gap-filled with a combination of autocorrelation gap-filling techniques [57] and downscaled ERA-interim data [58].
The surface energy fluxes to be evaluated are the latent heat flux (LE), sensible heat flux (H), net radiation (R n ), and soil heat flux (G). They were gap-filled using autocorrelation filling techniques [57]. The turbulent fluxes (LE and H) were measured using the eddy-covariance technique [59]. They were also adjusted in the FLUXNET2015 dataset by an energy balance closure correcting factor ( (R n − G)/(H + LE) ). The factor avoid changes in the heat storage in the ecosystem. It was calculated at 30-min time intervals and applied only when energy closure fell outside 1.5 times the interquartile interval and when all the terms were available. Following similar studies by Blyth et al. [12] and Alves et al. [30], a daily energy balance closure filter was applied in order to guarantee a proper observed surface energy balance closure. The daily filter consisted of discarding days for which the error in energy balance closure exceeded 10%. The percentage of the data accepted is the following: AU-ASM (43.2%), US-Wkg (17.3%), US-Ton (11.2%), IT-Noe (29.4%), MY-PSO (91.6%), and GF-Guy was not filtered because observations of G were not available.

SVS
SVS is a land surface model recently developed at ECCC [43,44]. In SVS, each grid cell may be divided into the following four tiling components: (1) bare ground; (2) vegetation (low and high); (3) snow over bare ground and low vegetation; and (4) snow under high vegetation. In the absence of snow, only Components (1) and (2) are taken into account. The model considers a single-layer approach for the vegetation canopy, a one-layer snowpack and a user-defined number of soil layers, with user-defined depth (by default set to seven layers) to compute soil moisture. The force-restore method is used to solve the energy budgets of each surface tile. Schematic diagrams of the hydrological regime and the tiling implementation in SVS soil part can be found in the works of Alavi et al. [43] and Husain et al. [44].
The energy budgets for the different tiles are obtained using the frce-restore approach. In the presence of different vegetation types, in a single grid cell, a single energy budget is calculated using mean vegetation parameters (such as albedo and emissivity) based on a weighted mean of the parameters of each individual type of vegetation present in the grid cell, and their respective fractional coverage. The net radiation for each type of vegetation is estimated taking into account total incoming solar and infrared radiation, as well as vegetation albedo and emissivity. The sensible and latent heat fluxes over vegetation are modeled using the bulk aerodynamic approach [43]. The soil heat flux was computed as the residual of the net radiation and the sensible and latent heat fluxes.
As described by Husain et al. [44], SVS distinguishes among 21 types of vegetation. In each grid cell, a varying combination of these vegetation types can be present. Vegetation parameters, such as leaf area index, roughness length, heat capacity of vegetation, albedo, stomatal resistance and root depth, are assigned separately for each vegetation-type and are specified using look-up tables. The canopy interception capacity is computed as a function of the leaf area index. If the canopy surface is covered with a film of intercepted rainfall (or if the vapor flux is toward the canopy), evaporation (condensation) takes place following the bulk aerodynamic approach. If the canopy is dry, the transpiration from leaves is computed by introducing the stomatal resistance to the evaporation. To compute the stomatal resistance, SVS has the option to use either a Jarvis-type approach [44] or a photosynthesis module that follows a biochemical approach [63][64][65]. In this study, the photosynthesis module was used.
The evolution of volumetric soil moisture in different soil layers is calculated using the diffusion equation. Liquid water flow is calculated using the Darcian equation for one-dimensional fluid flow along with the hydraulic conductivity and soil water suction following Clapp and Hornberger [66]. The saturated volumetric water content, wilting point volumetric water content and volumetric water content at field capacity are calculated based on the percentage of sand and clay contents in the soil according to Boone et al. [67] and Soulis et al. [68]. The saturated hydraulic conductivity and saturated soil water potential are calculated from the percentage of sand [69]. The lateral flow in the downhill direction is estimated through a parameterization of Richard's equation, which estimates lateral flow rate at the seepage face of a hillslope. Lateral flow only occurs when soil water content exceeds water content at field capacity for a sloping element [68]. The water balance for the entire soil column is evaluated as a residual of precipitation, evaporation, water uptake by roots, runoff, lateral flow, baseflow and change in the soil water liquid content.

CLASS
CLASS was initially conceived in response to the perceived need for more soil thermal and moisture layers and for a separate treatment of vegetation canopy within the Canadian Global Circulation Model [14,15]. It has been under continuous development, as evidenced by Wu et al. [70], von Salzen et al. [71], Bartlett et al. [21], and Verseghy [72]. Each grid cell in CLASS may be divided into subareas of: (1) bare soil; (2) vegetation over the soil; (3) vegetation over snow; and (4) snow over soil. The vegetation canopy is treated as a single layer, same for the snowpack. The soil column has the flexibility to handle any number of soil layers to whatever depth the user specifies. The energy fluxes are computed based on the heat conservation equation described by Verseghy et al. [15]. The prognostic evolution of soil water content and the vertical water flow are given by the one-dimensional Darcy and Richards equations, respectively (see Verseghy [14] for a more detailed description.)

Differences between SVS and CLASS
As shown in Table 2, SVS and CLASS rely on similar parameterizations to represent several physical processes. However, there are two distinctive differences. One is associated with stomatal resistance, where the photosynthesis module is used in SVS, while an empirical parameterization is used in CLASS. Note that, although not in the version used in this study, CLASS can be coupled to a similar photosynthesis module [73] and can also have access to a climate vegetation model [71]. The other difference resides in the method used for computing the surface (skin) temperature and the mean temperature of the vegetation and soil components. A one-layer force-restore method is employed in SVS, while CLASS uses a multi-layer heat diffusion equation. Note that a multi-layer Richard's equation framework for soil moisture is used in SVS and CLASS. The force-restore approach is still used in SVS for its computational efficiency and parsimony (less state variable) in the context of operational forecasting. Table 2. Summary of SVS and CLASS main physical processes.

Experimental Setup
In this study, experiments were performed at point-scale and in offline mode with the MESH modeling platform (Modélisation Environnementale communautaire-Surface Hydrology), which is ECCC's community environmental modeling system [31]. MESH allows different LSMs and hydrologic routing models to coexist within the same modeling framework. Although the final goal of MESH is to forecast water discharge at the watershed scale, it is also possible to utilize the stand-alone offline versions of SVS and CLASS, which are the two current LSMs integrated into MESH. A technical description of MESH can be found in Pietroniro et al. [31] and at www.wiki.usask.ca/display/MESH. MESH including the SVS and CLASS codes is available at www.wiki.usask.ca/display/MESH/ Interim+releases.
Three time steps were specified in the models: (1) the input time step was set as 30-min for each model; (2) the model time step was defined as 5-min for SVS and 30-min for CLASS (a test carried out with CLASS for a time step of 5-min revealed almost the same results as those obtained at 30-min); and (3) the output time step was fixed as 30-min for both models.
The soil was divided into seven layers with depths from the top to the bottom equal to 10, 35, 85, 160, 235, 310 and 410 cm. Soil texture was assumed constant in the vertical direction since information at multiple depths was not available in the literature ( Table 3).
The soil texture parameters used at each site (Table 3) are provided to both models, except for the organic matter (orgm), which is used only by CLASS. Vegetation associated parameters are defined in the look-up tables of each model. As shown in Table 3, both models account for the same vegetation type. For simplicity shown treating lateral water fluxes, the grid was considered perfectly flat at all sites.
Both models were set up with the same initial soil moisture and temperature conditions. Soil temperature for each layer was initialized with the yearly mean temperature. Soil moisture for each layer was initialized at 50% of saturation. To account for an equilibrium in the prognostic variables at each site (spin up), the first year of meteorological input was replicated ten times.

Modeling Performance
The performance of SVS and CLASS simulations was evaluated using the following metrics: root mean square error (RMSE), which measures the spread of the residuals between simulation and observation; percent bias (Pbias), which quantifies the average tendency of the simulated value to be larger or smaller than its observation counterpart; the Pearson correlation coefficient, R, which measures the degree of correlation between simulation and observation; and the Nash-Sutcliffe efficiency (NSE) for which the optimal value is 1 and a value inferior to 0 indicates that the average of the observation offers a better predictor than the model.

Energy Fluxes
For a precise evaluation of energy fluxes, the mean annual cycles of latent heat flux (LE), sensible heat flux (H), net radiation (R n ) and soil heat flux (G) are plotted in Figure 3. The metrics of the same fluxes were computed at a 30-min sampling interval and are summarized in Table 4.

Arid Sites
The observed annual cycle of LE displays a maximum peak in February during the rainy season at the AU-ASM site (Figure 3a). The LE annual cycle observations are well represented by the SVS simulations, resulting in a good 30-min correlation (R = 0.77, Table 4) and a small overestimation (Pbias = 3.7%, Table 4). H observations peak during the summer months of November-January. Although the mean annual cycle of H is well represented by SVS, it is underestimated by 20 W m −2 on average during the months of October-December. This could be explained by the energy partition at the surface and more precisely by the underestimation of the net radiation (21 W m −2 during the same months) as explained in Section 3.3. LE and H simulations of SVS are similar to that of CLASS, as reflected by the scores in Table 4. R n is the most dominant term of the annual cycle as well as the best-simulated one in both models (NSE = 0.97, Table 4). However, the bias of Rn is large even if correlation is high. Although incident short and longwave radiations are given, the positive bias suggest that some parameters related to the albedo is unsuitable, as discussed in Section 3.3. In addition, its underestimation during October-December implies that less energy is available for the partition into turbulent fluxes, particularly into H. G is the least dominant flux of the annual cycle, with monthly mean observed value equal to 2.2 W m −2 (Figure 3a). Despite the good correlation of G (R = 0.82 by SVS and R = 0.92 by CLASS, Table 4) , SVS shows a large underestimation (Pbias = −157.6%, Table 4) while CLASS shows a large overestimation (Pbias = 76.9%, Table 4). Note that, for the particular case of G, Pbias may lead to large values as a result of the small absolute values of the observations. The conflicting results for G in SVS are associated to: (1) The way G is computed. As the residual of the other energy fluxes, G does not capture properly the seasonality of the observations in the same way than CLASS where G is estimated as a function of soil temperatures. (2) The conceptual fact that the simulations are not really being compared to the point measurement at a flux plate (see Section 3.3 for more details). We obtained similar results for the US-Wkg arid site, which is partially covered by long grass. No valid data are available in January at this site due to the energy closure imposed on the selection criteria.   Table 4) at this site. R n is the best simulated term by SVS (Pbias = -8.8%, NSE = 0.98, Table 4), even better than by CLASS (Pbias = −20.8%, NSE = 0.97). G, with a mean observed value equal to −1.34 W m −2 , is poorly simulated by SVS (Pbias = 30.7%, NSE = −7.01) and CLASS (Pbias = −55.0%, NSE = −10.97). The performance of SVS and CLASS is similar at the IT-Noe and the US-Ton site for all energy fluxes.

Tropical Sites
MY-PSO shows a tendency towards uniform LE and H observations throughout the year (see Figure 3e). The behavior of LE (Pbias = −0.8%, R = 0.94, Table 4) and H (Pbias = −6.7%, R = 0.89, Table 4) is well captured by SVS. R n is equally well simulated, contrary to G (mean observed value equal to −0.87 W m −2 ), which is largely underestimated (Pbias = 174.2%) and poorly correlated (R = 0.55). SVS simulates all energy fluxes in a similar fashion to CLASS at this site, except for H, which tends to be lower in CLASS (Pbias = −24.2%) than in SVS (Pbias = −6.7%). The LE and H fluxes are accurately simulated by SVS at the GF-Guy site during the wet season (December-August), but are underestimated by SVS during the dry season (September-November), where LE cannot be sustained (Figure 3f). A close examination on the components of LE during the same months (Figure 4e) reveals lower values of SVS canopy evaporation and plant transpiration than CLASS. According to SVS simulations for the AU-ASM arid site, Es is the main ET component (Es/ET = 63%). A similar result was found for the other arid site, US-Wkg, where Es/ET = 84% (Figure 4b). Ts is the main ET component at the US-Ton Mediterranean site (Ts/ET = 63%). The large model differences of maximum ET during the months of April-June are associated with differences in Ts at this site (Figure 4c). The tropical sites illustrate the relevant role of canopy interception and plant transpiration. For example, Ts/ET = 72% at the MY-PSO site, which consists of 70% evergreen forest (Figure 4e). Similarly Ts/ET = 79%, at the GF-Guy tropical site (Figure 4e). The underestimation of ET by SVS during the months of September-November (Figure 4f) is clearly associated with the underestimation of Ts, which is probably due to the overestimation of stomatal resistance.

Partitioning of Evapotranspiration
Ts/ET values from the literature can be used to benchmark model results. SVS simulations during the growing season (June-October) at the site US-Wkg lead to a value of 84%, which is above values observed by Moran et al. [50]: values of 45% in 2002 and 75% in 2007. On the other hand, CLASS had a closer estimate of 61% during the same period. A compilation of 81 studies presented by Schlesinger and Jasechko [84] partitioned annual ET into Ts at the ecosystem scale in Mediterranean (47 ± 10%) and tropical ecosystems (70 ± 14%), which are in line with SVS and CLASS simulations at the Mediterranean (respectively, 50% and 57% on average) and tropical (respectively, 76% and 60% on average) sites.

Sources of Uncertainties Associated with Energy Fluxes
SVS surface energy flux simulations show a tendency to underestimate R n at all sites. More precisely, the average underestimation in terms of the Pbias score is of the order of 10.4% at the arid sites (AU-ASM, US-Wkg), 11.9% at the Mediterranean sites (US-TON, IT-Noe) and 5.9% at the tropical sites (MY-PSO, GF-Guy). Alves et al. [30] showed the same scale of underestimations in R n when they contrasted CLASS simulations with five FLUXNET sites representing semi-arid, temperate, and tropical climate types. The main reason is the underestimation of net shortwave radiation for which the percent bias score is on the same order as R n : 10.3% at arid sites, 13.3% at Mediterranean sites and 3.5% at tropical sites. Furthermore, albedo simulated values in SVS (and also CLASS) tend to be overestimated at all the studied sites. Indeed, daily simulated albedo, averaged between 10:00 and 14:00, for SVS(CLASS) showed poor performances: RMSE = 0.07, Pbias = 60.28, r = 0.24 (RMSE = 0.07, Pbias = 48.23, r = 0.16). Part of the bias may be related to instrumentation uncertainty in the FLUXNET database. Wilson et al. [85] showed that R n observations in FLUXNET are typically characterized by systematic instrument errors of up to 15%.
For turbulent fluxes, H at arid (AU-ASM, US-Wkg) and Mediterranean sites (US-Ton, IT-Noe) is better simulated by SVS (NSE = 0.80 on average) than LE (NSE = 0.39 on average), while the opposite is true at the tropical sites (MY-PSO, GF-Guy) where LE (NSE = 0.81 on average) is better simulated than (NSE = 0.50 on average). The fact that LE is less performing than H at arid and Mediterranean sites is because of the presence of larger variability in biophysical characteristics, where significant climate seasonality exists. A closer look at the evapotranspiration components ( Figure 4) reveals that transpiration is the most likely process to be biased (here it is essential to mention the benchmarking role of the CLASS model). Another factor influencing the simulation of turbulent fluxes may be the nocturnal observations. At nearly all FLUXNET sites, it is assumed that the advection of scalars can be neglected [86]. However, non-zero values of mean vertical velocity and vertical advection are possible and can be induced by horizontal heterogeneity or mesoscale and synoptic scale forcings [87,88].
G simulated by SVS is biased for all sites without exception; one reason may be the method associated with the estimation of soil temperature. As reported by Verseghy [14], the force-restore method causes the surface temperature to respond to diurnal forcing rather slowly, leading to a damping of G (and of turbulent fluxes). Additional studies concerning the implementation of force-restore method into LSMs showed accurate estimates of shallow (upper) soil layers (e.g., [40,[89][90][91][92]), but limitations on deep soil temperature, and under snow cover and frozen soils [93,94]. Furthermore, soil heat flux measurements using plates, which is typical in FLUXNET, can be inaccurate because the thermal conductivity of the heat flux plate and the surrounding soil may be unequal [95]. Heat flux plates typically have a conductivity similar to that of soil, but thermal properties vary temporally because of changes in soil water content and temperature [96]. Soil heat flux plates can also alter the environment they are measuring, notably by limiting water movement in soils [97]. Figure 5 reveals the contribution of the different water balance terms from Equation (2) for the various studied climate conditions when precipitation is considered to be 100%. According to the SVS simulations, ET dominates with 93%, 84% and 53% in arid, Mediterranean and tropical sites, respectively (Figure 5a-f). The remaining contribution is split mainly between overland (R sur f ) and base (R base ) flows. Although CLASS simulates ET similarly to SVS, differences between the distribution of R sur f and R base are observed. For example, at the MY-PSO tropical site, ET is around 65% of precipitation in both models. However, R sur f is 35% in SVS, but only 8% in CLASS (Figure 5e). Such a difference between models could lead to a substantial difference in terms of hydrological responses if R sur f and R base are not aggregated together in the routing model. This is the case in GEM-Hydro, where R base is first provided to an intermediate reservoir in the routing scheme Watroute (aiming to represent the aquifer), before reaching the channel [6].

Water Balance
CLASS guarantees water balance closure for each of its four subareas up to the accuracy limit of 0.1 kg m −2 [98], as this is a requirement of long-term simulations of climate models [72]. SVS closes the water balance with a small positive bias (values in parenthesis) at all sites: AU-ASM (0.2%), US-Wkg (0.2%), US-Ton (0.6%), It-Noe (0.1%), MY-PSO(0.05%) and GF-Guy(0.04%).

Soil Moisture
Scores obtained from the evaluation of the soil water content simulations are summarized in Table 5. They were computed by comparing 30-min point-scale measurements of the shallowest layer (typically measured at a 5-or 10-cm depth) to modeled values corresponding to the topsoil layer (0-10 cm). For deeper layers (10-410 cm), only SVS and CLASS simulations are available and are presented in Figures 6-8.

Arid Sites
The surface layer soil moisture content is highly modulated by precipitation at all arid sites. At AU-ASM (Figure 6a), SVS surface soil moisture performance is low (NSE = −0.04, Table 5) and less than CLASS (NSE = 0.20, Table 5). During the dry periods (from May to September), SVS tends to represent the minimum observed value of 0.05 m 3 m −3 properly. Exceptions occurred during the dry periods of 2013 and 2014 when SVS depleted soil moisture to 0.0 m 3 m −3 while CLASS was comparable to observations (0.05 m 3 m −3 ). As reported by Maheu et al. [46], SVS depletion during dry periods at semiarid sites is explained by the absence of the residual soil water content, which in CLASS was defined as 0.04 m 3 m −3 [14]. During rainy events, SVS overshot the observed maximum soil water content (Pbias = 35.1%, Table 5). For deeper layers (10-420 cm), SVS is slightly wetter than CLASS. As a result, the minimum soil water content of 0 m 3 m −3 is reached at the 3-m depth in SVS, while it occurred at the 2.4-m depth with CLASS (Figures 6c,d). As a consequence, more water is available in the soil and is transferred as overland flow, drainage, or as plant uptake through the roots. Figure 5 shows around 5% more overland flow for SVS than for CLASS at this particular site. A good representation of near-surface soil moisture by SVS and CLASS (NSE = 0.64 and 0.78, respectively, Table 5) is observed for the other arid site US-Wkg (not shown).

Mediterranean Sites
SVS performed well at the US-Ton site during the study period (NSE = 0.73, Table 5), even better than CLASS (NSE = 0.60, Table 5). During the wet season, SVS systematically underestimated surface water content, the maximum observed soil water content was 0.55 m 3 m −3 , while the maximum simulated was 0.45 m 3 m −3 . SVS also slightly underestimated surface soil moisture during the dry seasons. This response is reflected by a negative Pbias (=−22.9%, Table 5). Soil water content for SVS shows significant water percolation (soil moisture between 0.3-0.4 m 3 m −3 ) during the wet season and a progressive water transfer from the surface layer during the dry season (see Figure 7c). This result is contrasted to that of CLASS, which shows slow water percolation during the wet season and drastic transitions during the dry season, resulting in generally drier conditions (Figure 7c). Considering that both models account with the same meteorological forcing and soil texture, divergences in the deep soil moisture could be explained by differences in the parameter setting as a function of the soil texture ( Table 2). See also Section 3.6 for a more detailed discussion.

Tropical
Both models showed low performance for surface water content simulations at the MY-PSO tropical site (NSE = −26.50 and −22.71 for SVS and CLASS, respectively, Table 5). SVS underestimates water content mainly during the dry period. This behavior is depicted by Pbias = −39.4%. Despite the weak performance of SVS, its similarity to CLASS and the reasonably good correlation of both models (R = 0.86 for SVS and CLASS, Table 5) suggest that there is some bias associated with the point-scale measurements that may be associated with local inaccuracies in the soil texture. In fact, the two primary studies associated with the site [54,99] represent soil moisture as the average at 0.1, 0.2 and 0.3 m with a closer variability to our simulations. Additionally, there was large variability in the datasets employed by Cosby et al. [80] when the relationships between soil texture and hydraulic parameters were developed. For deeper layers, both models tend to simulate uniform wet conditions (see Figure 8c,d), which is contrary to the arid and Mediterranean sites.

Uncertainties Associated with Soil Moisture
A mismatch can occur when we compare point-scale soil moisture observations against the averaged simulated values of the top 10 cm layer, which is somewhat close to the surface. Furthermore, the fact that the soil texture had to be defined as uniform over the entire soil column, as already mentioned, certainly has an additional influence on the representativeness of the results. On tropical sites with evergreen tropical forest, the soil surface is usually shaded and covered with organic-matter litter that is frequently wet [100]. It imposes a condition that reduces the surface runoff and increases the base flow [100], which may explain the divergences between SVS and CLASS water balances at tropical sites (Figure 5e,f). Concerning model structure, SVS and CLASS models are not directly informed on the local soil hydraulic properties, which incidentally are unknown at the selected sites. Soil hydraulic properties are instead deduced from pedotransfer functions that provide the most probable values for a given soil texture (Table 2). This procedure inevitably leads to some divergence with reality that will push the models to diverge from observations. In addition, although both models are similarly process-based as to soil description, the constant values differ in the parameterized equations. This is the case for the saturated hydraulic conductivity, which provides higher values in SVS than in CLASS, allowing a greater water content in deeper layers as well as a role in the water uptake from the soil. This effect is observed at the US-TON site ( Figure 5). At the same time, model discrepancies in the description of stomatal resistance may also have a role in water uptake from the soil. A closer introspection of the model structure bias may require a sensitivity analysis of specific parameters, particularly the hydraulic conductivity and stomatal resistance.

Conclusions
This study used flux tower observations to assess the ability of the SVS model to simulate surface energy fluxes and surface soil moisture under snow-free conditions for various climates and biomes. In addition to observations, the CLASS model was used as a benchmark. This approach provided an integral diagnostic of SVS in terms of energy and water exchange between the land surface and the atmosphere. Evapotranspiration was the most significant term for water balance with contributions of 95%, 82% and 50% in arid, Mediterranean, and tropical sites, respectively.
As a result of this study, some deficiencies in the SVS model's ability to simulate energy fluxes and surface soil moisture were identified and thus deserve further attention: 1.
The albedo of vegetation is simply based on a weighted mean of look-up table values of fixed albedos. This scheme may lead to a negative bias of up to 12% in the net shortwave radiation and 13% in the net radiation. Albedo is specified the same way as root depth, density and LAI. These are all parameters that can be improved and will impact SVS simulations. The photosynthesis also has several look up tables for various key parameters controlling the stomatal resistance. External databases and satellite-derived approximation could help better specify some of these parameters.

2.
Abrupt underestimations of LE at GF-Guy site is mainly associated with transpiration simulations. Thus, further attention should be given to this process.

3.
Poorly simulated G is associated with the simple method used (residual of the other energy fluxes) and the single-layer description of the soil. To avoid these structural biases, vertical transport of heat in the soil along with multilayer-energy balance would be an asset in the soil description. 4.
The better simulation of soil moisture with CLASS in arid sites suggests that certain features, such as the inclusion of residual saturation when defining water retention curves, can be advantageous. Alternatively, limits on evaporation and moisture stress parameterizations may help to prevent soil moisture to reach zero values.

5.
Surface soil moisture at tropical sites should take into account organic matter. Funding: This research was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC), Hydro-Québec and the Ouranos consortium on regional climatology and adaptation to climate change, as well as the scientific contribution of Environment and Climate Change Canada and the Ministère de l'Environnement et de la Lutte contre les changements climatiques through NSERC project RDC-477125-14 entitled "Modélisation hydrologique avec bilan énergétique (ÉVAP).