Partitioning Global Surface Energy and Their Controlling Factors Based on Machine Learning

As two competitive pathways of surface energy partitioning, latent (LE) and sensible (H) heat fluxes are expected to be strongly influenced by climate change and wide spread of global greening in recent several decades. Quantifying the surface energy fluxes (i.e., LE and H) variations and controlling factors is still a challenge because of the discrepancy in existing models, parameterizations, as well as driven datasets. In this study, we assessed the ability of random forest (RF, a machine learning method) and further predicted the global surface energy fluxes (i.e., LE and H) by combining FLUXNET observations, climate reanalysis and satellite-based leaf area index (LAI). The results show that the surface energy fluxes variations can be highly explained by the established RF models. The coefficient of determination (R2) ranges from 0.66 to 0.89 for the LE, and from 0.53 to 0.90 for the H across 10 plant functional types (PFTs), respectively. Meanwhile, the root mean square error (RMSE) ranges from 12.20 W/m2 to 21.94 W/m2 for the LE and from 12.05 W/m2 to 22.34 W/m2 for the H at a monthly scale, respectively. The important influencing factors in building RF models are divergent with respect to LE and H, but the solar radiation is common to both LE and H and to all 10 PFTs in this study. We also found a contrasting trend of LE and H: a positive trend in LE and a negative trend in H during 1982–2016 and these contrasting trends are dominated by the elevated CO2 concentration level. Our study suggested an important role of the CO2 concentration in determining surface energy partitioning which is needed to be considered in future studies.


Introduction
The surface energy partitioning between latent (LE) and sensible (H) heat fluxes plays a critical role in controlling the state of the atmospheric boundary layer, hydrological cycle, as well as the weather and climate dynamics [1,2]. It links the atmosphere-land interactions by transmitting the effects of changes in surface characteristics and anthropogenic activities to the atmosphere by modifying the exchanges of vapor (i.e., LE) and energy (i.e., H) fluxes [3][4][5]. However, this process is also affected by the ongoing climate changes and human activities [6]. For example, human activities, like global-scale deforestation over the past decades, influenced the surface energy balance by suppressing the LE released in low latitudes and reducing the energy available at the surface, which leads to a corresponding warming FLUXNET network [12]. The monthly data were used to train the RF models and evaluate the RF outputs. The FLUXNET datasets are contributed by individual researchers across the world. The gathered datasets with various formats were standardized and quality controlled uniformly by the Regional Network and Fluxdata teams. The detailed information about processing including gap-filling, uncertainty quantification are given by Pastorello et al. [12]. Besides, the imbalance in the energy budget within eddy covariance measurements are corrected by multiplying an correction factor proposed by Pastorello et al. [12]. The criteria followed to select subsets of FLUXNET datasets in this study are: (1) the energy fluxes measurements and associated meteorological variables are available from the year 2000 to match with the Moderate Resolution Imaging Spectroradiometer (MODIS) LAI data sets; (2) the sites with poor energy closure in the turbulent fluxes are corrected are adopted. Finally, 152 sites were chosen and are shown in Figure 1. These sites cover 11 plant functional types (PFTs) classified by the International Geosphere-Biosphere Programme (IGBP) definition. They are: evergreen needleleaf forests (ENF, 34 sites), evergreen broadleaf forests (EBF, 12 sites), deciduous broadleaf forests (DBF, 18 sites), mixed forests (MF, 6 sites), croplands (CRO, 19 sites), grasslands (GRA, 30 sites), savanna (SAV, 7 sites), woody savanna (WSA, 6 sites), permanent wetlands (WET, 11 sites), closed (CSH, 2 sites) and open (OSH, 7 sites) shrublands. Since only two sites are covered by CSH, both CSH and OSH are integrated as broad shrublands (SH) in this study.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 18 outputs. The FLUXNET datasets are contributed by individual researchers across the world. The gathered datasets with various formats were standardized and quality controlled uniformly by the Regional Network and Fluxdata teams. The detailed information about processing including gapfilling, uncertainty quantification are given by Pastorello et al. [12]. Besides, the imbalance in the energy budget within eddy covariance measurements are corrected by multiplying an correction factor proposed by Pastorello et al. [12]. The criteria followed to select subsets of FLUXNET datasets in this study are: (1)

Vegetation and Climate Datasets
Previous studies have reported that vegetation change is a primary driver of surface energy fluxes variations, especially for the LE in which more than 50% of it could be explained by vegetation greening [34,35]. Leaf area index has been widely used as an indicator to monitor vegetation dynamics for its advantage of providing quantitative information of vegetation coverage, productivity and greenness, and is also adopted by this study. Given the facts that uncertainties exist in different LAI products projected by different algorithms, three different satellite-derived LAI products were considered in the RF models to predict global surface fluxes. The Global Inventory Modeling and Mapping Studies (GIMMS) LAI3g dataset was constructed by the Artificial Neural Network (ANN) model from GIMMIS Normalized difference vegetation index (NDVI3g) dataset [36]. The ANN model was established by training GIMMIS NDVI3g and MODIS LAI datasets for the overlapping period 2000-2009. The Long-term Global Mapping (GLOBMAP) LAI dataset was generated by combing both MODIS LAI and Advanced Very High Resolution Radiometer (AVHRR) LAI [37]. The pixel-by-pixel relationship was established first based on MODIS LAI and GIMMIS NDVI during the overlapped period (2000)(2001)(2002)(2003)(2004)(2005)(2006), and was used to derive historical AVHRR LAI during 1982-2000 from GIMMIS NDVI. GLOBMAP and GIMMIS LAI datasets have the same

Vegetation and Climate Datasets
Previous studies have reported that vegetation change is a primary driver of surface energy fluxes variations, especially for the LE in which more than 50% of it could be explained by vegetation greening [34,35]. Leaf area index has been widely used as an indicator to monitor vegetation dynamics for its advantage of providing quantitative information of vegetation coverage, productivity and greenness, and is also adopted by this study. Given the facts that uncertainties exist in different LAI products projected by different algorithms, three different satellite-derived LAI products were considered in the RF models to predict global surface fluxes. The Global Inventory Modeling and Mapping Studies (GIMMS) LAI3g dataset was constructed by the Artificial Neural Network (ANN) model from GIMMIS Normalized difference vegetation index (NDVI3g) dataset [36]. The ANN model was established by training GIMMIS NDVI3g and MODIS LAI datasets for the overlapping period 2000-2009. The Long-term Global Mapping (GLOBMAP) LAI dataset was generated by combing both MODIS LAI and Advanced Very High Resolution Radiometer (AVHRR) LAI [37]. The pixel-by-pixel relationship was established first based on MODIS LAI and GIMMIS NDVI during the overlapped period (2000)(2001)(2002)(2003)(2004)(2005)(2006), and was used to derive historical AVHRR LAI during 1982-2000 from GIMMIS Remote Sens. 2020, 12, 3712 4 of 18 NDVI. GLOBMAP and GIMMIS LAI datasets have the same temporal resolution of half month, but their spatial resolutions are 1/12 degree and 8 km, respectively. The Global LAnd Surface Satellite (GLASS) LAI was constructed based on general regression neural networks, which are trained with the fused MODIS and CYCLOPES LAI (from SPOT/VEGETATION) products and the AVHRR reflectance data [38]. The spatial resolution is 0.05 degree with 8-day time step spanning from 1982-2016. Besides, since LAI observations are not available in eddy covariance sites, the MODIS LAI (MOD15A2H) was used to train RF models because of its high spatial resolution of 500 m. The best pixels were extracted based on its associated QC flags, and finally were aggregated into monthly data from 8-day step to match with observed surface energy fluxes. The MODIS IGBP land cover map (MCD12C1) in 2005 was obtained from the land processes distributed active archive center (LP DAAC). The product has a spatial resolution of 0.05 • . The 0.5 • spatial resolution map was produced by the dominant land cover types in our global surface energy flux estimations.
The latest version of global climate forcing datasets (CRUNCEP V7) were obtained from Research Data Archive at the National Center for Atmospheric Research [39]. These datasets were constructed by combing two existing datasets: the Climate Research Unit (CRU) monthly data and the reanalysis data developed by National Centers for Environmental Prediction (NCEP), and were initially constructed to drive land surface models by their highly temporal (6-h) and spatial (0.5 degree) resolutions. Global monthly mean CO 2 concentration data were obtained from The Global Monitoring Division of NOAA/Earth System Research Laboratory, which is available from 1980 to date [40].
Atmospheric vapor pressure deficit (VPD), defined as the difference between saturate vapor pressure (SVP) and actual vapor pressure (AVP) for a given temperature, is also a potential factor in controlling surface energy fluxes, especially for the LE, by determining air vapor holding capacity and vapor exchange between vegetation and atmosphere through controlling stomatal closure [41]. It was calculated from other existing climate factors from CRUNCEP datasets based on the following equations [42]: (1) f w = 1.0007 + 3.46 × 10 −6 × P s (2) where f w is an enhanced factor in moist air and P s is the surface pressure.

Random Forest Model
The random forest method (RF) was used to reconstruct the surface fluxes during 1982-2016. RF is a tree-based ensemble approach and the predictions are averaged by all decision trees, which can be used for both classification and regression purposes [43]. Each decision tree was built independently by a bootstrap sample of the data set, randomly generated from the predictor variable values [44]. Each node of the decision tree was based on one characteristic (i.e., predictor variable) and the final outputs depend on the majority votes. This allows RF to avoid collinearity between predictor variables. RF has been widely used in agricultural, ecological and meteorological studies, and has demonstrated to be a robust method to reconstruct surface parameters [3,30]. Seven predictor variables, SR, PRE, TEM, wind, LAI, VPD and CO 2 are selected to build RF models for both LE and H, which have proven to play a significant role in regulating surface energy partitioning [3, [45][46][47]. RF was implemented by package "randomForest" in R language [41]. 'mtry' and 'ntree' are two parameters needed to be adjusted in RF. 'mtry' represents the number of randomly chosen predictor variables in each node of one decision tree and predefined as one third of total number of predictor variable, while ntree represents the number of total decision trees and predefined as 500 in RF. Here, 'ntree' is adjusted as 1000 based on the fact that the outputs tend to be stable. The parameter 'mtry' was not tuned because the model performance changed slightly compared to the default 'mtry'.

Morris Sensitivity Method
The Morris sensitivity analysis is an effective method to filter out important factors in models expressed mathematically [48]. The importance ranking of input factors are determined by both the mean (µ i ) and standard deviation (σ i ) of the elementary effects attributable to each input. Before implementation, all the inputs are scaled to [0, 1], and the elementary effects (E i ) of parameter x i are expressed as: where f is the computational model referred to as RF in this study and ∆ is the step size of input parameter x i ; ∆ is equal to j/2(j − 1) [49], and j is the sampling levels that is predefined. The starting parameter is randomly selected and varies by ∆, while other parameters are fixed. Then the second parameter is selected and varies again by ∆, until all the parameters are implemented. This is repeated r times (i.e., r trajectories). Therefore, the mean elementary effect of parameter x i is expressed as: The standard deviation of elementary effects of parameter x i is expressed as: where µ i represents the overall effects of x i on model output and σ i means the interactions/non-linearities with other parameters.
To determine the important parameters, a measure of importance for parameter x i (β i ) is suggested by Lu et al. [50], which is calculated by considering both µ i and σ i , where β 0 is a threshold above which that parameter is considered to be important and is suggested as 0.85 by the study from Lu et al. [50]; s is the standard deviation of µ 2 i + σ 2 i of all parameters. Thus, the important ranking of input parameters can be obtained according to β i values.

Experimental Design
Two types of experimental simulations were designed to extract the relative contributions of individual factors to the surface energy fluxes. In the first simulation, all the drivers, including vegetation and climate predictor variables, were set to vary during 1982-2016, which is defined as the total effects (S total ). In the second type of simulations, only one driver (S LAI , S TEM , S PRE , S VPD , S SW , Sco 2 and S Wind ) was kept at the initial state (1982), while the remaining drivers were set to vary with time. The difference between these two types of simulations was defined as the contribution of the corresponding factor to the changes in surface energy fluxes. For example, the contribution of LAI to LE variation during 1982-2016 was calculated as follows, This equation was also applied to H and other factors.

Model Evaluation
We used coefficient of determination (R 2 ), root mean square error (RMSE) and mean bias error (MBE) to assess the agreement between the predictions and the observations. The equations [51,52] are given as follows, where x i and y i are the observed and modeled values, respectively; n is the number of observed and modeled values; x is the mean value of observations. A value of one for the R 2 indicates a perfect match, and zero indicates no agreement at all. The smaller the value of RMSE and MBE, the better the agreement between the observations and simulations. The k-fold cross validation method [53] was applied to assess the capability and stability of RF to estimate surface energy fluxes. Particularly, the original dataset is randomly partitioned into k equal sized groups. Then, one of the groups was chosen for testing the model, and the remaining k−1 groups were used as training data. This cross-validation process is then repeated k times, until each of the groups was used once as the validation data. The value of k was fixed as 10, as it is commonly used [54].

Evaluation of the Capability of RF in Predicting LE and H
Based on the RF models that were constructed 10 times for each of the 10 PFTs by the k-fold cross validation method, the performance was assessed by R 2 , RMSE and MBE in Figure  For the H, the mean R 2 values were distributed between 0.53 in WET and 0.90 in WSA. The explanatory ability of RF in predicting H is stronger in ENF, EBF, MF, SH, WSA, SAV and GRA with the R 2 values higher than 0.8. However, the explanatory ability of RF is relatively weak in DBF, WET and CRO whose R 2 values are lower than 0.

Important Predictor Variables in Predicting LE and H
The LE and H are considerably constrained by predictor variables. The ranking of importance in these variables were identified by the Morris sensitivity method in this study. Guided by previous analyses, the RF models with the lowest RMSE for each PFT were chosen for the following analyses. As shown in Figure A1, taking GRA as an example, the elementary effects of predictor variables as quantified by β values fluctuated with the number of trajectories (r). However, these fluctuations become small and stable when the r reaches 2000 for both LE and H, which is sufficient for screening out the importance of predictor variables. As shown in Figure 3, the identified important variables are different for LE and H, and are also different across different PFTs. For the LE, specifically, the SR, TEM, LAI are the sensitive factors and are common to all 10 PFTs. VPD is identified as a sensitive one for most PFTs except for SH. PRE is considered to have limited effects in the forests, such as in EBF, ENF, DBF and in MF. For the H, the SR is the most important variable for all PFTs whose β values are higher than any other factors. Furthermore, VPD is identified as a sensitive factor for most PFTs, except for ENF and EBF. Except for the DBF, WET and CRO, the numbers of sensitive factors for LE are higher than the sensitive numbers for the H. These results suggest that the LE are more sensitive to ambient conditions and difficult to be captured than the H as it is constrained by more

Important Predictor Variables in Predicting LE and H
The LE and H are considerably constrained by predictor variables. The ranking of importance in these variables were identified by the Morris sensitivity method in this study. Guided by previous analyses, the RF models with the lowest RMSE for each PFT were chosen for the following analyses. As shown in Figure A1, taking GRA as an example, the elementary effects of predictor variables as quantified by β values fluctuated with the number of trajectories (r). However, these fluctuations become small and stable when the r reaches 2000 for both LE and H, which is sufficient for screening out the importance of predictor variables. As shown in Figure 3, the identified important variables are different for LE and H, and are also different across different PFTs. For the LE, specifically, the SR, TEM, LAI are the sensitive factors and are common to all 10 PFTs. VPD is identified as a sensitive one for most PFTs except for SH. PRE is considered to have limited effects in the forests, such as in EBF, ENF, DBF and in MF. For the H, the SR is the most important variable for all PFTs whose β values are higher than any other factors. Furthermore, VPD is identified as a sensitive factor for most PFTs, except for ENF and EBF. Except for the DBF, WET and CRO, the numbers of sensitive factors for LE are higher than the sensitive numbers for the H. These results suggest that the LE are more sensitive

Inter-Annual Variations in LE and H
Temporal and spatial patterns of global LE and H between 1982 and 2016 were estimated by RF models which were driven by gridded climatic datasets and satellite-based LAI (Figure 4)

Contributions of Individual Factors to H and LE Variations
Almost all the predictor variables show a significant increasing trend during 1982-2016, except for the wind ( Figure A2). Evidently, the largest increasing speed was presented in CO2, from 340 ppm in 1982 to more than 400 ppm in 2016, and increased by ~17%. LAI, TEM and VPD also show a strong increasing trend, as indicated by R 2 ranging from 0.74 to 0.89. However, the SR and PRE showed a slightly increasing trend, but still significantly (p < 0.05). Any changes in these variables can lead to the changes in the surface energy fluxes and further exert influences on surface energy balance. Here, the relative contributions of such individual factor variations to H and LE changes were quantified by conducting single factor experiments. The spatial patterns show that the elevated CO2 concentration played a profound impact on the changes of LE ( Figure 5). The elevated CO2 concentration contributed an increase in LE in the southern hemisphere and some parts in the midlatitudes of northern hemisphere, such as eastern parts of Asia and western parts of Europe, while the decrease of LE caused by CO2 mainly occurred in high-latitudes of northern hemisphere and some arid regions, such as middle parts of Asia and west parts of North America. Nearly 70.5% of the Earth experienced an increase of LE under the effect of LAI increase. Meanwhile, 64.4% of the Earth undergone an increase of LE caused by the TEM rise. The areas that experienced an increase or a decrease in LE caused by PRE changes are comparable (51% vs. 49%). VPD, SR and wind exerted limited impacts on the LE. Our results show that the increased CO2 contributed most to the decrease of LE among these factors with a value of −0.61 W/m 2 during 1982-2016, followed by the VPD and wind with the values of −0.032 W/m 2 and 0.028 W/m 2 , respectively. On the contrary, the increase of LAI causes the most increase in LE with a value of 0.57 W/m 2 , followed by TEM, PRE and SR with the values of 0.26 W/m 2 , 0.08 W/m 2 and 0.03 W/m 2 , respectively. The spatial maps of long-term trends are averaged by outputs of RF models that are driven by three different LAI datasets. Only the cells that are statistically significant at p < 0.05 level are shown. * and ** indicate the significance levels at 0.05 and 0.01, respectively.

Contributions of Individual Factors to H and LE Variations
Almost all the predictor variables show a significant increasing trend during 1982-2016, except for the wind ( Figure A2). Evidently, the largest increasing speed was presented in CO 2 , from 340 ppm in 1982 to more than 400 ppm in 2016, and increased by~17%. LAI, TEM and VPD also show a strong increasing trend, as indicated by R 2 ranging from 0.74 to 0.89. However, the SR and PRE showed a slightly increasing trend, but still significantly (p < 0.05). Any changes in these variables can lead to the changes in the surface energy fluxes and further exert influences on surface energy balance. Here, the relative contributions of such individual factor variations to H and LE changes were quantified by conducting single factor experiments. The spatial patterns show that the elevated CO 2 concentration played a profound impact on the changes of LE ( Figure 5). The elevated CO 2 concentration contributed an increase in LE in the southern hemisphere and some parts in the mid-latitudes of northern hemisphere, such as eastern parts of Asia and western parts of Europe, while the decrease of LE caused by CO 2 mainly occurred in high-latitudes of northern hemisphere and some arid regions, such as middle parts of Asia and west parts of North America. Nearly 70.5% of the Earth experienced an increase of LE under the effect of LAI increase. Meanwhile, 64.4% of the Earth undergone an increase of LE caused by the TEM rise. The areas that experienced an increase or a decrease in LE caused by PRE changes are comparable (51% vs. 49%). VPD, SR and wind exerted limited impacts on the LE. Our results show that the increased CO 2 contributed most to the decrease of LE among these factors with a value of −0.61 W/m 2 during 1982-2016, followed by the VPD and wind with the values of −0.032 W/m 2 and 0.028 W/m 2 , respectively. On the contrary, the increase of LAI causes the most increase in LE with a value of 0.57 W/m 2 , followed by TEM, PRE and SR with the values of 0.26 W/m 2 , 0.08 W/m 2 and 0.03 W/m 2 , respectively.
Contrast to the LE, the elevated CO 2 concentration contributed a decrease in H in the southern hemisphere and some parts in the mid-latitudes of northern hemisphere and the increase of H caused by CO 2 mainly occurred in high-latitudes of northern hemisphere ( Figure 6). There are 84.9% and 61.    dominant controls. The changes of both LE and H dominated by elevated CO2 concentration occurred in most parts of the Earth, which account for 54.8% and 52.0%, respectively (Figure 7). For LE, the LAI controls were mainly distributed in eastern parts of China and Brazil and southeastern parts of North America. Most regions of Australia and southeastern parts of Africa were dominated by PRE. For the H, only some eastern parts of South America and middle parts of Africa were dominated by LAI.

Comparisons with Previous Studies
The understanding of global-scale surface energy partitioning is still a challenging problem due to the complex interactions between the atmosphere and the surface as well as their spatial inhomogeneity. Machine learning methods has greatly enhanced our ability to predict and monitor the surface energy fluxes variations, by avoiding describing complicated biophysical or biochemical mechanisms. Fortunately, the established RF models by feeding site-observed monthly climatic and vegetation factors have a high potential in estimating the LE and H, which overall accounted for 0.66~0.89 variations for LE and 0.53~0.90 for H, respectively. Our estimations show a highly comparable performance to other studies. The detailed comparisons between this study and other products are listed in Table 1. For example, based on a common period between 2001 and 2010, we estimated a mean value of 42.27 ± 1.25 W/m 2 for LE and 36.52 ± 0.52 W/m 2 for H, respectively, which is close to 44.48 ± 0.16 W/m 2 for LE and 37.08 ± 0.33 W/m 2 for H estimated by Jung et al. [29], which is averaged by multiple machine learning methods. However, these estimations are inconsistent with our previous study which used a process-based land surface model (i.e., Common Land Model (CoLM)) [55]. The LE is higher than CoLM estimate with a value of 40.37 ± 1.21 W/m 2 , but the H is lower than CoLM-based estimate 37.14 ± 1.11 W/m 2 . The differences may come from a systematical underestimation of albedo by CoLM, which affect the net radiation received by the surface and

Comparisons with Previous Studies
The understanding of global-scale surface energy partitioning is still a challenging problem due to the complex interactions between the atmosphere and the surface as well as their spatial inhomogeneity. Machine learning methods has greatly enhanced our ability to predict and monitor the surface energy fluxes variations, by avoiding describing complicated biophysical or biochemical mechanisms. Fortunately, the established RF models by feeding site-observed monthly climatic and vegetation factors have a high potential in estimating the LE and H, which overall accounted for 0.66~0.89 variations for LE and 0.53~0.90 for H, respectively. Our estimations show a highly comparable performance to other studies. The detailed comparisons between this study and other products are listed in Table 1. For example, based on a common period between 2001 and 2010, we estimated a mean value of 42.27 ± 1.25 W/m 2 for LE and 36.52 ± 0.52 W/m 2 for H, respectively, which is close to 44.48 ± 0.16 W/m 2 for LE and 37.08 ± 0.33 W/m 2 for H estimated by Jung et al. [29], which is averaged by multiple machine learning methods. However, these estimations are inconsistent with our previous study which used a process-based land surface model (i.e., Common Land Model (CoLM)) [55]. The LE is higher than CoLM estimate with a value of 40.37 ± 1.21 W/m 2 , but the H is lower than CoLM-based estimate 37.14 ± 1.11 W/m 2 . The differences may come from a systematical underestimation of albedo by CoLM, which affect the net radiation received by the surface and further exert an influence on the energy partitioning between LE and H. Forzieri et al. [45] also reported a systematic model underestimation of LE by assessing 10 state-of-the-art LSMs simulations. They reported that the overestimation of the sensitivity to CO 2 and the underestimation of the biophysical response of ecosystems to changes in water availability in current LSMs may lead to the underestimations of LE in LSMs. Besides, we further compared LE with three other ET products derived from Global Land Evaporation Amsterdam Model (GLEAM), MODIS improved ET algorithm and Model tree ensemble (MTE) methods by combining with latent heat of vaporization (λ = 2.45 MJ kg −1 ). These authors estimated a mean annual LE of 39.54 ± 0.51, 38.51 ± 0.50 and 38.84 ± 0.40 W/m 2 for GLEAM, MODIS and MTE, respectively, which are lower than our estimates. The comparisons above suggest a large uncertainty exists in current widely used LE and H products. Therefore, intercomparisons of various datasets estimated by different methods are needed for a better understanding of surface energy partitioning in the future studies. And this study provides a valuable contribution to the surface energy flux prediction.

The Sensitivity of Predictor Variables to H and LE Variations
Assessing the sensitivity of individual factors to predict the LE and H will be helpful to identify which one among the several factors is important in determining H and LE changes. This provides a feasible and efficient way to monitor LE or H changes by focusing on limited variables. The important factors identified by Morris sensitivity method differ in determining variations of LE and H. We found that SR, LAI and TEM are important in LE, while SR is important in H across 10 PFTs. The important factor SR is common to both LE and H and to all 10 PFTs. This is because SR controls energy supply for both LE and H, and the magnitudes of their change are directly affected by SR. Another possible reason may come from the established RF models which are trained by monthly observation. Most of the stations (115 out of 152) are located in temperate regions with periodically or seasonally dynamic values that may give rise to the high sensitivity to SR.
PRE and TEM are also two important climatic factors which affect LE, because they regulate moisture supply and moisture holding capacity of the air [59,60]. However, the forests are not sensitive to the PRE compared to other PFTs. PRE regulates moisture supply through governing the soil water content [61], which becomes more significant in arid/semi-arid ecosystems where the precipitation is the primary channel to provide water resource [56]. This is also demonstrated by our results when the sensitivity is projected into different PFTs. The higher sensitivities occurred in relatively arid ecosystems such as in WSA and SAV where the β values are 2.06 and 3.53, respectively. While lower sensitivities are mainly distributed in relatively humid ecosystems, such as in forests (i.e., ENF, ENF, DBF and MF in this study), where the β values are lower than 0.85. LAI is another sensitive factor that could exert influence on both LE and H, especially for LE. It is reasonable to say that LAI impacts the exchange of energy fluxes between the land and the atmosphere through albedo, because a larger LAI is associated with a lower albedo, and more energy will be absorbed by the surface [62]. Besides, LAI is a key representative of leaf area which is expected to control the transpiration area [63], and thus, to affect the LE variations. CO 2 was expected to exert a critical impact on the the surface energy fluxes because of its greenhouse effect by enhancement of downward long-wave radiation, and also because its elevated concentration would lead to a closure of the stomatal apertures, leading to a decrease of transpiration [64]. However, its importance was not screened out for LE on forests, WSA and CRO, but only occurs in SH, SAV, GRA and WET. The sensitivity of surface energy partitioning to climatic and vegetation factors are divergent with respect to different PFT-scales. This finding is supported by Valayamkunnath et al. [65] which suggested that surface energy partitioning differs among different ecosystems due to heterogeneity in available energy, soil moisture conditions and vegetation characteristics. Beside the factors analyzed in this study, other factors are also reported to have an important role in influencing surface energy fluxes changes. For example, Yao et al. [33] found that snow cover, soil content and soil texture have a large influence on the surface energy balance in high-elevation areas of the Tibet plateau. Therefore, more localized and in-depth analyses are required in the future.

Inter-Annual Trends of LE and H and Their Dominant Factors
Our findings show that a contrasting trend was presented for LE and H: a positive trend in LE (0.054~0.073 W/m 2 per year) and a negative trend in H (−0.005~−0.013 W/m 2 per year) during 1982-2016 and these contrasting trends are largely controlled by the variations of CO 2 . These findings are consistent and comparable with previous studies using statistical algorithm, or process-based biophysical models [66,67]. For example, Zhang et al. [35] used a remote-sensing-driven ET algorithm and found a significant upward global trend of LE at a rate of 0.068 W/m 2 per year. Zeng et al. [34] also found a positive trend of LE at a rate of 0.059 W/m 2 per year during 1982-2011 by using an ensemble of several physics-based formula reconstructions. All these studies show that the positive trend of LE is driven by the global greening (indicated by NDVI or LAI) and their growth rates also fall within our estimates. However, our findings seem to be inconsistent with FLUXCOM product [29], which shows a much lower upward global trend in LE at a rate of 0.004 W/m 2 per year but not significantly (p > 0.05), and also an opposite trend in H at a rate of 0.01 W/m 2 per year. We note that the discrepancy between our findings and their results may come from the predictor variables. The effects of CO 2 were not taken into account in their estimates but were in our established RF models. This finding suggests that CO 2 is the most influencing factor in determining surface energy partitioning. Zhang et al. [66] quantified the contributions of changes in CO 2 to transpiration and evaporation during 1982-2012 using Community Atmosphere-Biosphere Land Exchange model (CABLE). They reported that the increasing CO 2 concentrations reduced transpiration 0.17 mm per year and increased evaporation by 0.04 mm per year, and hence reduced LE by approximate 0.01 W/m 2 per year during 1981-2012 which is comparable to 0.017 W/m 2 per year in our study. Therefore, CO 2 are highly recommended to be investigated in future researches given its important role in affecting surface energy fluxes.

Conclusions
The dramatic climate change and wide spread of global greening in recent past decades are expected to alter the surface energy partitioning. However, large uncertainties exist in predicting both LE and H, as reported in previous studies based on statistical methods, physical-based algorithms or process-based models. The machine learning method of RF was assessed to predict LE and H and their long-term trends and their dominant factors were analyzed. From this study, we conclude that: (1) The established RF models in this study have a high potential in predicting and monitoring the surface energy flux variations. However, their predicted performance is different among PFTs.    (2) The important influencing factors are different in LE and H based on Morris sensitive method. Particularly, The SR, LAI and TEM are important factors in LE, while SR is important in H across 10 PFTs in this study. Besides, the important factors are divergent with respect to PFT-level. (3) A contrasting trend was presented for LE and H: a positive trend in LE with a rate of 0.054~0.073 W/m 2 per year and a negative trend in H with a rate of −0.005~−0.013 W/m 2 per year during 1982-2016 and these contrasting trends are largely controlled by the variation of CO2. Our study emphasizes the need to better account for the influence of elevated CO2 in energy partitioning to improve surface energy fluxes estimations in future studies.