Uncertainty of Partial Dependence Relationship between Climate and Vegetation Growth Calculated by Machine Learning Models

: As more machine learning and deep learning models are applied in studying the quantitative relationship between the climate and terrestrial vegetation growth, the uncertainty of these advanced models requires clariﬁcation. Partial dependence plots (PDPs) are one of the most widely used methods to estimate the marginal effect of independent variables on the predicted outcome of a machine learning model, and it is regarded as the main basis for conclusions in relevant research. As more controversies regarding the reliability of the results of the PDPs emerge, the uncertainty of the PDPs remains unclear. In this paper, we experiment with real, remote sensing data to systematically analyze the uncertainty of partial dependence relationships between four climate variables (temperature, rainfall, radiation, and windspeed) and vegetation growth, with one conventional linear model and six machine learning models. We tested the uncertainty of the PDP curves across different machine learning models from three aspects: variation, whole linear trends, and the trait of change points. Results show that the PDP of the dominant climate factor (mean air temperature) and vegetation growth parameter (indicated by the normalized difference vegetation index, NDVI) has the smallest relative variation and the whole linear trend of the PDP was comparatively stable across the different models. The mean relative variation of change points across the partial dependence curves of the non-dominant climate factors (i.e., radiation, windspeed, and rainfall) and vegetation growth ranged from 8.96% to 23.8%, respectively, which was much higher than those of the dominant climate factor and vegetation growth. Lastly, the model used for creating the PDP, rather than the relative importance of these climate factors, determines the ﬂuctuation of the PDP output of these climate variables and vegetation growth. These ﬁndings have signiﬁcant implications for using remote sensing data and machine learning models to investigate the quantitative relationships between the climate and terrestrial vegetation.


Introduction
Growth and functionality of terrestrial vegetation are highly affected by climate variables [1][2][3]. Quantifying the effects of the climate on terrestrial vegetation growth has Remote Sens. 2023, 15, 2920 2 of 12 been a hot topic in climate change research [4][5][6][7][8][9][10]. In the last two decades, remote sensing technology, machine learning, and deep learning algorithms have made it possible to elaborate the quantitative relationship between the climate and vegetation growth on a global scale [11][12][13]. However, uncertainties accompanied with the new models and technologies have also been perceived as one of the main challenges for understanding the relationship between the climate and terrestrial vegetation growth, as well as implementing climate mitigation action strategies [14,15].
Compared with the conventional statistical methods, machine learning and deep learning models estimate the nonlinear relationships between the climate and vegetation parameters that are not summarized by a small number of interpretable parameters. Instead, their model structures often contain huge numbers of parameters from which the quantitative relationship derives [16,17]. Despite the "black box" reputation of machine learning and deep learning models regarding interpretation, there are plenty of insights to be gained by investigating these models at different levels (e.g., interpretations at system, variable, and individual prediction levels, respectively) [18][19][20].
At the system level, the fitting performance of machine learning models is evaluated with metrics such as r 2 (the proportion of the variance of the observed values explained by the predictions) or the root mean squared error (RMSE), which indicates how well the covariates can quantitatively explain the response variable based on the specified models. At the variable level, the variable importance can be assessed, i.e., examining the importance of the interactions between pairs of covariates and the functional responses of these covariates. The first step is to calculate the factor importance for the machine learning models [11,17,21,22]. Once the essential covariates have been identified based on the factor importance, it is useful to further investigate the variable level relationship between the covariate and the response. For example, the partial dependence plot (PDP) shows the marginal effect that the targeted features have on the predicted outcome of a machine learning model. PDPs are created by evaluating the mean response of the dependent variable to the change in the targeted covariate. It shows different patterns of quantitative relationship (linear, monotonic, or more complex) between the targeted covariate and the dependent variable [23,24]. For studies focusing on quantifying the impact of climate change on vegetation growth using machine learning models, PDPs are the main basis from which the core conclusions can be formulated [4,[25][26][27].
However, while other uncertainties of using machine learning models in estimating quantitative impacts of climate on vegetation growth have been examined, the uncertainty associated with the partial dependence relationship has not been fully investigated. In this study, we tested the variation of the partial dependence relationship between the normalized difference vegetation index (NDVI) of the northern hemisphere deciduous broad-leaved forest and four influencing climatic factors (temperature, rainfall, radiation, and windspeed) across different machine learning and deep learning models, conducting a detailed analysis of the variation characteristics of the PDP by undertaking different statistical methods. This study is vital for understanding the uncertainties and assessing the accuracy behind the emerging "black box" models.

Method
Here we conducted an experiment to simulate a study calculating the partial dependence relationship between the climate parameters and vegetation growth based on remote sensing data and several machine learning models. NDVI was chosen as the proxy for vegetation growth, while the independent climate factors are annual mean temperature (referred to as temperature), annual rainfall (referred to as rainfall), shortwave radiation (referred to as radiation), and windspeed. Then, we imported seven statistical models (including six different machine learning and deep learning models, with a conventional multi-linear model as the comparison group) to calculate the partial dependence relationship between the four climate factors and NDVI (Table 1).  [41,42] We selected deciduous broad-leaved forests in the northern hemisphere as our study region due to its relatively concentrated distribution range, with similar interactive mechanisms observed between the climate and local vegetation ( Figure 1). The land cover data was based on the product MCD12C1 using annual data covering the period 2001-2016 [43]. We first upscaled the dataset from 0.05 • to 0.5 • of spatial resolution. For each 0.5 • pixel, we assigned the land cover type for which the percentage was above 50% for all the subpixels. We extracted the pixels where the land cover type stayed constant over the study period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) (including six different machine learning and deep learning models, with a conventional multi-linear model as the comparison group) to calculate the partial dependence relationship between the four climate factors and NDVI (Table 1). learning rate is 0.008 [41,42] We selected deciduous broad-leaved forests in the northern hemisphere as our study region due to its relatively concentrated distribution range, with similar interactive mechanisms observed between the climate and local vegetation ( Figure 1). The land cover data was based on the product MCD12C1 using annual data covering the period 2001-2016 [43]. We first upscaled the dataset from 0.05° to 0.5° of spatial resolution. For each 0.5° pixel, we assigned the land cover type for which the percentage was above 50% for all the subpixels. We extracted the pixels where the land cover type stayed constant over the study period (2001-2016).    The dataset NDVI (MOD13C1, version 6.1, 2001-2016) was downloaded from the MODIS product website (https://lpdaac.usgs.gov/products/mod13c1v006/, accessed on 1 March 2022). The spatial resolution was 5 km and temporal interval was 16 days, respectively [44,45]. We resampled this dataset to the spatial resolution of 0.5 • (in accordance with the meteorological dataset detailed in 2.1.2 below) and extracted the time series NDVI data in deciduous broad-leaved forests in the northern hemisphere, which acted as the dependent variable indicating vegetation growth.

WFDEI Meteorological Dataset
The European Union Water and Global Change project (http://www.eu-watch.org, accessed on 1 March 2022), provided a gridded European Union Water and Global Change-Forcing-Data-ERA-Interim (WFDEI) data product [46]. The WFDEI data product contains several kinds of meteorological variables with a spatial resolution of 0.5 • and temporal resolution of 3 h (2001-2016). We resampled the original data to form meteorological data with a temporal resolution of 16 days, in accordance with the NDVI dataset (2.1.1). The independent meteorological variables influencing NDVI of vegetation included temperature, rainfall, radiation, and windspeed.
After unifying the spatio-temporal resolution for all remote sensing products, the PDP was calculated and developed based on the value of all NDVI pixels and four climate factors at the 16 days temporal scale.

Methods for Analysis
The partial dependence plot is a graphical representation of the marginal effect of a feature(s) on the predicted outcome of a machine learning model. It shows how the output variable changes when one or more features are changed. PDPs are useful for understanding the relationship between the output variable and the features in a machine learning model. The process of calculating and creating PDPs can be described as controlling the nontargeted features while shifting the targeted feature to quantitatively estimate the variation of the model output. We repeated the same step for each independent feature and created PDPs for the machine learning models, respectively.
The definition of partial dependence function for regression is: where x S are the targeted feature(s) selected to create the PDP and X C are the other features impacting the result of the machine learning modelf . Namely, the feature x s is the independent variable for which we want to know the effect on model f predictions (dependent variable). Together, the feature vectors x S and X C combine the total feature space x (independent variable). The partial dependence function runs by marginalizing the machine learning model f output over the distribution of the features in X C . Therefore, this function shows the relationship between the features in x S that we are interested in and the result of the model prediction (dependent variable). By marginalizing over the other features, the above function only depends on features in x S . Then the function of f S is calculated using Monte Carlo method: Based on the above equation, for given value(s) of the features x S , the average marginal effect on the prediction is calculated. In this formula, x (i) C are the other feature values from the dataset for the other features excluding the selected target feature, and n is the number of instances in the dataset [47,48].
After producing PDPs for the seven statistical models, we analyzed the variation of the PDP curve from four perspectives: (1) The overall variation of the curve shape for the PDP curves. We set different percentile intervals for calculating the variation of the response of the NDVI to the change of the climate factors across the different models used.
Remote Sens. 2023, 15, 2920 5 of 12 (2) The variation of the linear trend for PDP curves. First, we performed linear regression for each PDP based on the least squares principle: where y and x refer to the NDVI and each climate factor, respectively. The linear trend is the slope (a) of the fitting line, which demonstrates the macro influence (positive or negative, significant or not) of climate on NDVI in the PDP.
(3) Characteristics of the change points (which were evaluated by setting different threshold values for detecting one, two, or three change points, respectively) for each PDP, using MANOVA (multivariate analysis of variance) to test the influencing factors on the total number of change points in each PDP. Change points indicate abrupt variations within the PDP curves.
(4) Fluctuation of the PDP curves in the frequency domain based on the detrended Fourier decomposition, excluding the influence of the linear trend of the PDP. Fluctuations reveal the fast-varying parts in the PDP.
All steps were processed using relevant statistical packages (e.g., RegressionTree and newgrnn for building BRT and GRNN models, respectively) in Matlab 2021b.

Overall Variation of the PDPs
The PDP of each climate factor and NDVI have similar curve features among the seven statistical models (detailed in Supplementary Material). Among the four influencing climate factors, NDVI was found to have the most sensitivity to temperature change ( Figure 2). The PDP of NDVI and temperature follows an approximately S-shaped curve. In contrast, NDVI has a varied response to changes in rainfall, radiation, and windspeed. (2) The variation of the linear trend for PDP curves. First, we performed linear regression for each PDP based on the least squares principle: where y and x refer to the NDVI and each climate factor, respectively. The linear trend is the slope (a) of the fitting line, which demonstrates the macro influence (positive or negative, significant or not) of climate on NDVI in the PDP.
(3) Characteristics of the change points (which were evaluated by setting different threshold values for detecting one, two, or three change points, respectively) for each PDP, using MANOVA (multivariate analysis of variance) to test the influencing factors on the total number of change points in each PDP. Change points indicate abrupt variations within the PDP curves.
(4) Fluctuation of the PDP curves in the frequency domain based on the detrended Fourier decomposition, excluding the influence of the linear trend of the PDP. Fluctuations reveal the fast-varying parts in the PDP.
All steps were processed using relevant statistical packages (e.g., RegressionTree and newgrnn for building BRT and GRNN models, respectively) in Matlab 2021b.

Overall Variation of the PDPs
The PDP of each climate factor and NDVI have similar curve features among the seven statistical models (detailed in Supplementary Material). Among the four influencing climate factors, NDVI was found to have the most sensitivity to temperature change ( Figure 2). The PDP of NDVI and temperature follows an approximately S-shaped curve. In contrast, NDVI has a varied response to changes in rainfall, radiation, and windspeed.

Linear Trend of PDP
The similarity in the linear trend of PDP was calculated using different statistical models and varied among the different climate factors (Figure 3). The linear trend had a positive slope for the PDP of NDVI and temperature across all seven models. However, the linear trend for the PDP of NDVI and radiation was a significantly negative slope in the multi-linear and BP models, but positive in all the other models. With the exception of the climate factor of temperature, the other three influencing factors had this unconformity of positive or negative slopes of PDP across the different models.
the value of the normalized climate factors was divided into five equal intervals, for which the percentiles were annotated and the SD of PDP in each interval were calculated, respectively).

Linear Trend of PDP
The similarity in the linear trend of PDP was calculated using different statistical models and varied among the different climate factors (Figure 3). The linear trend had a positive slope for the PDP of NDVI and temperature across all seven models. However, the linear trend for the PDP of NDVI and radiation was a significantly negative slope in the multi-linear and BP models, but positive in all the other models. With the exception of the climate factor of temperature, the other three influencing factors had this unconformity of positive or negative slopes of PDP across the different models.

Characteristics and Influencing Factors of Change Points
In the three scenarios for detecting the different numbers of change points for each PDP, the average relative variations of the four climate factors was 15.78%, 12.60%, and 10.73%, for one, two, and three change points, respectively ( Table 2). Among the four climate factors, temperature had the smallest relative variation of 11.5%, 4.4%, and 5.0%, while rainfall was found to have the largest relative variation of 21.4%, 23.8%, and 19.8%, respectively.

Characteristics and Influencing Factors of Change Points
In the three scenarios for detecting the different numbers of change points for each PDP, the average relative variations of the four climate factors was 15.78%, 12.60%, and 10.73%, for one, two, and three change points, respectively ( Table 2). Among the four climate factors, temperature had the smallest relative variation of 11.5%, 4.4%, and 5.0%, while rainfall was found to have the largest relative variation of 21.4%, 23.8%, and 19.8%, respectively.  Among the seven statistical models, the PDP calculated by the LSTM was found to have the most change points, with an average value of 240 (Table S1), followed by BRT and RF (with average values of 60 and 87, respectively). MANOVA analyzed the impact of the model and climate factor on the total number of change points for the PDPs, using every local maximum or minimum point in each PDP curve. The results indicated that the statistical model used has the most significant influence (p < 0.0001), and the climate factor is not the determinant (p > 0.1; Table 3).

Fluctuation of PDP in the Frequency Domain
The results of the detrended Fourier decomposition demonstrates the fast-varying components and degree of detail of the PDP of the climate factors and NDVI. Among the seven statistical models, the PDP calculated by LSTM model for all four pairs of climate factors and NDVI had the highest value of P1 (red line in the four subplots; Figure 4) in both the medium and high frequency domains (higher than 10 Hz), indicating that the fluctuation of the curves is more evident than others. Compared with the different climate factors, the statistical model used was more dominant in determining the fluctuation traits of the PDPs. This phenomenon can also be seen in Supplementary Figure S1

Discussion
The development of remote sensing (from the data resources perspective) and machine learning technology (from the methodology perspective) have greatly promoted this pursuit of better fitting ability and quantitative interpretation for environmental remote sensing research. While the uncertainty of interpretation for factor importance has been studied, our results showed that the uncertainty of machine learning interpretation at a more detailed level needs to be addressed as well. In our experiment, the determination coefficient of all seven models was over 0.8, thereby showing an ideal fitting accuracy (Table S2). However, the PDPs for the climate factors and NDVI revealed evident differences in terms of the curve shape, linear trend, change points, and fluctuation traits. Ignoring this uncertainty may induce significant divergence and therefore reduce the credibility of the associated conclusions.
Based on the results above, we present the recommended steps to be followed for controlling the uncertainty in calculating and creating PDPs between the environmental factors (e.g., climate variables) and vegetation parameters using machine learning models in future studies. Firstly, the whole fitting accuracy (indicated by RMSE and R 2 and other alternative proxies) for the selected machine learning models needs to be evaluated. Once the fitting performance meets the requirement, the dominant and other important factors should then be identified for the second step. The index for this step can be chosen as either the Pearson's correlation coefficient, partial correlation coefficient, or factor importance. In our experiment, the mean temperature was found to have the maximum Pearson's correlation coefficient (0.87) with NDVI, after excluding the multicollinearity of climate factors ( Figure S2, all the other Pearson's correlation coefficients between the different climate factors and NDVI were less than 0.6). The dominance of temperature on vegetation growth in the study region was found to be in accordance with previous studies [49,50]. The last step is to rationally use PDPs for relative analysis and drawing conclusions. The more dominant the variable is, the more stable the resultant PDP of this factor (e.g., temperature) and the dependent factor (e.g., NDVI). For example, the PDPs of temperature and NDVI have the least relative variation, compared with the other three climate factors (rainfall, radiation, and windspeed). The response of NDVI to temperature change calculated by all seven models followed an S-shaped curve. In contrast, the PDP of the other three climate factors and NDVI had more uncertainty in the curve shape, with the average shape being relatively flat ( Figure 2). Additionally, the linear trend of the PDPs of temperature and NDVI was more stable than other climate factors (Figure 3). This relative stability in the PDP of these dominant factors and NDVI was also shown in the change points of the

Discussion
The development of remote sensing (from the data resources perspective) and machine learning technology (from the methodology perspective) have greatly promoted this pursuit of better fitting ability and quantitative interpretation for environmental remote sensing research. While the uncertainty of interpretation for factor importance has been studied, our results showed that the uncertainty of machine learning interpretation at a more detailed level needs to be addressed as well. In our experiment, the determination coefficient of all seven models was over 0.8, thereby showing an ideal fitting accuracy (Table S2). However, the PDPs for the climate factors and NDVI revealed evident differences in terms of the curve shape, linear trend, change points, and fluctuation traits. Ignoring this uncertainty may induce significant divergence and therefore reduce the credibility of the associated conclusions.
Based on the results above, we present the recommended steps to be followed for controlling the uncertainty in calculating and creating PDPs between the environmental factors (e.g., climate variables) and vegetation parameters using machine learning models in future studies. Firstly, the whole fitting accuracy (indicated by RMSE and R 2 and other alternative proxies) for the selected machine learning models needs to be evaluated. Once the fitting performance meets the requirement, the dominant and other important factors should then be identified for the second step. The index for this step can be chosen as either the Pearson's correlation coefficient, partial correlation coefficient, or factor importance. In our experiment, the mean temperature was found to have the maximum Pearson's correlation coefficient (0.87) with NDVI, after excluding the multicollinearity of climate factors ( Figure S2, all the other Pearson's correlation coefficients between the different climate factors and NDVI were less than 0.6). The dominance of temperature on vegetation growth in the study region was found to be in accordance with previous studies [49,50]. The last step is to rationally use PDPs for relative analysis and drawing conclusions. The more dominant the variable is, the more stable the resultant PDP of this factor (e.g., temperature) and the dependent factor (e.g., NDVI). For example, the PDPs of temperature and NDVI have the least relative variation, compared with the other three climate factors (rainfall, radiation, and windspeed). The response of NDVI to temperature change calculated by all seven models followed an S-shaped curve. In contrast, the PDP of the other three climate factors and NDVI had more uncertainty in the curve shape, with the average shape being relatively flat (Figure 2). Additionally, the linear trend of the PDPs of temperature and NDVI was more stable than other climate factors (Figure 3). This relative stability in the PDP of these dominant factors and NDVI was also shown in the change points of the curves ( Table 2). This was due to the high sensitivity of NDVI to temperature variation, which was Remote Sens. 2023, 15, 2920 9 of 12 shown by the corresponding PDP with a clear curve structure and more variation tolerance. Additionally, when analyzing the details of the PDP curves, the uncertainty brought by different machine learning models cannot be neglected. For all pairs between the four climate factors and NDVI, the resultant PDP curve from the LSTM models were always found to have the most change points, while the curves created by BP and GRNN were much smoother (with no change points of PDP detected, Table S1). This phenomenon was mainly caused by the specific structure of each machine learning model [51]. At last, PDP curves for the different machine learning models showed less variation in the midrange of independent variables (Figure 2). This is because the training dataset for machine learning models tends to be larger in this data region than those distributed at both ends. Thus, the fitting accuracy was higher and more stable, which further influenced the traits of the PDP.
Following the invention of new machine learning models, the corresponding methods for interpreting the quantitative relationships inside each model were raised successively. For artificial neural networks, several alternative methods can be selected to calculate the relative importance of the included independent factors in each network. Previous studies showed that both the value of relative importance and the rank of independent variables based on their impact differed greatly among the methods [21,22]. Recently, more novel methods were employed in research analyzing the quantitative relationship in machine learning models from different perspectives. For example, some model-diagnostic, local explanation approaches were designed to explain any given machine learning models displaying "black box" characteristics [52][53][54][55][56][57][58]. These methods can explain the individual predictions of any classifier in an interpretable manner by learning an interpretable model (e.g., linear model) locally around each prediction. However, few studies have yet mentioned the potential uncertainty they may have on the result of interpretation. The results and analyzes in this study imply that the exclusion of this uncertainty brought by the machine learning methods may reduce the credibility of the associated conclusions. In addition to uncertainty brought by data resources and model parameters, we suggest that the variation across the different machine learning and deep learning models should be highly emphasized.
Several other factors which were not evaluated in this study may also impact the variation of the PDP in machine learning models, for example, the hyper-parameter settings and training mode of each model. Hyper-parameters are parameters that are not learned by the model during training but are set prior to training [59,60]. In the context of artificial neural networks, hyperparameters are used to control the architecture of the network and the learning process. Examples of hyperparameters in neural networks include the learning rate, number of hidden layers, number of neurons in each hidden layer, activation function, regularization parameter etc. More systematic studies may therefore be considered in subsequent research.
In the future, when applying machine learning and deep learning methods in environmental remote sensing research, researchers are strongly recommended to evaluate the uncertainty of both the models and the method used. This evaluation of uncertainty should include more detailed examinations of the quantitative relationships calculated by the machine learning and deep learning models (e.g., using the Monte Carlo method to estimate the uncertainty across models in different data intervals). Only in this way can the research conclusions be robust and more significant for reference.

Conclusions
Driven by remote sensing technology and machine learning models, investigations into the quantitative relationships between the climate and terrestrial vegetation in global change studies have emerged in an unprecedented pattern. As an essential basis for drawing relative conclusions, the stability of the PDP has a direct impact on the credibility of the research. Overall, the relative importance of the influencing factors, the selected model structures, and the specific interval in the plot are the three key elements affecting the uncertainty of the PDP curve, which should be fully evaluated. We also provided several suggestions for controlling the uncertainty in calculating PDP, including considering the performance of the machine learning model, the selected index, the factor impact of each influencing factor and details of PDP interpretation. As large numbers of new algorithm structures and relationship interpretation methods have been proposed, reasonable exploration of the uncertainty of the different methods used and the analysis of its influencing factors are indispensable for future studies.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs15112920/s1, Figure S1. Partial dependence plots of climate factors and NDVI calculated by different statistical models. Figure S2. Cross correlation coefficient for each pair of climate factors and NDVI (T is mean air temperature). Table S1. Total number of change point for each partial dependence plot. Table S2. Mean determination coefficient of each statistical model.