Climate Change: Linear and Nonlinear Causality Analysis

The goal of this study is to detect linear and nonlinear causal pathways toward climate change as measured by changes in global mean surface temperature and global mean sea level over time using a data-based approach in contrast to the traditional physics-based models. Monthly data on potential climate change causal factors, including greenhouse gas concentrations, sunspot numbers, humidity, ice sheets mass, and sea ice coverage, from January 2003 to December 2021, have been utilized in the analysis. We first applied the vector autoregressive model (VAR) and Granger causality test to gauge the linear Granger causal relationships among climate factors. We then adopted the vector error correction model (VECM) as well as the autoregressive distributed lag model (ARDL) to quantify the linear long-run equilibrium and the linear short-term dynamics. Cointegration analysis has also been adopted to examine the dual directional Granger causalities. Furthermore, in this work, we have presented a novel pipeline based on the artificial neural network (ANN) and the VAR and ARDL models to detect nonlinear causal relationships embedded in the data. The results in this study indicate that the global sea level rise is affected by changes in ice sheet mass (both linearly and nonlinearly), global mean temperature (nonlinearly), and the extent of sea ice coverage (nonlinearly and weakly); whereas the global mean temperature is affected by the global surface mean specific humidity (both linearly and nonlinearly), greenhouse gas concentration as measured by the global warming potential (both linearly and nonlinearly) and the sunspot number (only nonlinearly and weakly). Furthermore, the nonlinear neural network models tend to fit the data closer than the linear models as expected due to the increased parameter dimension of the neural network models. Given that the information criteria are not generally applicable to the comparison of neural network models and statistical time series models, our next step is to examine the robustness and compare the forecast accuracy of these two models using the soon-available 2022 monthly data.


Introduction
In recent years, climate change, especially global warming and sea level rise, has caused increasing alarms among global communities. Mainstream research works based on physics models have identified the potential causal factors in global temperature rise, including, notably, the increased greenhouse gas emission due to human activities [1][2][3][4] and the increased humidity [5,6]. Meanwhile, the physical models have attributed the global sea level rise to glacier and ice sheet mass loss [7][8][9], reduced sea ice coverage [10], and ocean thermal expansion [11,12]. Besides the main stream research works based on physics models, there are also a few works utilizing statistical models, including: linear regression model [13]; structural equation modeling, which is indeed a system of intercorrelated linear regression equations [14,15]; and Granger causality and cointegration analysis, which are traditional time series methods for detecting linear causal relationships [16][17][18][19][20]. There is also a rising trend of utilizing machine learning methods for climate and weather studies in Stats 2023, 6 627 recent years [21][22][23][24][25]; however, none features an explicit causal inference for the pathways leading to global warming and sea level rise as we have presented here. While we have a substantial amount of faith in the existing physics models and the related conclusions, it would be doubly reassuring if we could reach the same conclusions using data-based analytical methods only including both statistical time series models and machine learning methods. This is exactly what we have performed in this work, using these purely databased methods to identify potential climate change causal factors and to examine whether these causal relationships are linear or nonlinear.
Our main contributions to this study are as follows. Firstly, we have conducted a thorough analysis of linear causal relationships and quantified the long-run cointegration relationships and the short-run dynamics related to global temperature change and sea level change. Secondly, we have developed a novel pipeline to combine time series models such as VAR and ARDL with neural network structures to detect nonlinear causality relationships among the time series data of climate factors. Thirdly, we have applied this pipeline to detect nonlinear causal pathways toward climate change in terms of rising global temperature and sea level and compared the model goodness-of-fit between the nonlinear machine learning models and the linear time series models. (Figure 1).
Stats 2023, 6, FOR PEER REVIEW 2 analysis, which are traditional time series methods for detecting linear causal relationships [16][17][18][19][20]. There is also a rising trend of utilizing machine learning methods for climate and weather studies in recent years [21][22][23][24][25]; however, none features an explicit causal inference for the pathways leading to global warming and sea level rise as we have presented here. While we have a substantial amount of faith in the existing physics models and the related conclusions, it would be doubly reassuring if we could reach the same conclusions using data-based analytical methods only including both statistical time series models and machine learning methods. This is exactly what we have performed in this work, using these purely data-based methods to identify potential climate change causal factors and to examine whether these causal relationships are linear or nonlinear.
Our main contributions to this study are as follows. Firstly, we have conducted a thorough analysis of linear causal relationships and quantified the long-run cointegration relationships and the short-run dynamics related to global temperature change and sea level change. Secondly, we have developed a novel pipeline to combine time series models such as VAR and ARDL with neural network structures to detect nonlinear causality relationships among the time series data of climate factors. Thirdly, we have applied this pipeline to detect nonlinear causal pathways toward climate change in terms of rising global temperature and sea level and compared the model goodness-of-fit between the nonlinear machine learning models and the linear time series models. (Figure 1) Figure 1. Flowchart of analysis and methodologies. In linear approaches, vector autoregression (VAR) model is adopted to analyze the linear Granger causality; vector error correction model (VECM) is adopted to analyze the linear long-run equilibrium and short-run dynamic effect; autoregressive distributed lag (ARDL) model is adopted to find lower order parsimonious model and simultaneous effects. In nonlinear approaches, nonlinear VAR neural network model is adopted to analyze nonlinear Granger causality; nonlinear ARDL neural network model is adopted to analyze the nonlinear long-run equilibrium and short-run dynamic effect.

Data Overview and Processing
In this study, seven relevant monthly frequency climate factors from January, 2003 to December, 2021 have been adopted toward the data-driven pathway analysis (the data were retrieved on 7 March 2023): Nonlinear long-run equilibrium and short-run dynamic effect analysis Figure 1. Flowchart of analysis and methodologies. In linear approaches, vector autoregression (VAR) model is adopted to analyze the linear Granger causality; vector error correction model (VECM) is adopted to analyze the linear long-run equilibrium and short-run dynamic effect; autoregressive distributed lag (ARDL) model is adopted to find lower order parsimonious model and simultaneous effects. In nonlinear approaches, nonlinear VAR neural network model is adopted to analyze nonlinear Granger causality; nonlinear ARDL neural network model is adopted to analyze the nonlinear long-run equilibrium and short-run dynamic effect.

Data Overview and Processing
In this study, seven relevant monthly frequency climate factors from January, 2003 to December, 2021 have been adopted toward the data-driven pathway analysis (the data were retrieved on 7 March 2023): Global mean sea level (GMSL) in mm, gathered from climate.nasa.gov, which was computed by the NASA Goddard Space Flight Center [26]; b.
Antarctic and Greenland ice sheet mass (IceSheet) in Gt, gathered from climate.nasa. gov, which was computed by the NASA MEaSUREs program [27]; c.
Northern and Southern hemisphere sea ice extent (SeaIce) in Mkm2, gathered from the National Snow and Ice Data Center [28]; d. Global mean surface temperature (TEMP) (with sea ice area measured by the air above sea ice) in • C, gathered from Berkeley Earth [29]; e.
Global mean specific humidity (Humidity) in kg kg −1 (mass of water vapor per kilogram of moist air), gathered from Copernicus [30]; f.
The trend and seasonality decomposition of each variable is shown in Figure 2.
deviation) while missing values are imputed via Kalman smoothing before the analysis. We combined the Greenland ice sheet and the Antarctic ice sheet data as one new feature: IceSheet. We also combined the Northern and Southern hemisphere sea ice as another new feature: SeaIce. Furthermore, the three major greenhouse gases, CO2, CH4, and N2O, are found to be highly correlated with each other; to avoid the multicollinearity issue, the global warming potential (GWP), which represents the heat absorbing capacities of greenhouse gases gauged in terms of the heat absorbed by the same amount of CO2, has been adopted for analyzing the greenhouse gas effect. The GWP is calculated in the CO2 variable unit and the coefficients are gathered from the global warming potential in 100 years' time length as calculated by IPCC [35]. These coefficients indicate how much energy the emissions of 1 ton of a gas will absorb over 100 years relative to the emissions of 1 ton of CO2. Among the three major greenhouse gases (CO2, CH4, and N2O), CH4 is 28 times more potent, while N2O is 265 times more potent than CO2 in warming up the globe. Therefore, we multiply the concentration of each of these gases in the atmosphere with these coefficients to arrive at the following GWP calculation formula:

Unit Root Test
A time series is (weakly) stationary if its first two moments are time-invariant and the unit root tests are the statistical procedures used to determine the stationarity [36]. In this study, we have used the augmented Dicky-Fuller (ADF) test and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test to examine the stationarity and to determine the integration order of each time series.
Decomposing each time series as follows: where Δ is the first difference of . For the ADF test, the null hypothesis is = 0 Since there exists strong seasonality in the dataset and the difference in scales is significant between climate factors, all the variables are de-seasoned and normalized (first minus the de-seasoned series mean and then divide it by the de-seasoned series standard deviation) while missing values are imputed via Kalman smoothing before the analysis. We combined the Greenland ice sheet and the Antarctic ice sheet data as one new feature: IceSheet. We also combined the Northern and Southern hemisphere sea ice as another new feature: SeaIce. Furthermore, the three major greenhouse gases, CO 2 , CH 4 , and N 2 O, are found to be highly correlated with each other; to avoid the multicollinearity issue, the global warming potential (GWP), which represents the heat absorbing capacities of greenhouse gases gauged in terms of the heat absorbed by the same amount of CO 2 , has been adopted for analyzing the greenhouse gas effect. The GWP is calculated in the CO 2 variable unit and the coefficients are gathered from the global warming potential in 100 years' time length as calculated by IPCC [35]. These coefficients indicate how much energy the emissions of 1 ton of a gas will absorb over 100 years relative to the emissions of 1 ton of CO 2 . Among the three major greenhouse gases (CO 2 , CH 4 , and N 2 O), CH 4 is 28 times more potent, while N 2 O is 265 times more potent than CO 2 in warming up the globe. Therefore, we multiply the concentration of each of these gases in the atmosphere with these coefficients to arrive at the following GWP calculation formula:

Unit Root Test
A time series is (weakly) stationary if its first two moments are time-invariant and the unit root tests are the statistical procedures used to determine the stationarity [36]. In this study, we have used the augmented Dicky-Fuller (ADF) test and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test to examine the stationarity and to determine the integration order of each time series.
Decomposing each time series x t as follows: where ∆x t is the first difference of x t . For the ADF test, the null hypothesis is α = 0 while the alternative hypothesis is α < 0. Rejecting the null hypothesis indicates the time series is stationary. For the KPSS test, the hypotheses are reversed, with the null hypothesis indicating that the time series is stationary, while the alternative hypothesis is not stationary [36]. By utilizing both unit root tests, we can further ensure the robustness of the test results to minimize the Type 1 and the Type 2 errors.

Vector Autoregressive Model (VAR) and Granger Causality Test
The vector autoregressive model is a multivariate statistical analysis framework to capture relationships between multiple time series. A VAR model with lag p (VAR(p)) can be expressed as: To determine the lag of a VAR model, several criteria such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the Hannan-Quinn criterion (HQC) can be used, with the BIC criterion being the most common choice as it imposes a higher penalty on a larger model (with more parameters).
In this study, we apply the VAR model to formulate and detect the linear causality pathways toward global mean temperature and global mean sea level (TEMP and GMSL) changes. Two VAR pathway models are formulated with one model containing variables potentially related to temperature rise, including TEMP, Humidity, GWP, and SSN; while the other model encompasses potential causal factors for sea level rise, including GMSL, IceSheet, SeaIce, and TEMP. The order of integration for both VAR models are found to be one, as selected by the Bayesian information criterion (BIC).
The Granger causality test is used to determine if one time series is critical in forecasting another one. For multiple time series, the Granger causality test is conducted through a VAR model with stationary time series data. It says that y i Granger causes another time series y j if at least one element of A γ (j, i) is significantly not 0 for any γ ∈ [1, p]. The Granger causality tests are usually conducted through an F-test or a Wald chi-squared test.

Vector Error Correction Model (VECM)
Two or more time series are cointegrated if they can form one or more long-run equilibrium relationship(s). To determine the cointegration relationship between TEMPrelated variables and GMSL-related variables, we have adopted the Johansen cointegration test [37].
If the time series variables are all integrated of order 1, namely I(1), and there exists a cointegration relationship among the time series, a vector error correction model (VECM) can be used to detect the long-run equilibrium and the short-run dynamic effect. In VECM, there is an error correction term (ECT) which is determined by the cointegration relationship: Subsequently, the VECM is written as: Here, the ∑ p i=1 A i ∆y t−i term represents the short-term dynamics.

Autoregressive Distributed Lag Model (ARDL)
The Johansen cointegration test and the VECM are strict on the integration order assumption, which is that all the time series must be of the same integration order. Conversely, the assumptions for the autoregressive distributed lag model (ARDL) are much more relaxed and so it can be applied to mixed integration order data. The ARDL can be formulated as follows: Here, the long-run equilibrium is represented as ECT t−1 = λ 0 y t−1 + ∑ k j=1 λ j x t−1 . The ARDL bound test is often used to detect the long-run cointegration relationship with the null hypothesis being λ i = 0 for all i ∈ [0, k], which means that there is no cointegration relationship.
Notably, co-integrated time series must have Granger causality in either one way or both directions, while the presence of Granger causality in either one way or both directions does not necessarily imply that the time series are cointegrated. Therefore, co-integration should be a stronger indication of Granger causality, in both directions. In summary, besides being able to differentiate the long-run and short-run relationships, the VECM and ARDL models can also help identify dual-directional Granger causality, and thereby serve as an alternative to the paradigm of VAR model with Granger causality test.

Nonlinear VAR Neural Network Model
Based on the VAR model structure and the Granger causality test, and inspired from the work of Rosoł et al. [38], we propose a statistical procedure to detect nonlinear Granger causality among multiple time series data. The basic idea and procedures are trivial and shown as the following:

1.
Fit an artificial neural network with all the predictors of the VAR model (full model): Fit the same artificial neural network structure without a predictor (x i ) to test for its causal relationship with the response variable (partial model): Examine if the residuals as measured by the sum squares of error (SSE) or mean absolute error (MAE) have increased significantly in the partial model compared to the full model; 4.
Repeat steps 2 and 3 above until all the predictors are tested.
If the residuals of the partial model are significantly larger than the full model, the tested predictor is critical in forecasting the response variable, which also indicates the tested predictor nonlinear Granger cause for the response variable if the SSE of the neural network model is smaller than that of the corresponding linear model.
Due to the limitation of data size (not large enough for training) and stochastic properties of the numerical solutions (for example, the stochastic gradient descent), the neural network results are not as stable as the linear model and are dependent upon the random seeds as well as the initial coefficients. To avoid these issues, instead of comparing the median absolute mean of residuals via the Wilcoxon test [38], we chose to repeat the full model and each partial models N times with different random seeds, and then comparing the medians of the sum squares of error (SSE) and mean absolute error (MAE) of the full model and the partial models. The detailed procedure is as follows.

1.
Fit N artificial neural network with all the predictors of the VAR model (full model) with different random seeds and record the SSE of each model: Fit N artificial neural network of the same structure without a given predictor to test its causal relationship with the response variable (partial model), with different random seeds and record the SSE of each model: Conduct a Wilcoxon rank sum test on SSE f and SSE p to test if the median of SSE f is significantly smaller than SSE p ; 4.
Repeat steps 2 and 3 until all the predictors are tested.
The procedure of comparing Mean absolute error (MAE) is similar, just replace SSE with MAE.

Nonlinear ARDL Neural Network Model
Similarly, we have also developed a nonlinear ARDL neural network based on the combined ARDL model and the artificial neural network (ANN) to detect the nonlinear long-run equilibrium relationship and the nonlinear relationship. The detailed procedure is as follows: 1.
Fit N artificial neural network with all the predictors of the ARDL model (full model) with different random seeds and record the SSE of each model: Fit N artificial neural network of the same structure without the cointegration part (partial cointegration model), with different random seed and record the SSE of each model: Fit N artificial neural network of the same structure without the predictor to test its nonlinear relationship with the response variable (partial model), with different random seeds and record the SSE of each model: Conduct a Wilcoxon rank sum test on SSE f and SSE p to test if the median of SSE f is significantly smaller than that of SSE p to test if there exists a nonlinear relationship between the tested predictor and the response variable; 6.
Repeat steps 4 and 5 until all the predicting variables are tested.
The procedure of comparing MAE is similar; simply replace SSE with MAE. It should be mentioned that the nonlinear autoregressive distributed lag (NARDL) model proposed by Shin and colleagues [39] also provides an asymmetric dynamic multipliers method to analyze the nonlinear long-run equilibrium and short-term dynamic relationships. The NARDL model formulates the problems as: . NARDL is widely applied in different research areas, such as business [40], economics [41], and finance [42,43]. When comparing with the machine learning methods such as the neural network in this article, NARDL has more of an advantage in its clear statistical properties, but neural network is more similar to a blackbox, with a structure that provides a more flexible way of complexity and nonlinear function selection. We aim to compare the NARDL model to the neural network structure in the future for climate studies.

Unit Root Tests and Integration Order
To determine the integration order of each time series, the augmented Dicky-Fuller (ADF) test with lag 12 and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test were performed, and the stationarities of level (original time series data) and difference (differenced time series data) were tested. The p-values are listed in Table 1. Both the ADF test and KPSS showed that the levels are not stationary while the firstorder differences are stationary, which indicates that the data integration orders are all I(1). That is, the de-seasoned original time series data are all integrated of order one.

Linear Granger Causality
To demonstrate the linear Granger causality among these climate variables, we fit two separate VAR(1) models and Engle-Granger tests, one for each response variable, global mean sea level (GMSL), and global mean surface temperature (TEMP), respectively. The confirmed VAR models and the Engle-Granger test results are shown in Table 2.    GMSL-related VAR model: TEMP-related VAR model: Humidity t = 0.003 + 0.163 TEMP t−1 + 0.740 Humidity t−1 + 0.040 GWP t−1 + 0.054 SSN t−1 GWP t = 0.015 + 0.003 TEMP t−1 + 1.000 GWP t−1 (8) The VAR models and Granger causality test results are consistent with the current physics-models-based results. The positive parameters from each feature at time t−1 to time t shows the consistent developing trend of each feature. IceSheet mass significantly causes GMSL changing by new water input, which is represented by the negative coefficient form IceSheet to GMSL. The positive coefficients from Humidity to TEMP and from TEMP to Humidity imply that they significantly affect each other with water vapor heat trap effect and water cycle theory. Furthermore, GWP leads the TEMP changing by the greenhouse effect. The thermal expansion effect of TEMP to GMSL, however, is less significant; the reason could be that the effect is not linear and needs to be formulated by a nonlinear model, which we will examine next using the neural network models.

Johansen Test
We first adopted the Johansen procedure to detect potential cointegration relationships in GMSL-related variables and in TEMP-related variables. Results of the Johansen procedure (Table 3) indicate that there could be one to two cointegration relationship(s) in each model.

Vector Error Correction Model (VECM)
To further identify the co-integration relationship, we resorted to two VECM models to detect the long-run equilibrium and short-run effect among related climate variables. VECM can be applied only if all the time series variables are integrated of the same order, such as order one, namely, I(1). VECM with order one (also referred to as lag one) is an equivalent transformation of a VAR(2) model, but, importantly, with the added benefit of focusing on the analysis of a long-run equilibrium (error correction terms; cointegration relationships) and short-run dynamic effect (differential terms). For each model, we have confirmed one strong cointegration relationship based on the VECM model. The VECM equations are shown below, with the corresponding significance levels provided in Tables 4 and 5. GMSL-related VECM model: where TEMP-related VECM model:

Autoregressive Distributed Lag Model (ARDL)
When the integration orders of time series variables are not all equal, the ARDL models are applied with a more relaxed integration order assumption. The ARDL bound test is performed to test the cointegration relationship in each ARDL model. In this work, the related climate factors are found to be integrated of the same order of one; however, ARDL is still employed to show the key relationship for each individual response variable of interest, namely global mean sea level or global mean temperature, respectively, in a univariate linear modeling approach in contrast to the multivariate modeling approach of VECM. The ARDL equations are shown below, with the corresponding significance levels provided in Tables 6 and 7. Compared to VECM, each individual ARDL is more flexible to formulate the problem with a lower order parsimonious model. We can see that these univariate ARDL models (Equation System 12 below) involve only time t and (t−1), whereas the multivariate VECM model (Equation System 11 above) includes time t, (t−1) and (t−2). By showing both VECM and ARDL models, we can demonstrate the climate pathways from both the system and the component perspectives:  Based on the significance levels and coefficients of the Johansen procedure, the VECM results, and the ARDL results, we found that IceSheet and TEMP both affect GMSL through a long-run equilibrium. The negative coefficient of IceSheet implies that a lower IceSheet mass will cause GMSL to increase via new water input, while the positive coefficient of TEMP implies that a TEMP increase will cause GMSL to increase directly by a water thermal expansion and indirectly by a melting IceSheet. Moreover, the effect of SeaIce to GMSL is not significant and there is no short-run dynamic effect. Meanwhile, TEMP is mainly modulated by both the long-term and the short-term effects of the greenhouse gases (GWP) and Humidity-the latter caused by the water vapor heat trap effect and the effect of SSN on TEMP is not as significant as the former two (GWP and Humidity).

Nonlinear Causality Detection
To detect the nonlinear causality of GMSL rise and TEMP change, a combined procedure based on artificial neural network (ANN) with one hidden layer of two nodes, and VAR model is formulated. Each neural network is repeated 100 times with different random seeds, the mean and standard deviation of the sum squares of error (SSE) and mean absolute error (MAE) are listed below. We use the Wilcoxon test to determine if the SSE increases significantly without each given predicting variable (Tables 8 and 9). Boxplots of the SSE comparisons are shown in Figures 3 and 4 for GMSL and TEMP respectively.     Stats 2023, 6, FOR PEER REVIEW 12 The nonlinear VAR neural network model for GMSL implies that there exists nonlinear Granger causality in the GMSL model based on the significant decrease in SSE and MAE of the neural network model in comparison to the linear VAR model. Increases in SSEs and MAEs of the partial models show that IceSheet and TEMP all impact GMSL change significantly. SeaIce's effect is only significant in terms of the SSE metric, but not in the MAE metric. The impact of IceSheet is the strongest corresponding to the model with the highest SSE and MAE, while the impact of SeaIce is the weakest corresponding to the model with the lowest SSE and MAE.  The SSE and MAE of the neural network model decreases significantly compared to that of the linear VAR model, which indicates that nonlinear causality does exist in the TEMP model. The increased SSE and the Wilcoxon test demonstrates that Humidity, GWP, and SSN all impact TEMP change significantly with the impact of Humidity and GWP being stronger, while the impact of SSN being the weakest.

Nonlinear ARDL Model
Similarly, to detect the nonlinear long-run equilibrium and short-run dynamic effect of GMSL rise and TEMP rise, we have developed a novel pipeline based on an artificial neural network (ANN) with one hidden layer of two nodes and ARDL model. Each neural network is repeated 100 times with different random seeds, the mean and standard deviation of SSE and MAE are listed below, and we used the Wilcoxon test to determine if the SSE or MAE increases significantly without a specific predictor variable (Tables 10 and  11). Boxplots of the SSE and MAE comparisons are shown in Figures 5 and 6 for GMSL and TEMP respectively.  The nonlinear VAR neural network model for GMSL implies that there exists nonlinear Granger causality in the GMSL model based on the significant decrease in SSE and MAE of the neural network model in comparison to the linear VAR model. Increases in SSEs and MAEs of the partial models show that IceSheet and TEMP all impact GMSL change significantly. SeaIce's effect is only significant in terms of the SSE metric, but not in the MAE metric. The impact of IceSheet is the strongest corresponding to the model with the highest SSE and MAE, while the impact of SeaIce is the weakest corresponding to the model with the lowest SSE and MAE.
The SSE and MAE of the neural network model decreases significantly compared to that of the linear VAR model, which indicates that nonlinear causality does exist in the TEMP model. The increased SSE and the Wilcoxon test demonstrates that Humidity, GWP, and SSN all impact TEMP change significantly with the impact of Humidity and GWP being stronger, while the impact of SSN being the weakest.

Nonlinear ARDL Model
Similarly, to detect the nonlinear long-run equilibrium and short-run dynamic effect of GMSL rise and TEMP rise, we have developed a novel pipeline based on an artificial neural network (ANN) with one hidden layer of two nodes and ARDL model. Each neural network is repeated 100 times with different random seeds, the mean and standard deviation of SSE and MAE are listed below, and we used the Wilcoxon test to determine if the SSE or MAE increases significantly without a specific predictor variable (Tables 10 and 11). Figures 5 and 6 for GMSL and TEMP respectively.    The nonlinear ARDL neural network and Wilcoxon test of the GMSL model indicate that the effect of IceSheet, SeaIce, and TEMP to GMSL is nonlinear, with the neural network model decreasing the SSE and MAE significantly. Furthermore, these three variables impact GMSL significantly as models without any one of these three variables would increase the prediction SSE significantly. Detailed SSE and MAE increases and the corresponding p-values show that the effect of IceSheet is much stronger than those of TEMP and SeaIce.  The nonlinear ARDL neural network and Wilcoxon test of the GMSL model indicate that the effect of IceSheet, SeaIce, and TEMP to GMSL is nonlinear, with the neural network model decreasing the SSE and MAE significantly. Furthermore, these three variables impact GMSL significantly as models without any one of these three variables would increase the prediction SSE significantly. Detailed SSE and MAE increases and the corresponding p-values show that the effect of IceSheet is much stronger than those of TEMP and SeaIce. The nonlinear ARDL neural network and Wilcoxon test results of the TEMP model demonstrate the nonlinear effect on TEMP caused by Humidity, GWP, and SSN, as the SSE and MAE decreases dramatically in the neural network model compared to the linear ARDL model. Moreover, the SSE and MAE of the neural network morel increased significantly upon removing any one of these variables, indicating that all three variables affect TEMP changing. From the SSE increase amount and p-values, we found that Humidity has the strongest effect on TEMP, followed by GWP and then SSN.

Conclusions and Future Work
In this work, we have performed a thorough data-based analysis of causal factors toward global mean temperature (TEMP) increase and global mean sea level (GMSL) rise, both linearly, using statistical time series models, and nonlinearly, using artificial neural network models. For TEMP change, the impact of global mean specific humidity (Humidity) and global warming potential (GWP) of the greenhouse gases are both significant, linearly and nonlinearly. At the same time, the effect of the sunspot number (SSN) is relatively small and only significant in the nonlinear neural network model. For GMSL change, the most significant factor, both linearly and nonlinearly, is the Antarctic and Greenland ice sheet mass (IceSheet), while the ocean thermal expansion as measured by TEMP came second, and would only impact the GMSL change nonlinearly, and the Northern and Southern Hemisphere sea ice extent (SeaIce) came third with only a weak nonlinear effect.
The unique contribution of our work is that we have delineated whether the causal relationships between each of the key climate variables to global warming or sea level rise is linear or nonlinear or both. Traditional physics-models-based analyses do not differentiate between these relationships and thus we compared our work to those models using the combined linear and nonlinear causal factors and found very good agreement. We also found excellent agreement between our linear causal factors with the classical linear statistical models. For example, GWP is found to be a significant causal factor for global warming by both physics-based and statistical models, and we found GWP to be significant both linearly and nonlinearly. The sunspot number was found to be insignificant for global warming by structural equation modeling, a multivariate linear statistical model [14,15], while the physics-based models, targeting both linear and nonlinear causal factors indiscriminately, have found it to be weakly related to global warming [44]. This is also in perfect agreement with our data-based models as we found the effect of the sunspot number to be relatively small and only significant in the nonlinear neural network model.
Given that the usual information criteria (AIC and BIC, etc.) toward gauging model goodness-of-fit are not generally applicable to the comparison of neural network models and statistical time series models, in order to better examine the robustness and compare The nonlinear ARDL neural network and Wilcoxon test results of the TEMP model demonstrate the nonlinear effect on TEMP caused by Humidity, GWP, and SSN, as the SSE and MAE decreases dramatically in the neural network model compared to the linear ARDL model. Moreover, the SSE and MAE of the neural network morel increased significantly upon removing any one of these variables, indicating that all three variables affect TEMP changing. From the SSE increase amount and p-values, we found that Humidity has the strongest effect on TEMP, followed by GWP and then SSN.

Conclusions and Future Work
In this work, we have performed a thorough data-based analysis of causal factors toward global mean temperature (TEMP) increase and global mean sea level (GMSL) rise, both linearly, using statistical time series models, and nonlinearly, using artificial neural network models. For TEMP change, the impact of global mean specific humidity (Humidity) and global warming potential (GWP) of the greenhouse gases are both significant, linearly and nonlinearly. At the same time, the effect of the sunspot number (SSN) is relatively small and only significant in the nonlinear neural network model. For GMSL change, the most significant factor, both linearly and nonlinearly, is the Antarctic and Greenland ice sheet mass (IceSheet), while the ocean thermal expansion as measured by TEMP came second, and would only impact the GMSL change nonlinearly, and the Northern and Southern Hemisphere sea ice extent (SeaIce) came third with only a weak nonlinear effect.
The unique contribution of our work is that we have delineated whether the causal relationships between each of the key climate variables to global warming or sea level rise is linear or nonlinear or both. Traditional physics-models-based analyses do not differentiate between these relationships and thus we compared our work to those models using the combined linear and nonlinear causal factors and found very good agreement. We also found excellent agreement between our linear causal factors with the classical linear statistical models. For example, GWP is found to be a significant causal factor for global warming by both physics-based and statistical models, and we found GWP to be significant both linearly and nonlinearly. The sunspot number was found to be insignificant for global warming by structural equation modeling, a multivariate linear statistical model [14,15], while the physics-based models, targeting both linear and nonlinear causal factors indiscriminately, have found it to be weakly related to global warming [44]. This is also in perfect agreement with our data-based models as we found the effect of the sunspot number to be relatively small and only significant in the nonlinear neural network model.
Given that the usual information criteria (AIC and BIC, etc.) toward gauging model goodness-of-fit are not generally applicable to the comparison of neural network models and statistical time series models, in order to better examine the robustness and compare the forecast accuracy of these two types of models, we shall perform a follow-up predic-tive study using models derived from this work and the soon available 2022 monthly climate data.
We also point out a limitation of our study due to the limited sample size (namely, number of years) available for the analysis. For non-stationary time series data, the Toda-Yamamoto procedure [45] is the proposed method for causality analysis. However, due to our modest sample size, the asymptotic distribution of the test statistic would not hold true, and thus we found the Toda-Yamamoto causality analysis not significant. This conflicts with the fact that co-integrated time series must have Granger causality in one way or both directions, while the presence of Granger causality in either one way or both directions does not necessarily imply that the series are cointegrated. Therefore, co-integration should be a stronger indication of Granger causality in both directions. We have therefore used the cointegration as an indication of Granger causality instead of the Toda-Yamamoto procedure.
On another note, the neural net VAR and ARDL models proposed in this paper can be seen as an alternative of other nonlinear time series analysis methods, including the nonlinear autoregressive distributed lag (NARDL) models [39,40,42], and those based on state-space reconstruction [46][47][48]. Furthermore, other approaches such as power transformations can also be considered to accommodate potential nonlinear relationships. For example, linear VAR and ARDL in logarithms are widely employed in economic and financial applications [49]. Noteworthy is that we can combine the transformations of data and the linear models to see if the resulting performance is comparable to that of the neural network models. If so, the transformation plus linear model would be preferred as it is an open-box in contrast to the machine learning model, which is a blackbox.
Finally, we point out that the physics-based and the data-driven models are not mutually exclusive. In fact, they can be integrated to complement each other for better forecasting in a more timely and computationally efficient manner. A plethora of white papers on how to integrate AI/machine learning methods to the physical earth systems models, for example, has been provided in the following website by the US Department of Energy (https://www.ai4esp.org/white-papers/, accessed on 4 May 2023). It is our sincere hope that our work will help to elicit more interest in this important area at this critical moment in history.