Next Article in Journal
Quantifying the Impacts of Urbanization on Urban Agriculture and Food Security in the Megacity Lahore, Pakistan
Previous Article in Journal
Spatial Heterogeneity of Urban Road Network Fractal Characteristics and Influencing Factors
Previous Article in Special Issue
Principal Component Regression Modeling and Analysis of PM10 and Meteorological Parameters in Sarajevo with and without Temperature Inversion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

More Accurate Climate Trend Attribution by Using Cointegrating Vector Time Series Models

by
David B. Stephenson
1,*,
Alemtsehai A. Turasie
2 and
Donald P. Cummins
3
1
Department of Mathematics and Statistics, University of Exeter, Exeter EX4 4QF, UK
2
Department of Mathematics and Computer Science, Adelphi University, Garden City, NY 11530, USA
3
CNRM (National Centre for Meteorological Research), Université de Toulouse, Météo-France, CNRS, 31100 Toulouse, France
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(16), 12142; https://doi.org/10.3390/su151612142
Submission received: 20 June 2023 / Revised: 27 July 2023 / Accepted: 27 July 2023 / Published: 8 August 2023
(This article belongs to the Special Issue Statistics and Econometrics of Environment and Climate Change)

Abstract

:
Adapting to human-induced climate change is becoming an increasingly important aspect of sustainable development. To be able to do this effectively, it is important to know how much human influence has contributed to observed climate trends. Climate detection and attribution (D&A) studies achieve this by estimating scaling factors usually obtained by performing a least squares regression of the observed trending climate variable on the equivalent variable simulated by a climate model. This study proposed instead to estimate scaling factors by using the econometric approach of dynamically modelling the time series as a cointegrating Vector Auto-Regressive (VAR) time series process. It is shown that a 2nd-order cointegrating VAR(2) model is theoretically justified if the observed and simulated variables can be represented as a one-box AR(1) response to a common integrated forcing. The VAR(2) model can be expressed as a Vector Error-Correction Model (VECM) and then fitted to the data to obtain the cointegration relationship, the stationary linear combination of the two variables, from which the scaling factor is then easily obtained. Estimates of the scaling factor from the VAR(2) model are critically compared to those from Ordinary Least Squares (OLS) and Total Least Squares (TLS) for annual Global Mean Surface Temperature (GMST) data simulated by a simple stochastic model of the carbon–climate system and for historical simulations from 16 climate models in the Coupled Model Intercomparison Project 5 (CMIP5) experiment. Results from the toy model simulations show that the slope estimates from OLS are negatively biased, TLS estimates are less biased but have high variance, and the VAR(2) estimates are unbiased and have lower variance and provide the most accurate estimates with smallest mean squared error. Similar behaviour is noted in the CMIP5 data. Hypothesis tests on the VAR(2) fits found strong evidence of a cointegrating relationship with the observations for all the CMIP5 simulations.

1. Introduction

Taking urgent action to combat climate change and its impacts is a major goal of sustainable development. To be able to address this, it is important to be able to accurately quantify what fraction of a climate trend can be attributed to human influence. This study explored the accuracy of various approaches to the attribution of trends in Global Mean Surface Temperature (GMST). GMST is an important variable in attribution because: (a) changes in GMST can be directly attributed to changes in the Earth’s energy budget; (b) other climate variables and the risk of weather extremes on global and regional scales can be predicted using GMST as a climate change index [1,2]; (c) GMST is the metric of climate change used by policymakers (e.g., the global warming limits of the 2015 Paris Agreement). Quantifying the effect of different forcing agents on GMST is therefore critical for informing climate change mitigation and adaptation measures. Furthermore, this study is relevant to recent studies that are starting to use attribution for impacts of climate, e.g., heat-related trends in human health [3].
Detection and attribution studies assess whether climate change can be detected as being significantly outside the range expected from natural internal variability and assess to what extent observed changes can be attributed to external forcings of climate change, both human-induced and naturally occurring. Hegerl et al. [4] defined the detection of climate change as a process of demonstrating that the climate or a system affected by the climate has changed in some defined statistical sense without providing a reason for that change. Attribution was defined as the process of evaluating the relative contributions of multiple causal factors to a change or event with an assignment of statistical confidence. The reliable detection and attribution of changes in the climate is fundamental to our understanding of climate change and to enabling decision-makers to manage climate-related risk. Moreover, confidence in the assessment of climate change will be increased when attribution of the change to a causal factor is robustly quantified and when there is a firm understanding of the processes that are involved in the proposed causal link [4].
Current approaches to the detection and attribution of an anthropogenic influence on climate involve quantifying the level of agreement between model-predicted patterns of externally forced change and observed changes in the recent climate record [5]. Most previous studies have used a regression approach in which it is assumed that observations can be represented as a linear combination of candidate signals (the climate model simulated responses to external forcing) plus noise ([6] p. 712). The regression model has the form y = X β + ϵ , where y = ( y 1 , y 2 , , y n ) T is a vector of observations, matrix X contains the simulated responses to the external forcings that are under investigation (with one column for each signal considered), β is a vector of slope parameters (scaling factors) that adjusts the amplitudes of those patterns and ϵ represents internal climate variability [5,7,8,9]. The unknown scaling factors, β , are generally estimated using either Generalized Least Squares (GLS) or Total Least Squares (TLS) estimation [8,10,11,12,13,14,15,16]. Some more sophisticated methods have also been used more recently (e.g., maximum likelihood, Bayesian inference, alternative estimators of the internal variability covariance matrix) but many are still essentially fitting a TLS model [17,18]. For a recent short review of methods, see Section 3.2 of [19].
When regressing trending series on one another, the residuals are not guaranteed to be stationary, which often leads to estimates and test statistics having non-standard limiting distributions that diverge as the sample size increases [20]. This gives rise to what is known as either nonsense or spurious regression [21,22]. For example, one can easily obtain large significant slope parameters even if trying to attribute/regress time series of independent random walks on one another (see (Turasie [23] pp. 27–29) for some simulated examples). To avoid this problem, it is useful instead to search for a linear combination of the variables that is stationary (a cointegrating relationship) rather than the one that has the minimum sum of squares ([24] p. 17). It can be shown that a cointegrating relationship is a necessary condition for least squares estimates to be consistent, i.e., the estimates converge onto the true values as the sample size increases [25]. It has recently been proven that model-simulated and -observed climate variables have a cointegrating relationship under the fairly general assumptions that the forcings are linearly related to one another and the responses are time-invariant and linear [17], which justifies the use of least squares regression in detection and attribution studies. Furthermore, scaling factors can be estimated by fitting a time series model known as an Error Correction Model (ECM) that instead involves regressing the first differences of a variable on current and past values of the other variable. The existence of such ECM models for cointegrating variables is proven by the Granger representation theorem [25].
Although these different estimation methods will converge on the same scaling factor estimates as the sample size increases to infinity when there is a cointegrating relationship, they will give different estimates for finite samples. Historical climate time series are typically short, having sample sizes less than T 200 years, which raises the important practical question of which method is most accurate in these situations for estimating scaling factors. This study assessed this by comparing the performance of OLS and TLS estimates and estimates from a theoretically-justified vector ECM using data simulated by (a) 16 different climate models that participated in the CMIP5 experiment, and (b) a simple stochastic carbon–climate time series model (where the true value is known). The following section briefly describes the three different estimation methods and then Section 3 and Section 4 show results from these methods for the CMIP5 and stochastic model simulations. Section 5 concludes the paper with a summary and discussion of future directions.

2. Methods for Estimating Scaling Factors

This section briefly describes the three different methods for estimating the scaling factor that were compared in this study: OLS/GLS, TLS and VAR(2).

2.1. Ordinary Least Squares and Generalized Least Squares (OLS/GLS)

Fixed-effect least squares regression was the first approach used in detection and attribution studies (e.g., [5,26]). In the simplest case, historical observed values of a variable are related to values simulated by a climate model using the simple linear regression model y t = β 0 + β 1 x t + ϵ t , where the residuals ϵ t are assumed to have an error covariance structure that can be estimated from long historical simulations. For the sake of mathematical simplicity, consider the special OLS case where ϵ t are assumed to be independent, i.e., white noise. The OLS slope estimate β 1 ^ is obtained by minimization of the Residual Sum of Squares, given by
R S S o l s = t = 1 T ϵ t ^ 2 = t = 1 T ( y t y t ^ ) 2 = t = 1 T ( y t β 0 ^ β 1 ^ x t ) 2 .
Minimizing (1) with respect to β 1 ^ gives
β ^ 1 = S x y S x x ,
where S x y = t = 1 T ( x t x ¯ ) ( y t y ¯ ) , S x x = t = 1 T ( x t x ¯ ) 2 , y ¯ = 1 T t = 1 T y t , and x ¯ = 1 T t = 1 T x t . This estimate, hereafter referred to as β ^ o l s , has assumed that the model simulated values x t are non-random fixed effects. This assumption ignores the natural variability present in the simulated series that contributes to S x x , and causes the slope estimate to be biased towards zero—an effect known as regression dilution (or attenuation). Furthermore, this bias, caused by ignored errors in explanatory variables, has been found to be even more severe in GLS fits [27]. For this reason and the sake of simplicity, we considered only OLS fits in the rest of this article.

2.2. Total Least Squares (TLS)

Total Least Squares (TLS) is a regression method that attempts to account for observational errors on both predictor and response variables ([28] p. 27) that was first introduced into detection and attribution by [10]. Suppose that ( x t * , y t * ) are the true values of the predictor and response variables while ( x t , y t ) are values that we observe such that x t = x t * + ζ t and y t = y t * + ω t . Assume the unobservable true values are related linearly,
y t * = β 0 + β 1 x t * .
Using x t and y t in (3) under the assumption that ζ t and ω t have finite variances ( σ ζ 2 and σ ω 2 , respectively) and zero means, we get:
y t = β 0 + β 1 x t + ν t , where ν t = ω t β 1 ζ t .
The OLS method assumes that the predictor variable x is deterministic and measured without error (that is, ζ t = 0 for all t). Thus, all the uncertainty in OLS regression is associated with the response variable and, hence, one minimizes the sum of squared distances in the y-direction of data-points from the fitted line. For TLS, it is sufficient to assume equal noise variance ( σ ζ = σ ω ) (known as orthogonal regression) since this can always be obtained by pre-multiplying y t by the ratio σ ζ / σ ω (if known and non-zero). TLS minimises the sum of squared perpendicular distances from the line of best fit:
R S S t l s = t = 1 T [ ( y t y t ^ ) cos θ ] 2 ,
where the gradient of the line of best fit is tan θ = β ^ 1 . This objective function is minimized when
β ^ 1 = ( S y y S x x ) + ( S y y S x x ) 2 + 4 S x y 2 2 S x y .
Note that, unlike the OLS slope estimate which tends to zero, this slope estimate, hereafter referred to as β ^ t l s , tends to infinity (no solution—see [29]) when the sample covariance S x y / T vanishes and S y y S x x . This singular behaviour can lead to large values and much uncertainty in the slope estimate, especially when there is only a weak correlation between x and y, e.g., in the detection and attribution of more variable regional climate trends.

2.3. A Dynamic Approach: The Cointegrating VAR(2) Model

In the OLS and TLS approaches, the residuals ϵ t will generally not be independent of one another and so the sum of squared residuals will not be simply related to the likelihood function. In other words, the least squares estimates are not necessarily maximum likelihood estimates. To obtain maximum likelihood estimates, it is necessary to dynamically model the serial dependence by specifying and fitting an appropriate bivariate time series model to the x and y time series. This section will present such a model—the 2nd order vector auto-regressive model—and show how it can be used to estimate the cointegrating relationship, and hence the scaling factor.
Based on simple one-box energy balance arguments, one of the most widely used stochastic models for climate variability and its response to climate change forcing is the 1st order auto-regressive process, e.g., see Cox et al. [30] for a recent application to global mean temperature projections. For this model, it can be shown (see Appendix A) that the vector z t = ( y t , x t ) T of observed and simulated temperatures responding to a trending forcing evolves as a second-order Vector Auto-Regressive time series VAR(2) process given by
z t = μ + Π 1 z t 1 + Π 2 z t 2 + ε t ,
where ϵ t is vector of random error terms which are assumed to be independent and normally distributed with zero mean vector and covariance matrix Σ [31]. The 2 × 2 matrices Π 1 and Π 2 allow the major sources of auto-correlation in the two series to be represented. In practice, the random error terms are not strictly independent since climate models are not perfectly represented by one-box energy balance models [32]. However, the independence assumption is likely to be much more valid here than the residual assumptions in the OLS and TLS static regression approaches.
Various cointegrating time series models have been used previously to relate observed temperature time series to historical estimates of radiative forcing (e.g., [33,34,35,36,37]; and references therein). For example, by considering two-box energy balance equations, Pretis [36] showed that historical estimates of surface temperature, ocean heat content, and radiative forcing could be modelled as a cointegrating VAR(1) process. Unit-root non-stationarity in the process arises from the radiative forcing that is assumed to be well represented by an I ( 1 ) unit-root stochastic trend process as greenhouse gas emissions have accumulated in the atmosphere throughout the industrial period. However, because of eventual reabsorption into the land and oceans, the emissions process is close to unit root rather than being strictly unit root. In the pre-industrial period, greenhouse gas emissions into the atmosphere were considerably smaller than during the industrial period and so the series would drift sufficiently slowly that there would be time for carbon dioxide and other greenhouse gases to be reabsorbed.
The novelty of our study is that we used cointegrating VAR models to model observed and climate model-simulated temperatures, which has the advantage of the climate model providing a physically-based response to the radiative forcing.
The unrestricted VAR model in (7) can be reparameterized in terms of differences, lagged differences, and levels of the process to give the VECM(2) Vector Error Correction Model
Δ z t = μ + Π z t 1 + Γ Δ z t 1 + ε t ,
where Π = ( Π 1 + Π 2 I 2 × 2 ) , Γ = Π 2 . The Π matrix describes the long-run relationship between variables ( y t and x t ) , and Γ describes transitory effects measured by the lagged changes of the variables [25,38]. This model can easily be fit to data—see Appendix B for R code used to find each of the scaling factor estimates.
The rank r of Π controls the non-stationarity properties of the series. When r = 2, then Π is of full rank and invertible and z t is stationary. When r = 0, then Π = 0 and y t and x t evolve as two non-cointegrating random walks ([24] p. 115). When r = 1 , there is a stationary linear combination of y t and x t (the cointegrating relationship). In this case, the long-run coefficient matrix can be written as Π = α β T , where α = ( α 1 , α 2 ) T (vector of adjustment coefficients) and β = ( β 1 , β 2 ) T (cointegration vector). The scaling factor is given by β c o i n t = β 2 / β 1 . When r = 1 , the model has 2 + 3 + 4 + 3 = 12 parameters, which give the model flexibility to represent the means and serial covariance of the y t and x t series.
If y t can be attributed on x t due to having a common forcing, we expect y t and x t to be a trending series that has a cointegrating relationship, so r = 1 . A formal test for the cointegration hypothesis can be formulated as a reduced rank test on the Π matrix, H 0 : rq, for some constant number q ( q = 0 , 1 , for bivariate model), and the alternative is r > q . The likelihood ratio-based trace statistic introduced by Johansen [39] was used to test the rank of Π matrix in this study. For our bivariate case, the null hypothesis of at most one cointegration vector was rejected in favour of a more than one alternative if the estimated trace statistic was greater than critical value provided, for instance, in Table 1* of Osterwald-Lenum [40]. Compared to the ECM for only y t used in Cummins et al. [17], Beenstock et al. [41], the vector ECM has the advantage that it treats both the x t and y t on equal footing [24]. Confidence intervals can easily be calculated for the scaling factor by performing OLS fits of the restricted VECM, e.g., using the cajorls function in the urca package in R [42].

3. Results from CMIP5 Climate Model Simulations

The three different estimation methods have been applied to historical Global Mean Surface Temperature (GMST) simulations from 1860–2004 for 16 climate models that participated in CMIP5 (see Table 1). Figure 1 shows time series of the GMST anomalies of the first simulation of each model together with those from the HadCRUT3 gridded dataset observations. It can be clearly seen that all the series are non-stationary and trending upwards—using the augmented Dickey–Fuller [43,44] test for unit root, all the series are found to be integrated of order 1, i.e., year-to-year differences are stationary ([23] Ch. 4).
The model simulated series tend to follow the observed series but there are notable deviations (e.g., HadGEM2-CC is warmer than the observed series right up until 1940). To exclude the possibility of spurious regressions, it is important to test whether or not there is a cointegrating relationship between the observed and each of the simulated temperatures. Table 2 shows the result of applying the rank test discussed in Section 2.3. The test statistics under H r 1 : r 1 are all less than the corresponding 5% critical value of 9.24, while all those under H r 0 : r = 0 are greater than the corresponding 5% critical value of 19.96. Therefore, we cannot reject that the rank is less than or equal to 1 but we can reject that the rank is 0, which implies that the rank is 1 and there is a cointegrating relationship between each of the simulations and the observed series in contrast to what was found in [41]. We preferred to use CMIP5 simulations here rather than the more recent CMIP6 simulations, which are known to have some overly sensitive models; however, it is worth noting that similar cointegrating relationships were also found in CMIP6 simulations [17].
The scaling relationship between the observations and simulations was estimated using the three different approaches. Figure 2 shows an example of the three different approaches applied to a simulation from the GISS-E2-R climate model. Figure 3 shows distributions of the estimated scaling factor for each of the three methods. The OLS method gives scaling factor estimates that are substantially lower than those from TLS and COINT, as might be expected due the method not accounting for natural variation in the simulated temperature variable. The OLS estimates also have the least spread, whereas the TLS estimates have the most spread with some questionably large scaling factor estimates exceeding 2. The COINT estimates do not have such outliers and have mean and spread intermediate between those of the OLS and TLS estimates. These conclusions across models are confirmed by estimates within multiple runs from individual models (Figure 4).

4. Results from Stochastic Carbon-Climate Model Simulations

To further investigate the properties of the different estimation methods, it is useful to have data generated by a process where we know the value of the true scaling factor. In this section, this is achieved by showing results from data simulated from a very simple conceptual model of the carbon–climate system.

4.1. Simple Stochastic Model of the Carbon-Climate System

A simple stochastic model of the carbon–climate system can be constructed as follows. It is reasonable to assume that the observed ( y t ) and model-simulated ( x t ) global mean temperatures in year t = 1 , 2 , 3 , , T are linear functions of the atmospheric carbon dioxide concentration c t [45,46] and that the carbon dioxide concentration accumulates by a random amount from year t 1 to year t:
y t = γ 0 + γ 1 c t + ϵ t x t = γ 0 + γ 1 c t + ϵ t c t = γ c + c t 1 + ϵ t c ,
where γ c > 0 represents constant emissions that arrive in the atmosphere each year and ϵ t , ϵ t and ϵ t c , are assumed to be independent Gaussian variates with zero means and constant variances. Therefore, c t is assumed to be a random-walk process with drift, which is consistent with previous studies that have shown that the time series of global mean temperature as well as the atmospheric concentration of CO 2 are integrated processes of an order greater than zero [47,48,49,50,51,52]. This model is a special case of the model in Appendix A, having an instantaneous response to forcing ( ϕ = ϕ = 0 ). The attribution regression equation and scaling factor can be easily derived for the stochastic model. The expression for x t in Equation (9) can be used to find c t , which can then be substituted into the equation for y t to give y t = γ 0 β γ 0 + β x t + ϵ t β ϵ t , where the scaling factor β = γ 1 / γ 1 .

4.2. Estimation of Parameters for the Stochastic Model

Figure 5a,b show historical estimates of c t and c t c t 1 using data from [53]. After 1960, when direct atmospheric measurements were available, the series c t c t 1 can be seen to vary randomly around a mean of about 3 GtC/year. Before 1960, the annual changes were smoother and less variable but were then estimated from ice core data that have no annual resolution (personal communication, Prof. Pierre Friedlingstein). The sample mean and variance of the annual differences gives estimates of γ ^ c = c t c t 1 ¯ = 1.53 GtC/year and σ ^ ϵ c 2 = 1.43 (GtC/year) 2 .
Figure 5c,d show time series plots of historical observed GMST and an example simulation from the GISS-E2-R GCM. When these are plotted against the historical CO 2 values (Figure 5e,f), it can be seen that there are clear linear relationships as assumed in the stochastic model. OLS fits to these data give estimates of γ ^ 1 = 0.00486 °C/GtC, γ ^ 1 = 0.00279 °C/GtC, γ ^ 0 = 3.41 °C, γ ^ 0 = 1.96 °C, which gives a known scaling factor of β = γ ^ 1 / γ ^ 1 = 1.74 . The sample variance of the observed and simulated temperature residuals was similar and so for simplicity pooled sample variance was used to obtain σ ^ ϵ 2 = σ ^ ϵ 2 = 0.0203 (°C) 2 .
The stochastic model with these parameter estimates was used to simulate K = 50 independent realizations of the time series y t , x t , and c t , where t = 1 , 2 , , T years. The realizations resemble the original series, as can be seen in the T = 100 year example shown in Figure 6.

4.3. Scaling Factor Estimates from Simulated Data

Figure 7 shows scaling factor estimates for time series of length T = 100 years for 50 simulations from the stochastic model. The 95% prediction intervals are calculated from β ^ ± 1.96 s β , where 1.96 is the 97.5th quantile of normal distribution and s β is the sample standard deviation of the 50 estimates. Figure 7a shows that the OLS estimates underestimate the true scaling factor value in all the 50 realizations. Thus the OLS estimators are strongly negatively biased. By contrast, the TLS estimates in Figure 7b appear to overestimate the scaling factor (positively biased) sometimes by a factor of two, and are a lot more variable. These results for OLS and TLS agree with the comparison made in (Van Huffel and Vandewalle [28] p. 244) for a moderate sample size—TLS estimates have a mean that is closer to the true value (i.e., are less biased) but have greater variance (i.e., are less efficient) than OLS estimates. The COINT estimates in Figure 7c are evenly scattered around the true value with a variance only slightly larger than that of the OLS estimates.
It is of interest to see how the estimates change with sample size T. Since a cointegrating relationship exists in the stochastic model ( y t β x t is stationary), all the estimates should be consistent and converge on the true value as T . It can be noted from Figure 8 that this appears to be the case, although for OLS (panels a and d) there is a substantial negative bias and the confidence interval does not overlap the true value even for long time series with T = 250 years. The TLS estimates have very wide confidence intervals especially for small sample sizes T < 200 , which are typical of the sample sizes that have been used in many previous detection and attribution studies. The COINT estimates appear to be unbiased for samples with T > 50 and the confidence intervals overlap the true value for all sample sizes. Sensitivity tests reveal that these results are robust to the choice of parameters in the stochastic model (not shown).
Figure 8 shows the accuracy of the estimators as measured by Mean Squared Error (MSE), which is the sum of the squared bias and the variance of the estimators. COINT is by far the most accurate estimator with the smallest MSE because of having small bias and small variance. For small sample sizes with T < 100 , the OLS estimator is the next most accurate estimator—despite its negative bias, it has much less variance than the TLS estimator. The TLS estimator, often used in detection and attribution studies, is the least accurate estimator of the scaling factor, primarily due to its high variance. Despite the cointegrating VAR(2) model having more parameters than the TLS regression model, it is apparent that the VAR(2) estimates have much less variance than those of TLS. This is most likely related to the problem of the TLS estimates diverging to infinity when the covariance between y t and x t is small.

5. Conclusions and Discussion

This study investigated the use of a dynamic time series model for obtaining estimates of scaling factors in detection and attribution and compared the performance to that of more traditional OLS and TLS regression approaches when applied to CMIP5 GMST simulations and simulations of a simple stochastic carbon–climate model. It has been shown that:
  • The cointegrating VAR(2) model is a reasonable time series approach to use for modelling pairs of observed and model-simulated time series if one assumes the variables are AR(1) responses to common integrated forcing;
  • Unlike OLS, which gives negatively biased estimates and TLS, which gives positively biased estimates (and large positive outliers), the cointegrating VAR(2) model gives estimates of the scaling factor that are unbiased;
  • The cointegrating VAR(2) model estimates are much more accurate (in terms of MSE) than the OLS and TLS estimates;
  • The TLS estimates have very large variance, which causes large MSE. They give infinite slope estimates if the sample covariance between the observed and model-simulated series is zero.
  • Hypothesis tests on the VAR(2) fits for all the CMIP5 models reassuringly found strong evidence of a cointegrating relationship with the observations, as to be expected for observations and simulations responding to shared trending forcing.
Hence, we recommend that the cointegrated VAR(2) approach is used in future detection and attribution studies in order to obtain more reliable estimates of the scaling factor. In particular, it would be of interest to test its performance in attribution studies with more than one simulated series.
It is worth noting that the GMST response to forcing involves more than one adjustment timescale and is generally best described by a 3-box energy balance model rather than a 1-box energy balance model [17,32,54,55,56]. A vector error correction model can also be derived for this situation but results in a much less parsimonious VARMA model for the pair of observed and simulated variables. In fitting to short time series, it is advantageous to have fewer parameters to estimate, so it is likely that the VAR(2) approach remains more appropriate for most climate applications.

Author Contributions

D.B.S. and A.A.T. wrote the main manuscript text. A.A.T. prepared the figures. D.P.C. provided mathematical results for the Appendix A and provided useful critical comments on the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original data were freely obtained from: Historical observed and CMIP5 simulations of GMST: https://climexp.knmi.nl (accessed on 26 July 2023); Historical CO 2 estimates: https://esgf-node.llnl.gov/search/input4mips/ (accessed on 26 July 2023). The R code and data are freely available from https://github.com/stormrisk/cointegration (accessed on 26 July 2023). Code for the methods is briefly described in Appendix B.

Acknowledgments

The authors wish to thank Hugo Lambert and Peter Cox for providing the CMIP5 and CO 2 data. A.A.T. would like to thank the College of Arts and Sciences, Adelphi University, for the research release time and resources, and James Davidson for research guidance on econometric methods.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Cointegrated VAR(2) Derivation

Let y t and x t be the observed and climate model-simulated AR(1) responses to a common stochastically trending forcing F t plus stationary natural variability forcing:
y t = γ 0 + ϕ y t 1 + γ 1 F t + ε t ,
x t = γ 0 + ϕ x t 1 + γ 1 F t + ε t ,
where | ϕ | , | ϕ | < 1 (i.e., stationary AR model). Non-stationarity is assumed to arise from unit root behaviour in the forcing F t not from ϕ or ϕ being equal to 1. An expression for F t 1 can be found by rearranging either Equation (A1) or (A2), or more generally, by taking a linear combination of these two expressions,
F t 1 = ρ γ 1 y t 1 γ 0 ϕ y t 2 ε t 1 + ( 1 ρ ) γ 1 x t 1 γ 0 ϕ x t 2 ε t 1 ,
where ρ is an arbitrary weighting parameter. Equation (A1) may equivalently be written
y t = γ 0 + ϕ y t 1 + γ 1 Δ F t + F t 1 + ε t .
Substituting Equation (A3) into (A4) gives
y t = ( 1 ρ ) ( γ 1 γ 0 γ 1 γ 0 ) γ 1 + ( ϕ + ρ ) y t 1 + ( 1 ρ ) γ 1 γ 1 x t 1 ρ ϕ y t 2 ( 1 ρ ) γ 1 ϕ θ x t 2 + γ 1 Δ F t + ε t ρ ε t 1 ( 1 ρ ) γ 1 γ 1 ε t 1 .
There exists a similar expression for x t with weighting parameter ρ . These two equations together give the vector equation for z t = ( y t , x t ) T :
y t x t = ( 1 ρ ) ( γ 1 γ 0 γ 1 γ 0 ) γ 1 ( 1 ρ ) ( γ 1 γ 0 γ 1 γ 0 ) γ 1 + ϕ + ρ ( 1 ρ ) γ 1 γ 1 ( 1 ρ ) γ 1 γ 1 ϕ + ρ y t 1 x t 1 + ρ ϕ ( 1 ρ ) γ 1 ϕ γ 1 ( 1 ρ ) γ 1 ϕ γ 1 ρ ϕ y t 2 x t 2 + γ 1 Δ F t + ε t ρ ε t 1 ( 1 ρ ) γ 1 γ 1 ε t 1 γ 1 Δ F t + ε t ρ ε t 1 ( 1 ρ ) γ 1 γ 1 ε t 1 .
When Δ F , ε , and ε are stationary processes, these equations have the 2nd order Vector Auto-Regressive VAR(2) representation,
z t = μ + Π 1 z t 1 + Π 2 z t 2 + ε t .
The noise term ε t has a relatively simple MA(1) temporal structure when forcing innovations Δ F t and weights ρ and ρ are negligible, and ϵ t and ϵ t are independent in time and from one another. The VAR(2) model may be written equivalently as a Vector Error-Correction Model (VECM),
Δ z t = μ + Π z t 1 + Γ Δ z t 1 + ε t .
Matrix Π = Π 1 + Π 2 I is given by
Π = ( 1 ρ ) ( 1 ϕ ) ( 1 ρ ) γ 1 ( 1 ϕ ) γ 1 ( 1 ρ ) γ 1 ( 1 ϕ ) γ ( 1 ρ ) ( 1 ϕ ) = Π 11 Π 12 Π 21 Π 22 .
The rank of Π gives the number of (stationary) cointegrating relationships and this, plus the number of (non-stationary) unit roots, must add up to the dimension of z equal to 2 [39]. The Π above has rank one (i.e., has 2 1 = 1 cointegrating relationships) and can be written as Π = α β T , where
α T = Π 11 , Π 21 β T = 1 , Π 12 / Π 11 .
Hence, the cointegration scaling factor β c o i n t = β 2 / β 1 = γ 1 γ 1 1 ϕ 1 ϕ , which is simply the ratio of the equilibrium sensitivities of the two AR(1) responses and does not depend on the weights ρ and ρ .

Appendix B. Software Used to Estimate the Scaling Factor

The scaling factors were easily obtained using the following code in the freely available R language. Data object y refers to a vector of observed temperatures and x refers to the corresponding vector of model-simulated temperatures.

Appendix B.1. OLS Regression

> sxx <- sum((x-mean(x))^2)
> sxy <- sum((x-mean(x))*(y-mean(y)))
> b.ols <- sxy/sxx  # the scaling factor
or alternatively
> fit_ols <- lm(y~x)
> b.ols <- fit_ols$coef[2] # the scaling factor

Appendix B.2. TLS Regression

> sxx <- sum((x-mean(x))^2)
> syy <- sum((y-mean(y))^2)
> sxy <- sum((x-mean(x))*(y-mean(y)))
> b.tls <- ((syy-sxx)+sqrt((syy-sxx)^2+4*sxy^2))/(2*sxy)   # the scaling factor
or alternatively
> library(deming)
> fit_tls <- deming(y~x)
> b.tls <- fit_tls$coef[2] # the scaling factor

Appendix B.3. Cointegrated VAR(2) Model

Johansen’s maximum likelihood estimation procedure is used to estimate the cointegrating vector β and other parameters in the VECM [39,57].
> library(urca)
> coint <- ca.jo(data.frame(cbind(y,x)), type = "trace", K =2,spec="transitory",
             ecdet="const",season = NULL, dumvar = NULL)
> coint.r1 <- cajorls(coint,r=1)
> b.coint <- -coint.r1$beta[2] # the scaling factor

References

  1. Haustein, K.; Allen, M.R.; Forster, P.M.; Otto, F.E.; Mitchell, D.M.; Matthews, H.D.; Frame, D.J. A real-time global warming index. Sci. Rep. 2017, 7, 15417. [Google Scholar] [CrossRef] [Green Version]
  2. Sutton, R.; Suckling, E.; Hawkins, E. What does global mean temperature tell us about local climate? Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2015, 373, 20140426. [Google Scholar] [CrossRef] [Green Version]
  3. Ebi, K.L.; Åström, C.; Boyer, C.J.; Harrington, L.J.; Hess, J.J.; Honda, Y.; Kazura, E.; Stuart-Smith, R.F.; Otto, F.E. Using Detection And Attribution To Quantify How Climate Change Is Affecting Health. Health Aff. 2020, 39, 2168–2174. [Google Scholar] [CrossRef]
  4. Hegerl, G.; Hoegh-Guldberg, O.; Casassa, G.; Hoerling, M.P.; Kovats, R.S.; Parmesan, C.; Pierce, D.W.; Stott, P.A. Good practice guidance paper on detection and attribution related to anthropogenic climate change. In Meeting Report of the Intergovernmental Panel on Climate Change Expert Meeting on Detection and Attribution of Anthropogenic Climate Change; 2010; Available online: https://www.ipcc.ch/publication/ipcc-expert-meeting-on-detection-and-attribution-related-to-anthropogenic-climate-change/ (accessed on 26 July 2023).
  5. Allen, M.R.; Tett, S.F.B. Checking for model consistency in optimal fingerprinting. Clim. Dyn. 1999, 15, 419–434. [Google Scholar] [CrossRef] [Green Version]
  6. Mitchell, J.F.B.; Karoly, D.J.; Hegerl, G.C.; Zwiers, F.W.; Allen, M.R.; Marengo, J. Detection of Climate Change and Attribution of Causes. In Climate Change 2001: The Scientific Basis. Contribution of Working Group I to the Third Assessment Report of the Intergovernmental Panel on Climate Change; Houghton, J.T., Ding, Y., Griggs, D.J., Noguer, M., Linden, P.J.v., Dai, X., Maskell, K., Johnson, C.A., Eds.; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2001. [Google Scholar]
  7. Hasselmann, K. Optimal fingerprints for the detection of time-dependent climate change. J. Clim. 1993, 6, 1957–1971. [Google Scholar] [CrossRef]
  8. Stott, A.; Allen, M.R.; Jones, G.S. Estimating signal amplitudes in optimal fingerprinting. part II: Application to general circulation models. Clim. Dyn. 2003, 21, 493–500. [Google Scholar] [CrossRef]
  9. Zhang, X.; Zwiers, F.W.; Hegerl, G.C.; Lambert, F.H.; Gillett, N.P.; Solomon, S.; Stott, P.A.; Nozawa, T. Detection of human influence on twentieth-century precipitation trends. Nature 2007, 448, 461–465. [Google Scholar] [CrossRef]
  10. Allen, M.R.; Stott, P.A. Estimating signal amplitudes in optimal fingerprinting, part I: Theory. Clim. Dyn. 2003, 21, 477–491. [Google Scholar] [CrossRef]
  11. Gillett, N.; Stone, D.; Stott, P.; Nozawa, T.; Karpechko, A.; Hegerl, G.; Wehner, M.; Jones, P. Attribution of polar warming to human influence. Nat. Geosci. 2008, 1, 750–754. [Google Scholar] [CrossRef] [Green Version]
  12. Hegerl, G.C.; Crowley, T.J.; Allen, M.; Hyde, W.T.; Pollack, H.N.; Smerdon, J.; Zorita, E. Detection of human influence on a new, validated 1500-year temperature reconstruction. J. Clim. 2007, 20, 650–666. [Google Scholar] [CrossRef] [Green Version]
  13. Hegerl, G.; Luterbacher, J.; González-Rouco, F.; Tett, S.F.B.; Crowley, T.; Xoplaki, E. Influence of human and natural forcing on european seasonal temperatures. Nat. Geosci. 2011, 4, 99–103. [Google Scholar] [CrossRef]
  14. Lambert, F.; Stott, P.; Allen, M.; Palmer, M. Detection and attribution of changes in 20th century land precipitation. Geophys. Res. Lett. 2004, 31, L10203. [Google Scholar] [CrossRef]
  15. Stott, P.A.; Sutton, R.T.; Smith, D.M. Detection and attribution of atlantic salinity changes. Geophys. Res. Lett. 2008, 35, L21702. [Google Scholar] [CrossRef]
  16. Zwiers, F.; Zhang, X. Toward regional-scale climate change detection. J. Clim. 2003, 16, 793–797. [Google Scholar] [CrossRef]
  17. Cummins, D.P.; Stephenson, D.B.; Stott, P.A. Could detection and attribution of climate change trends be spurious regression? Clim. Dyn. 2022, 59, 2785–2799. [Google Scholar] [CrossRef]
  18. Hannart, A. Integrated optimal fingerprinting: Method description and illustration. J. Clim. 2016, 29, 1977–1998. [Google Scholar] [CrossRef]
  19. Eyring, V.; Gillett, N.P.; Achuta Rao, K.M.; Barimalala, R.; Barreiro Parrillo, M.; Bellouin, N.; Cassou, C.; Durack, P.J.; Kosaka, Y.; McGregor, S.; et al. Human Influence on the Climate System. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S.L., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M.I., et al., Eds.; Cambridge University Press: Cambridge, UK, 2021; pp. 423–552. [Google Scholar]
  20. Phillips, P.C.B. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. [Google Scholar] [CrossRef] [Green Version]
  21. Granger, C.W.J.; Newbold, P. Spurious regressions in econometrics. J. Econom. 1974, 2, 111–120. [Google Scholar] [CrossRef] [Green Version]
  22. Yule, G.U. Why do we sometimes get nonsense-correlations between time-series?—A study in sampling and the nature of time-series. J. R. Stat. Soc. 1926, 89, 1–63. [Google Scholar] [CrossRef]
  23. Turasie, A. Cointegration Modelling of Climatic Time Series. Ph.D. Thesis, University of Exeter, Exeter, UK, 2012. Available online: https://ethos.bl.uk (accessed on 26 July 2023).
  24. Juselius, K. The Cointegrated Var Model: Methodology and Applications; Oxford University Press: Oxford, UK, 2006. [Google Scholar]
  25. Engle, R.F.; Granger, C.W.J. Co-integration and error correction: Representation, estimation and testing. Econometrica 1987, 55, 251–276. [Google Scholar] [CrossRef]
  26. Hegerl, G.C.; von Storch, H.; Hasselmann, K.; Santer, B.D.; Cubasch, U.; Jones, P.D. Detecting Greenhouse-Gas-Induced Climate Change with an Optimal Fingerprint Method. J. Clim. 1996, 9, 2281–2306. [Google Scholar] [CrossRef]
  27. Morton-Jones, A.; Henderson, R. Generalized Least Squares with Ignored Errors in Variables. Technometrics 2000, 42, 366–375. [Google Scholar] [CrossRef]
  28. Van Huffel, S.; Vandewalle, J. The Total Least Squares Problem: Computational Aspects and Analysis; Society for Industrial Mathematics: Philadelphia, PA, USA, 1991. [Google Scholar]
  29. Golub, G.H.; Van Loan, C.F. Matrix Computations; JHU Press: Baltimore, MA, USA, 2013. [Google Scholar]
  30. Cox, P.; Huntingford, C.; Williamson, M. Emergent constraint on equilibrium climate sensitivity from global temperature variability. Nature 2018, 553, 319–322. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  31. Johansen, S. Estimation and hypothesis testing of cointegration vectors in gaussian vector autoregressive models. Econom. J. Econom. Soc. 1991, 59, 1551–1580. [Google Scholar] [CrossRef]
  32. Cummins, D.P.; Stephenson, D.B.; Stott, P.A. Optimal Estimation of Stochastic Energy Balance Model Parameters. J. Clim. 2020, 33, 7909–7926. [Google Scholar] [CrossRef] [Green Version]
  33. Beenstock, M.; Reingewertz, Y.; Paldor, N. Polynomial cointegration tests of anthropogenic impact on global warming. Earth Syst. Dyn. 2012, 3, 173–188. [Google Scholar] [CrossRef] [Green Version]
  34. Estrada, F.; Perron, P. Extracting and analyzing the warming trend in global and hemispheric temperatures. J. Time Ser. Anal. 2017, 38, 711–732. [Google Scholar] [CrossRef] [Green Version]
  35. Phillips, P.C.; Leirvik, T.; Storelvmo, T. Econometric estimates of earth’s transient climate sensitivity. J. Econom. 2020, 214, 6–32. [Google Scholar] [CrossRef]
  36. Pretis, F. Econometric modelling of climate systems: The equivalence of energy balance models and cointegrated vector autoregressions. J. Econom. 2020, 214, 256–273. [Google Scholar] [CrossRef]
  37. Stern, D.I.; Kaufmann, R.K. Anthropogenic and natural causes of climate change. Clim. Chang. 2014, 122, 257–269. [Google Scholar] [CrossRef]
  38. Granger, C. Some properties of time series data and their use in econometric model specification. J. Econom. 1981, 16, 121–130. [Google Scholar] [CrossRef]
  39. Johansen, S. Statistical analysis of cointegration vectors. J. Econ. Dyn. Control 1988, 12, 231–254. [Google Scholar] [CrossRef]
  40. Osterwald-Lenum, M. A note with quantiles of the asymptotic distribution of the maximum likelihood cointegration rank test statistics. Oxf. Bull. Econ. Stat. 1992, 54, 461–472. [Google Scholar] [CrossRef]
  41. Beenstock, M.; Reingewertz, Y.; Paldor, N. Testing the historic tracking of climate models. Int. J. Forecast. 2016, 32, 1234–1246. [Google Scholar] [CrossRef]
  42. Johansen, S. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  43. Dickey, D.A.; Fuller, W.A. Distribution of the estimators for autoregressive time series with a unit root. J. Am. Stat. Assoc. 1979, 74, 427–431. [Google Scholar]
  44. Dickey, D.A.; Fuller, W.A. Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 1981, 49, 1057–1072. [Google Scholar] [CrossRef]
  45. Rust, B.W.; Thijsse, B.J. Data-based models for global temperature variations. In Proceedings of the 2007 International Conference on Scientific Computing, Sozopol, Bulgaria, 5–9 June 2007; CSREA Press: Las Vegas, NV, USA, 2007; pp. 10–16. [Google Scholar]
  46. Rust, B.W. A mathematical model of atmospheric retention of man-made CO2 emissions. Math. Comput. Simul. 2011, 81, 2326–2336. [Google Scholar] [CrossRef]
  47. Harvey, D.I.; Mills, T.C. Modelling global temperature trends using cointegration and smooth transitions. Stat. Model. 2001, 1, 143–159. [Google Scholar] [CrossRef]
  48. Kaufmann, R.K.; Stern, D.I. Cointegration analysis of hemispheric temperature relations. J. Geophys. Res. 2002, 107, 4012. [Google Scholar] [CrossRef] [Green Version]
  49. Kaufmann, R.K.; Kauppi, H.; Stock, J.H. The relationship between radiative forcing and temperature: What do statistical analyses of the instrumental temperature record measure? Clim. Chang. 2006, 77, 279–289. [Google Scholar] [CrossRef] [Green Version]
  50. Liu, H.; Rodríguez, G. Human activities and global warming: A cointegration analysis. Environ. Model. Softw. 2005, 20, 761–773. [Google Scholar] [CrossRef] [Green Version]
  51. Stern, D.I.; Kaufmann, R.K. Econometric analysis of global climate change. Environ. Model. Softw. 1999, 14, 597–605. [Google Scholar] [CrossRef]
  52. Stern, D.I.; Kaufmann, R.K. Detecting a global warming signal in hemispheric temperature series: A structural time series analysis. Clim. Chang. 2000, 47, 411–438. [Google Scholar] [CrossRef]
  53. Meinshausen, M.; Vogel, E.; Nauels, A.; Lorbacher, K.; Meinshausen, N.; Etheridge, D.M.; Fraser, P.J.; Montzka, S.A.; Rayner, P.J.; Trudinger, C.M.; et al. Historical greenhouse gas concentrations for climate modelling (CMIP6). Geosci. Model. Dev. 2017, 10, 2057–2116. [Google Scholar] [CrossRef] [Green Version]
  54. Caldeira, K.; Myhrvold, N.P. Projections of the pace of warming following an abrupt increase in atmospheric carbon dioxide concentration. Environ. Res. Lett. 2013, 8, 034039. [Google Scholar] [CrossRef]
  55. Fredriksen, H.B.; Rypdal, M. Long-range persistence in global surface temperatures explained by linear multibox energy balance models. J. Clim. 2017, 30, 7157–7168. [Google Scholar] [CrossRef] [Green Version]
  56. Tsutsui, J. Quantification of temperature response to CO2 forcing in atmosphere–ocean general circulation models. Clim. Chang. 2017, 140, 287–305. [Google Scholar] [CrossRef]
  57. Pfaff, B. Analysis of Integrated and Cointegrated Time Series with R, 2nd ed.; Springer: New York, NY, USA, 2008; ISBN 0-387-27960-1. [Google Scholar]
Figure 1. Time series plots of observed (solid lines) and simulated (dashed lines) temperature anomalies for each GCM over the period 1860–2004. For simplicity, only the 1st simulation from each GCM has been shown. Temperature anomalies were calculated by subtracting the time mean over the 1960–1990 base period.
Figure 1. Time series plots of observed (solid lines) and simulated (dashed lines) temperature anomalies for each GCM over the period 1860–2004. For simplicity, only the 1st simulation from each GCM has been shown. Temperature anomalies were calculated by subtracting the time mean over the 1960–1990 base period.
Sustainability 15 12142 g001
Figure 2. Scatter plot of observed temperature anomalies versus model simulated temperatures from the GISS-E2-R model. Best fit lines are shown for OLS regression (thin dashed lines), TLS regression (thick dashed line), and the VAR cointegrating relationship (thin solid line).
Figure 2. Scatter plot of observed temperature anomalies versus model simulated temperatures from the GISS-E2-R model. Best fit lines are shown for OLS regression (thin dashed lines), TLS regression (thick dashed line), and the VAR cointegrating relationship (thin solid line).
Sustainability 15 12142 g002
Figure 3. Comparison of the three estimates of the scaling factor across all 16 GCMs: (a) estimates for the first simulation from each of the 16 GCMs: O-OLS, T-TLS, and C-COINT, (b) boxplots of the scaling estimates over all simulations from each of the climate models, i.e., 70 simulated time series (see Table 1).
Figure 3. Comparison of the three estimates of the scaling factor across all 16 GCMs: (a) estimates for the first simulation from each of the 16 GCMs: O-OLS, T-TLS, and C-COINT, (b) boxplots of the scaling estimates over all simulations from each of the climate models, i.e., 70 simulated time series (see Table 1).
Sustainability 15 12142 g003
Figure 4. Boxplots of scaling factor estimates made for all available simulations of each climate model: (a) OLS estimates, (b) TLS estimates, and (c) cointegrated VAR estimates.
Figure 4. Boxplots of scaling factor estimates made for all available simulations of each climate model: (a) OLS estimates, (b) TLS estimates, and (c) cointegrated VAR estimates.
Sustainability 15 12142 g004
Figure 5. Time series plot of historical data over the 20th century: (a) atmospheric concentration of CO 2 (in units of Gigatons of Carbon), (b) year-to-year differences in atmospheric concentration of CO 2 , (c) observed global mean surface temperature anomalies, and (d) global mean surface temperature anomalies simulated by the GISS-E2-R model. Scatter plots of global mean temperature anomalies versus atmospheric concentration of CO 2 (with OLS fits as solid lines) for (e) observations, and (f) GISS-E2-R simulation. Temperature anomalies were calculated by subtracting the time mean over the 1960–1990 base period.
Figure 5. Time series plot of historical data over the 20th century: (a) atmospheric concentration of CO 2 (in units of Gigatons of Carbon), (b) year-to-year differences in atmospheric concentration of CO 2 , (c) observed global mean surface temperature anomalies, and (d) global mean surface temperature anomalies simulated by the GISS-E2-R model. Scatter plots of global mean temperature anomalies versus atmospheric concentration of CO 2 (with OLS fits as solid lines) for (e) observations, and (f) GISS-E2-R simulation. Temperature anomalies were calculated by subtracting the time mean over the 1960–1990 base period.
Sustainability 15 12142 g005
Figure 6. A typical example of time series simulated by the stochastic carbon–climate model: (a) atmospheric concentration of CO 2 , (b) observed temperature anomalies, (c) climate model temperature anomalies, and (d) scatter plot of observed versus simulated temperature anomalies with solid line showing the OLS fit.
Figure 6. A typical example of time series simulated by the stochastic carbon–climate model: (a) atmospheric concentration of CO 2 , (b) observed temperature anomalies, (c) climate model temperature anomalies, and (d) scatter plot of observed versus simulated temperature anomalies with solid line showing the OLS fit.
Sustainability 15 12142 g006
Figure 7. Scaling factor point estimates (black circles) and 95% confidence intervals (whiskers) obtained from 50 simulations of 100-years of the carbon–climate model using (a) OLS regression, (b) TLS regression, and (c) cointegrating VAR model. The true scaling factor of 1.74 is depicted by the thick horizontal lines.
Figure 7. Scaling factor point estimates (black circles) and 95% confidence intervals (whiskers) obtained from 50 simulations of 100-years of the carbon–climate model using (a) OLS regression, (b) TLS regression, and (c) cointegrating VAR model. The true scaling factor of 1.74 is depicted by the thick horizontal lines.
Sustainability 15 12142 g007
Figure 8. Properties of scaling factor estimates from 50 simulations of time series of lengths from 25 to 250 years. Upper panels: mean scaling factor estimates (black lines) and 95% confidence interval (grey shading) obtained by using (a) OLS regression, (b) TLS regression, and (c) cointegrating VAR. The true value of 1.74 for the scaling factor is denoted by the thick horizontal lines. Lower panels show sample properties of the estimators: (d) Mean Squared Error, (e) bias, and (f) variance.
Figure 8. Properties of scaling factor estimates from 50 simulations of time series of lengths from 25 to 250 years. Upper panels: mean scaling factor estimates (black lines) and 95% confidence interval (grey shading) obtained by using (a) OLS regression, (b) TLS regression, and (c) cointegrating VAR. The true value of 1.74 for the scaling factor is denoted by the thick horizontal lines. Lower panels show sample properties of the estimators: (d) Mean Squared Error, (e) bias, and (f) variance.
Sustainability 15 12142 g008
Table 1. Summary of the CMIP5 models used in this study and the number of available historical simulations.
Table 1. Summary of the CMIP5 models used in this study and the number of available historical simulations.
ModelInstitutionSimulations
1bcc-csm1-1Beijing Climate Center, China Meteorological Administration3
2CanESM2Canadian Centre for Climate Modeling and Analysis5
3CCSM4NCAR Community Climate System Model6
4CNRM-CM5Centre National de Recherches Meteorologiques/Centre
Europeen de Recherche et Formation Avancees en Calcul Scientifique
10
5CSIRO-Mk3-6-0Commonwealth Scientific and Industrial Research Organisation
and the Queensland Climate Change Centre of Excellence
10
6EC-Earth23European Centre for Medium-Range Weather Forecasts1
7GISS-E2-RNASA Goddard Institute for Space Studies10
8GISS-E2-HNASA Goddard Institute for Space Studies5
9HadCM3Met Office Hadley Centre1
10HadGEM2-CCMet Office Hadley Centre1
11HadGEM2-ESMet Office Hadley Centre4
12inmcm4Institute for Numerical Mathematics, Moscow, Russia1
13IPSL-CM5A-LRInstitut Pierre Simon Laplace, Paris, France4
14MIROC5Atmosphere and Ocean Research Institute (The University of Tokyo)1
15MRI-CGCM3Meteorological Research Institute, Tsukuba, Japan5
16NorESM1-MNorwegian Climate Centre3
Table 2. Trace test statistics for hypotheses about the rank r of Π for the first runs of the 16 CMIP5 models: H r 1 : r 1 non-stationary process, H r 0 : r = 0 two independent random walks with no cointegrating relationship. Hypothesis H r 1 can be rejected at the 5% level if the statistic exceeds 9.24, and H r 0 can be rejected at the 5% level if the statistic exceeds 19.96.
Table 2. Trace test statistics for hypotheses about the rank r of Π for the first runs of the 16 CMIP5 models: H r 1 : r 1 non-stationary process, H r 0 : r = 0 two independent random walks with no cointegrating relationship. Hypothesis H r 1 can be rejected at the 5% level if the statistic exceeds 9.24, and H r 0 can be rejected at the 5% level if the statistic exceeds 19.96.
SimulationNull Hypothesis
H r 1 : r 1 H r 0 : r = 0
M 1.1 1.6733.35
M 2.1 1.7734.49
M 3.1 2.0136.82
M 4.1 2.5529.47
M 5.1 3.5025.70
M 6.1 1.9128.35
M 7.1 1.6628.63
M 8.1 2.1725.78
M 9.1 1.7930.68
M 10.1 3.6029.17
M 11.1 3.9628.23
M 12.1 2.1050.32
M 13.1 1.6432.46
M 14.1 3.4054.99
M 15.1 2.2036.40
M 16.1 2.1743.72
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Stephenson, D.B.; Turasie, A.A.; Cummins, D.P. More Accurate Climate Trend Attribution by Using Cointegrating Vector Time Series Models. Sustainability 2023, 15, 12142. https://doi.org/10.3390/su151612142

AMA Style

Stephenson DB, Turasie AA, Cummins DP. More Accurate Climate Trend Attribution by Using Cointegrating Vector Time Series Models. Sustainability. 2023; 15(16):12142. https://doi.org/10.3390/su151612142

Chicago/Turabian Style

Stephenson, David B., Alemtsehai A. Turasie, and Donald P. Cummins. 2023. "More Accurate Climate Trend Attribution by Using Cointegrating Vector Time Series Models" Sustainability 15, no. 16: 12142. https://doi.org/10.3390/su151612142

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop