Aggregated World Energy Demand Projections: Statistical Assessment

: The primary purpose of this research is to assess the long-range energy demand assumption made in relevant Roadmaps for the transformation to a low-carbon energy system. A novel interdisciplinary approach is then implemented: a new model is estimated for the aggregated world primary energy demand with long historical time series for world energy, income, and population for the years 1900–2017. The model is used to forecast energy demand in 2050 and assess the uncertainty-derived risk based on the variances of the series and parameters analysed. The results show that large efﬁciency savings—up to 50% in some cases and never observed before—are assumed in the main Roadmaps. This discrepancy becomes signiﬁcantly higher when even moderate uncertainty assumptions are taken into account. A discussion on possible future sources of breaks in current patterns of energy supply and demand is also presented, leading to a new conclusion requiring an active political stance to accelerate efﬁciency savings and lifestyle changes that reduce energy demand, even if energy consumption may be reduced signiﬁcantly. This will likely include replacing the income-growth paradigm with other criteria based on prosperity or related measures.


Introduction
Forecasting long-range energy demand at the aggregated world level is the starting point for building general Roadmaps for the transformation to a low carbon energy system. Specifically, the forecasts are required to design an investment path on renewable energies and the required speed of deployment, balancing the costs and the implied demands on the volume of financial resources. Currently, the available long-range demand forecasts employed in Roadmaps are frequently based on ad hoc assumptions, mainly on population and GDP growth, and savings and efficiency. This procedure is usually pondered by a scenario analysis based on a few modified global assumptions, described in the literature review and discussion presented in Section 2.
In this paper, a further new contribution on the modelling of energy demand is proposed and implemented, based on a statistical analysis of the long-term historical time series that covers that gap. The results allow for comparing and assessing currently available future demand forecasts found in some relevant published Roadmaps [1][2][3]. A proper statistical risk assessment, beyond the standard scenario analysis, is also implemented. The research implements an interdisciplinary approach, applying statistical techniques to analyse a relevant energy issue, and discussing the evaluation of probability forecasting, as suggested in [4]. Finally, a discussion on possible sources of future breaks in energy supply and demand is presented.
Section 2 provides some background on the research topics; Section 3 presents the methodology briefly, explained in detail in Appendix A. Section 4 reports the main results and describes an assessment of the projections in some Roadmaps, and Section 5 discusses sources of future energy supply and demand breaks. Section 6 concludes with a summary This approach, besides, allows a systematic analysis of uncertainty properly embedded in statistics that allows, among other things, accounting for the variability of the driving variables of the model, i.e., the standard deviations of the input variables in the model (e.g., [18][19][20]) for an early implementation of this approach simulating long-range GDP values as a function of population. The early work of [19] is especially remarkable, which addressed the lack of data building a proper statistical distribution by weighting judiciously the answers elicited from a range of experts in the field. The research presented here follows this second aggregated approach, yielding valuable insights.
There is a wealth of literature dealing with energy forecasting, mainly focused in the short-and medium-term, up to 20 years ahead, at several levels of disaggregation, as reviewed, e.g., in [21]. Attempts at forecasting future energy demand, even at high frequencies like hourly and for the long-range, i.e., 2050, have been implemented as well, although [22] assess them instead as exploratory exercises. The research presented here is related but does not belong to the same category, and is based on an aggregate behavioural model estimated statistically with a long-term data set of historical series, intended to serve as a yardstick to assess current projections made commonly in published Roadmaps. The results reported in Section 4 show that this approach yields valuable insights.

Methodology
This study is based on the statistical estimation of an aggregated world demand energy model. The advantage of this approach is that there are long-term historical data series for the relevant variables available. This allows implementation of a new approach based on the estimation of a model based on actual data, as opposed to the many assumptions required in more disaggregated models, where the values for most parameters are ultimately guessed, however reasonable or justified the reasoning supporting the guess may be. Three series were analysed in a joint model for Total Primary Energy Supply (TPES) [23], world Gross Domestic Product (GDP) [24,25], and world population [26,27], an analysis that is missing in the literature. The frequency of the data analysed is annual, spanning the period 1900-2017.
The sources for the series, notably for TPES and GDP, are based on academic research rather than direct data gathering and observation, but are generally considered the most reliable in their respective fields. One potential drawback of this type of analysis is that the sample may be somewhat unstable since it covers such a long time period, i.e., almost 120 years. As it will be seen in the empirical results reported, nevertheless, dealing appropriately with some statistical issues allows for the estimation of a reasonably accurate and stable model for the whole period (Appendix B).
With the main estimated model, and after deriving the long-run dynamic equilibrium solution, simulations were run for the year 2050, the most common horizon date analysed in published Roadmaps for the low carbon energy transformation. Thus, the simulations are a new approach that replaces the usual sensitivity analysis. Besides forecasting specific values for energy demand, this solution allows a proper statistical risk analysis to be conducted; i.e., the reliability of that estimate for given probabilities can and has been calculated (Appendix A). This new risk analysis complements the simulation results in novel ways beyond the standard restricted set of scenario analysis. The risk analysis conducted is based on [28], where the value-at-risk methodology customarily applied in finance was introduced in the energy field. This methodology has been subsequently implemented in related problems in this literature, e.g., [29,30].
Finally, the results obtained were compared to selected relevant projected low carbon Roadmaps (Section 5.1). The comparison allowed an independent assessment of some of the Roadmaps' assumptions, particularly the forecasted world energy demand.

Results
The simulation and risk analysis were conducted with the estimated models reported in column III, considered, covering a reasonably wide array of realistic cases, according to most Roadmaps proposed by the leading organisations in the field. The model has been fitted to TPES actual values, provided in [23]. The final TFEC reported values are obtained assuming a 0.7 ratio of actual consumption to supply values-this coefficient has been remarkably stable over a long historical period [31].  The most salient results follow: (a) Energy intensity decreases with higher GDP growth rates, as seen by comparing the last line of columns I vs. II and III vs. IV: this is an encouraging result, derived from the fact that the elasticity of energy demand to GDP is lower than one-0.67, see Table A1 in Appendix B. (b) Contrarily, energy intensity increases with higher population growth rates, and in a nonnegligible measure-see columns I vs. III and II vs. IV; whether the current declining population growth rates will be sustained in the coming years, as is generally forecast, or not, remains to be seen; it must be remarked in this connection that over a long historical period, the variability of the population growth rate has been significant-0.5% from 1900 to 2017, and this is a matter of concern. (c) Simulated values are commonly projected straightforwardly with a given elasticity. However, this procedure does not account for dynamic adjustments, clearly detected in the empirical estimations; see the results in Table A1 in the Appendix B. Fortunately, this omission does not cause a significant bias, as can be seen by comparing the value in the lines 'contr. GDP' and 'contr. dyn.'. Worth remarking, nevertheless, is that the dynamic contribution is negative for low population growth rates, implying that the usual Roadmaps projections would be slightly overestimated and the opposite for higher growth rates. (d) Possibly the most worrying result concerns the variance of the projected values; if a 90% probability confidence is to be guaranteed, the projected energy demand values increase by 60% to 80%; ensuring an 80% probability does not decrease this range by much. This may have far-reaching implications for the general design of Roadmaps, ranging from the assessment of investment and financial requirements to the sheer availability of some rare earth and other metals required for wind turbines and solar panels.
These results were obtained with the best available data produced in academic research. Eventual future improvements in the data may improve the accuracy of the results to some extent, although the potential impact on the global qualitative implications is less likely.

Assessment of Roadmaps Projections
The previous results may be a useful yardstick to assess commonly published values for global world energy demand in transition Roadmaps to a low carbon economy. The number of these Roadmaps is in the range of several hundred, as analysed in the IPCC report [5,11]. Here, only a few will be considered, deemed to be among the most relevant: two by the leading international nonacademic institutions in the field of energy analysisthe International Energy Agency (IEA) [2,32] and the International Renewable Energy Agency (Irena) [3,33], and the Energy [R]evolution series [34,35], updated by its leading author [1]. The GDP and population growth rate assumptions, and the forecast final energy demand in the three Roadmaps, are gathered in Table 2. All make very close assumptions regarding both driving variables, i.e., population and GDP; the GDP growth rate is generally taken from the OECD and the population from the UN population research department. The significantly remarkable results are reported in the last two columns. First, and regarding the efficiency conversion ratio from primary to final supply, the IEA [1,2,32] kept the historical rate of 0.7, whereas Irena [3,33] assumed a significant improvement, 0.6. The most outstanding result, however, is the final assumed consumption, shown in the last column for all three cases, is far lower than the smallest value reported in the simulations of Table A1. Accounting for uncertainty, the comparison is even more striking. What these assumptions imply, therefore, is a high savings rate that has not been observed historically and is fraught by several sources of uncertainty, as discussed in Section 5.

Future Demand Uncertainties
As emphasised by the Nobel prize winner C. Granger [12,13], it is feasible to achieve reasonably good long-run forecasting and simulation accuracy. However, the unexpected breaks tend to cause the failure of those forecasts. The uncertainty shown and discussed in Section 4 should not be viewed inside this category since it is somehow derived inside the model estimation procedure that was implemented. There are, nevertheless, at least some features that can be broadly identified as possible causes for future breaks, affecting both the supply and the demand. These breaks would possibly imply significant changes in the model-estimated parameters.
As for the future energy demand, a first possible source of future increases is the proliferation of blockchain technology applications and cryptocurrencies, notably the bitcoin, e.g., [36]. However, [37] argues that the problem just affects bitcoin, since blockchain technology can be implemented much more efficiently than has been up to now. A second possible source is an increase in telecommunications services and applications [38,39]. In this respect, it is to be noted that overall energy demand has not decreased during the recent pandemic precisely because of this, a trend that is expected to last [40]. A third possibility is the rebound effect, i.e., demand increases precisely because the product or service considered becomes increasingly cheaper [41,42]. This is what has happened with the leading renewable energy technologies, i.e., wind and solar, for the past 50 years and partially explains its accelerated expansion. Finally, other likely sources of increased demand trends may arise from the increasing trend in urbanisation [43] and the increasing GDP growth of underdeveloped countries [44]. Possible reductions in future demand can also be identified already. First, straightforward demand reductions derived from lifestyle changes are not usually discussed, but are nevertheless natural and feasible as underlined, e.g., in [45,46]. One brief example would be dietary changes, which would simultaneously be more sustainable overall, e.g., reducing certain types of meat consumption. Another possibility, directly derived from engineering analysis, is based on the claim that up to 70% reduction in some cases can be achieved without compromising the services provided by energy, i.e., direct savings [47]. This claim has not been disputed in the literature, suggesting, therefore, that the world is currently consuming far more energy resources than required to sustain the same welfare levels. Finally, a third possibility is a replacement of the GDP growth as the main target of wellbeing by a general prosperity target [48], or some other more specific criterion like the human development index proposed by the UN [49], or other related measures as reviewed in [50] (see [51], e.g., for a discussion).
Overall, it is to be concluded that, besides the many uncertainties derived from the estimation procedures with available data, there are several possible causes of future 'breaks' in past demand and supply patterns. Finally, these breaks can mean demand increases but also decreases, so that the final implication, for now, is a significant increase in future uncertainty.

Conclusions
The relative merits of the disaggregated vs. the aggregated approaches for longrange future energy demand forecasting and simulation were assessed. It was argued that the second approach has been disregarded to a large extent, when in fact, it can offer better forecasts and tools for risk analysis. Consequently, long historical data series for energy, GDP, and population, covering the period 1900-2017, were gathered and analysed by means of a statistical model. Thus, an interdisciplinary approach was implemented, capable of yielding new insights into long-range energy demand forecasting, useful for the design of Roadmaps, among other aspects.
With the resulting model, plausible energy demand projections for the year 2050 were made and used as a yardstick to compare them with published results from a few selected relevant Roadmaps for a low carbon transformation. It was shown that the energy-saving assumptions are quite large and have never been observed historically: e.g., under mild growth assumptions for population (0.5%) and GDP growth rates (3%), the results of this research yield a TFEC of 576 EJ/yr, whereas Irena assumes a value of 370 EJ/yr and the IEA 407 EJ/yr. A risk analysis was conducted, accounting for the future variability of GDP and population; under a moderate probability assumption implying that energy demand does not increase above 30% of its projected value, the model yielded a maximum value of 852 EJ/yr, i.e., almost double the figures assumed by Irena and the IEA. These results may have far-reaching implications, as long as they require an accelerated implementation of the low-carbon Roadmaps to meet that future demand without disruptions to economic activity.
The research was complemented by discussing possible future breakpoints in energy demand and supply patterns. These breaks point to potential demand increases, but also possible and feasible significant decreases that would nevertheless require active policy measures to be realised. These results indicate considerable uncertainty that could lead to even higher energy demand values. All these implications underline the need for an active political stance if energy demand trends are to be reversed to fit the Roadmaps designs for achieving a low carbon and sustainable energy system. Some future research lines following the aggregated approach implemented in the present research include at least the specification of an interrelated model for the three variables considered, i.e., energy demand, GDP growth, and population, and a complementary simulation-based risk analysis. In particular, the dependence of GDP on population and the age structure has been emphasised in the literature [18,20], as has been the inverse relationship, i.e., the impact of GDP growth on population [52,53]. Some degree of geo-graphical and time disaggregation would also be worth pursuing, pending the availability of data. Finally, the potential implications for implementing low-carbon Roadmaps, i.e., the deployment of renewable energy sources, are huge and should be researched further.
Funding: This research received no external funding.

Acknowledgments:
The comments of four anonymous referees and the attendants at several university workshops are gratefully acknowledged. Any possible remaining errors are my sole responsibility.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. Theory and Methods
This appendix presents and discusses the statistical model implemented in the empirical estimations. The dynamic equilibrium solution and its variance, implemented to derive the main results reported in Section 4, are also analysed.
The heteroskedasticity corrections applied in the estimation procedure, in particular, deserve a detailed analysis presented next. Consider first the standard estimated linear regression model written as follows:û β is the ordinary least squares estimate of β, i.e., the parameters associated with the explanatory variables x t = (1, log(Gdp t ), log(Pop t )), in the corresponding model, and where y t = log(E t ), is the dependent explained variable-note that for the sake of notational saving, E t is used instead of TPES. The variance of u t , σ 2 t is estimated through a weighted moving average with declining weights as follows: This is designed to tackle a variance that is not constant and has a changing pattern, after an analysis of the first straightforward ordinary least squared residuals,û 2 t . Then, the heteroskedasticity weighted variables are defined from here as given by The next step is to estimate the following model with the weighted variables, The purpose is to account for dynamics in a standard way [54] and simultaneously introduce a further heteroskedasticity correction to account for a different error variance in two sample periods split in 1970. Additionally, the errors follow an AR(1) process dealt with the Prais-Winsten estimation procedure. The dynamic equilibrium solution that will allow for simulating long-run values and estimate their variance can be worked out from this model and is explicitly given by (e.g., [51]): where ∆x is the vector of constant equilibrium growth rates of the variables in x t , and (γ/δ) is the vector of their long-run equilibrium impacts-elasticities if the variables are in logs. as they usually are. It should be noted that while all three coefficients in the vector α are significantly different from zero statistically, only the coefficient for log(Gdp t ) is in the vector γ, as reported in the estimation results in Table A1. Finally, the simulated value for y t at a future horizon date H, i.e., y H , will be given straightforwardly from Expression (A10), replacing the subindex t by H. The next step is to work out the variance of E t = exp(y t ) to allow for establishing probability prediction intervals for its simulated values. First from (A5), the simulated future value for E t is nonrandom. Second, the growth rates of the driving variables Gdp t and Pop t , are assumed as constant, but their variance must be accounted for, and their respective historical values in the sample analysed are considered. Third, and following [13,18], the variance of the estimated parameters is not accounted for. However, the error variance of the fitted model, and although it turns out to be of a smaller order of magnitude, is added. The variance of E t for short term forecasting purposes can be usefully calculated as the variance σ 2 ofˆ t , given bŷ The variance of y t , σ 2 y , is finally given by where σ 2 2 , σ 2 3 , are the variances of x 2 = log(Gdp), x 3 = log(Pop), respectively (see the definition of x t below (A1)); the sample covariance is omitted since it is not significantly different from zero. Finally, the variance of E t = exp(y t ) is given by the variance of a log normal distribution [54], i.e., where y t is the mathematical expectation of y t , and is estimated by replacing the values of the estimated parameters as given in (A11). Finally, and given an expected simulated value at a future horizon date H for E H , E H = exp(y H )(see (A10), (A11), and (A12)), its variance (A14), and assuming Gaussianity, a conventional probability interval for the actual E H value can be set up as follows: where ϕ = (1.645, 1.282, 0.842) for the corresponding values of α = (95%, 90%, 80%)note that in the current context, H is usually taken as the year 2050. The expression E H + ϕ × σ E was originally denoted as value-at-risk (VaR), and it means the maximum value that can be expected for energy demand to attain with the stated probability. This risk analysis methodology was first introduced in the energy literature in [28] and later implemented, e.g., in [29,30]. It was applied in the main empirical results reported in Section 4, Table 1.

Appendix B. Empirical Results
The main estimation results for the model in Equation (A5) are reported in Table A1. The same equation is estimated under the same conditions, i.e., sample period, autoregressive procedure, and identical specification, but with a different treatment of the error variance. The coefficients change somewhat, although the long-run equilibrium elasticity l.r.ε(E, Gdp)-see (A10)-is remarkably stable. As implied by the t-ratios reported below them, the variances of the estimated coefficients generally improve when account is taken of the heteroskedasticity, as they should according to the theory. It should also be underlined that the variance of the estimated long-run elasticity is remarkably low, which provides reliability to the simulations reported.
An autoregressive error was also added to all estimated models, and the final DW does not point to any unresolved dynamic issue, even if its validity is only descriptive. As measured by R 2 , the fit is generally good, notably for the models that account for heteroskedasticity explicitly-columns II and III. A correction for the number of parameters estimated is accounted for by the adjusted R 2 , also reported. It can be seen that it is almost the same as R 2 , because the number of estimated parameters related to the number of observation data points is low-for columns II and III it only differs in the fifth digit, which is not reported. Statistical standard deviations for the regression errors and the dependent variable for every estimated model are reported as well. Note that in columns II and III the heteroskedasticity weighting changes the scale, although the remaining measures are invariant to this correction. Lastly, the two variables, Energy and the GDP, are also cointegrated according to conventional tests, better when heteroskedasticity is adequately dealt with.
It is also worth analysing the behaviour of the original variable graphically, E t , and its fitted valuesÊ t = exp(ŷ t )-see (A11-12), and the fitting errorsˆ t = E t −Ê t and their variance σ 2 . Figure A1 first displays the actual and fitted values, showing a tight fit with a correlation coefficient of 0.999; standard cointegration tests also show that the null hypothesis of cointegration cannot be rejected, and the forecasting error standard deviation is 1.6% compared to the 23% of the actual demand over the last sample period. Formal stability tests of the Chow-type were also conducted, rejecting the null hypothesis of instability; for the last 5 (2013-2017) and 18 (2000-2017) years, the test yields 0.326 and 1.791, respectively, accepting, therefore, the stability hypothesis according to the conventional 5% significance values of the F distribution-F(5,41) = 2.44, F(5,41) = 1.86. Figure A2 depicts the errorsˆ t of the fit, where the higher variance of the second period after 1970 is clearly seen. This is because the series before 1960 is interpolated from actual data every ten years period, and therefore is reasonably smooth. This also underscores the nature of the heteroskedasticity in this context, i.e., a variance change at a point in time. Nevertheless, time-dependent heteroskedasticity of the Arch-Garch-type was also tested and rejected; the null test of joint nonsignificant lags up to the third-order yields 1.75, with p-value 0.159 from and F(3,106) distribution.  , rho-differenced data. (5) rho, autoregressive coefficient. (6) A constant included in all regressions-properly weighted where applicable, columns II and III. (7) I, no heteroskedasticity correction; II, first heteroskedasticity correction (see (A4), (A5)); III, double heteroskedasticity correction (see (A7-9)). (8) . . , , elasticity of Energy demand to GDP. (9) t-ratios in braces below every estimated coefficient. (10) , adjusted for the number of parameters. (11) S.E., standard error of the regression. (12) S.D., standard deviation of the dependent variable.
It is also worth analysing the behaviour of the original variable graphically, , and its fitted values = -see (A11-12), and the fitting errors ̂ = − and their variance . Figure A1 first displays the actual and fitted values, showing a tight fit with a correlation coefficient of 0.999; standard cointegration tests also show that the null hypothesis of cointegration cannot be rejected, and the forecasting error standard deviation is 1.6% compared to the 23% of the actual demand over the last sample period. Formal stability tests of the Chow-type were also conducted, rejecting the null hypothesis of instability; for the last 5 (2013-2017) and 18 (2000-2017) years, the test yields 0.326 and 1.791, respectively, accepting, therefore, the stability hypothesis according to the conventional 5% significance values of the F distribution-F(5,41) = 2.44, F(5,41) = 1.86. Figure A2 depicts the errors ̂ of the fit, where the higher variance of the second period after 1970 is clearly seen. This is because the series before 1960 is interpolated from actual data every ten years period, and therefore is reasonably smooth. This also underscores the nature of the heteroskedasticity in this context, i.e., a variance change at a point in time. Nevertheless, time-dependent heteroskedasticity of the Arch-Garch-type was also tested and rejected; the null test of joint nonsignificant lags up to the third-order yields 1.75, with p-value 0.159 from and F(3,106) distribution.
All calculations were performed with programs written in the hansl language [55] running in gretl. The calculations are straightforward and are described in detail in Appendix A. Since the software is open-source, and the data used are freely available in public repositories, independent checking is guaranteed as required in [4]. The figures were drawn with programs written in the open-source language gnuplot [56]. Details of the specific econometric procedures can be found in [13] and [54,55].  All calculations were performed with programs written in the hansl language [55] running in gretl. The calculations are straightforward and are described in detail in Appendix A. Since the software is open-source, and the data used are freely available in public repositories, independent checking is guaranteed as required in [4]. The figures were drawn with programs written in the open-source language gnuplot [56]. Details of the specific econometric procedures can be found in [13,54,55]. Energies 2021, 14, x FOR PEER REVIEW 11 of 13 Figure A2. Fitting errors and two standard deviations (s.d.) bands. Figure A2. Fitting errors and two standard deviations (s.d.) bands.