Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future

Basmadjian, Robert; Shaafieyoun, Amirhossein

doi:10.3390/en16166005

Open AccessArticle

Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future

by

Robert Basmadjian

^1,*

and

Amirhossein Shaafieyoun

²

¹

Department of Informatics, Clausthal University of Technology, Julius-Albert-Str. 4, 38678 Clausthal-Zellerfeld, Germany

²

One Data, Kapuzinerstraße 2c, 94032 Passau, Germany

^*

Author to whom correspondence should be addressed.

Energies 2023, 16(16), 6005; https://doi.org/10.3390/en16166005

Submission received: 16 June 2023 / Revised: 2 August 2023 / Accepted: 11 August 2023 / Published: 16 August 2023

(This article belongs to the Section F5: Artificial Intelligence and Smart Energy)

Download

Browse Figures

Versions Notes

Abstract

:

Renewables are the greener substitute for the conventional polluting sources of generating energy. For their successful integration into the power grid, accurate forecasts are required. In this paper, we report the lessons acquired from our previous works on generating time-series ARIMA-based forecasting models for renewables. To this end, we considered a consistent dataset spanning the last four years. Assuming four different performance metrics for each of the best ARIMA-based models of our previous works, we derived a new optimal model for each month of the year, as well as for the two different methodologies suggested in those works. We then evaluated the performance of those models, by comparing the two methodologies: in doing so, we proposed a hybrid methodology that took the best models out of those two methodologies. We show that our proposed hybrid methodology has improved yearly accuracy of about 89.5% averaged over 12 months of the year. Also, we illustrate in detail for the four years under study and each month of the year the observed percentage of renewables and its corresponding accuracy compared to the generated forecasts. Finally, we give the implementation details of our open-source REN4KAST software platform, which provides several services related to renewables in Germany.

Keywords:

renewable energy sources; auto-regression; moving average; forecasting methodologies

1. Introduction

1.1. Motivation

Renewable energy sources (RESs) are an ecologically friendly alternative to conventional fossil-based sources for generating electricity. The major inconvenience of clean renewables is the fact that their generation is highly dependent on weather conditions [1,2]. This makes them intermittent, and forecasting their generation with high accuracy is a very challenging problem to solve [3].

However, integration of an RES into a power grid requires accurate forecasts [4,5] so that the stability of the grid is not jeopardized: this is because there is a constant need for the generation to match the demand [6]. In this regard, the New England grid operator (ISO-NE) estimated savings of about USD 20 million, thanks to reduced forecasting error rates [7]. The impact of renewable energy sources forecasting errors was studied in [8], and it was shown that improved forecasting accuracy is beneficial to grid operators, for participating in day-ahead markets. Also, accurate forecasts of RESs are necessary for planning purposes of finding greenness routes for electric vehicles in the field of electric mobility.

Given the relevance of the topic, several works have been proposed in the literature that can be categorized into two groups: (1) forecasts that are generated for small-scale installations and one type of renewable source, such as solar farms [9,10,11,12,13] or windmills [14,15,16,17]; (2) large-scale (national-level) forecasts that are provided for most relevant renewable energy sources [18,19,20,21,22]. For this paper, the latter group was the line of research that we adopted, by considering large-scale forecasts and the most relevant RESs. Such a problem is highly stochastic, for the following reasons: first and foremost, as mentioned above, renewables are highly intermittent, due to their dependency on weather conditions—which are, in themselves, stochastic; secondly, when considering large-scale installations, we usually have a dataset that is the concatenated generation of electricity from different sources, where it is almost impossible to map the corresponding generation to the underlying installation, yet such information is crucial in generating forecasting models, when considering large-scale RESs.

1.2. Problem Statement

RESs are sources that are used to generate energy through natural and clean means: examples include solar, wind (both onshore and offshore), hydro, and geothermal. By “percentage of renewables” in this paper, we mean the ratio of electricity generated by an RES to the electricity generated by all means, including both conventional sources (fossil-based or nuclear) and renewables:

% R E S = \frac{E_{R E S}}{E_{R E S} + E_{C E S}},

(1)

where

E_{R E S}

and

E_{C E S}

denote electricity generated from renewable and from conventional sources, respectively The whole list of sources can be fetched from http://139.174.11.24/data?dataset=entsoe&start=2023-07-01&end=2023-07-01 (accessed on 15 June 2023).

In the literature, the different proposed forecasting models are differentiated based on the forecasting interval. The larger this interval is, the more accurate are the forecasts, and vice versa. For this paper, we were interested in generating forecasts for the percentage of renewables by assuming 15-min intervals (required by grid operators) for the next day (day-ahead) and considering the most relevant RES (solar, wind, and hydro) distributed on a large scale (nationwide). To this end, we considered the time-series auto-regressive methodology (ARIMA-based models), with the main objective being not to generate a new set of models but rather to use the existing models of [18,19] and to assess, as well as reduce further, the error rates of their forecasts, by proposing the hybrid methodology.

1.3. Contributions

Yearly based and monthly based methodologies were previously proposed in [18,19], respectively. For this paper, we considered four years of consistent datasets from 1 January 2019 to 31 December 2022, to calculate and assess the accuracy of the best ARIMA-based models. Thus, our contribution is to report the lessons learned from those forecasting models within a duration of four years. We show that neither methodology (yearly nor monthly) is superior to the other. Consequently, we propose a hybrid methodology, which selects the best models out of the two worlds. To achieve this, we computed four different performance metrics (i.e., RMSE, MAE, ME, and MPE) for each of the best ARIMA-based models (i.e., SARIMAX, SARIMA, and ARIMAX) for the four years under study. Furthermore, we gave each of the above-mentioned metrics the same weight, and we derived the best ARIMA-based model both for yearly based and monthly based methodologies. We show that the newly proposed hybrid methodology achieves a yearly RMSE of about 10.5% when forecasting the percentage of renewables, with the highest error rates occurring in the month of June. Finally, we present in detail the percentage of renewables in Germany between 2019 and 2022, having studied both yearly and monthly views. Our results show that between 2019 and 2022 in Germany the percentage of renewables had a yearly average ranging from 42.71% to 47.52%, with a minimum of 12.5% (in 2021) and a maximum of 84.37%. These results confirm the targets that Germany set in 2011 for “Energy Transition” [23].

The rest of this paper is organized as follows: In Section 3, we compare the yearly based methodology to the monthly based methodology of [18,19], respectively, we propose the hybrid methodology, and we present the accuracy of the ARIMA-based models. In Section 4, we provide the details of the REN4KAST server implementation, with its main services of Data, Forecast, and Evaluation. In Section 5, our experimental results are provided, by considering the case study of Germany. The paper is concluded in Section 6.

2. Preliminaries

2.1. Training Methodology

The main concept of the models derived in [18,19] was to generate the best ARIMA-based model for each month of the year. To that end, the time-series dataset of the whole year of 2019 was considered, and the best models were identified. The training of each of the ARIMA-based models took about 7 h. For the subsequent years, the best ARIMA-based models for each month in 2019 were reused (to reduce the training time): thus, all the model parameters (p, d, q in Equations (5)–(7)) remained the same. However, the time-series dataset of the last 35 days of the specific month of the year under study was considered, in order to re-train only the hyperparameters (the weight values in Equations (5)–(7)) of the corresponding model. Unlike the training process of 7 h, this re-training process was extremely fast and took about 60 s. Consequently, this methodology allowed us to reduce the training time as well as the energy footprint of the models. For this reason, in this paper, our analysis was carried out by considering the dataset of 2019 and beyond, and it shows that such a modeling approach is reasonably accurate, despite the short re-training times (60 s) of the already trained models of 2019 for the subsequent years. It is important to note that to derive the best ARIMA-based model for each month, three weeks of data were used for training and five weeks of data were considered for testing. A detailed explanation of this process can be found in Figure 2 of [19].

The major difference between the yearly based and the monthly based methodologies is the way that the exogenous variables are identified and then used in the SARIMAX and ARIMAX models. Note that the SARIMA approach does not require any such variable, due to its specification (i.e., see Equation (5)). In this respect, weather- and irradiation-related datasets were considered, where each dataset consisted of nine different pieces of information (temperature, wind speed, etc.). The complete set of this information for weather and irradiation can be fetched from our server, using http://139.174.11.24/dbcolumns?dataset=weather (accessed on 15 June 2023) and http://139.174.11.24/dbcolumns?dataset=irradiation (accessed on 15 June 2023), respectively. In order to identify the most relevant exogenous variables, a Pearson’s correlation analysis was carried out, and those variables were picked that showed a strong positive correlation (i.e., a coefficient value of greater than 0.8), with respect to the percentage of renewables. Thus, for the yearly based methodology all the values from 1 January 2019 to 31 December 2022 were considered (i.e., one table of coefficients), whereas for the monthly based methodology the analysis was carried out for each month (i.e., one table of coefficients for each month of the year), such that we considered four years of data for that specific month of the year. Hence, for the yearly based methodology, 1461 days (4 × 365 + 1 leap day) were used to generate 1 table of the coefficients, whereas in the monthly based methodology 113, 120, or 124 (3 × 28 + 29, 4 × 30, or 4 × 31) days were used to generate 1 table of the coefficients for the corresponding month (i.e., 12 tables). Consequently, the SARIMAX and ARIMAX models of the monthly based methodology had exogenous variables specific to each month of the year, whereas the models for the yearly based methodology had exogenous variables that were the same for each month.

2.2. Description of ARIMA Models

Auto-regressive (AR) models, which are the basis of the ARIMA approach, are generated by considering a time-series dataset. The simplest form of such models predicts the dependent variable

y_{t}

(in our case, the percentage of renewables), by taking into account the linear combination of its previous p values (known as the order or lag):

y_{t} = c + \sum_{i = 1}^{p} ϕ_{i} y_{t - i} + ϵ_{t},

(2)

where

ϕ_{i}

is the weight of the

i^{t h}

previous value, c is a constant, and

ϵ_{t}

denotes the underlying error or white noise of the current time step t.

Moving average (MA) models predict the dependent variable

y_{t}

by considering the previous q prediction errors:

y_{t} = c + \sum_{i = 1}^{q} θ_{i} ϵ_{t - i} + ϵ_{t},

(3)

where

θ_{i}

is the weight of the

i^{t h}

previous prediction error, c is a constant, and

ϵ_{t}

denotes the underlying error or white noise of the current time step t.

An ARIMA model combines the above-mentioned two modeling approaches, and can be defined in terms of three parameters, p, q, and d, such that d denotes the degree of differencing in order to make the corresponding dataset stationary. Thus, ARIMA(

p, d, q

) is given by

ϕ_{p} (B) \nabla^{d} y_{t} = c + θ_{q} (B) ϵ_{t},

(4)

where

ϵ_{t}

is white noise, c is a constant,

ϕ_{p} (B)

is a polynomial of order p,

θ_{q} (B)

is a polynomial of order q,

\nabla^{d}

is the differentiating operator, and B is the backshift operator, which shifts an observation

y_{t}

in time.

There are situations where we can find repeating patterns (also known as seasonality) in the corresponding dataset. Consequently, a SARIMA

(p, d, q) (P, D, Q, s)

(i.e., seasonal ARIMA) model is given by

Φ_{P} (B^{s}) ϕ_{p} (B) \nabla^{d} \nabla_{s}^{D} y_{t} = c + Θ_{Q} (B^{s}) θ_{q} (B) ϵ_{t},

(5)

where

Φ_{P} (B^{s})

is a polynomial of order P, where

Θ_{Q} (B^{s})

is a polynomial of order Q, and where

\nabla_{s}^{D}

is the seasonal differentiating operator. Note that in this paper, we considered

s = 4

, as it denoted four seasons of the year.

Thus far, the presented models are of univariate type, because the only dependent variable (in our case, the percentage of renewables) is predicted based on its previous historical values. However, there are also circumstances where the prediction of the dependent variable is not only based on its historical values but also on some external variables (also known as exogenous). Thus, the corresponding models are of the multivariate type. Examples of such types of models include SARIMAX and ARIMAX. A SARIMAX

(p, d, q) (P, D, Q, s)

(i.e., SARIMA + exogenous) model is given by

Φ_{P} (B^{s}) ϕ_{p} (B) \nabla^{d} \nabla_{s}^{D} y_{t} = c + β_{k} x_{k, t} + Θ_{Q} (B^{s}) θ_{q} (B) ϵ_{t},

(6)

such that

β_{k}

is the weight of the

k^{t h}

exogenous variable, whereas

x_{k, t}

is the vector containing the

k^{t h}

exogenous variable at time t. An ARIMAX

(p, d, q)

model is specified as

ϕ_{p} (B) \nabla^{d} y_{t} = c + β_{k} x_{k, t} + θ_{q} (B) ϵ_{t},

(7)

where all the model parameters are provided previously.

It is important to mention that for the SARIMAX and ARIMAX models the considered exogenous variables are temperature, wind gust, and GHI: these are identified by calculating the Pearson’s correlation coefficient, the details of which can be found in [18,19]. Consequently, in the figures as well as tables of Section 5.1, we provide all the relevant boxplots and information, so that the reader can obtain an overview, with respect to the forecasts generated by models requiring exogenous variables like SARIMAX and ARIMAX.

3. Comparing Methodologies: Yearly vs. Monthly

In this section, we compare the two different methodologies—yearly and monthly—that were proposed in [18,19], respectively. For this purpose, we considered a dataset spanning four years of weather and irradiation information between 1 January 2019 and 31 December 2022. Furthermore, we retrained all the models of both methodologies in the same way as specified in Section 3 of [18] and [19], respectively. In order to have one coherent and consistent dataset (which was not the case for [18,19]), we developed and implemented a database, the details of which are covered in Section 4.

For this paper, as we acquired one and two years of additional information, respectively, in regard to [18,19], we went one step further and calculated for each methodology (i.e., yearly and monthly) and the 36 best models (12 months and each month the three best models of SARIMAX, SARIMA, and ARIMAX) the averages of all four performance metrics (RMSE, MAE, ME, and MPE) for four different years (2019 to 2022). Note that in calculating the overall averages, we attributed the same weight of 0.25 to all four considered performance metrics.

Table 1 and Table 2 show the optimized three models of SARIMAX, SARIMA, and ARIMAX for each month of the year, based on the two different methodologies of [18,19], respectively. Those tables also show for each month the best ARIMA-based model, in bold text. It is important to note that the numbers within the brackets are each model’s parameters, the details of which can be found in Section A.2 of [18]. It can be seen that the yearly based methodology generated five ARIMAX, six SARIMAX, and one SARIMA best models, whereas the monthly based methodology produced three ARIMAX, six SARIMAX, and three SARIMA best models. As mentioned above, the best model for each month was the one that provided the best average accuracy for the considered four performance metrics (RMSE, MAE, ME, and MPE) and for the four studied years of 2019 to 2022.

Figure 1 and Figure 2 illustrate the four considered performance metrics—RMSE, MAE, ME, and MPE—of the yearly based methodology. Note that the horizontal dotted lines present the average value of the corresponding performance metric, by considering the results of four years. It can be seen from Figure 1 that out of 48 months, approximately 20 had values above the average RMSE and MAE of 10.6 and 8.7, respectively, occurring mostly between April and August. Analyzing the ME and MPE values, it can be seen from Figure 2 that out of 48 months, about 22 of them had values exceeding the average ME and MPE of 6.4 and 15.4, respectively, occurring mostly during January, April, June, November, and December. Thus, we concluded that for the month of June, all four performance metrics had values above their corresponding averages. Figure 3 and Figure 4 demonstrate the four performance metrics—RMSE, MAE, ME, and MPE—of the monthly based methodology, where the horizontal dotted lines have the same meaning as in the previous two figures. We can see in Figure 3 that in the course of the 48 months, both the RMSE and the MAE exceeded their corresponding averages of 10.36 and 8.35, respectively, 25 times, mostly between March and July. Regarding the ME and MPE values, it can be seen in Figure 4 that in the course of the 48 months, their values were greater than the corresponding averages of 5.99 and 14.8, respectively, about 25 times, mostly during April (only for ME), July, and from September to November. Thus, we can conclude that for the monthly based methodology, all four performance metrics were above their averages during the month of July.

Figure 5 shows boxplots of the yearly based, the monthly based, and the hybrid methodologies. The three boxplots were created by considering four years (2019 to 2022) and 12 months (January to December), which resulted in 48 data points. Furthermore, each data point presents the average of the four performance metrics: ME, MPE, RMSE, and MAE. It can be seen in Figure 5 that the monthly based methodology obtained, in general, lower error rates than the yearly based methodology: the former had median, minimum, and maximum error rates of 10%, 7% (reported as an outlier), and 12.49%, respectively, whereas the latter had 10.49%, 6.64%, and 13.71%, respectively. Those results led us to carry out a further detailed comparison between the two methodologies, from which we derived the following conclusions: neither methodology outperformed absolutely the other, as there were certain months when monthly based models performed better than yearly based ones and vice versa. Consequently, in this paper, we propose the hybrid methodology. We selected the best model for each month out of the monthly based and yearly based methodologies, and the results are shown in Table 3. The improvements, in terms of performance metrics, are noticeable from the boxplot in Figure 5 of the hybrid methodology. The hybrid methodology took the best part of the monthly based and yearly based methodologies and had median, minimum, and maximum error rates of 9.7%, 6.64% (reported as an outlier), and 12.3% (reported as an outlier), respectively. Note that the outliers (i.e., the circles in Figure 5) were due to the fact that most of the error rates were within the interquartile range (IQR), while the rest (the outliers) lay outside of this range.

4. Implementation of REN4KAST

Figure 6 gives a high-level overview of the implemented REN4KAST server (http://139.174.11.24/api (accessed on 15 June 2023)). The corresponding software platform is realized as a Docker-based virtual machine running Python’s Flask library, which provides three basic services: Data (i.e., /data), Forecast (i.e., /), and Evaluation (i.e., /evaluate). It can be seen that both the Forecast and the Evaluation services need to interact with the Data service to fetch information and store results in the underlying databases. Next, we describe each of those services separately.

4.1. Data Service

The ARIMA-based models of Table 1 and Table 2 that have a suffix X require both a historical (i.e., last 35 days) and a day-ahead forecast of exogenous variables, such as temperature, wind gust, GHI, etc. Those variables can be divided into two groups: weather- and irradiation-related information. In the former case, the Data service uses the open-source Weatherbit API (https://www.weatherbit.io/api/historical-weather-api (accessed on 15 June 2023) and https://www.weatherbit.io/api/weather-forecast-api (accessed on 15 June 2023)), whereas, for the irradiation information, it uses the Solar Energy Service for Professionals (SoDa-Pro) API (https://www.soda-pro.com/ (accessed on 15 June 2023)). One of the new features of our REN4KAST platform is the storage of (1) the fetched historical data, and (2) the generated forecasts in MongoDB databases. This feature allows for the training and generation of different models of [18,19], by having one consistent and coherent dataset, which was not the case previously.

The stored information for the weather and irradiation can be fetched by setting the corresponding configuration (i.e., weather or irradiation), as well as the start and end dates. The following Data API call to the server fetched weather information between the 1st and 31st of January 2023: http://139.174.11.24/data?dataset=weather&start=2023-01-01&end=2023-01-31 (accessed on 15 June 2023). Note that, by replacing the "weather" keyword with “irradiation”, it is possible to fetch irradiation-related information for the specified period. Upon receiving such a request, the server returns, in JSON format, all the relevant weather or irradiation-related information in 15-min intervals.

The Data service uses the ENTSOE API (https://github.com/EnergieID/entsoe-py (accessed on 15 June 2023)) to fetch RES-related information and to store it in the corresponding REN4KAST’s “entsoe” database. The following example fetched the stored information for the 7 May 2023: http://139.174.11.24/data?dataset=entsoe&start=2023-05-07&end=2023-05-07 (accessed on 15 June 2023). Like the previous replies, this one also returns a JSON reply with each entry corresponding to a 15-min interval, demonstrating the generated electricity from all energy sources (i.e., biomass, fossil gas, geothermal, solar, wind onshore/offshore, etc.). The gathered information is then used to calculate the actual percentage of renewables and is stored in the corresponding REN4KAST’s “percentageOfRenewables” database. The following example gives the percentage of renewables for the 7 May 2023: http://139.174.11.24/data?dataset=percentageOfRenewables&start=2023-05-07&end=2023-05-07 (accessed on 15 June 2023).

4.2. Forecast Service

The best ARIMA-based models of Table 3 were implemented in the Forecast service module. The service was programmed to train (it took about 2 min) the hyper-parameters of the underlying models daily, at around 11:15 p.m. For this purpose, this service consulted the Data service, in order to fetch both the historical and the day-ahead forecasts of the required exogenous variables for SARIMAX and ARIMAX models. Furthermore, the generated forecast of the renewable energy sources for the next day was stored in the corresponding “ren4kastForecasts” database.

An example of such an API call was http://139.174.11.24/?date=2023-05-07 (accessed on 15 June 2023), which could be executed on the 6th of May 2023 at 11:15 p.m. or later. It is only possible to obtain the forecasts of the next day or any other day in the past. Such an API call returns a JSON reply containing 96 pieces of information, each for every 15-min interval. Next, we give an example of the returned information for one 15-min interval:

Exogenous: an array of all the exogenous variables and their corresponding values, in case the model for that specific month is either SARIMAX or ARIMAX. For the above request’s first (see timestamp below) 15-min interval, it returned: “Exogenous”: [“Name”: “windspeed”, “Value”: 9.879, “Name”: “GHI”, “Value”: 0.0].
Forecasts: a value between 0 and 100 indicating the percentage of renewables.
Timestamp: a Unix epoch timestamp having, in this case, a value of 1683417600, which denoted 7 May 2023 12:00 a.m. (i.e., the first 15-min interval).

4.3. Evaluation Service

This service serves to calculate the different performance metrics (i.e., RMSE, MAE, MPE, and MAPE) between the actual values stored in the “percentageOfRenewables” database and the forecast values stored in the “ren4kastForecasts” database for a specific period. The following example computes the above-mentioned four performance metrics for the date 7 May 2023: http://139.174.11.24/evaluate?start=2023-05-07&end=2023-05-07 (accessed on 15 June 2023).

5. Experimental Results

In [18,19], it was reported that for the exogenous variables, the SARIMAX and ARIMAX models require (1) weather-related information temperature and wind gust, and (2) irradiation-related GHI information. Consequently, we first present the average values of those variables, by taking into account yearly and monthly views. More precisely, in the yearly view, we considered for each year from 2019 to 2022 their corresponding average values, whereas in the monthly view, we considered all four years for a specific month (January 2019, January 2020, January 2021, and January 2022), and we present their corresponding average values. The results presented in Section 5.1 are for Germany, and were calculated by considering 12 different cities in Germany that are about 200 km distant from one another. Figure 7 shows these 12 cities and their geographic distribution. It can be seen that with the 12 cities, the map of Germany is divided into a grid of four rows and three columns.

5.1. Exogenous Variables

Figure 8 demonstrates the statistical results of the three considered exogenous variables of temperature, wind gust, and GHI from a yearly perspective. More precisely, the boxplots present the median (green line), mean (purple triangles), minimum (lowest horizontal line), and maximum (highest horizontal line) values of temperature (°C), wind gust (km/h), and GHI (W/m

^{2}

), respectively, for 2019 to 2022. These values are shown numerically in Table 4 and Table 5, where it can be seen that, unlike GHI, both temperature and wind gust had very similar median and mean values. Further analysis shows that (1) the temperature dataset for all four years had a negligible (almost zero) number of outliers (values exceeding the minimum and maximum values), (2) the wind gust dataset comprised about 3% outliers in each year, and (3) the GHI dataset comprised about 7% outliers in each of the four years.

Figure 9 illustrates the statistical results of the above-mentioned three exogenous variables, by considering each month of the year. Note that the boxplots were generated by considering all four years’ values for a specific month (January 2019, 2020, 2021, and 2022), and that they demonstrate the median (green line), minimum (lowest horizontal line), and maximum (highest horizontal line) values of temperature (°C), wind gust (km/h), and GHI (W/m

^{2}

). It can be seen from Figure 9a that the mean temperature in Germany increased gradually, from January (2

^{\circ}

C) to May (11.7

^{\circ}

C), while in June, July, and August it was at its highest, at about 18

^{\circ}

C, and it then decreased gradually from September (14.5

^{\circ}

C) to December (3

^{\circ}

C). Further analysis shows that (1) the dataset for April and August had a negligible (almost zero) number of outliers, (2) the dataset for March, May, June, and from September to December comprised about 1% outliers (i.e., values exceeding the minimum and maximum values), (3) the dataset for January and July comprised about 4% and 2% outliers, respectively, and (4) the dataset for February had the highest number of outliers, at a rate of about 7%. From Figure 9b, it can be seen that the mean wind gust in Germany fluctuated between 19 (June) and 30 km/h (February). Moreover, an in-depth analysis demonstrates that (1) the dataset for January, May, June, July, August, and October comprised about 1% outliers, (2) the dataset for February, March, April, and September comprised about 2% outliers, and (3) November and December comprised about 3% outliers. Finally, Figure 9c shows that the average GHI increased gradually from January to June and decreased again until December. Further analysis shows that (1) the dataset for April to September comprised negligible (i.e., 0%) outliers, (2) the dataset for March and October comprised about 2% outliers, and (3) the dataset for the January, February, November, and December comprised about 7% outliers. In this case (GHI), an outlier was a value that exceeded the corresponding maximum value of each month.

5.2. Observed vs. Forecast Values

Figure 10 demonstrates the statistical results of the observed and the forecast values for the percentage of renewables from a yearly perspective between 2019 and 2022. The boxplots in this figure have the same meaning as the ones presented in the previous figures. Furthermore, the statistical values are given numerically in Table 6. It can be seen from Figure 10a that in Germany the yearly observed average percentage of renewables was about 45%, with a minimum of 42.7% and a maximum of 47.52% realized in 2021 and 2020, respectively. Furthermore, the yearly observed maximum value was about 82%, with the lowest being about 81% and the highest of 84.37% having been achieved in 2021 (on 27 March 2021, 22 May 2021, and 29 July 2021) and in 2020 (on 04 July 2020), respectively. Finally, the yearly observed minimum value was about 13%, with the lowest being 12.5% and the highest of 14.1% having been realized in 2021 (on 16 November 2021) and in 2022 (on 16 December 2022), respectively. Figure 10b shows the results of the forecasts generated by our best ARIMA-based models of Table 3. The generated forecasts performed as well as the observed values, with minimum and maximum average yearly forecasts of 41.22% (the observed being 42.7%) and 47.29% (the observed being 47.52%) in 2021 and 2020 (the same years as the observed ones), respectively. However, looking at the maximum and minimum forecast values, we can see that for the year 2020, our forecasts overestimated by a maximum of 95% (on 21 and 22 April 2020, whereas the observed was 84.37%), and by a minimum of 14.62%, 15.1%, and 17.29% in 2020, 2021, and 2022, respectively. To view better the reason for these discrepancies, Figure 11 illustrates the percentage of the renewables from a monthly perspective for the four years under study. It can be seen that the highest average percentage was achieved in different months during those four years: March in 2019, February in 2020 and 2022, and May in 2021 (see Figure 11a,c,e,g). Note that this same pattern was also predicted by our ARIMA-based models (see Figure 11d,f,h), except for the year 2019, where the highest percentage of renewables was predicted for December (see Figure 11b).

Figure 12 illustrates the corresponding RMSE on a monthly basis for the four years (2019 to 2022) under study. The RMSE of each month was calculated by using the following formula:

R M S E_{D a y} = \sqrt{\frac{\sum_{i = 1}^{96} {(O_{i} - \hat{O_{i}})}^{2}}{96}};

(8)

R M S E_{M o n t h} = \frac{\sum_{i = 1}^{N_{M o n t h}} R M S E_{D a y_{i}}}{N_{M o n t h}},

(9)

where

O_{i}

and

\hat{O_{i}}

were the observed and forecast values of the percentage of renewables, such that for every hour there existed four observations (every 15-min interval), whereas

N_{M o n t h}

indicates the number of days of the given month under study. By considering the average RMSE of the whole year (the dashed horizontal lines), our results show that 2019, 2020, 2021, and 2022 were 9.9%, 10.26%, 10.4%, and 10.83%, respectively. Furthermore, in December 2022, our predictions had the lowest error rate, of 6.16%, whereas they had the highest error rate, of 16.95%, in June 2022. By considering the average RMSE for each year, in 2019, 2020, 2021, and 2022, the results show that our predictions were 8, 6, 6, and 7 times lower, respectively, than this average.

To further analyze the distribution of the above-mentioned error rates of the RMSE between the observed and forecast values of the percentage of renewables, Figure 13 demonstrates the boxplots with the relevant information: mean (purple triangle), median (horizontal green line), and minimum and maximum RMSE values. The circles in Figure 13a–d denote the outliers (in our case, highly errored predictions). It can be seen from Figure 13d that, in June 2022, the number of outliers (in the range of 25% of the RMSE), mean and median values, and intervals, were the highest, which explains the reason for the high error rates of 16% in this month and year. Moreover, it can be seen that our predictions, in general, suffered from relatively high error rates in the months between May and September for the four years under study (except for 2019 and 2022), with the highest error rates happening always in June. Table 7 shows a detailed statistical overview of the RMSE, with respect to the minimum, maximum, and mean values.

6. Conclusions

Forecasting accurately the generation of large-scale renewable energy sources is challenging, due to the underlying stochastic nature of the problem. In this paper, we considered the two different methodologies—yearly based and monthly based—proposed in our respective previous works [1,2]. To compare the accuracy of those two methodologies, unlike in our previous works, we assumed a consistent dataset spanning four years (1 January 2019 to 31 December 2022). Furthermore, we computed four different performance metrics (RMSE, MAE, ME, and MPE) for the best ARIMA-based model of each month and methodology. Using the equal weights of those four metrics, we then derived a new set of optimal models for the two considered methodologies, and we compared them to each other. We found that neither methodology was absolutely superior to the other: consequently, we proposed a hybrid of the two. Our hybrid methodology considered the best ARIMA-based model for each month of the year. By means of boxplots (i.e., minimum, maximum, mean, and median values), we showed that our proposed hybrid methodology improved accuracy (i.e., reduced error rates) compared to the yearly based and monthly based methodologies. Furthermore, we implemented those new best models in an open-source REN4KAST platform, and we provided the implementation details. Note that such a platform provides services (data retrieval, forecasting, evaluation), with respect to the percentage of renewables in Germany. We then carried out experimental analysis, by considering four years of data from Germany. We compared the observed (i.e., real data) to the forecast (i.e., models derived from our proposed hybrid methodology) values, and showed that the average annual RMSE for the four years was about 10.5%. To provide evidence of those improvements, we showed the distribution of the error rates for each month of the year. Our results showed that, in general, the months of May to September had higher error rates than the other months of the year, with the month of June suffering the highest error rates, of about 16%: we will reserve, for our future work, investigation of the reasons. Also, as a future work, we will investigate the impact of anomalies (i.e., outliers and missing values) in our dataset on the accuracy of the derived ARIMA-based forecasting models.

Author Contributions

Conceptualization, R.B. and A.S.; methodology, R.B.; software, A.S.; validation and verification, R.B. and A.S.; formal analysis, R.B.; writing—original draft preparation, R.B.; writing—review and editing, R.B.; supervision, R.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is available in our git repository (https://github.com/ren4kast/REN4KAST-Private and at http://139.174.11.24/api (accessed on 15 June 2023).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ARIMAX	auto-regressive integrated moving average with exogenous
GHI	global horizontal irradiation
MAE	mean absolute error
ME	mean error
MPE	mean percentage error
RES	renewable energy sources
RMSE	root mean squared error
SARIMA	seasonal auto-regressive integrated moving average
SARIMAX	seasonal auto-regressive integrated moving average with exogenous

References

Gernaat, D.E.; de Boer, H.S.; Daioglou, V.; Yalew, S.G.; Müller, C.; van Vuuren, D.P. Climate change impacts on renewable energy supply. Nat. Clim. Chang. 2021, 11, 119–125. [Google Scholar] [CrossRef]
Yang, D.; Wang, W.; Gueymard, C.A.; Hong, T.; Kleissl, J.; Huang, J.; Perez, M.J.; Perez, R.; Bright, J.M.; Xia, X.; et al. A review of solar forecasting, its dependence on atmospheric sciences and implications for grid integration: Towards carbon neutrality. Renew. Sustain. Energy Rev. 2022, 161, 112348. [Google Scholar] [CrossRef]
Bergmeir, C.; de Nijs, F.; Sriramulu, A.; Abolghasemi, M.; Bean, R.; Betts, J.; Bui, Q.; Dinh, N.T.; Einecke, N.; Esmaeilbeigi, R.; et al. Comparison and Evaluation of Methods for a Predict+ Optimize Problem in Renewable Energy. arXiv 2022, arXiv:2212.10723. [Google Scholar]
Worighi, I.; Maach, A.; Hafid, A.; Hegazy, O.; Van Mierlo, J. Integrating renewable energy in smart grid system: Architecture, virtualization and analysis. Sustain. Energy Grids Netw. 2019, 18, 100226. [Google Scholar] [CrossRef]
Ahmad, T.; Zhang, H.; Yan, B. A review on renewable energy and electricity requirement forecasting models for smart grid and buildings. Sustain. Cities Soc. 2020, 55, 102052. [Google Scholar] [CrossRef]
Basmadjian, R. Flexibility-Based Energy and Demand Management in Data Centers: A Case Study for Cloud Computing. Energies 2019, 12, 3301. [Google Scholar] [CrossRef]
Wan, C.; Zhao, J.; Song, Y.; Xu, Z.; Lin, J.; Hu, Z. Photovoltaic and solar power forecasting for smart grid energy management. CSEE J. Power Energy Syst. 2015, 1, 38–46. [Google Scholar] [CrossRef]
Wang, Y.; Millstein, D.; Mills, A.D.; Jeong, S.; Ancell, A. The cost of day-ahead solar forecasting errors in the United States. Sol. Energy 2022, 231, 846–856. [Google Scholar] [CrossRef]
Bendali, W.; Saber, I.; Boussetta, M.; Mourad, Y.; Bourachdi, B.; Bossoufi, B. Deep learning for very short term solar irradiation forecasting. In Proceedings of the 2020 5th International Conference on Renewable Energies for Developing Countries (REDEC), Marrakech, Morocco, 29–30 June 2020; pp. 1–6. [Google Scholar]
Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y. Short-term self consumption PV plant power production forecasts based on hybrid CNN-LSTM, ConvLSTM models. Renew. Energy 2021, 177, 101–112. [Google Scholar] [CrossRef]
Ghiassi-Farrokhfal, Y.; Keshav, S.; Rosenberg, C.; Ciucu, F. Solar Power Shaping: An Analytical Approach. IEEE Trans. Sustain. Energy 2015, 6, 162–170. [Google Scholar] [CrossRef]
Alsharif, M.H.; Younes, M.K.; Kim, J. Time series ARIMA model for prediction of daily and monthly average global solar radiation: The case study of Seoul, South Korea. Symmetry 2019, 11, 240. [Google Scholar] [CrossRef]
Atique, S.; Noureen, S.; Roy, V.; Subburaj, V.; Bayne, S.; Macfie, J. Forecasting of total daily solar energy generation using ARIMA: A case study. In Proceedings of the 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 7–9 January 2019; pp. 0114–0119. [Google Scholar]
Singh, U.; Rizwan, M.; Alaraj, M.; Alsaidan, I. A machine learning-based gradient boosting regression approach for wind power production forecasting: A step towards smart grid environments. Energies 2021, 14, 5196. [Google Scholar] [CrossRef]
Alkesaiberi, A.; Harrou, F.; Sun, Y. Efficient wind power prediction using machine learning methods: A comparative study. Energies 2022, 15, 2327. [Google Scholar] [CrossRef]
Hodge, B.M.; Zeiler, A.; Brooks, D.; Blau, G.; Pekny, J.; Reklatis, G. Improved wind power forecasting with ARIMA models. In Computer Aided Chemical Engineering; Elsevier: Amsterdam, The Netherlands, 2011; Volume 29, pp. 1789–1793. [Google Scholar]
Eldali, F.A.; Hansen, T.M.; Suryanarayanan, S.; Chong, E.K. Employing ARIMA models to improve wind power forecasts: A case study in ERCOT. In Proceedings of the 2016 North American Power Symposium (NAPS), Denver, CO, USA, 18–20 September 2016; pp. 1–6. [Google Scholar]
Basmadjian, R.; Shaafieyoun, A.; Julka, S. Day-Ahead Forecasting of the Percentage of Renewables Based on Time-Series Statistical Methods. Energies 2021, 14, 7443. [Google Scholar] [CrossRef]
Basmadjian, R.; Shaafieyoun, A. ARIMA-based Forecasts for the Share of Renewable Energy Sources: The Case Study of Germany. In Proceedings of the 2022 3rd International Conference on Smart Grid and Renewable Energy (SGRE), Doha, Qatar, 20–22 March 2022; pp. 1–6. [Google Scholar]
Ferrero Bermejo, J.; Gomez Fernandez, J.F.; Olivencia Polo, F.; Crespo Marquez, A. A Review of the Use of Artificial Neural Network Models for Energy and Reliability Prediction. A Study of the Solar PV, Hydraulic and Wind Energy Sources. Appl. Sci. 2019, 9, 1844. [Google Scholar] [CrossRef]
Mosavi, A.; Salimi, M.; Faizollahzadeh Ardabili, S.; Rabczuk, T.; Shamshirband, S.; Varkonyi-Koczy, A.R. State of the art of machine learning models in energy systems, a systematic review. Energies 2019, 12, 1301. [Google Scholar] [CrossRef]
Sharifzadeh, M.; Sikinioti-Lock, A.; Shah, N. Machine-learning methods for integrated renewable power generation: A comparative study of artificial neural networks, support vector regression, and Gaussian Process Regression. Renew. Sustain. Energy Rev. 2019, 108, 513–538. [Google Scholar] [CrossRef]
Basmadjian, R. Optimized Charging of PV-Batteries for Households Using Real-Time Pricing Scheme: A Model and Heuristics-Based Implementation. Electronics 2020, 9, 113. [Google Scholar] [CrossRef]

Figure 1. Comparison of RMSE and MAE performance metrics of the best ARIMA-based models, using the yearly methodology of [18] and for four different years between 2019 and 2022.

Figure 2. Comparison of ME and MPE performance metrics of the best ARIMA-based models, using the yearly methodology of [18] and for four different years between 2019 and 2022.

Figure 3. Comparison of RMSE and MAE performance metrics of the best ARIMA-based models, using the monthly methodology of [19] and for four different years between 2019 and 2022.

Figure 4. Comparison of ME and MPE performance metrics of the best ARIMA-based models, using the monthly methodology of [19] and for four different years between 2019 and 2022.

Figure 5. Comparison of yearly based, monthly based, and hybrid methodologies, by considering all four performance metrics—ME, MPE, RMSE, and MAE—for each month of the years 2019 to 2022. Each boxplot was created from 48 values (4 years × 12 months), where each value was the average of the four considered metrics.

Figure 6. Graphical overview of the REN4KAST server solution implemented as a Docker-based virtual machine using Python’s programming language and its Flask library.

Figure 7. The distribution of the 12 cities, from which the average values of Germany were calculated.

Figure 8. Yearly view of the exogenous variable temperature, wind gust, and GHI of Germany, by considering information from 1 January 2019 to 31 December 2022.

Figure 9. Monthly view of the exogenous variable temperature, wind gust, and GHI of Germany, by considering information from 1 January 2019 to 31 December 2022.

Figure 10. Yearly view of the observed and forecast values for the percentage of renewables for the years 2019 to 2022. Triangle and horizontal green lines represent the mean and median values respectively.

Figure 11. Monthly view of the observed and forecast values for the percentage of renewables for the years 2019 to 2022. Triangle and horizontal green lines represent the mean and median values respectively.

Figure 12. Root mean squared error (RMSE) of the percentage of renewables on a monthly basis, by considering the four years under study. The horizontal lines present the average of the corresponding year.

Figure 13. Root mean squared error (RMSE) of the observed and forecast values for the percentage of renewables for the years 2019 to 2022. Triangle and horizontal green lines represent the mean and median values respectively, whereas the circles are the outliers.

Table 1. Models generated for each month of the year, by considering the three ARIMA-based methods of SARIMAX, SARIMA, and ARIMAX and the yearly based methodology of [18]. The numbers between the brackets show each model’s corresponding parameters. The bold text presents the best ARIMA-based model for each month, which was selected by averaging the four performance metrics between 2019 and 2022.

Month of the Year	Generated Models
Month of the Year	SARIMAX	SARIMA	ARIMAX
January	(4, 1, 2) × (1, 0, 1, 4)	(3, 1, 2) × (2, 0, 1, 4)	(4, 1, 4)
February	(5, 0, 5) × (1, 1, 0, 4)	(4, 0, 0) × (1, 1, 2, 4)	(1, 1, 4)
March	(5, 0, 5) × (1, 1, 0, 4)	(1, 1, 2) × (2, 0, 2, 4)	(4, 1, 2)
April	(3, 1, 4) × (2, 0, 2, 4)	(5, 2, 3) × (2, 0, 2, 4)	(5, 1, 2)
May	(5, 0, 0) × (0, 1, 2, 4)	(3, 1, 4) × (2, 0, 2, 4)	(5, 1, 2)
June	(1, 1, 2) × (2, 0, 2, 4)	(5, 0, 4) × (2, 1, 2, 4)	(5, 1, 5)
July	(2, 0, 1) × (2, 1, 1, 4)	(1, 1, 4) × (2, 0, 2, 4)	(4, 1, 5)
August	(5, 1, 1) × (2, 0, 2, 4)	(5, 1, 1) × (2, 0, 2, 4)	(5, 1, 5)
September	(5, 0, 0) × (1, 1, 1, 4)	(4, 1, 5) × (2, 0, 2, 4)	(4, 1, 3)
October	(2, 1, 3) × (2, 0, 2, 4)	(4, 1, 1) × (2, 0, 2, 4)	(3, 1, 4)
November	(5, 1, 5) × (2, 0, 2, 4)	(2, 1, 5) × (2, 0, 2, 4)	(5, 1, 1)
December	(3, 1, 0) × (2, 0, 2, 4)	(3, 1, 0) × (2, 0, 2, 4)	(5, 1, 5)

Table 2. Models generated for each month of the year, by considering the three ARIMA-based methods of SARIMAX, SARIMA, and ARIMAX and the monthly based methodology of [19]. The numbers between the brackets show each model’s corresponding parameters. The bold text presents the best ARIMA-based model for each month, which was selected by averaging the four performance metrics between 2019 and 2022.

Month of the Year	Generated Models
Month of the Year	SARIMAX	SARIMA	ARIMAX
January	(4, 1, 2) × (1, 0, 1, 4)	(4, 1, 2) × (1, 0, 1, 4)	(2, 1, 5)
February	(4, 0, 0) × (2, 1, 1, 4)	(2, 0, 1) × (1, 1, 2, 4)	(1, 1, 4)
March	(1, 1, 2) × (1, 0, 0, 4)	(1, 1, 2) × (2, 0, 2, 4)	(5, 1, 0)
April	(5, 1, 3) × (2, 0, 2, 4)	(4, 1, 3) × (2, 0, 2, 4)	(5, 1, 2)
May	(2, 0, 1) × (1, 1, 1, 4)	(5, 1, 2) × (2, 0, 2, 4)	(5, 1, 2)
June	(2, 1, 1) × (2, 0, 2, 4)	(5, 0, 4) × (2, 1, 2, 4)	(5, 1, 4)
July	(5, 2, 3) × (2, 0, 2, 4)	(4, 1, 3) × (2, 0, 2, 4)	(4, 1, 5)
August	(3, 1, 2) × (2, 0, 2, 4)	(5, 1, 1) × (2, 0, 2, 4)	(5, 1, 4)
September	(5, 0, 0) × (2, 1, 2, 4)	(4, 1, 5) × (2, 0, 2, 4)	(4, 1, 5)
October	(1, 1, 1) × (2, 0, 2, 4)	(4, 1, 1) × (2, 0, 2, 4)	(3, 1, 5)
November	(4, 1, 5) × (2, 0, 2, 4)	(2, 1, 5) × (2, 0, 2, 4)	(5, 1, 1)
December	(2, 1, 0) × (2, 0, 2, 4)	(3, 1, 0) × (2, 0, 2, 4)	(5, 1, 5)

Table 3. Models selected for each month of the year, by considering the best model of the monthly based and yearly based methodologies reported in Table 1 and Table 2, respectively.

Month of the Year	Selected Models
Month of the Year	Name	Model	Methodology
January	SARIMAX	(4, 1, 2) × (1, 0, 1, 4)	Monthly
February	SARIMAX	(5, 0, 5) × (1, 1, 0, 4)	Yearly
March	SARIMAX	(1, 1, 2) × (1, 0, 0, 4)	Monthly
April	SARIMAX	(3, 1, 4) × (2, 0, 2, 4)	Yearly
May	ARIMAX	(5, 1, 2)	Yearly
June	SARIMA	(5, 0, 4) × (2, 1, 2, 4)	Monthly
July	SARIMAX	(2, 0, 1) × (2, 1, 1, 4)	Yearly
August	ARIMAX	(5, 1, 4)	Monthly
September	SARIMA	(4, 1, 5) × (2, 0, 2, 4)	Monthly
October	SARIMAX	(2, 1, 3) × (2, 0, 2, 4)	Yearly
November	SARIMAX	(4, 1, 5) × (2, 0, 2, 4)	Monthly
December	ARIMAX	(5, 1, 5)	Monthly

Table 4. Minimum (Min), maximum (Max), median (Mdn), and mean (Mn) values of the three exogenous variables of temperature, wind gust, and GHI from a yearly perspective.

Year	Temperature [°C]				Wind Gust [km/h]				GHI [W/m $^{2}$ ]
Year	Min	Mdn	Mn	Max	Min	Mdn	Mn	Max	Min	Mdn	Mn	Max
2019	−6.3	9.76	10.4	32	7.58	22	23.9	47.4	0	0.77	33.1	128.4
2020	−3.07	9.92	10.6	30.7	7.18	22	23.8	46.9	0	0.77	34	130.5
2021	−11.8	8.69	9.34	31.9	7.06	21	22.5	42.1	0	0.8	32	127.6
2022	−8.23	10.4	10.7	33.9	0	21	22.4	46.1	0	1.1	35.7	143.6

Table 5. Minimum (Min), maximum (Max), median (Mdn), and mean (Mn) values of the three exogenous variables of temperature, wind gust, and GHI from a monthly perspective.

Month	Temperature [°C]				Wind Gust [km/h]				GHI [W/m $^{2}$ ]
Month	Min	Mdn	Mn	Max	Min	Mdn	Mn	Max	Min	Mdn	Mn	Max
January	−5.1	1.93	2.13	9.26	8.94	24.1	26.7	56.5	0	0	7.46	31.0
February	−4.9	4.4	4.18	13.6	8.78	26.6	30.1	65.8	0	0	16.7	68.1
March	−3.3	5.22	5.56	15.2	6.89	23.6	26.0	57.1	0	0.05	31.5	145.6
April	−2.3	8.16	8.5	21.2	7.9	23.2	24.3	48.4	0	14.2	49.4	208.7
May	0.9	11.7	12.0	23.9	0	22.8	22.9	48.0	0	28.5	54.9	226.4
June	8.4	18.1	18.4	30.5	0	19.6	19.1	39.3	0	38.0	64.3	228.4
July	9.3	18.1	18.5	28.9	7.17	21.0	21.8	41.9	0	32.8	57.9	224.3
August	10.1	18.4	18.9	30.6	6.98	19.4	20.0	39.9	0	18.8	49.7	209.0
September	4.7	14.3	14.5	25.2	7.27	18.9	20.0	39.2	0	2.95	35.5	171.7
October	1.87	11.1	11.1	20.2	8.38	21.2	22.3	42.4	0	0	19.6	93.6
November	−1.8	5.6	5.94	14.4	8.68	20.7	21.6	38.6	0	0	9.85	40.1
December	−6.4	2.78	3.0	12.9	9.92	21.4	23.2	43.3	0	0	6.46	26.4

Table 6. Minimum (Min), maximum (Max), median (Mdn), and mean (Mn) values of the observed and forecast values for the percentage of renewables from a yearly perspective.

Year	Observed [%]				Forecast [%]
Year	Min	Mdn	Mn	Max	Min	Mdn	Mn	Max
2019	13.09	42.29	43.36	81.77	13.45	42.09	43.12	80.65
2020	12.83	47.28	47.52	84.37	14.62	48.36	47.29	94.99
2021	12.5	40.94	42.71	81.11	15.1	39.5	41.22	77.98
2022	14.1	45.48	45.92	82.76	17.29	42.09	43.8	87.17

Table 7. Minimum (Min), maximum (Max), and mean (Mn) values of the RMSE for the percentage of renewables, by considering the four years under study.

Month	2019			2020			2021			2022
Month	Min	Mn	Max	Min	Mn	Max	Min	Mn	Max	Min	Mn	Max
January	1.5	8.6	17.3	2.6	8.5	20.5	2.5	8.2	11.9	2.7	9.8	21.2
February	2.1	8.0	15.9	2.2	8.3	18.9	0.8	8.9	20.9	2.8	8.8	20.5
March	3.1	9.1	17.2	3.2	10.5	15.8	2.1	9.5	16.6	2.4	8.8	17.0
April	4.2	7.7	14.2	2.4	10.3	24.1	5.6	11.2	18.0	3.2	10.2	16.1
May	2.9	9.7	19.1	3.5	9.9	18.6	4.1	12.4	20.8	4.9	12.3	20.5
June	6.7	13.2	19.4	6.9	12.5	19.5	7.8	13.8	27.8	12.1	17.0	22.2
July	4.1	9.8	14.5	7.1	11.8	16.9	3.5	11.2	19.1	5.9	12.1	19.8
August	7.0	12.4	19.6	4.9	11.8	21.0	3.1	11.0	24.1	5.7	13.9	23.7
September	4.4	11.4	22.1	6.5	11.4	18.4	2.7	11.1	15.9	6.2	12.2	20.0
October	2.1	8.7	20.5	2.5	10.5	15.5	2.6	8.5	17.3	3.2	10.5	17.5
November	1.7	9.9	21.0	1.6	8.2	15.2	1.9	9.8	18.9	1.3	8.2	14.8
December	2.9	10.3	18.2	1.6	9.5	26.1	1.5	9.1	14.8	1.3	6.2	13.0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Basmadjian, R.; Shaafieyoun, A. Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future. Energies 2023, 16, 6005. https://doi.org/10.3390/en16166005

AMA Style

Basmadjian R, Shaafieyoun A. Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future. Energies. 2023; 16(16):6005. https://doi.org/10.3390/en16166005

Chicago/Turabian Style

Basmadjian, Robert, and Amirhossein Shaafieyoun. 2023. "Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future" Energies 16, no. 16: 6005. https://doi.org/10.3390/en16166005

APA Style

Basmadjian, R., & Shaafieyoun, A. (2023). Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future. Energies, 16(16), 6005. https://doi.org/10.3390/en16166005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Assessing ARIMA-Based Forecasts for the Percentage of Renewables in Germany: Insights and Lessons for the Future

Abstract

1. Introduction

1.1. Motivation

1.2. Problem Statement

1.3. Contributions

2. Preliminaries

2.1. Training Methodology

2.2. Description of ARIMA Models

3. Comparing Methodologies: Yearly vs. Monthly

4. Implementation of REN4KAST

4.1. Data Service

4.2. Forecast Service

4.3. Evaluation Service

5. Experimental Results

5.1. Exogenous Variables

5.2. Observed vs. Forecast Values

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI