1. Introduction
With the rapid increase in electricity demand and concerns over greenhouse gas emissions from fossil fuels [
1,
2,
3,
4], renewable energy has become a vital alternative for sustainable electricity generation [
5,
6,
7]. However, sources such as wind and solar are inherently variable and unpredictable, creating challenges in ensuring a stable and reliable electricity supply [
7,
8]. This variability makes the accurate forecasting of green electricity (GE) generation essential for effective operation and planning [
9,
10]. Given the essential characteristic of the natural resources used for GE generation, the daily actual production output is often unstable, exhibiting a wide range of interval values, which can be represented as interval-valued GE supply data [
11]. Interval-valued GE supply data represent GE production records over a day, captured at specific recording intervals (e.g., 15 min per record). This characteristic presents significant challenges for GE supply management. Consequently, the effective forecasting of interval-valued GE supply data is a crucial area of investigation [
12,
13,
14]. For example, the potential maximum and minimum electricity production can be estimated when forecasting the interval values of GE supply. This information is crucial for power-grid managers to monitor the GE supply and to make informed decisions [
14,
15]. Consequently, given the inherent complexity of GE data, the development of effective forecasting models and the delivery of useful information to support decision-making are valuable.
Interval-valued forecasting is an important topic in studies concerning GE supply [
16,
17]. For example, Gupta et al. [
18] forecasted interval-valued wind power supply data by combining wavelet and autoregressive integrated moving average (ARIMA) models. Wang et al. [
19] employed a multi-objective feature extraction approach to reconstruct decomposed interval-valued wind power data during forecasting. Yang et al. [
20] proposed an ensemble-based forecasting system for interval-valued wind power supply data.
However, most existing studies have paid limited attention to multi-step-ahead forecasting for interval-valued GE data [
18,
19,
20], with those that have often focusing on point forecasts [
21,
22]. To illustrate, Yu et al. [
23] combined convolution layers and long short-term memory (LSTM) for forecasting wind speed 10 min ahead. Nguyen et al. [
24] utilized stacked temporal convolutional networks (TCNs) to forecast wind power output values from one to six steps ahead, with each step length being 30 min-based. Tsegaye et al. [
25] forecasted power loads with hour- and day-based step lengths using LSTM with population-based adaptive optimization. Given the inherently uncertain nature of GE data, one-step-ahead forecasting provides limited information [
16]. For instance, preparing a GE supply often requires lead time for scheduling, coordination, and resource allocation [
26]. However, one-step-ahead forecasting offers a narrow window of information for action, providing limited flexibility to plan schedules or manage unexpected fluctuations [
12]. In contrast, multi-step-ahead forecasting extends the forecasting horizon, generating more future information that extends the buffer time during operation. This enables managers to gain more informed insights regarding the expected range of the GE supply over longer future horizons, supporting more effective planning such as maintenance or resource allocation [
12,
16]. Therefore, the accurate multi-step-ahead forecasting of the interval-valued GE supply can benefit GE management.
Furthermore, when forecasting interval-valued GE supply data, the targets are typically represented in either the upper and lower bound or center and radius formats [
14]. Either format can express the upper and lower bounds of an interval-valued data point. Most existing studies on forecasting interval-valued GE supply data have primarily considered basic features, such as the time-lagged information of the target variable, as predictor variables directly derived from upper and lower bounds or central and radius data [
14]. However, these basic features fail to capture other useful information for the forecasting task, such as the distribution information within an interval-valued data point [
27,
28,
29]. This information can be described using augmented features such as the quartile values, mean, interquartile range (IQR), skewness, and kurtosis. These augmented features can capture inherent information from an interval-valued data point and can be incorporated into the forecasting model to enhance the forecasting performance. Few studies have considered augmented features when forecasting interval-valued GE supply data. Jiang et al. [
27] demonstrated the potential of utilizing augmented features, but they only considered the quartile values, minimum value, and maximum value for the one-step-ahead forecasting of the regular electricity demand, and not GE supply data. Chang et al. [
28] took a similar approach and added the mean, IQR, skewness, and kurtosis as additional augmented features for the one-step-ahead forecasting of the regular electricity supply. However, to the best of our knowledge, no study has simultaneously considered both augmented feature usage and multi-step-ahead forecasting for interval-valued GE supply data. Therefore, addressing this research gap warrants further investigation.
Various methods are utilized for the interval-valued forecasting of GE supply data, including seasonal ARIMA (SARIMA) and deep-learning (DL) algorithms [
10,
30]. Because interval-valued forecasting involves a pair of target variables (upper and lower bounds or central and radius), it is considered a multi-output task that DL models can handle effectively and conveniently. For example, Niu et al. [
31] proposed a framework combining LSTM networks with several decomposition methods to achieve high performance in wind power interval-valued forecasting. Zhou et al. [
32] forecasted wind power intervals using LSTM. Wang et al. [
33] proposed a framework using a modified scaling approach and efficient feature ranking to improve the performance of gated recurrent unit (GRU) networks in wind power interval-valued forecasting. Wang et al. [
34] forecasted wind power intervals with a modified GRU combining two layers of decomposition methods. TCNs have also been used in interval-valued forecasting studies [
35,
36]. Hu et al. [
35] proposed and applied a slightly modified TCN algorithm for wind power interval forecasting, whereas Gan et al. [
36] added an adaptive layer to the TCN layer to predict wind speed intervals. Therefore, as interval-valued forecasting is a multi-output task, the commonly used LSTM, GRU, and TCN models are employed in this study.
This study proposes a multi-step-ahead forecasting scheme that incorporate augmented features (base and distribution features) and DL models (GRU, LSTM, and TCN) to forecast the interval-valued GE supply effectively. The proposed scheme is named the Augmented Feature Multi-Step Interval-valued Forecasting (AFMIF) scheme and consists of four steps. First, interval-valued data are generated by preprocessing the original GE supply data, which are records of daily electricity logs within a given time cycle. Second, augmented features are constructed from the interval-valued data that are represented in the upper and lower bound and central and radius formats. Third, LSTM, GRU, and TCN models are constructed, incorporating the augmented and basic features of the interval-valued data, to create three multi-step-ahead interval-valued forecasting models, namely, AFMIF-GRU, AFMIF-LSTM, and AFMIF-TCN, respectively. Finally, these three forecasting models are compared with four other models to evaluate the performance of the proposed AFMIF scheme, and the best model is selected. SARIMA is one of the comparison models used as the benchmark method as it is a commonly used method in GE data forecasting. The mean ratio of exclusive-or (MRXOR) is used to assess the overlap between the actual and estimated intervals. Wind power is a key source of GE and has gained global popularity, with a rapid annual growth rate of 12.6% in wind turbine installations in recent years [
9,
37,
38,
39]. Thus, this study uses wind-power-based GE supply data from Belgium and Germany as illustrative examples to evaluate the performance of the proposed AFMIF scheme.
Table 1 provides a comparative overview of the research gap and highlights the contributions of this study relative to the existing literature.
The remainder of this paper is organized as follows:
Section 2 introduces the DL methods and proposed AFMIF scheme.
Section 3 presents the processed Belgium and Germany data, descriptive statistics of the augmented features, model results, and robustness evaluation. Finally,
Section 4 summarizes the findings.
2. Methodology
This section provides a brief introduction to the LSTM, GRU, and TCN methods, followed by a step-by-step explanation of the proposed AFMIF scheme.
2.1. LSTM
LSTM is an extension of the recurrent neural network that utilizes the concept of retaining information over time to capture long-term dependencies for tasks such as time-series forecasting [
40,
41]. LSTM retains information via memory cells, each of which is made up of an input gate, a forget gate, and an output gate. The forget gate controls the extent to which previous information should be retained or discarded, and the calculation is shown in Equation (1):
where
is the time period during forecasting;
is the weight matrix;
is the output of the previous time period;
is the input at the current time period;
is the bias; and
is the sigmoid activation function. The input gate regulates the flow of new information into the cell state that determines updating or retaining new information to overwrite previous information. The calculation is expressed as follows:
where
yields a value between 0 and 1, where a value closer to 1 means that the information should be updated;
is the weight matrix; and
denotes the bias. Finally, the latest output value
is generated through the output gate according to Equations (3) and (4):
where
is the weight matrix;
denotes the bias; and
is the overwritten information. For further details, please refer to the original study [
39].
2.2. GRU
The concept of the GRU is similar to that of LSTM, with several differences in the gate mechanism [
42,
43]. The GRU is a lightweight modification of LSTM that simplifies the gate mechanism by merging the gates into update and reset gates. The reset gate
can be treated as the forget gate of LSTM, with the purpose of controlling the amount of historical information that should be forgotten. The calculation is shown in Equation (5):
where
is the weight matrix;
is the output of the previous time period;
is the input of the current time period; and
is the sigmoid activation function. The update gate
determines the amount of historical information to be carried forward to the current time step. The equation for
is expressed as follows:
where
is the weight matrix. Finally, the output
at the current state of time step
can be determined using Equation (7):
where
is the potential candidate value with new information to be added at current time step
. The GRU mainly manages the retention and retrieval of input information through reset and update gates for forecasting. For further details, please refer to the original study [
41].
2.3. TCN
The TCN is a DL architecture designed for sequential data modeling that is commonly used for time-series forecasting tasks [
44,
45]. The main mechanism of TCN is the combination of dilated convolution, causal convolution, and residual connection. With these components, the predictions at any time step are conditioned solely on past inputs, preserving the temporal order.
Causal convolution ensures temporal causality by restricting predictions at time step
to depend only on earlier time steps. Its receptive field increases linearly with the number of layers, limiting its ability to capture long-range dependencies in sequential data efficiently. The TCN addresses this limitation by incorporating dilated convolution. Dilated convolution is a variation of standard convolution that introduces gaps between consecutive filter elements, allowing the network to expand its receptive field without increasing the number of parameters, which helps the TCN model to capture long-term dependencies more effectively. Hence, the dilated operation of
on element time step
in a time-series sequence
can be expressed as follows:
where
is the dilation rate;
is the filter size; and
is the past state from the previous convolution. Moreover, to improve model stability and reduce the vanishing gradient effect, residual blocks are incorporated into the TCN. A residual block consists of stacked dilated causal convolution layers, followed by a rectified linear unit (ReLU), normalization layers, and a dropout layer for regularization.
LSTM, the GRU, and the TCN were selected because they are well-established DL models that have demonstrated strong performance in time-series forecasting tasks. LSTM and the GRU belong to the recurrent neural network family and are effective in capturing both the long- and short-term temporal dynamics that are suitable for modeling the evolving behavior of complex interval-valued GE supply data. The TCN, which is a convolution-based model, offers the advantages of parallel processing and stable gradients, making it well-suited for learning the temporal patterns that are hidden within the interval-valued data. Moreover, as mentioned in the Introduction, interval-valued GE supply forecasting is a multi-output task that involves two target variables. The three selected DL models are more effective for handling multi-output tasks, as traditional models often require separate models to be built for each target variable.
2.4. Proposed AFMIF Scheme
Figure 1 presents the proposed AFMIF scheme.
As shown in the figure, the original data consist of GE data that are generated from wind power, comprising electricity logs recorded at a specified minute-based recording cycle, which are collected first. Subsequently, the collected data are preprocessed to generate interval-valued data using the upper and lower bound and central and radius formats. Once the interval-valued data have been prepared, information can be extracted from them to construct augmented features. Two sets of features are constructed in the AFMIF scheme: the basic feature (BF) consisting of time-lagged information of the upper and lower bounds and central and radius; and the distribution feature (DF) consisting of the Q1, median, Q3, IQR, mean, standard deviation (SD), skewness, and kurtosis. The BF and DF represent different information. The BF is the basic information describing the time-lagged information of the upper bound, lower bound, central, and radius data points of the interval-valued data. The DF considers the time-lagged information of the detailed statistics that are directly extracted from the interval-valued data. After obtaining the predictor variables constituting the BF and DF, three different DL algorithms, namely, LSTM, the GRU, and the TCN, are used to construct models for multi-step-ahead forecasting (i.e., one-step-, two-steps-, and three-steps-ahead) of interval-valued wind power GE supply. As the central and radius format is commonly used in interval-valued forecasting tasks, this study uses this format to form the target variable. Finally, the MRXOR metric is used to evaluate the performance of all constructed models.
The processes of the proposed AFMIF scheme are introduced in the following subsections.
2.4.1. Preprocessing of Interval-Valued Data
The original wind power data systematically represent the electricity produced over time, for which electricity logs are recorded in minute-based cycles per day. Hence, the collected wind power data must first be transformed into interval-valued data. Suppose that a total record (
) of electricity logs (
) is recorded each day and
days of data are collected. A matrix
of size
can be created and expressed as follows:
where
(
) are vectors of the recorded
logs per day.
Figure 2 shows an example of the first three days of produced electricity data. As shown in the figure, each day contains a total of 96 recorded
logs because the data are recorded under 15 min cycles from 00:00 to 23:45 per day. Thus,
has a matrix size of
. Then, based on Equation (1),
is a vector consisting of 96
records (
), while
follows the same rule (
), and so on. Therefore, the constructed interval-valued data based on the example data are presented in
Table 2.
2.4.2. Construction of Augmented Features
Once the interval-valued data (matrix ) have been prepared, the augmented features are constructed from the data. The augmented features consist of components of the BF and DF. Both are calculated via the intervals of each day, for which they also form the vectors () from matrix .
The BF includes the upper and lower bounds and central and radius. The upper bound expresses the maximum value in an interval, whereas the lower bound expresses the minimum value. They are denoted by Equations (10) and (11), respectively:
Following the example data, Upper is the largest value among all 96 records, while Lower is the lowest value among them. Thus, the upper and lower values of the day 1 interval are
and
, respectively. The same concept is used to format the upper and lower bounds from the corresponding intervals of each day. After obtaining the upper and lower bounds of each day, they can be further transformed into the central and radius using Equations (12) and (13), respectively:
Thus, the central and radius values of day 1 from the example are and respectively.
A pair of upper and lower bound values can be extracted from the interval-valued data of each day; thus,
pairs of upper and lower bound values can be extracted from
days of data. This can be expressed as matrix
, as shown in Equation (14):
Following the same concept as that of the upper and lower bounds, each pair of central and radius values can be denoted by
, as shown in Equation (15):
Both and can provide information to describe the boundary of the interval, which can be viewed as an overall profile expression of the interval.
The structure of an interval can be either symmetric or asymmetric. When the DFs of the mean and median of an interval are equal, with zero skewness, the interval is symmetrically distributed. However, in real-world scenarios, interval-valued data are often asymmetrically distributed and may exhibit complex structures. The DF offers valuable insights into the underlying characteristics of interval-valued data. Hence, utilizing the DF to obtain information that reflects the distribution of an interval can help to enhance the forecasting effectiveness.
The DF consists of the mean, SD, Q1, median, Q3, IQR, skewness, and kurtosis. All of them are constructed using a similar concept to that of the BF in Step 2.1, and calculated via vectors (
) from matrix
. Equations (16)–(23) show the DF calculations.
where
.
where
where
.
where
where
.
where
.
where
.
where
.
Following the example data, the DF values of day 1 are , , , , , , , and .
A set of DFs can be extracted from each day of interval-valued data after construction. Thus,
sets of DFs can be constructed from
n days of data, which can be expressed as follows:
where
.
The augmented features of the BF and DF in each day are constructed from the interval-valued data.
Table 3 presents all augmented features extracted from the first three days of interval-valued data as an example.
2.4.3. Training of Multi-Step-Ahead Forecasting Model
After completing the construction of the augmented features, they are utilized for the construction of the forecasting model. The proposed AFMIF scheme uses a direct strategy for multi-step-ahead forecasting. The direct strategy is a commonly used multi-step-ahead forecasting strategy with a straightforward concept [
46,
47]. It forecasts the desired target value at
steps ahead from time period
by directly predicting the outcome while ignoring the information within the future time gap [
46,
47]. This approach builds forecasting models for different time steps ahead separately, allowing each model to focus on learning specific dynamics when making predictions for each time step.
Figure 3 presents a visualized example of constructing models under the direct strategy for multi-step-ahead forecasting. The future time gap refers to the time window between time period
and the desired target step ahead during forecasting; for example,
represents forecasting one step ahead. Moreover, as the direct strategy ignores the information within the future time gap, the predictor variables are time-lagged information within the time window of time period
. For example, as shown in the figure, information at time period
and time-lagged information at
and
are used to construct models for multi-step-ahead forecasting.
The forecasting targets in this study are formed from the central and radius and the forecasting is considered as a multi-output forecasting task. As mentioned in the Introduction, as opposed to traditional methods that must forecast both targets separately, DL models can forecast both targets simultaneously, which is more effective than single-output models. Therefore, let
be the forecast model and let the time lag be denoted as
. The forecasted central and radius values using the BF and DF as predictors along with the time-lagged information under the direct strategy are expressed as follows:
where the maximum horizon of future steps considered is three steps ahead (
) and a maximum time-lagged information count of 3 (
) is considered in this study. Hence, three DL models following the approach of the proposed AFMIF scheme are constructed, namely, AFMIF-GRU, AFMIF-LSTM, and AFMIF-TCN. Furthermore, to confirm the effectiveness of the AFMIF scheme, the simple approach of models using only time-lagged information formed from the targets (central and radius) are constructed for comparison; these models are named S-GRU, S-LSTM, and S-TCN. In addition, SARIMA that is constructed following the simple approach, known as S-SARIMA, is considered for comparison.
All models are built using the expanding window approach based on time when dividing the data into training and testing sets to preserve the chronological order. To ensure fairness and consistency across model comparisons, all three DL models are built using 100 epochs without early stopping, a batch size 64, and the Adam optimizer. ReLU is used as the activation function for all hidden layers, preventing a negative radius, with the linear function for the output layer. Details regarding the DL model structure and corresponding hyperparameters are presented in
Supplementary Table S1.
2.4.4. Performance Evaluation
Traditional error metrics evaluate accuracy at the level of single points, measuring the closeness of an estimated point to the corresponding actual point [
48]. While these metrics are appropriate for point forecasting, they may be limited or misleading in the interval-valued setting, in which the primary objective is to ensure that the forecasted interval overlaps with the actual interval as closely as possible. Furthermore, the forecasted interval may be wider than the actual interval [
48]. Such cases should be considered for appropriate performance evaluation. Thus, the MRXOR metric is used to evaluate the performance of the constructed models when forecasting the GE supply interval. MRXOR considers the overlap of the forecasted and actual intervals [
48]. Letting
be the total amount of central and radius data points, the formula for MRXOR is as follows:
where
;
;
.
The four conditions reflect how the forecasted interval overlaps with an actual interval that is too wide or too narrow, or inclines to the right or inclines to the left. MRXOR is similar to traditional error metrics in that it ranges from zero upwards, with zero representing the ideal case of perfect estimation. Thus, when comparing the MRXOR of different models, better ones will have smaller values closer to zero.
To evaluate the performance of the AFMIF scheme and reliability of the constructed models, robustness evaluation is performed using an expanding window with different training size proportions (60/70/80/90%) to determine whether the model maintains reasonably stable performance across varying training horizons. This approach repeatedly trains the model for the time-series forecasting task, as random data splitting is not suitable; it would break the temporal dependency and disrupt the chronological order. The experiments in this study were implemented in Python version of 3.8.8 [
49] and Jupyter Notebook version 6.3.0 [
50]. All forecasting models were constructed using the open-source Python packages Scikit-learn (version 0.24.2) [
51,
52], Keras (version 2.4.3) [
53], TensorFlow (version 2.3.0) [
54], PyTorch (version 2.3.0) [
55], and statsmodels (version 0.14.0) [
56].