Short-Term Firm-Level Energy-Consumption Forecasting for Energy-Intensive Manufacturing: A Comparison of Machine Learning and Deep Learning Models

: To minimise environmental impact, to avoid regulatory penalties, and to improve competitiveness, energy-intensive manufacturing ﬁrms require accurate forecasts of their energy consumption so that precautionary and mitigation measures can be taken. Deep learning is widely touted as a superior analytical technique to traditional artiﬁcial neural networks, machine learning, and other classical time-series models due to its high dimensionality and problem-solving capabilities. Despite this, research on its application in demand-side energy forecasting is limited. We compare two benchmarks (Autoregressive Integrated Moving Average (ARIMA) and an existing manual technique used at the case site) against three deep-learning models (simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU)) and two machine-learning models (Support Vector Regression (SVR) and Random Forest) for short-term load forecasting (STLF) using data from a Brazilian thermoplastic resin manufacturing plant. We use the grid search method to identify the best conﬁgurations for each model and then use Diebold–Mariano testing to conﬁrm the results. The results suggests that the legacy approach used at the case site is the worst performing and that the GRU model outperformed all other models tested. and P.T.E.; data curation, A.M.N.C.R. and P.R.X.d.C.; Formal analysis, A.M.N.C.R., I.R.R., T.L. and P.T.E.; investigation, A.M.N.C.R., P.R.X.d.C., I.R.R. and P.T.E.; methodology, P.T.E.; project administration, D.S.; resources, T.L.; validation, A.M.N.C.R.; writing—original draft, &


Introduction
The industrial sector is the largest consumer of delivered energy worldwide, and energy-intensive manufacturing is the largest component in that sector [1]. Energy-intensive manufacturing includes the manufacture of food, beverage, and tobacco products; pulp and paper; basic chemicals; refining; iron and steel; nonferrous metals, and nonmetallic metals [2]. Energy-intensity is driven by the mix of activity in these sectors including basic chemical feed-stocks; process (including heating and cooling) and assembly; steam and cogeneration; and building-related energy consumption, e.g., lighting, heating, and air conditioning [2]. World industrial energy consumption is forecasted to grow from c. 242 quadrillion British thermal units (Btu) in 2018 to about 315 quadrillion Btu in 2050;

Dataset
The data used in this study was sourced from the Brazilian subsidiary of an international thermoplastic resin manufacturer, an energy-intensive manufacturing plant. The plant size is c. 55,000 m 2 with a production capacity of approximately 500,000 tons per year. Currently, the case site calculates energy consumption forecasts manually and prepares a technical energy consumption index (TECI) as a proxy for energy efficiency. Five years of data from 1 January 2015 to 31 December 2019 for daily total energy consumption at the plant (ENERGY dataset) as well as process-related data for two different stages of the manufacturing process-polymerisation (POLY_PRODUCTION dataset) and solid-state polymerisation (SSPOLY_PRODUCTION dataset)-were provided. Each dataset comprised 1826 values. Figure 1 presents the time series of the three datasets used in this work. To identify the impact of the variations in the production flow data on total plant energy consumption, we performed a Pearson correlation analysis. It showed a moderate positive correlation between the ENERGY dataset and the combined production flow dataset (POLY_PRODUCTION dataset and SSPOLY_PRODUCTION dataset). The correlation coefficient between the ENERGY dataset and the POLY_PRODUCTION dataset was 0.71, while the relationship between the ENERGY dataset and the SSPOLY_PRODUCTION dataset was 0.75. Once this analysis was completed, a decision was made to include all three time series (ENERGY, POLY_PRODUCTION, SSPOLY_PRODUCTION) as input in the deep-learning models.

Data Preprocessing
Missing values and measurement errors can lead to unpredictable results. To avoid removing these, anomalous data were filled with imputed data and then normalised. For the former, we replaced missing data with the average of the data from the previous seven days as per [8][9][10]. We then normalised the data so that all model inputs had equal weights and the Sigmoid activation function could be applied in the deep-learning models as per [11][12][13]. Normalization reduces the data range to zero and one [0, 1]. Sklearn's MinMaxScaler function was used to normalise the data in this study, based on Equation (1).
where X i is the rescaled value, X i is the original value, min(x) is the minimum value in feature, and max(x) is the maximum value in feature.

Evaluation Metric
Root mean squared error (RMSE), mean absolute percent error (MAPE), and mean absolute error (MAE) are the most commonly metrics used in the evaluation of the accuracy of energy-consumption models [14] and, in particular, studies related to STLF using deep learning [6,7,15].
RMSE is defined as the square root of the mean squared error (MSE) [16] (Equation (2) [17]), that is, the root of the mean square error of the difference between the prediction (P i ) and the real value (R i ), where n is sample size. As RMSE is more sensitive to more significant errors (outliers) because it squares the difference between the predicted value and the real value. RMSE presents error values in the same dimensions as the analysed variable [16]. It is widely applied in models that use time series [18].
MAPE is widely used for evaluating prediction models particularly where the quality of the forecast is required and is used in numerous energy-consumption forecasting studies [6,[11][12][13]19]. MAPE is defined in Equation (3) [6,11,17] and expresses the accuracy of the error as a percentage. It can be applied in a wide range of contexts, as it is a relatively intuitive interpretation of relative errors; however, it can only be used if the values in the dataset do not equal zero [20].
MAE is defined in Equation (4) [12,17]. Unlike the other metrics presented, MAE depends on the scale of the data. It is not sensitive to outliers, as it treats all errors in the same way. We use it to quantify a model's ability to predict energy consumption.
While one or more of RMSE, MAPE, and MAE have been featured in related studies [6,7], we have chosen to measure all of them for better comprehensiveness and analysis of different aspects of what is being studied [21].

Finding Models to Predict Energy Consumption
Due to the nonlinear characteristics of the datasets used in this research, to the need for both accuracy and fast run times, and to the promising results obtained in other works that used deep learning [12,19,22,23], three deep-learning techniques were selected for STLF in this study: simple RNN, LSTM, and GRU. In addition to these techniques, two different machine-learning techniques were selected for the purpose of comparison: SVR, and Random Forest. These were selected as they are featured in related works on STLF for demand-side energy consumption [6,7]. Our models use three time series as input-(i) energy consumption, (ii) polymerisation production flow (POLY), and (iii) solid-state polymerisation production flow (SSPOLY)-and have a single output: the predicted energy consumption. For each input, the model uses data from the preceding seven days.

Deep Learning Models
As mentioned previously, we propose three deep-learning techniques for STLF: simple RNN, LSTM, and GRU. RNNs are a type of Artificial Neural Network (ANN) designed to recognise patterns in sequential data streams. In RNNs, the decision, classification, or learning done at a given moment t−1 influence the decision, classification, or learning at a subsequent time t in the time series. RNNs contain two sources of input: the present and the recent past. These data are combined to determine how new data is predicted. RNNs have a memory that, for example, multi-layer perceptrons (MLP) and Convolutional Neural Network (CNN) do not. As such, RNNs use information in the sequence itself to perform tasks that other ANNs are unable to do. RNNs have limitations, the most significant of which are difficulties in training RNNs to capture long-term dependencies due to vanishing and exploding gradient problems [24,25]. LSTM and GRU are variations of RNN that overcome such problems.
LSTM [26] is a variation of RNN that overcomes gradient problems through the use of a chain structure containing four neural networks and different blocks of memory [27]. LSTM updates its unit states using three gates: a forget gate, an input gate, and an output gate. The forget gate deletes information that is no longer useful in the unit [27]. The current input x t and the output from the previous unit h t−1 are multiplied by the weight matrix. The result is passed through an activation function that provides a binary output that causes the data to be forgotten. The input gate performs addition of useful information to the unit's status. First, the information is adjusted using a sigmoid function. Then, the tanh function is used to create a vector that produces −1 to +1. Finally, the output gate completes the extraction of useful information from the current state of the unit to be displayed as an output. In order to do so, a vector is generated by applying a tanh function to a cell. Due to its structure, LSTMs can predict time series with time intervals of unknown duration [28], a significant advantage over traditional RNNs. Notwithstanding this, long training times are a significant limitation [29].
GRUs reduce the complexity of LSTMs by only utilising an update gate and a reset gate to determine how values in the hidden states are computed [25]. In GRUs, only one hidden state is transferred between the time steps [25]. This state is capable of maintaining long-and short-term dependencies at the same time. GRU gates are trained to selectively filter out any irrelevant information while maintaining what is useful. These gates are vectors containing binary values, as in LSTM, and determine the importance of the information. Crucially, research suggests that GRUs have significantly faster training times with comparable performance to LSTM [25,29].

Deep Learning Model Settings
To determine the most suitable configuration for each deep-learning model, we use the grid search method to determine the respective hyperparameters [30][31][32][33][34]. It is used widely as it is quick to implement, is trivial to parallelise, and intuitively allows an entire search space to be explored [35].
To perform the grid search, the dataset was separated into a training set consisting of 80% of the original dataset (from 1 January 2015 to 30 December 2018) and a test set comprising 20% of the original dataset (from 31 December 2018 to 31 December 2019) using percentage split. The hyperparameters evaluated by the grid search for deep-learning techniques were (i) the number of layers, and (ii) the number of nodes in each layer (see Table 1).  Figure 2 presents the loss convergence during both training and testing of the deep-learning models. It suggests that the models converge after about 40 epochs (loss stabilisation); there is no overfitting.
For deep-learning models, the following parameters were fixed: 100 epochs based on ( Figure 2), a batch size of 16, Sigmoid [36] as the activation function, MSE as the loss function, and a method for stochastic optimisation (Adam) as the optimiser. These parameters were chosen empirically. Due to the stochastic nature of the optimisation process, the grid search was performed 30 times, and the averages of RMSE, MAPE, and MAE were calculated.    As model complexity increases, deep-learning models learn more from the greater volume of available data in the training dataset.

Machine-Learning Models
In addition to deep-learning models, we also propose two machine-learning models: SVR and Random Forest. Support Vector Machines (SVM) has been proposed as an alternative to traditional ANNs for classification and regression tasks. In particular, SVM provides better support for forecasting time series from nonlinear systems [37]. SVM is a machine-learning technique based on statistical learning theory [38]. Extant literature suggests that SVM performs well in forecasting time series [37,39,40]. Support Vector Regression (SVR) is a regression technique based on SVM [41]. The main differences relate to the formats and types of input and output. Kernel functions are used to map the data through nonlinear functions in an n-dimensional space. In this way, it is possible to transform nonlinear problems into linear problems. Research suggests that SVR presents accurate results for predicting energy consumption and, as such, is commonly used in the field [42]. Despite its advantages, the lack of predetermined heuristics for both the design and parameterisation of SVR models is a major drawback in using SVR [37]. As such, studies tend to be application-specific and to lack generalisability [37].
Random Forest is a machine-learning technique based on different decision trees. Random Forests' implementation involves random selection of features based on the position of the root node. The model output consists of the average of the results for all trees. When compared to a single decision tree, Random Forest presents a better performance [43][44][45]. The greater the number of trees, normally the better the performance of the model but the slower the model and the more inefficient the real-time predictions. It is one of the most popular machine-learning techniques used for classification and regression problems [21,46,47]. Random Forest's popularity is often attributed to its higher accuracy when compared with ANNs and SVR [48].

Machine-Learning Model Settings
Similar to the deep-learning models, we also perform grid search to find the best hyperparameters of the machine-learning models, using the same data set splitting procedure. The hyperparameters used vary according to the technique (see Table 2). For SVR, cost and the type of kernel were used, whereas the maximum depth of trees and the number of trees were used for Random Forest. For SVR, the best configuration across the three metrics used-RMSE, MAPE, and MAE-is represented by SVR-0.1-linear, for which the C value is 0.1 and used the linear kernel. For Random Forest, the models with configurations with (a) a maximum depth of three with 50 trees (Random Forest-3-50), (b) maximum depth of six with 50 trees (Random Forest-6-50), and (c) a maximum depth of six with 100 trees (Random Forest-6-100) generated the best results for RMSE, MAPE, and MAE, respectively. These four model configurations will be used in our benchmark evaluation.

Benchmarks
Two additional benchmarks were selected for comparison purposes. The first benchmark is the manual technique used by the case site providing the dataset for this study. The second benchmark is an ARIMA model. ARIMA was selected because of its widespread use in energy forecasting and, in particular, in related works [6].
The manual technique used by the case site is performed by a simple calculation as per Equation (5), where the energy consumption of a given day, C predicted , is the planned production flow (F planned ) and the TECI (n previous ) based on measured data collected on the previous day.
The choice of the ARIMA model for this study was based on the time-series nature of our dataset (data numbers and the output variable relates to your past data). Equation (6) [49] represents the mathematical expression for the autoregressive part.
where t is the index represented by an integer, x(t) is the estimated value, p is the number of autoregressive terms, and α is the polynomial related to the autoregressive operator of order p. Equation (7) [49] reflects the dependency of time-series values on the errors of previous estimates, i.e., the errors of the forecast are taken into account when estimating the next value in the time series.
where q is the number of moving average terms, β is the polynomial related to the moving average operator of order q, and ε is the difference between the estimated and actual values of x(t).
Equation (8) [49], a combination of Equations (6) and (7), represents the ARIMA model (p and q) used as a benchmark for this study.
After empirical analysis, the selected ARIMA model presented the order of the autoregressive (p = 1), the degree of differencing (d = 0), and the order of the moving average (q = 1). Table 3 presents the RMSE, MAPE, and MAE results for the four deep-learning models (RNN-1-30, RNN-4-30, LSTM-1-30, and GRU-1-30), and the four machine-learning models (SVR-0.1-linear, Random Forest-3-50, Random Forest-6-50, and Random Forest-3-100) identified by the grid search method as well as by the manual and ARIMA benchmarks. Based on the RMSE metric, the deep-learning models outperformed the machine-learning models and the manual and ARIMA benchmarks. This behavior can be explained by the ability of deep-learning models have to achieve insights outside of the domain of training data. The GRU model presented the best performance of all models tested as well as reduced the complexity inherent in the other deep-learning models analysed; the simple RNN models presented the worst performance. In contrast, based on MAPE and MAE, the ARIMA model outperformed the deep-learning models, the machine-learning models, and the legacy manual approach. Table 4 presents the average inference times for the four deep-learning models (RNN-1-30, RNN-4-30, LSTM-1-30, and GRU-1-30), the four machine-learning models (SVR-0.1-linear, Random Forest-3-50, Random Forest-6-50, and Random Forest-3-100), as well as the manual and ARIMA benchmarks. With average inference times of 0.8 and 0.0085, respectively, the deep-learning and machine-learning models performed best as a whole; standard deviations were insignificant. Random Forest-3-50 is the model with the shortest average inference time of those compared, while the ARIMA model is the worst performing when compared to the machine-learning and deep-learning models. Although achieving good RMSE, MAPE, and MAE results, the ARIMA inference time is much longer than the deep-learning models, a significant limitation for practical use.    Table 3 suggests that GRU-1-30 and ARIMA achieved the best results for the RMSE, MAPE, and MAE metrics. As the values of RMSE, MAPE, and MAE are very similar, we used the Diebold-Mariano [50] test to confirm the results. The Diebold-Mariano is a hypothesis test used to compare the significance of differences in two different prediction models. Table 5 presents the results obtained.

Diebold-Mariano Statistical Test
The Diebold-Mariano test result equals zero when the techniques being tested are equal; negative values are present when the left technique obtains a better performance and vice-versa. If the absolute Diebold-Mariano results are high, the tested techniques have significantly different prediction values. The first line of Table 5 shows the comparison between the case site technique compared to all other models.
It is clear that the existing manual technique used at the case site has the worst performance in comparison to the all models examined. The high statistical values obtained for this technique confirms that it is suboptimal for STLF in this case. The only model that outperformed the ARIMA model was the GRU-1-30 model. All deep-learning models outperformed the machine-learning models. However, the variance in Diebold-Mariano values are not as significant. While the Diebold-Mariano test results for deep-learning models are similar, the GRU-1-30 model achieved the best prediction indexes when compared to all models tested. As such, the initial hypothesis from the grid search results are confirmed.
These results suggest a significant improvement in the accuracy of the STLF for this energy-intensive manufacturer. This can be used to provide more accurate energy management to meet production demands, to improve cashflow, to reduce environmental impact, and to mitigate risks associated with energy inefficiencies. Accurate STLF results can be used for anticipatory optimisation and remediation. For example, anomaly detection can be used to identify possible machine degradation or failure from anomalous loads at different stages in the manufacturing process. This would enable predictive maintenance and avoid production downtime.

Related Work
Short term load forecasting using deep learning and machine learning has been examined from a variety of perspectives. For example, there is a well-established literature on supply-side energy consumption and demand forecasting using deep learning from the perspective of the management and optimisation of power systems and electricity grids. These include studies using deep neural networks [6], deep belief networks [51], CNNs [52], Autoencoder and LSTM [53], SVM and Random Forest [21] amongst others. The focus of this paper is demand-side. STLF for grids and utility companies have a fundamentally different motivation and context than manufacturing firms, not least the public interest aspect of energy systems, as opposed to profit maximisation, operational efficiencies and other business objectives.
Similarly, there has been a number of studies on the use of deep learning for forecasting load prediction for different energy consumer types-residential [19,22,54], commercial [55,56], and industrial [11,57]. While there are certainly new knowledge generated by these works, their focus is overwhelmingly on load forecasting for utility companies and grids. Residential and commercial use cases have fundamentally different energy consumption patterns than industry in terms of decision-making time horizon, building code standards, population density, building design, and response to regional climate, amongst others [58]. As discussed in Section 1, energy-intensive manufacturing has a significantly different energy consumption profile than other industry sectors, leaving aside the obvious differences with residential and commercial use. As a result, the motivation for load forecasting is substantially different than other industrial use cases. In particular, these operations tend to have high process-related energy requirements, are not subject to climate changes, and energy management is core capability in their business. As production is central to manufacturing, the demand for energy is derived from production planning and energy forecasting and optimisation based on the over-riding demands of production [59].
Ryu et al. [6] explore demand-side STLF for variety of industry categories including manufacturing. Using data sourced from a Korean utility company, they propose a deep neural network (DNN) based STLF framework based on industry category, temporal patterns, location and weather conditions. A comparative analysis was performed with three different forecasting techniques-shallow neural network (SNN), double seasonal Holt-Winters (DSHW) and ARIMA. Using MAPE and relative root mean squared error (RRMSE). The results suggest that the DNN-based STLF model achieved the best performance when compared to the other models with lower MAPE (2.19%) and lower RRMSE (2.76%). Our approach differs in three important ways. Firstly, we adopt RNNs that have significant advantages in terms of time series data. Second, in [6], because the data comes from the power company, all energy consumption at a firm level is bundled up and it is not possible to distinguish different sources of energy consumption and their impact from within the firm e.g., buildings vs. process-related consumption. Third, we focus on energy-intensive manufacturing. It is not clear whether energy-intensive manufacturing is included in [6].
Mawson & Hughes [60] explore the use of a deep feedforward neural networks (DFNN) and a deep RNN (DRNN) to predict STLF for a medium-sized manufacturing facility. Inputs to the DNNs included weather conditions and machining schedules. Results suggest that both models performed well but that the DRNN outperformed the DFNN for predictions of building energy consumption, achieving an accuracy of 96.8% compared to 92.4% for DFNN. The focus of [60] was optimising heating, ventilation and air conditioning (HVAC). As such, the impact of the production process was not a focus per se. While data for boiler energy, cooling energy and machine scheduling were taken in to account, again, unlike our work, specific process-related energy consumption was not considered and the focus was not energy-intensive manufacturing. Additionally, Mawson & Hughes [60] use simulated data whereas we use ground truth data to train and validate the models.
In contrast to [6,60], Chen et al. [7] study the use of DNN for STLF in an energy-intensive manufacturing use case. Using data from the melt shop of steel plant, they sought to use DNN to predict energy consumption for one specific process, the electric arc furnace (EAF) for different types of scrap. The performance of the DNN was compared with linear regression (LR), SVM, and decision tree (DT) based on the model correlation coefficient and MAE. The proposed DNN outperformed other models with the highest correlation index, at 0.854, and the lowest MAE, at 1.5%. While [7] is the closest use case to our paper, it focuses exclusively on one process and does not seek to calculate the overall plant energy consumption. While they identify the potential of deep learning over traditional statistical and machine learning approaches, they do not evaluate the relative performance against other deep learning architectures.
Yeom & Choi [15] describe a platform, E-IoT, for collecting a wide range of data (over 1556 variables for one process) at a Korean manufacturing plant. From the data collected by E-IoT, they use a least absolute shrinkage and selection operator (LASSO) technique, based on machine learning, to extract relevant variables to predict plant-level STLF based on the first stage of one process, using LSTM. The proposed LSTM model achieved an MAE of 0.07 and an accuracy of 79%. The paper suffers from a significant lack of detail. For example, while the energy consumption profile for the process presented in [15] appears energy-intensive when compared to total plant energy consumption, it is unclear from the paper whether the manufacturing plant was energy-intensive or not. It is also unclear why only the first stage of the manufacturing process was used, and how many other processes are involved. Furthermore, no detail is provided on how the LSTM model configuration was selected, and it is not compared with existing techniques or other deep learning models.
Li et al. [21] explore STLF for industrial customers in China and source data from a cable factory and a lithium factory located in Chongqing. They propose two short-term (20 days) energy consumption forecasting models using SVM and Random Forest based on historical consumption data as well as seasonal factors (holidays) and upstream value chain data i.e. the price of non-ferrous metals and raw material consumption at each factory. Both models accurately predicted the electricity loads for both factories. The MAPE for both the SVM and Random Forest model were similar for each factory -5% for the cable factory and 2% for the lithium factory. The study highlighted the need for research using additional industry-and firm-specific variables to increase accuracy.
As can be seen from the above, there is a paucity of research on demand-side STLF for energy-intensive manufacturing using deep learning and machine learning models. Decision making for utility companies has little in common with manufacturers. Similarly, STLF for residential and commercial use has little relevance to industrial use cases, and within industrial energy consumption, energy intensive manufacturing is idiosyncratic. The few similar studies lack detail on the degree to which they are energy-intensive manufacturers, aggregate all energy consumption, or focus on one process alone. Furthermore, where proposed deep learning and machine learning models were compared, they were either evaluated against only traditional techniques or only other deep learning models, or not at all. We addressed all of these shortcomings in our paper.

Conclusions and Future Work
This paper is one of the first papers to compare the efficacy of deep-learning and machine-learning models for short-term load forecasting for energy-intensive manufacturing plants. In addition, we benchmark these models against the incumbent manual prediction technique and a classic time-series forecasting technique, ARIMA. Unlike existing studies, we consider multi-year ground truth data including total plant energy consumption data and data from two stages in a complex energy intensive manufacturing process. The use of production data contributed significantly to improving STLF accuracy by reducing the RMSE.
Based on both the grid search results and Diebold-Mariano test results, we found that all the deep-learning and machine-learning models outperformed the incumbent manual technique. Furthermore, the GRU model (GRU-1-30) outperformed the basic RNN and LSTM models in RMSE (0.0305), MAPE (4.33%), and MAE (0.0305) in a very short inference time (0.7058 s).
Accurate STLF can be used in a variety manufacturing processes to achieve energy efficiencies and can be used as an input in a range of operational decisions including energy management (e.g., heat storage and cooling), anomaly detection, predictive machine maintenance, and proactive plant and machine management, amongst others. The reduction of machine idle-times would seem to be particularly attractive to such manufacturers. Given the dearth of research on this topic in energy-intensive manufacturing, there are many avenues for future research. As the industrial Internet of things matures, a significantly larger volume of time-series data will be available to further refine the accuracy of the models and to extend the use of deep learning beyond prediction to actuation.
For near real-time prediction, very short-term load forecasting (VSTLF) may be needed. In such use cases, rapid training times will be required. While GRUs may meet this criteria, further research is required. Furthermore, medium-term load forecasting may prove that fruitful, deep-learning training models may need to be augmented with historic trend data to account for longer seasonal cycles or predictable events. Medium-term load forecasting may enable new use cases including switches to more sustainable or lower-cost power supplies. Similarly, as production planning is prioritised over energy management in energy-intensive manufacturing, multi-step forecasting strategy may be more appropriate or preferable. This may require ensemble solutions and is worthy of exploration.
This paper highlights the potential of deep learning and ARIMA in energy-intensive manufacturing. The adoption of deep learning, like all data science technologies, requires overcoming human, organisational, and technological challenges; however, against intense rivalry, firms may not have a choice. Funding: This research has been partially financially supported by Fundação de Amaparo à Ciência e Tecnologia de Pernambuco (FACEPE), by the Irish Institute of Digital Business (IIDB) and dotLAB Brazil.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations were used in this manuscript: