ForecastTB—An R Package as a Test-Bench for Time Series Forecasting—Application of Wind Speed and Solar Radiation Modeling

: This paper introduces an R package ForecastTB that can be used to compare the accuracy of different forecasting methods as related to the characteristics of a time series dataset. The ForecastTB is a plug-and-play structured module, and several forecasting methods can be included with simple instructions. The proposed test-bench is not limited to the default forecasting and error metric functions, and users are able to append, remove, or choose the desired methods as per requirements. Besides, several plotting functions and statistical performance metrics are provided to visualize the comparative performance and accuracy of different forecasting methods. Furthermore, this paper presents real application examples with natural time series datasets (i.e., wind speed and solar radiation) to exhibit the features of the ForecastTB package to evaluate forecasting comparison analysis as affected by the characteristics of a dataset. Modeling results indicated the applicability and robustness of the proposed R package ForecastTB for time series forecasting.


Introduction
Decision making is one of the most crucial tasks in many domains and often decisions are based on the most accurate forecast available in the respective domains. A large number of areas, such as energy [1,2], economics [3], infrastructure [4,5], health [6,7], agriculture [8,9], defense [10], education [11,12], technology [13,14], geo-science [15], climate [16] and structural engineering [17] among several others, are looking forward to benefits that can be achieved with time series forecasting. Time series are consecutive sequences of values ordered with respect to time. Statistically, it is represented by the theory of stochastic processes [18]. The notable feature of time series data is the time-based correlation between values such that the probability of occurrence of a value depends on future or past observations [19,20].
Standardizing the process of the forecasting methods comparison is associated with numerous challenges [21]. The apparent concern is the nature of the time series and selection of appropriate methods such as the complete variation from seasonality, trends, and random parameters, that should be handled with each method. For instance, one forecasting model may perform better for a seasonal time series but show poor accuracy as compared to other models for a trendy or a random one [22]. Besides, interpretations might be affected by the error metric selection, for example, Root mean square error (RMSE), as different metrics have different objectives [23].
The ForecastTB package can be helpful for professionals and researchers working in the field of data science and forecasting analysis. The salient features of the ForecastTB package are as follows: •

Reduction in efforts and time consumption:
The ForecastTB package is designed to reduce the efforts and time consumption for the time series forecasting analysis. It avoids the repetitive steps in the analysis and leads to the promising comparative results report generation. • Truthful comparison assurance: The ForecastTB package ensures a truthful and unbiased comparison of forecasting methods. Hence, this package may be considered a reliable tool for forecasting models based on industrial reports generation or scientific publications. • Reproducible research: Along with unbiased comparisons, the ForecastTB package provides ease in reproducible research with minimum efforts. In other words, the forecasting comparison can be reproduced several times easily with the help of the ForecastTB package.

•
Stepping stone in machine learning automation: Forecasting methods play a very important role in machine learning applications [24]. The ForecastTB package aims to evaluate the best performing forecasting method for a given time series dataset and this can be presented as a stepping stone in machine learning automation modeling. For example, on changing nature and patterns of the time series dataset, a machine learning application could automatically replace the existing forecasting methods based on the output of the ForecastTB package. • A handy tool: The ForecastTB package is a handy tool, especially for researchers who are not comfortable with computer coding, since it is a plug-and-play module based package. A very simple syntax leads to very impressive and accurate forecasting comparison analysis.
This paper demonstrates the inspection of the ForecastTB package as a testbench for the comparative study of different forecasting methods [25]. The motivation behind this package is another R package, named imputeTestbench [26,27], which demonstrated great success in performing comparison analysis with time series imputation methods in several research studies [28][29][30][31]. The ForecastTB package aims at providing an evaluation toolset that overcomes the challenges of discovering the most suitable forecasting method along with a detailed comparative analysis. This package allows simulating random possibilities of different strategies including Monte-Carlo simulation. Values forecasted with several different methods are then compared with a user defined error metric. A couple of plotting functions are provided for overall forecast evaluation visualization. Besides, demonstration examples and case studies on natural time series datasets are discussed to showcase the usage of ForecastTB package.

Overview of ForecastTB
The ForecastTB package is a plug-and-play structured module as shown in Figure 1. It is used to compare the forecasting methods, which begins by forecasting time series with distinct strategies. Then the prediction accuracies are evaluated with the error metrics for all methods for several repetitions. The prediction_errors() function is employed for a forecasting method comparative evaluation with the consideration of various input parameters. It returns an object, which is the basic module in the package. Further this module can be updated with new methods and other parameters with append_() and choose_() functions. The Monte_Carlo() function is a further extension of the prediction_errors() module to compare distinct methods for randomly selected patches of the input time series. The remaining two functions, plotMethods() and plot_circle(), are used to visualize forecasted results and error summaries for the chosen methods. All functions, as a set of modules, are based on and connected with the object provided by the prediction_errors() function. Hence, the framework of the ForecastTB package is version-control-friendly. It means, in the future, new features in the next versions of the package, can be easily introduced. Dependencies include additional packages for data manipulation (reshape2, [32]; stats, [33]), graphing (ggplot2, [34]; circlize, [35]; graphics, [33]; gridExtra, [36]; RColorBrewer, [37]), and forecast (forecast, [38]; PSF, [39][40][41]; decomposedPSF, [42]; methods, [33]).
Recently, a package in R was proposed for streamlining time series forecasting with limited facilities and features, named as predtoolsTS [43,44]. This tool assists in forecasting with the automated Auto-Regressive Integrated Moving Average (ARIMA) model [38] and only regression type algorithms from the caret [45] package in the closed environment. Beyond this, there is no facility to append new forecasting methods and updating the comparison environment as per the user's need. On the contrary, the proposed ForecastTB package provides such multiple facilities along with the possibilities of introducing new features with a simple plug-and-play structure as discussed in the article.

The prediction_errors() function
The prediction_errors() is a primary function that compares the performance of different forecasting methods depends on various parameters supplied by user. The default methods included in prediction_errors() are ARIMA from the forecast package [38] and the Pattern sequence based forecast (PSF) method from the PSF package [39,40]. This statistical method, ARIMA, is most widely and frequently used in time series forecasting. It is comfortably modeled as compared to more complex approaches. Moreover, to avoid further complexity, the auto.arima() function in the forecast package [38] and psf() from the PSF package are used, which automatically estimates the model parameters required for the given dataset. With the following examples, we show the possibility of adding extra and user-defined methods as needed.
Following are the arguments of the prediction_errors() function: A ts (stats) as input dataset (numeric object) to be evaluated. It is a time varied dataset for comparison of performance and accuracy of forecasting methods.

nval:
An integer value to decide a number of values to be predicted while comparing the forecasting methods. The default value is 12.

ePara:
The forecasted and observed values are compared with error metrics. The root-mean-squared error (RMSE), mean absolute error (MAE), and mean absolute percent error (MAPE) [46] are the default error metrics included in ForecastTB, but the individual metric can be opted using ePara = 'rmse', 'mae', or 'mape'. The error metrics are defined as: where X i andX i are the observed and forecasted data at time t, respectively. N is the number of data for forecast evaluation. Additionally, a new error metric of the user's choice can be introduced in the following manner: ePara = c("errorparametername"), where errorparametername is an external function which returns desired error values for original and forecasted values. It is discussed in detail in the following sections.

ePara_name:
A string of characters for the names of error metrics provided in ePara. In the prediction_errors() function, RMSE, MAE and MAPE are the default names for the error parameters.

Method:
A string with method names that are used to forecast values with ARIMA, PSF and user-defined methods. The ARIMA is a default method until the user changes it. For instance, Method = c("PSF") will use only the PSF with the prediction_errors() function. New forecasting methods can be included in prediction_errors() function with Method = c("test1(data, nval)") vector, where test1 is an external function which forecasts nval number of future values for input data and returns the forecasted values. Such multiple functions can be added in the Method parameter. The external function is expected to provide forecast methods using the recursive strategy.

MethodName:
A string of characters provided by user as one or more forecast methods passed to Method. The default list of name of method in the prediction_errors() function is ARIMA.

strats:
A character string of names of forecasting strategies used for time series forecasting. The strategies used in the current version of the prediction_errors() function are Recursive and DirRec. In the recursive approach, the first value is predicted with a prediction model and then appended to the data-set. Fits to the same model are then used to predict the next value. In the DirRec approach, the first value is predicted with a model and appended it to the data-set. A new model is then fitted and used to predict the next value and so on. The default strategy in the prediction_errors() function is the Recursive one.

append_:
A flag that suggests if the function is used to append to another instance. The default flag is 1, which indicates newly added methods are compared with the default methods. When the flag is 0, all methods, excluding default methods, will be compared.

dval:
An integer that specifies the length of the data subset to be used for forecasting. The default value is the length of the data parameter.
With the default entities for all arguments, the prediction_errors() function generates an S4 object of class prediction_errors as the error profile for the forecasting methods. This prediction_errors() function returns two slots as output. The first slot is output, which provides Error_Parameters, indicating error values for the forecasting methods and error parameters defined in the framework, and Predicted_Values as values forecasted with the same foreasting methods. Further, the second slot is parameters, which returns the parameters used or provided to prediction_errors() function. This is illustrated in the following example: #'AirPassengers' is a sample dataset in CRAN prediction_errors(data = AirPassengers) ## An object of class "prediction_errors" ## Slot "output":

The append_() and choose_() functions
The append_() and choose_() are functions to add new forecasting methods and choose the selected methods to and within the prediction_errors object. These functions make ForecastTB a handy plug-and-play tool for forecasting analysis. The append_() function has the following arguments: object: A prediction_errors object with two slots, output and parameters returned by the prediction_errors() function.

Method, MethodName, ePara and~ePara_name:
These parameters are with similar meaning as that supplied to the prediction_errors() function. The ePara and ePara_name parameters of the append_() function automatically get synced with the prediction_errors() function.
The append_() function returns the error profile for the forecasting methods along with newly added ones in the form of an updated prediction_errors object.
Further, the choose_() function helps the user to manipulate the prediction_errors object, by allowing selection of the desired methods existing in the object. The choose_() function has the following argument: choose_(object) object: A prediction_errors object returned by the prediction_errors() or append_() function. The syntax of the choose_() function is discussed in the following example. The variable c is an object with three forecasting methods comparisons. After provide the c object to the choose_() function, it returns the following output showing names of forecasting methods attached in the c object and asks the index number of method to be removed from the same object. Finally, it stores the newly updated object in the variable d.
## > d <-choose_(object = c) ## Following are the~methods attached with the~object: [,3] ## Indices "1" "2" "3" ## Methods "ARIMA" "DPSF" "xyz" ## Enter the~indices of methods to remove:2 After removing on of the methods with the choose_() function, a new object d is returned as follows: In this way, the append_() and choose_() functions provide features to manage the desired forecasting methods in the prediction_errors object with simple and single lined syntax. Besides, these functions avoid repetitive computation and execution of the methods which are already attached in the prediction_errors object.

The monte_carlo() function
Monte-Carlo sampling is a very popular strategy to compare the performance of forecasting methods. In this strategy, multiple patches of a dataset are selected randomly and then tests of the performance of the forecasting methods are executed. It returns the average of error values obtained for each data patches. The most notable feature of the Monte-Carlo strategy is that it ensures an accurate comparison of forecasting methods and avoids the biased results which might be obtained by chance.
The monte_carlo() function has the following arguments: monte_carlo(object, size, iteration, fval, figs) object: An S4 object of the prediction_errors class with two slots, output and parameters returned by the prediction_errors(), append_() or choose_() functions. The monte_carlo() function tunes itself based on parameters returned by the object.

size:
An integer value representing the length of the patches from the time series dataset used in the Monte Carlo strategy. The input dataset is referred from the prediction_errors object.

iteration:
A numeric value representing the iterations to be used by the Monte-Carlo strategy.

fval:
A flag, when it is on (fval = 1), the monte_carlo() function return the output along with the forecasted values for each iteration. The default value of fval is 0, that returns only the error values in RMSE in each iteration.

figs:
A flag to plot with the plot.prediction_errors() function for each iteration in Monte-Carlo strategy when figs = 1. The default value of figs is 0 and it does not return any plots.
This way, the monte_carlo() function returns the error values along with the mean values for the provided forecasted methods in each iteration of the Monte-Carlo strategy. This function can be used as a quick tool to create an intuition of best-suited forecasting method for a given time series dataset.

A Case Study: Performance of Forecasting Methods on Standard Natural Time Series
This section presents a case study to introduce the use of the ForecastTB package as a test-bench to compare the performance of time series forecasting methods. The methods used in the case study are the auto.arima(), ets(), psf(), and lpsf() that are frequently used R functions for time series forecasting. The two standard datasets used in this example are nottem and sunspots. The nottem is the average air temperatures recorded at Nottingham Castle in degrees Fahrenheit from 1920 to 1930 (datasets package, [33]) Similarly, the sunspots is mean relative sunspot numbers from 1749 to 1983 (datasets package, [33]). Both datasets are measured on monthly basis.
In the following examples, the whole comparison analysis and visualization are done for the nottem dataset, but only the error values are shown for the sunspots dataset. The following are the two functions that represent forecasting methods. These methods include LPSF and PSF methods from the decomposedPSF [42] and PSF [39] packages. These functions consume time series dataset and nval as input parameters and return the forecasted values as a string.
library(decomposedPSF) test1 <-function(data, nval){ return(lpsf(data = data, n.ahead = nval)) } library(PSF) test2 <-function(data, nval){ a <-psf(data = data, cycle = 12) b <-predict(object = a, n.ahead = nval) return(b) } The following code chunk is the main function in the ForecastTB package, in which performance of ARIMA along with the above mentioned (PSF and LPSF) methods are compared for the nottem time series for next 12 months forecast. The output of the function is saved in the variable a1 as an S4 object of prediction_errors class. a1 <-prediction_errors(data = nottem, nval = 12, Method = c("test1(data, nval)", "test2(data, nval)"), MethodName = c("LPSF","PSF"), append_ = 1) a1 ## An object of class "prediction_errors" ## Slot "output": The object a1 shows the comparison of ARIMA, PSF, and LPSF methods in addition to the parameters provided by users and the defaults ones, whenever not provided. Imagining a case that the user wishes to attach a new method in the object a1. Though this could be done with the prediction_errors() function, as discussed above, this will lead to the execution of all methods which were already available in the object. To avoid such redundant calculations and executions, the append_() function could be used. The Error, Trend, Seasonal (ETS) method from the forecast [38] package is framed in a function as follows: The object a1 is updated with ETS method into object c1 with the append_() function and returns the error parameters as shown below: c1 <-append_(object = a1, Method = c("test3(data,nval)"), MethodName = c('ETS')) c1@output$Error_Parameters Further, the object c1 is plotted with the S3 method plot.prediction_errors() as follows and are shown in Figure 2. This figure shows the forecasting errors (bar plot) and forecasted values (line plot) for all methods attached in object c1. d1 <-plot(c1) From Figure 2, it can be observed that the performance of the ETS method is not satisfactory for the nottem time series dataset. In this case, the user can omit the unwanted method from object d1 with the choose_() function as shown below. After providing the object as input to the choose_() function, it returns the names of forecasting methods available in the prediction_errors object with respective indices, and asks the user to enter the indices of the methods to be removed from the object. In the following code chunk, index 4 for the ETS method is provided by the user and this results in a new object e1, which is nothing but the object d1 without the ETS method. The choose_() function permits the user to remove multiple numbers of methods from the object, where the user needs to enter multiple numbers in a vector format.

Adding New Error Metrics
As described in the earlier section, RMSE, MAPE, and MAE are the default error metrics assigned to ForecastTB. All these error metrics compare the forecasting methods in different perspectives [26]. Based on the information provided by each metric, users should use an appropriate one. All of these metrics are used internally within the prediction_errors() function.
Additionally, new error metrics can be added using an approach similar to that used for adding forecasting methods described in earlier sections. The following code demonstrates how the percentage change in variance (PCV, [47]) is added to the existing object as an additional error metric: where var(Predicted) and var(Observed) are variance of predicted and obvserved values. The user-defined error function should be provided in an R script with two mandatory input arguments, that is, the same length vectors of observed and forecasted values. The defined function should return a summary of the errors or differences. The function and its name must be supplied to the ePara and ePara_name argument of the prediction_errors() function, respectively.
# error metric to include with 'prediction_errors()' function pcv <-function(obs, pred){ d <-(var(obs) -var(pred)) * 100/ var(obs) d <-abs(as.numeric(d)) return(d) } The following code chunk shows how the new error function is attached to the prediction_errors object and corresponding update are observed with the plot() function, as shown in Figure 3.

A Polar Plot
The ForecastTB package provides a polar plot of showing forecasted values, especially if the time series dataset shows seasonality as shown in Figure 4. This plot shows the nature of forecasted values and how these values are behaving on an increasing number of time horizons. The numbers on the circle shows the forecasted time steps. Several forecasting methods are suitable for short-range forecasts but fail to maintain the accuracy for longer horizons. This proposed plot is capable of reflecting this feature in a more intuitive way. In Figure 4, the yellow-colored line for the ARIMA method is moving away from the test values as the time horizon proceeds.

Monte-Carlo Strategy
The code chunk below shows Monte-Carlo simulation on nottem time series with monte_carlo() function. This function is supplied with above mentioned prediction_errors object a1. It is also supplied with size of each data patch in each iteration as 144 and the number of iterations as 10. The flag parameters fval and figs are set to 0, in order to avoid display of forecasted values and comparison plots for all iterations in the Monte-Carlo simulations. The effect of these flags is further discussed in the vignette published in ForecastTB package. This output shows the performance of ARIMA, LPSF, and PSF methods on nottem time series for 10 different time patches. The first column in the output shows the numeric number as an index in time series from which the data patch is selected for the analysis. Besides, the last row shows the averaged error values for all methods and it ensures more accurate and unbiased performance of the forecasting methods.
Furthermore, the following code chunk shows an object to compare the performance of forecasting methods on the sunspots time series and further a more rigorous comparison is done with Monte-Carlo simulations.
x1 <-prediction_errors(data = sunspots, nval = 24, Method = c("test1(data, nval)", "test2(data, nval)"), MethodName = c("LPSF","PSF"), append_ = 1) x1@output$Error_Parameters It is interesting to know that, for the full-length sunspots time series, the PSF method was observed to be the best among the selected methods with the prediction_error() function, where last 24 values of the dataset were forecasted. Whereas, for the Monte-Carlo simulation, with rigorous iterations, ARIMA outperformed with a significant error difference as compared to PSF and LPSF methods. In this way, the monte_carlo() function can be used in comparing forecasting methods by avoiding the biased results that can be obtained by chance.

Case Study Related to the Energy Application
In the current research, it is important to validate the proposed R package ForecastTB based on real-world applications. In this manner, a reliable and feasible application can demonstrate the capacity of the developed R package forecasting code. Wind speed and solar radiation data of two meteorological stations (Robinson and Watford City) located in the North Dakota, USA, were used to test the predictability performance of the LPSF: modified pattern sequence-based forecast and PSF-pattern sequence-based forecast models. Three time scales were examined for the forecasting of 12 steps ahead of each scale including daily, weekly, and monthly. The modeling result was tested using several statistical performance metrics (RMSE, MAE, and MAPE), time convergence, and graphical presentation. The data span covers 20 years of historical data over the period (1 January 2000 to 31 December 2019). Three models were developed for modeling verification including support vector machine (SVM), decomposed pattern sequence-based forecast (DPSF), and hybridized ensemble empirical mode decomposition with Autoregressive integrated moving average (EEMD-ARIMA).

Statistical Performance Metrics Results
Following several established studies from the literature on wind speed and solar radiation prediction, RMSE, MAE, and MAPE are the commonly used statistical metrics for forecasting evaluation [48]. Those metrics can give an informative evaluation of the prediction capacity.  (Table 1). In comparison with the SVM, DPSF and EEMD-ARIMA, the LPSF and PSF models revealed an acceptable prediction enhancement. For instance, wind speed prediction was improved using the LPSF model by (86.04%, 42.77%, and 19.22%) based on the RMSE metric against the SVM, DPSF, and EEMD-ARIMA models, respectively. Whereas, solar radiation prediction accuracy was enhanced by (74.41%, 43.83% and 16.08%) using the LPSF model and RMSE metric over the SVM, DPSF and EEMD-ARIMA models, respectively. Table 1. The statistical performance of the forecasting wind speed and solar radiation at Robinson station for the daily, weekly and monthly time scale for the applied (LSPF and PSF) and the comparable predictive models (i.e., SVM, DPSF and EEMD-ARIMA). For Watford City station, the best forecasting results achieved for the proposed LPSF and PSF models with minimum values of (RMSE, MAE, and MAPE). Statistically, the performance metrics  Table 2). The statistical results of the Watford City station demonstrated more or less similar predictability performance of wind speed and solar radiation in comparison with the benchmark models (i.e., SVM, DPSF and EEMD-ARIMA). Table 2. The statistical performance of the forecasting wind speed and solar radiation at Watford City station for the daily, weekly and monthly time scale for the applied (LSPF and PSF) and the comparable predictive models (i.e., SVM, DPSF and EEMD-ARIMA). The convergence execution time is one of the essential modeling components in the field of soft computing and modeling aspects [49]. With respect to the convergence execution time of the developed models, the LPSF and PSF revealed a very faster time learning process with an acceptable execution process.

The Graphical Presentation of the Developed Forecasting Models
Figure 5a-c reported the performance metrics and time execution results of the wind speed forecasting daily, weekly, and monthly in the form of a bar chart. For all investigated time scales, the LPSF and PSF models revealed a slightly similar performance to the DPSF and EEMD-ARIMA. However, the SVM model showed very poor predictability performance and this can be explained due to the essential tuning procedure required for the SVM model for the internal parameters, that were not adopted or focused on in this research [50,51]. Figure 6a-c indicated the performance metrics and time execution results of solar radiation forecasting for the daily, weekly, and monthly scales at the Robinson station. It is clearly seen that the monthly scale solar radiation for LPSF, PSF, DPSF, and EEMD-ARIMA models were similar. However, the SVM model still reported the lowest capacity. Figure 7a-c displayed the forecasting capacity of the proposed and comparable forecasting models for wind speed modeling at Watford City station. For the daily and weekly wind speed scales, the LPSF and PSF models exhibited a distinguished performance against the results of SVM models. Yet, the monthly scale results of the LPSF, and PSF models were slightly poorer and similar to the ones attained by DPSF and EEMD-ARIMA. Figure 8a-c reported the performance of the solar radiation forecasting Watford City station and for all molded time series scales. For the daily scale, Figure 8a shows that the performance of the DPSF and EEMD-ARIMA models is superior to that of the LPSF models and PSF models. This might be due to the high stochasticity of the solar radiation process on a daily scale and thus the proposed models could not capture the actual trend. On the weekly base scale, the DPSF and EEMD-ARIMA models indicated the worst forecasting potential, even worse than the SVM model. Whereas, similar performance attained using LPSF, PSF, DPSF, and EEMD-ARIMA models for the monthly scale of solar radiation other hand, for the monthly.

Discussion
This article proposed and demonstrated the ForecastTB package as a test-bench for comparing the time series forecasting methods as a crucial step towards more formal time series analysis. This demonstration is further described with some case studies to show how characteristics of the temporal correlation structure can influence the forecast accuracy. The ForecastTB package greatly assists in comparing different forecasting methods and considering the characteristic of the time series dataset.
Also, this paper explains how new forecasting methods and error metrics can be introduced in the comparative test-bench. As such, the package allows users to include additional forecasting methods for comparison, which can be extremely useful given the capability of R to interface with other programming languages (e.g., matlabr for MatLab [52]; Rcpp for compiled languages [53] and many other). Finally, a simple plug-and-play module based architecture of the ForecastTB to append or remove several forecasting methods and error metrics makes it a robust and handy tool to evaluate forecasting analysis.
It is worth mentioning that forecasting different time scale wind speed and solar radiation is highly essential for multiple energy sources harvesting [54][55][56]. Providing an accurate and reliable forecasting intelligent computer aid models for wind speed and solar radiation can contribute to the base knowledge of friendly energy sources and sustainable green cities. To the best knowledge of the current research, the feasibility of the LPSF and PSF models is evidenced to be capable to module related energy datasets. Funding: This study was funded by Apple Inc. as part of the APPLAUSE bio-energy collaboration with Aarhus University.