1. Introduction
In terms of production and exchange relations, economic forecasts at a certain historical level have existed for many centuries. Scottish scientist and genius Adam Smith was among the first to take them seriously at the level of economics as a science. He defined the main economic categories, subjects and relationships in the form that we basically use them today. This concerns economic forecasts, economic models and their use and interpretation [
1].
However, the need for economic forecasting was strongly shown by the great economic crisis of the 1930s, which again accelerated research in this area and reached a level that is still valid nowadays. After this catastrophe, economists, politicians and academic scientists focused on gaining an understanding of the real functioning of the economy and where the economy is heading. All this has led to the emergence, development and use of a wider range of statistical and analytical techniques, which have resulted in economic forecasts.
The principle of economic forecasting consists of a qualified effort to predict the future state of the economy using appropriately selected and reliably reported economic variables. They are one of the starting points for government decisions on investment, hiring, spending and other important policies that affect aggregate economic activity [
2]. A correct estimate of future economic developments is essential for politicians and government officials, as it helps them to determine how to set the basic fiscal and monetary bases and instruments. Economists working in public administration play a crucial role in policy-making and in setting expenditure and tax parameters. In the area of fiscal policy, these are most often economic forecasts in relation to the amount of tax allocation depending on the forecasted GDP growth and public expenditure. Economic forecasting is important for the creation of plans and budgets in the medium and long term. In contrast, in the area of monetary policy, the main goal of central banks is to maintain a high degree of price stability, and this is primarily influenced by the development of demand and costs. Coyle [
3] empirically compared the accuracy of model predictions with equilibrium correction mechanisms and models in differences (or growth rates) in a central bank inflation-targeting environment. As he noted, inflation forecasts for one or two years ahead are the primary source of information for monetary policy decisions in countries that use the inflation-targeting regime.
Thus, economic forecasts involve the creation of statistical models with the input of selected crucial variables, usually in an effort to focus on the future values for predicted economic indicators. These predicted economic indicators often include the inflation rate, interest rates, and the exchange rate as indicators for monetary policy, as well as industrial production, consumer confidence, labor productivity, retail sales and unemployment rate for the areas of fiscal and business policy.
The challenges and subjective aspects of people’s behavior in economic forecasts do not only affect the public sector. The private sector, commercial banks, investors, economists, academics, and also central banks have all issued economic forecasts that were partially or even wildly out of touch with reality [
4].
Time series models that use historical economic data are popular methods of economic forecasting, while practice has verified that they are meaningful and competitive, mainly due to econometric systems of equations, especially in their multivariate forms.
Economic forecasting and prediction are often mentioned as ambiguous sciences. Due to the limited sources of data and their credibility, as well as the variously chosen methods, economic forecasting has its limits, which must be considered in practice and in scientific work [
5]. Looking back, this is evident, for example, in assessing the reliability of crisis forecasting. The inability to record immediate declines reflects pressure on economists and forecasters to predict it sharply. Many of them prefer not to deviate from the mainstream because they keep in mind that bold projections could damage their reputation and the model’s future application [
6].
Managers, business owners, investors and others should also not neglect the subjective nature of results and outputs within economic forecasts. Forecasts are strongly affected by the type of economic theory that forecasters use. Projections can vary very widely. One economist can believe in monetary policy, meaning that GDP growth is primarily influenced by the supply of money, and a second can argue that huge or even deficit spending is harmful for the economy. Thus, some of the outputs and conclusions do not come from qualified and credible economic analysis. On the contrary, they are usually determined on the basis of personal motives and assumptions about the functioning of markets and market forces. This inevitably means that the results of particular policies will be assessed differently [
7].
The limits of the accuracy of economic forecasts also follow from the adoption of a certain economic theory as a basis for drawing conclusions. Economic theory can determine the background and basic features of a forecast, and economic knowledge and education can often play a key role. The educated economist should consider that certain conditions of time and place are specific and that the forecast provided by selected statistical methods may be adjusted while taking into account specific important circumstances. This is particularly important if a fundamentally uneconomic event also has significant economic consequences. If possible, the exact conclusion is not based purely on economic values; it is also necessary to consider uneconomic factors, including their future development. Predictions-based forecasts made only on the basis of subjective judgments cannot meet the strict requirements for the accuracy of outputs obtained by using objective techniques. As a result, the most reliable and the most trustworthy predictions are mostly based on fundamental and upgraded economic knowledge and standard statistical techniques.
The most frequent ant the most common form of errors in economic forecasting is probably still theoretical. Additionally, knowledge about the functioning of economic entities is relatively limited. Effective and accurate forecasting is also spoiled by the fact that economic conditions, economic policies and institutions change over time.
Time series represent consecutive series of observations measured at equal time intervals, and they are one of the most important concepts used to analyze economic processes and to make forecasts.
While the classical theory of econometric time series is well understood now [
8,
9], new techniques coming from the soft computing area have more recently been adopted for economic forecasts. The use of fuzzy sets in the area of analyzing and predicting time series came up based on the ability of fuzzy models to approximate functions and on the ease to understand the rules using linguistic variables by non-experts. Fuzzy time series methods have the advantage that they can deal with uncertainty or imprecision [
10,
11,
12,
13,
14], which can be the case even for economic indicators computed based on data collected through probabilistic samples, or estimated using statistical models when the data are not available, or periodically updated based on the availability of new data sources (the revision policy for the datasets used in our experiments are given by the European Statistics Code of Practice, adopted by the European Statistical System Committee in 2011, and the European Statistical System Guidelines on the revision policy of the Principal European Economic Indicators), thus incorporating a certain degree of uncertainty.
Fuzzy time series were introduced by Song and Chisson [
10], who developed the fuzzy time series model and applied it to some enrollment data series of University of Alabama, and later developed by Chen [
11], who simplified the fuzzy time series prediction method using only arithmetic operations instead of max-min composition operations. Chen [
12] also proposed a forecasting method based on high-order fuzzy time series of various orders: second, third, fourth and fifth. His experiments showed that the forecasts generated using the third-order model had better accuracy than other existing models.
Singh [
13,
15] proposed a new and computationally simplified method of fuzzy time series forecasting based on difference parameters. He tested his method on the same data series regarding students’ enrollment at the University of Alabama and compared the forecasted values obtained with his method against the results obtained by other existing methods to argue in favor of his method. Singh introduced a difference parameter as the fuzzy relation for forecasting and developed a forecasting algorithm of linear complexity.
Huarng [
16] proposed a new heuristic model for time series forecasting using heuristic increasing and decreasing relations in order to increase the forecast accuracy. He tested his method on the same enrollments data and on Taiwan Future Exchange data.
Chen and Hsu [
17] designed a first-order and time-variant forecast method starting form a division of the universe of discourse into intervals, and in a later stage re-dividing them into four, three, two and one depending on the statistical distribution of the data in each interval. The linguistic values represented by fuzzy sets are defined based on the intervals divided in the second stage, and then the data are fuzzified and the logical relationships are established. Finally, Chen’s method uses a set of rules to determine whether the forecasted data have an increasing or decreasing trend.
Recently, Liu et al. [
18] proposed an ultra-short-term prediction approach based on the Takagi–Sugeno (T–S) fuzzy model, which has the advantage of a great linearization ability, by describing complex nonlinear systems by a number of linear or nearly linear systems and applied this model for wind power and wind speed forecast. The authors compared the results of their method with the results of some machine learning algorithms, showing the superiority of their method in increasing the forecasting accuracy.
Compared with classical time series forecasting methods, specific features of fuzzy time series add certain advantages: they do not require long series, and they can work with both crisp and fuzzy values. An extended overview of the fuzzy time series forecasting models was presented in Bose and Mali [
14].
In this paper, using a set of time series with economic indicators, we compared the accuracy of some well-known econometric forecasting methods with that of two new methods based on the concept of fuzzy time series.
The next sections of the paper are organized as follows. In
Section 2, we describe the data that we used in our experiments, the methods and the software used. In
Section 3, we discuss the results that we obtained, and in the last section, we present the main conclusions and directions for future work.
2. Data and Methods
We used data series from the Eurostat database to test all the methods used in this study. Thus, we selected 3 monthly economic indicators that we considered important to assess the economic activity of a country from the Eurostat database with values from January 2017 to February 2020:
The volume index of production in construction (2015 = 100), seasonally and calendar-adjusted, for 20 European countries;
The volume index of production in the industry for electricity, gas, steam and air conditioning supply (2015 = 100), seasonally and calendar-adjusted, for 29 European countries;
The index of deflated turnover in retail trade, except for motor vehicles and motorcycles (2015 = 100), seasonally and calendar-adjusted, for 28 European countries.
From the available data series in the Eurostat database, we eliminated those countries with missing values for the time series, because we did not want the results to be influenced by an external factor such as the imputation method. The econometric methods used in our analysis are sensitive to missing values. In
Figure 1,
Figure 2 and
Figure 3, we show a graphical representation of the datasets, while in
Table 1,
Table 2 and
Table 3, we present descriptive statistics and the order of integration for each economic indicator selected and each country. We checked all the time series for seasonality using the Webel–Ollech overall seasonality test (WO test) [
19], and none have a seasonal component, as we expected.
To test whether fuzzy time series models can be used as an alternative for time series forecasting and based on an overview study [
20] that indicates exponential smoothing, moving average and autoregressive methods as the most widely used econometric forecasting techniques, we used the simple exponential smoothing, Holt and the ARIMA methods, and then compared the accuracy of predictions with the results obtained by using fuzzy methods.
The exponential smoothing method has been developed and used for time series forecasting since the 1950s [
21,
22], but only recently some procedures for automatic model selection were designed [
23,
24]. Exponential smoothing is a set of methods that was classified initially by Pegels [
25], and then by Gardner [
26], and Taylor [
27]. In our study, we used the simple exponential smoothing method (sometimes called single exponential smoothing) [
20] and the Holt method [
28].
The simple exponential smoothing method (SEM) is usually applied for forecasting data, which do not exhibit a clear trend or seasonal pattern. For y
t time series, the one-step-ahead forecast for time t + 1 through the SEM method is:
which means that one time ahead prediction is a weighted average between the most recent observation and the previous predicted value. We can write:
or
where
is the smoothed value of the data series carried out at time t for t + 1,
is the previous forecast,
is the forecasting error at time t, and 0 ≤ α ≤ 1 is the smoothing parameter. That is, the series forecasted value for time t + 1 is calculated as the forecasted value for time t, corrected with a proportion (α) of the forecast error. The smoothing parameter α is either exogenous or is calculated to minimize the in-sample sum of squared errors.
The simple exponential smoothing is generally appropriate for series that are random walk with or without drift. The Holt exponential smoothing method is a better forecasting method for data that reveal a linear trend, as it can be described as follows:
where
is the forecast carried out at time
for period t + h (h = 1, 2, …), L
t and
are the smoothed values of the level at time t and t − 1, respectively, b
t and
denote the smoothed values for the trend, and 0 ≤ α ≤ 1 and 0 ≤ β ≤ 1 are the smoothing parameters for the level (α) and the trend (β), respectively. The smoothing parameters are exogenous, or they are calculated by using the sum of squared errors criterion (SSE) (or, alternatively the mean squared error (MSE), and root mean square errors (RMSE)).
Auto Regressive Integrated Moving Average (ARIMA) models [
29,
30] are widely used methods in the analysis and prediction of time series. In addition to exponential smoothing methods, ARIMA models aim to identify the stationarity and the autocorrelations in the data. The models are usually denoted as ARIMA(p, d, q), using p as the order (number of time lags) of the autoregressive part of the model, d to denote the degree of differencing and q for the order of the moving-average part. For a times series y
t (possibly non-stationary in the sense of mean, but not non-stationary in variance/autocovariance), the ARIMA(p, d, q) model can be written as:
where L is the lag (backshift) operator (so that
),
are the
in the autoregressive part of the model,
are the q coefficients in the moving average part of the model, and ε
t is white noise. In polynomial notation, the ARIMA model can be written as:
where Φ(L) is a polynomial function of order p (the autoregressive part of the model), Θ(L) is a polynomial function of order q (the moving average part), and d is the order of integration (degree of differencing involved). A detailed discussion about ARIMA models can be found in Hamilton [
9].
Besides classical methods of time series analysis, new, unconventional approaches have been developed based on the theory of fuzzy sets [
31]. Among some of the most well-known time series analysis and forecasting methods based on fuzzy set theory, we can mention the Song and Chissom [
10], Chen [
11,
12], Singh [
13,
15], heuristic [
16], Chen–Hsu [
17], Abbasov–Mamedova [
32], and NFTS [
33] methods. The first five methods cited above are used to fit/forecast time series even when historical values change over time. This could occur when the values of the time series are updated or changed due to different reasons, for example when new data are available to compute them. The last two methods allow users to make predictions for various time horizons.
Below, we present a very short introduction to the fuzzy time series concept. For more details, an interested reader can find a comprehensive review of fuzzy time series forecasting models in Bose and Mali [
14].
Assuming that:
is the universe of discourse where u
i, i = 1 … n are the linguistic values, we can define the fuzzy set of linguistic variables A
i of U as:
where
is the membership function of the fuzzy set A
i,
, and
represents the degree of belonging of
to the fuzzy set A
i.
Let us assume that
is a subset of
and it is also the universe of discourse of the fuzzy sets
. If Y(t) is the collection of
,…, then Y(t) is a fuzzy time series defined over Z
t. If Y(t) is caused only by Y(t − 1), which we will denote by
then a fuzzy relationship is established between Y(t) and Y(t − 1), and it can be written as the following fuzzy relational equation:
where
is the max-min composition operator and R is a relation named the first-order model of Y(t). If
is independent of time,
for
then Y(t) is called time invariant fuzzy time series.
If Y(t) is determined by several fuzzy sets , then the fuzzy relationship is denoted as: where . This is called nth-order fuzzy time series model.
In cases when Y(t) is simultaneously determined by
and the relations are time variant, then Y(t) is called the time variant fuzzy time series and we have:
where w > 1 is a time parameter by which Y(t) is affected, and there are several methods to compute the relation
ranging from the max-min composition operations to more simplified rules.
The foundation of fuzzy time series forecasting was set by Song and Chissom [
10] and later developed by various researchers. Thus, the Chen method [
11] is basically an adapted version of the method presented in [
10], but it is more efficient because it simplifies the arithmetic operations. Chen [
11] provides a detailed description of the fitting procedure and showed that it can make robust predictions even in cases when the historical data are not accurate.
According to Chen, fuzzy time series analysis and forecasting consist of the following processing stages:
Set the universe of discourse
Divide the universe of discourse into several intervals
Fuzzify the datasets
Setting the fuzzy logical relationship and group
Defuzzification.
Huarng [
16] proposed heuristic forecasting models through the integration of problem-specific heuristic knowledge with Chen’s model to obtain predictions that are more accurate. The empirical studies conducted by the author showed that the heuristic approach better reflects the fluctuations in fuzzy time series and obtained better overall forecasting results compared with previously developed methods.
The Chen–Hsu forecasting method [
17] belongs to the first order and time-variant class of methods, and improves Chen’s method, obtaining a better prediction accuracy measured with MSE.
Singh [
13,
15] developed a forecasting method based on fuzzy time series which can efficiently cope up with cases when there are large fluctuations in consecutive values of the time series. Singh tested the method on time series from crop production and compared its accuracy with other existing methods, showing the superiority of his approach. His method can be summarized as:
Define U as the universe of discourse starting from the range of time series data, setting , with D1 and D2 being two convenient positive numbers
Divide U into equal length intervals u1, u2, … uk. The number of intervals should be equal to the number of linguistic variables A1, A2, … Ak
Build the fuzzy sets Ai according to the previously set intervals and apply triangular membership rule to each one
Fuzzify the available data and set the fuzzy logical relations where Ai corresponds to the fuzzified value at time step n and Aj to the time step n + 1.
Define the rules for forecasting. An interested reader is invited to consult [
8,
13] for a detailed description of the rules.
For our forecasting experiments, we chose the Abbasov–Mamedova and NFTS methods because they easily allow us to make predictions for a future time horizon, and they have efficient implementations.
Starting from a population forecasting problem, Abbasov and Mamedova [
32] proposed a new method of time series forecasting that can even use incomplete, fuzzy input data. They successfully applied their procedure to demographic data, showing the effectiveness of the proposal. Their proposal can be described as follows:
Define the universe of discourse (U) starting from the range of time series data;
Divide the set U into equal-length intervals
Determine the respective values of linguistic variable
Fuzzify the input data
Select the parameter W > 1, and compute the fuzzy relationships matrix
Defuzzify the obtained results.
In the first step, the universal set U is set as
, where D
1 and D
2 are two positive numbers, and then this set is divided into several equal intervals. The default number of intervals is 7, and the fuzzy set
is defined on the universal set U by using the following equation:
, where
is the corresponding interval. After selecting a value for the w parameter, the
relationship matrix is computed, and then the forecasted value can be calculated. For a detailed description of this method, an interested reader can consult [
32].
The NFTS [
33] method can interpolate historical data to make predictions. It normalizes the original data, and automatically computes the number of clusters and finds the fuzzy relationships of each element in series to the clusters. The convergence of the method is theoretically demonstrated and experimentally tested by the authors. The accuracy of predictions was compared with other existing models using several well-known datasets, showing the advantages of their proposal. The novelty of this method starts with the procedure of computing the number of clusters for the universal set U. While other methods set this number to a constant, the NTFS method proposes an automatic iterative algorithm to determine the optimal number of clusters (suitable number of clusters (SNC)). Then, to determine the elements in each cluster and also the fuzzy relation of each element to the cluster, one can use an automatic iterative procedure. The authors of this method demonstrated the convergence of the proposed iterative procedures for all times series.
We used already existing implementations of the above-mentioned forecasting methods, namely the AnalyzeTS R package [
34] for the Abbasov–Mamedova and NFTS fuzzy methods.
We employed a standard technique and divided each times series used for experimentation in two parts: a first part used for model fitting (training) and a second part for testing and measuring out-of-sample prediction errors. The second part contains 2, 4 or 6 values for each time series and methods tested, thus measuring the performance of forecasting methods for 2, 4 and 6 steps ahead. The accuracy of forecasting models was evaluated using the mean absolute error (MAE), which reflects the amplitude of errors, the mean absolute percent error (MAPE), which is equivalent to standardized MAE and reflects absolute corresponding degree, the root of mean square error (RMSE), which reflects the closeness of the error distribution, and the mean absolute scaled error (MASE), a measure proposed in Hyndman and Koehler [
35]. The formulas for computing these accuracy measures are given in Equations (13)–(16). We mention that MAPE is particularly important for time series with strong trends. MAE and RMSE are considered scale-dependent errors, MAPE is a percentage accuracy measure and is consequently unit free, and MASE is a scaled accuracy measure, considered a good alternative to percentage errors. MAE is very popular, being easy to understand and compute, and minimizing MAE will lead to forecasts of the median, while minimizing the value of the RMSE will lead to forecasts of the mean. The RMSE is also extensively used in forecasting applications, although it implies computations that are more complex.
where
denotes the forecasted value at the time instant t, y
t denotes the actual value, and n denotes the prediction time horizon. In the case of MASE, the denominator is the MAE of the one-step naive forecast method on the training set.
All the calculations were performed using the R software system [
36], and the script used to process the data together with the datasets are freely available [
37]. The number of partitions of the universe of discussion was set to 6 for both fuzzy methods. This value was obtained by using the empirical formula given by Efendi et al. [
38]:
where T is the forecast horizon length. Forecasting was done using the fuzzy.ts2() function from the AnalyzeTS R package, which implements the Abbasov–Mametova and NFTS forecasting methods.
The implementations of the econometric times series forecasting methods used in our experiments, the simple exponential smoothing, Holt, and the ARIMA method, were those provided by the forecast R package [
39,
40], which also has an automatic procedure for setting the optimal parameters of them.
3. Results
The time series processing was performed by using the R script prelTS.R which is freely available [
37]. We first fitted all models using the training datasets and then computed the two, four, and six points ahead predictions for each time series using the simple exponential smoothing, Holt, ARIMA, Abbasov–Mametova and NFTS methods. Then, we computed the out-of-sample prediction accuracy measures MAE, MAPE, RMSE, MASE using the test datasets.
The AnalyzeTS R package provides a function to compute the optimal value of the
C parameter [
34] for the fuzzy.ts2() method, and we called this function for all-time series and used the corresponding value for fitting and forecasting. The rest of the parameters of the fuzzy.ts2() function were left to their default values.
We present the values of the accuracy measures in the
Supplementary Materials, Tables S1–S9. Then, we counted how many times the corresponding accuracy measure is lower for a fuzzy time series forecasting method than for a classical one. The results are presented in
Table 4, which summarizes all 77 time series under consideration and all 3 prediction horizons.
We observed that all four accuracy indicators showed approximately the same results. Considering the RMSE, on average, in 37% of the cases for h = 2, 47% of the cases for h = 4 and 39% of the cases for h = 6, fuzzy forecasting methods had better results than the econometric ones for the volume index of production in construction. For the volume index of production in industry, this percentage was 52% for h = 2, 45% for h = 4 and 25% for h = 6, while for the index of deflated turnover in retail trade, it was 25% for h = 2, 34% for h = 4, and 28% for h = 6.
We also observed that, on average, increasing the prediction horizon from two to four resulted in better results for both fuzzy methods, while going to a prediction horizon of six periods resulted in a drop in accuracy, which is expected, with the length of the forecasting time horizon negatively influencing the accuracy of the forecasts for other several methods [
41].
To support our affirmation, we computed the average value for the accuracy measures for the volume index of production in construction for both fuzzy time series methods and plotted then against the prediction horizon in
Figure 4 and
Figure 5.
Although not superior to the classical econometric time series methods, the fuzzy time series forecasting methods performed sufficiently well to be considered as an alternative. Nevertheless, the experimental results encourage us to say that with further improvements and parameter setting, the fuzzy methods tested in this study could be considered as alternatives to other forecasting methods. We could not detect any relationship between the stationarity/non-stationarity of the time series and the accuracy of the predictions.
4. Conclusions and Future Work
Times series forecasting is very important in decision-making in several economic domains, such as finance, macroeconomics, agriculture and tourism, as well as in other scientific areas such as the global climate or earthquake prediction. One of the biggest challenges for researchers in the field of time series forecasting is the selection of the best method to compute predictions.
Classical econometric techniques for time series analysis and forecasting are well understood now, but new methods coming from the soft computing area have emerged in this field. In this paper, we investigated whether fuzzy time series forecasting methods can be used for economic indicators. We compared three econometric time series methods, namely the simple exponential smoothing, Holt, and ARIMA with two fuzzy time series forecasting methods, Abbasov–Mamedova and NFTS, using 77 time series extracted from the Eurostat database. We ran the five above mentioned methods for two, four and six points ahead predictions and computed several accuracy indicators: root of mean square error, mean absolute error, mean absolute percent error, and mean absolute scaled error. Our experimental results show that fuzzy time series forecasting methods, while not yet superior in terms of prediction accuracy to econometric methods, could be considered, with further improvements, potential alternatives to the econometric techniques, especially for relatively short time series. We also noticed that the forecast horizon influences the prediction accuracy in the case of the fuzzy time series methods; an increase from 4 to 6 months in the forecast horizon leads to an important decrease in the accuracy. The accuracy indicators used in this study showed approximately the same results, and thus any of them could be used in assessing the prediction performance. During our experiments, we noticed that the prediction accuracy is greatly influenced by the parameters of the two fuzzy time series methods. We used only the optimal value of the C parameter provided by the AnalyzeTS R package, with the other parameters being set to their default value. Based on our results, it becomes clear that a grid search procedure to detect the optimal value of each parameter is necessary. The grid search procedure is computationally expensive, so a parallelization of the computations is also needed.
Other soft computing methods have been successfully used for time series predictions, such as the recurrent neural networks, long short-term memory networks or SVM-based methods [
42]. Together with the fuzzy methods used in this study, they can be considered as good alternatives to the classical econometric time series methods when these later methods fail to provide good predictions. As future work, we intend to further investigate how the accuracy of the predictions change with the length of the time series and the integration order and with other features of the time series, as well as to test other methods based on fuzzy sets theory. We also intend to widen the spectrum of datasets used for experimentation, including time series with different features. To optimize the performance of fuzzy methods, we intend to develop an efficient automatic procedure to select the optimal values for all parameters of the fuzzy time series forecast methods. There are other studies that show a better prediction accuracy for forecasting methods coming from the soft computing area than the econometric methods for specific time series [
43]. By widening the spectrum of the time series investigated, we intend to detect patterns for which fuzzy methods are better than classical econometric methods in terms of prediction accuracy.