EBITDA Index Prediction Using Exponential Smoothing and ARIMA Model

: Forecasting has become essential in different economic sectors for decision making in local and regional policies. Therefore, the aim of this paper is to use and compare performance of two linear models to predict future values of a measure of real proﬁt for a group of companies in the fashion sector, as a ﬁnancial strategy to determine the economic behavior of this industry. With forecasting purposes, Exponential Smoothing (ES) and autoregressive integrated moving averages ( ARIMA ) models were used for yearly data. ES and ARIMA models are widely used in statistical methods for time series forecasting. Accuracy metrics were used to select the model with best performance and ES parameters. For the real proﬁt measure of the ﬁnancial performance of the fashion sector in Colombia EBITDA (Earnings Before Interest, Taxes, Depreciation, and Amortization) was used and was calculated using multiple SQL queries. paper presents a comparative study of two linear models, Exponential Smoothing and ARIMA . Exponential Smoothing algorithms evaluate EBITDA performance by adjusting smoothing constants to reduce the error of overestimation or underestimation while the ARIMA model coefﬁcients are calculated using the maximum likelihood method devised by Box and Jenkins [14]. In this paper, the ARIMA model was chosen since it makes use of both autoregressive (AR) and moving average (MA) components to model changes over time and capture smoothed trends in the data. The (I) of ARIMA determines the level of differencing to be used, which helps the data to be stationary. Exponential Smoothing is a time series forecasting method for univariate data that can be extended to support those with a systematic trend or seasonal component. It is a powerful forecasting model that can be used as an alternative to the popular Box–Jenkins ARIMA family of methods. short-term volume exponential smoothing an industry model, quality model historical data. Köppelová Jindrová used these methods to predict telecommunications indicators, concluding that exponential smoothing performs better in terms of error using the metrics MAPE and AIC, and their combination provides improved results. Following these this paper proposes the of these two models predict the EBITDA index for the fashion sector in Colombia.


Introduction
The influence of prediction through information platforms has been one of the advantages of industry 4.0, which has facilitated the assurance of information for public and private organizations. This is an essential task in the processes of analysis and decision making, in front of multiple and complex alternatives, which require an understanding of the behavior of economic sectors. Today, there are levels of uncertainty that have permeated all economic sectors, forcing companies to analyze and improve financial results by building predictive models based on historical data [1][2][3][4].
For many years, associations and institutions have needed to promote the economic development of companies and their sectors by adopting models that allow the identification of financial performance patterns and the understanding of the future impact of decisions made over different time periods, for the formulation of policies and programs at regional and local levels [5][6][7][8][9]. To this end, several works have been developed to project future economic behavior based on annual data records, which according to Bova and Klyviene (2020), Alkaraan (2020), Dania, Xing and Amer (2020), Chen and Chen (2019), Hernandez de Cos, Ramos and Jimeno (2018), among others, projection models have many limitations, due to the number of observations, conventional techniques or short time series, which do not allow predicting long-term behavior [9][10][11][12][13].
However, these approaches do not consider the multiple system inputs, limiting their practical use, since a variation of unstudied variables makes the models impractical, making it impossible for the strategies and programs developed to obtain the expected results. Therefore, as a step in the development of predictive multi-entry models, this paper presents a comparative study of two linear models, Exponential Smoothing and ARIMA. Exponential Smoothing algorithms evaluate EBITDA performance by adjusting smoothing constants to reduce the error of overestimation or underestimation while the ARIMA model coefficients are calculated using the maximum likelihood method devised by Box and Jenkins [14]. In this paper, the ARIMA model was chosen since it makes use of both autoregressive (AR) and moving average (MA) components to model changes over time and capture smoothed trends in the data. The (I) of ARIMA determines the level of differencing to be used, which helps the data to be stationary. Exponential Smoothing is a time series forecasting method for univariate data that can be extended to support those with a systematic trend or seasonal component. It is a powerful forecasting model that can be used as an alternative to the popular Box-Jenkins ARIMA family of methods.
Exponential smoothing and ARIMA models have been widely used for time series forecasting showing their robustness, especially for short-term forecasting with a low volume of historical data. Different works have made use of them for forecasting in different fields. Williams et al. [15] used these two models to predict traffic flow on urban roads, where time series with cycles are present and these models are very effective, finding that ARIMA produced the best performance. Kotillová [16] uses them to predict electricity demand in Australia, finding that the exponential smoothing provides better prediction results as an industry model, stating that the quality of the model used depends on the historical data. Köppelová and Jindrová [17] used these methods to predict telecommunications indicators, concluding that exponential smoothing performs better in terms of error using the metrics MAPE and AIC, and their combination provides improved results. Following these trends, this paper proposes the use of these two models to predict the EBITDA index for the fashion sector in Colombia.
Information from the Colombian fashion industry was used to evaluate its behavior, using the EBITDA (Earnings Before Interest Taxes Depreciation and Amortization) index, a financial indicator widely used to measure the real profits of companies, since it takes the profits of the operation before interest, taxes, amortization and depreciation [18,19]. In addition, this indicator is used in different financial models and stock market indices, as a basic parameter to establish the gross operating profit before deducting the company's financial expenses, and for large mergers and acquisitions. It is also used to calculate the economic value multiple EV/EBITDA [20]. Therefore, EBITDA serves as a measure to compare the financial efficiency of companies, regardless of the industry type. It is generally calculated by adding to operating profit items such as depreciation, amortization and taxes, which are not actual outlays (cash outflows) but form part of the structure of the annual financial report, measuring the level of real monetary profit made by these companies and sectors of the economy [21,22].
Companies present their financial statements, from which EBITDA is calculated, using parameterized tools such as Enterprise Resource Planning-ERP and Business Intelligence-BI, which help managers to make decisions [5][6][7][8][9][10]. However, to make the right decisions at the strategic level of the companies and to understand the sectors of the economy, it is necessary to build predictive models from the corporate financial information, which allow validating the performance obtained and the growth or decrease that the sector may have over time. Although there are numerous publications in the literature on the use of qualitative and quantitative models in the fashion industry, we only found 11 in which exponential smoothing and ARIMA models were used with data obtained from the areas of marketing, production and finance [23][24][25][26][27][28][29][30][31][32], but none of them used the EBITDA index to predict the real profit of the fashion industry, despite its great relevance for the valuation of economic sectors. This importance lies in the fact that it measures the company's capacity to generate profits through its production process, eliminating from its calculation items that, if considered, would reduce the real profits obtained by the company through these operations. Therefore, this index is useful for companies, organizations, sectors or countries, given that there is a family of valuation multiples that take it into account for its calculation.
For this reason, this research uses the forecast of EBITDA as an indicator to measure the real profit and financial efficiency in the fashion industry sector.
The EBITDA index for the fashion industry sector was studied, according to the financial performance recorded by these companies between 1995 and 2019. For its calculation, historical data reported by the Superintendencia de Sociedades de Colombia (SIS) were used. The exponential smoothing models SES and DES, and ARIMA were implemented to predict the EBITDA index of the fashion sector, finding the best prediction among the smoothing models was obtained with the DES model. However, the ARIMA model provided a superior performance in terms of error for predicting the EBITDA index.

Technical Specifications
For the development of this work, a database was constructed from the information provided by the SIS, corresponding to the period 1995-2019, the longest one available. Only information from the garment sector was used for this analysis. The data obtained in Excel format was imported into PostgreSQL database. For the development of the application, an computer with 16 GB of RAM was used, 8th Gen. Intel Core i7 Processor, GeForce GTX 1070 With Max-Q Design, 8 GB GDDR5, running on the x64 bits Microsoft Windows 10 platform. The method was implemented in R language and using the PosgreSQL library for database management.
Financial performance data are supported in the annual financial statements that Colombian companies report to the SIS. It has been noted that there is an inconsistency within the financial data in the Excel files used to register and store financial balances. Parameters that measure the performance of each sector strongly influence the formulation of national economic development policies, limiting what can be achieved, and it is necessary to analyze sector behavior and variation to determine their effectiveness. To achieve accuracy, normalization of financial data for operating profit, depreciation, amortization and taxes was carried out using a BI framework implemented using R and PostgreSQL, based on the structure shown in Figure 1. The BI framework consisted of a set of layers, each dealing with a specific task and organizing data, storage and queries for index calculation. EBITDA model required calculating operating profit, depreciation and amortization, each of these variables is represented by a column name in the Excel files. The main problem is that each column name changes every year, so a first layer that deals with standardization and organization of the data is crucial in this model.

Predictive Analytics
To analyze companies' behavior in the Colombian fashion industry, two predictive models were used for time series forecasting, using as input the EBITDA index calculated from historical known data. The strategy followed for data collection, SQL queries and development of the predictive analytic model is illustrated in the flow chart shown in Figure 1. EBITDA, or earnings before interest, taxes, depreciation and amortization, is one of many ways of valuing a company, as well as specific loans on a company's assets, and is a widely used metric for corporate performance.
Since this measure shows the real profit of a company, it was selected to study the real profit behavior of the Colombian fashion sector.

Model Selection
For predictive modelling purposes Exponential Smoothing and ARIMA models were used. Forecasting accuracy results are presented for each of these models and the most accurate one is proposed.

Simple Exponential Smoothing
Simple Exponential Smoothing can provide different smoother for the EBITDA index based on α selection in a model where α ∈ [0, 1]. This model can be represented as where β is the vector of unknown parameters, ε t represents the uncorrelated errors which has zero mean value and constant variance. SES is a member of this broad model class, obtained by considering the constant process y t = β 0 + ε t . To find the least squares estimate for β 0 , a series of geometric powers with decreasing weights in time, called SS E , was minimized by taking its derivative with respect to β 0 and considering |θ| < 1 [37].
SES can be written in a recursive form asỹ Calculating in a recursive way starting withỹ 1 ,ỹ T can be written as: It can be noted that (1 − λ) T −→ 0 when T → ∞ therefore contribution ofỹ 0 toỹ T is almost null. However, in this work the frequently initial dataỹ 0 = y 1 is used or the mean value estimatorỹ 0 = y when the time series is constant at the beginning. Under the independence and constant variance assumptions, using variance properties, it can be shown that Var(ỹ T ) = λVar(y T )/(2 − λ), as can be seen for λ = 1 an unsmoothed version of the time series is obtained, based on forecasting errors and confidence intervals the right λ can be chosen [33][34][35][36].
In this work, λ = 0.4, 0.6, 0.8 in the SES implementation was employed. Figure 2 shows different approximations by the SES using the λ values selected. As can be seen, the closer the λ is to 1, the closer is the SES to the original time series as expected, reducing, in this way, its underestimation bias.

Second Order Exponential Smoothing
A Second Order o Double Exponential Smoothing (DES) can be applied when a bullish or bear market is identified in the time series. Here, High Order Exponential Smoothing was employed to approximate the signal using the following general polynomial approximation with the same conditions mentioned before for the noise ε t , (n + 1)-order exponential smoother can be applied to smooth the general polynomial of n degreẽ y (1) For example, if the linear polynomial is taken to smooth possible linear trends, considering the exponentially smoothed weights, the fact that |1 − λ| < 1 and E(y t ) = β 0 + β 1 t and the assumption of non-correlation, it can be shown that which means that the simple exponential smoother is a biased estimator. In the case of a linear polynomial signal it is observed that the bias is given by −(1 − λ)β 1 /λ and since −(1 − λ)β 1 /λ → 0 when λ → 1 the underestimation error is reduced.
As mentioned above, a DES can be obtained by applying a simple double exponential smoothing toŷ T to getỹ T−1 denote the first-and second-order smoothed exponentials, respectively. Due to the fact that DES is a first-order exponential smoother ofŷ T the expected value is E(ỹ Using this equation, estimators are obtained for β 0 , β 1 .
Combining these two estimators gives the second order predictor for the ES model The bias error of under-or overestimation obtained with the SES is reasonably reduced by using the DES [37].

ARIMA Models
ARIMA models, popularized by Box and Jenkins, are a flexible and powerful statistical tool for predictive modelling of time series data [38]. Primarily, the ARIMA models approximate the future values of the time series as a linear function of past observations and white noise terms. The model consists of three components: non-stationary differences for stationarity, autoregressive (AR) model and moving average (MA) model [37].
To define non-stationarity the backward shift operator, B, is introduced. A time series, y t , is called homogeneous non-stationary if it is non-stationary but its first difference, i.e., w t = y t − y t−1 = (1 − B)y t , or the dth difference, w t = (1 − B) d y t , produces a stationary time series. Moreover, y t is called an autoregressive integrated moving average (ARIMA) process of orders p, d, and q, denoted ARIMA(p, d, q) if its dth difference yields a stationary ARMA(p, q) process. Therefore, a ARIMA(p, d, q) can be written as where are the back-shift operator terms in the AR(p) and MA(q) defined as: Φ(B)y t = δ + ε t and y t = µ + Θ(B)ε t , with δ = µ − φµ, where µ is the mean and ε t the white noise with E(ε t ) = 0. Model orders p, q are determined by the nature of the autocorrelation and partial autocorrelation functions. The model coefficients are calculated using the maximum likelihood method devised by Box and Jenkins [14]. The best model is identified by diagnostic checks such as the Akaike Information Criterion (AIC), the Bayes information criterion (BIC) and the Jarque-Bera normality test on the residual error series.

Accuracy Metrics
To determine which approximation to the time series is the best, the error was calculated using as measures the mean absolute percentage error (MAPE), the mean absolute deviation (MAD) and the mean squared deviation (MSD) defined by the following expressions:

Forecasting
So far, exponential smoothing techniques have been considered as visual aids to pinpoint underlying patterns in time series data and estimate model parameters for a set of models. Exponential smoothing is now used to predict future observations. For example, at time T, it is useful to forecast the observation at T + τ, denoted byŷ T+τ (T), where τ is the lead time for the forecast made at T. In this paper, first-and second-order exponential smoothers are considered for forecasting time series data of constant and linear trend processes.

Constant Process
In Section 4, the first-order exponential smoothing for a constant process was introduced. From Equation (6) it was found (11) and was showed the constant process f (t, β) = β 0 can be estimated byỹ T . The constant model consists of two parts β 0 that can be estimated by the first-order exponential smoother and the random error ε t . The forecast for future observations is simply equal to the current values of the exponential smoother.
y T+τ (T) =ỹ T := λy T + (1 − λ)ỹ T−1 (12) To avoid misleading forecasting events, if the observation at time T + 1 comes into existence, the following expression can be used to forecast future observationŝ y T+1+τ (T + 1) = λy T+1 + (1 − λ)ŷ T+τ (T) (13) which for τ = 1 can be written aŝ where e T+1 (1) is the one-step-ahead forecast prediction error. Therefore, a prediction of the next observation is given by the previous forecast for the current observation plus λ times the prediction error made in forecasting at the current observation, so a good choice of the discount factor λ is crucial. Let SS E (λ) be the sum of the squared one-step-ahead forecast errors defined as where, for the historical data of the EBITDA index, the value of lambda that minimizes the sum of squared differences is used to predict future observations. Considering the process with the signal f (t, β) = β 0 , the prediction interval of 100(1 − α/2) percent for any waiting time τ is given byỹ whereỹ T is the first-order exponential smoother, Z α/2 the 100(1 − α/2) percentile for the normal distribution, andσ e the estimate of the standard deviation of the forecast errors.

Linear Process
For the linear trend model a t-step-ahead forecast is given bŷ which can be written as follows, in terms of exponential smoothing, It is possible to improve the forecasts, as more data are collected, by updating the parameter estimates using the following expression The 100(1 − α/2) percent prediction interval for the lead time τ is given by

ARIMA Forecasting
Once a time series model y ARIMA is fitted, it can be used to predict future observations. The best prediction, in the mean-square sense, is given by the conditional expected value of y T+τ , expressed mathematically as follows.
where µ + starting from the horizon τ, and then, the prediction error is computed using the expression and the 100(1 − α) percent prediction interval for y T+τ isŷ T+τ (T) ± z α/2 σ(τ). To avoid missing estimates of random shocks in a large data set, it is estimated recursively one step ahead as in ES models (see [14])

Results
For the tests, the fashion sector's EBITDA index was used as the time series of interest and response variable. The EBITDA index was calculated with the available data using SQL queries to the designed database into which each Excel file with balance sheet information was imported. The time series was divided into a training and a validation sets using 80% and 20% of the information, respectively.
To confirm the trend of the original data, ACF and PACF plots were performed. As can be seen in Figure 3, the ACF and PACF decay geometrically which is a sign of non-stationarity. The p-value = 0.99 > α = 0.05 (5%, significance level or 95% confidence interval) and ADF Statistic = 0.70619 > −2.5, −3.4, −2.8 (t-values at 1%, 5% and 10% confidence intervals); therefore, the time series was not stationary based on the Augmented Dickey-Fuller test as observed in Table 1. To choose the ARIMA orders the Akaike Information Criterion was used. (AIC) and the best discount factor for the ES method was obtained by minimizing the sum of squared errors (SS E ). In this work, the model Double Exponential Smoothing (DES) was used to reduce the approximation error. The best Discount Factor for a SES was the best for DES and was calculated by minimizing SS E , Figure 4 shows a plot for the discount factor vs the norm SS E and a vertical lines is draw to identify the λ value where this minimum value is reached.   Assuming a linear EBITDA signal, Figure 6 shows how the underestimation error was reduced in the smoothing process and the discount factor that minimized the SS E error. To forecast future values of EBITDA using DES Equation (1.18) was used by updating the constants in (1.19) each step forward in the prediction.
To fit the best model ARIMA for the index EBITDA, coefficients were calculated using the maximum likelihood method introduced by Box and Jenkins [14] as shown in Figure 7.
The orders of the model were calculated based on the AIC, BIC and Jarque-Bera normality test diagnostics. The lowest AIC was obtained with the ARIMA(0, 1, 3) model, which is a third-order MA with an integration for stationarity.
The prediction procedure was developed as explained in Section 5. The ARIMA and the DES were compared to select the one with the lowest prediction error. The horizon selected for forecasting was 5 days, corresponding to 20% of the total data.   Figures 8 and 9 show the prediction values obtained using the DES and ARIMA models. As can be seen in these graphs, predictions were quite similar, but in terms of error, using the different metrics proposed in Section 4.4, it was found that the best predictive model was ARIMA (0, 1, 3). Table 2 shows the error obtained from the accuracy tests of the two models proposed in this work. The ARIMA model outperformed the DES one, providing a good model for forecasting the EBITDA index.   As can be seen in Table 2 the ARIMA model outperformed DES in terms of forecasting error. Higher order could be used for the exponential smoothing model, but it required making a triple simple smoother. Therefore, ARIMA model can be used instead to forecast very short time series such as the EBITDA index for the fashion sector in Colombia.

Summary and Conclusions
Colombia is a country with a textile tradition, whose fashion industry has been affected by economic and social changes producing a significant decrease in its real profit. Since there were no statistical models available to predict the future behavior of the profit accumulation for this sector and its companies, two predictive models, exponential smoothing and ARIMA were presented in this paper. For this purpose, a database was designed in POSTGRESQL with historical data reported over a period of 24 years (1995-2019) by the SIS with the aim of studying the value of companies in the fashion sector in Colombia. The EBITDA formula was used as the metric used to evaluate operating performance.
In this work, it was found that, by adjusting the Exponential Smoothing and ARIMA models to predict the EBITDA index, it was possible to study the future behavior of this sector, facilitating decision making related to growth and value maximization. The best exponential smoothing model obtained was the DES which considered a linear process as a signal, this model provide the best fit for the EBITDA index compared to the SES which considered a constant process. Minimizing SS E for the DES models it was found that λ = 0.8 was the appropriate discount factor in the smoothing process. ARIMA model was also used with forecasting purposes, compared to the DES smoothing one, it provided better predictions based on the accuracy metrics MAPE MAD, MSD, and it is concluded that for this time series data it is the best alternative among the models studied to predict the EBTIDA index for the Colombian fashion sector. The results obtained with these models coincide with the findings of other researchers, who find they provide good prediction in applications such as traffic flow prediction, telecommunication indicators and electricity demand. Machine learning techniques can be considered for future work, using, for example, ANN as the RECURRENT NEURAL NETWORK to predict different indices and evaluate the behavior of the fashion sector, taking into account the high volume of data available to calculate each index. Likewise, these models can be used in the study of other economic industries.
Author Contributions: All the authors (L.R., A.J.G.-R. and M.G.F.) have participate equally in all the aspects of this paper: conceptualisation, methodology, investigation, formal analysis, software, validation, visualisation, writing-original draft preparation, writing-review and editing. All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement: Not applicable.