Hybrid Models Combining Technical and Fractal Analysis with ANN for Short-Term Prediction of Close Values on the Warsaw Stock Exchange

: This paper presents new methods and models for forecasting stock prices and computing hybrid models, combining analytical and neural approaches. First, technical and fractal analyses are conducted and selected stock market indices calculated, such as moving averages and oscillators. Next, on the basis of these indices, an artiﬁcial neural network (ANN) provides predictions one day ahead of the closing prices of the assets. New technical analysis indicators using fractal modeling are also proposed. Three kinds of hybrid model with different degrees of fractal analysis were considered. The new hybrid modeling approach was compared to previous ANN-based prediction methods. The results showed that the hybrid model with fractal analysis outperforms other models and is more robust over longer periods of time.


Introduction
Information about the current conditions on the Stock Exchange and future closing prices is crucial for investors, who wish to maximize their profits and make good investment decisions. Over the years, different theories and stock models have been developed, some of which contradicted each other. Until the second half of twentieth century, it was believed that price changes were random and thus could not be predicted. This was a basic assumption of the random walk hypothesis [1], which is consistent with the efficient market hypothesis (EMH) formulated by Eugene Fama [2]. This hypothesis states that share prices fully reflect all the available information, and stocks always trade at a fair value. It is therefore impossible to outperform the market overall, and higher returns can be only obtained by chance or by purchasing riskier investments. This hypothesis was not in line with experience, as evidenced by the profits amassed by professional financial institutions trading on stock markets. It was also refuted by a large group of academics, who observed and analyzed market prices over long periods of time and proved that market prices were not random, but exhibited trends [3,4]. Moreover, these trends repeated over time and some followed a sinusoidal shape [5]. These results supported the view that the market is, to some degree, predictable, and that future prices and trends can be predicted on the basis of past prices. This opened the way for technical analysis. Thus, the problem of stock price prediction is closely related to time series analysis and modeling trends.
These days, not only spectral analysis of trade signals [5], but also other more complex methods, such as joint time-frequency-shape analysis, are available [6].
In the present study, new hybrid models combining technical and fractal analyses with ANN are proposed for the short-term prediction of close values on the Warsaw Stock Exchange. Novel structures for hybrid models are presented, along with new fractal analysis indicators, and we compare the accuracy of the new hybrid models. Fractal analysis is based on the fractal market hypothesis, which was formulated in 1994 by Peters [36] and derived from chaos theory [37]. Fractal shapes can be formed in many ways. The simplest is the multiple iteration of a generating rule (e.g., the Koch curve or Sierpinski triangle) [38,39]. Fractals are generated in a deterministic way and all have a fractal, i.e., non-integer dimension. There are also random fractals, which can be generated using probability rules [40]. These ideas inspired the authors to implement fractal theory in the hybrid models and apply the fractal dimension to modify the TA indicators.

Technical Analysis Indicators
Technical analysis indicators are very useful tools for evaluating market trends, predicting price changes, and assessing the strength of a market. They can take different graphical or analytical forms. Specific values or combinations of values can be interpreted by a technical analyst, as indications to sell, buy, or hold an asset [27]. A detailed list and description of TA indicators can be found in Reference [41]. Here, only the most important TA indicators are presented, such as the selected moving averages and oscillators used in our work. These are as follows: a.
Moving averages: • The simple moving average (SMA) for 5, 10, or 20 days: where C(k) is the closing price on the k-th day, N is the number of days, and N = 5, 10, or 20.

•
The exponential moving average (EMA) for 5, 10, or 20 days: where N and C(k) are the same as in Equation (1) and a is a constant coefficient.

b. Oscillators
• The rate of change (ROC) characterizes the rate of price changes over 5, 10, or 20 days: • The relative strength index (RSI) is used to identify whether the market is overbought or oversold. The values are always in the range from 0 to 100. If the RSI is smaller than 30, it is assumed that the market is sold out, while a value larger than 70 means that the market is bought out. However, in the case of strong trends, RSI < 20 signifies a sold-out market (during a bear market) and RSI > 80 indicates a buyout market (during a bull market). The RSI is a measure of the strength of growth movements in relation to downward movements.

•
The stochastic oscillator (K%D) determines the relative value of the last closing price in relation to the considered range of price changes in a given period. Oscillator values cover a range from 0 to 100. If K%D is over 70, that the closing price is considered to be near the top end of the range of its fluctuations. A K%D below 30 indicates that it is close to the lower end of that range.
where L (14) and H (14) are the lowest and highest prices, respectively, over the last 14 days.

•
The moving average convergence/divergence (MACD) is defined as the difference between the long-term and short-term values of exponential moving averages. This oscillator is used to study buy and sell signals. It is usually compared with its 9-day exponential moving average, which is called the signal line (SL). The intersection of these two signals indicates a buying signal if the MACD line comes from the bottom, and a selling signal if this line is from the top.
SL(k) = EMA 9 (MACD(k)) (7) • Accumulation/distribution (AD) relates to the price and volume and indicates whether the price changes in the stock market appear together with increased accumulation and distribution movements.
where V(k) is a volume, e.g., the total number of shares traded on k-th day, and L(k) and H(k) are the lowest and the highest prices on k-th day.

•
The Bollinger oscillator (BOS) informs whether the market is overbought or oversold. It is derived from the Bollinger bands.
where SDev C (k) denotes the standard deviation of C(k).

Fractal Analysis
Fractal-based approaches have been gaining popularity in market analyses. This is due to the fact that fractal analysis (FA) allows for more precise modeling of stock market trends than technical analysis. The implementation of fractal analysis requires recognition of the fractal dimension. For this purpose, we used the so-called box-counting procedure [38]. The analyzed data chart was covered by N small elements (boxes) of size S. If we change the scale and the size of elements from S 1 to S 2 , the relationship between the number of elements N 1 and N 2 needed to cover the graph with objects of S 1 and S 2 size, respectively, is given as [42]: where D denotes the fractal dimension. When measuring the fractal dimension of the share prices chart, the considered period of time 2T should be divided into two equal intervals. For each time period, the chart of share prices is covered by N elements. The number N 1T of elements in the first time period T is equal to: where H T (k) and L T (k) are the highest and lowest share prices in the first time period T. The number N 2T of elements in the second time period T is equal to: where H 2T (k) and L 2T (k) are the highest and the lowest share prices in the second time period from T to 2T. The number N 0-2T of elements in the whole considered time period 2T is equal to: where H 0-2T (k) and L 0-2T (k) are the highest and the lowest share prices over the whole time period 2T. The fractal dimension of the share price chart is given by the relationship [42]: The fractal dimension is used to define the fractal moving average (FRAMA). This moving average is derived from the exponential moving average (Equation (2)) with (1 − a) coefficient, where a is given by [42]: a = exp(−4.6(D − 1)). (16) In this paper, we extend the idea of using a fractal approach to predict share prices and propose new technical analysis indicators with FRAMA: a.
The relative strength index with FRAMA (RSI_FRAMA)-this FA indicator is derived from the TA indicator RSI (Equation (4)) by replacing EMA with FRAMA.
where U(k) and D(k) are, respectively, the average increase and the average decrease on the k-th day. b.
where SL F is the signal line with FRAMA.
where SDev C (k) denotes the standard deviation.

Application of Hybrid Analytical-Neural Models for Share Price Forecasting
Information regarding likely upcoming share price changes is of the utmost importance for investors. In this work, new hybrid analytical-neural models are proposed for predicting the closing price of an asset for the next day. In these models, technical or fractal analyses were combined with ANN. The general structure of these hybrid models is shown in Figure 1. In the first module, TA or FA indicators were calculated. These were then used as inputs for the ANN, and the share closing price for the next day was obtained as the ANN output. Three kinds of hybrid model are discussed: where SD C ev ( k ) denotes the standard deviation.

Application of Hybrid Analytical-Neural Models for Share Price Forecasting
Information regarding likely upcoming share price changes is of the utmost importance for investors. In this work, new hybrid analytical-neural models are proposed for predicting the closing price of an asset for the next day. In these models, technical or fractal analyses were combined with ANN. The general structure of these hybrid models is shown in Figure 1. In the first module, TA or FA indicators were calculated. These were then used as inputs for the ANN, and the share closing price for the next day was obtained as the ANN output. Three kinds of hybrid model are discussed: The different hybrid model structures were compared to find the best solution. The influence of fractal analysis on the final results was also examined. In our earlier research, several TA-ANN models [22] and FA-ANN models [30] were tested, in which previously calculated market indicators were used as ANN inputs. This corresponds to the opened switch S, in Figure 1. In the present work, TA-FA-ANN models are introduced. A moving time window method was also applied in all three kinds of hybrid model, where, in addition to market indicators, the current and several past samples of the CLOSE signal were entered as ANN inputs. This variant corresponds to the closed switch S, in Figure 1.
Feed-forward ANNs of the MLP type were applied for close price forecasting. An example of one of the applied two-layer MLP structures is shown in Figure 2. The different hybrid model structures were compared to find the best solution. The influence of fractal analysis on the final results was also examined.
In our earlier research, several TA-ANN models [22] and FA-ANN models [30] were tested, in which previously calculated market indicators were used as ANN inputs. This corresponds to the opened switch S, in Figure 1. In the present work, TA-FA-ANN models are introduced. A moving time window method was also applied in all three kinds of hybrid model, where, in addition to market indicators, the current and several past samples of the CLOSE signal were entered as ANN inputs. This variant corresponds to the closed switch S, in Figure 1.
Feed-forward ANNs of the MLP type were applied for close price forecasting. An example of one of the applied two-layer MLP structures is shown in Figure 2 Figure 2 shows an exemplary architecture (4-7-1) of the studied networks. It is composed of 4 input nodes, 7 neurons with a sigmoidal transfer function in a hidden layer, and 1 neuron in an output layer. In this study, different two-layer MLP structures were tested with different numbers of inputs and neurons in the hidden layer. It was confirmed that such two-layer ANNs are universal approximators [13], i.e., they are capable of mapping any smooth input-output functions with appropriate accuracy [43]. In the presented example, four market indicators were entered into the ANN input and the CLOSE value was obtained on the ANN output. Generally, for all the tested ANNs, the number of inputs is determined by the number of implemented TA and FA indices and also by the number of CLOSE signal past samples. The number of neurons in the hidden layer should be chosen in such a way as to achieve a compromise between ANN accuracy and generalization capability [44]. An insufficient number of neurons would result in poor ANN accuracy. Too many neurons would increase the network training time and could cause losses in ANN generalization capability [44]. Therefore, different types of MLP were tested. The ANNs were trained with a resilient propagation algorithm [45]. Input variables were selected on the basis of expert knowledge gathered in a literature review [14,16,41,46,47] and then tested experimentally according to the rules described in Section 5. The main criterion for selecting the ANN structures was the minimum of the mean square error (MSE) for the testing data.
The reference method for hybrid modeling was a purely ANN-based approach, in which the moving time window method [48] was applied and the share close price was predicted by MLP.

Methodology
Different hybrid models using TA, FA, and ANN were compared with a purely ANN-based approach, as well as to find the best solution for short-term CLOSE value prediction. For this purpose, specialized software was designed using Java and an Encog library, which allows both for stock data collection and pre-processing and also for training and testing automatically a large number of different hybrid model structures. In our previous study [22], the choice of ANN input variables was made on the basis of expert knowledge. A small number of input variables were selected in order to design sufficiently accurate yet compact models. In the present work, new algorithms were implemented to create different hybrid models with ANNs of two-layer MLP type (see Figure 2). A large number of market indicators in different configurations were considered. The ANN input  Figure 2 shows an exemplary architecture (4-7-1) of the studied networks. It is composed of 4 input nodes, 7 neurons with a sigmoidal transfer function in a hidden layer, and 1 neuron in an output layer. In this study, different two-layer MLP structures were tested with different numbers of inputs and neurons in the hidden layer. It was confirmed that such two-layer ANNs are universal approximators [13], i.e., they are capable of mapping any smooth input-output functions with appropriate accuracy [43]. In the presented example, four market indicators were entered into the ANN input and the CLOSE value was obtained on the ANN output. Generally, for all the tested ANNs, the number of inputs is determined by the number of implemented TA and FA indices and also by the number of CLOSE signal past samples. The number of neurons in the hidden layer should be chosen in such a way as to achieve a compromise between ANN accuracy and generalization capability [44]. An insufficient number of neurons would result in poor ANN accuracy. Too many neurons would increase the network training time and could cause losses in ANN generalization capability [44]. Therefore, different types of MLP were tested. The ANNs were trained with a resilient propagation algorithm [45]. Input variables were selected on the basis of expert knowledge gathered in a literature review [14,16,41,46,47] and then tested experimentally according to the rules described in Section 5. The main criterion for selecting the ANN structures was the minimum of the mean square error (MSE) for the testing data.
The reference method for hybrid modeling was a purely ANN-based approach, in which the moving time window method [48] was applied and the share close price was predicted by MLP.

Methodology
Different hybrid models using TA, FA, and ANN were compared with a purely ANN-based approach, as well as to find the best solution for short-term CLOSE value prediction. For this purpose, specialized software was designed using Java and an Encog library, which allows both for stock data collection and pre-processing and also for training and testing automatically a large number of different hybrid model structures. In our previous study [22], the choice of ANN input variables was made on the basis of expert knowledge. A small number of input variables were selected in order to design sufficiently accurate yet compact models. In the present work, new algorithms were implemented to create different hybrid models with ANNs of two-layer MLP type (see Figure 2). A large number of market indicators in different configurations were considered. The ANN input variables were selected randomly and the most accurate model structures were identified in an experimental way, based on the results of machine learning. The main machine learning algorithm is based on the following rules:

1.
A set of ANN input data vectors (i.e., not repeated combinations of market indicators with or without CLOSE past samples for a selected company) is generated randomly.

3.
For each input vector and each MLP structure: a.
All input data are normalized to <0.1; 0.9> range using the following formula: b.
The training data for each company are divided into a learning data set and a testing data set, where the testing set is about 30%; c.
The ANNs are trained using the resilient propagation algorithm with a momentum factor; d.
Eight different ANNs are trained and the ANN with the smallest MSE for the testing data is chosen as the best.

4.
From the whole set of trained ANNs, a small subset of the best ANNs is identified, for which the MSE for the testing data is smaller than the defined error limit.
The proposed stock exchange data forecasting methodology was tested on historical stock data for five chosen companies (ŻYWIEC SA, ASSECO POLAND SA, BANK BPH SA, BUDIMEX SA, and VISTULA SA). The tested companies represent different market sectors, so the results can be generalized. They have been listed on the Warsaw Stock Exchange (WSE) since at least 1999. The WSE has been running since 12 April 1991. All the ANNs were trained on data covering the period from the start of the company's listing on the WSE to the end of 2008, and then tested on data from 1 January 2009 to 1 January 2015 (6 years). The best hybrid model structures were selected in an experimental way according to the following scheme:

1.
First, 12,628 various input data vectors were generated randomly for one company. Then, 208,518 different MLPs were trained. Each network was trained according to the rules defined above.

2.
In the second step, 29,149 MLPs with the smallest MSE were selected and their structures used to train ANNs for three randomly chosen companies. 3.
Next, 5545 ANN structures with an MSE smaller than the defined error limit were used to train ANNs for all five companies.

4.
Finally, 300 ANNs with the lowest MSE for the testing data for all five companies were selected and used to predict close values for the next day.

Results
The best networks for all types of the tested ANN-based hybrid models for the five chosen companies are listed in Table 1. The best ANN-based hybrid models in each of the considered classes of models are shown. It can be noted that for the FA-ANN and TA-ANN models, the structures with inputs extended by current and past samples of the CLOSE signal (closed switch S, in Figure 1) produced the best results (i.e., the smallest MSE for the testing data covering a period of 6 years).
Comparison of the MSE obtained for the testing data implies that hybrid models ensure better accuracy than a purely ANN-based approach. The hybrid model with fractal analysis outperforms the hybrid model with only TA and its results are similar to those of models with technical and fractal analysis indicators as input for the ANNs.
The MSE is a commonly used criterion for evaluating the accuracy of models, yet it averages the results. Therefore, the time series of the close prices of the considered companies was analyzed in detail. The results indicate that information from a single ANN is an insufficient basis for investment decisions. Therefore, in order to find the best hybrid ANN-based model, four kinds of model with the smallest MSE listed in Table 1 were tested for all five considered companies. Because the testing time period of 6 years is too long to observe the complete time series charts, it was shortened to one year only-2010. One year is long enough for the results to be representative for the whole set of the tested companies over a long period of time. Exemplary results of CLOSE price predictions obtained for three selected companies are presented in Figures 3-5. Comparison of the MSE obtained for the testing data implies that hybrid models ensure better accuracy than a purely ANN-based approach. The hybrid model with fractal analysis outperforms the hybrid model with only TA and its results are similar to those of models with technical and fractal analysis indicators as input for the ANNs.
The MSE is a commonly used criterion for evaluating the accuracy of models, yet it averages the results. Therefore, the time series of the close prices of the considered companies was analyzed in detail. The results indicate that information from a single ANN is an insufficient basis for investment decisions. Therefore, in order to find the best hybrid ANN-based model, four kinds of model with the smallest MSE listed in Table 1 were tested for all five considered companies. Because the testing time period of 6 years is too long to observe the complete time series charts, it was shortened to one year only-2010. One year is long enough for the results to be representative for the whole set of the tested companies over a long period of time. Exemplary results of CLOSE price predictions obtained for three selected companies are presented in Figures 3-5.   Comparison of the MSE obtained for the testing data implies that hybrid models ensure better accuracy than a purely ANN-based approach. The hybrid model with fractal analysis outperforms the hybrid model with only TA and its results are similar to those of models with technical and fractal analysis indicators as input for the ANNs.
The MSE is a commonly used criterion for evaluating the accuracy of models, yet it averages the results. Therefore, the time series of the close prices of the considered companies was analyzed in detail. The results indicate that information from a single ANN is an insufficient basis for investment decisions. Therefore, in order to find the best hybrid ANN-based model, four kinds of model with the smallest MSE listed in Table 1 were tested for all five considered companies. Because the testing time period of 6 years is too long to observe the complete time series charts, it was shortened to one year only-2010. One year is long enough for the results to be representative for the whole set of the tested companies over a long period of time. Exemplary results of CLOSE price predictions obtained for three selected companies are presented in Figures 3-5.    A comparison of the results shows that CLOSE value predictions using the hybrid ANN-based approach with technical and fractal analyses are more stable over a long period of time than those using the ANN model. The best prediction accuracy was obtained for the FA-ANN and FA-TA-ANN models.
The most important prediction parameter for investors is the highest absolute error. For the purposes of comparison, the following error measures were used: a. The highest prediction error per month Emax-i.e., the highest difference between the real CLOSE value and the value predicted by the ANN per month: b. Arithmetical mean of the month Emax values per tested period of time (one year): These error measures were used in the assessment and validation of the tested models. In order to assess the precision and stability of the models, the maximum absolute errors Emax of one day-ahead predictions were calculated from Equation (22) for each month in the test period. The results obtained for ŻYWIEC SA in the period of one year for the ANN-based model and three hybrid models are given in Figure 6. Analogical results obtained for BPH Bank SA are shown in Figure 7 and for BUDIMEX SA in Figure 8. A comparison of the results shows that CLOSE value predictions using the hybrid ANN-based approach with technical and fractal analyses are more stable over a long period of time than those using the ANN model. The best prediction accuracy was obtained for the FA-ANN and FA-TA-ANN models.
The most important prediction parameter for investors is the highest absolute error. For the purposes of comparison, the following error measures were used: a.
The highest prediction error per month E max -i.e., the highest difference between the real CLOSE value and the value predicted by the ANN per month: b. Arithmetical mean of the month E max values per tested period of time (one year): These error measures were used in the assessment and validation of the tested models. In order to assess the precision and stability of the models, the maximum absolute errors E max of one day-ahead predictions were calculated from Equation (22) for each month in the test period. The results obtained forŻYWIEC SA in the period of one year for the ANN-based model and three hybrid models are given in Figure 6. Analogical results obtained for BPH Bank SA are shown in Figure 7 and for BUDIMEX SA in Figure 8.
The results shown in Figures 6-8 imply that the hybrid approach using technical analysis with fractal indicators combined with the ANN (FA-ANN) ensured the best accuracy in terms for short-term predictions of CLOSE values. The worst results were obtained with the purely ANN-based approach. Similar results were obtained for other tested companies.
A comparison of the arithmetical means E max of monthly E max errors (absolute and relative) in the tested period of one year (2010) for the five selected companies and for four kinds of ANN-based models is presented in Table 2. These results confirm that the best accuracy was achieved by the hybrid model combining TA, FA, and ANN and the worst using the purely ANN-based approach.  The results shown in Figures 6-8 imply that the hybrid approach using technical analysis with fractal indicators combined with the ANN (FA-ANN) ensured the best accuracy in terms for shortterm predictions of CLOSE values. The worst results were obtained with the purely ANN-based approach. Similar results were obtained for other tested companies.
A comparison of the arithmetical means max E of monthly Emax errors (absolute and relative) in the tested period of one year (2010) for the five selected companies and for four kinds of ANN-based models is presented in Table 2. These results confirm that the best accuracy was achieved by the hybrid model combining TA, FA, and ANN and the worst using the purely ANN-based approach. The results shown in Figures 6-8 imply that the hybrid approach using technical analysis with fractal indicators combined with the ANN (FA-ANN) ensured the best accuracy in terms for shortterm predictions of CLOSE values. The worst results were obtained with the purely ANN-based approach. Similar results were obtained for other tested companies.
A comparison of the arithmetical means max E of monthly Emax errors (absolute and relative) in the tested period of one year (2010) for the five selected companies and for four kinds of ANN-based models is presented in Table 2. These results confirm that the best accuracy was achieved by the hybrid model combining TA, FA, and ANN and the worst using the purely ANN-based approach.

Discussion and Conclusions
In this paper, new structures of hybrid ANN-based models combined with fractal analysis have been proposed for predicting stock exchange share values for the next day. Three kinds of hybrid models were compared. The first combines technical analysis and an ANN. The second combines technical and fractal analyses with an ANN. The third combines fractal analysis with an ANN. The aim was to identify the best model structure. Hybrid ANN-based models combining technical analysis and fractal analysis gave more precise short-term forecasts than the purely ANN-based approach. As can be seen from the results displayed in Tables 1 and 2, the highest precision was achieved by the FA-ANN and TA-FA-ANN models. Marginally worse results were obtained with the TA-ANN model. The worst results were obtained using the purely ANN-based approach. Comparison of the Emax values for the audited period from January to December 2010 given in Figures  6-8 reveals that the proposed hybrid ANN-based prediction models were more accurate over time than the ANN approach. Because technical and fractal analysis are incorporated into the proposed hybrid modeling schemes, the hybrid models were less vulnerable to false signals from the market. The hybrid ANN-based models with fractal analysis produced the best accuracy, and may offer a very useful tool for supporting stock brokers and investors in their decisions.

Discussion and Conclusions
In this paper, new structures of hybrid ANN-based models combined with fractal analysis have been proposed for predicting stock exchange share values for the next day. Three kinds of hybrid models were compared. The first combines technical analysis and an ANN. The second combines technical and fractal analyses with an ANN. The third combines fractal analysis with an ANN. The aim was to identify the best model structure. Hybrid ANN-based models combining technical analysis and fractal analysis gave more precise short-term forecasts than the purely ANN-based approach. As can be seen from the results displayed in Tables 1 and 2, the highest precision was achieved by the FA-ANN and TA-FA-ANN models. Marginally worse results were obtained with the TA-ANN model. The worst results were obtained using the purely ANN-based approach. Comparison of the E max values for the audited period from January to December 2010 given in Figures 6-8 reveals that the proposed hybrid ANN-based prediction models were more accurate over time than the ANN approach. Because technical and fractal analysis are incorporated into the proposed hybrid modeling schemes, the hybrid models were less vulnerable to false signals from the market. The hybrid ANN-based models with fractal analysis produced the best accuracy, and may offer a very useful tool for supporting stock brokers and investors in their decisions.