1. Introduction
Over the last few decades there has been an increase in non-linear forecasting techniques, such as for instance neural networks (NN) and support vector machines (SVM). The non-linearity of the stock market performance has been mentioned by many researchers such as for instance Vrbka and Rowland [
1]. In fact, the non-linearity of the stock market was one of the reasons behind choosing neural networks (a non-linear approach) by Vrbka and Rowland [
1] as a model to forecast stocks prices in Prague Stock Exchange. In this paper the authors successfully applied multilayer perceptron and radial basis networks to that particular stock market. While neural networks are an important stock forecasting technique it should be noted that, as in any other technique, it has limitations. Horack and Krulicky [
2] did in this regard an interesting comparison between the exponential time series alignment method and the time series alignment with neural networks method. The authors highlighted the importance of neural networks in the field of stock forecasting mentioning that generally neural networks provide better forecast than traditional methods. However, they also concluded that in their example, applied to a volatile stock, the traditional forecasting method generated better results than the neural network. This highlights the importance of using the appropriate stock forecasting techniques with papers in the existing literature finding less than optimal results for some popular techniques. For instance, Groda and Vrbka [
3] concluded that the Box–Jenkins method is not an appropriate method in the case of stocks listed in the Prague Stock Exchange.
An important development in recent years is the very large increase in data available in many disciplines. It is relatively frequent to have a high number of variables that can potentially have an impact in non-linear processes, see Guyon and Elisseeff [
4]. There is substantial research covering the topic of variable selection using linear methods, such as for instance Hocking [
5], but these techniques might not be ideal when intended for non-linear modelling. There is relatively little literature covering the issue of variable selection of non-linear processes from a combinatorial approach. The basic assumption is that in non-linear processes the way in which different independent variables interact with each other can be highly complex. Identifying which combination of variables work better for a non-linear problem is clearly not trivial, see Yuan and Lin [
6]. There are some interesting methods, such as for instance Rech and Terasvirt [
7], using polynomial approximation. The drawback of this type of approach is that it is only applicable when there is a relatively small number of variables.
Ye and Sun [
8] proposed an iterative method in which starting from all the variables considered one, of the variables is dropped and the resulting set of variables is used, using neural networks, and then the results compared.
The stock market, as many other fields, has seen a large increase in the number of data available. More specifically, many researchers and practitioners have developed large amount of technical indicators. Getting into the details of these indicators is beyond the scope of this article but it is important to mention that they are typically constructed using historical data such as for instance the closing price of a stock. A moving average is a well-known technical indicator [
9,
10,
11,
12,
13,
14,
15]. A simple moving average can be constructed as the average of the closing price of a certain stock or index over a certain period of time. There are multiple different technical indicators and strategies based on those indicators with diverse levels of profitability [
16,
17,
18,
19]. Neural networks have been applied successfully to the field of stock market forecasting [
20,
21,
22,
23,
24,
25]. One of the frequently mentioned drawbacks of neural networks is the issue of local minima [
26,
27,
28,
29] which can cause the neural network to generalize poorly or in other words, generate poor forecast when faced with new data. In this regard there has been a focus on reducing the dimensionality of the data to avoid having the neural network stuck in this local minima [
30,
31,
32,
33,
34]. In this paper we propose a non-linear combinatorial approach for variable selection applied to stock market technical indicators. Given the number of possible technical indicators that are available today, is clear that not all possible combinations can be tested.
In this work we use a pool of 35 technical indicators, so the number of possible combinations is staggering. Because of that the proposed approach resorts to randomization. The algorithm starts by generating a preset number of combinations each with a size of half the number of available indicators. The reason for using such size is because the maximum number of combinations for a given n and k, that is the binomial coefficient n choose k, is attained for . This is evident by inspecting the Pascal’s Triangle, but can be proved using the properties of the Newton binomial. From this initial set of combinations, new combinations are generated randomly retaining the best ones in an iterative process that ends when the stop condition is met. The proposed method is shown to improve the baseline approach, that is using all the available indicators, when applied to the task of predicting market trends. As a case study for the proposed method, this paper focuses on the chinese stock market. China stock market is a very dynamic and of increasingly importance due to the economic grow of China. The most relevant chinese stock indexes have been considered. The strategy described in this paper has obtained good results and better combinations of technical indicators have been identified for such indexes.
The remaining of the paper is organized as follows:
Section 2 presents the proposed approach.
Section 3 shows the application of the proposed approach to the China stock market. The results are discussed in
Section 4 and
Section 5 presents the conclusions of the paper.
2. Technical Indicator Selection Approach
Let
be tuple of
T values of the
i-th technical indicator from a pool of up to
N nonlinear technical indicators computed at the time period
t (usually the time period is measured in days, but could be weeks, months or whatever), such as for instance a moving average:
where
is the value of the
i-th technical indicator evaluated at time
t.
Let
be a vector that groups the direction of the change of the price of a stock or index, in the period from
to
t, that is
with 0 meaning that the stock increases or remains constant at the end of a period, 1 meaning otherwise. Additionally, let
define a nonlinear mapping from
to
where
is an estimation of
. In order to guarantee that an estimator
can be found is necessary to assume the following:
Assumption 1 (Ground truth function existence). It is assumed that, ϕ, a mapping from to exists.
Technical indicators in the stock market can produce contradictory signals and some non-linear techniques can have local minima issues when using high dimensionality input variables, (large N or T value). Therefore it might be convenient to find a combination of the technical indicators rather than using all N available indicators. We present a combinatorial selection approach for technical indicators that do not have to be linear variable combination in non-linear processes.
The steps are as follows:
Split the available instances of and data into two subsets, an estimation subset and a validation (To keep the notation as clear as possible we use here the term “validation” with the meaning of test) subset .
Generate , a set of M combinations of random numbers in the range denoted from to . Check for repetition within each combination. If there is repetition, leave out the repeated numbers and keep generating random numbers in the aforementioned range until there is no repetition. Similarly, check that there are no repeated combinations and proceed to replace the redundant ones.
For each of the M combinations in , denoted as , perform the following steps:
- (3a)
Compute , i.e., the estimated non-linear classification mapping, using any technique of choice. In this case, the nonlinear mapping will have as arguments . Using neural networks as an example, this step involves training a neural network with the training data set with .
- (3b)
Evaluate over the validation set. Let denote the estimated classification output of .
- (3c)
Calculate the error
of the non-linear classification approach for each instance in the validation set, so that
with the resulting total error being:
where
denotes the cardinality of
. This is therefore the total error for the initial random combination of technical indicators chosen in step 2.
- (3d)
Randomly generate a single value that would denote a technical indicator candidate to be added to the combination chosen in step 2. Check for repetition with the previously generated values. If there is repetition, randomly generate another value . Repeat this step until there is no repetition.
- (3e)
Randomly generate a single value . This value will be used later to denote a technical indicator to be removed from the combination chosen in step 2.
- (3f)
Form a new combination of technical indicator indexes as . In the same fashion, form a new combination as . If some of these combinations are already in repeat either step 3d or 3e until a combination different from those of is obtained.
- (3g)
As in step 3a, compute two new mappings, denoted and using this case as inputs arguments the technical indicators given by the index combinations and respectively.
- (3h)
As in step 3c compute the resulting total error of evaluating the two mappings computed in the previous step over the validation set. Denote these total errors as and .
- (3i)
Given the three previous index combinations I, , and their resulting total errors , and pick the two combinations with the smallest error.
- (3j)
Substitute the initial combination I by the two combinations picked in the previous step. This will double the number of combinations in set , i.e., will contain combinations after this step.
Retain the M combinations in with the smallest error and discard the remaining M combinations with greatest error.
Find the minimum of all the combinations .
Repeat step 3 starting from 3d until a certain stop condition is met. Here it is proposed to check if a certain objective error is achieved or a maximum number of iterations is met, so that an infinite loop is avoided.
Remark 1. This strategy can be easily parallelized as the tasks in step 3 can be done independently for each combination, hence can be done in parallel, with one exception. The latter part of step 3f, that is the rejection of repeated combinations, cannot be done in parallel and must be done sequentially in a separate step outside of step 3.
Remark 2. The number of combinations in , that is M, and the number of iterations are related in the sense that similar performances can be achieved by using a greater M and fewer iterations or vice versa. Besides the differences in performance due to the randomized nature of the proposed strategy, a practical difference can lead to one choice or the other. With a greater M one can exploit the parallelizable nature of the algorithm, whereas in the case of a high number of iterations that cannot be made.
To illustrate the proposed approach consider a simple example with just two combinations (i.e.,
) and one iteration. If for instance, we assume that we have a total of 6 technical indicators to choose from, i.e.,
, where the sub-index 5 means that each of the indicators are evaluated over five periods of time. There is also a related classification vector
identifying up and down movements in the stock at each time t (in this case
). First, the algorithm selects randomly 2 combinations of 3 initial technical indicator indexes. For example:
Then the non-linear classification error is estimated for this configuration; let us assume that the obtained value is (with a slight abuse of notation the parameters of each combination will denote in this example the set of the parameters of both combinations):
Then the indexes
and
are computed for each combination:
The addition of the new technical indicator index is done randomly, ensuring that there is no repetition, i.e., each input variable (technical indicator) is used only once. The index removed is also computed randomly. Then the combinations
and
are formed:
Note that in every combination generated so far no index is repeated and that there are not repeated combinations. Once the new combinations are formed, their total classifying errors are computed:
For every combination in
we choose the two combinations (from
I,
and
) with the smallest error. From the fist combination we choose then,
and
. On the other hand, from the second combination we choose
and
. Therefore, the augmented
will consists of:
with their errors:
We order the combinations using the error as sort criterion:
Now we can reduce the numbers of combinations to its original number (
) picking the two with the smallest error:
with errors:
Then at the end of this first iteration the algorithm would pick the combination as the best technical indicator combination, as it has the lowest error (). Thus, the technical indicators proposed to forecast the direction of the change of price will be .
In practice, the number of combinations and iterations will be determined by the computing power available. The process thus will be repeated until a stop criteria is reached (that is, a maximum number of iterations or a target classification error).
4. Discussion
The proposed method can be a feasible approach when trying to determine a combination of variables or features to be used when forecasting the behaviour of non-linear processes. In the particular example of the stock market there is a very large number of technical indicators that are intended to give the investor some indication of the future performance of the stock. These indicators can generate contradictory signals and selecting the appropriate combination of technical indicators can become a difficult tasks. Reducing the dimensionality of the problem is also important to avoid issues such as local minimum that can cause poor generalization when applying techniques such as neural networks. We showed in this paper that it is possible to use our approach in the Chinese stock market (generating an appropriate combination of independent variables for non-linear models) obtaining better results than directly applying neural networks to all the available independent variables. This was tested using 6 Chinese stock index (as well as two international indexes) and 35 technical indicators.
There was an average 9.1% improvement when using the combinatorial approach with neural networks over the results using directly all the technical indicators and neural networks as the non-linear forecasting technique. The formal statistical analysis comparing the results using neural networks directly (all technical indicators) with the results from the combinatorial approach using neural networks shows that there are statistically significant difference for the error rates obtained at a 1%, 5% and 10% significance level, supporting once more the hypothesis that the combinatorial approach using neural networks is a more appropriate tool for forecasting the direction of the stock market movement, at least for the 8 indexes analyzed, than using neural networks directly. Thus, for eight different indexes, better combinations of the technical indicators have been found, offering a practical choice to improve the forecasting accuracy and hence the expected benefits. The total calculation time per index (100 initial configurations time 2500 iterations) was 157,691. While this is a substantial amount of time it is a calculation that can be done with a normal laptop computer. Moreover, many of the operations of the proposed algorithm can be done in parallel further shortening the computation times.
While direct comparison is challenging the approach of using the combinatorial approach with neural networks seems to be generate better results for stock forecasting purposes than other approaches used in the existing literature such as for instance the Box–Jenkins approach used by Groda and Vrbka [
3], which the authors considered not suitable. A more comparable paper is Kim and Han [
36] that achieved a hit rate of 61% in the Korean market using genetic algorithm in combination to neural networks which is comparable to the 59% rate that we obtained in the Chinese market. Nevertheless comparison across different stock markets should be taken with caution. For instance, it should be naïve to believe that the same approach would generate the same results in two markets as different as South Korea and China with China being an open stock market dominated by institutional investors while the Chinese market is a market dominated by local retail investors.
The selection approach was illustrated in the context of the stock market and using neural networks but the approach is easily applicable to other fields. This is increasingly important as the amount of data available in many fields has increased substantially over the last few decades with an ever increasing need for tools to process large databases. Besides neural networks other non-linear models, such as for instance support vector machines, can be used using the proposed approach. This could be an interesting area of future work.
5. Conclusions
The proposed combinatorial method for variable selection is applicable to the problem of forecasting the direction of stock market movements using non-linear techniques such as neural networks. This approach generates better results than directly applying non-linear forecasting methods such as neural networks using all the available variables. For large amount of technical indicators (independent variables) it is clearly not possible to estimate the forecasts for all the combinations with the proposed method providing a reasonable alternative. In fact, it has been shown that using all the available indicators can be counterproductive as it has higher error than a pure random approach. Another relevant result is that better indicator choices have been researched for 8 different indexes. The calculation time for the combinatorial approach is another factor to take into account as it is a computationally demanding, but clearly more efficient, from a calculation time point of view, than estimating the forecasts for all the possible combinations.
The combinatorial approach was tested thoroughly for the Chinese stock market as well as for some indexes describing the US and European stock market. The forecasting accuracy of this approach in other markets might be different and this can be potentially an area of future work. There are several factors that could potentially impact the forecasting accuracy of this approach. For instance, narrow and deep markets might have different behaviors and hence the combinatorial approach might also have rather different forecasting accuracy.
Overall, when there are many potential variables (technical indicators) that can affect the performance of the stock market and no strong fundamental support for choosing a specific combination of these variables, the proposed approach can be an appropriate alternative. Similarly, while the approach was tested using neural networks it can be easily applied to other non-linear forecasting techniques such as support vector machines. It can also be generalized to other forecasting problems besides stocks. It is in that sense a rather general approach with many potential applications.