An IPSO-FW-WSVM Method for Stock Trading Signal Forecasting

Trading signal detection is a very popular yet challenging research topic in the financial investment area. This paper develops a novel method integrating piecewise linear representation (PLR), improved particle swarm optimization (IPSO) and a feature-weighted support vector machine (FW-WSVM) to analyze the nonlinear relationships between trading signals and the stock data hidden in historical data. First, PLR is applied to generate numerous trading points (valleys or peaks) based on the historical data. These turning points’ prediction is formulated as a three-class classification problem. Then, IPSO is utilized to find the optimal parameters of FW-WSVM. Lastly, we conduct a series of comparative experiments between IPSO-FW-WSVM and PLR-ANN on 25 stocks with 2 different investment strategies. The experiment results show that our proposed method achieves higher prediction accuracy and profitability, which indicates the IPSO-FW-WSVM method is effective in the prediction of trading signals.


Introduction
Designing and implementing various predictive techniques for complex data analysis has attracted great scientific interest [1,2]. Among the research, stock trading point prediction has been an attractive yet challenging research topic. Therefore, many investors and researchers have conducted plenty of research in this field. However, the stock market is a nonlinear, changeable and complex system affected by many factors such as government policies, international situations, economic environments, interest rates and market capitalization. Despite the volatility of the stock market, researchers are still trying to develop more effective prediction techniques due to the benefits involved in accurate predictions.
With regard to the techniques used to find optimal trading points, some of them are based on financial analysis, and others are artificial intelligence methods. Financial analysis usually uses two main analysis methods, which are fundamental and technical analysis [3,4]. Fundamental analysis uses macroeconomic, industrial and business indicators to predict trend reversals [5][6][7]. Technical analysis assumes that past behavior has an impact on price evolution and trading decisions are made based on historical prices and some useful technical indicators such as moving averages and the relative strength index [8][9][10]. Financial time series data are inherently dynamic, nonlinear [11,12] and do not follow the fixed patterns. Therefore, making the right trading decisions is very difficult through financial analysis. In contrast, artificial intelligence algorithms excel at handling dynamic and nonlinear data in financial markets and are widely applied to predict trading points.
Over the past few decades, many artificial intelligence and machine learning algorithms have been developed as major tools in the financial investment field, such as artificial neural networks [13][14][15], support vector machines [16][17][18][19], rough set theory [20][21][22][23], Bayesian analysis [24][25][26][27] and evolutionary learning algorithms [28][29][30][31]. However, most past studies have focused on the accurate prediction of stock prices rather than trading decisions, since it is more difficult to predict the buying and selling points of stocks than to predict the changes in stock prices.
In recent years, artificial neural networks have been widely used to predict the turning points of a stock. For example, Chiang et al. proposed an adaptive intelligent stock trading decision support system using a particle swarm algorithm and neural network to predict the future movement direction [32]. This method overcame the weakness of the selection of inputs and parameter settings in the traditional ANN approach. However, artificial neural networks have many disadvantages, such as black box technology, easy overfitting, large computational complexity, slow convergence speed and easily falling into local minima. To overcome these disadvantages, the SVM [33] has attracted much attention. It has become a very popular research method in the financial investment field. Luo et al. conducted a set of improvements to the PLR-WSVM by adding some relative indicators with more valuable information into the input variables [34], simplifying the turning point prediction into a two-class problem, adapting the threshold values of PLR and using the WSVM to find the stock trading signals. Chen and Hao proposed a novel method, PLR-FW-WSVM, to predict trading points [35]. Despite the SVM having good generalization performance, whether the parameters are appropriate often affects its classification performance and the generalization ability [36,37]. The traditional grid search algorithm has low efficiency and a large amount of calculations, in addition to being time-consuming and having an unsatisfactory effect [38]. PSO not only has the global optimization ability but also has an efficient convergence ability and strong local optimization ability [39]. The optimal solution can be found through coordination and information exchange among individuals. In order to find the optimal solution to the FW-WSVM model, this paper proposes improved particle swarm optimization (IPSO) and integrates it into FW-WSVM to find the trading signals in the financial field (IPSO-FW-WSVM). First, the turning points are generated by a PLR algorithm. Secondly, the information gain is calculated in order to set the weight of each feature. Lastly, we utilize IPSO to optimize the FW-WSVM model parameters for stock trading decision.
The rest of this paper is organized as follows. In Section 2, we briefly present the theories of PLR, the FW-WSVM based on the information gain and FW-WSVM parameter optimization based on IPSO. In Section 3, we present the research design, which contains input variable selection, data labeling and performance measuring. Section 4 presents some experimental results to validate the performance of our proposed method. Section 5 gives a summary of this work and some brief future work.

Methodology
This paper utilizes the IPSO-FW-WSVM method to detect trading signals in financial time series data. PLR first generates trading points from the historical stock price database. Secondly, the whole dataset is divided into a lot of overlapping training-testing sets, which can reduce the time-varying characteristics of stock data. Thirdly, the IPSO-FW-WSVM model is applied to learn the relationship between the input features and trading signals. Finally, the trained model is adopted to compute the accuracy of the model and evaluate the profitability with two investment strategies. The flow chart of the proposed framework is shown in Figure 1.

PLR
PLR was developed for pattern matching, and it can be used to generate turning points in financial time series data. Let X = {x 1 , x 2 , . . . , x N } denote the financial time series data. X can be divided into M segments, which are expressed as follows: where t i (i = 1, 2, . . . , M) is the end of the ith segment. Therefore, S PLR provides a lot of segments belonging to an uptrend or a downtrend with low and high price points, as shown in Figure 2. The threshold of segment representation is a significant parameter affecting the results of PLR. Figure 3 shows some different segmentation graphs based on different threshold values for the Shenzhen Stock Exchange Component Index (SZSE COMP SUB IND). As shown in Figure 3, the higher threshold value created longer trend patterns and generated only a few segments, while a smaller threshold generated a lot of segments. Therefore, it should be more reasonable to specify different thresholds for different price fluctuations. In this paper, we utilized the SD algorithm [34] to automatically select the threshold, and different thresholds for different price fluctuations were calculated according to a percentage of turning points in the parameter pct.

FW-WSVM Based on the Information Gain
In this section, a brief introduction for the FW-WSVM method based on information gain is given, and more details can be found in [35]. Let T train = {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x N , y N )} be the training dataset, where x i = (x 1 i , x 2 i , . . . , x n i ) ∈ X = R n denotes the ith (i = 1, 2, . . . N) input vector, n is the size of vector and y i ∈ Y = {−1, +1} denotes the label. The dataset T train is separated into two classes through the hyperplane ω · φ(xP) + b = 0, where P is a n × n feature-weighted matrix. The classifier should satisfy the following conditions: where φ : R n → R m is used to map the input space R n to the high-dimensional feature space R m and ζ i is the slack variable. Therefore, The objective optimization can be expressed as follows: subject to: where C i is the weighted constant parameter, which is defined as follows: where v i is the weight of the instance x i . Thus, the corresponding Lagrangian function is Equation (5): where α i and µ i are the non-negative Lagrange multipliers. Therefore, the dual problem can be expressed as follows: subject to: The decision function of the classification problem can be obtained with Equation (7): where N S is the number of the support vectors and P is the feature-weighted matrix.
How to obtain the weighted matrix P is the key to the FW-WSVM algorithm. In this research, information gain is used to measure the weighted matrix P. Let |C {i,T train } | be the size of the dataset C {i,T train } . The probability of a sample belonging to class C +1 /C −1 can be approximately calculated through |C {i,T train } |/N. The expected information is as follows: Let F f eature be the feature selected to split the set T train , and let F f eature have v different values. It can produce v subsets, indicated by D 1 , D 2 , . . . , D v . Then, the expected information is as follows: where |D j | is the size of the set D j , |C {i,D j } | is the size of samples belonging to the class C +1 /C −1 in set D j and i = −1, +1, j = 1, 2, . . . , v. Thus, the information gain is as follows: If the information gain is greater, then the corresponding feature is more important, and the contribution to classification is greater. Consequently, the feature-weighted matrix P is as follows: where In f oGain( f i ) describes the weight of each feature and i = 1, 2, . . . , n.

FW-WSVM Parameter Optimization Based on IPSO
In the FW-WSVM algorithm, the value of factor C and the value of δ in the kernel function can obviously affect the performance of the system. It is difficult to choose the two important optimal parameters (C and δ) by virtue of expert experience. In order to find the optimal values for these parameters with the smallest generalization error, PSO is used to optimize the parameter selection of the FW-WSVM model.
The PSO algorithm is a heuristic search algorithm which was derived from the flocking behavior of insects, herds of animals, flocks of birds, schools of fish, etc. The algorithm searches the solution space of the problem by simulating the foraging behavior of birds. In the PSO algorithm, each particle moves at a certain speed in the search space, changes its position according to the fitness value in the environment and shares the information with other particles.
Let the total number of particles be n, the dimension be d, the position of the i particle be x i = (x i1 , x i2 , . . . , x id ) and the corresponding velocity be In each iteration, the particle updates its position by tracking two optimal solutions. The first one is the individual historical optimal solution, indicated as pbest, where p i = (p i1 , p i2 , . . . , p id ), and the other is the global optimal solution, indicated as gbest, where p g = (p g1 , p g2 , . . . , p gd ). The particle updates its velocity and position as follows: where ω k is the inertia weight, c 1k and c 2k are the local search ability and global search ability learning factors, respectively, γ 1 and γ 2 represent random numbers between [0, 1] and β k is the factor to speed up the convergence speed of the algorithm when updating the positions of particles: where T max is the maximum evolutionary quantity. The setting of the inertia weight has a great influence on the convergence speed of the algorithm. A larger inertia weight has a stronger global search ability for jumping out of the local optimal solution, and a smaller inertia weight is conducive to local searches, increasing the convergence speed of the algorithm. The inertia weight has a great influence on the performance of the algorithm, and there have been many studies on it [40][41][42][43]. The more successful one is the linearly adjusted inertia weight particle swarm algorithm [44]. The idea is that the inertia weight decreases linearly with the increase in the number of iterations of the algorithm. The actual search process of the PSO algorithm is nonlinear and highly complex, and the strategy of linearly decreasing the inertia weight often cannot reflect the actual optimal search process. In order to overcome the deficiency of a linearly decreasing weight, this paper proposes a non-linear decreasing strategy: where ω max and ω min are the maximum and minimum of the inertial weight, respectively, and ω k is expressed in the inertia weight value of the kth iteration.
In this paper, we control the better global searching and local searching of particles by changing the learning factors c 1k and c 2k [45]. In the early stage of a search, a larger cognition coefficient c 1k and a smaller social coefficient c 2k are set so individual cognition occupies a dominant position, and the particles can develop new search areas in a larger search space. In the later stage, a larger social coefficient and a smaller cognitive coefficient are set. The updated formulas of c 1k and c 2k are as follows: The flow of optimization of the FW-WSVM parameters using IPSO is illustrated in Algorithm 1.

Algorithm 1: Description of FW-WSVM parameter optimization based on IPSO
Step 1: Determine the range of C and δ in FW-WSVM.
Step 3: Train the FW-WSVM model with training set. The parameters C and δ vary as the particle travels.
Step 4: Judge whether the desired accuracy is reached, if yes, output the optimal combined parameters C and δ of FW-WSVM model, and turn to Step 6; otherwise turn to Step 5 and continue iterating.
Step 5: Update the parameters. substep 1: Update pbest by comparing the current fitness value of the particle with its individual history optimal value. substep 2: Update gbest by comparing the current fitness value of particles with the global optimal value of population. substep 3: Update the particle velocity according to Equation (12). substep 4: Update particle position are updated according to Equation (13).
Step 6: Substitute C and δ into FW-WSVM, train the model by training set and output the trained model.

Input Variable Selection
Let S be the dataset, which is described as follows: where TI i,j represents the technical indicators.

Data Labeling
We used PLR to generate class labels. We first used PLR to generate turning points for the stock data. We then classified these turning points into three categories, which were valley turning points, peak turning points and other turning points. The turning point with a trough was marked as a buying point, while the peak was marked as a selling point, and the other points were marked as holding points. The holding point, buying point and selling point were numbered zero, one and two, respectively. After generating the class labels y i ∈ {0, 1, 2}, the variables were scaled between 0 and +1 using the standard min-max formula. The dataset S was reformulated as follows:

Performance Measure
The most important goal of stock forecasting is to obtain high and stable profits. We used two investment strategies to objectively test the profitability of the forecast effect. Let b stock (i) be the balance number, b money (i) be the balance money, Avg i be the average price for day i, v money (i) be the total investment money until day i and c buy and c sell be the transaction cost rates of buying and selling, respectively. Strategy 1: This investment strategy is used to assess the benefits of having a large amount of funds.
(1) Buying strategy: If the forecast signal is a buying signal (y i = 1), then the investors spend Need i = 100 × (Avg i × (1 + c buy )) money to buy 100 shares. After buying the stock, b stock (i), v money (i) and b money (i) are calculated as follows: (2) Selling strategy: If the forecast signal is the selling signal (y i = 2) and b stock (i) > 0, then the investors sell all their shares. After selling the stock, b money (i) and b stock (i) are calculated according to Equations (24) and (25), respectively:

Strategy 2:
This investment strategy is used to evaluate the return of having limited capital. The initial investment capital is set by v money (1) = b money (1) = 10,000, expressed in CNY.
(1) Buying strategy: If y i = 1, then the investors spend the balance money to buy the number of shares NeedBuy i = b money (i) Need i × 100, where x denotes the minimal positive integer that is less than x. After buying the stock, b stock (i), v money (i) and b money (i) are calculated as follows: (b) Selling strategy: If y i = 2 and b stock (i) > 0, then the investors always sell all their shares. Thus, b money (i) and b stock (i) are updated according to Equations (24) and (25), respectively.
At the end of an investment cycle, all shares must be sold on the last day. The profit of this strategy is calculated as follows:

Data Collection and Experimental Set-Up
To demonstrate the IPSO-FW-WSVM model's performance, 25 stocks were randomly selected from the Shanghai and Shenzhen exchange markets. Table 2 describes the collected datasets. The time span for these stocks was from 1 June 2012 to 30 June 2014. The 25 stocks could be divided into 3 types according to the change rate of the closing price: uptrend, downtrend and steady trend. If the rate of change of the closing price from the starting day to the end of the test period was higher than 10%, then it was classified as an uptrend. If the rate of change was lower than 10%, then it was classified as a downtrend; otherwise, it was classified as a steady trend.
In the experiments, Matlab 2016B and libsvm-3.11 [50] are used. Table 3 depicts the parameters used in the Shanghai and Shenzhen stock exchanges, where the size of each training set was 220, the size of each testing set was 20, the transaction fee for buying and selling was 0.0015 and the parameter pct was 0.35. Table 4 depicts the parameters used in IPSO. Table 5 depicts the technical indicators used as input variables.

Experimental Results
In this section, we conduct some comparative experiments between the IPSO-FW-WSVM and PLR-ANN models to illustrate the effectiveness of the proposed model. We used the neural network toolbox in Matlab R2016B to construct the compared ANN model, which had a three-layered feedforward structure, and the number of neurons in the hidden layer was selected using searching through five-fold cross-validation.
Tables 6-8 list the comparison results between the IPSO-FW-WSVM and PLR-ANN models on testing set in the uptrend stocks, the steady trend stocks and the downtrend stocks.  In Table 6, it can be seen that the IPSO-FW-WSVM model outperformed the PLR-ANN model in all five uptrend stocks listed in both aspects of trading point prediction accuracy and trading profits. The average accuracy for the trading signal prediction of the IPSO-FW-WSVM method was 52.85% while it was 46.61% for the PLR-ANN model in all five stocks. Even in the accuracy of individual stocks, the IPSO-FW-WSVM method also performed better than the PLR-ANN method. In terms of profit, the IPSO-FW-WSVM model outperformed the PLR-ANN model for all stocks using strategy 1, with the average profit being 42.66%, which was 34.39 points higher than that for the PLR-ANN model. Meanwhile, the IPSO-FW-WSVM algorithm requires less capital and less time to purchase than the PLR-ANN method, which shows that our proposed method is effective when using strategy 1. In the transaction with strategy 2, the IPSO-FW-WSVM model stood out against the PLR-ANN for five stocks receiving higher profits, in which the average profit of the IPSO-FW-WSVM model was 63.87%, while it was 11.18% for the PLR-ANN. In this way, we can conclude that our proposed method was more effective in the uptrend category and achieved better profitability using different investment strategies. Table 7, demonstrates the similar comparative performances results between the IPSO-FW-WSVM and PLR-ANN models for the steady trend stocks and those for the uptrend stocks. The prediction accuracy of the IPSO-FW-WSVM model was higher than that for the PLR-ANN model for all steady trend stocks, being 51.56% for the IPSO-FW-WSVM while it was 47.80% in the PLR-ANN. In the transaction with strategy 1, the average profit for the IPSO-FW-WSVM method was 19.23%, while it was 7.04% for the PLR-ANN method. In the transaction with strategy 2, the IPSO-FW-WSVM also performed better than the PLR-ANN in terms of profit making with the same investment fund, with the average profit for the IPSO-FW-WSVM being 26.91%, while it was 0.41% for the PLR-ANN. The table also shows that the IPSO-FW-WSVM model needed to invest less capital and less time for purchases than the PLR-ANN model. From the above analysis, we can see that our proposed IPSO-FW-WSVM method can get better profit with different transaction strategies in the steady trend.
The results in Table 8 show that the IPSO-FW-WSVM model outperformed the PLR-ANN method for the downtrend stocks. Therefore, our proposed method was more effective in the downtrend category. Figures 4-28 show the buying and selling signals when adopting the IPSO-FW-WSVM model on testing set for the uptrend stocks, the steady trend stocks and the downtrend stocks with two strategies. In the figures, the upper triangle ( ) denotes the buying signal, and the lower triangle ( ) denotes the selling signal. In the Figures 4-28, it is shown that the buy and sell signals for both strategies were still far from optimal. We think this is reasonable. Due to the characteristics of the stock market, such as uncertainty, noise, non-stationarity and nonlinearity, stock trend forecasting is a very hard problem.

Conclusions
An amount of research has been carried out on the behavior of stock price movements. However, investors are more interested in obtaining profits. Therefore, trading decisions are more important than forecasting the stock prices themselves. This paper proposes a comprehensive and efficient trading signal forecasting framework. First, we applied PLR to decompose the historical data into different segments and model trading signal prediction as a three-class classification problem. Then, we trained the IPSO-FW-WSVM model using historical training data and compared the proposed method with a PLR-ANN on the stocks under different trends in Chinese stock exchange markets. The experiment's results clearly illustrate that the IPSO-FW-WSVM model can obtain a significantly higher forecasting accuracy than the PLR-ANN model for stocks in three different tends. Moreover, the proposed framework can make a significant amount of profit with different trading strategies, and the proposed system is very effective at predicting future trading points. However, there are still some problems to be studied further. Although the algorithm we proposed has many advantages, which we mentioned above, it also has some disadvantages. For example, it is difficult to implement for large-scale training samples and sensitive to missing data. Future research work will involve designing better forecasting models to make the results more accurate. Future research will also explore other good investment strategies, since different investment strategies have a large effect on profits, and an unsuitable strategy can lead to poor returns despite the high forecasting performance.

Conflicts of Interest:
All authors declare that they have no conflict of interest.