An Analysis and Implementation of the Hidden Markov Model to Technology Stock Prediction

: Future stock prices depend on many internal and external factors that are not easy to evaluate. In this paper, we use the Hidden Markov Model, (HMM), to predict a daily stock price of three active trading stocks: Apple, Google, and Facebook, based on their historical data. We ﬁrst use the Akaike information criterion (AIC) and Bayesian information criterion (BIC) to choose the numbers of states from HMM. We then use the models to predict close prices of these three stocks using both single observation data and multiple observation data. Finally, we use the predictions as signals for trading these stocks. The criteria tests’ results showed that HMM with two states worked the best among two, three and four states for the three stocks. Our results also demonstrate that the HMM outperformed the naïve method in forecasting stock prices. The results also showed that active traders using HMM got a higher return than using the naïve forecast for Facebook and Google stocks. The stock price prediction method has a signiﬁcant impact on stock trading and derivative hedging.


Introduction
Stock investments can have a huge return or a significant loss due to the high volatilities of stock prices. An adaptable stock price prediction model would reduce risk and enhance potential return in financial derivative trading. Recently, researchers have applied the hidden Markov model for stock prices' forecasts. Hassan and Nath (2005) used HMM to predict the stock price for interrelated markets. Kritzman, Page, and Turkington (Kritzman et al. 2012) applied HMM with two states to predict regimes in market turbulence, inflation, and industrial production index. Guidolin and Timmermann (2006) used HMM with four states and multiple observations to study asset allocation decisions based on regime switching in asset returns. Nguyen (2014) used HMM with both single and multiple observations to forecast economic regimes and stock prices. Nobakht, Joseph and Loni (Nobakht et al. 2012) implemented HMM using various observation data (open, close, low, high) prices of stock to predict its close prices. In our previous work Nguyen and Nguyen (2015), we used HMM for single observation sequence for the S&P 500 to select stocks for trading based on performances of these stocks during the predicted regimes. In this study, we use HMM to predict stock prices and apply the results to trade stocks. We use HMM for multiple independent observation sequences in this study. Three stocks: Apple Inc., Alphabet Inc., and Facebook, Inc., were chosen to implement the model. We limit numbers of states of the HMM to a maximum of four states and use two goodness of fit tests to choose the best HMM model among HMMs with two, three, or four states. The prediction process is based on the work of Hassan and Nath (2005). The authors use HMM with the four observations: close, open, high, and low prices of some airline stocks to predict their future close price using four states. They used HMM to find a day in the past that was similar to the recent day and used the price change in date and price of the current day to predict future close price. However, in the paper, the authors did not explain why they chose HMM with four states. Our approach is different from their work in the three following modifications. The first difference is that we use the Akaike information criterion (AIC) and Bayesian information criterion (BIC) to test the HMM's performances with numbers of states from two to four to find the best HMM model. The second modification is that we apply HMM for stock returns to predict future close prices and compare the results with the naïve forecast method. The modification is based on the assumption of the HMM's algorithms presented in this paper: the observation sequences are independent. Applying the HMM to stock returns, our prediction method is simpler than the method in Hassan and Nath (2005), which will be explained in Section 3.1. Finally, we use stock prices predicted via the HMM and the naïve method to trade these three stocks and compare the results.
The paper is organized as follows: Section 2 gives a brief introduction to HMM and its algorithms for multiple observation sequences. Section 3 describes the HMM model selections and data collections for stock price prediction. Section 4 presents the results of stock price predictions and stock trading, and Section 5 gives conclusions.

Hidden Markov Model and Its Algorithms
The Hidden Markov Model, HMM, is a signal detection model that was introduced in 1966 by Baum and Petrie (Baum and Petrie 1966). HMM assumes that an observation sequence was derived from a hidden state sequence of discrete data and satisfies the first order of a Markov process. HMM was developed from a model for a single observation variable to a model for multiple observation variables. The applications of HMM also were expanded to many areas such as speech recognition, biomathematics, and financial mathematics. In our previous paper Nguyen and Nguyen (2015), we described HMM for one observation, its algorithms, and applications. In this section, we present HMM for multiple observations and its corresponding algorithms. We assume that the multiple observations data are independent and have the same length. The basic elements of an HMM for multiple observations are: t , t = 1, 2, . . . , T, l = 1, 2, . . . , L}, where l is numbers of independent observation sequences and T is the length of each sequence, • Hidden state sequence of O, Q = {q t , t = 1, 2, . . . , T}, • Possible values of each state, {S i , i = 1, 2, . . . , N}, • Possible symbols per state, {v k , k = 1, 2, . . . , M}, • Transition matrix, A = (a ij ), where a ij = P(q t = S j |q t−1 = S i ), i, j = 1, 2, . . . , N, • Initial probability of being in state (regime) S i at time t = 1, p = (p i ), where p i = P(q 1 = S i ), i = 1, 2, . . . , N, Parameters of an HMM are the matrices A and B and the vector p. For convenience, we use a compact notation for the parameters, given by If the observation probability assumes the Gaussian distribution, then we have a continuous HMM where µ i and σ i are the mean and variance of the distribution corresponding to the state S i , respectively, and N is Gaussian density function. For convenience, Then, the parameters of HMM are where µ and σ are vectors of means and variances of the Gaussian distributions, respectively. With the assumption that the observations are independent, the probability of observation, denoted by P(O|λ), is There are three main questions that readers should consider when using the HMM: 1. Given an observation data O and the model parameters λ, can we compute the probabilities of the observations P(O|λ)? 2. Given the observation data O and the model parameters λ, can we find the best hidden state sequence of O? 3. Given the observation O, can we find the model's parameters λ?
The first problem can be solved by using forward or backward algorithms Baum and Egon (1967); Baum and Sell (1968), the second problem was solved by using Viterbi algorithm Forney (1973); Viterbi (1967) and the Baum-Welch algorithm Rabiner (1989) was developed to solve the last problem. In the paper, we only use the algorithms to solve the first and the last problem. We first use the Baum-Welch algorithm to calibrate parameters for the model and the forward algorithm to calculate the probability of observation to predict trending signals for stocks. In this section, we introduce the forward algorithm and the Baum-Welch algorithm for HMM with multiple observations. These algorithms are written based on Baum and Egon (1967); Baum and Sell (1968); Forney (1973); Petrushin (2000); Rabiner (1989).

Baum-Welch Algorithm
The Baum-Welch algorithm is an algorithm to calibrate parameters for the HMM given the observation data. The algorithm was introduced in 1970 Baum et al. (1970), in order to estimate the parameters of HMM for a single observation. Then, in 1983, the algorithm was extended to calibrate HMM's parameters for multiple independent observations, Levinson et al. (1983). In 2000, the algorithm was developed for multiple observations without the assumption of independence of the observations, Li et al. (2000). In this paper, we use HMM for independent observations, so we will introduce the Baum-Welch algorithm for this case. The Baum-Welch method or the expectation modification (EM) method is used to find a local maximizer, λ * , of the probability function P(O|λ).
In order to describe the procedure, we define the conditional probability β (l) T (i) = 1, and we have the following backward recursive: We then defined γ (l) t (i), the probability of being in state S i at time t of the observation O (l) , l = 1, 2, ..., L as: The probability of being in state S i at time t and state S j at time t + 1 of the observation O (l) , l = 1, 2, ..., L as: Note that the parameter λ * was updated in Step 2 of the Baum-Welch algorithm to maximize the function P(O|λ) so we will have = P If the observation probability b i (k) * , defined in Section 2, is Gaussian, we will use the following formula to update the model parameter, λ ≡ {A, µ, σ, p} .

Model Selections and Data Collections
The Hidden Markov Model has been widely used in financial mathematics area to predict economic regimes (Kritzman et al. 2012;Guidolin and Timmermann 2006;Ang and Bekaert 2002;Chen 2005;Nguyen 2014) or predict stock prices (Hassan and Nath 2005;Nobakht et al. 2012;Nguyen 2014). In this paper, we explore a new approach of HMM in predicting stock prices. In this section, we discuss how to use the Akaike information criterion, AIC, and the Bayesian information criterion, BIC, to test the HMM's performances with different numbers of states. We then will present how to use HMM to predict stock prices and apply the results to trade stocks. First, we will describe the chosen data and the AIC and BIC for HMM with selected numbers of states.

Baum-Welch for L independent observations
, ..., O (L) ) with the same length T 1. Initialization: input parameters λ, the tolerance tol, and a real number 2. Repeat until < tol

Overview of Data Selections
We chose three stocks that are actively trading in the stock market to examine our model: Apple Inc. (AAPL), Alphabet Inc. (GOOGL), and Facebook Inc. (FB). The daily stock prices (open, low, high, close) of these stocks and information of these companies can be found from finance.yahoo.com. We used daily historical prices of these stocks from 4 January 2010 to 30 October 2015 in this paper.

Checking Model Assumptions
The HMM's algorithms presented in this paper are based on the assumption that the observation sequences are independent. However, the open, low, high, and close prices of a stock are highly correlated, which can be since from the matrix of correlation in Figure 1. On the other hand, stock returns of these four series prices are independent, which are shown in Figure 2.
We use the Autocorrelation function (ACF) to calculate the paired correlation between the series and plot in Figures 1 and 2. The ACF for the Facebook and Google stocks are presented in Appendix A. We can see clearly from the figures that the return price series have low correlations while the stock price series have very high correlations.
Furthermore, we conduct the Ljung-Box test to test the independence of each time series. We use the test with lag = 1 for returns of the three stocks: AAPL, FB, and GOOGL, from 1 October 2014 to 1 October 2015, and present results in Table 1. Note that the stock prices are not independent, and they failed the Ljung-Box test at significance level α = 5%, so Table 1 only displays results for stock returns.   Table 1. p-values from the Ljung-Box test for independencies of stock return series: Open, High, Low, and Close. "*" indicates that the p-value is statistically significant at α = 5%, "**" indicates that the p-value is statistically significant at α = 1%, and "***" indicates that the p-value is statistically significant at α = 0.1%.

Stock
Open The null hypothesis of the Ljung-Box test is that the data are independently distributed. Thus, we will accept the null hypothesis if the p-value is bigger than the chosen significant level α. From Table 1, we can see that most of the stock returns series pass the independent test at the significant level α = 1%, and the only two series that do not pass the test at the significant level α = 0.1% are APPL's open returns and GOOGL's low returns. The HMM works for dependent observation data with a modification in calculating probabilities of observations. We will explore the case in our future study. We will apply HMM for predicting the daily returns and then forecast future stock prices in the next section.

Model Selection
Choosing a number of hidden states for the HMM is a critical task. We first use two standard criteria: the AIC and the BIC to examine the performances of HMM with different numbers of states. The two measures are suitable for HMM because, in the model training algorithm, the Baum-Welch algorithm, the EM method was used to maximize the log-likelihood of the model. We limit numbers of states from two to four to keep the model simple and feasible for stock prediction. The AIC and BIC are calculated using the following formulas, respectively: where L is the likelihood function for the model, M is the number of observation points, and k is the number of estimated parameters in the model. In this paper, we assume that the distribution corresponding to each hidden state is a Gaussian distribution. Therefore, the number of parameters, k, is formulated as k = N 2 + 2N − 1, where N is numbers of states used in the HMM.
To train HMM's parameters, I use historical observed data of a fixed length T, where O (i) with i = 1, 2, 3, or 4 represents the daily returns of open, low, high or close price of a stock, respectively. For the HMM with single observation, we use only the returns of close price data, where O t is stock's return of close price at time t. We ran the model calibrations with different time lengths, T, and saw that the model worked well for T ≥ 80. On the results below, we used blocks of T = 100 trading days of stock price data, O, to calibrate HMM's parameters and calculate the AIC and BIC numbers. Thus, the total number of observation points in each BIC calculation is M = 400 for four observation data and M = 100 for one observation data. For convenience, we did 100 calibrations for 100 blocks of data by moving the block of data forward, (we took off the price of the oldest day on the block and added the price of the following day to the recent day of the block). The calibrated parameters of the previous step are used as initial parameters for the new calibration. The training data set is from 16 January 2015 to 30 October 2015.
The first block of stock prices of 100 trading days from 16 January 2015 to 6 June 2015 was used to calibrate HMM's parameters and calculate corresponding AIC and BIC. Let µ (O) and σ (O) be the mean and standard deviation of observation data, O, respectably. We chose initial parameters for the first prediction as follows: where i, j = 1, .., N and N (0, 1) is the standard normal distribution. The second block of 100 trading day data from 17 January 2015 to 7 June 2015 was used for the second calibration and so on. The HMM calibrated parameters from the current calibration are used as initial parameters for the next estimation. We continued the process until we got 100 calibrations. We plot the AICs and BICs of the 100 calibrations of these three stocks (AAPL, FB, and GOOGL) on Figures 3-5. On Figures 3-5, the graph of AIC is located on the left and BIC is located on the right. The lower AIC or BIC is the better model calibration. However, the Baum-Welch algorithm only finds a local maximizer of the likelihood function. Therefore, we did not expect to have the same AIC or BIC if we run the calibration twice. The results on Figures 3-5 showed that the calibration performances of the model with different numbers of states differ from one simulation to others. Based on the AIC results, the performances of HMM with two, three, or four states are almost the same. However, based on the BIC, the HMM with two states is the best candidate for all three of the stocks. Therefore, we choose the HMM with two states to predict prices of the three stocks in the next section.

Stock Price Prediction and Stock Trading
In this section, we will use HMM to predict stock prices and compare the prediction with the real market prices. We will predict stock prices of GOOGL, APPL, and FB using HMM with two states, the best model selected from Section 3.3, and calculate the relative errors to the real market prices. The results will be compared with the naïve none change method. A trading strategy using HMM is also presented in this section.

Stock Price Prediction
We first introduce how to predict stock prices using HMM. The prediction process can be divided into three steps.
Step 1: calibrate HMM's parameters and calculate the likelihood of the model.
Step 2: find the day in the past that has a similar likelihood to the recent day.
Step 3: use the stock returns on the day after the "similar" day in the history to be the predicted return for tomorrow price. This prediction approach is based on the work of Hassan and Nath (2005). However, our procedure is different from their method in that we apply HMM for the returns of open, low, high, and close prices, which are independent, while the authors used the HMM directly to open, low, high, and close prices, which are not independent. Due to applying the HMM for stock returns, our method is simpler than their method in the third step. We use HMM with the returns of the four observation sequences (open, low, high, close price), as in Hassan and Nath (2005).
Suppose that we want to predict tomorrow's closing price of stock A, the prediction can be explained as follows. In the first step, we chose a block of T of the four daily return prices of stock A: open, low, high, and close, (O = {O where P T is close price at time T and O T+1 is the return of close price calculated in (2). The naïve none change method is applied for returns of the three stocks' close prices. The model simply takes the return of the close price today to use as the return of the tomorrow's close price After forecasting O (4) T+1 , we predict the next day's close price by using Equation (3). We use the naïve method for stock returns instead of stock prices because, for trading purposes, if we assume no change in future stock prices, then there is no trade. We present results of using the HMM to predict these three stocks'-AAPL, GOOGL, and FB-closing prices for one year trading, 252 days, in Figures 6-8. The results indicate that the HMM captures the trends of the three stocks well, while the naïve forecasts often go to the opposite directions of the real market trends. We can see from Figure 7 that the naïve forecast method had a few huge errors in predicting stock prices in February. The naïve model also showed its weakness when predicted prices of Google stock at the end of July 2015 are far from the actual prices.   We also compare the forecasting results of using the two-state HMM and the naïve method numerically by calculating the mean absolute percentage error, MAPE, of the estimations.
where N is number of predicted points, M is market price, and P is predicted price of a stock. The results were shown in Table 2. In Table 2, the "Price Std." and "Return Std." are the standard deviation of the stock prices and stock returns, respectively, and the efficiencies are calculated by taking the errors of the naïve method divided by the errors of the HMM. All efficiencies in the table are bigger than one, showing that the HMM outperformed the naïve in forecasting stock prices. Among these three stocks, GOOGL's prices have the highest volatility, but its returns have the lowest volatility. These factors will affect stock trading results so that we will present the results in the next section.

Stock Trading
In this section, we will use the predicted returns to trade these three stocks: AAPL, FB, and GOOGL. The trading strategy is: if HMM predicts that the stock price of AAPL will move up tomorrow, or its return is positive, we will buy this stock today and sell it tomorrow, assuming that we buy and sell with close prices. If the HMM predicts that the stock price will not increase tomorrow, then we will do nothing. We also assume that there is no trading cost. For each trade, we will buy or sell 100 shares of each of these three stocks. Based on the AIC and BIC results, we only use HMM with two states for the stock trading. Again, we will use a block of 252 trading days, one year, from 15 August 2016 to 11 August 2017 for model testing. We present the results of one year trading in the Table 3. In Table 3, the "Investment" is the price that we bought 100 shares of the stocks the first time. The "Earning" is the money gained, and the "Profit" is the percentage of return of the one-year trading. The results show that the HMM worked better than the naïve in trading the Facebook and Google stocks. Especially in the one year trading period, the GOOGL stock yielded a much higher return compared to the naïve forecast method. However, the results are reversed for AAPL stock. From Figures 6-8 and Table 2, we can see that, in the one period, the GOOGL prices have the highest volatility and lowest risk of returns among the three stocks. The naïve results are consistent with the risk of return levels, the "Return Std." in Table 2: the higher the risk, the better the return. The HMM followed close to the risk level theoretical. Based on the results in Table 3, using an HMM model, traders had returns of 32.00%, 24%, and 25% for AAPL, FB, GOOGL, respectively. Trading using HMM gave much higher returns than using the naïve for two stocks FB and GOOGL, but a likely lower return for the AAPL stock compare to the naïve.

Conclusions
Stock's performances are an essential indicator of the strength or weakness of the stock's corporation and economic viability in general. Many factors will drive stock prices up or down. In this paper, we use a Hidden Markov Model, HMM, to predict prices of three stocks: AAPL, GOOGL, and FB. We first use the AIC and BIC criterions to examine the performances of HMM numbers of states from two to four. The results showed that the HMM with two states is the best model among the two, three and four states. We then use the models to predict stock prices and compare the predictions with the naïve forecast results by plotting the forecasted prices versus the market prices and evaluating the mean absolute percentage error, MAPE. The prediction errors show that HMM worked better in predicting prices of the three stocks-AAPL, FB, and GOOGL-compared with the naïve method. In stock trading, the HMM outperformed the naïve for two stocks: FB and GOOGL. The graphs indicate that the HMM is the potential model for stock trading since it captures the trends of stock prices well.