Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information

Prachyachuwong, Kittisak; Vateekul, Peerapon

doi:10.3390/info12060250

Open AccessArticle

Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information

by

Kittisak Prachyachuwong

and

Peerapon Vateekul

^*

Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10300, Thailand

^*

Author to whom correspondence should be addressed.

Information 2021, 12(6), 250; https://doi.org/10.3390/info12060250

Submission received: 12 May 2021 / Revised: 9 June 2021 / Accepted: 10 June 2021 / Published: 15 June 2021

Download

Browse Figures

Versions Notes

Abstract

:

A stock trend prediction has been in the spotlight from the past to the present. Fortunately, there is an enormous amount of information available nowadays. There were prior attempts that have tried to forecast the trend using textual information; however, it can be further improved since they relied on fixed word embedding, and it depends on the sentiment of the whole market. In this paper, we propose a deep learning model to predict the Thailand Futures Exchange (TFEX) with the ability to analyze both numerical and textual information. We have used Thai economic news headlines from various online sources. To obtain better news sentiment, we have divided the headlines into industry-specific indexes (also called “sectors”) to reflect the movement of securities of the same fundamental. The proposed method consists of Long Short-Term Memory Network (LSTM) and Bidirectional Encoder Representations from Transformers (BERT) architectures to predict daily stock market activity. We have evaluated model performance by considering predictive accuracy and the returns obtained from the simulation of buying and selling. The experimental results demonstrate that enhancing both numerical and textual information of each sector can improve prediction performance and outperform all baselines.

Keywords:

deep learning; natural language processing; time series; stock market prediction

1. Introduction

Since the efficient market hypothesis (EMH) has been proposed by Fama [1], it has become the mainstream among financial economists. It states that if the capital market is efficient, information will be quickly and equally disseminated in the market, and rational investors will be able to interpret the information correctly. Thus, when new information arises, it will be captured and reflected in the stock price immediately. The EMH is associated with the random walk theory, arguing that the future stock price is random and unpredictable. Neither the technical analysis nor fundamental analysis can be used to predict the future price and generate excess return. More specifically, the EMH states that if using the past price information (also known as the technical analysis), investors are unable to outperform the market. It is said to have weak-form efficiency. If using all available public information (also known as the fundamental analysis), including the historical price and volume, investors are unable to outperform the market. It is said to have semi-strong form efficiency. After all, if using all public and private information available, investors are still unable to outperform the market, then it is said to have strong form efficiency.

A number of studies attempt to test the hypothesis, providing empirical results on stock price predictability. Numerous empirical studies have shown that future returns can be obtained by following specific trading strategies using past price information or other public information. Some examples of early studies are as follows. First, the small-size stocks will generate a higher return than the large-size stocks [2]. Second, the low price-to-earnings-ratio stocks generate higher returns than the high price-to-earnings-ratio stocks [3]. Third, the past winning stocks tend to overreact and continue to generate positive returns over several months [4], and this trend will reverse over the long run. Fourth, low liquidity tends to be associated with higher returns [5]. This empirical evidence shows that the excess returns are unable to justify by risk taken by investors. In other words, investors are able to generate excess returns by following this past price and public information. Although the above findings are often referred to as market anomalies, the implication is a persuasive strategy for investors to follow.

Since the year 2000, the rise of machines has become a fundamental shift in the financial markets. The computer now dominates the trading activity that used to be done by a human. More specifically, the traders use algorithms to acquire information and make trading decisions at the speed of light. There are various algorithms used by traders. Some use a market maker strategy to offer liquidity to other traders by posting the bid and offer in the market. Some predict price change by learning the price change in the market. An attempt to generating excess return has changed from forming a portfolio using a certain strategy to predicting the future price path. Nowadays, machine learning (ML) techniques play an essential role as a core algorithm to predict the future stock price.

Recently, deep learning has been a subfield in ML and is considered one of the emerging areas that are showing promising results. Many prior researches focused on modeling either numerical data or solely using textual data. References [6,7] applied neural network technical indicators, assuming that numerical properties can reflect all factors of the stock market. However, these methods have limitations as stock market behavior continues to respond to other external factors that can be captured in the news. References [8,9,10,11] focused on applying text analysis, mimicking fundamental analysis such as news articles and financial reports to find relevant relationships and to predict stock market behavior. In real-world scenarios, most investors usually consider both numerical and textual data. References [12,13,14,15] used news headlines to relate historical prices and technical indicators, and most of them agreed that using only the news headlines should be sufficient for stock prediction representing the whole article. Reference [14] used an event embedding as a representative of events to help increase prediction efficiency. An event is extracted using a tool called “Open Information Extraction (OpenIE)”, which converts the news into three entities: actor, action, and object. Unfortunately, OpenIE supports only documents in English. For the Thai stock market, there are only a few deep learning pieces of research. Moreover, it is surprising that there is only one research based on deep learning [16] utilizing both numerical and Thai textual information. It is quite challenging to process a Thai corpus since, such as the tokenization problem, arising from the fact that the Thai characters do not have the spaces used to divide words as in the English alphabet. Another significant challenge is the Thai news headlines used in this research with short sentences. When using the previous research techniques, they cannot convey the essence of the news. Therefore, it is crucial to use a pre-trained model in order to overcome the scarcity of information in the text data.

In this paper, we aim to propose a novel deep learning model to forecast a stock market trend (called “stock indexes”). It utilizes both numerical data (historical data and technical indicators) and textual information (news headlines). The model architecture is a combination between LSTM [17] and a pre-trained BERT [18,19] responsible for numerical and textual data, respectively. We focus on forecasting a stock index (the whole market) rather than an individual stock since it has a lower risk. Since a stock index is a combination of top individual stocks from various sectors, we propose to use news headlines to capture the market’s movement at “the sector level” rather than the overall market. Thus, we build a separated textual model for each sector and then combine results of all sectors to reflect the whole market’s movement. In the experiment, we focus on a stock market in Thailand using SET50, a stock index combining the top fifty individual stocks. Since the textual data are news headlines in Thai, all challenges of natural language processing (NLP) in Thai must be addressed. Furthermore, Multilingual BERT is chosen since it supports 104 languages, including Thai. The experiment was conducted on the stock index data from 2014 to 2020 and economic news headlines collected from various online sources. The results showed that our sector-based framework could forecast the stock market trend more accurately than other baseline models. Additionally, a simple investment was simulated using the output from our forecasting model, and the results showed that a straightforward trading strategy could yield better annualized returns than the stock market and other models. This comparison can be used as a test for EMH.

The rest of the paper is organized as follows. We introduce related literature in Section 2. We present some of the methodologies for our proposed framework in Section 3. Experimental setup and results are explained in Section 4 and Section 5, respectively. We summarize the paper and address the future direction in Section 6.

2. Literature Review

In this section, we are going to discuss various stock forecasting techniques. Since they can be related to EMH, this section will start from EMH’s details that categorize markets into three levels. Then, recent techniques related to a stock market prediction will be discussed in two groups; first, a stock market prediction using only textual information mainly inspired by [18,20]; second, a stock market prediction using both numerical and textual information largely inspired by [15,16].

2.1. Efficient Market Hypothesis (EMH)

EMH is a hypothesis that stock market prices are already reflected in all relevant information. In other words, stock prices are reflected by investors’ beliefs about future expectations. We can categorize the market’s efficiencies into three levels [21]: (1) “a weak form”, assuming that if investors can beat the market using historical stock prices, also known as technical indicators, the stock market is not yet considered to be efficient; (2) “a semi-strong form”, assuming that if investors can beat the market using the first level data (historical data) with public data such as news, earnings, and other stock fundamentals, this is not yet considered to be efficient; (3) “a strong form”, which is the most efficient one. If investors have won against the market using the data from the first and second levels with private information, it can be concluded that the stock market is not yet considered to be highly efficient. In addition, to apply EMH concepts to our research, it can be concluded that the first two levels of EMH are widely used by considering both historical data and public data to forecast the market trend. Nevertheless, the strong-form level considers private information, which is illegal in trading; thus, it is not focused on in our research.

2.2. Stock Market Prediction Using Only Textual Information

Analyzing the stock market using relevant text is complex but exciting [12,22,23,24,25,26,27,28,29,30]. For example, a model named AEnalyst was introduced by Lavrenko et al. [31]. Their goal was to predict intraday stock price trends by analyzing news articles published on the YAHOO finance homepage.

Mittermayer and Knolmayer [32] utilized multiple models to predict short-term market reactions to the news using text mining techniques. Their model predicts the one-day trend of the five major company indices. Wu et al. [22] predicted stock trends by choosing a representative set of ambiguous attributes (keywords) that affect each stock.

However, these simple methods are not suitable for conveying the headline’s meaning and have several limitations, including the disclosure of rules that may govern market dynamics, making forecasting models unable to capture the impact of recent stock market trends.

One of the most definitive studies is the research of Ding et al. [12], which assumes that news can influence the stock market’s behavior and the news of the previous day will influence the daily changes in the stock price. They try to create a representative event by using a process named open information extraction, which converts daily news into three representative data: actor, action, and object. The result shows that their research works well and outperforms early research by increasing the efficiency of prediction. They proved by trials that the news headlines should be sufficient for textual features as opposed to the whole article.

Shi et al. [33] implemented a hierarchical neural network for stock prediction on textual news input. Their structure embedded input into three layers; word representation, bigram phrases, and news headline layer, respectively. The results were brought to the feed-forward regression dense layer. Figure 1 illustrates a hierarchical neural network on textual news input data. The blue boxes represent the numerical information. The light blue boxes represent the textual information, and the dark green boxes represent the classification tasks.

Pisut et al. [15] have implemented the deep learning architecture for stock prediction by leveraging pre-trained word embedding to increase their model performance. Therefore, we customized the architecture from them and replaced the GloVe embedding [34] with FastText embedding [35], which is considered one of the best word embedding for Thai news headlines. Therefore, we took FastText embedding as one of our baseline models.

However, we noticed that the subword contextual text representation was significantly better than word global text representation fix embedding. The critical factor is that each type of pre-train can be observed from the BERT and FastText pre-train, ranging in size from large to small, respectively.

We will introduce more about model improvement and embedding generation. Ling et al. [36] have proposed a character-to-word model in which they combined character-level embedding with word-level embedding and then measured them against five different language datasets, and their performance was satisfactory. Character-level embedding with a convolutional neural network (CNN) model is proposed by Kim et al. [37]. They proved that a highway network could accumulate performance from several language prediction tasks. Wehrmann et al. [38] applied character embedding with CNN to analyze the sentiment of Twitter.

However, using only character-level features is not satisfactory in terms of performance on Thai short-text classification because they rely on word segmentation, which sometimes leads to incorrect classification. Bidirectional Encoder Representations from Transformers (BERT) [18] provide the pre-trained vector representation of the words, which can be used further with the various AI models. BERT is trained using masked language modeling, which randomly masks some tokens in a text sequence, and then independently recovers the masked tokens by conditioning them on vectors. The BERT architecture is a framework that provides representation by a standard conditional probability both from the left and right contexts for all computed layers. Vector BERT was used in experiments to employ the transfer learning model to enhance the current prediction model’s capabilities. Google also has a pre-trained model (a large-scale text corpus called Multilingual Cased (BERT-Base)), which undergoes initial training on the top 104 languages and is very suitable for the Thai language. Figure 2 illustrates the architecture of BERT for classification tasks.

Moreover, Bidirectional Encoder Representations from Transformers, one of the best language models available in the NLP research, outperforms many tasks. Hiew et al. [20] showed a significant enhancement of BERT in sentiment analysis compared to prior existing models. Next, Othan et al. [39] showed that utilizing deep learning models and a new-generation word embedding model called BERT could improve classification performance.

We utilized BERT because it provides a better text representation than the previous embedding. BERT is a deep, multi-layered neural network and generates two different vectors for the word “sentence” as they appear in two very different contexts (i.e., contextual vector). Moreover, BERT could solve our problems because they have BERT-Base Multilingual-Based available from Google, representing Thai words that convert a group of text characters into a numeric representation value. However, the disadvantage is that it took much compute-intensive inference, meaning that it would be costly if investors wanted to use it in scale-by-size production. However, the predictive performance was noticeably better.

Finally, to the best of our knowledge, this is the first study aimed at analyzing the direction of stocks by using news headlines to capture market movements at the sector level rather than the overall market, as the stock index is a combination of the top fifty stocks from various sectors. Capturing sector information can actually reflect the market trends.

2.3. Stock Market Prediction Using Numerical and Textual Information

Most prior research has focused on either text or numeric data as input but not both types of information. Nowadays, many researchers are interested in improving a model’s performance by utilizing both types of data.

For the traditional machine learning techniques, Tantisantiwong et al. [40] recently proposed a framework to forecast SET50, a stock index in Thailand, using both numerical and textual information. For the textual information, they have gathered social media data with the advice of experts to manually define keywords for sentiments (positive and negative). Then, each document is labeled as positive or negative based on those predefined keywords. Next, sentiment keywords are extracted from those labeled documents. After that, each document is assigned a score called “the market composite sentiment index”. Finally, a multiple linear regression is generated based on both numeric (historical data of SET50) and textual (the sentiment index) information along with other control variables. Although this work is interesting, it requires experts’ effort to define sentiment keywords. Furthermore, these keywords are not publicly available, and they can be changed at different periods of time, so they must be periodically updated. Therefore, this work is not included in our study due to the manual processes required of experts.

For the deep learning approach, Vargas et al. [14] are the first to present the concept of gathering two different types of information. They take the news headlines into word vectors by using the Word2Vec algorithm and then pass them to the convolutional neural network to extract the outstanding words. The results were brought to long short-term memory to find the relationships between all of the news. On the other hand, technical indicators were created from historical stock price information, which learns directly from long short-term memory. The results are obtained from the two types of data and are then combined and used to predict stock market behavior. Figure 2 illustrates the structure of the deep learning model.

The work of Tanawat et al. [16] is another example that is very interesting. To the best of our knowledge, they are the first to extend the studying for the Thai stock market combining the Natural Language Processing (NLP) domain with technical indicators. There are various ideas and uses in Thai language headlines, more so than in other languages. For example, the tokenization problem. Thus, they have proposed a modified hierarchical structure for textual representation, which is the highlight of the research by Shi et al. [33], but they customized the model by using the “Newmn” tokenization from pyThaiNLP and replaced the word2vec embedding with thai2fit embedding so that models can learn Thai news headlines. Figure 1 illustrates the architecture of the hierarchical textual representation. Their model was designed to optimize the explanation with vector representation from the word to bigram phrases, title, and daily news representation level, respectively. The results have shown that adding textual representation increased the profitability more so than previous research.

Another relevant research is presented by Pisut et al. [15]. They offer a deep learning model that can acquire textual and numerical data to use in predicting stock market trends. They took the event data generated from the mean of each day’s event embedded vector as textual input and split the data into three parts: events from the past thirty days, events from the past seven days, and events from the past day, respectively. Event vectors dating back seven days and thirty days are fed into a convolutional neural network (CNN) to create a feature map that replaces essential events in the past and then combines them with event vectors from the previous day. On the other hand, historical price representation vectors and technical indicators are entered into long short-term memory (LSTM) and featured in the time series data analysis. Finally, the results obtained from each type of information are interconnected to predict the stock market trend.

In this research, we will consider both textual and numerical data, as many previous studies have shown that the efficacy obtained from both types of information is better than the performance focus on only textual or numerical data.

3. Methodology

This section provides a deep learning method for predicting industrial stock trends by considering numerical and textual data together. The models focus on improving prediction accuracy and performance in the field of returns when using prediction results to simulate trading. We have divided this into two main topics: data preprocessing (Section 3.1) and the proposed prediction model (Section 3.2), respectively. For the model, there are three main components, as shown in Figure 3.

3.1. Data Preprocessing

This section describes the data preparation to be used as the input to a model for predicting industrial stocks’ trends.

3.1.1. Textual Data Preprocessing

In this research, we have eliminated the Thai headlines between 2014 and 2020 from online sources and filtered only economic topics. Since the collected raw dataset is very dirty, different preprocessing techniques are used to clean up the dataset, e.g., lower casing, text without punctuation, stop word removal, removing duplicates, and trading symbol extraction methods.

As we focus on supervised deep learning strategies, it is necessary to label the collected Thai texts. One way to label documents is to read the news titles and labels them manually. This would provide results of a high level and ensure confidence. However, these qualities require a human to analyze the entire dataset. We have followed the authors of [41], who said they could automatically label documents using some market feedback, which is not perfect but quick and straightforward. We take a similar approach and groups documents that were released on the same day with the corresponding next-day market yield change. The value is calculated with Equation (1).

On the other hand, the labeling of textual data is used in a similar way. That is to say, the labeling obtained from the historical stock price is correlated with the textual data using the date as the correlation link. Like many previous kinds of research, we can divide the stock return ratio into three classes: upward, downward, and sideways, representing the significant dropping, rising, and steady stock trend on the next date, respectively. We set up a particular threshold to bin the return ratio percentage, based on the researching mentioned above, i.e., Downward (RisePercent (t)

< - 0.41 %

), Upward (RisePercent (t)

> 0.87 %

), and Sideways (

- 0.41 % \leq

RisePercent (t)

\leq 0.87 %

).

R i s e P e r c e n t = \frac{O p e n_{(t + 1)} - O p e n_{(t)}}{O p e n_{(t)}}

(1)

3.1.2. Numerical Data Preprocessing

We used open, high, low, and close as the historical prices in Table 1 and applied technical indicators, a feature that is based on historical stock prices, by mathematical equations presented in Reference [42]. As a result, there are a total of 15 technical indicators to make adjustments to the parameters as appropriate. Table 2 shows a list of technical indicators used in our experiment.

3.1.3. Data Normalization

The numerical inputs are in very different ranges; therefore, it is essential to standardize the dataset in a close range to enable the model’s faster teaching. We use a z-score to convert data, which is the zero mean and standard deviation of the data.

Z = \frac{(x - μ)}{σ}

(2)

where

μ

is the mean of the input x, and

σ

is the standard deviation of the input x.

3.2. Proposed Model

3.2.1. Model Architecture

The model we present aims to predict stock market trends, focusing on finding the context of daily headlines and correlating different types of data using numerical and textual data. Initially, we start with a data stream from textual and numerical information and finally feed the data to a forecasting model with correlation inference. An overview of the framework is shown in Figure 3.

Firstly, we used a concept presented in [13], where it is reported that stocks belonging to the same industry tend to behave similarly. Grouping these stocks can improve the performance of the model. Therefore, we have divided the news headlines for each sector of the SET50 industry grouping, for which we create a custom stock index to suit our personal purposes. We need to emphasize the specific meaning of each stock within each industry segment, in accordance with Table 5, as we discussed in Section 4.1.2, by using the best existing aspect of an index to make it more usable for this research. We use a method known as capitalization-weighted indexing because large stocks with market capitalization influence the dynamics of high indexes, such as the S&P500 (US), FTSE100 (UK), and SET50 (Thailand). Top-weighted best reflects the actual market, as the largest and most stable companies have the most influence on the index, while small or growing companies dictate the index’s movement and have a smaller amount of value. The very definition of a weighted index has the highest weight and serves as a good gauge for the market’s overall health. Equation (3) illustrates the capitalization-weighted indexing calculation by applying the SET50 calculation of the Stock Exchange of Thailand.

I n d e x = \frac{\sum_{t = 1}^{n} (P r i c e s_{i t} \times L i s t e d S h a r e_{i t} \times A d j u s t m e n t F a c t o r_{i t})}{A d j u s t e d B M V} \times B a s e V a l u e

(3)

where

P r i c e s_{i t}

is the price of each security constituting the index as of the calculation date.

L i s t e d S h a r e_{i t}

is the number of registered shares of each security constituting the index as of the calculation date.

A d j u s t m e n t F a c t o r_{i t}

is the weight limit rate of each asset in the index as of the calculation date, and

A d j u s t e d B M V

is the market capitalization of all securities that make up the index. This is weighted by the weight limit rate of each security in the index as of the base day.

Next, we input each headline into the BERT model that we have employed (Multilingual Cased pre-training weights (BERT-Base)) to shape the language with our headline corpus. We have selected the first token from the BERT-Base output, which is often used for classification tasks to display the headlines. Next, we trained the model prediction until the loss function reaches convergence through the BERT architecture workflow. Lastly, we keep the best final embedding values and join each sector to represent the day’s best headlines.

For numerical information, we feed the input into the LSTM network, which is widely used for time series data entry. The output from the LSTM is fed into a hidden layer that can indicate forecasts based on technical indicators and historical price data.

Finally, we concatenate all output from the textual representation vector, and the numeric vector then feeds them into the final hidden layers to predict the following day’s stock trend prediction. The output of the model is a multiclass classification, where class “UP” represents an upward trend of the stock market, “DOWN” represents a downward trend of the stock market, and “STABLE” indicates that the trend of the stock market is in no definite direction or moving in a narrow range (sideways).

3.2.2. Training Process

To create a model, we need to prepare training, validation, and testing datasets for model construction, model tuning, and model assessment, respectively. In addition, we have tried to avoid overfitting results by following a setup of [43], which divides time-series data into multiple parts over time using a sliding window, as shown in Figure 4. The performance of each model can be evaluated by averaging the results of all testing datasets rather than relying on just one testing dataset.

Our study has three datasets (#1, #2, and #3); this is called “3-fold cross-validation”. Using a sliding window of 1 year, the periods of training, validation, and testing datasets are three years, around one year, and one year, respectively. Although Sutheebanjard and Premchaiswadi [44] recommend that it is sufficient to use only one year of data to train a model, this suggestion cannot be applied to our study. Their model is a multiple linear regression, while ours is a deep learning model, which is much more complex and commonly requires more training data.

4. Experiment Settings

4.1. Datasets

The experiment was conducted on the stock market in Thailand via a stock index called “SET50” along with textual data of news headlines in Thai mainly due to data availability. Nevertheless, the stock market in Thailand is one of the emerging markets of the world. The size of the market capitalization in Thailand is 2nd in Asia, as of 2019 (http://www.set.or.th/en/news/econ_mkt_dev/overview_2019.html, accessed on 5 June 2021), and 20th in the world as of 2018 (http://www.indexmundi.com/facts/indicators/CM.MKT.LCAP.CD/rankings, accessed on 5 June 2021). Therefore, it should share common characteristics with other emerging markets. Moreover, it is common to utilize both numerical and textual information for stock market forecasting as stated in the semi-strong form level in EMH (in Section 2.1). Therefore, the proposed model can be applied to other major indexes, and the results in this study can somewhat represent other emerging markets.

4.1.1. Numerical Statistic

The EOD numbers (an end-of-day order is a buy or sell order for securities requested by an investor that is only open until the end of the day) used in our research come from the Stock Exchange of Thailand (SET50), corresponding to the period from 2 January 2014 to 14 February 2020, Table 3. The numerical data include price information, such as open, high, low, close, volume, as well as calculating the base value, such as the book value and the P/E ratio.

4.1.2. Textual Statistic

We only collected economic headlines, the source of which clearly groups economic news on a single topic, and only news headlines from various online sources corresponding to the EOD timeline. Due to our research scope, the entire Thai stock market index is studied, enabling all relevant economic news headlines. Therefore, we have to clean up approximately two-hundred-thousand headlines using the method presented in Section 3 to ensure tidiness and suitableness for further use. Finally, there are approximately one-hundred-thousand headlines being introduced every trading day, and we use these in this research. Detailed data statistics (Table 4) are included for each experiment, as not all tests use the same data boundary.

In addition, we have classified the stocks in SET50. These are appropriate according to the seven industrial groups, as shown in Table 5. Next, we split the headlines for each sector for training, validation, and testing with three sets of sliding windows, as summarized in Table 6.

4.2. Baseline Model

This section discusses details of baseline models [15,16] for performance comparison.

Numerical Input Only (LSTM (NUM)): This model uses only numerical information, as shown in the numerical module (the right box) in Figure 3. The inputs are time series of OHLC (4-time series) and technical indicators.

Textual Input Only (FastText): This model uses only textual information, which is different from the previous baseline.

Both Input (FastText + NUM): This model uses both numeric and textual information, as shown in Figure 3.

For our model, BERT is chosen as a textual module with three variations. First, “BERT” refers to our model with only textual information. Second, “BERT + NUM” refers to our model with both numerical and textual information. Finally, “BERT_SEC + NUM” refers to our model with both sources of information in addition to a sector strategy for the textual data.

4.3. Evaluation Metrics

4.3.1. Performance Evaluation

As the current experiment being conducted is a supervised problem, a matrix is evaluated to compare results from different deep learning models that were implemented based on precision, accuracy, recall, and F1 [45,46]. The metrics used to enhance performance are shown in Table 7, where “

T P

” represents the true positive from the model, “

T N

” represents the true negative from the model, “

F N

” represents the false negative from the model, and “

F P

” represents the false positive from the model.

4.3.2. Trading Profit

This work uses the stock buying simulation concept from Reference [43] to compare the predicted returns of the models we have presented. We only buy one contract at a time and set a stop loss at five percent of the total cost to prevent any forced margin and prevent the loss of all investments if that model predicts the wrong way. The conditions of the trading simulation are described below. (i) Make a buy when the model predicts an uptrend for the next day. (ii) Discard the stock when the model predicts a downtrend for the next day. (iii) If the conditions mentioned in the first two items are not met, hold shares. The formula shown in Equation (4) is only a formula that is suitable for simulating a bull market. However, the advantage of TFEX is that investors can trade both rising and falling stocks. Therefore, by simply swapping the execution of the equation, the stock will be traded in a downtrend accordingly.

\begin{matrix} E x e c u t i o n (t) \{\begin{matrix} B u y n c o n t r a c t s; P r e d i c t i o n_{t - 1} = U P a n d # c o n t r a c t s = 0 \\ S e l l n c o n t r a c t s; P r e d i c t i o n_{t - 1} = D O W N a n d # c o n t r a c t s > 0 \\ H o l d; O t h e r w i s e \end{matrix} \end{matrix}

(4)

The profit or loss is calculated when the sell condition is reached by using the difference of price multiplied by the number of shares available. Additionally, if there are stocks on the last day of simulated trading, they are all sold at the closing price of that day. The equation for calculating profit or loss is shown in Equation (5).

\begin{matrix} M a r g i n (t) \{\begin{matrix} E x e c u t i o n (t) = S e l l; I n d e x m u l . \times n \times (E x i t P r i c e - E n t r y P r i c e) \\ t = T; I n d e x m u l . \times n \times (C l o s e P r i c e - E n t r y P r i c e) \\ 0; O t h e r w i s e \end{matrix} \end{matrix}

(5)

The amount available at time t can be calculated based on the amount from the previous time, profit or loss, as shown in Equation (6).

t M o n e y (t) = t M o n e y (t - 1) + M a r g i n (t)

(6)

The return is calculated every time the shares are sold. Furthermore, it is adjusted to the average annual return for easy comparison using Equation (7).

A n n u a l i z e d R e t u r n = {{(1 + F i n a l t M o n e y)}^{\frac{1}{# S i m D a y}}} - 1

(7)

where n = 1, index multiplier = 200, T is the end of time period, tMoney denotes total money, and

# S i m D a y

is the total of the simulation period.

4.4. Training and Hyperparameters

The computer resources used for this research have the following characteristics: Due to the large architectures we have designed, we can only scale up to 32 batch sizes. However, we use one of the most popular and most widely used optimization algorithms, Adaptive Moment Estimation (Adam) [47] with an initial learning rate of 0.001. We also apply a state-of-the-art approach, batch normalization [48], to our deep neural network to accelerate deep network training by reducing internal variable changes. Lastly, we train each model for a total of fifteen epochs and always choose the best result of our validation datasets; however, it takes us at least thirty-six hours to train each model.

5. Results and Discussion

We conducted the study using the numerical and textual information described in the previous section. The performance is evaluated based on a “3-fold cross validation”, which is the average result of three testing datasets (#1, #2, and #3). All models compared in the experiment are described and abbreviated in Section 4.2

Table 8 and Table 9 show a model comparison in terms of accuracy and F1, consecutively. Figure 5 illustrates a confusion matrix to provide details of our sector-based model (BERT_SEC + NUM), where rows and columns refer to actual and predicted classes, respectively. Note that a definition of each class (DOWN, STABLE, UP) is explained in Section 3.1.1 From the results, our sector-based model (BERT_SEC + NUM) is the winner with an average accuracy of 61.28% and F1 of 59.58%. In more detail, it achieves the highest accuracy in datasets #1 and #3 with 63.67% and 58.67%, respectively. It also shows approximately the same trends in terms of F1 achieving 62.30% and 56.34% in datasets #1 and #3, consecutively. Although the accuracy of the winner (61.28%) may not be promising, it is still higher than that of prior work (51.27%) [16]. This can be expected since a stock price prediction is considered a complicated problem due to its high volatility. Moreover, if we considered a confusion matrix in Figure 5, there are two serious error cases: lower-left (actual “UP”, but predict “DOWN”) and upper-right (actual “DOWN”, but predict “UP”). Our model has failed these cases in minimal amounts of only 4.33%, 4%, and 4.17%, in datasets #1, #2, and #3, respectively.

Furthermore, we will discuss each module’s effect on our model: transfer learning, numerical and textual information, and sector-wise strategy. Furthermore, a training strategy using our prediction results will be simulated to show annualized returns of our algorithm.

5.1. Effects of Transfer Learning

In this section, we aim to compare different language models: FastText vs. BERT. The results show that BERT outperforms FastText in both accuracy (60.72% vs. 55.22%) and F1 (56.97% vs. 48.15%). This demonstrates that BERT is the right choice for the textual module in our model. Unlike FastText, BERT can embed contextual information and handle unknown words by using subwords instead of a whole word. Furthermore, Multilingual BERT is pretrained with a huge training corpus.

5.2. Effects of Numerical and Textual Features

In this section, we aim to compare between an only-textual model and a model with both information. The results confirmed that the model could be improved by using both information. For “BERT + NUM” vs. “BERT”, the accuracy and F1 are 60.78% vs. 60.72% and 57.61% vs. 56.97%, respectively. For “FastText + NUM” vs. “FastText”, the accuracy and F1 are 57.67% vs. 55.22% and 56.65% vs. 48.15%, consecutively.

5.3. Effects of Industry-Specific News Headlines (Sector)

In this section, we aim to analyze different embedding strategies for textual information. It should be more accurate if news headlines are embedded separately for each sector rather than the whole market. The results show that “BERT_SEC + NUM” outperforms “BERT + NUM” in both accuracy (61.28% vs. 60.78%) and F1 (59.58% vs. 57.61%). In conclusion, our sector-based model is the winner as it employs a contextualized language model (BERT), utilizes both sources of information, and handles textual information properly (sector-wise).

5.4. Annualized Return Based on Trading Simulation

As shown in Table 10, the results are favorable for dataset #2, while the other results are negative. However, the results should not be compared between the datasets because they are based on different market conditions. Therefore, we only compare the results for models that use the same dataset, even if we get negative returns. Nevertheless, our tests show that adding the numerical data increases the model’s annual return. The negative annual return may come from the trading strategies at this event. This strategy can lead to high transaction costs and negative returns when price action gains are less than trading costs. Moreover, the final model is selected based on the highest prediction accuracy on the validation data, leading to negative profits if the model makes accurate predictions during slight price movements. Moreover, a wrong forecast is made when the price changes significantly.

Based on EMH in Section 2.1, Table 10 shows that the average annualized return of the SET50 is −2.13%, which is the lowest return. Although it shows the highest return in dataset #1 (18.15%), it obtains the worst returns in datasets #2 and #3 (−7.17% and −17.76%, respectively). This demonstrates the volatility and unpredictability of the stock market; thus, it is crucial to have a trading strategy (e.g., a forecasting model) to help investors outperform the market. Furthermore, our sector-based model (BERT_SEC + TI) is a winner with the highest average annualized return (8.47%). The model is successful because it utilizes numerical data and textual data specifically for each industry (sector). Furthermore, the model using only historical data (LSTM (NUM)) gains the average annualized return at 2.47%, which outperforms SET50. Therefore, it can be concluded that Thailand’s stock market efficiency is not considered a weak-form level.

6. Conclusions

This research proposes a deep learning model to forecast the stock market trend (also called “a stock index”) based on both numerical (historical and technical indicators) and textual (news headlines) information. Since a stock index is a combination of many individual stocks from various sectors, we also propose to embed news into many industry-segment vectors. The experiments were conducted on SET50, a stock index in Thailand, along with news headlines in Thai. The results show that our sector-based model outperforms all baselines with an accuracy of 61.28% and F1 of 59.58%. Intensive experiments were provided to show that each proposed module can really improve the performance. Moreover, a trading strategy utilizing our prediction results was simulated. It achieves the highest annualized return of 8.47%.

For future studies, this research can be extended in two aspects. First, the model can be extended to other indexes, e.g., S&P500 in the United States and Nikkei 225 in Japan. However, the sector-based textual model must be tailored specifically for each market. Second, other external information can be integrated into our model to further improve the model’s performance. Especially during the epidemic (COVID-19) period, an announcement from the government and the number of infected patients can be included in our model.

Author Contributions

Conceptualization, K.P. and P.V.; methodology, K.P. and P.V.; software, K.P. and P.V.; validation, K.P. and P.V.; investigation, K.P. and P.V.; resources, K.P. and P.V.; data curation, K.P.; writing—original draft preparation, K.P. and P.V.; writing—review and editing, K.P. and P.V.; visualization, K.P. and P.V.; supervision, P.V.; project administration, K.P. and P.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fama, E.F. Efficient Market Hypothesis. Ph.D. Thesis, University of Chicago, Ellis Ave, IL, USA, 1960. [Google Scholar]
Banz, R.W. The relationship between return and market value of common stocks. J. Financ. Econ. 1981, 9, 3–18. [Google Scholar] [CrossRef] [Green Version]
Basu, S. Investment performance of common stocks in relation to their price-earnings ratios: A test of the efficient market hypothesis. J. Financ. 1977, 32, 663–682. [Google Scholar] [CrossRef]
Jegadeesh, N.; Titman, S. Returns to buying winners and selling losers: Implications for stock market efficiency. J. Financ. 1933, 48, 65–91. [Google Scholar] [CrossRef]
Amihud, Y.; Mendelson, H. Liquidity and stock returns. Financ. Anal. J. 1986, 42, 43–48. [Google Scholar] [CrossRef]
Leigh, W.; Purvis, R.; Ragusa, J.M. Forecasting the nyse composite index with technical analysis, pattern recognizer, neural network, and genetic algorithm: A case study in romantic decision support. Decis. Support Syst. 2002, 32, 361–377. [Google Scholar] [CrossRef]
Mizuno, H.; Kosaka, M.; Yajima, H.; Komoda, N. Application of neural network to technical analysis of stock market prediction. Stud. Inform. Control 1998, 7, 111–120. [Google Scholar]
Gidofalvi, G.; Elkan, C. Using news articles to predict stock price movements. In Department of Computer Science and Engineering; University of California: San Diego, CA, USA, 2001. [Google Scholar]
Gunduz, H.; Cataltepe, Z. Borsa istanbul (bist) daily prediction using financial news and balanced feature selection. Expert Syst. Appl. 2015, 42, 9001–9011. [Google Scholar] [CrossRef]
Schumaker, R.P.; Chen, H. Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Trans. Inf. Syst. (TOIS) 2009, 27, 1–19. [Google Scholar] [CrossRef]
Wang, B.; Huang, H.; Wang, X. A novel text mining approach to financial time series forecasting. Neurocomputing 2012, 83, 136–145. [Google Scholar] [CrossRef]
Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Deep learning for event-driven stock prediction. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
Akita, R.; Yoshihara, A.; Matsubara, T.; Uehara, K. Deep learning for stock prediction using numerical and textual information. In Proceedings of the 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), Okayama, Japan, 26–29 June 2016. [Google Scholar]
Vargas, M.R.; De Lima, B.S.; Evsukoff, A.G. Deep learning for stock market prediction from financial news articles. In Proceedings of the 2017 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Annecy, France, 26–28 June 2017. [Google Scholar]
Oncharoen, P.; Vateekul, P. Deep learning for stock market prediction using event embedding and technical indicators. In Proceedings of the 2018 5th International Conference on Advanced Informatics: Concept Theory and Applications (ICAICTA), Krabi, Thailand, 14–17 August 2018. [Google Scholar]
Chiewhawan, T.; Vateekul, P. Explainable deep learning for thai stock market prediction using textual representation and technical indicators. In Proceedings of the 8th International Conference on Computer and Communications Management, Singapore, 17–19 July 2020; pp. 19–23. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Sun, C.; Qiu, X.; Xu, Y.; Huang, X. How to fine-tune bert for text classification? In China National Conference on Chinese Computational Linguistics; Springer: Berlin, Germany, 2019; pp. 194–206. [Google Scholar]
Hiew, J.Z.G.; Huang, X.; Mou, H.; Li, D.; Wu, Q.; Xu, Y. Bert-based financial sentiment index and lstm-based stock return predictability. arXiv 2019, arXiv:1906.09024. [Google Scholar]
Malkiel, B.G. The efficient market hypothesis and its critics. J. Econ. Perspect. 2003, 17, 59–82. [Google Scholar] [CrossRef] [Green Version]
Wu, D.; Fung, G.P.C.; Yu, J.X.; Pan, Q. Stock prediction: An event-driven approach based on bursty keywords. Front. Comput. Sci. China 2009, 3, 145–157. [Google Scholar] [CrossRef]
Xie, B.; Passonneau, R.; Wu, L.; Creamer, G.G. Semantic frames to predict stock price movement. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, 4–9 August 2013; pp. 873–883. [Google Scholar]
Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Using structured events to predict stock price movement: An empirical investigation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1415–1425. [Google Scholar]
Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Knowledge-driven event embedding for stock prediction. In Proceedings of the Coling 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 2133–2142. [Google Scholar]
Chang, C.Y.; Zhang, Y.; Teng, Z.; Bozanic, Z.; Ke, B. Measuring the information content of financial news. In Proceedings of the Coling 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 3216–3225. [Google Scholar]
Peng, Y.; Jiang, H. Leverage financial news to predict stock price movements using word embeddings and deep neural networks. arXiv 2015, arXiv:1506.07220. [Google Scholar]
Luss, R.; d’Aspremont, A. Predicting abnormal returns from news using text classification. Quant. Financ. 2015, 15, 999–1012. [Google Scholar] [CrossRef]
Skuza, M.; Romanowski, A. Sentiment analysis of twitter data within big data distributed environment for stock prediction. In Proceedings of the 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), Lodz, Poland, 13–16 September 2015; pp. 1349–1354. [Google Scholar]
Sehgal, V.; Song, C. Sops: Stock prediction using web sentiment. In Proceedings of the Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), Omaha, NE, USA, 28–31 October 2007; pp. 21–26. [Google Scholar]
Lavrenko, V.; Schmill, M.; Lawrie, D.; Ogilvie, P.; Jensen, D.; Allan, J. Mining of concurrent text and time series. In KDD-2000 Workshop on Text Mining; Citeseer: University Park, PA, USA, 2000; Volume 2000, pp. 37–44. [Google Scholar]
Xiong, G.; Bharadwaj, S. Asymmetric roles of advertising and marketing capability in financial returns to news: Turning bad into good and good into great. J. Mark. Res. 2013, 50, 706–724. [Google Scholar] [CrossRef]
Shi, L.; Teng, Z.; Wang, L.; Zhang, Y.; Binder, A. Deepclue: Visual interpretation of text-based deep stock prediction. IEEE Trans. Knowl. Data Eng. 2018, 31, 1094–1108. [Google Scholar] [CrossRef]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef] [Green Version]
Ling, W.; Luís, T.; Marujo, L.; Astudillo, R.F.; Amir, S.; Dyer, C.; Black, A.W.; Trancoso, I. Finding function in form: Compositional character models for open vocabulary word representation. arXiv 2015, arXiv:1508.02096. [Google Scholar]
Kim, Y.; Jernite, Y.; Sontag, D.; Rush, A. Character-aware neural language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
Wehrmann, J.; Becker, W.; Cagnini, H.E.; Barros, R.C. A character-based convolutional neural network for language-agnostic twitter sentiment analysis. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2384–2391. [Google Scholar]
Othan, D.; Kilimci, Z.H.; Uysal, M. Financial sentiment analysis for predicting direction of stocks using bidirectional encoder representations from transformers (bert) and deep learning models. In Proceedings of the International Conference on “Innovative & Intelligent Technologies”, Istanbul, Turkey, 5–6 December 2019; pp. 30–35. [Google Scholar]
Tantisantiwong, N.; Komenkul, K.; Channuntapipat, C.; Jeamwatthanachai, W. Capturing investor sentiment from big data: The effects of online social media on set50 index. CM Res. Innov. 2020, 2020, 1–42. [Google Scholar]
Hu, Z.; Liu, W.; Bian, I.; Liu, X.; Liu, T.-Y. Listening to chaotic whispers: A deep learning framework for news-oriented stock trend prediction. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, Marina Del Rey, CA, USA, 5–9 February 2018; pp. 261–269. [Google Scholar]
Zhai, Y.; Hsu, A.; Halgamuge, S.K. Combining news and technical indicators in daily stock price trends prediction. In International Symposium on Neural Networks; Springer: Berlin, Germany, 2007; pp. 1087–1096. [Google Scholar]
Sezer, O.B.; Ozbayoglu, A.M. Algorithmic financial trading with deep convolutional neural networks: Time series to image conversion approach. Appl. Soft Comput. 2018, 70, 525–538. [Google Scholar] [CrossRef]
Sutheebanjard, P.; Premchaiswadi, W. Determining the time period and amount of training data for stock exchange of thailand index prediction. In Proceedings of the 2010 2nd IEEE International Conference on Information and Financial Engineering, Chongqing, China, 17–19 September 2010; pp. 359–363. [Google Scholar]
Gupta, B.B.; Sheng, Q.Z. Machine Learning for Computer and Cyber Security: Principle, Algorithms, and Practices; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
Tzimas, M.; Michopoulos, J.; Po, G.; Reid, A.C.; Papanikolaou, S. Inference and prediction of nanoindentation response in fcc crystals: Methods and discrete dislocation simulation examples. arXiv 2019, arXiv:1910.07587. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]

Figure 1. The hierarchical neural network (from Figure 2 in [33]). Detailed structure of a hierarchical representation (left). The overview architecture for the whole model (right).

Figure 2. Illustrations of fine-tuning BERT for classification tasks (from Figure 4(b) in [18]) where each news (input) contains N tokens (words), and the output is a predicted class label y.

Figure 3. Our proposed model using numerical inputs and sector-based textual information.

Figure 4. Training, validating, and testing approach.

Figure 5. Confusion matrix of our sector-based model (BERT + NUM) on three datasets.

Table 1. Fundamental parameter summary.

Feature	Explanation
Open Price	The first share price at the start of daily trading
Close Price	The final share price at the end of daily trading
High Price	The highest share price during daily trading
Low Price	The lowest share price during daily trading

Table 2. List of 15 technical indicators.

RSI	CMO	WMA	PRO	Williamś %R
EMA	ROC	SMA	HMA	TripleEMA
DMI	PSI	CCI	CMFI	MACD

Table 3. Numerical data summary showing the total number of records (days).

No.	Data Period	Numerical Information
No.	Data Period	Training	Validating	Testing
#1	Jan.-2014 to Apr.-2018	1117	109	351
#2	May-2014 to Mar.-2019	1093	351	314
#3	Jun.-2015 to Feb.-2020	1115	261	344

Table 4. Textual data summary showing the number of total news headlines; there are many news headlines per day.

No.	Data Period	Textual Information
No.	Data Period	Training	Validating	Testing
#1	Jan.-2014 to Apr.-2018	101,916	14,727	13,662
#2	May-2014 to Mar.-2019	101,875	14,802	14,398
#3	Jun.-2015 to Feb.-2020	101,860	13,722	14,228

Table 5. Industry grouping for the SET50 index.

Industry Symbols	Industry Group	Stock Symbols
INDUS	Petrochemicals and Chemicals, Packaging	IVL, PTTGC, SCGP
TECH	Information and Communication, Electronic components, Technology	ADVANC, DTAC, INTUCH, TRUE, THCOM, JAS, KCE
PROPCON	Property Development, Construction services, Construction materials	AWC, CPN, LH, SCC, TOA, WHA, CK, ITD, PS, SCCC, TPIPL, TASCO
SERVICE	Tourism and Leisure, Commerce, Transportation and Logistics, Health Care Services	AOT, BEM, BTS, CPALL, BH, HMPRO, CRC, BDMS, BJC, GLOBAL, VGI
FINCIAL	Banking, Finance and Securities	BBL, KBANK, KTB, SCB, MTC, TMB, TISCO, KTC, SAWAD, TCAP BLA MTLS
RESOURC	Energy and Utilities	EGCO, GULF, GPSC, IRPC, PTTEP, RATCH, TOP, BGRIM, BPP, EA, PTT, TTW
AGRO	Food and Beverage	CBG, CPF, MINT, TU, OSP

Table 6. Data statistic for industry grouping experiments.

No.	Data Period	Sectors	Textual Information
No.	Data Period	Sectors	Training	Validating	Testing
#1	Jan.-2014 to Apr.-2018	FINCIAL	27,662	4067	4030
		SERVICE	24,689	3553	2977
		RESOURC	16,176	2215	2236
		PROPCON	14,858	2420	2165
		AGRO	11,480	1723	1598
		TECH	4808	569	440
		INDUS	2070	349	458
#2	May-2014 to Mar.-2019	FINCIAL	27,666	4030	4298
		SERVICE	24,595	3625	3132
		RESOURC	16,226	2195	1378
		PROPCON	14,884	2408	2341
		AGRO	11,477	1756	1655
		TECH	4784	564	399
		INDUS	2091	348	470
#3	Jun.-2015 to Feb.-2020	FINCIAL	27,760	3731	4245
		SERVICE	24,490	3413	3097
		RESOURC	16,195	2034	2299
		PROPCON	14,919	2222	2298
		AGRO	11,453	1670	1609
		TECH	4768	515	469
		INDUS	2089	348	466

Table 7. Metrics for classification evaluations.

Metrics	Formula	Explanation
Accuracy	$\frac{T P + T N}{T P + F N + F P + F N}$	Specifying the percentage of correct forecasts in all samples.
Recall	$\frac{T P}{T P + F N}$	Specifying the proportions of positive samples is classified as a positive sample.
Precision	$\frac{T P}{T P + F P}$	Identifying the proportion of real positive samples in the class that was classified as positive.
F1	$2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n \times R e c a l l}$	F1 is the precision and recall weighted harmonic mean.

Table 8. Model comparison in terms of “accuracy” on testing data based on a 3 fold-cross validation (#1, #2, #3 refers to the result of each fold), and the boldface represents the winner.

Group	Model	Accuracy (%)
Group	Model	#1	#2	#3	Avg
Baseline	LSTM (NUM)	54.28	45.71	42.86	47.62
	FastText	54.50	58.33	52.83	55.22
	FastText + NUM	60.17	58.67	54.17	57.67
Ours	BERT	62.67	61.17	58.33	60.72
	BERT + NUM	62.17	64.50	55.67	60.78
	BERT_SEC + NUM	63.67	61.50	58.67	61.28

Table 9. Model comparison in terms of “F1-score” on testing data based on a 3 fold-cross validation (#1, #2, #3 refers to the result of each fold), and the boldface is the winner.

Group	Model	F1-Score (%)
Group	Model	#1	#2	#3	Avg
Baseline	LSTM (NUM)	54.63	43.75	40.44	46.27
	FastText	42.54	54.51	47.50	48.15
	FastText + NUM	58.69	56.34	54.91	56.65
Ours	BERT	56.09	58.47	56.36	56.97
	BERT + NUM	56.64	60.13	56.05	57.61
	BERT_SEC + NUM	62.30	60.10	56.34	59.58

Table 10. Model comparison in terms of “Annualized Return” on testing data based on a 3 fold-cross validation (#1, #2, #3 refers to the result of each fold), and boldface is the winner.

Group	Model	Annualized Return (%)
Group	Model	#1	#2	#3	Avg
Baseline	SET50	18.55	−7.17	−17.76	−2.13
	LSTM (NUM)	8.40	11.10	−12.10	2.47
	FastText	6.90	8.60	−12.10	1.13
	FastText + NUM	15.30	−4.50	5.20	5.33
Ours	BERT	7.70	12.00	−11.90	2.60
	BERT + NUM	15.80	−3.90	6.30	6.06
	BERT_SEC + NUM	17.50	−3.30	11.20	8.47

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Prachyachuwong, K.; Vateekul, P. Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information. Information 2021, 12, 250. https://doi.org/10.3390/info12060250

AMA Style

Prachyachuwong K, Vateekul P. Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information. Information. 2021; 12(6):250. https://doi.org/10.3390/info12060250

Chicago/Turabian Style

Prachyachuwong, Kittisak, and Peerapon Vateekul. 2021. "Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information" Information 12, no. 6: 250. https://doi.org/10.3390/info12060250

APA Style

Prachyachuwong, K., & Vateekul, P. (2021). Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information. Information, 12(6), 250. https://doi.org/10.3390/info12060250

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Stock Trend Prediction Using Deep Learning Approach on Technical Indicator and Industrial Specific Information

Abstract

1. Introduction

2. Literature Review

2.1. Efficient Market Hypothesis (EMH)

2.2. Stock Market Prediction Using Only Textual Information

2.3. Stock Market Prediction Using Numerical and Textual Information

3. Methodology

3.1. Data Preprocessing

3.1.1. Textual Data Preprocessing

3.1.2. Numerical Data Preprocessing

3.1.3. Data Normalization

3.2. Proposed Model

3.2.1. Model Architecture

3.2.2. Training Process

4. Experiment Settings

4.1. Datasets

4.1.1. Numerical Statistic

4.1.2. Textual Statistic

4.2. Baseline Model

4.3. Evaluation Metrics

4.3.1. Performance Evaluation

4.3.2. Trading Profit

4.4. Training and Hyperparameters

5. Results and Discussion

5.1. Effects of Transfer Learning

5.2. Effects of Numerical and Textual Features

5.3. Effects of Industry-Specific News Headlines (Sector)

5.4. Annualized Return Based on Trading Simulation

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI