Stock Movement Prediction Using Machine Learning Based on Technical Indicators and Google Trend Searches in Thailand
Abstract
:1. Introduction
2. Methodology
2.1. Study Design
- Yahoo Finance (stock market data)—weekly stock prices and trading volumes for stocks in SET100.
- Google Trends (keywords: internet search)—weekly search data for 484 specific terms which are commonly mentioned by the public on the internet. The selection of internet search terms are also based on the research by Preis et al. (2013) and the websites of Finnomena, SET, Krungsri, Encyclopedia, and stock2morrow.
2.2. Dataset
2.2.1. Stock Market Data
- The opening price is the first price of any listed stock at the start of a trading day.
- The high and low values represent the stock’s highest and lowest prices on that particular day. Generally, traders utilize these statistics to determine the volatility of a stock.
- The closing price is the price of the stock at the close of the trading day.
- The adjusted close price is regarded as the genuine price of that stock, as it reflects the stock’s worth after dividends are distributed.
2.2.2. Keywords (Internet Search)
2.3. Data Preprocessing
2.3.1. Technical Indicators
2.3.2. Keyword Selection
- : Pearson Correlation Coefficient between variables x and y.
- : The sum of the measured data from variable x.
- : The sum of the measured data from variable y.
- : The sum of product of the variables x and y.
- : The sum of the squares of variable x.
- : The sum of the squares of variable y.
- n: Number of data.
2.4. Modeling
2.4.1. Logistic Regression
- Use a dataset to create a simple linear regression or multiple linear regression depending on the independent variables used in the type of work performed.
- Bring the regression equation to the Sigmoid function to adjust the value to be in the range 0–1 because the regression equation can have values greater than 1 or less than 0. It should be between 0 and 1 only, so this function has been implemented.
- By passing the sigmoid function, the probability of the event of interest is obtained.
2.4.2. Random Forest
- In Random Forest, n number of random records are taken from the dataset having k number of records (Bootstrapped Dataset).
- Individual decision trees are constructed for each sample.
- Each decision tree will generate an output.
- The final output is considered based on majority voting.
2.4.3. Extreme Gradient Boosting (XGBoost)
2.5. Evaluation
- A True Positive (TP) response is when what is predicted matches what is actually happening. In the case of a prediction that is “true”, what happened is “true”.
- A True Negative answer (TN) is when what the prediction matches what happened. In the event that the prediction is “not true”, what happened is “not true”.
- A False Positive (FP) is a prediction that does not match what happened, that is, a prediction is “true”, but what happens is “not true”.
- False Negative (FN) is a prediction that does not match what actually happened, which is a prediction that something is “not true”, but what happens is “true”.
3. Results and Discussion
3.1. Performance on Dataset
3.2. Performance
3.3. Performance for Crisis
3.3.1. Quantitative Easing (QE)
3.3.2. Controversy
3.3.3. Foreign Investors
3.3.4. MSCI
3.3.5. COVID-19 (1)
3.3.6. Protestation
3.3.7. COVID-19 (2)
3.3.8. COVID-19 (3)
3.3.9. COVID-19 (4)
3.3.10. COVID-19 (5)
4. Conclusions
4.1. Conclusion
4.2. Limitation and Future Work
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Alfonso, Gerardo, and Daniel R. Ramirez. 2020. A Nonlinear Technical Indicator Selection Approach for Stock Markets. Application to the Chinese Stock Market. Mathematics 8: 1301. [Google Scholar] [CrossRef]
- Ananthakumar, Usha, and Ratul Sarkar. 2017. Application of Logistic Regression in Assessing Stock Performances. Paper presented at the 2017 IEEE 15th International Conference on Dependable, Autonomic and Secure Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd International Conference on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech), Orlando, FL, USA, November 6–10; pp. 1242–47. [Google Scholar]
- Anghel, Gabriel Dan I. 2015. Stock Market Efficiency and the MACD. Evidence from Countries around the World. Procedia Economics and Finance 32: 1414–31. [Google Scholar] [CrossRef] [Green Version]
- Antonio Agudelo Aguirre, Alberto, Ricardo Alfredo Rojas Medina, and Néstor Darío Duque Méndez. 2020. Machine learning applied in the stock market through the Moving Average Convergence Divergence (MACD) indicator. Investment Management and Financial Innovations 17: 44–60. [Google Scholar] [CrossRef]
- Atkins, Adam, Mahesan Niranjan, and Enrico Gerding. 2018. Financial news predicts stock market volatility better than close price. The Journal of Finance and Data Science 4: 120–37. [Google Scholar] [CrossRef]
- Basak, Suryoday, Saibal Kar, Snehanshu Saha, Luckyson Khaidem, and Sudeepa Roy Dey. 2019. Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance 47: 552–67. [Google Scholar] [CrossRef]
- Bhargavi, R., Srinivas Gumparthi, and R. Anith. 2017. Relative Strength Index for Developing Effective Trading Strategies in Constructing Optimal Portfolio. International Journal of Applied Engineering Research 12: 8926–36. [Google Scholar]
- Breiman, Leo. 2001. Random Forests. Machine Learning 45: 5–32. [Google Scholar] [CrossRef] [Green Version]
- Bustos, Oscar, and Alexandra Pomares-Quimbaya. 2020. Stock market movement forecast: A Systematic review. Expert Systems with Applications 156: 113464. [Google Scholar] [CrossRef]
- Chai, Jian, Chenyu Zhao, Yi Hu, and Zhe George Zhang. 2021. Structural analysis and forecast of gold price returns. Journal of Management Science and Engineering 6: 135–45. [Google Scholar] [CrossRef]
- Chen, Qian, Wenyu Zhang, and Yu Lou. 2020. Forecasting Stock Prices Using a Hybrid Deep Learning Model Integrating Attention Mechanism, Multi-Layer Perceptron, and Bidirectional Long-Short Term Memory Neural Network. IEEE Access 8: 117365–76. [Google Scholar] [CrossRef]
- Dash, Rajashree, and Pradipta Kishore Dash. 2016. A hybrid stock trading framework integrating technical analysis with machine learning techniques. The Journal of Finance and Data Science 2: 42–57. [Google Scholar] [CrossRef] [Green Version]
- Elizabeth, R. DeLong, M. DeLong David, and L. Clarke-Pearson Daniel. 1988. Comparing the Areas Under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. International Biometric Society 44: 837–45. [Google Scholar]
- Geurts, Pierre, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine Learning 63: 3–42. [Google Scholar] [CrossRef] [Green Version]
- Ghatasheh, Nazeeh. 2014. Business Analytics using Random Forest Trees for Credit Risk Prediction: A Comparison Study. International Journal of Advanced Science and Technology 72: 19–30. [Google Scholar] [CrossRef]
- Huang, Melody Y., Randall R. Rojas, and Patrick D. Convery. 2019. Forecasting stock market movements using Google Trend searches. Empirical Economics 59: 2821–39. [Google Scholar] [CrossRef]
- Le, Thi-Thu-Huong, Yustus Eko Oktian, and Howon Kim. 2022. XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems. Sustainability 14: 8707. [Google Scholar] [CrossRef]
- Li, Qing, TieJun Wang, Ping Li, Ling Liu, Qixu Gong, and Yuanzhu Chen. 2014. The effect of news and public mood on stock movements. Information Sciences 278: 826–40. [Google Scholar] [CrossRef]
- Nishimura, Yoshito, and Jared D. Acoba. 2022. Impact of Breast Cancer Awareness Month on Public Interest in the United States between 2012 and 2021: A Google Trends Analysis. Cancers 14: 2534. [Google Scholar] [CrossRef]
- Papadamou, Stephanos, Alexandros Koulis, Constantinos Kyriakopoulos, and Athanasios P. Fassas. 2022. Cannabis Stocks Returns: The Role of Liquidity and Investors’ Attention via Google Metrics. International Journal of Financial Studies 10: 7. [Google Scholar] [CrossRef]
- Perry, Marcus B. 2011. The Weighted Moving Average Technique. In Wiley Encyclopedia of Operations Research and Management Science. Hoboken: John Wiley & Sons, Inc. [Google Scholar] [CrossRef]
- Poutachidou, Nikoletta, and Stephanos Papadamou. 2021. The Effect of Quantitative Easing through Google Metrics on US Stock Indices. International Journal of Financial Studies 9: 56. [Google Scholar] [CrossRef]
- Praekhaow, Puchong. 2010. Determination of Trading Points using the Moving Average Methods. Paper presented at the International Conference for a Sustainable Greater Mekong Subregion, Bangkok, Thailand, August 26–27; pp. 519–44. [Google Scholar]
- Preis, Tobias, Helen Susannah Moat, and H. Eugene Stanley. 2013. Quantifying trading behavior in financial markets using Google Trends. Scientific Reports 3: 1684. [Google Scholar] [CrossRef] [Green Version]
- Robles, Víctor Bielza, Concha Larranaga, Santiago Pedro González, and Lucila Ohno-Machado. 2008. Optimizing logistic regression coefficients for discrimination and calibration using estimation of distribution algorithms. Top 16: 345–66. [Google Scholar] [CrossRef] [Green Version]
- Sadorsky, Perry. 2021. A Random Forests Approach to Predicting Clean Energy Stock Prices. Journal of Risk and Financial Management 14: 48. [Google Scholar] [CrossRef]
- Smart, Hoon. 2018. หุ้นสุดเหวี่ยงปิดลบ 8 จุด - ต่างชาติขาย 1.8 พันล้าน. Available online: https://hoonsmart.com/archives/39374 (accessed on 8 April 2021).
- Sycinska-Dziarnowska, Magdalena, Liliana Szyszka-Sommerfeld, Krzysztof Woźniak, Steven J. Lindauer, and Gianrico Spagnuolo. 2022. Predicting Interest in Orthodontic Aligners: A Google Trends Data Analysis. Journal of Environmental Research and Public Health 19: 3105. [Google Scholar] [CrossRef] [PubMed]
- Teixeira, Lamartine Almeida, and Adriano Lorena Inácio de Oliveira. 2010. A method for automatic stock trading combining technical analysis and nearest neighbor classification. Expert Systems with Applications 37: 6885–90. [Google Scholar] [CrossRef]
- Today, Post. 2018. หุ้นไทย27ธ.ค.61ปิดลบ8.56จุด. Available online: https://www.posttoday.com/finance-stock/stock/575197 (accessed on 28 April 2021).
- Trifonova, Oxana, Lokhov Petr, and Archakov Alexander. 2014. Metabolic profiling of human blood. Biomed Khim 60: 281–94. [Google Scholar] [CrossRef] [Green Version]
- Tudor, Cristiana. 2022. The Impact of the COVID-19 Pandemic on the Global Web and Video Conferencing SaaS Market. Electronics 11: 2633. [Google Scholar] [CrossRef]
- Vaidya, Rashesh. 2018. Stochastic and Momentum Analysis of Nepalese Stock Market. The Journal of Nepalese Business Studies XI: 14–22. [Google Scholar] [CrossRef]
- Wang, Kui, Jie Wan, Gang Li, and Hao Sun. 2022. A Hybrid Algorithm-Level Ensemble Model for Imbalanced Credit Default Prediction in the Energy Industry. Energies 15: 5206. [Google Scholar] [CrossRef]
- Yu, Yang, Wenjing Duan, and Qing Cao. 2013. The impact of social and conventional media on firm equity value: A sentiment analysis approach. Decision Support Systems 55: 919–26. [Google Scholar] [CrossRef]
Date | High | Low | Open | Close | Volume | Adj Close | Symbol |
---|---|---|---|---|---|---|---|
4 January 2017 | 3 | 2.96 | 2.96 | 2.98 | 25,413,400 | 2.378223 | WHA |
5 January 2017 | 3.04 | 2.98 | 3 | 3.02 | 82,795,000 | 2.410144 | WHA |
6 January 2017 | 3.04 | 3 | 3.02 | 3.02 | 48,678,800 | 2.410144 | WHA |
9 January 2017 | 3.1 | 3.02 | 3.02 | 3.08 | 1.52 × 108 | 2.458028 | WHA |
10 January 2017 | 3.12 | 3.04 | 3.1 | 3.04 | 90,063,300 | 2.426106 | WHA |
11 January 2017 | 3.1 | 3.06 | 3.08 | 3.06 | 74,300,900 | 2.442067 | WHA |
12 January 2017 | 3.2 | 3.1 | 3.1 | 3.16 | 3.55 × 108 | 2.521873 | WHA |
16 January 2017 | 3.26 | 3.18 | 3.26 | 3.24 | 90,585,100 | 2.585719 | WHA |
17 January 2017 | 3.26 | 3.18 | 3.24 | 3.2 | 83,648,500 | 2.553796 | WHA |
18 January 2017 | 3.24 | 3.18 | 3.22 | 3.2 | 53,264,600 | 2.553796 | WHA |
Keywords | ||
---|---|---|
Type | Keyword | Definition |
Basic Investment Term | P/E | Price-to-Earnings (P/E) Ratio: The ratio for valuing a company that measures its current share price relative to its earnings per share (EPS) |
Basic Investment Term | P/BV | Price to Book Value Ratio (P/BV): The market’s valuation of a company relative to its book value |
Basic Investment Term | EPS | Earnings Per Share (EPS): Calculated as a company’s profit divided by the outstanding shares of its common stock |
Industry Group | Agribusiness | Agribusiness |
Industry Group | Food and Beverage | Food and beverage |
Industry Group | Insurance | Insurance |
Stock Name | ADVANC | Advanced Info Service PCL (ADVANC.BKK) |
Stock Name | BBL | Bangkok Bank PCL (BBL.BKK) |
Stock Name | CPN | Central Pattana PCL (CPN.BKK) |
Trading Method | Technical | A trading strategy that primarily relies on technical indicators |
Trading Method | Day Trade | A trading strategy that is often informed by technical analysis of price movements and requires a high degree of self-discipline and objectivity |
Trading Method | Swing Trade | A trading strategy that focuses on taking smaller gains in short term trends and cutting losses quicker |
Global Search | Economics | Economics |
Global Search | Politics | Politics |
Global Search | Conflict | Conflict |
Popular Word | กอง (Kong) | Mutual fund |
Popular Word | ปอบ (Pop) | Broker |
Popular Word | หรั่ง (Rang) | Foreign investor |
Idiom | ลำไย (Lamyai) | Profit |
Idiom | ซื้อควาย (Sue Khwai) | Buy stock(s) right before the stock’s price goes down |
Idiom | ขายหมู (Khai Mu) | Sale stock(s) right before the stock’s price moves up |
Yearly Search Term | คนละครึ่ง (Khon La Khrueng) | Thailand’s government COVID-19 financial relief campaign |
Yearly Search Term | โควิด-19 (COVID-19) | An infectious disease caused by the SARS-CoV-2 virus |
Yearly Search Term | ชิมช้อปใช้ (Chim Chop Chai) | Thailand’s government COVID-19 financial relief campaign |
Date | Debt | Color | Stocks | Restaurant | Portfolio | Inflation | Housing | Dow Jones |
---|---|---|---|---|---|---|---|---|
1 January 2017 | 3 | 21 | 17 | 60 | 30 | 17 | 11 | 3 |
8 January 2017 | 22 | 31 | 32 | 51 | 29 | 24 | 7 | 11 |
15 January 2017 | 9 | 39 | 25 | 43 | 36 | 15 | 9 | 6 |
22 January 2017 | 37 | 41 | 42 | 43 | 38 | 24 | 32 | 9 |
29 January 2017 | 12 | 29 | 20 | 50 | 41 | 38 | 17 | 8 |
5 February 2017 | 34 | 38 | 21 | 51 | 25 | 55 | 30 | 8 |
Technical Indicators | Formula |
---|---|
Simple Moving Average (SMA) | |
Weighted Moving Average (WMA) | |
Exponential Moving Average (EMA) | |
Moving Average Convergence Divergence (MACD) | |
Relative Strength Index (RSI) | |
Stochastic Oscillator K (K) | |
Stochastic Oscillator D (D) |
Symbol | Positive | Correlation | Negative | Correlation |
---|---|---|---|---|
WHA | wha | 0.2322 | restaurant | −0.2028 |
WHA | sta | 0.2262 | major | −0.1999 |
WHA | tcap | 0.2161 | holiday | −0.1818 |
WHA | sawad | 0.2066 | fun | −0.1771 |
WHA | settrade | 0.1885 | แขก (kaek) | −0.1693 |
WHA | dow jones | 0.1878 | hybride | −0.1671 |
WHA | banpu | 0.1851 | food& beverage | −0.1642 |
WHA | tisco | 0.1840 | bec | −0.1567 |
WHA | toa | 0.1832 | ประชัย (bpra chai) | −0.1511 |
WHA | aot | 0.1824 | forex | −0.1430 |
WHA | bcp | 0.1822 | short selling | −0.1411 |
WHA | hmpro | 0.1766 | water | −0.1362 |
WHA | ปอด (bpot) | 0.1753 | mbk | −0.1340 |
WHA | lh | 0.1724 | thani | −0.1339 |
WHA | amata | 0.1712 | เบาะหนัง (bor nang) | −0.1331 |
WHA | bdms | 0.1689 | travel | −0.1249 |
WHA | cpf | 0.1679 | top | −0.1224 |
WHA | bbl | 0.1672 | ตกรถ (dtok rot) | −0.1218 |
WHA | บริการ (bor-ri-gaan) | 0.1645 | markets | −0.1214 |
WHA | cpn | 0.1615 | bts | −0.1198 |
Actually Positive (1) | Actually Negative (0) | |
Predicted Positive (1) | True Positive (TP) | False Positive (FP) |
Predicted Negative (0) | False Negative (FN) | True Negative (TN) |
Model | Type Dataset | Test | Unknown | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | F1-Score | AUC | Accuracy | Precision | Recall | F1-Score | AUC | ||
Logistic Regression | Indicators | 0.9699 | 0.9589 | 0.9699 | 0.9587 | 0.8457 | 0.9818 | 0.9739 | 0.9818 | 0.9745 | 0.8251 |
Keywords | 0.9727 | 0.9664 | 0.9727 | 0.9653 | 0.8999 | 0.9840 | 0.9843 | 0.9840 | 0.9779 | 0.8880 | |
Keywords and Indicators | 0.9813 | 0.9794 | 0.9813 | 0.9790 | 0.9767 | 0.9834 | 0.9827 | 0.9834 | 0.9830 | 0.9733 | |
Random Forest | Indicators | 0.9694 | 0.9555 | 0.9694 | 0.9553 | 0.9348 | 0.9818 | 0.9638 | 0.9818 | 0.9727 | 0.9324 |
Keywords | 0.9727 | 0.9734 | 0.9727 | 0.9619 | 0.9120 | 0.9814 | 0.9717 | 0.9814 | 0.9737 | 0.9045 | |
Keywords and Indicators | 0.9734 | 0.9708 | 0.9734 | 0.9642 | 0.9645 | 0.9831 | 0.9833 | 0.9831 | 0.9758 | 0.9623 | |
XGBoost | Indicators | 0.9732 | 0.9674 | 0.9732 | 0.9681 | 0.9577 | 0.9818 | 0.9767 | 0.9818 | 0.9782 | 0.9499 |
Keywords | 0.9743 | 0.9694 | 0.9743 | 0.9682 | 0.9339 | 0.9850 | 0.9840 | 0.9850 | 0.9802 | 0.9456 | |
Keywords and Indicators | 0.9823 | 0.9810 | 0.9823 | 0.9796 | 0.9787 | 0.9879 | 0.9871 | 0.9879 | 0.9855 | 0.9810 |
% Success | Model | Win | ||
---|---|---|---|---|
Crisis | Logistic Regression | Random Forest | XGBoost | Model |
QE | 55.00% | 85.00% | 83.75% | Random Forest |
Controversy | 51.11% | 78.89% | 65.56% | Random Forest |
Foreign Investors | 45.00% | 71.67% | 63.33% | Random Forest |
MSCI | 57.78% | 86.67% | 82.22% | Random Forest |
COVID-19 (1) | 51.54% | 73.85% | 79.23% | XGBoost |
Protestation | 65.56% | 94.44% | 90.00% | Random Forest |
COVID-19 (2) | 42.86% | 90.00% | 91.43% | XGBoost |
COVID-19 (3) | 40.00% | 82.22% | 84.44% | XGBoost |
COVID-19 (4) | 56.25% | 92.50% | 93.75% | XGBoost |
COVID-19 (5) | 51.11% | 90.00% | 90.00% | Random Forest |
AVERAGE | 51.62% | 84.52% | 82.37% | Random Forest |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Saetia, K.; Yokrattanasak, J. Stock Movement Prediction Using Machine Learning Based on Technical Indicators and Google Trend Searches in Thailand. Int. J. Financial Stud. 2023, 11, 5. https://doi.org/10.3390/ijfs11010005
Saetia K, Yokrattanasak J. Stock Movement Prediction Using Machine Learning Based on Technical Indicators and Google Trend Searches in Thailand. International Journal of Financial Studies. 2023; 11(1):5. https://doi.org/10.3390/ijfs11010005
Chicago/Turabian StyleSaetia, Kittipob, and Jiraphat Yokrattanasak. 2023. "Stock Movement Prediction Using Machine Learning Based on Technical Indicators and Google Trend Searches in Thailand" International Journal of Financial Studies 11, no. 1: 5. https://doi.org/10.3390/ijfs11010005
APA StyleSaetia, K., & Yokrattanasak, J. (2023). Stock Movement Prediction Using Machine Learning Based on Technical Indicators and Google Trend Searches in Thailand. International Journal of Financial Studies, 11(1), 5. https://doi.org/10.3390/ijfs11010005