Next Article in Journal
How Have District-Based House Price Earnings Ratios Evolved in England and Wales?
Next Article in Special Issue
An Alternative to Coping with COVID-19—Knowledge Management Applied to the Banking Industry in Taiwan
Previous Article in Journal
The Impact of Digitalization in Supporting the Performance of Circular Economy: A Case Study of Greece
Previous Article in Special Issue
The Role of Coefficient Drivers of Time-Varying Coefficients in Estimating the Total Effects of a Regressor on the Dependent Variable of an Equation
 
 
Article
Peer-Review Record

Can Ensemble Machine Learning Methods Predict Stock Returns for Indian Banks Using Technical Indicators?

J. Risk Financial Manag. 2022, 15(8), 350; https://doi.org/10.3390/jrfm15080350
by Sabyasachi Mohapatra 1,*, Rohan Mukherjee 1, Arindam Roy 2, Anirban Sengupta 1 and Amit Puniyani 3
Reviewer 2:
Reviewer 3:
J. Risk Financial Manag. 2022, 15(8), 350; https://doi.org/10.3390/jrfm15080350
Submission received: 2 June 2022 / Revised: 12 July 2022 / Accepted: 19 July 2022 / Published: 7 August 2022
(This article belongs to the Special Issue Predictive Modeling for Economic and Financial Data)

Round 1

Reviewer 1 Report

An artificial intelligence supported method was used in the study. In recent years, the reflections of machine learning, algorithms and artificial intelligence in the field of finance have been frequently mentioned in the literature, but the method used and the findings and solution proposals to be achieved are not compatible with each other. When the situation is evaluated for this study, it is understood that sufficient level of data entry and learning process cannot be performed in order for machine learning and the algorithms used to give the optimum result. The data of a study conducted only in India will not be sufficient to measure the predictive power in the stock market. The most basic rule of machine learning is to feed the artificial intelligence with the highest and most accurate data possible in order to realize the learning and reach the optimum prediction. The study has limitations in this aspect. In addition, several different algorithms were used in the study. If the design is designed with a hyperheuristic algorithm by emphasizing the hyperheuristic algorithm instead, the algorithm will present the most appropriate algorithm for the optimum result. The sources used in the literature are not up-to-date. however, this field is in a very rapid change and transformation. I think it would be appropriate to reconsider the study in line with these evaluations.

Author Response

Can ensemble machine learning methods predict stock returns for Indian banks using Technical Indicators?

 

Authors’ Response to Reviewer1 Comments and Suggestions

 

Comments and Suggestions for Authors: An artificial intelligence supported method was used in the study. In recent years, the reflections of machine learning, algorithms and artificial intelligence in the field of finance have been frequently mentioned in the literature, but the method used and the findings and solution proposals to be achieved are not compatible with each other. When the situation is evaluated for this study, it is understood that sufficient level of data entry and learning process cannot be performed in order for machine learning and the algorithms used to give the optimum result. The data of a study conducted only in India will not be sufficient to measure the predictive power in the stock market. The most basic rule of machine learning is to feed the artificial intelligence with the highest and most accurate data possible in order to realize the learning and reach the optimum prediction. The study has limitations in this aspect. In addition, several different algorithms were used in the study. If the design is designed with a hyperheuristic algorithm by emphasizing the hyperheuristic algorithm instead, the algorithm will present the most appropriate algorithm for the optimum result. The sources used in the literature are not up-to-date. however, this field is in a very rapid change and transformation. I think it would be appropriate to reconsider the study in line with these evaluation

 

We have broken down the Reviewer Comments and have addressed as below:

 

  • Reviewer Comment: An artificial intelligence supported method was used in the study. In recent years, the reflections of machine learning, algorithms and artificial intelligence in the field of finance have been frequently mentioned in the literature, but the method used and the findings and solution proposals to be achieved are not compatible with each other. When the situation is evaluated for this study, it is understood that sufficient level of data entry and learning process cannot be performed in order for machine learning and the algorithms used to give the optimum result. The data of a study conducted only in India will not be sufficient to measure the predictive power in the stock market. The most basic rule of machine learning is to feed the artificial intelligence with the highest and most accurate data possible in order to realize the learning and reach the optimum prediction. The study has limitations in this aspect.

 

Authors’ Response: The reviewer has rightly suggested that highest and most accurate data is required to make optimum prediction. In that respect, the present work has considered daily data of stock prices of publicly traded banks in India ranging from 1st January 2014 till 31st December 2021. The study has considered daily data for major private and public sector Indian banks along with the 10 technical variables for the study period.

The current motivation of the paper is to test and establish the findings if technical indicators are good enough to predict stocks for Indian banks using ensemble machine learning techniques. In the results we have reported the prediction errors and plots (showing fitted lines) that exhibit the accuracy of the prediction and we observe that the obtained results align themselves with the objective defined in the title of the manuscript itself (Please refer Figure 2 and Figure 3 of the Results section). The authors acknowledge that the study can also be replicated for other sectors in the Indian stock market to prove the efficacy of the existing models. Moreover, the reason behind selecting Indian banks as a subject of study has also been elucidated in the paper. (Pease refer the Para 6,7 and 8 of the Introduction Section, on Page# 3-5).

  • Reviewer Comment: In addition, several different algorithms were used in the study. If the design is designed with a hyper heuristic algorithm by emphasizing the hyper heuristic algorithm instead, the algorithm will present the most appropriate algorithm for the optimum result.

Authors’ Response: Recent literature have already explored the ability of ensemble models to predict stock returns (Ampomah and Qin,2020; Nti and Adekoya,2020) using macroeconomic and firm specific factors. However, the present work has taken the challenge to solely explore the relevance of technical indicators in predicting Indian banking stocks and in that respect the paper has reported appreciable degree of accuracy in Table2 and Figures 2 & 3 using ensemble learning techniques.

The idea of developing a hyper heuristic algorithm is appreciated and can explored as a scope of future study.

 

  • Reviewer Comment: The sources used in the literature are not up to date. however, this field is in a very rapid change and transformation. I think it would be appropriate to reconsider the study in line with this evaluation.

 

Authors’ Response: Recent literature have been added to the revised manuscript and the Bibliography Section has been updated accordingly. The additional set of references are as follows:

 

Challa, M. L., Malepati, V., & Kolusu, S. N. R. (2020). S&p bse sensex and s&p bse it return forecasting using arima. Financial Innovation, 6(1), 1-19.

Chen, Y., Wu, J., & Wu, Z. (2022). China’s commercial bank stock price prediction using a novel K-means-LSTM hybrid approach. Expert Systems with Applications, 202, 117370.

 

Day, M. Y., Ni, Y., Hsu, C., & Huang, P. (2022). Do Investment Strategies Matter for Trading Global Clean Energy and Global Energy ETFs?. Energies, 15(9), 3328.

 

Ernest Kwame Ampomah, Zhiguang Qin, Gabriel Nyame (2020). Evaluation of Tree-Based Ensemble Machine Learning Models in Predicting Stock Price Direction of Movement. Information, 11, 332

 

Hsu, M. W., Lessmann, S., Sung, M. C., Ma, T., & Johnson, J. E. (2016). Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert Systems with Applications, 61, 215-234.

 

Isaac Kof Nti, Adebayo Felix Adekoya, & Benjamin Asubam Weyori (2020). A comprehensive evaluation of ensemble learning for stock‑market prediction. Journal of Big Data, 7(20)

 

Kim, S., Ku, S., Chang, W., & Song, J. W. (2020). Predicting the direction of US stock prices using effective transfer entropy and machine learning techniques. IEEE Access, 8, 111660-111682.

 

Meesad, P., & Rasel, R. I. (2013, May). Predicting stock market price using support vector regression. In 2013 International Conference on Informatics, Electronics and Vision (ICIEV) (pp. 1-6). IEEE.

 

Naik, N., & Mohan, B. R. (2020). Intraday stock prediction based on deep neural network. National Academy Science Letters, 43(3), 241-246.

 

Sadorsky, P. (2022). Forecasting solar stock prices using tree-based machine learning classification: How important are silver prices?  The North American Journal of Economics and Finance, 101705.

 

Wu, J. M. T., Li, Z., Srivastava, G., Tasi, M. H., & Lin, J. C. W. (2021). A graph‐based convolutional neural network stock price prediction with leading indicators. Software: Practice and Experience, 51(3), 628-644.

Author Response File: Author Response.docx

Reviewer 2 Report

The authors have proposed ensemble machine learning models that could help traders, investors and portfolio managers to make prudent and informed decisions. The authors have suggested areas for further exploration of their research also. The authors may include the analysis of any typical sector using their proposed model.

Author Response

Can ensemble machine learning methods predict stock returns for Indian banks using Technical Indicators?

 

Authors’ Response to Reviewer2 Comments and Suggestions

 

Comments and Suggestions for Authors:. The authors have proposed ensemble machine learning models that could help traders, investors, portfolio managers to make prudent and informed decisions. The authors have suggested areas for their further exploration of research also. The authors may include the analysis of any typical sector using their proposed model.  

 

Authors’ Response:-  As suggested, in the current study we have included a detailed study on the Indian Banks for the period 2014-2021 and the results have been represented and discussed in the Section 5. (Please refer the text on Page 15-19).

Author Response File: Author Response.docx

Reviewer 3 Report

The paper could spend more time providing context and interpretation.

1) Why stocks of Indian banks?  Granted - most readers of this journal are interested in Finance but why might someone be interested in stock returns for Indian banks?  

2) Provide context by describing the Indian banking sector during the sample period.  What happened to their stocks?  Is this a highly regulated sector? 

3) Suppose this paper became very popular and investors used the models in this paper to conduct transactions.  But wouldn't the usefulness of the models then diminish since investors are using tools to conduct transactions that they did not use during the sample period? 

4) The conclusion states that the purpose of this paper is to validate this machine learning methods.  But validate relative to what?  The authors infer from Figure 3 that three of the four models perform well.  But the real question is whether these models outperform more common alternative ways that investors could use to predict market outcomes.  Unfortunately, the paper provides no indication to what extent these models outperform non-machine learning techniques. 

5) The paper does a good job of presenting the math behind the models but does little to describe intuitively when these models would work well versus when they would not work well. 

6) Moreover, Random Forest and XGBoost seem to outperform the other two but why?  I can see they give lower errors but is there a more intuitive reason why these models might be better equipped to predict stock movements for Indian banks?    

7) Continuing from (6), the paper presents the results but then ends.  No discussion is included to put results into context or to discuss finanical implications.  

Author Response

Please see the attachment - "jrfm-1778384_Reviewer3_Authors Response Sheet".

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

Thank you for incorporating my suggestions.  Two comments: 

I still would have liked to have seen comparisons with more 'traditional' methodologies.  

Please look over the paper again for minor typos.  

 

Author Response

Can ensemble machine learning methods predict stock returns for Indian banks using Technical Indicators?

 

Authors’ Response to Reviewer3 Comments and Suggestions

  1. Reviewer Comment: Why stocks of Indian banks?  Granted - most readers of this journal are interested in Finance but why might someone be interested in stock returns for Indian banks?

Authors’ Response: To incorporate the above remark, a detailed description in the Introduction Section has been added explaining the reasons behind the selection of Indian Banks in the current study. (Please refer the newly added text in Para 7, Para 8 and Para 9 of the Introduction Section on Page #3, Page #4 and Page# 5).

As reported by the World Bank, India’s GDP in dollar terms had grown all the way to $2.9 trillion in 2019 before plummeting to $2.7 trillion in 2020 owing to the COVID-19 impact. Despite the degrowth, the Indian Economy continued to retain its tag of the fastest growing major economy in the world. At present, the Indian Republic is the largest democracy and the sixth largest economy in the world in terms of nominal GDP. At the current prices, India’s nominal GDP is estimated to be at $3.12 trillion in FY22. The Indian economy is driven by an ideal mix of advanced industries and modern agriculture, showcasing the recent deliberations made by the Government (like reducing the minimum capital barrier in key sectors and simplifying the licensing process) to further attract the FDI inflows and facilitate the economic growth.

During this time, under the supervision of the capital market regulator, Securities and Exchange Bureau of India, the financial markets in India became more transparent and robust. This has brought the attention of global investors to the Indian capital markets which is evident by the size of the National Stock Exchange (NSE).  As of March 2022, NSE is ranked as the 9th largest stock exchange in terms market cap ($3.45 trillion). In the due course of time, the weightage assigned to Indian equities in the widely followed MSCI-emerging market index has also increased with time indicating an interest of global investors in the rising Indian economy. Such a tremendous growth of the Indian financial market particularly in the last decade has attracted the attention of the market participants as well as academicians in unravelling the Indian stocks prediction by applying machine learning techniques (Dutta et al.,2006; Kumar and Tenmozhi,2007; Panda et al.,2007; Patel et al.,2015).

To give a further distinction to our current study, we limit our evaluation to the Indian Banks. There are 3 major reasons behind this approach. Firstly, banking as a sector plays a critical role in the development of emerging economies like India. Banking sector in India contributes to 7-8% towards GDP. They account for most of the funding in the ongoing and future businesses particularly represented by the small and medium enterprises. Banks directly contribute to job creation and the overall development of the economy. Secondly, banks and financial services contribute almost 36 percentage points in the S&P Nifty 500 benchmark index. Big and large banking stocks from the public as well as the private sector majorly decide the movement and the overall returns of the benchmark index. And finally, The Indian banking system is all set to draw benefit out of the improved credit offtake post Covid-19 and the resilient risk management practices lasting the trying times. Public as well as private sector banks continue to grow their widely spread branch network led by huge technology led investments targeting a better customer outreach with an increased service level satisfaction. Both public as well as private sector banks target to increase their core profitability on the back of increasing operational efficiencies.  Further, the finance industry has systematically looked for ways to predict future asset returns, based on financial time-series data. The focus is mainly to predict the sign, either for shorter or longer horizons rather than predicting the effective returns. But this task has become challenging as markets are noisy and have volatile environments with fluctuations and shifts in volatilities.   Realising the challenges, in the current study, we uniquely develop ensemble machine learning models for predicting the stock returns solely on the evaluation of the Technical Indicators built using the Price, Volume and Turnover features for the Indian Banks. The study is in line with (Chen,2022), who used the machine learning techniques for predicting the stock prices of Chinese banks.

  1. Reviewer Comment: Provide context by describing the Indian banking sector during the sample period.  What happened to their stocks?  Is this a highly regulated sector?

Authors’ Response: Yes, the banking sector in India is highly regulated. It is administered by the Reserve Bank of India. The Indian banking sector has gone through various financial as well as structural changes which is added in the Data Description section of the modified manuscript. (Please refer Paragraph 2 on Page #8).

During the period between 2014-2021, the Indian banking system was badly hit by the rising NPA issues. Almost all the banks (specially the public sector banks) witnessed an unprecedented level of Gross NPAs. The gross NPAs of public sector banks almost doubled to the tune of $700 Million in 2021 from $325 Million. The central bank i.e. the Reserve Bank of India passed regulation for recognition of the NPAs (non-performing assets) and cleaning of the balance sheets. Also, during this period, the Indian government not only administered demonetisation to curb down on the corruption and the related black money forming a part of the economy, it also implemented major bills and laws like the Insolvency and Bankruptcy Code and the Goods and Service Tax affecting the NPAs as well as profitability of the banks. The study period also takes into account the financial disruption caused by the COVID-19 pandemic and its impact on the Indian economy and particularly the banking sector at large.

  1. Reviewer Comment: Suppose this paper became very popular and investors used the models in this paper to conduct transactions.  But wouldn't the usefulness of the models then diminish since investors are using tools to conduct transactions that they did not use during the sample period?

Authors’ Response: The present work has considered data of stock prices of publicly traded banks in India ranging from 1st January 2014 till 31st December 2021. The data is huge enough to build a robust model that remains valid for future study. But the reviewer, rightly noted that there are chances that the error values may diminish a little. In that respect, we feel that the compromise in terms of errors (which may have a little difference with current one) will not affect investors’ decision largely. So, the utility of the model will still be relevant.

  1. Reviewer Comment: The conclusion states that the purpose of this paper is to validate this machine learning methods.  But validate relative to what?  The authors infer from Figure 3 that three of the four models perform well.  But the real question is whether these models outperform more common alternative ways that investors could use to predict market outcomes.  Unfortunately, the paper provides no indication to what extent these models outperform non-machine learning techniques.

Authors’ Response:  Please refer the newly added 5th Paragraph of the Introduction Section on Page# 3 which addresses the above query.

The present work has explored the ensemble machine learning models, namely (XGBoost, Gradient Boosting and AdaBoost in addition to Random Forest) for predicting stock returns of Indian banks using technical indicators. It is established in literature that machine learning models outperform statistical and econometric models (Hsu et al.,2016; Messad and Rasel,2013; Patel et al.,2015). One big advantage of machine learning techniques is that there is no need to justify distributional assumptions and the ability of recognize the hidden patterns of time-series data. Also, the reduction of variance in machine learning models and the gain in prediction accuracy have contributed to the popularity of those models in stock prediction. The work described in (Hsu et al.,2016) has vividly established the superiority of the machine learning techniques over non-machine learning techniques for intraday and daily predictions across major markets. Further exploring the machine learning domain, literature suggests that ensemble models can perform better than single model in financial prediction system (Ampomah and Qin,2020; Nti and Adekoya,2020). In such context, the current research explores the ensemble techniques in prediction stocks in Indian scenario using a particular sector as reference.

 

  1. Reviewer Comment: The paper does a good job of presenting the math behind the models but does little to describe intuitively when these models would work well versus when they would not work well.

Authors’ Response: The methodology section has been updated to address the above query. The newly added text are as follows:

5.1)   Section 4.1 (Random Forest) – Page 9

Random Forest algorithm can handle large datasets. But when the data are very sparse, the model may face challenge as for some node, the bootstrapped sample and the random subset of features may produce an invariant feature space. Overfitting is also a risk with Random Forest and should be given attention.

5.2)   Section 4.2 (AdaBoost) – Page 10

AdaBoost works better in case of classification problems and the model is particularly vulnerable to uniform noise. It faces difficulty in adapting to different link functions to a create a linear model with a given outcome.

5.3)   Section 4.3 (Gradient Boosting) – Page 11

Gradient boosting works well in capturing complex patterns in the data as it generalizes the framework and allows for easier computation. It is not preferred in cases where there is no straightforward way to study how variables interact and contribute to the final prediction. Further it is harder to tune than other models because of so many hyperparameters.

5.4)  Section 4.4 (XGBoost) – Page 12

XGBoost performs very well on medium, small, data with subgroups and structured datasets with not too many features. XGBoost gives best results in case of structured or tabular datasets and is preferred in cases where computational speed is a great concern. In unstructured data related to computer vison and natural language processing, it may not perform according to expectations.

 

  1. Reviewer Comment: Moreover, Random Forest and XGBoost seem to outperform the other two but why?  I can see they give lower errors but is there a more intuitive reason why these models might be better equipped to predict stock movements for Indian banks?   

Authors’ Response: The query has been addressed in the Para 6 of the Results Section on Page #17. The updated text in the revised manuscript is as follows:

With respect to other boosting, XGBoost can boost the weaker learner by better parallel computing and optimized algorithms. XGBoost has improved upon the base Gradient Boosting machine framework through systems optimization and algorithmic enhancements. The algorithm goes through a cycle- In the initial stage, it tests the existing models on a validation set; then, it further adds a model to improve the prediction accuracy followed by testing this new model along the existing models on a validation set again, and this cycle repeats till an optimal ensemble method is reached. The performance of the Random Forest model may be due to its ability to model non-linear dynamics in data. Further, the ensemble-based prediction technique incorporated in Random Forest model contributes to its prediction accuracy. The individual trees in the forest protect each other from their error resulting in reduced cumulated error.

 

  1. Reviewer Comment: Continuing from (6), the paper presents the results but then ends.  No discussion is included to put results into context or to discuss financial implications.

Authors’ Response: The above query has been addressed by including the discussion in the Results and Conclusion section. The revisions are highlighted in the revised manuscript between the Pages 13-17.

Author Response File: Author Response.docx

Back to TopTop