1. Introduction
Analysts are routinely required to estimate the impact of certain economic events on the worth of businesses. Initially, this looks to be a difficult task; nonetheless, an event study can quickly develop a method of measurement. Using data from the financial markets, a study on any event investigates the impact of a certain event on a company’s wealth. The utility of these studies involves the idea that, assuming market rationality, the impacts of such occurrences will be immediately reflected in stock price [
1]. As a result, the stock price recorded over a relatively short time can be employed to design a metric of the economic influence of an occurrence. The event study has several uses. In accounting and finance research, event studies have been employed to study various company and economy-wide news events. Mergers and acquisitions, earnings results releases, fraud announcements, expert ratings, and the trade deficit are simple examples of macroeconomic issues [
2]. Moreover, the finance industry has been becoming a significant test platform for NLP and Information Retrieval (IR) approaches for the automatic analysis of financial news and opinions online due to its reliance on the interpretation of numerous unstructured and structured data sources and its demand for quick and thorough decision making [
3].
The primary phase in executing an event study is to describe the intriguing event and determine the time frame for examining the stock prices of the enterprises participating in the event (the event window). For instance, if you are considering quarterly or yearly results with daily returns, the event is an earnings announcement, and the event window will be one day after the news announcement. It is a common exercise to make the event window larger than the time of interest. This allows for the investigation of the time periods surrounding the event. The attention period is typically designed to cover at least the day of publication of the story and the next day. This reveals the price impact of news after the stock market closes on the day of the publication. The period preceding and following the occurrence might be of interest. The event can belong to any particular sector such as banking, pharmaceutical, technology, etc., of any country.
The banking sector is critical to a country’s growth and development, and banks are regarded as the backbone of any business. With a creative and foresighted assignment to strengthen the banking zone and its operations within the financial system, India’s banking sector has seen substantial development. In the Indian banking system, there are private and public sector banks. Further, 14 private banks were nationalized in 1969 by the Indian government, with the nationalization of 6 more private banks advancing the economic progress of the country in 1980. On 1 April 2017, SBI merged with other SBI partners and Bhartiya Mahila Bank to form India’s leading bank. Vijaya Bank and Dena Bank merged to form the Bank of Baroda in 2019. Smt. Nirmala Sitaraman, India’s finance minister, said on 30 August 2019 that 10 public sector banks will be merged with 4 big banks, starting 1 April 2020, lowering the number of public sector banks from 27 to 12. In the banking sector, mergers and acquisitions are, furthermore, more important in developing markets than in the US stock market [
4]. Moreover, consolidation of the banking domain is important for a variety of reasons, including the need for more capital, risk assessment, funding development projects, technological advancements, and improved customer service [
5,
6]. However, the stock market, being a well-organized market, is influenced by the spread of any uncertain news occurrence in any economy. Unprecedented news about the COVID-19 epidemic in recent times impacted developing and established markets, with the Asian market being the most affected [
7]. The news event that two public-sector banks would be privatized would have an immediate impact on the capital markets.
Despite being well-regulated, the banking system is confronted with a number of problems, especially financial difficulties and a lack of standard regulations. As per the annual report of the RBI (2019) and an article in the Economic Times, fraud has increased dramatically in both number and value during the past 10 years (2019). Bank frauds increased from 4669 in 2009–2010 to 6801 in 2018–2019, with a total value of 71,542.93 crores. The number of fraud cases increased by 45.66% between 20092010 and 2018–2019, with the amount involved growing by more than 35 times. Fraud at a bank or any other business entity is a completely unforeseen occurrence that has far-reaching economic and social ramifications. The negative impact of fraud on stock prices was highlighted in a reference [
8].
Government policies have far-reaching implications on a country’s economy [
9], particularly the banking industry. Uncertainty regarding government policy and election outcomes has serious financial implications. Existing research on policy uncertainty’s impact on economic outcomes suggests that increasing economic policy uncertainty causes enterprises to postpone investments [
10], firms to be less involved in new mergers and acquisitions [
11], and the amount of foreign direct investment to fall [
12]. The subject of whether and how government policy uncertainty affects the banking industry is notably absent from this research. One of the objectives of this research is to fill this gap by analyzing the performance of banking stocks before and after government policy announcements.
Several factors impact the stock market, including national and international news. Some company-specific news or releases influence a single industry, whilst others, such as inflation, GDP growth, and the repo rate, affect the market as a whole [
13]. In a similar vein, we are interested in analyzing the influence of certain events on the banking industry in this paper. To examine an analytical approximation of the banking events that make up daily financial news, we collected 10,000 financial news articles. What categories would we need to create for news articles to separate the banking news articles from the rest of the collection, which seems to be more or less a representation of the national or global news landscape? To categorize these news articles, we have separated them into four groups: banking, government (national), global, and non-banking (others). We solely analyzed news about banking and related topics for this study. Additionally, whether they occurred frequently or infrequently, we searched for the best examples of news events that had an impact on the banking and financial markets. In the context of a developing market such as India, our research has contributed to the present works by looking at the short-term reaction of stock prices on the Indian market to the banking sectors best-case events such as mergers and acquisitions, frauds, expert ratings, earnings results, government policies, and RBI policies. Further, the banking sector in India is divided into public and private banking. We were specifically interested in finding the impact of these news event announcements on the Indian public and private banks’ stock prices separately. An existing study also compares the returns on PSB (public sector bank) stocks to the returns on the Sensex to assess the performance of public sector banks following disinvestment (Indian stock market) [
14]. The mentioned study also calculated private sector bank relative returns to the Sensex (Refer to
Appendix A Table A1), which were compared to the performance of public and private sector banks. It was discovered that PSB stock performance was not considerably distinct from that of the private sector banks or Sensex. Another study looked into the effect of interest rates and foreign currency rates on the movement of banking stocks in India. The study’s findings demonstrate that all banks’ returns are heavily influenced by the performance of the Bank Nifty Index (Refer to
Appendix A Table A1). There was stronger evidence of return spillover from private sector banks than of public sector bank equities. In the case of volatility spillover, however, there is evidence of bilateral spillover between private and public bank stocks [
15]. The authors of that study also published another attempt look at how the US Federal Reserve and the European Central Bank’s policy interest rate announcements affect stock returns and volatility for commercial banks listed on the NYSE and the DAX in Germany [
16]. They discovered that the most significant impact of Federal news on both US and German bank shares was that an unexpected policy rate hike diminished returns and increased volatility in the majority of situations.
Important financial news is increasingly available in electronic form on the WWW, and it has evolved to be a very useful data source for event studies [
17] incorporating stock market evaluations [
18]. By following up on a variety of online news sources and building a news classification system, investors in the Indian stock market’s banking sector can be notified of potential financial banking events. To our knowledge, there is presently no news classification system created exclusively for the banking sector. Therefore, online financial news is classified as: banking, other related news articles of interest, and non-banking. Further, the news articles on ‘events’ of our interest—mergers and acquisitions, frauds, expert ratings, earnings results, government policies, and RBI policies—are extracted from banking news and other news articles using our classification system. Finally, the tone with which news events of the banking sector are broadcast is correlated to observable stock price volatility. Therefore, we intend to classify the above-mentioned six banking news events into positive and negative sentiments and assess their influence on the stock prices of the private and public banks involved.
Sentiment analysis is a computational method for handling document’s subjectivity, sentiments, and views [
19]. For monitoring and spotting key events and suspicious behaviors, this problem is very important [
20]. This is also regarded as a subsection of text mining, information retrieval, and natural language processing [
21,
22]. Sentiment analysis in the banking financial domain is the process of interpreting readers’ sentiments (negative or positive) about banking news events using computational intelligence, such as Machine Learning or other rules-based approaches. Text sentiment classification is a basic subfield in NLP. However, the sentiment classification process is very domain-specific. To classify a text in any domain, the most popular method is to employ domain-specific data samples. Machine learning-based text sentiment analysis, on the other hand, requires enough labeled training data. A transfer learning approach is widely used to overcome this problem [
23]. With the quick growth of deep learning, various applicable approaches to transfer learning are now being used, and numerous notable findings are being obtained. Transfer learning fine-tunes pre-trained deep learning models [
24,
25], using even small domain-specific data (banking financial domain in our current study). To collect banking financial domain data for further sentiment analysis and event study for Indian private and public banks, a test classification framework was designed.
To extract the news of the banking sector and additional financial news that is relevant to the banking news collected from the various online news sources, the text classification approach was used. In natural language processing, text classification is a well-known subject wherein labels are assigned to texts such as phrases or documents. It may be used for a number of activities, such as answering questions, spam filtering, topic modelling, news classification, and so forth [
26]. Moreover, any text is a very valuable reference, although extracting ideas from it may be challenging and time-consuming due to an often unstructured nature [
27]. Manual annotation or automatic labeling are both options for text classification. Automatic text classification is becoming more significant as the amount of text data in industry sectors grows. However, when working with extreme circumstances or sectors where public or synthetic databases are insufficient or unavailable, manual labelling is very crucial. In our study, manual labelling was the preferred method for text classification. Furthermore, in our study, the Banking Financial NLP is an unaddressed edge case. Banking Financial would not have been able to generate synthetic data in this circumstance. Here, it was both practical and efficient to use a team of financial experts to manually label the data. The following four steps may be dissected in most text classification and document categorization systems; extraction of features, dimensionality reductions, selection classifiers, and assessments are all part of the process [
26]. The study shows how an effective financial news classification framework can lead to ascertaining the impact of the tone of such information on the stock prices of related public and private banks, as well as what financial news must be retrieved, and what type of news is most appropriate to banking stakeholders.
The key contributions of the study are mentioned below:
Extracting news events that are relevant to the banking sector from the overall financial news articles;
Performing sentiment classification on banking financial news into positive and negative classes using the state-of-the-art NLP approach;
Performing an event study on banking stocks listed in an Indian stock exchange, i.e., BSE and NSE (Refer to
Appendix A Table A1).
The remainder of this document is divided into different sections. The literature review is discussed in
Section 2, the methodology is discussed in
Section 3,
Section 4 mentions the data and experimentation,
Section 5 contains the conclusion.
4. Experiments and Analysis
We used Python-based code created in Google Colab to obtain news for our experiments from public news sources such as Times of India, Money Control, Bloomberg, and Financial Express. The python script was collecting news articles many times each day. As a result, we accumulated roughly 10,000 financial news articles from between 2017 and 2020. We cleaned and prepared the news articles using the Tableau prep tool. To extract banking and other relevant news, we chose to classify the news stories as banking, government, global, and non-banking. These four categories were manually assigned to the news items. The technique of manually labeling text articles by human experts (or users) is time-consuming and labor-intensive, but it produces greater accuracy since expert knowledge is used to label the texts with the proper information. We classified a selection of representative media articles for each class as we went along. The labelers were experts in the financial industry and financial markets. In a four-class classification, a team of three experts performs feature selection to find the key or representative terms for each class. Next, each text document is examined and assigned to the appropriate class based on the representative words for each class. The classification tests were run on Python 3.8 with a variety of Python-supported libraries (scikit-learn and imblearn) that included Machine Learning and deep learning classifiers. The data had been skewed in nature as shown in
Figure 3 (Refer to
Appendix A Table A2). As a result, several sampling procedures were employed to balance the data across classes.
We further classified the news articles separated from the overall financial news in the previous phase into seven events (Refer to
Appendix A Table A3): RBI Policies, Merger or Acquisition, Results, Rating Agencies or Expert’s View, Governmental, Global, and Fraud [
87]. The news events were relevant to the private and public Indian banking sectors. We also used transfer learning to divide the news events into negative and positive polarity for sentiment categorization as shown in
Table 2. While inputting our hand-labeled data set into the supervised Machine Learning classifier Random Forest, the pre-trained DistilBERT model was fine-tuned. Even with professionals in the field, classifying financial reports or documents is challenging. Competence of the annotator is needed for the tagging of news items with suitable sentiment. For classification, we have conducted experiments using the TensorFlow library in python created and released by Google.
In this study, we investigated the relationship between CARs of Indian banks by sector (private and public) and news sentiments (negative and positive) following banking news events. The dependent variable was the CAR obtained from the data in the event study. We created different event windows with lengths ranging from 120 days (−60 to +60) to zero days (i.e., a one-day computation). The Standardised Cross-Sectional t-test technique was used to assess the statistical significance of the CARs estimates. As mentioned earlier, the banks fall under the public and private sectors. We evaluated how public and private banks listed in NSE and BSE reacted to banking news events from short (1–5 days prior and post news event publication) to long term (60 days prior and post news event publication).
We ran numerous experiments on our pre-processed data collected from web news portals, applying the usual Machine Learning techniques described in the prior section. The main goal of these trials was to find the best classifier for each situation. The classification output of each classifier was derived using the metrics Precision, Recall, and F1-score. Accuracy was achieved for all classifiers using a train–/test split of 75% and 25%, respectively, and five-fold cross-validation.
SMOTE helps to balance class representation by duplicating minority class cases at random. In comparison to alternative down and up-sampling strategies, the Random Forest classifier with balanced data employing SMOTE has the maximum accuracy, according to previous research [
90]. The data were vectorized with the TF-IDF feature representation approach, and the data were balanced with SMOTE before being fed into Machine Learning classifiers. The data were additionally vectorized using the DistilBERT feature representation approach, and the data were balanced with SMOTE. The resultant feature set was fed into different Machine Learning classifiers. The selected classifiers’ findings are provided in the tables below.
Table 3 and
Table 4 demonstrate the results of each classifier when the TF-IDF feature extraction approach was used to vectorize the data and it was balanced between classes using the SMOTE over-sampling method.
Among the classifiers—Multilayer Perceptron, Logistic Regression, Random Forest, Decision Tree, and Linear SVC for all classes with balanced datasets using SMOTE up-sampling—the Random Forest performed best in terms of accuracy, with 93% using the train/test method and 94% cross-validation, as shown in
Table 4. For the classifications—Banking, Global, Non-Banking, and Governmental—the Random Forest achieved F1 scores of 0.90, 0.94, 0.90, and 1.00, respectively.
Table 3 and
Table 4 compare all of the described classifiers for the four different classes.
Table 5 shows the results of each classifier when data were vectorized using the DistilBERT feature extraction and when data were balanced across classes using the over-sampling approach SMOTE.
Among the different classifiers—Linear SVC, Decision Tree, Random Forest, Logistic Regression, and Multilayer Perceptron for all classes with balanced datasets using SMOTE up-sampling—the Random Forest performed best, with 94% accuracy using the train/test method and 94% cross-validation, as shown in
Table 6. For the classifications—Banking, Global, Non-Banking, and Governmental—the Random Forest achieved F1 scores of 0.93, 0.94, 0.87, and 1.00, respectively.
Table 5 and
Table 6 compare all of the described classifiers for four different classes.
The Random Forest classifier is shown to be effective with a pre-trained neural model DistilBERT feature extraction and representation technique, and it does better in terms of classification accuracy than that with the TF-IDF feature extraction and representation technique by 1%. Although the MLP classifier with DistilBERT feature representation also produced 94% accuracy using the train/test method, the same as of Random Forest classifier, it was slightly less accurate with cross-validation by 0.02%. Therefore, the Random Forest with DistilBERT is considered the best classifier among all, with the highest accuracy using both the train/test split method and cross-validation.
In addition, we used a hybrid strategy that combines a rule-based approach with a machine-learning algorithm to perform an experimental evaluation of Indian banking news for event extraction and categorization to identify event scope and event triggers. The banking news was first labeled into seven classes or events: Results, Rating Agencies or Expert’s View, Merger or Acquisition, Governmental, Global, Fraud, and RBI Policies. The accuracy, recall, and F1 score of the generated hybrid model and DistilBERT fine-tuned using banking news events dataset and Random Forest classifier are shown in
Table 7.
Table 8 shows that, of the two techniques, DistilBERT using Random Forest classifier and suggested Hybrid model (i.e., transfer learning through DistilBERT and fine tuning with own Rules for Random Forest), the Hybrid model performed best with an accuracy of 100%.
Furthermore, these news events were classified intonegative, positive, and neutral sentiments. On banking news-event sentiments,
Table 9 illustrates the accuracy, recall, and F1 score of DistilBERT fine-tuned using different Machine Learning classifiers.
As shown in
Table 10, with an accuracy of 78%, the Random Forest outperformed the other classifiers: Decision Tree, Logistic Regression, and Linear SVC. The influence of these news events on sentiments on private and public banking stocks listed on the NSE and BSE is examined.
In
Table 11, we observe highly statistically substantial negative mean CARs of −1.05% and −5.05% in the event windows (−5, 5) and (D, 30) for private banks following the publication of a negative banking news event. This indicates that investors have responded in an identical manner to the tone of news events. The influence of negative banking news articles on private banks continued one month after the news was published. As a result, we assume that the banking news events with negative polarity will have an adverse impact for a short to medium period on private banking stocks or indexes.
However, we observe highly statistically substantial negative mean CARs of −2.15%, −3.96%, −8.74%, −11.44%, −0.76%, −4.64%, −9.55%, −14.86%, −24.81%, −17.25%, −8.3%, and −2.87% in the event windows (D, 1), (D, 5), (D, 30), (D, 60), (−1, D), (−5, −1), (−30, −1), (−60, −1), (−60, 60), (−30, 30), (−5, 5), and (−1, 1) for public banks following the publication of a negative banking news event. This indicates that investors have responded in an identical manner to the polarity of news articles. The impact of negative banking news events on public banks was observed for two months after the news was released. As a result, we anticipate that banking news articles with negative polarity will have a long-term adverse influence on public banking stocks or indexes. Furthermore, the mean CARs are negative in all event windows prior to the publishing of banking news with a negative polarity. This demonstrates that investors may forecast negative news events on public banking stocks before they occur. Banking news events with a negative polarity have a greater impact on public banking stocks or index returns than private banking stocks.
In
Table 12, we observe highly statistically substantial negative mean CARs of −2.46% and −6.23% in the event windows (D, 5) and (D, 30) for private banks following the announcement of a banking news event with positive polarity. The impact of positive banking news events on private banks lasted one month after the news was published. As a result, it is assumed that positive banking news events will have a substantial influence on private banking stocks or indexes for a short to medium period. In the symmetric event window (−30,30), a statistically significant negative mean CARs of −6.99% is observed. This demonstrates that investors may forecast positive news events on private banking stocks before they occur.
We also observe the statistically substantial negative mean CARs of −1.61% and −3.66% in the event windows (D, 1) and (D, 5) for public banks following the publication of a positive banking news event. The impact of positive banking news events on public banks lasted for five days after the news was published. As a result, it is assumed that positive banking news events will have a substantial impact on public banking stocks or indexes for a short period. It is also clear that public banks’ stocks react more to negative news events as compared to positive news events in the same manner as the tone of the news events.
5. Conclusions and Future Works
The goal of this paper was to perform an event study on private and public bank stocks listed in NSE and BSE. It was found that the influence of banking news with negative polarity on private banks lasted one month after the news was published, with statistically substantial negative mean CARs of −1.05% and −5.05% in the event windows (−5, 5) and (D, 30) following the announcement of the negative banking news event. Conversely, the impact of negative banking news events on public banks was observed for two months after the news was released, with highly statistically substantial negative mean CARs of −2.15%, −3.96%, −8.74%, −11.44%, −0.76%, −4.64%, −9.55%, −14.86%, −24.81%, −17.25%, −8.3%, and −2.87% in the event windows (D, 1), (D, 5), (D,30), (D,60), (−1, D), (−5, −1), (−30, −1), (−60, −1), (−60, 60), (−30, 30), (−5, 5), and (−1, 1) for public banks following the announcement of the banking news with negative polarity. As a result, it is anticipated that banking news with negative polarity will have a long-term negative influence on public banking stocks or indexes as compared to private bank stocks.
Moreover, the impact of positive banking news events on private banks lasted one month after the news was published. As a result, it is assumed that positive banking news events will have a substantial impact on private banking stocks or indexes for a short to medium period. The impact of positive banking news events on public banks lasted for five days after the news was published. As a result, it is assumed that positive banking news events will have a substantial impact on public banking stocks or indexes for a very short period. It is also clear that public bank stocks react more to negative news events as compared to positive news events, in the same manner as the tone of the news events. We looked at the effects of news events on the banking sector in India. However, future studies might concentrate on how large event news announcements’ tones affect other sectors. Furthermore, even though we analyzed data from India, future studies might be focused on capturing cross-country implications.
It is quite visible that the Random Forest classifier performs better for multiclass classification on financial news datasets than Linear SVC, Decision Tree, MLP, and Logistic Regression Machine Learning models. Furthermore, when it comes to financial news classification, event classification, and sentiment classification, transformers-based pre-trained DistilBERT word embeddings outperform standard TF-IDF with the Random Forest classifier. It can also be seen that the SMOTE sampling technique deals with the unbalanced datasets perfectly, produces appropriate samples for each class in multiclass classification, and results in a highly accurate classification with the Random Forest classifier. We intend to examine further appropriate sentiment classification applications to new analytic areas in the future. We would want to obtain enough training data to apply to the model using novel transfer learning-based approaches. In addition, for multi-class prediction, we would want to increase the number of classification labels.