1. Introduction
Banks are important financial intermediaries within the financial system, and play a significant role in the economic growth of a country by mobilizing the savings of depositors and making them available to the borrowers for productive ventures (
Jasevičienė et al. 2013). Bank failures led to contagious effects, as witnessed in the global banking crisis of 2007–2008 that affected many sectors of major economies of the world (
Edey 2009;
Laeven and Valencia 2010;
Fethi and Pasiouras 2010;
Jasevičienė et al. 2013). Therefore, prediction of bank performance is important, because bank failures cause vulnerabilities to the financial system (
De Haan and Vlahu 2016). The Federal Deposit Insurance Corporation of the US reported that almost 400 commercial banks were bankrupted in the US during 2008 to 2012. Furthermore, 10 large banking groups survived with the help of government bailout packages. The total estimated cost of the financial crisis that the US economy had to face ranged between
$10 trillion to
$20 trillion (
Seamans 2013).
In this context, a large body of literature has emerged to deal with investigating bank future performance from traditional econometric models to advanced machine learning methods (see
Horváthová and Mokrišová (
2018) for more details). However, these models were exclusively built on quantitative financial data (
Gaganis et al. 2006). In addition, the focus of the earlier studies was to classify the banks into two groups, ‘failed’ and ‘non-failed’ (
Lin et al. 2009). In response to that, the European Central Bank advised using both the business data as well as qualitative information for the extensive and comprehensive assessment of bank’s performance (
European Central Bank 2010). This was earlier mentioned by the Securities and Exchange Commission (1987) of the US that numerical presentations and brief notes were insufficient for investors to assess expected profitability of banks. It is already evident from the research on non-financial firms that qualitative information not only helps in understanding the financial statement, but also contains incremental information about future financial performance (
Li 2008,
2010b). Therefore,
Craig et al. (
2013) proposed that qualitative data should also be considered for examining the firm’s future performance. This view was also endorsed by
Beattie (
2014) who asserted that narratives could be used to assess the future financial performance of firms. Thus, qualitative data seems a valuable additional information source to be used for supplementing the quantitative financial data provided in corporate financial statements.
Researchers in non-financial firms have already incorporated qualitative data to empirically testing the hypothesis of interest, including the relationship between stock market returns and media discussions (
Tetlock 2007), the relationship between firm performance and complexity of corporate textual documents (
Li 2008;
You and Zhang 2009), press releases and future financial performance (
Davis et al. 2012), the language of CEO letter to shareholders and firm performance (
Craig et al. 2013), speculative forecast of investment and behavior of investors (
Mushinada and Veluri 2018), and the relationship between self-attribution bias of managers and firms’ future performance (
Amernic and Craig 2006).
The prevalence of self-attribution bias is likely in annual reports, press releases, conference calls, and the President’s letter to shareholders, because managers try to attribute favorable outcomes to their abilities to get rewards, and unfavorable outcomes to external factors to avoid penalties. Accounting literature has widely analyzed CEO’s self-attribution bias from different perspectives, especially linking it with the firm’s future financial performance.
Keusch et al. (
2012) argued that self-attribution behavior of the managers could alert the misleading information related to financial performance, especially during bad times of firms.
Research on non-financial firms has already started to investigate the possibilities of using self-attribution bias of managers for testing hypothesis of interest. However, the banking sector has not benefited from this line of research.
De Haan and Vlahu (
2016) maintained that banks are different from non-financial firms in many important ways: (a) banks have a unique role of creating credit for economy, (b) they are highly regulated due to contagious effects, and (c) banks are highly leveraged. Likewise, banks act as delegated monitors of resources, share risk of the economy, and ensure that firms efficiently use the resources extended to them in terms of loans (
Allen and Carletti 2012). Therefore, the studies conducted on non-financial firms cannot be generalized to the banking sector.
In addition,
Demirgüç-kunt et al. (
2012) explained that a substantial body of economic theory also emphasized the comparative importance of banks, and concluded that banks are important in early stages of economic development of a country. The authors suggested that in the transition stage of economies, banks have a comparative advantage in financing collateralized, and standardized endeavors than those of other financial institutions.
Fink et al. (
2005) also suggest that banks are more important in the majority of the emerging economies. Similarly,
Djalilov and Piesse (
2016) also argue that when the economies are in the early transition stage, the share of banks towards whole financial system and GDP growth are relatively high.
The ‘immeasurably important’ place of banks as central of economic activities in the emerging economies underscore the merit for studies that exclusively investigate the bank future financial performance. To the best of my knowledge, there is no study that has used self-attribution bias of management focusing on the emerging economies to predict the future performance of banks. This study attempts to fill this gap.
This study poses the research question, does the self-attribution of management in annual reports offer incremental predictive power, over and above the models based on the traditional quantitative financial data alone?
The study contributes in two important ways. First, it contributes toward a comprehensive performance prediction models of banks in which contextual information using self-attribution bias of management is leveraged with the traditional numerical data to predict the future performance of banks. This could help potential, as well as existing, investors with a comprehensive information about future profitability. Second, management of banks have the advantage of superior information, and are aware that regulators mostly focus on few financial ratios, which can be distorted (
Gandhi et al. 2019). This study helps in reducing the information asymmetry between the management and shareholders about expected future performance by getting signals with the help of self-attribution of management in annual reports of the banks (Principal-Agent model).
The remaining part of the paper has been organized as follows: the next section shed lights on existing literature, taking up each issue under scope of attribution theory.
Section 3 provides the research methodology that describes the sample selection, how managerial attribution bias was measured, and describes the econometric models.
Section 4 explores the data with the help of descriptive statistics, correlation matrix and scatter plot matrix.
Section 5 presents the exploratory data analysis with agglomerative hierarchal cluster analysis to find the latent groups within the data.
Section 6 provides the estimates of the model that leads to the discussion of results.
Section 7 concludes the research findings.
2. Literature Review
Canbas et al. (
2005) have explained that it is important to predict the future performance of banks, so that the regulators could take timely actions to mitigate the disastrous outcomes resulting from banking crisis. In this context, a large body of the literature emerged that dealt with investigating the future performance of banks. Methods developed earlier for manufacturing firms were also adopted in financial firms with some modifications. The methods included an analysis of financial ratios, discriminant analysis ((
Beaver 1966;
Altman 1968), and logistic models, data envelopment analysis or DEA, (
Fethi and Pasiouras 2010), and machine learning algorithms, particularly, the neural networks (
Ravi Kumar and Ravi 2007). These methods provide the foundation for researchers to predict the future financial performance of banks from more than two decades (
Board et al. 2003). In addition to the earlier models, stress test which has become more widespread after the financial crisis of 2007–2008 that is performed under hypothetical bad economic conditions to observe that a bank has required capital to bear the impact of adverse conditions (
Petrella and Resti 2013). However, this test is performed by management, and based on hypothetical situations which may be strategically manage by the managers to show a better financial condition of the bank (principals-agent problem).
Gaganis et al. (
2006) criticized earlier developed models on the basis that these models exclusively relied on quantitative financial data. The
European Central Bank (
2010) also observed that researchers were developing models based only on the financial data for predicting the bank’s performance. However, there is a large quantity of unstructured qualitative information that should also be used in conjunction with numerical data for a comprehensive performance prediction model. In this perspective,
Smith and Taffler (
2000) augmented that qualitative sections of annual report provide “nearly twice the quantity of information as do the basic financial statements”. Moreover, the qualitative information could also help in understanding financial statement, and signals about future financial performance (
Li 2008,
2010b). This was earlier suggested by
Craig et al. (
2013) that the textual data should also be considered in examining whether firms are healthy or at risk.
Earlier studies on non-financial firms have empirically evidenced that future performance could be predicted using qualitative information in corporate documents (
Li 2008,
2010a;
Craig et al. 2013). Thus, the qualitative data seems to be a valuable additional information source to supplement the financial data available in corporate financial statements. Research in the manufacturing sector has already begun to explore the narratives in textual disclosures in many ways, including complexity of text, sentiment analysis, and self-attribution bias. In this study, self-attribution bias of management is used to predict the future financial performance of banks.
Literature has witnessed that attribution of management provides involuntary signals about the expected outcomes of firms in the textual sections of annual reports, especially, in the CEO’s letter to the shareholders, and management discussion and analysis (
Aerts 1994;
Clatworthy and Jones 2003;
Merkl-davies et al. 2014).
Amernic and Craig (
2006)
provided that when the outcomes of firms were good, the CEO took credit, and attributed to internal factors (their ability, skills, vision and foresightedness). In contrast, if the outcomes were unfavorable, they made external attributions (bad economic, and market situations) in an attempt to distance
from bad performance, so that they would not be held personally accountable.
Accounting literature has widely analyzed management’s self-attribution bias from different perspectives. For instance,
Craig et al. (
2013) analyzed the bankrupted Indian firm ‘Satyam’, and found that there was a shift in first-person singular pronouns to first-person plural pronouns in the CEO’s letter to shareholders. It also showed blame shifting signals about the bad outcomes of the firm. Similarly,
Clatworthy and Jones (
2003) examined the chairman’s letter to shareholders of the 50 top and bottom UK companies, and found that management took credit for good news, and blamed the external environment for bad outcomes. Likewise,
Li (
2010) analyzed the managements’ tendency towards self-serving attribution with the help of computational linguistic technique in management discussions and analysis (MD&A), and found that the inclination of attributional words by management was positively related to firms’ future performance. More recently,
Lehmberg and Tangpong (
2018) examined how top management communicate the bad or good firm performance to the stakeholders. The authors found a significant and positive relationship between subsequent good performance and internal attribution, while bad performance was attributed to external factors.
Researchers also used self-attribution biases from many other aspects other than firm performance. For instance,
Adam et al. (
2015) analyzed whether the past speculative forecast of managers increased more investments by testing the data of 92 North American gold mining firms over a period of 1989 to 1999. The results of the study demonstrated that bad forecast did not reduce investments, because managers attributed successful outcomes to their abilities, while losses were attributed to bad luck or some other external factors.
From the investors’ perspective,
Mushinada and Veluri (
2018) analyzed the behavior of investors for their investment decisions using behavioral biases. The authors used 1290 stocks traded in the Bombay Stock Exchange of the India during 2004 to 2012 examining whether the self-attribution bias existed among investors about their earnings’ forecasts. The results of the study revealed that when the forecast was accurate, the investors took credit, while the wrong forecast was attributed to external factors, especially the excessive volatility in the stocks. Likewise,
Chen et al. (
2016) investigated the relationship between attribution of managers for their earnings forecast, and investors’ utilization of that attribution. The study found that investors followed the internal attribution of management for their investments. In another study,
Asay et al. (
2018) investigated the attribution with the help of personal pronouns of CEO for the prediction of winning law suits or good/bad news of CEO. Results clearly indicated that participants’ likelihood of winning law suits or good performance was associated with the use of more personal pronouns.
3. Methodology
3.1. Data
The author selected all the emerging economies which were reported in IMF emerging economies list. Specialized banks were excluded to keep the sample homogeneous. Annual reports were downloaded from the websites of banks, and the financial data were obtained from Bureau van Dijk (BvD)
2. Another condition to the sample selection was the availability of CEO letters to shareholders, and management discussion and analysis in annual reports. Moreover, banks were who published annual reports in their national language excluded from the sample, because this could create reliability problems in the construction of self-attribution bias indices.
After accounting for all changes, the data consisted of 58 banks from 16 emerging economies, and the period covered 2007 to 2015. The list of banks is provided in
Appendix A.
3.2. Variables and Their Definitions
3.2.1. Bank Performance Indicator
Return on Average Equity (ROAE): Return on average equity was taken as a performance indicator of banks. It was calculated as the net income divided by average total equity. ROAE is an internal performance measure of shareholder value and has been widely used for performance prediction of banks (
Aerts 2001;
Petria et al. 2015;
Beccalli et al. 2015;
Yao et al. 2018;
Akhisar et al. 2015). It is a fundamental ratio that tells the investors, how effectively management uses their money. It proposes a direct assessment of the financial return of shareholders’ investment. This ratio shows whether the management is growing banks’ value at an acceptable rate.
3.2.2. Performance Determinants
Self-Serving Attribution Bias (CEO letter to shareholders): In the corporate context, more use of first, and second-person pronouns is an indication of taking credit for good outcomes, compared to third-person pronouns. Researchers have widely analyzed corporate narrative to explore the relationship between CEO’s self-attribution bias in letters to the shareholders and the firm’s future financial performance (
Clatworthy and Jones 2003;
Amernic and Craig 2006;
Craig et al. 2013;
Lehmberg and Tangpong 2018). In this study, a positive relationship is expected between self-attribution bias of CEO in letter to shareholders and bank’s future performance.
Self-Serving Attribution Bias (Management Discussions and Analysis): This section of annual report explains the firm’s overall performance, challenges face by the management, internal and external risks involved in the operations, and indications about the future prospects. A similar methodology was adopted to calculate the self-attribution bias of management in management discussions and analysis as described in CEO letter to shareholders. Earlier studies have evidenced the positive relation between self-attribution of management in management discussions and analysis and firm’ future financial performance (
Li 2010a;
Lehmberg and Tangpong 2018;
Aerts 2001,
2005). Similarly, a relationship is also expected in the bank’s financial performance.
Total Assets: In this study, total assets represent the size of bank with the absolute values in million US dollars. There was huge variation in the assets of the banks in the emerging economies. Thus, the logarithm of total assets (Log_Assets) was employed as a proxy for bank size. The size of assets could provide higher profit up to a certain level, thereafter, the profitability could be lowered as compared to small banks. Thus, the relationship between total assets and banks performance could be negative, because percentage of profit does not increase with the equivalent proportion of assets (
Trujillo-Ponce 2013;
Panta 2018;
Terraza 2015;
Shehzad et al. 2013). Some other studies have also shown an insignificant relationship of total assets and firms’ performance (
Athanasoglou et al. 2008;
Petria et al. 2015).
Assets Growth Ratio: The assets growth ratio was calculated by current year assets minus last year assets divided by last year assets. Assets growth ratio indicates the percentage increase or decrease over the prior year. The increase in profitability of the bank depends upon the increase in quality of assets. Higher quality assets would increase the profitability of the banks. The earlier studies suggested that as the assets increased, profitability of the bank were also increased (
Mathuva 2009;
Ahamed 2017;
Bougatef 2017;
Yao et al. 2018). Thus, the relations between assets growth and banks future performance is expected to be positive.
Non-Performing Loans to Gross Loans Ratio: Non-performing loans to gross loans was calculated as total non-performing loans divided by gross loans of the bank. The loans are classified as nonperforming when the borrower defaults or declares bankruptcy. It measures the effectiveness of a bank in receiving repayments on its loans. The higher the ratio, lower the profitability of the banks (
Trujillo-Ponce 2013;
Panta 2018;
Petria et al. 2015).
Tier1Capital Ratio: Tier1capital ratio was calculated as total tier1capital
3 divided by total risk-weighted assets
4. The tier1capital ratio measures a bank’s core capital. In 2015, under Basel III, the minimum tier 1 capital ratio was 6 percent. The regulators use this ratio to determine, whether a bank is well capitalized, undercapitalized or adequately capitalized relative to the minimum requirements. Since, net income is spread over increased equity, the relationship between performance of banks and tier1capital is expected to be negative (
Stovrag 2017).
Loans to Asset Ratio: Loans to asset ratio was calculated as total loans held by bank’s borrowers divided by total assets of the bank. The loans included cash deposits at other banks, financial assets, securities, advances to the borrowers. This ratio indicates to what extent assets are devoted to loans. Literature has shown that the relationship between loans to assets ratio and bank performance could be either positive or negative. Higher the ratio, lower the liquidity position of bank, and may face a higher risk of failure (
Goddard et al. 2013). In addition, a bank holding more liquid assets (lower loan to asset ratio) may suffer from lower profitability (
Demirgüç-Kunt and Huizinga 1999;
Yao et al. 2018). On the other hand, higher loans to borrowers could provide more interest income to the banks (
Trujillo-Ponce 2013).
Interest Rate Spread: Interest rate spread is the country level variable that refers to the difference between the borrowing and the lending rates of banks. Higher spread shows that banks are advancing loans at a higher premium. Thus, the relationship between interest rate spread and bank’s performance is expected to be negative.
Exchange Rate (1 $ equals local currency): The exchange rate is the price of one country’s currency in terms of foreign currency. The financial data for this study was obtained from Bureau van Dijk (BvD), which was available in US dollar. Exchange rate has indirect impact on the profitability of the banks. There is a tendency among some countries to keep local currency weaker to stimulates exports. Because the export transactions are executed through banking channels, it increases the non-interest income of the banks (
Medura 2006;
Pagratis et al. 2014). Thus, a positive relationship is expected between exchange rate and return on average equity.
The list of variables and their definitions are provided in
Table 1.
3.3. Measurement of Managerial Self-Attribution Bias
Text preprocessing is an exhaustive process, especially when someone needs to analyze unstructured qualitative information. The purpose of preprocessing is to convert qualitative text in a usable form to get insight for further analysis. There are different software programs which have built-in functions that convert text into predefined objects. Many of those software programs do not provide customized options to the users to process text according to specific requirements. For instance, two software programs are commonly used in literature; (i) Diction, and (ii) Linguistic Inquiry and Word Count (LIWC) that contain predefined built-in functions for text analysis. Our text analysis required customized functions, because annual reports of selected sample countries did not follow any standard pattern to measure the managerial self-attribution bias. Therefore, an open source R software is used for measurement of attribution bias indices, because it provides a rich selection of text preprocessing packages.
I also used SAS software for agglomerative hierarchical clustering, and estimation of system GMM.
After downloading the annual reports from the websites of the banks, two sections namely, CEO’s letter to shareholders, and the management discussions and analysis (MD&A)
5 were extracted.
At an initial stage, all the documents were converted from pdf to plain, UTF-8 encoded text using “pdftools” of R, and were stored in a “Corpus”. A corpus is considered to be a “library” of all the original documents that have been converted to plain, UTF-8 encoded text.
Figure 1 exhibits the whole process for the measurement of self-attribution bias indices. Self-attribution bias was constructed by first and second-person pronouns minus third-person pronouns. To construct the self-attribution bias from the CEO’s letter to shareholders (CEO SAB), two separate dictionaries of first and second-person pronouns, and third-person pronouns were constructed. These two dictionaries were employed to obtain the score by matching with the corpus of CEO letter to shareholders. Each term of pronoun was counted as many times it appeared in the document, which reflected the stress made by management to take credit of good performance and vice versa.
A similar process was adopted to obtain the score of first and second-person pronouns and third-person pronouns of management discussions and analysis (MD&A SAB).
3.4. Econometric Model Using System GMM
For performance prediction, three models were estimated. First, prediction of banks performance using only two self-attribution bias (SAB) indices. Second, the model was estimated with the help of SAB indices along with bank level quantitative financial variables. Third model was estimated with full set of SAB indices, bank level variables, and macroeconomic indicators. The notion behind estimating three models was to observe the predictive power of SAB indices, over and above the models that were based on quantitative financial data alone.
Dynamic panel models are linear regression models that consist of individual effects, yield individual-level errors, and overall model residual errors. It allows for dependent variables to depend on its own value from its previous time, thus making the model dynamic. The following Equation (1) is specified for model 1.
where
is the return on average equity one year ahead taken as a performance indicator of banks. CEO_SAB is the self-attribution bias calculated from CEO letter to shareholders, and MD&A_SAB is the self-attribution bias calculated from management discussions and analysis. The
is individual effects, and the
observation-level regression errors.
The second model is shown in Equation (2):
The second model consisted of SAB indices, bank level quantitative financial variables that includes log of assets, assets growth ratio, ratio of non-performing loans to gross loans, tier1captial ratio, and loans to assets ratio.
The third model is shown in Equation (3):
The third model consists of a full set of SAB indices, bank level quantitative financial variables, and macroeconomic indicators that includes interest rate spread, GDP growth, and exchange rate.
Diagnostic Checks
Initially, the model was estimated using fixed effects, and random effects, and observed the individual effects using F-test. The Hausman test was estimated for model’s selection either to use the fixed effects or the random effects. The problem with the fixed effects and the random effects are that the error term being correlated with the lagged dependent variable created an endogeneity problem, and even the error term is not autocorrelated (
Greene 1996, p. 536). Both models are presented in
Appendix C. Thus, the regressors are said to be endogenous when random errors are correlated to the regressors. In fact, endogeneity is a major methodological concern in many areas of research in corporate finance (
Abdallah et al. 2015). If endogeneity is present in the model, then the statistical inference from the analysis may be biased (
Abdallah et al. 2015). Endogeneity may occur due to omitted variables, which may result in the error term being correlated with the explanatory variables. Alternatively, the endogeneity may be of the dynamic type, whereby the past realizations of the dependent variable influence current realizations of one or more of the explanatory variables. Finally, endogeneity can be of the simultaneous type, where the contemporaneous realizations of both the dependent variable and the explanatory variables affect each other (
Abdallah et al. 2015;
Roberts and Whited 2012).
The model in this study potentially faces two types of endogeneity issues, i.e., omitted variable bias, and the past realization of dependent variable in terms of earning persistence. For example, SAB indices were developed from the textual data to capture the private information of the management. Nevertheless, these indices may not necessarily present a perfect proxy of private information, and is affected by the agency problem. This is referred to as endogeneity of omitted variable bias. Moreover, lagged dependent variable is included in the right-hand side of the model, making it dynamic. The reason to include the lagged dependent variable is the performance persistence of the banks, which was the continuity of the current earnings affected by the magnitude of the accruals. The higher persistent earnings are accompanied with more ability to maintain the current earnings (
Lipe 1990). Hence, failure to address the endogeneity may lead to poor statistical inference (
Abdallah et al. 2015).
Some of researchers have mentioned that there should be a theoretical reasoning for considering the regressor as endogenous. However, for a robustness check, the Durbin-Wo-Hausman test was used to test whether the theoretical reasoning justify the empirical reason of endogeneity problem.
Similarly, the model could also face heteroscedasticity problem. The White test was used for detecting the heteroscedasticity of residuals, and the null hypothesis is that the variances for the errors are homoscedastic, where
for all
i. In presence of the heteroscedasticity, estimates are still unbiased, but become inefficient. However, the standard errors of estimates are wrong, leading to incorrect inferences (
White 1980).
A generalization of the linear regression model is an autoregressive (AR) model. The AR test was conducted to identify the serial correlation in residual. The null hypothesis is that there is no serial correlation within residuals.
The validity of the instruments was confirmed via a the Sargan test that checks for overidentifying restrictions, and it is asymptotically distributed as a χ2 (n) with n degrees of freedom under the null hypothesis that the instrument set is appropriate for the data at hand.
The estimates of all models were conducted with the help of SAS ‘proc panel’ procedure.
4. Descriptive Statistics
Table 2 provides the descriptive statistics of 58 banks of 16 emerging economies. Furthermore, detailed descriptive statistics were calculated at country level, so that insight could be obtained in depth about how one country’s statistics were different from another. Country level statistics are shown in
Appendix B. The Mean of ROAE was 16 percent, which was close to median, whereas the lowest value was −27 percent, and the highest was 41 percent, and the standard deviation was 6.75. It was found that the highest ROAE belonged to the NDB bank of Sri Lanka in 2012, because in that year NDB bank became the first investment bank. It made divestment of AVIVA insurance, and made new investment with A/A corporation. Further, economic growth, and the post war period in Sri Lanka also helped to increase profitability. Lowest ROAE −27.80 belonged to an Askari bank of Pakistan due to huge non-performing loans, and the bank written-off non-banking assets.
SAB of CEO ranged between −139 to 181 that means there was polarity of managerial attribution, because in some of the banks, the CEO used more first and second-person pronouns, and in other cases, the CEO used more third-person pronouns. SAB of MD&A also showed a greater dispersion, and ranged between −191 to 1128. The highest values of SAB of MD&A related to those banks who had published annual reports of more than 700–1000 pages, meaning that a large number of pages were also allocated to MD&A section.
Financial variables also presented greater variability in the data. Absolute values were reported in millions of US dollars converted from local currency on the last day of the financial year of the respective country. The mean value of total assets was 187,225 US dollars, and ranged between 143 million US dollars to 3,421,363 million US dollars, having a standard deviation of 486,273. Largest value belonged to the ICBC, because it was one of the largest banks in the China, whereas, lowest assets were reported by the PABC bank of the Sri Lanka. It was important to notice that the Chinese banks were much larger than any other banks in selected samples. The lowest value of assets growth was −22% belonged to the Bank Alfalah of the Pakistan in 2008, which was year of financial crisis, and 56% belonged to the ThanaChart bank of the Thailand. The PABC bank of the Sri Lanka has the highest non-performing loans with the value of 27%, whereas, an average value was 4% close to median.
In the macroeconomic indicators, the Do-Brazil bank had the highest interest rate spread ranged between 30–35 percent. The high interest rate spread was due to the Brazil’s large public debt, and debt services. Another reason of high interest rate spread was the history of default that’s the government had to pay a high default risk premium to attract foreign capital. Moreover, the exchange rate was lowest in the Indonesia for 1 dollar equals 13,389 Indonesian Rupiah. The GDP Growth rate was lowest in Hungary in 2009 with −6.56, and highest in Singapore with 15.24 in 2010.
The author provides the correlation matrix in the
Table 3 showing that there was no multicollinearity between the variables.
Figure 2 provides the scatter plot matrix of all the SAB indices, quantitative financial variables, and macroeconomic indicators.
5. Agglomerative Hierarchical Clustering Analysis
Cluster analysis is an important form of exploratory data analysis that tries to explore hidden groups within the data. It is an unsupervised machine learning method, which is widely used for classification problems, and exploring the data based on some similarity of features
6 through a structed pattern (
Wu et al. 2009;
Myatt and Johnson 2014). As a result, observations in the same cluster are more analogous than those in other clusters (
Hsu et al. 2007). More importantly, clustering helps to discover latent natural group, and categorize data into a hierarchical set of clusters organized in a tree structure (
Loewenstein et al. 2008).
There are different clustering techniques, including k-mean clustering, k-mode clustering, but the underlying study adopted the hierarchical clustering, because it is mainly used for small data sets, where each observation forms hierarchy. SAS ‘proc cluster’ procedure was used for agglomerative hierarchical cluster analysis.
To calculate the similarities between observations, it is required to normalize all the variables, so that the distance between these observations are computed to prevent disproportionate weights, and biases. The data used in this study consisted of total assets reported in million in the dollar, and other financial variables are in percentage. Therefore, the whole dataset was normalized using the following formula:
where
is the standardized value of observation
n,
is the original value of observation
n,
and
are the mean, and standard deviation of the variable
X. The sample data consisted of a panel of 58 banks, and the time period covers from 2007–2015. Directly making clusters of the panel data could match the observation of one bank’ year with another bank. This could distort the whole data, and understanding of clusters. Therefore, the mean of each bank was taken, where each bank represented a single observation. Moreover, the data consisted of 11 variables that could also create problems for differentiating variables for making the clusters. Principal component analysis was performed to identify the variables that were loaded on the first factor (see
Appendix C). Hence, four variables: assets growth, total assets, NPL to gross loans, and GDP growth, were loaded on factor-1 that were taken to form clusters.
5.1. Measurement of Distance between Observations
The distance between observations was calculated using Euclidean distance matric (
Myatt and Johnson 2014). It is shown as follows.
where
d is the distance between
p, and
q the observations of
n variables. Thus, the Euclidean method calculates the distance of each combination of all the observations. Then, the linkage rule creates clusters by comparing the distance between the clusters, and observations. This process continues until a diagram called dendrogram is created, which explains how the observations are connected to each other based on similarities of observations. The dendrogram is shown in
Figure 3.
To determine the optimum number of clusters, cubic clustering criterion (CCC), (b) pseudo T statistic, and (c) pseudo F statistics were used. The highest peak of CCC is considered as the optimum number of clusters. In our case, the peak of the plot at 4 CCC provided 9 optimal number of clusters, as shown in
Figure 4.
Pseudo
F statistic (PSF) also supplemented the selection of 9 clusters by showing the highest peak (middle graph). In pseudo
t2 statistic, observing the graph from right to left, until the first height gave the indication of about 9 as the number of clusters.
Table 4 shows the number of banks in each cluster, percent frequency, and cumulative percent frequency.
Figure 5 and
Figure 6 show the distribution of clusters in pie chart, and bar chart respectively.
5.2. Cluster Profile
To profile the clusters, descriptive statistics were calculated on the unstandardized dataset based on the clusters’ distribution of banks to explore the reasons of similarities, and differences between clusters.
It helps to understand the patterns between clusters, and the rationale behind how these clusters were formed. Statistics included mean, median, standard deviation, skewness, and kurtosis shown in the
Table 5.
Observing the mean statistic of four variables used in cluster analysis, the mean value of assets growth was 10 percent, and GDP growth at 5.16 percent. However, Natural Language Processing (NLP) to gross loans were only 4.24 percent. These values were also close to median, meaning that the distribution of these variables was symmetrical. These statistics were compared with cluster statistics.
Table 5 shows that cluster-1 included those banks which had a smaller mean value, and median as compared to the whole dataset, except total assets. It included 13 banks: four banks of Philippines, four banks of the Sri Lanka, two banks of Indonesia, and one bank each from the India, Bangladesh, and the Malaysia. Cluster-2 consisted those banks which had high assets growth, banks belonged to those countries with a better GDP growth rate, less NPL, and low averaged total assets as compared to whole sample statistics. This cluster consisted 6 banks: four Indian banks, and one bank each from the Sri Lanka, and Bangladesh. Cluster-3 comprised of banks with highest assets growth, and banks belonged to those countries with the highest GDP growth rate, lowest NPL, and largest banks in the selected sample. It was interesting to notice that all the three banks in cluster-3 belonged to the China.
Cluster-4 included 16 banks where five from Turkey, three each from Malaysia and Singapore, and one bank each from Thailand, Indonesia, and Nepal. Banks included in this cluster had lower assets growth which was also manifested in countries with lower GDP growth.
Banks included in cluster-4 had low averaged NPL, and total assets as compared to whole sample statistics. Cluster-5 carried those banks having low averaged assets, GDP growth, and low NPL, as compared to total sample statistics, which was also supplemented by other statistics. Banks included in this cluster were relatively small banks. Three Chinese banks also constructed the cluster-6 without including any other bank from emerging economies. Cluster-3 and cluster-6 carried the Chinese banks, but the major differentiators between both were assets growth, and the size of banks. Cluster-3 banks were small sized, but the highest assets growth as compared to the banks of cluster-6.
Cluster-7 included banks which had the highest non-performing loans with second lowest assets growth. The data also revealed that the Askari bank and NBP bank from the Pakistan, and the OTP bank from Hungary had persistent high non-performing loans. In cluster-8, the major differentiator was the assets growth, having the lowest mean value and highest coefficient of variation. Finally, cluster-9 carried only one bank that did not merge in any other cluster. The reason being that the separated bank was the highest mean value of assets growth of 18.14 percent, and smallest bank in terms of total assets. The list of banks in each cluster along with their belonging to the country are shown in
Table 6.
7. Conclusions
Banks are important financial intermediaries within the financial system because they help to promote the economic growth of a country. Nevertheless, the banking sector crisis had a history for their role in the financial turmoil, especially in the 2007–2008 sub-prime crisis. Such a banking crisis had not only reduced the industrial production, entrepreneurial innovation, trade, but also had knock-on effects to the rest of the world. Therefore, the prediction of bank performance is important for regulators to take pre-emptive actions to avoid huge losses. However, developed models for performance prediction of banks were only based on quantitative financial data.
In this research, self-serving attribution bias, which is a text analysis technique based on attribution theory, was used for a contextual understanding of managerial behavioral bias towards the outcomes of banks. The notion behind self-attribution was that management uses more first and second-person pronouns as compared to third person pronouns in annual reports if they anticipate better future performance.
The sample consisted of 58 banks of 16 emerging economies for a period from 2007–2015. For exploratory data analysis, hierarchical clustering from unsupervised machine learning was performed to detect latent groups within the data. It was observed that some of the banks joined the clusters based on asset growth, other banks formed clusters due to high NPL. GDP growth also worked as a differentiator for grouping the banks into the cluster. Finally, the size of the banks in terms of assets distinguished the small banks with medium and large banks.
To predict the future performance of banks, system GMM proposed by
Blundell and Bond (
1998) was used to estimate the models. System GMM helps to deal with the endogeneity and heterogeneity problems within the data. The results of the study have shown that there existed a strong relationship between managerial attribution bias and the future performance of banks in emerging economies. The results were consistent with the attribution theory, which predicted that managers took credit for good outcomes and distanced from bad outcomes. Therefore, the study concludes that self-attribution bias of management signals about the future performance of banks, over and above the quantitative financial data provided in financial statements.
7.1. Policy Implications
Regulators: Any technique that could even marginally improve the ability of regulatory authorities to make an assessment of overall bank performance would be beneficial, because supervisors may intervene in a timely manner to avoid bank failure (
Gandhi et al. 2019). The findings of the study have shown that self-attribution bias of management in annual reports provides incremental information, over and above the quantitative data provided in financial statements of banks. Such information could be used as indications of early warnings, and help the regulatory authorities of emerging economies to differentiate the bad performing banks. As a result, the banks could be supervised more efficiently, and take preventive, and corrective measures to avoid huge losses to the investors, and government.
Investors and Analysts: The findings also show that the contextual information of management in emerging economies’ banks can help reduce information asymmetry between shareholders and management (principal-agent theory). It can provide the existing as well as potential investors with a better tool for a comprehensive assessment of banks profitability for prospective investments.
Researchers: Use of textual information from the textual information of banks is a relatively unexplored area where researchers may yield rich insights for testing further hypotheses of interest.
7.2. Limitation of the Study
Sample data was relatively small, because the author had to follow certain criteria for inclusion of banks in the sample. For example, annual reports of banks must include CEO letter to shareholders, and management discussions and analysis (MD&A). The CEO letters were mainly available in the annual reports; however, MD&A was not regulatory requirement in emerging economies. In addition, the banks in emerging economies were either state owned or family owned or became public limited only recently. Therefore, most of the banks did not have these two sections. These issues reduced the sample size of banks.
Cultural differences of countries may hold individual effects, while using first-person pronouns, and second person pronouns by management in annual reports. For instance, it might be the convention in some sample countries that management uses more plural pronouns in these two sections.
Likewise, English was not the first-language in annual reports of banks in emerging economies. Therefore, use of pronouns, while writing annual reports may hold some implications for construction of self-attribution bias indices.
Finally, the literature on textual analysis was mainly focused on developed economies. Most of the banks in emerging economies are either state owned or family owned. Thus, management might not necessarily take credit for good performance, and bad performance attributed to external factors, due to restricted power of management over the board of directors and shareholders.