Measurement of Investor Sentiment and Its Bi-Directional Contemporaneous and Lead – Lag Relationship with Returns : Evidence from Pakistan

The present study examines bi-directional contemporaneous and lead–lag relationships between investor sentiment and market returns in the emerging market of Pakistan over the period of 2006 to 2016. To measure investor sentiment, the study employs a direct proxy namely Google search volume index (GSVI) and nine other indirect proxies. Besides conventional regression and VAR model, the study applies Geweke’s (1982) tests to investigate the nature of relationships between sentiment and returns. Thus, the study adds to existing literature by providing latest and thorough statistical evidence on the role of investor sentiment in influencing market returns. The study finds sufficient evidence regarding irrational behavior of investors in the thin market of Pakistan. In particular, the results indicate substantive role of sentiment in dragging stock market away from its sustainable path as implied by economic fundamentals.


Introduction
For decades, academicians, economists, and researchers have been focusing on the role of financial markets in promoting sustainable economic growth and development by channelizing capital mobility, risk sharing, etc.However, when stock markets strongly react to sentiments, it would mean that they deviate from their ideal role of conveying credible signals regarding health of the economy and rather provide opportunity for unproductive speculative and rent-seeking activities.In extreme cases, sentiments may drag financial markets too far away from their fundamental paths to create financial crises and thereby make real economic progress unsustainable.
It is not surprising, therefore, that the effect of investor sentiment on stock returns has been the focus of increased theoretical and empirical investigation in behavioral finance literature focusing on three elementary units, namely limits to arbitrage (e.g., [1]), investors' behavioral preferences (see, [2]), and their sentiment [3][4][5][6].Investor sentiment is defined as the conception that depicts in what ways investors develop their preferences and beliefs using moods, emotions, psychological bias, and cognitive bias; and afterward forecast future asset prices.Sentiment is the result of emotional responses instead of stock market's fundamental changes and it impacts the expectations of stock returns [7].Barberis et al. [8] defined sentiment as communal judgment fallacy made up by a group of stakeholders that are noise traders as opposed to a chain of uncorrelated errors.According to De Long et al., Brown and Cliff, Lemmon and Portniaguina, Baker and Wurgler, and Berger and Turtle [9][10][11][12][13][14], market sentiment is optimistic or pessimistic convictions of investors on stocks, which are  [15][16][17] interpreted sentiment as prejudices and errors in beliefs of investors regarding prospects of future performance of a company.Sentiment can be measured by direct or indirect approaches.Direct measures rely on information gained through surveys, seeking information from individuals regarding their feelings about the stock market and economic conditions, and electronic sources like internet and social media.On the other hand, indirect measures are based on financial and economic variables that depict investors' mindset.
Both the direct and indirect measures have been used extensively in empirical literature.For example, Lemmon and Portniaguina, Brown and Cliff, Zhang, Lux, and Ali et al. [11,[18][19][20][21] used surveys to measure sentiment.Da et al. [22] constructed an index named as Financial and Economic Attitudes Revealed by Search (FEARS) index by using daily internet search volume from millions of households as a measure of sentiment.Bollen et al. [23] used Twitter feeds, while Edmans et al., Palomino et al., and Kaplanski and Levy [24][25][26][27][28] used websites to access comments/views on the results of various World Cup games to measure investor sentiment.It is pertinent to note here that recent literature (e.g., [29][30][31][32][33]) has also interpreted information based on internet search, especially Google trends as an indicator of uncertainty rather than just a measure of sentiment.Sentiment could also be interpreted as fear.For example, Ghosh et al. [34] have used the volatility index (VIX) constructed by Chicago Board of Options Exchange based on 30-day volatility and market expectations and interpreted it as a measure of fear rather than sentiment or uncertainty.Some of the popular indirect measures used in empirical literature include number of IPOs, first-day returns on IPOs and closed-end fund discount [10,12,35], ratio of the number of advancing issues to the number of declining issues [10,36], mutual fund flows [10,37], turnover [38], dividend premium, trading volume and equity share in new issues [12], and bull-bear indicator of financial markets which includes relative strength index [36,39].
Any single indicator of sentiment cannot capture the broad concept of sentiment but provides information on certain characteristics of firms such as performance, liquidity, activity level, etc.Therefore, the dominant opinion in the literature proposes to construct sentiment index using various individual indicators.
Literature reveals a high correlation between investor sentiment and market returns.In the course of great optimism (pessimism) or high (low) sentiment, the investors as a group overrate (underrate) assets and, hence, market valuation becomes greater (lower) than the intrinsic value.Thus, it is suggested that besides traditional theory, the role of investor sentiment should be considered for asset-pricing models.Kaplanski and Levy [40] revealed that bad mood and anxiety provoked due to media coverage of aviation disaster create negative sentiment and affect stock prices.Curatola et al. [41] used sport sentiment and found that it has a significant impact on returns of financial sector.Bijl et al. [42] found that stock returns can be predicted using Google search volume, with a negative mutual relationship.According to Kim et al. [43], search volume has insignificant relationship with returns but significant positive relationship with trading volume.
As noted in Nisha [53], academic interest recently shifted towards emerging economies to investigate the movement in stock prices.For example, according to Anusakumar et al., Kaplanski et al.,, investor sentiment has a significant positive relationship with returns.Barber et al. [57], however, showed that at longer horizons there is a negative relationship between stock returns and investor sentiment.Lao et al. [58] observed a positive effect of returns shocks on sentiment and a negative impact of sentiment shocks on market returns.Pakistan was recently re-awarded the status of emerging market by Morgan Stanley Capital International (MSCI) in May 2017 after nine years.However, in second half of 2017, it lost 15% total return KSE-100 index (as compared to 24% average return during the past 10 years).This underperformance was experienced during the period when healthy gains were exhibited by peer countries like India, Bangladesh, Malaysia, etc. [59].This indicates that like most other markets, Pakistani stock market has substantial noise component, which needs to be analyzed.Although a number of studies have attempted to explore the role of investors' sentiment in this regard, the recently proposed measures of sentiment, such as Google search volume index, have not yet been used for Pakistan.Besides using a broad index of invertors' sentiment, the present study also employs comprehensive statistical tools to measure the relationship between sentiment and returns.
In Pakistani context, various studies have focused on the behavioral factor of investor sentiment.The recent study of Rehman et al. [60] has found a positive relationship between returns and the sentiment index.Similarly, Ahmed and Ullah, Ahmad et al., Awan et al., Rehman, and Sarwar et al. [61][62][63][64][65] have also found significant relationship of future trading with the investor sentiment.Khan and Rahman [66] explored the influence of sentiment on industry returns and found a significant relationship.Chughtai et al. [67] have revealed a negative relationship of sentiment with current and future returns.Sadaqat and Butt [68] concluded that sentiment has positive contemporaneous and negative lagged effect on excess returns.
The present study quantifies a new and more comprehensive measure of investor sentiment for Pakistan in comparison to previous studies.Only few empirical studies have used both direct and indirect indicators of sentiment.In particular, no study can be found for Pakistan that makes use of sentiment data derived from internet search trends.Specifically, the study includes both direct and indirect proxies to construct a composite sentiment index for Pakistani stock market using principal component analysis.For the first time, Google search volume index is used as a direct measure of sentiment for Pakistani market.In addition, nine indirect measures including the number of initial public offerings, closed-end fund discount, advance-decline ratio, dividend premium, interest rate, price-earnings ratio, turnover, money-flow index, and relative strength index are used.The composite index is a broad representation of sentiment measures from different categories/perspectives, for instance, stock prices, turnover, mutual funds, Initial public offerings, interest rate, and especially Google search volume.
Therefore, the study adds to existing literature by providing latest evidence on the role of investor sentiment in influencing market returns in the emerging market of Pakistan, considering a broader measure of sentiment.
The possible bi-directional contemporaneous and lead-lag relationships between returns and investor sentiment for the Pakistani market are also investigated in the study, which is new to the literature.Another novel feature of the study is that besides conventional regression and VAR model, the approach proposed in and Geweke [69] is also applied.Thus, in addition to analyzing the lead-lag feedback responses between sentiment and returns, the study also investigates contemporaneous feedback between the two variables, which has not yet been done for Pakistan Stock Exchange (PSX).The remaining part of this article is organized as follows.Section 2 presents methodology and data followed by Section 3, which provides empirical results.Section 4 concludes the paper.

Measurement of Sentiment
The first task of this study is to calculate investor sentiment of Pakistani stock market using direct and indirect proxies for the time period of 2006 to 2016.Direct measures are developed through surveys of investor and business studies [10,11,44].Nowadays, a bulk of information is available on different search engines and social media that provide a more efficient and less costly alternative to field surveys.According to Mondria et al. and Choi and Varian [70,71], Google search volumes of certain terms are correlated with investment.Therefore, a change in search volume provides information about the investors' buying and selling pattern, which eventually shows an optimistic or pessimistic trend in the market [72].
The indirect approach measures investor sentiment using financial market variables on the basis of financial theories.Brown and Cliff [10] classified indirect sentiment indicators into four categories representing performance, types of trading activity, derivatives trading activity, and 'market weather vanes'.According to Simon and Wiggins [73], every single proxy comes under one of the three categories: performance, liquidity, and financing activities of firms.Based on such classifications quite a few indirect indicators of investor sentiment have been proposed and used in literature, which include relative strength index (RSI) and money flow index (MFI), dividend premiums, closed-end funds discount, put-call ratio, and buy-sell imbalance (see, for instance, [9,10,36,39,46,74,75]).
The main advantage of direct measure of sentiment is that it provides direct information about psychological mindset of people (pessimism, optimism, etc.), which does not need to be endorsed by financial theory [76,77].The main disadvantages of survey measures are errors in questionnaire or interviews, prestige bias, and most importantly low frequency of sampling time period.The indicators of the indirect method have the advantage that these rely on financial theories and are generated by using financial data, which are mostly measured with reliability and are available in higher frequencies of time as well.However, financial data may capture expectations of the investor about price changes only and no other elements of attitudes of the investor [78].Compared to survey-based direct measures and indirect measures, the measure derived from financial and economic attitudes based on daily internet search volume has various advantages.For instance, it tells about attitudes instead of inquiring about them, is obtainable at a high frequency, and discloses more personal information as compared to surveys where response rate and motivation for truth-telling is low [22].
In the light of above observations, the present study uses nine indirect measures and one direct measure, the Google search volume index (GSVI).This choice is based on two considerations.First, the selected indirect measures have been used by some of the important studies, for instance, Brown and Cliff, Baker and Wurgler, Chen et al., Fisher and Statman, and Gao et al. [10,12,13,39,79,80].Second, GSVI is a unique direct measure of sentiment on which time series can be constructed and combined with data on indirect measures.
It may be noted here that the ability of sentiment proxies popularly used in literature to represent sentiment can be questioned because some of these proxies indicate performance and other such attributes of financial market.One possible way-out that this study also follows is to remove the non-sentiment component from the proxies by orthogonalizing them to fundamental macroeconomic variables at some stage of computation.
The details of nine indirect proxies used in the present study and the methods to calculate them are as follows: 1.
Number of Initial Public Offerings (NIPO): When a company first time issues stock to the public, it is known as initial public offering (IPO).During optimism in the market, most companies go public, so, higher the number of IPOs higher will be the positive (bullish) sentiment [81].

2.
Closed-End Fund Discount (CEFD): A pooled investment fund which uses IPO to raise a fixed amount of capital is called closed end fund.It is traded like a stock having a price set by the market and the average difference between the net asset values of closed-end stock fund shares and their market prices is the CEFD.It is inversely related to investor sentiment, i.e., when the discount increases as a result of decrease in share price, the investor will have a bearish sentiment.
Only three closed-end funds are left in the sector of mutual funds of Pakistan and their data are available from February 2006 onwards.The value-weighted discount of Lee et al. [44] is applied for the computation, i.e., where N AV it are the discount and associated weight of fund i at end of period t and n is the number of funds, while N AV it and SP it denote net asset value and stock price of fund i at end of period t.

3.
Advance-Decline Ratio (AVDC): AVDC is an indicator of market breadth.It is calculated as the ratio between numbers of advancing and declining issues This ratio indicates direction of the market on net basis.If the ratio is equal to one, it would mean that on average the market is neither bullish, nor bearish, whereas a value greater (less) than one indicates bullish (bearish) sentiment.Furthermore, increase (decrease) in the ratio indicates movement of the market in bullish (bearish) direction.

4.
Dividend Premium (DP): DP is the log difference of the average market-to-book ratios (M/B) of payers and nonpayers.
This indicator is inversely related to sentiment because optimistic investors are more interested in stocks that have more investment opportunities rather than dividends attraction.

5.
Interest Rate (IR): The cost of investments is shown by inter-bank offer rate (IBOR).When it becomes high, cost of capital increases and profits decline and as a result some investors leave stock market.Therefore, increase in IBOR is supposed to be a bearish signal.6.
Price-Earnings Ratio (PE): This is a valuation ratio that measures how much investors are willing to pay per unit of current earnings and reveals expectation of the market about a company's growth.Usually a high PE ratio indicates relative degree of overvaluation in asset prices and, therefore, a high sentiment level.The PE ratio is calculated as where PE it = MP it /EPS it is the ratio between market price and earning per share of company i in period t and EPS it is the weight attached to the price-earnings ratio of company i in period t.

7.
Turnover (TO): TO indicates about the liquidity of the market.It is the ratio of number of shares traded daily (turnover) to the number of shares outstanding.When turnover increases, investor sentiment also increases.TO is calculated using the formula where TO it = Turnover o f company i /Outstanding shares o f company i and OS it represents the associated weights based on outstanding shares.

8.
Money Flow Index (MFI): This indicator measures buying and selling pressures in the market.
The MFI holds both price and turnover information and ranges from 0 to 100.An exceptionally high value (80 or above) indicates that an equity is overbought while an exceptionally low value (20 or below) indicates that the equity is oversold [39].To compute this index, first one has to compute money flow for each company as where Typical Price it = Price_High it + Price_Low it + Price_Close it /3.
If the typical price today is greater (less) than that of yesterday, then today's money flow is considered as positive (negative) money flow.The sum of all positive (negative) money in the previous seven days is defined as the positive (negative) money flow.The next step is to calculate MFI for each company as follows.
MFI it = 100 × positive Money Flow it /(positive or negative Money Flow it ) MFI for the market is estimated by taking simple average of all the MFIs, that is, Relative Strength Index (RSI): It is a market indicator that shows whether the market is oversold or overbought.An RSI of 80 indicates that the market is overbought, while an RSI of 20 indicates that the market is oversold.The seven-day RSI is used and defined as where (P RSI for the market is estimated by taking simple average of all the RSIs, that is, 10. Google Search Volume Index (GSVI): The direct proxy used here is GSVI, which is based on households' Google search behavior through Google trends.Daily search volumes of search terms in English language that relate to finance and economics are downloaded to measure the sentiment concerning economic conditions.The Harvard IV-4 Dictionary and the Lasswell Value Dictionary are used for developing the set of finance and economic related search terms.Following Gao et al. [80], the steps given below are performed for developing Google search volume index (GSVI): a.Using markers like "Econ@", "ECON", and "EXCH", a set of words is created.The study focuses on words that show positive or negative sentiment (marked with positive or negative tags).This approach provides search words like cost, profit, bankruptcy, etc. b.
The words are fed into the Google Trends and the first top ten phrases related to each word are extracted.For instance, the top expressions related to gold are "gold rate", "Pakistan gold rate", etc. c.
Only the search terms having at least 100 weekly SVI observations are considered and the terms that turn out to be not evidently associated with finance and economics are discarded.d.
The daily SVI of search terms is downloaded for the period 2006-2016, then the daily change in SVI for each term is calculated.To make them comparable, all data are deseasonalized and standardized to obtain adjusted daily change in SVI (∆ASVI) for all search terms.e.
We let the data speak for itself to tag search terms with positive/negative sentiment and identify which of them are most significant for returns.In this regard, the expanding backward regressions of ∆ASVI on Pakistan's market returns are used.With total number of observations 3213 the study uses the windows of 21 days (the average number of trading days per month) for expanding backward regression.This daily data provides number of search terms with significant positive and negative t-statistics of slope coefficients.f.To construct sentiment index, the ∆ASVI of positive and negative terms are consolidated with t-statistic weighted averages and the difference between them is calculated as a measure of sentiment, that is; where is the t-statistic weighted average of positive (negative) search items.
To obtain a composite investor sentiment index based on the above-calculated sentiment measures, principal component analysis (PCA) is employed.The first principal component (PC), which provides maximum common variation is data, is used as a single measure of sentiment.The main reason for combining all the direct and indirect sentiment proxies into a single composite index is that for the forthcoming econometric analysis, sample will consist of 131 months, which will not be sufficient to include more than one sentiment measures along with their lags simultaneously in the same model.
All the 10 sentiment proxies are standardized before applying PCA.The investor sentiment index, denoted by SENT t , is given by The sentiment proxies used here also contain non-sentiment components, which is removed by orthogonalizing the sentiment index to a set of fundamental macroeconomic variables.Following Brown and Cliff, Baker and Wurgler, and Hudson and Green [10,12,36], this set of variables includes growth rates of industrial production and household consumption expenditure (calculated by taking first difference of the natural logs of variables), investment-output ratio, inflation rate (log first difference of CPI), and term spread (log difference of yield on 10-year government bonds from yield on three-month T-bills).
The sentiment index (SENT t ) is regressed on macroeconomic variables as follows and the resultant residuals, standardized to the range zero to 100, are used as the orthogonalized measure of investor sentiment.
where SENT t , I NF t , IP t , HC t , IO t and TS t denote sentiment index, inflation rate, growth rate of industrial production, growth rate of household consumption expenditure, investment to output ratio and the term spread, while t represents a random error term.The above calculations for the sentiment proxies are applied on daily data but the maximum frequency for the macroeconomic variables is monthly.To match frequency of the index with that of macroeconomic variables, averages of sentiment proxies for the whole calendar months are calculated and monthly sentiment index is obtained.

Modeling Sentiment-Return Relationship
Theory of traditional finance is not sufficient to explain stock market returns and behavioral theory affirms that irrational investor sentiment may possibly have an effect on the markets for a considerable period of time and arbitrage could be delayed and limited.The noise trader model proposes that a class of investors do not make decisions on investment on the basis of market fundamentals but their sentiment may affect asset prices [9].According to Black and Kahneman and Riepe [82,83], noise traders are the investors who trade on unrelated information or digress from the standard models of decision making.According to Black [82], in the market having so many noise traders, it pays to those who have information to invest, whereas noise traders on average tend to lose money.However, the presence of noise creates confusion for the information traders because they cannot be sure whether the information they have is already revealed in the prices.Since trading on information is like trading on noise, the distinction between noise and informed traders gets diluted.Thus, investors seldom have full market information and tend to trade on the basis of intuition or easily available noisy information, especially in weak efficient markets.
To begin with, this study proposes the following model of regression of return on sentiment in which lagged response and inertia in returns are also allowed.
where the lag lengths p and q are to be determined using selection criteria like AIC and SBC and RM stands for return orthogonalized to the same set of macroeconomic variables that are used in orthogonalizing sentiment.Note that according to Frisch-Waugh Theorem, OLS estimate of the simple regression coefficient based on two orthogonaled variables is identical to partial regression coefficient relating the corresponding unorthogonalized variables in an equation in which orthogonalizing variables are also present [84].
It is not always the case that causal relationship runs one way from sentiment to returns.A close look at the sentiment indicators used here would suggest that most of these are also directly or indirectly triggered by recent events in stock prices as reflected in stock returns.It appears, therefore, that market returns and investor sentiment are most likely to respond to each other instantaneously or with some time lags.This possibility calls for VAR modeling in order to let the data speak itself for better understanding of the relationship between the two variables.
The general form of VAR model is given as Although this model could be extended to include other variables, especially macroeconomic indicators like output, interest rate, etc. to generate rich dynamics, this possibility is not considered here because the extended model will require estimation of too many parameters and will cause substantial loss in degrees of freedom with the given sample of 131 months.Besides, the effect of macroeconomic variables has already been netted out while orthogonalizing the sentiments and returns series.
As is well known, individual parameter estimates of a VAR model are not much reliable.Granger causality tests provide a simple procedure to determine the presence or absence of causal relationship between two variables.It is further to be noted that Granger causality tests only allow for lag responses with no provision of instantaneous relationships.Geweke [69] has proposed a more general procedure of causality testing that also allows for instantaneous responses between variables.Since Granger tests are well known, here we explain Geweke's [69] procedure according to which linear dependence is the sum of three measures, i.e., a measure of linear feedback from variable X to variable Y, a measure of linear feedback from Y to X, and instantaneous linear feedback measure.Mathematically, f X, The test is based on the following six equations.
The tests involve the following null hypotheses and the corresponding Chi-square test statistics.
In the last equation, |Υ| is the determinant of covariance matrix S 2 C C T 2 where S 2 and T 2 are variance of the two error terms (µ 2t and ν 2t ), while C is the covariance between them.
For impulse response and variance decomposition analyses, we need to identify the underlying structural VAR model from the estimated (reduced form) VAR model.We follow the standard practice of Cholesky ordering.Since literature mostly focuses on the role of sentiment in driving stock market returns, we set the instantaneous effect of returns on sentiment equal to zero [10].Needless to say that this restriction does not deny the possible effects of market returns on sentiment with some lag of time.

Data
The sample consists of 102 companies including all those companies (90) [85,86] and the data for the years up to 2016 are extrapolated by using the past three-year averages of the quarterly proportions in annual GDP.Finally, quarterly data are temporally split into monthly values by imposing continuity and exponential growth between months.

Sentiment Index
Descriptive statistics of the ten standardized proxies of sentiment index are presented in Table 1.Out of these, four indicators; dividend premium (DP), interest rate (IR), money flow index (MFI), and relative strength index (RSI) have mean values greater than 0.5, while the remaining six indicators; number of initial public offerings (NIPO), closed-end fund discount (CEFD), advance-decline ratio (AVDC), price-earnings ratio (PE), turnover (TO), and Google search volume index (GSVI) have mean values less than 0.5.It shows that 60 percent of the indicators are concentrated towards lower side showing low sentiment.The table also shows that AVDC has lowest standard deviation value indicating the lowest variation in the series of AVDC on daily basis, whereas, PE shows highest variation over the period of data.Table 2 shows that the sentiment indicators are highly correlated with one another with 22 of the 36 pairs forming statistically significant relationship, which justifies the construction of a single index to represent sentiment.The results further reveal that the highest correlation is present between MFI and RSI, which means that they are measuring the same phenomenon, i.e., whether the market is overbought or oversold.The second highest correlation is found between CEFD and IR both of which indicate the cost of future earnings (bearish signal).The significance level of Bartlett's test of sphericity is less than 0.05 which shows that this PCA has an acceptable degree of common variance.First principal component is selected with the highest eigenvalue and its component loadings are presented in the following equation, which depicts that all indicators have signs as expected according to the theory.The second component has eigenvalue greater than 2 but it has opposite signs for some indicators.Therefore, it is improbable that this component will capture sentiment and is not used in the analysis.
It is shown that four indicators (NIPO, AVDC, DP, and GSVI) have component loadings less than 0.3.
In the next step, another sentiment index (PCA_6) is constructed by eliminating the above mentioned four variables which have component loadings less than 0.3.The component loadings of this index are This equation shows that all indicators have the same signs and almost same coefficients as in the initially calculated index (PCA_10).The correlation between PCA_10 and PCA_6 is also extremely high, i.e., 0.963, showing that eliminating four sentiment indicators did not cause any significant change, which can also be seen from Figure 1.Therefore, the present study uses the sentiment index based on the initially selected 10 indicators in order to retain maximum number of sentiment proxies for the final analysis.The correlation vector of 10 individual sentiment indicators and the sentiment index (based on Principal Component 1, PCA_10) is presented in Table 3.The results show that only one indicator has low and insignificant correlation with sentiment index (p-value = 0.50).Out of these 10 indicators, seven proxies namely NIPO, AVDC, PE, TO, MFI, RSI, and GSVI are found to be directly related to sentiment.When the companies perform well, they increase investment capital by issuing stock to the public through IPOs, market has more advancing issues, high PE ratio and turnover, more buying pressures, and investors search about the market more frequently.All these circumstances show sound health of the market and optimistic behavior of the investors.The remaining three proxies found to be inversely related to sentiment are CEFD, DP, and IR.The closedend funds are traded at a discount when the market price of funds drops below its NAV, and it may be an indication that the fund's future earnings are risky and its assets may not be in favorable condition, so closed-end fund discount is a bearish signal.Similarly, dividend premium has an inverse relationship with sentiment because the stocks of dividend-paying firms show that they are  Out of these 10 indicators, seven proxies namely NIPO, AVDC, PE, TO, MFI, RSI, and GSVI are found to be directly related to sentiment.When the companies perform well, they increase investment capital by issuing stock to the public through IPOs, market has more advancing issues, high PE ratio and turnover, more buying pressures, and investors search about the market more frequently.All these circumstances show sound health of the market and optimistic behavior of the investors.The remaining three proxies found to be inversely related to sentiment are CEFD, DP, and IR.The closed-end funds are traded at a discount when the market price of funds drops below its NAV, and it may be an indication that the fund's future earnings are risky and its assets may not be in favorable condition, so closed-end fund discount is a bearish signal.Similarly, dividend premium has an inverse relationship with sentiment because the stocks of dividend-paying firms show that they are retaining less in investment opportunities and more like bonds having a stable income as a sign of safety (bearish signal).The interest rate also has negative relationship with sentiment because when the cost of investment is high, profit margins are reduced and as a result, the investors become pessimistic.
Figure 1 shows that the sentiment index followed a downward trend during the initial years till 2008 when it nose-dived to its lowest level in the periods of global financial crisis of 2008 and 2009.Since global financial crisis could not have been caused by sentiments in Pakistan, it must be the case that pessimistic sentiments were triggered by the financial crisis.After quick recovery the index remained fluctuating still at low level until the year 2011.Thereafter, the index followed upward secular trend with fluctuations on the way.Until the last year of analysis, that is 2016, the sentiment index reverted back almost completely to its initial level in 2006.

Regression Analysis
As mentioned in methodology, to remove the effect of macroeconomic fundamentals, returns, and sentiment index are regressed on a set of macroeconomic variables.Before this, stationarity of all the series involved is tested.Table 4 shows that the null hypothesis of unit root stands rejected for all the variables.Therefore, all the variables are stationary and ready for further analysis.
Table 5 reports parameter estimates of the regression of returns on sentiment along with lagged variables.F-Statistic is highly significant, which indicates significant contribution of the sentiment along with lagged variables in explaining variance in the returns.Furthermore, the value of adjusted R2 is 31 percent, which supports the application of this model.The value of Durbin-Watson indicates that there is no significant autocorrelation left in errors, which justifies the selected lag length on the basis of AIC and SBC criteria.Market rate of return has significant positive relationship with current investor sentiment and significant negative relationship with lagged sentiment.The estimated coefficient values show that one percent point increase in sentiment results in 0.4 percentage point increase in return in the current month and 0.3 percentage point decrease in return after one month, indicating that most of the effect of sentiment on return is wiped out after one month.This result is consistent with Anusakumar et al., Kaplanski et al., and Jansen and Nahuis [54][55][56].The results also show that returns have positive relationship with one period lagged return, that is, one percentage point increase in previous return increases the current returns by 22 percentage points.This result indicates rally effect and inefficiency of the market.The compounding effect shows that increase in return is related to its previous increasing trend.
Figure 2 shows time paths of returns and sentiment, which support the above mentioned results.It is clearly seen that the returns of Pakistani market are affected by the sentiment of the investors.When Pakistani investor considers that the previous returns are high (low), they overvalue (undervalue) the market and develops the positive (negative) sentiment, consequently current returns tend to be high (low) showing irrationality of the market.Results show that in the period of 2008, the sentiment of Pakistani investors is dropped below 10 percent and returns followed the same pattern by reaching to minimum level.When Pakistani investor considers that the previous returns are high (low), they overvalue (undervalue) the market and develops the positive (negative) sentiment, consequently current returns tend to be high (low) showing irrationality of the market.Results show that in the period of 2008, the sentiment of Pakistani investors is dropped below 10 percent and returns followed the same pattern by reaching to minimum level. -.

VAR Analysis
To investigate the possibility of bidirectional relationship between orthogonalized returns and orthogonalized investor sentiment index, we now turn our attention to VAR analysis.The parameter estimates of VAR model are not focused here because of the expected high degree of multicolinearity.Thus, following standard practice, conclusions regarding the bidirectional relationship between returns and sentiment are drawn on the basis of causality tests and impulse response and variance decompositions analyses.Lag length selection is a pre-requisite in the estimation of a VAR model.Application of the performance criteria like AIC, SBC, HQ, etc. indicates that all the criteria favor selection of lag order one except LR, which selects lag order six.In addition, lag exclusion Wald tests are also applied starting with 12 lags and following stepwise backward elimination technique.The final model left has lags 1, 2 and 6.The lag exclusion test produces different results than those of lag selection criteria except for LR criterion.However, to benefit from the rich dynamics in the presence of higher lags, we prefer to rely on the results based on lag exclusion tests and LR criterion, given that the sample size is reasonably large.The finally estimated model is given by the following equations.

𝑅𝑀
As mentioned earlier, the parameter estimates of the above VAR model are not interpreted directly.Rather, we rely on other analytical tools to which we now turn our attention.

VAR Analysis
To investigate the possibility of bidirectional relationship between orthogonalized returns and orthogonalized investor sentiment index, we now turn our attention to VAR analysis.The parameter estimates of VAR model are not focused here because of the expected high degree of multicolinearity.Thus, following standard practice, conclusions regarding the bidirectional relationship between returns and sentiment are drawn on the basis of causality tests and impulse response and variance decompositions analyses.Lag length selection is a pre-requisite in the estimation of a VAR model.Application of the performance criteria like AIC, SBC, HQ, etc. indicates that all the criteria favor selection of lag order one except LR, which selects lag order six.In addition, lag exclusion Wald tests are also applied starting with 12 lags and following stepwise backward elimination technique.The final model left has lags 1, 2 and 6.The lag exclusion test produces different results than those of lag selection criteria except for LR criterion.However, to benefit from the rich dynamics in the presence of higher lags, we prefer to rely on the results based on lag exclusion tests and LR criterion, given that the sample size is reasonably large.The finally estimated model is given by the following equations.As mentioned earlier, the parameter estimates of the above VAR model are not interpreted directly.Rather, we rely on other analytical tools to which we now turn our attention.

Causality Analysis
The results of Granger causality test presented in Table 6 indicate the presence of significant two-way causality between returns and sentiment even though the causality from sentiment to returns is more pronounced.The existing literature provides support for this bidirectional relationship in the studies of Waleed and Alrabadi for Jordan and Thanou and Tserkezos for Greece [87,88].Since the critical χ2 value at 5% level of significance is 3.84, all the calculated values of χ2 statistic fall in the rejection range.This is a strong result indicating that sentiment and returns not only affect each other with lags, as found in Granger causality results, but the relationship between them is also statistically significant.The results further show, obviously, that the total feedback between returns and sentiment is also statistically significant.Thus, our results point towards a strong two-way relationship between returns and sentiment, which is established instantaneously and persists with passage of time.

Impulse Response Analysis
Impulse response functions trace time paths of the effects of structural shocks on variables present in a VAR model.Structural shocks are extracted from the SVAR model retrieved from the estimated (reduced form) VAR model.For identification of SVAR model, we use Cholesky ordering setting the instantaneous effect of returns on sentiment equal to zero.
The four impulse response functions with lags of 12 months are shown in Figure 3.These graphs show how returns and sentiment react to each other over a horizon of 12 months.The first graph shows response of market return to shocks originating from stock market itself as measured by one standard deviation of return.The resultant impact is positive for two period lags, negative for third lag, and positive again from the lag 4 until lag 8. Initially, it is significantly positive for one lag but the response cools down and turns insignificant in the next months.
The second graph shows that when a positive shock of one standard deviation is given to sentiment, market returns respond insignificantly in positive direction for the first two lags but become significant for third and fourth lags.However, later on, the response starts moving downwards but remains positive and almost significant for the fifth and sixth lag.Later on the response becomes significantly negative in the seventh month and remains so in the eighth month as well.This negative impact of sentiment shocks on returns moves towards zero and tends to wipe out.This impulse response profile points toward substantive role of sentiment in driving stock market outcomes.In particular, the impact of sentiment on returns cannot be regarded as an instantaneous reaction of temporary nature.Rather this impact is realized with a lag of three months and then persists for up to six months after which the market seems to adjust back following a short period of technical correction.This finding is consistent with the results of many previous studies.For example, Berger and Turtle, Fisher and Statman, Schmeling, Baker et al., and Zheng [14,[89][90][91][92] that investor sentiment has a negative relationship with subsequent market returns, that is, when investors are optimistic (pessimistic), return shows downward (upward) trend ultimately to move towards equilibrium.Response of SENT to SENT The third graph indicates that due to positive shock of one standard deviation in market returns, sentiment is positively and significantly affected with a lag of one month.Afterward, the sentiment response turns insignificant in the future months.It means that when there is an increase in returns shock, the sentiment becomes positive indicating optimism.This result is consistent with the studies of Verma et al., Väljamets and Sekkat, and Anusakumar et al. [54,93,94].The last graph reveals the response of sentiment to sentiment shock which is significantly positive during the course of the first seven lag months.Later on, the impact of sentiment shock remains positive but turns insignificant.

Variance Decomposition Analysis
We now present the results of forecast error decomposition in order to determine what proportions of variation returns and sentiment are accounted for by returns and sentiment shocks.Table 7 shows the results of variance decomposition for a horizon of twelve months.The first part of the table reveals that in the short run (i.e., month 3), return shocks account for 97.67 percent variation in returns while sentiment accounts for only 2.33 percent of the latter.Although the contribution of sentiment shocks to variation in return increases over time, it remains quite low, a little more than 7 percent even after 10 months.In contrast, return shocks contribute a substantial portion of variation in sentiment, starting at about 33 percent in period 1 and decreasing to 18 percent after 10 months.At first, this result seems somewhat surprising because based on the existing literature, one would expect that it is the sentiment that mainly influences market returns (see, for example, [51]).In any case, the variance decomposition results are consistent with impulse response results.The second and third graphs in Figure 3 show that the impulse response of sentiment to returns starts with a The third graph indicates that due to positive shock of one standard deviation in market returns, sentiment is positively and significantly affected with a lag of one month.Afterward, the sentiment response turns insignificant in the future months.It means that when there is an increase in returns shock, the sentiment becomes positive indicating optimism.This result is consistent with the studies of Verma et al., Väljamets and Sekkat, and Anusakumar et al. [54,93,94].The last graph reveals the response of sentiment to sentiment shock which is significantly positive during the course of the first seven lag months.Later on, the impact of sentiment shock remains positive but turns insignificant.

Variance Decomposition Analysis
We now present the results of forecast error decomposition in order to determine what proportions of variation returns and sentiment are accounted for by returns and sentiment shocks.Table 7 shows the results of variance decomposition for a horizon of twelve months.The first part of the table reveals that in the short run (i.e., month 3), return shocks account for 97.67 percent variation in returns while sentiment accounts for only 2.33 percent of the latter.Although the contribution of sentiment shocks to variation in return increases over time, it remains quite low, a little more than 7 percent even after 10 months.In contrast, return shocks contribute a substantial portion of variation in sentiment, starting at about 33 percent in period 1 and decreasing to 18 percent after 10 months.At first, this result seems somewhat surprising because based on the existing literature, one would expect that it is the sentiment that mainly influences market returns (see, for example, [51]).In any case, the variance decomposition results are consistent with impulse response results.The second and third graphs in Figure 3 show that the impulse response of sentiment to returns starts with a large magnitude and then it dies down to small values.On the other hand, the response of returns to sentiment starts with small magnitude and small at higher lags.This explains the magnitudes and trends in the proportions of cross effects in variance decomposition analysis.

Conclusions
Objective of the present study has been to develop composite sentiment index for Pakistan and to investigate its relationship with returns over the time period of 2006 to 2016.Various direct and indirect indicators related to sentiment are used to construct this monthly index through principal component analysis (PCA).These indicators include number of IPO's, closed-end fund discount, advance-decline ratio, dividend premium, interest rate, price earnings ratio, turnover, money flow index, relative strength index, and Google search index.This composite index is a broad representation of sentiment measures from different categories/perspectives, for instance, stock prices, turnover, mutual funds, initial public offerings, interest rate, and especially Google search volume.The last indicator i.e., Google search volume index which is a unique direct measure of sentiment on which time series data can be generated, is used first time for Pakistan.
The results of PCA show that out of ten, seven proxies are directly related to sentiment, while the remaining three proxies are inversely related to sentiment.To eliminate non-sentiment component from this index and getting returns generated purely due to sentiment, the study orthogonalizes these two series to key macroeconomic variables namely industrial production, inflation, investment-output ratio, term spread, and household expenditure.
The results show positive relationship of returns with current sentiment and lagged returns but negative relation with lagged sentiment.Therefore, the presence of rally effect and irrationality in the Pakistani market is corroborated.Causality tests of Granger and Geweke [69] show strong bidirectional causal relationship between market returns and investor sentiment.These results support the proposition that in the absence of full information, investors become irrational; they use gut feeling, experience, intuition, emotion or mood and influence the process of stock price formation.As a result, stock prices deviate from their fundamental values.Results also provide evidence that sentiment is caused by previous returns.Thus, market outcomes influence investor sentiment, which in turn again affect their decision-making and, hence market outcomes.The study also reports contemporaneous association between investor sentiment and market returns using Geweke measure.The results show that current sentiment immediately affects current returns and vice versa.
The impulse responses between sentiment and returns point towards substantive role of sentiment in driving stock market outcomes.In particular, the impact of sentiment on returns cannot be regarded as an instantaneous reaction of temporary nature.Rather, this impact is realized with a lag of three months and then persists for up to six months after which the market seems to adjust back following a short period of technical correction.The impulse response of sentiment to returns is also substantial but it is short lived.These results indicate the presence of noise traders that drive stock market prices based on sentiments in the thin stock market of Pakistan.Since the Pakistani market is found to be inefficient and the adjustment process following any shocks is delayed, overreaction (high returns) caused by optimism is followed by decrease in returns in the future time periods.
from the list of 2016 KSE-100 for which data are available since 2006.The remaining 12 companies are those which made initial public offerings during the sample period 2006-2016 but are not included in the KSE-100.Data sources for the sentiment indicators include DataStream, Business Recorder, balance sheets of the companies, and websites of the Pakistan Stock Exchange (PSX), Mutual Fund Association of Pakistan (MUFAP), and State Bank of Pakistan (SBP).The monthly data on CPI, yields on 3-month T-bills and 10-year government bonds are taken from the website of State Bank of Pakistan and the monthly data of industrial production are taken from the Bulletin of Statistics.The quarterly data on GDP, investment, and household expenditure from 2006 to 2013 are taken from the Kemal and Arby and Hanif et al.

Figure 1 .
Figure 1.Sentiment Index (PCA_10 and PCA_6).The correlation vector of 10 individual sentiment indicators and the sentiment index (based on Principal Component 1, PCA_10) is presented in Table3.The results show that only one indicator has low and insignificant correlation with sentiment index (p-value = 0.50).

Figure 2 .
Figure 2. Market Returns and Investor Sentiment.

Figure 2 .
Figure 2. Market Returns and Investor Sentiment.
and Cliff, Berger and Turtle, Brown and Cliff, Zhang, Lee et al., Brown, Kumar and Lee, Yu and Yuan, Da et al., Zouaoui et al., Chung et al.,

Table 2 .
Correlation Matrix for Individual Sentiment Indicators.

Table 3 .
Correlation Vector of Individual Sentiment Indicators with the First Principal Component Index (PCA_10).
*Values in parentheses are probabilities of t-statistics of the corresponding correlation coefficient.

Table 3 .
Correlation Vector of Individual Sentiment Indicators with the First Principal Component Index (PCA_10).

Table 5 .
Regression Analysis for Market Returns.
Note:The statistics significant at 1% level are indicated by *.