1. Introduction
Previous empirical studies provide evidence on the association between consumer sentiment and economic and financial variables (
Gupta et al. 2014;
Fisher and Huh 2016;
Baghestani and Palmer 2017;
Shahzad et al. 2019). For example, the academic literature is rich on how sentiment can explain returns on stocks (
Schmeling 2009;
Akhtar et al. 2011;
Chung et al. 2012;
Balcilar et al. 2018;
Zhou 2018). Given the importance of the consumer sentiment to business-cycle analysis (
Lahiri and Zhao 2016), it is very informative to extend the related literature and understand the determinants of the consumer sentiment, especially in the largest economy, the U.S. This is important as the empirical evidence indicates that economic and financial crises are reflected in a decrease in economic activities, personal incomes, spending, and a depressed labor market. Such economic implications on consumers will ultimately affect consumer sentiment and thus consumers’ perceptions of the overall economy and of their personal financial conditions (
van Giesen and Pieters 2019). In addition to its association with income, wealth, and stock market performance, consumer sentiment can be affected by the establishment of an environmental governance system that is found to be beneficial to economic conditions. Furthermore, there are many advantages of being more energy-efficient, and more environmentally friendly, which leads to more sustainability and satisfaction (
Issa et al. 2011). However, previous studies tend to focus on consumer sentiment in a relatively generic setting and often ignore the determinants of consumer sentiment in linear and nonlinear models. Especially, there is lack of studies in regard to roles of the perception of the general consumers and citizens as well as stock market indices and energy-efficiency measures in determining the U.S. consumer sentiment.
In this paper, we examine the determinants of the U.S. consumer sentiment based on a large set of financial and nonfinancial variables that involve the stock market, personal income, confidence, education, environment, sustainability, and innovation freedom. Methodologically, we borrow some methods from econophysics (e.g., Random Matrix Theory (RMT)) and apply various linear and nonlinear models. Our analyses can be also considered as an analysis of the behavioral attitudes and perceptions of the U.S. consumers regarding financial indices that affect their sentiment. After reviewing the related academic literature (see
Section 2), we could not find any direct research that has examined the association between consumer sentiment and variables related to education, environment, sustainability, and innovation freedom.
According to the Oxford English Dictionary, the sentiment is defined as “
a feeling or an opinion, especially one based on emotions.” Accordingly, we considered severable variables that could explain the consumer sentiment, such as confidence, education, environment, sustainability, and innovation freedom. We chose the University of Michigan Consumer Sentiment Index as a measure of the U.S. consumer sentiment. This index is designed to largely reflect fundamentals (
Stivers 2015). It is published monthly by the University of Michigan based on at least 500 telephone interviews conducted with U.S. households.
In our analyses, we applied the RMT and show that more than 8.3% of the total of eigenvalues deviate from the RMT, and thus it might contain pertinent information. Then, linear regression analysis was applied, showing that the stock market, confidence, personal income, and unemployment explain significantly the U.S. consumer sentiment. However, to capture nonlinearity, we applied the switching regime model and the results show evidence of a switch towards more confidence and more positive sentiment regarding energy efficiency, unemployment rate, student loan, sustainability, and business confidence. We additionally applied the Gradient Descent Algorithm to compare the errors obtained in both linear and nonlinear models, and the results suggest a better model with a high predictive power.
The remainder of this paper is organized as follows. We present a brief literature review in
Section 2. We describe the data in
Section 3. We present our empirical results in
Section 4. We conclude in
Section 5.
2. Literature Review
Previous studies indicate that the investor sentiment is studied through different perspectives, except from a consumer sentiment perspective.
Fisher and Statman (
2000) examined the relationship between Wall Street strategists and the sentiment of individual investors and found evidence of a negative relationship.
Baker and Wurgler (
2006) studied how investor sentiment can affect the cross-section of stock returns. They defined
Investor sentiment as the degree of market participants’ being overly optimistic or pessimistic about financial markets. They showed that investor sentiment, broadly defined, has significant cross-sectional effects, which undermines classical finance theory in which investor sentiment does not play any role in the cross-section of stock prices, realized returns, or expected returns.
Baker and Wurgler (
2007) indicated that it is quite possible to measure investor sentiment, and that waves of sentiment have clearly discernible, important, and regular effects on individual firms and on the overall stock market returns.
Kurov (
2008) analyzed the sentiment of traders through feedback trading and found that positive feedback trading appears to be more active in periods of high investor sentiment, which is consistent with the notion that feedback trading is driven by expectations of noise traders.
Cofnas (
2015) indicated that business and consumer confidence data are a powerful source of information that can move financial markets. They showed that, when related survey results are released, they provide important information on expectations regarding the local economy. Focusing on the drivers of consumer sentiment over business cycles,
Lahiri and Zhao (
2016) showed that macroeconomic variables can explain consumer sentiment and highlight the role of household perceptions on their own financial and employment prospects.
Paraboni et al. (
2018) showed the existence of a significant relationship between measures of market sentiment and risk. The developed U.S. and German markets demonstrate a stronger relationship between optimism and risk, while the emerging Chinese market demonstrates a stronger relationship between pessimism and risk.
Wadud et al. (
2019) showed the impact of consumer confidence on U.S. household credit delinquency rates.
Zhang and Pei (
2019) explored the impact of investor sentiment on stock returns of petroleum companies by using a binomial probability distribution model to build a daily investor sentiment endurance index. According to their results, the index can effectively predict the stock returns of petroleum companies, and the sentiment effect becomes stronger in the period of economic expansion.
Bouteska (
2019) examined whether the investor sentiment has a moderating effect on the impact of earnings restatements on security prices by studying the cumulative abnormal returns and investor sentiment. The results show that investor conservatism represents a dominant factor to explain the positive relationship between cumulative abnormal return and investor sentiment.
In light of the above studies, we contribute to the academic literature by studying the consumer sentiment from a different perspective by focusing on the linear and nonlinear relationship between the U.S. consumer sentiment and several financial and nonfinancial explanatory variables related to stock market, confidence, education, environment, sustainability, and innovation freedom. Notably, the Gradient Descent Algorithm is new to the above academic literature and its application refines the prediction models involving the determinants of the U.S. consumer sentiment. In fact, we show that the sum of errors computed by using Gradient Descent Algorithm is smaller than the one found in ordinary linear regressions and the switching regime model, which indicates that the algorithm gives less error and could be used to do better predictions comparing to the other models with a high predictive power.
3. Data
Our data are at the monthly frequency. Given that data on several variables under study are not all at the daily frequency, we opted for the monthly frequency for all the variables under study. Accordingly, where needed, we computed the monthly growth/returns for daily series, by calculating the average growth/return observed during each month. Our data cover the following series that are often used in previous studies:
University of Michigan Consumer Sentiment Index is a monthly survey of U.S. consumer confidence levels conducted by the University of Michigan. It is based on telephone surveys that gather information on consumer expectations regarding the overall economy.
Bloomberg Barometer Startups Global Index measures both the occurrence and level of historical and recent venture activity for U.S.-based startups excluding biotechnology. The index is a gauge of startup activity that equally considers capital raised, deal count, first financings, and exit count.
Business Confidence Index provides information on future developments, based upon opinion surveys on developments in production, orders, and stocks of finished goods in the industry sector.
Dow Jones Sustainability United States 40 Index is composed of U.S. sustainability leaders as identified by Sustainable Asset Management (SAM) through a corporate sustainability assessment. The index represents the top 20% of the largest 600 U.S. companies in the Dow Jones Sustainability U.S. Index based on long-term economic, environmental, and social criteria.
Morgan Stanley Capital International (MSCI) Global Energy Efficiency Index includes developed and emerging market large-, mid-, and smallcap companies that derive 50% or more of their revenues from products and services in energy efficiency.
MSCI USA ESG leaders index is a capitalization weighted index that provides exposure to companies with high Environmental, Social, and Governance (ESG) performance relative to their sector peers.
Personal Income in Billions is the income that persons receive in return for their provision of labor, land, and capital used in current production and the net current transfer payments that they receive from business and from government.
S&P Carbon Efficiency Index is designed to measure the performance of companies in the S&P 500, while overweighting or underweighting those companies that have lower or higher levels of carbon emissions per unit of revenue.
S&P Consumer Finance Index provides liquid exposure to mortgage real estate investment trusts (REITs), thrifts and mortgage finance companies, diversified and regional banks, consumer finance or data processing services companies trading on U.S. stock exchanges.
S&P Municipal Bond Education Index consists of bonds in the S&P Municipal Bond Index from the Higher Education and Student Loan Sectors.
U.S. unemployment rate is defined as the percentage of unemployed people who are currently in the labor force. In order to be in the labor force, a person either must have a job or have looked for work in the last four weeks.
Data are taken from Bloomberg and S&P databases and cover the period October 2009–July 2019. We designate by the monthly average level of the series on month . We compute the natural logarithmic growth/returns of each series as: , which yields 118 monthly observations.
4. Empirical Models and Results
In this section, we examine multicollinearity, analyze the correlation matrix based on the Random Matrix Theory (RMT), and conduct regression analyses. First, we analyze the logarithmic growth/returns of each variable to understand some of their features as well as the structure of cross-correlation, which helps in refining our models. Then, we run multiple regressions to uncover how each of the variable is contributing to the explanation of the U.S. consumer sentiment.
4.1. Multicollinearity Analysis
The presence of multicollinearity among the independent variables is assessed via the variance inflation factor (VIF):
where
is the R-squared.
The variables are said to be not correlated if the VIF is close to one, moderately correlated if the VIF is between one and five, and highly correlated if the VIF exceeds five.
Table 1 shows that only three variables have a VIF above five (DOW JONES SUSTAINABILITY U.S. INDEX; MSCI USA LEADERS INDEX; SP500 CARBON EFFICIENT).
4.2. Random Matrix Theory Analysis
Using RMT,
Pafka and Kondor (
2004) found that the effect of noise in the correlation matrices of financial series can be large and that the filtering based on RMT is particularly powerful in this respect.
Laloux et al. (
1999,
2000) indicated that the empirical correlation matrix leads to a dramatic underestimation of the real risk, by overinvesting in artificially low-risk eigenvectors. They showed that less than 6% of the eigenvectors, which are responsible for 26% of the total volatility, appear to carry some information. In order to quantify correlations, we first calculate the growth/return of series
over a time scale
,
where
denotes the level of the series
. Since different series (variables) have varying levels of volatility (standard deviation), we define a normalized return
where
is the standard deviation of
, and
denotes a time average over the period studied. We then compute the equal-time cross-correlation matrix
with elements
By construction, the elements are restricted to the domain , where corresponds to perfect relations, corresponds to perfect anti-correlations, and corresponds to uncorrelated pairs of stocks.
The difficulties in analyzing the significance and meaning of the empirical cross-correlation coefficients are due to the fact that market conditions change with time, and the cross correlations that exist between any pair of variables may not be stationary.
Furthermore, the finite length of time series available to estimate cross-correlations introduces ”measurement noise”.
If we have
returns with the same length equal to
, then the empirical cross-correlation matrix
could be computed by
. In our case, we have
and
. By diagonalizing matrix
, we obtain
In matrix notation, the correlation matrix can be expressed as
where
is an
matrix with elements
, and
denotes transpose of
. Therefore, we consider a random correlation matrix
where
is an
matrix containing
time series of
random elements
with zero mean and unit variance, which are mutually uncorrelated.
Statistical properties of random matrices such as R are known (e.g.,
Dyson 1971;
Sengupta and Mitra 1999). Particularly, in the limit
such that
is fixed, the probability density function
of eigenvalues
of the random correlation matrix
is given by
For
within the bounds
, where
and
are, respectively, the minimum and maximum eigenvalues of
given by
where
is equal to the mean of eigenvalues of the correlation matrix (
Bouchaud and Potters 2003). The distribution of the components
of an eigenvector
of a random correlation matrix
should obey the standard normal distribution with zero mean and unit variance (
Plerou et al. 2002),
We observe that there are deviations from the interval of eigenvalues predicted by RMT. Then, these deviating values might contain pertinent information, and therefore they are not noisy elements.
It is found that theoretical eigenvalues bounds (maximum and minimum) are and . We have 12 series and 118 monthly returns for each equity. Then, the value of is equal to .
By analyzing results, we observed in
Figure 1 that many eigenvalues deviate from RMT interval of predictions.
Laloux et al. (
2000) found that there is less than
of eigenvalues that might contain pertinent information. In our case, these deviations represent
of the total of eigenvalues, which is a very important percentage. Then, only
of eigenvalues deals with random matrix theory distribution. Moreover, the maximum of empirical value of eigenvalues
exceeds what is predicted by random matrix theory
.
4.3. Regression Analysis
In the following, we present the results from regressing the University of Michigan Consumer Sentiment Index on the various independent variables.
Table 2 presents the results from the ordinary least squares (OLS) linear regression for the undifferentiated variables (Model 1) and the differentiated variables (Model 2). In both models, the F-statistic is significant, suggesting that all the independent variables jointly can influence the University of Michigan Consumer Sentiment Index. We observe in Model 1 that only two variables are significant at the level of 5% and the adjusted R-squared represents about 7% of the explained variance. The MSCI USA Leaders Index is significant at the 5% level and that companies with high Environmental, Social, and Governance (ESG) performance contribute positively and importantly in the improvement of the U.S. consumer sentiment. In Model 2, the adjusted R-squared is 23.43% of the explained variance, and six variables are significantly related to the University of Michigan Consumer Sentiment Index. These are Business Confidence Index (-1), Dow Jones Sustainability Index (-1), MSCI Global Energy Efficiency (-1), Personal income (-4), SP Consumer Finance Index (-3), and Unemployment rate (-3). Besides, the adjusted R-squared improved from 7.18% to 23.43% of the explained variance. Model 2 could only be used for predictions given that the differentiations were done iteratively until we got the best results. Overall, some of our results are in line with
Lahiri and Zhao (
2016), who showed that macroeconomic conditions can explain the sentiment of U.S. consumers.
Next, we split the full sample period into two equal sub-periods to assess whether the estimated model maintains the same predictive power. The related results are given in
Table 3. Notably, they show the significance of seven variables in the first sub-period compared to only one variable in the second sub-period. The Adjusted R-squared is 37% in the first sub-period and 1.5% in the second sub-period. Accordingly, we can say that the model is not stable over time since it shows very different results in each sub-period. This suggests the need to move beyond the linear regression in order to capture nonlinearity in the model.
4.4. Regime Switching Model
For one variable, the typical behavior could be described with a first autoregression as follows,
with
, which seemed to be adequately the observed data for
.
is a date where there is a significant change in the average of the series, so that instead the data would be described as follows,
for
This fix of changing the value of the intercept from
to
might help the model to get back on track with better forecasts, but it is rather unsatisfactory as a probability law that could have generated the data.
Rather than claim that the first equation above governed the data up to date
and the second one after that date, it is possible to write that in one equation,
where
is a random variable, as a result of institutional changes that happened in the sample.
for and for
The probabilistic model of what caused the change from
to
where
is the realization of a two-state Markov chain with
is not supposed to be observed directly, but only infer its operation though the observed behavior
. The probability of a change in regime depends on the past only through the value of the most recent regime (
Hamilton 2005). Furthermore, if the regime change reflects a fundamental change in monetary or fiscal policy, the prudent assumption would seem to be to allow the possibility for it to change back again, suggesting that
is often a more natural formulation for thinking about changes in regime than
(
Hamilton 2005). We present in
Table 4 the results of the regime-switching model.
Results from Regime 1 show that the SP Municipal Bond Education is statistically significant at 10%, reflecting the positive impact of the student loan on the University of Michigan Consumer Sentiment Index. The SP Consumer Finance index and unemployment rate are also significant at the level of 5%. The SP Consumer Finance Index contributes significantly and positively to the consumer sentiment index, while the unemployment rate contributes negatively to it. In that state (1), the U.S. consumer seems to be frustrated about the other variables and then has less confidence in sustainability variables, innovation freedom, energy efficiency policies, and personal income expectations. However, MSCI Global Energy Efficiency, SP Municipal Bond Education, and Personal Income contribute negatively to the level of the University of Michigan Consumer Sentiment Index after switching from Regime 1 to Regime 2. This result should be explained as an important change in the social and economic state in the U.S. Furthermore, other variables become significant; Business Confidence Index and MSCI USA Leaders Index have a significant and positive impact. This can be explained by the fact that the improvement of businesses affects directly and positively the consumer sentiment. As for the positive impact of the MSCI USA Leaders Index, it can be explained by the fact that U.S. consumers are highly satisfied by the Environmental, Social, and Governance (ESG) performance of companies belonging to the MSCI USA Leaders Index. Finally, personal income shows a negative relationship with the University of Michigan Consumer Sentiment Index.
Based on the above results, we indicate that the regime-switching model presents a switch from a state one where U.S. consumers were not confident about the variables studied in relation to sustainability, personal income, environment, and business confidence to state two, when there was a switch towards more confidence and more positive sentiment regarding energy efficiency, unemployment rate, student loan, sustainability, and business confidence.
Furthermore, results from
Table 5 show that the probabilities of being in Regime 1 are more than 68% while the probabilities of being in Regime 2 are 31.56%. We also observe that both probabilities do not depend on the origin state. The constant expected duration is also higher in Regime 1 with a duration of 3.17 months, compared to only 1.47 months in Regime 2. Accordingly, U.S. consumers seem to stay most of the time in Regime 1 and are usually less confident in sustainability variables, innovation freedom, energy efficiency policies, and personal income expectations. The switch to Regime 2 is seasonal and depends on the social, economic, political, and institutional mutations in the U.S.
4.5. Gradient Descent Algorithm
We computed a gradient descent for the linear regression in order to compare the results obtained. The gradient descent is an optimization algorithm used to minimize a function by iteratively moving in the direction of steepest descent, as defined by the negative of the gradient (
Cauchy 1847). Algorithms play an important role in the optimization process. They are defined as a finite sequence of well-defined, computer-implementable instructions, in order to solve a class of problems or to perform a computation. Gradient descent will allow us to update linear regression coefficients in an iterative way until convergence. Then, we will try to minimize the function of mean squared error (cost function) that is considered as the difference between the estimator and the estimated values.
The adjustment of this equation allows for making a calculation simpler with the Gradient Descent Algorithm to obtain the following equation,
Gradient Descent changes the theta values iteratively until, in a way, that minimizes the cost function. We start the algorithm by initializing theta (0) and theta (1).
where α, alpha, is the learning rate, or how quickly we want to move towards the minimum. If α is too large, however, we can overshoot.
The algorithm will be repeated until convergence
After computing the algorithm, we obtained the results presented in
Table 6.
We can see that the sum of errors computed by using Gradient Descent Algorithm is smaller than the one found in the other ordinary linear regressions and the switching regime model. The interpretation of the coefficient is not possible since this learning machine algorithm aims to minimize the cost function regardless of the meaning of the coefficients. Thus, this model could be used to do predictions that are more accurate since it has less errors comparing to the other models computed above and gives it a high predictive power.
5. Conclusions
In this paper, we examined the relationship between the U.S. consumer sentiment and other relevant financial indexes in relation to education, environment, sustainability, and innovation freedom. We started by analyzing all the variables structure via cross-correlation and RMT analysis. Results show that more than 8.33% of the total of eigenvalues contain deviate from the RMT and contain then pertinent information, which means that those variables are useful for our analysis. Then, we used the linear regression which fails to capture the nonlinearity interaction among the variables, especially after estimating the linear regression in two equal sub-periods. Accordingly, we employed the regime-switching regression, and the results show that the model presents a switch from a Regime 1 where U.S. consumers were not confident about the variables studied in relation to sustainability, personal income, environment, and business confidence, to Regime 2, where there was a switch towards more confidence and more positive sentiment regarding energy efficiency, unemployment rate, student loan, sustainability, and business confidence. However, U.S. Consumers stay most of the time in Regime 1 and are usually less confident in sustainability variables, innovation freedom, energy efficiency policies, and personal income expectations. The switch to Regime 2 is seasonal and depends on the social, economic, political and institutional mutations in the U.S. Finally, we computed the Gradient Descent Algorithm to compare the errors obtained in each model. We found that the algorithm gives less error and could be used to do better predictions comparing to the other models with a high predictive power.
Our analyses and results extend our limited understanding regarding the exogenous factors that determine the U.S. consumer sentiment and the suitability of prediction models. In fact, we have shown that noneconomic and nonfinancial variables matter to the level of the U.S. consumer sentiment and that nonlinear models combined with Gradient Descent Algorithm have a more significant prediction power over standard regression models.
Our results have policy implications given that the findings presented above improve our understanding of the factors driving the sentiment of U.S. consumers. The results can be useful to investors in a way that would help them better understand the drivers of the U.S. consumers’ sentiment and the overall level of confidence in the U.S. economy. Furthermore, the results have implications regarding consumption, saving, investment, and other related variables. Consumers can benefit from the findings to enhance their understanding of the most important problems that are impacting their sentiments, which might induce economic, social, and political consequences through voting decisions or economic and social adjustments. For policymakers, there seems to be a possibility to design policies capable of exploiting the association between current economic and stock markets conditions and consumers’ confidence. Given the significant role played by specific factors, a practical policy formulation is merited to enhance U.S. consumer confidence with appropriate initiatives that involve education, environment, sustainability, and innovation freedom. If employed, such policies can enhance the well-being of the U.S. consumers.
Future studies can consider conducting an analysis that involves various developed and emerging countries. Another extension could be the application of a mixed-data sampling to exploit the high frequency of data on stock indices in explaining consumer confidence.