Homeownership for All: An American Narrative

: The narrative of homeownership for all citizens is a uniquely American story. Narrative economics is a field that studies the spread of stories to explain economic fluctuations. We quanti-tatively examine the relationship between the American housing narrative and the run-up in home prices experienced since the Great Recession in the United States. We rely on a natural language processing (NLP) framework to measure the sentiment associated with the narrative. We then use a panel vector autoregression to empirically model the relationship between home prices and homeownership sentiment in the United States. We find that sentiment related to the American homeownership narrative is an important factor in explaining movements in home prices even after taking into account the economic factors typically thought to explain home price fluctuations. Though others have examined the role of sentiment in markets, our study is the first to empirically measure the American homeownership narrative. While this is a narrative promoted at the national level, future research might examine whether sentiment related to homeownership varies across this diverse nation.

species (Shiller 2017). As George Bailey's story attests, owning a home in America signifies success for the family and the nation.
Researchers in economics use theoretical paradigms to tell stories and provide insight into the forces shaping the human experience. McClosky notes that "economists are tellers of stories and makers of poems, and from recognizing this we will know better what economists do" (McClosky 1990, p. 59). We argue that recognizing that economists are storytellers will allow us to know better what people do. McClosky notes that economists' stories "are selective" because "we decide what matters, for our purposes" (McClosky 1990, p. 69). We use narrative economics to measure what matters to people, rather than economists. Narrative economics is a field of research that studies the spread of stories to shed light on what drives economic fluctuations (Shiller 2019). In this paper, we examine the relationship between the time series of home prices in the United States and a measure of sentiment related to the homeownership narrative.
In his Presidential Address before the American Economic Association, Robert Shiller argued that research based on quantitative examinations of popular narratives has the potential to expand our understanding of the workings of the economy (Shiller 2017). A narrative is simply a story that is shared through conversation, written text, social media, radio, television, or other means. Narratives change over time and need not be factual. It is not easy to judge what narratives will become contagious, but typically stories that are shared repeatedly grab people's imaginations or emotions.
The goal of this paper is to provide quantitative evidence on the relationship between the American housing narrative and the run-up in housing prices experienced after the Great Recession of [2007][2008][2009]. While the term "American Dream" was not popularized until 1931, an American narrative tying happiness to homeownership began long before. This narrative has evolved over time, as most do, with the Great Recession having a profound impact on the homeownership story. Few empirical studies have heeded Shiller's call for the study of narratives because stories are difficult to quantify. One way to measure what people are talking about is to use simple counts of words in written works or in transcripts of spoken word, referred to as "n-grams." However, as Shiller points out, the emotional tenor of a word or phrase is a factor, with more vivid or emotion-inducing images leading to stronger responses. While a word or phrase may signal a narrative, the connections between various signals are meaningful. To measure the sentiment associated with a word or phrase, we adopt a natural language processing (NLP) framework developed in linguistics, artificial intelligence, and computing fields to extract information from a body of text, referred to as a corpus. As we describe later, the computer is trained to understand the context of words without any fixed notion of words.
In our quantitative analysis, we use an archive of television news to search terms related to the homeownership narrative. Though the words associated with the narrative are mentioned quite consistently since the Great Recession, we observe variability in counts and sentiment over time. Importantly, we find that a shock to sentiment drives home price, whereas a shock to price has little importance in explaining sentiment. Thus, the homeownership narrative moves American home prices after the Great Recession. This evidence provides direct support for Shiller's (Shiller 2019) assertion that narratives drive major economic events.
The remainder of this paper is organized as follows. Section 2 develops the American homeownership narrative. Section 3 describes the quantitative measurement of the narrative using both word counts (n-grams) and sentiment analysis. The empirical method, based on the natural language processing (NLP) framework, is also described in Section 3. Section 4 provides summary information and reports the primary results. Section 5 contains a discussion of the findings and concluding remarks.

The Homeownership Narrative
The phrase "American Dream" was not popularized until 1931 when it appeared in a book by James Truslow Adams, a respected historian who won the Pulitzer Prize for his writing. Adams begins his international bestseller entitled "The Epic of America" with the settling of the country and progresses in the book through the nation's development (Adams 1931). In extolling the "gifts to humankind" made by America, Adams describes the American Dream as "that dream of a land in which life should be better and richer and fuller for every man, with opportunity for each according to his ability or achievement" (Adams 1931, p. 317). According to Adams, the individual liberties and opportunities granted to American citizens constitute a great accomplishment, yet the nation remained a work in progress. Each generation of Americans must struggle to preserve the "life, liberty, and pursuit of happiness" bestowed by the U.S. Declaration of Independence.
While Adams did not single out homeownership as an element of the American Dream, Herbert Hoover, the President of the United States at the time of Adams' writing, did argue that owning a home was a core value in the United States. On 13 August, 1931, Hoover argued that "(h)ome owning is more than the provision of domiciles; it goes to the roots of family life, public morals and standards of living" (Hoover 1931). In fact, the notion of a home for all was embraced in the American narrative much earlier, even by the founders of the nation. Typically, the early narrative used the phrasing "land" rather than "home" ownership, but the implication was that ownership of a place to live for each citizen was an objective. This is evidenced in a letter from Thomas Jefferson to James Madison written on 28 October 1785 in which Jefferson argues that "(t)he earth is given as a common stock for man to labour and live on" (Jefferson 1785). In the letter, Jefferson stresses the importance of land ownership by all in saying "(b)ut it is not too soon to provide by every possible means that as few as possible shall be without a little portion of land. The small landholders are the most precious part of a state." Jefferson seems to be asking, as George Bailey did much later, whether it was "too much to have them work and pay and live and die in a couple of decent rooms and a bath?" This narrative through the years suggests that homeownership is the goal, if not a right, for all Americans.
In more recent decades, American leaders, both Democrats and Republicans, have reinforced the view that a home for every American is fundamental to the American way of life. In remarks during a Roundtable Discussion with Housing Industry Representatives in Arlington, Texas, on 12 April, 1984, President Ronald Reagan said "(h)ome ownership is an essential part-as I told these people out there today-of the American dream. It strengthens the family. It's fundamental to our way of life, and we want to build an opportunity society where more and more families from all walks of life can afford to buy their own homes" (Reagan 1984). In 1995, President Bill Clinton announced a "National Homeownership Strategy" with a goal toward increasing the homeownership rate to an all-time high (Harris 1995). In 2002, President George W. Bush opined that "(w)e can put light where there's darkness, and hope where there's despondency in this country. Additionally, part of it is working together as a nation to encourage folks to own their own home" (Becker et al. 2008). Earlier that same year, Bush argued that economic security relied upon homeownership by Americans. President Barack Obama added his support to the homeownership narrative when he set out policies aimed at promoting "the American dream of homeownership" (Obama 2013). This support continues with the Biden-Harris administration, as evidenced in their platform, which states that President Joe Biden "believes the middle class is not a number, but a value set which includes the ability to own your own home and live in a safe community. Housing should be a right, not a privilege" (Biden 2020). The President does not go as far as suggesting that homeownership is a right of all Americans, but rather suggests it is the goal for middle America.
While American leaders promote a homeownership narrative with a positive tone, evidence suggests that many younger Americans, in particular, are not buying into it. Almost 6 in 10 Americans surveyed by CNN in 2014 said the American Dream is unattainable, with 63% of young adult respondents indicating it is an impossible dream (Luhby 2014). In retrospect, Presidents Bill Clinton and George W. Bush were blamed for policies that drove up home prices, leading to a price bubble in the housing market that later crashed (Becker et al. 2008). Today, the American Dream of homeownership seems more like a fairytale that reflects wishful thinking to many young people. Traditionally, a home is a middle-class person's store of wealth and a safety net for bad times, but this buffer has been diluted. With growth in the interest of professional investors in the housing market in recent years, individual homeownership may be a thing of the past. According to a recent Wall Street Journal article, the American Dream is for rent (Dezember 2020). The author of the article notes that the idea that single family homes in the suburbs are a good large-scale investment took off following the bursting of the housing bubble in 2007. Professional investors saw an opportunity presented by low home prices and they pounced on it.
In recent years, the homeownership narrative in the popular press is negative and predicts a "tsunami of evictions," with an increase in homelessness (Miranda 2020). Progressive organizations are calling for change in order to protect Americans, including requiring disclosures on rents and evictions from the nation's large landlords (Andrews 2019). According to Alexandria Ocasio-Cortez, an extremely popular young member of the U.S. House of Representatives with a massive social media following, "(w)e need a complete overhaul of housing policy. We need to stop commodifying the housing market because it's not a speculative good; it's a human right. Everyone needs a home to thrive." As is often the case, the poor, women, and minority citizens are hit the hardest in a poor economy (e.g., Desmond 2016; Dezember 2021). However, the impact of the change in the homeownership environment in the United States since the Great Recession is reverberating through other groups, even touching senior citizens. These days more older Americans are without permanent living structures and forced to seek seasonal employment, work that is often quite physically demanding (Bruder 2017).
Prior work has considered the role of investor sentiment in markets. As for stock markets, the evidence suggests that stock prices move in response to changes in investor sentiment (Baker and Wurgler 2007). Wilcox (2015) reports that a measure of home purchase sentiment has power in forecasting home sales. On the other hand, Goodwin (2019) reports little success in explaining home prices with real estate market sentiment. However, Soo (2018) finds success when using textual analysis of local newspaper articles to quantify the tone of the reports. Word lists from a commonly used dictionary are used to quantify tone as positive or negative. Soo concludes that local housing media sentiment predicts future home prices. While our approach has similarities with that of Soo, there are also important differences. First, our measure is not dependent on word lists which can capture unintended effects (Loughran and McDonald 2011). Second, our goal is distinct from Soo's, who seeks to examine how the sentiment reflected in newspapers moves home prices at the local level. While we use a news archive to identify the story of homeownership in America, our goal is to examine whether the national narrative can explain the run-up in American home prices since the Great Recession.
As we have described, the American homeownership narrative has experienced a transformation over time. In the next section we turn to a description of how we measure Americans' emotional response to this narrative.

N-Grams
Shiller argues that "we would be wise to add some analysis of what people are talking about if we are to search for the source of economic fluctuations" (Shiller 2017, p. 972). To see what people are talking about, we follow Shiller and begin with n-gram counts. Ngrams are simple sequences of n items from a sample of text, and our chosen searches use n-grams of one or two words.
We searched each n-gram on the Internet Archive, a non-profit digital library of internet websites and published works (Internet Archive 2020). The Internet Archive includes a number of searchable transcripts, including TV News. We selected the TV News Archive because it is an appropriate source for news narratives on the American housing experience. We considered a number of corpora and searchable bodies of text, including ProQuest, Twitter, and the searchable corpus available at https://www.english-corpora.org/ (accessed on 28 May 2021). For our purposes, the Internet Archive provided the most comprehensive body of searchable text. The TV News Archive begins in 2009, which follows the crash in the U.S. housing market, and the count data are monthly. The digital library includes over 2 million television news shows broadcast in the United States on the major networks (ABC, CBS, and NBS), as well as cable programs on BBC, CNN, and Fox News and business news broadcasts such as CNBC. Determining the exact linguistic contour of a narrative is difficult, though not necessary for our purposes, because the narrative is sufficiently characterized by a cluster of carefully selected n-grams based on economic priors. Later in the paper, we report on joint tests of the hypothesis that the n-grams are connected and jointly explain the variation in American home prices. We chose ngrams that we believe capture the American dream of homeownership as developed earlier in the paper. We examined the sensitivity of our results by adding several n-grams. When we increased the number of n-grams included in the analysis, our findings were substantively similar. Nine n-grams were included in our analysis: American Dream, Eviction, Home Price, Homeless, Homeowner, Housing Crisis, Housing Market, Real Estate, and Rent.

Sentiment Analysis
Narratives are the stories of humans that reflect fact, fiction, and emotion, and they are often developed to leave an imprint on others. We can measure what people are talking about using n-gram counts from a body of written or spoken word. However, it is not just the words that form a narrative that is spread. The emotional response is a factor, with more vivid images leading to stronger emotional responses. An emotional response does not suggest poor decision making because emotion and reason work in parallel to promote rational behavior (Slovic et al. 2004). An immediate affective response to a stimulus gives a rough assessment that directs cognitive processes toward the aspects of greatest concern (Loewenstein et al. 2001). Thus, positive and negative experienced emotion motivate the actions people take. The evidence supports the view that asset markets move due to social dynamics and investor sentiment (Shiller 1984;Baker and Wurgler 2007).
To measure the sentiment associated with an n-gram, we applied the FLAIR algorithm, a natural language processing (NLP) framework proposed by Akbik et al. (2018) and Akbik et al. (2019). NLPs are developed in linguistics, artificial intelligence, and computing fields in order to extract information from documents, including written works and transcripts of spoken word like television broadcasts. The computer is trained to evaluate the context of words and sentences without any fixed notion of words. Instead, the "contextual string embeddings" treat words as strings of characters that are contextualized by the surrounding text. Thus, a word can have multiple embeddings or meanings depending on the use of the word in the context of the passage. Naturally, even humans will disagree on the extent of positivity or negativity of a text passage. We made use of the FLAIR algorithm as it is considered to have information extraction performance that is state-of-the-art (Otter et al. 2020;Li et al. 2020). The text passages searched were from the Internet Archive, described previously.
The text search for the appearance of each n-gram in the TV News Archive returned the transcript of the news. After cleaning extraneous material related to the programming code, we defined a passage of text centered on each n-gram that includes 100 tokens before and after the target word or phrase. A token is an individual occurrence of a linguistic unit in speech or writing. FLAIR models language as distributions over characters. It automatically internalizes linguistic concepts such as words, sentences, subclauses, and even sentiment. For the purpose of sentiment classification, the model was pre-trained over a mix of polarized positive and negative comments. FLAIR has been shown to recognize and correctly classify positive and negative sentiment in English text (Otter et al. 2020;Li et al. 2020). We fed each passage containing the target n-gram extracted from the TV News Archive into the pre-trained model. The model then classified the passage's sentiment based on context. Finally, the algorithm returned a sentiment score (negativity and positivity of each passage), with sentiment ranging from −1.0 to +1.0. We then collected the scores and computed the average for each month. The object of study was then the monthly sentiment average for the nine n-grams.

Panel Vector Autoregression
Our goal was to examine the relationship between home prices and sentiment related to homeownership in the United State subsequent to the Great Recession. We described our measurement of sentiment above. As for home prices, we used the widely followed Case-Shiller home price index available from the Federal Reserve Bank of St. Louis (Federal Reserve Bank of St. Louis 2020). To capture the relationships among several variables across time, economists often use vector autoregressions, or VARs, which are generally considered to be a useful way of quantifying the links between the time series. A VAR system estimates the dynamic relationships between the variables while treating all included variables as endogenous. Holtz-Eakin et al. (1988) examined the particular case of vector autoregressions with panel data, or panel VAR, an approach that assumes the underlying data-generating process is the same for the cross-sectional units. We used the convenient set of programs provided by Abrigo and Love (2016), which were developed to follow Holtz-Eakin et al. (1988) and allow for nonstationarity, appropriate lag length selection, and computation of test statistics in a generalized method of moments framework.
To provide insight into the relationship between home prices and the American homeownership narrative as reflected in our measure of sentiment, we used a bivariate panel autoregression, with the time series being home prices (Pt) and sentiment (Si,t). Thus, we estimated the following panel vector autoregression: where Si,t is the average sentiment for n-gram i, with i = 1 … 9, at time t. The lag length, m, was chosen so that the error term was white noise. is the unexplained change in home prices, and is derived as we subsequently explain.
As noted above, the variables in a VAR system are treated as endogenous. Before considering the relationship between home prices and homeownership sentiment, we estimated a preliminary VAR to remove the impact of other determinants of home prices. To this end, we followed Case and Shiller (2004 , Table 3), who identified six fundamental variables, all available from the Federal Reserve Bank of St. Louis: change in population, change in employment, mortgage rate, unemployment rate, housing starts, and income per capita. The first differences of the six variables were stationary and the VAR system was stable after differencing, with an R 2 of 87.92% for the home price equation. The preliminary VAR estimation confirms the suitability of the six variables above for understanding home prices with economic data only. The residuals from the preliminary regression represent the variation in prices that is not explained by the six fundamental variables. This unexplained variation is the price series (Pt) included in the estimation of the panel VAR. The vector Pt in Equations (1) and (2) was formed by stacking this unexplained variation nine times to match the dimension of Si,t.
The second variable in our VAR system, sentiment (Si,t), was a panel of the average monthly sentiment measure associated with each of the nine n-grams described previously. The panel methodology inherently accommodated the interconnectedness of the ngrams that characterize the homeownership narrative. Therefore, we tested the hypothesis that the average sentiment for each of the nine n-grams in the panel describes the dynamic relationship between the narrative and home prices. The sentiment time series began in June 2009 and ended in November 2020, giving 138 monthly observations on the sentiment associated with each of the nine n-grams and a total of 1242 observations.

Descriptive Information
As described above, we began with n-gram counts from the Internet Archive. Figure  1 provides graphs of the observed frequencies for each of the nine n-grams over the June 2009-November 2020 sample period. We observed variation over time in the n-gram counts, but no obvious time series patterns, with the possible exception of the rent n-gram, which seems to indicate increased frequency over time. In Panel A of Table 1, we provide summary information for each of the nine n-grams over our sample period. The table reports the mean, median, standard deviation, minimum, and maximum across sample months for the nine n-gram counts. The smallest frequency observed at the monthly observation window is 5 and the largest is 3271. There was quite a lot of variability in the frequencies for each n-gram in the TV News Archive, but the n-grams were mentioned quite consistently, with monthly averages ranging from 104 to 1325. The table reports summary information for each n-gram for the June 2009-November 2020 sample period. The sample includes 138 observations for each of the nine n-grams. The table reports the mean, median, standard deviation, minimum, and maximum across sample months for the n-grams counts (Panel A) and sentiment (Panel B). Next, we turned to the sentiment associated with each n-gram. Figure 2 provides graphs of the measured n-gram sentiment for each month in our sample. We observed that the sentiment measures for the n-grams associated with homeownership are typically negative. As with the counts, the graphs indicate significant variability over time in sentiment. Panel B of Table 1 reports the mean, median, standard deviation, minimum, and maximum across sample months for n-gram sentiment. Since the Great Recession, the sentiment associated with eight of the nine n-grams has been negative, on average. Notably, the sentiment associated with Eviction, Home Price, Housing Market, and Rent was always negative, whereas the American Dream is the single n-gram with positive average sentiment. After the Great Recession, Americans continued to feel the pinch on their housing budgets, yet they remained hopeful about the dream of attaining the American ideal. Our Figure 3 shows the Case-Shiller national home price index for our sample period, June 2009 through November 2020. Though prices continued to fall at the beginning of our sample, which begins just after the Great Recession, we saw a quick rebound and strong upward trend. In the next section we provide estimates of a panel vector autoregression to provide insight into the relationship between American home prices and sentiment.

Panel Vector Autoregressions
Before conducting the panel VAR analysis, some preliminaries related to model selection must be addressed, as noted by Holtz-Eakin et al. (1988) and Abrigo and Love (2016). As discussed earlier in this paper, the sentiment variable measures the negativity and positivity associated with an n-gram in a passage of text. We argue that this is a better measure of the impact of a narrative, as compared to n-gram counts, because sentiment reflects the tone of the passage. However, to confirm that sentiment is a better measure of what moves markets, we examined the panel VAR specification with home price and ngram counts. This two-equation system was unstable, and estimates suggested poor fit. Next, we confirmed that standard panel unit root tests for our home price series (Pt) and average monthly sentiment (Si,t) reject a unit root at any conventional level of significance. Second, we conducted tests to determine the appropriate lag lengths, i.e., the value of m in Equations (1) and (2). We selected two lags (m = 2) based on Akaike's final prediction error (FPE), Akaike's information criterion (AIC), and the Hannan and Quinn information criterion (HQIC). Third, we conducted Granger Causality tests of exclusion restrictions. The results indicate that both price and sentiment are useful in predicting the other variable at p < 0.03. Table 2 reports coefficient estimates for the price Equation (1) and sentiment Equation (2). Each VAR included two lags of each variable and was estimated with nine panels and 1188 observations after adjustment for lags. Standard errors and two-sided p-values appear to the right of each estimated coefficient.  Table 2 reports the panel VAR results with coefficient estimates for the price Equation (1) and sentiment Equation (2). The panel VAR was estimated with nine panels and 1188 observations, after adjustment for lag length. Though we began with 1242 observations (9 n-grams * 138 time periods), we lost 1 observation in the time dimension due to first differencing of home prices (Pt), two because of the lags in the preliminary VAR, and three more because of the fixed effect and lags in the PVAR. Hence our final number of observations of 1188 (9 * 132 = 1188). Standard errors and two-sided p-values appear to the right of each estimated coefficient in the table. For the price equation, all variables were significant at p < 0.001. For the sentiment equations, the first lag of home prices (Pt-1) was not significant, but the other three variables were significant at p < 0.03. Insight into the economic significance of these estimates is provided by the variance decomposition and impulse response functions, to which we now turn. Table 3 reports the response of price and sentiment to a shock provided by both price and sentiment. After estimation of the panel VAR, we used a forecast error variance decomposition to provide insight into the amount of information each of the two variables contributed to the other in the autoregression. The decomposition allowed us to determine how much of the forecast error variance of price and sentiment can be explained by exogenous shocks to both price and sentiment. Table 3 reports the response of home price and sentiment to a shock for the subsequent 12 months. We observed that a shock to sentiment drove home price with increasing importance over an extended horizon, whereas an own shock to price had declining importance in explaining home price. In contrast, a shock to the price had a small impact on sentiment and its impact remained small over time. A shock to sentiment was the primary driver of home prices. This finding directly supports and provides a quantitative measure of Shiller's (2019) assertion that narratives drive major economic events.
To further examine how home prices and sentiment evolve over time, we provide impulse response functions in Figure 4. The impulse response functions showed 95% confidence intervals for the impact of a shock in the first variable on the response of the second over a twelve-month time horizon. In the top right figure, it is clear that a shock to homeownership sentiment has a lasting positive impact on home price. In contrast, the impulse response function on the bottom left suggests that a shock to home price has a very small impact on homeownership sentiment, and one that is not significantly different from zero. As with the variance decomposition discussed previously, these results are consistent with the view that a homeownership narrative, as measured by sentiment, drives home prices.

Discussion and Concluding Remarks
This paper reports the results of a quantitative examination of the relationship between the American housing narrative and the run-up in home prices experienced since 2010 in the United States. In order to measure what matters to people and moves markets, we used narrative economics, a field that studies the spread of stories to explain economic fluctuations. Stories are difficult to quantify and, as a result, few empirical studies have used narratives to understand asset market fluctuations. We relied on a natural language processing (NLP) framework to measure sentiment from text, an approach that incorporates the emotional tone of a word or phrase. We then used a panel vector autoregression to empirically model the relationship between home prices and homeownership sentiment in the United States. We found that sentiment related to the American homeownership narrative is an important factor in explaining movements in home prices, even after taking into account the economic factors typically thought to explain home price fluctuations. Financial economists too often exclude storytelling from their mathematical models, which misses much of the important lines of a story, such as the role of the homeownership narrative. Our hope is that future research will bring the story back.
We started this paper by asking whether it was too much to dream the American Dream of homeownership. Young Americans may be the first generation to be worse off than their parents, in which case homeownership may truly be an impossible dream (Leatherby 2017). As we saw earlier in George Bailey's dialogue, the challenge of attaining the American dream is not a new theme. In his classic book, which details the evolution of the nation, Adams (1931) expresses concern about the condition of the United States, in part, because "we forgot to live, in the struggle to 'make a living'" (Adams 1931, p. 319). If the commodification of housing continues, Americans, young and old, low-and middleincome, will lose their safety net and find it more difficult to find a place "to live" as home prices and rents are pushed higher and higher.
We encourage future researchers to consider a narrative approach. The goal of this paper is to provide insight into how the story of homeownership in the United States has impacted American home prices. We found strong evidence that is consistent with the notion that the American homeownership narrative drives movements in home prices. However, many questions remain. For example, the real estate market in the United States may be segmented, and it is possible that the narrative varies across regions of this large and diverse nation. Clearly, in recent years the nation has become more divided, and political posturing has led to increasing polarization among citizens. In addition to the homeownership narrative, other narratives are prevalent and worthy of investigation. For example, nationalist narratives arising around the world lead to potentially isolationist policies such as Brexit. A narrative economics approach has the potential to provide valuable insight into economic growth and prosperity.
Author Contributions: The authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Coles College of Business.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data is available upon request to SM.