Information Spillover Effects of Real Estate Markets: Evidence from Ten Metropolitan Cities in China

: With the rapid development of information communication technology and the Internet, information spillover between cities in real estate markets is becoming more frequent. The inﬂuence of information spillover in real estate markets is becoming more and more prominent. However, the current research of information spillover between cities is still relatively insufﬁcient. In view of this research gap, this paper builds a research framework on the information conduction effect in the real estate markets of 10 Chinese cities by using Baidu search data, text mining and principal component analysis and analyzes the information interaction and dynamic inﬂuence of the real estate markets in each city by using the vector autoregressive model empirically. The results show that the information interaction among the real estate markets in each city has a network pattern and there is a signiﬁcant two-way information spillover effect in most cities. When the “information distance” becomes closer, the information interaction between the markets of the cities becomes closer and it is easier for cities to inﬂuence each other. The results help to explain the information spillover mechanism behind the house price spillover and to improve the ability to predict and analyze the information spillover process in real estate markets.


Introduction
In the two decades since the beginning of the new century, the boom and bust of the real estate market has aroused heated discussions in academic circles, especially in China's real estate market. Since the implementation of the monetary reform of housing allocation in 1998, China's real estate market has experienced a period of rapid development. While housing prices have been rising all over the world, there are certain spatial linkages and mutual influences of housing prices between different regions, which is the phenomenon of mutual spillover of housing prices (Chiang 2014;Yang et al. 2018;Tsai and Chiang 2019). The relationship of housing price spillover between cities has broken through the simple geographic "nearest neighbor" effect and presented a multi-threaded complex structure as a network form (Luo et al. 2007;Liu et al. 2008). The house price spillover effect leads to housing price fluctuations not only dependent on local factors but also influenced by other regional factors, which poses a new challenge to the analysis and prediction of price trends and interactions in the real estate market. Therefore, it is of great significance to clarify the causes and mechanisms of housing price spillover to reveal the characteristics and laws of housing price spillover and clarify the relevant relationships in the real estate market of each city to prevent and resolve the systemic risks in the real estate market.
Although there are many studies that have tested the house price spillover effect (Tsai 2015;Teye et al. 2017;Mohammad et al. 2020), there are relatively few studies on the causes and influencing factors of housing price spillover. Studies before have mainly argued from the two paths of population mobility and capital transfer (Meen 1999;Gong J. Risk Financial Manag. 2021, 14, 244 2 of 19 et al. 2016Teng et al. 2017;Hong 2020). With the rapid development of information and communication technology (ICT) and the Internet, the impact of information spillover on house price spillovers across regions has become increasingly significant (Oikarinen 2004;Lv and Liu 2019). This is mainly due to the fact that information spillover is characterized by a low cost, high speed and wide scope, which makes it easier to break through the objective conditions of population mobility and capital transfer, and is more likely to have a significant impact on the house price spillover effect across different cities. However, there is still a relative lack of research on the information spillover between real estate markets. The reason for this is that, unlike tangible population flows and capital transfers, information spillover is a hidden mechanism of house price spillover, the intensity of which is difficult to observe directly. Therefore, the key to solving this problem is how to find effective indicators to measure the changes in the information of urban real estate markets.
In recent years, big data information, represented by online search data, has been gradually applied to the study of real estate economics, providing new ideas for exploring the information transfer between different urban real estate markets. More and more studies are focusing on the behavior of real estate market participants with the help of the public's web search data (Wang and Wei 2015;Zhang and Tang 2016;Huang et al. 2019;Gargano et al. 2020;Tian and Zhong 2020). These studies have shown that web search data can reflect the changes in public attention to real estate market information and the motivations for search behavior, which can then be better applied to analyze the impact of fluctuations in public attention on real estate market prices. Therefore, drawing on the method of constructing public attention, we use Internet search data to construct an information attention index and examine the spillover effect of information attention between urban real estate markets by analyzing the change patterns of information search data in each city to study the sources and paths of information spillover between real estate markets in different cities.
In view of the limitations of the above-mentioned studies on the information spillover mechanism of house price spillover, this paper uses Internet search data to construct an information concern index and then investigates the information spillover effect of real estate market information among cities. The research objectives are (1) to construct a research framework for analyzing the information spillover effect in real estate markets; (2) to test the correlation and causality between the information concern levels in different cities; and (3) to establish a dynamic model for analyzing the information spillover effect in real estate markets among cities.
The main research questions of this paper are: (1) How can we characterize the degree of public interest in real estate market information? (2) How can we test the interaction of information spillover between real estate markets? (3) What kind of information spillover effects exist in different cities' real estate markets? (4) What is the role of each city's real estate market in the information spillover process? The answers to these questions can help us fully understand and grasp the characteristics and laws of information spillover between real estate markets, thus providing a reference for establishing a long-term mechanism for the smooth and healthy development of real estate markets with emphasis on information regulation and guidance.
The remaining sections of this paper are arranged as follows: Section 2 describes the main research methods, mainly including the principal component analysis-based method for constructing the real estate market information concern index and the VAR model for analyzing the interaction of information spillover between real estate markets. Section 3 describes the data sources and preprocessing process as well as the basic data characterization analysis. Section 4 analyzes and discusses the information spillover effects and interaction mechanisms between real estate markets in different cities. Section 5 draws conclusions and makes policy recommendations.

Research Methodology
This paper uses text mining and principal component analysis to construct a real estate market information concern index and then builds a vector autoregressive model to analyze the effect of information spillover between real estate markets.

Text Mining
Text mining techniques were used in this paper to identify and filter access to search keywords that are used frequently by the public when searching for real estate market information. Text mining, also known as text data mining or text knowledge discovery, is a computer processing technique for extracting valuable information and knowledge from text documents . After the concept of text mining was formally proposed, research around text mining models, text feature extraction models, text representation models and pattern discovery (e.g., association rule extraction, text classification and text clustering) developed rapidly (Hu and Zhang 2013). Text mining can generally be divided into four parts: text acquisition, preprocessing, feature extraction and algorithm mining, and the general process of text mining is shown in Figure 1. J. Risk Financial Manag. 2021, 14, x FOR PEER REVIEW 3 of

Research Methodology
This paper uses text mining and principal component analysis to construct a real e tate market information concern index and then builds a vector autoregressive model analyze the effect of information spillover between real estate markets.

Text Mining
Text mining techniques were used in this paper to identify and filter access to sear keywords that are used frequently by the public when searching for real estate mark information. Text mining, also known as text data mining or text knowledge discovery, a computer processing technique for extracting valuable information and knowledge fro text documents . After the concept of text mining was formally propose research around text mining models, text feature extraction models, text representati models and pattern discovery (e.g., association rule extraction, text classification and te clustering) developed rapidly (Hu and Zhang 2013). Text mining can generally be divid into four parts: text acquisition, preprocessing, feature extraction and algorithm minin and the general process of text mining is shown in Figure 1. (1) Text Acquisition. Text acquisition relies mainly on automated web crawlers. W crawlers are capable of automatically extracting the content of specific pages on t Internet according to pre-defined rules and algorithms (Meng et al. 2016). This pap uses (Python Software Foundation 2018) to write a crawler program on PyCharm fetch the text content and web search index of a particular web page. (2) Text Preprocessing. Text preprocessing mainly includes three steps: text partitionin deactivation processing and lexical annotation. Firstly, by text division, the original complete text chapter is divided into single words according to semantics and pun tuation; secondly, by deactivation, the virtual words and other meaningless words the text are further deleted to reduce the data dimension; finally, lexical annotati of different categories of words is carried out to facilitate subsequent processing. (3) Feature Extraction. Feature extraction is the construction of a subset of significa variables from the original feature set. Additionally, the remaining variables that a irrelevant or redundant are excluded, leaving only the keywords that are closely r lated to the real estate market in order to improve the efficiency and accuracy of t text mining work. (4) Algorithm Mining. After completing the above series of steps, the original text w be transformed into a phrase sequence containing only key semantic informatio Compared with the original text, the number of words contained in the phrase s quence is greatly reduced, meaning that it can be expressed in the form of vectors (1) Text Acquisition. Text acquisition relies mainly on automated web crawlers. Web crawlers are capable of automatically extracting the content of specific pages on the Internet according to pre-defined rules and algorithms (Meng et al. 2016). This paper uses (Python Software Foundation 2018) to write a crawler program on PyCharm to fetch the text content and web search index of a particular web page. (2) Text Preprocessing. Text preprocessing mainly includes three steps: text partitioning, deactivation processing and lexical annotation. Firstly, by text division, the originally complete text chapter is divided into single words according to semantics and punctuation; secondly, by deactivation, the virtual words and other meaningless words in the text are further deleted to reduce the data dimension; finally, lexical annotation of different categories of words is carried out to facilitate subsequent processing. (3) Feature Extraction. Feature extraction is the construction of a subset of significant variables from the original feature set. Additionally, the remaining variables that are irrelevant or redundant are excluded, leaving only the keywords that are closely related to the real estate market in order to improve the efficiency and accuracy of the text mining work. (4) Algorithm Mining. After completing the above series of steps, the original text will be transformed into a phrase sequence containing only key semantic information. Compared with the original text, the number of words contained in the phrase sequence is greatly reduced, meaning that it can be expressed in the form of vectors or matrices of lower dimensionality. Based on the obtained phrase matrices, the algorithm mining process can perform efficient analysis and computation according to the different objectives of the text mining task.
The text mining method can effectively and accurately identify keywords with a high search frequency. Characterizing market participants' attention to real estate market information by the search frequency of keywords and building a real estate market information attention index based on it are carried out to conduct research on information spillover among real estate markets.

Principal Component Analysis
Principal component analysis (PCA) was used in this paper to extract the main components of the time series of different keyword web search data and construct a comprehensive index to measure the information index in the real estate market of each city. Principal component analysis is a statistical analysis method that recombines a number of indicators with certain correlations, extracts important dimensions and forms a new set of unrelated (orthogonal) composite indicators to replace the original indicators, thereby eliminating redundant information. Principal component analysis (PCA) has been used in many fields such as image processing, data downscaling and stock market forecasting (Wang and Zhu 2015).
In this paper, a decentralized two-dimensional feature matrix X = {a i ,b i } is used as an example to illustrate the specific procedure of principal component analysis, where a i , b i are different feature dimensions. First, the covariance matrix Cov(X) of the feature matrix X is computed by Equation (1).
In the above Equation (1), a 2 i in the matrix is the variance of the features, representing the eigenvalues, and m ∑ i=1 a i b i is the covariance between the features, the magnitude of which represents the correlation between the features. Second, in order to obtain the eigenvector of X, the covariance matrix Cov(X) is diagonalized so that the value of the covariance Cov(X) is 0, that is, Equation (1) is equal to 0. The solution yields a set of eigenvectors, which are rearranged according to their corresponding eigenvalues λ from largest to smallest, and the first k eigenvectors (k < n) are taken to obtain the spatial mapping matrix P. Assuming that the value of λ in the eigenvalues is λ 1 < λ 2 <...< λ n , retaining the proportion η of the original data information contained in the first k dimensions, η can be obtained by Equation (2).
Finally, according to the calculated matrix P, X can be mapped to the new space Y, and the principal component matrix Y is obtained, where Y is the data of orthogonal features. This process can remove the low information of high-dimensional features to reduce feature correlation and eliminate information redundancy to extract the matrix containing the main information of the principal component.
Principal component analysis can synthesize numerous keyword search indexes into a small number of principal components to more concisely and accurately characterize market participants' information concerns and improve the utilization of information.

Vector Autoregressive Model (VAR)
In order to investigate the dynamic interaction of the information attention index among different cities' real estate markets and further measure the conduction effect of information in each city's real estate market, this paper established a vector autoregressive (VAR) model with impulse response analysis and variance decomposition to investigate the conduction process and dynamic influence of information among different cities in the information conduction network and clarify the information sources in the information conduction network. Vector autoregressive modeling (VAR) is a common statistical technique used for modeling and predicting multivariate time series, which can be used to examine the dynamic interaction between multiple variable time series. Its mathematical expression is shown in the following Equation (3).
In the above equation, y t is a vector of m-dimensional endogenous variables, x t is a vector of n-dimensional exogenous variables, p is the lagged order and T is the number of samples. m × m-dimensional matrix A 1 ,..., A p and m × n-dimensional matrix B are the estimated coefficient matrices. Additionally, ε t is a vector of n-dimensional perturbations, which can be correlated with each other contemporaneously, but not with their own lagged values and not with the variables on the right side of the equation.
In this study, the constant term was used as an exogenous variable of the model, and the information attention time series of each city's real estate market were used as an endogenous variable y t to build the VAR model. Additionally, on the basis of the stability of the VAR model, the impulse response and variance decomposition were used to investigate the dynamic impact of random perturbations on the variable system in order to investigate the interaction and dynamic impact between any two information attention index time series.

Data Sources
In this article, Baidu Index, a data sharing platform of Baidu, was chosen as the data source. According to Statcounter (Statcounter is a website traffic monitoring agency that provides various types of statistical reports and website traffic statistics), Baidu search has been leading the search engine market in China, basically reflecting the public's search for information related to the real estate market in various cities. Baidu search has been a leading share of the search engine market in China, which basically reflects the public's search for information related to the real estate market in each city. In terms of the choice of cities, considering that in some smaller or less economically developed cities, people's access to real estate information may be mainly through advertisements, friends or real estate agents, search for real estate market information through the Internet is relatively rare (Dong et al. 2014). Therefore, we selected the real estate of the 10 larger cities that have a larger scale, that are more economically developed and that have larger populations and relatively active real estate transactions. The market information Baidu search index was used as the data source. The selected cities specifically include Beijing (BJ), Tianjin (TJ), Shanghai (SH), Nanjing (NJ), Hangzhou (HZ), Chongqing (CQ), Chengdu (CD), Guangzhou (GZ), Shenzhen (SZ) and Wuhan (WH). These cities belong to the first-tier and new-tier cities, have a population of nearly or more than 10 million inhabitants, have strong economic power and are distributed in all regions of China; therefore, it is representative to study the information spillover of the real estate market in these cities. The basic characteristics of each city are shown in Table 1.

Data Collection
The search index of these keywords reflects the frequency of retrieval and the degree of attention paid by the public to different information, which is an important basis for measuring the degree of attention paid to real estate market information. Therefore, it is extremely important to identify and screen the keywords commonly used by the public. Most of the existing research works selected keywords, by artificial methods or by Baidu Index, of "demand graph", "source-related words" and "destination-related words" (Hong and Li 2015;Jiang et al. 2016;Wang et al. 2019). However, this method has the following problems: the terminology is mostly technical and does not cover colloquial search terms, is mainly a single basic term without considering the influence of synonyms and related terms and rarely selects long-tail keywords to analyze the search situation (Sun and Lv 2011). This can lead to the acquired search data not being able to fully and accurately quantify the information concern, which has a great impact on the accuracy and precision of the research results. To overcome the problems of keyword selection methods in previous studies, this paper improves the keyword selection methods, as shown in Figure   The search index of these keywords reflects the frequency of retrieval and the degree of attention paid by the public to different information, which is an important basis for measuring the degree of attention paid to real estate market information. Therefore, it is extremely important to identify and screen the keywords commonly used by the public. Most of the existing research works selected keywords, by artificial methods or by Baidu Index, of "demand graph", "source-related words" and "destination-related words" (Hong and Li 2015;Jiang et al. 2016;Wang et al. 2019). However, this method has the following problems: the terminology is mostly technical and does not cover colloquial search terms, is mainly a single basic term without considering the influence of synonyms and related terms and rarely selects long-tail keywords to analyze the search situation (Sun and Lv 2011). This can lead to the acquired search data not being able to fully and accurately quantify the information concern, which has a great impact on the accuracy and precision of the research results. To overcome the problems of keyword selection methods in previous studies, this paper improves the keyword selection methods, as shown in Figure 2. Firstly, the keywords used in previous studies were counted in terms of word frequency, and the high-frequency keywords were screened to determine "house price", "new house", "second-hand house", "second-hand house", "property", "provident fund", "mortgage", "property tax", "property", "down payment" and "mortgage" as initial keywords; then, we used these initial keywords as a starting point, combined with the Firstly, the keywords used in previous studies were counted in terms of word frequency, and the high-frequency keywords were screened to determine "house price", "new house", "second-hand house", "second-hand house", "property", "provident fund", "mortgage", "property tax", "property", "down payment" and "mortgage" as initial keywords; then, we used these initial keywords as a starting point, combined with the 5118 big data analysis website. 5118 is a search engine optimization (SEO) for all kinds of big data mining and can provide keyword mining, an industry thesaurus, site group weight monitoring, keyword ranking monitoring and other information data platforms. The keyword mining technology, mining each initial keyword, long-tail keywords and related keywords to form 10 groups of combination keywords and eliminating the 10 cities with an average monthly total search volume of less than 3000, with "down payment" and "mortgage" forming 2 groups of keywords, allowed for reaching the final 8 groups of combination keywords, from which the real estate market information search keywords database was built, as shown in Table 2. The above keywords can basically cover the common keywords used by the public in the process of searching for information on the real estate market and the main aspects that people consider in their housing purchase decisions, involving both macro-policies and information on market supply and demand. Specifically, it includes the macroeconomic situation and the overall trend of the real estate market, policies closely related to the real estate market and various types of information directly related to the housing itself and transaction details, which constitute the above keyword database.
According to the 24 real estate information search keywords obtained from the above mining, the Baidu Index crawler program was written in Python 3.7 on PyCharm to collect 106 monthly average search data from January 2011 to October 2019 in 10 cities by the unit of "month", which returned 25,440 data in total.

Construction of the Real Estate Market Information Concern Index
This paper mainly synthesized the time series of the real estate market information concern index by principal component analysis. First of all, the Baidu Index of the combined keywords was standardized, the standardized data were subjected to principal component analysis and then the principal components were weighted on average to obtain the real estate market information attention index of each city. The specific process was carried out in the following three steps.
Firstly, the time series of eight groups of the monthly average search index of each city was obtained by combining keywords according to their semantic relevance: TSHP, TSNH, TSSH, TSPR, TSPF, TSMO, TSPT and TSR; the time series data were standardized by (SPSS, Inc. 2014) Statistic 21 to reduce the error due to differences in data magnitude. According to the results of the KMO and Bartlett test (as shown in Table 3), the KMO value is 0.849, which exceeds the critical value of 0.8, indicating that the real estate market combination keyword search index time series obtained in this paper is suitable for principal component analysis, and the Bartlett test result is significant, which also indicates that it is more reasonable to conduct principal component analysis on the above-mentioned variables. Secondly, factor analysis was conducted on the monthly average search index of the eight groups of keywords to obtain the component score coefficient matrix. According to the numerical magnitude of the explained variance of each component in the score coefficient matrix, the principal component score coefficients whose explained variance is greater than 0.1 were extracted. Additionally, by combining this with the following Formula (4), the specific numerical time series of the principal components of the eight groups of combination keywords were obtained. P n = p n1 TS HP + p n2 TS NH + · · · + p n8 TS RE Finally, the principal component data with explained variance greater than 0.1 were weighted on average, with the weight being the percentage w n of the explained variance of each principal component to the total explained variance, to construct the time series of the real estate market information concern index for each city.
TS city = w 1 p 1 + w 2 p 2 + · · · + w n p n Through the above analysis, the information concern index time series of 10 cities' real estate markets can be obtained: TSBJ, TSCD, TSCQ, TSGZ, TSHZ, TSNJ, TSSH, TSSZ, TSTJ and TSWH, which can be used to characterize the public's concern for information in each city's real estate market. The time series characteristics of the information concern index were then analyzed to explore the information spillover effect between real estate markets.

Descriptive Statistics and Granger Causality Test
In order to visually reflect the features and patterns of the above derived information concern indexes for the real estate markets of the 10 cities, we conducted a descriptive statistical analysis to present the basic characteristics of the data, and the results are shown in Table 4. Overall, the average value of the information concern indexes of the real estate markets of Beijing, Shanghai and Hangzhou is high, while the average value of the information concern indexes of the real estate markets of Guangzhou, Chengdu and Nanjing is low; the time series standard deviation of the information concern indexes of the real estate markets of Beijing, Chongqing, Hangzhou and Shanghai is large, indicating a high degree of volatility. Except for Tianjin, where the peak of the information concern index was to the left, the time series of the information concern index in the real estate markets of other cities were negatively skewed, indicating that the peak of the information concern index in the real estate markets of most cities was to the right. The peaks of the information concern index time series of the real estate market in all cities are smaller than the peaks of the normal distribution, indicating that the data distribution is relatively flat, there are no sharp fluctuations in the short term and the stability is good. According to the obtained time series of the information concern index, the trend of these data was visualized using the metric analysis software EViews 10.0 (2001), and the results are shown in Figure 3. It can be found that there is an obvious correlation between the information concern indexes in each city's real estate market. In order to further determine the correlation between the time series of each information concern index, the Pearson correlation test was conducted, and at the 1% significance level, there is a significant correlation between the information concern index of the real estate market in both cities, indicating that there is a strong correlation between the level of information concern between the real estate market in each city, which may influence each other, and the VAR model can be established to explore its dynamics interactive relationship.
the information concern indexes in each city's real estate market. In order to further determine the correlation between the time series of each information concern index, the Pearson correlation test was conducted, and at the 1% significance level, there is a significant correlation between the information concern index of the real estate market in both cities, indicating that there is a strong correlation between the level of information concern between the real estate market in each city, which may influence each other, and the VAR model can be established to explore its dynamics interactive relationship. According to Figure 3, it can also be found that the time series data distribution trend of the real estate market information concern index in each city is generally consistent. Among them, the time series of the real estate market information concern index of each city was in a relatively stable state from January 2011 to April 2013, followed by a rapid rise, and continued to climb in May 2013, until it reached its peak in February-March 2017 and then gradually declined, with the overall trend of first rising and then declining. The fluctuations in the real estate market information concern index of each city in the whole time interval are more obvious and have a certain regularity, as shown in the previous years, where in December, the fluctuations will fall slightly, rise in the following year's February-March and then fall after a maximum value, showing the characteristics of cyclical changes. In addition, each city's real estate market information concern index also has certain leading-lagging characteristics. For example, the trends of the information concern indexes of the real estate markets of Beijing and Shanghai, Guangzhou and Shenzhen and Chengdu and Chongqing are highly consistent, but they are not completely consistent in time, which is reflected in the fact that one city's real estate market information concern index changes first and the other city follows closely, with a certain lag. This lead-lag relationship will be further verified in the subsequent empirical analysis. According to Figure 3, it can also be found that the time series data distribution trend of the real estate market information concern index in each city is generally consistent. Among them, the time series of the real estate market information concern index of each city was in a relatively stable state from January 2011 to April 2013, followed by a rapid rise, and continued to climb in May 2013, until it reached its peak in February-March 2017 and then gradually declined, with the overall trend of first rising and then declining. The fluctuations in the real estate market information concern index of each city in the whole time interval are more obvious and have a certain regularity, as shown in the previous years, where in December, the fluctuations will fall slightly, rise in the following year's February-March and then fall after a maximum value, showing the characteristics of cyclical changes.
In addition, each city's real estate market information concern index also has certain leading-lagging characteristics. For example, the trends of the information concern indexes of the real estate markets of Beijing and Shanghai, Guangzhou and Shenzhen and Chengdu and Chongqing are highly consistent, but they are not completely consistent in time, which is reflected in the fact that one city's real estate market information concern index changes first and the other city follows closely, with a certain lag. This lead-lag relationship will be further verified in the subsequent empirical analysis.
Considering the lead-lag relationship presented in Figure 3, in order to determine whether there is an information spillover relationship in each city's real estate market, we conducted a Granger causality test on the time series of the information concern index in the real estate market of the 10 cities and analyzed and compared the results of the Granger causality test under different lag conditions. According to the test results, when the lag period is 2, the information conduction relationship of each city's real estate market is the closest. Additionally, the 0-1 matrix of the information conduction relationship of the real estate market under such conditions is derived. Additionally, the results are shown in Table 5, where vertical represents the "cause" of Granger, horizontal represents the "effect" of Granger and "1" represents the significant information conduction relationship between the real estate markets of two cities. On the other hand, "0" means that there is no significant information transfer relationship between the real estate markets of the two cities.

Analysis of the Impact and Timeliness of Information Shocks in Real Estate Markets
In order to ensure that the time series of the real estate market information concern index is stable in the long run, the data were first tested for stability using the unit root test. The test results show that, at the 1% significance level, the time series of the real estate market information concern index obtained in this paper is a first-order single integer I(1) series.
In order to further investigate the degree of influence and timeliness of information spillover in the real estate market, impulse response analysis can be performed on the time series of the information concern index. Due to the different positions and roles of each city in the information spillover network, the variable of the information concern index needs to be ranked and then analyzed. Compared with other methods, the impulse response function based on Cholesky decomposition is more sensitive to the ranking of variables and is more suitable for analyzing the interactions among variables of different importance. Therefore, the impulse response function based on the Cholesky decomposition method was chosen for analysis in this paper. Cholesky order of impulse response analysis can be based on the Granger causality test in the number of causal relationships from largest to smallest, namely, Shenzhen, Guangzhou, Chongqing, Beijing, Chengdu, Hangzhou, Wuhan, Shanghai, Tianjin and Nanjing, in order to determine the results of the impulse response analysis based on this order, as shown in Figure 4. Since the shock effect curves in the first three periods fluctuate widely and can reflect the immediate effects among the real estate markets of different cities, we focus on the image changes in the first three periods.

Analysis of the Impact of Information Shocks in the Real Estate Market
The real estate markets of different cities can be divided into three categories based on the effect of different shocks obtained from the impulse response analysis of each city, as reflected in Figure 4. The first category is the cities that have a strong impact on the information concern index of other real estate markets, with the average impact effect of the first three periods ranging from 20% to 40%, such as Shenzhen. Overall, information shocks from the Shenzhen real estate market have the most significant impact on each city. As shown in Figure 4a,d,f,h, the impact of information spillover from Shenzhen's real estate market on Shenzhen, Beijing, Hangzhou and Shanghai was as high as 80% from the first period onwards, and the impact on several other cities was also between 40% and 60%, from which it can be concluded that Shenzhen may be a major source of information in the information spillover network.
The second type of cities is those with a medium impact on the information concern index of other real estate markets, with the average value of impact in the first three periods ranging from 10% to 20%, such as Beijing, Shanghai, Chongqing, Chengdu and Wuhan. The information spillover effect of these cities' real estate markets on other cities is medium, and the information interaction is relatively close. After the fourth period, the information impact brought by the real estate markets of Guangzhou and Wuhan gradually declined and finally stabilized at around 20%, and its influence is relatively stronger and longer lasting in the middle and late stages compared to other cities. This indicates that cities such as Guangzhou and Wuhan are not the source of the information spillover effect but may be the intermediary cities in the information spillover process.
The third category of cities is cities with a weaker impact on the information concern index of other real estate markets, where the average value of their impact influence in the first three periods is less than 10%, which can be seen from Figure 4f,i,j. Hangzhou, Tianjin and Nanjing are more susceptible to the impact of information on other cities' real estate markets. At the beginning of period 1, these cities are affected by different degrees of information shocks from other cities, and as the number of periods increases, the impact of information shocks from other cities continues to increase, while their impact on other cities is only below 10%, indicating that the changes in these cities' real estate market information concern indexes mainly come from the impact of information shocks from other cities, acting as a conduit for receiving information from other cities.

Real Estate Market Information Spillover Impact Timing Analysis
By analyzing the change pattern of the information shock effect with an increasing number of periods in the impulse response effect diagram, the timeliness of the information spillover effect in the real estate market can be derived, and then the main source of the shock effect can be investigated. From Figure 4a-e, it can be found that after a shock

Analysis of the Impact of Information Shocks in the Real Estate Market
The real estate markets of different cities can be divided into three categories based on the effect of different shocks obtained from the impulse response analysis of each city, as reflected in Figure 4. The first category is the cities that have a strong impact on the information concern index of other real estate markets, with the average impact effect of the first three periods ranging from 20% to 40%, such as Shenzhen. Overall, information shocks from the Shenzhen real estate market have the most significant impact on each city. As shown in Figure 4a,d,f,h, the impact of information spillover from Shenzhen's real estate market on Shenzhen, Beijing, Hangzhou and Shanghai was as high as 80% from the first period onwards, and the impact on several other cities was also between 40% and 60%, from which it can be concluded that Shenzhen may be a major source of information in the information spillover network.
The second type of cities is those with a medium impact on the information concern index of other real estate markets, with the average value of impact in the first three periods ranging from 10% to 20%, such as Beijing, Shanghai, Chongqing, Chengdu and Wuhan. The information spillover effect of these cities' real estate markets on other cities is medium, and the information interaction is relatively close. After the fourth period, the information impact brought by the real estate markets of Guangzhou and Wuhan gradually declined and finally stabilized at around 20%, and its influence is relatively stronger and longer lasting in the middle and late stages compared to other cities. This indicates that cities such as Guangzhou and Wuhan are not the source of the information spillover effect but may be the intermediary cities in the information spillover process.
The third category of cities is cities with a weaker impact on the information concern index of other real estate markets, where the average value of their impact influence in the first three periods is less than 10%, which can be seen from Figure 4f,i,j. Hangzhou, Tianjin and Nanjing are more susceptible to the impact of information on other cities' real estate markets. At the beginning of period 1, these cities are affected by different degrees of information shocks from other cities, and as the number of periods increases, the impact of information shocks from other cities continues to increase, while their impact on other cities is only below 10%, indicating that the changes in these cities' real estate market information concern indexes mainly come from the impact of information shocks from other cities, acting as a conduit for receiving information from other cities.

Real Estate Market Information Spillover Impact Timing Analysis
By analyzing the change pattern of the information shock effect with an increasing number of periods in the impulse response effect diagram, the timeliness of the information spillover effect in the real estate market can be derived, and then the main source of the shock effect can be investigated. From Figure 4a-e, it can be found that after a shock to Shenzhen and Guangzhou in period 1, positive shock effects will be brought to the real estate markets of other cities in the same period. As the number of periods increases, the impact of the information shock brought by these two cities starts to gradually decrease. In period 5 and beyond, the impact of information shocks to the Shenzhen real estate market gradually stabilizes at around 20%, while the impact of information shocks to the Guangzhou real estate market weakens to less than 10%, indicating that the information shocks to the real estate markets of Shenzhen and Guangzhou have a significant and persistent impact on other cities and may be the source of information in the information spillover network. Figure 4e,f,h,j show that after a shock to Beijing's real estate market information concern index in period 1, there is a positive shock effect on the real estate markets of Chengdu, Hangzhou, Shanghai and Nanjing in the current period, while other cities are not affected in the current period and the shock effect does not appear until period 2, indicating that the shock effect of Beijing's real estate market on these cities lags behind period 1; a similar situation is also reflected in the real estate markets of Chengdu, Wuhan and Chongqing, where the positive shock effect has a certain lag.
As it can be seen from Figure 4f-j, the real estate markets of Hangzhou, Shanghai, Tianjin and Nanjing are mainly influenced by information shocks from Shenzhen, Guangzhou, Chengdu and Beijing in the current period, while their own information shocks have a less pronounced impact. From the second period onwards, the impact of these cities' own information shocks gradually decreases, while the impact of Wuhan's and Chongqing's shocks gradually increases. This result indicates that the real estate markets of Hangzhou, Wuhan, Shanghai, Tianjin and Nanjing are more sensitive to information shocks in other cities, are susceptible to changes in information concerns in external real estate markets and are in a position to follow in the information spillover network.
The overall situation of the dynamic effects of information shocks in the real estate market of each city is shown in Table 6. In terms of the timeliness of the information shock effect, the main source of the current information shock effect in each city is Shenzhen, and the main source of the one-period lagging information shock effect is Beijing and Chengdu, which shows that Shenzhen, Beijing and Chengdu are the source cities of the information spillover effect. In terms of the main direction of the information shock effect, the information shock generated by Chongqing, Beijing and Wuhan is mainly negative, while in the other cities, it is positive. In terms of the strength of the information shock effect, Shenzhen, Beijing and Chengdu have a strong information shock effect, Guangzhou, Chongqing and Wuhan have a medium information shock effect and Shanghai, Hangzhou, Tianjin and Nanjing have a weak information shock effect.

Analysis of Real Estate Market Information Source and Spillover Network
In order to further identify the information sources and characteristics of the information spillover network in the real estate market, this paper performs variance decomposition on the VAR model of the information concern index to investigate the degree of explanation of each city in the expected fluctuations in the information concern index in the real estate market and then analyzes the roles played by each city in the information spillover network. The results are shown in Figure 5. J. Risk Financial Manag. 2021, 14, x FOR PEER REVIEW 14 of 19

Analysis of Real Estate Market Information Source and Spillover Network
In order to further identify the information sources and characteristics of the information spillover network in the real estate market, this paper performs variance decomposition on the VAR model of the information concern index to investigate the degree of explanation of each city in the expected fluctuations in the information concern index in the real estate market and then analyzes the roles played by each city in the information spillover network. The results are shown in Figure 5.

Real Estate Market Information Spillover City Type Analysis
Based on the results of the ANOVA decomposition analysis, we classified cities into "information source cities", "information intermediary cities" and "information-receiving cities" according to the proportion of the explanation for the fluctuation in the information concern index of other cities and the lagged characteristics of the shock effect. The "information source cities" explain about 20% of the fluctuations in the information interest index of other cities in the real estate market and have a significant shock effect in the current period. The "information intermediary cities" explain about 10% of the shocks with a lag of one period. The rest of the cities which explain close to zero percent of the fluctuations in the information concern index of the real estate market of other cities are the "information-receiving cities".
As shown in Figure 5, overall, from period 1 to period 10, Shenzhen explained the fluctuation in the information concern index of each city to the greatest extent. Especially in the first period, at least 60% of the explanation for the fluctuation in the information concern index of the real estate market of other cities comes from Shenzhen, while the information concern index of Shenzhen itself is affected by the fluctuation in other cities by only about 30%, which shows that Shenzhen is a source of influencing the change in the information concern index of the real estate market of the remaining cities and plays the role of an "information source city" in the information spillover network. Guangzhou, on the other hand, explained less than 10% of the fluctuations in the information concern index of each city, with almost no direct influence; therefore, it is only an "information intermediary city" in the information spillover network, which is basically consistent with the results of the impulse response analysis mentioned above.
In addition, Beijing and Chengdu explain a higher proportion of the fluctuations in the information concern index for other cities' real estate markets. In particular, in the first

Real Estate Market Information Spillover City Type Analysis
Based on the results of the ANOVA decomposition analysis, we classified cities into "information source cities", "information intermediary cities" and "information-receiving cities" according to the proportion of the explanation for the fluctuation in the information concern index of other cities and the lagged characteristics of the shock effect. The "information source cities" explain about 20% of the fluctuations in the information interest index of other cities in the real estate market and have a significant shock effect in the current period. The "information intermediary cities" explain about 10% of the shocks with a lag of one period. The rest of the cities which explain close to zero percent of the fluctuations in the information concern index of the real estate market of other cities are the "information-receiving cities".
As shown in Figure 5, overall, from period 1 to period 10, Shenzhen explained the fluctuation in the information concern index of each city to the greatest extent. Especially in the first period, at least 60% of the explanation for the fluctuation in the information concern index of the real estate market of other cities comes from Shenzhen, while the information concern index of Shenzhen itself is affected by the fluctuation in other cities by only about 30%, which shows that Shenzhen is a source of influencing the change in the information concern index of the real estate market of the remaining cities and plays the role of an "information source city" in the information spillover network. Guangzhou, on the other hand, explained less than 10% of the fluctuations in the information concern index of each city, with almost no direct influence; therefore, it is only an "information intermediary city" in the information spillover network, which is basically consistent with the results of the impulse response analysis mentioned above.
In addition, Beijing and Chengdu explain a higher proportion of the fluctuations in the information concern index for other cities' real estate markets. In particular, in the first period of the ANOVA, Beijing and Chengdu are second only to Shenzhen in explaining the fluctuations in the information concern index of other cities, and the explanations for the fluctuations in the information concern index of other cities by these two cities continue to increase over time, indicating that these cities are also "information source cities" in the real estate market information spillover network. The explanation for the fluctuation in information in the real estate market generated by Chongqing and Wuhan is low in the first five periods and gradually increases after five periods, and the influence on the information concern index of most cities remains at a high level. Thus, these cities are "information intermediary cities" in the information spillover network of the real estate market. From period 1 to period 10, the explanation for the fluctuations in the information concern index in the real estate markets of Shanghai, Hangzhou, Nanjing and Tianjin gradually stabilized at about 20%, while the explanation for the fluctuations in the real estate markets of other cities finally stabilized at about 80%, indicating that these cities are "information-receiving cities".

Analysis of Real Estate Market Information Spillover Network Characterization
Overall, changes in the information concern index for the "information-receiving cities" and "information-intermediating cities" were mainly influenced by the "information source cities". Specifically, changes in the information concern index of Chongqing and Tianjin were mainly influenced by Shenzhen, Chengdu and Wuhan; changes in the information concern index of Hangzhou, Wuhan and Shanghai were mainly influenced by Shenzhen, Beijing and Chengdu; and changes in the information concern index of Guangzhou and Nanjing were mainly influenced by Shenzhen, Beijing and Wuhan.
In summary, the following types of cities and areas of influence in the real estate market information spillover network can be derived (as shown in Table 7).  Table 7, it can be concluded that Shenzhen, Beijing and Chengdu have an information spillover effect on the real estate market of each city, with a high intensity, long effect time and short lag period, playing the role of an information source. Especially in the early stage of variance decomposition, these three cities have a dominant influence on the information fluctuation in other cities, indicating that the initial fluctuation in information concern comes from these cities and gradually radiates to other cities in the spillover network.
Although Guangzhou, Wuhan and Chongqing have a great influence on the information concern of other cities' real estate markets at a later stage, their initial performance is not significant and lagging, and they are not the source of changes in the information concern index. The effect of information spillover between the real estate markets of these cities was not obvious in the early stage, but in the middle and later stages, the influence of the fluctuation in information attention between each other increased, resulting in the influence of secondary information shock.
Cities other than the above such as Shanghai, Tianjin, Hangzhou and Nanjing have a relatively small influence on the changes in the information concern index of the real estate market of other cities, with a long lag period, and belong to the information-receiving cities, which mainly have some influence on their own information concern. These cities may be mainly influenced by local policies and supply and demand and have less interaction with other cities, playing the role of receiving influence in the information spillover network.

Conclusions
In view of the lack of research on information conduction between real estate markets, this paper built a research framework for analyzing the information conduction effect in real estate markets, verified and analyzed the information conduction effect between cities and put forward new ideas for studying the characteristics and laws of information conduction in real estate markets and grasping and predicting the dynamic changes in information. Using text mining, principal component analysis and other methods, this paper quantified the information spillover between real estate markets by constructing a comprehensive index to measure the degree of interest in real estate market information of each city and examined the correlation and cause-effect relationship between the degree of interest in real estate market information of different cities. In addition, a dynamic model was developed to analyze the information spillover effects among cities' real estate markets, in order to investigate the information sources and network characteristics of the information spillover network in real estate markets.
Different from the previous studies, this paper investigated the information spillover effect between urban real estate markets from the perspective of "information distance". In this paper, "information distance" was used as an indicator to measure the information interaction among different urban real estate markets. When the "information distance" becomes closer, the information shock effect becomes stronger, and the lag period becomes shorter and lasts longer. The study in this paper found that the impact of information shock on the real estate markets of some cities with large geographical distances and economic gaps is immediate and synchronous, and some cities show a two-way impact. This result shows that under the role of information spillover effects, the real estate market spillover effect can break through the geographical location restrictions, resulting in cross-regional and immediate linkages, and the closer the "information distance" of the city, the easier it is to influence each other's real estate markets, which is basically consistent with the linkage of some cities' house price fluctuations in reality. This finding in this paper helps to explain the information spillover mechanism behind the house price spillover.
The empirical analysis of this paper shows that different cities play different roles in the information spillover network of the real estate market. According to the degree of influence and timeliness, these cities are divided into three categories: "information source cities", "information intermediary cities" and "information receiver cities", which are different from other cities in the real estate market. The cities have different information spillover relationships. From the point of view of the degree of influence, the information spillover of the real estate market of the "information source cities" has the most significant influence on the changes in the information concern index of each city, and the influence of the "information intermediary cities" on the real estate market of other cities is second only to that of Shenzhen, which has the most influence on the real estate market of other cities. The real estate market information concern index also has a certain leading role. From the point of view of the time of influence, the impact on the real estate market information concern index of each city in the early stage mainly comes from itself and the "information source cities", and the latter is dominant; meanwhile, in the middle and later stages, the impact from the real estate market of the "information source cities" gradually declined, while the impact of the impact on the real estate market of the "information intermediary cities" gradually increased to stabilize, which became the main reason for influencing the changes in the information concern index of other cities.
The research ideas and framework of this paper are conducive to revealing the characteristics and laws of real estate market information spillover in different cities, and to enhancing the ability to analyze and predict real estate market information dynamics.
For example, according to the findings of this paper, different regulatory and guidance measures can be taken at different times for the information conduction between the real estate markets of "information source cities" and "non-information source cities". Firstly, in the early stage of real estate market information conduction, it is necessary to focus on the changing trend of the real estate market information concern index of the "information source cities", predict the real estate market information conduction effect of these cities, promptly curb abnormal fluctuations and guide the development of the city's real estate market information concern degree in a reasonable direction. Secondly, in the mid-term of real estate market information conduction, the focus should be on the changing trend of the information concern index of the real estate market of "information intermediary cities". Focusing on regulating the change in the information concern index of the real estate market of these cities can reduce the impact of secondary information shocks on the real estate markets of other cities caused by "information intermediary cities". Third, in the later stage of information spillover in the real estate market, attention should be paid to the changing trend of the information concern index in the real estate market of the "information-receiving cities", and the impact of information shock on the real estate market of the "information-receiving cities" in other cities should be reduced through information regulation and intervention.
It is worth noting that the research in this paper also has certain limitations. Due to the limitation of the data volume, it is not possible to introduce more cities and information spillover models with other explanatory variables in the real estate market, which will be further tested in the future. In addition, the influence of different types of information spillover on the spillover effect of house prices is also a research topic that deserves further in-depth discussion in the future.

Conflicts of Interest:
The authors declare no conflict of interest.