Next Article in Journal
Landscape Approach towards Integrated Conservation and Use of Primeval Forests: The Transboundary Kovda River Catchment in Russia and Finland
Next Article in Special Issue
Land Use/Land Cover Data of the Urban Atlas and the Cadastre of Real Estate: An Evaluation Study in the Prague Metropolitan Region
Previous Article in Journal
Land Use Impacts on Particulate Matter Levels in Seoul, South Korea: Comparing High and Low Seasons
Previous Article in Special Issue
Regional Economic Sustainability: Universities’ Role in Their Territories
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mass Appraisal Modeling of Real Estate in Urban Centers by Geographically and Temporally Weighted Regression: A Case Study of Beijing’s Core Area

1
Department of Geography and Resource Management, The Chinese University of Hong Kong, New Territories 999077, Hong Kong, China
2
Institute of Future Cities, The Chinese University of Hong Kong, New Territories 999077, Hong Kong, China
3
Department of Land & Real Estate Management, Renmin University of China, Beijing 100872, China
*
Author to whom correspondence should be addressed.
Land 2020, 9(5), 143; https://doi.org/10.3390/land9050143
Submission received: 9 April 2020 / Revised: 5 May 2020 / Accepted: 6 May 2020 / Published: 8 May 2020

Abstract

:
The traditional linear regression model of mass appraisal is increasingly unable to satisfy the standard of mass appraisal with large data volumes, complex housing characteristics and high accuracy requirements. Therefore, it is essential to utilize the inherent spatial-temporal characteristics of properties to build a more effective and accurate model. In this research, we take Beijing’s core area, a typical urban center, as the study area of modeling for the first time. Thousands of real transaction data sets with a time span of 2014, 2016 and 2018 are conducted at the community level (community annual average price). Three different models, including multiple regression analysis (MRA) with ordinary least squares (OLS), geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), are adopted for comparative analysis. The result indicates that the GTWR model, with an adjusted R2 of 0.8192, performs better in the mass appraisal modeling of real estate. The comparison of different models provides a useful benchmark for policy makers regarding the mass appraisal process of urban centers. The finding also highlights the spatial characteristics of price-related parameters in high-density residential areas, providing an efficient evaluation approach for planning, land management, taxation, insurance, finance and other related fields.

1. Introduction

The urban center is the core area of an urban structure. It is usually the area with the most concentrated functions of urban politics, economy and culture. The high density of urban centers and the central position of urban functions make it different from the cities’ other areas in many aspects, i.e., high population density, traffic congestion and high land development intensity. As a result, the central area of a city often does not have new residential land for development to build first-hand real estate for housing market. Therefore, the real estate market in central areas mainly consists of second-hand housing transactions. For a city’s real estate market, the government can determine and revise policies regarding planning, land, finance, tax, price and other aspects. Tax policy is a very important part of the policies. China is one of a small number of countries that does not levy annual real estate tax on ownership of residential properties. Recently, in the Government Work Report on March 5, 2019, the idea to "steadily promote the legislation of real estate tax" has been clearly put forward. Referring to the experience of developed countries, real estate tax is often based on the value of the houses [1]. How to assess properties quickly and accurately on a large scale is the basic technical requirement for the real implementation of real estate tax.
Based on this purpose, many scholars enter the research field of mass appraisal modeling of land and real estate from the perspective of economics, statistics, computer science, operational research and geographical science [2,3,4,5,6,7]. Based on Lancaster’s consumer behavior theory, the classic hedonic pricing model is introduced into the housing market and performs well in the analysis of the mass appraisal model, which is based on the hypothesis that goods are valued on their attributes or characteristics [8]. After years of application, the multiple regression analysis (MRA) model has been considered as the most widely use model with its advantages of a clear formula, wide recognition of tax departments and long-term stable application, especially for the linear multiple regression analysis model [9,10].
At the same time, more and more scholars find the unique spatial characteristics of properties and take spatial factors into account in the establishment of a mass appraisal model. Geographically Weighted Regression (GWR) is a widely used tool for exploring potential spatial heterogeneity in processes over geographic space [11,12,13,14,15]. The traditional MRA model is usually a kind of global model which assumes that the processes generating the observed data are the same everywhere so that a single parameter is estimated for each covariate in the model. However, the GWR model allows this assumption to be realized by calibrating a model at each location to obtain location-specific parameter estimates for each process. Furthermore, many researchers also notice the time dimension of real estate transaction database and combine it with the GWR model. After adding temporal non-stationarity, the conventional GWR model integrates both temporal and spatial information into the mass appraisal modeling and becomes a new form, namely the Geographically and Temporally Weighted Regression (GTWR) model. Huang et al. (2010) first proposed the GTWR model [16], and then Fotheringham et al. (2015) also developed the GTWR model based on their classic GWR model [17]. The GTWR model has good analytical capabilities for data sets with time series and spatial distribution characteristics [18]. Therefore, many scholars in different fields have adopted the GTWR model for analysis, such as satellite-based mapping of air pollution [19], spatial-temporal heterogeneity of industrial pollution [20], and the driving trajectory space-time characteristics [21], etc. At the same time, some scholars also focused on the improvement of model prototypes, such as compatibility with multiscale effect [22], combination with gravity model [23], extension with space-time kriging [24], etc.
It should also be noted that both institutions and scholars have made a lot of efforts for the research of mass appraisal models. Bourassa et al. (2007) compare and summarize the mass appraisal models related to spatial dependence and housing submarkets [25]. McCluskey et al (2013) concentrate on the accuracy of mass appraisal models and highlight the application potential of spatially weighted approach [26]. Wang and Li (2019) provide a systematic review of mass appraisal models for nearly two decades and identify a 3I-trend, namely AI-based, GIS-based and MIX-based models [27]. At the same time, researchers in many fields have also made effective attempts to the mass appraisal model, such as a multi-criteria decision analysis [28,29], expert system [30], modified evolutionary polynomial regression [31], Markov chain hybrid Monte Carlo method [32], artificial neural networks [33], hierarchical models [34], cluster analysis [35,36], rough set theory [37], reasoning-based models [38], support vector machine [39], geographically weighted principal component analysis [40], spatial error model [41] and geostatistical model [42].
This paper contributes to the study of mass appraisal modeling by utilizing the spatial-temporal characteristics of properties. Therefore, the aim of this article is to build, implement, and test the GTWR model with the community annual average price data in Beijing’s core area to compare the performance with MRA and GWR model. In addition, we also focus on the spatial characteristics of price-related parameters in high-density residential areas, providing an efficient evaluation approach for related fields. The remainder of this article is structured as follows. Section 2 introduces the study area and describes the data structure. It also provides detailed information about the three mass appraisal models which will be applied in this paper. Section 3 compares the results of MRA with the ordinary least squares (OLS) model, GWR model and GTWR model and analyzes the different situations for mass appraisal modeling of the Beijing core area. The final part presents conclusions and recommendations for future research.

2. Data and Methods

2.1. Study Area

Beijing is the capital of China. It is the political, cultural, international communication and technological innovation center of the nation. According to the Beijing Municipal Bureau of statistics (Release date: May 31, 2019), there are 21.54 million permanent residents in Beijing in 2018, and it is planned to stabilize at 23 million after 2020. The study area is the core area of Beijing, also named the Capital Functional Core Area. It is composed of two administrative regions, Xicheng District and Dongcheng District. The total area is 92.54 square kilometers, including 50.7 square kilometers in Xicheng District and 41.84 square kilometers in Dongcheng District. In 2018, the permanent residents of Xicheng District and Dongcheng District are 1.18 million and 0.82 million, and the corresponding population density (unit: person/km2) are 23333 and 19637 respectively. Figure 1 shows the map of the study area.

2.2. Data Description

The second-hand housing transaction database is from Lianjia, the largest second-hand house trading agency in China, with a local market share of nearly 60% in Beijing. The valuable data come from the records of transaction process in real commercial environment. The database contains average price data of annual transactions in each community and annual average value of all housing attributes in corresponding community. The transaction time span is 2014, 2016 and 2018, respectively. By removing the samples with missing attributes and obviously deviated coordinates, Table 1 shows the number of annual effective transaction samples of communities in Dongcheng District and Xicheng District. The total number of samples is 3064.
Figure 2 shows the spatial distribution and kernel density distribution of community annual average price (Unit: Renminbi (RMB) Yuan/m2). At first, it shows the geographical distribution of communities in core area. Some communities have all the three years’ transactions; some others only have one or two years’. Then, the interpolation of community annual average price is utilized to create a price surface by using the inverse distance weighted (IDW) method [43], for 2014, 2016 and 2018, respectively. IDW is a convenient spatial interpolation method, which can intuitively display the spatial distribution of the communities’ annual average price. It takes the distance between the interpolation point and the sample point as the weight for weighted average. The closer the interpolation point is, the greater the weight is given by the sample point. In this paper, the IDW method is supported by ArcGIS Desktop Software (Version: 10.5; Type: Advanced). And the mathematical power parameter of distance is set to the default value of 2. The distance parameter (search radius type) is defined as an adaptive radius with the default value of 12, which specifies 12 nearest input sample points to be utilized to perform interpolation. Finally, the kernel density of community samples in each year is estimated. The kernel density estimation is a natural extension of the histogram which shows the overall trend and density distribution regularity of the variables [44]. Based on MATLAB Software (Version: R2019b), a normal kernel density function is utilized with log(Price) (see Table 2 for definition) on the x-axis, probability density estimate values on the y-axis and default optimal bandwidth.
Residential community refers to a residential area surrounded by urban roads or natural boundaries, with a certain scale of living population, and built with public service facilities to meet the needs of residents. The transaction database of annual average price comes from residential community distributed throughout Beijing’s core area and hence is representative of the core area’s housing market. For mass appraisal of real estate in a city, the community scale is a proper choice. The evaluation value of the community will be the standard baseline of the individual properties within it. Based on the sufficient quantity and coverage of the 3064 community samples, the regression models can be simulated and applied well.
According to the attribute of each community sample in the database, there are 25 variables in total. The community average price is the only independent variable. Based on research purpose, all the dependent variables are divided into four categories: property structure in community, basic condition of community, traffic condition around community and living condition around community. Property structure in community contains community id, buying year, average area, average bedrooms, average decoration condition, average orientation condition. Basic condition of community includes the average house age, average ladder-to-household ratio, average years of property right, average ratio of elevator, average property management fee, number of buildings, number of households, floor area ratio (FAR) and green ratio (GA). Traffic condition around community consists of the shortest distance to bus station and the shortest distance to subway station. Living condition around community contains the shortest distance to kindergarten, the shortest distance to the park, the shortest distance to the hospital, the shortest distance to the shopping mall, the shortest distance to the food market, the shortest distance to the supermarket, the shortest distance to the movie theater, and the shortest distance to the restaurant. All the shortest distance calculations are within the 2 kilometer Euclidean-distance buffer zone. A summary of the detailed information and statistical analysis of the variables is listed in Table 2. It shows a whole picture of the community condition in Beijing’s core area. For property structure in community, the average housing price of the community is 79122 RMB Yuan/m2 with an average living area of 71.5 m2. Furthermore, the average number of two bedrooms for each property is suitable for a family of three. The average property decoration condition in the community is 0.33 (from “best = 1” to “worst = 0”), which is a normal condition for second-hand properties.
The orientation condition of the core area’s community has an average level of 0.70. For each property, the orientation weight ranges from 0.00 to 1.00. According to Beijing’s condition, houses facing south get the most light and good ventilation. Therefore, a weight of 1.00 represents the South with the best orientation. The following orientations are East, West and North with the weights of 0.66, 0.33 and 0, respectively. The data of the orientation in this paper are the average value of all properties in the relevant community, which indicates a comprehensive condition of the orientation. The closer the average value is to 1, the better the comprehensive orientation of the community is. For the basic condition of community, the average age of the building is 25 years. This shows that the development time of real estate in the core area is earlier and most communities are built in the 1990s. It also has an average of eight buildings and 597 households. Meanwhile, 45% of the buildings in community have elevators. For the traffic condition around the community, the bus condition with the average shortest distance of 0.76 kilometers is better than the subway condition with the average shortest distance of 0.90 kilometers. For the living condition, the shortest distance to kindergarten is 0.56 kilometers and the communities in the study area have great accessibility to the leisure and living places. The average shortest distance to parks, hospitals, shopping malls, food markets, supermarkets and restaurants are 0.76, 0.60, 1.03, 0.6, 0.55, 0.68 kilometers, respectively. People could walk to these places within 15 minutes (walk speed: 1.0–1.2 m/s) or ride a bike within 5 minutes (bike speed: 3.0–5.0 m/s).

2.3. Methods

2.3.1. Multiple Regression Analysis

Multiple regression analysis (MRA) explains the regression of a dependent variable over more than one independent variable. This makes it suitable for property price analysis because property values are determined by more than one property attribute. Equation (1) shows the formal model of an MRA.
Y = β 0 + β 1 X 1 + + β k X k + ε ,
where Y is the community price, X 1 ,…, X k are the community attributes, β 0 is the constant, β 1 ,…, β k is the coefficients, ε is the error term.
In order to facilitate the calculation and reduce the scale of housing prices, the logarithm calculation is carried out for the annual average house price of the community. In this paper, we adopt the natural logarithm calculation for community price. Then the equation is converted to:
l o g ( Y ) = β 0 + β 1 X 1 + + β k X k + ε ,
Generally, the MRA model is usually operated with the ordinary least squares (OLS). OLS is a data-driven methodology which can make the selected regression model has the minimum residual sum of squares of all the observations [45].

2.3.2. GWR Model and GTWR model

The GWR model is also a linear regression model which pays more attention to the local regression based on spatial relationship. The model can be performed as Equation (3).
Y i = β 0 ( u i ,   ν i ) + k β k ( u i ,   ν i ) X i k + ε i i = 1 , , n ,
where ( u i , v i ) represents the ( x , y ) coordinates of community i , β 0 ( u i ,   ν i ) is the constant value or intercept value. β k ( u i ,   ν i ) are the coefficients of variable X i k in community i . ε i is the error term.
Then the time variable is introduced into the model. Based on the analysis of Huang et al. (2010) [16], the GTWR model can be performed as Equation (4).
Y i = β 0 ( u i ,   ν i , t i ) + k β k ( u i ,   ν i , t i ) X i k + ε i i = 1 , , n ,
where ( u i ,   ν i , t i ) represents the ( x , y , t ) spatial-temporal coordinates of community i . Other factors in the equation are the same as Equation (3).
Then the linear regression should be solved by estimating the β k ( u i ,   ν i , t i ) and β 0 ( u i ,   ν i , t i ) in Equation (5).
β ^ ( u i ,   ν i , t i ) = [ X T W ( u i ,   ν i , t i ) X ] 1 X T W ( u i ,   ν i , t i ) Y ,
where W ( u i ,   ν i , t i ) is the spatial-temporal weight matrix to community i . By defining the spatial distance d S and temporal distance d T , the spatial-temporal distance d S T can be combined as in Equation (6)
d S T = d S   d T ,
where could be any operator for certain situation. Here the + operator is adopted and scale factors of λ and μ are selected for d S and d T .Then the spatial-temporal distance of community i and community j with transaction year t i and t j can be represented in Equation (7)
( d i j S T ) 2 = λ [ ( u i u j ) 2 + ( v i v j ) 2 ] + μ ( t j t j ) 2 ,
Based on the First Law of Geography [46], the closer an observation is to community i , the greater the weight. The transaction year is also assumed. A different transaction year has mutual influence, i.e., the closer the transaction year, the greater the weight. This kind of weight is commonly built by Gaussian distance decay-based functions as shown in Equation (8) [47].
W i j = exp { λ [ ( u i u j ) 2 + ( v i v j ) 2 ] + μ ( t j t j ) 2 h S T 2 } ,
where h S T is the parameter of spatial-temporal bandwidth and λ μ is the spatial-temporal distance ratio. The λ μ value could be optimized by using the cross-validation (CV) or corrected Akaike information criterion (AICc) [48].

3. Results and Discussion

3.1. Multiple Regression Analysis with Ordinary Least Squares

The multiple regression analysis with ordinary least squares is carried out with all the variables in the Beijing core area. Table 3 shows the parameter estimates, their standard error, and inference results. There are six independent variables (p-value > 0.05) which are not statistically significant with the community price, including the property management fee, the green ratio, the shortest distance to the kindergarten, the shortest distance to the park, the shortest distance to the food market and the shortest distance to the supermarket. The coefficients of different variables reflect the degree and direction of the influence on the dependent variable under different measurement unit. The transaction year is the most important one with the coefficient value of 0.162. Stderror is the standard deviation of regression coefficient; the smaller it is, the more accurate the model is. T-statistic and p-value are both used to test the significance of the model variables. The larger the t-statistic is, the more significant the corresponding covariate is. The variance inflation factors (VIF) of all independent variables are also tested and all VIF values are smaller than 7.5 (most of them are smaller than 2), indicating that there is no global significant multicollinearity (also called redundant variable) among the explanatory variables. In terms of the performance of the overall model, the R2 is 0.5680 and the adjusted R2 is 0.5647, indicating that the OLS model can explain 56 percent of the variation in community price in core area.

3.2. Geographically Weighted Regression Model

OLS results in Table 3 shows that 17 significant variables are selected from the total of 23 variables. In order to run the GWR model, the global and local multicollinearity should also be removed. Otherwise, the result will not be feasible. The global multicollinearity could be checked by the VIF values. Variables with large VIF values (above 7.5) are redundant variables. It is more difficult to find out the local multicollinearity. One of the effective ways is to create a thematic map for each of the independent variables and look for areas with little or no variation in values. We combine the OLS results with the thematic map of each variable and finally find out the variables with local multicollinearity are transaction year, ladder-to-household ratio and years of property right. Finally, 14 variables are involved in building the GWR model. The result is shown in Table 4.
The model is implemented by the ArcGIS Desktop Software (Version: 10.5; Type: Advanced). The Gaussian kernel is used for GWR model and the kernel type is fixed. The overall R2 is 0.2215. The adjusted R2 is 0.2007, and the bandwidth is 4098.1515 meters. The bandwidth is an important factor for the GWR model. It determines the smoothness of the model. The optimal result of bandwidth is estimated by the AICc methods. The residual square is 305.0345. The smaller the residual square is, the more the GWR model fits the observed data. The sigma value is the square root of the normalized residual sum of squares, which is used for AICc calculation. Detailed information is listed in Table 5.
GWR is a local linear regression model and the result reflects that the GWR model can only explain around 20 percent of the variation in the center area of Beijing for the whole dataset of 2014, 2016 and 2018. It reflects that the GWR model is not effective for the multi-year dataset of core area in Beijing. Figure 3 shows the distribution of R2 and standard residual value of the GWR model.
As for the problem of the GWR model in the same community, the different attribute values (independent variables) of 2014, 2016 and 2018 are treated as three different samples, all in the same location. This situation may lead to the fact that different sample data at the same location are calculated and averaged during local regression, which disturbs the spatial characteristics of local regression. Therefore, the result of R2 is very low. For further verification, according to the methodology in this paper, the database is intercepted by year. The data of 2014, 2016 and 2018 are extracted respectively. First, the OLS test is carried out, then global and local multicollinearity tests are taken into progress for the significant variables. Afterward, all final variables are utilized to build the GWR model and the results are shown in Table 6. Obviously, the results of the GWR model for each year separately are much better than the GWR model with all years’ database. The adjusted R2 of GWR model for 2014, 2016 and 2018 is approximately 0.5374, 0.4618 and 0.6321.

3.3. Geographically and Temperally Weighted Regression Model

The independent variables involved in GTWR model are the same as GWR model. The GTWR model is also provided in ArcGIS Desktop with a plug-in program (Release Version: https://www.researchgate.net/publication/339567248_GTWRv1_1_20_May2020zip. Algorithm Source: reference [16]. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. International Journal of Geographical Information Science 2010, 24, 383–401, doi:10.1080/13658810802672469). The Gaussian kernel is used for GTWR model and the kernel type is fixed. The transaction year variable is set as the timestamps according to the program’s instrument. After the calculation, the R2 is 0.8200 and the adjusted R2 is 0.8192. The bandwidth of the GTWR is 0.1122 and the spatial-temporal distance ratio is 0.3731. The detail information of model diagnosis is shown in Table 7. Figure 4 shows the spatial distribution of the standard residuals for the GTWR model in 2014, 2016, and 2018. Where more than 2.5 times of standardized residuals need to be examined. According to the output information and distribution maps, the residuals range from −0.8759 to 1.0417 in 2014, −1.2920 to 0.6153 in 2016 and −0.6944 to 0.3574 in 2018. There is no residual value that is statistically significant clustering of high and/or low residual. This indicates that the GTWR model is reliable.
For further analysis, independent variables of Area, Decoration, Shortest Distance of Bus station, Shortest Distance of Hospital and Shortest Distance of Restaurant are selected to conduct the coefficient distribution analysis for each year.
According to Figure 5, cold colors (e.g., black, dark blue) indicate that the area variable has a negative effect on the annual average house price of community. The larger the area is, the lower the average house price is. Warm colors (e.g., red, orange) indicate positive effect. The larger the area is, the higher the average house price is. From 2014 to 2018, the residential area in Xicheng District on the left side of the research area gradually changed from positive to negative. By 2018, only a proportion of the southern residential area has a positive correlation with the housing price. One reason for this change is that within five years, the purchasing ability of a family will not change too much. With the rise of housing prices, if most families still want to make a deal, they can only choose a smaller area of housing. Yet small area housing often has a higher unit price. This leads to the distribution that the smaller the area, the higher the unit price by 2018.
Figure 6 is the coefficient analysis of decoration. Considering the condition of 2014, 2016 and 2018, most communities have the pattern that the better decoration, the higher the housing price. This feature is prevalent in second-hand housing transactions. However, there are still obvious differences in the degree of influence. For instance, the coefficient value of the warm colors (e.g., red, orange) can contribute 20% to 30% of the price. In particular, the southern part of Dongcheng District has the lowest sensitivity to decoration in 2014, but the highest sensitivity in 2018. Generally, keeping other factors the same, the house with exquisite decoration will get a higher valuation than the house with ordinary decoration. However, when the transaction price of the house greatly exceeds the cost of decoration, the buyer will become insensitive to the decoration situation.
Figure 7 shows the distribution of coefficients for the shortest distance from the community to the bus stop. Because the variable here is the minimum distance, the situation is the opposite of the variable for area and decoration. Overall, the central area keeps cold colors (e.g., black, dark blue), which means that the closer to the bus station, the higher the house price. On the contrary, the surrounding areas have the opposite trend. Considering the high density of traffic facilities in the study area, it may be a phenomenon of supersaturation, as the convenience of public transport also means traffic congestion and traffic noise. Meanwhile, the warm colors (e.g., red, orange) of the communities are basically around the second and third ring, which is the urban expressway of Beijing.
Figure 8 is the coefficient distribution of the shortest distance to the hospital. Cold colors (e.g., black, dark blue) indicate that the smaller the shortest distance to the hospital, the higher the house price is. This also represents that these areas are still at a stage of positive demand for medical resources. The overall distribution trend for the three years of 2014, 2016 and 2018 has not changed. On the one hand, hospital construction needs a large investment in public infrastructure construction, which takes many years from input to output. On the other hand, it also reflects that the level of medical resources in the core area remains relatively stable.
Figure 9 shows the coefficient distribution of the shortest distance to the restaurant. Most communities of Xicheng District are in warm colors (e.g., red, orange), indicating that the closer the distance is, the smaller the impact on house prices. It presents a state of oversaturation. Most of the communities in Dongcheng District are in cold colors (e.g., black, dark blue), indicating that the more convenient the distance from the restaurant, the higher the house price.

4. Conclusions

Mass appraisal is considered when many properties need to be assessed under an evaluation standard on a given date. Compared with the appraiser’s house by house evaluation, the software programs of mass appraisal models can provide a more effective, fair and accurate result, together with easier operation and lower cost in the practical application. In this research, the level of community is used for the mass appraisal modeling with annual average price and other meaningful attributes. The database contains the price data of 2014, 2016 and 2018 in 3064 communities. Three mass appraisal models including the MRA with OLS, the GWR model and the GTWR model are built in the urban center of Beijing core area as the study area. The overall performance of the models is shown in Table 8. From the results of mass appraisal, MRA with OLS, as a global linear regression model, has a general effect and can explain about 56% of the information. In contrast, as a local linear model, the adjusted R2 of GWR is only 0.2007, which is invalid in this experimental area. However, when time factor is introduced to form the GTWR model, it will be able to take advantage of the local model and obtain the adjusted R2 of 0.8192. Housing price data are sensitive to spatial factor. At the same time, the influence of temporal factor on housing prices is also obvious from the results of Table 3 in Section 3.1. Therefore, modeling the sample data with spatial-temporal heterogeneity will be able to more accurately simulate the characteristics of housing price in the research area. Finally, GTWR model can make good use of multi-year community-level data to conduct the mass appraisal modeling.
There are also some limitations in this study that need to be further discussed. First is about the evaluation scale. This study has proved that community scale is feasible and effective. However, the community data is a mathematical processing of the original individual transaction data, which may lose some important information. At the same time, it should be noted that if the transaction dataset for each property is used to execute the mass appraisal for the whole city, the amount of data will be greatly increased. For local linear regression model, the multiple increases of the amount of calculation are a challenge to the stability and efficiency of the model. Finally, compared with the housing price data, the rental data have a higher transaction frequency and is relatively stable in a housing submarket. It can better describe the housing value from the perspective of residence and usage rather than investment. A useful future study will be to introduce the rental data into the construction of the mass appraisal model and make a comparative analysis with the housing price data.

Author Contributions

Conceptualization, D.W.; methodology, D.W.; software, D.W.; formal analysis, D.W.; resources, H.Y.; data curation, H.Y.; writing—original draft preparation, D.W.; writing—review and editing, D.W. and V.J.L.; visualization, D.W.; supervision, V.J.L.; funding acquisition, H.Y. All authors have read and agree to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 71874195.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. IAAO. Standard on Mass Appraisal of Real Property; IAAO: Kansas City, MO, USA, 2017. [Google Scholar]
  2. Tajani, F.; Morano, P.; Ntalianis, K. Automated valuation models for real estate portfolios a method for the value updates of the property assets. J. Prop. Invest. Financ. 2018, 36, 324–347. [Google Scholar] [CrossRef]
  3. Ciuna, M.; Milazzo, L.; Salvo, F. A Mass Appraisal Model Based on Market Segment Parameters. Buildings 2017, 7, 34. [Google Scholar] [CrossRef]
  4. Zhou, G.; Ji, Y.; Chen, X.; Zhang, F. Artificial Neural Networks and the Mass Appraisal of Real Estate. Int. J. Online Eng. 2018, 14, 180–187. [Google Scholar] [CrossRef] [Green Version]
  5. Bencardino, M.; Nesticò, A. Demographic changes and real estate values. A quantitative model for analyzing the urban-rural linkages. Sustainability 2017, 9, 536. [Google Scholar] [CrossRef] [Green Version]
  6. Battisti, F.; Campo, O.; Forte, F. A Methodological Approach for the Assessment of Potentially Buildable Land for Tax Purposes: The Italian Case Study. Land 2020, 9, 8. [Google Scholar] [CrossRef] [Green Version]
  7. Manganelli, B.; Murgante, B. The dynamics of urban land rent in Italian regional capital cities. Land 2017, 6, 54. [Google Scholar] [CrossRef] [Green Version]
  8. Lancaster, K.J. New approach to consumer theory. J. Political Econ. 1966, 74, 132–157. [Google Scholar] [CrossRef]
  9. Del Giudice, V.; Manganelli, B.; De Paola, P. Hedonic Analysis of Housing Sales Prices with Semiparametric Methods. Int. J. Agric. Environ. Inf. Syst. 2017, 8, 65–77. [Google Scholar] [CrossRef] [Green Version]
  10. Lin, C.C.; Mohan, S.B. Effectiveness comparison of the residential property mass appraisal methodologies in the USA. Int. J. Hous. Mark. Anal. 2011, 4, 224–243. [Google Scholar] [CrossRef]
  11. Brunsdon, C.; Fotheringham, A.S.; Charlton, M.E. Geographically weighted regression: A method for exploring spatial nonstationarity. Geogr. Anal. 1996, 28, 281–298. [Google Scholar] [CrossRef]
  12. Bitter, C.; Mulligan, G.F.; Dall’erba, S. Incorporating spatial variation in housing attribute prices: A comparison of geographically weighted regression and the spatial expansion method. J. Geogr. Syst. 2007, 9, 7–27. [Google Scholar] [CrossRef] [Green Version]
  13. Harris, R.; Dong, G.P.; Zhang, W.Z. Using Contextualized Geographically Weighted Regression to Model the Spatial Heterogeneity of Land Prices in Beijing, China. Trans. GIS 2013, 17, 901–919. [Google Scholar] [CrossRef]
  14. Cao, K.; Diao, M.; Wu, B. A Big Data-Based Geographically Weighted Regression Model for Public Housing Prices: A Case Study in Singapore. Ann. Am. Assoc. Geogr. 2019, 109, 173–186. [Google Scholar] [CrossRef]
  15. Li, Z.Q.; Fotheringham, A.S.; Li, W.W.; Oshan, T. Fast Geographically Weighted Regression (FastGWR): A scalable algorithm to investigate spatial process heterogeneity in millions of observations. Int. J. Geogr. Inf. Sci. 2019, 33, 155–175. [Google Scholar] [CrossRef]
  16. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int. J. Geogr. Inf. Sci. 2010, 24, 383–401. [Google Scholar] [CrossRef]
  17. Fotheringham, A.S.; Crespo, R.; Yao, J. Geographical and Temporal Weighted Regression (GTWR). Geogr. Anal. 2015, 47, 431–452. [Google Scholar] [CrossRef] [Green Version]
  18. Wang, H.X.; Wang, J.D.; Huang, B. Prediction for spatio-temporal models with autoregression in errors. J. Nonparametric Stat. 2012, 24, 217–244. [Google Scholar] [CrossRef]
  19. He, Q.; Huang, B. Satellite-based mapping of daily high-resolution ground PM2. 5 in China via space-time regression modeling. Remote Sens. Environ. 2018, 206, 72–83. [Google Scholar] [CrossRef]
  20. Cheng, J.; Dai, S.; Ye, X. Spatiotemporal heterogeneity of industrial pollution in China. China Econ. Rev. 2016, 40, 179–191. [Google Scholar] [CrossRef]
  21. Zhang, X.X.; Huang, B.; Zhu, S.Z. Spatiotemporal Influence of Urban Environment on Taxi Ridership Using Geographically and Temporally Weighted Regression. ISPRS Int. J. Geo-Inf. 2019, 8, 23. [Google Scholar] [CrossRef] [Green Version]
  22. Wu, C.; Ren, F.; Hu, W.; Du, Q. Multiscale geographically and temporally weighted regression: Exploring the spatiotemporal determinants of housing prices. Int. J. Geogr. Inf. Sci. 2019, 33, 489–511. [Google Scholar] [CrossRef]
  23. Wang, H.; Zhang, B.; Liu, Y.; Liu, Y.; Xu, S.; Zhao, Y.; Chen, Y.; Hong, S. Urban expansion patterns and their driving forces based on the center of gravity-GTWR model: A case study of the Beijing-Tianjin-Hebei urban agglomeration. J. Geogr. Sci. 2020, 30, 297–318. [Google Scholar] [CrossRef]
  24. Du, Z.; Wu, S.; Kwan, M.-P.; Zhang, C.; Zhang, F.; Liu, R. A spatiotemporal regression-kriging model for space-time interpolation: A case study of chlorophyll-a prediction in the coastal areas of Zhejiang, China. Int. J. Geogr. Inf. Sci. 2018, 32, 1927–1947. [Google Scholar] [CrossRef]
  25. Bourassa, S.C.; Cantoni, E.; Hoesh, M. Spatial dependence, housing submarkets, and house price prediction. J. Real Estate Financ. Econ. 2007, 35, 143–160. [Google Scholar] [CrossRef] [Green Version]
  26. McCluskey, W.J.; McCord, M.; Davis, P.T.; Haran, M.; McIlhatton, D. Prediction accuracy in mass appraisal: A comparison of modern approaches. J. Prop. Res. 2013, 30, 239–265. [Google Scholar] [CrossRef]
  27. Wang, D.; Li, V.J. Mass Appraisal Models of Real Estate in the 21st Century: A Systematic Literature Review. Sustainability 2019, 11, 7006. [Google Scholar] [CrossRef] [Green Version]
  28. Guarini, M.R.; Battisti, F.; Chiovitti, A. A methodology for the selection of multi-criteria decision analysis methods in real estate and land management processes. Sustainability 2018, 10, 507. [Google Scholar] [CrossRef] [Green Version]
  29. Manganelli, B.; Paola, P.D.; Giudice, V.D. A multi-objective analysis model in mass real estate appraisal. Int. J. Bus. Intell. Data Min. 2018, 13, 441–455. [Google Scholar] [CrossRef]
  30. Kilpatrick, J. Expert systems and mass appraisal. J. Prop. Invest. Financ. 2011, 29, 529–550. [Google Scholar] [CrossRef] [Green Version]
  31. Morano, P.; Rosato, P.; Tajani, F.; Manganelli, B.; Di Liddo, F. Contextualized Property Market Models vs. Generalized Mass Appraisals: An Innovative Approach. Sustainability 2019, 11, 4896. [Google Scholar] [CrossRef] [Green Version]
  32. Del Giudice, V.; De Paola, P.; Forte, F.; Manganelli, B. Real estate appraisals with Bayesian approach and Markov chain hybrid Monte Carlo method: An application to a central urban area of Naples. Sustainability 2017, 9, 2138. [Google Scholar] [CrossRef] [Green Version]
  33. Yacim, J.A.; Boshoff, D.G.B. Impact of Artificial Neural Networks Training Algorithms on Accurate Prediction of Property Values. J. Real Estate Res. 2018, 40, 375–418. [Google Scholar]
  34. Hui, S.K.; Cheung, A.; Pang, J. A Hierarchical Bayesian Approach for Residential Property Valuation: Application to Hong Kong Housing Market. Int. Real Estate Rev. 2010, 13, 1–29. [Google Scholar]
  35. Napoli, G.; Giuffrida, S.; Valenti, A. Forms and Functions of the Real Estate Market of Palermo (Italy). Science and Knowledge in the Cluster Analysis Approach. In Appraisal: From Theory to Practice; Stanghellini, S., Morano, P., Bottero, M., Oppio, A., Eds.; Springer: Berlin, Germany, 2017; pp. 191–202. [Google Scholar] [CrossRef]
  36. Calka, B. Estimating Residential Property Values on the Basis of Clustering and Geostatistics. Geosciences 2019, 9, 143. [Google Scholar] [CrossRef] [Green Version]
  37. Del Giudice, V.; De Paola, P.; Cantisani, G.B. Rough Set Theory for Real Estate Appraisals: An Application to Directional District of Naples. Buildings 2017, 7, 12. [Google Scholar] [CrossRef]
  38. Yeh, I.C.; Hsu, T.-K. Building real estate valuation models with comparative approach through case-based reasoning. Appl. Soft Comput. 2018, 65, 260–271. [Google Scholar] [CrossRef]
  39. Chen, J.-H.; Ong, C.F.; Zheng, L.; Hsu, S.-C. Forcasting spatial dynamics of the housing market using support vector machine. Int. J. Strateg. Prop. Manag. 2017, 21, 273–283. [Google Scholar] [CrossRef]
  40. Wu, C.; Ye, X.; Ren, F.; Du, Q. Modified Data-Driven Framework for Housing Market Segmentation. J. Urban Plan. Dev. 2018, 144. [Google Scholar] [CrossRef]
  41. Zhang, R.; Du, Q.; Geng, J.; Liu, B.; Huang, Y. An improved spatial error model for the mass appraisal of commercial real estate based on spatial analysis: Shenzhen as a case study. Habitat Int. 2015, 46, 196–205. [Google Scholar] [CrossRef]
  42. Palma, M.; Cappello, C.; De Iaco, S.; Pellegrino, D. The residential real estate market in Italy: A spatio-temporal analysis. Qual. Quant. 2019, 53, 2451–2472. [Google Scholar] [CrossRef]
  43. Watson, D.F.; Philip, G. A refinement of inverse distance weighted interpolation. Geo-Processing 1985, 2, 315–327. [Google Scholar]
  44. Silverman, B.W. Density Estimation for Statistics and Data Analysis; CRC Press: Boca Raton, FL, USA, 1986; Volume 26. [Google Scholar]
  45. Anselin, L. GIS research infrastructure for spatial analysis of real estate markets. J. Hous. Res. 1998, 9, 113–133. [Google Scholar]
  46. Tobler, W.R. Computer movie simulating urban growth in detroit region. Econ. Geogr. 1970, 46, 234–240. [Google Scholar] [CrossRef]
  47. Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons: New Yok, NY, USA, 2003. [Google Scholar]
  48. Hurvich, C.M.; Simonoff, J.S.; Tsai, C.L. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. J. R. Stat. Soc. Ser. B Stat. Methodol. 1998, 60, 271–293. [Google Scholar] [CrossRef]
Figure 1. Map of the study area.
Figure 1. Map of the study area.
Land 09 00143 g001
Figure 2. Spatial distribution and kernel density distribution of community annual average price.
Figure 2. Spatial distribution and kernel density distribution of community annual average price.
Land 09 00143 g002
Figure 3. Distribution of R2 and standard residual of the GWR model.
Figure 3. Distribution of R2 and standard residual of the GWR model.
Land 09 00143 g003
Figure 4. Standard residuals for the GTWR model in 2014, 2016, and 2018.
Figure 4. Standard residuals for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g004
Figure 5. Coefficient of area for the GTWR model in 2014, 2016, and 2018.
Figure 5. Coefficient of area for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g005
Figure 6. Coefficient of decoration for the GTWR model in 2014, 2016, and 2018.
Figure 6. Coefficient of decoration for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g006
Figure 7. Coefficient of SD_Bus for the GTWR model in 2014, 2016, and 2018.
Figure 7. Coefficient of SD_Bus for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g007
Figure 8. Coefficient of SD_Hospital for the GTWR model in 2014, 2016, and 2018.
Figure 8. Coefficient of SD_Hospital for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g008
Figure 9. Coefficient of SD_Restaurant for the GTWR model in 2014, 2016, and 2018.
Figure 9. Coefficient of SD_Restaurant for the GTWR model in 2014, 2016, and 2018.
Land 09 00143 g009
Table 1. Transaction sample distributions.
Table 1. Transaction sample distributions.
YearXicheng DistrictDongcheng DistrictTotal of Communities
2014543350893
20166634961159
20185974151012
3064
Table 2. Summary of variables.
Table 2. Summary of variables.
CategoryVariableDefinitionObs.MinimumMeanMedianMaximumStd.
Dependent VariablesPriceAnnual average transaction price of each community (Renminbi (RMB) Yuan per square meter)306419667.7079122.6778380.10148977.0026610.81
log ( P r i c e ) Natural logarithm calculation of Price30649.8911.2211.2711.910.36
Property Structure in CommunityYearTransaction year of 2014, 2016 and 2018306420142016.08201620181.58
AreaLiving area (square meter)30647.9271.5261.33303.4432.37
BedroomNumber of bedrooms for unit household30641.002.012.005.000.54
DecorationDecoration condition (from “best = 1” to “worst = 0”)30640.000.330.301.000.31
OrientationOrientation condition (from “best = 1” to “worst = 0”)30640.000.700.801.000.34
Basic Condition of CommunityAgeThe age of the buildings till 201930641.0025.2425.17102.0010.83
Ladder-to-Household Ratio Ladder and Household number ratio for each floor30640.733.643.0045.002.32
Years of Property RightThe property right for 70, 50 and 40 years306440.0069.5970.0070.003.23
Ratio of ElevatorBuildings with elevators divided by total buildings30640.000.450.451.000.43
Property Management FeeProperty management fee (RMB Yuan per square meter per month)30640.501.861.8614.001.25
Num. BuildingsThe number of buildings in community30641.007.905.00126.0010.09
Num. HouseholdsThe number of households in community30641.00597.18376.005877.00651.84
Floor Area Ratio The ratio of building’s available area to the total area30640.092.742.7413.891.27
Green RatioThe ratio of total green space to the total area of residential land30640.100.300.300.600.06
Traffic Condition around CommunitySD_BusShortest distance to bus station within 2 km (kilometer)30640.030.760.72.000.41
SD_SubwayShortest distance to subway station within 2 km (kilometer)30640.120.900.792.000.49
Living Condition around CommunitySD_KindergartenShortest distance to kindergarten within 2 km (kilometer)30640.010.560.522.000.30
SD_ParkShortest distance to park within 2 km (kilometer)30640.010.760.711.870.35
SD_HospitalShortest distance to hospital within 2 km (kilometer)30640.030.600.571.630.31
SD_Shopping mallShortest distance to shopping mall within 2 km (kilometer)30640.021.030.992.000.49
SD_Food marketShortest distance to food market within 2 km (kilometer)30640.030.600.582.000.31
SD_SupermarketShortest distance to supermarket within 2 km (kilometer)30640.020.550.542.000.27
SD_RestaurantShortest distance to restaurant within 2 km (kilometer)30640.020.680.652.000.35
Table 3. Results of the multiple regression analysis (MRA) with ordinary least squares (OLS).
Table 3. Results of the multiple regression analysis (MRA) with ordinary least squares (OLS).
VariableCoefficientStdErrort-Statisticp-ValueVIF1
Intercept−315.8511875.620593−56.1953500.000000*——
Year0.1621380.00278658.1970900.000000*1.059613
Area−0.0026770.000249−10.7516100.000000*3.572943
Bedroom0.0323680.0121612.6617530.007810*2.368106
Decoration0.1406650.0145999.6352220.000000*1.125050
Orientation0.0570930.0136884.1709990.000037*1.174621
Age−0.0019040.000535−3.5571230.000396*1.846311
Ladder Ratio of Household−0.0069750.001933−3.6084880.000327*1.108054
Years of Property Right0.0042400.0013343.1799190.001503*1.022561
Ratio of Elevator0.0296150.0131382.2541630.024240*1.742408
Property management fee0.0037700.0042390.8893470.3738721.554304
Num. Buildings0.0044150.0004829.1523150.000000*1.303021
Num. Households−0.0000360.000008−4.7560890.000003*1.344524
Floor Area Ratio−0.0119920.003643−3.2919640.001023*1.170229
Green Ratio0.0326920.0753350.4339520.6643701.038599
SD_Bus−0.0378060.010777−3.5080510.000473*1.060005
SD_Subway−0.0258250.009012−2.8656140.004196*1.054826
SD_Kindergarten−0.0162010.014863−1.0900430.2757741.058128
SD_Park0.0071610.0127120.5633200.5732661.085705
SD_Hospital0.0463710.0140873.2916670.001024*1.037497
SD_Shopping mall0.0460110.0091525.0276810.000001*1.116088
SD_Food market0.0064660.0144270.4482190.6540421.095029
SD_Supermarket0.0269490.0158901.6960330.0899921.049578
SD_Restaurant−0.0645110.012821−5.0317000.000001*1.108541
OLS Diagnostics
Number of Observations3064
R20.567956
Adjusted R20.564687
AICc2−127.522124
VIF1: variance inflation factors; AICc2: corrected Akaike information criterion; * a p-value less than 0.05 (typically ≤0.05) is statistically significant.
Table 4. Variables Selection.
Table 4. Variables Selection.
VariablesSignificant3without Global4without Local5
Transaction Year
Area
Bedroom
Decoration
Orientation
Age
Ladder Ratio of Household
Years of Property Right
Ratio of Elevator
Property Management Fee
Num. Buildings
Num. Households
Floor Area Ratio
Green Ratio
SD_Bus
SD_Subway
SD_Kindergarten
SD_Park
SD_Hospital
SD_Shopping mall
SD_Food market
SD_Supermarket
SD_Restaurant
Significant3: Variables with significance in OLS; without Global4: Variables without global multicollinearity; without Local5: Variables without local multicollinearity.
Table 5. Results of GWR Model.
Table 5. Results of GWR Model.
Diagnostics ContentValue
Number of Observations3064
Bandwidth4098.151518
Residual Squares305.034493
Sigma0.319754
AICc1760.556297
R20.221452
Adjusted R20.200689
Table 6. Comparison of GWR models.
Table 6. Comparison of GWR models.
GWR DiagnosticsAll Years 201420162018
Number of Observations306489311591012
Bandwidth4098.1515181290.9046332582.5486321850.362328
Residual Squares305.03449320.98407241.74678814.625219
Sigma0.3197540.1773070.1987810.129565
AICc1760.556297−398.932944−401.695394−1184.098235
R20.2214520.653810.5089870.682964
Adjusted R20.2006890.5373630.4618180.632098
Table 7. Results of the GTWR model.
Table 7. Results of the GTWR model.
Diagnostics ContentValue
Number of Observations3064
Bandwidth0.112248
Residual Squares70.5342
Sigma0.151724
AICc−1896.93
R20.820032
Adjusted R20.819206
Spatial-temporal Distance Ratio0.373068
Table 8. Overall performance of all models.
Table 8. Overall performance of all models.
MRA with OLSGWRGTWR
Number of Observations306430643064
BandwidthGlobal4098.1515180.112248
AICc−127.5221241760.556297−1896.93
R20.5679560.2214520.820032
Adjusted R20.5646870.2006890.819206

Share and Cite

MDPI and ACS Style

Wang, D.; Li, V.J.; Yu, H. Mass Appraisal Modeling of Real Estate in Urban Centers by Geographically and Temporally Weighted Regression: A Case Study of Beijing’s Core Area. Land 2020, 9, 143. https://doi.org/10.3390/land9050143

AMA Style

Wang D, Li VJ, Yu H. Mass Appraisal Modeling of Real Estate in Urban Centers by Geographically and Temporally Weighted Regression: A Case Study of Beijing’s Core Area. Land. 2020; 9(5):143. https://doi.org/10.3390/land9050143

Chicago/Turabian Style

Wang, Daikun, Victor Jing Li, and Huayi Yu. 2020. "Mass Appraisal Modeling of Real Estate in Urban Centers by Geographically and Temporally Weighted Regression: A Case Study of Beijing’s Core Area" Land 9, no. 5: 143. https://doi.org/10.3390/land9050143

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop