Geographically Weighted Regression Models in Estimating Median Home Prices in Towns of Massachusetts Based on an Urban Sustainability Framework

Housing is a key component of urban sustainability. The objective of this study was to assess the significance of key spatial determinants of median home price in towns in Massachusetts that impact sustainable growth. Our analysis investigates the presence or absence of spatial non-stationarity in the relationship between sustainable growth, measured in terms of the relationship between home values and various parameters including the amount of unprotected forest land, residential land, unemployment, education, vehicle ownership, accessibility to commuter rail stations, school district performance, and senior population. We use the standard geographically weighted regression (GWR) and Mixed GWR models to analyze the effects of spatial non-stationarity. Mixed GWR performed better than GWR in terms of Akaike Information Criterion (AIC) values. Our findings highlight the nature and spatial extent of the non-stationary vs. stationary qualities of key environmental and social determinants of median home price. Understanding the key determinants of housing values, such as valuation of green spaces, public school performance metrics, and proximity to public transport, enable towns to use different strategies of sustainable urban planning, while understanding urban housing determinants—such as unemployment and senior population—can help modify urban sustainable housing policies.


Background
More than 83 percent of the US population now lives in cities, up thirty percent from 50 years ago. By 2050, the US urban population is projected to increase to more than 90 percent (of 423 million in 2050). Urbanization is transforming farmland, wetlands, forests, and other natural ecosystems into urban landscapes at an unprecedented rate resulting in urban sprawl. Urban landscape patterns and dynamics are the physical manifestation of complex interactions between environment, society, and economy [1][2][3][4]. Thus, urban areas are highly relevant, if not central, to any discussion on sustainable development.
The central goal of urban sustainability is efficient use of natural resources within a city region, while simultaneously improving its livability, through social amenities, economic opportunity, and health [5]. We offer a conceptual framework to understanding urban sustainability through the lens of urban housing that is at the intersection of economic, ecological and social dimensions. Housing plays a pivotal role in determining the financial (economic) security and well-being of individuals, neighborhoods, and cities. Owning a home is part of the "American Dream". Research on spatial non-stationarity across Massachusetts. There are policy implications related to development, demographics, housing and transportation. What forest or woodland to cut depends on value placed on open lands in various towns across the state? Suburban towns north and west of Boston place a higher premium on forest compared with rural western Massachusetts. Senior populations living in suburban Boston homes have seen considerable increase in their home prices and seniors choose to continue to live in these towns with greater access to senior services (free rides to malls or hospitals). A town may adopt stricter regulations to protect unprotected forests open from development, or to improve the quality of education by imposing more taxes on its citizens, knowing that school performance positively impacts home prices in these towns.
The structure of the paper is as follows: the next section examines the modeling framework and the long history of hedonic and GWR modeling. Section 2 outlines the data sources and discusses the methodology including spatial modeling considerations. Section 3 provides the results of analysis relating to spatio-temporal patterns of median home prices derived from the ordinary least squares (OLS), and two types of GWR models. We highlight the differences in the degree of spatial non-stationarity in the determinants of the median home price in Massachusetts, as well as characterize the temporal differences in median home prices in the period of bust and boom. Section 4 provides conclusions related to the theory and practice of GWR in this field.

Modeling Framework
From the methodological point of view, the hedonic price function f, typically describes the property price P as a function of three categories of independent variables: structural, locational, and environmental characteristics. Traditional hedonic approaches adopt a model structure which reduces heteroscedasticity and nonlinearity to produce a single solution for the intercept term, along with the coefficients that determine the significance of independent locational, structural, or environmental characteristics, and the overall model's goodness-of-fit. Hedonic modeling has been executed at a variety of spatial scales ranging from a block or neighborhood [18][19][20], to the district or metro scale [21,22]. However, these models cannot account for spatial autocorrelation resulting from spatially correlated omitted variables or spatial externalities and spatial heterogeneity [15].
Spatial autocorrelation indicates that homes in a neighborhood tend to be more similar. Real estate companies (including the popular Zillow), in effect, use spatial autocorrelation to determine the price of a home at a certain location based on the prices of nearby (similar) homes. Furthermore, many homes in a neighborhood tend to be built around the same time, and proximity to both positive and negative externalities has similar effects on the market values of nearby properties [33]. However, spatial heterogeneity or spatial non-stationarity results when the relationship between two or more variables determining the median home price is not constant across space, resulting in locally varying submarkets. Ignoring spatial non-stationarity leads to misspecification in the model including missing local effects that can have profound implications for understanding the temporal and spatial relationship between housing prices, location and housing attributes [12,18].
The spatial modeling literature provides a variety of local and global models to deal with spatial dependence [34,35], as well as models that explicitly incorporate spatial heterogeneity such as the geographically weighted regression (GWR) methodology [36]. GWR calibrates a series of local regression models separately at each location and offers the ability to map local estimates of the intercept, variable coefficients, and other regression diagnostics and a check for spatial variations in the relationships between dependent and independent variables at each location [19,37]. The keys to GWR modeling are the spatial kernel functions, with fixed or varying bandwidth that impact the shape and size of local neighborhood at each location (e.g., a circular neighborhood of fixed radius or a fixed number of neighbors for each location), and the weighting functions that determine the significance of neighbors with fixed or varying bandwidth. The definition of spatial neighborhood is a critical consideration in the analysis. GWR has been applied in the contexts of determining home prices in many cities [20,[38][39][40][41]. The basic GWR model assumes the same degree of spatial smoothness for each coefficient, which may not hold true in all contexts. GWR therefore overfits the data and produces a bias. Hence the basic GWR has undergone the following significant revisions: First, traditional GWR models define distances as straight line or Euclidean, while more recent modifications of the distance function adopt non-Euclidean distance metrics [39] to improve the model fit. Second, traditional GWR models use a fixed bandwidth for all variables to estimate the spatial relationship between variables while a revised GWR can use a flexible bandwidth [42] to estimate spatially varying relationships at various geographical scales within one model. The resulting model estimates coefficient surfaces that may vary at different spatial scales for different variables leading to better model fits. Third, Wheeler [43,44] proposed regularized GWR models, by combining ridge and/or lasso regression with GWR that have shown robustness in addressing the multicollinearity problem. Fourth, there has been a focus on diagnostics to check the model fit such as cross-validation (CV) score [45] to derive an optimal kernel bandwidth for GWR regression to reduce model bias. Another measure is the Akaike Information Criterion (AIC) [46] that is traditionally used to account for model parsimony dealing with the trade-off between prediction accuracy and complexity. In GWR, a corrected version of the AIC is used that accounts for sample size [47] and entails fitting bandwidths with different penalty functions. Fifth, the incorporation of temporal non-stationarity into GWR model is providing more insights on market trends and depreciation of home prices through time [18]. Finally, not all variables in GWR models exhibit non-stationarity in all contexts; hence assuming that all of the independent variables in the GWR exert a spatial influence on the dependent variable can lead to biased estimations [48]. For example, real estate markets may be economically connected through common federal policies such as governmental subsidies while some price-determining effects vary across space resulting in spatial heterogeneity. It would be wrong to assume that both factors exert a spatial influence. Research addressing this issue has led to the formulation of the Mixed Geographically Weighted Regression model (MGWR) that incorporates both linear regression and the GWR [36]. MGWR is a regression model in which the first step involves differentiating non-stationary and stationary variables by explicitly testing for spatial variability. This testing results in some independent variable coefficients being held constant, which are considered global parameters, while some others spatially vary, denoted as local parameters [49]. The second step in MGWR involves in mapping spatial variability of local parameters while global parameters have no spatial variability. Thus, MGWR differs from GWR, where all independent variables are assumed to have spatial variability resulting in different spatial distributions of local parameters.

Study Area
Massachusetts is located in the northeastern United States and has an area of 27,340 km 2 , 25.7% of which is water and 61% is covered by forest. It is the 7th-smallest state in the United States and accounts for around 2.75% of the total GDP. The state has a population of 6.8123 M (2016) of which 4.7 M live in the Boston Metro region. Boston is the 10 largest metropolitan area in the US. The population of the state is mostly urban (83%). Figure 1 shows the location of Boston and other towns mentioned throughout this paper. Metro Boston (#7 on Figure 1) is located on the eastern coast of the state.
The GWR analysis is conducted at the town and city scale. There are 351 towns and cities in Massachusetts. Each town's data across the time period was selected for census years 2000 and 2010 [50] and ACS data for years 2009, 2011-2013 [51]. We only included 336 towns and cities due to data availability. Our data includes the Great Recession, a highly influential period for housing prices in the US economy. The period of the Great Recession, from 2008 to 2009, was characterized by loss of wealth, reduction in consumer spending, and massive job loss that resulted in a decline of home prices [29,52].

Data Sources-Socio-Economic and Environmental Variables
Our data was collected from various sources including the census, remote sensing (land cover), transportation, tax department, and labor statistics. We describe our data shown in Table 1. Our dependent variable is the Median Home Prices, recorded from the census (2000 and 2010) and ACS (2009, 2011, 2012, and 2013). Our choice of time periods was dictated by the availability of relevant data. We differentiated our temporal analysis into two sets based on the source of data-decennial census collected in 2000 and 2010, and ACS data from 2009 to 2013. (Note ACS data for 2008 is unavailable). The American Community Survey (ACS) data are estimates and therefore cannot be compared with census data directly. We used the ACS data since it represents the period of Great Recession starting in 2009. We used inflation-adjusted median home prices in order to compare today's real estate prices to their historical norm.
The independent variables used in our models were selected based on past theoretical and empirical works examining their relevance in estimating home values. Population Density and Unemployment Rate have an impact on home prices [27,53,54]. The impact of unemployment on housing during the Great Recession [29,55] shows that employment was key for recovery in some metro areas of the US. Unemployment was a significant factor in the wake of the housing collapse in 2008 in Massachusetts and impacted home prices [56]. Therefore, we assume that unemployment is likely to impact the housing prices in our study context. The impact of education (college degree) and home price highlight that since 1988, those who completed college owned homes at higher rates in the US than those with no college education [32]. Sellers of existing homes provide a major share of the annual supply of homes sold in the US; home sales are driven by the aging of the population since seniors are net home sellers [30]. Older homeowners have emerged as a dominant segment of the housing market following the housing collapse in 2008. The homeownership rate for Americans aged 65 and over has remained at 80 percent while dropping for every other age group. Seniors typically have less mortgage debt than younger homeowners; they typically downsize and sell, or increasingly some stay at the same home [57]. Seniors play a significant role in housing dynamics and hence are included in our analysis.
Residential property tax rates vary across the state related to home values and the mix of residential and commercial holdings in each town. Suburban towns such as Weston and Wellesley have some of the lowest residential tax rates in the state while Longmeadow in western Massachusetts, with few business establishments, relies heavily on residential property taxes to fund town services. Residential taxes may therefore reflect the economic structure of each town and is included in our analysis. Prior studies identified and measured land value uplift (LVU) resulting from rapid transit and other mass transit [28,58]. Thus, we have included distances from town centroids to commuter rail stations as an independent variable. Vehicle ownership is derived by normalizing the census variable called vehicle availability for each town by the population of each town. This data may provide information on town's travel accessibility and modal choices. More vehicle ownership may imply more travel by car while less vehicle ownership may imply the use of public transport, walking or biking [59]. Vehicle ownership has direct implications in urban sustainability studies. The city of Boston is planning to go carbon neutral by 2050 and is seeking solutions for reducing air pollution caused by automobile transportation. There is strong evidence to suggest that school quality substantially affects home prices in the US [60,61]. Earlier studies focused on the relationship between home prices and the quality of local education, using public school expenditures per pupil as the key school variable. However, more recent research highlights that the measure of K-12 student achievement is a more appropriate variable in home value estimations. We use the Composite Performance Index (CPI) scores of school districts in our analysis. The CPI is a measure of the extent to which students are progressing toward proficiency (a CPI of 100) in ELA (English Language Arts) and mathematics on the state's MCAS (Massachusetts Comprehensive Assessment System) [62]. CPIs are generated separately for ELA and mathematics at all levels in Massachusetts. For this study, we considered CPI for mathematics performance in public school district. In order to estimate the MCAS score for each individual town, we used Area Interpolation Tool in ArcGIS. First, we created an interpolated surface based on CPI from all public school districts. Then we applied Massachusetts Town and City boundary shapefile to estimate the CPI for each district. The State implemented and collected CPI data starting 2003. Hence, we substituted the CPI score of 2003 for 2000 since we did not have data for 2000. Better public schools' CPI (MCAS) generally correlates with a higher home price [31], exploited by realtors (such as Zillow) in selling homes.
In our study, we obtained the land cover classes designated as residential (low and high) and forest from the Mass Audubon's publication called Losing Ground [63], derived from processing and classifying Landsat time series of Landsat-5, 7, and 8 data for this time period. Two thematic classes called low and high residential classes described in the Losing Ground report were summed up to define the residential class in our study. A recent study by Cunningham et al., [64] used Landsat archives to examine the change from undeveloped (forest) to developed land-use during the real estate bubble (2000)(2001)(2002)(2003)(2004)(2005)(2006) and the subsequent bust (2006-2013) in Massachusetts. The results in this paper further highlight the significance of the land cover change during this period.
According to US Forest Service, Massachusetts had an estimated 3.0 million acres of forest land in 2014. About 61% of the land area of Massachusetts meets the Forest Inventory and Analysis (FIA) definition of forestland [65]. Forests are not evenly distributed across the state, and are largely influenced by development patterns. The lowest occurrences of forestland are seen in areas surrounding Metro Boston, Springfield, and Worcester, as well as along the coast and the major transportation corridors. Unprotected forest is an important determinant of home price and hence we created this new class of forest that is at the greatest risk for development. We derived this class based on two different data sources-Losing Ground [63] report and MassGIS data, as follows: First, forest areas were extracted from the Losing Ground dataset for the entire study period. Second, MassGIS dataset called Protected and Recreational Open Space layer was extracted; this class includes recreation, conservation, surface water supply protection areas and wellhead protection areas, or scenic sections [66]. Third, we created a new class called unprotected forest by differencing layers created in Step 1 and 2 to derive forest that is at risk for development. This thematic class provides us the areas at risk for conversion from forest to some form of development in the towns of Massachusetts.

Spatial Model Considerations
In this study, we use conventional ordinary least squares (OLS) as a benchmark, GWR and MGWR to describe spatial heterogeneity in housing across the towns in the state of Massachusetts.
The first step in the analysis was to use a traditional OLS of the form: where y i is home price of each town i at a specified time of the study and x ij is a row vector of explanatory variables for town i, a j is a column vector of regression coefficients, and e i is the random error for town i. The first element of the equation is the intercept. The initial model considered the relationship of the median price of a home in each town using independent variables-Population Density, Unprotected Forest, Unemployment rate, Percent Residential Area, Vehicle Ownership, Higher Education, Senior Population, distance to the nearest Commuter Rail station, Residential Property Taxes and CPI (Composite Performance Index) for each town. OLS results were interpreted based on an assessment of multicollinearity, adjusted R 2 and Akaike's information criterion (AIC) [46]. The variance inflation factor (VIF) statistic, which measures redundancy among explanatory variables, was used to assess multicollinearity. Explanatory variables with large VIF values-above a threshold of 7.5-were considered to be multicollinear. This process ensured that the model became unbiased. The next step in the analysis was to use GWR that explicitly incorporates the spatial structure of the variables into the estimation of the regression and shows how those estimates vary across space. We have selected an adaptive kernel whose bandwidth was found by minimizing the AIC value. The bandwidth is a count of the number of nearest observations to be included under the kernel. Preference is given to lower values of AIC since they indicate a closer fit to the data.
We explore the spatial variability of relationships between median home prices and the explanatory variables by mapping GWR coefficients and local R 2 values. We also performed an F3 test to probe whether the GWR estimates are a significant improvement on the conventional globally estimated OLS [67]. The Akaike Information Criterion (AIC) is used in this study as a test diagnostic to select flexible bandwidth b [68,69] from: where n is the local sample size (according to bandwidth);σ is the estimated standard deviation of the error term; and tr(S) represents the trace of the hat matrix S. The hat matrix denotes the projection matrix from the observed y to the fitted values. As highlighted before, GWR is not always appropriate if some of the variables do not exhibit spatial non-stationarity and can be held constant. We used a Mixed Geographically Weighted Regression model (MGWR) after testing for spatial variability of all variables [70]. In MGWR, some contributing factors that have no spatial variability will generate a global parameter, while others with spatial variability will produce a local parameter. The MGWR is defined as: We used a Monte Carlo approach to test for significant (spatial) variation in each regression coefficient of the basic GWR against a series of randomized data sets [71]. If the true variance of the coefficient did not fall in the top 5% tail of the ranked results, the corresponding variable was treated as a global variable in the specification of MGWR.

Results
We discuss three sets of results. We first examine the spatio-temporal patterns of median home prices derived by comparing and validating the results of the OLS and GWR models. Our second set of results show the application of MGWR to highlight the differences in the degree of spatial non-stationarity in the determinants of the median home price in towns of Massachusetts, while the third set of results describes the urban sustainability from the perspective of economic, social and ecological determinants. The OLS results are presented in Tables 2-4; GWR results are presented in  Tables A1-A6 and Table 5; and MGWR are presented in Tables A7-A12 and Table 6. Summary results for each model are presented in Table 4 (OLS), Table 5 (Basic GWR) and Table 6 (MGWR).

Spatial-Temporal Patterns of Home Prices
We first explore and model the spatiotemporal variability of median home prices and associated determinants in the state of Massachusetts by benchmarking the performance of the global regression model (OLS) with its GWR counterpart with the same set of variables. Table 2 shows the results from the OLS regression for the median home prices using decennial census data from 2000 and 2010, while Table 3 shows similar results for the ACS years-2009, 2011, 2012, and 2013. Table 4 shows overall summary results for the OLS model.  Of the covariates, only population density, unprotected forest, and residential area are not significant (at significance level 0.05) in the model in any year of observation, while others including unemployment, vehicles owned, residential taxes, educated population above 25 years, senior population, or distance to commuter rail stations, were statistically significant in all years. Unemployment rate was consistently significant from 2009 to 2013 as seen in other studies. Suburban towns exemplify this pattern the best indicating that an increase in unemployment causes a drop in median home prices in suburban towns such as Natick, Framingham, Wayland, Wellesley and Cambridge. Senior population impacted median home prices more significantly in 2013 (at significance level 0.001) and to a lesser degree from 2010 to 2012 (at significance level 0.05). Perhaps this segment of population held on to their homes that depreciated in value (in 2008) and sold their home starting 2010. The ownership of homes by seniors and its impacts on median home prices deserve further empirical scrutiny. Coefficients, associated with unemployment rate and residential taxes, are negative indicating that a decrease in home price is associated with an increase in both unemployment and residential taxes. Unprotected forest may not have overall significance in the state, but it may have more significance in the eastern part of the state around Boston where development of housing in unprotected forest has resulted in urban sprawl [9] and the building of "McMansions". Such non-stationarity patterns in this determinant have to be explored using GWR since traditional OLS is unable to account for spatial heterogeneity in local submarkets. Figure 2 shows the OLS residuals in the median price of homes in various towns in different time periods. In general, OLS underestimates median home price in eastern Massachusetts, which is more populated with greater housing density, and overestimates the median home price in the rural western part of the state. This pattern is consistent over the entire time period. The ratio of unprotected forest to residential area in 2013 is computed to highlight such price variations. Eastern towns around Boston have ratios ranging from 0 to 0.626, while in western and central Massachusetts, the ratios range from 5.82 to 75. Higher ranges of ratio in the west suggest that more expansive forest is not influencing median home prices in these towns, while the opposite may be true in the eastern towns. Therefore, unprotected forest may be characterized by non-stationarity and needs to be explored using GWR.  . The standard error for the intercept is not significant in any year, except 2000. We expect that median home prices exhibit spatial heterogeneity and vary in the state from east to west reflecting proximity to Boston. As highlighted before, most of western Massachusetts is rural with large areas of forest. Hence unprotected forest may not impact overall home prices in the west but may have an impact for residential development in the east.
GWR results are presented in Tables A1-A6; each table pertains to one year of observation and shows the medians and ranges in the values of each coefficient across all towns (Columns 1-3). These coefficient numbers, in general, have large ranges. Hence, the results are next summarized using percentages of coefficient estimates that were positive and negative (columns 4 and 5) for each variable across all towns. GWR results for census years are shown in Tables A1 and A2 along with p-value (F3 test) and significance. Population density is not a significant factor in both census years. The public-school CPI scores are less significant in all years. The remaining determining variables are highly significant across both census years.
The ACS survey years are shown in Tables A3-A6. Examining the impact of determinants, CPI score is significant in all ACS years (at significance level 0.001). Population density is only significant in recent years 2012-2013, probably driven by increasing urban growth in towns such as Everett, Lawrence, Malden, Arlington and Brookline (see Figure 1). Overall population increased the most in Boston, Cambridge, Somerville, Chelsea and Brookline 2011-2013. Other factors are significant across all years. To summarize, GWR results highlight that most determinants are characterized by non-stationarity leading to spatial variation in median home prices.
We next interpret columns 4 and 5 in Table A6 for 2013. Most towns (65.77%) seem to value unprotected forest (positive coefficient) while some towns (34.2%) in the rural west (negative coefficient) may not. Senior population accounts for 91.96% in the positive coefficient column, indicating that most senior citizens may have higher incomes, and better homes (longer tenure of ownership). The coefficient of this determinant is negative only in 8.04% of towns. This pattern is exemplified in towns outside of Boston Metro, such as Wales, Richmond, and Bernardston, where the total population of the town may be decreasing due to migration of all segments of population except seniors. Overall, the results across census and ACS years, show that unemployment rate, vehicles ownership, residential property taxes, residential area and distance to commuter rail stations have a larger percentage of towns with negative coefficients (column 5 in all tables). The model predicts a negative relationship between these determinants and median home prices across most towns. On the other hand, CPI, educated population above 25 years, population density, unprotected forest and senior population have a range between 60% and 100% of towns with positive coefficients (column 4 in all tables). The model predicts a positive relationship between these determinants and median home prices across most towns. The model fit in terms of positive and negative coefficient signs is consistent across most years highlighting the strong role of the selected determinants in determining the median home prices in the state. Figure A1 maps the estimated coefficients of senior population in 2000, and 2009 to 2013 which indicate that proximity to Boston plays a strong role in differentiating eastern (colored red) and western towns (blue). Figure A2 shows coefficients of unprotected forest class for the time period are highly significant in towns northwest of Boston including Lexington, Winchester, Woburn, and Belmont. Finally, the R 2 values are examined to measure the fit of GWR in each town displayed in Figure 3 for year 2013. Metro Boston has good GWR R 2 values. The towns of Bernardston, East Longmeadow, Hampden, Longmeadow, Ludlow, and Springfield have higher R 2 values. These towns are located in the central and western regions of the state. Bernardston (Route 91 south of NH Border) had the highest GWR R 2 value. In contrast, Seekonk, Somerset, Swansea, Taunton, and Freetown had lower GWR R 2 values. These towns are all located south of Boston closer to Providence RI and therefore may be impacted by the determinants in Rhode Island. Table 5 shows the overall results in terms of bandwidth, RSS, AIC and Adjusted R 2 . The R 2 value was the highest in 2000 and averages around 0.80 over other years. To summarize, most determinants in our study are characterized by spatial non-stationarity.

Are All Determinants of Median Home Price Non-Stationary?
We next examine the question of non-stationarity of determinants using MGWR that differentiates the spatially non-stationary from stationary determinants. Results are shown in the Tables A7-A12 where variables with p-value (Monte Carlo simulations) in column 7 greater than 0.05 are spatially stationary and should be treated as fixed global variables (listed in the lower section of the table). Vehicle ownership and residential taxes should be treated as fixed global variables in most years. The importance of vehicle ownership is uniform across the state, as are town taxes. These should be treated as fixed global variables, while MCAS CPI scores were spatially non-stationary (local) in 2009 and 2011. Population density presents contrasting results in GWR and MGWR analysis. It is not a fixed global variable in MGWR suggesting spatial non-stationarity; population density was not significant in the GWR analysis until 2012-2013, coinciding with the increase in population in metro Boston area in this time period.
A closer examination of unprotected forest indicates that towns such as Belmont, Billerica, Carlisle, Lexington and Burlington, within the Greater Boston region, consistently perform highest on this coefficient indicating the greatest potential for residential development in these towns, if town regulations permit. On the other hand, towns such Provincetown (on the Cape), North Attenborough (on the intersection of Routes 95 and 295), and Norfolk rank the lowest for at least two years on this coefficient, indicating they have less potential for development. Figure A3 displays unemployment coefficient maps of the GWR and MGWR that differentiate the spatial non-stationarity in this determinant around Worcester (west of Boston) and suburbs south of Boston. Similarly, results are shown for the display of senior population coefficient maps of the GWR and MGWR in Figure A4. Differences emerge in the western suburbs of Boston. Both determinants are non-stationary using GWR and MGWR but are producing different spatial patterns of coefficients around Boston. They produce similar patterns of coefficients in western part of the state.
MGWR results presented in Table 6 show better results compared with GWR results shown in Table A7 across both census and ACS years. As shown in Figure A5, GWR and MGWR have similar AIC in 2009, 2010, but display some differences in the remaining years. MGWR produces consistently the lowest AIC values for all years. GWR is the second-best performer using the AIC measure while OLS is the worst performer. This result highlights the spatial non-stationary of some factors-Population Density, Unemployment Rate, Residential Area, Vehicle Ownership, Senior Population, Distance to Commuter Rail Stations and Property Tax-that influence median home prices in Massachusetts towns. The highest difference between two model results occurs in 2013 indicating that MGWR is a better choice for this year since the nature of spatial non-stationarity of some determinants has changed.

Impact of Housing on Urban Sustainability
Economically, our study captures the Great Recession period of bust and its aftermath based on the ACS data 2009, as well as census 2010 and recovery in 2011-2013 ACS data. Determinants such as unemployment rate show differences from 2009 to 2013, while unprotected forest is significant beginning in 2009, suggesting that housing demand is impacted forests in the period of boom that followed the economic recovery. Population density is significant in recent years 2012-2013 and was likely driven by increasing employment in Metro Boston. Our spatial models capture the economic boom and bust in the various towns from 2009 to 2013 displaying the linkages between economic, social, and ecological factors in urban environments during critical periods. Addressing real estate expansion during periods of boom, one has to consider ecological and social impacts, while during periods of bust, urban planning has to consider the social implications of unemployment and drop in housing value. Social factors such as unemployment impacted certain towns in Metro Boston (such as Natick, Framingham, Wayland, Wellesley, and Somerville) during this time period. This perspective on social determinants may inform policy makers about provisioning social services and employment opportunities in these towns. Ecological factors related to unprotected forest are also significant in Metro Boston towns (such as Bedford, Lexington, Burlington, Woburn, and Waltham), suggesting these towns should balance residential expansion and forest cover in the future.

Discussion
Our data driven approach to model housing in this study calls for the integration of remote sensing, socio-economic, town and other data. The study covers 336 towns and cities in the state of Massachusetts for the period 2000-2013 that was characterized by periods of boom and bust in the US economy following the housing market collapse in 2008 [52]. Our modeling approach of GWR and MGWR, enables us to analyze how spatial non-stationarity of environmental, economic and social determinants lead to changes in median home prices. The basic OLS model appears least suited in modeling median home prices, while both GWR and MGWR models using adaptive bandwidths perform better.
The analysis presented here can help us address a series of questions focused on social determinants: What is the relationship between an educated workforce and home prices? Boston and Cambridge are home to sectors that attract educated workforce with high-wage jobs. Hence, people with higher education and income move closer to the suburban towns around Boston. This finding is shown in our analysis with marked differences between 2009 and 2013. What is the impact of senior citizens on home prices? Our GWR and MGWR suggests that senior citizens living in close vicinity to metro Boston exerted an influence on median home prices in 2013 and not in the period 2009-2012. Perhaps this segment of population held on to their homes that depreciated in value (in 2008) and sold their home starting 2010. The ownership of homes by seniors and its impacts on home prices deserve further empirical scrutiny. We used public school CPI scores as a determinant on home value. We found that these scores were significant using GWR model but were fixed as global variables using MGWR in 2000, 2010, 2012, and 2013. For future studies, we could incorporate other test scores (such as English Language Arts) to obtain a more nuanced understanding of the correlation between test performance across school districts in each town and their corresponding home prices.
What are the implications of spatial non-stationarity on environmental determinants? Unprotected forest cover in the eastern part of the state is more valuable than in the rural western Massachusetts. This may give towns an opportunity to build resilient communities that can preserve unprotected forest or mobilize citizen activists to transform the unprotected forest into protected forest. This investigation has implications in urban planning as it can guide us in conservation of key species. Work presented here can be extended to address reducing ecological footprints, reducing carbon emissions through reduction in deforestation, and conservation in towns that could benefit from taking actions to be resilient in the future. How does unemployment impact home prices? Unemployment rate is significant in determining home prices in all years using the GWR model. Suburban towns exemplify this pattern, indicating that an increase in unemployment causes a drop in home prices. The model predicts a negative relationship between these determinants and home prices across most towns. Property tax is another significant variable in our GWR analysis in determining home prices.
Housing impacts household behavior, policy, and the environment and therefore directly relates to urban sustainability [72]. Housing construction includes land, energy, and materials. Housing is directly connected to transportation and other externalities since people need to move back and forth from their homes for employment, recreation and other activities [73,74]. Given the complex connectivity between housing, transportation, and other economic sectors, sustainability in housing has typically been examined as a function of resource and location efficiency [75,76]. Home prices are related to environmental determinants, as described in our paper and elsewhere [77][78][79], that will be transformed given the sustainability focus of many cities, including Boston. Attributes such as green homes and solar panels may increasingly impact future home prices. Many new housing units are built with a focus on energy efficiency. Modern urban architecture emphasizes eco-friendly, or "green" homes that use materials and building methods with less energy requirements as well as result in reduced energy bills for the homeowner. The decreasing price in solar panels and their increasing availability will impact home prices in the future. Accessibility to public transport is relevant in this paper today but in the future, accessibility to electric car charging stations may become equally important. We discussed the nature and valuation of open space that is relevant in any discussion of urban sustainability as cities lower energy consumption, reduce emissions, and promote healthier lifestyles. Our approach can incorporate and estimate the non-stationarity of these sustainability driven factors that may impact future home prices. Thus, home prices can inform us of the changing public attitudes towards sustainability in cities across the US in the coming decades.

Conclusions
We proposed an urban sustainability framework centered on housing that could address issues involving economic, social and ecological dimensions. We used GWR model to examine spatial non-stationarity of key economic, ecological and social dimensions that are easily available at a larger spatial scale. Traditional approaches both in economics and in spatial econometrics involve hedonic and spatial models that estimate prices of individual homes using block level census data or town parcel data. Such models do not account for spatial non-stationarity at a regional or state scale. Our approach is to address urban sustainability at a broader spatial scale to encapsulate the entire state. We show that a data driven approach to modeling home prices at this broad scale calls for integration of geospatial data from a number of sources, including remote sensing, and census data to address multiple facets of sustainability. The understanding of key determinants to housing enables towns to take different strategies to sustainability such as valuation of green spaces and proximity to public transport, while unemployment rates and public-school performance can help shape urban housing policies. We hope to incorporate other variables such as green homes, residential solar panels, charging stations, air quality and other components to make our work relevant for assessing future urban sustainability.