EPC Labels and Building Features: Spatial Implications over Housing Prices

The influence of building or dwelling energy performance on the real estate market dynamics and pricing processes is deeply explored, due to the fact that energy efficiency improvement is one of the fundamental reasons for retrofitting the existing housing stock. Nevertheless, the joint effect produced by the building energy performance and the architectural, typological, and physicaltechnical attributes seems poorly studied. Thus, the aim of this work is to investigate the influence of both energy performance and diverse features on property prices, by performing spatial analyses on a sample of housing properties listed on Turin’s real estate market and on different sub-samples. In particular, Exploratory Spatial Data Analyses (ESDA) statistics, standard hedonic price models (Ordinary Least Squares—OLS) and Spatial Error Models (SEM) are firstly applied on the whole data sample, and then on three different sub-samples: two territorial clusters and a sub-sample representative of the most energy inefficient buildings constructed between 1946 and 1990. Results demonstrate that Energy Performance Certificate (EPC) labels are gaining power in influencing price variations, contrary to the empirical evidence that emerged in some previous studies. Furthermore, the presence of the spatial effects reveals that the impact of energy attributes changes in different sub-markets and thus has to be spatially analysed.


Introduction
Changes in consumers' demand behaviours, expressed through the willingness to pay for housing properties or for interventions to improve their energy performance, are the core of a wide number of studies in the real estate economics literature. Charalambides et al. [1] note that energy efficiency improvement is one of the principal reasons for retrofitting a house. Contextually, in the recent years the influence of the building energy performance-by means of the Energy Performance Certificate (EPC) labels-on prices has been deeply explored, showing evidence of the growing importance of EPC labels on selling/rental prices formation process [2,3]. In the meanwhile, the linkage between the building energy performance and the architectural, typological, and physical-technical attributes seems to be less studied, although these building features, besides the location variable, are often relevant in price formation processes [4].
At the European level, the relationship between EPC label and selling prices has been studied in other urban contexts through the use of hedonic regression models, and results demonstrated the presence of a positive, albeit often limited, influence of the EPC on prices [5][6][7]. Furthermore, as highlighted by the most recent literature, it is worth analysing the effects of dwelling attributes-and particularly the "green" ones-on property prices, by assuming their spatial variability and by considering the specific market segments to which the properties belong.
Assuming that housing unit intrinsic attributes have to be better analysed also in relation to the building types and typological components, the aim of this paper is to explore the influence of existing housing properties' relevant attributes and energy performance on prices, by also investigating their spatial component. Implicitly, the objective is also to identify which factors are able to strengthen the enhancement process through energy retrofit interventions, aiming both at improving the properties' energy efficiency and at increasing their economic values. Notice that this study analyses a data sample of real estate listings, which includes not only prices and energy attributes, but also other property characteristics, given that the energy retrofitting interventions can affect also the enhancement of other building features.
Methodologically, this study is based on spatial analyses of data points. Firstly, the Exploratory Spatial Data Analyses (ESDA) are performed to verify the eventual presence of spatial autocorrelation. The global and local spatial association is detected, by means of the Local Moran's Index; then, by applying the Local Indicator of Spatial Association (LISA), cluster maps are produced in order to identify the presence of (eventual) spatial clusters. Secondly, two regression models-the Ordinary Least Squares (OLS) model and the Spatial Error Model (SEM)-are applied on the whole data sample. Thirdly, the same analyses are performed on three different sub-samples.
A sample of 2092 housing properties listed on the Turin's (Northern Italy) real estate market in the 2015-2018 time period is considered as the case study. After the analysis of the whole data sample, this study presents a focus on two different sub-samples (territorial clusters), and, finally, on a sub-sample representative of buildings with low EPC labels constructed between 1946 and 1990, which are the most critical from an energy viewpoint.
In general, the results of the analyses performed on the complete sample allow to argue that the effects of the EPC, as mandatory by law, are starting to impact on the real estate market prices, in line with some international studies [8][9][10][11]. More precisely, the results of the analyses on the whole sample show that, in comparison to some previous studies on the Turin's real estate market [12,13], the EPC label effect on prices is slightly increasing. Particularly, it is confirmed that lower EPC labels (E, F, G) negatively affect listing prices in terms of marginal prices; furthermore, for the first time higher EPC labels (A-A4, B) slightly but positively influence property prices.
The analyses performed on the two spatial sub-samples, one in the Northern zone and one in the Centre zone of the city of Turin, reveal that the high EPC labels (A-A4, B) have a positive influence on prices only in the central sub-sample, while the low EPC labels (E, F and G) negatively affect prices only in the northern sub-sample. Finally, the analyses performed on the low energy-efficient dwellings sub-sample, with EPC label (E, F and G) and built in the 1946-1990 time period, confirm that the attributes related to buildings' energy behavior, for the existing buildings stock, are in many cases not related to the building quality.
Furthermore, the research findings open the way to interpretations, illustrated in the paper, which allow also for updating and extending the framework on the influence of the EPC on prices by comparing with the results of two previous studies [12,13], in which the impact of EPC on listing prices and transaction prices was analysed in the same Italian city.
Finally, the result of this study can support construction companies, real estate developers and property owners in their investment decisions: for example, in selecting the retrofit interventions particularly effective on the building/dwelling performance and, consequently, increase the property values.
Assuming these premises, the paper is divided into the following parts: Section 2 presents the considered literature and scientific background on the topic. In Section 3 the methodological approach is presented. In Section 4 the case study and data sampling are illustrated. Section 5 presents the results of the application and, finally, Section 6 concludes.

Analysis of Pricing in Housing Economics
A significant part of housing economics literature is devoted to exploring the determinants of real estate prices. In general, the methods adopted by scholars to analyse the residential real estate prices include the following (notice that the quoted researches are referred specifically to the recent exploration of green attributes contribution on prices, but the methodologies illustrated are commonly used in real estate analysis): Hedonic Price Model (HPM) [14], Hedonic Price Model combined with spatial specification [15,16], regression and multivariate analysis [17], binary logit regression models in conjunction with a Polytomous Universal Model [18], quantile regression [19] and evolutionary polynomial regression [20]. Furthermore, the Structural Equation Models are explored as an alternative to the regression models for exploring the presence of latent variables [21][22][23]. Besides, other researches focus on the use of Neural Networks and Genetic Algorithms [24][25][26].
Among the wide variety of methods, two families of tools emerge from the literature: the first one which is founded on the consolidated Econometric approach through the Hedonic method, and a second one, which opens up to the Spatial Econometrics.
As known since the second half of the 1960s, the first method has represented for many years the privileged methodology for the real estate market analysis, and, specifically, for detecting the impacts of building/dwelling attributes on pricing processes. Firstly, Rosen [27] explores the hedonic models, operatively solved through Multiple Regression Analysis, for detecting the price determinants in the real estate housing properties, and even before Ridker [28] he studied the relation between housing prices and environmental qualities. It must be stressed that the hedonic model was the most frequently applied among the traditional approaches on housing market analysis specifically for detecting the influence of structural, environmental and neighbourhood attributes, as demonstrated by Malpezzi [29] who provides a review of traditional hedonic regression models. In many works the impact of location on prices is empirically demonstrated [30]. Contextually, the importance of geographical segmentation in price prediction is underlined in many researches [31][32][33][34][35].
The pilot studies have been followed in the last decades by a number of publications worldwide, oriented towards the treatment and modeling of the spatial effects. In fact, the well-known criticalities due to the spatial heterogeneity (heteroskedasticity) and the spatial autocorrelation (interdependence) are explored [32], with the support of tests for the detection of spatial effects [36][37][38], opening the way to the Spatial Econometrics. Implicitly, the recent spreading in spatial analyses is related to the weight caught by the location variable in values formation.
According to the microeconomics viewpoint, for applying the Spatial Econometrics methods it is fundamental to explore the (eventual) presence of a spatial dependence in the property values. In fact, as is well known in the scientific research context, since property prices are spatial data, in addition to depend on multiple factors, they are influenced by values of the nearby locations. This phenomenon is well known in literature as the spatial autocorrelation of property prices [39]. Given the importance of taking into consideration the spatial dependence in data and the issues generated by spatial autocorrelation, many studies carry out spatial analyses and spatial regressions. For example, Wilhelmsson [40] discussed the importance of adopting spatial econometrics as a way to explore the size of bias that can occur in parameters when spatial effects are considered. The study starts from the theory of spatial econometrics in the real estate economics; through an empirical analysis, it demonstrated (among other conclusions) that the spatial hedonic model is able to explain a higher percentage of the price variation, and, above all, that the economic interpretation changes by including a spatial structure in the hedonic model.
Operatively, spatial analyses were founded on the relevance of space in influencing the real estate prices. Dubin [41] introduced the spatial effects in the hedonic model, assuming, in the hedonic regression, the autocorrelation of the error term. LeSage and Pace [42] discussed the adjacency effect considered as an influence (or spill-over effect) among neighbouring housing prices [43]. By implementing the hedonic model with the management of the spatial effects the Spatial Auto Regressive Model (SAR), the Spatial Lag Model (SLM) and Spatial Error Model (SEM) were obtained [44,45] with lattice data. Notice that the wide variety of spatial regression models were developed due to the growing availability of data technologies [46]. Besides, widely used methods are the Global Indicators of Spatial Autocorrelation (GISA) and the Local Indicators of Spatial Autocorrelation (LISA), among ESDA techniques [33,40,47,48].
In the following sub-section, the background of the study is illustrated, mainly considering some of the studies strictly related to the topics of this research.

A brief Overview on the European and Italian Regulatory Framework
The buildings' energy performance assessment was introduced in the European regulatory framework in 2002 and is expressed by different indicators that are part of the Energy Performance Certificate (EPC) of buildings and residential units. The European regulatory framework is presented in Table 1. They supplement the European Directive 2010/31/EU on the energy performance of buildings by establishing a comparative methodology framework for calculating cost-optimal levels of minimum energy performance requirements for buildings and building elements. The cost-optimal methodology has been recently explored [52].
European Directive 2018/844/EU [53] It is aimed at accelerating the cost-effective renovation of existing buildings, towards a decarbonized building stock by 2050 and the mobilization of investments.
Concerning the Italian context, laws and standards have been produced aiming at transposing the European Directives. In Italy, the EPBD was adopted firstly by Legislative Decree n. 192 on 19 August 2005, according to which buildings are rated on a scale from A+ to G. The EPC labels are calculated in each Italian region, according to the standards UNITS 11300 parts 1 and 2 procedure for evaluating the building energy performance index. As an assumption, the highest label A+ corresponds to the lowest Energy Performance index for Heating (EPH). Coherently, the lowest label G indicates the highest EPH. With the Legislative Decree n. 28 dated 3rd March 2011, since 1st January 2012, the EPC is a mandatory requirement for dwellings listed in the selling/rental market. With the Ministerial Decree dated 22nd November 2012, the owners cannot self-certify the lowest EPC label G.
The EPBD recast was then adopted by Legislative Decree n. 63 dated 4th June 2013, and with the Law n. 90 dated 3rd August 2013 introduced a new methodology for the calculation of the energy performance of buildings, later adopted by the Ministerial Decree dated 26th June 2015. This last norm includes the National Guidelines for buildings' EPC and introduced a new EPC format, based on ten homogeneous labels at the national level. The highest level is A4 and corresponds to the lowest EP, whilst the lowest level is G with the highest EP.

The Energy Performance Certificate Effects on Housing Prices
Contextually, with the evolution of the regulatory framework and the consequent evolution of the methods for applying the norms and rules, a growing literature at the international/national level has been produced. According to the real estate economics perspective, many studies aim to detect the impact of EPC labels on pricing processes and on market dynamics. Operatively, on the same line as the widespread literature produced, a great number of these studies utilize the Hedonic Price Method.
More recently, a wide amount of literature has explored the potentialities of spatial econometrics in real estate market analyses, even by comparing spatial approaches with traditional Hedonic methods. Special attention is posed on the detection of green attributes' influence on pricing processes. Among the most recent literature, some studies can be cited as examples in the context of the present research due to the object of the analyses.
In a study by Bottero et al. [8], the conjoint use of a spatial econometric model and of the Hedonic Price Method is analysed for estimating the implicit marginal price as a measure of the willingness to pay for building energy consumption in the city of Turin. More precisely, the study tries to estimate the differential of building energy performance in monetary terms. It is assumed necessary to evaluate the social cost of energy wastes. As the authors report, the results show an evident necessity to pay particular attention towards the coherence between the spatial and econometric approaches.
In a recent study, Chen and Marmolejo-Duarte [9,54] reflect on the differences between dwellings whose energy efficiency impacts price formation and dwellings whose energy efficiency does not influence pricing processes, in Barcelona's (Spain) residential market. Starting from studies about the energy efficiency marginal prices methodologically founded on the use of hedonic models, the authors assumed the premises that, in some cases, the increase of the relevance of EPC labels, in terms of marginal prices, is not constant for all the classes (in Spain, A-G).They propose to analyse the differences in terms of architectural and location characteristics between dwellings that show an increase in EPC rating marginal prices, and, at the same time, dwellings for which EPC seems not to be relevant for pricing processes. More precisely, the study proposes a pooled Spatial Error Model for exploring the prices of multifamily houses in the period 2014-2016. Among others, the following results are particularly interesting for the present work: firstly, in general there is a correlation between spatial and houses characteristics, and the attributes can reveal a weight varying in function of the building location (spatial effects); secondly, architectural attributes and location contributions to the green premium must be further explored.
In a study by Dell'Anna et al. [11], a comparison between two European cities belonging to two different climate zones, Barcelona (Spain) and Turin (Italy), is presented. Methodologically, datasets of listing prices in the residential market are analyzed with the Hedonic Price Method and Spatial Econometric Models (specifically, Spatial Autoregressive Model and Spatial Error Model), for detecting the contribution (in terms of marginal price) of green attributes, and, contextually, for controlling the spatial correlation among the prices. As highlighted by the authors, the results related to the two cities are different (in Italy, EPC is more relevant then in Spain, where single characteristics are more appreciated). Furthermore, they deduce that the EPC implementation is still irregular in European States, and also that it can be reinforced by introducing a standardized rating model for EPC. Obviously, the importance to introduce the effects of location when analyzing green labels is confirmed, and consequently, so is the necessity to strengthen the Hedonic Price Model with Spatial Econometric Models.
Besides the abovementioned researches, other recent experiences on the topic are assumed in the background as being specifically aimed to calculate the spatial autocorrelation in property prices in the city of Turin; in these studies, spatial analyses are proposed to detect the presence of spatial dependence between different kinds of indicators and to manage the spatial latent variables in the property price determination process [55][56][57][58].
For concluding the scientific background, a comparative reading with the results of two previous studies in the Italian context focused on the relevance of the building EPC label on listing/transaction prices formation, and on supply and demand behaviors, is fundamental.
The first study by Fregonara et al. [12] aimed at investigating the economic effects of the Italian statutory provisions, related to the energy performance of buildings, on the listing behaviors. More precisely, the study was directed towards the measuring of the impact of the EPC labels on listing prices, for analyzing whether the EPC label, besides the mandatory requirement, is considered as a significant aspect able to influence dwelling prices. The study assumes that listing prices and dwelling characteristics are the initial and fundamental information considered by sellers/buyers during a first preliminary analysis [59,60]. A data sampling of 577 listing prices collected in Turin in the year 2012 (source: real estate advertisement websites) is analyzed by means of a log-linear Hedonic regression model, applied through hedonic regression analyses. The results demonstrate that a low level of EPC (EPC label F) is found to be a significant influence; the same results were confirmed in the model tested with EPC label clusters. In conclusion, EPC labels only partially explain listing prices.
In the second study by Fregonara et al. [13], the impact of the EPC label on dwelling prices and on market liquidity was analyzed, considering both listing and transaction prices. Particularly, the time on the market and the difference between listing prices and transaction prices were considered. A sample of 879 transactions of old apartments in Turin in the period 2011-2014 was considered, to analyze the effect of a set of variables on the listing and transaction prices, and, furthermore, on time on the market and bargaining outcome. Besides the EPC labels, building construction period and the main dwelling characteristics were modelled. The study revealed that low EPC labels (E, F and G) are priced in the market but explain only 6-8 per cent of price variation; then, including the dwelling principal characteristics, EPC labels have no impact on prices (more precisely, the G rating is weakly significant. Notice that the G level is to be considered with special attention, due to the possibility, at least for a certain period, to self-certify the dwelling). Finally, a focus on old buildings built in the period of 1940-1989 confirms the previous results. From the study emerges the clear influence of the building construction period and of assets' location on prices/real estate market dynamics.
As a general conclusion, the results of these last two studies suggest the presence of latent variables that are able to catch the explicative power of EPC label, which must be carefully explored in line with the literature. This consideration encourages us to address the research towards the deepening of the power characteristics, not only in the pricing process, but also in explaining the consumers' behavior and the relative spatial real estate dynamics. Above all, the examination of the recent literature on the topic reveals the necessity to introduce spatial econometrics beside the traditional Hedonic Price Method to manage the influence of spatial effects on pricing processes.

Methodological Approach
The analyses performed in this paper were mainly based on two widely known methods: Exploratory Spatial Data Analyses (ESDA) and the Spatial Error Model (SEM). The spatial regression model was applied to investigate the influence of the Energy Performance Certificate label and other property intrinsic features on listing prices. The analyses were focused on a data sample of real estate listings and three different sub-samples, which were selected on the basis of two different approaches.
Methods and approaches are briefly illustrated in the following subsections.

ESDA Statistics
Before performing regression models, the correlation among the considered variables has to be analysed by means of Spearman correlation test. Subsequently, Exploratory Spatial Data Analyses are necessary to investigate the presence of spatial autocorrelation that usually affects the real estate market [30,48,61,62]. When property prices and a series of characteristics are analysed across territorial units, the global and local spatial association that exists between each unit and the neighbouring ones has to be detected. Therefore, the Local Moran's Index is calculated to assess the spatial autocorrelation level [63], as follows (1): where z i is the standardized spatial weight and the summation over j is such that only neighbouring values j ∈ J i are included. For ease of interpretation, the weights w ij may be in row standardized form, and by convention w ij = 0. Furthermore, the Local Indicator of Spatial Association (LISA) [64] cluster maps are generated to identify eventual spatial clusters and the related spatial association classification between the following four categories: • spatial cluster defined by high values of the investigated phenomenon with a high level of similarity with their surroundings named "high-high" or "hot spots"; • spatial cluster defined by observations with low values and a high level of similarity with its surroundings named "low-low" or "cold spots"; • spatial outliers defined by observations with high values surrounded by low ones named "high-low"; • spatial outliers defined by observations with low values surrounded by high ones named "low-high".
Spatial clusters are identified on the basis of a Row Standardized Queen Contiguity-First Order Weight matrix (W), representative of the degree of the spatial autocorrelation of each territorial unit and its surroundings, by means of the GeoDa software (software tool devised by Centre for Spatially Integrated Social Sciences-CSISS) [65].

The Spatial Error Model (SEM)
Spatial hedonic models aim to manage the spatial components of dependent and explanatory variables to reach more precise and unbiased results. All spatial hedonic models are based on the classical ordinary least squared (OLS) regression model. Commonly, the OLS model is preliminary applied to detect the influence of a set of independent variables (observable attributes) on the dependent variable, which in the real estate market are usually represented by the property price (listing price or transaction price, according to the data availability). The hedonic model which, in turn, is founded on the OLS algorithm estimation, can be formally represented as follows (2): where Y stands for the dependent variable, α k stands for the model intercept, X ik , with k = 1, . . . , K, and Z im , with m = 1, . . . , M stand for the variables introduced for each of the n observable characteristics, α i and β i represent the hedonic weights assigned to each variable, i.e., the contribution given by each single characteristic level to the price value, and ε represents the error term.
In performing OLS models, the presence of spatial dependence has to be tested; if tests results are significant, the model is probably biased. The statistics are the simple Moran's I on the spatial autocorrelation of errors, the simple Lagrange Multiplier (LM-lag) test for a missing spatially lagged dependent variable, the simple LM test for error dependence (LMerror), and their robust variants: Robust LM (lag), Robust LM (error) and a portmanteau test (SARMA), composed by Lagrange Multiplier (error) and Robust LM (lag). Therefore, the spatial autocorrelation has to be managed by means of the spatial regression models described in the following subsections [63,66].
Assuming that the spatial component of prices has to be managed in order to make regression models unbiased [67] in the presence of spatial autocorrelation of the residuals, two possible spatial models can be used: Spatial Lag Model (SLM) and Spatial Error Model (SEM). In order to identify the best model, the results of Moran's I test and the Lagrange Multiplier tests (LM-lag and LM-error) must be compared in the early phases of the analysis [62].
In this study, price variability shows a non-linear relation with explicative variables, thus the spatial autocorrelation is managed by applying SEM, which is estimated by the Maximum Likelihood Estimator (MLE) algorithm, given the assumption of normality in the error term. This approach is suggested when errors in measuring the locational characteristics are present, or when the errors of the hedonic Model are correlated due to spatial effects. This effect can be reduced by introducing a correction of the error in the model (in fact, SEM is also known as Spatial Error Correction Model).
The SEM can be specified as (3): where u i is the random error (independent identically distributed-i.i.d.), and the spatially structured error is composed of the added spatial error coefficient (λ) and the original error term (ε) weighted by a weight matrix w i (W). If there is no spatial correlation between errors, then λ = 0. If λ = 0, OLS is unbiased and consistent, but the standard errors will be wrong and the β will be inefficient. Notice that a positive and significant value of λ means that the model fit is good.

Sample Stratification and Spatial Clustering
The analyses on the whole data sample can be focused by applying two different stratification approaches.
Firstly, two different sub-samples can be spatially identified, starting from the results of the Local Indicator of Spatial Association (LISA) cluster maps, generated by analysing the listing price variable.
The first sub-sample can be identified by including the listing prices classified as "highhigh" clustered and their surroundings, within a maximum 500 m distance. The second sub-sample instead can be identified by assuming the listing prices "low-low" clustered and their surroundings, within a maximum 500 m distance (Figure 1). Secondly, another attribute (not spatial) sub-sampling approach can be assume create a different sub-sample, based on intrinsic features, representative of the mos ergy inefficient housing units. In particular, a set of housing units can be selected on basis of the following assumptions related to building features:


Construction period: in the 1946-1990 time period, which represents the post period, many medium Italian cities were characterized by a consistent popula growth, a great urban development and a series of economic policies finalized to port housing construction activities at the municipal level, even if not yet sensiti the building energy issues. In fact, buildings built during that period can be con Secondly, another attribute (not spatial) sub-sampling approach can be assumed to create a different sub-sample, based on intrinsic features, representative of the most energy inefficient housing units. In particular, a set of housing units can be selected on the basis of the following assumptions related to building features:

•
Construction period: in the 1946-1990 time period, which represents the post-war period, many medium Italian cities were characterized by a consistent population growth, a great urban development and a series of economic policies finalized to support housing construction activities at the municipal level, even if not yet sensitive to the building energy issues. In fact, buildings built during that period can be considered particularly "energy voracious" due to their typological and technological characteristics (one-layer walls, single glass windows, low quality of materials, etc.); • EPC label: when precise information about the building physical features is not available, this variable can represent a suitable proxy of the energy performance.
It is worth mentioning that this second approach is based on two building intrinsic features (construction period and EPC label) that do not generate spatial clusters. These are significant characteristics for the study area of city of Turin, but they could differ in other urban contexts and thus they could be modified according to the built heritage specificities.

Case Study, Data and Statistics
The real estate market of the city of Turin (Northern Italy) was assumed as a case study for this study. A data sample of existing housing units listed on the market in the 2015-2018 time period was used by assuming a database from the Turin Real Estate Market Observatory (TREMO). The TREMO was founded in 2000 by a partnership between the Politecnico di Torino and the Municipality of Turin; it constantly monitors and analyses housing prices and the related intrinsic or extrinsic features, assuming the different territorial segments of the city of Turin, which correspond to different real estate submarkets.
The data sample consisted of 2092 real estate listings published on one of the most important Italian real estate market advertisement websites, which were punctually analysed in order to study the listing prices and a series of features related both to the housing unit and to the building.
Obviously, the data set results from a previous phase of selection and cleaning of eventual outliers or incomplete data. The unitary mean listing price of the sample is 2210 Euro per square meter, while the standard deviation is 983Euro per square meter.
As is known, in Italy selling prices of real estate properties are not public information. Thus, selling prices are difficult to observe. Nevertheless, in many studies listing prices are used by researchers, appraisers and real estate companies to perform market analyses and to estimate the value of properties.
In fact, as shown from studies and literature, listing prices are capable of representing a fundamental aspect of the asset value formation process, specifically for their influence on selling processes and price prediction [68,69]. Furthermore, studies demonstrate the impact of listing prices on assets liquidity [70] and price spreads, represented by the difference between the listing price and selling price [71]. In the Italian context, a study analyses whether and to what extent listing prices can be considered a proxy for selling prices [72].
For these reasons, in this study listing prices are assumed as the main information available to analyse house pricing processes. A fundamental aspect is that listing prices are a function of house attributes, able to influence the bargaining processes [73] and opening to the analysis of the impact of green attributes and EPC labels on prices/values, as explored in previous studies [12,13].

Variables and Descriptive Statistics
In addition to the listing price variable, a set of relevant characteristics was selected and analysed.
As shown in Table 2  Lastly, two "green variables" were considered in order to analyse the energy performance level and the technological equipment of the building: the presence of airconditioning system (ARC), and the EPC label. The analysis of latter variable-the EPC label-is particularly relevant considering the aim of the present study, since it can be considered a proxy for a series of other variables related to technological and physical-technical characteristics. Furthermore, being strictly related to the energy consumption, if considered jointly with other characteristics (e.g., level of maintenance and age of the building), it can also represent an "indicator of energy voracity" of a building. This indicator might give a more complete picture of the building features that could be appreciated by buyers and might be one part of the comprehensive "quality performance" of the buildings themselves.
In this study, the EPC labels were grouped in three levels: "high EPC labels" (A-A4, B), which correspond to the 3% of the sample; "medium EPC labels" (C, D), which correspond to the 39% of the sample; and "low EPC labels" (E, F, G), which correspond to the 58% of the sample. In Figure 2, the three groups are spatially represented: high labels are a small part of the sample and are mainly located in the city centre. On the contrary, medium and low labels constitute the main part of the sample and are more scattered all over the city. In particular, the representation of the "low EPC labels" group confirms that it cannot correspond to a proper spatial cluster.

selves.
In this study, the EPC labels were grouped in three levels: "high EPC labels" (A-A4, B), which correspond to the 3% of the sample; "medium EPC labels" (C, D), which correspond to the 39% of the sample; and "low EPC labels" (E, F, G), which correspond to the 58% of the sample. In Figure 2, the three groups are spatially represented: high labels are a small part of the sample and are mainly located in the city centre. On the contrary, medium and low labels constitute the main part of the sample and are more scattered all over the city. In particular, the representation of the "low EPC labels" group confirms that it cannot correspond to a proper spatial cluster.

Data Sub-Sampling
The whole data sample was split in different sub-samples by applying the two abovementioned different stratification approaches.
Firstly, two spatial sub-samples were selected, on the basis of the LISA "high-high" and "low-low" cluster maps (in Section 5.1, Figure 5), to in-depth analyse the different influence of the EPC label in different areas of the city. The so called "Central Zone" (CZ) included both spatially correlated values (LISA = 1) and the probable outliers (LISA = 3 and 4) and those "not significant" (LISA = 0) that fell in the same zone, while the "Northern Zone" (NZ) cluster included "low-low" values (LISA = 2), the probable outliers (LISA = 3 and 4) and the "not significant" ones (LISA = 0) that fall in this zone (as previously explained in Section 3.3, Figure 1). In Figure 3, the result of the first spatial sub-sampling is presented: the Central Zone (CZ) sub-sample (dark green dots) and the Northern Zone (NZ) subsample (light green dots).
Secondly, a third sub-sample, including only the energetically inefficient buildings, was created by applying the abovementioned attribute (not spatial) sub-sampling approach. Therefore, this last sub-sample, named "Low Energy Efficient (LEE)," was not based on a spatial clustering, and it included housing units in buildings built between 1946 and 1990 with a low EPC label (EPC = E, F, G), which are randomly located in the city of Turin, as Figure 4 shows. ern Zone" (NZ) cluster included "low-low" values (LISA = 2), the probable = 3 and 4) and the "not significant" ones (LISA = 0) that fall in this zone ( explained in Section 3.3, Figure 1). In Figure 3, the result of the first spatial is presented: the Central Zone (CZ) sub-sample (dark green dots) and the N (NZ) sub-sample (light green dots). Secondly, a third sub-sample, including only the energetically ineffici was created by applying the abovementioned attribute (not spatial) sub proach. Therefore, this last sub-sample, named "Low Energy Efficient (L based on a spatial clustering, and it included housing units in buildings 1946 and 1990 with a low EPC label (EPC = E, F, G), which are randomly city of Turin, as Figure 4 shows.

Data Sub-Sampling
The whole data sample was split in different sub-samples by applying the two abovementioned different stratification approaches.
Firstly, two spatial sub-samples were selected, on the basis of the LISA "high-high" and "low-low" cluster maps (in Section 5.1, Figure 5), to in-depth analyse the different influence of the EPC label in different areas of the city. The so called "Central Zone" (CZ) included both spatially correlated values (LISA = 1) and the probable outliers (LISA = 3 and 4) and those "not significant" (LISA = 0) that fell in the same zone, while the "Northern Zone" (NZ) cluster included "low-low" values (LISA = 2), the probable outliers (LISA = 3 and 4) and the "not significant" ones (LISA = 0) that fall in this zone (as previously explained in Section 3.3, Figure 1). In Figure 3, the result of the first spatial sub-sampling is presented: the Central Zone (CZ) sub-sample (dark green dots) and the Northern Zone (NZ) sub-sample (light green dots). Secondly, a third sub-sample, including only the energetically inefficient buildings, was created by applying the abovementioned attribute (not spatial) sub-sampling approach. Therefore, this last sub-sample, named "Low Energy Efficient (LEE)," was not based on a spatial clustering, and it included housing units in buildings built between 1946 and 1990 with a low EPC label (EPC = E, F, G), which are randomly located in the city of Turin, as Figure 4 shows.

Results
By applying the methodological approach on the case study presented in the Section 4, the dataset was processed and tested by means of the GeoDa software (1.14.

Results
By applying the methodological approach on the case study presented in the Section 4, the dataset was processed and tested by means of the GeoDa software (1.14.0-24 August 2019), and by the open-source software "R" (by R Foundation for statistical computing, software version 3.6.1 (5 July 2019)); spatial analyses and maps were produced by the ArcGIS Desktop software package (software version 10.8.0.12790, Esri, Redlands, CA, USA). The results are illustrated and commented in the following sub-sections.

ESDA Statistics and Data Sub-Samples
Firstly, the Spearman correlation test was performed: results showed the absence of significant correlation among the considered variables. Therefore, to verify the presence of spatial correlation in the dataset, the Moran's Index was calculated and represented in the scatterplot and in the corresponding LISA cluster map in Figure 5.

Results
By applying the methodological approach on the case study presented in the Section 4, the dataset was processed and tested by means of the GeoDa software (1.14.0-24 August 2019), and by the open-source software "R" (by R Foundation for statistical computing, software version 3.6.1 (2019−07−05)); spatial analyses and maps were produced by the ArcGIS Desktop software package (software version 10.8.0.12790, Esri, Redlands, CA, USA). The results are illustrated and commented in the following sub-sections.

ESDA Statistics and Data Sub-Samples
Firstly, the Spearman correlation test was performed: results showed the absence of significant correlation among the considered variables. Therefore, to verify the presence of spatial correlation in the dataset, the Moran's Index was calculated and represented in the scatterplot and in the corresponding LISA cluster map in Figure 5. Results showed that most of the observations fall in the II and IV quadrants, suggesting the presence of a positive spatial autocorrelation between LP values and their lagged (Moran's I = 0.688). The Local Indicator of Spatial Association (LISA) was calculated to explore the significance of the spatial clusters and to produce the related map. The LISA cluster map confirmed the presence of clusters with the highest and lowest concentration of highest and lowest LP values, whereas the significance calculation (99 permutation) on the basis of Monte Carlo statistics confirmed the significance of the clusters, with a p-value between 0.001 and 0.05.
A first cluster called "high-high," collecting 170 data points with a positive autocorrelation of high values, was identified in the historical city centre and in the nearby part of the Turin's hillside, which are very classy and rich areas. A second "low-low" cluster, collecting 216 data points with a positive autocorrelation of lower values, was identified in the northern and southern outskirts of the city, where there is a high population with low income and the building quality is low (Figure 5b). Results showed that most of the observations fall in the II and IV quadrants, suggesting the presence of a positive spatial autocorrelation between LP values and their lagged (Moran's I = 0.688). The Local Indicator of Spatial Association (LISA) was calculated to explore the significance of the spatial clusters and to produce the related map. The LISA cluster map confirmed the presence of clusters with the highest and lowest concentration of highest and lowest LP values, whereas the significance calculation (99 permutation) on the basis of Monte Carlo statistics confirmed the significance of the clusters, with a p-value between 0.001 and 0.05.

Hedonic Regression Models
A first cluster called "high-high," collecting 170 data points with a positive autocorrelation of high values, was identified in the historical city centre and in the nearby part of the Turin's hillside, which are very classy and rich areas. A second "low-low" cluster, collecting 216 data points with a positive autocorrelation of lower values, was identified in the northern and southern outskirts of the city, where there is a high population with low income and the building quality is low (Figure 5b).

Hedonic Regression Models
A traditional logarithmic hedonic model was applied on the complete sample (2092 existing housing units listed on the market in 2015-2018 time period). The unitary listing price (LP), measured in Euro/m 2 , was assumed as the dependent variable. On the basis of the normality test of the LP variable, the logarithmic transformation was preferred: thus, the dependent variable of the OLS model was LogLP. The whole set of explanatory variables was considered to assess the influence of the variables on price variation. All Lagrange Multiplier tests (both simple and robust) on the OLS application were significant and showed the presence of spatial dependence between variables (Moran's I = 0.346), thus the OLS model could be biased.
Furthermore, results showed that, above the spatial regression models, the more suitable spatial model tested was the Spatial Error Model (SEM), since AIC and Log likelihood values are respectively lower and higher than in the Spatial Lag Model(SLM). The Breusch-Pagan test on the spatial effects, calculated for testing the homoscedasticity hypothesis, showed that the null hypothesis was confirmed (Table 3).  Results showed that the SEM model can explain 75% of the price variation (R squared = 0.754) and regression residuals were not clustered. This result is rather good considering that the aim of this study was not to predict property values, but just to study the influence of a set of characteristics on property prices. The significant variables with the higher marginal prices were the presence of elevator (LFT (1) = 0.111), the renovated condition of the unit (MTL (4) = 0.181), the category of the building, that means prestigious buildings (BLC (5) = 0.250) and the coefficient on the spatially correlated errors (LAMBDA = 0.797), which were able to manage the spatial dependence in the model.
On the other hand, in relation with the characteristics of the housing units, a clear, even if small, influence of the EPC label and the presence of an air conditioning system on housing prices emerged. In particular, low EPC labels from G to E (EPC (1) = −0.027) significantly and negatively affected housing prices, while high EPC labels from B to A4 (EPC (3) = 0.062) had a positive influence on them, and the presence of air conditioning had a positive small marginal coefficient (ARC = 0.045). These results are in line with the recent literature related to the perception of the EPC label in the real estate market, which is an ongoing research field [74]. For example, Chen and Marmolejo-Duarte highlighted that in Barcelona, an energy performance improvement from label G to label A brought in a growth of 8.6% of housing prices in 2014, and an increase of 10.6% from label G to label B in 2016 [75]. Moreover, Dell'Anna et al. estimated an increase given by the EPC of 6.33% for each rating level from G to A in Turin for the 2014-2018 time period [11].
By comparing the results of the current analysis with two previous researches based on the same study area (Turin) with the same source of data (TREMO) [12,13] and with other recent literature [54,75,76], it is possible to confirm that the attention on the EPC label rose slightly in recent years and is starting to be monetized by the real estate market, not only for the lower labels (E, F, G) but also for the higher ones (A-A4, B).
This means that the international energy policies and regulations transposed into the national laws and legal constrains (EPC is mandatory for all transaction contracts since 2012, and minimum energy requirements are mandatory for all types of building interventions in Turin since 2007) applied on the residential building stock in the city of Turin are finally starting to be recognized by the real estate market. 5.2.1. Spatial Clustering: "Northern Zone" and "Central Zone" Sub-Samples A further step was carried out by applying the SEM on both the CZ and the NZ subsamples, which were previously defined. The CZ sub-sample included 886 observations, while the NZ sub-sample included 660 observations. Both spatial regressions assumed the logarithm of unitary listing price (LogLP) as a dependent variable, while, on the basis of the significance of the independent variables (Stepwise selection), different intrinsic and extrinsic characteristics were included in the two final models (Table 4).
By comparing the results of the two models, the following variables were significant both in the NZ and in the CZ sub-sample: the presence of Terrace and Lift, the Maintenance Level, the Building Type and the 1946-1975 Construction Time Period (CTP (2)). Furthermore, the presence of Custodian Service and Air Conditioning variables were significant only in the NZ sub-sample, while the Allocation Level, the Number of Bedrooms and Bathrooms, the Number of Views, and the classification of the unit as a Penthouse were significant in the CZ sub-sample. Although the Building Category represents one of the most significant variables, there are also some differences in its significance in the two sub-samples: in the NZ sub-sample almost all levels were significant, while in the CZ only higher levels (Noble (4) and Prestigious (5)) were significant. The analysis of the EPC label highlighted that the high EPC labels (3) were significant only in the CZ sub-sample, while low EPC labels (1) had a negative influence on the price formation process only in the NZ sub-sample. In conclusion, the results confirmed that different spatial clusters behave as different sub-markets: they both include different listing prices and building categories and are able to differently monetize the intrinsic characteristics of the buildings.
Firstly, there is a clear connection between the impact of EPC label on housing prices and the quality of the buildings and apartments. Namely, dwellings with a high energylabel (EPC = A), which slightly impacts housing prices, are more expensive and boast the best architectonic attributes. Moreover, those dwellings with high EPC labels are located in buildings that are older than 60 years and are probably totally refurbished, both being mainly historical buildings listed by Superintendence of Archaeology, Fine Arts and Landscape, and being regularly under ordinary maintenance by the owners, often representative of the richest population.
Secondly, some attributes totally differed in the two sub-samples. The characteristic related to the air conditioning system is emblematic: it is not monetized by the city centre sub-market as it is a common dwelling's feature, in order to guarantee a high internal comfort to almost all residential units (mainly classy or prestigious). On the opposite hand, air conditioning is a stronger discriminating factor for dwelling buying decisions in the northern suburbs, where the housing quality is lower and for which the "housing added services" seems to compensate by acquiring the greatest importance.
The other dimensional and physical features seem related to the micro-context of the dwelling. In fact, in the city centre the building oldness and density allow the presence of non-ordinary apartment plans, with no balconies or with a solely principal view. Furthermore, the building density means that classy buildings are widely closed, and medium or low floor levels have low lighting and face the opposite buildings. Thus, the presence of a terrace (in the penthouse) or more than one view are appreciated features, which emerged in previous studies [77].
Even these last considerations which emerged from the study seem to confirm the results reported in the most recent literature (previously mentioned in the Section 2), which reveal an uneven importance of building features-green features included-in view of the building energy performance and, jointly, to the reference sub-market. Then, the results can allow us to argue that, on the one side, some attributes are able to capture a part of the EPC and/or the energy performance features, in terms of marginal prices, for reasons that should be furtherly explored (for example, by analysing the awareness degree of the EPC potentialities among buyers); on the other side, to argue the presence of "latent" variables able to act as a proxy of the building/dwelling energy rating, due to the knowledge degree of the EPC potentialities which could be surely reinforced.

EPC Clustering: Low Energy-Efficient Dwellings Sub-Sample
A further step of the analysis was conducted by considering the LEE sub-sample, consisting of 234 observations related to housing units with low EPC labels (E, F and G), built in the 1946-1990 time period. The identification of this not-spatial sub-sample, based on an attribute selection, moved from the results of the previous spatial regressions, which highlighted that the low EPC labels (EPC = 1) and the buildings built between 1946 and 1990 (CTP = 2), to two variables that significantly and negatively influence the listing price formation.
A spatial regression (SEM) was performed also on this sub-sample, assuming LogLP as the dependent variable and different intrinsic features as independent variables. The results (final model with only significant variables) are illustrated in Table 5. Above the numerous variables and their related levels, the final SEM can explain the 72% (R squared = 0.718) of the price formation process; results showed the spatial LAMBDA indicator (Lambda = 0.739), the prestigious (BLC (5) = 0.387) to medium level (BLC (3) = 0.128) of building category the high level of maintenance (MTL (4) = 0.189), and the presence of terrace, lift, box and custodian service as the most influential variables.
In conclusion, in this case the spatially correlated errors indicator (expressed by the LAMDA variable) and the "building quality" (expressed by the prestigious building category) are able to explain the greatest part of the price variability. It is interesting to notice that in this sub-sample of energy inefficient housing, refurbished dwellings are also included with a good level of maintenance and prestigious building category, confirming that the energy efficiency and the techno-physical features related to the energy behavior of a building, in the case of existing and historical buildings, are often unrelated to the building architectural quality. Moreover, also for the sub-sample of existing assets, the "housing added services" such as the presence of the appurtenance car box, the presence of lift and custodian service are attributes appreciated by the market.
Finally, it is worth highlighting the presence of two variables probably more related with the present climate changes: the presence of terrace/balcony and the presence of air conditioning. Since the average temperatures increased significantly in Turin from +8.07 • C (in 1963) to +10.69 • C (in 2018) [78], devices (passive and active) related to home cooling are increasingly appreciated by the market.

Conclusions
This paper aimed to explore the pricing processes of the real estate residential properties, considering the influence of the EPC labels and relevant building/dwelling attributes on prices in their spatial context. The study was developed by analysing a sample of residential listing prices in Turin. Exploratory Spatial Data Analyses (ESDA) and two regression models (the Ordinary Least Squares model and the Spatial Error Model) were firstly applied on the whole data sample.
The results highlighted that low EPC labels (E, F and G) significantly and negatively affect housing prices, while high EPC labels (B, A1, A2, A3 and A4) have a lower but positive influence on them. Moreover, some intrinsic building/dwelling features emerged as characteristics particularly able to influence the property price formation: the building category and the housing unit maintenance level.
Secondly, the regression models were applied on three different sub-samples: two of them were generated by a spatial approach-based on LISA cluster maps of LP-while the third was identified by applying an attribute (not spatial) approach based on intrinsic features, representative of the most energy inefficient housing units. The results showed that high EPC labels are significant only in the "high-high" spatial cluster characterized by a positive autocorrelation of high LP values, while low EPC labels have a negative influence on the price formation process only in the "low-low" spatial cluster, characterized by a positive autocorrelation of low LP values. By comparing the influence of the building/dwelling attributes on property prices in these two spatial sub-samples, some variables resulted as always significant, but several differences emerged, which means that different spatial clusters behave as different sub-markets.
The results achieved in the third sub-sample, which included low energy efficient dwellings, on the one hand confirmed the great influence of prestigious "building category" and high levels of maintenance, and on the other highlighted the relevance of other variables, such as the presence of a terrace, lift, box, custodian service and air-conditioning system. Therefore, the results allow for arguing that the EPC labels are acquiring an explicative power on property prices, in line with some international studies. Specifically, by comparing the results with two previous studies conducted in the city of Turin in two different periods [12,13], a growing effect of EPCs on prices can be seen.
Furthermore, some building features related to the energy performance emerged which deserve to be deeply explored in future, being connected to two relevant aspects: on one side the capacity of certain features, including the "green" ones, to capture a part of the EPC's explicative power; on the other side, the eventual presence of "latent" variables acting as a proxy of the building/dwelling energy rating.
Operatively, the methodological approach illustrated in this work can be considered a support for building/dwelling retrofitting policies, considering the potentialities of technological interventions in terms of dynamics in sub-markets and prices. Being in a period of deep changes in the real estate market, this study demonstrates that spatial analyses jointly with the Hedonic pricing models represent important approaches to study the real estate sub-markets and explore their different pricing processes. In fact, spatial regression models permit to manage the spatial autocorrelation and to study the influence of latent variables on pricing processes, which represent important factors able to change all spatial hierarchies.
Limitations: In the Italian context property transaction prices are not public information. Thus, researchers, real estate companies and public administrations are used to study and analyze listing prices to perform market analyses and to estimate the property values. Even if it represents a key limitation of this study, it is worth mentioning that previous studies demonstrated that listing prices can be considered a proxy for transaction prices [79] and that they can influence the selling processes and prices prediction [68,69]. Moreover, the data sample consists in a set of property listings containing detailed information that in some cases are not complete: this is the case of the building construction time period and also of the EPC label. Data Availability Statement: TREMO data (housing listing prices and related characteristics), used to support the findings of this study, have not been made available because of third-party rights.