Next Article in Journal
LiDAR SLAM for Safety Inspection Robots in Large Scale Public Building Construction Sites
Previous Article in Journal
Study on the Energy Distribution and Dissipation of High-Rise Structures Under Long-Period Ground Motions
Previous Article in Special Issue
Model Predictive Control for Smart Buildings: Applications and Innovations in Energy Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Decarbonizing the Building Sector: The Integrated Role of Environmental, Social, and Governance Indicators

by
Nicola Magaletti
1,
Valeria Notarnicola
1,
Mauro Di Molfetta
1 and
Angelo Leogrande
2,*
1
LUM Enterprise s.r.l., 70010 Casamassima, Italy
2
Dipartimento di Management, Finanza e Tecnologia (MFT), LUM University Giuseppe Degennaro, 70010 Casamassima, Italy
*
Author to whom correspondence should be addressed.
Buildings 2025, 15(19), 3601; https://doi.org/10.3390/buildings15193601
Submission received: 3 July 2025 / Revised: 18 September 2025 / Accepted: 24 September 2025 / Published: 7 October 2025

Abstract

Climate change mitigation for the built environment has become a subject of greatest urgency, as buildings account for nearly 40% of total energy consumption and nearly one-third of total CO2 emissions. While environmental, social, and governance (ESG) indicators are increasingly used to monitor sustainability performance, their collective role in impacting building-related emissions is yet largely under-investigated. The current research closes that gap through an examination of the ESG dimension–CO2 emissions intersection of 180 nations from 2000 to 2022, in the hope of illuminating how environmental, social, and governance elements interact to facilitate decarbonization. The research is guided by a multi-method design, including econometric examination, cluster modeling, and machine learning techniques, which provide causal evidence and predictive analysis, respectively. The findings reveal that the deployment of renewable energy significantly reduces emissions, while per capita energy use and PM2.5 air pollution exacerbate this effect. The social indicators show mixed results: learning, women’s parliamentary representation, and women’s workforce representation reduce emissions, while food production and growth among the lowest-income individuals demonstrate higher emissions. Governance demonstrates mixed results as well, with good regulation reducing emissions under specific conditions yet primarily supporting high-income countries with superior infrastructure. The examination of clusters reveals that ESG-balanced performance is retained by countries in the low-emission clusters, whereas decentralized ESG pillars are associated with higher emissions. Machine learning confirms the existence of non-linear effects and identifies PM2.5 exposure and renewable energy deployment as the strongest predictors of the relationship. In summary, the findings suggest that successful policies for decarbonizing the built environment are constructed upon the consistency of environmental, social, and governance plans, rather than single steps.

1. Introduction

Climate change is a defining issue of the 21st century. The building sector produces nearly 40% of global energy demand and a third of CO2 emissions [1]. ESG indicators are now recognized as standards of sustainability, yet their macro-scale policy application is limited [2]. Except for investment screens, ESG proxies institutional strength, innovation, and environmental performance. Healthcare and regulation reduce carbon intensity in industrialized sectors [3]. This study examines the connections between governance, education, healthcare, scientific productivity, and construction-sector emissions, utilizing IPCC data, econometrics, and machine learning [4]. Clustering and variable-importance mapping reveal cross-country differences and non-linear institution-emissions relationships. Research gaps are apparent: institutional and socio-ecological determinants are under-investigated, while technology diffusion and urban energy receive the most attention [5]. Findings indicate that effective governance attracts investors and that scientific productivity reduces emissions through innovation and policy. Paradoxically, greater R&D and governance accompany increased emissions, revealing a relationship between infrastructure growth and emissions [6]. Overall, ESG evidence reveals complex environmental performance determinants and policies that balance development and sustainability.
The article continues as follows: Section 2 reviews the current literature on how ESG features map onto emissions in buildings, identifying gaps in integrated studies. Section 3 presents a multi-method approach that combines econometrics, clustering, and machine learning across a global sample from 2000 to 2022. Section 4, Section 5 and Section 6 study environmental, social, and governance features in turn. They find unique but correlated effects on emissions. Section 7 synthesizes findings while highlighting ESG coherence. Section 8 and Section 9 present limitations and conclude on the relevance of ESG to future emission mitigation policy.
The article also contains several appendices dedicated to exploring metric dimensions and analyzing data characteristics: Appendix A provides a comprehensive description of the variables used in the ESG analysis, along with data sources and definitions. Appendix B presents the summary statistics for the ESG model’s Environmental (E) component. Appendix C releases the summary statistics, as well as the robustness checks, pertaining to the Social (S) dimension. Appendix D provides a summary of the results for the Governance (G) indicators, as well as their relationship with building-sector emissions. Appendix E presents diagnostic tests for autocorrelation and heteroscedasticity, ensuring the robustness of the econometric estimations. Appendix F describes the K-Nearest Neighbors (KNN) regression algorithm used, under the part dedicated to machine learning, the hyperparameter optimization process.

2. Literature Review

E-Environmental. Several works criticize the application of environmental certification tools at a symbolic level and a low level of linkage to actual environmental performance. Ref. [7], for instance, found that ESG tools applied within Florence’s residential policy did not cause noticeable decreases in gas or energy consumption. Ref. [1] are similarly critical of the BREEAM system’s application within the UK, as it places too much emphasis on environmental parameters while losing sight of equity and governance themes. In India, green certification is shown by [8] to generate higher rent levels, which are thus socioeconomically exclusionary. The separation between environmental performance and communication is further exacerbated by the minimal attention to facility-level practices in the post-construction phase. Whilst some works note positive practices, such as the use of prefabricated infrastructure for energy use and waste reduction [9], these benefits remain largely negated mainly due to mixed facility-level practices and maintenance [10,11]. Whilst ref. [10] successfully demonstrates that ESG-certified Class A office stock within the urban area of Madrid is a reality and firm energy savings occur, these are dependent upon repeated operating disciplines, such as metering and tenant engagement. This is reiterated in [12], who emphasize that successful facility practice is a necessity if sustainability practices are to be more than a symbolic application, especially within restrictive bureaucratic regimes. It has been posited that digital innovation has a solution. Ref. [13] speculates about a digital monitoring ESG system that applies to commercial stock in real estate, but its hypothetical nature, coupled with actual non-verification, limits its current application. Ref. [14] promote the applications of technology but without any linkages between technology applications and their resultant sustainability impacts. Ref. [15] concur with this disconnect and note that Malaysian ESG reporting still tends to be at a rather transformational level, rather than a reputation level, in its operational application. Other environmental concerns highlight researchers who study wider energy and pollution infrastructures. Ref. [16] cooperate on materials and lifecycle emission traceability but highlight the basis of materials sustainability in successful use and procurement systems. Likewise, ref. [17] outline ESG as a “change engine” of construction but highlight common operating processes between procurement and waste treatment. At the same time, these works suggest that environmental performance within ESG thinking depends not only on innovation or design but also on process practices designed in and repeated, energy source quality, and ongoing administrative resolve.
S-Social. The social ESG of the built environment is conceptually applied but lacks theoretical depth at operational levels. The social ESG literature focuses primarily on design intention, occupant wellness, and occupant health but underscores the long-term socially desirable aspects that buildings do. Ref. [18] cite air and light performance research and studies on occupant satisfaction but remain constrained within first-order design, missing tenant use and after-occupancy engagement—the sustainability drivers where they have the most impact in the long term. Equity concerns apply equally to criticism claims. Ref. [8] conclude that India’s green-awarded buildings rent at higher grade levels and are consequently exclusionary to poorer rent grades, thus exacerbating urban poverty. By analogy, ref. [19] cites a dilemma where ESG uptake still largely accrues within cash-based revenues, but environmental resilience is seldom accompanied by investment in the social or governance quadrants. Ref. [20] cite environmental and societal forces within real assets but provide no further explanation at the operational level beyond facilities. Retrofit and inclusion sustainability reports within comparisons beyond their own region. Ref. [21] suggest that ESG retrofit within South African facilities might be initiated more often due to energy blackouts (e.g., load shedding) rather than sustainability targets and will consequently not be maintained in the long term. Ref. [22] further suggest that larger practicum and service-provider architecture and engineering departments largely direct provision primarily within the SDGs during preparation intervals but overlook the use of ESG findings some years later. Ref. [23] sketch further ESG research mapping within commercial real property but concede that facility-level completion is poorly expounded. Social ESG implementation deterrent concerns often provide structural barriers. Ref. [24] concludes that tomorrow’s knowledge, ignorance, and operational incompetence remain the main deterrents to green uptake within Kenya and calls for firm management reinforcement and institutional reinforcement. These deterrents predict ESG policy change extending beyond disclosure and design but broader, co-participative engagement and governance. Even if research works such as ref. [25] assume user behavior modelers and facilities managers to be promoters of ESG outcomes, only frail empirical evidence can be provided, and experimental confirmation needs to be carried out. G-Governance. Governance is ESG’s third pillar and key to long-term sustainability within the built environment but has been revealed to have an inadequate operational governance mechanism. Regulatory fragmentation and inability to maintain follow-up on constructed buildings are common attributes of certain studies.
G-Governance. Systemic ESG issues, such as inefficient regulatory coordination and regulatory gaps, are highlighted in [26]. Ref. [27] clearly suggest that operational data requirements remain a consideration; however, minimal consideration continues to be given to the long-term ESG performance effects emanating from facility-level operations. Physical elements of design often drive ESG implementation, but ref. [28] finds that it is further influenced by administrative rituals, such as communication between relevant parties and energy use tracking, thereby encapsulating building management within governance effectiveness. Contract provisions that interface with ESG obligations at inception, between design and everyday use, are featured in [29]. This continues to be carried forward by [30], who suggest that Chinese agendas incorporate ESG into property agendas, but making such incorporation a reality in everyday operational use remains elusive. But policy and financial arrangements fail in governance. Investment programs, such as sustainable investments, offer positive financial returns but do not necessarily raise doubts about whether they lead to reform at a facility level [31]. Likewise, ref. [32] note that, while tenants account for ESG certificates within rent premiums, administrators of buildings seldom receive assessments regarding their compliance with ESG standards in the long term, resulting in a gap between market incentives and governance accountability. A green certificate often fails to institutionalize governance in most instances. Ref. [33] note that standardized certificate requirements mostly overlook the operational dimension—the dimension where attention is focused on documentation but not on use. Only in a similar manner do ref. [34] identify gaps between taxonomy at an EU level and schemes such as BREEAM-SE, inducing distress in building managers in making long-term compliance plans. Refs. [35,36] go further to identify how reform efforts within facilities’ sectors primarily call upon ESG but without designing operationally successful observance systems. Furthermore, managers’ roles within buildings and facilities’ governance are repeatedly mentioned in some studies. Ref. [37] identify an increase in property uptake but overlook the facilities’ use of smart technology, while ref. [12] suggest that post-construction monitoring often falls victim to administrative inertia. Ref. [11] again sarcastically identifies ESG compliance and repeatable operating commitments as supporting governance’s role at a building level. If there are no enforcing processes in place, then the ESG will be susceptible to becoming a label rather than a change-based governing process.
Governance, ESG’s third pillar, is central to making the constructed environment sustainable in the long term; however, operating governance mechanisms are surprisingly absent in the literature. Some of the literature refers to regulatory gaps and disjointed follow-up after construction. Ref. [26] present evidence of structural ESG challenges, such as non-coordination and regulatory gaps. Ref. [27] refer to operational data requirements that are increasingly accepted; however, facility-level, everyday processes are essentially left out of consideration for their contribution to long-term ESG performance. Ref. [28] reveals that ESG is not only administratively applied to the physical side of processes but also in processes such as tracking edifice management and energy, and this is at the core of governance performance. Ref. [29] suggest that contractual clauses bring ESG requirements into conception and link design and everyday implementation. This is also reported by [30], who note that ESG adoption is becoming integrated but does not yet seem to be translating into regular operational practice. However, governance also raises questions about policy and the limits of finances. Ref. [31] reveals evidence that sustainable investment funds yield better monetary returns, but does not raise questions about whether these investment increments have a tangible impact on governance at the executive level. Likewise, ref. [32] report that tenants pay premiums for ESG certification, and facade managers work mainly but are seldom measured, on compliance with ESG requirements; an increasingly significant gap is emerging between market incentives and accountability in governance performance. Green certification does not institutionalize governance. Ref. [33] conclude that certified requirements tend to exclude the operational dimension—the theory of documentation, instead. Likewise, ref. [34] identify inconsistencies between the EU taxonomy and local sets, such as BREEAM-SE, which makes it challenging for edifice managers to devise long-term compliance strategies. Refs. [35,36] further outline, based on construction reform, often relying on ESG but not converting it into operationally reflective mechanisms of oversight. Lastly, facilities governance and the positions of building managers receive attention in several remarks. Ref. [37] refer to innovative technological development at a property level but not facility managers’ application of such technology. Ref. [12] note that post-construction monitoring is often hindered by administrative routine inertia. Ref. [11] personally equates ESG compliance with reproducible operational necessities and reiterates the governance function at a building level. Without enforceable, evolutionary standards, ESG risks amount to a label but not a transformational governance role (Table 1).

3. Modeling Building-Related Emissions Through ESG Dimensions: A Multi-Method Analysis Using Econometrics, Clustering, and Machine Learning

The built environment and climate change cannot be separated. The building sector accounts for approximately one-third of global CO2 emissions [1]. The current research examines building-sector CO2 emissions against the ESG prism. Multi-method research combines econometrics, clusterization, and machine learning. The panel econometrics comprises 180 nations (2000–2021). Fixed- and random-effect regressions are linked between building emissions and environmental, social, and governance indicators [45,46]. National typologies are highlighted by cluster analysis. Normalized ESG–emissions data cluster states into groupings with similar structures or fragmented profiles [47,48]. Supervised machine learning also preserves non-linear relationships. Random Forest, SVR, k-NN, and Boosting predict and rank ESG attributes by importance [49,50]. Results converge. Balanced ESG pillars are associated with lower emissions. Broken governance or weak social equity are associated with higher emissions [51,52]. Results show interdependence. Environmental action becomes effective only if governance and social equity co-evolve. Disjointed ESG pillars are ineffective for lowering emissions. This triangulative approach clears up causal, predictive, and structural dynamics. The strong triangulation of econometrics, cluster analysis, and machine learning enhances validity [53,54,55]. Policy prescriptions are immediate: decarbonization of buildings is achievable only with governance quality, social inclusiveness, and environmental enforcement simultaneously. Fragmented ESG strategies are unsuccessful. See Figure 1. A complete description of the variables used to estimate the ESG model is provided in Appendix A.

4. Decoding Building Emissions: Environmental Drivers in an ESG Framework (2000–2022)

Estimating the environmental determinants of building-related carbon dioxide emissions (BCE) holds the key to synchronizing the policies of decarbonization with ESG (environmental, social, and governance) targets. This section introduces the first phase of the empirical work on analyzing the effect of environmental indicators on BCE during the years 2000–2022 in 180 countries. With the help of fixed and random effects panel regressions, the study evaluates the effect of prominent energy and pollution-based indicators—access to clean fuels (CFTC), electricity access (ELEC), per capita energy usage (ENUC), particulate matter pollution (PM2.5), and renewable energy contribution (RENC)—on building-related emissions. The outcomes serve as a quantitative underpinning for the interpretation of the environment’s sustainability through the lens of ESG and present direct and indirect channels through which energy systems impact outcomes on emissions. This model forms a crucial diagnostic tool through which one can assess the environment’s performance as well as the structural dynamics shaping carbon intensity in the built environment on an evidence-based trajectory. The metric characteristics of the variables used to estimate the ECB component with respect to the E-Environmental variables within the ESG model are summarized in Appendix B. Diagnostic tests for autocorrelation and heteroscedasticity for the econometric estimates are provided in Appendix E.

4.1. Modeling the Environmental Determinants of Building Emissions: A Global ESG-Informed Econometric Analysis

To quantify the environmental determinants of building-related carbon emissions (BCE), this section presents a panel regression model estimating the effects of key energy-related indicators. Drawing on data from 180 countries over the period 2000–2022, the model incorporates variables aligned with the “E” pillar of the ESG framework, including clean fuel access, electricity availability, per capita energy use, PM2.5 exposure, and renewable energy consumption. The analysis employs both fixed and random effects estimators, with results providing robust statistical insights into how national energy infrastructure and environmental conditions shape BCE outcomes globally.
Specifically we have estimated the following equation:
B C E i t = α 1 + β 1 C F T C i t + β 2 E L E C i t + β 3 E N U C i t + β 4 P M 25 i t + β 5 R E N C i t
where i = 180 and t = 2000–2022. The results are shown in the following Table 2.
Building-sector CO2 emissions (BCE) are a key environmental (E) metric within the ESG framework, making use of 180 countries’ (2000–2022) panel data. The approach relates BCE to accessibility, quality, and composition of energy, including cleaner fuel transition (CFTC), electricity accessibility (ELEC), per capita energy use (ENUC), exposure to PM2.5, and use of renewable energy (RENC). The results establish that CFTC stimulates social performance while increasing BCE unless fueled by low-carbon energy [48,54]. On the contrary, a transition to electricity reduces the rate if electricity is generated from cleaner sources, thereby maintaining ESG co-benefits for universal electricity accessibility [56]. Increasing ENUC matches with rising emissions, making efficiency policies relevant for high-consumption nations. From a governance (G) perspective, proper planning, conservation incentives, and regulating energy-intensive tech are central [57]. PM2.5 exhibits a high and desirable correlation with BCE, supporting the convergence between environmental damage and public health hazards and integrating the E–S pillars within ESG. On the contrary, the use of renewable energy (RENC) shares a high negative correlation with emissions, validating green energy as a prime avenue for a resilient future. The fixed-effect formulation yields a high R2 value for the panel estimates, validating the use of BCE as a robust metric for ESG while accounting for country-wise dissimilarity. The results, in general, reveal that electrification and renewables lead to abatement of emissions, while energy efficiency is crucial for offsetting per capita demand growth [54,56]. The linkage between BCE and PM2.5 provides a compelling rationale for integrating policies on both air quality and climate that need to be globally instituted, especially for emerging economies with lower regulations [48]. The results position BCE at the core of ESG performance, where the building sector’s performance is dependent on accessibility, quality, efficiency, and governance [47,57].

4.2. Evaluating Clustering Strategies for Building Emissions: A Multi-Metric Comparison

To disclose building-related carbon emission (BCE) global patterns and underlying country groupings that are structurally regular, we employ clustering techniques. Six methods are systematically compared across ten standardized performance measures, which vary from statistical fit and cohesion to quality of separation. The comparative benchmarking identifies the best possible segmentation technique for emission profiles, ensuring appropriate classification across countries by BCE structure. Moving beyond the selection of techniques, the application of clustering supports the interpretational strength of diagnostics that are ESG-based through correlation with environmental indicators. The approach possesses both technical sophistication and analytical coherence and constitutes a tool that scales for comparing emission dynamics across diverse country settings (Figure 2).
Six cluster procedure performances were compared against ten standardized indices (R2, AIC, BIC, silhouette, Dunn, maximum diameter, minimum separation, entropy, and Calinski–Harabasz), each on a 0–1 scale. The density-based algorithm performed exceptionally well, achieving a mean score of 0.599, surpassing the optimal AIC and BIC (1.0) and achieving perfect separation indices (1.0), which is typical for well-defined and tightly clustered data [58]. Its silhouette value was high (0.8), beaten only by hierarchical (1.0), and satisfactory performance on maximum diameter (0.696) and Pearson’s γ (0.489). On R2 and entropy, they performed poorly, although these indices are better suited for non-linear cluster shapes and are more suitable for DBSCAN [59]. Hierarchical was best in terms of R2, silhouette, and γ while performing poorly in terms of separation, suggesting potential overfitting. The neighborhood- and model-based techniques showed one-dimensional strengths and mixed performance. The density-based approach showed the best-rounded performance and handles noisy or cluttered data the best.

Evaluating Cluster Quality and Structure in ESG-Driven Density-Based Analysis

This section reports on results from density-based environmental and energy indicator clustering relevant to ESG (environmental, social, governance). The leading indicators—clean fuel access (CFTC), electricity access (ELEC), per capita energy use (ENUC), PM2.5 exposure, and renewable energy use (RENC)—produced several sharply differentiated clusters and one noise group. The clusters varied in size, shape, and separation, as indicated by the within-cluster sum of squares (WSS), explained heterogeneity, and silhouette statistics. The latter two statistics corroborate both statistical stability and interpretational soundness of the solution. The results link country energy infrastructures and environmental configurations with building-related CO2 emissions (BCE) and reveal clustering as a gauge for benchmarking and ESG-centered sustainability reporting (Table 3).
The DB model identified one noise point and five distinct clusters. The cluster quality was determined based on size, explained variance, WSS, and silhouette measure. The noise point, a typical DBSCAN characteristic [60], exhibited a WSS and silhouette value of 0, indicating evident separation from the cluster. Cluster 1, with 949 data points, exhibited high explained variance (0.999) but low isolation (silhouette = 0.347) and was therefore observed to create an overlap boundary artifact [61]. Clusters 2–5 were much smaller, each having between 6 and 20 data points. Cluster 4 was an outlier (WSS = 0.051; silhouette = 0.985), and Clusters 3 (silhouette = 0.968), 5 (0.820), and 2 (0.632) also passed the 0.5 interpretability cutoff [62]. The entire cluster display showed primarily spherical, compact shapes with acceptable isolation, although resolution within the noise was low. Cluster 1 showed lower discrimination compared to the small clusters, which exhibited sharper and clearer groupings (Figure 3).
Examining the environmental (E) pillar within ESG, the model correlates building-related CO2 emissions (BCE) with a set of five indicators: clean cooking fuels (CFTCs), electricity access (ELEC), per capita energy use (ENUC), PM2.5 exposure, and renewable energy share (RENC). The strategy is consistent with integrated sustainability indices [63,64]. The nations fall within six groups. Cluster 0 represents energy poverty, characterized by low electricity, high PM2.5 levels, low renewable energy sources, and a moderate BCE. Cluster 1 has balanced and low BCE regimes. Cluster 2 is more closely associated with high pollution and low renewable energy sources and thus receives a moderate BCE from fossil-heavily dominated urban–industrial systems. Cluster 3 achieves high electricity and renewables, as well as moderate BCE, due to efficiency gains. Cluster 4 has very low BCE and PM2.5 levels, which is likely due to its nuclear or hydrocarbon dependence. Cluster 5 balances clean energy and low pollution yet maintains a moderate BCE, characteristic of efficient, high-income fossil fuel economies. Overall, each cluster plots distinct energy–environment portfolios, testifying to how integrating indicators strengthens the ESG diagnostics (Figure 4).

4.3. Explaining Carbon Emissions in the Built Environment: A Comparative Machine Learning Approach

This section presents a rigorous comparison of multiple machine learning regression models—ranging from ensemble methods to neural networks and proximity-based learners—evaluated through standardized performance metrics. By identifying the most reliable and accurate algorithm for predicting building-sector emissions, the analysis sheds light on model suitability and interpretability, with a particular focus on the role of local energy consumption, pollution exposure, and access variables in shaping emissions profiles (Figure 5).
Model performance was evaluated using MSE, RMSE, MAE, MAPE, and R2, with the metrics normalized and inverted for ease of comparison. Composite scores showed that k-Nearest Neighbors (k-NNs) strongly dominated the rest, with error rates being low and an R2 level of 1.0, consistent with a perfect explanation of variance. This is the fit between k-NN’s similarity principle and patterned data structures, as well as its adaptiveness with weak functional assumptions [65]. Despite the risk of overfitting driven by a perfect score, k-NN performed strongly across various error measures. Compared to that, Boosting yielded similar results with higher errors and lower R2 [66]. Decision Trees overfitted (low MSE but R2 = 0), while linear, regularization, and neural network models lacked complexity. Random Forest was stable without localized precision. Overall, k-NN’s high level of accuracy, along with its parsimonious and flexible character, highlights its strength for locally regularized data in empirically tested, simple models, consistent with prior results in software engineering estimation [67]. See Figure 6.
By applying k-Nearest Neighbors (k-NNs) with dropout loss (RMSE on 50 permutations), we determined significant environmental associates for building-related CO2 emissions (BCE) in the ESG “Environment” pillar. The power consumption of fossil fuel use, combined with simultaneous air conditioning and heating, was the top contributor to air pollution (PM2.5) (90.969), highlighting the significant impact of fossil fuel use on power, heating, and air conditioning consumption, as well as the climate–health hazard associated with decarbonization policy [62]. Consumer energy consumption of renewable energy (RENC) came a close second (76.284), indicating a reduction in carbon intensity and facilitating policy-mandated decarbonization [65]. Clean fuel for cooking (CFTC) came in a distant third (68.672), associated with biomass dependence and the infrastructural vulnerability of low-income countries [68]. Energy use per capita (ENUC) (63.747) delivered demand-side increases in emissions, while electricity penetration (ELEC) delivered the weakest effect (40.879) due to near-universal penetration. Overall, PM2.5 and RENC were chief initiators, followed by CFTC and ENUC, and followed by marginal ELEC. The implications are for renewable installation, clean fuel consumption, and air quality management as central BCE mitigation activities within ESG policy. See Figure 7.
Additive decompositions for k-NN predictions of building-related CO2 emissions (BCE) provide contributions for clean fuels (CFTC), electricity access (ELEC), energy use per capita (ENUC), PM2.5, and renewables (RENC) compared to a base level of 25.434. For Case 1 (BCE = 12.257), maximum reduction (−16.664) was brought on by low PM2.5, in line with evidence for air quality and emissions associations [62]; ENUC and RENC reduced, while CFTC and ELEC increased emissions by higher accessibility [65]. For Case 2 (BCE = 0.500), a reduction was observed for PM2.5 (−17.110) and ENUC/RENC, offset by CFTC and ELEC, which is typical for low-emitting systems. For Case 3 (BCE = 18.570), a notable reduction by ENUC (−13.891) was offset by a favorable effect by RENC (+9.796) and ELEC. For Cases 4–5 (BCE = 4.024), the high negative contributions for ELEC (−20.170; −18.398) indicate a preference for low-carbon generating alternatives, such as hydro. Generally, PM2.5 and RENC are repeated movers for BCE reduction, while accessibility-related indicators (CFTC, ELEC) increase or decrease emissions according to system efficiency and mix. The exploratory findings for these jobs in ML reveal clean energy and air quality as the key factors for ESG policy [68]. See Figure 8.

5. Equity, Participation, and Emissions: Social Determinants of Building-Sector CO2

This paper examines the linkage between building-sector CO2 emissions (BCE) and social development indicators within the context of ESG, with a focus on the oft-overlooked “Social” pillar. Using panel econometrics and machine learning, we quantify the impact of distributive income inequality, labor force participation, educational parity, food output, and female political empowerment on the level of emissions. The results demonstrate robust relationships across specifications: distributive income inequality and higher food production are linked with higher emissions, while schooling and female political representation are linked with reductions. The relationships reveal that social equity, economic inclusion, and institutional representation are stimulants to sustainable environmental performance and not just moral prescriptions. The integration of social considerations into ESG assessment enhances predictive strength and highlights how inclusive policy directly impacts long-term decarbonization trajectories. The metric characteristics of the variables used to estimate the ECB component with respect to the S-Social variables within the ESG model are summarised in Appendix C. Diagnostic tests for autocorrelation and heteroscedasticity for the econometric estimates are provided in Appendix E.

5.1. Social Dimensions of Carbon Emissions: A Panel Data Approach to the Building Sector

This subsection applies panel data econometrics to estimate the degree to which social considerations drive building-sector CO2 emissions (BCE). From a universe of 180 nations and over 1200 observations, drawn from a global dataset, the research tests associations between emissions and a group of five indicators: food production (FOOD), female parity in primary schooling (GPIE), income share for the bottom 20% (INC20), labor force participation (LABF), and women in legislatures (WPAR). Fixed- and random-effect regressions forecast the size and significance of impacts. Results show that country-level emission profiles are shaped by social equity, accessibility of schooling, and political representation, demonstrating the importance of social dimensions as a driver of ESG-related sustainability, rather than a side concern.
Specifically, we have estimated the following equation:
B C E i t = α + β 1 F O O D i t + β 2 G P I E i t + β 3 I N C 20 i t + β 4 L A B F i t + β 5 W P A R i t
where i = 180 and t = 2000–2020 (Table 4).
This paper positions building-sector CO2 emissions in the ESG matrix with a focus on the overlooked “Social” pillar. Using panel econometrics with fixed- and random-effect specifications, we test four social indicators—gender parity in primary schooling (GPIE), income share for the poorest 20% (INC20), labor force participation (LABF), and women in parliament (WPAR)—for their roles in signaling educational equity, income distribution, labor inclusiveness, and political representation. Results reveal robust associations. Female schooling (GPIE) is associated with lower pollution, confirming the value of schooling in sustainability [69]. Political representation for women (WPAR) also reduces pollution, corroborating evidence that female empowerment produces cleaner building codes, energy standards, and solar subsidies [70,71]. Increasing participation in the labor force (LABF) has a negative impact on residential energy use, suggesting that jobs lead to greater efficiency [72]. Increasing incomes for the poorest (INC20), however, drive up pollution, since new consumption creates more carbon-demanding use from illumination and domestic appliances [73]. Poverty reduction thus demands simultaneous investments in solar infrastructure and energy literacy. Generally, social equity, inclusiveness, and political representation are key drivers of the low-carbon transition, not adjuncts. Building on data for 180 countries over one generation, the study confirms the universality of these associations [74]. Low-carbon inclusivity demands the union of social justice and environmental defense.

5.2. Clustering Social Determinants of Emissions: An Evaluation of Algorithmic Performance

This research positions building-sector CO2 emissions within the ESG framework, aiming to address the often-overlooked “Social” pillar. With fixed- and random-effect panel econometrics, we study the effect on CO2 emissions from female parity in primary schooling (GPIE), income share for the poorest 20% (INC20), labor force participation (LABF), and women in parliament (WPAR) as proxies for educational equity, income distribution, labor inclusion, and political voice. Results show high associations. GPIE is significantly linked to negative CO2 impacts, highlighting the importance of female education for sustainability [69]. Likewise, a negative effect emerges for WPAR, supporting the results that female political leadership is associated with greener building codes, energy codes, and green and renewable subsidies [70,71]. A correlation emerges between residential energy consumption and LABF, indicating a link to job efficiency [72]. INC20 appears to be significantly linked with CO2 increases, as higher incomes for the poor result in a boost to fossil-intensive consumption [73]. The result supports the notion that the “Social” pillar is central, not peripheral, to ESG. Across over 180 countries spanning multiple generations, the research establishes the universality of these associations [74], thus supporting the notion that low-carbon transitions that are inclusive depend on aligning social equity and environmental protection. See Figure 9.
We experimented with six clusters—density-based, Fuzzy C-means, hierarchical, model-based, neighborhood-based, and Random Forest—with ten validity indices (R2, AIC, BIC, entropy, silhouette, maximum diameter, minimum separation, Pearson’s γ, Dunn, and Calinski–Harabasz). They measure fit, error minimization, and quality structure in high-dimensional ESG data. Model-based and Random Forest excelled in R2 (0.003), although their superiority over other techniques was slight, while density-based and neighborhood-based performed the worst (0.001). Random Forest performed best on AIC and BIC (0.016; 0.017), with model-based a close second (0.011; 0.012), indicating that explanatory strengths predominated over penalties for complexity [75]. Separation and silhouette measures were close to zero across techniques, with density-based being slightly higher (0.004), indicating a weak cluster definition. Dunn indices were unanimous zero, supporting reports that compaction in ESG clusterings was low [76]. Entropy measures varied on a small scale, with model-based (0.008) and Random Forest (0.007) best. Calinski–Harabasz was a flat 1.0, likely due to normalization. Overall, Random Forest was the best-balanced performer, excelling in AIC, BIC, and entropy, while tying with model-based on R2. Fuzzy C-means, hierarchical, and neighborhood-based were inconsistent performers, with near-zero separation and validity [77].

Decoding Emissions and Equity: A Density-Based Clustering Approach to ESG Social Metrics

The research employs a density-based cluster model to analyze an ESG-related environmental dataset in the context of sectoral CO2 emissions and certain societal development indicators. The cluster structure identifies four clusters and one outlier, covering both leading and niche configurations in the societal development and emissions process. The evaluation determines compactness and distinctiveness within the cluster, as well as within-cluster heterogeneity, using Silhouette indices and the respective sum of squares, along with their explanatory power. The model acknowledges an implicit consideration of the interrelation between environmental impact and societal dimension, thereby providing a more refined explanation for the diverse sustainability trajectories across nations. See Table 5.
The density-based strategy identified four large and one noise point group, based on the number of firms, captured heterogeneity, as accounted for by WSS, and silhouette statistics. The largest group (Cluster 1), comprising 1188 firms, exhibited 95.6% within-cluster heterogeneity; it had a high size-related WSS (2,288,542) and a moderate silhouette (0.618), characteristics typical of large heterogeneous groups [78]. The small lumps (Clusters 2–4), comprising 21, 15, and 21 firms, were better defined and higher on silhouettes (0.851, 0.878, 0.768) and low on WSS (13,632; 5502; 84,952). The overall model explained 67.9% of the variance (BSS = 5054.78, TSS = 7447.41), characteristic of strong unsupervised performance on non-convex shapes [78]. One noise point signals aptness for ESG data, where such points typically flag data errors or outlier performers [79,80]. The output produces one large, heterogeneous cluster and three small, tight groups, suggesting the possibility of extracting both pervasive shape and weak ESG-related information. See Figure 10.
The density-cluster model associates building-sector CO2 emissions (BCE) and social indicators—food production (FOOD), gender parity in primary enrolment (GPIE), income share of the poorest 20% (INC20), labor force participation (LABF), and women in parliament (WPAR)—for the “S” component of ESG. Cluster 0 has a low BCE (−0.354), high food production (FOOD = 4.285); however, it exhibits low equity within genders and income levels (≈ −0.1) and low labor force participation (LABF = −1.442). High rates of women in parliament (1.410) reflect middle-income countries where political representation has caught up with educational and labor advances [81]. Cluster 1 represents a moderately negative BCE (−0.117) with balanced rates across indicators, typical of transitional economies close to global means, served by a limited mix and efficient social infrastructure [82]. Cluster 2 includes high BCE (6.826) and weak social performance—negative GPIE (−0.129), INC20 (−0.236), and WPAR (−0.107), with just slightly favorable participation in the labor force (LABF = 0.548)—typifying industrial economies that are pollution-increasing and structurally inequitable and relying on fossil fuel [51]. See Figure 11.
Cluster 3 has a low BCE (−0.337) yet extremely heterogeneous social indicators, characterized by extremely low labor participation (−4.439; −2.845) and exceptionally high female parliamentary representation (2.214). This profile characterizes scenarios in which narrow labor market inclusion coexists with political equity advances and low emissions being a by-product more of poverty limits than on-regulation across emissions, consistent with evidence that poor economies are lower emitters yet development challenges [83,84]. Cluster 4 exhibits a near-neutral BCE (0.044) but high social performance variability, characterized by high GPIE (7.518), INC20 (7.328), and WPAR (4.165), which are accompanied by low LABF (−4.759) and a negative FOOD (−3.916). They are possible high-income scenarios with equitable allocations, yet structural economic limits. Past research has confirmed non-linear and, at times, negative associations between social indicators and emissions [51]. On their own, Cluster 2 complements high emissions and inequity, Clusters 0 and 3 exhibit low emissions and mixed social performance, and Cluster 4 illustrates selected equity–emissions decoupling. The finding corroborates multisided ESG, where environmental and social elements exhibit interlocking and context-aware relationships.

5.3. Finding the Best Fit: A Comparative Evaluation of Regression Models on ESG Data

We compared a broad range of regression methods on ESG data with conventional error measures (MSE, RMSE, MAE, MAPE, and R2). The assessment covered classical techniques (linear regression, Decision Trees) and current-state methods (ensemble methods, neural networks). The findings reveal clear trade-offs: ensemble methods consistently achieved optimal predictive quality and generalization, while linear methods achieved the best interpretability and computational efficiency. The quality of neural networks was patchy, being superior on some occasions and unstable. The research presents the optimal regression methods for ESG prediction tasks, striking a balance between quality, efficiency, and interpretability. See Figure 12.
A comparison between error measures and R2 values from regression models reveals sharp contrasts in the prediction and generalization capacities. Compared models are linear, regularized linear, tree-based, k-Nearest Neighbors (k-NNs), support vector machines (SVMs), neural networks, and Boosting. Multi-model benchmarking similar to that provided is common in cryptocurrency predictions [85], vehicle price prediction [86], and e-commerce satisfaction modeling [87]. The best R2 was 1.0, a perfect explanation of variance. Decision Tree (0.677) and k-NN (0.655) were close, while linear and regularized linear regressions (0.014) and SVM (0.0) were poor. From error measures (MSE, RMSE, MAE), Random Forest stood out above the others, with 0s reflecting an extremely close fit that nonetheless raises some concern about overfitting in the absence of cross-validation. The best error was logged by SVM, corroborating its poor prediction performance. k-NN demonstrated consistent performance (RMSE = 0.285, MAE = 0.17), while Decision Tree was decent (RMSE = 0.782, MAE = 0.741). The best MAPE was logged by Boosting (1.0), followed by Neural Networks (0.376), k-NN (0.453), and SVM scored 0.0 again. On MAPE, linear models underperformed across the board, with regularization yielding little improvement. Overall, Random Forest demonstrated the best prediction appropriateness and reliability and thus is best suited for the current regression task, while k-NN and Decision Tree are fair options where interpretability or computational efficiency is paramount.

Social Drivers of Emissions: Interpreting BCE Through Machine Learning and ESG Indicators

Random Forest feature-importance methods were employed to select the most important building-sector CO2 emissions (BCE) determinants at the prediction level with additive accounts. Income distribution, labor force participation, and gender equity were found to be the most significant socio-economic indicators influencing emission outcomes. The indicators were overwhelmingly the most important across measures of importance, revealing a crucial underlying driving force for patterns in BCE. Case-level decompositions also revealed that changes in social equity and participation directly reposition amounts predicted for emissions. The findings provide evidence at a microlevel that environmental performance has social components in an ESG context, and they establish a linkage between equity, governance, and decarbonization strategies. See Figure 13.
Random Forest importance indicators (MDA, TINP, MDL) are significant indicators of socio-economic push factors influencing building-sector CO2 emissions (BCE). The income share for the bottom 20% (INC20) has the greatest predictivity in each measure (MDA = 56.464, TINP = 973.192, MDL = 83.505), suggesting income distribution—the bottom quintile in particular—as crucial for BCE and likely a consequence of unequal accessibility to energy efficiency in buildings, renewable energy, and support infrastructure. The labor force participation (LABF) variable comes in a close second. Although with a moderate level of MDA (4.113), high TINP (755.169) and MDL (81.929) support high predictivity. High participation rates are associated with urban, energy-intensive building use, while low participation rates indicate structural fragility and inefficient energy demand. See Figure 14.
Results from the Random Forest model indicate that social indicators are significant predictors of building-sector CO2 emissions. Female parliamentary representation (WPAR) becomes significant (MDA = 1.018, Node Purity = 599.087, Dropout Loss = 64.896), consistent with its correlation with progressive building standards and energy-efficacy policy. Gender parity in education (GPIE) also emerges strongly (MDA = 1.209, Node Purity = 517.854, Dropout Loss = 61.618), indicating that knowledge and competence are key triggers for low-emissions building. The Food Production Index (FOOD) has a smaller yet complementary effect (Node Purity = 434.772, Dropout Loss = 58.385), which connects rural land use and agro-industrial energy networks. The findings align with studies that identify socio-economic and structural variables as underlying factors in emitters’ trajectories [88,89,90]. Among the myriad variables tested, the income share of the poorest quintile (INC20) emerges as the best predictor, indicating that inclusiveness and equity are at the heart of ESG-driven sustainability. See Figure 15.
Additive model decompositions reveal the respective contributions of social indicators to projected building-related CO2 emissions (BCE) relative to a base level of 35.711 Mt CO2. Case 1 (35.045) experiences the greatest negative effect (−12.268) for labor force participation (LABF), balanced in part by women in parliament (WPAR, +10.702) and parity in children’s educational attainment (GPIE, +6.325); low-infrastructure agrarian settings are experienced for food production (FOOD, −5.862). Case 2 drops to 15.396 on the strength of a sharp loss for the income share for the poorest 20% (INC20, −15.411), and marginal losses for WPAR and LABF. Case 3 (21.101) again experiences a sharp negative effect for LABF (−15.392), while INC20 becomes positive (+2.898), reflecting energy growth due to income equalization. Case 4 (18.029) experiences losses for both LABF (−14.523) and parity in girls’ and boys’ educational attainment (−6.516) against a small gain for INC20 (+7.341). Case 5 jumps to 45.117, as WPAR (+20.099) and LABF (+12.822) propel high-income and high-energy systems, balanced in part by negative INC20 (−15.323) and negative food production (−10.631). Altogether, LABF exerts a constant negative effect on emissions, while WPAR and GPIE act contextually, and INC20 alternates between these two effects.

6. Governance and Carbon: Unpacking the Institutional Drivers of Building-Sector Emissions

This part assesses how governance elements impact building-sector CO2 emissions (BCE), which is highlighted in the “G” pillar of ESG. The fixed- and random-effect regressions are conducted on panel data (n = 982) to ascertain if government effectiveness (GOVT), expenditure on education (EDUE), political stability (STAB), rule of law (LAWR), R&D expenditure (RNDG), hospital expenditure (HOSP), and scientific productivity (SCIE) have significant impacts on BCE. The results reveal complex interplays. Elevated governance capacity often coexists with higher emissions. The latter is a product reflecting infrastructure-push development. Auxiliary clustering (see Section 6.2) categorizes countries on governance and emission patterns. The analysis highlights the various ways in which governance impacts environmental performance in the built environment. A metric summary of the G-Governance component variables within the ESG model used to estimate the value of BCE is shown in Appendix D. Diagnostic tests for autocorrelation and heteroscedasticity for the econometric estimates are provided in Appendix E.

6.1. Governance and the Carbon Cost of Development: A Panel Analysis of Building Emissions

This section examines how considerations regarding governance influence building-sector CO2 emissions (BCE) in relation to the “G” pillar of ESG. On a 982-observation panel, fixed- and random-effects regressions examine government effectiveness, political stability, the rule of law, educational and research and development (R&D) expenditure, hospital infrastructure, and science output. The results present multifaceted and sometimes contradictory associations: higher governance capacity is often accompanied by higher emissions, providing evidence for development being infrastructure-driven. The study provides empirical evidence for a linkage between governance quality and environmental performance in the built environment, as well as for the interdependence between institutional strength and emission pathways across countries.
We have estimated the following equation:
B P = α + β 1 G O V T i t + β 2 E D U E i t + β 3 S T A B i t + β 4 R N D G i t + β 5 L A W R i t + β 6 H O S P i t + β 7 S C I E i t
where i = 180 and t = 2000–2020 (Table 6).
The present study distinguishes the governance dimension of ESG (“G”) as a predictor for building-sector CO2 emissions (BCE), including residential, commercial, and other building-related emissions (IPCC, 2006; AR5). From a global panel for 982 observations, fixed-effect (LSDV) and random-effect (GLS) specifications are estimated, with BCE as the dependent variable and governance quality proxied by government effectiveness (GOVT), political stability (STAB), rule of law (LAWR), expenditure on education (EDUE), R&D expenditure (RNDG), hospital infrastructure (HOSP), and scientific output (SCIE). The results show that GOVT enters significantly and positively for emissions (12.79 FE; 11.29 RE, p < 0.01), reflecting convergence between effective governments and energy-prolific economies with high incomes [91]. EDUE enters significantly and positively (p < 0.10), reflecting a growing demand for infrastructure to support human capital development phases. On the contrary, political stability reduces emissions (−3.19 FE; −2.30 RE), enabling the effective enforcement of regulations [92]. R&D expenditure (−4.12 FE; −4.25 RE) and LAWR (−4.30 FE; −1.36 RE) lower emissions, corroborating the importance of investing in science and legal capacity for building efficiency and ESG conformity. HOSP enters significantly and positively (∼3.80, highly significant), reasserting that healthcare has a high carbon intensity and that mitigation hinges on efficiency upgrades [93,94]. SCIE enters significantly and positively but non-significantly, reflecting that research systems are both high-energy intensity providers and innovation facilitators. The fixed-effect specifications dominate (R2 = 0.992; AIC/BIC superior; Hausman χ2 = 73.19, p < 0.01), supporting unobserved heterogeneity. Overall, governance is a key predictor of BCE, but its impact hinges on context: while better institutions, stability, and R&D lower emissions, governance intensity in high-income economies comes alongside simultaneous growth driven by infrastructure and supporting higher carbon output [95]. Aligning governance quality with focus-tuned policy and conformity mechanisms is thus crucial to align institutional capability for sustainable decarbonization.

6.2. Governance and Emissions: Clustering Insights from Neighborhood-Based Algorithms

As a supplement to regression analysis, we employed clustering to identify latent groupings among nations based on governance attributes and building-sector CO2 emissions (BCE). Performance measures included R2, AIC/BIC, silhouette statistics, and entropy, as well as robust tests for statistical fit, compaction, and interpretability. Neighborhood-based clustering was optimal, achieving perfect explanatory power (R2 = 1.0) with superior cohesion and separation compared to other methods. The outcome provides a rigorous template to unveil how diverse governance structures are plotted against profiles of emissions, with a sounder analytics basis for ESG-driven diagnostics. See Figure 16.
Neighborhood-based clustering was the best performer among the criteria. Performance was strong and consistent across quality indices. The algorithm achieved an R2 value of 1.0, accounting for 100% of the variance, a very stringent test for model fit. Calinski–Harabasz index was 1.0, supporting tight and well-separated clusters. The AIC and BIC were not at a minimum, reflecting higher complexity, yet better practical validity indices did exist. The Silhouette measure (0.449) registered strong cohesion and sharp separation between groups. Entropy was high (0.962), reflecting structural order. Minimum separation and Pearson’s γ were strong, yet not the best. Balanced strength across both statistical and structural indices confirms Neighborhood-based clustering as the best and most comprehensible approach. Neighborhood-based clustering is the best solution when explanatory power and cluster quality are key.

Mapping Governance-Emission Profiles: Insights from Neighborhood-Based Clustering

To further elucidate the relationship between governance traits and structure-associated CO2 emissions (BCE), the following is an exploration of the resulting neighborhood-based clustering algorithm, deemed optimal based on comparisons with preceding models. By clustering nations based on eight governance and infrastructural indicators (including BCE, government effectiveness, education, research and development (R&D), and institutional quality), such an exploration identifies distinct profiles of countries. By identifying divergent relationships between governance and emissions within these clusters, it yields a profound observation regarding the institutional qualities that converge towards environmental ends amidst diverse development contexts. See Figure 17.
Countries cluster based on eight indicators: building-related CO2 emissions, government effectiveness, education, political stability, R&D, rule of law, hospital density, and scientific output. Cluster 3 has high cohesion (silhouette = 0.911). Governance, stability, and law are high, while education and science are low, indicating weak knowledge-investing regimes. Cluster 6 has the highest emissions and is closely linked to scientific production and R&D. Healthcare is relatively weak. They are industrial states that trade innovation for environmental cost. Cluster 2 (204 countries) clusters around the global means. Low silhouette means an undifferentiated group. Cluster 7 has an emphasis on education, low on R&D. Emerging nations are characteristic of this profile. Cluster 10 invests heavily in education but falls short in health and R&D, thereby limiting development spillovers. Clusters 3 and 6 are distinct: knowledge-constrained states that are stable and high-carbon trade-offs that are stuck in knowledge-intensive states. Such profiles reinforce evidence that structural forces are driving emissions [96,97]. Diversified profiles inform bespoke CO2 mitigation, according to policy scenarios supplied by [98].

6.3. Predicting Emissions with Precision: Machine Learning Models for Governance and BCE

The econometric regressions are augmented with machine-learning regressions. The regressions are K-Nearest neighbors, Decision Trees, Random Forests, and Boosting. The performance is calculated as MSE, RMSE, MAE, MAPE, and R2. The results establish the optimal performing algorithms for non-linear and complex relationships. The forecast precision validates governance variables as significant predictors for building-sector CO2 emissions (BCE). The exercise establishes a data-driven benchmark for tracking environmental outcomes that are shaped by governance. See Figure 18.
Evaluation confirms KNN as the best predictor. Errors are eradicated (MSE, RMSE, MAE = 0; MAPE = 0.054). R2 is 1.0 for ideal predictions for building-sector CO2 emissions (BCE). Decision Trees are also robust (R2 = 0.957). Boosting and Random Forest are similarly close (R2 = 0.843, 0.857). Linear and regularized forms perform poorly. Neural networks are unsuccessful, with an R2 value of 0, indicating a failure in proper generalization. KNN’s strength lies in its non-parametric flexibility, yet it is vulnerable to overfitting without validation. Past research has confirmed both the superiority and limitations in scale-extensive tasks [85,99,100]. Despite the computational cost, KNN is the most accurate and reliable method under the research’s configuration. The hyperparameter optimization settings for the KNN machine learning regression algorithm are given in Appendix F.

What Drives Emissions? Feature Importance of Governance and Knowledge Indicators

This section explores the predictive influence of governance-related variables on building-sector CO2 emissions (BCE) using feature importance metrics and additive model explanations. Feature importance, measured through mean dropout loss across 50 permutations, highlights the central role of scientific output (SCIE), healthcare infrastructure (HOSP), and education and R&D investment in driving model accuracy. See Figure 19.
RMSE (50 permutations) drop-out loss ranks building-sector CO2 emissions predictors across fields of science, socio-economics, and governance. Most strongly ranked is scientific output (SCIE, 170.266), citing roles for knowledge intensity and research infrastructures, in line with [101]. Conflict between hospital and system adequacy ranks two (33.765). Education expenditure (EDUE, 31.846) and R&D expenditure (RNDG, 29.399) again feature strongly for human and technological development, in line with [100]. The governance indicators are less contributory yet significant longitudinally: government efficiency (21.112) and political stability (19.785) are favorable for institutional persistence. Rule of law (3.374) has the lowest rank, demonstrating an indirect, intangible effect. The results validate knowledge production, healthcare, and R&D as lead drivers, and governance quality as complementary infrastructure. See Figure 20.
Figure 20 reports additive contributions for five scenarios relative to a base of 50.263. SCIE, the quantity of scientific publications, is the most negative driver. Offsets range between −33.359 (Cases 1–2) and −4.794 (Case 5). There is a systematic negative correlation between building-related CO2 emissions (BCE) and scientific output. Expanding scientific production is consistent with lower BCE. The mechanism accounts for technological innovation, efficiency in building design, and research-driven policy decisions. See Figure 21.
Government effectiveness (GOVT) and R&D expenditure (RNDG) boost emissions. Case 1 exhibits a +24 increase, and similar effects are observed in Cases 2 and 4. Education expenditure (EDUE) and political stability (STAB) exhibit erratic behavior. They are positive for Cases 1–2 and negative for Cases 3–5, which support the notion of non-linear, context-dependent dynamics. Rule of law (LAWR) makes a moderate contribution, peaking in Case 2 at +16.392. Hospital beds (HOSP) add little explanatory power. Scientific production (SCIE) uniformly reduces emissions across scenarios, and it is a successful mitigation force. Overall, GOVT and RNDG push emissions higher, while SCIE balances them, exposing the structural dualism of development: institutional strength drives growth, and science reduces emissions.

7. Harnessing ESG Dimensions for Effective Building Sector Climate Action

Comparative research finds country clusters in building-sector CO2 and ESG. Policies are diverse in context while requiring a technological foundation in building engineering [102]. Emission profiles are derived from envelope performance, material choice, and HVAC efficiency. Strategies must be context-aware while being technologically distinct. Decarbonization in Germany, Denmark, Finland, and Nordic Europe is a sophisticated endeavor, albeit limited by the possibilities of engineering. Institutionalized actors require air-tight envelopes in the retrofit programs, triple glazing, and digitalized energy monitoring. Carbon pricing on building elements has the potential to provide lifecycle neutrality. Governance and innovation flag-bearers must lead domestic decarbonization and invest in green transitions abroad [103,104,105]. Disseminations in modular architecture, passive architecture, and retrofitting have the potential to transform global practices. Emerging markets, such as India, Indonesia, and Brazil, are experiencing rapid urbanization. Policy priorities include green transportation, affordable housing, and procuring ESG-compliant products. Prefabrication leads to waste reduction. District solar or wind cooling deters HVAC emissions. Cooperation enables leapfrogging [106,107,108]. Codes with specifications for bamboo composites or recycled steel reinforce sustainability. Capacity-constrained states such as Nigeria, Kenya, and Pakistan require institution-building and ESG-conformant programs. Ventilation and compressed earth block pilot retrofits suppress emissions while building social acceptability. School and hospital investments build legitimacy. Global fora build cooperative retrofitting with local adaptability [109]. The Orient and East Asia, comprising China and Korea, lead in governance while experiencing equity gaps. Redistributive policy is required. Subsidized low-cost building retrofits are immediate. Green roof and high-performance envelopes are imminent. AI-generational HVACs, fresh materials, and adaptive urbanism must be propagated [110,111]. Indoor air quality requires continuous monitoring. High-income emitters such as the US and Arab states require strict regulation. Near-zero codes, carbon taxes, and thermal storage constitute key policy measures. Double-skin buildings and recoverable heat systems must be the foundation for high-rise portfolios under green leasing schemes. All contexts require ESG integrity, dependent on coherence. Environmental, governance, and social pillars must be brought into alignment. Clean energy uptake requires governance capacity and social equity. Fragmented ESG does not deliver. Indicators must inform interventions rather than merely serving symbolic alignment [112,113]. Engineering indicators, including insulation transmittance, rates of recovery, and embodied carbon, must be delivered alongside ESG reports [114,115]. Clustering evidence confirms bespoke strategies: OECD leaders take leapfrogging and transfer approaches; new economies leap ahead with modularity and construction; at-risk states fortify their institutions; East Asia balances equity and governance; high-carbon OECD states set codes and establish carbon markets. All else being equal, without engineering integration, ESG remains symbolic. Through interconnection, buildings can achieve net-zero targets and make significant progress on climate mitigation at scale.

8. Limitations

Research correlates CO2 output from buildings with environmental, social, and governance (ESG) indicators across 180 countries. There is strong evidence, and some limitations. However, the analysis depends on national aggregates spanning 22 years. Urban–rural gaps and intra-city differences remain largely invisible, despite their significant importance for emission dynamics. Most sectoral and energy policies take place subnationally. The limited availability of local data restricts the transferability to municipal and urban administrations. Approaches are based on machine learning, econometrics, and clustering. The models are computationally tractable but constrained by crude ESG proxies, specifically end-use energy and air quality [116,117,118]. National-level coverage for governance and social ESG data is sparse [102,119], which restricts its use at smaller scales. Neural-network models did not perform well. The R2 value was approximately zero, and errors were large, indicating shortcomings in the dataset and hyperparameters [120]. Noise, low sample boards, and lack of parameter tuning prevented generalizability. Generalizable AI is expected to reveal non-linear and threshold dynamics in follow-up research. Another point of consideration is that the data ends in 2021. Pandemic-driven transformations—hybrid work, office underuse, and plans for recovery are lacking [121]. Given these factors, future research demands near-real-time data, enriched collections, and novel reporting standards. Only thus are evolving trajectories of emissions detectable by ESG models.

9. Conclusions

CO2 releases by the building sector were compared with ESG indicators across 180 countries. Econometrics, cluster analysis, and machine learning detected multifaceted, multidimensional drivers. Releases are shaped by technology, governance, science, and social investment. Institutions and societies are just as important as engineering. Scientific output was associated with CO2 releases. Higher research intensity nations are better buffered, spread knowledge, innovative, and codified sound policy. Institutional quality and R&D investments reported mixed findings. Strong institutions were ubiquitous in low-emissions economies and densely infrastructured, high-demand economies of aging economies. Governance, innovation, and country-tailored sustainability policies need to be aligned for effective buffering. Social investments build viability for sustainable practice and environmental reform capacity. Equity-led policies are ethical and more. They facilitate direct low-carbon transitions. Education, healthcare, and inclusiveness build resilience and transformation. Integration across methodologies is this study’s strength. Regression, feature importance, and cluster detection distinguish country profiles, institutional trajectories, and predictive dynamics beyond monomethod research. Decarbonizing buildings cannot rely solely on markets or technology. Success depends on governance, science, social investments, and ESG frameworks that value the interdependence between the environment, society, and institutions.

Author Contributions

Conceptualization, N.M., V.N., M.D.M. and A.L.; methodology, N.M., V.N., M.D.M. and A.L.; software, N.M., V.N., M.D.M. and A.L.; validation, N.M., V.N., M.D.M. and A.L.; formal analysis, N.M., V.N., M.D.M. and A.L.; investigation, N.M., V.N., M.D.M. and A.L.; resources, N.M., V.N., M.D.M. and A.L.; data curation, N.M., V.N., M.D.M. and A.L.; writing—original draft preparation, N.M., V.N., M.D.M. and A.L.; writing—review and editing, N.M., V.N., M.D.M. and A.L.; visualization, V.N. and M.D.M.; supervision, V.N. and M.D.M.; project administration, V.N., M.D.M. and A.L.; funding acquisition, V.N. and M.D.M. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by “xTech NextHub: Competence Center for Innovative Solutions Development” through the Puglia Regional Call for Proposals for Exemption Aid 17 of 30/09/2014-BURP 139 suppl. of 06/10/2014 and subsequent amendments—Title II Chapter 2 of the General Regulations “Notice for the Submission of Projects Promoted by Large Enterprises Pursuant to Article 17 of the Regulation”. Project’s grant number: VTOIFW0.

Data Availability Statement

The data presented in this study are openly available in World Bank Databank, https://databank.worldbank.org/.

Conflicts of Interest

Author Nicola Magaletti, Valeria Notarnicola and Mauro Di Molfetta were employed by the company LUM Enterprise s.r.l. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Data Description

Table A1 compiles an exhaustive set of World Bank variables that constitute the foundation of the ESG framework used in the building sector. The environmental dimension encompasses building-related CO2 emissions (BCE), energy use per unit (ENUC), access to electricity (ELEC), cleaner fuel (CFTC), PM2.5 air pollution, and consumption of renewable energy (RENC), reflecting the balance between energy development and sustainability. The social dimension encompasses food production (FOOD), gender parity in education (GPIE), income share held by the bottom quintile (INC20), participation in the labor force (LABF), as well as parliament participation by women (WPAR), which reflect exclusion and equity. The governance dimension encompasses government effectiveness (GOVT), political stability (STAB), rule of law (LAWR), expenditure on research and development (RNDG), expenditure on education (EDUE), hospital beds (HOSP), and scientific output (SCIE). These, together, allow for a comprehensive view of sustainability (Table A1).
Table A1. We used the following variables acquired from the World Bank.
Table A1. We used the following variables acquired from the World Bank.
AcronymVariablesDefinition
BCECarbon Dioxide (CO2) Emissions From Building (energy) (Mt CO2e)Total annual carbon dioxide equivalent (CO2e) emissions from energy use in the buildings sector, covering IPCC 2006 categories 1.A.4 (Residential and other) and 1.A.5 (Unspecified), converted to CO2e using global warming potentials from the IPCC Fifth Assessment Report (AR5). Unit: Mt CO2e per year.
CFTCAccess to Clean Fuels and Technologies for CookingAccess to clean cooking fuel and technology estimates come from the WHO Global Household Energy Database with national representative household surveys as the sole data source (e.g., DHS, MICS, LSMS, WHS, national censuses). A multivariate hierarchical model—split by urban and rural—estimates fuel-type trends by grouping them as ‘clean’ (e.g., gas, electricity, alcohol) and ‘polluting’ (e.g., biomass, charcoal, coal, kerosene). There are estimates for 191 countries. High-income countries (by World Bank 2022 classification) have universal clean fuel access assumed.
ELECAccess to ElectricityReliable and secure electricity is essential for economic growth, poverty reduction, and human development. As countries decarbonize, dependence on clean, efficient power will grow. Electricity access enables basic services (lighting, refrigeration, appliances) and is a key indicator of energy poverty. Especially in lower-income countries, governments are prioritizing electrification through rural programs and national agencies. While vital for raising living standards, electricity generation can harm the environment—its impact depends on the energy sources used, with fossil fuels like coal being especially carbon-intensive.
ENUCEnergy Use per CapitaTotal energy consumption gauges final energy use after conversion into end-use fuels (e.g., electricity, processed oil). It encompasses energy from combustible renewables and waste—like biomass, biogas, and municipal waste. Biomass describes plant materials used as such or converted into fuel, heat, or power. Figures, as gathered by the IEA, use per capita estimates from the World Bank population. National non-OECD data are converted to IEA equivalence. Figures are imprecise and not completely comparable for countries because of limited data quality, particularly for waste and renewables. Energy values have been computed in terms of oil equivalents on the basis of 33% thermal conversion for nuclear and 100% for hydropower.
PM25PM2.5 PollutionPopulation-weighted exposure to ambient PM2.5 refers to the average level of fine particulate matter (PM2.5) pollution that a country’s population is exposed to. PM2.5 particles, with a diameter smaller than 2.5 microns, can penetrate deep into the lungs and pose serious health risks. This measure is calculated by weighting the annual average PM2.5 concentrations by the population distribution across urban and rural areas.
RENCRenewable Energy ConsumptionThe share of total final energy consumption derived from renewable sources, based on data from IEA, IRENA, UNSD, WHO, and the World Bank (Tracking SDG 7, 2023).
FOODFood Production IndexThe Food Production Index reflects the output of edible crops that offer nutritional value. It excludes items like coffee and tea, which, despite being consumable, do not contribute meaningfully to nutrition. This metric emphasizes food sources that support dietary needs, aligning production data with human nutritional requirements rather than general edibility alone.
GPIEGender Parity in EnrollmentThe Gender Parity Index (GPI) in primary education is calculated by dividing female gross enrollment by male gross enrollment. Data are collected by UNESCO from national education surveys and aligned with ISCED standards to ensure international comparability. The current methodology was adopted in 2011. Reference years reflect when the school year ends. A GPI below 1 indicates girls are disadvantaged; above 1 indicates boys are. Achieving gender parity enhances women’s opportunities and contributes to broader social and economic development.
INC20Income Share Lowest 20%The percentage share of income or consumption reflects the portion received by population subgroups, typically divided into deciles or quintiles. Due to rounding, quintile shares may not total exactly 100%. Data come from household surveys via national statistics agencies and World Bank departments, with high-income country data largely from the Luxembourg Income Study. These measures support the World Bank’s goal of shared prosperity—focusing on income growth among the bottom 40%—and help assess inequality within and across countries.
LABFLabor Force ParticipationThe labor force participation rate represents the share of the population aged 15 and older that is economically active, including all individuals engaged in the production of goods and services during a specific period. Data, sourced from the ILO’s modeled estimates, highlight persistent gender disparities: women’s labor force participation is generally lower than men’s due to social, legal, and cultural norms. In low-income countries, women often work unpaid in family enterprises, while, in high-income nations, higher education has expanded their access to better employment opportunities, though inequalities persist.
WPARWomen in ParliamentWomen in parliament refers to the percentage of seats held by women in a single or lower house of national parliaments. Although progress has been made, women remain significantly underrepresented in decision-making roles, especially in lower-income countries. Gender inequality in political participation limits women’s influence on policy and national priorities. Equal representation is essential for inclusive governance and sustainable development. True democracy requires full participation of women, whose perspectives and leadership are vital for shaping equitable and effective public policies.
GOVTGovernment EffectivenessGovernment effectiveness: Estimated measures of perceptions of public service quality, civil service independence, policy formulation and implementation, and government credibility. Scores range from −2.5 to 2.5, based on a standard normal distribution.
EDUEGov. Expenditure on EducationGeneral government expenditure on education, including current spending, capital outlays, and transfers, is measured as a percentage of GDP. It accounts for education funding from all government levels—local, regional, and central—and includes international transfers to the government. This indicator reflects the government’s financial commitment to the education sector relative to the country’s economic output.
STABPolitical StabilityPolitical stability and absence of violence/terrorism reflect perceptions of the risk of political unrest, government instability, and politically motivated violence or terrorism. Countries are ranked by percentile, from 0 (least stable) to 100 (most stable), allowing global comparison. Percentile ranks are adjusted over time to ensure consistency despite changes in the number of countries included in the Worldwide Governance Indicators (WGIs).
RNDGR&D ExpenditureGross domestic expenditures on research and development (R&D), measured as a percentage of GDP, represent a country’s financial commitment to innovation and technological progress. This includes both capital and current spending across four key sectors: business enterprises, government institutions, higher education, and private non-profits. It encompasses all R&D activities—basic research, applied research, and experimental development—supporting economic and scientific advancement.
LAWRRule of LawRule of law reflects perceptions of how much confidence individuals and institutions have in societal rules, particularly regarding contract enforcement, property rights, police effectiveness, and judicial independence. It also considers the likelihood of crime and violence. Countries receive a score ranging from approximately −2.5 (weak rule of law) to 2.5 (strong), based on a standard normal distribution.
HOSPHospital BedsHospital beds refer to the total number of beds that are maintained, staffed, and immediately available for the admission of patients. These include inpatient beds in public and private hospitals, general and specialized institutions, and rehabilitation centers. The count typically covers beds used for both acute and chronic care, reflecting the overall healthcare system’s capacity for treatment and recovery.
SCIEScientific ArticlesScientific and technical journal articles represent the total number of peer-reviewed publications in key research areas, including physics, biology, chemistry, mathematics, clinical medicine, biomedical research, engineering and technology, and earth and space sciences. These articles reflect ongoing advancements, innovation, and collaboration within the global scientific community, contributing to knowledge expansion and technological development across multiple disciplines and industries.

Appendix B. E-Environment

Descriptive statistics suggest high dispersion in all indicators, a very high building-sector CO2 emission (BCE) range, echoing its very high range of (0–729,783 Mt CO2e) and extreme skewness of 7251, as an estimator of high-emitting countries’ tail effect. Clean fuel consumption (CFTC) and electricity consumption (ELEC) indicate a high median and mode of 100%, as would be expected in high-income nations with widespread electricity use. High standard deviations indicate widespread inequity worldwide. Per capita energy consumption (ENUC) and PM2.5 air pollution exhibit high interquartile ranges and kurtosis, indicating the presence of outliers. Renewable energy consumption (RENC) remains uneven, reflecting the uneven nature of the global energy transition (Table A2).
Table A2. Descriptive statistics of building emissions (BCE) and associated energy-environmental indicators.
Table A2. Descriptive statistics of building emissions (BCE) and associated energy-environmental indicators.
CFTCELECBCEENUCPM25RENC
Valid391639404140217321803805
Missing224200019671960335
Mode100,000100,0000.006172017,8690.000
Median83,45097,0001098119022,74824,690
Mean63,31277,34918,177234728,01033,772
Std. Error of Mean0.6260.497105262,2810.3830.488
95% CI Mean Upper64,54078,32320,239246928,76134,728
95% CI Mean Lower62,08576,37616,114222527,26032,816
Std. Deviation39,17931,16967,687290317,87130,072
95% CI Std. Dev. Upper40,06731,87369,178299218,41830,764
95% CI Std. Dev. Lower38,33030,49666,260281917,35629,411
Coefficient of Variation0.6190.403372412370.6380.890
MAD16,55030001064851,251949720,700
MAD Robust24,537444815781,262,06414,08130,690
IQR77,40044,17669762,531,12521,58449,260
Variance1534971,53745818.429 × 106319,361904,338
95% CI Variance Upper1605101547858.954 × 106339,207946,397
95% CI Variance Lower1469930,01843907.949 × 106301,215865,034
Skewness−0.509−11297251275810590.649
Std. Error of Skewness0.0390.0390.0380.0530.0520.040
Kurtosis−1431−0.23659,21899110.716−0.932
Std. Error of Kurtosis0.0780.0780.0760.1050.1050.079
Shapiro–Wilk0.7970.7370.2650.6990.9170.885
p-value of Shapiro–Wilk<0.001<0.001<0.001<0.001<0.001<0.001
Range99,90099,928729,78321,41997,80898,340
Minimum0.1000.0720.0001540−25660.000
Maximum100,000100,000729,78321,42095,24398,340
25th percentile22,60055,8240.168536,61715,5757450
50th percentile83,45097,0001098119022,74824,690
75th percentile100,000100,0007144306737,15856,710
25th percentile22,60055,8240.168536,61715,5757450
50th percentile83,45097,0001098119022,74824,690
75th percentile100,000100,0007144306737,15856,710
Sum247,931304,75575,2515.102 × 10661,062128,502
Kernel density and histogram plots of building-sector CO2 emissions (BCE) and corresponding environmental and social indicators: access to clean fuels (CFTC), electricity access (ELEC), per capita energy use (ENUC), PM2.5 air pollution, and renewable energy consumption (RENC). The graphs reveal right-skewness of BCE and ENUC and clustering at complete access (100%) for CFTC and ELEC, with variation across countries (Figure A1).
Figure A1. Distribution plots of BCE and related ESG indicators.
Figure A1. Distribution plots of BCE and related ESG indicators.
Buildings 15 03601 g0a1
A matrix of scatterplots with marginal density plots of bivariate relations between building-related CO2 emissions (BCE), access to clean fuels (CFTC), electricity access (ELEC), energy use per capita (ENUC), PM2.5 air pollution, and renewable energy consumption (RENC) is presented. Plots indicate non-linear and highly skewed relations, where BCE and ENUC exhibit strong right-skewness and a shared clustering pattern, predominantly at energy access and pollution exposure variables (Figure A2).
Figure A2. Pairwise scatterplots and density distributions of BCE and ESG-related indicators.
Figure A2. Pairwise scatterplots and density distributions of BCE and ESG-related indicators.
Buildings 15 03601 g0a2
Figure 3 shows boxplots of building-related CO2 emissions (BCE) and matching ESG indicators: CFTC, ELEC, ENUC, PM2.5, and RENC. BCE and ENUC exhibit high right-skew, with extreme outliers, such as points exceeding 700 Mt CO2e and per capita energy consumption above 20,000 units, indicating the presence of high-emission, high-consumption countries. Conversely, CFTC, ELEC, and RENC register more slender interquartile ranges with concentrations of points near 100%, indicating widespread access in high-income countries. PM2.5 captures medium variation with some extreme pollution conditions. The visualization reveals the planetary heterogeneity and unequal distribution of emissions and energy access (Figure A3).
Figure A3. Boxplots of BCE and ESG-related indicators with outliers highlighted.
Figure A3. Boxplots of BCE and ESG-related indicators with outliers highlighted.
Buildings 15 03601 g0a3
Figure 4 presents Q–Q (quantile–quantile) plots of building-sector CO2 emissions (BCE) and corresponding ESG indicators: CFTC, ELEC, ENUC, PM2.5, and RENC. Each variable’s empirical distribution is contrasted with a normal distribution. BCE and ENUC considerably deviate from normality, with heavy right tails and outliers, reflecting positive skew and the presence of extreme values, as would be predicted given unequal worldwide energy use and emissions. CFTC and ELEC have truncated left tails, reflecting many observations at near-universal access. PM2.5 displays moderate curvature, while RENC exhibits moderate skewness. These non-normal distributions necessitate the application of robust, non-parametric, or transformed approaches in modeling (Figure A4).
Figure A4. Q–Q plots assessing normality of BCE and ESG-related indicators.
Figure A4. Q–Q plots assessing normality of BCE and ESG-related indicators.
Buildings 15 03601 g0a4
The scatterplot matrix in Figure 5 graphs pairwise relationships between building-sector CO2 emissions (BCE) and five significant ESG-related variables: access to clean fuels (CFTC), electricity access (ELEC), energy use per capita (ENUC), PM2.5 pollution, and renewable energy consumption (RENC). BCE is weakly positive with ENUC and PM2.5, as would be anticipated with higher consumption and pollution with higher emissions. BCE is weakly negative with RENC, indicating a compensating effect of renewable energy. More sizeable correlations emerge between CFTC and ELEC, whereas RENC is negative with CFTC as well as with ELEC. The density curves identify non-normal distributions across variables (Figure A5).
Figure A5. Scatterplot matrix with density overlays for BCE and ESG-related variables.
Figure A5. Scatterplot matrix with density overlays for BCE and ESG-related variables.
Buildings 15 03601 g0a5
Table 2 presents the covariance matrix between building-related CO2 emissions (BCE) and several ESG indicators: clean fuel access (CFTC), electricity access (ELEC), per capita energy use (ENUC), PM2.5 air pollution, and renewable energy use, in relation to BCE. Covariances define the direction and strength. BCE has a moderate positive correlation with CFTC (328.1), ELEC (294.5), and ENUC (29.0) due to related higher consumption and energy accessibility, which is typically associated with higher emissions. BCE has a strong negative correlation with RENC (−416.6), confirming the use of renewable energy in emission offset. Strong covariation between CFTC and ELEC (715.0) defines the associated infrastructuring development process, while negative covariations vs. RENC suggest energy change mismatches. These covariations highlight systemic relationships defining emission dynamics worldwide (Table A3).
Table A3. Covariance matrix of BCE and ESG-related indicators.
Table A3. Covariance matrix of BCE and ESG-related indicators.
CFTCELECBCEENUCPM25RENC
CFTC1.212715.040328.14549.067−281.490−744.185
ELEC715.040771.991294.48233.913−112.064−458.817
BCE328.145294.4826.01128.989−8.701−416.588
ENUC49.06733.91328.9898.975 × 106−3.560−26.307
PM25−281.490−112.064−8.701−3.560310.835109.470
RENC−744.185−458.817−416.588−26.307109.470746.022
The correlation matrix reveals significant relationships between building-related CO2 emissions (BCE) and ESG-energy indicators. BCE is weakly positively correlated with clean fuel access (0.122), electricity access (0.137), and per capita energy consumption (0.125) while exhibiting a moderate growth increase in the latter, particularly with higher structure and consumption. BCE is negatively correlated with renewable consumption (−0.197) in supporting the renewables-driven process of decarbonization. Electricity access is highly correlated with access to electricity and clean fuels (0.739), while demonstrating a dependency between the two infrastructures. Renewable energy consumption (RENC) is highly negatively correlated with consumption-based carbon pollution (CFTC) (−0.783) and with electricity consumption (ELEC) (−0.605), indicating possible energy transition trade-offs. PM2.5 pollution shows weak positive correlations with other indicators (Table A4).
Table A4. Pearson correlation matrix of building-sector CO2 emissions (BCE) and ESG-energy indicators.
Table A4. Pearson correlation matrix of building-sector CO2 emissions (BCE) and ESG-energy indicators.
CFTCELECBCEENUCPM25RENC
CFTC1.0000.7390.1220.470−0.459−0.783
ELEC0.7391.0000.1370.407−0.229−0.605
BCE0.1220.1371.0000.125−0.006−0.197
ENUC0.4700.4070.1251.000−0.067−0.321
PM25−0.459−0.229−0.006−0.0671.0000.227
RENC−0.783−0.605−0.197−0.3210.2271.000

Appendix C. S-Social

Below is the table of descriptive statistics of five chief social indicators—food production (FOOD), gender parity in education (GPIE), income share of the bottom 20% (INC20), labor force participation (LABF), and women in parliament (WPAR)—and building-related CO2 emissions (BCE). BCE is extremely right-skewed (skewness = 7251) and extremely kurtotic due to the impact of some high-emitting states. Socio-environmental indicators like LABF and GPIE exhibit a relatively even distribution, while WPAR is unsurprisingly at the tail end on average, as it registers long-term patterns of divergent gender representation. Incidentally, FOOD and INC20 have high variance coupled with high skewness, which reflect structural inequity. Contrast between dispersion and distribution shows the perception about the lopsidedness of socio-environmental states of nations (Table A5).
Table A5. Descriptive statistics of social indicators and building-related CO2 emissions (BCE).
Table A5. Descriptive statistics of social indicators and building-related CO2 emissions (BCE).
FOODGPIEINC20LABFWPARBCE
Valid410226651607399839634140
Missing38147525331421770
Mode0.1230.990710055,1460.0000.006
Median96,0200.998710067,32417,3021098
Mean92,769216,096787965,66319,26318,177
Std. Error of Mean0.38347,2450.2360.1860.2041052
95% CI Mean Upper93,520308,737834166,02919,66320,239
95% CI Mean Lower92,017123,454741665,29718,86316,114
Std. Deviation24,5422438945811,79112,84767,687
95% CI Std. Dev. Upper25,0852506979712,05513,13669,178
95% CI Std. Dev. Lower24,0222375914211,53812,57066,260
Coefficient of Variation0.26511,28712000.1800.6673724
MAD10,0000.0241500698979271064
MAD Robust14,8260.035222410,36111,7521578
IQR21,7580.049305014,34016,1906976
Variance602,2955.949 × 10689,448139,028165,0384581
95% CI Variance Upper629,2386.281 × 10695,972145,331172,5544785
95% CI Variance Lower577,0535.642 × 10683,570133,129158,0054390
Skewness261511,6017932−0.91611977251
Std. Error of Skewness0.0380.0470.0610.0390.0390.038
Kurtosis43,986135,46365,2601406268959,218
Std. Error of Kurtosis0.0760.0950.1220.0770.0780.076
Shapiro–Wilk0.8160.0600.2500.9540.9300.265
p-value of Shapiro–Wilk<0.001<0.001<0.001<0.001<0.001<0.001
Range502,01733,37693,76976,46787,730729,783
Minimum0.1230.0000.18713,1560.0000.000
Maximum502,14033,37693,95689,62387,730729,783
25th percentile81,7570.971535059,36810,0000.168
50th percentile96,0200.998710067,32417,3021098
75th percentile103,5151020840073,70926,1907144
Sum380,537575,89412,660262,52076,33875,251
Figure A6 presents the distribution of the social component variables under the ESG framework, highlighting their diversity and the occurrence of skewness in some indicators. The Food Production Index (FOOD) reveals a positively skewed distribution, where most observations are clustered below 150, indicating that even though most countries exhibit moderate food production, some countries exhibit significantly higher levels. The Gender Parity Index in Education (GPIE) also reveals a strong positive skew, where most countries are clustered around lower values, with a small fraction achieving very high values, indicating uneven convergence in schooling equity. Likewise, the variable Income Share of the Poorest 20% (INC20) is skewed to the right, indicating that in most instances the poorest quintile shares just a small fraction of the country’s income. Variable Labor Force Participation (LABF), on the other hand, approximates a normal distribution, centered around the value of 60%, indicating a more balanced distribution across the population. Women in Parliament (WPAR) are distributed widely, albeit still skewed, indicating that many countries exhibit modest representation, while others exhibit relatively high representation. Variables Building-sector CO2 Emissions (BCE) and Building Performance (BP) indicate strong right-skewness, where most values cluster around the low values, albeit a few extreme values overshading the distribution. Generally, the figure highlights the asymmetric distribution of social and emission-related variables, where structural inequality and varying distribution paths across countries are indicated (Figure A6).
Figure A6. Distribution of social component variables in the ESG framework.
Figure A6. Distribution of social component variables in the ESG framework.
Buildings 15 03601 g0a6
Boxplots illustrating the spread and outliers of key social indicators used in the ESG framework. Variables include food expenditure (FOOD), public investment in education (GPIE), household income (INC20), labor force participation (LABF), women’s participation rate (WPAR), and basic consumption expenditure (BCE). Outliers are labeled and represent extreme values deviating from the interquartile range (Figure A7).
Figure A7. Boxplot distributions of social ESG variables with outlier detection.
Figure A7. Boxplot distributions of social ESG variables with outlier detection.
Buildings 15 03601 g0a7
Quantile–Quantile (Q–Q) plots for variables represent the social component in ESG analysis. The deviation of points from the theoretical normal line indicates non-normal distributions, particularly for FOOD, GPIE, INC20, and BCE, which exhibit strong right skewness and heavy tails. These findings suggest the need for transformation or robust statistical methods (Figure A8).
Figure A8. Normality assessment of social ESG variables via Q–Q plots.
Figure A8. Normality assessment of social ESG variables via Q–Q plots.
Buildings 15 03601 g0a8
The scatterplot matrix of the social component variables from the ESG framework includes FOOD, GPIE, INC20, LABF, WPAR, and BCE. The diagonal panels display the univariate density distributions, while the off-diagonal panels show pairwise scatterplots. The matrix reveals strong non-normality, right-skewed distributions, and possible non-linear associations between variables such as INC20 and GPIE (Figure A9).
Figure A9. Scatterplot matrix and density distributions of social ESG variables.
Figure A9. Scatterplot matrix and density distributions of social ESG variables.
Buildings 15 03601 g0a9
The plot visualizes the bivariate scatter distributions and marginal densities among the social indicators employed in the model, including FOOD (food production volume), GPIE (gross primary income equity), INC20 (income share of the bottom 20%), LABF (labor force participation rate), WPAR (women in parliament), and BCE (building-related CO2 emissions). This matrix enables an exploratory assessment of linear associations and potential collinearity among the variables constituting the social dimension of the ESG framework, thereby contributing to a deeper understanding of their influence on emission patterns within the built environment (Figure A10).
Figure A10. Pairwise correlation plot of social pillar variables in the ESG framework.
Figure A10. Pairwise correlation plot of social pillar variables in the ESG framework.
Buildings 15 03601 g0a10
The table reports the covariances among key social indicators—FOOD, GPIE, INC20, LABF, WPAR—and building-related CO2 emissions (BCE). Positive values indicate a direct relationship in variability between variable pairs, whereas negative values suggest an inverse relationship. High absolute values reflect stronger associations in the variance structure, which may inform multicollinearity diagnostics in subsequent regression models (Table A6).
Table A6. Covariance matrix of social ESG variables and building-related CO2 emissions.
Table A6. Covariance matrix of social ESG variables and building-related CO2 emissions.
FOODGPIEINC20LABFWPARBCE
FOOD424.789−37.108−96.944114.735−82.864107.769
GPIE−37.1081.262 × 10735.079−24.53826.7191.483
INC20−96.94435.079102.362−65.16577.834−0.529
LABF114.735−24.538−65.165126.359−32.17739.820
WPAR−82.86426.71977.834−32.177191.207−29.946
BCE107.7691.483−0.52939.820−29.9467.518
The table presents Pearson correlation coefficients among the selected social indicators—FOOD, GPIE, INC20, LABF, and WPAR—and the variable BCE, representing building-related CO2 emissions. Strong positive correlations are observed between GPIE and INC20 (r = 0.976), while negative associations emerge between GPIE and LABF (r = −0.615) and between INC20 and LABF (r = −0.573). Correlations with BCE are weak, suggesting limited linear association with social variables in this model (Table A7).
Table A7. Pearson correlation matrix among social ESG variables and building-related CO2 emissions.
Table A7. Pearson correlation matrix among social ESG variables and building-related CO2 emissions.
FOODGPIEINC20LABFWPARBCE
FOOD1.000−0.507−0.4650.495−0.2910.060
GPIE−0.5071.0000.976−0.6150.5440.005
INC20−0.4650.9761.000−0.5730.556−6.033 × 10−4
LABF0.495−0.615−0.5731.000−0.2070.041
WPAR−0.2910.5440.556−0.2071.000−0.025
BCE0.0600.005−6.033 × 10−40.041−0.0251.000

Appendix D. G-Governance

This table includes complete descriptive statistics of seven governance and public service indicators (GOVT: government effectiveness; EDUE: education expenditure; STAB: political stability; RNDG: R&D expenditure; LAWR: rule of law; HOSP: hospital beds; SCIE: scientific publications) and the variable BCE (building-related CO2 emissions). Such statistics include central tendency, dispersion, shape (skewness and kurtosis), and results of the Shapiro–Wilk normality test. High skewness and extreme kurtosis of a number of variables (e.g., BCE, LAWR, SCIE) suggest non-normal distributions likely requiring transformation prior to modeling (Table A8).
Table A8. Descriptive statistics of governance and public service variables related to building CO2 emissions.
Table A8. Descriptive statistics of governance and public service variables related to building CO2 emissions.
BCEGOVTEDUESTABRNDGLAWRHOSPSCIE
Valid41403907276039491883394120303782
Missing0233138019122571992110358
Mode0.006−1.1580.0001.1700.0180.8341.3000.000
Median1.098−0.21514.031−0.0190.577−0.2703.015226.270
Mean18.177−0.02914.370−0.0090.9480.5833.73310.180
Std. Error of Mean1.0520.0210.0970.0250.0230.1380.060688.649
95% CI Mean Upper20.2390.01214.5610.0400.9940.8543.85011.530
95% CI Mean Lower16.114−0.06914.180−0.0580.9020.3123.6168.830
Std. Deviation67.6871.3005.1021.5581.0178.6822.68342.350
95% CI Std. Dev. Upper69.1781.3295.2401.5941.0518.8782.76943.327
95% CI Std. Dev. Lower66.2601.2724.9711.5250.9868.4942.60341.417
Coefficient of Variation3.724−45.2590.355−171.7681.07314.8890.7194.160
MAD1.0640.6603.2350.7110.4300.7001.715222.835
MAD Robust1.5780.9794.7971.0540.6381.0382.543330.375
IQR6.9761.3736.5061.4241.1381.4403.8783.153
Variance4.5811.69026.0282.4291.03475.3707.2011.794 × 109
95% CI Variance Upper4.7851.76727.4582.5401.10478.8127.6651.877 × 109
95% CI Variance Lower4.3901.61724.7072.3250.97172.1496.7771.715 × 109
Skewness7.2514.1520.4546.0091.27311.9681.1118.482
Std. Error of Skewness0.0380.0390.0470.0390.0560.0390.0540.040
Kurtosis59.21834.6101.47763.8631.633143.1741.25985.379
Std. Error of Kurtosis0.0760.0780.0930.0780.1130.0780.1090.080
Shapiro–Wilk0.2650.7340.9770.6150.8580.1120.9110.235
p-value of Shapiro–Wilk<0.001<0.001<0.001<0.001<0.001<0.001<0.001<0.001
Range729.78314.58044.80222.9607.586111.21614.590669.746
Minimum0.000−2.4390.000−3.313−1.880−2.5910.100−2.283
Maximum729.78312.14144.80219.6475.706108.62514.690669.744
25th percentile0.168−0.79610.877−0.7220.230−0.8541.60325.813
50th percentile1.098−0.21514.031−0.0190.577−0.2703.015226.270
75th percentile7.1440.57717.3830.7021.3680.5865.4803.179
Sum75.251−112.21539.661−35.8301.7852.2977.5773.850 × 107
This figure displays kernel density estimates and histograms for BCE (building-related CO2 emissions), GOVT (government effectiveness), EDUE (education expenditure), STAB (political stability), RNDG (research and development expenditure), LAWR (rule of law), HOSP (hospital beds), and SCIE (scientific publications). Most variables exhibit skewed distributions, particularly BCE, LAWR, and SCIE, suggesting potential non-normality and the need for transformation prior to parametric analysis (Figure A11).
Figure A11. Density distributions of governance and public service indicators related to building CO2 emissions.
Figure A11. Density distributions of governance and public service indicators related to building CO2 emissions.
Buildings 15 03601 g0a11
This scatterplot matrix exhibits the bivariate relationships and marginal density distributions between BCE (building-related CO2 emissions), GOVT (government effectiveness), EDUE (education expenditure), STAB (political stability), RNDG (research and development expenditure), LAWR (rule of law), HOSP (hospital beds), and SCIE (scientific publications). It facilitates visualization of patterns of correlation, hetero-scedasticity, and distributional skewness in aid of preliminary diagnostics for multivariate modeling within the ESG context (Figure A12).
Figure A12. Pairwise scatterplot matrix of governance and public service variables related to building CO2 emissions.
Figure A12. Pairwise scatterplot matrix of governance and public service variables related to building CO2 emissions.
Buildings 15 03601 g0a12
Figure A13 shows boxplots of distributions and outliers of BCE (building-related CO2 emissions), GOVT (government effectiveness), EDUE (education expenditure), STAB (political stability), RNDG (research and development expenditure), LAWR (rule of law), HOSP (hospital beds), and SCIE (scientific publications). Most of these variables contain apparent outliers and asymmetries, signifying signs of skewed distributions and a need to normalize or utilize robust estimation methods in subsequent analyses (Figure A13).
Figure A13. Boxplot distribution of governance and public service indicators related to building CO2 emissions.
Figure A13. Boxplot distribution of governance and public service indicators related to building CO2 emissions.
Buildings 15 03601 g0a13
Quantile–Quantile (Q–Q) plots check normality of variables such as BCE (building-related CO2 emissions), GOVT (government effectiveness), EDUE (education expenditure), STAB (political stability), RNDG (research and development expenditure), LAWR (rule of law), HOSP (hospital beds), and SCIE (scientific publications). Strong departures from the reference line in all plots, especially for BCE, LAWR, and SCIE, indicate heavy-tailed and right-skewed distributions and thus non-normality and potential need for transformation in statistical modeling (Figure A14).
Figure A14. Q–Q plots for governance and public service indicators related to building CO2 emissions.
Figure A14. Q–Q plots for governance and public service indicators related to building CO2 emissions.
Buildings 15 03601 g0a14
Figure A15 uncovers a scatter plot matrix exploring the relationship between building-related CO2 emissions (BCE) and ESG governance metrics. Scatters reveal varied patterns, including both positive and negative associations across variables. Government effectiveness (GOVT) is strongly positively associated with BCE, such that high-institution-capability economies also exhibit high emissions, likely the result of infrastructure expansion. Expenditure on education (EDUE) depicts a moderate positive association with emissions, such that investments in human capabilities are accompanied by higher energy consumption. Political stability (STAB) and the rule of law (LAWR) tend to exhibit negative associations with BCE, as hypothesized, considering the role that strong order plays in enforcing regulations that reduce emissions. R&D expenditure (RNDG) exhibits a negative association with BCE, indicating that innovations can serve as a source of emissions savings. Expenditure on hospitals (HOSP) exhibits a positive association, reflecting the carbon intensities embedded in the health sector. Finally, scientific productivity (SCIE) depicts a multifaceted relationship: high output accompanied by emissions reflecting advanced economies, while high scientific productivity is also a source reflecting innovations that yield emissions savings. As a panel, the figure separates the double-edged role of governance: capable institutions enable reductions, while in high-income cases, governance emerges hand-in-hand with carbon-intensive growth trajectories (Figure A15).
Figure A15. Scatter plot matrix of building-related CO2 emissions (BCE) and G-governance indicators.
Figure A15. Scatter plot matrix of building-related CO2 emissions (BCE) and G-governance indicators.
Buildings 15 03601 g0a15
Table A9 provides some indication regarding the covariance connection between building-related CO2 emissions (BCE) and governance indicators. It uncovers both reinforcing and conflicting dynamics. BCE significantly covaries positively with both R&D expenditure (27.514) and scientific output (6.849 × 106), substantiating that innovative economies pay a higher price in emissions due to energy-intensive infrastructure. Conversely, BCE negatively covaries with both the rule of law (−15.110) and political stability (−14.152), indicating that effective institutions can impose environmental regulations as well as decelerate emissions. It is surprising, however, that government effectiveness (4.309) positively covaries with BCE, signaling the irony that effective government is often associated with high-carbon economies that are sophisticated. Similarly, hospital infrastructure (−4.792) also exhibits a moderate negative covariance, reflecting the cost that the carbon sector incurs in health systems. Education expenditure (−3.168) negatively covaries with BCE, although it exhibits strong negative relationships with governance indicators, indicating trade-offs in resource allocation. As a panel, the matrix sheds insight into the double responsibility that governance has, generating growth while also inflicting structure-based challenges within decarbonization (Table A9).
Table A9. Covariance matrix of governance and public service variables in relation to building CO2 emissions.
Table A9. Covariance matrix of governance and public service variables in relation to building CO2 emissions.
BCEGOVTEDUESTABRNDGLAWRHOSPSCIE
BCE11.9504.309−3.168−14.15227.514−15.110−4.7926.849 × 106
GOVT4.3092.653−2.9433.2350.72218.6550.9665.898
EDUE−3.168−2.94316.500−3.925−0.818−21.927−4.630−4.123
STAB−14.1523.235−3.9254.7030.43326.4011.816−8.322
RNDG27.5140.722−0.8180.4331.0111.5600.66522.538
LAWR−15.11018.655−21.92726.4011.560169.8736.960−34.172
HOSP−4.7920.966−4.6301.8160.6656.9607.315−5.751
SCIE6.849 × 1065.898−4.123−8.32222.538−34.172−5.7514.319 × 109
The table shows Pearson correlation coefficients between BCE (building-related CO2 emissions) and governance-related indicators like GOVT (government effectiveness), EDUE (education expenditure), STAB (political stability), RNDG (research and development expenditure), LAWR (rule of law), HOSP (hospital beds), and SCIE (scientific publications). The highly positive relationship between BCE and SCIE (r = 0.953) suggests intense co-movement, while GOVT and STAB suggest high mutual correspondence (r = 0.916) too. These correlations provide some insights into likely multi-collinearity patterns and directional relationships across the governance dimension within the ESG framework (Table A10).
Table A10. Pearson correlation matrix of governance and public service variables related to building CO2 emissions.
Table A10. Pearson correlation matrix of governance and public service variables related to building CO2 emissions.
BCEGOVTEDUESTABRNDGLAWRHOSPSCIE
BCE1.0000.024−0.007−0.0600.250−0.011−0.0160.953
GOVT0.0241.000−0.4450.9160.4410.8790.2190.055
EDUE−0.007−0.4451.000−0.446−0.200−0.414−0.421−0.015
STAB−0.0600.916−0.4461.0000.1990.9340.310−0.058
RNDG0.2500.441−0.2000.1991.0000.1190.2450.341
LAWR−0.0110.879−0.4140.9340.1191.0000.197−0.040
HOSP−0.0160.219−0.4210.3100.2450.1971.000−0.032
SCIE0.9530.055−0.015−0.0580.341−0.040−0.0321.000

Appendix E. Autocorrelation and Heteroscedasticity

Appendix E.1. Autocorrelation for E-Environment

The Wooldridge residual-regression test for the panel-data disturbances shows evident first-order serial correlation: from a sample containing 570 observations, an auxiliary regression of residual on its first lag generates an F statistic of 382.97 with a p-value < 0.001 for rejection of the null hypothesis of no temporal dependence. We observe a coefficient for lagged residual of 0.756, supported by a t ratio of 19.57 and an R-squared of 0.403. This implies that about three-quarters of each disturbance spills over into the next period, with the intercept effectively being 0; there is no systematic bias once persistence is controlled for. Such a strong AR(1) pattern is noteworthy because standard fixed- or random-effect estimators, in conjunction with standard homoscedastic standard errors, are too low in estimating sampling variability and risk misleading inferences and because ordinary least-squares sacrifices efficiency by forgoing information in dependence structure (Table A11).
Table A11. Wooldridge-Type Test for First-Order Autocorrelation (Residual Regression Method).
Table A11. Wooldridge-Type Test for First-Order Autocorrelation (Residual Regression Method).
SourceSSdfMS
Model12,807.621112,807.621
Residual18,995.490256833.4427644
Total31,803.111256955.8929898
Number of obs570
F(1, 568)382.97
Prob > F0.000
R-Squared0.4027
Adj R-Squared0.4017
Root MSE5.783
UhatUhat_lag_cons
Coefficient0.7561607−0.0021011
Std. Err.0.03863940.2423395
T19.57−0.01
p > |t|0.680267−0.4780921
[95% Con. Interval]0.83205430.4738899
To restore valid inference, we therefore re-estimate the model under both fixed-effect and random-effect frameworks using covariance matrices that are cluster-robust by country; this adjustment keeps standard errors consistent in the simultaneous presence of heteroskedasticity and any form of within-panel autocorrelation, allowing t, F, and Wald statistics to be interpreted with confidence even in light of the persistence revealed by the Wooldridge test. The baseline fixed-effect regression explains little of the within-country variation in building-sector CO2 emissions: the within R2 is only 0.11 and the joint F test on the five covariates is far from conventional significance (p = 0.174). Standard errors clustered by country are appropriately large, and, as a result, none of the coefficients reach the 5% level. The point estimates for energy use per capita, particulate pollution, and renewable energy share are plausible in sign, yet their p-values hover just above 0.07, signaling fragile statistical support (Table A12).
Table A12. Fixed-effect regression results (baseline model for building sector emissions).
Table A12. Fixed-effect regression results (baseline model for building sector emissions).
Model StatisticsValueModel StatisticsValueModel StatisticsValue
Number of observations990Observations per group (max)20ρ (rho)0.9784
Number of groups (N)159F(5, 158)1.56Observations per group (avg)6.2
R-squared (Within)0.1127Prob > F0.1740Observations per group (min)1
R-squared (Between)0.0337corr(ui Xb)−0.1328σe (sigma_e)104.782
R-squared (Overall)0.0312σu (sigma_u)705.105
CFTCELECENUCPM25RENC_cons
Coefficient0.3551−0.27310.00280.6920−0.5133102.987
Std. Err.0.33320.25720.00160.38690.2821121.714
t1.07−1.061.831.79−1.820.85
** p < 0.050.2880.2900.0700.0760.0710.399
95% Conf. Interval[−0.3029, 1.0132][−0.7810, 0.2349][−0.0002, 0.0059][−0.0722, 1.4562][−1.0705, 0.0438][−13.7409, 34.3383]
Note: ** p < 0.05.
The random-effect GLS yields almost identical slopes and only a marginal improvement in precision, with the share of renewables attaining significance at the 5% threshold, while the Wald statistic as a whole remains weak (p = 0.066). The diagnostic statistics reveal the root of the problem: the estimated serial-correlation parameter ρ is close to one, indicating that the idiosyncratic disturbance is dominated by highly persistent shocks, and the variance of the unobserved country component significantly exceeds the idiosyncratic variance. Put simply, most of the action takes place in low-frequency movements that the standard within transformation treats as noise, inflating standard errors and obscuring any true relationship between the covariates and emissions. Retaining clustered variance estimators safeguards inference, but it does so at the cost of substantial efficiency; with a mean time dimension of merely six years per country, that loss is keenly felt (Table A13).
Table A13. Random-effect GLS regression results (building sector emissions).
Table A13. Random-effect GLS regression results (building sector emissions).
DescriptionValueDescriptionValueDescriptionValue
Number of observations990Number of groups (N)159R-squared (Within)0.1126
R-squared (Between)0.0341R-squared (Overall)0.0316Observations per group (min)1
Observations per group (avg)6.2Observations per group (max)20Wald chi2(5)10.33
Prob > chi20.0664corr(ui, X) (assumed)0σu (sigma_u)699.504
σe (sigma_e)104.782ρ (rho)0.9781
CFTCELECENUCPM25RENC_cons
Coefficient0.3110−0.23990.00270.6584−0.485394.543
Robust Std. Err.0.27390.22800.00140.43490.2393105.143
z1.14−1.051.861.51−2.030.90
p > |z|0.2560.2930.0630.1300.0430.369
95% Conf. Interval[−0.2259, 0.8479][−0.6868, 0.2070][−0.0001, 0.0055][−0.1940, 1.5108][−0.9542, −0.0164][−11.1532, 30.0619]
To manage autocorrelation and heteroskedasticity in panel regressions carefully, we estimated both fixed-effect and random-effect regression specifications using Driscoll–Kraay standard errors that are robust to general spatial and temporal dependencies. It is a particularly useful method in empirical applications where cross-sectional dependence co-occurs with serial correlation—a common feature in environmental as well as energy-oriented datasets. The fixed-effect model had an F-statistic value of 3071.86 (df = 25, 20) for the joint significance of all covariates with a p-value < 0.0001. The adjusted R-squared was estimated at 0.1188, indicating a significant explanatory capacity for a model that accounts for endogeneity caused by individual heterogeneity (Table A14).
Table A14. Fixed-effect regression with Driscoll–Kraay standard errors and time fixed effects (building sector emissions).
Table A14. Fixed-effect regression with Driscoll–Kraay standard errors and time fixed effects (building sector emissions).
MethodNumber of ObservationsNumber of GroupsGroup Variable (i)F-Statistic (25, 20)Maximum LagProb > FWithin R-Squared
Fixed-effect regression990159n3071.8620.00000.1188
VariableCoefficientStd. Err.tp > |t|CI LowerCI UpperNote
cftc0.33760.05506.140.0000.22280.4523
elec−0.29150.0967−3.010.007−0.4933−0.0897
enuc0.00280.00055.090.0000.00160.0039
pm250.75600.21613.500.0020.305212.067
renc−0.51900.1812−2.860.010−0.8968−0.1411
t = 1 empty
t = 20.286231.1410.090.928−62.09767.820
t = 328.58414.4271.980.061−0.150958.678
t = 439.22314.5212.700.0140.893469.512
t = 539.63414.6402.710.0140.909570.173
t = 613.7000.38003.610.0020.577321.627
t = 737.74514.2512.650.0150.801867.473
t = 8−13.78314.871−0.930.365−44.80317.238
t = 917.01814.9561.140.269−14.18148.216
t = 1015.75114.7641.070.299−15.04746.549
t = 1111.6960.57772.020.056−0.035423.746
t = 120.01260.46350.030.979−0.95410.9794
t = 13−0.30140.5163−0.580.566−13.7850.7757
t = 1413.3680.77111.730.098−0.271729.453
t = 1515.67610.2231.530.141−0.564837.000
t = 160.172210.7590.160.874−20.72024.164
t = 17−26.09714.772−1.770.093−56.9100.4716
t = 18−33.65814.779−2.280.034−64.487−0.2828
t = 19−32.24414.882−2.170.043−63.287−0.1201
t = 20−45.23414.859−3.040.006−76.228−14.240
t = 21−60.56114.831−4.080.001−91.498−29.625
t = 22 omitted
t = 23 omitted
_cons109.81390.2041.220.238−78.349297.975
The random-effect model estimated using generalized least squares (GLS) had a Wald statistic of 36,113.22 with a p-value < 0.0001, which indicated very strongly that there is a significant set of explanatory variables jointly responsible for explaining variance in the dependent variable. Furthermore, an estimated intra-class correlation coefficient (ρ) came out to be 0.9774, suggesting that nearly 98% of all variation in the dependent variable results from variations between groups rather than within groups. Such a large value for rho suggests an overriding influence by unmeasured, time-invariant factors specific to groups in shaping the outcome variable. Under such circumstances, estimation based on random effects is usually appropriate, as it efficiently accounts for both within- and between-group variation. However, robust standard errors should still be used to account for possible classical assumption violations (Table A15).
Table A15. Random-effect GLS regression with time dummies and robust standard errors (building sector emissions).
Table A15. Random-effect GLS regression with time dummies and robust standard errors (building sector emissions).
Maximum Lagcorr(u_i, Xb)Overall R-SquaredSigma_uSigma_eRho
20 (assumed)0.030569.5110.570.9774
Number of observationsNumber of groupsGroup variable (i)MethodWald chi2(25)Prob > chi2
990159nRandom-effects GLS regression36,113.220.0000
VariableCoef.Std. Err.tp > |t|[95% Conf. Interval]
cftc0.29410.0575.160.0(0.1753, 0.4129)
elec−0.26020.0815−3.190.005(−0.4303, −0.0902)
enuc0.00260.0012.650.016(0.0006, 0.0047)
pm250.7050.1634.320.0(0.3649, 1.045)
renc−0.49670.1158−4.290.0(−0.7383, −0.255)
t20.11417.3250.070.948(−3.4999, 3.728)
t329.32915.1021.940.066(−0.2173, 6.083)
t439.90815.1322.640.016(0.8342, 7.1473)
t540.24515.2492.640.016(0.8437, 7.2054)
t613.7750.23825.780.0(0.8807, 1.8743)
t736.42415.4692.350.029(0.4156, 6.8693)
t8−13.29915.465−0.860.4(−4.5559, 1.8962)
t917.42915.7741.10.282(−1.5475, 5.0332)
t1016.28715.9331.020.319(−1.6949, 4.9523)
t1112.9120.14029.210.0(0.9986, 1.5837)
t120.18170.20210.90.379(−0.24, 0.6034)
t13−0.12310.191−0.640.527(−0.5215, 0.2753)
t1414.7410.19397.60.0(1.0698, 1.8785)
t1516.7220.29355.70.0(1.06, 2.2845)
t160.198912.4190.160.874(−2.3917, 2.7895)
t17−25.58414.557−1.760.094(−5.5948, 0.4781)
t18−33.15215.049−2.20.039(−6.4544, −0.176)
t19−31.79615.339−2.070.051(−6.3793, 0.02)
t20−44.76815.331−2.920.008(−7.6748, −1.2789)
t21−60.08115.257−3.940.001(−9.1906, −2.8256)
_cons10.54870.9891.490.153(−4.26, 25.3559)
Driscoll–Kraay errors permit robustness for classical-assumption failures. Nonparametric covariance estimation allows for autocorrelation and heteroscedasticity in large-T panels. Lag truncation at two counterbalances temporal dynamics and overfitting. Predictors CFTC, ELEC, PM2.5, ENUC, and RENC are significant across specifications. Fine particulate matter (PM2.5) presents a consistently positive, high-significance effect (p < 0.001), confirming robust environmental–health linkages. The temporal dummies (t3–t21) show dynamic variations, a couple of periods significantly different from the base. The impacts are not due to model specification. The Driscoll–Kraay estimators compensate for autocorrelated shocks and inter-regional spill-ins, making plausible inferences where classical errors underestimate variability. Robust estimation improves validity. The results confirm genuine empirical relationships, rather than model-driven noise. Robustness of this kind is essential for providing evidence for climate policy based on longitudinal data.

Appendix E.2. Autocorrelation and Heteroscedasticity for S-Social

Robust macro-panel regression inference is called for. When data encompasses many countries over extended time horizons, classical estimators are tainted by serial correlation and heteroscedasticity, which increases significance and weakens reliability. Diagnostic evidence leaves little doubt: the Wooldridge test (F = 8.892, df = 1, 66, p = 0.004) confirms first-order autocorrelation, exposing the fragility of usual fixed-effect models. Driscoll–Kraay and cluster-robust estimators overcome these frailties by simultaneously addressing temporal dependence and heteroscedasticity. Application of those estimators strengthens inference, repressing true signals and removing false ones. The ensuing models are robust under test and develop policy applicability, forming better bases for research on sustainability. The study illustrates a foundational tenet: sound methodology is subordinate to being necessary. Robust estimation transforms tentative results into robust conclusions, providing a solid foundation for scientific and policy advancements using global panel data (Table A16).
Table A16. Wooldridge test for first-order autocorrelation in panel data.
Table A16. Wooldridge test for first-order autocorrelation in panel data.
TestNull HypothesisF-StatisticDegrees of Freedomp-Value
Wooldridge test for autocorrelationNo first-order autocorrelation8.892(1, 66)0.004
The fixed-effect regression was estimated with Driscoll–Kraay errors. The procedure accounts for autocorrelation, cross-sectional and macro-panel level heteroskedasticity, and cross-sectional dependence that are characteristic of macro-panels. The data include 1246 observations across 138 units. The group variable is “n”. The maximum lag is two, a characteristic of practice. The F-stat. is 37.78 (df = 5, 21; p < 0.001). The regressors are jointly significant. The R2 is 0.0325. The explanatory variables leave little within-entity variation to be explained. The result is characteristic of the social sciences due to unobserved heterogeneity and measurement error. The results for the coefficients are as follows. wpar is significantly negative (p < 0.001). High value means low outcome. labf has a significant negative effect (p = 0.009). inc20 has a significant positive effect (p = 0.006). A higher bottom—20% income is linked with a higher outcome. Gpie has a negative and significant effect (p < 0.001). It could be due to economic distress or the inequality effect. Food has a strong and narrow confidence interval, a positive effect. The prediction is robust. The constant is high and significantly high. The baseline outcome remains at a high level when the predictor is at zero (Table A17).
Table A17. Fixed-effect panel regression with Driscoll–Kraay standard errors.
Table A17. Fixed-effect panel regression with Driscoll–Kraay standard errors.
Regression MethodNumber of ObservationsNumber of GroupsGroup Variable (i)F-Statistic (df = 5, 21)Maximum Lag
Fixed-effect regression DK1246138n37.782
Prob > FWithin R-squared
0.00.0325
VariableCoefficientDK Std. Err.tp > |t|95% Conf. Interval
wpar−0.16924850.03145−5.380.0(−0.2346524, −0.1038446)
labf−0.32347650.111633−2.90.009(−0.55563, −0.091323)
inc200.45462870.14803793.070.006(0.1467669, 0.7624904)
gpie−0.00173320.0004127−4.20.0(−0.0025915, −0.0008749)
food0.08636150.02097984.120.0(0.0427317, 0.1299913)
_cons46.578.375.560.0(29.15322, 64.001)
Overall, the results confirm that the model formulation and estimation plan were appropriate in handling the statistical problems inherent in macro-panel datasets. The detection of autocorrelation using the Wooldridge test also justified the use of Driscoll–Kraay standard errors, which correctly addressed both heteroskedasticity and serial correlation to make standard errors and test statistics robust. Although the R-squared value is low, indicating low explanatory power, the statistical significance results and correct signs for the main variables indicate meaningful and significant associations. Significant negative coefficients for variables such as wpar, labf, and gpie indicate social vulnerability pockets, whereas significant positive effects for inc20 and food indicate food-correlated and inclusive economic benefits in predicting social sustainability outcomes. Overall, the model reflects appropriate econometric handling of the information and offers support for plausible inferences, upon which policy-oriented conclusions are made based on observed covariate associations.

Appendix E.3. Autocorrelation and Heteroscedasticity for G-Governance

A first-order autocorrelation test was conducted on the panel data to determine if a first-order serial correlation exists in the residuals within the model. A null hypothesis (H0) is that no first-order autocorrelation exists. An F-statistic of 19.200 was discovered during a test with a degree of freedom (1, 66) and a corresponding p-value of 0.0000. Since the p-value is exceedingly small in comparison to common significance levels (e.g., 0.05 or 0.01), we reject the null hypothesis. This suggests strongly that first-order autocorrelation exists in the residuals. Autocorrelation has the ability to create biased standard errors; hence, it can trigger invalid statistical inferences. Accordingly, it is preferable to treat it with the right corrective measure. Potentially used robust standard errors include Driscoll–Kraay or clustered standard errors, along with estimation strategies that accommodate serial correlation, such as the use of Prais–Winsten or Arellano–Bond estimation strategies (Table A18).
Table A18. Wooldridge test for first-order autocorrelation in panel data.
Table A18. Wooldridge test for first-order autocorrelation in panel data.
TestF-StatisticDegrees of Freedom (df)p-Value (Prob > F)Decision on H0
Wooldridge test for first-order autocorrelation19.200(1, 66)0.0000Reject H0: first-order autocorrelation is present
Fixed-effect regression relates governance and institutions to building-sector CO2 emissions (BCE). There are 982 observations across 102 units. Within R2 is 0.3000, explaining 30% within-unit variation. Joint significance is ongoing (F(7, 873) = 53.46; p < 0.001). The unit effects are dominant in residual variance (ρ = 0.983). Government effectiveness (12.843; p < 0.001) is positive, indicating that governance contributes to emissions through infrastructure development. The hospital availability coefficient (3.800; p < 0.001) is positive, reflecting the carbon intensity of healthcare. Institutional stability, the rule of law, and regulatory quality are negatively correlated, meaning that stronger institutions are associated with lower pollution. Education exerts a weak but positive effect (p = 0.065), supporting energy demand for school development. Scientific output is mildly positive, likely due to research energy use. The test for homoskedasticity is refuted (χ2 = 1,930,609.76; p < 0.001). Robust inference is guaranteed by Driscoll–Kraay estimation (Table A19).
Table A19. Fixed-effect regression results with heteroskedasticity and autocorrelation diagnostics.
Table A19. Fixed-effect regression results with heteroskedasticity and autocorrelation diagnostics.
VariableCoefficientStd. Errortp > |t|95% Confidence Interval
govt12.8432.4215.310.000[8.092, 17.594]
edue0.4520.2451.840.065[−0.029, 0.933]
stab−3.1981.165−2.740.006[−5.485, −0.911]
rndg−4.1021.713−2.390.017[−7.464, −0.740]
lawr−4.3542.011−2.160.031[−8.302, −0.407]
hosp3.8000.6126.210.000[2.598, 5.002]
scie0.0004330.00002616.870.000[0.000383, 0.000484]
_cons14.8016.1182.420.016[2.793, 26.808]
Model InfoValueF-statistic (7, 873)53.46F test (all ui = 0), F(101, 873)78.08
Number of observations982Prob > F0.0000Prob > F0.0000
Number of groups102corr(ui, Xb)0.0035TestValue
Observations per group (min/avg/max)1/9.6/19StatisticValueChi2 (df = 102)1,930,609.76
R-squared (within)0.3000sigma_u (variance due to group effects)77.265Prob > Chi20.0000
R-squared (between)0.2641sigma_e (variance due to idiosyncratic error)10.058DecisionReject H0: Groupwise heteroskedasticity is present
R-squared (overall)0.2527rho (variance due to ui)0.983
A fixed-effect panel regression with Driscoll–Kraay errors accommodates heteroskedasticity, serial correlation, and cross-sectional dependence, which are typical issues in macro-panel data. The data include 982 observations for 102 units. The maximum lag length is limited to two, ensuring short-run persistence. The explanatory variables are jointly significant (F = 59.38; p < 0.001). The within R2 is 0.30, accounting for a significant portion of the variance in building-sector emissions (BCE). Government expenditure has a large positive coefficient (12.84; p < 0.001). Increasing expenditure is linked to increased emissions, possibly because of infrastructure development and energy use. The coefficient for hospital infrastructure is also high and positive (3.80; p < 0.001), reflecting a carbon-intensive approach to healthcare delivery. The coefficient for law-right protections reduces emissions (−4.35; p = 0.032). Robust institutions align with green approaches. Scientific activity is associated with increased emissions (p = 0.004). Research intensity may trigger increased energy demand for industrial and development-related activities. Education and R&D expenditure are weakly significant, possibly reflecting low variation or multicollinearity. Driscoll–Kraay estimation confirms robustness. Robust inference survives elaborate temporal and spatial error structures.

Appendix F. Hyperparameter Optimization of KNN Regression Algorithm

Feature Selection. The methodology forecasts building-related CO2 releases (BCE) in environmental, social, and governance (ESG) regions. The specification reflects the econometric model. The same set of predictors is uniformly applied to ESG regions. The same function is implemented in three environments: econometric analysis, cluster, and machine learning regression. Methodology equivalence permits comparability. Variance disparity is a result of methodology, and not predictor variation. The application of a fixed set of predictors permits interpretability across ESG regions. The conventional versus data-driven choice becomes apparent in the BCE explanation. Methodology equivalence between econometric inference and machine learning applications emerges. Robustness in feature selection confirms findings across methodology. See Figure A16.
Figure A16. Multi-method analytical framework for ESG-driven building sector decarbonization. Note: It shows how environmental, social, and governance (ESG) indicators are decomposed and analyzed through three methods—econometric models, clustering analysis, and machine learning regression—to study building-related CO2 emissions (BCE).
Figure A16. Multi-method analytical framework for ESG-driven building sector decarbonization. Note: It shows how environmental, social, and governance (ESG) indicators are decomposed and analyzed through three methods—econometric models, clustering analysis, and machine learning regression—to study building-related CO2 emissions (BCE).
Buildings 15 03601 g0a16
Choice of k. k-Nearest Neighbors (k-NN) regression parameter k was determined on a per-dimension basis for ESG features through automated hyperparameter tuning. This was carried out to make k’s choice data-dependent, minimize the MSE on a validation hold-out set, and achieve a trade-off between complexity and generalization. For each dimension, we partitioned the dataset into training, validation, and testing sets, and we tested k-values ranging between 1 and 10. We made our selection based on validation performance and then retested it on a separate, independent testing set.
Choice of k for EEnvironment. Parameter choices for the k-Nearest Neighbor (k-NN) regression model were made through an automated hyperparameter tuning process tailored to the Environment (E) component of the ESG model. It aimed to predict building-related CO2 emissions (BCE) based on a set of environmental predictor features like renewable energy use, particulate levels, availability of clean fuels, and energy use per capita. Rather than choosing a fixed k a priori, we allowed the model to search for the optimum number of neighbors and chose k to be such that a minimum mean squared error (MSE) was attained on a held-out validation set consisting of 20% of the data. Training-validation-test split was performed in keeping with an 80-20-20 split. Optimization probed k values between 1 and 10 and concluded that k = 2 resulted in minimized validation error. The adoption of the Euclidean distance metric was coupled with a rectangular weighting function. To ensure fair computation of distance, features were standardized preparatory to model building. Upon application to the testing collection, the optimized k-NN model yielded an MSE of 2264.407, an RMSE of 47.586, and an R-squared (R2) of 0.577, demonstrating satisfactory predictive power in model-building BCE using environmental proxies. Though k-NN is non-parametric and does not involve parameter estimation in typical use, tuning k helped achieve a balance between model complexity and ability to generalize. Optimization was conducted in a validation-based manner, in line with the mainstream literature in machine learning. These methodological details, unique to the Environment component of the ESG framework, have now been transparently spelled out in this revised version of the work, with a focus on improving feature selection transparency where model specification and predictive power are particularly important (Table A20).
Table A20. Hyperparameter optimization results for k-NN regression on environmental (E) indicators of ESG.
Table A20. Hyperparameter optimization results for k-NN regression on environmental (E) indicators of ESG.
ItemValue/Description
Model typek-Nearest Neighbors (k-NN) regression
Target variableBuilding-related CO2 emissions (BCE)
Predictor variables (features)PM2.5, RENC, ENUC, CFTC, ELEC (environmental indicators)
Feature scalingApplied to all variables (standardization-enabled)
Distance metricEuclidean
Weighting functionRectangular (equal weights to neighbors)
Optimization method for kAutomated hyperparameter tuning based on validation MSE
Range of k tested1 to 10
Optimal k selected2
Data splitTraining: 633 (64%), Validation: 159 (16%), Test: 198 (20%)
Validation MSE (used for tuning)6.353
Test MSE2.264
Test RMSE47.586
Test R20.577
Mean dropout loss (feature importance)PM2.5: 83.622 RENC: 71.355 ENUC: 68.092 CFTC: 61.716 ELEC: 48.405
Software settings usedSplit: 20% test/20% validation, Scale features: Yes, Set seed: 1
Note: Summary of k-NN hyperparameter tuning for the environmental (E) dimension of ESG, showing the choice of k = 2 as optimal for predicting building-related CO2 emissions (BCE).
Choice of k for SSocial. For the Social (S) dimension within this ESG model, the choice of parameter k in k-Nearest Neighbors (k-NN) regression was made through an automatic hyperparameter tuning procedure, aiming to minimize the mean squared error (MSE) on a validation set withheld during tuning. This process was utilized in particular to ensure the selection of k was data-driven rather than a matter of arbitrary choice. Experimentation was performed on k levels between 1 and 10, and k = 10 was selected, which achieved the minimum validation error. Optimization was performed within a data partitioning procedure tightly managed: we partitioned the dataset into training (64%), validation (16%), and testing (20%) partitions, ensuring no leakage between these phases. The validation MSE returned during tuning was 5830.302, while the terminal test MSE was 7185.254. While higher than validation MSE, the test MSE does not suggest any level of overfitting, considering that this is clearly accompanied by a moderate level of test R2 (0.253). All features were standardized prior to entering the modeling process to facilitate a comparable state between these within-distance computations. We utilized a Euclidean distance metric accompanied by rectangular (uniform) weighting. These procedural steps follow common best practices within machine learning regression. They provide a transparent and reproducible route to parameter tuning within non-parametric model inference. Therefore, k-selection within the social component is soundly motivated by empirical evidence and fully integrated into a validation-based selection protocol between competing models. That serves to increase the validity of k-NN findings and help validate the comparability across all three ESG dimensions (Table A21).
Table A21. Choice of k for k-NN regression: social (S) dimension.
Table A21. Choice of k for k-NN regression: social (S) dimension.
ItemValue/Description
ESG ComponentS—social
Target VariableBuilding-related CO2 emissions (BCE)
Model Typek-Nearest Neighbors (k-NN) regression
Feature SetSocial indicators (e.g., income, labor force, education, representation)
Feature ScalingApplied (standardization-enabled)
Distance MetricEuclidean
Weighting FunctionRectangular (uniform weights)
k Values Tested1 to 10
Optimization MethodAutomated hyperparameter tuning on validation MSE
Optimal k Selected10
Training Set Size797 observations (64%)
Validation Set Size200 observations (16%)
Test Set Size249 observations (20%)
Validation MSE5.830.302
Test MSE7.185.254
Test RMSE84.684
Test R20.253
Overfitting EvidenceNo significant overfitting observed
Conclusionk = 10 chosen based on best validation performance
Note: Hyperparameter tuning results for the social (S) dimension of the ESG framework, selecting k = 10 as the optimal value for predicting building-related CO2 emissions (BCE).
Choice of k for GGovernance. For this ESG model’s regression method within the Governance (G) dimension, the k-Nearest Neighbor (k-NN) regression choice of k parameter was made through an automated hyperparameter tuning process designed to forestall overfitting and maximize generalization. All observations were entered into three partitions: 628 measurements went into training (approximately 64%), 158 went into validation (16%), and 196 went into testing (20%). Rather than choosing a number of neighbors ahead of time to set a number, model exploration was performed over k within the range 1 to 10. Validation-based MSE was minimized to locate optimal k. That was k = 2. With this k, model validation MSE was 1067.181, while test MSE was a drastically minimized level of 129.452. Root mean squared error was 11.378 on unseen data in the test collection, and a corresponding R2 was witnessed at a level of 0.975, identifying very high predictive power. It appears that the model’s generalization is incredibly good across unseen data, and choosing k = 2 does a very good trade-off between model complexity versus variance. Other model specifications were the use of a rectangular weighting function and the use of Euclidean distance. All features were normalized prior to training such that a distance metric was not skewed due to range of variables. Validation-based tuning coupled with impressive out-of-sample behavior provides powerful justification to a chosen level of k, substantiating that k is a good choice within a Governance-driven regression model setup within this ESG model (Table A22).
Table A22. Hyperparameter optimization results for k-NN regression: governance (G) dimension.
Table A22. Hyperparameter optimization results for k-NN regression: governance (G) dimension.
ItemValue/Description
ESG DimensionGovernance (G)
Model Typek-Nearest Neighbors (k-NN) regression
Training Set Size628 observations (≈64%)
Validation Set Size158 observations (≈16%)
Test Set Size196 observations (≈20%)
Hyperparameter TunedNumber of Nearest Neighbors (k)
Range of k Tested1 to 10
Optimal k Selected2 (minimizes Validation MSE)
Validation MSE1.067
Test MSE129.452
Test RMSE11.378
Test R20.975
Distance MetricEuclidean
Weighting SchemeRectangular (uniform weights)
Feature Scaling AppliedYes (all features standardized)
Conclusion on k SelectionOptimized via validation MSE, k = 2 balances accuracy and complexity
Note: Validation-based tuning of k in the governance (G) dimension identify k = 2 as optimal, yielding high predictive accuracy (R2 = 0.975) for building-related CO2 emissions (BCE).
Multicollinearity. When modeling the environmental (E) dimension of the ESG model, the issue of multicollinearity presents special importance even while operating non-parametrically and making no assumptions about regression coefficients. For ESG research, environmental covariates such as PM2.5, renewable energy use, per capita energy consumption, use of clean fuels, and electricity coverage tend to be correlated, as they serve as indicators of structural complexity in energy systems and environmental strength. If left unaddressed, multicollinearity can obscure the unique effect of different indicators, limiting interpretation and further dampening the strength of sustainability analysis. For instance, renewable energy consumption and electricity access can be highly correlated in higher-income countries, while air pollution and per capita consumption can compensate for each other across a fast-industrializing state. Focusing on the peril of latent redundancy between predictor variables is thus instrumental in ensuring that ESG-based models identify actual distinct drivers in emissions rather than a construct of intersecting variables. For this research work, precautions such as feature scaling, permutation importance-based dropout selection, and cautious selection of predictor features were employed to mitigate the risks of collinearity. Through this process, this research provides a higher level of validity to both a model-based projection of CO2 emissions associated with buildings and makes a broader case for the appropriateness of ESG indicators in producing relevantly interpretable results to inform sustainable policy development.
E—Environment. For the environment (E) component within the ESG model, we understand that, due to the non-parametric nature of the k-Nearest Neighbor (k-NN) algorithm, it does not require formal testing against multicollinearity, as it does not imply the estimation of regression coefficients. Nevertheless, we assumed a possible multicollinearity effect on predictor features, employing a two-pronged approach that combined preprocessing operations and metrics-based informability. First, all environmental input features, including PM2.5, renewable energy consumption (RENC), energy use per capita (ENUC), use of cleaning fuels (CFTC), and electricity supply use (ELEC), were standardized prior to training. Feature scaling is necessary in k-NN to prevent any supplied variable from dominating the distance measurement due to differences. Additionally, to evaluate the relative level of distinctness and informativeness of every feature, we employed a dropout-based feature importance quantification. The resulting dropout losses across a wide range of divergent predictors indicated that the model was not relying on redundant or collinear features while producing estimates. Although we did not apply formal testing for multicollinearity, such as the Variance Inflation Factor (VIF), applicable to linear models, a combination of feature scaling, normalized feature inputs, and permutation-based feature evaluation provided adequate protection against distortions due to highly correlated features. Such methodological precautions were taken, especially within the environment model, in a bid to ensure that estimations of building-related CO2 emissions were not adversely affected by hidden redundancies between input features (Table A23).
Table A23. Feature scaling and multicollinearity control in k-NN regression: environmental (E) dimension.
Table A23. Feature scaling and multicollinearity control in k-NN regression: environmental (E) dimension.
AspectValue/Description
Model typek-Nearest Neighbors (k-NN) regression
Target variable (Y)Building-related CO2 Emissions (BCE)
Predictors (X variables)PM2.5, RENC (renewable energy consumption), ENUC (energy use per Capita), CFTC (clean fuel access), ELEC (electricity access)
Number of predictors5 environmental variables
Feature scaling appliedYes—All variables standardized before training
Reason for scalingTo prevent any feature from dominating distance calculations due to scale differences
Multicollinearity test (formal)Not performed—VIF not applicable to non-parametric models
Alternative control methods usedFeature scaling + feature importance (dropout loss) + manual selection
Feature importance methodDropout-based permutation importance (50 permutations)
Dropout loss valuesPM2.5: 83.62 RENC: 71.36 ENUC: 68.09 CFTC: 61.72 ELEC: 48.41
Interpretation of dropout spreadWide variation in loss suggests that predictors contribute uniquely → no dominance or redundancy
Conclusion on multicollinearityNo evidence of harmful multicollinearity affecting predictions in the E-component model
Note: Evaluation of predictor distinctness for the environmental (E) dimension of the ESG framework. Standardization and dropout-based feature importance confirm no harmful multicollinearity among environmental indicators (PM2.5, RENC, ENUC, CFTC, ELEC) in predicting building-related CO2 emissions (BCE).
S–Social. Regarding the Social (S) element within this application of the ESG model, multicollinearity was addressed through a combination of model-specific issues and non-parametric learning and preprocessing methods. Since k-Nearest Neighbors (k-NN) regression does not yield coefficients, it is not susceptible to the same statistical aberrations caused by multicollinearity infecting linear regression-based applications. Therefore, model-based testing, such as the Variance Inflation Factor (VIF), is neither applicable nor necessary in this case. Precautions were still taken to guard against redundant or highly correlated predictor features that could obfuscate model performance. First, features were normalized beforehand during training, ensuring that no variable dominated distance metrics due to scale differences. More substantively, feature importance was evaluated through dropout loss analysis, whereby the predictive influence contributed by each variable was quantitatively estimated across 50 permutations. Results reflected apparent and clear-cut variability in dropout loss between predictor features (e.g., LABF = 84.697, WPAR = 81.206, GPIE = 66.064), indicating that every variable contributed disparate information to this model. Such variability serves to corroborate evidence that model performance was not overly reliant upon some singular or collinear feature subset. The table of additive explanation further demonstrates that per-individual predictive contributions arise due to a variegated combination of inputs. Overall, these elements provide both quantitative and methodological evidence, indicating that multicollinearity did not diminish the predictive integrity of this k-NN application within the Social component. Although technically not tested due to a non-significant relationship between model applications across these components, this combination of feature scaling and interpretation across permutations provides a powerful safeguard against multicollinearity within this non-parametric application (Table A24).
Table A24. Multicollinearity control and feature importance in k-NN regression: social (S) dimension.
Table A24. Multicollinearity control and feature importance in k-NN regression: social (S) dimension.
ItemValue/Description
ESG ComponentS—Social
Model Typek-Nearest Neighbors (k-NN) Regression
Target VariableBuilding-related CO2 Emissions (BCE)
Predictor VariablesLABF, WPAR, FOOD, INC20, GPIE
Number of Predictors5 Social Indicators
Distance MetricEuclidean
Weighting SchemeRectangular (equal weights)
Feature Scaling AppliedYes (all variables standardized)
Multicollinearity Diagnostic UsedNot Applicable (k-NN is non-parametric; VIF not suitable)
Alternative SafeguardsStandardization + Permutation-based Feature Importance (Dropout Loss)
Feature Importance (Dropout Loss)LABF: 84.697 WPAR: 81.206 FOOD: 77.663 INC20: 73.991 GPIE: 66.064
Dropout Loss MethodBased on 50 Permutations Using RMSE Impact
InterpretationHigh Variability In Dropout Loss Indicates Non-Redundant Predictors
ConclusionNo Harmful Multicollinearity Detected; Each Feature Contributes Uniquely
Note: Although formal multicollinearity diagnostics such as VIFs are not applicable to non-parametric models like k-NN, safeguards are implemented through standardization and dropout-based feature importance. Variability in dropout loss across the five social predictors confirms their unique contributions, ensuring that no single variable dominates or introduces redundancy.
G—Governance. Regarding Governance (G) within the ESG model, the multicollinearity problem was addressed by considering the features of the k-Nearest Neighbors (k-NN) regression algorithm. While linear regression algorithms estimate coefficients, k-NN does not, and it is consequently not susceptible to the parametric model instability associated with multicollinearity. Therefore, formal diagnostic statistics such as the Variance Inflation Factor (VIF) could not be employed. Nonetheless, precautions were taken to prevent redundancy between predictors, which disrupted the distance-based learning process. All predictors related to governance features, such as SCIE, HOSP, EDUE, RNDG, GOVT, STAB, and LAWR, were standardized before training. By doing so, no variable could overpower the distance function due to differing scales. Feature contribution was then observed via dropout-based importance analysis across 50 permutations. Dropout-based importance analysis across permutations numbering 50. Results indicated a wide variance, such that dropout loss ranged from SCIE at 165.690 to LAWR at 6.726. Such diffusivity reveals that the predictors contribute variably and not simply as redundant copies of each other. Furthermore, additive explanations relating to test set cases confirm that model predictions map to the effects of a variegated arrangement of features, rather than a redundant set. Overall, the evidence suggests that multicollinearity did not compromise the predictive validity of the governance model. Although no formal analysis in terms of VIF was required here due to the non-estimation of coefficients, the use of standardization, careful selection between features to be entered, and permutation-based interpretation served to keep the model resistant against distortions likely to be occasioned by highly correlated features (Table A25).
Table A25. Multicollinearity safeguards and feature importance in k-NN regression: governance (G) dimension.
Table A25. Multicollinearity safeguards and feature importance in k-NN regression: governance (G) dimension.
ItemValue/Description
ESG DimensionGovernance (G)
Model Typek-Nearest Neighbors (k-NN) regression
Target VariableBuilding-related CO2 Emissions (BCE)
Predictor VariablesSCIE, HOSP, EDUE, RNDG, GOVT, STAB, LAWR
Number of Predictors7 governance indicators
Feature ScalingApplied (all variables standardized before modeling)
Distance MetricEuclidean
Weighting FunctionRectangular (equal weights for neighbors)
Multicollinearity Test (VIF)Not applied (not relevant to non-parametric k-NN)
Alternative AssessmentPermutation-based dropout loss (50 permutations)
Feature Importance (Dropout Loss)SCIE: 165.690 HOSP: 30.104 EDUE: 28.272 RNDG: 28.038 GOVT: 20.035 STAB: 18.963 LAWR: 6.726
InterpretationWide variability in dropout loss values indicates that predictors contribute uniquely
ConclusionNo evidence of harmful multicollinearity; predictors are complementary, not redundant
Note: Standardization and dropout-based feature importance confirm that governance indicators (SCIE, HOSP, EDUE, RNDG, GOVT, STAB, LAWR) contribute uniquely within the k-NN model. No harmful multicollinearity is detected, supporting the robustness of the governance (G) dimension in predicting building-related CO2 emissions (BCE).
Overfitting. To deter the danger of overfitting in the environment (E) factor within the ESG model, we employed a rigorous model validation and data partitioning procedure. A dataset was partitioned into separate training (64%), validation (16%), and test (20%) sets such that model training and hyperparameter tuning were carried out without consideration of model performance for a concluding evaluation. Optimal k-NN model selection was achieved in selecting a k value associated with a minimum mean squared error on a validation dataset such that a model was safeguarded against a fit to a training dataset. Notably, model performance on a test dataset did not suffer but instead performed better relative to a validation dataset (test MSE = 2264.407 vs. validation MSE = 6353.665) such that model successful generalization was evidenced, and neither was overfit to a training or validation dataset. Additionally, validation in the form of a test R2 of equaling 0.577 ensures a satisfactory predictive power without evidence of model overfitting. These precautions, tailored to include a data partitioning component, validation-based hyperparameter tuning, and out-of-sample evaluation, were particularly adopted to deter a danger of overfitting within a k-NN regression model employed within a calculation of building-related CO2 emissions.
E—Environment. To manage the peril associated with having an environmental (E) element within the ESG model, we adopted a systematic model validation and data partitioning procedure. We divided the dataset into separate training (64%), validation (16%), and testing (20%) sets such that model training and hyperparameter tuning occurred without affecting the final evaluation. We trained a k-NN model and selected k such that it resulted in minimum mean squared error on validation set to avoid overfitting to training data. Incidentally, model performance on unseen testing did not degrade but rather improved compared to that on validation dataset (test MSE = 2264.407 compared to validation MSE = 6353.665), suggesting that model generalized fairly and was not overfitted to both training and validation data. Also, a corresponding test R2 of 0.577 further exemplifies good predictability without generating model overfitting concerns. These precautions, including partitioning data, hyperparameter tuning on a validation dataset rather than on training or testing datasets or even out-of-sample testing, were particularly developed to manage the danger of overfitting within the k-NN regression model employed in approximating building-related CO2 emissions (Table A26).
Table A26. Overfitting control and validation outcomes in k-NN regression: environmental (E) dimension.
Table A26. Overfitting control and validation outcomes in k-NN regression: environmental (E) dimension.
AspectValue/Description
Model typek-Nearest Neighbor (k-NN) regression
Target variableBuilding-related CO2 Emissions (BCE)
Predictor variables (features)PM2.5, RENC, ENUC, CFTC, ELEC (environmental indicators)
Training set size633 observations (64% of dataset)
Validation set size159 observations (16% of dataset)
Test set size198 observations (20% of dataset)
Hyperparameter tuning methodAutomated optimization of k based on lowest validation MSE
Optimal k selected2
Validation MSE6.353.665
Test MSE2.264.407
Test RMSE47.586
Test R20.577
Performance comparisonTest performance better than validation → no overfitting
Overfitting control techniquesTrain-validation-test split + validation-based tuning + out-of-sample test evaluation
InterpretationModel generalizes well with no signs of overfitting
Note: Train–validation–test partitioning and validation-based tuning ensured the environmental (E) k-NN model avoided overfitting. Superior test performance (MSE = 2264.407; R2 = 0.577) confirmed robust generalization in predicting building-related CO2 emissions (BCE).
S—Social. To guard against the peril of overfitting within the social (S) part of the ESG model, a formal breakdown between training, validation, and testing was made such that a stringent separation was maintained between model training, hyperparameter tuning, and final evaluation. Data was split into three sets: 64% for training (797 observations), 16% for validation (200 observations), and 20% into a test dataset (249 observations). Such a setup prevented leaks between model evaluation and optimization. A k-Nearest Neighbor (k-NN) regression model was optimized such that k was selected such that mean squared error (MSE) was minimized in validation dataset. Optimization was performed across k such that k ranged between 1 and 10, and k was selected such that validation MSE was minimized. Validation MSE was 5830.302, while ultimate test MSE was 7185.254. Although a large error rate on a test is very large, this difference was neither dominatingly large nor significant to indicate overfitting. This is further supported by a test R2 of 0.253, such that the model possesses predictability evident across unseen data. All features were normalized before model development so that equal computation across distances and equal weight were given. A Euclidean distance measure and rectangular weighting function were common approaches among k-NN family members. Test set performance overall supports the proposition that the model fairly generalizes but not too fitted within training or validation data. Validation-based tuning accompanied by independent evaluation across a test serves as a strong defense against such overfitting that predictive inferences within the Social model across ESG remain supported (Table A27).
Table A27. Overfitting control and validation outcomes in k-NN regression: social (S) dimension.
Table A27. Overfitting control and validation outcomes in k-NN regression: social (S) dimension.
ItemValue/Description
ESG DimensionSocial (S)
Model Typek-Nearest Neighbor (k-NN) regression
Training Set Size797 observations (64%)
Validation Set Size200 observations (16%)
Test Set Size249 observations (20%)
Hyperparameter TunedNumber of nearest neighbors (k)
Range of k Tested1 to 10
Optimal k Selected10
Validation MSE5.830.302
Test MSE7.185.254
Test RMSE84.766
Test R20.253
Distance MetricEuclidean
Weighting FunctionRectangular (uniform weights)
Feature Scaling AppliedYes (all features standardized)
Overfitting EvidenceNo clear signs of overfitting; test performance remains stable
Safeguard Against OverfittingIndependent validation and test splits, validation-based k selection
ConclusionThe model generalizes well and retains predictive power on unseen data
G—Governance. In the governance (G) component of the ESG model, a risk of overfitting was addressed in a cautious manner through systematic training, validation, and test splits, along with hyperparameter tuning. The data was split into 628 training observations (64%), 158 validation observations (16%), and 196 test observations (20%), allowing for the calibration of the model and evaluation of its performance to occur entirely autonomously. A k-Nearest Neighbor (k-NN) regression was obtained using automatic tuning of the number of neighbors, with testing k-values ranging from 1 to 10 and selecting the k-value that resulted in a minimal mean squared error (MSE) on the validation set. Based on this process, k = 2 was found to be an optimal setup. The validation MSE was 1067.181, but the corresponding test MSE was drastically lower at 129.452, corresponding to an associated RMSE of 11.378 and an R2 measure of 0.975 on the test part. A rock-solid but smaller test error compared to validation error does not appear, but rather it is a strong indication that the model does not seem to be overfitted to the training dataset, suggesting instead that the model has a very good ability to generalize when subjected to unseen data. Other precautions included feature scaling, ensuring a fair measurement of distance, and the use of Euclidean distance with rectangular weights, thereby adhering to the relevant k-NN regression best practices. When combined, partitioning of data, tuning based on validation evidence, and relentless out-of-sample performance provide firm evidence supporting the claim that a governance model does not overscale. Instead, it demonstrates a remarkable ability to generalize while maintaining predictive accuracy (Table A28).
Table A28. Overfitting control and validation outcomes in k-NN regression: governance (G) dimension.
Table A28. Overfitting control and validation outcomes in k-NN regression: governance (G) dimension.
ItemValue/Description
ESG DimensionGovernance (G)
Model Typek-Nearest Neighbor (k-NN) regression
Training Set Size628 observations (≈64%)
Validation Set Size158 observations (≈16%)
Test Set Size196 observations (≈20%)
Hyperparameter TunedNumber of neighbors (k)
Range of k Tested1 to 10
Optimal k Selected2
Validation MSE1.067.181
Test MSE129.452
Test RMSE11.378
Test R20.975
Distance MetricEuclidean
Weighting FunctionRectangular (uniform weights)
Feature ScalingApplied (all variables standardized before training)
Overfitting EvidenceNo signs of overfitting: test error lower than validation error, stable high R2
ConclusionModel generalizes well, predictions are robust and not overfitted
Note: Systematic train–validation–test partitioning and validation-based tuning identify k = 2 as optimal for the Governance (G) model. Superior test performance (MSE = 129.452; R2 = 0.975) confirms robust generalization and no evidence of overfitting in predicting building-related CO2 emissions (BCE).

References

  1. Adewumi, A.S.; Opoku, A.; Dangana, Z. Sustainability assessment frameworks for delivering environmental, social, and governance (ESG) targets: A case of building research establishment environmental assessment method (BREEAM) UK new construction. Corp. Soc. Responsib. Environ. Manag. 2024, 31, 3779–3791. [Google Scholar] [CrossRef]
  2. Costantiello, A.; Laureti, L.; Quarto, A.; Leogrande, A. Methane Emissions in the ESG Framework at the World Level. Methane 2025, 4, 3. [Google Scholar] [CrossRef]
  3. Al Amosh, H.; Khatib, S.F. Toward sustainable healthcare: Linking environmental governance, business innovation and carbon emission in Europe. Manag. Environ. Qual. Int. J. 2024, 35, 1461–1483. [Google Scholar] [CrossRef]
  4. Zhao, X.; Jiang, S.; Gao, J. Spatial Impacts of Green Finance Reform Pilot Zones on Renewable Energy Technology Innovation: Pathways for Accelerating Low-Altitude Economic Development. SSRN 2025, 5122716. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5122716 (accessed on 18 September 2025).
  5. Salarin, T.; Staiano, J.; Bazzana, F.; Paterlini, S. Examining Intraday Stock Market Response to ESG News Classified by AI: An Event Study. SSRN 2024, 5040946. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5040946 (accessed on 10 March 2025).
  6. Valaskova, K.; Gajdosikova, D.; Lazaroiu, G. Has the COVID-19 pandemic affected the corporate financial performance? A case study of Slovak enterprises. Equilib. Q. J. Econ. Econ. Policy 2023, 18, 1133–1178. [Google Scholar] [CrossRef]
  7. Battisti, F. SDGs and ESG Criteria in Housing: Defining Local Evaluation Criteria and Indicators for Verifying Project Sustainability Using Florence Metropolitan Area as a Case Study. Sustainability 2023, 15, 9372. [Google Scholar] [CrossRef]
  8. Banerjee, A.; Das, P.; Fuerst, F. Are green and healthy building labels counterproductive in emerging markets? An examination of office rental contracts in India. J. Clean. Prod. 2024, 455, 141838. [Google Scholar] [CrossRef]
  9. Li, R.Y.M.; Li, B.; Zhu, X.; Zhao, J.; Pu, R.; Song, L. Modularity clustering of economic development and ESG attributes in prefabricated building research. Front. Environ. Sci. 2022, 10, 977887. [Google Scholar] [CrossRef]
  10. Valero, A. The Impact of ESG Certifications on Class A Office Buildings in Madrid: A Multi-Criteria Decision Analysis. Standards 2025, 5, 14. [Google Scholar] [CrossRef]
  11. Voronkov, A. Mechanisms for Attracting Investment into Green Building Projects. Am. J. Manag. Econ. Innov. 2025, 7, 83–89. [Google Scholar] [CrossRef]
  12. Vladimirova, I.; Dmitriev, A.; Kallaur, G.; Oleynikova, N.; Tsygankova, A. Trends in the application of “green” standards on the life cycle of capital housing construction objects. In Proceedings of the E3S Web of Conferences, Seoul, Republic of Korea, 17–19 April 2023; p. 02018. [Google Scholar]
  13. Chungath, J. Modeling Architecture to manage Energy, Social, and Governance (ESG) Data of Commercial Buildings. Master’s Thesis, Stockholm University, Stockholm, Sweden, 2023. [Google Scholar]
  14. Cruz, P.L.; Ávila, L.V.; Dinis, M.A.P.; Baggio, D.K. Environmental, social and governance (ESG) and innovation in the construction sector: Systematic Literature Review. Rev. Adm. UFSM 2023, 16, e1. [Google Scholar] [CrossRef]
  15. Sulaiman, N.N.; Malaysia, U.T.H.O.; Omar, A.J.; Pakir, F.; Mohamad, M. Evaluating environmental, social and governance (ESG) practice among malaysian public listed construction companies using FTSE Russell rating model. Int. J. Sustain. Constr. Eng. Technol. 2024, 15, 25–40. [Google Scholar] [CrossRef]
  16. Yap, C.K.; Leow, C.S.; Goh, B. Sustainable construction materials under ESG: A literature review and synthesis. MOJ Biol. Med. 2024, 9, 1–6. [Google Scholar] [CrossRef]
  17. Wang, J.; Xue, F. Emerging trends of ESG in the construction sector: A promising pathway to sustainable and responsible development. In International Symposium on Advancement of Construction Management and Real Estate; Springer: Singapore, 2023; pp. 2033–2044. [Google Scholar]
  18. Karerat, S.; Houghton, A.; Dickinson, G. Leveraging healthy buildings as a tool to apply a people-centric approach to ESG. Corp. Real Estate J. 2023, 13, 32. [Google Scholar] [CrossRef]
  19. Aagedal, M.T. Who’s Paying for Green?—Property Owners’ and Tenants’ Perspectives on ESG Investments. Master’s Thesis, NTNU, Trondheim, Norway, 2024. [Google Scholar]
  20. Ifediora, C.O.; Igwenagu, C.L. Identifying Environmental, Social and Governance (ESG) Factors as Key Factors in Residential and Commercial Properties/Real Estate Investment Decision. Int. J. Energy Environ. Res. 2024, 12, 1–19. [Google Scholar] [CrossRef]
  21. Masia, T.; Kajimo-Shakantu, K.; Opawole, A. A case study on the implementation of green building construction in Gauteng province, South Africa. Manag. Environ. Qual. Int. J. 2020, 31, 602–623. [Google Scholar] [CrossRef]
  22. Nuradhi, M.; Wibowo, H.; Paulus; Fatimah, R.A. Beyond Green Building, How Giant Architecture Firm Contribute to Sustainable Development Goals. In International Conference on Sustainability in Creative Industries; Springer: Cham, Switzerland, 2023; pp. 249–260. [Google Scholar]
  23. Robinson, S.; McIntosh, M.G. A literature review of environmental, social, and governance (ESG) in commercial real estate. J. Real Estate Lit. 2022, 30, 54–67. [Google Scholar] [CrossRef]
  24. Ombati, P.M. Evaluating Awareness Levels and Barriers to Green Building Implementation. Ph.D. Dissertation, University of Nairobi, Nairobi, Kenya, 2023. [Google Scholar]
  25. Kempeneer, S.; Peeters, M.; Compernolle, T. Bringing the user back in the building: An analysis of ESG in real estate and a behavioral framework to guide future research. Sustainability 2021, 13, 3239. [Google Scholar] [CrossRef]
  26. Roukoz, K.; Ersenkal, D. Environmental, Social and Governance-related Challenges in the Construction Industry (CIC 2023). In Proceedings of the 2nd International Conference on Civil Infrastructure and Construction, Doha, Qatar, 5–8 February 2023; pp. 1238–1252. [Google Scholar]
  27. Richter, T.J.; Soliva, E.; Haase, M.; Wrase, I. Corporate real estate and green building: Prevalence, transparency and drivers. J. Corp. Real Estate 2022, 24, 241–255. [Google Scholar] [CrossRef]
  28. Lam, O.T.N. Exploring the Mutual Reinforcement of ESG and Green Building Management for Long-Term Sustainability: A Case Study of Hong Kong; The University of Hong Kong: Pokfulam, Hong Kong, 2024. [Google Scholar]
  29. Yaman, K.M.; Ghadas, Z.A.A. The Role of Contract in Advancing ESG in Construction: A Proposed Framework of ‘Green’ Standard Form of Contract for Use in Green Building Projects in Malaysia. In Artificial Intelligence and Transforming Digital Marketing; Springer Nature: Cham, Switzerland, 2023; pp. 269–279. [Google Scholar]
  30. Wang, X.; Wen, H. Research on Green Transformation of Real Estate Enterprises under ESG Concept. In Proceedings of the International Conference on Construction and Real Estate Management, Xi’an, China, 23–24 September 2023; pp. 500–505. [Google Scholar]
  31. Dugo, I. Finance Embraces Sustainability: An Empirical Analysis of the Financial Performances of ETFs Investing in Sustainable Real Estate and Green Building. Master’s Thesis, Università Ca’ Foscari, Venice, Italy, 2021. [Google Scholar]
  32. Liu, N.; Zhao, Y.; Yan, B.; Hutchison, N. Tenants’ ESG: Influence on preferences and rent premiums for green buildings in commercial real estate. J. Prop. Res. 2025, 1–35. [Google Scholar] [CrossRef]
  33. Kudinova, Y.; Shushunova, N. Modern certification systems for green construction. AIP Conf. Proc. 2023, 2791, 050034. [Google Scholar] [CrossRef]
  34. Sjöberg, K.; Östling, M. Frameworks for Sustainable Construction: Comparing the EU Taxonomy, ESG Ratings and BREEAM-SE. Master’s Thesis, School of Industrial Engineering and Management Stockholm, KTH Royal Institute of Technology, Stockholm, Sweden, 2024. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1874971&dswid=2548 (accessed on 18 September 2025).
  35. Meshcheryakova, T. Development of “Green” Building Standards in the Context of Global ESG-transformation. In Proceedings of the International Scientific and Practical Conference “Young Engineers of the Fuel and Energy Complex: Developing the Energy Agenda of the Future” (EAF 2021), Saint Petersburg, Russia, 10–11 December 2021; Atlantis Press: Paris, France, 2022; pp. 188–194. [Google Scholar]
  36. Miklinska, J. Green buildings for greening supply chains—Selected Organisational and legal solutions on the polish commercial warehouse space market. Eur. Res. Stud. J. 2024, XXVII, 152–168. [Google Scholar] [CrossRef]
  37. Obushnyi, S.; Novikov, A. Influence of it solutions on ESG real estate development and investment attractiveness of the urban projects. Eur. Sci. J. Econ. Financ. Innov. 2024, 1, 96–106. [Google Scholar] [CrossRef]
  38. Gu, Y.; Katz, S.; Wang, X.; Vasarhelyi, M.; Dai, J. Government ESG reporting in smart cities. Int. J. Account. Inf. Syst. 2024, 54, 100701. [Google Scholar] [CrossRef]
  39. Morri, G.; Yang, F.; Colantoni, F. Green investments, green returns: Exploring the link between ESG factors and financial performance in real estate. J. Prop. Invest. Financ. 2024, 42, 435–452. [Google Scholar] [CrossRef]
  40. Sahu, A.; Pahi, D.; Dwibedi, P.; Mishra, A.P.; Mishra, B. Examining the role of ESG disclosure and firm characteristics in promoting global green building adoption: A panel probit approach. Socio-Ecol. Pract. Res. 2025, 7, 77–91. [Google Scholar] [CrossRef]
  41. Debrah, C.; Chan, A.P.C.; Darko, A. Green finance gap in green buildings: A scoping review and future research needs. Build. Environ. 2022, 207, 108443. [Google Scholar] [CrossRef]
  42. Fedchenko, E.; Gusarova, L.; Uskenbayeva, A. Green building in the ESG agenda for sustainable development of Russia: Conditions and trends. Constr. Mater. Prod. 2024, 7, 9. [Google Scholar] [CrossRef]
  43. Kostrikin, P.; Andreeva, A. ESG-criteria for low-rise housing construction projects. AIP Conf. Proc. 2023, 2560, 050003. [Google Scholar]
  44. Oloyede, B. ESG-LED Sustainable Finance: A Blueprint for Ending the Menace of Frequent Building Col-lapse in Nigeria. SSRN 2023, 5043885. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5043885 (accessed on 18 September 2025).
  45. Zheng, J.; Khurram, M.U.; Chen, L. Can green innovation affect ESG ratings and financial performance? Evidence from Chinese GEM listed companies. Sustainability 2022, 14, 8677. [Google Scholar] [CrossRef]
  46. Costantiello, A.; Leogrande, A. The determinants of CO2 emissions in the context of ESG models at world level. Acad. Account. Financ. Stud. J. 2023, 28, 1–15. [Google Scholar] [CrossRef]
  47. Jiao, J.; Shuai, Y.; Li, J. Identifying ESG types of Chinese solid waste disposal companies based on machine learning methods. J. Environ. Manag. 2024, 360, 121235. [Google Scholar] [CrossRef]
  48. Wang, W.; Li, D.; Zhou, S.; Wang, Y.; Yu, L. Exploring the key influencing factors of low-carbon innovation from urban characteristics in China using interpretable machine learning. Environ. Impact Assess. Rev. 2024, 107, 107573. [Google Scholar] [CrossRef]
  49. Ulussever, T.; Depren, S.K.; Kartal, M.T.; Depren, Ö. Estimation performance comparison of machine learning approaches and time series econometric models: Evidence from the effect of sector-based energy consumption on CO2 emissions in the USA. Environ. Sci. Pollut. Res. 2023, 30, 52576–52592. [Google Scholar] [CrossRef] [PubMed]
  50. Giannelos, S.; Bellizio, F.; Strbac, G.; Zhang, T. Machine learning approaches for predictions of CO2 emissions in the building sector. Electr. Power Syst. Res. 2024, 235, 110735. [Google Scholar] [CrossRef]
  51. Yang, Q.; Wang, J.; Liu, X.; Liu, Y. Complexity influence of societal development comprehensive indicators on building carbon emission: Empirical evidence from China. Environ. Sci. Pollut. Res. 2023, 30, 117179–117200. [Google Scholar] [CrossRef] [PubMed]
  52. Cheng, K.; Peng, B.; Arif, M.; Zhang, Y.; Shang, L.; Zhang, Z. Strategic Deployment of CCUS in China: Aiming for Carbon Neutrality in Key Industries. In Proceedings of the SPE Gas & Oil Technology Showcase and Conference, Dubai, United Arab Emirates, 13–15 March 2023. [Google Scholar]
  53. Le, T.T.; Priya, J.C.; Le, H.C.; Le, N.V.L.; Duong, M.T.; Cao, D.N. Harnessing artificial intelligence for data-driven energy predictive analytics: A systematic survey towards enhancing sustainability. Int. J. Renew. Energy Dev. 2024, 13, 270–293. [Google Scholar] [CrossRef]
  54. Mahmood, S.; Sun, H.; Alhussan, A.A.; Iqbal, A.; El-Kenawy, E.-S.M. Active learning-based machine learning approach for enhancing environmental sustainability in green building energy consumption. Sci. Rep. 2024, 14, 19894. [Google Scholar] [CrossRef]
  55. Zhang, Q.; Iqbal, S.; Shahzad, F. Role of environmental, social, and governance (ESG) investment and natural capital stocks in achieving net-zero carbon emission. J. Clean. Prod. 2024, 478, 143919. [Google Scholar] [CrossRef]
  56. Zhang, H.; Li, R.; Cai, X.; Zheng, C.; Liu, L.; Liu, M.; Zhang, Q.; Lin, H.; Chen, L.; Wang, X. Do electricity flows hamper regional economic–environmental equity? Appl. Energy 2022, 326, 120001. [Google Scholar] [CrossRef]
  57. Xianghan, Z.; Zhengwei, L. Research on the Impact of Corporate Green Transformation on Organizational Performance under ESG Strategy Orientation. Risk Financ. Manag. 2025, 6, 24. [Google Scholar] [CrossRef]
  58. Matar, J.; El Khoury, H.; Charr, J.C.; Guyeux, C.; Chrétien, S. Biological sequence clustering: Novel approaches and a comparative study. SN Comput. Sci. 2025, 6, 369. [Google Scholar] [CrossRef]
  59. Rad, D.; Cuc, L.D.; Lile, R.; Balas, V.E.; Barna, C.; Pantea, M.F.; Bâtcă-Dumitru, G.C.; Szentesi, S.G.; Rad, G. A cognitive systems engineering approach using unsupervised fuzzy c-means technique, exploratory factor analysis and network analysis—A preliminary statistical investigation of the bean counter profiling scale robustness. Int. J. Environ. Res. Public Health 2022, 19, 12821. [Google Scholar] [CrossRef] [PubMed]
  60. Mohamed, A.M.O.; Paleologos, E.K.; Mohamed, D.; Fayad, A.; Al Nahyan, M.T. Critical Minerals Mining: A Path Toward Sustainable Resource Extraction and Aquatic Conservation. 2025. Available online: https://www.preprints.org/manuscript/202501.2379/v1 (accessed on 18 September 2025).
  61. Barua, S.; Adeleye, B.N.; Akam, D.; Ogunrinola, I.; Shafiq, M.M. Modeling mortality rates and environmental degradation in Asia and the Pacific: Does income group matter? Environ. Sci. Pollut. Res. 2022, 29, 30548–30567. [Google Scholar] [CrossRef]
  62. Xu, X.; Liu, L. Effects of climate policy on urban carbon emissions—New insights from China’s low-carbon city pilot policy. Appl. Ecol. Environ. Res. 2025, 23, 3637–3661. [Google Scholar] [CrossRef]
  63. Ciski, M.; Rząsa, K. The Environmental Dimension of Sustainable Development in Relation to the Transition from Brown to Green Energy—A Case Study of Poland from 2005 to 2023. Energies 2025, 18, 2993. [Google Scholar] [CrossRef]
  64. Montero, J.-M.; Fernández-Avilés, G.; Laureti, T. A Local Spatial STIRPAT Model for Outdoor NOx Concentrations in the Community of Madrid, Spain. Mathematics 2021, 9, 677. [Google Scholar] [CrossRef]
  65. Alali, Y.; Harrou, F.; Sun, Y. Unlocking the potential of wastewater treatment: Machine learning based energy consumption prediction. Water 2023, 15, 2349. [Google Scholar] [CrossRef]
  66. Hajjar, M.K.; Ulku, I. Forecasting of Turkey’s Total Electricity Consumption in Sectoral Bases Using Machine Learning Algorithms. In Proceedings of the 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey, 9–11 June 2022; pp. 1–7. [Google Scholar]
  67. Sarro, F.; Petrozziello, A.; Harman, M. Multi-objective software effort estimation. In Proceedings of the ICSE ‘16: 38th International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016; pp. 619–630. [Google Scholar]
  68. Cakir, M.A.; Unlu, R.; Cakir, S.C. Future of Clean Cooking Energy Access in Emerging Economies by 2030. In Operations Research Forum; Springer International Publishing: Berlin/Heidelberg, Germany, 2025; Volume 6, pp. 1–30. [Google Scholar]
  69. Sanchez-Olmedo, D.; Ortiz-Yepez, P.; Faytong-Haro, M. Bridging the Gap: The Impact of Gender Equality on CO2 Emissions Across Countries. World 2025, 6, 26. [Google Scholar] [CrossRef]
  70. Deininger, F.; Woodhouse, A.; Kuriakose, A.T.; Gren, A.; Liaqat, S. Placing Gender Equality at the Center of Climate Action; World Bank: Washington, DC, USA, 2023. [Google Scholar]
  71. Al Mamun, M.; Baghdadi, G.; Hossain, M.Z.; Rokonuzzaman, M. Towards Net-Zero: Legislative Gender Diversity and Cleaner Energy Transition. SSRN 2025, 5187015. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5187015 (accessed on 18 September 2025).
  72. Achuo, E.D.; Nchofoung, T.N.; Zanfack, L.J.T.; Epoge, C.E. The nexus between labour force participation and environmental sustainability: Global comparative evidence. Heliyon 2023, 9, e21434. [Google Scholar] [CrossRef]
  73. Yang, B.; Ali, M.; Hashmi, S.H.; Jahanger, A. Do income inequality and institutional quality affect CO2 emissions in developing economies? Environ. Sci. Pollut. Res. 2022, 29, 42720–42741. [Google Scholar] [CrossRef] [PubMed]
  74. Ragazou, K.; Garefalakis, A.; Samara, A. In the fight against climate change: The importance of environ-mental, social, and governance initiatives. In Environmental Sustainability and Global Change; Elsevier: Amsterdam, The Netherlands, 2025; pp. 313–321. [Google Scholar]
  75. Parashar, M.; Jaiswal, R.; Sharma, M. A quantitative analysis of ESG disclosure and financial performance in renewable energy companies: A two-step approach using unsupervised machine learning. Int. J. Energy Sect. Manag. 2024, 19, 1186–1212. [Google Scholar] [CrossRef]
  76. Tomadon, L.d.S.; Couto, E.V.D.; de Vries, W.T.; Moretto, Y. Smart city and sustainability indicators: A bibliometric literature review. Discov. Sustain. 2024, 5, 143. [Google Scholar] [CrossRef]
  77. Sajadi, S.M.A.; Barak, S.; Fereydooni, A. Online Portfolio Selection Using Macroeconomic Pattern Matching. SSRN 2024. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4960922 (accessed on 18 September 2025).
  78. Sahu, R.T.; Verma, M.K.; Ahmad, I. Density-based spatial clustering of application with noise approach for regionalisation and its effect on hierarchical clustering. Int. J. Hydrol. Sci. Technol. 2023, 16, 240–269. [Google Scholar] [CrossRef]
  79. Lenzi, E. SIMBA: Systematic clustering-based methodology to support built environment analysis. In Proceedings of the 30th Italian Symposium on Advanced Database Systems, Tirrenia, Italy, 19–22 June 2022. [Google Scholar]
  80. Järvenstrand, R.; Åström, L. Evaluating Thematic Exposures Using Natural Language processing: An Unsupervised Topic Modeling Approach to Improve Classification and Mitigate Risk. Master’s Thesis, Umeå Universitet, Umeå, Sweden, 2024. [Google Scholar]
  81. Sharma, L.; Salineeta; Poddar, P.N.; Singhal, P.B. Investing in Women, Investing in the Planet: Quantifying the Impact of Women’s Empowerment on Environmental Sustainability. Rev. Gest. Soc. Ambient. 2024, 18, e05345. [Google Scholar] [CrossRef]
  82. Khan, I.; Hou, F. The Impact of Socio-economic and environmental sustainability on CO2 emissions: A novel framework for thirty IEA countries. Soc. Indic. Res. 2021, 155, 1045–1076. [Google Scholar] [CrossRef]
  83. Mohapatra, B.B.; Kumari, A.; Mohapatra, S.; Sahoo, B.K. Dynamic Relationship Between Carbon Dioxide Emissions and Gross Domestic Product for Low, Middle- and High-Income Countries. J. Quant. Econ. 2023, 21, 873–898. [Google Scholar] [CrossRef]
  84. Nimerenco, I.; Soare, B.E.L.; Panait, D.Z. Assessment of the Link between CO2 $emissions and Socio-Economic Indicators. Proc. Int. Conf. Bus. Excel. 2024, 18, 2594–2608. [Google Scholar] [CrossRef]
  85. Jenifel, M.G.; Jasmine, R.A.; Umanandhini, D. Bitcoin Price Predictive Dynamics Using Machine Learning Models. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024; pp. 1–6. [Google Scholar]
  86. Gupta, R.; Sharma, A.; Anand, V.; Gupta, S. Automobile Price Prediction using Regression Models. In Proceedings of the 2022 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 20–22 July 2022; pp. 410–416. [Google Scholar]
  87. Chowdhury, M.S.; Shak, M.S.; Devi, S.; Miah, M.R.; Al Mamun, A.; Ahmed, E.; Mozumder, M.S.A. Optimizing E-Commerce Pricing Strategies: A Comparative Analysis of Machine Learning Models for Predicting Customer Satisfaction. Am. J. Eng. Technol. 2024, 6, 6–17. [Google Scholar] [CrossRef]
  88. Costantini, L.; Laio, F.; Mariani, M.S.; Ridolfi, L.; Sciarra, C. Forecasting national CO2 emissions world-wide. Sci. Rep. 2024, 14, 22438. [Google Scholar]
  89. Wen, L.; Yuan, X. Forecasting CO2 emissions in Chinas commercial department, through BP neural network based on random forest and PSO. Sci. Total Environ. 2020, 718, 137194. [Google Scholar] [CrossRef]
  90. Zhou, X.; Liu, Z.; Wu, L.; Wang, Y. Study on CO2 Emission Forecast of “Four Provinces of Mountains and Rivers” Based on Time-Series Machine Learning. Atmosphere 2024, 15, 949. [Google Scholar] [CrossRef]
  91. Siriteanu, A.-A. The effect of good governance on CO2 emissions and FDI inflows. J. Public Adm. Financ. Law 2024, 30, 225–233. [Google Scholar] [CrossRef]
  92. Sadaoui, N.; Zabat, L.; Sekrafi, H.; Abid, M. The moderating role of natural resources between governance and CO2 emissions: Evidence from MENA countries. Energy Environ. 2024, 35, 1597–1615. [Google Scholar] [CrossRef]
  93. Al Fannah, J.; Al Sabahi, S.; Al Harthi, H.; Al Bahrani, M.; Al Salmi, Q. Towards a green hospital approach in Oman: A case study of quantifying an environmental impact. Int. J. Health Plan. Manag. 2023, 38, 1583–1589. [Google Scholar] [CrossRef] [PubMed]
  94. Kadrić, D.; Blažević, R.; Bajrić, H.; Peco, A.; Kadrić, E. Renovation Measures for Reduction of Primary Energy Consumption and CO2 Emissions of Hospital Building. Teh. Glas. J. 2025, 19, 26–33. [Google Scholar] [CrossRef]
  95. Baldi, F.; Lambertides, N. Exploring the role of ESG for the performance and risks of infrastructure investing: Evidence from the international funds’ market. Manag. Financ. 2024, 50, 92–117. [Google Scholar] [CrossRef]
  96. Jiménez-Preciado, A.L.; Cruz-Aké, S.; Venegas-Martínez, F. Identification of Patterns in CO2 Emissions among 208 Countries: K-Means Clustering Combined with PCA and Non-Linear t-SNE Visualization. Mathematics 2024, 12, 2591. [Google Scholar] [CrossRef]
  97. Chovancová, J.; Petruška, I.; Pata, U.K.; Adamišin, P. Diverse pathways to decarbonization: Cluster-specific impacts of energy sources on CO2 emissions in the European Union. Energy Nexus 2025, 17, 100367. [Google Scholar] [CrossRef]
  98. Matviychuk, A.; Zhytkevych, O.; Osadcha, N. Modeling carbon dioxide emissions reduction. Energy Rep. 2024, 12, 1876–1887. [Google Scholar] [CrossRef]
  99. Srisuradetchai, P.; Suksrikran, K. Random kernel k-nearest neighbors regression. Front. Big Data 2024, 7, 1402384. [Google Scholar] [CrossRef]
  100. Ihzaniah, L.S.; Setiawan, A.; Wijaya, R.W.N. Perbandingan Kinerja Metode Regresi K-Nearest Neighbor dan Metode Regresi Linear Berganda pada Data Boston Housing. Jambura J. Probab. Stat. 2023, 4, 17–29. [Google Scholar] [CrossRef]
  101. Liu, K.; Tatinati, S.; Khong, A.W.H. A Weighted feature extraction technique based on temporal accumulation of learner behavior features for early prediction of dropouts. In Proceedings of the 2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), Takamatsu, Japan, 8–11 December 2020; pp. 295–302. [Google Scholar]
  102. Brunauer, W.; Servais, M. Nationwide ESG-Assessment of existing and new buildings at property level. In Proceedings of the 29th Annual European Real Estate Society Conference, London, UK, 12–15 July 2023. [Google Scholar]
  103. Abu Samah, I.H.; Rashid, I.M.A.; Husain, W.A.F.W.; Ibrahim, S.; Hamzah, H.; Amlus, M.H. The impact of healthcare expenditure and healthcare sector growth on CO2 emission using dynamic panel data system GMM estimation model during COVID-19 crisis. Int. J. Energy Econ. Policy 2020, 10, 235–241. [Google Scholar] [CrossRef]
  104. Andreoni, V. Estimating the European CO2 emissions change due to COVID-19 restrictions. Sci. Total Environ. 2021, 769, 145115. [Google Scholar] [CrossRef] [PubMed]
  105. Long, H.; Feng, G. Does national ESG performance curb greenhouse gas emissions? Innov. Green Dev. 2024, 3, 100138. [Google Scholar] [CrossRef]
  106. Junsiri, C.; Sutthichaimethee, P.; Phong-A-Ran, N. Modeling CO2 Emission Forecasting in Energy Consumption of the Industrial Building Sector under Sustainability Policy in Thailand: Enhancing the LISREL-LGM Model. Forecasting 2024, 6, 485–501. [Google Scholar] [CrossRef]
  107. Petrović, P.; Lobanov, M.M. The impact of R&D expenditures on CO2 emissions: Evidence from sixteen OECD countries. J. Clean. Prod. 2020, 248, 119187. [Google Scholar] [CrossRef]
  108. Chen, X.; Wang, J. The Impact of Regional Carbon Emission Reduction on Corporate ESG Performance in China. Sustainability 2024, 16, 5802. [Google Scholar] [CrossRef]
  109. Alqudah, M.Z.; Sierra-García, L.; Garcia-Benau, M.A. ESG sustainable technologies literature review using cluster-based methods. Pressacademia 2024, 18, 88–89. [Google Scholar] [CrossRef]
  110. Dong, Y.; Gao, J.; Qiu, J.; Cui, Y.; Guo, M. The dynamic spatial effects of education investment on carbon emissions: Heterogeneous analysis based on north-south differences in China. Front. Environ. Sci. 2024, 12, 1432457. [Google Scholar] [CrossRef]
  111. Drago, C.; Arnone, M.; Leogrande, A. A Machine Learning and Panel Data Analysis of N2O Emissions in an ESG Framework. Sustainability 2025, 17, 4433. [Google Scholar] [CrossRef]
  112. Lagodiyenko, O.; Lagodiienko, V.; Ivanchenkova, L.; Romanashenko, I.; Laskaiev, O. Prospects for the development of the ESG concept in the face of new challenges. Collect. Pap. New Econ. 2023, 1, 179–195. [Google Scholar]
  113. Paolone, F.; Bitbol-Saba, N. Identifying thematic clusters of ESG in contemporary accounting studies through systematic literature review. Meas. Bus. Excel. 2025, 29, 368–394. [Google Scholar] [CrossRef]
  114. Lee, C.L.; Liang, J. The effect of carbon regulation initiatives on corporate ESG performance in real estate sector: International evidence. J. Clean. Prod. 2024, 453, 142188. [Google Scholar] [CrossRef]
  115. Xie, H.; Qin, Z.; Li, J. ESG performance and corporate carbon emission intensity: Based on panel data analysis of A-share listed companies. Front. Environ. Sci. 2024, 12, 1483237. [Google Scholar] [CrossRef]
  116. Zhang, L.; Song, G.; Ma, X.; Zhan, C.; Zhang, S. Decarbonising residential building energy towards achieving the intended nationally determined contribution at subnational level under uncertainties. J. Clean. Prod. 2020, 272, 122760. [Google Scholar] [CrossRef]
  117. Giannelos, S.; Moreira, A.; Papadaskalopoulos, D.; Borozan, S.; Pudjianto, D.; Konstantelos, I.; Sun, M.; Strbac, G. A Machine learning approach for generating and evaluating forecasts on the environmental impact of the buildings sector. Energies 2023, 16, 2915. [Google Scholar] [CrossRef]
  118. Yang, W.; Hei, Y. Research on the impact of enterprise ESG ratings on carbon emissions from a spatial perspective. Sustainability 2024, 16, 3826. [Google Scholar] [CrossRef]
  119. Jarboui, A.; Mnif, E.; Akrout, Z.; Chakroun, S. Unveiling the drivers behind carbon emissions disclosure: An ESG perspective. Soc. Bus. Rev. 2025, 20, 276–292. [Google Scholar] [CrossRef]
  120. Anonna, F.R.; Mohaimin, M.R.; Ahmed, A.; Nayeem, M.B.; Akter, R.; Alam, S.; Nasiruddin, M.; Hossain, M.S. Machine Learning-Based Prediction of US CO2 Emissions: Developing Models for Forecasting and Sustainable Policy Formulation. J. Environ. Agric. Stud. 2023, 4, 85–99. [Google Scholar]
  121. Dou, X.; Liu, Z. Carbon Monitor Gridded Dataset (CMGD), a near-real-time high resolution gridded CO2 emission estimates dataset. In Proceedings of the EGU General Assembly 2021, Online, 19–30 April 2021. EGU21-4323. [Google Scholar]
Figure 1. Multi-method analytical framework for ESG-driven building sector decarbonization. Note: Multi-method analytical framework for modeling building sector CO2 emissions using ESG indicators. The framework begins with the decomposition of ESG into Environmental (E), Social (S), and Governance (G) dimensions (Panel (A)), followed by econometric modeling through fixed and random effects using panel data from 180 countries over 2000–2022 (Panel (B)). Clustering techniques (Panel (C)) such as density-based and hierarchical algorithms are applied to uncover emission patterns. Finally, machine learning regression models (Panel (D)) including k-NN, Random Forest, and SVM are used to predict emissions. This integrated approach captures both causal relationships and complex predictive patterns across ESG dimensions.
Figure 1. Multi-method analytical framework for ESG-driven building sector decarbonization. Note: Multi-method analytical framework for modeling building sector CO2 emissions using ESG indicators. The framework begins with the decomposition of ESG into Environmental (E), Social (S), and Governance (G) dimensions (Panel (A)), followed by econometric modeling through fixed and random effects using panel data from 180 countries over 2000–2022 (Panel (B)). Clustering techniques (Panel (C)) such as density-based and hierarchical algorithms are applied to uncover emission patterns. Finally, machine learning regression models (Panel (D)) including k-NN, Random Forest, and SVM are used to predict emissions. This integrated approach captures both causal relationships and complex predictive patterns across ESG dimensions.
Buildings 15 03601 g001
Figure 2. Comparative performance of clustering algorithms on building-related carbon emissions (BCE) profiles across ten normalized metrics.
Figure 2. Comparative performance of clustering algorithms on building-related carbon emissions (BCE) profiles across ten normalized metrics.
Buildings 15 03601 g002
Figure 3. Mean indicator values for ESG-based clusters identified through density-based clustering.
Figure 3. Mean indicator values for ESG-based clusters identified through density-based clustering.
Buildings 15 03601 g003
Figure 4. DBSCAN clustering and 5-Nearest Neighbor distance plot for environmental indicator-based country segmentation.
Figure 4. DBSCAN clustering and 5-Nearest Neighbor distance plot for environmental indicator-based country segmentation.
Buildings 15 03601 g004
Figure 5. Comparative performance of machine learning regression models for predicting building-sector CO2 emissions.
Figure 5. Comparative performance of machine learning regression models for predicting building-sector CO2 emissions.
Buildings 15 03601 g005
Figure 6. Feature importance rankings based on mean dropout loss in predicting building-sector emissions.
Figure 6. Feature importance rankings based on mean dropout loss in predicting building-sector emissions.
Buildings 15 03601 g006
Figure 7. Additive feature contributions to building CO2 emissions predictions (test cases).
Figure 7. Additive feature contributions to building CO2 emissions predictions (test cases).
Buildings 15 03601 g007
Figure 8. Additive feature contributions to building-related CO2 emissions: k-NN model explanations across five cases.
Figure 8. Additive feature contributions to building-related CO2 emissions: k-NN model explanations across five cases.
Buildings 15 03601 g008
Figure 9. Comparative performance of clustering algorithms for ESG-based environmental data: evaluation across statistical and validity metrics.
Figure 9. Comparative performance of clustering algorithms for ESG-based environmental data: evaluation across statistical and validity metrics.
Buildings 15 03601 g009
Figure 10. Cluster centroids from density-based model: ESG and building CO2 emissions contextualized by social indicators.
Figure 10. Cluster centroids from density-based model: ESG and building CO2 emissions contextualized by social indicators.
Buildings 15 03601 g010
Figure 11. Clustering the social foundations of building emissions: ESG-based density model output.
Figure 11. Clustering the social foundations of building emissions: ESG-based density model output.
Buildings 15 03601 g011
Figure 12. Performance metrics of regression models for ESG-informed emissions prediction.
Figure 12. Performance metrics of regression models for ESG-informed emissions prediction.
Buildings 15 03601 g012
Figure 13. Feature importance analysis of socio-economic determinants in building CO2 emissions via Random Forest model.
Figure 13. Feature importance analysis of socio-economic determinants in building CO2 emissions via Random Forest model.
Buildings 15 03601 g013
Figure 14. Relative importance of socio-economic indicators in predicting building CO2 emissions using Random Forest.
Figure 14. Relative importance of socio-economic indicators in predicting building CO2 emissions using Random Forest.
Buildings 15 03601 g014
Figure 15. Additive feature contributions to predicted building CO2 emissions: social indicators and Random Forest interpretability. Note: Displayed values represent feature contributions to the predicted value without features (column ‘Base’) for the test set.
Figure 15. Additive feature contributions to predicted building CO2 emissions: social indicators and Random Forest interpretability. Note: Displayed values represent feature contributions to the predicted value without features (column ‘Base’) for the test set.
Buildings 15 03601 g015
Figure 16. Comparison of clustering algorithms on governance indicators and building CO2 emissions (BCE).
Figure 16. Comparison of clustering algorithms on governance indicators and building CO2 emissions (BCE).
Buildings 15 03601 g016
Figure 17. Country clusters by governance profiles and building CO2 emissions: results from neighborhood-based clustering. Normalized data. The y-axis shows the normalized values between 0 and 1. The x-axis shows the number of clusters.
Figure 17. Country clusters by governance profiles and building CO2 emissions: results from neighborhood-based clustering. Normalized data. The y-axis shows the normalized values between 0 and 1. The x-axis shows the number of clusters.
Buildings 15 03601 g017
Figure 18. Regression model performance on governance-based prediction of building CO2 emissions (BCE).
Figure 18. Regression model performance on governance-based prediction of building CO2 emissions (BCE).
Buildings 15 03601 g018
Figure 19. Governance feature importance in predicting building CO2 emissions: dropout loss-based ranking.
Figure 19. Governance feature importance in predicting building CO2 emissions: dropout loss-based ranking.
Buildings 15 03601 g019
Figure 20. Additive explanations of governance feature contributions to BCE predictions.
Figure 20. Additive explanations of governance feature contributions to BCE predictions.
Buildings 15 03601 g020
Figure 21. Additive effects of governance features on predicted building CO2 emissions: case-level decomposition.
Figure 21. Additive effects of governance features on predicted building CO2 emissions: case-level decomposition.
Buildings 15 03601 g021
Table 1. ESG dimensions and their property-level implications in the building sector.
Table 1. ESG dimensions and their property-level implications in the building sector.
ESG DimensionReferencesMain ResultsComparison with Our Study
E-Environmental[1,7,8,9,10,11,12,13,14,15,16,17,20,28,30,31,32,35,36,37,38,39,40]. ESG certifications and technologies are often disconnected from actual emission outcomes; operational routines are key but under-applied.Our study directly quantifies the impact of environmental ESG indicators (e.g., renewable energy, air quality) on building emissions, validating their role with econometric models.
S—Social[8,18,19,20,21,22,23,24,25,29]. Social aspects are conceptually acknowledged but rarely implemented operationally; issues of equity, inclusion, and behavior are underdeveloped.Our study empirically links social variables (e.g., gender equity, income equality, labor participation) to emission levels, revealing both mitigation and rebound effects.
G—Governance[1,10,11,12,15,17,19,24,26,27,28,29,30,31,32,33,34,35,36,39,40,41,42,43,44]. Governance is often fragmented; regulation and institutional routines are disconnected from long-term ESG outcomes.Our study finds that governance indicators (e.g., government effectiveness, political stability) may paradoxically correlate with higher emissions, especially in higher-income countries, highlighting the complexity of governance–ESG links.
Table 2. Results of the panel data.
Table 2. Results of the panel data.
Random Effects (GLS), Using 990 Observations, Dependent Variable: BCEFixed Effects, Using 990 Observations
Dependent Variable: BCE
CoefficientStd. ErrorzCoefficientStd. Errort-Ratio
const9.316611.62740.801310.298710.60610.9710
CFTC0.318025 ***0.08429853.7730.355114 ***0.08930233.977
ELEC−0.245591 ***0.0923988−2.658−0.273064 ***0.0964139−2.832
ENUC0.00269113 ***0.0006914663.8920.00284832 ***0.0007357973.871
PM250.664678 ***0.1314725.0560.691987 ***0.1444704.790
RENC−0.489147 ***0.104604−4.676−0.513324 ***0.112617−4.558
Statistics Mean dependent var2.480 2.480
Sum squared resid5,836,005 90,688
Log-likelihood−5702 −3640
Schwarz criterion11,445 8.413
Rho0.756174 0.756174
S.D. dependent var7.753
S.E. of regression7.697
Akaike criterion11,416
Hannan–Quinn11,427
Durbin–Watson0.324935
LSDV R-squared 0.984747
LSDV F(163, 826) 3.271
Test‘Between’ variance = 4971.73, ‘Within’ variance = 91.6047, mean theta = 0.935247, Joint test on named regressors—Asymptotic test statistic: chi-square(5) = 109.56 with p-value = 5.0646   ×   10 22 Joint test on named regressors—
Test statistic: F(5, 826) = 20.981
with p-value = P(F(5, 826) > 20.981) = 8.85584 × 10−20
Breusch–Pagan test—Null hypothesis: Variance of the unit-specific error = 0, Asymptotic test statistic: chi-square(1) = 3155.73 with p-value = 0Test for differing group intercepts—Null hypothesis: The groups have a common intercept
Test statistic: F(158, 826) = 319.607
with p-value = P(F(158, 826) > 319.607) = 0
Hausman test—Null hypothesis: GLS estimates are consistent, Asymptotic test statistic: chi-square(5) = 5.09644, with p-value = 0.404224
Note: *** p < 0.01.
Table 3. Cluster characteristics from density-based clustering on ESG-linked energy and environmental indicators.
Table 3. Cluster characteristics from density-based clustering on ESG-linked energy and environmental indicators.
ClusterNoise Points12345
Size194966208
Explained proportion within-cluster heterogeneity0.0000.9998.471 × 10−42.529 × 10−51.102 × 10−54.580 × 10−4
Within sum of squares0.0004.5843.8880.1160.0512.103
Silhouette score0.0000.3470.6320.9680.9850.820
Table 4. Impact of social indicators on building-related CO2 emissions: fixed-effect and random-effect panel regression results (2000–2020).
Table 4. Impact of social indicators on building-related CO2 emissions: fixed-effect and random-effect panel regression results (2000–2020).
Fixed Effects, Using 1246 Observations
Dependent Variable: BCE
Random Effects (GLS), Using 1246 Observations
Dependent Variable: BCE
CoefficientStd. Errort-RatioCoefficientStd. Errorz
const46.5798 ***8.086705.76036.040510.54313.418
FOOD0.0863615 ***0.02360103.6590.08685470.02338773.714
GPIE−0.00173333 **0.000727140−2.384−0.001712240.000720050−2.378
INC200.454688 **0.2152272.1130.4517060.2135272.115
LABF−0.323534 ***0.114301−2.831−0.2993590.111774−2.678
WPAR−0.169214 ***0.0504386−3.355−0.1694340.0499015−3.395
Statistics Mean dependent var31.32247 Mean dependent var31.32247
Sum squared resid73,866 Sum squared resid9479
LSDV R-squared0.992108 Log-likelihood−7335.684
LSDV F(142, 1103)976.5221 Schwarz criterion14,714
Log-likelihood−4311.281 Rho0.727702
Schwarz criterion9641.822 S.D. dependent var86.70740
Rho0.727702 S.E. of regression87.39678
S.D. dependent var86.70740 Akaike criterion14,683
S.E. of regression8.183426 Hannan–Quinn14,694
Within R-squared0.032513 Durbin–Watson0.486805
p-value (F)0.000000
Akaike criterion8908.562
Hannan–Quinn9184.263
Durbin–Watson0.486805
TestsJoint test on named regressors—
Test statistic: F(5, 1103) = 7.41335
with p-value = P(F(5, 1103) > 7.41335) = 7.51925 × 10−7
‘Between’ variance = 6423.31, ‘Within’ variance = 59.2827, mean theta = 0.953035, Joint test on named regressors—Asymptotic test statistic: chi-square(5) = 36.391 with p-value = 7.93233 × 10−7
Test for differing group intercepts—
Null hypothesis: The groups have a common intercept
Test statistic: F(137, 1103) = 1002.32
with p-value = P(F(137, 1103) > 1002.32) = 0
Breusch–Pagan test—Null hypothesis: Variance of the unit-specific error = 0
Asymptotic test statistic: chi-square(1) = 5098.24, with p-value = 0
Hausman test—Null hypothesis: GLS estimates are consistent, Asymptotic test statistic: chi-square(5) = 2.65474
with p-value = 0.753031
Note: *** p < 0.01, ** p < 0.05.
Table 5. Cluster characteristics from density-based clustering of ESG and building CO2 emissions data.
Table 5. Cluster characteristics from density-based clustering of ESG and building CO2 emissions data.
ClusterNoise Points1234
Size11188211521
Explained proportion within-cluster heterogeneity0.0000.9560.0060.0020.036
Within sum of squares0.0002.28813.6325.50284.952
Silhouette score0.0000.6180.8510.8780.768
Note: The between sum of squares of the 4-cluster model is 5054.78. Note: The total sum of squares of the 4-cluster model is 7447.41.
Table 6. Panel regression results: governance indicators and building-sector CO2 emissions (BCE).
Table 6. Panel regression results: governance indicators and building-sector CO2 emissions (BCE).
Fixed Effects, Using 982 Observations
Dependent Variable: BCE
Random Effects (GLS), Using 982 Observations Dependent Variable: BCE
CoefficientStd. Errort-RatioCoefficientStd. Errorz
const14.7556 **6.120132.4114.378239.080370.4822
GOVT12.7921 ***2.422175.28111.2943 ***2.156475.237
EDUE0.450092 *0.2450661.8370.404457 *0.2379951.699
STAB−3.19087 ***1.16516−2.739−2.29649 **0.964108−2.382
RNDG−4.11812 **1.71293−2.404−4.25068 **1.65044−2.575
LAWR−4.29609 **2.00928−2.138−1.36151 *0.755280−1.803
HOSP3.80015 ***0.6123546.2063.75312 ***0.5891346.371
SCIE0.000433491 ***2.56973 × 10−516.870.000462643 ***2.50122 × 10−518.50
StatisticsMean dependent var44.83643 Mean dependent var44.83643
Sum squared resid88,339.25 Sum squared resid6,648,511
LSDV R-squared0.992464 Log-likelihood−5724.171
LSDV F(108, 873)1064.614 Schwarz criterion11,503.46
Log-likelihood−3602.578 Rho0.766045
Schwarz criterion7956.121 S.D. dependent var109.3165
Rho0.766045 S.E. of regression82.57715
S.D. dependent var109.3165 Akaike criterion11,464.34
S.E. of regression10.05935 Hannan–Quinn11,479.22
Within R-squared0.299880 Durbin–Watson0.464047
p-value (F)0.000000
Akaike criterion7423.156
Hannan–Quinn7625.898
Durbin–Watson0.464047
TestsJoint test on named regressors-
Test statistic: F(7, 873) = 53.4184
with p-value = P(F(7, 873) > 53.4184) = 1.58042 × 10−63
‘Between’ variance = 5929.08
‘Within’ variance = 89.9585
Mean theta = 0.942699
Joint test on named regressors-
Asymptotic test statistic: chi-square(7) = 437.883
with p-value = 1.77433 × 10−90
Test for differing group intercepts-
Null hypothesis: The groups have a common intercept
Test statistic: F(101, 873) = 78.0485
with p-value = P(F(101, 873) > 78.0485) = 0
Breusch–Pagan test-
Null hypothesis: Variance of the unit-specific error = 0
Asymptotic test statistic: chi-square(1) = 2207.29
with p-value = 0
Hausman test-
Null hypothesis: GLS estimates are consistent
Asymptotic test statistic: chi-square(7) = 73.1884
with p-value = 3.34305 × 10−13
Note: *** p < 0.01, ** p < 0.05, * p < 0.10.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Magaletti, N.; Notarnicola, V.; Di Molfetta, M.; Leogrande, A. Decarbonizing the Building Sector: The Integrated Role of Environmental, Social, and Governance Indicators. Buildings 2025, 15, 3601. https://doi.org/10.3390/buildings15193601

AMA Style

Magaletti N, Notarnicola V, Di Molfetta M, Leogrande A. Decarbonizing the Building Sector: The Integrated Role of Environmental, Social, and Governance Indicators. Buildings. 2025; 15(19):3601. https://doi.org/10.3390/buildings15193601

Chicago/Turabian Style

Magaletti, Nicola, Valeria Notarnicola, Mauro Di Molfetta, and Angelo Leogrande. 2025. "Decarbonizing the Building Sector: The Integrated Role of Environmental, Social, and Governance Indicators" Buildings 15, no. 19: 3601. https://doi.org/10.3390/buildings15193601

APA Style

Magaletti, N., Notarnicola, V., Di Molfetta, M., & Leogrande, A. (2025). Decarbonizing the Building Sector: The Integrated Role of Environmental, Social, and Governance Indicators. Buildings, 15(19), 3601. https://doi.org/10.3390/buildings15193601

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop