Next Article in Journal
Theoretical Analysis of Dynamic Effects of Supply Chain Concentration on Inventory Management Performance: A System Dynamics Approach
Previous Article in Journal
Safety Evaluation and Management Optimization Strategies for Building Operations Under the Integrated Metro Station–Commercial Development Model: A Case Study
Previous Article in Special Issue
Industry-Driven Model-Based Systems Engineering (MBSE) Workforce Competencies—An AI-Based Competency Extraction Framework
 
 
Article
Peer-Review Record

Integrating ESG with Digital Twins and the Metaverse: A Data-Driven Framework for Smart Building Sustainability

Systems 2025, 13(12), 1083; https://doi.org/10.3390/systems13121083 (registering DOI)
by Nicola Magaletti 1, Chiara Tognon 2, Mauro Di Molfetta 1, Angelo Zerega 2, Valeria Notarnicola 1, Ettore Zini 2 and Angelo Leogrande 1,3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Systems 2025, 13(12), 1083; https://doi.org/10.3390/systems13121083 (registering DOI)
Submission received: 6 October 2025 / Revised: 6 November 2025 / Accepted: 27 November 2025 / Published: 1 December 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for the opportunity to review this work. My detailed comments:

- failure to comply with the journal template - indicates editorial imperfection,

- there should be no period after keywords,

- APA-style citations do not meet the journal's standards,

- the authors write that there is a lack of a codified corpus linking the metaverse and smart buildings, but they do not report any details of the search methodology (range of years, languages, databases other than Scopus, exclusion strategies); the claim about “five” publications is made rhetorically, without a review table, PRISMA diagram, or selection criteria.

- The authors repeatedly use the words prototype and case study, but admit that the evaluation is based on simulated data; there is a lack of raw data, acquisition protocol, sensor parameters, test scenarios, error and uncertainty measures.

- The discussion of the system points to numerous benefits (energy efficiency, reduced MTTR, increased FTFR, improved comfort), but the “Limitations” section itself admits the lack of data from actual implementations and excessive reliance on simulations, thus the conclusions are stronger than the empirical material.

The current version does not meet the criteria for a research article. It is a solid concept paper (white paper) with a KPI table and dashboard suggestions, but without data, analysis, or validation.

Author Response

Point to Point Answers to Reviewer 1

 

Q1. Thank you for the opportunity to review this work.

A1. Thanks dear reviewer for your work.

Q2. Failure to comply with the journal template - indicates editorial imperfection

A2.  This is true. We should have used the template provided by Systems magazine.

Q3. There should be no period after keywords

A3. Periods have been removed. The new description of keywords is as follows:

Smart Building Management; Metaverse Technologies; Sustainability; Key Performance Indicators (KPIs); Digital Twins and IoT; ESG

Q4. APA-style citations do not meet the journal's standards

A4. We have removed the APA-based citation model and introduced an IEEE-based citation model.

Q5. The authors write that there is a lack of a codified corpus linking the metaverse and smart buildings, but they do not report any details of the search methodology (range of years, languages, databases other than Scopus, exclusion strategies); the claim about “five” publications is made rhetorically, without a review table, PRISMA diagram, or selection criteria.

A5. We thank the reviewer for this valuable observation. The purpose of our investigation was not to perform a full systematic literature review (SLR), but rather to verify whether a structured and codified body of research explicitly connecting metaverse technologies and smart building management already existed. The scarcity of available works constitutes, in itself, an important finding that confirms the novelty of the topic and justifies the originality of the research. To address the reviewer’s concern and enhance methodological transparency, we have expanded the section describing the literature exploration process. The revised text now reads as follows:

“To assess the current state of research linking metaverse technologies to smart building management, a targeted literature search was conducted using the Elsevier Scopus database (2018–2025, English language) with the query: TITLE-ABS-KEY (‘metaverse’) AND TITLE-ABS-KEY (‘smart building’). Only five peer-reviewed results were identified. Although this number is too limited for a formal systematic review, it reveals the absence of a codified corpus in this emerging intersection, thus justifying the originality and timeliness of the present research.”

In addition, we have clarified the methodological aspects in the revised manuscript. Specifically, the literature exploration was conducted using the Elsevier Scopus database, chosen for its comprehensive coverage of peer-reviewed journals in technology, engineering, and sustainability. The search covered the period 2018–2025 and was limited to English-language publications. Only studies explicitly linking metaverse applications with smart building management or operation were included, while broader works on smart cities, virtual reality, or IoT not referring to metaverse contexts were excluded. The five retrieved works have been summarized in the manuscript with a concise descriptive synthesis, providing clear evidence of the fragmentation and early developmental stage of this research domain. Although the limited number of studies does not justify the adoption of a PRISMA diagram or a full systematic review framework, it offers an empirical confirmation of the research gap that this paper addresses. The updated section and clarifications directly respond to the reviewer’s request for greater methodological transparency.

 

Q6. The authors repeatedly use the words prototype and case study, but admit that the evaluation is based on simulated data; there is a lack of raw data, acquisition protocol, sensor parameters, test scenarios, error and uncertainty measures.

A6. The sections from 3 to 8 outline an integrated methodology for developing, validating, and scientifically interpreting an ESG dataset to prototype an intelligent management model for a smart building structure using digital twin and metaverse technologies. The first part of the work (Section 3) primarily addresses the establishment of KPIs for the three components of ESG analysis: environmental, social, and governance. The objective of that section is to define an integrated system of indicators relevant to the evaluation of smart and sustainable infrastructures. The section on validating the approach proposed in Section 4 introduces the methodology used to test the quality and availability of the data, thereby highlighting the robustness of the data set required for implementing the model. Sections 5 and 6 describe the scientific validation of the approach that has been applied based on the correlation and regression analysis. These sections outline the internal coherence and statistical correlations among the environmental, social, and economic factors in the ESG analysis. These sections identify the theoretical indicators as well as the correlated factors that may exist within those indicators. Section 7 emphasizes the confirmatory analysis that uses the Principal Component Analysis approach. The analysis confirms that the data set has a multidimensional character. Section 8 includes further analysis by applying machine learning algorithms, such as the Random Forest and SVM algorithms. Sections 5 to 8 offer evidence of scientific validation of the applied data set of the environmental and ecological approach. The substantively applied scientific approach provides a comprehensive foundation for integrating intelligent management models that suit digital twins and the metaverse.

 

  1. Development of the Environmental Dataset for Evaluating Smart Infrastructure Performance through Digital Twin Integration

 

 

The creation of the environmental data set for the evaluation of environmental performances marks an imperative step in the design of an intelligent digital model based on digital twin applied to smart infrastructures. In light of this consideration, the environmental KPI set section marks the foundation of the proposed research based on digital twin applied to environmental performances of smart infrastructures. The bibliographic research on digital twin applied to environmental performances of smart infrastructures leads to an understanding of the relevant KPI necessities proposed by digital twin. The ultimate goal of designing digital twin applied to environmental performances of smart infrastructures is to introduce an integrated system that enables the simulation of environmental performances of infrastructures. The applied KPI set permits an operative analysis of the synergic relationship concerning energy efficiency, sustainability, and environmental performances. The proposed KPI set permits an analysis of carbon footprint and emission intensity that provides extend information on environmental sustainability. The remaining KPI set related to load cover factor and on-site energy ratio grants evidence on system autonomy and energy efficiency. The data set creation implemented by the KPI set permits an integrated analysis of environmental performances that corresponds to the principles of the environmental framework proposed by ESG. The designed data set permits an operative comparison of environmental performances of an infinite number of infrastructures. The data set designed permits an intelligent analysis of environmental performances that provides a knowledge base on environmental sustainability of infrastructures. The designed data set permits an analysis on environmental performances of infrastructures that marks an perpetual approach on design decision concerning data set creation. The data set creation designed permits an intelligent extension on system environmental design that provides an intelligent knowledge on system environmental sustainability. In an ultimate analysis on data set creation designed by digital twin applied to environmental performances of infrastructures, digital twin represents an intelligent approach toward environmental sustainability of infrastructures.

 

 

3.1 Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Evaluation of Smart Infrastructure

 

The chosen environmental Key Performance Indicators (KPIs) provide a comprehensive framework for evaluating the environmental performance characteristics of smart infrastructures. These factors help support the aims and scope of the proposed digital twin platform, aiming to analyze, model, and optimize the environmental, energy, and operational characteristics of buildings and urban infrastructures in an immersive and data-driven setting  [1]; Fokaides, Jurelionis, & Spudys, 2022). Each Key Performance Indicator adds a unique perspective on energy, resilience, and efficiency, combining to form a holistic model for managing and interpreting the sustainability of urban infrastructures [1]. The Carbon Footprint (CFPT) provides a fundamental measure of sustainability by quantifying the total greenhouse gas emissions generated by system activity. This metric allows for the interpretation of complex system operations through a comparative metric expressed in CO₂-eq, evaluating both direct and indirect emissions (Zahedi, Alavi, Sardroud, & Dang, 2024). Applied in digital twin analysis, it is essential for the CFPT, as it enables real-time analysis and projection of environmental implications across different system operation scenarios (Li, 2025). It effectively serves as the central model connecting energy performance characteristics to global environmental goals for climate regulation (Yu, Ye, Xia, & Chen, 2024). Emission Intensity (EMIN) adds a further dimension by using normalized factors directly related to the energy consumed or produced. This type of ratio analysis permits different system-scale operations to compare system emissions, making it highly valuable for multi-building and city-scale analysis (Alibrandi, 2022). The Load Cover Factor (LCF) and Supply Cover Factor (SCF) assess the relationship presented by energy demand and supply, an important consideration for energy and resource sufficiency. The LCF will evaluate how much local energy production can sustain energy activity for a predetermined period, assessing system sufficiency, while SCF will assess how much local energy production can sustain energy use for a predetermined period, assessing system resource use (Chávez et al., 2022). The Load Matching Index (LMI) evaluates the synchrony of system dimensions for local energy production and energy activity. Large LMI values clearly indicate that local energy production and storage are well supported by local loads, thereby providing a fundamental basis for the efficiency and resilience of Smart Grids (Klar & Angelakis, 2023). The On-Site Energy Ratio (OER) also captures the extent to which local energy consumption is supported by local use of Renewable Energy sources, thereby serving as a crucial factor in assessing the zero-energy building index (Prandi et al., 2022). The Grid Interaction Index (GII) and No-Grid Interaction Probability (NGI) further establish the global context for autonomy. The GII captures the intensity and direction of energy interactions, while the NGI estimates the probability of autonomy (Fokaides et al., 2022). Capacity Factor (CAF) and One Percent Peak Power (OPP) establish system performance at varying loads. The Capacity Factor estimates system performance and its ability to use its installed energy resources, thereby forming a crucial index for judging performance return on investment, while the One Percent Peak Power focuses on peak loads and their intensity, thereby estimating impacts on system stress [2]. Building on the concept of behavior-based system performance, the Demand Response Percentage (DRS) estimates system performance flexibility in adapting to varying loads, particularly in Smart Pricing scenarios [3]. The system's total flexibility level for adapting to global environmental stimuli, such as market prices or Renewable resource availability, thereby covering system transitions from Static Energy Management to Adaptability, is captured by the system’s triple dimensions – the Flexibility Factor (FLF), Flexibility Index (FLI), or Flexible Energy Efficiency (FEE) (Chávez et al., 2022, 2022; Li, 2025). This framework satisfies not only system sustainability analysis requirements but also provides additional benefits for decision-making, scenario analysis, and future system optimization [4]; Zahedi et al., 2024). This framework therefore aligns well with the system requirements for an intelligent, fully interoperable, and environmentally sustainable Smart Urban Ecosystem, supported by measurable system performance indicators (Prati, Pelucchi, Dal Fiore, Fuzzati, & Agostini, 2023).

 

Table xyz. Environmental Key Performance Indicators (KPIs) and Their Computational Formulations

KPI

ACRONYM

Description

Formula

Carbon Footprint

CFPT

Indicates the total amount of greenhouse gas (GHG) emissions caused by an individual, organization, or product, either directly or indirectly. The formula calculates the sum of emissions associated with different activities by multiplying the quantity of each activity by its corresponding emission factor [5].

 

 

 

 

 = Quantity of a specific activity that generates greenhouse gas emissions (e.g., km, kWh, liters).

 = Rate of GHG emissions per unit of activity, expressed in CO₂ equivalent per unit (e.g., tCO₂e/kWh for electricity, tCO₂e/liter for fuel, etc.).

Emission Intensity EI

EMIN

Evaluates the environmental impact of an energy system by measuring the amount of carbon dioxide (CO₂) emitted per unit of energy consumed or produced. A low  value indicates that the system is more environmentally efficient, emitting less CO₂ for each unit of energy consumed or produced (this can occur through the use of renewable energy sources). Conversely, high  values typically occur in systems that rely heavily on fossil fuels [6].

 

2

 

 = Total amount of CO₂ emitted over a given period, resulting from the consumption of fossil fuels or the use of grid electricity [tCO₂]

 = Total amount of energy consumed or produced during the same reference period [kWh]

Load Cover Factor

LCF

Represents the ratio between the energy actually supplied by a generation source and the energy demanded or consumed over a given time interval. If equal to 1, it indicates that the generation capacity exceeds the demand, whereas values lower than 1 indicate that generation is insufficient to meet the required load. When =  1, the entire load demand is fully satisfied.  When   1,   the load is not completely met during part of the period, due to limitations in generation or available resources. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Supply Cover Factor

SCF

Indicates the ability of an organization to meet its energy demand through its own on-site supply resources. When = 1, the amount of useful supplied resources is exactly equal to the total available amount. This implies that there are no significant losses and that all available resources are fully utilized. When <  1, the amount of effectively usable resources is lower than the total available amount. Part of the generated energy is not used to meet the load, likely due to overproduction, losses, or storage capacity limitations. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Load Matching Index

LMI

Measures the efficiency with which on-site energy generation (whether renewable or not) matches the energy load (demand) of a system.

It evaluates how well the energy production profile corresponds to the load profile over time by analyzing the synchrony between supply and demand.

A higher index indicates a better match between generation and load.

When  = 1, the load is fully met (i.e., generation and storage are sufficient to cover the required demand) in every considered interval.

When  < 1, the load is not fully met at certain times, meaning that the generation and/or storage capacity was lower than the demand.

Range: 0 % ≤ f_(load,i) ≤ 100 % [8].

 

i = Time intervals [hourly, daily, monthly]

 = On-site energy generation at a given time t [kWh]

 = Storage energy balance at a given time t [kWh]

 = Energy losses at a given time t (sum of generation energy losses, storage energy losses, building technical system losses (excluding storage), and load-related energy losses such as distribution losses) [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Number of samples within the evaluation period, from τ₁ to τ₂. When hourly data are used and the evaluation period covers a full year, the number of samples is 8760.

 

On-site Energy Ratio

OER

Determines the amount of energy produced on-site (e.g., from renewable sources such as solar panels or wind turbines) relative to the total energy consumption over a given period of time.

If  = 1, the on-site generated energy equals the total energy consumption.

If  < 1, the on-site produced energy is lower than total consumption, meaning that the system depends on external energy sources to meet the demand.

If   > 1, the on-site generated energy exceeds total consumption, indicating that energy production is greater than demand (and surplus energy may be exported to the grid).

Range:   0 [9].

 

 

 = On-site energy generation at a given time t [kWh]

 = Total energy consumption (energy load) at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 

 

 

 

Grid Interaction Index (Indice di Interazione con la Rete)

GII

Measures the level of interaction and integration of a facility with the power grid, describing its average stress.

If  = 100%, the energy exchanged with the grid during interval i equals the maximum possible exchange.

If  = 0%, no energy exchange with the grid occurred at that moment.

If  < 0%, energy was injected into the grid rather than drawn from it [7], [8].

 

 = Net energy exchanged with the power grid during interval i (can be positive or negative depending on whether energy is being drawn from or injected into the grid) [kWh]

 = Maximum absolute value of the net energy flow with the grid, taken over all considered time intervals [kWh]

i = Time intervals [hourly, daily, monthly]

No grid interaction probability

NGI

Measures the probability that a building or facility operates autonomously from the power grid, and therefore the likelihood of no interaction with it.

It also indicates the extent to which the load is covered by stored energy or renewable energy use.

If  = 0, there was no moment during the considered time interval when the net energy was zero or negative.

If  = 1, the net energy was zero or negative for the entire considered period.

Range: 0           1  [7], [8].

 

 = Probability that the net energy  is zero or negative during the time interval ||

 = Normalized variable for the net exported energy at a given time t [kWh]

e  = Start and end of the evaluation period [s]

Capacity Factor

 

CAF

Defines the ratio between the actual energy production of a system (energy exchanged between the building and the grid) and the maximum production that could be achieved if the system operated at full capacity over a given period of time.

If = 1, the system operated at its maximum capacity for the entire considered period.

If = 0, the system did not produce any energy.

Range: 0           1  [8].

 

 = Normalized variable for the net exported energy at a given time t [kWh]

 = Maximum producible energy at full capacity (system capacity) [kWh]

 = = Evaluation period [s]

One Percent Peak Power

OPP

Quantifies the maximum power that an energy system can reach by calculating the energy production corresponding to the top 1% of peak periods.

A high  value indicates that the building or system experiences moments (the top 1% of the time) with very high energy consumption. This may point to significant peak loads that place stress on the electrical grid.

If   is low, the building’s energy demand is more evenly distributed over time, with fewer or smaller peaks. [10].

 

 = Energy associated with the top 1% of a given value, calculated during periods of maximum demand or generation [kWh]

 = Time period over which the energy is measured [h]

Demand Response Percentage

 

DRS

Refers to the percentage variation of the Demand Response relative to a baseline value.

If  > 0, the Demand Response was successful in reducing power compared to the baseline level (load “reduction” capability).

If  = 0, no variation occurred.

If  < 0, it indicates an increase in power during the Demand Response implementation, which is generally undesirable (load “overload” condition) [11].

 

 = Baseline hourly power, i.e., the expected or normal power level without any Demand Response measures [kWh]

 = Hourly power under Load Shifting conditions, i.e., the power recorded during the Demand Response event [kWh]

Flexibility Factor

FLF

Measures the ability of an energy system to adapt to variations in energy demand and resource availability, and to shift energy use from high-price periods to lower-price periods. It applies a daily quartile-based price classification, dividing prices into three categories: low, medium, and high.

A high price is defined as one above the third quartile (price > 75% of all prices during a day).

A low price corresponds to a value within the first quartile (price ≤ 25%).

If = 0, consumption is balanced between low- and high-price periods.

If   = 1, consumption occurs only during low-price periods.

If < 0, most consumption occurs during high-price periods.

Range:  -1            1  [12].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Low-price periods (first quartile, i.e., the lowest 25% of prices)

 = High-price periods (above the third quartile, i.e., the highest 25% of prices)

 = Number of considered time intervals

 

Flexibility Index

FLI

Calculates the difference between the energy cost under a flexibility-controlled scenario and the energy cost under a reference scenario. The Flexibility Index is used to measure the effectiveness of flexibility strategies in reducing costs compared to a baseline case.

If   < 0, the flexibility-controlled case has a higher energy cost than the reference case, meaning an undesirable cost increase.

If   = 0, the total energy cost under flexible conditions is identical to that of the reference case, indicating that flexibility yields no savings.

If   = 1, the total cost in the flexibility-controlled case is zero relative to the reference case—this represents an ideal but unrealistic situation.

If  is positive and close to 1, it means that energy has been effectively shifted or managed, reducing costs compared to the reference scenario.

Range:  -            1   [13].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Total electricity cost in a flexibility-controlled scenario  = Total electricity cost in a reference scenario without flexibility control

 = Number of considered time intervals

Flexible Energy Efficiency

FEE

Measures how effectively a system utilizes flexible energy compared to its reference energy consumption. It refers to the system’s ability to manage energy use during Demand Response (DR) events, considering the “rebound effect” (i.e., when energy consumption increases after a reduction event to restore normal operating conditions). A higher  value indicates greater flexibility efficiency, meaning the system can better optimize energy use during flexible periods. Range: 0 %         100%  [14].

 

 = Flexible energy, i.e., the energy used during periods when the system operates in flexible mode (for example, by optimizing consumption based on renewable resource availability or variable pricing) [kWh]

 = Reference or baseline energy, i.e., the energy consumed under normal or non-flexible operating conditions [kWh]

Note. This table presents the Environmental Key Performance Indicators (KPIs) used to evaluate the environmental, energy, and operational performance of smart infrastructures within a digital twin framework. Each KPI is defined with its acronym, description, and mathematical formulation for standardized and comparative analysis.

 

 

3.2 Social and Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Assessment of Smart Urban and Industrial Infrastructures

 

The set introduced for Key Performance Indicators (KPIs) plays an important role in facilitating the digital twin and metaverse software platform proposed, highlighted in the abstract, since it plays an important enabling role in assessing, optimizing, and ensuring the factors related to Smart Urban and Industrial Infrastructure (Dovolil & Svítek, 2024; Barykin et al., 2023). The proposed set of KPIs serves as parameters that enable the processing of complex phenomena related to the environment into measurable values, enabling real-time processing, simulation, and optimization (Englezos et al., 2022; Hadjidemetriou et al., 2023). The integration process fully meets the aims of the ESG (Environmental, Social, and Governance) evaluation framework, particularly targeting both Environmental and Social factors (Shaharuddin et al., 2022). Focusing on KPIs that assess indoor environmental quality, energy efficiency, and user comfort, the proposed platform enables, through an evidence-based process, the optimization of sustainable design, preventive maintenance, and energy-efficient building operations (Yitmen et al., 2025). Humidity (HUM) is an important KPI for assessing indoor environmental quality. This parameter measures the actual water content percentage in the air, relative to its maximum threshold at a given temperature scale. Humidity level, when maintained within its optimal range (40% to 60%), plays a critical role in health and comfort, since low air humidity can lead to air irritation and electrical charges, whereas excess humidity can contribute to mold growth, causing material degradation. This phenomenon, when implemented in digital twin functionality, enables RH measurement, permitting, through algorithmic processing, automatic regulation of Heating, Ventilation, and Air Conditioning (HVAC) operation and, through forecast models, optimizing air-conditioned ventilation (Lo, 2025). This leads, therefore, to thermal and hygrometric comfort, optimized through energy conservation, directly linking HUM to both social well-being and environmental factors, concerning optimized energy savings. Particulate Matter (PM10 and PM2.5) is an important environmental parameter. The proposed KPI aims to assess the level of air concentration of particles that can significantly provoke health problems, particularly in densely populated and industrially developed regions. Continuous exposure to particles can cause problems relating to heart and pulmonary diseases. The measurement process, set up for buildings, aims to assess effectiveness and identify pollution sources through functional analysis of ventilation systems. The integration of PM values in the proposed system contributes to the support for the ESG “Social” perspective by ensuring health for the inhabitants, along with achieving healthier approaches for efficient air circulation systems, thereby contributing to improvements in the “Environmental” perspective by ensuring cleaner, more efficient air circulation methods (Saleh et al., 2025; Ariansyah et al., 2023). Volatile Organic Compound (VOC) concentrations enable the measurement of air pollution from harmful gases such as benzene, formaldehyde, and toluene, which are derived from construction materials, cleaning agents, and interior decor. Volatile organic compounds can significantly affect indoor air quality, comfort, and health. However, it is recommended that VOC concentrations not exceed 300 ppb to maintain global health standards. The integration of VOC concentration measurement in the digital twin system will enable real-time responses, enabling facility managers to trace the cause, adjust ventilation rates, or use low-emitting materials (Yitmen et al., 2025; Venkateswarlu & Sathiyamoorthy, 2025). This reasonable preventive strategy will enhance indoor environmental quality and enable ESG factors to achieve “Social Sustainability,” resist factors that threaten health, and lead to occupant contentment. The rate of “Air Changes per Hour (ACH) Quantitative Indicator,” expressed by “ACC,” measures the rate at which total air replacement can occur inside an indoor space. An average rate range of 3 to 5 ACC will ensure adequate ventilation for residential and office buildings. The continuous measurement, adjustment, and calculation procedure for ACC using digital twin technology will enable facility managers to dynamically adjust ventilation rates, ensuring safe, healthy air and energy conservation by optimizing ventilation rates (Hadjidemetriou et al., 2023). The ACC Key Performance Indicator has both social and environmentally friendly impacts for ESG achievement. Regarding ACC, it offers “Social Benefits,” ensuring healthy ventilation for human well-being, and “Environmental Benefits,” conserving energy by systematically adjusting ventilation rates to improve energy performance (Hadjidemetriou et al., 2023). The “Thermal insulation rate (R-value) Quantitative Indicator,” also expressed as “R-value,” essentially estimates the “Thermal Resistance Capacity (TRC)” of construction materials to heat, thereby indicating how little heat will conduct through them, thereby ensuring greater energy conservation, as discussed previously. Increased insulation reduces heating and cooling loads, aligning with the ESG environmental aspect by reducing emissions from energy use and the social aspect by ensuring a comfortable temperature level without increasing costs (Englezos et al., 2022). The Sound Insulation Index (SND) rates sound insulation properties for construction structures, such as walls, windows, and floors. Noise pollution is gradually recognized for its impacts on both mental and physical health. The measurement of sound insulation level inside buildings helps stakeholders rate sound comfort, particularly in highly populous urban areas. This KPI actually improves the social sustainability aspect by fostering well-calibrated environments for concentration, rest, and quality of life (Lo, 2025). Energy use actual KPIs, namely Energy Efficiency Ratio (EER) and the remaining three actual indicators, namely Coefficient of Performance (COP) and System Efficiency (SEF), that rate, along with EER, how well energy services translate from energy use, contribute singularly to how well energy inputs translate from energy services. The EER, COP, and SEF actual indicators are particularly important for rating energy services’ contribution to both chiller/heater performance ratios for cooling and heating, respectively. Values for higher ratios indicate greater use for every amount of power used, thereby improving digital twin capabilities for optimizing inefficiencies, predicting system degradation, and scheduling preventive maintenance (Venkateswarlu & Sathiyanmuthu, 2025) that support ‘Environmental’ and ‘Economic’ ESG spheres, along with, again, affordability, thereby strengthening ‘Social’ ESG factors. The actual Energy Use Intensity (EUI) and actual Lightning Power Density (LPD) actual indicators can, particularly, rate lighting energy use, and its intensity, respectively, that provide deeper insight into energy use per capita, by rating lighting energy use adjusted for expected user population, along with lighting energy consumption intensity adjusted for ex-pected unit floor space, respectively, that provide deeper, similar insight, by measur-ing shared relationship factors related to spatial, user, and energy use. The actual use of digital twins with similar data can enable various analyses, including simulations for different user occupancy scenarios, lighting system schedule optimizations, and adoption of intelligent lighting systems that dynamically adjust to different user behaviors (Yitmen et al., 2025). Such enhancements lead to lower energy losses and operational costs, thereby aligning well with the ESG framework from both environmental and social perspectives, given their well-being benefits and resource distribution. Overall, integrating such KPIs into a digital twin and metaverse system constitutes a comprehensive framework for measurement, simulation, and improvement efforts to support greater sustainability and energy goals across various infrastructures in both urban and industrial settings. Each KPI has applicability to advancing or improving environmental, energy, and human comfort factors. Continuous surveillance using the set parameters allows a shift from a reactive governance model to a predictive one, in which any intervention depends on real-time factors rather than fixed paradigms that lack dynamic scope, thereby adhering to the ESG model's focus on innovation directly linked to sustainable and inclusive elements.

 

 

Table xyz. Social Key Performance Indicators (KPIs) for Indoor Environmental Quality and Energy Efficiency Assessment

KPI

Acronym

Description

Formula

UoM

Relative Humidity

HUM

Indicates the amount of water vapor in the air relative to the maximum that can be contained at the same temperature.

The optimal relative humidity (RH) range for occupant comfort and health is between 40% and 60% [15].

 

 = Water vapor pressure [Pa]

 = Saturation vapor pressure [Pa]

%

Concentrazione di PM  (Particulate Matter - PM10 e PM2.5)

PM10 e PM2.5

Measures the amount of suspended particles (particulate matter) in the air, typically expressed in micrograms per cubic meter (µg/m³).

PM2.5 refers to particles with a diameter smaller than 2.5 micrometers, while PM10 refers to particles smaller than 10 micrometers.

Recommended long-term health thresholds are PM2.5 < 20 µg/m³ and PM10 < 50 µg/m³ [16].

 

 

 = Mass of particulate matter [µg]

 = Volume of air [m³]

µg/m³

Volatile Organic Compounds

VOC

Establishes the concentration of VOCs – such as benzene, formaldehyde, and other potentially harmful gases.

Elevated VOC levels can cause discomfort and health issues in occupants.

The indicated threshold is  < 300 ppb. [17].

 

 

 = VOC concentration [mg/m³]

 = Molar mass of the VOC [g/mol]

 = Molar volume under standard conditions, generally considered as 24.45 L/mol (at standard temperature and pressure, 0°C and 1 atm)

ppb

Air Changes per Hour

ACH

Indicates the number of times the air within a space is completely renewed in one hour.

An air change rate between 3–5 ACH is considered adequate for residential buildings or office environments [18].

 

 = Airflow rate [m³/h]

 = Volume of the indoor space [m³]

1/h

Thermal Insulation Rate 

THR

Determines the thermal resistance of insulating materials, indicating how effectively they prevent heat loss.

A higher R-Value indicates better insulation performance [19].

 

 = Materials thickness [m]

λ = Thermal conductivity of the materials [W/m·K]

m²·K/W

Sound Insulation Index

SND

Evaluates the effectiveness of a building element in reducing sound transmission between two different spaces.

It is defined as the difference between the incident sound pressure level on a surface and the transmitted sound pressure level through it.

A higher R value indicates that walls, floors, or windows are more effective at blocking sound [20].

 

 = Incident sound pressure level [dB]

 = Transmitted sound pressure level [dB]

 = Equivalent absorption area [m²]

 = Separating surface area [m²]

dB

Energy Efficiency Ratio

EER

Measures the efficiency of an air conditioning system (air conditioners or cooling units). A higher EER indicates that the air conditioning system provides more cooling output for each unit of energy consumed, making it more efficient.

If EER ≥ 12, the system is considered efficient. [21].

 

 = Total cooling capacity provided by the system [kW]

 = Electrical power input consumed by the system [kW]

-

Coefficient of Performance

COP

An indicator similar to the EER, it can be used to evaluate efficiency in both cooling and heating modes.

It is commonly applied to heat pumps. A higher COP indicates that the system can produce a greater amount of useful energy (heating or cooling) for each unit of electrical energy consumed.

If COP ≥ 3.5, the system is considered efficient. [22].

 

| =  =  = Heating or cooling capacity provided by the system [kW]

 = Electrical input power consumed by the system [kW]

-

System Efficiency η

SEF

Measures how much of the energy used by the system is effectively converted into useful heating or cooling.

A high system efficiency means that a large portion of the consumed energy is actually transformed into useful thermal energy, minimizing losses.

If η ≥ 85%, the system is considered efficient. [23].

 

 = Useful energy delivered (cooling or heating capacity) [kWh]

 = Total energy consumed (including system losses and auxiliary consumption) [kWh]

-

Energy Use Intensity based on people count 

EUI

Measures the energy consumption for lighting relative to the number of occupants in the building, reflecting energy efficiency in terms of per capita usage.

A high EUI indicates higher energy consumption for lighting per person, suggesting a lack of optimization.

Optimal values: EUI < 15 kWh/person/year. [23].

 

 

 = Energy consumed for lighting [kWh]

 = Number of occupants in the building

 = Duration of lighting usage [year]

kWh/

person/

year

Lighting Power Density per floor area

LPD

Determines the power consumed by lighting per unit of floor area.

It serves as an indicator of lighting efficiency in relation to the utilized space.

A high LPD indicates greater power consumption per unit area, suggesting inefficient lighting design.

Optimal values: LPD < 10 W/m² [23].

 

 

 = Power used for lighting [kW]

 = Illuminated indoor area [m²]

kW/m²

Note. This table summarizes the Social and Environmental Key Performance Indicators (KPIs) used to assess indoor environmental quality, user comfort, and energy efficiency in smart infrastructures. Each KPI is defined by its acronym, description, and calculation formula, providing measurable parameters that support ESG-oriented evaluation and digital twin integration.

 

3.3 Governance Key Performance Indicators (KPIs) for ESG Evaluation in Digital Twin and Metaverse Applications

 

The selected Key Performance Indicators (KPIs) provide an integrated framework for evaluating ESG performance for Smart Infrastructure, specifically for the digital twin and metaverse applications related to the management of urban and industrial environments. Each Key Performance Indicator is a link that connects technology innovation and sustainability to enable real-time analysis and optimization of energy use, expenditure, and social impacts. The use of Key Performance Indicators, in aggregate, provides a holistic view of efficiency and equity, ensuring infrastructural advancement that encompasses technological innovation, sound ecology, and support for social justice. The relevance of the Key Performance Indicators is significant in the ESG framework, particularly because it directly covers both environmental and economic perspectives, and it has an indirect relationship with Governance, largely through interactions, accountabilities, and shared decision-making (Wu et al., 2022; Zhang, 2025). The Cost of Energy Saving (CES) is the single most important Key Performance Indicator under the ESG framework, since it estimates the financial costs of unit energy savings from efficiency. This Key Performance Indicator assists by evaluating the cost-effectiveness and investment-to-benefit ratio for environmental elements, leading to environmentally viable energy conversion (Dovolil & Svítek, 2024). The CES Key Performance Indicator has clear relevance to the ESG environmental domain, helping establish cost-optimal strategies for energy waste and emission savings, and also has implications for Governance, as it assists with financial accountabilities and forward-looking strategic planning for financial resource use. The Energy Return on Investment (EROI) is another highly important Key Performance Indicator, calculated as the ratio of energy output to energy invested for any given system. The Key Performance Indicator for energy has important implications for ESG’s environmental domain, as it indicates that when EROI increases, the energy output of the system is significantly higher than the energy consumed (Hämäläinen, 2020). This shift leads to optimized energy resources and sustainable energy. This Key Performance Indicator has several ESG factors, as it supports the ESG environmental dimension by enabling transparent evaluation of energy system efficiency and helping strategic decision-making to maximize energy output from resources without harmful depletion (de Trizio et al., 2024). The Energy Payback Time (EPBT) Key Performance Indicator complements the EROI Key Performance Indicator, as it describes the time required for a particular system to recover the energy invested in construction, setup, and maintenance operations. Functionally, from an ESG perspective, EPBT plays a crucial role in evaluating the life-cycle sustainability of energy systems (Hu, 2023). In the digital twin environment, EPBT helps evaluate simulation scenarios and establish the sustainability level of different energy technologies, thereby strengthening the use of transparent data —an important consideration in ESG modeling for the governance process. The Cost of Peak Demand (CPD) measures the cost of peak electricity demand over a given time period. The use of CPD is critical for sustainability, both environmental and economic, since maximizing efforts to reduce peak loads will ease energy networks and prevent the need to generate additional energy from fossil fuels, which are characterized by higher emissions (Aghazadeh Ardebili et al., 2025). The Cumulative Cash Flow (CCF) criterion considers both financial and environmental factors, as it evaluates total cash flow for an energy project alongside investment costs. ESG analysis supports governance by using financial criteria to express financial transparency and assess future risk (Hien & Hanh, 2024). The positive interpretation of a project’s cash flow feature is critical, as it asserts that financial investment in a project, beyond financial benefits, helps achieve resource savings and sustainability. The Share of Project Cost Subsidized (SPC) measures the extent of grant use. This criterion assumes ESG duality, as it explains the financial attractiveness of sustainable project investment by focusing on social benefits arising from inclusivity for small players from developing communities in the use of sustainable technology (Wu et al., 2022). Renewable Energy Use (REU) assumes critical importance as an essential ESG criterion that estimates the level of energy use from conservation to sustainable energy. Indicative interpretation assumes critical importance, particularly because it signifies a strong commitment to sustainability for a project, which is otherwise characterized by the continuous use of fossil fuels (Becattini et al., 2024). The use of digital twin technology is critical, as it assists in monitoring energy use across different scenarios, thereby enabling interpretation for sustainable energy use (Wei, 2023). The Energy Use per Worker Hour (EPWH) is dual in its interpretation of energy use across different labor productivity scenarios (Zhang et al., 2023). Socially, it signifies environmentally responsible production that does not strain human resources by being energy-intensive. EPWH, on a digital twin platform, supports modeling for appropriate workforce and energy equity balance interpretation, as well as effective energy use in labor-intensive industries (Englezos et al., 2022). Taking it all in, it forms a sound analysis framework for a comprehensive digital twin model that expresses difficult objectives for sustainable production through specific, quantified, and tractable information. The gauges improve the proposed digital twin framework’s capabilities for both real-time activity monitoring and, through simulation, forecasting future ESG performance implications. The proposed digital twin platform’s balanced model for ensuring a comprehensive, integrated, and holistic approach to ESG responsibility, covering environmentally responsible operations (EROI, REU, EPBT) for low-cost energy use, economic soundness (CES, CCF, CPD, SPC) for sustainable economic growth, and social responsibility (EPWH) for fair social implications, has therefore become possible through the incorporation and integration of such factors for its successful implementation.

 

Table xyz. Governance Key Performance Indicators (KPIs) for ESG Evaluation within Digital Twin Frameworks

KPI

Acronym

Description

Formula

UoM

Cost of Energy Saving

CES

Measures the cost associated with energy savings achieved through energy efficiency interventions.

This parameter is particularly useful for comparing different investment options in terms of efficiency, as it estimates how much it costs to save one unit of energy (e.g., 1 kWh) through technological or operational measures.

The CES formula is structured to calculate the total cost of energy savings and divide it by the amount of energy saved, accounting for system inefficiencies.

A higher CES indicates a greater cost per unit of energy saved, suggesting that the intervention may be less cost-effective compared to other alternatives.

Conversely, a lower CES means a lower cost per unit of energy saved, making the energy efficiency measure more economically advantageous [24].

 

 

 = Change in initial investment. Represents the amount of capital required to implement the energy efficiency measure [€]

 = Change in operating costs. Includes expenses related to the operation and maintenance of the energy efficiency measure [€]

 = Energy price. Represents the cost per unit of energy, which can influence the savings achieved by the measure [€/kWh]

 = Change in energy consumption. Indicates the amount of energy saved as a result of the intervention [kWh]

 = Energy loss (or efficiency) factor associated with losses that may occur during the energy use process. It may include heat losses or other system inefficiencies [–]

 = Capital Recovery Factor. Used to calculate the annualized cost of the investment and determine how much an investment must generate each year to be recovered over time [-]

 

 = Interest rate [-]

 = Amortization period [years].

[€/kWh]

Energy Return on Investment

EROI

Evaluates the energy efficiency of a production source by measuring how much energy is obtained compared to how much energy is invested to produce it. It is a key indicator of energy sustainability: the higher the EROI, the more efficient the system.

If EROI > 1, the energy process is sustainable, as the energy produced exceeds the energy invested.

If EROI = 1, the energy produced is exactly equal to the energy invested, meaning the system is at the limit of sustainability and produces no usable net energy.

If EROI < 1, the system is inefficient, since it requires more energy than it generates. Such a process is neither economically nor energetically sustainable in the long term.

This indicator answers the question: “How efficient is the energy investment?” [25].

 

 = Total outgoing or produced energy from process i. This may include, for example, the electricity generated by a power plant or the fuel produced by a refinery [kWh].

 = Total incoming or consumed energy for process j. This may include the energy required to extract, transform, or transport the energy source [kWh].

 e  = Scaling factors that can represent the quality of energy. For instance, they may be used to assign greater or lesser importance to certain forms of energy or technologies [–].

[-]

Energy Payback Time

EPBT

Measures the time required for an energy system to produce the same amount of energy that was needed to build, install, and maintain it.

If EPBT is high, it takes longer for the system to return the energy invested. Conversely, if EPBT is low, the energy system quickly recovers the energy used for its construction and startup.

It is an indicator that answers the question: “How long does it take for the system to repay the energy invested?” [26].

 = Total invested energy required to build, install, maintain, and decommission the energy system throughout its life cycle [kWh].

 = Amount of energy that the system is capable of producing annually once it is operational [kWh/year]. 

[year]

Cost of Peak Demand

CPD

Measures the cost associated with the peak electricity demand over a given period.

A lower CPD is desirable, as it indicates effective management and reduced exposure to energy costs [27].

 

 = Represents the maximum power demand during a given period [kW].

 = Represents the cost associated with each unit of power [€/kW].

[€]

Cumulative Cash Flow

CCF

Measures the total cash flow generated by the project in relation to the initial investment.

The CCF is useful for investors and decision-makers, as it helps assess a project's profitability, compare different investments, and plan future financial needs and returns on investment.

A CCF > 0 indicates that the project is generating more cash flow than the costs incurred, while a CCF < 0 indicates a loss. [24]

 

 = Represents the Final Energy Savings in period k. This value indicates the final energy savings achieved through energy efficiency measures or other strategies [kWh].

 = Energy Carrier Cost, i.e. the cost of energy per unit during period k. This may include costs for purchasing or using energy such as electricity, gas, etc. [€/kWh].

 = Technical Life, i.e. the project period during which energy savings and economic benefits are expected [years].

 = Investment Cost, i.e. the cost of the investment. It includes all expenses necessary to implement the project, such as installation, equipment, and other preliminary costs [€].

[€]

Share of Project Cost Subsidized

SPC

Indicates the proportion of the total project cost that has been financed through grants.

A high SPCS means that a significant portion of the project has been funded through external aid, while a low SPCS suggests that the project has been mainly self-financed.

SPCS = 0% when no grants have been received (RS = 0), meaning no part of the project costs is subsidized.

SPCS = 100% when the entire project cost is covered by grants (RS = IC), meaning the entire project is subsidized.

Range: 0 % ≤   SPCS ≤   100%  [28].

 

 = Received Subsidies, meaning the total amount of grants or funding received for the project [€].

 = Investment cost, meaning the total investment cost [€].

 

 

 

[%]

Renewable Energy Use

REU

Provides a measure of the proportion of final energy savings that comes from renewable sources compared to all energy sources used.

It is useful for energy policies and environmental assessments, as it helps quantify and compare the impact of different energy sources on overall sustainability and efficiency.

A higher REU indicates greater use of renewable energy, while a lower REU suggests a higher dependence on fossil fuels.

Range: 0 % ≤   REU ≤   100%   [28].

 

 

 = Final Energy Savings for each energy source k. Indicates the final energy savings achieved from that specific source [kWh].

 = Conversion Factor for each energy source k. This factor is used to convert the saved energy into a common unit, allowing comparison among different sources [-].

 = Renewable Energy Source factor for each energy source k, which accounts for the sustainability of the source. This value varies depending on the type of energy:

·          0 for fossil fuels, indicating they do not contribute to sustainable energy production [-]

·          1 for renewable sources such as biomass, wind, solar, and other renewables, as they are considered sustainable [-]

A value between 0 and 1 for mixed sources, such as industrial waste or end-of-life tires, depending on the sustainability level of the source [-]

[%]

Energy Use per Worker-Hour

EPWH

Measures the total energy used by a production system in relation to the number of human resources and working time.

It calculates the energy used per working hour, taking into account the total supplied energy minus the imported one, and normalizing the result by the number of workers and the annual working hours.

This indicator is useful for evaluating the energy efficiency of an organization or an entire economy, allowing comparisons over time or between different sectors or countries.

A low EPWH is considered positive, as it indicates higher productivity with lower energy use, suggesting a more sustainable use of energy resources.

Conversely, a high EPWH may indicate energy inefficiency, potentially linked to poorly optimized production processes, outdated machinery, or energy-intensive technologies [29].

 

 = Total Primary Energy Supply, i.e., the total amount of primary energy supplied, including all available energy sources [kWh].

 = Population number, meaning the total number of individuals within the studied population.

  = Total number of working hours per person per year [hours/year].

 = Industrial Primary Energy Supply, meaning the portion of TPES specifically used in the industrial sector [kWh].

 

 = Industrial Final Consumption, referring to the final energy consumption by the industrial sector [MWh].

 = Total Final Consumption, referring to the total final energy consumption within a given economic system, including the industrial, residential, tertiary, and transport sectors [MWh].

MJ /

(ab. hour/years)

Note: This table summarizes the Governance Key Performance Indicators (KPIs) used for ESG evaluation within digital twin frameworks. The listed indicators quantify economic efficiency, financial accountability, and strategic resource management, enabling transparent decision-making and long-term sustainability assessment. These variables collectively support the “Governance” dimension of ESG by linking economic performance with responsible investment, policy transparency, and data-driven management.

 

Apart from the previously listed key performance indicators, the following are also calculated for measurement in relation to the context of the given system, making it easier for normalization:

  • Area (Area_m² – AREA): This signifies the total floor space investigated for the energy and environment indices related to the building or infrastructural facility. The total floor space is presented in square meters.
  • Energy Consumption (Energy_Consumption_kWh – ENCO): This refers to the total consumption during the period under review, expressed in kilowatt-hours. This is the fundamental unit that can also produce comparative energy performance indicators
  • Occupants (OCC): This variable measures the number of people using or occupying any given space. This parameter enables calculations related to energy use and per capita environmental factors, making analysis easier for the user.

These factors establish highly important normalizing variables, enabling true comparability of performance across different buildings, facilities, and circumstances, thereby enhancing the robustness of the entire KPI system.

 

  1. Descriptive Statistical Analysis of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype for Smart Buildings

 

The results of the descriptive statistical analysis of the dataset highlight the complexity and diversity reflected in the Key Performance Indicators (KPIs) used to evaluate environmental, energy, and operational performance related to the functioning of Smart buildings and infrastructures. This also aligns well with existing studies that emphasize the significance of Key Performance Indicator frameworks for optimized building management (Faria et al., 2021; Alrashed, 2020). The average surface area (AREA) for the sites analyzed is around 9,637 m², with considerable variability (SD greater than 5,200 m²), indicating that low-scale buildings coexist with larger buildings, including structures larger than 19,000 m². The energy consumption (ENCO) has an average value of around 981,000 kWh, with considerable variation, indicating that the dataset includes both energy-intensive and optimized buildings (Bandoria et al., 2024; Koutras et al., 2023). The Carbon Footprint (CFPT) has an average value of 296 tCO₂e, confirming considerable emissions, which are reasonable given the dimensions of the dataset. The Emission Intensity (EMIN) rate, at 0.081 tCO₂/kWh, indicates optimized energy use, with lower environmental impacts, as reflected in energy consumption, and aligns with energy optimization strategies for the functioning of Smart Infrastructure (Ho et al., 2021). The average values for the energy coverage factors, Load Cover Factor (LCF) and Supply Cover Factor (SCF), are 0.81 and 0.814, respectively, indicating that approximately 80% of the energy can be covered through optimized resources, either on-site production or utilization. The Load Matching Index (LMI) average value, amounting to 71.7%, depicts optimized synchronization for energy production and energy requirements, whereas the average value for On-site Energy Ratio (OER) amounting to 0.75, reflects considerable on-site energy production, thereby making it clear that autonomy also has a strong dimension (Mustapha et al., 2025; Kumar et al., 2024). The average values for the Grid Interaction Index (GII) and No Grid Interaction Probability (NGI) sum to 47% and 0.47, indicating that optimized interaction levels for energy autonomy and interaction are crucial, suggesting optimized energy interaction strategies. The system entrance and operation indices remain uncertain for facility operation performance. The Capacity Factor (CAF), having an average value of 0.54, signifies that the actual use of the installed capacity is around half, along with a slight excess, while One Percent Peak Power (OPP) has an average value of 584 kW, indicating that there are periods where significant peak loads are used. The flexibility and Demand Response factors (DRS, FLF, FLI, FEE) signify the midpoint level for flexibility. It is pertinent to note that since the average for the Demand Response (DRS) factor is 9%, it signifies that it has flexibility for load reduction or time shift, whereas since the Flexible Energy Efficiency (FEE) factor average is around 49%, it also signifies that there is scope for improvement in dynamic energy use (Romanska-Zapala et al., 2020). Considering environmental and comfort factors, indoor conditions are stable and acceptable, meeting comfort requirements. The average humidity (HUM) is 49%, well inside the range for maximum comfort. The level for Particulate Matter (PM₂.₅) and (PM₁₀) (11.2 µg/m³ and 24.6 µg/m³) is lower than the World Health Organization’s requirements, thereby confirming that indoor air quality is satisfactory (Haka-wati et al., 2024). Volatile Organic Compound (VOC) concentration, averaging 186 ppb, shows significant variability, which can be influenced by building materials, effectiveness, and ventilation rates. The average air change rate (ACH) is 4, confirming that recommended rates for buildings that are not industries are met (Mustapha et al., 2025). The comfort levels for thermal and acoustic performance factors also indicate acceptable comfort, with average values of Thermal Insulation Rate (THR) at 2.93 m²K/W and Sound Insulation Index (SND) at 43 dB, indicating well-insulated and comfortable acoustic environments (Mustapha et al., 2025). Regarding energy subsystem factors, EER, COP, and SEF indicate that energy subsystems perform well, with average values of 10.3, 2.86, and 87.5%, respectively. The average Energy Use Intensity for each person (EUI) is 16.9 kWh/year, and the average lighting power density (LPD) value is 0.008 kW/m², ensuring that lighting energy use is satisfactory (Arias-Requejo et al., 2023). However, from an economic perspective, there is greater variability. The average for the Cost of Energy Saving (CES) factor is 11.45 €/kWh, and that for the Energy Return on Investment (EROI) factor is 14.79, indicating equilibrium, albeit with considerable variability. The average Energy Payback Time (EPBT) is 4.9 years, indicating acceptable energy recovery time (Haka-wati et al., 2023). The Cumulative Cash Flow (CCF) is negative, indicating no full cost recovery by the project, while the Share of Project Cost Subsidized (SPC) = 35%, indicating strong subsidization, largely financial in nature. The Renewable Energy Use (REU) = 64%, indicating strong integration of clean energy, while Energy Use per Worker Hour (EPWH) = 39 MJ, indicating that average energy productivity can still improve (Kumar et al., 2024).

 

Table xyz. Descriptive Statistics of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype Applied to Smart Buildings.

Variable

Obs

Mean

Std_Dev

Min

Max

p1

p99

Skew

Kurt

AREA

100

9637.3

5249.252

1161

19942

1175

19694

.167

1.959

ENCO

100

981000

562000

63556.65

1970000

72951.46

1960000

.11

1.79

CFPT

100

295.725

130.658

52.28

495.52

53.21

491.685

-.275

1.887

EMIN

100

.081

.039

.022

.149

.022

.148

-.017

1.765

LCF

100

.811

.125

.604

.997

.604

.996

-.146

1.722

SCF

100

.814

.119

.606

1

.609

1

-.069

1.784

LMI

100

71.682

13.711

51

99.33

51.415

99.225

.383

1.981

OER

100

.753

.25

.33

1.191

.339

1.18

.035

1.729

GII

100

47.038

29.213

.46

99.69

.885

99.085

.104

1.86

NGI

100

.469

.281

.011

.984

.012

.966

.076

1.823

CAF

100

.541

.312

.018

.998

.019

.995

-.123

1.666

OPP

100

584.406

263.627

105.75

995.42

116.035

989.245

-.254

1.724

DRS

100

9.006

11.919

-9.61

29.88

-9.575

29.675

.073

1.825

FLF

100

.045

.584

-.939

.993

-.938

.984

-.072

1.735

FLI

100

.27

.445

-.493

.999

-.492

.99

-.139

1.815

FEE

100

49.036

26.92

.76

98.78

1.29

97.535

-.023

1.98

OCC

100

412.27

225.185

50

933

61

927

.387

2.307

HUM

100

49.463

7.495

25

73.7

27.25

70.65

-.078

4.539

PM25

100

11.233

4.714

3

22.3

3

21.85

.274

2.341

PM10

100

24.617

9.179

8

42.9

8

42.65

.074

2.285

VOC

100

186.01

87.096

20

383

20

371

-.163

2.445

ACH

100

4.051

.795

2.25

6.05

2.285

5.82

.043

2.616

THR

100

2.934

.859

.8

5.5

.97

5.025

.099

2.921

SND

100

43.343

6.227

30

61.6

30.8

60.3

.278

2.962

EER

100

10.34

1.169

7.18

13.03

7.545

12.885

-.158

2.72

COP

100

2.857

.368

2.2

3.59

2.2

3.59

.055

2.287

SEF

100

87.511

4.892

72.2

97.3

74.4

97.2

-.436

3.155

EUI

100

16.932

3.683

7.5

25.4

8.6

25.35

.019

2.616

LPD

100

.008

.002

.005

.012

.005

.012

.22

2.318

CES

100

11.453

25.527

.019

213.237

.02

146.411

5.406

40.749

EROI

100

14.79

21.237

.193

121.655

.224

114.719

3.32

14.856

EPBT

100

4.91

11.729

.08

86.67

.09

79.575

5.544

35.698

CPD

100

141000

73729.43

14691.18

298000

15023.01

298000

.232

2.306

CCF

100

-420000

785000

-1780000

2390000

-1760000

2050000

.644

3.843

SPC

100

34.946

20.902

.25

69.89

.405

69.885

.019

1.789

REU

100

64.338

13.584

30.98

95.58

34.64

92.81

-.066

2.344

EPWH

100

39.763

45.06

.302

229.515

.337

189.341

1.5

5.199

Note. This table presents the descriptive statistical parameters of the Key Performance Indicator (KPI) dataset developed to support the validation of a prototypal Digital Twin and Metaverse model for Smart Building management. The dataset integrates environmental, energy, operational, and governance-related variables, enabling the characterization of heterogeneous building typologies and operational conditions. The statistical descriptors (mean, standard deviation, minimum, maximum, skewness, and kurtosis) provide a quantitative overview of variability and distribution, essential for model calibration, simulation accuracy, and data-driven performance validation within the digital twin environment.

 

 

 

  1. Validation Framework and Data Reliability for ESG-Based Smart Building Model

 

The image illustrates the validation framework for an ESG (Environmental, Social, Governance) Smart Building model, outlining a methodological process divided into four main phases.

 

Figure 1. Validation Framework for ESG-Based Smart Building Model. This framework validates and structures ESG data for Smart Building applications, combining statistical and machine learning methods to ensure data reliability and predictive accuracy. The validated dataset supports testing and prototyping of a management system that integrates metaverse and digital twin technologies for advanced, real-time smart building management.

 

 

The process starts with data preparation and structuring, in which data on environmental, social, and governance indicators should be collected and processed by normalizing and organizing them into three analytic blocks. In addition, data screening for outlier observation should be executed at the same stage to ensure data quality for subsequent analysis. The next process involves correlation analysis and Principal Component Analysis. The PCA analysis needs to identify hidden components and prove structural homogeneity. The next step involves Ordinary Least Squares linear regression for each component of environmental, social, and governance. The area will serve as the output for the data. In addition, the framework should use VIF to test for homogeneity in the data. Furthermore, it should apply the calculations for both the determination coefficient and the degrees of freedom. The framework should use machine learning algorithms to improve predictive analysis. At the same time, comparisons of various algorithms, such as Boosting algorithm analysis, Decision Tree Analysis by KNN, Random Forest by Regularization, and Support Vector Analysis, should be used. The analysis should be carried out separately for each component. The algorithm has been designed to ensure that the processed data can be used for testing during the design of a management system that combines the metaverse and a digital twin. At the same time, data structural homogeneity should be ensured. Therefore, based on the data structural homogeneity analysis, it is meaningful and timely to create an advanced digital environment that is both interactive and immersive. Furthermore, it should be an opportunity to create environmental management in an intelligent digital environment.

 

 

  1. Scientific Validation of ESG Data through Correlation Analysis for Smart Building Prototyping

 

 

The correlation matrix, as a validation technique for the database used in the analysis of ESG components, holds a strong position from a scientific perspective. Correlation analysis is the most robust statistical approach for assessing the internal consistency of the data. The advantage of correlation analysis lies in the ability of researchers to determine whether the set of investigated factors shows positive or negative correlations. In the analysis of ESG factors, it is confirmed that each factor has a specific property within the non-overlapping value of sustainability. From a scientific perspective, it confirms that the data structure holds strong internal consistency. In the context of smart building implementation, it plays an important role by validating the quality of data that flows into the digital management system. The analysis of correlations among various factors of energy consumption and environmental emissions confirms that the data set follows an independent distribution of sustainability. The moderate levels of correlation confirm that it holds multidimensional properties. In terms of scientific research and the scientific standards of environmental analysis and management science, it complies with high standards. It provides a robust foundation for further analysis, such as PCA and regression. These two analyses provide further evidence supporting research on environmental sustainability. Furthermore, it provides strong evidence that the data has been integrated into the digital twin metaverse. In respect of the research analysis targeting the assessment of smart building implementation on environmental factors. The research analysis holds three types of correlation analysis. The correlation analysis focuses on each ESG aspect. The three factors in the analysis include the Environmental factor (E), the Social factor (S), and the Good Governance factor. The analysis of these factors provides an important perspective, as it confirms that the data structure holds comprehensive internal properties.

 

 

 

5.1 Correlation Analysis and Validation of Environmental (E) Factors in the ESG Framework

 

 

 

The environmental factor in the ESG framework refers to operational characteristics related to energy, emissions, and environmental issues. The correlation matrix for the environmental factors (AREA, CFPT, ENCO, EMIN, LCF, SCF, LMI, OER, GII, NGI, OPP, DRS, FLF, FLI, FEE) helps the researcher perform initial checks for internal dataset coherence and multicollinearity among factors. The correlations appear to range from weak to medium, thereby ensuring that similar factors are not measured again (Wang, 2024; Eskantar et al., 2024). This helps improve the construct validity of the environmental elements, as it clearly supports a wide range of factors and prevents overlap (Handoko, Afifudin, & Holili, 2024). The AREA, which relates to the asset's actual size, shows insignificant correlations with other factors. The slight negative correlations observed between energy use (ENCO) and Carbon Footprint (CFPT) indicate that larger areas do not necessarily lead to greater energy use and emissions (Hou et al., 2025). The positive, albeit trivial, relationship between Load Cover Factor (LCF) and building size indicates that larger buildings tend to handle load factors better, though this relationship is not significant. CFPT, having relation to Carbon Footprint, is negatively correlated to both energy use (ENCO) and Emission Intensity (EMIN). The negative relationship between CFPT and ENCO may seem paradoxical, but it could reflect differences in the use of cleaner energy across organizations (Zhou, 2024). The negative association between CFPT and EMIN implies that when total emissions are higher, Emission Intensity tends to fall, suggesting that either larger organizations use different energy resources to scale or that better technological efficiencies account for better results (Du et al., 2024). The trivial relationship emphasizes that emissions, although controlled by many, are not solely defined by energy use quantities, making it valid for CFPT and EMIN to remain distinct factors. The energy use factor (ENCO) also has insignificant correlations for other factors in the environmental domain, thereby requiring support for its applicability. The presence of a weak negative relationship between it and LCF and SCF (Load and Supply Cover Factors) implies that greater energy use does not necessarily correspond to better load coverage or supply adequacy, thereby ensuring autonomy in quantity and management efficiency (WANG, Y., 2024). This adds strength to the theoretical basis for modeling, in which operational intensity and efficiency remain separate dimensions within the environment. The correlations for LCF, SCF, LMI, and OER, factors that indicate energy balance and autonomy, demonstrate an internal logical structure. For example, LCF shows a positive relationship with EMIN and LMI, thereby confirming that systems with greater load coverage tend to demonstrate greater operational matching. The positive relation between LCF and EMIN could prima facie appear contradictory: greater intensity could indicate inefficiency, yet it could also indicate systems running at, or near, full capacity, where greater loads tend to temporarily enhance intensity. The slight positive relationship between LCF and OER (On-site Energy Ratio) supports internal logic, in which greater load coverage enables greater on-site production —a sensible practice for system design that sustains the environment (Dovolil & Svítek, 2024). The GII and NGI, which indicate interaction on the power grid, tend to show slight negative or weak correlations with almost all other factors. This also appears sensible: systems that depend more on the power grid for functioning (greater NGI, lower GII) tend not to relate directly to greater efficiency (FEE) or flexibility (FLI) (Zhou, 2024; Wang, 2024). The slight correlations tend to confirm that interaction with the power grid remains an autonomous domain for the environment, suggesting that the dataset can properly account for almost every aspect of the environment, from production to system administration (Eskantar et al., 2024; Hou et al., 2025). The factors for flexibility (FLF, FLI, FEE) tend to show slight correlations with each other, thereby confirming that flexibility and efficiency remain largely autonomous factors in analysis. The slight positive correlations between FLF and AREA, and between FEE and LCF, suggest that larger systems display greater flexibility, though only slightly. This slight autonomy in interdependence tends to confirm that, for the environment, different dimensions (structure, operation, and efficiency) relate only partially (Eskantar et al., 2024; Hou et al., 2025). The correlations for the environment tend to confirm the dataset’s validity. The low to medium correlations confirm that the environmental factors are exploring different, albeit complementary, dimensions around the notion of ‘sustainability,’ without any considerable redundancy. This also adds strength to the basis for further analysis, such as PCA, that will also, in turn, support the interpretation of the factorial structure underlying the environmental dimension, achieving a meaningful combination of indicators (Handoko et al., 2024; Wang, Y., 2024).

 

 

Table xyz. Correlation Matrix for Environmental (E) Factors in the ESG Model

 

Variables

AREA

CFPT

ENCO

EMIN

LCF

AREA

1.0000

-0.0382

-0.0608

-0.0678

0.0483

CFPT

-0.0382

1.0000

-0.1416

-0.2254

-0.0229

ENCO

-0.0608

-0.1416

1.0000

-0.0344

-0.1235

EMIN

-0.0678

-0.2254

-0.0344

1.0000

0.1844

LCF

0.0483

-0.0229

-0.1235

0.1844

1.0000

SCF

-0.0142

-0.0214

-0.2180

-0.1927

-0.1126

LMI

0.0376

0.0284

-0.0592

-0.0165

0.2509

OER

0.0432

0.0155

-0.1793

0.0205

0.0918

GII

-0.0380

0.0052

-0.2230

0.0543

-0.0519

NGI

-0.0188

0.0250

-0.0573

-0.0805

-0.0523

OPP

-0.1248

0.0472

0.1331

0.2376

-0.0651

DRS

-0.0577

0.1073

-0.1592

-0.0992

-0.1351

FLF

0.1050

-0.1392

-0.0412

0.0490

-0.0770

FLI

0.0023

0.1272

-0.0822

0.0331

-0.0335

FEE

0.0965

-0.0327

-0.0738

-0.0678

0.0085

 

Note: The table presents the correlation coefficients among the environmental indicators used within the ESG framework. The low to moderate correlation values confirm that the variables are largely independent and represent distinct aspects of environmental performance, such as energy use, emissions, and operational efficiency. This statistical consistency validates the internal coherence of the dataset and ensures its suitability for advanced modeling techniques, including PCA and regression analysis. The results further demonstrate that the data are appropriate for use in the prototyping and testing of smart building management systems based on digital twin and metaverse technologies.

 

 

The relationship heat map is a graphical representation of the inherent relationships among the environmental indicators in the dataset. The intensity distribution in the heat map shows mainly light-colored regions and a few strong red and blue regions, suggesting that most correlations are low to moderately positive. This graphical interpretation also supports the initial statistical analysis, confirming that the majority of the environmental factors presented are mutually independent and cover different facets of energy consumption, emissions intensity, load management, and efficiency. The same correlations can also be found in ESG datasets, for which multidimensionality is crucial to guarantee the strength and ease of interpretation of modeling (Ioannidis et al., 2022; Loukili & Benli, 2023). The absence of strongly correlated factors indicates that the dataset has an effective structure and lacks multicollinearity, ensuring it meets the requirements for accurate modeling and interpretation (Eskantar et al., 2024). The regions that display moderately strongly correlated factors, found in different parts of the heat map, relate to well-known correlations for the expected dimensions. For example, a low, positive relationship between Emission Intensity (EMIN) and Load Cover Factor (LCF) could reflect operational conditions: when power systems operate at maximum load, emission intensity tends to increase. Other low, positive correlations for factors related to energy autonomy (on-site energy ratio, OER) and load matching (load matching indicators, LMI) demonstrate that there are coherent interactions in energy autonomy and system efficiency, thereby aligning with results from ESG analysis carried out using alternative methods (Sorathiya et al., 2024). The heat map analysis clearly shows that each set of factors has an inherent, logical structure without compromising its mutual independence. The heat map suggests that there are no strongly correlated factors that fully define the environmental dimension. This can also indicate that the dataset has inherent multidimensional characteristics, covering different facets related to energy, emission intensity, load balance, and flexibility, which contribute to a comprehensive ESG analysis in a unique way. An integrated view has also been applied in ESG analysis to evaluate smart city infrastructure (Dovolil & Svítek, 2024). This shows that the variables are distinct yet conceptually related, providing a strong basis for analysis such as PCA and regression models in the ESG framework.

 

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Environmental (E) Factors in the ESG Model. Note: The heat map shows mostly weak to moderate correlations, indicating that the environmental variables are independent and free from multicollinearity. This confirms the dataset’s structural validity and its suitability for integration into digital twin and metaverse-based smart building management models.

 

 

 

5.2 Correlation Analysis and Validation of Social (S) Factors in the ESG Framework

 

 

The social dimension of the ESG model emphasizes human and system characteristics at the building level, highlighting considerations such as user comfort, indoor air quality, thermal and sound performance, and efficiency. The given correlation matrix for the social dimension (OCC, HUM, PM25, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) represents an important validation process for the dataset used for creating a management model that applies digital twin technology (Hadjidemetriou et al., 2023). The correlations between variables are largely weak to moderate, signifying that each variable identifies a unique aspect without overlapping. The correlations for the number of occupants (OCC) range from small positive correlations for RH (HUM, 0.13) and fine particle concentration (PM2.5, 0.20), since higher human presence could lead to slight increases in concentration for both factors (Lo, 2025). However, the correlations remain weak, strengthening the hypothesis that better-designed environmental conditions and ventilation systems can exclude environmental factors from having a major impact on indoor air quality (Cai et al., 2023). The near-zero correlations between OCC and variables such as the concentration of Volatile Organic Compounds (VOC, -0.07) and thermal resistance (THR, -0.08) indicate little to no relationship between human presence and these factors, again proving that the balance of the dataset has been appropriately defined. The air quality variables (PM2.5, PM10, and VOC) tend to exhibit mild correlations, particularly between PM2.5 and PM10 (0.24), since both factors are closely related through their co-occurrence at the same locations (Hadjidemetriou et al., 2023b). The low correlations between VOCs and humidity indicate that air quality factors are largely independent of indoor environmental factors, supporting the continued separation of their measurement as separate elements for analysis under the social dimension (Ni et al., 2024). Such correlations for air change rates (ACH) tend to reflect positive, albeit weak, correlations with temperature and sound insulation, again showing that ventilation rate performance is largely uninfluenced by envelope characteristics, which are important for digital twin simulations of indoor user comfort (Islam et al., 2024). The thermal and acoustic factors (THR, SND) show medium-strength positive correlations with COP, SEF, and EER, suggesting that buildings with lower thermal and sound transmission tend to exhibit better energy system efficiency. This pattern is also expected, reinforcing the dataset's internal validity by associating comfort factors with actual system performance (Alibrandi, 2022). Conversely, EER, COP, and SEF display strong positive correlations (ranging from 0.49 to 0.72) because energy efficiency factors are expected to show considerable convergence in value. However, for a digital twin model, such strong correlations are highly acceptable, as they can assess various system factors that are closely related yet supplementary to one another (Hii & Hasama, 2024). Interestingly, energy use intensity (EUI) and lighting power density (LPD) show a strong positive correlation (r = 0.88), as lighting factors strongly influence energy use per capita. This pattern explicitly verifies that the dataset accurately captures internal load patterns, both for assessing social comfort and productivity in digital twin settings, thereby becoming crucial for exploring social system factors through ESG models (Yossef Ravid & Aharon-Gutman, 2023). However, the low values for EUI, LPD, and social factors explicitly confirm that energy use patterns remain a separate system factor, not driven by social factors. The social component’s correlation matrix clearly shows that low-to-medium correlations indicate logical convergences among comfort, air quality, and energy factors, thereby explicitly confirming that the dataset captures supplementary system factors that are logically related to each other. This pattern clearly shows that the dataset’s social component, which focuses on ESG modeling, is robustly constructed, thereby ensuring its reliability for efficient analysis, simulations, and decision-making support through digital twin frameworks for managing energy-efficient buildings and enhancing social factors by maximizing energy performance in social buildings.

 

 

 

Table xyz. Correlation Matrix for Social (S) Dimension Variables in the ESG Smart Building Model

 

Variable

OCC

HUM

PM25

PM10

VOC

ACH

THR

SND

EER

COP

SEF

EUI

LPD

OCC

1.0000

0.1329

0.1953

0.0406

-0.0661

-0.0387

-0.0806

0.0172

0.0373

0.0912

0.1720

-0.0849

-0.0437

HUM

0.1329

1.0000

0.0027

0.0540

-0.0592

0.1160

-0.1581

0.0172

0.0477

0.0399

0.0013

0.0618

0.0800

PM25

0.1953

0.0027

1.0000

0.2370

0.0320

-0.0518

-0.2271

0.1503

-0.0616

0.0376

0.0095

-0.0114

0.0935

PM10

0.0406

0.0540

0.2370

1.0000

0.0760

0.0683

0.0201

0.0481

-0.0705

-0.0022

-0.0393

0.0587

0.0935

VOC

-0.0661

-0.0592

0.0320

0.0760

1.0000

0.0005

-0.0622

-0.0455

-0.0209

-0.0401

-0.0085

0.0454

0.0214

ACH

-0.0387

0.1160

-0.0518

0.0683

0.0005

1.0000

0.0289

0.1062

0.0741

0.0784

0.0607

0.0243

0.0072

THR

-0.0806

-0.1581

-0.2271

0.0201

-0.0622

0.0289

1.0000

0.1467

0.1021

0.1425

0.1260

-0.0078

0.0573

SND

0.0172

0.0172

0.1503

0.0481

-0.0455

0.1062

0.1467

1.0000

0.0119

0.0676

0.0631

-0.0225

0.0202

EER

0.0373

0.0477

-0.0616

-0.0705

-0.0209

0.0741

0.1021

0.0119

1.0000

0.4872

0.7244

-0.1632

-0.0750

COP

0.0912

0.0399

0.0376

-0.0022

-0.0401

0.0784

0.1425

0.0676

0.4872

1.0000

0.7074

-0.0906

-0.0529

SEF

0.1720

0.0013

0.0095

-0.0393

-0.0085

0.0607

0.1260

0.0631

0.7244

0.7074

1.0000

-0.1307

-0.0399

EUI

-0.0849

0.0618

-0.0114

0.0587

0.0454

0.0243

-0.0078

-0.0225

-0.1632

-0.0906

-0.1307

1.0000

0.8829

LPD

-0.0437

0.0800

0.0935

0.0935

0.0214

0.0072

0.0573

0.0202

-0.0750

-0.0529

-0.0399

0.8829

1.0000

 

Note. The table displays the correlations among social indicators such as comfort, air quality, and energy efficiency. The weak to moderate correlations confirm that these variables represent distinct yet complementary dimensions, ensuring the dataset’s internal consistency and its suitability for digital twin-based simulations in smart building management.

 

 

 

 

The heat map for the Correlation Matrix of the Social (S) dimension of ESG helps establish the intuitive structure of the mutual relationships among the variables that define indoor comfort, air quality, and energy efficiency in buildings. The structure is dominated by light colors, indicating that there is little to medium strength across the majority of variables; hence, the dataset provides a comprehensive range of social factors related to sustainability without duplication. The presence of mixed correlations in the heat map enhances its validity for use in digital twin-based building management systems to optimize building performance and human well-being. The red line running along the diagonal indicates the perfect relationship each has with itself, distinct from the existing correlations denoted by the colors along the diagonal. The red colors in the lower right corner indicate that the relationship (high correlation) between the energy-related variables EER, COP, and SEF (ranging from 0.7 to 0.8) is strong. However, it is expected that there was a relationship, given that it measures efficiency and performance. The same applies to the red square that connects EUI and LPD (0.9) correlations. The red square indicates that lighting load plays an important role in energy use per capita, thereby underscoring its role in defining energy efficiency. The top-left corner, related to the indicators for occupants' and air quality (OCC, HUM, PM2.5, PM10, and VOC), shows pale colors with scattered red and blue. This indicates that the relationship (low correlations) is weak, confirming that building air quality and comfort are not reliant on factors related to occupants —an important characteristic for datasets that help model building conditions using digital twin methods. This helps indicate that humidity and pollutant concentrations can change through simulations that model different process scenarios, thereby avoiding building conditions that could arise from occupants' varying factors related to building functionality and adaptations. The heat map indicates that it is valid for modeling the ESG social dimension in systems that apply analytics for building optimization and related human well-being.

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Social (S) Factors in the ESG Model. The heat map shows mostly weak to moderate correlations, indicating that the social variables—related to comfort, air quality, and energy efficiency—are distinct yet interrelated. This confirms the dataset’s internal coherence and its suitability for digital twin-based smart building simulations.

 

 

 

 

5.3 Correlation Analysis and Validation of Governance (G) Factors in the ESG Framework

 

The correlation matrix for the Governance (G) component of the ESG model provides valuable insight into the interrelationships among indicators that represent economic and operational aspects of smart building management. These include cost-effectiveness (CES), energy return on investment (EROI), energy payback time (EPBT), capital cost factors (CPD and CCF), system performance (SPC), renewable energy utilization (REU), and energy productivity per worker hour (EPWH). The aim of analyzing these correlations is to validate the dataset used for the prototyping of a digital twin-based management model for smart buildings (Roda-Sanchez et al., 2023; Alibrandi, 2022), ensuring that the indicators are statistically consistent, complementary, and capable of accurately reflecting the governance dynamics of sustainable infrastructure systems (Poels et al., 2022). The overall pattern of correlations in this matrix shows that relationships between governance variables are generally weak to moderate—a desirable feature for multidimensional datasets (Li, 2025). This indicates that each variable captures a distinct dimension of governance performance without excessive redundancy. The Cost of Energy Savings (CES) shows a strong negative correlation with the Capital Cost Factor (CCF) at -0.42, suggesting that higher capital costs are associated with lower cost-efficiency in achieving energy savings. This inverse relationship highlights an important governance trade-off: investments that require significant capital may not always translate into proportional financial efficiency gains (Chungath & Hacks, 2024). The negative correlations between CES and other variables such as SPC (-0.22) and REU (-0.20) reinforce this interpretation, implying that systems with higher cost-effectiveness tend to have lower levels of spending and less direct connection with renewable energy deployment intensity. EROI, which measures the ratio between energy produced and energy invested, displays weak correlations across most variables, including a slight negative association with EPBT (-0.22), consistent with the expectation that higher energy returns correspond to shorter payback times. Its positive, though modest, correlations with CCF (0.09) and SPC (0.16) suggest that systems with better energy efficiency tend to be embedded in contexts with moderate capital intensity and performance consistency (Elnour et al., 2024). EPBT itself maintains low correlations, except for its mild negative association with SPC (-0.20), which indicates that buildings or systems with shorter payback periods tend to have more stable or efficient operational performance. The CCF variable is positively correlated with SPC (0.23) and EPWH (0.11), showing that capital costs are weakly linked to system performance and worker energy productivity. These modest correlations support the validity of the dataset by suggesting that financial parameters and productivity metrics are related but not overlapping dimensions of governance performance (Zhou et al., 2021). REU and EPWH exhibit a small positive relationship (0.19), consistent with the idea that renewable energy integration enhances the energy productivity per worker, a finding relevant for evaluating the operational efficiency of buildings managed under sustainable frameworks (Dovolil & Svítek, 2024; Kljaić et al., 2024). The overall low correlation magnitudes across variables, with few exceptions, demonstrate that the dataset is well balanced and not dominated by interdependent indicators. This structural integrity is fundamental for the calibration and validation of digital twin models (Chungath & Hacks, 2024; Poels et al., 2022), which require clear variable independence to accurately simulate decision-making and policy scenarios in smart buildings.The limited but coherent correlations between cost, performance, and efficiency metrics confirm that the Governance dimension of the ESG dataset is statistically reliable. It effectively captures the complexity of managing financial and operational sustainability (Li, 2025), ensuring that the digital twin model can use these parameters to support optimization, predictive analysis, and performance benchmarking within a robust and transparent governance structure (Kljaić et al., 2024; Elnour et al., 2024).

Table X. Correlation Matrix for Governance (G) Factors in the ESG Smart Building Model

 

Variable

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

CES

1.0000

-0.0596

0.0320

0.0069

-0.4240

-0.2163

-0.1981

-0.0780

EROI

-0.0596

1.0000

-0.2234

0.0083

0.0851

0.1553

0.0725

0.0126

EPBT

0.0320

-0.2234

1.0000

-0.1697

0.0380

-0.1981

0.1305

0.0050

CPD

0.0069

0.0083

-0.1697

1.0000

-0.0017

-0.0894

0.0077

0.0670

CCF

-0.4240

0.0851

0.0380

-0.0017

1.0000

0.2251

0.0327

0.1058

SPC

-0.2163

0.1553

-0.1981

-0.0894

0.2251

1.0000

-0.1582

-0.0859

REU

-0.1981

0.0725

0.1305

0.0077

0.0327

-0.1582

1.0000

0.1860

EPWH

-0.0780

0.0126

0.0050

0.0670

0.1058

-0.0859

0.1860

1.0000

 

Note. The table shows weak to moderate correlations among governance indicators, confirming their independence and validity. The negative link between CES and CCF highlights an inverse cost–efficiency relationship, while positive ties among CCF, SPC, and EPWH indicate consistent governance performance suitable for digital twin-based smart building management.

 

 

The corresponding heat map for the Governance (G) component clearly shows the structure of correlations associated with important governance factors, offering a quick look at the relationship profiles of financial, operational, and efficiency factors in the dataset. The color scale from deep red to blue also clearly emphasizes the type and intensity of correlations, differentiating red for positive correlations and blue for negative ones. This helps perform intuitive analysis aimed at assessing the level of internal association consistency in the dataset, which is important for approving the digital twin model for managing Smart buildings (Chungath & Hacks, 2024; Cureton & Dunn, 2021). The first observed feature from the heat map is the strong negative association existing between the Cost of Energy Savings (CES) and the Capital Cost Factor (CCF), as indicated by the deep blue square (around -0.4). This association clearly shows that when capital costs are higher, energy savings are less beneficial and less important for governance-related financial decisions in Smart buildings (Pileggi et al., 2020). This clearly shows that the numerical analysis is supported by the heat map, making it easier to observe clear, interpretable correlations among financial factors and enhancing the dataset's credibility by appropriately referencing these correlations for cost and investment factors. An important group could also be observed for the efficiency and performance factors (CCF, SPC, and REU), characterized by weak to moderately red-toned correlations. This clearly shows that when better performance and usage of REU are positively associated with higher capital costs, it’s expected that investment level intensity will be higher, with a positive outcome for energy governance (Lv et al., 2023; Zahedi et al., 2024). The same could also be analyzed by examining the REU and EPWH groups, as shown in a red-toned heat map, clearly indicating that REU has a positive relationship with EPWH and fully confirming the operational scenario for the efficiency model for Smart buildings (Roda-Sanchez et al., 2023). The dominance of the light-toned palette for almost every corner in heat maps indicates that almost every variable has low levels of correlation, ensuring that the dataset is fully balanced and lacks multicollinearity, both of which outweigh benefits for digital twin applications, ensuring that it’s fully accurate for cause-and-effect simulations (Yue et al., 2022; Hartmann et al., 2023). The heat map, therefore, validates the dataset's effectiveness by illustrating that governance indicators are differentiated yet linked in a logical way, ensuring it’s apt for use in an integrated system for the governance of sustainable buildings (Dovolil & Svítek, 2024; Cranford, 2023). Thus, it can unequivocally be concluded that the significance of governance heat maps is mandatory for analysis and, more appropriately, for decision-making regarding ESG integration in digital twin technology for infrastructural governance in a smart city (Kljaić et al., 2024).

 

 

 

.

 

Figure X. Heat Map of the Correlation Matrix for Governance (G) Factors in the ESG Model. Note: The heat map illustrates the correlations among governance indicators such as cost-effectiveness, capital costs, and system performance. The predominance of light colors indicates weak to moderate relationships, confirming the independence of the variables and the absence of multicollinearity. This validates the dataset’s consistency and its suitability for digital twin-based simulations in smart building governance.

 

 

 

  1. Regression-Based Validation of the ESG Dataset for Digital Twin Smart Building Modeling

 

To demonstrate the efficacy and applicability of the ESG model, the analysis equations will provide a crucial starting point for evaluating the dataset's statistical validity and reliability. These equations will examine the levels of cohesion, interdependence, and applicability of the environmental, social, and governance dimensions within the broader context of sustainable building resource management. The analysis aims to demonstrate that the ESG dataset has the potential to significantly contribute to the conceptualization and ideation of an integrated building resource management model that leverages digital twin technology and the metaverse to model, monitor, and regulate the performance efficiency of intelligent buildings in real time (Zhang et al., 2023). The equations will use Ordinary Least Squares (OLS), with the dependent variable (AREA) indicating the scale, size, and functionality of buildings, and the independent variables indicating Key Performance Indicators for each ESG dimension. The equations will enable the researcher to assess the reliability and functionality of the dataset, thereby creating the opportunity to examine the applicability of fundamental ESG dimensions that can effectively contribute to sustainable building resource scale, functionality, and efficiency (Dou & Yin, 2024). The structure and form of the equations will apply the three dimensions that govern ESG, creating a sound methodology for assessing environmental, social, and governance factors within a broader context of sustainable building resource scale, functionality, and efficiency (Wang et al., 2024). The Environmental equation will appropriately indicate energy consumption, intensity, and efficiency factors that can contribute to building scale, the Social equation will indicate factors related to comfort, air, and user well-being, and the Governance equation will relate financial intensity and efficiency factors to building scale, functionality, and performance (Wang et al., 2024). The equations will provide the crucial foundation for the validation and calibration of the given dataset, ensuring that its integration into digital twin and metaverse technology provides the fundamental soundness for reliable, accurate, and environmentally validated model building (Liu et al., 2025).

 

Table X. Regression Equations for ESG Dimensions in the Smart Building Model

ESG

Equations

E-Environment

 

S-Social

 

G-Governance

 

Note: The table presents the Ordinary Least Squares (OLS) regression equations used to validate the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. Each equation relates specific Key Performance Indicators (KPIs) to the dependent variable AREA, representing building scale and functionality. These equations provide the analytical foundation for integrating the ESG dataset into digital twin-based smart building management and simulation frameworks.

 

The results for the Environmental model (E) indicate an R² of 0.226 and an adjusted R² of 0.005, suggesting that while the included variables explain approximately 22.6% of the variance in AREA, much of this explanatory power is not statistically robust once adjusted for the number of predictors. However, the F-statistic (1.02) and its corresponding probability value (0.451) confirm that the model structure remains consistent and free from specification errors. The significant variables, namely the Capacity Factor (CAF, p = 0.006) with a negative sign, and the Renewable Energy Utilization (REU, p = 0.065) with a positive sign, indicate logical relationships. Larger building areas tend to be associated with lower utilization efficiency (CAF) but higher renewable energy use (REU), a pattern coherent with real-world behavior in large smart infrastructures (Guo et al., 2025). The low mean VIF (1.93) confirms the absence of multicollinearity, reinforcing dataset reliability for modeling energy-environmental dynamics. The Social (S) dimension regression exhibits an R² of 0.085 and an adjusted R² of 0.004, showing that social and comfort-related KPIs explain only a small fraction of the variation in building area. This result aligns with expectations, as social variables—such as air quality (PM2.5, PM10), humidity, and acoustic comfort—tend to capture internal environmental quality rather than scale-dependent properties. The significance of PM2.5 (p = 0.084) suggests that particulate concentration may have a weak relationship with building size, potentially due to differences in ventilation and occupancy density (Chungath & Hacks, 2024). The low mean VIF (1.08) again validates the statistical independence of these indicators, confirming that the Social dataset is structurally well defined, even if its predictive strength remains marginal. The Governance (G) regression yields the most consistent results in terms of model validity, with an R² of 0.124 and a higher adjusted R² of 0.067. The F-statistic of 2.19 and a p-value of 0.051 indicate near-statistical significance at the 5% level, implying that the governance and economic indicators together provide a weak but coherent explanation of AREA variability. The negative signs of the significant variables—Capital Development Cost (CPD, p = 0.027) and Capital Cost Factor (CCF, p = 0.054)—reveal that greater efficiency and lower costs per unit area are associated with better governance performance. This outcome is particularly relevant for validating the economic component of the digital twin, as it suggests that financial optimization and governance transparency correlate with spatial and operational efficiency (Dovolil & Svítek, 2024; Cranford, 2023). The low mean VIF (1.15) confirms internal model consistency and the absence of collinearity distortions. Overall, the three regressions validate the ESG dataset by confirming that each component captures a distinct dimension of building performance. While none of the models exhibits high explanatory power individually, their combined interpretation demonstrates structural coherence and logical sign directions. The Environmental model highlights operational and renewable energy dynamics, the Social model reflects comfort and health independence, and the Governance model reveals economic efficiency trends. Together, they provide a statistically sound and multidimensional foundation for implementing a digital twin system capable of assessing, simulating, and optimizing smart building governance and performance in line with ESG principles (Zhang et al., 2023).

Table X. Summary of Regression Results for ESG Dimensions in the Smart Building Model

ESG Dimension

E (Environment)

S (Social)

G (Governance)

Included KPIs (X)

ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, FLF, FLI, FEE, EER, COP, SEF, EUI, LPD, REU, EPWH

OCC, HUM, PM25, PM10, VOC, ACH, THR, SND

CES, EROI, EPBT, CPD, CCF, SPC

 Vars

22

8

6

0.226

0.085

0.124

Adj. R²

0.005

0.004

0.067

F (df1, df2)

1.02 (22, 77)

1.05 (8, 91)

2.19 (6, 93)

Prob > F

0.451

0.403

0.051

Root MSE

5237

5238

5069

Mean VIF

1.93

1.08

1.15

Significant Variables (p < 0.10)

CAF (p = 0.006), REU (0.065), EPWH (0.108)

PM25 (p = 0.084)

CPD (p = 0.027), CCF (0.054)

Sign

CAF (−), REU (+)

+

both (−)

Interpretation

Environmental KPIs are consistent but weakly predictive of AREA; no multicollinearity; logical directional signs.

Social KPIs are independent and orthogonal; air quality and comfort show limited relation with building scale.

Governance and economic KPIs show structural consistency and marginal significance; negative coefficients suggest efficiency gains with lower costs per area.

Note: This table summarizes the regression outcomes for the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. The results show that all models are statistically coherent and free from multicollinearity (Mean VIF < 2). While the Environmental and Social models exhibit low explanatory power, the Governance model shows marginal significance (Prob > F ≈ 0.05), indicating that financial and efficiency indicators play a stronger role in explaining building scale and performance within digital twin-based smart building systems.

 

The validation of the ESG dataset through the three regression models—each representing the Environmental, Social, and Governance dimensions—demonstrates the internal coherence and distinct contribution of each component to the explanation of building scale and performance, represented by the variable AREA. The results reveal a layered structure of relationships within the dataset that supports its robustness and analytical validity for modeling within a digital twin framework (Hien & Hanh, 2024). From a global perspective, the Governance (G) regression emerges as the most statistically relevant, with a Prob > F value around 0.05, suggesting marginal significance and indicating that this component captures some consistent structural patterns between governance and economic indicators and the dependent variable AREA. This finding implies that financial and efficiency-related parameters—such as cost per energy unit, return on investment, and payback time—are moderately predictive of the built area, reflecting how governance decisions and economic structures may scale with building size (Dou & Yin, 2024). In contrast, the Environmental (E) and Social (S) models exhibit low explanatory power, with adjusted R² values close to zero. This outcome is not unexpected, as environmental and social metrics often capture operational performance and contextual conditions rather than structural attributes like area. The Environmental model, although statistically weaker, presents logical directional relations, such as negative associations with carbon footprint (CFPT) and positive associations with renewable energy use (REU), which are consistent with theoretical expectations of sustainable design (Su & Sun, 2023). Similarly, the Social model demonstrates independence among variables, showing that indoor air quality, thermal and acoustic comfort, and occupancy metrics vary orthogonally without multicollinearity, as confirmed by low mean VIF values (below 2). The analysis of multicollinearity further reinforces the validity of the dataset. All three models have mean variance inflation factors below 5, confirming that no block of variables presents internal redundancy. This indicates that the dataset is structurally well-defined and that each KPI contributes unique information within its respective ESG dimension (Chen & Lin, 2023). From a methodological standpoint, this supports the use of the dataset for higher-level analytical modeling, including multivariate or machine learning regressions, since predictor independence is a prerequisite for robust feature interpretation. The Governance block stands out for its structural coherence. Variables such as CPD (cost per design), CCF (capital cost factor), and SPC (sustainability performance cost) show statistically relevant coefficients, some with negative signs. This pattern indicates that higher building efficiency or optimized financial planning is associated with lower costs per unit area—an interpretation aligned with principles of sustainable financial governance and stakeholder-oriented management (Berman et al., 1999). The presence of negative coefficients further reinforces the logic of efficiency-driven management models, where resource optimization translates into economic and environmental benefits. Overall, the regression-based validation confirms that the ESG dataset is both statistically sound and conceptually coherent. Each dimension provides non-redundant information, supporting the multidimensional structure of ESG analysis. While the E and S models describe operational and contextual variability, the G model anchors the dataset’s structural significance, establishing a measurable link between governance efficiency and building scale. The low multicollinearity, consistent variable behavior, and partial significance of the Governance model collectively validate the dataset for use in a digital twin context, where real-time data integration and predictive modeling depend on stable and interpretable variable relationships. This foundational validation demonstrates that the dataset can be reliably used for developing intelligent management systems capable of assessing performance and sustainability through interconnected ESG indicators (Hien & Hanh, 2024; Dou & Yin, 2024).

Table X. Summary of Key Analytical Insights from ESG Regression Models

Aspect

Observation

Global significance

Only the G model is marginally significant (Prob > F ≈ 0.05).

Internal coherence (VIF)

All Mean VIF < 5 → no multicollinearity in any ESG block.

Predictive power vs. AREA

E and S blocks have low explanatory power; G block moderate (Adj R² ≈ 0.07).

General interpretation

The three ESG dimensions are statistically distinct and non-redundant. The Governance/Economic dimension shows the strongest structural consistency.

Note: The table presents the main analytical observations derived from the ESG regression analysis. It highlights that the Governance (G) model demonstrates marginal statistical significance and the strongest internal consistency, while the Environmental (E) and Social (S) models show lower explanatory power but maintain structural independence. The low VIF values confirm the absence of multicollinearity, validating the dataset’s robustness for digital twin–based smart building modeling.

 

  1. Principal Component Analysis (PCA) for Technical Validation of the ESG Dataset in Smart Building Governance

 

The analysis presented in this section aims to apply the Principal Component Analysis (PCA) technique to provide a technical and scientific validation of the ESG (Environmental, Social, and Governance) dataset developed for testing the smart building governance prototype. PCA is a widely recognized multivariate statistical method used to reduce the dimensionality of complex datasets while preserving their essential information structure. Its application in this context serves a dual purpose: to verify the internal coherence and multidimensionality of the ESG dataset and to ensure that the selected indicators accurately represent the underlying sustainability dimensions without redundancy. The purpose of this analysis is to confirm that the dataset is robust, logically structured, and suitable for integration into advanced digital environments such as digital twin and metaverse platforms. By identifying the principal components that explain the highest variance among the ESG indicators, PCA enables the researcher to isolate the most influential factors affecting smart building performance and governance. This validation step is essential to ensure that the prototype operates on a reliable and scientifically grounded dataset, capable of supporting dynamic simulations, predictive modeling, and real-time decision-making. Through PCA, the dataset’s structural soundness is assessed, verifying the independence and complementarity of the variables associated with environmental efficiency, social comfort, and governance effectiveness. The resulting components will form the analytical backbone for building an integrated system that governs and optimizes smart buildings in immersive digital environments. Ultimately, this approach ensures that the proposed governance prototype—based on digital twin and metaverse technologies—is supported by a technically validated and scientifically reliable data framework, reinforcing its potential for sustainable, data-driven management of intelligent infrastructures.

 

 

7.1 Principal Component Analysis (PCA) Results for the Environmental (E) Dimension of the ESG Model

The results of the Principal Component Analysis (PCA) applied to the environmental component of the ESG model provide a significant validation of the underlying dataset, confirming both its internal coherence and its multidimensional structure (Kwon, Kim, & Choi, 2024). The PCA technique, which decomposes the dataset into orthogonal principal components, is particularly effective for evaluating the relationships among environmental indicators and identifying latent structures that capture the underlying variance of the data (Ascione et al., 2022). In this case, the eigenvalues associated with the first few components demonstrate that approximately 40% of the total variance is explained by the first four principal components, indicating that the environmental indicators share meaningful correlations without redundancy. This supports the use of PCA as a robust approach to assess data consistency and dimensionality reduction within the ESG framework. From the component loadings, the first principal component (PC1) captures the largest share of variance and is primarily driven by positive contributions from EMIN (Emission Intensity), LCF (Load Cover Factor), and ENCO (Energy Consumption), while variables such as CFPT (Carbon Footprint) and SCF (Supply Cover Factor) contribute negatively. This component appears to represent a balance between efficiency and energy consumption, reflecting how emissions and energy coverage jointly influence environmental performance. The second component (PC2) has strong positive loadings for OER (On-site Energy Ratio) and GII (Grid Interaction Index), while FLF (Flexibility Factor) shows a strong negative contribution. This suggests that PC2 differentiates systems with higher local energy autonomy from those that rely more on flexibility and grid interaction, aligning with the notion of distributed energy management (Zhang et al., 2023). The third and fourth components (PC3 and PC4) further refine the structure of the data, capturing subtler aspects of energy-environmental interactions. For instance, PC3 shows high positive loadings for ENCO and OPP (One Percent Peak Power), while LCF and LMI (Load Matching Index) load negatively, suggesting a contrast between energy demand peaks and load coverage capacity. PC4, on the other hand, captures variability associated with EMIN and CAF (Capacity Factor), pointing toward the efficiency of energy conversion processes within the system (Kwon et al., 2024). A noteworthy observation is that none of the variables display extreme loadings across multiple components, which indicates that the dataset lacks strong multicollinearity and maintains a balanced contribution of each indicator to the overall structure. This aligns with the earlier regression analyses that confirmed low mean variance inflation factors (VIF), thereby reinforcing the dataset’s internal consistency (Islam, Guerrieri, Gravina, & Fortino, 2024). The presence of moderate but distributed loadings also implies that each variable contributes uniquely to the multidimensional understanding of environmental performance, making the dataset appropriate for subsequent modeling steps. The negative correlations observed in some components, such as between CFPT and EMIN, or SCF and CAF, emphasize the complexity of the environmental dimension. These negative signs do not indicate inconsistencies but rather complementary dynamics: higher carbon footprints tend to associate with lower emission intensity efficiency, while energy coverage and capacity factors reveal trade-offs between resource use and operational performance. This reinforces the interpretative depth of PCA as a diagnostic validation tool rather than a purely descriptive method (Zhou et al., 2023). Overall, the PCA results validate the environmental dataset as a coherent and structurally reliable foundation for the ESG model. The distribution of eigenvalues and loadings supports the presence of independent, interpretable dimensions within the environmental domain. This validation step is crucial, especially considering the dataset’s intended application in the development of a management prototype integrating Digital Twin and Metaverse technologies (Zhang et al., 2023). In this context, PCA ensures that the environmental indicators capture distinct yet complementary aspects of building energy efficiency, emission control, and operational sustainability. Consequently, the PCA model not only confirms the statistical robustness of the environmental data but also establishes a reliable basis for embedding it within a digital simulation environment for smart building management.

 

 

 

 

 

Figure X. Principal Component Loadings for Environmental (E) Factors in the ESG Model. Note: The figure illustrates the loading values of each environmental indicator (ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, and FLF) across the principal components (PC1–PC15). The distributed and moderate loading patterns confirm that no single factor dominates the variance, indicating balanced variable contributions and low multicollinearity. This supports the dataset’s structural integrity and validates its suitability for digital twin–based smart building governance modeling within the ESG framework.

 

 

 

7.2 Principal Component Analysis (PCA) Results for the Social (S) Dimension of the ESG Model

 

The principal component analysis of the Social (S) dimension in the ESG framework provides a deep understanding of how human-related and comfort variables interact within smart building environments. Incorporating the identified variables — such as Occupants (OCC), Relative Humidity (HUM), Particulate Matter (PM2.5 and PM10), Volatile Organic Compounds (VOC), Air Changes per Hour (ACH), Thermal Insulation (THR), Sound Insulation (SND), Energy Efficiency Ratio (EER), Coefficient of Performance (COP), System Efficiency (SEF), Energy Use Intensity (EUI), and Lighting Power Density (LPD) — the PCA demonstrates the multidimensional structure of the social component, highlighting interdependencies between human comfort, air quality, and building performance metrics (Bonab, Bellini, & Rudko, 2023). The first principal component (PC1) shows strong negative loadings for EER, COP, and SEF, indicating that this dimension captures the efficiency and operational quality aspects of social comfort. These parameters represent the building’s ability to maintain indoor well-being through technological optimization. Negative values suggest an inverse relationship between system efficiency and variability in occupant conditions, implying that as systems become more efficient, fluctuations in perceived comfort decrease. This aligns with the principles of smart building management, where automation and digital control stabilize the indoor environment (Elnour, Meskin, Khan, & Jain, 2021). PC2 exhibits positive contributions from EUI and LPD, suggesting that energy consumption per person and lighting density are key indicators of human activity levels within buildings. This axis can be interpreted as a behavioral-energy dimension, linking occupant presence and usage patterns to energy demand. It supports the concept that social variables are not isolated but are reflections of dynamic interactions between people and infrastructure (Ma et al., 2023). The third component (PC3) emphasizes indoor air quality factors, with high negative loadings for PM2.5, PM10, and THR. This reveals an important trade-off between particulate pollution and thermal comfort. In smart building contexts, this component provides insight into how environmental control systems influence both health-related and comfort-related metrics. Lower particulate concentrations may require higher ventilation rates (ACH), which in turn affect energy consumption and humidity balance (Hu & Lu, 2024). PC4 is primarily characterized by strong positive loadings for Occupants (OCC) and Humidity (HUM), alongside moderate contributions from ACH and PM10. This suggests that the fourth component captures spatial and microclimatic comfort interactions, where occupant density and air renewal are central to maintaining an acceptable indoor environment. In digital twin applications, such relationships are essential for predicting comfort variations based on occupancy data and HVAC system behavior. Higher components, such as PC5 through PC7, refine specific comfort and acoustic dimensions. Negative loadings of SND and THR indicate the balance between thermal insulation, noise control, and user satisfaction. These components are crucial for understanding the subtle effects of building envelope performance on perceived comfort, an area that is increasingly relevant for ESG-oriented smart building metrics (Hu & Lu, 2024). Finally, components like PC8 to PC13 capture residual variance associated with specific operational parameters, confirming that while social indicators are diverse, they remain statistically coherent and non-redundant. The consistent spread of variance across components underscores the structural validity of the dataset, confirming that each variable contributes uniquely to the representation of social sustainability within buildings. Overall, the PCA confirms that the social dataset is robust and internally coherent, providing strong empirical support for its use in validating the proposed metric model. The clear clustering of efficiency-related, environmental, and comfort indicators reflects a realistic representation of how occupants experience smart buildings. When integrated into a digital twin and metaverse-based management system, these results ensure that the model can simulate user-environment interactions, predict comfort dynamics, and optimize building operations in line with ESG principles (Bonab et al., 2023; Ma et al., 2023).

 

 

 

 

 

 

 

Figure xyz. Principal Component Loadings for Social (S) Factors in the ESG Model. Note: The figure presents the loading values of social indicators (OCC, HUM, PM2.5, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) across the principal components (PC1–PC13). The distribution of moderate and distinct loadings confirms that each factor contributes uniquely to the social dimension. Efficiency variables (EER, COP, SEF) and comfort-related indicators (PM2.5, PM10, HUM) form separate but complementary clusters, validating the dataset’s internal consistency and its suitability for digital twin and metaverse-based smart building governance applications.

 

7.3 Principal Component Analysis (PCA) Results for the Governance (G) Dimension of the ESG Model

 

 

 

 

 

 

 

 

The Principal Component Analysis (PCA) of the Governance (G) component in the ESG model provides critical evidence for the statistical validity and structural coherence of the dataset intended for digital twin and metaverse-based smart building management. This component includes variables related to economic efficiency and governance performance—specifically, cost efficiency (CES), energy return on investment (EROI), energy payback time (EPBT), construction and operational costs (CPD, CCF), system performance and control (SPC), renewable energy utilization (REU), and energy productivity per worker (EPWH). Together, these indicators describe the economic and managerial dimension of sustainable smart buildings, where financial optimization, performance monitoring, and long-term resource efficiency are intertwined (Dovolil & Svítek, 2024). The first principal component (PC1) explains a substantial portion of the variance, with strong negative loading for CES (-0.537) and positive loading for CCF (0.531) and SPC (0.506). This pattern highlights a fundamental trade-off between cost reduction per unit of energy saved and capital or operational investment, which is typical in building governance models (Bezrukov, Sadovnikova, & Lebedinskaya, 2022). In a digital twin context, this suggests that reducing the marginal cost of energy efficiency (CES) is associated with higher upfront or management costs (CCF, SPC), reflecting realistic investment-efficiency dynamics. PC1 can therefore be interpreted as a “financial governance axis,” emphasizing the relationship between cost control, structural investment, and system efficiency. The second component (PC2) shows significant negative correlations for EPBT (-0.472), REU (-0.516), and EPWH (-0.467), suggesting that this component represents the temporal and productivity-related aspect of governance. Shorter energy payback times and greater renewable energy utilization contribute to higher system efficiency but require optimization of workforce productivity and process management (Pandhare et al., 2024). This factor can be understood as an “operational sustainability axis,” demonstrating the capacity of governance metrics to reflect the long-term return of energy and human capital investments. PC3 and PC4 reveal more specific structural relations within the dataset. The strong positive loading of EPBT (0.458) and CPD (0.619) in these components indicates that buildings with longer payback periods also tend to have higher cost structures. This pattern validates the consistency of the dataset, showing that financial and temporal metrics are not independent but logically correlated. In a digital twin simulation, these relationships can be used to model the trade-offs between project duration, capital investment, and lifecycle sustainability (Zhang, Yu, & Tian, 2024). The significant contribution of EROI (-0.601 in PC4) further connects governance efficiency to the building’s ability to generate positive energy returns, highlighting the strategic value of integrating real-time energy flow analytics in metaverse-based management systems. The fifth and sixth components (PC5 and PC6) capture more subtle variations related to operational resilience and system integration. REU (0.325 in PC5, -0.584 in PC6) and EPWH (-0.751 in PC5) suggest that renewable energy performance and energy use efficiency per worker vary inversely, reflecting the complexity of aligning workforce productivity with renewable infrastructure adoption. This finding is particularly relevant for smart building governance because it illustrates how data-driven management—enabled by digital twins—can balance human and technological performance indicators (Pandhare et al., 2024). Finally, PC7 and PC8 consolidate the multidimensionality of the governance structure. The strong positive loading of CES (0.519 and 0.536) indicates that cost efficiency remains a dominant variable across higher components, confirming that economic optimization is consistently embedded in the model. The coherence of loadings across multiple components demonstrates that each indicator contributes uniquely to the overall governance structure, with no redundancy or distortion. In summary, the PCA results confirm that the governance dataset is statistically robust and conceptually coherent. The clear differentiation of principal components reflects the internal logic of ESG-based governance, where financial, operational, and energy metrics interact systematically. This validates the model’s suitability for integration into a digital twin and metaverse framework, enabling predictive management, optimization of energy investment, and real-time governance of smart building performance. The structure uncovered by the PCA not only supports the empirical reliability of the data but also provides a scientific foundation for developing intelligent, data-driven systems aligned with sustainable management objectives.

Figure X. Principal Component Loadings for Governance (G) Factors in the ESG Model. Note. The figure displays the loadings of governance-related indicators (CES, EROI, EPBT, CPD, CCF, SPC, REU, and EPWH) across the principal components (PC1–PC8). The results highlight clear structural differentiation among financial, operational, and productivity dimensions. PC1 and PC2 capture cost–efficiency and sustainability trade-offs, while higher components (PC4–PC6) reflect investment and performance dynamics. The balanced distribution of loadings confirms the statistical coherence and multidimensional integrity of the governance dataset, validating its use for digital twin and metaverse-based smart building governance models.

 

 

 

 

 

  1. Machine Learning Regression for ESG Dataset Validation in Digital Twin and Metaverse-Based Smart Building Governance

The machine learning regression analysis presented in this section was developed as a key step in the technical and scientific validation of a dataset designed for the testing and calibration of a prototype aimed at the management of smart buildings through Digital Twin and Metaverse technologies. The purpose of this process is to ensure that the dataset, structured according to the Environmental, Social, and Governance (ESG) framework, demonstrates high levels of internal consistency, predictive reliability, and interpretability—three essential conditions for its integration into intelligent, data-driven decision systems. By applying advanced machine learning algorithms such as Random Forest and Support Vector Machine (SVM), the study evaluates how effectively the dataset captures the complex, nonlinear relationships that characterize smart building governance. Each ESG dimension—environmental, social, and governance—is analyzed to identify the most suitable model capable of minimizing prediction errors (MSE, RMSE, MAE, MAPE) while maximizing explanatory performance (R²). The Random Forest model proves particularly effective for validating the Environmental and Social components, owing to its ensemble-based structure that captures multidimensional dependencies, avoids overfitting, and enhances interpretability through variable importance measures. The SVM algorithm, conversely, demonstrates superior performance in modeling the Governance dimension, where financial and operational variables interact through complex, non-linear patterns. The outcome of this machine learning validation process confirms that the ESG dataset provides a statistically robust foundation for developing an intelligent management prototype. Within a Digital Twin and Metaverse framework, this validated dataset will enable real-time simulation, optimization, and governance of building performance, energy efficiency, and sustainability—transforming smart buildings into adaptive, self-learning systems that support informed, data-driven decision-making.

 

 

 

8.1 Random Forest Regression for Environmental Dataset Validation within the ESG Framework

The selection of the Random Forest algorithm as the best-performing model for the validation of the ESG dataset is grounded on a comprehensive evaluation of multiple performance metrics, some of which are to be minimized and others maximized. In predictive modeling, a reliable validation approach must consider this dual nature of indicators. The metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) are indicators that must be minimized because they quantify the deviation between the predicted and actual values; lower values correspond to better accuracy and less dispersion of residuals. On the other hand, the coefficient of determination (R²) must be maximized, as it measures the proportion of the variance in the dependent variable explained by the model, thus reflecting its explanatory power. The Random Forest model demonstrates an optimal balance between these opposing objectives. Its normalized MSE, RMSE, MAE, and MAPE values are the lowest among all tested algorithms, indicating superior predictive precision and stability (Inala et al., 2024). Even though its R² is moderate compared to the Linear Regression model, this is compensated by the fact that Random Forest captures complex, nonlinear relationships that linear models fail to represent adequately (Ji & Niu, 2024). The ensemble structure of Random Forest, based on the aggregation of multiple decision trees, allows it to reduce variance and avoid overfitting, which enhances its robustness when validating large and heterogeneous datasets such as those associated with ESG indicators (Wang et al., 2024). Furthermore, the model’s interpretability and ability to estimate variable importance make it particularly suitable for applications in digital twin and metaverse environments. By quantifying the contribution of each variable, Random Forest supports both predictive accuracy and strategic understanding of how environmental, social, and governance components influence building performance. Its capacity to minimize prediction errors while maintaining stable explanatory reliability validates its use as a scientifically sound model for data validation in smart building management and ESG-driven digital infrastructures (Ukwuoma et al., 2024; Vasilica et al., 2025).

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Environmental Component

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.33

0.30

0.65

1.00

0.00

0.36

0.38

RMSE

0.35

0.33

0.73

1.00

0.00

0.42

0.44

MAE

0.44

0.53

0.78

1.00

0.00

0.27

0.52

MAPE

0.66

0.50

1.00

0.67

0.46

0.47

0.00

0.01

0.07

0.38

1.00

0.00

0.44

0.00

 

Note: All metrics — MSE, RMSE, MAE, MAPE, and R² — are normalized to enable direct comparison among algorithms. The Random Forest model shows the lowest normalized errors and a stable R², confirming its superior accuracy and robustness. These results validate its suitability for the technical–scientific verification of the environmental dataset used in developing a digital twin and metaverse-based smart building management prototype.

 

The analysis of the Random Forest model applied to the ESG dataset reveals a coherent and scientifically valid validation process for its future use in the development of a smart building management prototype that integrates digital twin and metaverse technologies. The feature importance metrics, expressed through the mean dropout loss, serve as a robust indicator of how each environmental variable contributes to the predictive power of the model. This metric, based on fifty permutations of the dataset, measures the increase in error (in terms of RMSE) when a variable is randomly excluded from the model. Lower dropout loss values correspond to less influential variables, whereas higher values indicate features whose removal leads to a significant deterioration in predictive accuracy. In this case, CAF (Capacity Factor), SCF (Supply Cover Factor), and OER (On-site Energy Ratio) exhibit the highest dropout loss, showing that they are essential in explaining variations in the dependent variable AREA. These results are consistent with the physical logic of smart building systems, where energy capacity balance, supply coverage, and on-site generation efficiency play fundamental roles in determining operational performance and sustainability outcomes (Du, 2024; Kinshakov et al., 2021). This importance ranking demonstrates that the Random Forest algorithm not only captures statistical correlations but also reflects the actual structural dynamics governing energy and environmental processes within buildings (Miao & Xu, 2024). In particular, the model’s ability to represent nonlinear interactions enhances the interpretability of variable contributions—especially when dropout loss metrics are combined with ensemble-based prediction strategies (Orlenko & Moore, 2020). Moreover, recent studies demonstrate that such methods significantly improve the readability and explainability of complex systems, supporting their use in high-dimensional, heterogeneous ESG datasets (Yu et al., 2021). Consequently, the application of Random Forest in this context is not only statistically justified but also conceptually aligned with the operational goals of smart building governance.

Figure xyz. Feature Importance Based on Mean Dropout Loss — Environmental Component

Variables

Mean dropout loss

Variables

Mean dropout loss

CAF

5.077

CFPT

5.068

SCF

5.074

LCF

5.068

OER

5.070

OPP

5.068

FLF

5.069

LMI

5.068

GII

5.069

FLI

5.068

EMIN

5.069

ENCO

5.067

NGI

5.068

FEE

5.067

DRS

5.068

   

Note. The mean dropout loss values indicate each variable’s contribution to the Random Forest model. Higher values (e.g., CAF, SCF, OER) represent greater influence on model accuracy, confirming their key role in validating the environmental dataset for smart building analysis.

 

The model’s ability to identify meaningful predictors supports the internal coherence of the ESG dataset and confirms its reliability as a basis for digital twin modelling. The additive explanations of the predictions further reinforce the model’s interpretability. Each predicted value is constructed from a baseline prediction (the “Base”) adjusted by the additive contributions of each variable. Positive contributions increase the predicted AREA, while negative ones decrease it. For example, in Case 1, positive influences from GII (Grid Interaction Index) and FLI (Flexibility Index) compensate for the negative impact of SCF and EMIN, resulting in a final prediction slightly above the baseline. This additive approach allows for a clear decomposition of the prediction mechanism, offering transparency in understanding how individual environmental factors shape the model’s output. Such interpretability is essential for validating the dataset in a scientific context, as it ensures that the model’s decisions are both explainable and consistent with domain knowledge. By integrating the mean dropout loss and additive prediction explanations, the Random Forest model provides a double-layer validation: it identifies the most influential features for prediction and explains how they act in shaping each result. This combination of accuracy, interpretability, and conceptual alignment with building energy dynamics confirms that the model is methodologically sound and suitable for the prototyping of an intelligent management system for smart buildings, capable of leveraging digital twin and metaverse technologies for real-time performance monitoring and sustainable decision-making.

 

Table xyz. Additive Feature Contributions in Random Forest Predictions — Environmental Component

Case

1

2

3

4

5

Predicted

9.141

8.936

9.175

8.931

8.931

Base

9.063

9.063

9.063

9.063

9.063

ENCO

-0.298

-2.765

4.759

-9.130

6.100

CFPT

8.291

-20.937

0.220

4.921

-12.691

EMIN

-10.720

-1.963

1.428

16.626

-17.500

LCF

-16.680

-18.864

9.401

-21.168

22.139

SCF

-9.687

-2.902

-13.972

-59.684

-46.113

LMI

2.564

-1.952

-4.729

2.104

1.990

OER

10.921

10.930

10.902

10.914

10.910

GII

32.240

-20.779

35.824

-22.670

5.397

NGI

8.787

23.709

-7.874

-9.781

1.382

CAF

-33.998

3.153

34.955

-36.673

-36.673

OPP

13.678

-18.241

-11.706

13.536

-4.859

DRS

29.411

-25.220

-2.092

-17.218

-2.467

FLF

42.525

-52.681

53.365

-4.552

-57.637

FLI

0.446

1.780

1.584

1.260

-2.368

FEE

-0.064

-0.010

-0.139

-0.194

-0.013

 

Note. This table shows the additive contributions of each environmental variable to the predicted AREA across five test cases. Positive values increase the prediction, while negative ones reduce it. The results highlight the interpretability of the Random Forest model, confirming that the dataset captures realistic and consistent relationships among energy and environmental indicators.

 

 

8.2 Machine Learning Validation of the Social (S) Component in the ESG Dataset

In the context of developing a scientifically grounded methodology for validating the ESG dataset, this section focuses on the Social (S) component by applying and comparing different machine learning regression algorithms. The goal is to identify which algorithm best captures the underlying relationships among social performance indicators relevant to smart building management while ensuring both predictive reliability and interpretability (Li & Xu, 2024). After the normalization of performance metrics, the Random Forest algorithm demonstrates the most balanced and consistent results. It achieves the lowest normalized error values across MSE, RMSE, and MAE, indicating superior predictive accuracy and robustness in modeling the social variables. The model’s relatively low MAPE further supports its reliability, as it suggests that Random Forest maintains stable relative error levels across the range of predicted values, ensuring that deviations between observed and estimated outputs remain proportionally small (Gaur et al., 2021; Li, 2025). By contrast, Linear Regression, while producing the highest R² value, exhibits significantly higher normalized error metrics. This indicates that despite its apparent explanatory power, the linear model fails to account for the complex, nonlinear interactions typical of social indicators in ESG frameworks, leading to overfitting and reduced generalizability (Li & Jiang, 2023). In this sense, Random Forest provides a better trade-off between minimizing errors and maximizing interpretability, effectively capturing multidimensional relationships among variables such as occupant comfort, indoor air quality, and system efficiency, which collectively define the social sustainability of building operations. The results confirm that the Random Forest approach not only enhances the predictive stability of the validation process but also ensures methodological consistency with the broader objective of dataset validation within a digital twin and metaverse framework (Khan & Vora, 2024). Its ability to model complex nonlinearities and maintain low residual variance validates the dataset’s structural coherence and reinforces its suitability for integration into the prototyping of a smart building management system capable of dynamic, data-driven decision-making.

Table xyz. Normalized Performance Metrics of Machine Learning Models — Social (S) Component

Metric

Boosting

Decision Tree

KNN

Linear

Random Forest

Regularized Linear

SVM

MSE

0.828

0.273

0.186

0.771

0.000

0.004

0.133

RMSE

0.989

0.123

0.027

0.949

0.000

0.002

0.067

MAE

0.713

0.210

0.044

1.000

0.006

0.000

0.038

MAPE

1.000

0.238

0.292

0.595

0.263

0.316

0.000

0.000

0.182

0.727

1.000

0.667

0.182

0.000

 

Note: The table presents normalized evaluation metrics for different machine learning algorithms applied to the Social (S) dataset. The Random Forest model achieves the lowest error values (MSE, RMSE, MAE) and balanced performance, confirming its superior predictive accuracy and suitability for validating social indicators within digital twin and metaverse smart building frameworks.

The application of the Random Forest algorithm to the Social (S) dimension of the ESG model provides valuable insights for validating the dataset’s internal coherence and predictive reliability in the context of smart building management. This validation is essential to support the prototyping of a management model based on digital twin and metaverse technologies, which require accurate, interpretable, and scalable data structures to simulate and optimize human-environment interactions within buildings (Li, 2025). The results obtained through feature importance metrics—mean decrease in accuracy, total increase in node purity, and mean dropout loss—illustrate the role and weight of social indicators such as air quality, comfort, and system efficiency in predicting the dependent variable (Miao & Xu, 2024). The mean dropout loss, calculated through fifty permutations, serves as an indicator of the relative contribution of each feature to model accuracy. Lower dropout loss values correspond to higher importance, as their removal would significantly degrade model performance (Xu, 2021). In this dataset, variables such as Sound Insulation (SND), Thermal Insulation (THR), System Efficiency (SEF), and Coefficient of Performance (COP) display some of the lowest dropout loss values, confirming their fundamental role in explaining the variance of the output. These indicators are directly linked to the comfort and operational quality of the indoor environment, which are central to the social sustainability dimension of smart buildings (Chowdhury et al., 2023). Conversely, variables such as Humidity (HUM), Occupants (OCC), and Air Changes per Hour (ACH) contribute to the model with a moderate but consistent effect, emphasizing how internal environmental control and occupancy behavior affect building performance through indirect interactions. The other two importance measures—mean decrease in accuracy and total increase in node purity—further reinforce these findings. The positive values associated with PM2.5 (PM25), System Efficiency (SEF), and COP indicate that they significantly enhance the model’s predictive capacity, while negative or small values in other variables reflect lower or context-dependent influence. The total increase in node purity, a measure of how much a variable reduces overall model variance when used to split data in decision trees, identifies similar key drivers, suggesting the model’s internal coherence across multiple evaluation metrics (Lou, 2025).

Table xyz. Feature Importance Metrics for the Random Forest Model — Social (S) Component

 Variable

Mean decrease in accuracy

Total increase in node purity

Mean dropout loss

VOC

-310.022

1.200×10+8

3.820

PM25

2.522×10+6

7.406×10+7

3.919

HUM

-361.803

6.387×10+7

3.727

OCC

-455.332

6.330×10+7

3.647

ACH

-777.406

5.574×10+7

3.644

PM10

-38.839

5.482×10+7

3.632

LPD

153.570

5.290×10+7

3.638

EUI

120.725

5.279×10+7

3.653

COP

346.376

4.928×10+7

3.666

SND

284.558

4.860×10+7

3.602

THR

-515.408

4.721×10+7

3.595

SEF

862.251

4.260×10+7

3.623

EER

-55.396

3.588×10+7

3.563

Note. This table reports the importance metrics derived from the Random Forest model, including mean decrease in accuracy, total increase in node purity, and mean dropout loss. Variables such as SEF, COP, and PM2.5 show the highest influence on model accuracy, confirming their central role in explaining social sustainability and indoor comfort dynamics in smart buildings.

The additive explanations for predictions provide another layer of interpretability, illustrating how each variable contributes to specific case predictions. For example, in the first test case, variables such as PM25 and EER (Energy Efficiency Ratio) have strong positive contributions to the predicted value, whereas factors like VOC and LPD exert negative effects. These additive contributions allow the decomposition of predictions into comprehensible components, which is particularly valuable for digital twin applications that rely on traceable, feature-level understanding to inform operational decisions (Ozdemir et al., 2025). The capacity to visualize how indoor comfort, air quality, and energy efficiency dynamically influence outcomes reinforces the model’s practical relevance for smart building management. Overall, the Random Forest model demonstrates a robust and balanced capability to capture complex, nonlinear interactions among social variables within the ESG framework. It effectively distinguishes between features with direct physical impacts—such as thermal and acoustic insulation—and those representing behavioral or environmental feedbacks, like occupancy and ventilation rates. This multi-level interpretability confirms the dataset’s scientific validity, showing that it contains coherent, measurable relationships consistent with the physical and social principles of building performance (Orlenko & Moore, 2020; Drobnič et al., 2020). Therefore, this analysis validates the dataset as a reliable foundation for the development of an intelligent management prototype that integrates machine learning with digital twin and metaverse environments. The model’s structure supports the simulation of user comfort and operational efficiency, providing a data-driven mechanism for adaptive, sustainable management of smart buildings (Yu et al., 2021; Akhtar et al., 2024).

Table xyz. Additive Prediction Explanations for the Random Forest Model — Social (S) Component

Case

Predicted

Base

OCC

HUM

PM25

PM10

VOC

1

10.119

9.706

137.642

356.475

102.444

-145.549

-198.943

2

9.497

9.706

-141.526

-557.700

510.704

-426.882

-234.224

3

8.345

9.706

271.758

310.474

-1.262

-290.846

-164.060

4

9.409

9.706

-99.743

-292.405

-1.277

613.552

1.001

5

9.857

9.706

-64.956

461.545

242.598

-5.588

-325.978

ACH

THR

SND

EER

COP

SEF

EUI

LPD

-182.452

-28.543

158.293

285.878

74.686

-124.032

81.448

-104.236

151.726

122.638

432.515

-131.370

255.324

100.229

-32.603

-258.426

-167.946

-121.153

-146.303

6.577

260.931

207.834

-42.744

-223.179

-253.144

52.756

403.440

-192.171

189.206

139.739

-392.002

-191.752

-52.027

84.146

-43.087

-56.376

267.341

184.743

-363.505

-178.109

 

Note. The table illustrates the additive contributions of each variable to the predicted values across five test cases. Positive and negative values indicate how each social indicator (e.g., OCC, PM2.5, SND, COP) influences the final prediction relative to the baseline. These results confirm the interpretability of the Random Forest model and its capacity to capture complex interactions between comfort, air quality, and energy efficiency in smart building environments.

 

 

8.3 Machine Learning Validation of the Governance (G) Component within the ESG Framework

 

The validation of the Governance (G) component of the ESG model through machine learning techniques represents a critical step in ensuring the scientific reliability and applicability of the dataset for the prototyping of a smart building management system. Within this context, the application of the Support Vector Machine (SVM) algorithm was identified as the most effective method for the validation process (Wang, 2025). The Governance dataset includes key indicators such as Cost of Energy Saved (CES), Energy Return on Investment (EROI), Energy Payback Time (EPBT), Construction and Capital Costs (CPD and CCF), System Performance Coefficient (SPC), Renewable Energy Utilization (REU), and Energy Productivity per Worker Hour (EPWH). These variables jointly capture the economic and managerial dimensions of building performance, linking financial efficiency with operational sustainability (Wu et al., 2023). SVM was selected due to its superior performance across multiple validation metrics, particularly in minimizing mean absolute error (MAE) and mean absolute percentage error (MAPE), while maintaining a high coefficient of determination (R²). Unlike linear regression or decision trees, which may struggle to represent nonlinear dependencies, SVM effectively models the complex and interrelated relationships among governance variables (Lin & Hsu, 2023). This is crucial for ESG-driven frameworks, where economic efficiency, energy optimization, and operational decision-making are deeply intertwined. The low normalized MSE and RMSE further confirm the algorithm’s capacity to reduce prediction variance, ensuring high accuracy in estimating key governance outcomes such as cost-effectiveness and return efficiency (Koseoglu et al., 2025). The dataset itself, composed of one hundred buildings with diverse energy and cost characteristics, provides a robust foundation for testing the generalization capabilities of the model. SVM’s kernel-based approach allows for capturing nonlinear interactions between energy payback time, system costs, and governance efficiency indicators without overfitting the data (Suprihadi & Danila, 2024). This adaptability makes it particularly suitable for applications in digital twin environments, where data-driven models must reflect real-time changes and complex system feedbacks. By integrating this validated model into a digital twin framework, it becomes possible to simulate governance-related decisions in virtual environments before implementing them in physical infrastructures. This enhances predictive control, cost management, and operational resilience in smart buildings. The ability to test policies, predict maintenance needs, or optimize energy-economic trade-offs within the metaverse extends the role of the Governance component beyond data analytics, transforming it into a dynamic management tool. Therefore, the use of SVM for database validation ensures methodological rigor and computational robustness, confirming that the dataset is not only statistically coherent but also operationally meaningful. This validation establishes a scientific foundation for developing a prototype capable of merging machine learning, digital twin technologies, and ESG-based governance metrics into a unified management model for smart, efficient, and sustainable buildings.

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Governance (G) Component.

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.000

0.586

0.952

1.000

0.573

0.694

0.436

RMSE

0.000

0.742

0.965

1.000

0.733

0.812

0.570

MAE

0.000

0.820

0.908

1.000

0.380

0.129

0.000

MAPE

0.164

1.000

0.682

0.620

0.783

0.968

0.000

0.940

0.433

0.928

0.560

0.980

0.793

0.000

 

Note. The table reports normalized performance metrics for several machine learning models applied to the Governance dimension of the ESG dataset. Among all tested algorithms, the Support Vector Machine (SVM) achieved the best overall balance, minimizing errors (MSE, RMSE, MAE, MAPE) while maintaining high explanatory power (R²), confirming its robustness for dataset validation in smart building governance modeling.

 

The results obtained from the validation of the Governance (G) component of the ESG model using machine learning provide a consistent and technically coherent confirmation of the dataset’s reliability for the prototyping of a smart building management system. In this validation phase, the analysis focuses on the feature importance metrics and the additive explanations derived from the Random Forest regression model, which was used to estimate the AREA variable based on a set of governance-related indicators including CES (Cost of Energy Saved), EROI (Energy Return on Investment), EPBT (Energy Payback Time), CPD (Construction Cost), CCF (Capital Cost Factor), SPC (System Performance Coefficient), REU (Renewable Energy Utilization), and EPWH (Energy Productivity per Worker Hour). The Mean Dropout Loss, which remains consistent across all variables at approximately 5.279, suggests that each feature contributes similarly to the model’s predictive accuracy. This uniformity implies that the dataset is well-structured, without any variable disproportionately influencing the model. The stability in dropout loss also confirms the absence of overfitting, ensuring that the model generalizes effectively to unseen data. From a methodological standpoint, this homogeneity validates the internal coherence of the Governance dataset and indicates that each metric contributes to explaining different aspects of building efficiency and management performance.

 

 

Table xyz. Feature Importance Metrics Based on Mean Dropout Loss — Governance (G) Component

Variables

EPWH

CPD

CCF

SPC

REU

CES

EROI

EPBT

Mean Dropout Loss

5.155

5.154

5.151

5.149

5.148

5.144

5.144

5.142

Note. The table presents the Mean Dropout Loss values for each governance-related variable in the ESG dataset. The results show minimal variation among indicators (≈5.14–5.16), confirming a balanced contribution of all features to model accuracy and validating the internal consistency of the dataset used for smart building governance modeling.

The additive explanations of the predictions for the test set provide further insights into how each variable influences the estimated AREA values. The predictions show small but meaningful variations around the base value of 9.309.215, with feature contributions generally close to zero. These subtle shifts indicate that the model captures complex interactions among governance variables without introducing excessive noise. For instance, the CPD and CCF indicators show minor but systematic effects, reflecting the role of cost-related parameters in determining building scale and resource allocation. Similarly, the contributions from REU and EPWH confirm the connection between renewable energy utilization, labor productivity, and overall building governance efficiency. From a broader perspective, these results substantiate the model’s capacity to interpret governance-related dynamics within the ESG framework. The balanced feature importance distribution demonstrates that the variables are not redundant but complementary, collectively enhancing predictive accuracy and interpretative value. In the context of smart building management, this outcome is particularly relevant because it supports the integration of governance indicators into a decision-support system capable of optimizing energy efficiency, financial sustainability, and operational planning. Therefore, the validation confirms that the database is statistically consistent and suitable for the development of an intelligent management prototype leveraging digital twin and metaverse technologies. The capacity to model economic and performance interdependencies with precision establishes a strong foundation for advanced predictive control, simulation-based policy testing, and strategic governance of smart buildings. This ensures that the system’s management model is both scientifically validated and operationally viable in a real-world digital twin environment.

Table xyz. Additive Prediction Explanations for the Governance (G) Component — Feature-Level Contributions

 

Case

Predicted

Base

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

1

9.309

9.309

-0.020

0.017

-8.680×10-4

-0.093

-0.007

0.091

-0.049

-0.092

2

9.309

9.309

0.022

-0.024

0.027

0.299

0.034

-0.120

0.116

-0.149

3

9.309

9.309

-0.020

-0.007

-6.765×10-4

0.010

-0.074

-0.007

0.086

-0.155

4

9.309

9.309

0.003

0.031

-9.158×10-4

0.323

-0.045

-0.037

-0.128

-0.155

5

9.309

9.309

0.012

-0.013

-6.446×10-4

-0.265

0.108

-0.136

-0.053

0.17

 

Note: The table shows the additive decomposition of predicted values for five test cases within the Governance (G) component. Each variable’s contribution is expressed as a deviation from the base prediction (9.309), illustrating how governance indicators such as CPD, CCF, and REU subtly influence model output. The small variations confirm the stability and coherence of the dataset and the balanced behavior of the machine learning model.

 

 

 

Q7. The discussion of the system points to numerous benefits (energy efficiency, reduced MTTR, increased FTFR, improved comfort), but the “Limitations” section itself admits the lack of data from actual implementations and excessive reliance on simulations, thus the conclusions are stronger than the empirical material.

A7. The reviewers correctly pointed out that the current dependence on simulated data may be limited by the absence of empirical data from real-world implementation. However, as the reviewers correctly noted, these simulations were initially used to test interoperability, data integrity, and algorithmic capabilities within the Digital Twin-Metaverse framework. The current form of the work incorporates an expanded, scientifically validated set of environmental, social, and governance KPIs. These include Load Cover Factor, Energy Return on Investment, Cost of Energy Saved, Air Changes per Hour, and Thermal Insulation. These factors provide an initial foundation for a standardized, quantifiable validation structure. In conjunction with the determination of these new KPI factors, an intensive set of data testing and validation was conducted to assess the scientific soundness of the initial proposed data set for the ESG database. These data testing efforts included correlations and regressions to determine statistically significant factors, principal component analysis to validate the multidimensional nature of the data set and its overall internal coherence, and machine learning models, including Random Forest Analysis and Support Vector Machine Analysis. These efforts serve as an intensive validation analysis that recognizes overall internal data coherence and soundness. While it is true that empirical data from real world applications of smart buildings may be covered in the next set of tests and analyses, based on the intensive validation analysis that has already been applied to set data tests that recognize overall scientific soundness of data internal coherence and soundness within multidimensional spaces that propose an overall sound framework of initial data testing to provide sound empirical testing of digital twin/metakernel spaces.

 

Q8. The current version does not meet the criteria for a research article.

A8. The article has been placed correctly in the expected template.

Q9. It is a solid concept paper (white paper) with a KPI table and dashboard suggestions, but without data, analysis, or validation.

A9. The reviewer's mention of such an observation is appreciated. Although it may seem that the original paper had only a conceptual framework, descriptive KPI tables, and a dashboard design, the revised paper includes a full list of quantitative analysis tests and validations of the scientific soundness of the proposed model. The paper has moved from a white paper on theoretical scientific concepts to an analysis and validation paper on the scientific soundness of the proposed model for environmental-social-governance data. In particular, from the creation of the data set onward, the scientific soundness of the proposed model has been successfully validated through a systematic analysis that included correlation analysis, Regression analysis, Principal Component Analysis (PCA), and machine-learning regression tests. These analyses were performed for each environmental factor and for the remaining social-governance factors as well. Correlation and Regression analysis validated the internal coherence of the data set as well as the overall logical links among the KPIs. In addition, the analysis showed that the data set lacked redundancy from a multidimensional perspective, as assessed by Principal Component Analysis. Finally, machine-learning algorithms such as Random Forests and Support Vector Machines were used to assess the model's robustness from a predictive perspective. Hence, from now on, the paper provides not only a model from a conceptual framework perspective but also a scientifically sound, quantitatively validated system that helps build a bridge from concept to implementation. In particular, the paper provides a comprehensive qualitative framework for implementing Digital-Twining in smart building management.

 

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

“The integration of immersive technologies emerges as the new disruptive paradigm for intelligent building management. This research situates itself within the context of eco logical intelligence and system harmony, considering how the metaverse, digital twins, and predictive analytics can enhance relationships between people, systems, and environments.”The above content is the conclusion of this study. I believe this research is very meaningful. Although the practicality and actual application data of this paper still require a period of further study and discussion, this paper has good originality and innovation. The main suggestions concern the logic and formatting of the paper, as detailed below:

1. Figures 3 to 7 are unclear; please reformat them for better display of the relevant indicators.

2. The top-level sections in this paper are too many (there are 13). Consider merging sections 3 to 5 and sections 6 to 8. Additionally, in Part 3 on page 5, set the relevant paragraphs as sub-sections 3.1 to 3.5 to highlight the key points. Section 13 should be removed, and the way top-level sections are presented should be adjusted. In the text, for '11. Limitations', remove the sub-sections and use numbering (1) to (9) for better clarity.

3. In the fourth paragraph on page 3, you can divide it into smaller paragraphs according to the perspectives of different scholars, to better explain the logical relationships and differences.

4. Table 1 on page 8 can be reformatted and organized to make it more compact.

5. The three steps in the eighth part on page 12 can be presented as three separate paragraphs to clarify the logical relationships.

Author Response

Point to Point Answers to Reviewer 2

Q1. “The integration of immersive technologies emerges as the new disruptive paradigm for intelligent building management. This research situates itself within the context of eco logical intelligence and system harmony, considering how the metaverse, digital twins, and predictive analytics can enhance relationships between people, systems, and environments.” The above content is the conclusion of this study. I believe this research is very meaningful. Although the practicality and actual application data of this paper still require a period of further study and discussion, this paper has good originality and innovation.

A2. I agree that the research presents a meaningful and innovative contribution to the field of intelligent building management. In order to strengthen the practical dimension of the study, we have conducted a data validation activity employing a combination of analytical and computational techniques, including correlation analysis, Principal Component Analysis (PCA), Ordinary Least Squares (OLS) regression systems, and machine learning models. These methods were used to verify the reliability, consistency, and predictive capacity of the proposed framework, thereby enhancing the robustness of the research outcomes. This process not only supports the theoretical assumptions of the study but also provides a solid foundation for future applications and empirical developments within the context of ecological intelligence and system harmony.   Therefore, in order to propose a model characterized by the presence of validated data for prototyping purposes, we have completely rewritten the methodological section and introduced paragraphs highlighting the validation techniques through the application of an integrated system of methodologies that includes correlations, PCA, regression, and machine learning. The changes that have been made are outlined below.

 

  1. Development of the Environmental Dataset for Evaluating Smart Infrastructure Performance through Digital Twin Integration

 

 

The creation of the environmental data set for the evaluation of environmental performances marks an imperative step in the design of an intelligent digital model based on digital twin applied to smart infrastructures. In light of this consideration, the environmental KPI set section marks the foundation of the proposed research based on digital twin applied to environmental performances of smart infrastructures. The bibliographic research on digital twin applied to environmental performances of smart infrastructures leads to an understanding of the relevant KPI necessities proposed by digital twin. The ultimate goal of designing digital twin applied to environmental performances of smart infrastructures is to introduce an integrated system that enables the simulation of environmental performances of infrastructures. The applied KPI set permits an operative analysis of the synergic relationship concerning energy efficiency, sustainability, and environmental performances. The proposed KPI set permits an analysis of carbon footprint and emission intensity that provides extend information on environmental sustainability. The remaining KPI set related to load cover factor and on-site energy ratio grants evidence on system autonomy and energy efficiency. The data set creation implemented by the KPI set permits an integrated analysis of environmental performances that corresponds to the principles of the environmental framework proposed by ESG. The designed data set permits an operative comparison of environmental performances of an infinite number of infrastructures. The data set designed permits an intelligent analysis of environmental performances that provides a knowledge base on environmental sustainability of infrastructures. The designed data set permits an analysis on environmental performances of infrastructures that marks an perpetual approach on design decision concerning data set creation. The data set creation designed permits an intelligent extension on system environmental design that provides an intelligent knowledge on system environmental sustainability. In an ultimate analysis on data set creation designed by digital twin applied to environmental performances of infrastructures, digital twin represents an intelligent approach toward environmental sustainability of infrastructures.

 

 

3.1 Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Evaluation of Smart Infrastructure

 

The chosen environmental Key Performance Indicators (KPIs) provide a comprehensive framework for evaluating the environmental performance characteristics of smart infrastructures. These factors help support the aims and scope of the proposed digital twin platform, aiming to analyze, model, and optimize the environmental, energy, and operational characteristics of buildings and urban infrastructures in an immersive and data-driven setting  [1]; Fokaides, Jurelionis, & Spudys, 2022). Each Key Performance Indicator adds a unique perspective on energy, resilience, and efficiency, combining to form a holistic model for managing and interpreting the sustainability of urban infrastructures [1]. The Carbon Footprint (CFPT) provides a fundamental measure of sustainability by quantifying the total greenhouse gas emissions generated by system activity. This metric allows for the interpretation of complex system operations through a comparative metric expressed in CO₂-eq, evaluating both direct and indirect emissions (Zahedi, Alavi, Sardroud, & Dang, 2024). Applied in digital twin analysis, it is essential for the CFPT, as it enables real-time analysis and projection of environmental implications across different system operation scenarios (Li, 2025). It effectively serves as the central model connecting energy performance characteristics to global environmental goals for climate regulation (Yu, Ye, Xia, & Chen, 2024). Emission Intensity (EMIN) adds a further dimension by using normalized factors directly related to the energy consumed or produced. This type of ratio analysis permits different system-scale operations to compare system emissions, making it highly valuable for multi-building and city-scale analysis (Alibrandi, 2022). The Load Cover Factor (LCF) and Supply Cover Factor (SCF) assess the relationship presented by energy demand and supply, an important consideration for energy and resource sufficiency. The LCF will evaluate how much local energy production can sustain energy activity for a predetermined period, assessing system sufficiency, while SCF will assess how much local energy production can sustain energy use for a predetermined period, assessing system resource use (Chávez et al., 2022). The Load Matching Index (LMI) evaluates the synchrony of system dimensions for local energy production and energy activity. Large LMI values clearly indicate that local energy production and storage are well supported by local loads, thereby providing a fundamental basis for the efficiency and resilience of Smart Grids (Klar & Angelakis, 2023). The On-Site Energy Ratio (OER) also captures the extent to which local energy consumption is supported by local use of Renewable Energy sources, thereby serving as a crucial factor in assessing the zero-energy building index (Prandi et al., 2022). The Grid Interaction Index (GII) and No-Grid Interaction Probability (NGI) further establish the global context for autonomy. The GII captures the intensity and direction of energy interactions, while the NGI estimates the probability of autonomy (Fokaides et al., 2022). Capacity Factor (CAF) and One Percent Peak Power (OPP) establish system performance at varying loads. The Capacity Factor estimates system performance and its ability to use its installed energy resources, thereby forming a crucial index for judging performance return on investment, while the One Percent Peak Power focuses on peak loads and their intensity, thereby estimating impacts on system stress [2]. Building on the concept of behavior-based system performance, the Demand Response Percentage (DRS) estimates system performance flexibility in adapting to varying loads, particularly in Smart Pricing scenarios [3]. The system's total flexibility level for adapting to global environmental stimuli, such as market prices or Renewable resource availability, thereby covering system transitions from Static Energy Management to Adaptability, is captured by the system’s triple dimensions – the Flexibility Factor (FLF), Flexibility Index (FLI), or Flexible Energy Efficiency (FEE) (Chávez et al., 2022, 2022; Li, 2025). This framework satisfies not only system sustainability analysis requirements but also provides additional benefits for decision-making, scenario analysis, and future system optimization [4]; Zahedi et al., 2024). This framework therefore aligns well with the system requirements for an intelligent, fully interoperable, and environmentally sustainable Smart Urban Ecosystem, supported by measurable system performance indicators (Prati, Pelucchi, Dal Fiore, Fuzzati, & Agostini, 2023).

 

Table xyz. Environmental Key Performance Indicators (KPIs) and Their Computational Formulations

KPI

ACRONYM

Description

Formula

Carbon Footprint

CFPT

Indicates the total amount of greenhouse gas (GHG) emissions caused by an individual, organization, or product, either directly or indirectly. The formula calculates the sum of emissions associated with different activities by multiplying the quantity of each activity by its corresponding emission factor [5].

 

 

 

 

 = Quantity of a specific activity that generates greenhouse gas emissions (e.g., km, kWh, liters).

 = Rate of GHG emissions per unit of activity, expressed in CO₂ equivalent per unit (e.g., tCO₂e/kWh for electricity, tCO₂e/liter for fuel, etc.).

Emission Intensity EI

EMIN

Evaluates the environmental impact of an energy system by measuring the amount of carbon dioxide (CO₂) emitted per unit of energy consumed or produced. A low  value indicates that the system is more environmentally efficient, emitting less CO₂ for each unit of energy consumed or produced (this can occur through the use of renewable energy sources). Conversely, high  values typically occur in systems that rely heavily on fossil fuels [6].

 

2

 

 = Total amount of CO₂ emitted over a given period, resulting from the consumption of fossil fuels or the use of grid electricity [tCO₂]

 = Total amount of energy consumed or produced during the same reference period [kWh]

Load Cover Factor

LCF

Represents the ratio between the energy actually supplied by a generation source and the energy demanded or consumed over a given time interval. If equal to 1, it indicates that the generation capacity exceeds the demand, whereas values lower than 1 indicate that generation is insufficient to meet the required load. When =  1, the entire load demand is fully satisfied.  When   1,   the load is not completely met during part of the period, due to limitations in generation or available resources. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Supply Cover Factor

SCF

Indicates the ability of an organization to meet its energy demand through its own on-site supply resources. When = 1, the amount of useful supplied resources is exactly equal to the total available amount. This implies that there are no significant losses and that all available resources are fully utilized. When <  1, the amount of effectively usable resources is lower than the total available amount. Part of the generated energy is not used to meet the load, likely due to overproduction, losses, or storage capacity limitations. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Load Matching Index

LMI

Measures the efficiency with which on-site energy generation (whether renewable or not) matches the energy load (demand) of a system.

It evaluates how well the energy production profile corresponds to the load profile over time by analyzing the synchrony between supply and demand.

A higher index indicates a better match between generation and load.

When  = 1, the load is fully met (i.e., generation and storage are sufficient to cover the required demand) in every considered interval.

When  < 1, the load is not fully met at certain times, meaning that the generation and/or storage capacity was lower than the demand.

Range: 0 % ≤ f_(load,i) ≤ 100 % [8].

 

i = Time intervals [hourly, daily, monthly]

 = On-site energy generation at a given time t [kWh]

 = Storage energy balance at a given time t [kWh]

 = Energy losses at a given time t (sum of generation energy losses, storage energy losses, building technical system losses (excluding storage), and load-related energy losses such as distribution losses) [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Number of samples within the evaluation period, from τ₁ to τ₂. When hourly data are used and the evaluation period covers a full year, the number of samples is 8760.

 

On-site Energy Ratio

OER

Determines the amount of energy produced on-site (e.g., from renewable sources such as solar panels or wind turbines) relative to the total energy consumption over a given period of time.

If  = 1, the on-site generated energy equals the total energy consumption.

If  < 1, the on-site produced energy is lower than total consumption, meaning that the system depends on external energy sources to meet the demand.

If   > 1, the on-site generated energy exceeds total consumption, indicating that energy production is greater than demand (and surplus energy may be exported to the grid).

Range:   0 [9].

 

 

 = On-site energy generation at a given time t [kWh]

 = Total energy consumption (energy load) at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 

 

 

 

Grid Interaction Index (Indice di Interazione con la Rete)

GII

Measures the level of interaction and integration of a facility with the power grid, describing its average stress.

If  = 100%, the energy exchanged with the grid during interval i equals the maximum possible exchange.

If  = 0%, no energy exchange with the grid occurred at that moment.

If  < 0%, energy was injected into the grid rather than drawn from it [7], [8].

 

 = Net energy exchanged with the power grid during interval i (can be positive or negative depending on whether energy is being drawn from or injected into the grid) [kWh]

 = Maximum absolute value of the net energy flow with the grid, taken over all considered time intervals [kWh]

i = Time intervals [hourly, daily, monthly]

No grid interaction probability

NGI

Measures the probability that a building or facility operates autonomously from the power grid, and therefore the likelihood of no interaction with it.

It also indicates the extent to which the load is covered by stored energy or renewable energy use.

If  = 0, there was no moment during the considered time interval when the net energy was zero or negative.

If  = 1, the net energy was zero or negative for the entire considered period.

Range: 0           1  [7], [8].

 

 = Probability that the net energy  is zero or negative during the time interval ||

 = Normalized variable for the net exported energy at a given time t [kWh]

e  = Start and end of the evaluation period [s]

Capacity Factor

 

CAF

Defines the ratio between the actual energy production of a system (energy exchanged between the building and the grid) and the maximum production that could be achieved if the system operated at full capacity over a given period of time.

If = 1, the system operated at its maximum capacity for the entire considered period.

If = 0, the system did not produce any energy.

Range: 0           1  [8].

 

 = Normalized variable for the net exported energy at a given time t [kWh]

 = Maximum producible energy at full capacity (system capacity) [kWh]

 = = Evaluation period [s]

One Percent Peak Power

OPP

Quantifies the maximum power that an energy system can reach by calculating the energy production corresponding to the top 1% of peak periods.

A high  value indicates that the building or system experiences moments (the top 1% of the time) with very high energy consumption. This may point to significant peak loads that place stress on the electrical grid.

If   is low, the building’s energy demand is more evenly distributed over time, with fewer or smaller peaks. [10].

 

 = Energy associated with the top 1% of a given value, calculated during periods of maximum demand or generation [kWh]

 = Time period over which the energy is measured [h]

Demand Response Percentage

 

DRS

Refers to the percentage variation of the Demand Response relative to a baseline value.

If  > 0, the Demand Response was successful in reducing power compared to the baseline level (load “reduction” capability).

If  = 0, no variation occurred.

If  < 0, it indicates an increase in power during the Demand Response implementation, which is generally undesirable (load “overload” condition) [11].

 

 = Baseline hourly power, i.e., the expected or normal power level without any Demand Response measures [kWh]

 = Hourly power under Load Shifting conditions, i.e., the power recorded during the Demand Response event [kWh]

Flexibility Factor

FLF

Measures the ability of an energy system to adapt to variations in energy demand and resource availability, and to shift energy use from high-price periods to lower-price periods. It applies a daily quartile-based price classification, dividing prices into three categories: low, medium, and high.

A high price is defined as one above the third quartile (price > 75% of all prices during a day).

A low price corresponds to a value within the first quartile (price ≤ 25%).

If = 0, consumption is balanced between low- and high-price periods.

If   = 1, consumption occurs only during low-price periods.

If < 0, most consumption occurs during high-price periods.

Range:  -1            1  [12].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Low-price periods (first quartile, i.e., the lowest 25% of prices)

 = High-price periods (above the third quartile, i.e., the highest 25% of prices)

 = Number of considered time intervals

 

Flexibility Index

FLI

Calculates the difference between the energy cost under a flexibility-controlled scenario and the energy cost under a reference scenario. The Flexibility Index is used to measure the effectiveness of flexibility strategies in reducing costs compared to a baseline case.

If   < 0, the flexibility-controlled case has a higher energy cost than the reference case, meaning an undesirable cost increase.

If   = 0, the total energy cost under flexible conditions is identical to that of the reference case, indicating that flexibility yields no savings.

If   = 1, the total cost in the flexibility-controlled case is zero relative to the reference case—this represents an ideal but unrealistic situation.

If  is positive and close to 1, it means that energy has been effectively shifted or managed, reducing costs compared to the reference scenario.

Range:  -            1   [13].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Total electricity cost in a flexibility-controlled scenario  = Total electricity cost in a reference scenario without flexibility control

 = Number of considered time intervals

Flexible Energy Efficiency

FEE

Measures how effectively a system utilizes flexible energy compared to its reference energy consumption. It refers to the system’s ability to manage energy use during Demand Response (DR) events, considering the “rebound effect” (i.e., when energy consumption increases after a reduction event to restore normal operating conditions). A higher  value indicates greater flexibility efficiency, meaning the system can better optimize energy use during flexible periods. Range: 0 %         100%  [14].

 

 = Flexible energy, i.e., the energy used during periods when the system operates in flexible mode (for example, by optimizing consumption based on renewable resource availability or variable pricing) [kWh]

 = Reference or baseline energy, i.e., the energy consumed under normal or non-flexible operating conditions [kWh]

Note. This table presents the Environmental Key Performance Indicators (KPIs) used to evaluate the environmental, energy, and operational performance of smart infrastructures within a digital twin framework. Each KPI is defined with its acronym, description, and mathematical formulation for standardized and comparative analysis.

 

 

3.2 Social and Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Assessment of Smart Urban and Industrial Infrastructures

 

The set introduced for Key Performance Indicators (KPIs) plays an important role in facilitating the digital twin and metaverse software platform proposed, highlighted in the abstract, since it plays an important enabling role in assessing, optimizing, and ensuring the factors related to Smart Urban and Industrial Infrastructure (Dovolil & Svítek, 2024; Barykin et al., 2023). The proposed set of KPIs serves as parameters that enable the processing of complex phenomena related to the environment into measurable values, enabling real-time processing, simulation, and optimization (Englezos et al., 2022; Hadjidemetriou et al., 2023). The integration process fully meets the aims of the ESG (Environmental, Social, and Governance) evaluation framework, particularly targeting both Environmental and Social factors (Shaharuddin et al., 2022). Focusing on KPIs that assess indoor environmental quality, energy efficiency, and user comfort, the proposed platform enables, through an evidence-based process, the optimization of sustainable design, preventive maintenance, and energy-efficient building operations (Yitmen et al., 2025). Humidity (HUM) is an important KPI for assessing indoor environmental quality. This parameter measures the actual water content percentage in the air, relative to its maximum threshold at a given temperature scale. Humidity level, when maintained within its optimal range (40% to 60%), plays a critical role in health and comfort, since low air humidity can lead to air irritation and electrical charges, whereas excess humidity can contribute to mold growth, causing material degradation. This phenomenon, when implemented in digital twin functionality, enables RH measurement, permitting, through algorithmic processing, automatic regulation of Heating, Ventilation, and Air Conditioning (HVAC) operation and, through forecast models, optimizing air-conditioned ventilation (Lo, 2025). This leads, therefore, to thermal and hygrometric comfort, optimized through energy conservation, directly linking HUM to both social well-being and environmental factors, concerning optimized energy savings. Particulate Matter (PM10 and PM2.5) is an important environmental parameter. The proposed KPI aims to assess the level of air concentration of particles that can significantly provoke health problems, particularly in densely populated and industrially developed regions. Continuous exposure to particles can cause problems relating to heart and pulmonary diseases. The measurement process, set up for buildings, aims to assess effectiveness and identify pollution sources through functional analysis of ventilation systems. The integration of PM values in the proposed system contributes to the support for the ESG “Social” perspective by ensuring health for the inhabitants, along with achieving healthier approaches for efficient air circulation systems, thereby contributing to improvements in the “Environmental” perspective by ensuring cleaner, more efficient air circulation methods (Saleh et al., 2025; Ariansyah et al., 2023). Volatile Organic Compound (VOC) concentrations enable the measurement of air pollution from harmful gases such as benzene, formaldehyde, and toluene, which are derived from construction materials, cleaning agents, and interior decor. Volatile organic compounds can significantly affect indoor air quality, comfort, and health. However, it is recommended that VOC concentrations not exceed 300 ppb to maintain global health standards. The integration of VOC concentration measurement in the digital twin system will enable real-time responses, enabling facility managers to trace the cause, adjust ventilation rates, or use low-emitting materials (Yitmen et al., 2025; Venkateswarlu & Sathiyamoorthy, 2025). This reasonable preventive strategy will enhance indoor environmental quality and enable ESG factors to achieve “Social Sustainability,” resist factors that threaten health, and lead to occupant contentment. The rate of “Air Changes per Hour (ACH) Quantitative Indicator,” expressed by “ACC,” measures the rate at which total air replacement can occur inside an indoor space. An average rate range of 3 to 5 ACC will ensure adequate ventilation for residential and office buildings. The continuous measurement, adjustment, and calculation procedure for ACC using digital twin technology will enable facility managers to dynamically adjust ventilation rates, ensuring safe, healthy air and energy conservation by optimizing ventilation rates (Hadjidemetriou et al., 2023). The ACC Key Performance Indicator has both social and environmentally friendly impacts for ESG achievement. Regarding ACC, it offers “Social Benefits,” ensuring healthy ventilation for human well-being, and “Environmental Benefits,” conserving energy by systematically adjusting ventilation rates to improve energy performance (Hadjidemetriou et al., 2023). The “Thermal insulation rate (R-value) Quantitative Indicator,” also expressed as “R-value,” essentially estimates the “Thermal Resistance Capacity (TRC)” of construction materials to heat, thereby indicating how little heat will conduct through them, thereby ensuring greater energy conservation, as discussed previously. Increased insulation reduces heating and cooling loads, aligning with the ESG environmental aspect by reducing emissions from energy use and the social aspect by ensuring a comfortable temperature level without increasing costs (Englezos et al., 2022). The Sound Insulation Index (SND) rates sound insulation properties for construction structures, such as walls, windows, and floors. Noise pollution is gradually recognized for its impacts on both mental and physical health. The measurement of sound insulation level inside buildings helps stakeholders rate sound comfort, particularly in highly populous urban areas. This KPI actually improves the social sustainability aspect by fostering well-calibrated environments for concentration, rest, and quality of life (Lo, 2025). Energy use actual KPIs, namely Energy Efficiency Ratio (EER) and the remaining three actual indicators, namely Coefficient of Performance (COP) and System Efficiency (SEF), that rate, along with EER, how well energy services translate from energy use, contribute singularly to how well energy inputs translate from energy services. The EER, COP, and SEF actual indicators are particularly important for rating energy services’ contribution to both chiller/heater performance ratios for cooling and heating, respectively. Values for higher ratios indicate greater use for every amount of power used, thereby improving digital twin capabilities for optimizing inefficiencies, predicting system degradation, and scheduling preventive maintenance (Venkateswarlu & Sathiyanmuthu, 2025) that support ‘Environmental’ and ‘Economic’ ESG spheres, along with, again, affordability, thereby strengthening ‘Social’ ESG factors. The actual Energy Use Intensity (EUI) and actual Lightning Power Density (LPD) actual indicators can, particularly, rate lighting energy use, and its intensity, respectively, that provide deeper insight into energy use per capita, by rating lighting energy use adjusted for expected user population, along with lighting energy consumption intensity adjusted for ex-pected unit floor space, respectively, that provide deeper, similar insight, by measur-ing shared relationship factors related to spatial, user, and energy use. The actual use of digital twins with similar data can enable various analyses, including simulations for different user occupancy scenarios, lighting system schedule optimizations, and adoption of intelligent lighting systems that dynamically adjust to different user behaviors (Yitmen et al., 2025). Such enhancements lead to lower energy losses and operational costs, thereby aligning well with the ESG framework from both environmental and social perspectives, given their well-being benefits and resource distribution. Overall, integrating such KPIs into a digital twin and metaverse system constitutes a comprehensive framework for measurement, simulation, and improvement efforts to support greater sustainability and energy goals across various infrastructures in both urban and industrial settings. Each KPI has applicability to advancing or improving environmental, energy, and human comfort factors. Continuous surveillance using the set parameters allows a shift from a reactive governance model to a predictive one, in which any intervention depends on real-time factors rather than fixed paradigms that lack dynamic scope, thereby adhering to the ESG model's focus on innovation directly linked to sustainable and inclusive elements.

 

 

Table xyz. Social Key Performance Indicators (KPIs) for Indoor Environmental Quality and Energy Efficiency Assessment

KPI

Acronym

Description

Formula

UoM

Relative Humidity

HUM

Indicates the amount of water vapor in the air relative to the maximum that can be contained at the same temperature.

The optimal relative humidity (RH) range for occupant comfort and health is between 40% and 60% [15].

 

 = Water vapor pressure [Pa]

 = Saturation vapor pressure [Pa]

%

Concentrazione di PM  (Particulate Matter - PM10 e PM2.5)

PM10 e PM2.5

Measures the amount of suspended particles (particulate matter) in the air, typically expressed in micrograms per cubic meter (µg/m³).

PM2.5 refers to particles with a diameter smaller than 2.5 micrometers, while PM10 refers to particles smaller than 10 micrometers.

Recommended long-term health thresholds are PM2.5 < 20 µg/m³ and PM10 < 50 µg/m³ [16].

 

 

 = Mass of particulate matter [µg]

 = Volume of air [m³]

µg/m³

Volatile Organic Compounds

VOC

Establishes the concentration of VOCs – such as benzene, formaldehyde, and other potentially harmful gases.

Elevated VOC levels can cause discomfort and health issues in occupants.

The indicated threshold is  < 300 ppb. [17].

 

 

 = VOC concentration [mg/m³]

 = Molar mass of the VOC [g/mol]

 = Molar volume under standard conditions, generally considered as 24.45 L/mol (at standard temperature and pressure, 0°C and 1 atm)

ppb

Air Changes per Hour

ACH

Indicates the number of times the air within a space is completely renewed in one hour.

An air change rate between 3–5 ACH is considered adequate for residential buildings or office environments [18].

 

 = Airflow rate [m³/h]

 = Volume of the indoor space [m³]

1/h

Thermal Insulation Rate 

THR

Determines the thermal resistance of insulating materials, indicating how effectively they prevent heat loss.

A higher R-Value indicates better insulation performance [19].

 

 = Materials thickness [m]

λ = Thermal conductivity of the materials [W/m·K]

m²·K/W

Sound Insulation Index

SND

Evaluates the effectiveness of a building element in reducing sound transmission between two different spaces.

It is defined as the difference between the incident sound pressure level on a surface and the transmitted sound pressure level through it.

A higher R value indicates that walls, floors, or windows are more effective at blocking sound [20].

 

 = Incident sound pressure level [dB]

 = Transmitted sound pressure level [dB]

 = Equivalent absorption area [m²]

 = Separating surface area [m²]

dB

Energy Efficiency Ratio

EER

Measures the efficiency of an air conditioning system (air conditioners or cooling units). A higher EER indicates that the air conditioning system provides more cooling output for each unit of energy consumed, making it more efficient.

If EER ≥ 12, the system is considered efficient. [21].

 

 = Total cooling capacity provided by the system [kW]

 = Electrical power input consumed by the system [kW]

-

Coefficient of Performance

COP

An indicator similar to the EER, it can be used to evaluate efficiency in both cooling and heating modes.

It is commonly applied to heat pumps. A higher COP indicates that the system can produce a greater amount of useful energy (heating or cooling) for each unit of electrical energy consumed.

If COP ≥ 3.5, the system is considered efficient. [22].

 

| =  =  = Heating or cooling capacity provided by the system [kW]

 = Electrical input power consumed by the system [kW]

-

System Efficiency η

SEF

Measures how much of the energy used by the system is effectively converted into useful heating or cooling.

A high system efficiency means that a large portion of the consumed energy is actually transformed into useful thermal energy, minimizing losses.

If η ≥ 85%, the system is considered efficient. [23].

 

 = Useful energy delivered (cooling or heating capacity) [kWh]

 = Total energy consumed (including system losses and auxiliary consumption) [kWh]

-

Energy Use Intensity based on people count 

EUI

Measures the energy consumption for lighting relative to the number of occupants in the building, reflecting energy efficiency in terms of per capita usage.

A high EUI indicates higher energy consumption for lighting per person, suggesting a lack of optimization.

Optimal values: EUI < 15 kWh/person/year. [23].

 

 

 = Energy consumed for lighting [kWh]

 = Number of occupants in the building

 = Duration of lighting usage [year]

kWh/

person/

year

Lighting Power Density per floor area

LPD

Determines the power consumed by lighting per unit of floor area.

It serves as an indicator of lighting efficiency in relation to the utilized space.

A high LPD indicates greater power consumption per unit area, suggesting inefficient lighting design.

Optimal values: LPD < 10 W/m² [23].

 

 

 = Power used for lighting [kW]

 = Illuminated indoor area [m²]

kW/m²

Note. This table summarizes the Social and Environmental Key Performance Indicators (KPIs) used to assess indoor environmental quality, user comfort, and energy efficiency in smart infrastructures. Each KPI is defined by its acronym, description, and calculation formula, providing measurable parameters that support ESG-oriented evaluation and digital twin integration.

 

3.3 Governance Key Performance Indicators (KPIs) for ESG Evaluation in Digital Twin and Metaverse Applications

 

The selected Key Performance Indicators (KPIs) provide an integrated framework for evaluating ESG performance for Smart Infrastructure, specifically for the digital twin and metaverse applications related to the management of urban and industrial environments. Each Key Performance Indicator is a link that connects technology innovation and sustainability to enable real-time analysis and optimization of energy use, expenditure, and social impacts. The use of Key Performance Indicators, in aggregate, provides a holistic view of efficiency and equity, ensuring infrastructural advancement that encompasses technological innovation, sound ecology, and support for social justice. The relevance of the Key Performance Indicators is significant in the ESG framework, particularly because it directly covers both environmental and economic perspectives, and it has an indirect relationship with Governance, largely through interactions, accountabilities, and shared decision-making (Wu et al., 2022; Zhang, 2025). The Cost of Energy Saving (CES) is the single most important Key Performance Indicator under the ESG framework, since it estimates the financial costs of unit energy savings from efficiency. This Key Performance Indicator assists by evaluating the cost-effectiveness and investment-to-benefit ratio for environmental elements, leading to environmentally viable energy conversion (Dovolil & Svítek, 2024). The CES Key Performance Indicator has clear relevance to the ESG environmental domain, helping establish cost-optimal strategies for energy waste and emission savings, and also has implications for Governance, as it assists with financial accountabilities and forward-looking strategic planning for financial resource use. The Energy Return on Investment (EROI) is another highly important Key Performance Indicator, calculated as the ratio of energy output to energy invested for any given system. The Key Performance Indicator for energy has important implications for ESG’s environmental domain, as it indicates that when EROI increases, the energy output of the system is significantly higher than the energy consumed (Hämäläinen, 2020). This shift leads to optimized energy resources and sustainable energy. This Key Performance Indicator has several ESG factors, as it supports the ESG environmental dimension by enabling transparent evaluation of energy system efficiency and helping strategic decision-making to maximize energy output from resources without harmful depletion (de Trizio et al., 2024). The Energy Payback Time (EPBT) Key Performance Indicator complements the EROI Key Performance Indicator, as it describes the time required for a particular system to recover the energy invested in construction, setup, and maintenance operations. Functionally, from an ESG perspective, EPBT plays a crucial role in evaluating the life-cycle sustainability of energy systems (Hu, 2023). In the digital twin environment, EPBT helps evaluate simulation scenarios and establish the sustainability level of different energy technologies, thereby strengthening the use of transparent data —an important consideration in ESG modeling for the governance process. The Cost of Peak Demand (CPD) measures the cost of peak electricity demand over a given time period. The use of CPD is critical for sustainability, both environmental and economic, since maximizing efforts to reduce peak loads will ease energy networks and prevent the need to generate additional energy from fossil fuels, which are characterized by higher emissions (Aghazadeh Ardebili et al., 2025). The Cumulative Cash Flow (CCF) criterion considers both financial and environmental factors, as it evaluates total cash flow for an energy project alongside investment costs. ESG analysis supports governance by using financial criteria to express financial transparency and assess future risk (Hien & Hanh, 2024). The positive interpretation of a project’s cash flow feature is critical, as it asserts that financial investment in a project, beyond financial benefits, helps achieve resource savings and sustainability. The Share of Project Cost Subsidized (SPC) measures the extent of grant use. This criterion assumes ESG duality, as it explains the financial attractiveness of sustainable project investment by focusing on social benefits arising from inclusivity for small players from developing communities in the use of sustainable technology (Wu et al., 2022). Renewable Energy Use (REU) assumes critical importance as an essential ESG criterion that estimates the level of energy use from conservation to sustainable energy. Indicative interpretation assumes critical importance, particularly because it signifies a strong commitment to sustainability for a project, which is otherwise characterized by the continuous use of fossil fuels (Becattini et al., 2024). The use of digital twin technology is critical, as it assists in monitoring energy use across different scenarios, thereby enabling interpretation for sustainable energy use (Wei, 2023). The Energy Use per Worker Hour (EPWH) is dual in its interpretation of energy use across different labor productivity scenarios (Zhang et al., 2023). Socially, it signifies environmentally responsible production that does not strain human resources by being energy-intensive. EPWH, on a digital twin platform, supports modeling for appropriate workforce and energy equity balance interpretation, as well as effective energy use in labor-intensive industries (Englezos et al., 2022). Taking it all in, it forms a sound analysis framework for a comprehensive digital twin model that expresses difficult objectives for sustainable production through specific, quantified, and tractable information. The gauges improve the proposed digital twin framework’s capabilities for both real-time activity monitoring and, through simulation, forecasting future ESG performance implications. The proposed digital twin platform’s balanced model for ensuring a comprehensive, integrated, and holistic approach to ESG responsibility, covering environmentally responsible operations (EROI, REU, EPBT) for low-cost energy use, economic soundness (CES, CCF, CPD, SPC) for sustainable economic growth, and social responsibility (EPWH) for fair social implications, has therefore become possible through the incorporation and integration of such factors for its successful implementation.

 

Table xyz. Governance Key Performance Indicators (KPIs) for ESG Evaluation within Digital Twin Frameworks

KPI

Acronym

Description

Formula

UoM

Cost of Energy Saving

CES

Measures the cost associated with energy savings achieved through energy efficiency interventions.

This parameter is particularly useful for comparing different investment options in terms of efficiency, as it estimates how much it costs to save one unit of energy (e.g., 1 kWh) through technological or operational measures.

The CES formula is structured to calculate the total cost of energy savings and divide it by the amount of energy saved, accounting for system inefficiencies.

A higher CES indicates a greater cost per unit of energy saved, suggesting that the intervention may be less cost-effective compared to other alternatives.

Conversely, a lower CES means a lower cost per unit of energy saved, making the energy efficiency measure more economically advantageous [24].

 

 

 = Change in initial investment. Represents the amount of capital required to implement the energy efficiency measure [€]

 = Change in operating costs. Includes expenses related to the operation and maintenance of the energy efficiency measure [€]

 = Energy price. Represents the cost per unit of energy, which can influence the savings achieved by the measure [€/kWh]

 = Change in energy consumption. Indicates the amount of energy saved as a result of the intervention [kWh]

 = Energy loss (or efficiency) factor associated with losses that may occur during the energy use process. It may include heat losses or other system inefficiencies [–]

 = Capital Recovery Factor. Used to calculate the annualized cost of the investment and determine how much an investment must generate each year to be recovered over time [-]

 

 = Interest rate [-]

 = Amortization period [years].

[€/kWh]

Energy Return on Investment

EROI

Evaluates the energy efficiency of a production source by measuring how much energy is obtained compared to how much energy is invested to produce it. It is a key indicator of energy sustainability: the higher the EROI, the more efficient the system.

If EROI > 1, the energy process is sustainable, as the energy produced exceeds the energy invested.

If EROI = 1, the energy produced is exactly equal to the energy invested, meaning the system is at the limit of sustainability and produces no usable net energy.

If EROI < 1, the system is inefficient, since it requires more energy than it generates. Such a process is neither economically nor energetically sustainable in the long term.

This indicator answers the question: “How efficient is the energy investment?” [25].

 

 = Total outgoing or produced energy from process i. This may include, for example, the electricity generated by a power plant or the fuel produced by a refinery [kWh].

 = Total incoming or consumed energy for process j. This may include the energy required to extract, transform, or transport the energy source [kWh].

 e  = Scaling factors that can represent the quality of energy. For instance, they may be used to assign greater or lesser importance to certain forms of energy or technologies [–].

[-]

Energy Payback Time

EPBT

Measures the time required for an energy system to produce the same amount of energy that was needed to build, install, and maintain it.

If EPBT is high, it takes longer for the system to return the energy invested. Conversely, if EPBT is low, the energy system quickly recovers the energy used for its construction and startup.

It is an indicator that answers the question: “How long does it take for the system to repay the energy invested?” [26].

 = Total invested energy required to build, install, maintain, and decommission the energy system throughout its life cycle [kWh].

 = Amount of energy that the system is capable of producing annually once it is operational [kWh/year]. 

[year]

Cost of Peak Demand

CPD

Measures the cost associated with the peak electricity demand over a given period.

A lower CPD is desirable, as it indicates effective management and reduced exposure to energy costs [27].

 

 = Represents the maximum power demand during a given period [kW].

 = Represents the cost associated with each unit of power [€/kW].

[€]

Cumulative Cash Flow

CCF

Measures the total cash flow generated by the project in relation to the initial investment.

The CCF is useful for investors and decision-makers, as it helps assess a project's profitability, compare different investments, and plan future financial needs and returns on investment.

A CCF > 0 indicates that the project is generating more cash flow than the costs incurred, while a CCF < 0 indicates a loss. [24]

 

 = Represents the Final Energy Savings in period k. This value indicates the final energy savings achieved through energy efficiency measures or other strategies [kWh].

 = Energy Carrier Cost, i.e. the cost of energy per unit during period k. This may include costs for purchasing or using energy such as electricity, gas, etc. [€/kWh].

 = Technical Life, i.e. the project period during which energy savings and economic benefits are expected [years].

 = Investment Cost, i.e. the cost of the investment. It includes all expenses necessary to implement the project, such as installation, equipment, and other preliminary costs [€].

[€]

Share of Project Cost Subsidized

SPC

Indicates the proportion of the total project cost that has been financed through grants.

A high SPCS means that a significant portion of the project has been funded through external aid, while a low SPCS suggests that the project has been mainly self-financed.

SPCS = 0% when no grants have been received (RS = 0), meaning no part of the project costs is subsidized.

SPCS = 100% when the entire project cost is covered by grants (RS = IC), meaning the entire project is subsidized.

Range: 0 % ≤   SPCS ≤   100%  [28].

 

 = Received Subsidies, meaning the total amount of grants or funding received for the project [€].

 = Investment cost, meaning the total investment cost [€].

 

 

 

[%]

Renewable Energy Use

REU

Provides a measure of the proportion of final energy savings that comes from renewable sources compared to all energy sources used.

It is useful for energy policies and environmental assessments, as it helps quantify and compare the impact of different energy sources on overall sustainability and efficiency.

A higher REU indicates greater use of renewable energy, while a lower REU suggests a higher dependence on fossil fuels.

Range: 0 % ≤   REU ≤   100%   [28].

 

 

 = Final Energy Savings for each energy source k. Indicates the final energy savings achieved from that specific source [kWh].

 = Conversion Factor for each energy source k. This factor is used to convert the saved energy into a common unit, allowing comparison among different sources [-].

 = Renewable Energy Source factor for each energy source k, which accounts for the sustainability of the source. This value varies depending on the type of energy:

·          0 for fossil fuels, indicating they do not contribute to sustainable energy production [-]

·          1 for renewable sources such as biomass, wind, solar, and other renewables, as they are considered sustainable [-]

A value between 0 and 1 for mixed sources, such as industrial waste or end-of-life tires, depending on the sustainability level of the source [-]

[%]

Energy Use per Worker-Hour

EPWH

Measures the total energy used by a production system in relation to the number of human resources and working time.

It calculates the energy used per working hour, taking into account the total supplied energy minus the imported one, and normalizing the result by the number of workers and the annual working hours.

This indicator is useful for evaluating the energy efficiency of an organization or an entire economy, allowing comparisons over time or between different sectors or countries.

A low EPWH is considered positive, as it indicates higher productivity with lower energy use, suggesting a more sustainable use of energy resources.

Conversely, a high EPWH may indicate energy inefficiency, potentially linked to poorly optimized production processes, outdated machinery, or energy-intensive technologies [29].

 

 = Total Primary Energy Supply, i.e., the total amount of primary energy supplied, including all available energy sources [kWh].

 = Population number, meaning the total number of individuals within the studied population.

  = Total number of working hours per person per year [hours/year].

 = Industrial Primary Energy Supply, meaning the portion of TPES specifically used in the industrial sector [kWh].

 

 = Industrial Final Consumption, referring to the final energy consumption by the industrial sector [MWh].

 = Total Final Consumption, referring to the total final energy consumption within a given economic system, including the industrial, residential, tertiary, and transport sectors [MWh].

MJ /

(ab. hour/years)

Note: This table summarizes the Governance Key Performance Indicators (KPIs) used for ESG evaluation within digital twin frameworks. The listed indicators quantify economic efficiency, financial accountability, and strategic resource management, enabling transparent decision-making and long-term sustainability assessment. These variables collectively support the “Governance” dimension of ESG by linking economic performance with responsible investment, policy transparency, and data-driven management.

 

Apart from the previously listed key performance indicators, the following are also calculated for measurement in relation to the context of the given system, making it easier for normalization:

  • Area (Area_m² – AREA): This signifies the total floor space investigated for the energy and environment indices related to the building or infrastructural facility. The total floor space is presented in square meters.
  • Energy Consumption (Energy_Consumption_kWh – ENCO): This refers to the total consumption during the period under review, expressed in kilowatt-hours. This is the fundamental unit that can also produce comparative energy performance indicators
  • Occupants (OCC): This variable measures the number of people using or occupying any given space. This parameter enables calculations related to energy use and per capita environmental factors, making analysis easier for the user.

These factors establish highly important normalizing variables, enabling true comparability of performance across different buildings, facilities, and circumstances, thereby enhancing the robustness of the entire KPI system.

 

  1. Descriptive Statistical Analysis of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype for Smart Buildings

 

The results of the descriptive statistical analysis of the dataset highlight the complexity and diversity reflected in the Key Performance Indicators (KPIs) used to evaluate environmental, energy, and operational performance related to the functioning of Smart buildings and infrastructures. This also aligns well with existing studies that emphasize the significance of Key Performance Indicator frameworks for optimized building management (Faria et al., 2021; Alrashed, 2020). The average surface area (AREA) for the sites analyzed is around 9,637 m², with considerable variability (SD greater than 5,200 m²), indicating that low-scale buildings coexist with larger buildings, including structures larger than 19,000 m². The energy consumption (ENCO) has an average value of around 981,000 kWh, with considerable variation, indicating that the dataset includes both energy-intensive and optimized buildings (Bandoria et al., 2024; Koutras et al., 2023). The Carbon Footprint (CFPT) has an average value of 296 tCO₂e, confirming considerable emissions, which are reasonable given the dimensions of the dataset. The Emission Intensity (EMIN) rate, at 0.081 tCO₂/kWh, indicates optimized energy use, with lower environmental impacts, as reflected in energy consumption, and aligns with energy optimization strategies for the functioning of Smart Infrastructure (Ho et al., 2021). The average values for the energy coverage factors, Load Cover Factor (LCF) and Supply Cover Factor (SCF), are 0.81 and 0.814, respectively, indicating that approximately 80% of the energy can be covered through optimized resources, either on-site production or utilization. The Load Matching Index (LMI) average value, amounting to 71.7%, depicts optimized synchronization for energy production and energy requirements, whereas the average value for On-site Energy Ratio (OER) amounting to 0.75, reflects considerable on-site energy production, thereby making it clear that autonomy also has a strong dimension (Mustapha et al., 2025; Kumar et al., 2024). The average values for the Grid Interaction Index (GII) and No Grid Interaction Probability (NGI) sum to 47% and 0.47, indicating that optimized interaction levels for energy autonomy and interaction are crucial, suggesting optimized energy interaction strategies. The system entrance and operation indices remain uncertain for facility operation performance. The Capacity Factor (CAF), having an average value of 0.54, signifies that the actual use of the installed capacity is around half, along with a slight excess, while One Percent Peak Power (OPP) has an average value of 584 kW, indicating that there are periods where significant peak loads are used. The flexibility and Demand Response factors (DRS, FLF, FLI, FEE) signify the midpoint level for flexibility. It is pertinent to note that since the average for the Demand Response (DRS) factor is 9%, it signifies that it has flexibility for load reduction or time shift, whereas since the Flexible Energy Efficiency (FEE) factor average is around 49%, it also signifies that there is scope for improvement in dynamic energy use (Romanska-Zapala et al., 2020). Considering environmental and comfort factors, indoor conditions are stable and acceptable, meeting comfort requirements. The average humidity (HUM) is 49%, well inside the range for maximum comfort. The level for Particulate Matter (PM₂.₅) and (PM₁₀) (11.2 µg/m³ and 24.6 µg/m³) is lower than the World Health Organization’s requirements, thereby confirming that indoor air quality is satisfactory (Haka-wati et al., 2024). Volatile Organic Compound (VOC) concentration, averaging 186 ppb, shows significant variability, which can be influenced by building materials, effectiveness, and ventilation rates. The average air change rate (ACH) is 4, confirming that recommended rates for buildings that are not industries are met (Mustapha et al., 2025). The comfort levels for thermal and acoustic performance factors also indicate acceptable comfort, with average values of Thermal Insulation Rate (THR) at 2.93 m²K/W and Sound Insulation Index (SND) at 43 dB, indicating well-insulated and comfortable acoustic environments (Mustapha et al., 2025). Regarding energy subsystem factors, EER, COP, and SEF indicate that energy subsystems perform well, with average values of 10.3, 2.86, and 87.5%, respectively. The average Energy Use Intensity for each person (EUI) is 16.9 kWh/year, and the average lighting power density (LPD) value is 0.008 kW/m², ensuring that lighting energy use is satisfactory (Arias-Requejo et al., 2023). However, from an economic perspective, there is greater variability. The average for the Cost of Energy Saving (CES) factor is 11.45 €/kWh, and that for the Energy Return on Investment (EROI) factor is 14.79, indicating equilibrium, albeit with considerable variability. The average Energy Payback Time (EPBT) is 4.9 years, indicating acceptable energy recovery time (Haka-wati et al., 2023). The Cumulative Cash Flow (CCF) is negative, indicating no full cost recovery by the project, while the Share of Project Cost Subsidized (SPC) = 35%, indicating strong subsidization, largely financial in nature. The Renewable Energy Use (REU) = 64%, indicating strong integration of clean energy, while Energy Use per Worker Hour (EPWH) = 39 MJ, indicating that average energy productivity can still improve (Kumar et al., 2024).

 

Table xyz. Descriptive Statistics of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype Applied to Smart Buildings.

Variable

Obs

Mean

Std_Dev

Min

Max

p1

p99

Skew

Kurt

AREA

100

9637.3

5249.252

1161

19942

1175

19694

.167

1.959

ENCO

100

981000

562000

63556.65

1970000

72951.46

1960000

.11

1.79

CFPT

100

295.725

130.658

52.28

495.52

53.21

491.685

-.275

1.887

EMIN

100

.081

.039

.022

.149

.022

.148

-.017

1.765

LCF

100

.811

.125

.604

.997

.604

.996

-.146

1.722

SCF

100

.814

.119

.606

1

.609

1

-.069

1.784

LMI

100

71.682

13.711

51

99.33

51.415

99.225

.383

1.981

OER

100

.753

.25

.33

1.191

.339

1.18

.035

1.729

GII

100

47.038

29.213

.46

99.69

.885

99.085

.104

1.86

NGI

100

.469

.281

.011

.984

.012

.966

.076

1.823

CAF

100

.541

.312

.018

.998

.019

.995

-.123

1.666

OPP

100

584.406

263.627

105.75

995.42

116.035

989.245

-.254

1.724

DRS

100

9.006

11.919

-9.61

29.88

-9.575

29.675

.073

1.825

FLF

100

.045

.584

-.939

.993

-.938

.984

-.072

1.735

FLI

100

.27

.445

-.493

.999

-.492

.99

-.139

1.815

FEE

100

49.036

26.92

.76

98.78

1.29

97.535

-.023

1.98

OCC

100

412.27

225.185

50

933

61

927

.387

2.307

HUM

100

49.463

7.495

25

73.7

27.25

70.65

-.078

4.539

PM25

100

11.233

4.714

3

22.3

3

21.85

.274

2.341

PM10

100

24.617

9.179

8

42.9

8

42.65

.074

2.285

VOC

100

186.01

87.096

20

383

20

371

-.163

2.445

ACH

100

4.051

.795

2.25

6.05

2.285

5.82

.043

2.616

THR

100

2.934

.859

.8

5.5

.97

5.025

.099

2.921

SND

100

43.343

6.227

30

61.6

30.8

60.3

.278

2.962

EER

100

10.34

1.169

7.18

13.03

7.545

12.885

-.158

2.72

COP

100

2.857

.368

2.2

3.59

2.2

3.59

.055

2.287

SEF

100

87.511

4.892

72.2

97.3

74.4

97.2

-.436

3.155

EUI

100

16.932

3.683

7.5

25.4

8.6

25.35

.019

2.616

LPD

100

.008

.002

.005

.012

.005

.012

.22

2.318

CES

100

11.453

25.527

.019

213.237

.02

146.411

5.406

40.749

EROI

100

14.79

21.237

.193

121.655

.224

114.719

3.32

14.856

EPBT

100

4.91

11.729

.08

86.67

.09

79.575

5.544

35.698

CPD

100

141000

73729.43

14691.18

298000

15023.01

298000

.232

2.306

CCF

100

-420000

785000

-1780000

2390000

-1760000

2050000

.644

3.843

SPC

100

34.946

20.902

.25

69.89

.405

69.885

.019

1.789

REU

100

64.338

13.584

30.98

95.58

34.64

92.81

-.066

2.344

EPWH

100

39.763

45.06

.302

229.515

.337

189.341

1.5

5.199

Note. This table presents the descriptive statistical parameters of the Key Performance Indicator (KPI) dataset developed to support the validation of a prototypal Digital Twin and Metaverse model for Smart Building management. The dataset integrates environmental, energy, operational, and governance-related variables, enabling the characterization of heterogeneous building typologies and operational conditions. The statistical descriptors (mean, standard deviation, minimum, maximum, skewness, and kurtosis) provide a quantitative overview of variability and distribution, essential for model calibration, simulation accuracy, and data-driven performance validation within the digital twin environment.

 

 

 

  1. Validation Framework and Data Reliability for ESG-Based Smart Building Model

 

The image illustrates the validation framework for an ESG (Environmental, Social, Governance) Smart Building model, outlining a methodological process divided into four main phases.

 

Figure 1. Validation Framework for ESG-Based Smart Building Model. This framework validates and structures ESG data for Smart Building applications, combining statistical and machine learning methods to ensure data reliability and predictive accuracy. The validated dataset supports testing and prototyping of a management system that integrates metaverse and digital twin technologies for advanced, real-time smart building management.

 

 

The process starts with data preparation and structuring, in which data on environmental, social, and governance indicators should be collected and processed by normalizing and organizing them into three analytic blocks. In addition, data screening for outlier observation should be executed at the same stage to ensure data quality for subsequent analysis. The next process involves correlation analysis and Principal Component Analysis. The PCA analysis needs to identify hidden components and prove structural homogeneity. The next step involves Ordinary Least Squares linear regression for each component of environmental, social, and governance. The area will serve as the output for the data. In addition, the framework should use VIF to test for homogeneity in the data. Furthermore, it should apply the calculations for both the determination coefficient and the degrees of freedom. The framework should use machine learning algorithms to improve predictive analysis. At the same time, comparisons of various algorithms, such as Boosting algorithm analysis, Decision Tree Analysis by KNN, Random Forest by Regularization, and Support Vector Analysis, should be used. The analysis should be carried out separately for each component. The algorithm has been designed to ensure that the processed data can be used for testing during the design of a management system that combines the metaverse and a digital twin. At the same time, data structural homogeneity should be ensured. Therefore, based on the data structural homogeneity analysis, it is meaningful and timely to create an advanced digital environment that is both interactive and immersive. Furthermore, it should be an opportunity to create environmental management in an intelligent digital environment.

 

 

  1. Scientific Validation of ESG Data through Correlation Analysis for Smart Building Prototyping

 

 

The correlation matrix, as a validation technique for the database used in the analysis of ESG components, holds a strong position from a scientific perspective. Correlation analysis is the most robust statistical approach for assessing the internal consistency of the data. The advantage of correlation analysis lies in the ability of researchers to determine whether the set of investigated factors shows positive or negative correlations. In the analysis of ESG factors, it is confirmed that each factor has a specific property within the non-overlapping value of sustainability. From a scientific perspective, it confirms that the data structure holds strong internal consistency. In the context of smart building implementation, it plays an important role by validating the quality of data that flows into the digital management system. The analysis of correlations among various factors of energy consumption and environmental emissions confirms that the data set follows an independent distribution of sustainability. The moderate levels of correlation confirm that it holds multidimensional properties. In terms of scientific research and the scientific standards of environmental analysis and management science, it complies with high standards. It provides a robust foundation for further analysis, such as PCA and regression. These two analyses provide further evidence supporting research on environmental sustainability. Furthermore, it provides strong evidence that the data has been integrated into the digital twin metaverse. In respect of the research analysis targeting the assessment of smart building implementation on environmental factors. The research analysis holds three types of correlation analysis. The correlation analysis focuses on each ESG aspect. The three factors in the analysis include the Environmental factor (E), the Social factor (S), and the Good Governance factor. The analysis of these factors provides an important perspective, as it confirms that the data structure holds comprehensive internal properties.

 

 

 

5.1 Correlation Analysis and Validation of Environmental (E) Factors in the ESG Framework

 

 

 

The environmental factor in the ESG framework refers to operational characteristics related to energy, emissions, and environmental issues. The correlation matrix for the environmental factors (AREA, CFPT, ENCO, EMIN, LCF, SCF, LMI, OER, GII, NGI, OPP, DRS, FLF, FLI, FEE) helps the researcher perform initial checks for internal dataset coherence and multicollinearity among factors. The correlations appear to range from weak to medium, thereby ensuring that similar factors are not measured again (Wang, 2024; Eskantar et al., 2024). This helps improve the construct validity of the environmental elements, as it clearly supports a wide range of factors and prevents overlap (Handoko, Afifudin, & Holili, 2024). The AREA, which relates to the asset's actual size, shows insignificant correlations with other factors. The slight negative correlations observed between energy use (ENCO) and Carbon Footprint (CFPT) indicate that larger areas do not necessarily lead to greater energy use and emissions (Hou et al., 2025). The positive, albeit trivial, relationship between Load Cover Factor (LCF) and building size indicates that larger buildings tend to handle load factors better, though this relationship is not significant. CFPT, having relation to Carbon Footprint, is negatively correlated to both energy use (ENCO) and Emission Intensity (EMIN). The negative relationship between CFPT and ENCO may seem paradoxical, but it could reflect differences in the use of cleaner energy across organizations (Zhou, 2024). The negative association between CFPT and EMIN implies that when total emissions are higher, Emission Intensity tends to fall, suggesting that either larger organizations use different energy resources to scale or that better technological efficiencies account for better results (Du et al., 2024). The trivial relationship emphasizes that emissions, although controlled by many, are not solely defined by energy use quantities, making it valid for CFPT and EMIN to remain distinct factors. The energy use factor (ENCO) also has insignificant correlations for other factors in the environmental domain, thereby requiring support for its applicability. The presence of a weak negative relationship between it and LCF and SCF (Load and Supply Cover Factors) implies that greater energy use does not necessarily correspond to better load coverage or supply adequacy, thereby ensuring autonomy in quantity and management efficiency (WANG, Y., 2024). This adds strength to the theoretical basis for modeling, in which operational intensity and efficiency remain separate dimensions within the environment. The correlations for LCF, SCF, LMI, and OER, factors that indicate energy balance and autonomy, demonstrate an internal logical structure. For example, LCF shows a positive relationship with EMIN and LMI, thereby confirming that systems with greater load coverage tend to demonstrate greater operational matching. The positive relation between LCF and EMIN could prima facie appear contradictory: greater intensity could indicate inefficiency, yet it could also indicate systems running at, or near, full capacity, where greater loads tend to temporarily enhance intensity. The slight positive relationship between LCF and OER (On-site Energy Ratio) supports internal logic, in which greater load coverage enables greater on-site production —a sensible practice for system design that sustains the environment (Dovolil & Svítek, 2024). The GII and NGI, which indicate interaction on the power grid, tend to show slight negative or weak correlations with almost all other factors. This also appears sensible: systems that depend more on the power grid for functioning (greater NGI, lower GII) tend not to relate directly to greater efficiency (FEE) or flexibility (FLI) (Zhou, 2024; Wang, 2024). The slight correlations tend to confirm that interaction with the power grid remains an autonomous domain for the environment, suggesting that the dataset can properly account for almost every aspect of the environment, from production to system administration (Eskantar et al., 2024; Hou et al., 2025). The factors for flexibility (FLF, FLI, FEE) tend to show slight correlations with each other, thereby confirming that flexibility and efficiency remain largely autonomous factors in analysis. The slight positive correlations between FLF and AREA, and between FEE and LCF, suggest that larger systems display greater flexibility, though only slightly. This slight autonomy in interdependence tends to confirm that, for the environment, different dimensions (structure, operation, and efficiency) relate only partially (Eskantar et al., 2024; Hou et al., 2025). The correlations for the environment tend to confirm the dataset’s validity. The low to medium correlations confirm that the environmental factors are exploring different, albeit complementary, dimensions around the notion of ‘sustainability,’ without any considerable redundancy. This also adds strength to the basis for further analysis, such as PCA, that will also, in turn, support the interpretation of the factorial structure underlying the environmental dimension, achieving a meaningful combination of indicators (Handoko et al., 2024; Wang, Y., 2024).

 

 

Table xyz. Correlation Matrix for Environmental (E) Factors in the ESG Model

 

Variables

AREA

CFPT

ENCO

EMIN

LCF

AREA

1.0000

-0.0382

-0.0608

-0.0678

0.0483

CFPT

-0.0382

1.0000

-0.1416

-0.2254

-0.0229

ENCO

-0.0608

-0.1416

1.0000

-0.0344

-0.1235

EMIN

-0.0678

-0.2254

-0.0344

1.0000

0.1844

LCF

0.0483

-0.0229

-0.1235

0.1844

1.0000

SCF

-0.0142

-0.0214

-0.2180

-0.1927

-0.1126

LMI

0.0376

0.0284

-0.0592

-0.0165

0.2509

OER

0.0432

0.0155

-0.1793

0.0205

0.0918

GII

-0.0380

0.0052

-0.2230

0.0543

-0.0519

NGI

-0.0188

0.0250

-0.0573

-0.0805

-0.0523

OPP

-0.1248

0.0472

0.1331

0.2376

-0.0651

DRS

-0.0577

0.1073

-0.1592

-0.0992

-0.1351

FLF

0.1050

-0.1392

-0.0412

0.0490

-0.0770

FLI

0.0023

0.1272

-0.0822

0.0331

-0.0335

FEE

0.0965

-0.0327

-0.0738

-0.0678

0.0085

 

Note: The table presents the correlation coefficients among the environmental indicators used within the ESG framework. The low to moderate correlation values confirm that the variables are largely independent and represent distinct aspects of environmental performance, such as energy use, emissions, and operational efficiency. This statistical consistency validates the internal coherence of the dataset and ensures its suitability for advanced modeling techniques, including PCA and regression analysis. The results further demonstrate that the data are appropriate for use in the prototyping and testing of smart building management systems based on digital twin and metaverse technologies.

 

 

The relationship heat map is a graphical representation of the inherent relationships among the environmental indicators in the dataset. The intensity distribution in the heat map shows mainly light-colored regions and a few strong red and blue regions, suggesting that most correlations are low to moderately positive. This graphical interpretation also supports the initial statistical analysis, confirming that the majority of the environmental factors presented are mutually independent and cover different facets of energy consumption, emissions intensity, load management, and efficiency. The same correlations can also be found in ESG datasets, for which multidimensionality is crucial to guarantee the strength and ease of interpretation of modeling (Ioannidis et al., 2022; Loukili & Benli, 2023). The absence of strongly correlated factors indicates that the dataset has an effective structure and lacks multicollinearity, ensuring it meets the requirements for accurate modeling and interpretation (Eskantar et al., 2024). The regions that display moderately strongly correlated factors, found in different parts of the heat map, relate to well-known correlations for the expected dimensions. For example, a low, positive relationship between Emission Intensity (EMIN) and Load Cover Factor (LCF) could reflect operational conditions: when power systems operate at maximum load, emission intensity tends to increase. Other low, positive correlations for factors related to energy autonomy (on-site energy ratio, OER) and load matching (load matching indicators, LMI) demonstrate that there are coherent interactions in energy autonomy and system efficiency, thereby aligning with results from ESG analysis carried out using alternative methods (Sorathiya et al., 2024). The heat map analysis clearly shows that each set of factors has an inherent, logical structure without compromising its mutual independence. The heat map suggests that there are no strongly correlated factors that fully define the environmental dimension. This can also indicate that the dataset has inherent multidimensional characteristics, covering different facets related to energy, emission intensity, load balance, and flexibility, which contribute to a comprehensive ESG analysis in a unique way. An integrated view has also been applied in ESG analysis to evaluate smart city infrastructure (Dovolil & Svítek, 2024). This shows that the variables are distinct yet conceptually related, providing a strong basis for analysis such as PCA and regression models in the ESG framework.

 

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Environmental (E) Factors in the ESG Model. Note: The heat map shows mostly weak to moderate correlations, indicating that the environmental variables are independent and free from multicollinearity. This confirms the dataset’s structural validity and its suitability for integration into digital twin and metaverse-based smart building management models.

 

 

 

5.2 Correlation Analysis and Validation of Social (S) Factors in the ESG Framework

 

 

The social dimension of the ESG model emphasizes human and system characteristics at the building level, highlighting considerations such as user comfort, indoor air quality, thermal and sound performance, and efficiency. The given correlation matrix for the social dimension (OCC, HUM, PM25, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) represents an important validation process for the dataset used for creating a management model that applies digital twin technology (Hadjidemetriou et al., 2023). The correlations between variables are largely weak to moderate, signifying that each variable identifies a unique aspect without overlapping. The correlations for the number of occupants (OCC) range from small positive correlations for RH (HUM, 0.13) and fine particle concentration (PM2.5, 0.20), since higher human presence could lead to slight increases in concentration for both factors (Lo, 2025). However, the correlations remain weak, strengthening the hypothesis that better-designed environmental conditions and ventilation systems can exclude environmental factors from having a major impact on indoor air quality (Cai et al., 2023). The near-zero correlations between OCC and variables such as the concentration of Volatile Organic Compounds (VOC, -0.07) and thermal resistance (THR, -0.08) indicate little to no relationship between human presence and these factors, again proving that the balance of the dataset has been appropriately defined. The air quality variables (PM2.5, PM10, and VOC) tend to exhibit mild correlations, particularly between PM2.5 and PM10 (0.24), since both factors are closely related through their co-occurrence at the same locations (Hadjidemetriou et al., 2023b). The low correlations between VOCs and humidity indicate that air quality factors are largely independent of indoor environmental factors, supporting the continued separation of their measurement as separate elements for analysis under the social dimension (Ni et al., 2024). Such correlations for air change rates (ACH) tend to reflect positive, albeit weak, correlations with temperature and sound insulation, again showing that ventilation rate performance is largely uninfluenced by envelope characteristics, which are important for digital twin simulations of indoor user comfort (Islam et al., 2024). The thermal and acoustic factors (THR, SND) show medium-strength positive correlations with COP, SEF, and EER, suggesting that buildings with lower thermal and sound transmission tend to exhibit better energy system efficiency. This pattern is also expected, reinforcing the dataset's internal validity by associating comfort factors with actual system performance (Alibrandi, 2022). Conversely, EER, COP, and SEF display strong positive correlations (ranging from 0.49 to 0.72) because energy efficiency factors are expected to show considerable convergence in value. However, for a digital twin model, such strong correlations are highly acceptable, as they can assess various system factors that are closely related yet supplementary to one another (Hii & Hasama, 2024). Interestingly, energy use intensity (EUI) and lighting power density (LPD) show a strong positive correlation (r = 0.88), as lighting factors strongly influence energy use per capita. This pattern explicitly verifies that the dataset accurately captures internal load patterns, both for assessing social comfort and productivity in digital twin settings, thereby becoming crucial for exploring social system factors through ESG models (Yossef Ravid & Aharon-Gutman, 2023). However, the low values for EUI, LPD, and social factors explicitly confirm that energy use patterns remain a separate system factor, not driven by social factors. The social component’s correlation matrix clearly shows that low-to-medium correlations indicate logical convergences among comfort, air quality, and energy factors, thereby explicitly confirming that the dataset captures supplementary system factors that are logically related to each other. This pattern clearly shows that the dataset’s social component, which focuses on ESG modeling, is robustly constructed, thereby ensuring its reliability for efficient analysis, simulations, and decision-making support through digital twin frameworks for managing energy-efficient buildings and enhancing social factors by maximizing energy performance in social buildings.

 

 

 

Table xyz. Correlation Matrix for Social (S) Dimension Variables in the ESG Smart Building Model

 

Variable

OCC

HUM

PM25

PM10

VOC

ACH

THR

SND

EER

COP

SEF

EUI

LPD

OCC

1.0000

0.1329

0.1953

0.0406

-0.0661

-0.0387

-0.0806

0.0172

0.0373

0.0912

0.1720

-0.0849

-0.0437

HUM

0.1329

1.0000

0.0027

0.0540

-0.0592

0.1160

-0.1581

0.0172

0.0477

0.0399

0.0013

0.0618

0.0800

PM25

0.1953

0.0027

1.0000

0.2370

0.0320

-0.0518

-0.2271

0.1503

-0.0616

0.0376

0.0095

-0.0114

0.0935

PM10

0.0406

0.0540

0.2370

1.0000

0.0760

0.0683

0.0201

0.0481

-0.0705

-0.0022

-0.0393

0.0587

0.0935

VOC

-0.0661

-0.0592

0.0320

0.0760

1.0000

0.0005

-0.0622

-0.0455

-0.0209

-0.0401

-0.0085

0.0454

0.0214

ACH

-0.0387

0.1160

-0.0518

0.0683

0.0005

1.0000

0.0289

0.1062

0.0741

0.0784

0.0607

0.0243

0.0072

THR

-0.0806

-0.1581

-0.2271

0.0201

-0.0622

0.0289

1.0000

0.1467

0.1021

0.1425

0.1260

-0.0078

0.0573

SND

0.0172

0.0172

0.1503

0.0481

-0.0455

0.1062

0.1467

1.0000

0.0119

0.0676

0.0631

-0.0225

0.0202

EER

0.0373

0.0477

-0.0616

-0.0705

-0.0209

0.0741

0.1021

0.0119

1.0000

0.4872

0.7244

-0.1632

-0.0750

COP

0.0912

0.0399

0.0376

-0.0022

-0.0401

0.0784

0.1425

0.0676

0.4872

1.0000

0.7074

-0.0906

-0.0529

SEF

0.1720

0.0013

0.0095

-0.0393

-0.0085

0.0607

0.1260

0.0631

0.7244

0.7074

1.0000

-0.1307

-0.0399

EUI

-0.0849

0.0618

-0.0114

0.0587

0.0454

0.0243

-0.0078

-0.0225

-0.1632

-0.0906

-0.1307

1.0000

0.8829

LPD

-0.0437

0.0800

0.0935

0.0935

0.0214

0.0072

0.0573

0.0202

-0.0750

-0.0529

-0.0399

0.8829

1.0000

 

Note. The table displays the correlations among social indicators such as comfort, air quality, and energy efficiency. The weak to moderate correlations confirm that these variables represent distinct yet complementary dimensions, ensuring the dataset’s internal consistency and its suitability for digital twin-based simulations in smart building management.

 

 

 

 

The heat map for the Correlation Matrix of the Social (S) dimension of ESG helps establish the intuitive structure of the mutual relationships among the variables that define indoor comfort, air quality, and energy efficiency in buildings. The structure is dominated by light colors, indicating that there is little to medium strength across the majority of variables; hence, the dataset provides a comprehensive range of social factors related to sustainability without duplication. The presence of mixed correlations in the heat map enhances its validity for use in digital twin-based building management systems to optimize building performance and human well-being. The red line running along the diagonal indicates the perfect relationship each has with itself, distinct from the existing correlations denoted by the colors along the diagonal. The red colors in the lower right corner indicate that the relationship (high correlation) between the energy-related variables EER, COP, and SEF (ranging from 0.7 to 0.8) is strong. However, it is expected that there was a relationship, given that it measures efficiency and performance. The same applies to the red square that connects EUI and LPD (0.9) correlations. The red square indicates that lighting load plays an important role in energy use per capita, thereby underscoring its role in defining energy efficiency. The top-left corner, related to the indicators for occupants' and air quality (OCC, HUM, PM2.5, PM10, and VOC), shows pale colors with scattered red and blue. This indicates that the relationship (low correlations) is weak, confirming that building air quality and comfort are not reliant on factors related to occupants —an important characteristic for datasets that help model building conditions using digital twin methods. This helps indicate that humidity and pollutant concentrations can change through simulations that model different process scenarios, thereby avoiding building conditions that could arise from occupants' varying factors related to building functionality and adaptations. The heat map indicates that it is valid for modeling the ESG social dimension in systems that apply analytics for building optimization and related human well-being.

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Social (S) Factors in the ESG Model. The heat map shows mostly weak to moderate correlations, indicating that the social variables—related to comfort, air quality, and energy efficiency—are distinct yet interrelated. This confirms the dataset’s internal coherence and its suitability for digital twin-based smart building simulations.

 

 

 

 

5.3 Correlation Analysis and Validation of Governance (G) Factors in the ESG Framework

 

The correlation matrix for the Governance (G) component of the ESG model provides valuable insight into the interrelationships among indicators that represent economic and operational aspects of smart building management. These include cost-effectiveness (CES), energy return on investment (EROI), energy payback time (EPBT), capital cost factors (CPD and CCF), system performance (SPC), renewable energy utilization (REU), and energy productivity per worker hour (EPWH). The aim of analyzing these correlations is to validate the dataset used for the prototyping of a digital twin-based management model for smart buildings (Roda-Sanchez et al., 2023; Alibrandi, 2022), ensuring that the indicators are statistically consistent, complementary, and capable of accurately reflecting the governance dynamics of sustainable infrastructure systems (Poels et al., 2022). The overall pattern of correlations in this matrix shows that relationships between governance variables are generally weak to moderate—a desirable feature for multidimensional datasets (Li, 2025). This indicates that each variable captures a distinct dimension of governance performance without excessive redundancy. The Cost of Energy Savings (CES) shows a strong negative correlation with the Capital Cost Factor (CCF) at -0.42, suggesting that higher capital costs are associated with lower cost-efficiency in achieving energy savings. This inverse relationship highlights an important governance trade-off: investments that require significant capital may not always translate into proportional financial efficiency gains (Chungath & Hacks, 2024). The negative correlations between CES and other variables such as SPC (-0.22) and REU (-0.20) reinforce this interpretation, implying that systems with higher cost-effectiveness tend to have lower levels of spending and less direct connection with renewable energy deployment intensity. EROI, which measures the ratio between energy produced and energy invested, displays weak correlations across most variables, including a slight negative association with EPBT (-0.22), consistent with the expectation that higher energy returns correspond to shorter payback times. Its positive, though modest, correlations with CCF (0.09) and SPC (0.16) suggest that systems with better energy efficiency tend to be embedded in contexts with moderate capital intensity and performance consistency (Elnour et al., 2024). EPBT itself maintains low correlations, except for its mild negative association with SPC (-0.20), which indicates that buildings or systems with shorter payback periods tend to have more stable or efficient operational performance. The CCF variable is positively correlated with SPC (0.23) and EPWH (0.11), showing that capital costs are weakly linked to system performance and worker energy productivity. These modest correlations support the validity of the dataset by suggesting that financial parameters and productivity metrics are related but not overlapping dimensions of governance performance (Zhou et al., 2021). REU and EPWH exhibit a small positive relationship (0.19), consistent with the idea that renewable energy integration enhances the energy productivity per worker, a finding relevant for evaluating the operational efficiency of buildings managed under sustainable frameworks (Dovolil & Svítek, 2024; Kljaić et al., 2024). The overall low correlation magnitudes across variables, with few exceptions, demonstrate that the dataset is well balanced and not dominated by interdependent indicators. This structural integrity is fundamental for the calibration and validation of digital twin models (Chungath & Hacks, 2024; Poels et al., 2022), which require clear variable independence to accurately simulate decision-making and policy scenarios in smart buildings.The limited but coherent correlations between cost, performance, and efficiency metrics confirm that the Governance dimension of the ESG dataset is statistically reliable. It effectively captures the complexity of managing financial and operational sustainability (Li, 2025), ensuring that the digital twin model can use these parameters to support optimization, predictive analysis, and performance benchmarking within a robust and transparent governance structure (Kljaić et al., 2024; Elnour et al., 2024).

Table X. Correlation Matrix for Governance (G) Factors in the ESG Smart Building Model

 

Variable

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

CES

1.0000

-0.0596

0.0320

0.0069

-0.4240

-0.2163

-0.1981

-0.0780

EROI

-0.0596

1.0000

-0.2234

0.0083

0.0851

0.1553

0.0725

0.0126

EPBT

0.0320

-0.2234

1.0000

-0.1697

0.0380

-0.1981

0.1305

0.0050

CPD

0.0069

0.0083

-0.1697

1.0000

-0.0017

-0.0894

0.0077

0.0670

CCF

-0.4240

0.0851

0.0380

-0.0017

1.0000

0.2251

0.0327

0.1058

SPC

-0.2163

0.1553

-0.1981

-0.0894

0.2251

1.0000

-0.1582

-0.0859

REU

-0.1981

0.0725

0.1305

0.0077

0.0327

-0.1582

1.0000

0.1860

EPWH

-0.0780

0.0126

0.0050

0.0670

0.1058

-0.0859

0.1860

1.0000

 

Note. The table shows weak to moderate correlations among governance indicators, confirming their independence and validity. The negative link between CES and CCF highlights an inverse cost–efficiency relationship, while positive ties among CCF, SPC, and EPWH indicate consistent governance performance suitable for digital twin-based smart building management.

 

 

The corresponding heat map for the Governance (G) component clearly shows the structure of correlations associated with important governance factors, offering a quick look at the relationship profiles of financial, operational, and efficiency factors in the dataset. The color scale from deep red to blue also clearly emphasizes the type and intensity of correlations, differentiating red for positive correlations and blue for negative ones. This helps perform intuitive analysis aimed at assessing the level of internal association consistency in the dataset, which is important for approving the digital twin model for managing Smart buildings (Chungath & Hacks, 2024; Cureton & Dunn, 2021). The first observed feature from the heat map is the strong negative association existing between the Cost of Energy Savings (CES) and the Capital Cost Factor (CCF), as indicated by the deep blue square (around -0.4). This association clearly shows that when capital costs are higher, energy savings are less beneficial and less important for governance-related financial decisions in Smart buildings (Pileggi et al., 2020). This clearly shows that the numerical analysis is supported by the heat map, making it easier to observe clear, interpretable correlations among financial factors and enhancing the dataset's credibility by appropriately referencing these correlations for cost and investment factors. An important group could also be observed for the efficiency and performance factors (CCF, SPC, and REU), characterized by weak to moderately red-toned correlations. This clearly shows that when better performance and usage of REU are positively associated with higher capital costs, it’s expected that investment level intensity will be higher, with a positive outcome for energy governance (Lv et al., 2023; Zahedi et al., 2024). The same could also be analyzed by examining the REU and EPWH groups, as shown in a red-toned heat map, clearly indicating that REU has a positive relationship with EPWH and fully confirming the operational scenario for the efficiency model for Smart buildings (Roda-Sanchez et al., 2023). The dominance of the light-toned palette for almost every corner in heat maps indicates that almost every variable has low levels of correlation, ensuring that the dataset is fully balanced and lacks multicollinearity, both of which outweigh benefits for digital twin applications, ensuring that it’s fully accurate for cause-and-effect simulations (Yue et al., 2022; Hartmann et al., 2023). The heat map, therefore, validates the dataset's effectiveness by illustrating that governance indicators are differentiated yet linked in a logical way, ensuring it’s apt for use in an integrated system for the governance of sustainable buildings (Dovolil & Svítek, 2024; Cranford, 2023). Thus, it can unequivocally be concluded that the significance of governance heat maps is mandatory for analysis and, more appropriately, for decision-making regarding ESG integration in digital twin technology for infrastructural governance in a smart city (Kljaić et al., 2024).

 

 

 

.

 

Figure X. Heat Map of the Correlation Matrix for Governance (G) Factors in the ESG Model. Note: The heat map illustrates the correlations among governance indicators such as cost-effectiveness, capital costs, and system performance. The predominance of light colors indicates weak to moderate relationships, confirming the independence of the variables and the absence of multicollinearity. This validates the dataset’s consistency and its suitability for digital twin-based simulations in smart building governance.

 

 

 

  1. Regression-Based Validation of the ESG Dataset for Digital Twin Smart Building Modeling

 

To demonstrate the efficacy and applicability of the ESG model, the analysis equations will provide a crucial starting point for evaluating the dataset's statistical validity and reliability. These equations will examine the levels of cohesion, interdependence, and applicability of the environmental, social, and governance dimensions within the broader context of sustainable building resource management. The analysis aims to demonstrate that the ESG dataset has the potential to significantly contribute to the conceptualization and ideation of an integrated building resource management model that leverages digital twin technology and the metaverse to model, monitor, and regulate the performance efficiency of intelligent buildings in real time (Zhang et al., 2023). The equations will use Ordinary Least Squares (OLS), with the dependent variable (AREA) indicating the scale, size, and functionality of buildings, and the independent variables indicating Key Performance Indicators for each ESG dimension. The equations will enable the researcher to assess the reliability and functionality of the dataset, thereby creating the opportunity to examine the applicability of fundamental ESG dimensions that can effectively contribute to sustainable building resource scale, functionality, and efficiency (Dou & Yin, 2024). The structure and form of the equations will apply the three dimensions that govern ESG, creating a sound methodology for assessing environmental, social, and governance factors within a broader context of sustainable building resource scale, functionality, and efficiency (Wang et al., 2024). The Environmental equation will appropriately indicate energy consumption, intensity, and efficiency factors that can contribute to building scale, the Social equation will indicate factors related to comfort, air, and user well-being, and the Governance equation will relate financial intensity and efficiency factors to building scale, functionality, and performance (Wang et al., 2024). The equations will provide the crucial foundation for the validation and calibration of the given dataset, ensuring that its integration into digital twin and metaverse technology provides the fundamental soundness for reliable, accurate, and environmentally validated model building (Liu et al., 2025).

 

Table X. Regression Equations for ESG Dimensions in the Smart Building Model

ESG

Equations

E-Environment

 

S-Social

 

G-Governance

 

Note: The table presents the Ordinary Least Squares (OLS) regression equations used to validate the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. Each equation relates specific Key Performance Indicators (KPIs) to the dependent variable AREA, representing building scale and functionality. These equations provide the analytical foundation for integrating the ESG dataset into digital twin-based smart building management and simulation frameworks.

 

The results for the Environmental model (E) indicate an R² of 0.226 and an adjusted R² of 0.005, suggesting that while the included variables explain approximately 22.6% of the variance in AREA, much of this explanatory power is not statistically robust once adjusted for the number of predictors. However, the F-statistic (1.02) and its corresponding probability value (0.451) confirm that the model structure remains consistent and free from specification errors. The significant variables, namely the Capacity Factor (CAF, p = 0.006) with a negative sign, and the Renewable Energy Utilization (REU, p = 0.065) with a positive sign, indicate logical relationships. Larger building areas tend to be associated with lower utilization efficiency (CAF) but higher renewable energy use (REU), a pattern coherent with real-world behavior in large smart infrastructures (Guo et al., 2025). The low mean VIF (1.93) confirms the absence of multicollinearity, reinforcing dataset reliability for modeling energy-environmental dynamics. The Social (S) dimension regression exhibits an R² of 0.085 and an adjusted R² of 0.004, showing that social and comfort-related KPIs explain only a small fraction of the variation in building area. This result aligns with expectations, as social variables—such as air quality (PM2.5, PM10), humidity, and acoustic comfort—tend to capture internal environmental quality rather than scale-dependent properties. The significance of PM2.5 (p = 0.084) suggests that particulate concentration may have a weak relationship with building size, potentially due to differences in ventilation and occupancy density (Chungath & Hacks, 2024). The low mean VIF (1.08) again validates the statistical independence of these indicators, confirming that the Social dataset is structurally well defined, even if its predictive strength remains marginal. The Governance (G) regression yields the most consistent results in terms of model validity, with an R² of 0.124 and a higher adjusted R² of 0.067. The F-statistic of 2.19 and a p-value of 0.051 indicate near-statistical significance at the 5% level, implying that the governance and economic indicators together provide a weak but coherent explanation of AREA variability. The negative signs of the significant variables—Capital Development Cost (CPD, p = 0.027) and Capital Cost Factor (CCF, p = 0.054)—reveal that greater efficiency and lower costs per unit area are associated with better governance performance. This outcome is particularly relevant for validating the economic component of the digital twin, as it suggests that financial optimization and governance transparency correlate with spatial and operational efficiency (Dovolil & Svítek, 2024; Cranford, 2023). The low mean VIF (1.15) confirms internal model consistency and the absence of collinearity distortions. Overall, the three regressions validate the ESG dataset by confirming that each component captures a distinct dimension of building performance. While none of the models exhibits high explanatory power individually, their combined interpretation demonstrates structural coherence and logical sign directions. The Environmental model highlights operational and renewable energy dynamics, the Social model reflects comfort and health independence, and the Governance model reveals economic efficiency trends. Together, they provide a statistically sound and multidimensional foundation for implementing a digital twin system capable of assessing, simulating, and optimizing smart building governance and performance in line with ESG principles (Zhang et al., 2023).

Table X. Summary of Regression Results for ESG Dimensions in the Smart Building Model

ESG Dimension

E (Environment)

S (Social)

G (Governance)

Included KPIs (X)

ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, FLF, FLI, FEE, EER, COP, SEF, EUI, LPD, REU, EPWH

OCC, HUM, PM25, PM10, VOC, ACH, THR, SND

CES, EROI, EPBT, CPD, CCF, SPC

 Vars

22

8

6

0.226

0.085

0.124

Adj. R²

0.005

0.004

0.067

F (df1, df2)

1.02 (22, 77)

1.05 (8, 91)

2.19 (6, 93)

Prob > F

0.451

0.403

0.051

Root MSE

5237

5238

5069

Mean VIF

1.93

1.08

1.15

Significant Variables (p < 0.10)

CAF (p = 0.006), REU (0.065), EPWH (0.108)

PM25 (p = 0.084)

CPD (p = 0.027), CCF (0.054)

Sign

CAF (−), REU (+)

+

both (−)

Interpretation

Environmental KPIs are consistent but weakly predictive of AREA; no multicollinearity; logical directional signs.

Social KPIs are independent and orthogonal; air quality and comfort show limited relation with building scale.

Governance and economic KPIs show structural consistency and marginal significance; negative coefficients suggest efficiency gains with lower costs per area.

Note: This table summarizes the regression outcomes for the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. The results show that all models are statistically coherent and free from multicollinearity (Mean VIF < 2). While the Environmental and Social models exhibit low explanatory power, the Governance model shows marginal significance (Prob > F ≈ 0.05), indicating that financial and efficiency indicators play a stronger role in explaining building scale and performance within digital twin-based smart building systems.

 

The validation of the ESG dataset through the three regression models—each representing the Environmental, Social, and Governance dimensions—demonstrates the internal coherence and distinct contribution of each component to the explanation of building scale and performance, represented by the variable AREA. The results reveal a layered structure of relationships within the dataset that supports its robustness and analytical validity for modeling within a digital twin framework (Hien & Hanh, 2024). From a global perspective, the Governance (G) regression emerges as the most statistically relevant, with a Prob > F value around 0.05, suggesting marginal significance and indicating that this component captures some consistent structural patterns between governance and economic indicators and the dependent variable AREA. This finding implies that financial and efficiency-related parameters—such as cost per energy unit, return on investment, and payback time—are moderately predictive of the built area, reflecting how governance decisions and economic structures may scale with building size (Dou & Yin, 2024). In contrast, the Environmental (E) and Social (S) models exhibit low explanatory power, with adjusted R² values close to zero. This outcome is not unexpected, as environmental and social metrics often capture operational performance and contextual conditions rather than structural attributes like area. The Environmental model, although statistically weaker, presents logical directional relations, such as negative associations with carbon footprint (CFPT) and positive associations with renewable energy use (REU), which are consistent with theoretical expectations of sustainable design (Su & Sun, 2023). Similarly, the Social model demonstrates independence among variables, showing that indoor air quality, thermal and acoustic comfort, and occupancy metrics vary orthogonally without multicollinearity, as confirmed by low mean VIF values (below 2). The analysis of multicollinearity further reinforces the validity of the dataset. All three models have mean variance inflation factors below 5, confirming that no block of variables presents internal redundancy. This indicates that the dataset is structurally well-defined and that each KPI contributes unique information within its respective ESG dimension (Chen & Lin, 2023). From a methodological standpoint, this supports the use of the dataset for higher-level analytical modeling, including multivariate or machine learning regressions, since predictor independence is a prerequisite for robust feature interpretation. The Governance block stands out for its structural coherence. Variables such as CPD (cost per design), CCF (capital cost factor), and SPC (sustainability performance cost) show statistically relevant coefficients, some with negative signs. This pattern indicates that higher building efficiency or optimized financial planning is associated with lower costs per unit area—an interpretation aligned with principles of sustainable financial governance and stakeholder-oriented management (Berman et al., 1999). The presence of negative coefficients further reinforces the logic of efficiency-driven management models, where resource optimization translates into economic and environmental benefits. Overall, the regression-based validation confirms that the ESG dataset is both statistically sound and conceptually coherent. Each dimension provides non-redundant information, supporting the multidimensional structure of ESG analysis. While the E and S models describe operational and contextual variability, the G model anchors the dataset’s structural significance, establishing a measurable link between governance efficiency and building scale. The low multicollinearity, consistent variable behavior, and partial significance of the Governance model collectively validate the dataset for use in a digital twin context, where real-time data integration and predictive modeling depend on stable and interpretable variable relationships. This foundational validation demonstrates that the dataset can be reliably used for developing intelligent management systems capable of assessing performance and sustainability through interconnected ESG indicators (Hien & Hanh, 2024; Dou & Yin, 2024).

Table X. Summary of Key Analytical Insights from ESG Regression Models

Aspect

Observation

Global significance

Only the G model is marginally significant (Prob > F ≈ 0.05).

Internal coherence (VIF)

All Mean VIF < 5 → no multicollinearity in any ESG block.

Predictive power vs. AREA

E and S blocks have low explanatory power; G block moderate (Adj R² ≈ 0.07).

General interpretation

The three ESG dimensions are statistically distinct and non-redundant. The Governance/Economic dimension shows the strongest structural consistency.

Note: The table presents the main analytical observations derived from the ESG regression analysis. It highlights that the Governance (G) model demonstrates marginal statistical significance and the strongest internal consistency, while the Environmental (E) and Social (S) models show lower explanatory power but maintain structural independence. The low VIF values confirm the absence of multicollinearity, validating the dataset’s robustness for digital twin–based smart building modeling.

 

  1. Principal Component Analysis (PCA) for Technical Validation of the ESG Dataset in Smart Building Governance

 

The analysis presented in this section aims to apply the Principal Component Analysis (PCA) technique to provide a technical and scientific validation of the ESG (Environmental, Social, and Governance) dataset developed for testing the smart building governance prototype. PCA is a widely recognized multivariate statistical method used to reduce the dimensionality of complex datasets while preserving their essential information structure. Its application in this context serves a dual purpose: to verify the internal coherence and multidimensionality of the ESG dataset and to ensure that the selected indicators accurately represent the underlying sustainability dimensions without redundancy. The purpose of this analysis is to confirm that the dataset is robust, logically structured, and suitable for integration into advanced digital environments such as digital twin and metaverse platforms. By identifying the principal components that explain the highest variance among the ESG indicators, PCA enables the researcher to isolate the most influential factors affecting smart building performance and governance. This validation step is essential to ensure that the prototype operates on a reliable and scientifically grounded dataset, capable of supporting dynamic simulations, predictive modeling, and real-time decision-making. Through PCA, the dataset’s structural soundness is assessed, verifying the independence and complementarity of the variables associated with environmental efficiency, social comfort, and governance effectiveness. The resulting components will form the analytical backbone for building an integrated system that governs and optimizes smart buildings in immersive digital environments. Ultimately, this approach ensures that the proposed governance prototype—based on digital twin and metaverse technologies—is supported by a technically validated and scientifically reliable data framework, reinforcing its potential for sustainable, data-driven management of intelligent infrastructures.

 

 

7.1 Principal Component Analysis (PCA) Results for the Environmental (E) Dimension of the ESG Model

The results of the Principal Component Analysis (PCA) applied to the environmental component of the ESG model provide a significant validation of the underlying dataset, confirming both its internal coherence and its multidimensional structure (Kwon, Kim, & Choi, 2024). The PCA technique, which decomposes the dataset into orthogonal principal components, is particularly effective for evaluating the relationships among environmental indicators and identifying latent structures that capture the underlying variance of the data (Ascione et al., 2022). In this case, the eigenvalues associated with the first few components demonstrate that approximately 40% of the total variance is explained by the first four principal components, indicating that the environmental indicators share meaningful correlations without redundancy. This supports the use of PCA as a robust approach to assess data consistency and dimensionality reduction within the ESG framework. From the component loadings, the first principal component (PC1) captures the largest share of variance and is primarily driven by positive contributions from EMIN (Emission Intensity), LCF (Load Cover Factor), and ENCO (Energy Consumption), while variables such as CFPT (Carbon Footprint) and SCF (Supply Cover Factor) contribute negatively. This component appears to represent a balance between efficiency and energy consumption, reflecting how emissions and energy coverage jointly influence environmental performance. The second component (PC2) has strong positive loadings for OER (On-site Energy Ratio) and GII (Grid Interaction Index), while FLF (Flexibility Factor) shows a strong negative contribution. This suggests that PC2 differentiates systems with higher local energy autonomy from those that rely more on flexibility and grid interaction, aligning with the notion of distributed energy management (Zhang et al., 2023). The third and fourth components (PC3 and PC4) further refine the structure of the data, capturing subtler aspects of energy-environmental interactions. For instance, PC3 shows high positive loadings for ENCO and OPP (One Percent Peak Power), while LCF and LMI (Load Matching Index) load negatively, suggesting a contrast between energy demand peaks and load coverage capacity. PC4, on the other hand, captures variability associated with EMIN and CAF (Capacity Factor), pointing toward the efficiency of energy conversion processes within the system (Kwon et al., 2024). A noteworthy observation is that none of the variables display extreme loadings across multiple components, which indicates that the dataset lacks strong multicollinearity and maintains a balanced contribution of each indicator to the overall structure. This aligns with the earlier regression analyses that confirmed low mean variance inflation factors (VIF), thereby reinforcing the dataset’s internal consistency (Islam, Guerrieri, Gravina, & Fortino, 2024). The presence of moderate but distributed loadings also implies that each variable contributes uniquely to the multidimensional understanding of environmental performance, making the dataset appropriate for subsequent modeling steps. The negative correlations observed in some components, such as between CFPT and EMIN, or SCF and CAF, emphasize the complexity of the environmental dimension. These negative signs do not indicate inconsistencies but rather complementary dynamics: higher carbon footprints tend to associate with lower emission intensity efficiency, while energy coverage and capacity factors reveal trade-offs between resource use and operational performance. This reinforces the interpretative depth of PCA as a diagnostic validation tool rather than a purely descriptive method (Zhou et al., 2023). Overall, the PCA results validate the environmental dataset as a coherent and structurally reliable foundation for the ESG model. The distribution of eigenvalues and loadings supports the presence of independent, interpretable dimensions within the environmental domain. This validation step is crucial, especially considering the dataset’s intended application in the development of a management prototype integrating Digital Twin and Metaverse technologies (Zhang et al., 2023). In this context, PCA ensures that the environmental indicators capture distinct yet complementary aspects of building energy efficiency, emission control, and operational sustainability. Consequently, the PCA model not only confirms the statistical robustness of the environmental data but also establishes a reliable basis for embedding it within a digital simulation environment for smart building management.

 

 

 

 

 

Figure X. Principal Component Loadings for Environmental (E) Factors in the ESG Model. Note: The figure illustrates the loading values of each environmental indicator (ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, and FLF) across the principal components (PC1–PC15). The distributed and moderate loading patterns confirm that no single factor dominates the variance, indicating balanced variable contributions and low multicollinearity. This supports the dataset’s structural integrity and validates its suitability for digital twin–based smart building governance modeling within the ESG framework.

 

 

 

7.2 Principal Component Analysis (PCA) Results for the Social (S) Dimension of the ESG Model

 

The principal component analysis of the Social (S) dimension in the ESG framework provides a deep understanding of how human-related and comfort variables interact within smart building environments. Incorporating the identified variables — such as Occupants (OCC), Relative Humidity (HUM), Particulate Matter (PM2.5 and PM10), Volatile Organic Compounds (VOC), Air Changes per Hour (ACH), Thermal Insulation (THR), Sound Insulation (SND), Energy Efficiency Ratio (EER), Coefficient of Performance (COP), System Efficiency (SEF), Energy Use Intensity (EUI), and Lighting Power Density (LPD) — the PCA demonstrates the multidimensional structure of the social component, highlighting interdependencies between human comfort, air quality, and building performance metrics (Bonab, Bellini, & Rudko, 2023). The first principal component (PC1) shows strong negative loadings for EER, COP, and SEF, indicating that this dimension captures the efficiency and operational quality aspects of social comfort. These parameters represent the building’s ability to maintain indoor well-being through technological optimization. Negative values suggest an inverse relationship between system efficiency and variability in occupant conditions, implying that as systems become more efficient, fluctuations in perceived comfort decrease. This aligns with the principles of smart building management, where automation and digital control stabilize the indoor environment (Elnour, Meskin, Khan, & Jain, 2021). PC2 exhibits positive contributions from EUI and LPD, suggesting that energy consumption per person and lighting density are key indicators of human activity levels within buildings. This axis can be interpreted as a behavioral-energy dimension, linking occupant presence and usage patterns to energy demand. It supports the concept that social variables are not isolated but are reflections of dynamic interactions between people and infrastructure (Ma et al., 2023). The third component (PC3) emphasizes indoor air quality factors, with high negative loadings for PM2.5, PM10, and THR. This reveals an important trade-off between particulate pollution and thermal comfort. In smart building contexts, this component provides insight into how environmental control systems influence both health-related and comfort-related metrics. Lower particulate concentrations may require higher ventilation rates (ACH), which in turn affect energy consumption and humidity balance (Hu & Lu, 2024). PC4 is primarily characterized by strong positive loadings for Occupants (OCC) and Humidity (HUM), alongside moderate contributions from ACH and PM10. This suggests that the fourth component captures spatial and microclimatic comfort interactions, where occupant density and air renewal are central to maintaining an acceptable indoor environment. In digital twin applications, such relationships are essential for predicting comfort variations based on occupancy data and HVAC system behavior. Higher components, such as PC5 through PC7, refine specific comfort and acoustic dimensions. Negative loadings of SND and THR indicate the balance between thermal insulation, noise control, and user satisfaction. These components are crucial for understanding the subtle effects of building envelope performance on perceived comfort, an area that is increasingly relevant for ESG-oriented smart building metrics (Hu & Lu, 2024). Finally, components like PC8 to PC13 capture residual variance associated with specific operational parameters, confirming that while social indicators are diverse, they remain statistically coherent and non-redundant. The consistent spread of variance across components underscores the structural validity of the dataset, confirming that each variable contributes uniquely to the representation of social sustainability within buildings. Overall, the PCA confirms that the social dataset is robust and internally coherent, providing strong empirical support for its use in validating the proposed metric model. The clear clustering of efficiency-related, environmental, and comfort indicators reflects a realistic representation of how occupants experience smart buildings. When integrated into a digital twin and metaverse-based management system, these results ensure that the model can simulate user-environment interactions, predict comfort dynamics, and optimize building operations in line with ESG principles (Bonab et al., 2023; Ma et al., 2023).

 

 

 

 

 

 

 

Figure xyz. Principal Component Loadings for Social (S) Factors in the ESG Model. Note: The figure presents the loading values of social indicators (OCC, HUM, PM2.5, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) across the principal components (PC1–PC13). The distribution of moderate and distinct loadings confirms that each factor contributes uniquely to the social dimension. Efficiency variables (EER, COP, SEF) and comfort-related indicators (PM2.5, PM10, HUM) form separate but complementary clusters, validating the dataset’s internal consistency and its suitability for digital twin and metaverse-based smart building governance applications.

 

7.3 Principal Component Analysis (PCA) Results for the Governance (G) Dimension of the ESG Model

 

 

 

 

 

 

 

 

The Principal Component Analysis (PCA) of the Governance (G) component in the ESG model provides critical evidence for the statistical validity and structural coherence of the dataset intended for digital twin and metaverse-based smart building management. This component includes variables related to economic efficiency and governance performance—specifically, cost efficiency (CES), energy return on investment (EROI), energy payback time (EPBT), construction and operational costs (CPD, CCF), system performance and control (SPC), renewable energy utilization (REU), and energy productivity per worker (EPWH). Together, these indicators describe the economic and managerial dimension of sustainable smart buildings, where financial optimization, performance monitoring, and long-term resource efficiency are intertwined (Dovolil & Svítek, 2024). The first principal component (PC1) explains a substantial portion of the variance, with strong negative loading for CES (-0.537) and positive loading for CCF (0.531) and SPC (0.506). This pattern highlights a fundamental trade-off between cost reduction per unit of energy saved and capital or operational investment, which is typical in building governance models (Bezrukov, Sadovnikova, & Lebedinskaya, 2022). In a digital twin context, this suggests that reducing the marginal cost of energy efficiency (CES) is associated with higher upfront or management costs (CCF, SPC), reflecting realistic investment-efficiency dynamics. PC1 can therefore be interpreted as a “financial governance axis,” emphasizing the relationship between cost control, structural investment, and system efficiency. The second component (PC2) shows significant negative correlations for EPBT (-0.472), REU (-0.516), and EPWH (-0.467), suggesting that this component represents the temporal and productivity-related aspect of governance. Shorter energy payback times and greater renewable energy utilization contribute to higher system efficiency but require optimization of workforce productivity and process management (Pandhare et al., 2024). This factor can be understood as an “operational sustainability axis,” demonstrating the capacity of governance metrics to reflect the long-term return of energy and human capital investments. PC3 and PC4 reveal more specific structural relations within the dataset. The strong positive loading of EPBT (0.458) and CPD (0.619) in these components indicates that buildings with longer payback periods also tend to have higher cost structures. This pattern validates the consistency of the dataset, showing that financial and temporal metrics are not independent but logically correlated. In a digital twin simulation, these relationships can be used to model the trade-offs between project duration, capital investment, and lifecycle sustainability (Zhang, Yu, & Tian, 2024). The significant contribution of EROI (-0.601 in PC4) further connects governance efficiency to the building’s ability to generate positive energy returns, highlighting the strategic value of integrating real-time energy flow analytics in metaverse-based management systems. The fifth and sixth components (PC5 and PC6) capture more subtle variations related to operational resilience and system integration. REU (0.325 in PC5, -0.584 in PC6) and EPWH (-0.751 in PC5) suggest that renewable energy performance and energy use efficiency per worker vary inversely, reflecting the complexity of aligning workforce productivity with renewable infrastructure adoption. This finding is particularly relevant for smart building governance because it illustrates how data-driven management—enabled by digital twins—can balance human and technological performance indicators (Pandhare et al., 2024). Finally, PC7 and PC8 consolidate the multidimensionality of the governance structure. The strong positive loading of CES (0.519 and 0.536) indicates that cost efficiency remains a dominant variable across higher components, confirming that economic optimization is consistently embedded in the model. The coherence of loadings across multiple components demonstrates that each indicator contributes uniquely to the overall governance structure, with no redundancy or distortion. In summary, the PCA results confirm that the governance dataset is statistically robust and conceptually coherent. The clear differentiation of principal components reflects the internal logic of ESG-based governance, where financial, operational, and energy metrics interact systematically. This validates the model’s suitability for integration into a digital twin and metaverse framework, enabling predictive management, optimization of energy investment, and real-time governance of smart building performance. The structure uncovered by the PCA not only supports the empirical reliability of the data but also provides a scientific foundation for developing intelligent, data-driven systems aligned with sustainable management objectives.

Figure X. Principal Component Loadings for Governance (G) Factors in the ESG Model. Note. The figure displays the loadings of governance-related indicators (CES, EROI, EPBT, CPD, CCF, SPC, REU, and EPWH) across the principal components (PC1–PC8). The results highlight clear structural differentiation among financial, operational, and productivity dimensions. PC1 and PC2 capture cost–efficiency and sustainability trade-offs, while higher components (PC4–PC6) reflect investment and performance dynamics. The balanced distribution of loadings confirms the statistical coherence and multidimensional integrity of the governance dataset, validating its use for digital twin and metaverse-based smart building governance models.

 

 

 

 

 

  1. Machine Learning Regression for ESG Dataset Validation in Digital Twin and Metaverse-Based Smart Building Governance

The machine learning regression analysis presented in this section was developed as a key step in the technical and scientific validation of a dataset designed for the testing and calibration of a prototype aimed at the management of smart buildings through Digital Twin and Metaverse technologies. The purpose of this process is to ensure that the dataset, structured according to the Environmental, Social, and Governance (ESG) framework, demonstrates high levels of internal consistency, predictive reliability, and interpretability—three essential conditions for its integration into intelligent, data-driven decision systems. By applying advanced machine learning algorithms such as Random Forest and Support Vector Machine (SVM), the study evaluates how effectively the dataset captures the complex, nonlinear relationships that characterize smart building governance. Each ESG dimension—environmental, social, and governance—is analyzed to identify the most suitable model capable of minimizing prediction errors (MSE, RMSE, MAE, MAPE) while maximizing explanatory performance (R²). The Random Forest model proves particularly effective for validating the Environmental and Social components, owing to its ensemble-based structure that captures multidimensional dependencies, avoids overfitting, and enhances interpretability through variable importance measures. The SVM algorithm, conversely, demonstrates superior performance in modeling the Governance dimension, where financial and operational variables interact through complex, non-linear patterns. The outcome of this machine learning validation process confirms that the ESG dataset provides a statistically robust foundation for developing an intelligent management prototype. Within a Digital Twin and Metaverse framework, this validated dataset will enable real-time simulation, optimization, and governance of building performance, energy efficiency, and sustainability—transforming smart buildings into adaptive, self-learning systems that support informed, data-driven decision-making.

 

 

 

8.1 Random Forest Regression for Environmental Dataset Validation within the ESG Framework

The selection of the Random Forest algorithm as the best-performing model for the validation of the ESG dataset is grounded on a comprehensive evaluation of multiple performance metrics, some of which are to be minimized and others maximized. In predictive modeling, a reliable validation approach must consider this dual nature of indicators. The metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) are indicators that must be minimized because they quantify the deviation between the predicted and actual values; lower values correspond to better accuracy and less dispersion of residuals. On the other hand, the coefficient of determination (R²) must be maximized, as it measures the proportion of the variance in the dependent variable explained by the model, thus reflecting its explanatory power. The Random Forest model demonstrates an optimal balance between these opposing objectives. Its normalized MSE, RMSE, MAE, and MAPE values are the lowest among all tested algorithms, indicating superior predictive precision and stability (Inala et al., 2024). Even though its R² is moderate compared to the Linear Regression model, this is compensated by the fact that Random Forest captures complex, nonlinear relationships that linear models fail to represent adequately (Ji & Niu, 2024). The ensemble structure of Random Forest, based on the aggregation of multiple decision trees, allows it to reduce variance and avoid overfitting, which enhances its robustness when validating large and heterogeneous datasets such as those associated with ESG indicators (Wang et al., 2024). Furthermore, the model’s interpretability and ability to estimate variable importance make it particularly suitable for applications in digital twin and metaverse environments. By quantifying the contribution of each variable, Random Forest supports both predictive accuracy and strategic understanding of how environmental, social, and governance components influence building performance. Its capacity to minimize prediction errors while maintaining stable explanatory reliability validates its use as a scientifically sound model for data validation in smart building management and ESG-driven digital infrastructures (Ukwuoma et al., 2024; Vasilica et al., 2025).

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Environmental Component

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.33

0.30

0.65

1.00

0.00

0.36

0.38

RMSE

0.35

0.33

0.73

1.00

0.00

0.42

0.44

MAE

0.44

0.53

0.78

1.00

0.00

0.27

0.52

MAPE

0.66

0.50

1.00

0.67

0.46

0.47

0.00

0.01

0.07

0.38

1.00

0.00

0.44

0.00

 

Note: All metrics — MSE, RMSE, MAE, MAPE, and R² — are normalized to enable direct comparison among algorithms. The Random Forest model shows the lowest normalized errors and a stable R², confirming its superior accuracy and robustness. These results validate its suitability for the technical–scientific verification of the environmental dataset used in developing a digital twin and metaverse-based smart building management prototype.

 

The analysis of the Random Forest model applied to the ESG dataset reveals a coherent and scientifically valid validation process for its future use in the development of a smart building management prototype that integrates digital twin and metaverse technologies. The feature importance metrics, expressed through the mean dropout loss, serve as a robust indicator of how each environmental variable contributes to the predictive power of the model. This metric, based on fifty permutations of the dataset, measures the increase in error (in terms of RMSE) when a variable is randomly excluded from the model. Lower dropout loss values correspond to less influential variables, whereas higher values indicate features whose removal leads to a significant deterioration in predictive accuracy. In this case, CAF (Capacity Factor), SCF (Supply Cover Factor), and OER (On-site Energy Ratio) exhibit the highest dropout loss, showing that they are essential in explaining variations in the dependent variable AREA. These results are consistent with the physical logic of smart building systems, where energy capacity balance, supply coverage, and on-site generation efficiency play fundamental roles in determining operational performance and sustainability outcomes (Du, 2024; Kinshakov et al., 2021). This importance ranking demonstrates that the Random Forest algorithm not only captures statistical correlations but also reflects the actual structural dynamics governing energy and environmental processes within buildings (Miao & Xu, 2024). In particular, the model’s ability to represent nonlinear interactions enhances the interpretability of variable contributions—especially when dropout loss metrics are combined with ensemble-based prediction strategies (Orlenko & Moore, 2020). Moreover, recent studies demonstrate that such methods significantly improve the readability and explainability of complex systems, supporting their use in high-dimensional, heterogeneous ESG datasets (Yu et al., 2021). Consequently, the application of Random Forest in this context is not only statistically justified but also conceptually aligned with the operational goals of smart building governance.

Figure xyz. Feature Importance Based on Mean Dropout Loss — Environmental Component

Variables

Mean dropout loss

Variables

Mean dropout loss

CAF

5.077

CFPT

5.068

SCF

5.074

LCF

5.068

OER

5.070

OPP

5.068

FLF

5.069

LMI

5.068

GII

5.069

FLI

5.068

EMIN

5.069

ENCO

5.067

NGI

5.068

FEE

5.067

DRS

5.068

   

Note. The mean dropout loss values indicate each variable’s contribution to the Random Forest model. Higher values (e.g., CAF, SCF, OER) represent greater influence on model accuracy, confirming their key role in validating the environmental dataset for smart building analysis.

 

The model’s ability to identify meaningful predictors supports the internal coherence of the ESG dataset and confirms its reliability as a basis for digital twin modelling. The additive explanations of the predictions further reinforce the model’s interpretability. Each predicted value is constructed from a baseline prediction (the “Base”) adjusted by the additive contributions of each variable. Positive contributions increase the predicted AREA, while negative ones decrease it. For example, in Case 1, positive influences from GII (Grid Interaction Index) and FLI (Flexibility Index) compensate for the negative impact of SCF and EMIN, resulting in a final prediction slightly above the baseline. This additive approach allows for a clear decomposition of the prediction mechanism, offering transparency in understanding how individual environmental factors shape the model’s output. Such interpretability is essential for validating the dataset in a scientific context, as it ensures that the model’s decisions are both explainable and consistent with domain knowledge. By integrating the mean dropout loss and additive prediction explanations, the Random Forest model provides a double-layer validation: it identifies the most influential features for prediction and explains how they act in shaping each result. This combination of accuracy, interpretability, and conceptual alignment with building energy dynamics confirms that the model is methodologically sound and suitable for the prototyping of an intelligent management system for smart buildings, capable of leveraging digital twin and metaverse technologies for real-time performance monitoring and sustainable decision-making.

 

Table xyz. Additive Feature Contributions in Random Forest Predictions — Environmental Component

Case

1

2

3

4

5

Predicted

9.141

8.936

9.175

8.931

8.931

Base

9.063

9.063

9.063

9.063

9.063

ENCO

-0.298

-2.765

4.759

-9.130

6.100

CFPT

8.291

-20.937

0.220

4.921

-12.691

EMIN

-10.720

-1.963

1.428

16.626

-17.500

LCF

-16.680

-18.864

9.401

-21.168

22.139

SCF

-9.687

-2.902

-13.972

-59.684

-46.113

LMI

2.564

-1.952

-4.729

2.104

1.990

OER

10.921

10.930

10.902

10.914

10.910

GII

32.240

-20.779

35.824

-22.670

5.397

NGI

8.787

23.709

-7.874

-9.781

1.382

CAF

-33.998

3.153

34.955

-36.673

-36.673

OPP

13.678

-18.241

-11.706

13.536

-4.859

DRS

29.411

-25.220

-2.092

-17.218

-2.467

FLF

42.525

-52.681

53.365

-4.552

-57.637

FLI

0.446

1.780

1.584

1.260

-2.368

FEE

-0.064

-0.010

-0.139

-0.194

-0.013

 

Note. This table shows the additive contributions of each environmental variable to the predicted AREA across five test cases. Positive values increase the prediction, while negative ones reduce it. The results highlight the interpretability of the Random Forest model, confirming that the dataset captures realistic and consistent relationships among energy and environmental indicators.

 

 

8.2 Machine Learning Validation of the Social (S) Component in the ESG Dataset

In the context of developing a scientifically grounded methodology for validating the ESG dataset, this section focuses on the Social (S) component by applying and comparing different machine learning regression algorithms. The goal is to identify which algorithm best captures the underlying relationships among social performance indicators relevant to smart building management while ensuring both predictive reliability and interpretability (Li & Xu, 2024). After the normalization of performance metrics, the Random Forest algorithm demonstrates the most balanced and consistent results. It achieves the lowest normalized error values across MSE, RMSE, and MAE, indicating superior predictive accuracy and robustness in modeling the social variables. The model’s relatively low MAPE further supports its reliability, as it suggests that Random Forest maintains stable relative error levels across the range of predicted values, ensuring that deviations between observed and estimated outputs remain proportionally small (Gaur et al., 2021; Li, 2025). By contrast, Linear Regression, while producing the highest R² value, exhibits significantly higher normalized error metrics. This indicates that despite its apparent explanatory power, the linear model fails to account for the complex, nonlinear interactions typical of social indicators in ESG frameworks, leading to overfitting and reduced generalizability (Li & Jiang, 2023). In this sense, Random Forest provides a better trade-off between minimizing errors and maximizing interpretability, effectively capturing multidimensional relationships among variables such as occupant comfort, indoor air quality, and system efficiency, which collectively define the social sustainability of building operations. The results confirm that the Random Forest approach not only enhances the predictive stability of the validation process but also ensures methodological consistency with the broader objective of dataset validation within a digital twin and metaverse framework (Khan & Vora, 2024). Its ability to model complex nonlinearities and maintain low residual variance validates the dataset’s structural coherence and reinforces its suitability for integration into the prototyping of a smart building management system capable of dynamic, data-driven decision-making.

Table xyz. Normalized Performance Metrics of Machine Learning Models — Social (S) Component

Metric

Boosting

Decision Tree

KNN

Linear

Random Forest

Regularized Linear

SVM

MSE

0.828

0.273

0.186

0.771

0.000

0.004

0.133

RMSE

0.989

0.123

0.027

0.949

0.000

0.002

0.067

MAE

0.713

0.210

0.044

1.000

0.006

0.000

0.038

MAPE

1.000

0.238

0.292

0.595

0.263

0.316

0.000

0.000

0.182

0.727

1.000

0.667

0.182

0.000

 

Note: The table presents normalized evaluation metrics for different machine learning algorithms applied to the Social (S) dataset. The Random Forest model achieves the lowest error values (MSE, RMSE, MAE) and balanced performance, confirming its superior predictive accuracy and suitability for validating social indicators within digital twin and metaverse smart building frameworks.

The application of the Random Forest algorithm to the Social (S) dimension of the ESG model provides valuable insights for validating the dataset’s internal coherence and predictive reliability in the context of smart building management. This validation is essential to support the prototyping of a management model based on digital twin and metaverse technologies, which require accurate, interpretable, and scalable data structures to simulate and optimize human-environment interactions within buildings (Li, 2025). The results obtained through feature importance metrics—mean decrease in accuracy, total increase in node purity, and mean dropout loss—illustrate the role and weight of social indicators such as air quality, comfort, and system efficiency in predicting the dependent variable (Miao & Xu, 2024). The mean dropout loss, calculated through fifty permutations, serves as an indicator of the relative contribution of each feature to model accuracy. Lower dropout loss values correspond to higher importance, as their removal would significantly degrade model performance (Xu, 2021). In this dataset, variables such as Sound Insulation (SND), Thermal Insulation (THR), System Efficiency (SEF), and Coefficient of Performance (COP) display some of the lowest dropout loss values, confirming their fundamental role in explaining the variance of the output. These indicators are directly linked to the comfort and operational quality of the indoor environment, which are central to the social sustainability dimension of smart buildings (Chowdhury et al., 2023). Conversely, variables such as Humidity (HUM), Occupants (OCC), and Air Changes per Hour (ACH) contribute to the model with a moderate but consistent effect, emphasizing how internal environmental control and occupancy behavior affect building performance through indirect interactions. The other two importance measures—mean decrease in accuracy and total increase in node purity—further reinforce these findings. The positive values associated with PM2.5 (PM25), System Efficiency (SEF), and COP indicate that they significantly enhance the model’s predictive capacity, while negative or small values in other variables reflect lower or context-dependent influence. The total increase in node purity, a measure of how much a variable reduces overall model variance when used to split data in decision trees, identifies similar key drivers, suggesting the model’s internal coherence across multiple evaluation metrics (Lou, 2025).

Table xyz. Feature Importance Metrics for the Random Forest Model — Social (S) Component

 Variable

Mean decrease in accuracy

Total increase in node purity

Mean dropout loss

VOC

-310.022

1.200×10+8

3.820

PM25

2.522×10+6

7.406×10+7

3.919

HUM

-361.803

6.387×10+7

3.727

OCC

-455.332

6.330×10+7

3.647

ACH

-777.406

5.574×10+7

3.644

PM10

-38.839

5.482×10+7

3.632

LPD

153.570

5.290×10+7

3.638

EUI

120.725

5.279×10+7

3.653

COP

346.376

4.928×10+7

3.666

SND

284.558

4.860×10+7

3.602

THR

-515.408

4.721×10+7

3.595

SEF

862.251

4.260×10+7

3.623

EER

-55.396

3.588×10+7

3.563

Note. This table reports the importance metrics derived from the Random Forest model, including mean decrease in accuracy, total increase in node purity, and mean dropout loss. Variables such as SEF, COP, and PM2.5 show the highest influence on model accuracy, confirming their central role in explaining social sustainability and indoor comfort dynamics in smart buildings.

The additive explanations for predictions provide another layer of interpretability, illustrating how each variable contributes to specific case predictions. For example, in the first test case, variables such as PM25 and EER (Energy Efficiency Ratio) have strong positive contributions to the predicted value, whereas factors like VOC and LPD exert negative effects. These additive contributions allow the decomposition of predictions into comprehensible components, which is particularly valuable for digital twin applications that rely on traceable, feature-level understanding to inform operational decisions (Ozdemir et al., 2025). The capacity to visualize how indoor comfort, air quality, and energy efficiency dynamically influence outcomes reinforces the model’s practical relevance for smart building management. Overall, the Random Forest model demonstrates a robust and balanced capability to capture complex, nonlinear interactions among social variables within the ESG framework. It effectively distinguishes between features with direct physical impacts—such as thermal and acoustic insulation—and those representing behavioral or environmental feedbacks, like occupancy and ventilation rates. This multi-level interpretability confirms the dataset’s scientific validity, showing that it contains coherent, measurable relationships consistent with the physical and social principles of building performance (Orlenko & Moore, 2020; Drobnič et al., 2020). Therefore, this analysis validates the dataset as a reliable foundation for the development of an intelligent management prototype that integrates machine learning with digital twin and metaverse environments. The model’s structure supports the simulation of user comfort and operational efficiency, providing a data-driven mechanism for adaptive, sustainable management of smart buildings (Yu et al., 2021; Akhtar et al., 2024).

Table xyz. Additive Prediction Explanations for the Random Forest Model — Social (S) Component

Case

Predicted

Base

OCC

HUM

PM25

PM10

VOC

1

10.119

9.706

137.642

356.475

102.444

-145.549

-198.943

2

9.497

9.706

-141.526

-557.700

510.704

-426.882

-234.224

3

8.345

9.706

271.758

310.474

-1.262

-290.846

-164.060

4

9.409

9.706

-99.743

-292.405

-1.277

613.552

1.001

5

9.857

9.706

-64.956

461.545

242.598

-5.588

-325.978

ACH

THR

SND

EER

COP

SEF

EUI

LPD

-182.452

-28.543

158.293

285.878

74.686

-124.032

81.448

-104.236

151.726

122.638

432.515

-131.370

255.324

100.229

-32.603

-258.426

-167.946

-121.153

-146.303

6.577

260.931

207.834

-42.744

-223.179

-253.144

52.756

403.440

-192.171

189.206

139.739

-392.002

-191.752

-52.027

84.146

-43.087

-56.376

267.341

184.743

-363.505

-178.109

 

Note. The table illustrates the additive contributions of each variable to the predicted values across five test cases. Positive and negative values indicate how each social indicator (e.g., OCC, PM2.5, SND, COP) influences the final prediction relative to the baseline. These results confirm the interpretability of the Random Forest model and its capacity to capture complex interactions between comfort, air quality, and energy efficiency in smart building environments.

 

 

8.3 Machine Learning Validation of the Governance (G) Component within the ESG Framework

 

The validation of the Governance (G) component of the ESG model through machine learning techniques represents a critical step in ensuring the scientific reliability and applicability of the dataset for the prototyping of a smart building management system. Within this context, the application of the Support Vector Machine (SVM) algorithm was identified as the most effective method for the validation process (Wang, 2025). The Governance dataset includes key indicators such as Cost of Energy Saved (CES), Energy Return on Investment (EROI), Energy Payback Time (EPBT), Construction and Capital Costs (CPD and CCF), System Performance Coefficient (SPC), Renewable Energy Utilization (REU), and Energy Productivity per Worker Hour (EPWH). These variables jointly capture the economic and managerial dimensions of building performance, linking financial efficiency with operational sustainability (Wu et al., 2023). SVM was selected due to its superior performance across multiple validation metrics, particularly in minimizing mean absolute error (MAE) and mean absolute percentage error (MAPE), while maintaining a high coefficient of determination (R²). Unlike linear regression or decision trees, which may struggle to represent nonlinear dependencies, SVM effectively models the complex and interrelated relationships among governance variables (Lin & Hsu, 2023). This is crucial for ESG-driven frameworks, where economic efficiency, energy optimization, and operational decision-making are deeply intertwined. The low normalized MSE and RMSE further confirm the algorithm’s capacity to reduce prediction variance, ensuring high accuracy in estimating key governance outcomes such as cost-effectiveness and return efficiency (Koseoglu et al., 2025). The dataset itself, composed of one hundred buildings with diverse energy and cost characteristics, provides a robust foundation for testing the generalization capabilities of the model. SVM’s kernel-based approach allows for capturing nonlinear interactions between energy payback time, system costs, and governance efficiency indicators without overfitting the data (Suprihadi & Danila, 2024). This adaptability makes it particularly suitable for applications in digital twin environments, where data-driven models must reflect real-time changes and complex system feedbacks. By integrating this validated model into a digital twin framework, it becomes possible to simulate governance-related decisions in virtual environments before implementing them in physical infrastructures. This enhances predictive control, cost management, and operational resilience in smart buildings. The ability to test policies, predict maintenance needs, or optimize energy-economic trade-offs within the metaverse extends the role of the Governance component beyond data analytics, transforming it into a dynamic management tool. Therefore, the use of SVM for database validation ensures methodological rigor and computational robustness, confirming that the dataset is not only statistically coherent but also operationally meaningful. This validation establishes a scientific foundation for developing a prototype capable of merging machine learning, digital twin technologies, and ESG-based governance metrics into a unified management model for smart, efficient, and sustainable buildings.

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Governance (G) Component.

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.000

0.586

0.952

1.000

0.573

0.694

0.436

RMSE

0.000

0.742

0.965

1.000

0.733

0.812

0.570

MAE

0.000

0.820

0.908

1.000

0.380

0.129

0.000

MAPE

0.164

1.000

0.682

0.620

0.783

0.968

0.000

0.940

0.433

0.928

0.560

0.980

0.793

0.000

 

Note. The table reports normalized performance metrics for several machine learning models applied to the Governance dimension of the ESG dataset. Among all tested algorithms, the Support Vector Machine (SVM) achieved the best overall balance, minimizing errors (MSE, RMSE, MAE, MAPE) while maintaining high explanatory power (R²), confirming its robustness for dataset validation in smart building governance modeling.

 

The results obtained from the validation of the Governance (G) component of the ESG model using machine learning provide a consistent and technically coherent confirmation of the dataset’s reliability for the prototyping of a smart building management system. In this validation phase, the analysis focuses on the feature importance metrics and the additive explanations derived from the Random Forest regression model, which was used to estimate the AREA variable based on a set of governance-related indicators including CES (Cost of Energy Saved), EROI (Energy Return on Investment), EPBT (Energy Payback Time), CPD (Construction Cost), CCF (Capital Cost Factor), SPC (System Performance Coefficient), REU (Renewable Energy Utilization), and EPWH (Energy Productivity per Worker Hour). The Mean Dropout Loss, which remains consistent across all variables at approximately 5.279, suggests that each feature contributes similarly to the model’s predictive accuracy. This uniformity implies that the dataset is well-structured, without any variable disproportionately influencing the model. The stability in dropout loss also confirms the absence of overfitting, ensuring that the model generalizes effectively to unseen data. From a methodological standpoint, this homogeneity validates the internal coherence of the Governance dataset and indicates that each metric contributes to explaining different aspects of building efficiency and management performance.

 

 

Table xyz. Feature Importance Metrics Based on Mean Dropout Loss — Governance (G) Component

Variables

EPWH

CPD

CCF

SPC

REU

CES

EROI

EPBT

Mean Dropout Loss

5.155

5.154

5.151

5.149

5.148

5.144

5.144

5.142

Note. The table presents the Mean Dropout Loss values for each governance-related variable in the ESG dataset. The results show minimal variation among indicators (≈5.14–5.16), confirming a balanced contribution of all features to model accuracy and validating the internal consistency of the dataset used for smart building governance modeling.

 

 

 

The additive explanations of the predictions for the test set provide further insights into how each variable influences the estimated AREA values. The predictions show small but meaningful variations around the base value of 9.309.215, with feature contributions generally close to zero. These subtle shifts indicate that the model captures complex interactions among governance variables without introducing excessive noise. For instance, the CPD and CCF indicators show minor but systematic effects, reflecting the role of cost-related parameters in determining building scale and resource allocation. Similarly, the contributions from REU and EPWH confirm the connection between renewable energy utilization, labor productivity, and overall building governance efficiency. From a broader perspective, these results substantiate the model’s capacity to interpret governance-related dynamics within the ESG framework. The balanced feature importance distribution demonstrates that the variables are not redundant but complementary, collectively enhancing predictive accuracy and interpretative value. In the context of smart building management, this outcome is particularly relevant because it supports the integration of governance indicators into a decision-support system capable of optimizing energy efficiency, financial sustainability, and operational planning. Therefore, the validation confirms that the database is statistically consistent and suitable for the development of an intelligent management prototype leveraging digital twin and metaverse technologies. The capacity to model economic and performance interdependencies with precision establishes a strong foundation for advanced predictive control, simulation-based policy testing, and strategic governance of smart buildings. This ensures that the system’s management model is both scientifically validated and operationally viable in a real-world digital twin environment.

Table xyz. Additive Prediction Explanations for the Governance (G) Component — Feature-Level Contributions

 

Case

Predicted

Base

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

1

9.309

9.309

-0.020

0.017

-8.680×10-4

-0.093

-0.007

0.091

-0.049

-0.092

2

9.309

9.309

0.022

-0.024

0.027

0.299

0.034

-0.120

0.116

-0.149

3

9.309

9.309

-0.020

-0.007

-6.765×10-4

0.010

-0.074

-0.007

0.086

-0.155

4

9.309

9.309

0.003

0.031

-9.158×10-4

0.323

-0.045

-0.037

-0.128

-0.155

5

9.309

9.309

0.012

-0.013

-6.446×10-4

-0.265

0.108

-0.136

-0.053

0.17

 

Note: The table shows the additive decomposition of predicted values for five test cases within the Governance (G) component. Each variable’s contribution is expressed as a deviation from the base prediction (9.309), illustrating how governance indicators such as CPD, CCF, and REU subtly influence model output. The small variations confirm the stability and coherence of the dataset and the balanced behavior of the machine learning model.

Q2. Figures 3 to 7 are unclear; please reformat them for better display of the relevant indicators.

A2. Abbiamo modificato completamentamente il

 

  1. Operationalizing Environmental Sustainability through Digital Twins: A Metaverse-Enhanced ESG Dashboard for Smart Building Management

 

In this rapidly shifting landscape of AI-driven building management, ESG factors aligned with digital twin technology and metaverse-based engagement, like the metaverse itself, are driving a paradigm shift in sustainability. As a precursor to this innovative solution developed to promote and enable prototyping and training for a digital twin infrastructure for sustainable building management in a smart city setup for ESG-based KPI development for digital twin sustainability metrics, this article proposes and brings to fore a critical dashboard that translates to The ESG KPI Framework – Metaverse-Enhanced Operations. The above-mentioned dashboard provides a comprehensive perspective on environmental building metrics and includes critical dimensions related to carbon emissions, energy usage patterns, and the integration of renewable energy sources, in addition to sustainable and optimized building management performance. The dashboard is made possible by enhanced, streamlined inputs from real-time IoT-based streaming sources and is supported by advanced computational methods such as PCA and Ordinary Least Squares for predictive and related mathematical modeling to ensure feasibility. Moreover, through metaverse-based infrastructure development opportunities that are inclusive of interactive 3D platform development for critical sustainability metrics and factors such as carbon emissions and building performance, this dashboard translates into a critical sustainability perspective that resolves in a waterfall fashion. Thus, it essentially encapsulates critical sustainability and assurance translation and resolution through adaptive, advanced platform development. Moreover, it essentially translates to a critical confluence between sustainability intelligence and innovative digital platform development. Therefore, basically translates to a new and critical metaverse-driven paradigm for sustainability intelligence and related predictive development.

 

“This dashboard is a representation of the Environmental (E) dimension of ESG KPI Framework – Metaverse-Enhanced Operations- and has specifically been designed for prototyping and training a digital twin and metaverse-based system for a smarter building management (No-Roozinejad Farsangi et al., 2024). The dashboard is specifically designed to provide information on the building and its environmental performance, and to demonstrate how sustainability metrics can be measured, authenticated, and even visualized to improve ecological intelligence and efficiency within a metaverse-based management platform (Mahariya et al., 2023). The dashboard essentially represents a systematic evaluation of key environmental factors designed to monitor and track the building's carbon footprint and the renewable energy generated and integrated within the building. The Carbon Footprint, with a value of 453.75 TCO2e, essentially represents a measure of total carbon emissions that are generated as a result of building operation in a given period and is a critical factor within this context as it essentially suggests that a building is striving to achieve sustainability and is committed to reducing carbon emissions and staying within limitations and goals established within ESG frameworks. The Emission Intensity of 0.0249 TCO2/kWh suggests that this building maintains a high level of energy efficiency and has a negligible environmental impact. The building’s commitment to a sustainable cause is essential, as it provides critical information on the adoption and integration of 57.4% of renewable resources into its energy structure. The building is essentially committed to sustainability and is well aligned with ESG-based strategies that identify net-zero and transition to a green building approach (Dovolil & Svítek, 2024). The building’s Energy Consumption of 1440827 kWh and related Energy ROI of 260220.0 suggest that sustainability strategies and techniques can deliver returns in this context and ensure that actions and strategies are focused on and optimized for energy efficiency. The building is capable of and can provide a substantial portion of the demand through on-site and renewable sources, as indicated by its critical factors, which show a Load Covering Factor of 76.5% and a Supply Covering Factor of 84.8%. The On-site Energy Ratio of 0.68 and its ability to interact with grids and manage and sustain its operation based on strategic connections as suggested through its critical factor that essentially suggests that it is capable of and has a strategic connection to grids as suggested through its value that essentially suggests that it is capable to operate independently and autonomously as suggested within its critical factor of 58.6% related to its interaction between its independent and strategic connections to grids. The bottom portion of a dashboard essentially provides a pictorial representation through a number of critical factors that essentially represent and identify a building and its dynamics within a broader context that essentially represents sustainability as suggested within its critical factor that essentially represents Energy Flow as suggested within its critical factor that essentially represents its dynamics and level within a broader context that suggests that this is essentially a building that is capable to switch its sources and essentially manage and sustain its operation within this context as suggested related to its critical factor that essentially represents “This capability to forecast and react to changes in demand patterns is but one example of how predictive control methods are indeed imbedded within this environmental management framework” (Masubuchi et al., 2025).The information in this dashboard is continuously updated using IoT devices and validated using computational methods such as correlation analysis, Principal Components Analysis (PCA), Ordinary Least Squares (OLS), and Machine Learning algorithms (Has-sani et al., 2022).The methods ensure that information is trustworthy and help make Key Performance Indicators science-based for training a digital twin. It is through this that “the environmental factor in ESG becomes not only descriptive but can accurately forecast future performance scenarios under different scenarios of either operation and climate in which a digital twin performs” (Tsouri & Avgousti-Della, 2024). On the designer’s intent level, this dashboard makes it clear that integration between environmental intelligence and immersive technologies has been realized. At the metaverse level and within this context-based scenario, this information is analyzed in real time through interactive 3D to assist in “the direct effect of building operation decisions on energy and carbon emissions and system performance” (Hernandez et al., 2023).The immersive experience is one that “brings monitoring and controlling traditional environments to an 'experiential' level for learning and strategies” while making “sustainability a not fixed information piece but a 'dynamic' and 'participative' strategy for decision-making” (E Zainab & Bawanay, 2023). The dashboard is one of those building foundational ingredients that contribute to a strategy platform for a digital twin application in a smarter building environment. On its platform strategy formulation, this one offers “continuing” feedback between information and simulation and system-level “optimization” for better building performance through a logical, optimized logical calculus for a better development strategy (Markopoulos et al., 2024). IT recognizes that “strategic” developments for “building environmental intelligence and immersive display” are reaching a decisive point to achieve “an intelligent ecosystem that can self-act to improve its building environment while keeping to transparency and accountability” and for itself “the strategic de-velopment level for optimized building development” and “strategic” that can self-act through its “strategic development level for optimized building development” in making better strategies for building. In conclusion, and in relation to how this ESG theory can help develop better building strategies for smarter buildings through a digital twin immaterial platform. The “E” in ESG theory can actually help in making a computational strategy for building digital twins and metaverse technology, and making better building strategies. The dashboard has a “continuous” and “data-driven” strategy for developing a new “ecological” approach to smarter building infrastructure systems that are not “theoretical” but “founded,” “dynamic,” and “interactive” realities in carbon emissions and renewable energy.

 

 

 

Figure xyz. Environmental Dimension Dashboard – ESG KPI Framework for Smart Building Digital Twin Development. Note: This dashboard represents the Environmental (E) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, focusing on carbon footprint, renewable energy use, and energy efficiency. The displayed KPIs—such as 453.75 tCO₂e, 57.4% renewable energy, and 0.0249 tCO₂/kWh emission intensity—demonstrate strong environmental performance. The Energy Flow and Load Shifting charts visualize real-time energy dynamics, supporting the digital twin prototype for sustainable and intelligent building management.

 

 

 

 

The above dashboard represents the Social (S) factor in relation to the ESG Key Performance Indicator Framework – Metaverse-Enhanced Operations within a digital twin and Metaverse-based system for a smart building management system (Farsangi et al., 2024). Unlike other dashboard views that consider sustainability in relation to environmental and governance factors, this dashboard focuses on directly improving user well-being and health through quantified, measurable Key Performance Indicators for social sustainability. The dashboard combines real-time monitoring from building sensors and digital twin analysis to evaluate indoor environmental quality (IEQ). This makes it a pivotal framework for ESG layers to ensure that building users’ well-being is maintained through a digital and sustainable metaverse platform. The top portion of this dashboard provides a summary of information on building sustainability. The Carbon Footprint (453.75 tCO2e) is maintained as indicative in ESG layers to promote sustainability. The key indicator directly related to building users’ health and well-being is given as an “Excellent” rating for indoor air quality. It is denoted as “11.4 μg/m3” and classified as a “PM2.5” concentration for air purity. The building maintains a “PM2.5” concentration within a “Very Low” level to ensure that building users are protected from inhaling building air pollutants. The subsequent section of this dashboard provides a deeper evaluation of key social factors that define the indoor experience for building users. The building maintains its “Relative Humidity” at “51.1%,” within the “Normal” range. Therefore, this ensures that building users experience health and well-being related to indoor humidity. The dashboard shows that the “PM10” air concentration is maintained at “20.9” “µg/m3,” while the “Volatile Concentration” is denoted as “20” “ppb” to support building health and well-being. The building maintains its “Air Changes/h /h” as “2.8” “1/h” to ensure that building users’ indoor health and well-being are maintained. The above-mentioned factor ensures that building users experience health and well-being benefits from indoor air quality improvements. The “R-Value” is maintained as a “2.19” building factor to ensure that building users experience health and well-being factors associated with indoor building temperatures. The above health and well-being factor related to indoor building temperatures is associated with indoor building noise levels. The building maintains its “Sound Insulation” level as “-” “dB” to ensure that building users’ health and well-being requirements are maintained. The factor is associated with indoor noise stress among building users. The above factor maintains indoor noise stress factors within a “Low” level. Therefore, indoor building noise stress factors are reduced to ensure that building users experience health and well-being related to indoor building noise. The dashboard displays that building users experience health and well-being factors within a “Very Low” level. The health and well-being factor is associated with indoor building temperatures. The above dashboard factor maintains indoor temperatures within a “Comfort” level. The health and well-being factor related to indoor building temperatures is associated with indoor building noise. The building serves as a component of its prototype for its digital twin system, translating abstract social sustainability criteria into measurable indicators to promote a new paradigm of human-centered, resilient building management.

Figure xyz.  Social Dimension Dashboard – ESG KPI Framework for Digital Twin and Metaverse-Based Smart Building Management. This dashboard focuses exclusively on the Social (S) dimension of the ESG framework, illustrating KPIs that measure comfort, health, and indoor environmental quality. Indicators such as PM2.5 (11.4 µg/m³), VOC (20 ppb), Sound Insulation (35.6 dB), and System Efficiency (86.0%) provide quantifiable insight into occupant well-being. These data, integrated within the digital twin and metaverse-based prototype, form the basis for predictive, interactive, and human-centered smart building governance.

 

 

The dashboard above represents the Governance (G) dimension within the ESG KPI Framework – Metaverse-Enhanced Operations. The dashboard is specifically designed to train and validate a prototype for building an integrated digital twin and metaverse system for smart building management (Noroozinejad et al., 2024). The dashboard is distinct from other ESG metrics because it focuses solely on economic governance and financial decisions related to energy management and sustainability (Adnan et al., 2024). The top portion of this dashboard frames its scope in relation to governance through three core metrics. Carbon Footprint (453.75 tCO2e) and Indoor Air Quality (11.4 μg/m³ PM2.5, Excellent) are maintenance metrics to maintain continuity with ESG dimensions. However, in this context, all metrics are financial in scope. The Energy ROI (26,022.0), with a 0.8-year payoff period, signifies economic sustainability in terms of how energy investments are returned (Cranford, 2023). The above-mentioned metric indicates financial efficiency in how capital and financial governance sustain this digital twin across financial and environmental parameters. The middle portion of this dashboard signifies a financial segregation regarding parameters for its governance. The Investment for system implementation is 1,679,500 €, while Subsidies are 883,081 €, corresponding to 52.6% of the total capital. The above-mentioned metrics indicate transparency and traceability levels for ESG financial governance by highlighting financial investment segregation within this ESG platform (Park et al., 2023). The Cost Energy Saving metric (0.701 €/kWh) signifies financial returns within energy-saving initiatives. The Annual S-avings value (22,513 €/yr) is reflected within this ESG platform for financial returns in relation to sustainability and financial benefit. The above-mentioned Peak Demand Cost (187,854 €) and the metrics related to demand on this ESG platform indicate financial sustainability. The aforementioned digital twin and demand within this ESG platform enable the platform to forecast demand cost variability across different scenarios (Aloqaily et al., 2022). The Cash Flow (–372,815 €), reflected in this ESG platform, is used to assess investment metrics for this digital twin platform's financial sustainability. However, the above-mentioned metric is supported by a 15-year forecast in this ESG platform, as reflected in the cash flow projection below. The above graph is one of the most important aspects of the Governance factor and shows how both the «Annual Cash Flow» and «Cumulative Cash Flow» have been displayed to better define financial recovery dynamics over time (Masubuchi et al., 2025). The «Negative bar» in this first-year chart refers to capital outflows as a direct investment cost and «Positive bars» that follow specifically denote «yearly savings». The point where «cumulative curves shift into positive zones» will denote a stage where a given investment becomes profitable. The ability to simulate this within a digital twin system will help this metaverse platform project financial sustainability and performance in line with its ecological and operational potential (Trung, 2022). The «Left side-bar» titled «Building Config» enhances this «governance strategy» by including «Real and Simulated parameters» within its digital twin platform. There is scope to switch between «Real Building Data» and «Simulated Data» and to define «building config parameters» like «Building Size » (16,795 m²), «Building Occupants» (701 people), and «Total Units» (209), all with a view to ensure «comparability» and «normalization» for «governance KPIs» when different building types will specifically come under this ESG factor and «Governance strategy» classification (Zainab & Bawanay, 2023). The «switch to enable» Renewable Energy Systems, «Smart & Grid Integration» and «Metaverse Technology» specifically not only enables monitoring but enhances training within this »governance logic» for this »prototype» to react and easily «predict» decisions to invest in the future in its «simulated metaverse» platform (Duong et al., 2023). In essence, this dashboard can specifically act as a «Governance Cock-pit» within this ESG framework that is specifically designed for monitoring and predicting »financial and strategic» aspects related to »smarter building» management. The dashboard directly connects financial accountability and sustainability goals and can define «Investment Efficiency» and »Savings» impact as well as «Pay Back Time» factors while predicting future »economic behavior» in this metaverse and digital twin platform. In reference to this digital twin and metaverse platform and prototype »indicators» and factors will specifically setup »training ground» for «algorithms» to easily execute direct »decision-making» in »immersing» management scenarios within its ecosystem. Thus, this dashboard can specifically address a «financial strategic» and related »Governance factor» related to this ESG strategy within its metaverse platform that directly translates this «G» factor to a «financial strategic digital governance model» within this metaverse digital twin. It not only proves that good governance is a management activity but is indeed a computational task that can be modeled and optimized in a digital twin space. So, in this context, a digital twin can indeed have a significant impact on building management.

 

Figure xyz. Governance Dimension Dashboard – ESG KPI Framework for Metaverse-Enhanced Smart Building Management. This dashboard focuses on the Governance (G) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, highlighting financial transparency and investment efficiency. KPIs such as Energy ROI (26,022.0), Payback Time (0.8 yrs), and Subsidies (52.6%) demonstrate strong economic performance, while the 15-Year Cash Flow Projection confirms long-term financial sustainability within the digital twin and metaverse-based governance model.

Q3. The top-level sections in this paper are too many (there are 13). Consider merging sections 3 to 5 and sections 6 to 8. Additionally, in Part 3 on page 5, set the relevant paragraphs as sub-sections 3.1 to 3.5 to highlight the key points. Section 13 should be removed, and the way top-level sections are presented should be adjusted. In the text, for '11. Limitations', remove the sub-sections and use numbering (1) to (9) for better clarity.

A3. Following the suggestions of other reviewers, we have significantly modified the structure of the article. This is particularly true because the authors raised the issue of data validation to ensure the effectiveness of prototyping, so we were forced to give great importance to all issues related to data metric verification. Therefore, the revised version of the article includes several paragraphs dedicated to data validation techniques using correlation, PCA, regression, and machine learning. Therefore, the current section structure is as follows:

The second section describes in more detail the key research question, which addresses the integration of digital twin and metaverse technologies to construct a more holistic platform for efficient, optimized smart building management. The third section discusses the created environmental dataset for evaluating the performance characteristics of smarter infrastructure and has multiple subsections that specifically define pertinent Key Performance Indicators for environmental, social, and governance aspects to be considered in digital twin techniques. The fourth section specifically points out that a writer has developed a comprehensive validation framework that is essential for underpinning and reinforcing ESG information as trustworthy and secure. The fifth and sixth sections apply and implement respective methodologies to evaluate how correlation analysis and Ordinary Least Squares techniques reinforce, with even greater rigor, ESG aspects of inter-user correlations, as well as their application to homogeneous datasets. The seventh and eighth sections provide more in-depth information on how to apply techniques such as PCA and machine learning models using random forests. The seventh and eighth sections specifically point out that to promote more efficient valuation and analysis techniques for ESG (especially across all categories of KPIs considered in its comprehensive datasets), a metaverse should include a dashboard to monitor and manage ESG factors in a more realistic, real-time context. The final sections present a broader debate on all aspects related to this proposed thesis, and the thesis limitations highlight key and critical aspects of it.

 

Q4. In the fourth paragraph on page 3, you can divide it into smaller paragraphs according to the perspectives of different scholars, to better explain the logical relationships and differences.

A4. The article has been completely revised to reflect the reviewers' suggestions. In reorganizing the sections, we had to give special emphasis to data construction and validation techniques using models such as correlation, PCA, regression, and machine learning.

Q5. Table 1 on page 8 can be reformatted and organized to make it more compact.

A5. The article has been completely revised to reflect the reviewers' suggestions. The tables have also been modified and adapted to meet the required requirements.

Q6. The three steps in the eighth part on page 12 can be presented as three separate paragraphs to clarify the logical relationships.

A6. The article has been completely revised to reflect the reviewers' suggestions. In reorganizing the sections, we had to give special emphasis to data construction and validation techniques using models such as correlation, PCA, regression, and machine learning. The previous organization of paragraphs has been completely changed and new sections have been added.

 

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript addresses an emerging and relevant topic within the fields of digital transformation and smart building management. The paper presents an ambitious proposal for an integrated metaverse-based platform that combines digital twins, IoT data, and immersive virtual environments to enhance building operations. The topic is innovative and aligns well with the scope of the journal; however, the current version lacks structural coherence, methodological transparency, and sufficient theoretical grounding. While the manuscript demonstrates creative thinking, it requires major revision to achieve the academic depth and analytical rigor necessary for publication. The authors must significantly improve the paper’s framing, expand its literature base, clarify its methodology, and strengthen the discussion of results and implications.

The research explores the potential of integrating metaverse technologies into smart building management systems—a concept that is both forward-looking and interdisciplinary. However, the scientific quality is limited by several key deficiencies. the paper cites only 47 references, which is insufficient for establishing a coherent theoretical and technological foundation. The authors must expand the bibliography with recent (2020–2025) high-impact publications that reflect the rapid advancements in metaverse technologies, smart buildings, and digital transformation.

The idea of an integrated metaverse-based management system for smart buildings is highly original and aligns with current trends in immersive technologies and cyber-physical integration. However, originality alone is not enough to ensure scholarly contribution. The manuscript must demonstrate conceptual depth and methodological robustness to justify its novelty.

At present, the discussion of metaverse integration remains largely descriptive. The paper would benefit from deeper theoretical engagement—particularly with frameworks such as digital twin interoperability, user-centered virtual environments, and intelligent data-driven control systems. Strengthening these connections would allow the research to move beyond conceptual novelty toward substantive scholarly contribution.

The writing is generally clear and readable, but the organization and flow require significant improvement. Several sections are repetitive and loosely connected, leading to a fragmented narrative. Transitions between conceptual explanation, framework development, and implementation are abrupt. Improving coherence and logical progression will enhance readability and academic professionalism. Furthermore, visual and structural aids—such as conceptual models, flow diagrams, or prototype schematics—should be included to illustrate system architecture and functional relationships. These would help readers grasp how the proposed metaverse platform operates in practice.

Section-by-Section Evaluation

Abstract: The abstract lacks clarity and precision. It should present the research problem, objectives, methods, and key contributions rather than offering a thematic overview. Quantitative or conceptual highlights would strengthen its scientific tone.

Introduction: The introduction presents the background but fails to establish a clear research gap or objective. The authors should add one for research significance, explaining why the study is important. A Theoretical and Technological Framework should follow the introduction to provide context and justification.

Research Question: Toward an Integrated Metaverse-Based Platform for Smart Building Management: The central question is intriguing but lacks clarity and precision. The authors should frame the research question in measurable terms and connect it to the broader discourse on digital transformation. A schematic showing how the metaverse model addresses gaps in current smart building systems would improve this section.

Theoretical and Technological Framework: The section remains descriptive and disconnected from established theoretical constructs. The authors should provide a clear conceptual framework explaining how metaverse technologies interface with digital twins, IoT systems, and building management workflows. Theoretical underpinnings—such as cyber-physical systems theory or human–computer interaction principles—should be referenced.

KPI Framework for Immersive Smart Building Management: This section introduces an important idea but lacks methodological clarity. The authors should explain the selection process, categorization, and weighting of KPIs and relate them to measurable building performance indicators. Comparing these KPIs with established ones in existing BMS literature would demonstrate added value.

Economic Feasibility and Cost-Benefit Analysis of Metaverse Integration: The discussion here is speculative and unsupported by quantitative data. The authors should outline a structured cost-benefit analysis model, referencing empirical or benchmark data from comparable projects. Potential risks and scalability challenges should also be discussed.

Platform Functional Capabilities: While informative, this section is too general. The authors should distinguish between technical, operational, and experiential functionalities and clearly explain how the metaverse environment enhances each. A conceptual diagram or architecture model would improve clarity.

Prototype Implementation and Preliminary Results: The description lacks technical detail and transparency. The authors should specify the prototype development tools, datasets, and testing scenarios. Screenshots or workflow diagrams would help validate claims about functionality.

Methodology and Case Study: The authors should explain how the research was structured, the rationale for case selection, and how the prototype or framework was evaluated. This is critical for reproducibility.

Dashboards: This section remains descriptive. The authors should elaborate on the data visualization logic, real-time monitoring mechanisms, and linkage with the KPI framework. Discussing dashboard usability or decision-support implications would enhance its value.

Discussion: The discussion reiterates prior sections rather than synthesizing insights. It should connect the results to existing research and critically evaluate the implications of the proposed framework for real-world building management. Comparative reflection with other digital twin or BIM-based systems is essential to demonstrate conceptual maturity.

Limitations: The limitations are underdeveloped. The authors should expand this section to acknowledge the lack of empirical validation, reliance on conceptual modeling, and assumptions regarding technological readiness. Transparent discussion of constraints will strengthen credibility.

Conclusion: The conclusion restates ideas but lacks depth. It should synthesize findings, articulate contributions, and explicitly address both theoretical and practical implications.

 

Author Response

Point to Point Answer to Reviewer 3

Q1. This manuscript addresses an emerging and relevant topic within the fields of digital transformation and smart building management. The paper presents an ambitious proposal for an integrated metaverse-based platform that combines digital twins, IoT data, and immersive virtual environments to enhance building operations.

A1. Thanks, dear editor.

Q2. The topic is innovative and aligns well with the scope of the journal; however, the current version lacks structural coherence, methodological transparency, and sufficient theoretical grounding. While the manuscript demonstrates creative thinking, it requires major revision to achieve the academic depth and analytical rigor necessary for publication. The authors must significantly improve the paper’s framing, expand its literature base, clarify its methodology, and strengthen the discussion of results and implications.

A2. To strengthen the article's technical-scientific dimension, in line with the requests of other reviewers, we substantially rewrote the article, both to expand the literature review currently distributed across all sections and to address methodological issues. A significant focus was placed on defining KPIs that would capture the ESG dimension within smart building management activities. In particular, considerable attention was given to developing a methodology for data validation as a preliminary step towards prototyping. To this end, sections were added presenting the results relating to correlations, PCA analysis, regression analysis, and machine learning applications. The prototype was subsequently modified as well. Below we report the parts that have been substantially modified following the suggestions of the reviewers.

 

 

  1. Development of the Environmental Dataset for Evaluating Smart Infrastructure Performance through Digital Twin Integration

 

 

The creation of the environmental data set for the evaluation of environmental performances marks an imperative step in the design of an intelligent digital model based on digital twin applied to smart infrastructures. In light of this consideration, the environmental KPI set section marks the foundation of the proposed research based on digital twin applied to environmental performances of smart infrastructures. The bibliographic research on digital twin applied to environmental performances of smart infrastructures leads to an understanding of the relevant KPI necessities proposed by digital twin. The ultimate goal of designing digital twin applied to environmental performances of smart infrastructures is to introduce an integrated system that enables the simulation of environmental performances of infrastructures. The applied KPI set permits an operative analysis of the synergic relationship concerning energy efficiency, sustainability, and environmental performances. The proposed KPI set permits an analysis of carbon footprint and emission intensity that provides extend information on environmental sustainability. The remaining KPI set related to load cover factor and on-site energy ratio grants evidence on system autonomy and energy efficiency. The data set creation implemented by the KPI set permits an integrated analysis of environmental performances that corresponds to the principles of the environmental framework proposed by ESG. The designed data set permits an operative comparison of environmental performances of an infinite number of infrastructures. The data set designed permits an intelligent analysis of environmental performances that provides a knowledge base on environmental sustainability of infrastructures. The designed data set permits an analysis on environmental performances of infrastructures that marks an perpetual approach on design decision concerning data set creation. The data set creation designed permits an intelligent extension on system environmental design that provides an intelligent knowledge on system environmental sustainability. In an ultimate analysis on data set creation designed by digital twin applied to environmental performances of infrastructures, digital twin represents an intelligent approach toward environmental sustainability of infrastructures.

 

 

3.1 Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Evaluation of Smart Infrastructure

 

The chosen environmental Key Performance Indicators (KPIs) provide a comprehensive framework for evaluating the environmental performance characteristics of smart infrastructures. These factors help support the aims and scope of the proposed digital twin platform, aiming to analyze, model, and optimize the environmental, energy, and operational characteristics of buildings and urban infrastructures in an immersive and data-driven setting  [1]; Fokaides, Jurelionis, & Spudys, 2022). Each Key Performance Indicator adds a unique perspective on energy, resilience, and efficiency, combining to form a holistic model for managing and interpreting the sustainability of urban infrastructures [1]. The Carbon Footprint (CFPT) provides a fundamental measure of sustainability by quantifying the total greenhouse gas emissions generated by system activity. This metric allows for the interpretation of complex system operations through a comparative metric expressed in CO₂-eq, evaluating both direct and indirect emissions (Zahedi, Alavi, Sardroud, & Dang, 2024). Applied in digital twin analysis, it is essential for the CFPT, as it enables real-time analysis and projection of environmental implications across different system operation scenarios (Li, 2025). It effectively serves as the central model connecting energy performance characteristics to global environmental goals for climate regulation (Yu, Ye, Xia, & Chen, 2024). Emission Intensity (EMIN) adds a further dimension by using normalized factors directly related to the energy consumed or produced. This type of ratio analysis permits different system-scale operations to compare system emissions, making it highly valuable for multi-building and city-scale analysis (Alibrandi, 2022). The Load Cover Factor (LCF) and Supply Cover Factor (SCF) assess the relationship presented by energy demand and supply, an important consideration for energy and resource sufficiency. The LCF will evaluate how much local energy production can sustain energy activity for a predetermined period, assessing system sufficiency, while SCF will assess how much local energy production can sustain energy use for a predetermined period, assessing system resource use (Chávez et al., 2022). The Load Matching Index (LMI) evaluates the synchrony of system dimensions for local energy production and energy activity. Large LMI values clearly indicate that local energy production and storage are well supported by local loads, thereby providing a fundamental basis for the efficiency and resilience of Smart Grids (Klar & Angelakis, 2023). The On-Site Energy Ratio (OER) also captures the extent to which local energy consumption is supported by local use of Renewable Energy sources, thereby serving as a crucial factor in assessing the zero-energy building index (Prandi et al., 2022). The Grid Interaction Index (GII) and No-Grid Interaction Probability (NGI) further establish the global context for autonomy. The GII captures the intensity and direction of energy interactions, while the NGI estimates the probability of autonomy (Fokaides et al., 2022). Capacity Factor (CAF) and One Percent Peak Power (OPP) establish system performance at varying loads. The Capacity Factor estimates system performance and its ability to use its installed energy resources, thereby forming a crucial index for judging performance return on investment, while the One Percent Peak Power focuses on peak loads and their intensity, thereby estimating impacts on system stress [2]. Building on the concept of behavior-based system performance, the Demand Response Percentage (DRS) estimates system performance flexibility in adapting to varying loads, particularly in Smart Pricing scenarios [3]. The system's total flexibility level for adapting to global environmental stimuli, such as market prices or Renewable resource availability, thereby covering system transitions from Static Energy Management to Adaptability, is captured by the system’s triple dimensions – the Flexibility Factor (FLF), Flexibility Index (FLI), or Flexible Energy Efficiency (FEE) (Chávez et al., 2022, 2022; Li, 2025). This framework satisfies not only system sustainability analysis requirements but also provides additional benefits for decision-making, scenario analysis, and future system optimization [4]; Zahedi et al., 2024). This framework therefore aligns well with the system requirements for an intelligent, fully interoperable, and environmentally sustainable Smart Urban Ecosystem, supported by measurable system performance indicators (Prati, Pelucchi, Dal Fiore, Fuzzati, & Agostini, 2023).

 

Table xyz. Environmental Key Performance Indicators (KPIs) and Their Computational Formulations

KPI

ACRONYM

Description

Formula

Carbon Footprint

CFPT

Indicates the total amount of greenhouse gas (GHG) emissions caused by an individual, organization, or product, either directly or indirectly. The formula calculates the sum of emissions associated with different activities by multiplying the quantity of each activity by its corresponding emission factor [5].

 

 

 

 

 = Quantity of a specific activity that generates greenhouse gas emissions (e.g., km, kWh, liters).

 = Rate of GHG emissions per unit of activity, expressed in CO₂ equivalent per unit (e.g., tCO₂e/kWh for electricity, tCO₂e/liter for fuel, etc.).

Emission Intensity EI

EMIN

Evaluates the environmental impact of an energy system by measuring the amount of carbon dioxide (CO₂) emitted per unit of energy consumed or produced. A low  value indicates that the system is more environmentally efficient, emitting less CO₂ for each unit of energy consumed or produced (this can occur through the use of renewable energy sources). Conversely, high  values typically occur in systems that rely heavily on fossil fuels [6].

 

2

 

 = Total amount of CO₂ emitted over a given period, resulting from the consumption of fossil fuels or the use of grid electricity [tCO₂]

 = Total amount of energy consumed or produced during the same reference period [kWh]

Load Cover Factor

LCF

Represents the ratio between the energy actually supplied by a generation source and the energy demanded or consumed over a given time interval. If equal to 1, it indicates that the generation capacity exceeds the demand, whereas values lower than 1 indicate that generation is insufficient to meet the required load. When =  1, the entire load demand is fully satisfied.  When   1,   the load is not completely met during part of the period, due to limitations in generation or available resources. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Supply Cover Factor

SCF

Indicates the ability of an organization to meet its energy demand through its own on-site supply resources. When = 1, the amount of useful supplied resources is exactly equal to the total available amount. This implies that there are no significant losses and that all available resources are fully utilized. When <  1, the amount of effectively usable resources is lower than the total available amount. Part of the generated energy is not used to meet the load, likely due to overproduction, losses, or storage capacity limitations. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Load Matching Index

LMI

Measures the efficiency with which on-site energy generation (whether renewable or not) matches the energy load (demand) of a system.

It evaluates how well the energy production profile corresponds to the load profile over time by analyzing the synchrony between supply and demand.

A higher index indicates a better match between generation and load.

When  = 1, the load is fully met (i.e., generation and storage are sufficient to cover the required demand) in every considered interval.

When  < 1, the load is not fully met at certain times, meaning that the generation and/or storage capacity was lower than the demand.

Range: 0 % ≤ f_(load,i) ≤ 100 % [8].

 

i = Time intervals [hourly, daily, monthly]

 = On-site energy generation at a given time t [kWh]

 = Storage energy balance at a given time t [kWh]

 = Energy losses at a given time t (sum of generation energy losses, storage energy losses, building technical system losses (excluding storage), and load-related energy losses such as distribution losses) [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Number of samples within the evaluation period, from τ₁ to τ₂. When hourly data are used and the evaluation period covers a full year, the number of samples is 8760.

 

On-site Energy Ratio

OER

Determines the amount of energy produced on-site (e.g., from renewable sources such as solar panels or wind turbines) relative to the total energy consumption over a given period of time.

If  = 1, the on-site generated energy equals the total energy consumption.

If  < 1, the on-site produced energy is lower than total consumption, meaning that the system depends on external energy sources to meet the demand.

If   > 1, the on-site generated energy exceeds total consumption, indicating that energy production is greater than demand (and surplus energy may be exported to the grid).

Range:   0 [9].

 

 

 = On-site energy generation at a given time t [kWh]

 = Total energy consumption (energy load) at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 

 

 

 

Grid Interaction Index (Indice di Interazione con la Rete)

GII

Measures the level of interaction and integration of a facility with the power grid, describing its average stress.

If  = 100%, the energy exchanged with the grid during interval i equals the maximum possible exchange.

If  = 0%, no energy exchange with the grid occurred at that moment.

If  < 0%, energy was injected into the grid rather than drawn from it [7], [8].

 

 = Net energy exchanged with the power grid during interval i (can be positive or negative depending on whether energy is being drawn from or injected into the grid) [kWh]

 = Maximum absolute value of the net energy flow with the grid, taken over all considered time intervals [kWh]

i = Time intervals [hourly, daily, monthly]

No grid interaction probability

NGI

Measures the probability that a building or facility operates autonomously from the power grid, and therefore the likelihood of no interaction with it.

It also indicates the extent to which the load is covered by stored energy or renewable energy use.

If  = 0, there was no moment during the considered time interval when the net energy was zero or negative.

If  = 1, the net energy was zero or negative for the entire considered period.

Range: 0           1  [7], [8].

 

 = Probability that the net energy  is zero or negative during the time interval ||

 = Normalized variable for the net exported energy at a given time t [kWh]

e  = Start and end of the evaluation period [s]

Capacity Factor

 

CAF

Defines the ratio between the actual energy production of a system (energy exchanged between the building and the grid) and the maximum production that could be achieved if the system operated at full capacity over a given period of time.

If = 1, the system operated at its maximum capacity for the entire considered period.

If = 0, the system did not produce any energy.

Range: 0           1  [8].

 

 = Normalized variable for the net exported energy at a given time t [kWh]

 = Maximum producible energy at full capacity (system capacity) [kWh]

 = = Evaluation period [s]

One Percent Peak Power

OPP

Quantifies the maximum power that an energy system can reach by calculating the energy production corresponding to the top 1% of peak periods.

A high  value indicates that the building or system experiences moments (the top 1% of the time) with very high energy consumption. This may point to significant peak loads that place stress on the electrical grid.

If   is low, the building’s energy demand is more evenly distributed over time, with fewer or smaller peaks. [10].

 

 = Energy associated with the top 1% of a given value, calculated during periods of maximum demand or generation [kWh]

 = Time period over which the energy is measured [h]

Demand Response Percentage

 

DRS

Refers to the percentage variation of the Demand Response relative to a baseline value.

If  > 0, the Demand Response was successful in reducing power compared to the baseline level (load “reduction” capability).

If  = 0, no variation occurred.

If  < 0, it indicates an increase in power during the Demand Response implementation, which is generally undesirable (load “overload” condition) [11].

 

 = Baseline hourly power, i.e., the expected or normal power level without any Demand Response measures [kWh]

 = Hourly power under Load Shifting conditions, i.e., the power recorded during the Demand Response event [kWh]

Flexibility Factor

FLF

Measures the ability of an energy system to adapt to variations in energy demand and resource availability, and to shift energy use from high-price periods to lower-price periods. It applies a daily quartile-based price classification, dividing prices into three categories: low, medium, and high.

A high price is defined as one above the third quartile (price > 75% of all prices during a day).

A low price corresponds to a value within the first quartile (price ≤ 25%).

If = 0, consumption is balanced between low- and high-price periods.

If   = 1, consumption occurs only during low-price periods.

If < 0, most consumption occurs during high-price periods.

Range:  -1            1  [12].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Low-price periods (first quartile, i.e., the lowest 25% of prices)

 = High-price periods (above the third quartile, i.e., the highest 25% of prices)

 = Number of considered time intervals

 

Flexibility Index

FLI

Calculates the difference between the energy cost under a flexibility-controlled scenario and the energy cost under a reference scenario. The Flexibility Index is used to measure the effectiveness of flexibility strategies in reducing costs compared to a baseline case.

If   < 0, the flexibility-controlled case has a higher energy cost than the reference case, meaning an undesirable cost increase.

If   = 0, the total energy cost under flexible conditions is identical to that of the reference case, indicating that flexibility yields no savings.

If   = 1, the total cost in the flexibility-controlled case is zero relative to the reference case—this represents an ideal but unrealistic situation.

If  is positive and close to 1, it means that energy has been effectively shifted or managed, reducing costs compared to the reference scenario.

Range:  -            1   [13].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Total electricity cost in a flexibility-controlled scenario  = Total electricity cost in a reference scenario without flexibility control

 = Number of considered time intervals

Flexible Energy Efficiency

FEE

Measures how effectively a system utilizes flexible energy compared to its reference energy consumption. It refers to the system’s ability to manage energy use during Demand Response (DR) events, considering the “rebound effect” (i.e., when energy consumption increases after a reduction event to restore normal operating conditions). A higher  value indicates greater flexibility efficiency, meaning the system can better optimize energy use during flexible periods. Range: 0 %         100%  [14].

 

 = Flexible energy, i.e., the energy used during periods when the system operates in flexible mode (for example, by optimizing consumption based on renewable resource availability or variable pricing) [kWh]

 = Reference or baseline energy, i.e., the energy consumed under normal or non-flexible operating conditions [kWh]

Note. This table presents the Environmental Key Performance Indicators (KPIs) used to evaluate the environmental, energy, and operational performance of smart infrastructures within a digital twin framework. Each KPI is defined with its acronym, description, and mathematical formulation for standardized and comparative analysis.

 

 

3.2 Social and Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Assessment of Smart Urban and Industrial Infrastructures

 

The set introduced for Key Performance Indicators (KPIs) plays an important role in facilitating the digital twin and metaverse software platform proposed, highlighted in the abstract, since it plays an important enabling role in assessing, optimizing, and ensuring the factors related to Smart Urban and Industrial Infrastructure (Dovolil & Svítek, 2024; Barykin et al., 2023). The proposed set of KPIs serves as parameters that enable the processing of complex phenomena related to the environment into measurable values, enabling real-time processing, simulation, and optimization (Englezos et al., 2022; Hadjidemetriou et al., 2023). The integration process fully meets the aims of the ESG (Environmental, Social, and Governance) evaluation framework, particularly targeting both Environmental and Social factors (Shaharuddin et al., 2022). Focusing on KPIs that assess indoor environmental quality, energy efficiency, and user comfort, the proposed platform enables, through an evidence-based process, the optimization of sustainable design, preventive maintenance, and energy-efficient building operations (Yitmen et al., 2025). Humidity (HUM) is an important KPI for assessing indoor environmental quality. This parameter measures the actual water content percentage in the air, relative to its maximum threshold at a given temperature scale. Humidity level, when maintained within its optimal range (40% to 60%), plays a critical role in health and comfort, since low air humidity can lead to air irritation and electrical charges, whereas excess humidity can contribute to mold growth, causing material degradation. This phenomenon, when implemented in digital twin functionality, enables RH measurement, permitting, through algorithmic processing, automatic regulation of Heating, Ventilation, and Air Conditioning (HVAC) operation and, through forecast models, optimizing air-conditioned ventilation (Lo, 2025). This leads, therefore, to thermal and hygrometric comfort, optimized through energy conservation, directly linking HUM to both social well-being and environmental factors, concerning optimized energy savings. Particulate Matter (PM10 and PM2.5) is an important environmental parameter. The proposed KPI aims to assess the level of air concentration of particles that can significantly provoke health problems, particularly in densely populated and industrially developed regions. Continuous exposure to particles can cause problems relating to heart and pulmonary diseases. The measurement process, set up for buildings, aims to assess effectiveness and identify pollution sources through functional analysis of ventilation systems. The integration of PM values in the proposed system contributes to the support for the ESG “Social” perspective by ensuring health for the inhabitants, along with achieving healthier approaches for efficient air circulation systems, thereby contributing to improvements in the “Environmental” perspective by ensuring cleaner, more efficient air circulation methods (Saleh et al., 2025; Ariansyah et al., 2023). Volatile Organic Compound (VOC) concentrations enable the measurement of air pollution from harmful gases such as benzene, formaldehyde, and toluene, which are derived from construction materials, cleaning agents, and interior decor. Volatile organic compounds can significantly affect indoor air quality, comfort, and health. However, it is recommended that VOC concentrations not exceed 300 ppb to maintain global health standards. The integration of VOC concentration measurement in the digital twin system will enable real-time responses, enabling facility managers to trace the cause, adjust ventilation rates, or use low-emitting materials (Yitmen et al., 2025; Venkateswarlu & Sathiyamoorthy, 2025). This reasonable preventive strategy will enhance indoor environmental quality and enable ESG factors to achieve “Social Sustainability,” resist factors that threaten health, and lead to occupant contentment. The rate of “Air Changes per Hour (ACH) Quantitative Indicator,” expressed by “ACC,” measures the rate at which total air replacement can occur inside an indoor space. An average rate range of 3 to 5 ACC will ensure adequate ventilation for residential and office buildings. The continuous measurement, adjustment, and calculation procedure for ACC using digital twin technology will enable facility managers to dynamically adjust ventilation rates, ensuring safe, healthy air and energy conservation by optimizing ventilation rates (Hadjidemetriou et al., 2023). The ACC Key Performance Indicator has both social and environmentally friendly impacts for ESG achievement. Regarding ACC, it offers “Social Benefits,” ensuring healthy ventilation for human well-being, and “Environmental Benefits,” conserving energy by systematically adjusting ventilation rates to improve energy performance (Hadjidemetriou et al., 2023). The “Thermal insulation rate (R-value) Quantitative Indicator,” also expressed as “R-value,” essentially estimates the “Thermal Resistance Capacity (TRC)” of construction materials to heat, thereby indicating how little heat will conduct through them, thereby ensuring greater energy conservation, as discussed previously. Increased insulation reduces heating and cooling loads, aligning with the ESG environmental aspect by reducing emissions from energy use and the social aspect by ensuring a comfortable temperature level without increasing costs (Englezos et al., 2022). The Sound Insulation Index (SND) rates sound insulation properties for construction structures, such as walls, windows, and floors. Noise pollution is gradually recognized for its impacts on both mental and physical health. The measurement of sound insulation level inside buildings helps stakeholders rate sound comfort, particularly in highly populous urban areas. This KPI actually improves the social sustainability aspect by fostering well-calibrated environments for concentration, rest, and quality of life (Lo, 2025). Energy use actual KPIs, namely Energy Efficiency Ratio (EER) and the remaining three actual indicators, namely Coefficient of Performance (COP) and System Efficiency (SEF), that rate, along with EER, how well energy services translate from energy use, contribute singularly to how well energy inputs translate from energy services. The EER, COP, and SEF actual indicators are particularly important for rating energy services’ contribution to both chiller/heater performance ratios for cooling and heating, respectively. Values for higher ratios indicate greater use for every amount of power used, thereby improving digital twin capabilities for optimizing inefficiencies, predicting system degradation, and scheduling preventive maintenance (Venkateswarlu & Sathiyanmuthu, 2025) that support ‘Environmental’ and ‘Economic’ ESG spheres, along with, again, affordability, thereby strengthening ‘Social’ ESG factors. The actual Energy Use Intensity (EUI) and actual Lightning Power Density (LPD) actual indicators can, particularly, rate lighting energy use, and its intensity, respectively, that provide deeper insight into energy use per capita, by rating lighting energy use adjusted for expected user population, along with lighting energy consumption intensity adjusted for ex-pected unit floor space, respectively, that provide deeper, similar insight, by measur-ing shared relationship factors related to spatial, user, and energy use. The actual use of digital twins with similar data can enable various analyses, including simulations for different user occupancy scenarios, lighting system schedule optimizations, and adoption of intelligent lighting systems that dynamically adjust to different user behaviors (Yitmen et al., 2025). Such enhancements lead to lower energy losses and operational costs, thereby aligning well with the ESG framework from both environmental and social perspectives, given their well-being benefits and resource distribution. Overall, integrating such KPIs into a digital twin and metaverse system constitutes a comprehensive framework for measurement, simulation, and improvement efforts to support greater sustainability and energy goals across various infrastructures in both urban and industrial settings. Each KPI has applicability to advancing or improving environmental, energy, and human comfort factors. Continuous surveillance using the set parameters allows a shift from a reactive governance model to a predictive one, in which any intervention depends on real-time factors rather than fixed paradigms that lack dynamic scope, thereby adhering to the ESG model's focus on innovation directly linked to sustainable and inclusive elements.

 

 

Table xyz. Social Key Performance Indicators (KPIs) for Indoor Environmental Quality and Energy Efficiency Assessment

KPI

Acronym

Description

Formula

UoM

Relative Humidity

HUM

Indicates the amount of water vapor in the air relative to the maximum that can be contained at the same temperature.

The optimal relative humidity (RH) range for occupant comfort and health is between 40% and 60% [15].

 

 = Water vapor pressure [Pa]

 = Saturation vapor pressure [Pa]

%

Concentrazione di PM  (Particulate Matter - PM10 e PM2.5)

PM10 e PM2.5

Measures the amount of suspended particles (particulate matter) in the air, typically expressed in micrograms per cubic meter (µg/m³).

PM2.5 refers to particles with a diameter smaller than 2.5 micrometers, while PM10 refers to particles smaller than 10 micrometers.

Recommended long-term health thresholds are PM2.5 < 20 µg/m³ and PM10 < 50 µg/m³ [16].

 

 

 = Mass of particulate matter [µg]

 = Volume of air [m³]

µg/m³

Volatile Organic Compounds

VOC

Establishes the concentration of VOCs – such as benzene, formaldehyde, and other potentially harmful gases.

Elevated VOC levels can cause discomfort and health issues in occupants.

The indicated threshold is  < 300 ppb. [17].

 

 

 = VOC concentration [mg/m³]

 = Molar mass of the VOC [g/mol]

 = Molar volume under standard conditions, generally considered as 24.45 L/mol (at standard temperature and pressure, 0°C and 1 atm)

ppb

Air Changes per Hour

ACH

Indicates the number of times the air within a space is completely renewed in one hour.

An air change rate between 3–5 ACH is considered adequate for residential buildings or office environments [18].

 

 = Airflow rate [m³/h]

 = Volume of the indoor space [m³]

1/h

Thermal Insulation Rate 

THR

Determines the thermal resistance of insulating materials, indicating how effectively they prevent heat loss.

A higher R-Value indicates better insulation performance [19].

 

 = Materials thickness [m]

λ = Thermal conductivity of the materials [W/m·K]

m²·K/W

Sound Insulation Index

SND

Evaluates the effectiveness of a building element in reducing sound transmission between two different spaces.

It is defined as the difference between the incident sound pressure level on a surface and the transmitted sound pressure level through it.

A higher R value indicates that walls, floors, or windows are more effective at blocking sound [20].

 

 = Incident sound pressure level [dB]

 = Transmitted sound pressure level [dB]

 = Equivalent absorption area [m²]

 = Separating surface area [m²]

dB

Energy Efficiency Ratio

EER

Measures the efficiency of an air conditioning system (air conditioners or cooling units). A higher EER indicates that the air conditioning system provides more cooling output for each unit of energy consumed, making it more efficient.

If EER ≥ 12, the system is considered efficient. [21].

 

 = Total cooling capacity provided by the system [kW]

 = Electrical power input consumed by the system [kW]

-

Coefficient of Performance

COP

An indicator similar to the EER, it can be used to evaluate efficiency in both cooling and heating modes.

It is commonly applied to heat pumps. A higher COP indicates that the system can produce a greater amount of useful energy (heating or cooling) for each unit of electrical energy consumed.

If COP ≥ 3.5, the system is considered efficient. [22].

 

| =  =  = Heating or cooling capacity provided by the system [kW]

 = Electrical input power consumed by the system [kW]

-

System Efficiency η

SEF

Measures how much of the energy used by the system is effectively converted into useful heating or cooling.

A high system efficiency means that a large portion of the consumed energy is actually transformed into useful thermal energy, minimizing losses.

If η ≥ 85%, the system is considered efficient. [23].

 

 = Useful energy delivered (cooling or heating capacity) [kWh]

 = Total energy consumed (including system losses and auxiliary consumption) [kWh]

-

Energy Use Intensity based on people count 

EUI

Measures the energy consumption for lighting relative to the number of occupants in the building, reflecting energy efficiency in terms of per capita usage.

A high EUI indicates higher energy consumption for lighting per person, suggesting a lack of optimization.

Optimal values: EUI < 15 kWh/person/year. [23].

 

 

 = Energy consumed for lighting [kWh]

 = Number of occupants in the building

 = Duration of lighting usage [year]

kWh/

person/

year

Lighting Power Density per floor area

LPD

Determines the power consumed by lighting per unit of floor area.

It serves as an indicator of lighting efficiency in relation to the utilized space.

A high LPD indicates greater power consumption per unit area, suggesting inefficient lighting design.

Optimal values: LPD < 10 W/m² [23].

 

 

 = Power used for lighting [kW]

 = Illuminated indoor area [m²]

kW/m²

Note. This table summarizes the Social and Environmental Key Performance Indicators (KPIs) used to assess indoor environmental quality, user comfort, and energy efficiency in smart infrastructures. Each KPI is defined by its acronym, description, and calculation formula, providing measurable parameters that support ESG-oriented evaluation and digital twin integration.

 

3.3 Governance Key Performance Indicators (KPIs) for ESG Evaluation in Digital Twin and Metaverse Applications

 

The selected Key Performance Indicators (KPIs) provide an integrated framework for evaluating ESG performance for Smart Infrastructure, specifically for the digital twin and metaverse applications related to the management of urban and industrial environments. Each Key Performance Indicator is a link that connects technology innovation and sustainability to enable real-time analysis and optimization of energy use, expenditure, and social impacts. The use of Key Performance Indicators, in aggregate, provides a holistic view of efficiency and equity, ensuring infrastructural advancement that encompasses technological innovation, sound ecology, and support for social justice. The relevance of the Key Performance Indicators is significant in the ESG framework, particularly because it directly covers both environmental and economic perspectives, and it has an indirect relationship with Governance, largely through interactions, accountabilities, and shared decision-making (Wu et al., 2022; Zhang, 2025). The Cost of Energy Saving (CES) is the single most important Key Performance Indicator under the ESG framework, since it estimates the financial costs of unit energy savings from efficiency. This Key Performance Indicator assists by evaluating the cost-effectiveness and investment-to-benefit ratio for environmental elements, leading to environmentally viable energy conversion (Dovolil & Svítek, 2024). The CES Key Performance Indicator has clear relevance to the ESG environmental domain, helping establish cost-optimal strategies for energy waste and emission savings, and also has implications for Governance, as it assists with financial accountabilities and forward-looking strategic planning for financial resource use. The Energy Return on Investment (EROI) is another highly important Key Performance Indicator, calculated as the ratio of energy output to energy invested for any given system. The Key Performance Indicator for energy has important implications for ESG’s environmental domain, as it indicates that when EROI increases, the energy output of the system is significantly higher than the energy consumed (Hämäläinen, 2020). This shift leads to optimized energy resources and sustainable energy. This Key Performance Indicator has several ESG factors, as it supports the ESG environmental dimension by enabling transparent evaluation of energy system efficiency and helping strategic decision-making to maximize energy output from resources without harmful depletion (de Trizio et al., 2024). The Energy Payback Time (EPBT) Key Performance Indicator complements the EROI Key Performance Indicator, as it describes the time required for a particular system to recover the energy invested in construction, setup, and maintenance operations. Functionally, from an ESG perspective, EPBT plays a crucial role in evaluating the life-cycle sustainability of energy systems (Hu, 2023). In the digital twin environment, EPBT helps evaluate simulation scenarios and establish the sustainability level of different energy technologies, thereby strengthening the use of transparent data —an important consideration in ESG modeling for the governance process. The Cost of Peak Demand (CPD) measures the cost of peak electricity demand over a given time period. The use of CPD is critical for sustainability, both environmental and economic, since maximizing efforts to reduce peak loads will ease energy networks and prevent the need to generate additional energy from fossil fuels, which are characterized by higher emissions (Aghazadeh Ardebili et al., 2025). The Cumulative Cash Flow (CCF) criterion considers both financial and environmental factors, as it evaluates total cash flow for an energy project alongside investment costs. ESG analysis supports governance by using financial criteria to express financial transparency and assess future risk (Hien & Hanh, 2024). The positive interpretation of a project’s cash flow feature is critical, as it asserts that financial investment in a project, beyond financial benefits, helps achieve resource savings and sustainability. The Share of Project Cost Subsidized (SPC) measures the extent of grant use. This criterion assumes ESG duality, as it explains the financial attractiveness of sustainable project investment by focusing on social benefits arising from inclusivity for small players from developing communities in the use of sustainable technology (Wu et al., 2022). Renewable Energy Use (REU) assumes critical importance as an essential ESG criterion that estimates the level of energy use from conservation to sustainable energy. Indicative interpretation assumes critical importance, particularly because it signifies a strong commitment to sustainability for a project, which is otherwise characterized by the continuous use of fossil fuels (Becattini et al., 2024). The use of digital twin technology is critical, as it assists in monitoring energy use across different scenarios, thereby enabling interpretation for sustainable energy use (Wei, 2023). The Energy Use per Worker Hour (EPWH) is dual in its interpretation of energy use across different labor productivity scenarios (Zhang et al., 2023). Socially, it signifies environmentally responsible production that does not strain human resources by being energy-intensive. EPWH, on a digital twin platform, supports modeling for appropriate workforce and energy equity balance interpretation, as well as effective energy use in labor-intensive industries (Englezos et al., 2022). Taking it all in, it forms a sound analysis framework for a comprehensive digital twin model that expresses difficult objectives for sustainable production through specific, quantified, and tractable information. The gauges improve the proposed digital twin framework’s capabilities for both real-time activity monitoring and, through simulation, forecasting future ESG performance implications. The proposed digital twin platform’s balanced model for ensuring a comprehensive, integrated, and holistic approach to ESG responsibility, covering environmentally responsible operations (EROI, REU, EPBT) for low-cost energy use, economic soundness (CES, CCF, CPD, SPC) for sustainable economic growth, and social responsibility (EPWH) for fair social implications, has therefore become possible through the incorporation and integration of such factors for its successful implementation.

 

Table xyz. Governance Key Performance Indicators (KPIs) for ESG Evaluation within Digital Twin Frameworks

KPI

Acronym

Description

Formula

UoM

Cost of Energy Saving

CES

Measures the cost associated with energy savings achieved through energy efficiency interventions.

This parameter is particularly useful for comparing different investment options in terms of efficiency, as it estimates how much it costs to save one unit of energy (e.g., 1 kWh) through technological or operational measures.

The CES formula is structured to calculate the total cost of energy savings and divide it by the amount of energy saved, accounting for system inefficiencies.

A higher CES indicates a greater cost per unit of energy saved, suggesting that the intervention may be less cost-effective compared to other alternatives.

Conversely, a lower CES means a lower cost per unit of energy saved, making the energy efficiency measure more economically advantageous [24].

 

 

 = Change in initial investment. Represents the amount of capital required to implement the energy efficiency measure [€]

 = Change in operating costs. Includes expenses related to the operation and maintenance of the energy efficiency measure [€]

 = Energy price. Represents the cost per unit of energy, which can influence the savings achieved by the measure [€/kWh]

 = Change in energy consumption. Indicates the amount of energy saved as a result of the intervention [kWh]

 = Energy loss (or efficiency) factor associated with losses that may occur during the energy use process. It may include heat losses or other system inefficiencies [–]

 = Capital Recovery Factor. Used to calculate the annualized cost of the investment and determine how much an investment must generate each year to be recovered over time [-]

 

 = Interest rate [-]

 = Amortization period [years].

[€/kWh]

Energy Return on Investment

EROI

Evaluates the energy efficiency of a production source by measuring how much energy is obtained compared to how much energy is invested to produce it. It is a key indicator of energy sustainability: the higher the EROI, the more efficient the system.

If EROI > 1, the energy process is sustainable, as the energy produced exceeds the energy invested.

If EROI = 1, the energy produced is exactly equal to the energy invested, meaning the system is at the limit of sustainability and produces no usable net energy.

If EROI < 1, the system is inefficient, since it requires more energy than it generates. Such a process is neither economically nor energetically sustainable in the long term.

This indicator answers the question: “How efficient is the energy investment?” [25].

 

 = Total outgoing or produced energy from process i. This may include, for example, the electricity generated by a power plant or the fuel produced by a refinery [kWh].

 = Total incoming or consumed energy for process j. This may include the energy required to extract, transform, or transport the energy source [kWh].

 e  = Scaling factors that can represent the quality of energy. For instance, they may be used to assign greater or lesser importance to certain forms of energy or technologies [–].

[-]

Energy Payback Time

EPBT

Measures the time required for an energy system to produce the same amount of energy that was needed to build, install, and maintain it.

If EPBT is high, it takes longer for the system to return the energy invested. Conversely, if EPBT is low, the energy system quickly recovers the energy used for its construction and startup.

It is an indicator that answers the question: “How long does it take for the system to repay the energy invested?” [26].

 = Total invested energy required to build, install, maintain, and decommission the energy system throughout its life cycle [kWh].

 = Amount of energy that the system is capable of producing annually once it is operational [kWh/year]. 

[year]

Cost of Peak Demand

CPD

Measures the cost associated with the peak electricity demand over a given period.

A lower CPD is desirable, as it indicates effective management and reduced exposure to energy costs [27].

 

 = Represents the maximum power demand during a given period [kW].

 = Represents the cost associated with each unit of power [€/kW].

[€]

Cumulative Cash Flow

CCF

Measures the total cash flow generated by the project in relation to the initial investment.

The CCF is useful for investors and decision-makers, as it helps assess a project's profitability, compare different investments, and plan future financial needs and returns on investment.

A CCF > 0 indicates that the project is generating more cash flow than the costs incurred, while a CCF < 0 indicates a loss. [24]

 

 = Represents the Final Energy Savings in period k. This value indicates the final energy savings achieved through energy efficiency measures or other strategies [kWh].

 = Energy Carrier Cost, i.e. the cost of energy per unit during period k. This may include costs for purchasing or using energy such as electricity, gas, etc. [€/kWh].

 = Technical Life, i.e. the project period during which energy savings and economic benefits are expected [years].

 = Investment Cost, i.e. the cost of the investment. It includes all expenses necessary to implement the project, such as installation, equipment, and other preliminary costs [€].

[€]

Share of Project Cost Subsidized

SPC

Indicates the proportion of the total project cost that has been financed through grants.

A high SPCS means that a significant portion of the project has been funded through external aid, while a low SPCS suggests that the project has been mainly self-financed.

SPCS = 0% when no grants have been received (RS = 0), meaning no part of the project costs is subsidized.

SPCS = 100% when the entire project cost is covered by grants (RS = IC), meaning the entire project is subsidized.

Range: 0 % ≤   SPCS ≤   100%  [28].

 

 = Received Subsidies, meaning the total amount of grants or funding received for the project [€].

 = Investment cost, meaning the total investment cost [€].

 

 

 

[%]

Renewable Energy Use

REU

Provides a measure of the proportion of final energy savings that comes from renewable sources compared to all energy sources used.

It is useful for energy policies and environmental assessments, as it helps quantify and compare the impact of different energy sources on overall sustainability and efficiency.

A higher REU indicates greater use of renewable energy, while a lower REU suggests a higher dependence on fossil fuels.

Range: 0 % ≤   REU ≤   100%   [28].

 

 

 = Final Energy Savings for each energy source k. Indicates the final energy savings achieved from that specific source [kWh].

 = Conversion Factor for each energy source k. This factor is used to convert the saved energy into a common unit, allowing comparison among different sources [-].

 = Renewable Energy Source factor for each energy source k, which accounts for the sustainability of the source. This value varies depending on the type of energy:

·          0 for fossil fuels, indicating they do not contribute to sustainable energy production [-]

·          1 for renewable sources such as biomass, wind, solar, and other renewables, as they are considered sustainable [-]

A value between 0 and 1 for mixed sources, such as industrial waste or end-of-life tires, depending on the sustainability level of the source [-]

[%]

Energy Use per Worker-Hour

EPWH

Measures the total energy used by a production system in relation to the number of human resources and working time.

It calculates the energy used per working hour, taking into account the total supplied energy minus the imported one, and normalizing the result by the number of workers and the annual working hours.

This indicator is useful for evaluating the energy efficiency of an organization or an entire economy, allowing comparisons over time or between different sectors or countries.

A low EPWH is considered positive, as it indicates higher productivity with lower energy use, suggesting a more sustainable use of energy resources.

Conversely, a high EPWH may indicate energy inefficiency, potentially linked to poorly optimized production processes, outdated machinery, or energy-intensive technologies [29].

 

 = Total Primary Energy Supply, i.e., the total amount of primary energy supplied, including all available energy sources [kWh].

 = Population number, meaning the total number of individuals within the studied population.

  = Total number of working hours per person per year [hours/year].

 = Industrial Primary Energy Supply, meaning the portion of TPES specifically used in the industrial sector [kWh].

 

 = Industrial Final Consumption, referring to the final energy consumption by the industrial sector [MWh].

 = Total Final Consumption, referring to the total final energy consumption within a given economic system, including the industrial, residential, tertiary, and transport sectors [MWh].

MJ /

(ab. hour/years)

Note: This table summarizes the Governance Key Performance Indicators (KPIs) used for ESG evaluation within digital twin frameworks. The listed indicators quantify economic efficiency, financial accountability, and strategic resource management, enabling transparent decision-making and long-term sustainability assessment. These variables collectively support the “Governance” dimension of ESG by linking economic performance with responsible investment, policy transparency, and data-driven management.

 

Apart from the previously listed key performance indicators, the following are also calculated for measurement in relation to the context of the given system, making it easier for normalization:

  • Area (Area_m² – AREA): This signifies the total floor space investigated for the energy and environment indices related to the building or infrastructural facility. The total floor space is presented in square meters.
  • Energy Consumption (Energy_Consumption_kWh – ENCO): This refers to the total consumption during the period under review, expressed in kilowatt-hours. This is the fundamental unit that can also produce comparative energy performance indicators
  • Occupants (OCC): This variable measures the number of people using or occupying any given space. This parameter enables calculations related to energy use and per capita environmental factors, making analysis easier for the user.

These factors establish highly important normalizing variables, enabling true comparability of performance across different buildings, facilities, and circumstances, thereby enhancing the robustness of the entire KPI system.

 

  1. Descriptive Statistical Analysis of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype for Smart Buildings

 

The results of the descriptive statistical analysis of the dataset highlight the complexity and diversity reflected in the Key Performance Indicators (KPIs) used to evaluate environmental, energy, and operational performance related to the functioning of Smart buildings and infrastructures. This also aligns well with existing studies that emphasize the significance of Key Performance Indicator frameworks for optimized building management (Faria et al., 2021; Alrashed, 2020). The average surface area (AREA) for the sites analyzed is around 9,637 m², with considerable variability (SD greater than 5,200 m²), indicating that low-scale buildings coexist with larger buildings, including structures larger than 19,000 m². The energy consumption (ENCO) has an average value of around 981,000 kWh, with considerable variation, indicating that the dataset includes both energy-intensive and optimized buildings (Bandoria et al., 2024; Koutras et al., 2023). The Carbon Footprint (CFPT) has an average value of 296 tCO₂e, confirming considerable emissions, which are reasonable given the dimensions of the dataset. The Emission Intensity (EMIN) rate, at 0.081 tCO₂/kWh, indicates optimized energy use, with lower environmental impacts, as reflected in energy consumption, and aligns with energy optimization strategies for the functioning of Smart Infrastructure (Ho et al., 2021). The average values for the energy coverage factors, Load Cover Factor (LCF) and Supply Cover Factor (SCF), are 0.81 and 0.814, respectively, indicating that approximately 80% of the energy can be covered through optimized resources, either on-site production or utilization. The Load Matching Index (LMI) average value, amounting to 71.7%, depicts optimized synchronization for energy production and energy requirements, whereas the average value for On-site Energy Ratio (OER) amounting to 0.75, reflects considerable on-site energy production, thereby making it clear that autonomy also has a strong dimension (Mustapha et al., 2025; Kumar et al., 2024). The average values for the Grid Interaction Index (GII) and No Grid Interaction Probability (NGI) sum to 47% and 0.47, indicating that optimized interaction levels for energy autonomy and interaction are crucial, suggesting optimized energy interaction strategies. The system entrance and operation indices remain uncertain for facility operation performance. The Capacity Factor (CAF), having an average value of 0.54, signifies that the actual use of the installed capacity is around half, along with a slight excess, while One Percent Peak Power (OPP) has an average value of 584 kW, indicating that there are periods where significant peak loads are used. The flexibility and Demand Response factors (DRS, FLF, FLI, FEE) signify the midpoint level for flexibility. It is pertinent to note that since the average for the Demand Response (DRS) factor is 9%, it signifies that it has flexibility for load reduction or time shift, whereas since the Flexible Energy Efficiency (FEE) factor average is around 49%, it also signifies that there is scope for improvement in dynamic energy use (Romanska-Zapala et al., 2020). Considering environmental and comfort factors, indoor conditions are stable and acceptable, meeting comfort requirements. The average humidity (HUM) is 49%, well inside the range for maximum comfort. The level for Particulate Matter (PM₂.₅) and (PM₁₀) (11.2 µg/m³ and 24.6 µg/m³) is lower than the World Health Organization’s requirements, thereby confirming that indoor air quality is satisfactory (Haka-wati et al., 2024). Volatile Organic Compound (VOC) concentration, averaging 186 ppb, shows significant variability, which can be influenced by building materials, effectiveness, and ventilation rates. The average air change rate (ACH) is 4, confirming that recommended rates for buildings that are not industries are met (Mustapha et al., 2025). The comfort levels for thermal and acoustic performance factors also indicate acceptable comfort, with average values of Thermal Insulation Rate (THR) at 2.93 m²K/W and Sound Insulation Index (SND) at 43 dB, indicating well-insulated and comfortable acoustic environments (Mustapha et al., 2025). Regarding energy subsystem factors, EER, COP, and SEF indicate that energy subsystems perform well, with average values of 10.3, 2.86, and 87.5%, respectively. The average Energy Use Intensity for each person (EUI) is 16.9 kWh/year, and the average lighting power density (LPD) value is 0.008 kW/m², ensuring that lighting energy use is satisfactory (Arias-Requejo et al., 2023). However, from an economic perspective, there is greater variability. The average for the Cost of Energy Saving (CES) factor is 11.45 €/kWh, and that for the Energy Return on Investment (EROI) factor is 14.79, indicating equilibrium, albeit with considerable variability. The average Energy Payback Time (EPBT) is 4.9 years, indicating acceptable energy recovery time (Haka-wati et al., 2023). The Cumulative Cash Flow (CCF) is negative, indicating no full cost recovery by the project, while the Share of Project Cost Subsidized (SPC) = 35%, indicating strong subsidization, largely financial in nature. The Renewable Energy Use (REU) = 64%, indicating strong integration of clean energy, while Energy Use per Worker Hour (EPWH) = 39 MJ, indicating that average energy productivity can still improve (Kumar et al., 2024).

 

Table xyz. Descriptive Statistics of the KPI Dataset for the Validation of a Digital Twin and Metaverse Prototype Applied to Smart Buildings.

Variable

Obs

Mean

Std_Dev

Min

Max

p1

p99

Skew

Kurt

AREA

100

9637.3

5249.252

1161

19942

1175

19694

.167

1.959

ENCO

100

981000

562000

63556.65

1970000

72951.46

1960000

.11

1.79

CFPT

100

295.725

130.658

52.28

495.52

53.21

491.685

-.275

1.887

EMIN

100

.081

.039

.022

.149

.022

.148

-.017

1.765

LCF

100

.811

.125

.604

.997

.604

.996

-.146

1.722

SCF

100

.814

.119

.606

1

.609

1

-.069

1.784

LMI

100

71.682

13.711

51

99.33

51.415

99.225

.383

1.981

OER

100

.753

.25

.33

1.191

.339

1.18

.035

1.729

GII

100

47.038

29.213

.46

99.69

.885

99.085

.104

1.86

NGI

100

.469

.281

.011

.984

.012

.966

.076

1.823

CAF

100

.541

.312

.018

.998

.019

.995

-.123

1.666

OPP

100

584.406

263.627

105.75

995.42

116.035

989.245

-.254

1.724

DRS

100

9.006

11.919

-9.61

29.88

-9.575

29.675

.073

1.825

FLF

100

.045

.584

-.939

.993

-.938

.984

-.072

1.735

FLI

100

.27

.445

-.493

.999

-.492

.99

-.139

1.815

FEE

100

49.036

26.92

.76

98.78

1.29

97.535

-.023

1.98

OCC

100

412.27

225.185

50

933

61

927

.387

2.307

HUM

100

49.463

7.495

25

73.7

27.25

70.65

-.078

4.539

PM25

100

11.233

4.714

3

22.3

3

21.85

.274

2.341

PM10

100

24.617

9.179

8

42.9

8

42.65

.074

2.285

VOC

100

186.01

87.096

20

383

20

371

-.163

2.445

ACH

100

4.051

.795

2.25

6.05

2.285

5.82

.043

2.616

THR

100

2.934

.859

.8

5.5

.97

5.025

.099

2.921

SND

100

43.343

6.227

30

61.6

30.8

60.3

.278

2.962

EER

100

10.34

1.169

7.18

13.03

7.545

12.885

-.158

2.72

COP

100

2.857

.368

2.2

3.59

2.2

3.59

.055

2.287

SEF

100

87.511

4.892

72.2

97.3

74.4

97.2

-.436

3.155

EUI

100

16.932

3.683

7.5

25.4

8.6

25.35

.019

2.616

LPD

100

.008

.002

.005

.012

.005

.012

.22

2.318

CES

100

11.453

25.527

.019

213.237

.02

146.411

5.406

40.749

EROI

100

14.79

21.237

.193

121.655

.224

114.719

3.32

14.856

EPBT

100

4.91

11.729

.08

86.67

.09

79.575

5.544

35.698

CPD

100

141000

73729.43

14691.18

298000

15023.01

298000

.232

2.306

CCF

100

-420000

785000

-1780000

2390000

-1760000

2050000

.644

3.843

SPC

100

34.946

20.902

.25

69.89

.405

69.885

.019

1.789

REU

100

64.338

13.584

30.98

95.58

34.64

92.81

-.066

2.344

EPWH

100

39.763

45.06

.302

229.515

.337

189.341

1.5

5.199

Note. This table presents the descriptive statistical parameters of the Key Performance Indicator (KPI) dataset developed to support the validation of a prototypal Digital Twin and Metaverse model for Smart Building management. The dataset integrates environmental, energy, operational, and governance-related variables, enabling the characterization of heterogeneous building typologies and operational conditions. The statistical descriptors (mean, standard deviation, minimum, maximum, skewness, and kurtosis) provide a quantitative overview of variability and distribution, essential for model calibration, simulation accuracy, and data-driven performance validation within the digital twin environment.

 

 

 

  1. Validation Framework and Data Reliability for ESG-Based Smart Building Model

 

The image illustrates the validation framework for an ESG (Environmental, Social, Governance) Smart Building model, outlining a methodological process divided into four main phases.

 

Figure 1. Validation Framework for ESG-Based Smart Building Model. This framework validates and structures ESG data for Smart Building applications, combining statistical and machine learning methods to ensure data reliability and predictive accuracy. The validated dataset supports testing and prototyping of a management system that integrates metaverse and digital twin technologies for advanced, real-time smart building management.

 

 

The process starts with data preparation and structuring, in which data on environmental, social, and governance indicators should be collected and processed by normalizing and organizing them into three analytic blocks. In addition, data screening for outlier observation should be executed at the same stage to ensure data quality for subsequent analysis. The next process involves correlation analysis and Principal Component Analysis. The PCA analysis needs to identify hidden components and prove structural homogeneity. The next step involves Ordinary Least Squares linear regression for each component of environmental, social, and governance. The area will serve as the output for the data. In addition, the framework should use VIF to test for homogeneity in the data. Furthermore, it should apply the calculations for both the determination coefficient and the degrees of freedom. The framework should use machine learning algorithms to improve predictive analysis. At the same time, comparisons of various algorithms, such as Boosting algorithm analysis, Decision Tree Analysis by KNN, Random Forest by Regularization, and Support Vector Analysis, should be used. The analysis should be carried out separately for each component. The algorithm has been designed to ensure that the processed data can be used for testing during the design of a management system that combines the metaverse and a digital twin. At the same time, data structural homogeneity should be ensured. Therefore, based on the data structural homogeneity analysis, it is meaningful and timely to create an advanced digital environment that is both interactive and immersive. Furthermore, it should be an opportunity to create environmental management in an intelligent digital environment.

 

 

  1. Scientific Validation of ESG Data through Correlation Analysis for Smart Building Prototyping

 

 

The correlation matrix, as a validation technique for the database used in the analysis of ESG components, holds a strong position from a scientific perspective. Correlation analysis is the most robust statistical approach for assessing the internal consistency of the data. The advantage of correlation analysis lies in the ability of researchers to determine whether the set of investigated factors shows positive or negative correlations. In the analysis of ESG factors, it is confirmed that each factor has a specific property within the non-overlapping value of sustainability. From a scientific perspective, it confirms that the data structure holds strong internal consistency. In the context of smart building implementation, it plays an important role by validating the quality of data that flows into the digital management system. The analysis of correlations among various factors of energy consumption and environmental emissions confirms that the data set follows an independent distribution of sustainability. The moderate levels of correlation confirm that it holds multidimensional properties. In terms of scientific research and the scientific standards of environmental analysis and management science, it complies with high standards. It provides a robust foundation for further analysis, such as PCA and regression. These two analyses provide further evidence supporting research on environmental sustainability. Furthermore, it provides strong evidence that the data has been integrated into the digital twin metaverse. In respect of the research analysis targeting the assessment of smart building implementation on environmental factors. The research analysis holds three types of correlation analysis. The correlation analysis focuses on each ESG aspect. The three factors in the analysis include the Environmental factor (E), the Social factor (S), and the Good Governance factor. The analysis of these factors provides an important perspective, as it confirms that the data structure holds comprehensive internal properties.

 

 

 

5.1 Correlation Analysis and Validation of Environmental (E) Factors in the ESG Framework

 

 

 

The environmental factor in the ESG framework refers to operational characteristics related to energy, emissions, and environmental issues. The correlation matrix for the environmental factors (AREA, CFPT, ENCO, EMIN, LCF, SCF, LMI, OER, GII, NGI, OPP, DRS, FLF, FLI, FEE) helps the researcher perform initial checks for internal dataset coherence and multicollinearity among factors. The correlations appear to range from weak to medium, thereby ensuring that similar factors are not measured again (Wang, 2024; Eskantar et al., 2024). This helps improve the construct validity of the environmental elements, as it clearly supports a wide range of factors and prevents overlap (Handoko, Afifudin, & Holili, 2024). The AREA, which relates to the asset's actual size, shows insignificant correlations with other factors. The slight negative correlations observed between energy use (ENCO) and Carbon Footprint (CFPT) indicate that larger areas do not necessarily lead to greater energy use and emissions (Hou et al., 2025). The positive, albeit trivial, relationship between Load Cover Factor (LCF) and building size indicates that larger buildings tend to handle load factors better, though this relationship is not significant. CFPT, having relation to Carbon Footprint, is negatively correlated to both energy use (ENCO) and Emission Intensity (EMIN). The negative relationship between CFPT and ENCO may seem paradoxical, but it could reflect differences in the use of cleaner energy across organizations (Zhou, 2024). The negative association between CFPT and EMIN implies that when total emissions are higher, Emission Intensity tends to fall, suggesting that either larger organizations use different energy resources to scale or that better technological efficiencies account for better results (Du et al., 2024). The trivial relationship emphasizes that emissions, although controlled by many, are not solely defined by energy use quantities, making it valid for CFPT and EMIN to remain distinct factors. The energy use factor (ENCO) also has insignificant correlations for other factors in the environmental domain, thereby requiring support for its applicability. The presence of a weak negative relationship between it and LCF and SCF (Load and Supply Cover Factors) implies that greater energy use does not necessarily correspond to better load coverage or supply adequacy, thereby ensuring autonomy in quantity and management efficiency (WANG, Y., 2024). This adds strength to the theoretical basis for modeling, in which operational intensity and efficiency remain separate dimensions within the environment. The correlations for LCF, SCF, LMI, and OER, factors that indicate energy balance and autonomy, demonstrate an internal logical structure. For example, LCF shows a positive relationship with EMIN and LMI, thereby confirming that systems with greater load coverage tend to demonstrate greater operational matching. The positive relation between LCF and EMIN could prima facie appear contradictory: greater intensity could indicate inefficiency, yet it could also indicate systems running at, or near, full capacity, where greater loads tend to temporarily enhance intensity. The slight positive relationship between LCF and OER (On-site Energy Ratio) supports internal logic, in which greater load coverage enables greater on-site production —a sensible practice for system design that sustains the environment (Dovolil & Svítek, 2024). The GII and NGI, which indicate interaction on the power grid, tend to show slight negative or weak correlations with almost all other factors. This also appears sensible: systems that depend more on the power grid for functioning (greater NGI, lower GII) tend not to relate directly to greater efficiency (FEE) or flexibility (FLI) (Zhou, 2024; Wang, 2024). The slight correlations tend to confirm that interaction with the power grid remains an autonomous domain for the environment, suggesting that the dataset can properly account for almost every aspect of the environment, from production to system administration (Eskantar et al., 2024; Hou et al., 2025). The factors for flexibility (FLF, FLI, FEE) tend to show slight correlations with each other, thereby confirming that flexibility and efficiency remain largely autonomous factors in analysis. The slight positive correlations between FLF and AREA, and between FEE and LCF, suggest that larger systems display greater flexibility, though only slightly. This slight autonomy in interdependence tends to confirm that, for the environment, different dimensions (structure, operation, and efficiency) relate only partially (Eskantar et al., 2024; Hou et al., 2025). The correlations for the environment tend to confirm the dataset’s validity. The low to medium correlations confirm that the environmental factors are exploring different, albeit complementary, dimensions around the notion of ‘sustainability,’ without any considerable redundancy. This also adds strength to the basis for further analysis, such as PCA, that will also, in turn, support the interpretation of the factorial structure underlying the environmental dimension, achieving a meaningful combination of indicators (Handoko et al., 2024; Wang, Y., 2024).

 

 

Table xyz. Correlation Matrix for Environmental (E) Factors in the ESG Model

 

Variables

AREA

CFPT

ENCO

EMIN

LCF

AREA

1.0000

-0.0382

-0.0608

-0.0678

0.0483

CFPT

-0.0382

1.0000

-0.1416

-0.2254

-0.0229

ENCO

-0.0608

-0.1416

1.0000

-0.0344

-0.1235

EMIN

-0.0678

-0.2254

-0.0344

1.0000

0.1844

LCF

0.0483

-0.0229

-0.1235

0.1844

1.0000

SCF

-0.0142

-0.0214

-0.2180

-0.1927

-0.1126

LMI

0.0376

0.0284

-0.0592

-0.0165

0.2509

OER

0.0432

0.0155

-0.1793

0.0205

0.0918

GII

-0.0380

0.0052

-0.2230

0.0543

-0.0519

NGI

-0.0188

0.0250

-0.0573

-0.0805

-0.0523

OPP

-0.1248

0.0472

0.1331

0.2376

-0.0651

DRS

-0.0577

0.1073

-0.1592

-0.0992

-0.1351

FLF

0.1050

-0.1392

-0.0412

0.0490

-0.0770

FLI

0.0023

0.1272

-0.0822

0.0331

-0.0335

FEE

0.0965

-0.0327

-0.0738

-0.0678

0.0085

 

Note: The table presents the correlation coefficients among the environmental indicators used within the ESG framework. The low to moderate correlation values confirm that the variables are largely independent and represent distinct aspects of environmental performance, such as energy use, emissions, and operational efficiency. This statistical consistency validates the internal coherence of the dataset and ensures its suitability for advanced modeling techniques, including PCA and regression analysis. The results further demonstrate that the data are appropriate for use in the prototyping and testing of smart building management systems based on digital twin and metaverse technologies.

 

 

The relationship heat map is a graphical representation of the inherent relationships among the environmental indicators in the dataset. The intensity distribution in the heat map shows mainly light-colored regions and a few strong red and blue regions, suggesting that most correlations are low to moderately positive. This graphical interpretation also supports the initial statistical analysis, confirming that the majority of the environmental factors presented are mutually independent and cover different facets of energy consumption, emissions intensity, load management, and efficiency. The same correlations can also be found in ESG datasets, for which multidimensionality is crucial to guarantee the strength and ease of interpretation of modeling (Ioannidis et al., 2022; Loukili & Benli, 2023). The absence of strongly correlated factors indicates that the dataset has an effective structure and lacks multicollinearity, ensuring it meets the requirements for accurate modeling and interpretation (Eskantar et al., 2024). The regions that display moderately strongly correlated factors, found in different parts of the heat map, relate to well-known correlations for the expected dimensions. For example, a low, positive relationship between Emission Intensity (EMIN) and Load Cover Factor (LCF) could reflect operational conditions: when power systems operate at maximum load, emission intensity tends to increase. Other low, positive correlations for factors related to energy autonomy (on-site energy ratio, OER) and load matching (load matching indicators, LMI) demonstrate that there are coherent interactions in energy autonomy and system efficiency, thereby aligning with results from ESG analysis carried out using alternative methods (Sorathiya et al., 2024). The heat map analysis clearly shows that each set of factors has an inherent, logical structure without compromising its mutual independence. The heat map suggests that there are no strongly correlated factors that fully define the environmental dimension. This can also indicate that the dataset has inherent multidimensional characteristics, covering different facets related to energy, emission intensity, load balance, and flexibility, which contribute to a comprehensive ESG analysis in a unique way. An integrated view has also been applied in ESG analysis to evaluate smart city infrastructure (Dovolil & Svítek, 2024). This shows that the variables are distinct yet conceptually related, providing a strong basis for analysis such as PCA and regression models in the ESG framework.

 

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Environmental (E) Factors in the ESG Model. Note: The heat map shows mostly weak to moderate correlations, indicating that the environmental variables are independent and free from multicollinearity. This confirms the dataset’s structural validity and its suitability for integration into digital twin and metaverse-based smart building management models.

 

 

 

5.2 Correlation Analysis and Validation of Social (S) Factors in the ESG Framework

 

 

The social dimension of the ESG model emphasizes human and system characteristics at the building level, highlighting considerations such as user comfort, indoor air quality, thermal and sound performance, and efficiency. The given correlation matrix for the social dimension (OCC, HUM, PM25, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) represents an important validation process for the dataset used for creating a management model that applies digital twin technology (Hadjidemetriou et al., 2023). The correlations between variables are largely weak to moderate, signifying that each variable identifies a unique aspect without overlapping. The correlations for the number of occupants (OCC) range from small positive correlations for RH (HUM, 0.13) and fine particle concentration (PM2.5, 0.20), since higher human presence could lead to slight increases in concentration for both factors (Lo, 2025). However, the correlations remain weak, strengthening the hypothesis that better-designed environmental conditions and ventilation systems can exclude environmental factors from having a major impact on indoor air quality (Cai et al., 2023). The near-zero correlations between OCC and variables such as the concentration of Volatile Organic Compounds (VOC, -0.07) and thermal resistance (THR, -0.08) indicate little to no relationship between human presence and these factors, again proving that the balance of the dataset has been appropriately defined. The air quality variables (PM2.5, PM10, and VOC) tend to exhibit mild correlations, particularly between PM2.5 and PM10 (0.24), since both factors are closely related through their co-occurrence at the same locations (Hadjidemetriou et al., 2023b). The low correlations between VOCs and humidity indicate that air quality factors are largely independent of indoor environmental factors, supporting the continued separation of their measurement as separate elements for analysis under the social dimension (Ni et al., 2024). Such correlations for air change rates (ACH) tend to reflect positive, albeit weak, correlations with temperature and sound insulation, again showing that ventilation rate performance is largely uninfluenced by envelope characteristics, which are important for digital twin simulations of indoor user comfort (Islam et al., 2024). The thermal and acoustic factors (THR, SND) show medium-strength positive correlations with COP, SEF, and EER, suggesting that buildings with lower thermal and sound transmission tend to exhibit better energy system efficiency. This pattern is also expected, reinforcing the dataset's internal validity by associating comfort factors with actual system performance (Alibrandi, 2022). Conversely, EER, COP, and SEF display strong positive correlations (ranging from 0.49 to 0.72) because energy efficiency factors are expected to show considerable convergence in value. However, for a digital twin model, such strong correlations are highly acceptable, as they can assess various system factors that are closely related yet supplementary to one another (Hii & Hasama, 2024). Interestingly, energy use intensity (EUI) and lighting power density (LPD) show a strong positive correlation (r = 0.88), as lighting factors strongly influence energy use per capita. This pattern explicitly verifies that the dataset accurately captures internal load patterns, both for assessing social comfort and productivity in digital twin settings, thereby becoming crucial for exploring social system factors through ESG models (Yossef Ravid & Aharon-Gutman, 2023). However, the low values for EUI, LPD, and social factors explicitly confirm that energy use patterns remain a separate system factor, not driven by social factors. The social component’s correlation matrix clearly shows that low-to-medium correlations indicate logical convergences among comfort, air quality, and energy factors, thereby explicitly confirming that the dataset captures supplementary system factors that are logically related to each other. This pattern clearly shows that the dataset’s social component, which focuses on ESG modeling, is robustly constructed, thereby ensuring its reliability for efficient analysis, simulations, and decision-making support through digital twin frameworks for managing energy-efficient buildings and enhancing social factors by maximizing energy performance in social buildings.

 

 

 

Table xyz. Correlation Matrix for Social (S) Dimension Variables in the ESG Smart Building Model

 

Variable

OCC

HUM

PM25

PM10

VOC

ACH

THR

SND

EER

COP

SEF

EUI

LPD

OCC

1.0000

0.1329

0.1953

0.0406

-0.0661

-0.0387

-0.0806

0.0172

0.0373

0.0912

0.1720

-0.0849

-0.0437

HUM

0.1329

1.0000

0.0027

0.0540

-0.0592

0.1160

-0.1581

0.0172

0.0477

0.0399

0.0013

0.0618

0.0800

PM25

0.1953

0.0027

1.0000

0.2370

0.0320

-0.0518

-0.2271

0.1503

-0.0616

0.0376

0.0095

-0.0114

0.0935

PM10

0.0406

0.0540

0.2370

1.0000

0.0760

0.0683

0.0201

0.0481

-0.0705

-0.0022

-0.0393

0.0587

0.0935

VOC

-0.0661

-0.0592

0.0320

0.0760

1.0000

0.0005

-0.0622

-0.0455

-0.0209

-0.0401

-0.0085

0.0454

0.0214

ACH

-0.0387

0.1160

-0.0518

0.0683

0.0005

1.0000

0.0289

0.1062

0.0741

0.0784

0.0607

0.0243

0.0072

THR

-0.0806

-0.1581

-0.2271

0.0201

-0.0622

0.0289

1.0000

0.1467

0.1021

0.1425

0.1260

-0.0078

0.0573

SND

0.0172

0.0172

0.1503

0.0481

-0.0455

0.1062

0.1467

1.0000

0.0119

0.0676

0.0631

-0.0225

0.0202

EER

0.0373

0.0477

-0.0616

-0.0705

-0.0209

0.0741

0.1021

0.0119

1.0000

0.4872

0.7244

-0.1632

-0.0750

COP

0.0912

0.0399

0.0376

-0.0022

-0.0401

0.0784

0.1425

0.0676

0.4872

1.0000

0.7074

-0.0906

-0.0529

SEF

0.1720

0.0013

0.0095

-0.0393

-0.0085

0.0607

0.1260

0.0631

0.7244

0.7074

1.0000

-0.1307

-0.0399

EUI

-0.0849

0.0618

-0.0114

0.0587

0.0454

0.0243

-0.0078

-0.0225

-0.1632

-0.0906

-0.1307

1.0000

0.8829

LPD

-0.0437

0.0800

0.0935

0.0935

0.0214

0.0072

0.0573

0.0202

-0.0750

-0.0529

-0.0399

0.8829

1.0000

 

Note. The table displays the correlations among social indicators such as comfort, air quality, and energy efficiency. The weak to moderate correlations confirm that these variables represent distinct yet complementary dimensions, ensuring the dataset’s internal consistency and its suitability for digital twin-based simulations in smart building management.

 

 

 

 

The heat map for the Correlation Matrix of the Social (S) dimension of ESG helps establish the intuitive structure of the mutual relationships among the variables that define indoor comfort, air quality, and energy efficiency in buildings. The structure is dominated by light colors, indicating that there is little to medium strength across the majority of variables; hence, the dataset provides a comprehensive range of social factors related to sustainability without duplication. The presence of mixed correlations in the heat map enhances its validity for use in digital twin-based building management systems to optimize building performance and human well-being. The red line running along the diagonal indicates the perfect relationship each has with itself, distinct from the existing correlations denoted by the colors along the diagonal. The red colors in the lower right corner indicate that the relationship (high correlation) between the energy-related variables EER, COP, and SEF (ranging from 0.7 to 0.8) is strong. However, it is expected that there was a relationship, given that it measures efficiency and performance. The same applies to the red square that connects EUI and LPD (0.9) correlations. The red square indicates that lighting load plays an important role in energy use per capita, thereby underscoring its role in defining energy efficiency. The top-left corner, related to the indicators for occupants' and air quality (OCC, HUM, PM2.5, PM10, and VOC), shows pale colors with scattered red and blue. This indicates that the relationship (low correlations) is weak, confirming that building air quality and comfort are not reliant on factors related to occupants —an important characteristic for datasets that help model building conditions using digital twin methods. This helps indicate that humidity and pollutant concentrations can change through simulations that model different process scenarios, thereby avoiding building conditions that could arise from occupants' varying factors related to building functionality and adaptations. The heat map indicates that it is valid for modeling the ESG social dimension in systems that apply analytics for building optimization and related human well-being.

 

 

 

 

Figure xyz. Heat Map of the Correlation Matrix for Social (S) Factors in the ESG Model. The heat map shows mostly weak to moderate correlations, indicating that the social variables—related to comfort, air quality, and energy efficiency—are distinct yet interrelated. This confirms the dataset’s internal coherence and its suitability for digital twin-based smart building simulations.

 

 

 

 

5.3 Correlation Analysis and Validation of Governance (G) Factors in the ESG Framework

 

The correlation matrix for the Governance (G) component of the ESG model provides valuable insight into the interrelationships among indicators that represent economic and operational aspects of smart building management. These include cost-effectiveness (CES), energy return on investment (EROI), energy payback time (EPBT), capital cost factors (CPD and CCF), system performance (SPC), renewable energy utilization (REU), and energy productivity per worker hour (EPWH). The aim of analyzing these correlations is to validate the dataset used for the prototyping of a digital twin-based management model for smart buildings (Roda-Sanchez et al., 2023; Alibrandi, 2022), ensuring that the indicators are statistically consistent, complementary, and capable of accurately reflecting the governance dynamics of sustainable infrastructure systems (Poels et al., 2022). The overall pattern of correlations in this matrix shows that relationships between governance variables are generally weak to moderate—a desirable feature for multidimensional datasets (Li, 2025). This indicates that each variable captures a distinct dimension of governance performance without excessive redundancy. The Cost of Energy Savings (CES) shows a strong negative correlation with the Capital Cost Factor (CCF) at -0.42, suggesting that higher capital costs are associated with lower cost-efficiency in achieving energy savings. This inverse relationship highlights an important governance trade-off: investments that require significant capital may not always translate into proportional financial efficiency gains (Chungath & Hacks, 2024). The negative correlations between CES and other variables such as SPC (-0.22) and REU (-0.20) reinforce this interpretation, implying that systems with higher cost-effectiveness tend to have lower levels of spending and less direct connection with renewable energy deployment intensity. EROI, which measures the ratio between energy produced and energy invested, displays weak correlations across most variables, including a slight negative association with EPBT (-0.22), consistent with the expectation that higher energy returns correspond to shorter payback times. Its positive, though modest, correlations with CCF (0.09) and SPC (0.16) suggest that systems with better energy efficiency tend to be embedded in contexts with moderate capital intensity and performance consistency (Elnour et al., 2024). EPBT itself maintains low correlations, except for its mild negative association with SPC (-0.20), which indicates that buildings or systems with shorter payback periods tend to have more stable or efficient operational performance. The CCF variable is positively correlated with SPC (0.23) and EPWH (0.11), showing that capital costs are weakly linked to system performance and worker energy productivity. These modest correlations support the validity of the dataset by suggesting that financial parameters and productivity metrics are related but not overlapping dimensions of governance performance (Zhou et al., 2021). REU and EPWH exhibit a small positive relationship (0.19), consistent with the idea that renewable energy integration enhances the energy productivity per worker, a finding relevant for evaluating the operational efficiency of buildings managed under sustainable frameworks (Dovolil & Svítek, 2024; Kljaić et al., 2024). The overall low correlation magnitudes across variables, with few exceptions, demonstrate that the dataset is well balanced and not dominated by interdependent indicators. This structural integrity is fundamental for the calibration and validation of digital twin models (Chungath & Hacks, 2024; Poels et al., 2022), which require clear variable independence to accurately simulate decision-making and policy scenarios in smart buildings.The limited but coherent correlations between cost, performance, and efficiency metrics confirm that the Governance dimension of the ESG dataset is statistically reliable. It effectively captures the complexity of managing financial and operational sustainability (Li, 2025), ensuring that the digital twin model can use these parameters to support optimization, predictive analysis, and performance benchmarking within a robust and transparent governance structure (Kljaić et al., 2024; Elnour et al., 2024).

Table X. Correlation Matrix for Governance (G) Factors in the ESG Smart Building Model

 

Variable

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

CES

1.0000

-0.0596

0.0320

0.0069

-0.4240

-0.2163

-0.1981

-0.0780

EROI

-0.0596

1.0000

-0.2234

0.0083

0.0851

0.1553

0.0725

0.0126

EPBT

0.0320

-0.2234

1.0000

-0.1697

0.0380

-0.1981

0.1305

0.0050

CPD

0.0069

0.0083

-0.1697

1.0000

-0.0017

-0.0894

0.0077

0.0670

CCF

-0.4240

0.0851

0.0380

-0.0017

1.0000

0.2251

0.0327

0.1058

SPC

-0.2163

0.1553

-0.1981

-0.0894

0.2251

1.0000

-0.1582

-0.0859

REU

-0.1981

0.0725

0.1305

0.0077

0.0327

-0.1582

1.0000

0.1860

EPWH

-0.0780

0.0126

0.0050

0.0670

0.1058

-0.0859

0.1860

1.0000

 

Note. The table shows weak to moderate correlations among governance indicators, confirming their independence and validity. The negative link between CES and CCF highlights an inverse cost–efficiency relationship, while positive ties among CCF, SPC, and EPWH indicate consistent governance performance suitable for digital twin-based smart building management.

 

 

The corresponding heat map for the Governance (G) component clearly shows the structure of correlations associated with important governance factors, offering a quick look at the relationship profiles of financial, operational, and efficiency factors in the dataset. The color scale from deep red to blue also clearly emphasizes the type and intensity of correlations, differentiating red for positive correlations and blue for negative ones. This helps perform intuitive analysis aimed at assessing the level of internal association consistency in the dataset, which is important for approving the digital twin model for managing Smart buildings (Chungath & Hacks, 2024; Cureton & Dunn, 2021). The first observed feature from the heat map is the strong negative association existing between the Cost of Energy Savings (CES) and the Capital Cost Factor (CCF), as indicated by the deep blue square (around -0.4). This association clearly shows that when capital costs are higher, energy savings are less beneficial and less important for governance-related financial decisions in Smart buildings (Pileggi et al., 2020). This clearly shows that the numerical analysis is supported by the heat map, making it easier to observe clear, interpretable correlations among financial factors and enhancing the dataset's credibility by appropriately referencing these correlations for cost and investment factors. An important group could also be observed for the efficiency and performance factors (CCF, SPC, and REU), characterized by weak to moderately red-toned correlations. This clearly shows that when better performance and usage of REU are positively associated with higher capital costs, it’s expected that investment level intensity will be higher, with a positive outcome for energy governance (Lv et al., 2023; Zahedi et al., 2024). The same could also be analyzed by examining the REU and EPWH groups, as shown in a red-toned heat map, clearly indicating that REU has a positive relationship with EPWH and fully confirming the operational scenario for the efficiency model for Smart buildings (Roda-Sanchez et al., 2023). The dominance of the light-toned palette for almost every corner in heat maps indicates that almost every variable has low levels of correlation, ensuring that the dataset is fully balanced and lacks multicollinearity, both of which outweigh benefits for digital twin applications, ensuring that it’s fully accurate for cause-and-effect simulations (Yue et al., 2022; Hartmann et al., 2023). The heat map, therefore, validates the dataset's effectiveness by illustrating that governance indicators are differentiated yet linked in a logical way, ensuring it’s apt for use in an integrated system for the governance of sustainable buildings (Dovolil & Svítek, 2024; Cranford, 2023). Thus, it can unequivocally be concluded that the significance of governance heat maps is mandatory for analysis and, more appropriately, for decision-making regarding ESG integration in digital twin technology for infrastructural governance in a smart city (Kljaić et al., 2024).

 

 

 

.

 

Figure X. Heat Map of the Correlation Matrix for Governance (G) Factors in the ESG Model. Note: The heat map illustrates the correlations among governance indicators such as cost-effectiveness, capital costs, and system performance. The predominance of light colors indicates weak to moderate relationships, confirming the independence of the variables and the absence of multicollinearity. This validates the dataset’s consistency and its suitability for digital twin-based simulations in smart building governance.

 

 

 

  1. Regression-Based Validation of the ESG Dataset for Digital Twin Smart Building Modeling

 

To demonstrate the efficacy and applicability of the ESG model, the analysis equations will provide a crucial starting point for evaluating the dataset's statistical validity and reliability. These equations will examine the levels of cohesion, interdependence, and applicability of the environmental, social, and governance dimensions within the broader context of sustainable building resource management. The analysis aims to demonstrate that the ESG dataset has the potential to significantly contribute to the conceptualization and ideation of an integrated building resource management model that leverages digital twin technology and the metaverse to model, monitor, and regulate the performance efficiency of intelligent buildings in real time (Zhang et al., 2023). The equations will use Ordinary Least Squares (OLS), with the dependent variable (AREA) indicating the scale, size, and functionality of buildings, and the independent variables indicating Key Performance Indicators for each ESG dimension. The equations will enable the researcher to assess the reliability and functionality of the dataset, thereby creating the opportunity to examine the applicability of fundamental ESG dimensions that can effectively contribute to sustainable building resource scale, functionality, and efficiency (Dou & Yin, 2024). The structure and form of the equations will apply the three dimensions that govern ESG, creating a sound methodology for assessing environmental, social, and governance factors within a broader context of sustainable building resource scale, functionality, and efficiency (Wang et al., 2024). The Environmental equation will appropriately indicate energy consumption, intensity, and efficiency factors that can contribute to building scale, the Social equation will indicate factors related to comfort, air, and user well-being, and the Governance equation will relate financial intensity and efficiency factors to building scale, functionality, and performance (Wang et al., 2024). The equations will provide the crucial foundation for the validation and calibration of the given dataset, ensuring that its integration into digital twin and metaverse technology provides the fundamental soundness for reliable, accurate, and environmentally validated model building (Liu et al., 2025).

 

Table X. Regression Equations for ESG Dimensions in the Smart Building Model

ESG

Equations

E-Environment

 

S-Social

 

G-Governance

 

Note: The table presents the Ordinary Least Squares (OLS) regression equations used to validate the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. Each equation relates specific Key Performance Indicators (KPIs) to the dependent variable AREA, representing building scale and functionality. These equations provide the analytical foundation for integrating the ESG dataset into digital twin-based smart building management and simulation frameworks.

 

The results for the Environmental model (E) indicate an R² of 0.226 and an adjusted R² of 0.005, suggesting that while the included variables explain approximately 22.6% of the variance in AREA, much of this explanatory power is not statistically robust once adjusted for the number of predictors. However, the F-statistic (1.02) and its corresponding probability value (0.451) confirm that the model structure remains consistent and free from specification errors. The significant variables, namely the Capacity Factor (CAF, p = 0.006) with a negative sign, and the Renewable Energy Utilization (REU, p = 0.065) with a positive sign, indicate logical relationships. Larger building areas tend to be associated with lower utilization efficiency (CAF) but higher renewable energy use (REU), a pattern coherent with real-world behavior in large smart infrastructures (Guo et al., 2025). The low mean VIF (1.93) confirms the absence of multicollinearity, reinforcing dataset reliability for modeling energy-environmental dynamics. The Social (S) dimension regression exhibits an R² of 0.085 and an adjusted R² of 0.004, showing that social and comfort-related KPIs explain only a small fraction of the variation in building area. This result aligns with expectations, as social variables—such as air quality (PM2.5, PM10), humidity, and acoustic comfort—tend to capture internal environmental quality rather than scale-dependent properties. The significance of PM2.5 (p = 0.084) suggests that particulate concentration may have a weak relationship with building size, potentially due to differences in ventilation and occupancy density (Chungath & Hacks, 2024). The low mean VIF (1.08) again validates the statistical independence of these indicators, confirming that the Social dataset is structurally well defined, even if its predictive strength remains marginal. The Governance (G) regression yields the most consistent results in terms of model validity, with an R² of 0.124 and a higher adjusted R² of 0.067. The F-statistic of 2.19 and a p-value of 0.051 indicate near-statistical significance at the 5% level, implying that the governance and economic indicators together provide a weak but coherent explanation of AREA variability. The negative signs of the significant variables—Capital Development Cost (CPD, p = 0.027) and Capital Cost Factor (CCF, p = 0.054)—reveal that greater efficiency and lower costs per unit area are associated with better governance performance. This outcome is particularly relevant for validating the economic component of the digital twin, as it suggests that financial optimization and governance transparency correlate with spatial and operational efficiency (Dovolil & Svítek, 2024; Cranford, 2023). The low mean VIF (1.15) confirms internal model consistency and the absence of collinearity distortions. Overall, the three regressions validate the ESG dataset by confirming that each component captures a distinct dimension of building performance. While none of the models exhibits high explanatory power individually, their combined interpretation demonstrates structural coherence and logical sign directions. The Environmental model highlights operational and renewable energy dynamics, the Social model reflects comfort and health independence, and the Governance model reveals economic efficiency trends. Together, they provide a statistically sound and multidimensional foundation for implementing a digital twin system capable of assessing, simulating, and optimizing smart building governance and performance in line with ESG principles (Zhang et al., 2023).

Table X. Summary of Regression Results for ESG Dimensions in the Smart Building Model

ESG Dimension

E (Environment)

S (Social)

G (Governance)

Included KPIs (X)

ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, FLF, FLI, FEE, EER, COP, SEF, EUI, LPD, REU, EPWH

OCC, HUM, PM25, PM10, VOC, ACH, THR, SND

CES, EROI, EPBT, CPD, CCF, SPC

 Vars

22

8

6

0.226

0.085

0.124

Adj. R²

0.005

0.004

0.067

F (df1, df2)

1.02 (22, 77)

1.05 (8, 91)

2.19 (6, 93)

Prob > F

0.451

0.403

0.051

Root MSE

5237

5238

5069

Mean VIF

1.93

1.08

1.15

Significant Variables (p < 0.10)

CAF (p = 0.006), REU (0.065), EPWH (0.108)

PM25 (p = 0.084)

CPD (p = 0.027), CCF (0.054)

Sign

CAF (−), REU (+)

+

both (−)

Interpretation

Environmental KPIs are consistent but weakly predictive of AREA; no multicollinearity; logical directional signs.

Social KPIs are independent and orthogonal; air quality and comfort show limited relation with building scale.

Governance and economic KPIs show structural consistency and marginal significance; negative coefficients suggest efficiency gains with lower costs per area.

Note: This table summarizes the regression outcomes for the Environmental (E), Social (S), and Governance (G) dimensions of the ESG model. The results show that all models are statistically coherent and free from multicollinearity (Mean VIF < 2). While the Environmental and Social models exhibit low explanatory power, the Governance model shows marginal significance (Prob > F ≈ 0.05), indicating that financial and efficiency indicators play a stronger role in explaining building scale and performance within digital twin-based smart building systems.

 

The validation of the ESG dataset through the three regression models—each representing the Environmental, Social, and Governance dimensions—demonstrates the internal coherence and distinct contribution of each component to the explanation of building scale and performance, represented by the variable AREA. The results reveal a layered structure of relationships within the dataset that supports its robustness and analytical validity for modeling within a digital twin framework (Hien & Hanh, 2024). From a global perspective, the Governance (G) regression emerges as the most statistically relevant, with a Prob > F value around 0.05, suggesting marginal significance and indicating that this component captures some consistent structural patterns between governance and economic indicators and the dependent variable AREA. This finding implies that financial and efficiency-related parameters—such as cost per energy unit, return on investment, and payback time—are moderately predictive of the built area, reflecting how governance decisions and economic structures may scale with building size (Dou & Yin, 2024). In contrast, the Environmental (E) and Social (S) models exhibit low explanatory power, with adjusted R² values close to zero. This outcome is not unexpected, as environmental and social metrics often capture operational performance and contextual conditions rather than structural attributes like area. The Environmental model, although statistically weaker, presents logical directional relations, such as negative associations with carbon footprint (CFPT) and positive associations with renewable energy use (REU), which are consistent with theoretical expectations of sustainable design (Su & Sun, 2023). Similarly, the Social model demonstrates independence among variables, showing that indoor air quality, thermal and acoustic comfort, and occupancy metrics vary orthogonally without multicollinearity, as confirmed by low mean VIF values (below 2). The analysis of multicollinearity further reinforces the validity of the dataset. All three models have mean variance inflation factors below 5, confirming that no block of variables presents internal redundancy. This indicates that the dataset is structurally well-defined and that each KPI contributes unique information within its respective ESG dimension (Chen & Lin, 2023). From a methodological standpoint, this supports the use of the dataset for higher-level analytical modeling, including multivariate or machine learning regressions, since predictor independence is a prerequisite for robust feature interpretation. The Governance block stands out for its structural coherence. Variables such as CPD (cost per design), CCF (capital cost factor), and SPC (sustainability performance cost) show statistically relevant coefficients, some with negative signs. This pattern indicates that higher building efficiency or optimized financial planning is associated with lower costs per unit area—an interpretation aligned with principles of sustainable financial governance and stakeholder-oriented management (Berman et al., 1999). The presence of negative coefficients further reinforces the logic of efficiency-driven management models, where resource optimization translates into economic and environmental benefits. Overall, the regression-based validation confirms that the ESG dataset is both statistically sound and conceptually coherent. Each dimension provides non-redundant information, supporting the multidimensional structure of ESG analysis. While the E and S models describe operational and contextual variability, the G model anchors the dataset’s structural significance, establishing a measurable link between governance efficiency and building scale. The low multicollinearity, consistent variable behavior, and partial significance of the Governance model collectively validate the dataset for use in a digital twin context, where real-time data integration and predictive modeling depend on stable and interpretable variable relationships. This foundational validation demonstrates that the dataset can be reliably used for developing intelligent management systems capable of assessing performance and sustainability through interconnected ESG indicators (Hien & Hanh, 2024; Dou & Yin, 2024).

Table X. Summary of Key Analytical Insights from ESG Regression Models

Aspect

Observation

Global significance

Only the G model is marginally significant (Prob > F ≈ 0.05).

Internal coherence (VIF)

All Mean VIF < 5 → no multicollinearity in any ESG block.

Predictive power vs. AREA

E and S blocks have low explanatory power; G block moderate (Adj R² ≈ 0.07).

General interpretation

The three ESG dimensions are statistically distinct and non-redundant. The Governance/Economic dimension shows the strongest structural consistency.

Note: The table presents the main analytical observations derived from the ESG regression analysis. It highlights that the Governance (G) model demonstrates marginal statistical significance and the strongest internal consistency, while the Environmental (E) and Social (S) models show lower explanatory power but maintain structural independence. The low VIF values confirm the absence of multicollinearity, validating the dataset’s robustness for digital twin–based smart building modeling.

 

  1. Principal Component Analysis (PCA) for Technical Validation of the ESG Dataset in Smart Building Governance

 

The analysis presented in this section aims to apply the Principal Component Analysis (PCA) technique to provide a technical and scientific validation of the ESG (Environmental, Social, and Governance) dataset developed for testing the smart building governance prototype. PCA is a widely recognized multivariate statistical method used to reduce the dimensionality of complex datasets while preserving their essential information structure. Its application in this context serves a dual purpose: to verify the internal coherence and multidimensionality of the ESG dataset and to ensure that the selected indicators accurately represent the underlying sustainability dimensions without redundancy. The purpose of this analysis is to confirm that the dataset is robust, logically structured, and suitable for integration into advanced digital environments such as digital twin and metaverse platforms. By identifying the principal components that explain the highest variance among the ESG indicators, PCA enables the researcher to isolate the most influential factors affecting smart building performance and governance. This validation step is essential to ensure that the prototype operates on a reliable and scientifically grounded dataset, capable of supporting dynamic simulations, predictive modeling, and real-time decision-making. Through PCA, the dataset’s structural soundness is assessed, verifying the independence and complementarity of the variables associated with environmental efficiency, social comfort, and governance effectiveness. The resulting components will form the analytical backbone for building an integrated system that governs and optimizes smart buildings in immersive digital environments. Ultimately, this approach ensures that the proposed governance prototype—based on digital twin and metaverse technologies—is supported by a technically validated and scientifically reliable data framework, reinforcing its potential for sustainable, data-driven management of intelligent infrastructures.

 

 

7.1 Principal Component Analysis (PCA) Results for the Environmental (E) Dimension of the ESG Model

The results of the Principal Component Analysis (PCA) applied to the environmental component of the ESG model provide a significant validation of the underlying dataset, confirming both its internal coherence and its multidimensional structure (Kwon, Kim, & Choi, 2024). The PCA technique, which decomposes the dataset into orthogonal principal components, is particularly effective for evaluating the relationships among environmental indicators and identifying latent structures that capture the underlying variance of the data (Ascione et al., 2022). In this case, the eigenvalues associated with the first few components demonstrate that approximately 40% of the total variance is explained by the first four principal components, indicating that the environmental indicators share meaningful correlations without redundancy. This supports the use of PCA as a robust approach to assess data consistency and dimensionality reduction within the ESG framework. From the component loadings, the first principal component (PC1) captures the largest share of variance and is primarily driven by positive contributions from EMIN (Emission Intensity), LCF (Load Cover Factor), and ENCO (Energy Consumption), while variables such as CFPT (Carbon Footprint) and SCF (Supply Cover Factor) contribute negatively. This component appears to represent a balance between efficiency and energy consumption, reflecting how emissions and energy coverage jointly influence environmental performance. The second component (PC2) has strong positive loadings for OER (On-site Energy Ratio) and GII (Grid Interaction Index), while FLF (Flexibility Factor) shows a strong negative contribution. This suggests that PC2 differentiates systems with higher local energy autonomy from those that rely more on flexibility and grid interaction, aligning with the notion of distributed energy management (Zhang et al., 2023). The third and fourth components (PC3 and PC4) further refine the structure of the data, capturing subtler aspects of energy-environmental interactions. For instance, PC3 shows high positive loadings for ENCO and OPP (One Percent Peak Power), while LCF and LMI (Load Matching Index) load negatively, suggesting a contrast between energy demand peaks and load coverage capacity. PC4, on the other hand, captures variability associated with EMIN and CAF (Capacity Factor), pointing toward the efficiency of energy conversion processes within the system (Kwon et al., 2024). A noteworthy observation is that none of the variables display extreme loadings across multiple components, which indicates that the dataset lacks strong multicollinearity and maintains a balanced contribution of each indicator to the overall structure. This aligns with the earlier regression analyses that confirmed low mean variance inflation factors (VIF), thereby reinforcing the dataset’s internal consistency (Islam, Guerrieri, Gravina, & Fortino, 2024). The presence of moderate but distributed loadings also implies that each variable contributes uniquely to the multidimensional understanding of environmental performance, making the dataset appropriate for subsequent modeling steps. The negative correlations observed in some components, such as between CFPT and EMIN, or SCF and CAF, emphasize the complexity of the environmental dimension. These negative signs do not indicate inconsistencies but rather complementary dynamics: higher carbon footprints tend to associate with lower emission intensity efficiency, while energy coverage and capacity factors reveal trade-offs between resource use and operational performance. This reinforces the interpretative depth of PCA as a diagnostic validation tool rather than a purely descriptive method (Zhou et al., 2023). Overall, the PCA results validate the environmental dataset as a coherent and structurally reliable foundation for the ESG model. The distribution of eigenvalues and loadings supports the presence of independent, interpretable dimensions within the environmental domain. This validation step is crucial, especially considering the dataset’s intended application in the development of a management prototype integrating Digital Twin and Metaverse technologies (Zhang et al., 2023). In this context, PCA ensures that the environmental indicators capture distinct yet complementary aspects of building energy efficiency, emission control, and operational sustainability. Consequently, the PCA model not only confirms the statistical robustness of the environmental data but also establishes a reliable basis for embedding it within a digital simulation environment for smart building management.

 

 

 

 

 

Figure X. Principal Component Loadings for Environmental (E) Factors in the ESG Model. Note: The figure illustrates the loading values of each environmental indicator (ENCO, CFPT, EMIN, LCF, SCF, LMI, OER, GII, NGI, CAF, OPP, DRS, and FLF) across the principal components (PC1–PC15). The distributed and moderate loading patterns confirm that no single factor dominates the variance, indicating balanced variable contributions and low multicollinearity. This supports the dataset’s structural integrity and validates its suitability for digital twin–based smart building governance modeling within the ESG framework.

 

 

 

7.2 Principal Component Analysis (PCA) Results for the Social (S) Dimension of the ESG Model

 

The principal component analysis of the Social (S) dimension in the ESG framework provides a deep understanding of how human-related and comfort variables interact within smart building environments. Incorporating the identified variables — such as Occupants (OCC), Relative Humidity (HUM), Particulate Matter (PM2.5 and PM10), Volatile Organic Compounds (VOC), Air Changes per Hour (ACH), Thermal Insulation (THR), Sound Insulation (SND), Energy Efficiency Ratio (EER), Coefficient of Performance (COP), System Efficiency (SEF), Energy Use Intensity (EUI), and Lighting Power Density (LPD) — the PCA demonstrates the multidimensional structure of the social component, highlighting interdependencies between human comfort, air quality, and building performance metrics (Bonab, Bellini, & Rudko, 2023). The first principal component (PC1) shows strong negative loadings for EER, COP, and SEF, indicating that this dimension captures the efficiency and operational quality aspects of social comfort. These parameters represent the building’s ability to maintain indoor well-being through technological optimization. Negative values suggest an inverse relationship between system efficiency and variability in occupant conditions, implying that as systems become more efficient, fluctuations in perceived comfort decrease. This aligns with the principles of smart building management, where automation and digital control stabilize the indoor environment (Elnour, Meskin, Khan, & Jain, 2021). PC2 exhibits positive contributions from EUI and LPD, suggesting that energy consumption per person and lighting density are key indicators of human activity levels within buildings. This axis can be interpreted as a behavioral-energy dimension, linking occupant presence and usage patterns to energy demand. It supports the concept that social variables are not isolated but are reflections of dynamic interactions between people and infrastructure (Ma et al., 2023). The third component (PC3) emphasizes indoor air quality factors, with high negative loadings for PM2.5, PM10, and THR. This reveals an important trade-off between particulate pollution and thermal comfort. In smart building contexts, this component provides insight into how environmental control systems influence both health-related and comfort-related metrics. Lower particulate concentrations may require higher ventilation rates (ACH), which in turn affect energy consumption and humidity balance (Hu & Lu, 2024). PC4 is primarily characterized by strong positive loadings for Occupants (OCC) and Humidity (HUM), alongside moderate contributions from ACH and PM10. This suggests that the fourth component captures spatial and microclimatic comfort interactions, where occupant density and air renewal are central to maintaining an acceptable indoor environment. In digital twin applications, such relationships are essential for predicting comfort variations based on occupancy data and HVAC system behavior. Higher components, such as PC5 through PC7, refine specific comfort and acoustic dimensions. Negative loadings of SND and THR indicate the balance between thermal insulation, noise control, and user satisfaction. These components are crucial for understanding the subtle effects of building envelope performance on perceived comfort, an area that is increasingly relevant for ESG-oriented smart building metrics (Hu & Lu, 2024). Finally, components like PC8 to PC13 capture residual variance associated with specific operational parameters, confirming that while social indicators are diverse, they remain statistically coherent and non-redundant. The consistent spread of variance across components underscores the structural validity of the dataset, confirming that each variable contributes uniquely to the representation of social sustainability within buildings. Overall, the PCA confirms that the social dataset is robust and internally coherent, providing strong empirical support for its use in validating the proposed metric model. The clear clustering of efficiency-related, environmental, and comfort indicators reflects a realistic representation of how occupants experience smart buildings. When integrated into a digital twin and metaverse-based management system, these results ensure that the model can simulate user-environment interactions, predict comfort dynamics, and optimize building operations in line with ESG principles (Bonab et al., 2023; Ma et al., 2023).

 

 

 

 

 

 

 

Figure xyz. Principal Component Loadings for Social (S) Factors in the ESG Model. Note: The figure presents the loading values of social indicators (OCC, HUM, PM2.5, PM10, VOC, ACH, THR, SND, EER, COP, SEF, EUI, and LPD) across the principal components (PC1–PC13). The distribution of moderate and distinct loadings confirms that each factor contributes uniquely to the social dimension. Efficiency variables (EER, COP, SEF) and comfort-related indicators (PM2.5, PM10, HUM) form separate but complementary clusters, validating the dataset’s internal consistency and its suitability for digital twin and metaverse-based smart building governance applications.

 

7.3 Principal Component Analysis (PCA) Results for the Governance (G) Dimension of the ESG Model

 

The Principal Component Analysis (PCA) of the Governance (G) component in the ESG model provides critical evidence for the statistical validity and structural coherence of the dataset intended for digital twin and metaverse-based smart building management. This component includes variables related to economic efficiency and governance performance—specifically, cost efficiency (CES), energy return on investment (EROI), energy payback time (EPBT), construction and operational costs (CPD, CCF), system performance and control (SPC), renewable energy utilization (REU), and energy productivity per worker (EPWH). Together, these indicators describe the economic and managerial dimension of sustainable smart buildings, where financial optimization, performance monitoring, and long-term resource efficiency are intertwined (Dovolil & Svítek, 2024). The first principal component (PC1) explains a substantial portion of the variance, with strong negative loading for CES (-0.537) and positive loading for CCF (0.531) and SPC (0.506). This pattern highlights a fundamental trade-off between cost reduction per unit of energy saved and capital or operational investment, which is typical in building governance models (Bezrukov, Sadovnikova, & Lebedinskaya, 2022). In a digital twin context, this suggests that reducing the marginal cost of energy efficiency (CES) is associated with higher upfront or management costs (CCF, SPC), reflecting realistic investment-efficiency dynamics. PC1 can therefore be interpreted as a “financial governance axis,” emphasizing the relationship between cost control, structural investment, and system efficiency. The second component (PC2) shows significant negative correlations for EPBT (-0.472), REU (-0.516), and EPWH (-0.467), suggesting that this component represents the temporal and productivity-related aspect of governance. Shorter energy payback times and greater renewable energy utilization contribute to higher system efficiency but require optimization of workforce productivity and process management (Pandhare et al., 2024). This factor can be understood as an “operational sustainability axis,” demonstrating the capacity of governance metrics to reflect the long-term return of energy and human capital investments. PC3 and PC4 reveal more specific structural relations within the dataset. The strong positive loading of EPBT (0.458) and CPD (0.619) in these components indicates that buildings with longer payback periods also tend to have higher cost structures. This pattern validates the consistency of the dataset, showing that financial and temporal metrics are not independent but logically correlated. In a digital twin simulation, these relationships can be used to model the trade-offs between project duration, capital investment, and lifecycle sustainability (Zhang, Yu, & Tian, 2024). The significant contribution of EROI (-0.601 in PC4) further connects governance efficiency to the building’s ability to generate positive energy returns, highlighting the strategic value of integrating real-time energy flow analytics in metaverse-based management systems. The fifth and sixth components (PC5 and PC6) capture more subtle variations related to operational resilience and system integration. REU (0.325 in PC5, -0.584 in PC6) and EPWH (-0.751 in PC5) suggest that renewable energy performance and energy use efficiency per worker vary inversely, reflecting the complexity of aligning workforce productivity with renewable infrastructure adoption. This finding is particularly relevant for smart building governance because it illustrates how data-driven management—enabled by digital twins—can balance human and technological performance indicators (Pandhare et al., 2024). Finally, PC7 and PC8 consolidate the multidimensionality of the governance structure. The strong positive loading of CES (0.519 and 0.536) indicates that cost efficiency remains a dominant variable across higher components, confirming that economic optimization is consistently embedded in the model. The coherence of loadings across multiple components demonstrates that each indicator contributes uniquely to the overall governance structure, with no redundancy or distortion. In summary, the PCA results confirm that the governance dataset is statistically robust and conceptually coherent. The clear differentiation of principal components reflects the internal logic of ESG-based governance, where financial, operational, and energy metrics interact systematically. This validates the model’s suitability for integration into a digital twin and metaverse framework, enabling predictive management, optimization of energy investment, and real-time governance of smart building performance. The structure uncovered by the PCA not only supports the empirical reliability of the data but also provides a scientific foundation for developing intelligent, data-driven systems aligned with sustainable management objectives.

Figure X. Principal Component Loadings for Governance (G) Factors in the ESG Model. Note. The figure displays the loadings of governance-related indicators (CES, EROI, EPBT, CPD, CCF, SPC, REU, and EPWH) across the principal components (PC1–PC8). The results highlight clear structural differentiation among financial, operational, and productivity dimensions. PC1 and PC2 capture cost–efficiency and sustainability trade-offs, while higher components (PC4–PC6) reflect investment and performance dynamics. The balanced distribution of loadings confirms the statistical coherence and multidimensional integrity of the governance dataset, validating its use for digital twin and metaverse-based smart building governance models.

 

  1. Machine Learning Regression for ESG Dataset Validation in Digital Twin and Metaverse-Based Smart Building Governance

The machine learning regression analysis presented in this section was developed as a key step in the technical and scientific validation of a dataset designed for the testing and calibration of a prototype aimed at the management of smart buildings through Digital Twin and Metaverse technologies. The purpose of this process is to ensure that the dataset, structured according to the Environmental, Social, and Governance (ESG) framework, demonstrates high levels of internal consistency, predictive reliability, and interpretability—three essential conditions for its integration into intelligent, data-driven decision systems. By applying advanced machine learning algorithms such as Random Forest and Support Vector Machine (SVM), the study evaluates how effectively the dataset captures the complex, nonlinear relationships that characterize smart building governance. Each ESG dimension—environmental, social, and governance—is analyzed to identify the most suitable model capable of minimizing prediction errors (MSE, RMSE, MAE, MAPE) while maximizing explanatory performance (R²). The Random Forest model proves particularly effective for validating the Environmental and Social components, owing to its ensemble-based structure that captures multidimensional dependencies, avoids overfitting, and enhances interpretability through variable importance measures. The SVM algorithm, conversely, demonstrates superior performance in modeling the Governance dimension, where financial and operational variables interact through complex, non-linear patterns. The outcome of this machine learning validation process confirms that the ESG dataset provides a statistically robust foundation for developing an intelligent management prototype. Within a Digital Twin and Metaverse framework, this validated dataset will enable real-time simulation, optimization, and governance of building performance, energy efficiency, and sustainability—transforming smart buildings into adaptive, self-learning systems that support informed, data-driven decision-making.

 

 

8.1 Random Forest Regression for Environmental Dataset Validation within the ESG Framework

The selection of the Random Forest algorithm as the best-performing model for the validation of the ESG dataset is grounded on a comprehensive evaluation of multiple performance metrics, some of which are to be minimized and others maximized. In predictive modeling, a reliable validation approach must consider this dual nature of indicators. The metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) are indicators that must be minimized because they quantify the deviation between the predicted and actual values; lower values correspond to better accuracy and less dispersion of residuals. On the other hand, the coefficient of determination (R²) must be maximized, as it measures the proportion of the variance in the dependent variable explained by the model, thus reflecting its explanatory power. The Random Forest model demonstrates an optimal balance between these opposing objectives. Its normalized MSE, RMSE, MAE, and MAPE values are the lowest among all tested algorithms, indicating superior predictive precision and stability (Inala et al., 2024). Even though its R² is moderate compared to the Linear Regression model, this is compensated by the fact that Random Forest captures complex, nonlinear relationships that linear models fail to represent adequately (Ji & Niu, 2024). The ensemble structure of Random Forest, based on the aggregation of multiple decision trees, allows it to reduce variance and avoid overfitting, which enhances its robustness when validating large and heterogeneous datasets such as those associated with ESG indicators (Wang et al., 2024). Furthermore, the model’s interpretability and ability to estimate variable importance make it particularly suitable for applications in digital twin and metaverse environments. By quantifying the contribution of each variable, Random Forest supports both predictive accuracy and strategic understanding of how environmental, social, and governance components influence building performance. Its capacity to minimize prediction errors while maintaining stable explanatory reliability validates its use as a scientifically sound model for data validation in smart building management and ESG-driven digital infrastructures (Ukwuoma et al., 2024; Vasilica et al., 2025).

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Environmental Component

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.33

0.30

0.65

1.00

0.00

0.36

0.38

RMSE

0.35

0.33

0.73

1.00

0.00

0.42

0.44

MAE

0.44

0.53

0.78

1.00

0.00

0.27

0.52

MAPE

0.66

0.50

1.00

0.67

0.46

0.47

0.00

0.01

0.07

0.38

1.00

0.00

0.44

0.00

 

Note: All metrics — MSE, RMSE, MAE, MAPE, and R² — are normalized to enable direct comparison among algorithms. The Random Forest model shows the lowest normalized errors and a stable R², confirming its superior accuracy and robustness. These results validate its suitability for the technical–scientific verification of the environmental dataset used in developing a digital twin and metaverse-based smart building management prototype.

 

The analysis of the Random Forest model applied to the ESG dataset reveals a coherent and scientifically valid validation process for its future use in the development of a smart building management prototype that integrates digital twin and metaverse technologies. The feature importance metrics, expressed through the mean dropout loss, serve as a robust indicator of how each environmental variable contributes to the predictive power of the model. This metric, based on fifty permutations of the dataset, measures the increase in error (in terms of RMSE) when a variable is randomly excluded from the model. Lower dropout loss values correspond to less influential variables, whereas higher values indicate features whose removal leads to a significant deterioration in predictive accuracy. In this case, CAF (Capacity Factor), SCF (Supply Cover Factor), and OER (On-site Energy Ratio) exhibit the highest dropout loss, showing that they are essential in explaining variations in the dependent variable AREA. These results are consistent with the physical logic of smart building systems, where energy capacity balance, supply coverage, and on-site generation efficiency play fundamental roles in determining operational performance and sustainability outcomes (Du, 2024; Kinshakov et al., 2021). This importance ranking demonstrates that the Random Forest algorithm not only captures statistical correlations but also reflects the actual structural dynamics governing energy and environmental processes within buildings (Miao & Xu, 2024). In particular, the model’s ability to represent nonlinear interactions enhances the interpretability of variable contributions—especially when dropout loss metrics are combined with ensemble-based prediction strategies (Orlenko & Moore, 2020). Moreover, recent studies demonstrate that such methods significantly improve the readability and explainability of complex systems, supporting their use in high-dimensional, heterogeneous ESG datasets (Yu et al., 2021). Consequently, the application of Random Forest in this context is not only statistically justified but also conceptually aligned with the operational goals of smart building governance.

Figure xyz. Feature Importance Based on Mean Dropout Loss — Environmental Component

Variables

Mean dropout loss

Variables

Mean dropout loss

CAF

5.077

CFPT

5.068

SCF

5.074

LCF

5.068

OER

5.070

OPP

5.068

FLF

5.069

LMI

5.068

GII

5.069

FLI

5.068

EMIN

5.069

ENCO

5.067

NGI

5.068

FEE

5.067

DRS

5.068

   

Note. The mean dropout loss values indicate each variable’s contribution to the Random Forest model. Higher values (e.g., CAF, SCF, OER) represent greater influence on model accuracy, confirming their key role in validating the environmental dataset for smart building analysis.

 

The model’s ability to identify meaningful predictors supports the internal coherence of the ESG dataset and confirms its reliability as a basis for digital twin modelling. The additive explanations of the predictions further reinforce the model’s interpretability. Each predicted value is constructed from a baseline prediction (the “Base”) adjusted by the additive contributions of each variable. Positive contributions increase the predicted AREA, while negative ones decrease it. For example, in Case 1, positive influences from GII (Grid Interaction Index) and FLI (Flexibility Index) compensate for the negative impact of SCF and EMIN, resulting in a final prediction slightly above the baseline. This additive approach allows for a clear decomposition of the prediction mechanism, offering transparency in understanding how individual environmental factors shape the model’s output. Such interpretability is essential for validating the dataset in a scientific context, as it ensures that the model’s decisions are both explainable and consistent with domain knowledge. By integrating the mean dropout loss and additive prediction explanations, the Random Forest model provides a double-layer validation: it identifies the most influential features for prediction and explains how they act in shaping each result. This combination of accuracy, interpretability, and conceptual alignment with building energy dynamics confirms that the model is methodologically sound and suitable for the prototyping of an intelligent management system for smart buildings, capable of leveraging digital twin and metaverse technologies for real-time performance monitoring and sustainable decision-making.

 

Table xyz. Additive Feature Contributions in Random Forest Predictions — Environmental Component

Case

1

2

3

4

5

Predicted

9.141

8.936

9.175

8.931

8.931

Base

9.063

9.063

9.063

9.063

9.063

ENCO

-0.298

-2.765

4.759

-9.130

6.100

CFPT

8.291

-20.937

0.220

4.921

-12.691

EMIN

-10.720

-1.963

1.428

16.626

-17.500

LCF

-16.680

-18.864

9.401

-21.168

22.139

SCF

-9.687

-2.902

-13.972

-59.684

-46.113

LMI

2.564

-1.952

-4.729

2.104

1.990

OER

10.921

10.930

10.902

10.914

10.910

GII

32.240

-20.779

35.824

-22.670

5.397

NGI

8.787

23.709

-7.874

-9.781

1.382

CAF

-33.998

3.153

34.955

-36.673

-36.673

OPP

13.678

-18.241

-11.706

13.536

-4.859

DRS

29.411

-25.220

-2.092

-17.218

-2.467

FLF

42.525

-52.681

53.365

-4.552

-57.637

FLI

0.446

1.780

1.584

1.260

-2.368

FEE

-0.064

-0.010

-0.139

-0.194

-0.013

 

Note. This table shows the additive contributions of each environmental variable to the predicted AREA across five test cases. Positive values increase the prediction, while negative ones reduce it. The results highlight the interpretability of the Random Forest model, confirming that the dataset captures realistic and consistent relationships among energy and environmental indicators.

 

 

8.2 Machine Learning Validation of the Social (S) Component in the ESG Dataset

In the context of developing a scientifically grounded methodology for validating the ESG dataset, this section focuses on the Social (S) component by applying and comparing different machine learning regression algorithms. The goal is to identify which algorithm best captures the underlying relationships among social performance indicators relevant to smart building management while ensuring both predictive reliability and interpretability (Li & Xu, 2024). After the normalization of performance metrics, the Random Forest algorithm demonstrates the most balanced and consistent results. It achieves the lowest normalized error values across MSE, RMSE, and MAE, indicating superior predictive accuracy and robustness in modeling the social variables. The model’s relatively low MAPE further supports its reliability, as it suggests that Random Forest maintains stable relative error levels across the range of predicted values, ensuring that deviations between observed and estimated outputs remain proportionally small (Gaur et al., 2021; Li, 2025). By contrast, Linear Regression, while producing the highest R² value, exhibits significantly higher normalized error metrics. This indicates that despite its apparent explanatory power, the linear model fails to account for the complex, nonlinear interactions typical of social indicators in ESG frameworks, leading to overfitting and reduced generalizability (Li & Jiang, 2023). In this sense, Random Forest provides a better trade-off between minimizing errors and maximizing interpretability, effectively capturing multidimensional relationships among variables such as occupant comfort, indoor air quality, and system efficiency, which collectively define the social sustainability of building operations. The results confirm that the Random Forest approach not only enhances the predictive stability of the validation process but also ensures methodological consistency with the broader objective of dataset validation within a digital twin and metaverse framework (Khan & Vora, 2024). Its ability to model complex nonlinearities and maintain low residual variance validates the dataset’s structural coherence and reinforces its suitability for integration into the prototyping of a smart building management system capable of dynamic, data-driven decision-making.

Table xyz. Normalized Performance Metrics of Machine Learning Models — Social (S) Component

Metric

Boosting

Decision Tree

KNN

Linear

Random Forest

Regularized Linear

SVM

MSE

0.828

0.273

0.186

0.771

0.000

0.004

0.133

RMSE

0.989

0.123

0.027

0.949

0.000

0.002

0.067

MAE

0.713

0.210

0.044

1.000

0.006

0.000

0.038

MAPE

1.000

0.238

0.292

0.595

0.263

0.316

0.000

0.000

0.182

0.727

1.000

0.667

0.182

0.000

 

Note: The table presents normalized evaluation metrics for different machine learning algorithms applied to the Social (S) dataset. The Random Forest model achieves the lowest error values (MSE, RMSE, MAE) and balanced performance, confirming its superior predictive accuracy and suitability for validating social indicators within digital twin and metaverse smart building frameworks.

The application of the Random Forest algorithm to the Social (S) dimension of the ESG model provides valuable insights for validating the dataset’s internal coherence and predictive reliability in the context of smart building management. This validation is essential to support the prototyping of a management model based on digital twin and metaverse technologies, which require accurate, interpretable, and scalable data structures to simulate and optimize human-environment interactions within buildings (Li, 2025). The results obtained through feature importance metrics—mean decrease in accuracy, total increase in node purity, and mean dropout loss—illustrate the role and weight of social indicators such as air quality, comfort, and system efficiency in predicting the dependent variable (Miao & Xu, 2024). The mean dropout loss, calculated through fifty permutations, serves as an indicator of the relative contribution of each feature to model accuracy. Lower dropout loss values correspond to higher importance, as their removal would significantly degrade model performance (Xu, 2021). In this dataset, variables such as Sound Insulation (SND), Thermal Insulation (THR), System Efficiency (SEF), and Coefficient of Performance (COP) display some of the lowest dropout loss values, confirming their fundamental role in explaining the variance of the output. These indicators are directly linked to the comfort and operational quality of the indoor environment, which are central to the social sustainability dimension of smart buildings (Chowdhury et al., 2023). Conversely, variables such as Humidity (HUM), Occupants (OCC), and Air Changes per Hour (ACH) contribute to the model with a moderate but consistent effect, emphasizing how internal environmental control and occupancy behavior affect building performance through indirect interactions. The other two importance measures—mean decrease in accuracy and total increase in node purity—further reinforce these findings. The positive values associated with PM2.5 (PM25), System Efficiency (SEF), and COP indicate that they significantly enhance the model’s predictive capacity, while negative or small values in other variables reflect lower or context-dependent influence. The total increase in node purity, a measure of how much a variable reduces overall model variance when used to split data in decision trees, identifies similar key drivers, suggesting the model’s internal coherence across multiple evaluation metrics (Lou, 2025).

Table xyz. Feature Importance Metrics for the Random Forest Model — Social (S) Component

 Variable

Mean decrease in accuracy

Total increase in node purity

Mean dropout loss

VOC

-310.022

1.200×10+8

3.820

PM25

2.522×10+6

7.406×10+7

3.919

HUM

-361.803

6.387×10+7

3.727

OCC

-455.332

6.330×10+7

3.647

ACH

-777.406

5.574×10+7

3.644

PM10

-38.839

5.482×10+7

3.632

LPD

153.570

5.290×10+7

3.638

EUI

120.725

5.279×10+7

3.653

COP

346.376

4.928×10+7

3.666

SND

284.558

4.860×10+7

3.602

THR

-515.408

4.721×10+7

3.595

SEF

862.251

4.260×10+7

3.623

EER

-55.396

3.588×10+7

3.563

Note. This table reports the importance metrics derived from the Random Forest model, including mean decrease in accuracy, total increase in node purity, and mean dropout loss. Variables such as SEF, COP, and PM2.5 show the highest influence on model accuracy, confirming their central role in explaining social sustainability and indoor comfort dynamics in smart buildings.

The additive explanations for predictions provide another layer of interpretability, illustrating how each variable contributes to specific case predictions. For example, in the first test case, variables such as PM25 and EER (Energy Efficiency Ratio) have strong positive contributions to the predicted value, whereas factors like VOC and LPD exert negative effects. These additive contributions allow the decomposition of predictions into comprehensible components, which is particularly valuable for digital twin applications that rely on traceable, feature-level understanding to inform operational decisions (Ozdemir et al., 2025). The capacity to visualize how indoor comfort, air quality, and energy efficiency dynamically influence outcomes reinforces the model’s practical relevance for smart building management. Overall, the Random Forest model demonstrates a robust and balanced capability to capture complex, nonlinear interactions among social variables within the ESG framework. It effectively distinguishes between features with direct physical impacts—such as thermal and acoustic insulation—and those representing behavioral or environmental feedbacks, like occupancy and ventilation rates. This multi-level interpretability confirms the dataset’s scientific validity, showing that it contains coherent, measurable relationships consistent with the physical and social principles of building performance (Orlenko & Moore, 2020; Drobnič et al., 2020). Therefore, this analysis validates the dataset as a reliable foundation for the development of an intelligent management prototype that integrates machine learning with digital twin and metaverse environments. The model’s structure supports the simulation of user comfort and operational efficiency, providing a data-driven mechanism for adaptive, sustainable management of smart buildings (Yu et al., 2021; Akhtar et al., 2024).

Table xyz. Additive Prediction Explanations for the Random Forest Model — Social (S) Component

Case

Predicted

Base

OCC

HUM

PM25

PM10

VOC

1

10.119

9.706

137.642

356.475

102.444

-145.549

-198.943

2

9.497

9.706

-141.526

-557.700

510.704

-426.882

-234.224

3

8.345

9.706

271.758

310.474

-1.262

-290.846

-164.060

4

9.409

9.706

-99.743

-292.405

-1.277

613.552

1.001

5

9.857

9.706

-64.956

461.545

242.598

-5.588

-325.978

ACH

THR

SND

EER

COP

SEF

EUI

LPD

-182.452

-28.543

158.293

285.878

74.686

-124.032

81.448

-104.236

151.726

122.638

432.515

-131.370

255.324

100.229

-32.603

-258.426

-167.946

-121.153

-146.303

6.577

260.931

207.834

-42.744

-223.179

-253.144

52.756

403.440

-192.171

189.206

139.739

-392.002

-191.752

-52.027

84.146

-43.087

-56.376

267.341

184.743

-363.505

-178.109

 

Note. The table illustrates the additive contributions of each variable to the predicted values across five test cases. Positive and negative values indicate how each social indicator (e.g., OCC, PM2.5, SND, COP) influences the final prediction relative to the baseline. These results confirm the interpretability of the Random Forest model and its capacity to capture complex interactions between comfort, air quality, and energy efficiency in smart building environments.

 

 

8.3 Machine Learning Validation of the Governance (G) Component within the ESG Framework

 

The validation of the Governance (G) component of the ESG model through machine learning techniques represents a critical step in ensuring the scientific reliability and applicability of the dataset for the prototyping of a smart building management system. Within this context, the application of the Support Vector Machine (SVM) algorithm was identified as the most effective method for the validation process (Wang, 2025). The Governance dataset includes key indicators such as Cost of Energy Saved (CES), Energy Return on Investment (EROI), Energy Payback Time (EPBT), Construction and Capital Costs (CPD and CCF), System Performance Coefficient (SPC), Renewable Energy Utilization (REU), and Energy Productivity per Worker Hour (EPWH). These variables jointly capture the economic and managerial dimensions of building performance, linking financial efficiency with operational sustainability (Wu et al., 2023). SVM was selected due to its superior performance across multiple validation metrics, particularly in minimizing mean absolute error (MAE) and mean absolute percentage error (MAPE), while maintaining a high coefficient of determination (R²). Unlike linear regression or decision trees, which may struggle to represent nonlinear dependencies, SVM effectively models the complex and interrelated relationships among governance variables (Lin & Hsu, 2023). This is crucial for ESG-driven frameworks, where economic efficiency, energy optimization, and operational decision-making are deeply intertwined. The low normalized MSE and RMSE further confirm the algorithm’s capacity to reduce prediction variance, ensuring high accuracy in estimating key governance outcomes such as cost-effectiveness and return efficiency (Koseoglu et al., 2025). The dataset itself, composed of one hundred buildings with diverse energy and cost characteristics, provides a robust foundation for testing the generalization capabilities of the model. SVM’s kernel-based approach allows for capturing nonlinear interactions between energy payback time, system costs, and governance efficiency indicators without overfitting the data (Suprihadi & Danila, 2024). This adaptability makes it particularly suitable for applications in digital twin environments, where data-driven models must reflect real-time changes and complex system feedbacks. By integrating this validated model into a digital twin framework, it becomes possible to simulate governance-related decisions in virtual environments before implementing them in physical infrastructures. This enhances predictive control, cost management, and operational resilience in smart buildings. The ability to test policies, predict maintenance needs, or optimize energy-economic trade-offs within the metaverse extends the role of the Governance component beyond data analytics, transforming it into a dynamic management tool. Therefore, the use of SVM for database validation ensures methodological rigor and computational robustness, confirming that the dataset is not only statistically coherent but also operationally meaningful. This validation establishes a scientific foundation for developing a prototype capable of merging machine learning, digital twin technologies, and ESG-based governance metrics into a unified management model for smart, efficient, and sustainable buildings.

Table xyz. Normalized Performance Metrics of Machine Learning Models for ESG Dataset Validation — Governance (G) Component.

Metric

Boosting

Decision Tree

KNN

Linear Regression

Random Forest

Regularized Linear

SVM

MSE

0.000

0.586

0.952

1.000

0.573

0.694

0.436

RMSE

0.000

0.742

0.965

1.000

0.733

0.812

0.570

MAE

0.000

0.820

0.908

1.000

0.380

0.129

0.000

MAPE

0.164

1.000

0.682

0.620

0.783

0.968

0.000

0.940

0.433

0.928

0.560

0.980

0.793

0.000

 

Note. The table reports normalized performance metrics for several machine learning models applied to the Governance dimension of the ESG dataset. Among all tested algorithms, the Support Vector Machine (SVM) achieved the best overall balance, minimizing errors (MSE, RMSE, MAE, MAPE) while maintaining high explanatory power (R²), confirming its robustness for dataset validation in smart building governance modeling.

 

The results obtained from the validation of the Governance (G) component of the ESG model using machine learning provide a consistent and technically coherent confirmation of the dataset’s reliability for the prototyping of a smart building management system. In this validation phase, the analysis focuses on the feature importance metrics and the additive explanations derived from the Random Forest regression model, which was used to estimate the AREA variable based on a set of governance-related indicators including CES (Cost of Energy Saved), EROI (Energy Return on Investment), EPBT (Energy Payback Time), CPD (Construction Cost), CCF (Capital Cost Factor), SPC (System Performance Coefficient), REU (Renewable Energy Utilization), and EPWH (Energy Productivity per Worker Hour). The Mean Dropout Loss, which remains consistent across all variables at approximately 5.279, suggests that each feature contributes similarly to the model’s predictive accuracy. This uniformity implies that the dataset is well-structured, without any variable disproportionately influencing the model. The stability in dropout loss also confirms the absence of overfitting, ensuring that the model generalizes effectively to unseen data. From a methodological standpoint, this homogeneity validates the internal coherence of the Governance dataset and indicates that each metric contributes to explaining different aspects of building efficiency and management performance.

 

 

Table xyz. Feature Importance Metrics Based on Mean Dropout Loss — Governance (G) Component

Variables

EPWH

CPD

CCF

SPC

REU

CES

EROI

EPBT

Mean Dropout Loss

5.155

5.154

5.151

5.149

5.148

5.144

5.144

5.142

Note. The table presents the Mean Dropout Loss values for each governance-related variable in the ESG dataset. The results show minimal variation among indicators (≈5.14–5.16), confirming a balanced contribution of all features to model accuracy and validating the internal consistency of the dataset used for smart building governance modeling.

 

 

 

The additive explanations of the predictions for the test set provide further insights into how each variable influences the estimated AREA values. The predictions show small but meaningful variations around the base value of 9.309.215, with feature contributions generally close to zero. These subtle shifts indicate that the model captures complex interactions among governance variables without introducing excessive noise. For instance, the CPD and CCF indicators show minor but systematic effects, reflecting the role of cost-related parameters in determining building scale and resource allocation. Similarly, the contributions from REU and EPWH confirm the connection between renewable energy utilization, labor productivity, and overall building governance efficiency. From a broader perspective, these results substantiate the model’s capacity to interpret governance-related dynamics within the ESG framework. The balanced feature importance distribution demonstrates that the variables are not redundant but complementary, collectively enhancing predictive accuracy and interpretative value. In the context of smart building management, this outcome is particularly relevant because it supports the integration of governance indicators into a decision-support system capable of optimizing energy efficiency, financial sustainability, and operational planning. Therefore, the validation confirms that the database is statistically consistent and suitable for the development of an intelligent management prototype leveraging digital twin and metaverse technologies. The capacity to model economic and performance interdependencies with precision establishes a strong foundation for advanced predictive control, simulation-based policy testing, and strategic governance of smart buildings. This ensures that the system’s management model is both scientifically validated and operationally viable in a real-world digital twin environment.

Table xyz. Additive Prediction Explanations for the Governance (G) Component — Feature-Level Contributions

 

Case

Predicted

Base

CES

EROI

EPBT

CPD

CCF

SPC

REU

EPWH

1

9.309

9.309

-0.020

0.017

-8.680×10-4

-0.093

-0.007

0.091

-0.049

-0.092

2

9.309

9.309

0.022

-0.024

0.027

0.299

0.034

-0.120

0.116

-0.149

3

9.309

9.309

-0.020

-0.007

-6.765×10-4

0.010

-0.074

-0.007

0.086

-0.155

4

9.309

9.309

0.003

0.031

-9.158×10-4

0.323

-0.045

-0.037

-0.128

-0.155

5

9.309

9.309

0.012

-0.013

-6.446×10-4

-0.265

0.108

-0.136

-0.053

0.17

 

Note: The table shows the additive decomposition of predicted values for five test cases within the Governance (G) component. Each variable’s contribution is expressed as a deviation from the base prediction (9.309), illustrating how governance indicators such as CPD, CCF, and REU subtly influence model output. The small variations confirm the stability and coherence of the dataset and the balanced behavior of the machine learning model.

 

 

  1. Operationalizing Environmental Sustainability through Digital Twins: A Metaverse-Enhanced ESG Dashboard for Smart Building Management

 

In this rapidly shifting landscape of AI-driven building management, ESG factors aligned with digital twin technology and metaverse-based engagement, like the metaverse itself, are driving a paradigm shift in sustainability. As a precursor to this innovative solution developed to promote and enable prototyping and training for a digital twin infrastructure for sustainable building management in a smart city setup for ESG-based KPI development for digital twin sustainability metrics, this article proposes and brings to fore a critical dashboard that translates to The ESG KPI Framework – Metaverse-Enhanced Operations. The above-mentioned dashboard provides a comprehensive perspective on environmental building metrics and includes critical dimensions related to carbon emissions, energy usage patterns, and the integration of renewable energy sources, in addition to sustainable and optimized building management performance. The dashboard is made possible by enhanced, streamlined inputs from real-time IoT-based streaming sources and is supported by advanced computational methods such as PCA and Ordinary Least Squares for predictive and related mathematical modeling to ensure feasibility. Moreover, through metaverse-based infrastructure development opportunities that are inclusive of interactive 3D platform development for critical sustainability metrics and factors such as carbon emissions and building performance, this dashboard translates into a critical sustainability perspective that resolves in a waterfall fashion. Thus, it essentially encapsulates critical sustainability and assurance translation and resolution through adaptive, advanced platform development. Moreover, it essentially translates to a critical confluence between sustainability intelligence and innovative digital platform development. Therefore, basically translates to a new and critical metaverse-driven paradigm for sustainability intelligence and related predictive development.

 

“This dashboard is a representation of the Environmental (E) dimension of ESG KPI Framework – Metaverse-Enhanced Operations- and has specifically been designed for prototyping and training a digital twin and metaverse-based system for a smarter building management (No-Roozinejad Farsangi et al., 2024). The dashboard is specifically designed to provide information on the building and its environmental performance, and to demonstrate how sustainability metrics can be measured, authenticated, and even visualized to improve ecological intelligence and efficiency within a metaverse-based management platform (Mahariya et al., 2023). The dashboard essentially represents a systematic evaluation of key environmental factors designed to monitor and track the building's carbon footprint and the renewable energy generated and integrated within the building. The Carbon Footprint, with a value of 453.75 TCO2e, essentially represents a measure of total carbon emissions that are generated as a result of building operation in a given period and is a critical factor within this context as it essentially suggests that a building is striving to achieve sustainability and is committed to reducing carbon emissions and staying within limitations and goals established within ESG frameworks. The Emission Intensity of 0.0249 TCO2/kWh suggests that this building maintains a high level of energy efficiency and has a negligible environmental impact. The building’s commitment to a sustainable cause is essential, as it provides critical information on the adoption and integration of 57.4% of renewable resources into its energy structure. The building is essentially committed to sustainability and is well aligned with ESG-based strategies that identify net-zero and transition to a green building approach (Dovolil & Svítek, 2024). The building’s Energy Consumption of 1440827 kWh and related Energy ROI of 260220.0 suggest that sustainability strategies and techniques can deliver returns in this context and ensure that actions and strategies are focused on and optimized for energy efficiency. The building is capable of and can provide a substantial portion of the demand through on-site and renewable sources, as indicated by its critical factors, which show a Load Covering Factor of 76.5% and a Supply Covering Factor of 84.8%. The On-site Energy Ratio of 0.68 and its ability to interact with grids and manage and sustain its operation based on strategic connections as suggested through its critical factor that essentially suggests that it is capable of and has a strategic connection to grids as suggested through its value that essentially suggests that it is capable to operate independently and autonomously as suggested within its critical factor of 58.6% related to its interaction between its independent and strategic connections to grids. The bottom portion of a dashboard essentially provides a pictorial representation through a number of critical factors that essentially represent and identify a building and its dynamics within a broader context that essentially represents sustainability as suggested within its critical factor that essentially represents Energy Flow as suggested within its critical factor that essentially represents its dynamics and level within a broader context that suggests that this is essentially a building that is capable to switch its sources and essentially manage and sustain its operation within this context as suggested related to its critical factor that essentially represents “This capability to forecast and react to changes in demand patterns is but one example of how predictive control methods are indeed imbedded within this environmental management framework” (Masubuchi et al., 2025).The information in this dashboard is continuously updated using IoT devices and validated using computational methods such as correlation analysis, Principal Components Analysis (PCA), Ordinary Least Squares (OLS), and Machine Learning algorithms (Has-sani et al., 2022).The methods ensure that information is trustworthy and help make Key Performance Indicators science-based for training a digital twin. It is through this that “the environmental factor in ESG becomes not only descriptive but can accurately forecast future performance scenarios under different scenarios of either operation and climate in which a digital twin performs” (Tsouri & Avgousti-Della, 2024). On the designer’s intent level, this dashboard makes it clear that integration between environmental intelligence and immersive technologies has been realized. At the metaverse level and within this context-based scenario, this information is analyzed in real time through interactive 3D to assist in “the direct effect of building operation decisions on energy and carbon emissions and system performance” (Hernandez et al., 2023).The immersive experience is one that “brings monitoring and controlling traditional environments to an 'experiential' level for learning and strategies” while making “sustainability a not fixed information piece but a 'dynamic' and 'participative' strategy for decision-making” (E Zainab & Bawanay, 2023). The dashboard is one of those building foundational ingredients that contribute to a strategy platform for a digital twin application in a smarter building environment. On its platform strategy formulation, this one offers “continuing” feedback between information and simulation and system-level “optimization” for better building performance through a logical, optimized logical calculus for a better development strategy (Markopoulos et al., 2024). IT recognizes that “strategic” developments for “building environmental intelligence and immersive display” are reaching a decisive point to achieve “an intelligent ecosystem that can self-act to improve its building environment while keeping to transparency and accountability” and for itself “the strategic de-velopment level for optimized building development” and “strategic” that can self-act through its “strategic development level for optimized building development” in making better strategies for building. In conclusion, and in relation to how this ESG theory can help develop better building strategies for smarter buildings through a digital twin immaterial platform. The “E” in ESG theory can actually help in making a computational strategy for building digital twins and metaverse technology, and making better building strategies. The dashboard has a “continuous” and “data-driven” strategy for developing a new “ecological” approach to smarter building infrastructure systems that are not “theoretical” but “founded,” “dynamic,” and “interactive” realities in carbon emissions and renewable energy.

 

 

 

Figure xyz. Environmental Dimension Dashboard – ESG KPI Framework for Smart Building Digital Twin Development. Note: This dashboard represents the Environmental (E) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, focusing on carbon footprint, renewable energy use, and energy efficiency. The displayed KPIs—such as 453.75 tCO₂e, 57.4% renewable energy, and 0.0249 tCO₂/kWh emission intensity—demonstrate strong environmental performance. The Energy Flow and Load Shifting charts visualize real-time energy dynamics, supporting the digital twin prototype for sustainable and intelligent building management.

 

 

 

 

The above dashboard represents the Social (S) factor in relation to the ESG Key Performance Indicator Framework – Metaverse-Enhanced Operations within a digital twin and Metaverse-based system for a smart building management system (Farsangi et al., 2024). Unlike other dashboard views that consider sustainability in relation to environmental and governance factors, this dashboard focuses on directly improving user well-being and health through quantified, measurable Key Performance Indicators for social sustainability. The dashboard combines real-time monitoring from building sensors and digital twin analysis to evaluate indoor environmental quality (IEQ). This makes it a pivotal framework for ESG layers to ensure that building users’ well-being is maintained through a digital and sustainable metaverse platform. The top portion of this dashboard provides a summary of information on building sustainability. The Carbon Footprint (453.75 tCO2e) is maintained as indicative in ESG layers to promote sustainability. The key indicator directly related to building users’ health and well-being is given as an “Excellent” rating for indoor air quality. It is denoted as “11.4 μg/m3” and classified as a “PM2.5” concentration for air purity. The building maintains a “PM2.5” concentration within a “Very Low” level to ensure that building users are protected from inhaling building air pollutants. The subsequent section of this dashboard provides a deeper evaluation of key social factors that define the indoor experience for building users. The building maintains its “Relative Humidity” at “51.1%,” within the “Normal” range. Therefore, this ensures that building users experience health and well-being related to indoor humidity. The dashboard shows that the “PM10” air concentration is maintained at “20.9” “µg/m3,” while the “Volatile Concentration” is denoted as “20” “ppb” to support building health and well-being. The building maintains its “Air Changes/h /h” as “2.8” “1/h” to ensure that building users’ indoor health and well-being are maintained. The above-mentioned factor ensures that building users experience health and well-being benefits from indoor air quality improvements. The “R-Value” is maintained as a “2.19” building factor to ensure that building users experience health and well-being factors associated with indoor building temperatures. The above health and well-being factor related to indoor building temperatures is associated with indoor building noise levels. The building maintains its “Sound Insulation” level as “-” “dB” to ensure that building users’ health and well-being requirements are maintained. The factor is associated with indoor noise stress among building users. The above factor maintains indoor noise stress factors within a “Low” level. Therefore, indoor building noise stress factors are reduced to ensure that building users experience health and well-being related to indoor building noise. The dashboard displays that building users experience health and well-being factors within a “Very Low” level. The health and well-being factor is associated with indoor building temperatures. The above dashboard factor maintains indoor temperatures within a “Comfort” level. The health and well-being factor related to indoor building temperatures is associated with indoor building noise. The building serves as a component of its prototype for its digital twin system, translating abstract social sustainability criteria into measurable indicators to promote a new paradigm of human-centered, resilient building management.

Figure xyz.  Social Dimension Dashboard – ESG KPI Framework for Digital Twin and Metaverse-Based Smart Building Management. This dashboard focuses exclusively on the Social (S) dimension of the ESG framework, illustrating KPIs that measure comfort, health, and indoor environmental quality. Indicators such as PM2.5 (11.4 µg/m³), VOC (20 ppb), Sound Insulation (35.6 dB), and System Efficiency (86.0%) provide quantifiable insight into occupant well-being. These data, integrated within the digital twin and metaverse-based prototype, form the basis for predictive, interactive, and human-centered smart building governance.

 

 

The dashboard above represents the Governance (G) dimension within the ESG KPI Framework – Metaverse-Enhanced Operations. The dashboard is specifically designed to train and validate a prototype for building an integrated digital twin and metaverse system for smart building management (Noroozinejad et al., 2024). The dashboard is distinct from other ESG metrics because it focuses solely on economic governance and financial decisions related to energy management and sustainability (Adnan et al., 2024). The top portion of this dashboard frames its scope in relation to governance through three core metrics. Carbon Footprint (453.75 tCO2e) and Indoor Air Quality (11.4 μg/m³ PM2.5, Excellent) are maintenance metrics to maintain continuity with ESG dimensions. However, in this context, all metrics are financial in scope. The Energy ROI (26,022.0), with a 0.8-year payoff period, signifies economic sustainability in terms of how energy investments are returned (Cranford, 2023). The above-mentioned metric indicates financial efficiency in how capital and financial governance sustain this digital twin across financial and environmental parameters. The middle portion of this dashboard signifies a financial segregation regarding parameters for its governance. The Investment for system implementation is 1,679,500 €, while Subsidies are 883,081 €, corresponding to 52.6% of the total capital. The above-mentioned metrics indicate transparency and traceability levels for ESG financial governance by highlighting financial investment segregation within this ESG platform (Park et al., 2023). The Cost Energy Saving metric (0.701 €/kWh) signifies financial returns within energy-saving initiatives. The Annual S-avings value (22,513 €/yr) is reflected within this ESG platform for financial returns in relation to sustainability and financial benefit. The above-mentioned Peak Demand Cost (187,854 €) and the metrics related to demand on this ESG platform indicate financial sustainability. The aforementioned digital twin and demand within this ESG platform enable the platform to forecast demand cost variability across different scenarios (Aloqaily et al., 2022). The Cash Flow (–372,815 €), reflected in this ESG platform, is used to assess investment metrics for this digital twin platform's financial sustainability. However, the above-mentioned metric is supported by a 15-year forecast in this ESG platform, as reflected in the cash flow projection below. The above graph is one of the most important aspects of the Governance factor and shows how both the «Annual Cash Flow» and «Cumulative Cash Flow» have been displayed to better define financial recovery dynamics over time (Masubuchi et al., 2025). The «Negative bar» in this first-year chart refers to capital outflows as a direct investment cost and «Positive bars» that follow specifically denote «yearly savings». The point where «cumulative curves shift into positive zones» will denote a stage where a given investment becomes profitable. The ability to simulate this within a digital twin system will help this metaverse platform project financial sustainability and performance in line with its ecological and operational potential (Trung, 2022). The «Left side-bar» titled «Building Config» enhances this «governance strategy» by including «Real and Simulated parameters» within its digital twin platform. There is scope to switch between «Real Building Data» and «Simulated Data» and to define «building config parameters» like «Building Size » (16,795 m²), «Building Occupants» (701 people), and «Total Units» (209), all with a view to ensure «comparability» and «normalization» for «governance KPIs» when different building types will specifically come under this ESG factor and «Governance strategy» classification (Zainab & Bawanay, 2023). The «switch to enable» Renewable Energy Systems, «Smart & Grid Integration» and «Metaverse Technology» specifically not only enables monitoring but enhances training within this »governance logic» for this »prototype» to react and easily «predict» decisions to invest in the future in its «simulated metaverse» platform (Duong et al., 2023). In essence, this dashboard can specifically act as a «Governance Cock-pit» within this ESG framework that is specifically designed for monitoring and predicting »financial and strategic» aspects related to »smarter building» management. The dashboard directly connects financial accountability and sustainability goals and can define «Investment Efficiency» and »Savings» impact as well as «Pay Back Time» factors while predicting future »economic behavior» in this metaverse and digital twin platform. In reference to this digital twin and metaverse platform and prototype »indicators» and factors will specifically setup »training ground» for «algorithms» to easily execute direct »decision-making» in »immersing» management scenarios within its ecosystem. Thus, this dashboard can specifically address a «financial strategic» and related »Governance factor» related to this ESG strategy within its metaverse platform that directly translates this «G» factor to a «financial strategic digital governance model» within this metaverse digital twin. It not only proves that good governance is a management activity but is indeed a computational task that can be modeled and optimized in a digital twin space. So, in this context, a digital twin can indeed have a significant impact on building management.

 

Figure xyz. Governance Dimension Dashboard – ESG KPI Framework for Metaverse-Enhanced Smart Building Management. This dashboard focuses on the Governance (G) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, highlighting financial transparency and investment efficiency. KPIs such as Energy ROI (26,022.0), Payback Time (0.8 yrs), and Subsidies (52.6%) demonstrate strong economic performance, while the 15-Year Cash Flow Projection confirms long-term financial sustainability within the digital twin and metaverse-based go

 

Q3. The research explores the potential of integrating metaverse technologies into smart building management systems—a concept that is both forward-looking and interdisciplinary. However, the scientific quality is limited by several key deficiencies. the paper cites only 47 references, which is insufficient for establishing a coherent theoretical and technological foundation. The authors must expand the bibliography with recent (2020–2025) high-impact publications that reflect the rapid advancements in metaverse technologies, smart buildings, and digital transformation.

A3. Following the changes made, the number of references has significantly increased. Furthermore, the references are not only present within the literature review but are distributed throughout the text to provide technical, scientific, and academic support for the entire research effort, both during the dataset creation and validation phases and during the prototype development phases.

Q4. The idea of an integrated metaverse-based management system for smart buildings is highly original and aligns with current trends in immersive technologies and cyber-physical integration. However, originality alone is not enough to ensure scholarly contribution. The manuscript must demonstrate conceptual depth and methodological robustness to justify its novelty.

A4. To enhance the scientific dimension of the analysis, we modified relevant sections of the article. First, we began by revising the KPIs. Since our article is related to an industrial research project, we determined the need to introduce previously tested industrial KPIs into the project documentation developed with our research partners. Therefore, we modified the approach, starting with the presence of ESG KPIs. We then carefully described the ESG KPIs used within the project, also referencing the relevant scientific literature. We then explored database validation techniques in greater depth. To this end, we conducted various types of experiments to validate the data, including correlations, PCA, OLS linear regression, and machine learning regression algorithms. Once the database was validated, we completely redid the prototyping phase, commenting on the new results obtained.

 

Q5. At present, the discussion of metaverse integration remains largely descriptive. The paper would benefit from deeper theoretical engagement—particularly with frameworks such as digital twin interoperability, user-centered virtual environments, and intelligent data-driven control systems. Strengthening these connections would allow the research to move beyond conceptual novelty toward substantive scholarly contribution.7

A5. We have added a paragraph to analyze this issue as follows:

 

  1. Theoretical Framework: Digital Twin Interoperability, User-Centered Virtual Environments, and Intelligent Data-Driven Control Systems in Smart Building Management

 

 

The convergence of digital twin interoperability, user-centric virtual realities, and data-driven control systems constitutes the comprehensive theoretical framework for the smart management of buildings in metaverse-scale ecosystems (Aloqaily et al., 2022; Masubuchi et al., 2025). The aspect of digital twin interoperability forms the foundational backbone in facilitating seamless connectivity among different units in a building, such as HVAC units, light units, storage units, and environmental units, among others, via standardized data exchange protocols and modulated interfaces (Lyu & Fridenfalk, 2024; Picone et al., 2023). The interoperability of digital twins ensures real-time synchronization between physical units and their corresponding digital twins, facilitating predictive maintenance, dynamic resource management, and adaptive energy management (Li et al., 2024). The continuous one-way data flow facilitates an integrated and dynamic infrastructure network that can evolve with users and adapt to different environmental settings, thereby forming the basis for scalable metaverse-scale governance models (Aloqaily et al., 2022). On the one hand, with the convergence of user-centric realities in the metaverse, the human factor arises, thereby facilitating the transition from a visualization technology to an interactive management solution space (Ruiu et al., 2024). Occu-pants and users can thereby utilize immersion visualization technology to directly interact with different units in the metaverse to simulate spatial arrangements in real space to in-depth analyses of the situation, to co-develop sustainability strategies in real space via exploration of the third digital reality space in real time, respectively (Zainab & Bawanay, 2023). On the other hand, complemented by human-centric interaction technologies, data-driven control technology forms the analytical core of metaverse-scaled management platforms. The use of AI machine learning technology in data mechanisms enables multisource data collection to identify undetected data patterns, predict consumption patterns, and optimize performance (Stefko et al., 2025). The integration of smart algorithms into control mechanisms enables a transition from post-response to pre-response mechanisms, respectively, self-sustaining and adjusting to variations in occupancy, external environments, and consumption (Masubuchi et al., 2025).

 

 

 

 

 

 

 

 

Q6. The writing is generally clear and readable, but the organization and flow require significant improvement. Several sections are repetitive and loosely connected, leading to a fragmented narrative. Transitions between conceptual explanation, framework development, and implementation are abrupt. Improving coherence and logical progression will enhance readability and academic professionalism. Furthermore, visual and structural aids—such as conceptual models, flow diagrams, or prototype schematics—should be included to illustrate system architecture and functional relationships. These would help readers grasp how the proposed metaverse platform operates in practice.

A7. We have completely modified the structure of the article. Following the comments raised by other authors, we have given significant space to the presentation of KPIs within the ESG context, and then to dataset validation techniques through the application of a series of methodologies: correlation, PCA, regression, and machine learning regression. We then reanalyzed the characteristics of the reprogrammed prototype based on the newly acquired information.

Q7. Abstract: The abstract lacks clarity and precision. It should present the research problem, objectives, methods, and key contributions rather than offering a thematic overview. Quantitative or conceptual highlights would strengthen its scientific tone.

A7. The abstract has been modified as follows:

This article proposes a complex solution to improve sustainable intelligent building management based on the principles of Environmental, Social, and Governance (ESG) factors. The ESG KPI Framework – Metaverse-Enabled Operations incorporates the latest digital twin solutions, IoT sensor systems, and metaverse platforms to deliver real-time management and optimization of ESG factors. A hybrid solution strategy has been used in this framework, focusing on au-to-acquisition of information and multiple validations at different levels through correlation analy-sis, Principal Component Analysis (PCA), Ordinary Least Squares (OLS) regression, and Machine Learning. The designed prototype links all the solutions together in a multi-level dashboard to represent key performance factors such as carbon footprint, energy consumption, renewable en-ergy use, and occupant wellness. Experiments conducted validate the effectiveness of the pro-posed solution in improving prediction efficiency and user interaction experience during metaverse simulations.

 

 

Q8. Introduction: The introduction presents the background but fails to establish a clear research gap or objective. The authors should add one for research significance, explaining why the study is important.

 

A8. The introduction has been modified as follows

Recent technological advancements and the sustainable development movement have led to revolutionary changes in intelligent building management. The present-day system has evolved from simple automation into complex network-based architectures that support data-driven and adaptive operations (Lemian & Bode, 2025). This transformation reflects the convergence of artificial intelligence, machine learning, Building Information Modeling (BIM), and digital twin technologies in the built environment (Solmaz, 2024). The focus of smart building systems has consequently shifted from energy optimization alone to encompassing user health, carbon emission reduction, and ethical management principles (Fokaides, Jurelionis, & Spudys, 2022).However, the ever-increasing use of digital solutions has introduced greater complexity into urban systems, which now require more advanced tools for planning, operation, and monitoring (Sabri & Witte, 2023). This complexity has encouraged the development of innovative methodologies that extend beyond traditional control or energy management systems, emphasizing prediction, automation, and adaptability as the new priorities (Zavaleta, 2025). Consequently, the need for real-time monitoring systems with predictive and visual capabilities has become increasingly urgent, as current approaches are often static and reactive rather than dynamic and proactive (Gao et al., 2024). Existing systems, in fact, have remained relatively basic, providing passive reporting based solely on limited variables such as temperature, occupancy, or energy efficiency. While such systems have contributed to initial efficiency gains, they neglect the broader context of sustainability and governance. Most fail to integrate Environmental, Social, and Governance (ESG) principles, which extend beyond energy metrics to include carbon neutrality, renewable energy use, user safety and well-being, and institutional transparency (Matei & Cocoșatu, 2024). Digital twin technology, in particular, has emerged as one of the most promising solutions for implementing these multidimensional sustainability requirements. As highlighted by Lemian and Bode (2025), digital twins enable real-time integration and optimization between physical and virtual systems in the building sector, supporting continuous feedback and adaptive management. Similarly, Fokaides et al. (2022) demonstrate that the integration of digital twins within urban infrastructures—such as in the SmartWins project—can significantly contribute to achieving carbon neutrality and operational resilience in smart cities. Complementing digital twins, Internet of Things (IoT) systems play a fundamental role by using sensor networks to monitor environmental parameters, including energy use, air quality, temperature, and occupancy patterns, which provide the empirical foundation for real-time analytics (Matei & Cocoșatu, 2024). Gao et al. (2024) further emphasize how such sensor-based systems, when coupled with digital twins, can predict thermal comfort and environmental performance, thereby enabling more precise control and sustainable decision-making. At the same time, metaverse environments have introduced new possibilities for immersive 3D visualization and interaction within the digital twin ecosystem. Misilmani and Elbastawissi (2023) explore this potential through their integration of BIM–GIS–metaverse technologies in urban heritage planning, showing how immersive digital environments enhance both stakeholder engagement and design accuracy. Likewise, Sabri and Witte (2023) point out that digital technologies such as VR and AR are reshaping urban management and participatory planning, providing stakeholders with greater spatial awareness and collaborative capability in virtual spaces. Despite these advancements, the synergistic potential between metaverse technologies, digital twins, and ESG-oriented management remains largely unexploited. Most studies have focused primarily on optimization or visualization, with little attention given to embedding ESG metrics into real-time interactive frameworks (Zavaleta, 2025; Fokaides et al., 2022). Current ESG reporting systems are typically retrospective and periodic, offering limited predictive or prescriptive capabilities. This creates a disconnect between sustainability objectives and day-to-day operational management. The innovative solution presented in this study aims to overcome these limitations by linking the quantitative dimension of ESG factors to immersive digital environments, where stakeholders can experience and analyze sustainability performance dynamically. The proposed integration of digital twins, IoT sensing, and metaverse visualization establishes a continuous information loop, allowing users to access and interpret key ESG indicators—such as carbon footprint, energy consumption, renewable energy utilization, and occupant comfort—in real time (Gao et al., 2024; Matei & Cocoșatu, 2024). This interactive model empowers stakeholders to move beyond passive observation toward active scenario development and decision-making, enhancing participatory governance and system transparency. By continuously feeding validated IoT data into predictive digital twin models and visualizing outcomes within the metaverse, the framework supports self-correcting, adaptive performance management. Predictions are refined through iterative simulation, improving responsiveness to dynamic environmental and behavioral changes in building operations. Furthermore, the spatial and immersive dimension of sustainability, as visualized in metaverse environments, transforms abstract data into geographically intuitive experiences. As Zavaleta (2025) and Matei and Cocoșatu (2024) suggest, combining digital twins, blockchain networks, and sensor-based algorithms can create a foundation for sustainable smart city administration capable of integrating complex ESG objectives across multiple spatial and temporal layers. This approach redefines sustainability not merely as a numerical output but as an experiential geography, facilitating more effective spatial planning and community engagement toward sustainable urban development. Overall, this integrative strategy bridges the gap between ESG principles and the operational reality of sustainable development, contributing to the advancement of intelligent building management and the broader smart city paradigm. The resulting system aspires to move from static sustainability reporting to dynamic, data-validated, and user-centered decision-making, positioning metaverse-enabled digital twins as a transformative tool for achieving the next generation of sustainable, adaptive, and intelligent infrastructures. However, despite the technological advancements achieved in previous studies, the area of concern identified still recognizes the lack of development in integrating ESG principles into the context of an immersive, real-time building management system. This has been attributed to the fact that previous studies have only addressed the technological capabilities of digital twins/IoT systems or the development of the metaverse. This specific area of concern is the development and verification of the ESG KPI Framework: Metaverse-Enabled Operations. This innovative solution combines the principles of digital twins, the use of IoT-based sensors, and metaverse visualization components to create a real-time ESG optimization solution. The applicability of the current area of concern can be specifically attributed to the fact that the development of immersive integrated environments can revolutionize the management of ESG.

 

Q9. A Theoretical and Technological Framework should follow the introduction to provide context and justification.

A9. The following section has been added:

The proposed ESG KPI Framework – Metaverse-Enabled Operations is based on the integration of the following three main components: theoretical, technological, and operational. From a theoretical point of view, the scheme depends on integrating the principles of Environment, Society, and Governance (ESG), as mentioned above. These principles form the basis for determining sustainability performance from a multidimensional perspective. They form the basis for understanding the information and the goals the system aims to achieve. The principles of the ESG model serve as the basis for achieving the goals related to the appropriate management and operation of intelligent structures. The technological foundation of the scheme relies on the Digital Twinning technique. This technique forms the basis for creating a virtual replica of the physical environment. Digital Twinning provides a real-time link between the physical and virtual worlds. This link enables the optimization of performance factors in real time. This can be achieved by recognizing that the Digital Twinning technique has undergone many changes over the last decade. In fact, the transition from BIM techniques to digital twins and, finally, to metaverse-based systems has been viewed as the biggest paradigm shift. The paradigm shift has significantly impacted the entire field of digital transformation. The Internet of Things (IoT) serves as the bridge between the physical and digital worlds. This enables the real-time collection of information on consumption and behavior related to occupation levels. This real-time information has been viewed as the best basis for generating the empirical data required for real-time decision-making. Moreover, the development based on IoT has also led to the integration of Blockchain and analytics techniques. This has contributed to the creation of reliable information about the entire built environment. An important advantage of digital structures has been the development of the metaverse. The metaverse serves as the basis for adding interactive realizations of visualization techniques. This has been achieved because the metaverse serves as the basis for the virtual reality simulator. This enables the determination of sustainability factors in practice. Moreover, the creation of the metaverse has enabled the application of visualization techniques, fostering an interactive atmosphere. This has contributed to the fact that the visualization techniques can reach 100% reality as mentioned above." The metaverse context enables immersive, real-time experience of ESG performance through virtualization, simulation, and interactive interfaces that enhance cognitive insights and understanding of sustainability relationships and interactions (Hernandez et al., 2023). This immersive digital reality enables greater cooperative management and understanding of ESG factors through experiential visualization. In addition, the metaverse serves as a gateway to Industry 5.0, emphasizing human-centered, sustainable, and intelligent systems (De Giovanni, 2023). From a systemic perspective, the solution functions as a networked architecture that integrates the concurrent use of IoT-based information acquisition, AI-based predictive analytics, and metaverse-empowered 3D visualization. This interactive flow of information among the stacked components enables intelligent adaptation and feedback for continuous enhancement. This synergy enables perpetual assessment and the effective communication of sustainability performance in real time. In this regard, the ESG KPI Framework represents the integration of ESG principles and the latest advancements in digital infrastructure—the convergence of theory and human experiential interaction.

 

Q10. Research Question: Toward an Integrated Metaverse-Based Platform for Smart Building Management: The central question is intriguing but lacks clarity and precision. The authors should frame the research question in measurable terms and connect it to the broader discourse on digital transformation. A schematic showing how the metaverse model addresses gaps in current smart building systems would improve this section.

A10.  We have modified the section containing the search query as indicated below:

 

The main topic of the aforementioned research stems from the NextHub Project's purposes, in which the use of the metaverse in Smart Building solutions has been investigated. In this regard, the current research envisions the design and development of a management structure that integrates performance measurement based on ESG factors with digital immersive environments. This should enable real-time interaction and optimization of the performance of the aforementioned structures by leveraging the interplay among the Digital Twin paradigm, the Internet of Things (IoT), Artificial Intelligence (AI), and Extended Reality (XR) systems. The inspiration for the aforementioned research stems from the fact that Smart Building systems remain disparate and consist mainly of entities such as energy-efficiency tracking and control systems and user comfort control systems. In other words, the aforementioned systems lack a management structure that can connect the metric to digital immersion. Given that the aforementioned structures continue to evolve at a very rapid pace on the technological plane due to the acceleration driven by the metaverse environment, the development of a data-driven management structure that focuses on these factors has become the essence of the research. This forms the basis for the following research question: In what ways can the management system built for the metaverse, utilizing the concepts of Digital Twins, IoT, AI, and XR, be more effective in measurable aspects of ESG factors like energy efficiency, carbon footprint reduction, comfort levels of occupants, and transparency in governance than the currently available non-immersive systems? This question contextualizes the research effort within the wider discourse on the digital transformation of the built environment, in which the interplay between immer-sive visualization techniques, real-time analytics, and intelligent automation represents “the next phase of innovation in the built environment.” The emphasis on verifiable ESG factors thereby ensures that the research effort can remain more than merely speculative at the level of “grand concepts,” since empirical verification will instead be “grounded in measurable outcomes.” In order to inform the search effort, a specific search strategy has been conducted utilizing the Elsevier Scopus database (English language publications between 2018 and 2025), whereby the following query has been employed:
“TITLE-ABS-KEY("metaverse") AND TITLE-ABS-KEY("smart building").” This specific query revealed only five peer-reviewed documents from the queried database. This outcome has confirmed that the topic has been addressed only very rarely. While the results obtained fall short of the level required for a formal Systematic Review, the outcome has qualitatively underscored the lack of a well-defined body of research. Current studies offer cherished but disintegrated insights: computing systems for intelligent building networks (Zhang et al., 2022), extended reality solutions for building maintenance and user experience (Casini, 2022), IoT-blockchain integration for the management of decentralized data (Ud Din et al., 2023), experimental systems combining IoT sensors and commercial metaverse platforms (Masubuchi et al., 2025), and theoretical systems merging the concepts of digital twins and metaverse (Tang et al., 2025). Though very useful, current studies are fragmented and lack a standardized metric and a management approach to synchronize intelligent building management across technological, management, and sustainability aspects. This work specifically builds upon prior foundation achievements and proposes a measurable, interactive management system that integrates the metaverse into building management. This system goes beyond the management model design and introduces standardized Key Performance Indicators for energy consumption efficiency, building maintenance efficiency, occupants' health and well-being, and governance. All four performance indicators can be quantitatively measured using real-time building data-sense flow, validated through AI-empowered analytics. However, the immersive nature of the entire system facilitates the visualization of the aforementioned four building performance factors in physical space and supports building management-related decision-making. To clearly demonstrate the management structure of the interconnection among the three technology components of the entire building management system, Figure 1 presents a management-related schematic overview of the system. This management process finally ends at a level that demonstrates the digital transformation effectiveness of building management based on environmental sustainability.

 

 

 

Q11. Theoretical and Technological Framework: The section remains descriptive and disconnected from established theoretical constructs. The authors should provide a clear conceptual framework explaining how metaverse technologies interface with digital twins, IoT systems, and building management workflows. Theoretical underpinnings—such as cyber-physical systems theory or human–computer interaction principles—should be referenced.

A11. We have modified the section entitled “Theoretical and Technological Framework” as follows:

 

The above-mentioned ESG KPI Framework – Metaverse-Enabled Operations has been designed based upon the following three fundamental pillars: the theoretical foundation, the technological foundation, and the operational factors. From a theoretical foundation, the ESG KPI Framework has been based on the constructs of “Cyber-Physical Systems” and “Human-Computer Interaction,” which describe the relationship between the digital and physical worlds in intelligent environments. The “Cyber-Physical Systems” theory explains the symbiotic relationship between the building's physical systems and their digital counterparts, and the interfaces that connect them, to establish a real-time sensing and control process between the virtual and physical systems (Elias et al., 2023; Rafique & Qadir, 2024). This “Cyber-Physical” relationship has become more robust through the integration of “Human-Computer Interaction” principles, which simplify the development of human-centered interfaces that promote intuitive interaction and informed decision-making in the metaverse (Markopoulos, 2024; Stary, 2023). On the other hand, from a technological foundation perspective, the framework uses the “Digital Twin” as its computational core. A “Digital Twin” is the real-time digital representation of a physical structure that uses IoT-based information flows on energy consumption, indoor air quality, and occupancy patterns to optimize these factors. Moreover, the AI analytics module in the framework enhances the predictive capacity for pattern analysis, irregularity detection, and the development of self-adaptive management strategies (Elias et al., 2023). Blockchain-based solutions are being adopted to make ESG reporting more trustworthy and reliable. Blockchain-based solutions add authenticity to information about the actual sustainability records of the concerned structure (Picone et al., 2023; Rafique & Qadir, 2024). The “Metaverse” layer lies between human operators and the digital world. This layer enables users to navigate the real-time ESG performance indicators presented through “Human-Computer Interaction” visualization. This layer converts sustainability information from a numeric form into a spatial interface that can be navigated in 3D. Moreover, the “Virtual Reality” and “AR” solutions are adapted to enhance the perception of information on ESG factors in realistic contexts, thereby improving decision-making (Markopoulos, 2024; Stary, 2023). From the systemic and functional perspective: “The ESG KPI Framework can be viewed as a hierarchical cyber-physical architecture for intelligent building management. In the architecture, the IoT layer can be considered as the sensory layer. The Digital Twin layer can represent the analytics and prediction module. The Metaverse layer can represent the human-computer interaction layer. The interactions among the aforementioned layers can support a self-learning process encompassing the stages of “sensing,” “simulation,” “visualization,” and “optimization,” forming a closed loop. This closed-loop process enables real-time responsiveness and optimization. For example, Elias et al. (2023) describe a self-learning process for sensors that can monitor and analyze sensor readings. This can predict future sensor outputs. This can represent the prediction module. The prediction module can predict future human behavior as needed. This can represent the human module. The self-learning human module can generate appropriate human-machine interaction processes. This can represent the human-computer interaction module. The self-learning human module can influence the prediction module. This can represent the human-computer interaction module” (Elias et al., 2023). In total, the proposed ESG KPI Framework incorporates the concepts of CPS, HCI, and ESG in the digital universe.

 

Q12. KPI Framework for Immersive Smart Building Management: This section introduces an important idea but lacks methodological clarity. The authors should explain the selection process, categorization, and weighting of KPIs and relate them to measurable building performance indicators. Comparing these KPIs with established ones in existing BMS literature would demonstrate added value.

A12. The following section has been added to analyze and illustrate, also in light of the relevant scientific literature, the characteristics that led to the selection of the KPIs.

 

 

Development of the Environmental Dataset for Evaluating Smart Infrastructure Performance through Digital Twin Integration

 

 

The creation of the environmental data set for the evaluation of environmental performances marks an imperative step in the design of an intelligent digital model based on digital twin applied to smart infrastructures. In light of this consideration, the environmental KPI set section marks the foundation of the proposed research based on digital twin applied to environmental performances of smart infrastructures. The bibliographic research on digital twin applied to environmental performances of smart infrastructures leads to an understanding of the relevant KPI necessities proposed by digital twin. The ultimate goal of designing digital twin applied to environmental performances of smart infrastructures is to introduce an integrated system that enables the simulation of environmental performances of infrastructures. The applied KPI set permits an operative analysis of the synergic relationship concerning energy efficiency, sustainability, and environmental performances. The proposed KPI set permits an analysis of carbon footprint and emission intensity that provides extend information on environmental sustainability. The remaining KPI set related to load cover factor and on-site energy ratio grants evidence on system autonomy and energy efficiency. The data set creation implemented by the KPI set permits an integrated analysis of environmental performances that corresponds to the principles of the environmental framework proposed by ESG. The designed data set permits an operative comparison of environmental performances of an infinite number of infrastructures. The data set designed permits an intelligent analysis of environmental performances that provides a knowledge base on environmental sustainability of infrastructures. The designed data set permits an analysis on environmental performances of infrastructures that marks an perpetual approach on design decision concerning data set creation. The data set creation designed permits an intelligent extension on system environmental design that provides an intelligent knowledge on system environmental sustainability. In an ultimate analysis on data set creation designed by digital twin applied to environmental performances of infrastructures, digital twin represents an intelligent approach toward environmental sustainability of infrastructures.

 

 

3.1 Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Evaluation of Smart Infrastructure

 

The chosen environmental Key Performance Indicators (KPIs) provide a comprehensive framework for evaluating the environmental performance characteristics of smart infrastructures. These factors help support the aims and scope of the proposed digital twin platform, aiming to analyze, model, and optimize the environmental, energy, and operational characteristics of buildings and urban infrastructures in an immersive and data-driven setting  [1]; Fokaides, Jurelionis, & Spudys, 2022). Each Key Performance Indicator adds a unique perspective on energy, resilience, and efficiency, combining to form a holistic model for managing and interpreting the sustainability of urban infrastructures [1]. The Carbon Footprint (CFPT) provides a fundamental measure of sustainability by quantifying the total greenhouse gas emissions generated by system activity. This metric allows for the interpretation of complex system operations through a comparative metric expressed in CO₂-eq, evaluating both direct and indirect emissions (Zahedi, Alavi, Sardroud, & Dang, 2024). Applied in digital twin analysis, it is essential for the CFPT, as it enables real-time analysis and projection of environmental implications across different system operation scenarios (Li, 2025). It effectively serves as the central model connecting energy performance characteristics to global environmental goals for climate regulation (Yu, Ye, Xia, & Chen, 2024). Emission Intensity (EMIN) adds a further dimension by using normalized factors directly related to the energy consumed or produced. This type of ratio analysis permits different system-scale operations to compare system emissions, making it highly valuable for multi-building and city-scale analysis (Alibrandi, 2022). The Load Cover Factor (LCF) and Supply Cover Factor (SCF) assess the relationship presented by energy demand and supply, an important consideration for energy and resource sufficiency. The LCF will evaluate how much local energy production can sustain energy activity for a predetermined period, assessing system sufficiency, while SCF will assess how much local energy production can sustain energy use for a predetermined period, assessing system resource use (Chávez et al., 2022). The Load Matching Index (LMI) evaluates the synchrony of system dimensions for local energy production and energy activity. Large LMI values clearly indicate that local energy production and storage are well supported by local loads, thereby providing a fundamental basis for the efficiency and resilience of Smart Grids (Klar & Angelakis, 2023). The On-Site Energy Ratio (OER) also captures the extent to which local energy consumption is supported by local use of Renewable Energy sources, thereby serving as a crucial factor in assessing the zero-energy building index (Prandi et al., 2022). The Grid Interaction Index (GII) and No-Grid Interaction Probability (NGI) further establish the global context for autonomy. The GII captures the intensity and direction of energy interactions, while the NGI estimates the probability of autonomy (Fokaides et al., 2022). Capacity Factor (CAF) and One Percent Peak Power (OPP) establish system performance at varying loads. The Capacity Factor estimates system performance and its ability to use its installed energy resources, thereby forming a crucial index for judging performance return on investment, while the One Percent Peak Power focuses on peak loads and their intensity, thereby estimating impacts on system stress [2]. Building on the concept of behavior-based system performance, the Demand Response Percentage (DRS) estimates system performance flexibility in adapting to varying loads, particularly in Smart Pricing scenarios [3]. The system's total flexibility level for adapting to global environmental stimuli, such as market prices or Renewable resource availability, thereby covering system transitions from Static Energy Management to Adaptability, is captured by the system’s triple dimensions – the Flexibility Factor (FLF), Flexibility Index (FLI), or Flexible Energy Efficiency (FEE) (Chávez et al., 2022, 2022; Li, 2025). This framework satisfies not only system sustainability analysis requirements but also provides additional benefits for decision-making, scenario analysis, and future system optimization [4]; Zahedi et al., 2024). This framework therefore aligns well with the system requirements for an intelligent, fully interoperable, and environmentally sustainable Smart Urban Ecosystem, supported by measurable system performance indicators (Prati, Pelucchi, Dal Fiore, Fuzzati, & Agostini, 2023).

 

Table xyz. Environmental Key Performance Indicators (KPIs) and Their Computational Formulations

KPI

ACRONYM

Description

Formula

Carbon Footprint

CFPT

Indicates the total amount of greenhouse gas (GHG) emissions caused by an individual, organization, or product, either directly or indirectly. The formula calculates the sum of emissions associated with different activities by multiplying the quantity of each activity by its corresponding emission factor [5].

 

 

 

 

 = Quantity of a specific activity that generates greenhouse gas emissions (e.g., km, kWh, liters).

 = Rate of GHG emissions per unit of activity, expressed in CO₂ equivalent per unit (e.g., tCO₂e/kWh for electricity, tCO₂e/liter for fuel, etc.).

Emission Intensity EI

EMIN

Evaluates the environmental impact of an energy system by measuring the amount of carbon dioxide (CO₂) emitted per unit of energy consumed or produced. A low  value indicates that the system is more environmentally efficient, emitting less CO₂ for each unit of energy consumed or produced (this can occur through the use of renewable energy sources). Conversely, high  values typically occur in systems that rely heavily on fossil fuels [6].

 

2

 

 = Total amount of CO₂ emitted over a given period, resulting from the consumption of fossil fuels or the use of grid electricity [tCO₂]

 = Total amount of energy consumed or produced during the same reference period [kWh]

Load Cover Factor

LCF

Represents the ratio between the energy actually supplied by a generation source and the energy demanded or consumed over a given time interval. If equal to 1, it indicates that the generation capacity exceeds the demand, whereas values lower than 1 indicate that generation is insufficient to meet the required load. When =  1, the entire load demand is fully satisfied.  When   1,   the load is not completely met during part of the period, due to limitations in generation or available resources. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Supply Cover Factor

SCF

Indicates the ability of an organization to meet its energy demand through its own on-site supply resources. When = 1, the amount of useful supplied resources is exactly equal to the total available amount. This implies that there are no significant losses and that all available resources are fully utilized. When <  1, the amount of effectively usable resources is lower than the total available amount. Part of the generated energy is not used to meet the load, likely due to overproduction, losses, or storage capacity limitations. Range: 0     1 [7], [8].

 

 

 

= On-site energy generation at a given time t [kWh]

 = Storage energy losses at a given time t [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Storage energy balance at a given time t [kWh]

 = Charging energy of the storage system [kWh]

 = Discharging energy of the storage system [kWh]

Load Matching Index

LMI

Measures the efficiency with which on-site energy generation (whether renewable or not) matches the energy load (demand) of a system.

It evaluates how well the energy production profile corresponds to the load profile over time by analyzing the synchrony between supply and demand.

A higher index indicates a better match between generation and load.

When  = 1, the load is fully met (i.e., generation and storage are sufficient to cover the required demand) in every considered interval.

When  < 1, the load is not fully met at certain times, meaning that the generation and/or storage capacity was lower than the demand.

Range: 0 % ≤ f_(load,i) ≤ 100 % [8].

 

i = Time intervals [hourly, daily, monthly]

 = On-site energy generation at a given time t [kWh]

 = Storage energy balance at a given time t [kWh]

 = Energy losses at a given time t (sum of generation energy losses, storage energy losses, building technical system losses (excluding storage), and load-related energy losses such as distribution losses) [kWh]

 = Building load at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 = Number of samples within the evaluation period, from τ₁ to τ₂. When hourly data are used and the evaluation period covers a full year, the number of samples is 8760.

 

On-site Energy Ratio

OER

Determines the amount of energy produced on-site (e.g., from renewable sources such as solar panels or wind turbines) relative to the total energy consumption over a given period of time.

If  = 1, the on-site generated energy equals the total energy consumption.

If  < 1, the on-site produced energy is lower than total consumption, meaning that the system depends on external energy sources to meet the demand.

If   > 1, the on-site generated energy exceeds total consumption, indicating that energy production is greater than demand (and surplus energy may be exported to the grid).

Range:   0 [9].

 

 

 = On-site energy generation at a given time t [kWh]

 = Total energy consumption (energy load) at a given time t [kWh]

e  = Start and end of the evaluation period [s]

 

 

 

 

Grid Interaction Index (Indice di Interazione con la Rete)

GII

Measures the level of interaction and integration of a facility with the power grid, describing its average stress.

If  = 100%, the energy exchanged with the grid during interval i equals the maximum possible exchange.

If  = 0%, no energy exchange with the grid occurred at that moment.

If  < 0%, energy was injected into the grid rather than drawn from it [7], [8].

 

 = Net energy exchanged with the power grid during interval i (can be positive or negative depending on whether energy is being drawn from or injected into the grid) [kWh]

 = Maximum absolute value of the net energy flow with the grid, taken over all considered time intervals [kWh]

i = Time intervals [hourly, daily, monthly]

No grid interaction probability

NGI

Measures the probability that a building or facility operates autonomously from the power grid, and therefore the likelihood of no interaction with it.

It also indicates the extent to which the load is covered by stored energy or renewable energy use.

If  = 0, there was no moment during the considered time interval when the net energy was zero or negative.

If  = 1, the net energy was zero or negative for the entire considered period.

Range: 0           1  [7], [8].

 

 = Probability that the net energy  is zero or negative during the time interval ||

 = Normalized variable for the net exported energy at a given time t [kWh]

e  = Start and end of the evaluation period [s]

Capacity Factor

 

CAF

Defines the ratio between the actual energy production of a system (energy exchanged between the building and the grid) and the maximum production that could be achieved if the system operated at full capacity over a given period of time.

If = 1, the system operated at its maximum capacity for the entire considered period.

If = 0, the system did not produce any energy.

Range: 0           1  [8].

 

 = Normalized variable for the net exported energy at a given time t [kWh]

 = Maximum producible energy at full capacity (system capacity) [kWh]

 = = Evaluation period [s]

One Percent Peak Power

OPP

Quantifies the maximum power that an energy system can reach by calculating the energy production corresponding to the top 1% of peak periods.

A high  value indicates that the building or system experiences moments (the top 1% of the time) with very high energy consumption. This may point to significant peak loads that place stress on the electrical grid.

If   is low, the building’s energy demand is more evenly distributed over time, with fewer or smaller peaks. [10].

 

 = Energy associated with the top 1% of a given value, calculated during periods of maximum demand or generation [kWh]

 = Time period over which the energy is measured [h]

Demand Response Percentage

 

DRS

Refers to the percentage variation of the Demand Response relative to a baseline value.

If  > 0, the Demand Response was successful in reducing power compared to the baseline level (load “reduction” capability).

If  = 0, no variation occurred.

If  < 0, it indicates an increase in power during the Demand Response implementation, which is generally undesirable (load “overload” condition) [11].

 

 = Baseline hourly power, i.e., the expected or normal power level without any Demand Response measures [kWh]

 = Hourly power under Load Shifting conditions, i.e., the power recorded during the Demand Response event [kWh]

Flexibility Factor

FLF

Measures the ability of an energy system to adapt to variations in energy demand and resource availability, and to shift energy use from high-price periods to lower-price periods. It applies a daily quartile-based price classification, dividing prices into three categories: low, medium, and high.

A high price is defined as one above the third quartile (price > 75% of all prices during a day).

A low price corresponds to a value within the first quartile (price ≤ 25%).

If = 0, consumption is balanced between low- and high-price periods.

If   = 1, consumption occurs only during low-price periods.

If < 0, most consumption occurs during high-price periods.

Range:  -1            1  [12].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Low-price periods (first quartile, i.e., the lowest 25% of prices)

 = High-price periods (above the third quartile, i.e., the highest 25% of prices)

 = Number of considered time intervals

 

Flexibility Index

FLI

Calculates the difference between the energy cost under a flexibility-controlled scenario and the energy cost under a reference scenario. The Flexibility Index is used to measure the effectiveness of flexibility strategies in reducing costs compared to a baseline case.

If   < 0, the flexibility-controlled case has a higher energy cost than the reference case, meaning an undesirable cost increase.

If   = 0, the total energy cost under flexible conditions is identical to that of the reference case, indicating that flexibility yields no savings.

If   = 1, the total cost in the flexibility-controlled case is zero relative to the reference case—this represents an ideal but unrealistic situation.

If  is positive and close to 1, it means that energy has been effectively shifted or managed, reducing costs compared to the reference scenario.

Range:  -            1   [13].

 

 = Electricity consumption during time interval i [kWh]

 = Energy price during time interval i

 = Total electricity cost in a flexibility-controlled scenario  = Total electricity cost in a reference scenario without flexibility control

 = Number of considered time intervals

Flexible Energy Efficiency

FEE

Measures how effectively a system utilizes flexible energy compared to its reference energy consumption. It refers to the system’s ability to manage energy use during Demand Response (DR) events, considering the “rebound effect” (i.e., when energy consumption increases after a reduction event to restore normal operating conditions). A higher  value indicates greater flexibility efficiency, meaning the system can better optimize energy use during flexible periods. Range: 0 %         100%  [14].

 

 = Flexible energy, i.e., the energy used during periods when the system operates in flexible mode (for example, by optimizing consumption based on renewable resource availability or variable pricing) [kWh]

 = Reference or baseline energy, i.e., the energy consumed under normal or non-flexible operating conditions [kWh]

Note. This table presents the Environmental Key Performance Indicators (KPIs) used to evaluate the environmental, energy, and operational performance of smart infrastructures within a digital twin framework. Each KPI is defined with its acronym, description, and mathematical formulation for standardized and comparative analysis.

 

 

3.2 Social and Environmental Key Performance Indicators (KPIs) for Digital Twin-Based Assessment of Smart Urban and Industrial Infrastructures

 

The set introduced for Key Performance Indicators (KPIs) plays an important role in facilitating the digital twin and metaverse software platform proposed, highlighted in the abstract, since it plays an important enabling role in assessing, optimizing, and ensuring the factors related to Smart Urban and Industrial Infrastructure (Dovolil & Svítek, 2024; Barykin et al., 2023). The proposed set of KPIs serves as parameters that enable the processing of complex phenomena related to the environment into measurable values, enabling real-time processing, simulation, and optimization (Englezos et al., 2022; Hadjidemetriou et al., 2023). The integration process fully meets the aims of the ESG (Environmental, Social, and Governance) evaluation framework, particularly targeting both Environmental and Social factors (Shaharuddin et al., 2022). Focusing on KPIs that assess indoor environmental quality, energy efficiency, and user comfort, the proposed platform enables, through an evidence-based process, the optimization of sustainable design, preventive maintenance, and energy-efficient building operations (Yitmen et al., 2025). Humidity (HUM) is an important KPI for assessing indoor environmental quality. This parameter measures the actual water content percentage in the air, relative to its maximum threshold at a given temperature scale. Humidity level, when maintained within its optimal range (40% to 60%), plays a critical role in health and comfort, since low air humidity can lead to air irritation and electrical charges, whereas excess humidity can contribute to mold growth, causing material degradation. This phenomenon, when implemented in digital twin functionality, enables RH measurement, permitting, through algorithmic processing, automatic regulation of Heating, Ventilation, and Air Conditioning (HVAC) operation and, through forecast models, optimizing air-conditioned ventilation (Lo, 2025). This leads, therefore, to thermal and hygrometric comfort, optimized through energy conservation, directly linking HUM to both social well-being and environmental factors, concerning optimized energy savings. Particulate Matter (PM10 and PM2.5) is an important environmental parameter. The proposed KPI aims to assess the level of air concentration of particles that can significantly provoke health problems, particularly in densely populated and industrially developed regions. Continuous exposure to particles can cause problems relating to heart and pulmonary diseases. The measurement process, set up for buildings, aims to assess effectiveness and identify pollution sources through functional analysis of ventilation systems. The integration of PM values in the proposed system contributes to the support for the ESG “Social” perspective by ensuring health for the inhabitants, along with achieving healthier approaches for efficient air circulation systems, thereby contributing to improvements in the “Environmental” perspective by ensuring cleaner, more efficient air circulation methods (Saleh et al., 2025; Ariansyah et al., 2023). Volatile Organic Compound (VOC) concentrations enable the measurement of air pollution from harmful gases such as benzene, formaldehyde, and toluene, which are derived from construction materials, cleaning agents, and interior decor. Volatile organic compounds can significantly affect indoor air quality, comfort, and health. However, it is recommended that VOC concentrations not exceed 300 ppb to maintain global health standards. The integration of VOC concentration measurement in the digital twin system will enable real-time responses, enabling facility managers to trace the cause, adjust ventilation rates, or use low-emitting materials (Yitmen et al., 2025; Venkateswarlu & Sathiyamoorthy, 2025). This reasonable preventive strategy will enhance indoor environmental quality and enable ESG factors to achieve “Social Sustainability,” resist factors that threaten health, and lead to occupant contentment. The rate of “Air Changes per Hour (ACH) Quantitative Indicator,” expressed by “ACC,” measures the rate at which total air replacement can occur inside an indoor space. An average rate range of 3 to 5 ACC will ensure adequate ventilation for residential and office buildings. The continuous measurement, adjustment, and calculation procedure for ACC using digital twin technology will enable facility managers to dynamically adjust ventilation rates, ensuring safe, healthy air and energy conservation by optimizing ventilation rates (Hadjidemetriou et al., 2023). The ACC Key Performance Indicator has both social and environmentally friendly impacts for ESG achievement. Regarding ACC, it offers “Social Benefits,” ensuring healthy ventilation for human well-being, and “Environmental Benefits,” conserving energy by systematically adjusting ventilation rates to improve energy performance (Hadjidemetriou et al., 2023). The “Thermal insulation rate (R-value) Quantitative Indicator,” also expressed as “R-value,” essentially estimates the “Thermal Resistance Capacity (TRC)” of construction materials to heat, thereby indicating how little heat will conduct through them, thereby ensuring greater energy conservation, as discussed previously. Increased insulation reduces heating and cooling loads, aligning with the ESG environmental aspect by reducing emissions from energy use and the social aspect by ensuring a comfortable temperature level without increasing costs (Englezos et al., 2022). The Sound Insulation Index (SND) rates sound insulation properties for construction structures, such as walls, windows, and floors. Noise pollution is gradually recognized for its impacts on both mental and physical health. The measurement of sound insulation level inside buildings helps stakeholders rate sound comfort, particularly in highly populous urban areas. This KPI actually improves the social sustainability aspect by fostering well-calibrated environments for concentration, rest, and quality of life (Lo, 2025). Energy use actual KPIs, namely Energy Efficiency Ratio (EER) and the remaining three actual indicators, namely Coefficient of Performance (COP) and System Efficiency (SEF), that rate, along with EER, how well energy services translate from energy use, contribute singularly to how well energy inputs translate from energy services. The EER, COP, and SEF actual indicators are particularly important for rating energy services’ contribution to both chiller/heater performance ratios for cooling and heating, respectively. Values for higher ratios indicate greater use for every amount of power used, thereby improving digital twin capabilities for optimizing inefficiencies, predicting system degradation, and scheduling preventive maintenance (Venkateswarlu & Sathiyanmuthu, 2025) that support ‘Environmental’ and ‘Economic’ ESG spheres, along with, again, affordability, thereby strengthening ‘Social’ ESG factors. The actual Energy Use Intensity (EUI) and actual Lightning Power Density (LPD) actual indicators can, particularly, rate lighting energy use, and its intensity, respectively, that provide deeper insight into energy use per capita, by rating lighting energy use adjusted for expected user population, along with lighting energy consumption intensity adjusted for ex-pected unit floor space, respectively, that provide deeper, similar insight, by measur-ing shared relationship factors related to spatial, user, and energy use. The actual use of digital twins with similar data can enable various analyses, including simulations for different user occupancy scenarios, lighting system schedule optimizations, and adoption of intelligent lighting systems that dynamically adjust to different user behaviors (Yitmen et al., 2025). Such enhancements lead to lower energy losses and operational costs, thereby aligning well with the ESG framework from both environmental and social perspectives, given their well-being benefits and resource distribution. Overall, integrating such KPIs into a digital twin and metaverse system constitutes a comprehensive framework for measurement, simulation, and improvement efforts to support greater sustainability and energy goals across various infrastructures in both urban and industrial settings. Each KPI has applicability to advancing or improving environmental, energy, and human comfort factors. Continuous surveillance using the set parameters allows a shift from a reactive governance model to a predictive one, in which any intervention depends on real-time factors rather than fixed paradigms that lack dynamic scope, thereby adhering to the ESG model's focus on innovation directly linked to sustainable and inclusive elements.

 

 

Table xyz. Social Key Performance Indicators (KPIs) for Indoor Environmental Quality and Energy Efficiency Assessment

KPI

Acronym

Description

Formula

UoM

Relative Humidity

HUM

Indicates the amount of water vapor in the air relative to the maximum that can be contained at the same temperature.

The optimal relative humidity (RH) range for occupant comfort and health is between 40% and 60% [15].

 

 = Water vapor pressure [Pa]

 = Saturation vapor pressure [Pa]

%

Concentrazione di PM  (Particulate Matter - PM10 e PM2.5)

PM10 e PM2.5

Measures the amount of suspended particles (particulate matter) in the air, typically expressed in micrograms per cubic meter (µg/m³).

PM2.5 refers to particles with a diameter smaller than 2.5 micrometers, while PM10 refers to particles smaller than 10 micrometers.

Recommended long-term health thresholds are PM2.5 < 20 µg/m³ and PM10 < 50 µg/m³ [16].

 

 

 = Mass of particulate matter [µg]

 = Volume of air [m³]

µg/m³

Volatile Organic Compounds

VOC

Establishes the concentration of VOCs – such as benzene, formaldehyde, and other potentially harmful gases.

Elevated VOC levels can cause discomfort and health issues in occupants.

The indicated threshold is  < 300 ppb. [17].

 

 

 = VOC concentration [mg/m³]

 = Molar mass of the VOC [g/mol]

 = Molar volume under standard conditions, generally considered as 24.45 L/mol (at standard temperature and pressure, 0°C and 1 atm)

ppb

Air Changes per Hour

ACH

Indicates the number of times the air within a space is completely renewed in one hour.

An air change rate between 3–5 ACH is considered adequate for residential buildings or office environments [18].

 

 = Airflow rate [m³/h]

 = Volume of the indoor space [m³]

1/h

Thermal Insulation Rate 

THR

Determines the thermal resistance of insulating materials, indicating how effectively they prevent heat loss.

A higher R-Value indicates better insulation performance [19].

 

 = Materials thickness [m]

λ = Thermal conductivity of the materials [W/m·K]

m²·K/W

Sound Insulation Index

SND

Evaluates the effectiveness of a building element in reducing sound transmission between two different spaces.

It is defined as the difference between the incident sound pressure level on a surface and the transmitted sound pressure level through it.

A higher R value indicates that walls, floors, or windows are more effective at blocking sound [20].

 

 = Incident sound pressure level [dB]

 = Transmitted sound pressure level [dB]

 = Equivalent absorption area [m²]

 = Separating surface area [m²]

dB

Energy Efficiency Ratio

EER

Measures the efficiency of an air conditioning system (air conditioners or cooling units). A higher EER indicates that the air conditioning system provides more cooling output for each unit of energy consumed, making it more efficient.

If EER ≥ 12, the system is considered efficient. [21].

 

 = Total cooling capacity provided by the system [kW]

 = Electrical power input consumed by the system [kW]

-

Coefficient of Performance

COP

An indicator similar to the EER, it can be used to evaluate efficiency in both cooling and heating modes.

It is commonly applied to heat pumps. A higher COP indicates that the system can produce a greater amount of useful energy (heating or cooling) for each unit of electrical energy consumed.

If COP ≥ 3.5, the system is considered efficient. [22].

 

| =  =  = Heating or cooling capacity provided by the system [kW]

 = Electrical input power consumed by the system [kW]

-

System Efficiency η

SEF

Measures how much of the energy used by the system is effectively converted into useful heating or cooling.

A high system efficiency means that a large portion of the consumed energy is actually transformed into useful thermal energy, minimizing losses.

If η ≥ 85%, the system is considered efficient. [23].

 

 = Useful energy delivered (cooling or heating capacity) [kWh]

 = Total energy consumed (including system losses and auxiliary consumption) [kWh]

-

Energy Use Intensity based on people count 

EUI

Measures the energy consumption for lighting relative to the number of occupants in the building, reflecting energy efficiency in terms of per capita usage.

A high EUI indicates higher energy consumption for lighting per person, suggesting a lack of optimization.

Optimal values: EUI < 15 kWh/person/year. [23].

 

 

 = Energy consumed for lighting [kWh]

 = Number of occupants in the building

 = Duration of lighting usage [year]

kWh/

person/

year

Lighting Power Density per floor area

LPD

Determines the power consumed by lighting per unit of floor area.

It serves as an indicator of lighting efficiency in relation to the utilized space.

A high LPD indicates greater power consumption per unit area, suggesting inefficient lighting design.

Optimal values: LPD < 10 W/m² [23].

 

 

 = Power used for lighting [kW]

 = Illuminated indoor area [m²]

kW/m²

Note. This table summarizes the Social and Environmental Key Performance Indicators (KPIs) used to assess indoor environmental quality, user comfort, and energy efficiency in smart infrastructures. Each KPI is defined by its acronym, description, and calculation formula, providing measurable parameters that support ESG-oriented evaluation and digital twin integration.

 

3.3 Governance Key Performance Indicators (KPIs) for ESG Evaluation in Digital Twin and Metaverse Applications

 

The selected Key Performance Indicators (KPIs) provide an integrated framework for evaluating ESG performance for Smart Infrastructure, specifically for the digital twin and metaverse applications related to the management of urban and industrial environments. Each Key Performance Indicator is a link that connects technology innovation and sustainability to enable real-time analysis and optimization of energy use, expenditure, and social impacts. The use of Key Performance Indicators, in aggregate, provides a holistic view of efficiency and equity, ensuring infrastructural advancement that encompasses technological innovation, sound ecology, and support for social justice. The relevance of the Key Performance Indicators is significant in the ESG framework, particularly because it directly covers both environmental and economic perspectives, and it has an indirect relationship with Governance, largely through interactions, accountabilities, and shared decision-making (Wu et al., 2022; Zhang, 2025). The Cost of Energy Saving (CES) is the single most important Key Performance Indicator under the ESG framework, since it estimates the financial costs of unit energy savings from efficiency. This Key Performance Indicator assists by evaluating the cost-effectiveness and investment-to-benefit ratio for environmental elements, leading to environmentally viable energy conversion (Dovolil & Svítek, 2024). The CES Key Performance Indicator has clear relevance to the ESG environmental domain, helping establish cost-optimal strategies for energy waste and emission savings, and also has implications for Governance, as it assists with financial accountabilities and forward-looking strategic planning for financial resource use. The Energy Return on Investment (EROI) is another highly important Key Performance Indicator, calculated as the ratio of energy output to energy invested for any given system. The Key Performance Indicator for energy has important implications for ESG’s environmental domain, as it indicates that when EROI increases, the energy output of the system is significantly higher than the energy consumed (Hämäläinen, 2020). This shift leads to optimized energy resources and sustainable energy. This Key Performance Indicator has several ESG factors, as it supports the ESG environmental dimension by enabling transparent evaluation of energy system efficiency and helping strategic decision-making to maximize energy output from resources without harmful depletion (de Trizio et al., 2024). The Energy Payback Time (EPBT) Key Performance Indicator complements the EROI Key Performance Indicator, as it describes the time required for a particular system to recover the energy invested in construction, setup, and maintenance operations. Functionally, from an ESG perspective, EPBT plays a crucial role in evaluating the life-cycle sustainability of energy systems (Hu, 2023). In the digital twin environment, EPBT helps evaluate simulation scenarios and establish the sustainability level of different energy technologies, thereby strengthening the use of transparent data —an important consideration in ESG modeling for the governance process. The Cost of Peak Demand (CPD) measures the cost of peak electricity demand over a given time period. The use of CPD is critical for sustainability, both environmental and economic, since maximizing efforts to reduce peak loads will ease energy networks and prevent the need to generate additional energy from fossil fuels, which are characterized by higher emissions (Aghazadeh Ardebili et al., 2025). The Cumulative Cash Flow (CCF) criterion considers both financial and environmental factors, as it evaluates total cash flow for an energy project alongside investment costs. ESG analysis supports governance by using financial criteria to express financial transparency and assess future risk (Hien & Hanh, 2024). The positive interpretation of a project’s cash flow feature is critical, as it asserts that financial investment in a project, beyond financial benefits, helps achieve resource savings and sustainability. The Share of Project Cost Subsidized (SPC) measures the extent of grant use. This criterion assumes ESG duality, as it explains the financial attractiveness of sustainable project investment by focusing on social benefits arising from inclusivity for small players from developing communities in the use of sustainable technology (Wu et al., 2022). Renewable Energy Use (REU) assumes critical importance as an essential ESG criterion that estimates the level of energy use from conservation to sustainable energy. Indicative interpretation assumes critical importance, particularly because it signifies a strong commitment to sustainability for a project, which is otherwise characterized by the continuous use of fossil fuels (Becattini et al., 2024). The use of digital twin technology is critical, as it assists in monitoring energy use across different scenarios, thereby enabling interpretation for sustainable energy use (Wei, 2023). The Energy Use per Worker Hour (EPWH) is dual in its interpretation of energy use across different labor productivity scenarios (Zhang et al., 2023). Socially, it signifies environmentally responsible production that does not strain human resources by being energy-intensive. EPWH, on a digital twin platform, supports modeling for appropriate workforce and energy equity balance interpretation, as well as effective energy use in labor-intensive industries (Englezos et al., 2022). Taking it all in, it forms a sound analysis framework for a comprehensive digital twin model that expresses difficult objectives for sustainable production through specific, quantified, and tractable information. The gauges improve the proposed digital twin framework’s capabilities for both real-time activity monitoring and, through simulation, forecasting future ESG performance implications. The proposed digital twin platform’s balanced model for ensuring a comprehensive, integrated, and holistic approach to ESG responsibility, covering environmentally responsible operations (EROI, REU, EPBT) for low-cost energy use, economic soundness (CES, CCF, CPD, SPC) for sustainable economic growth, and social responsibility (EPWH) for fair social implications, has therefore become possible through the incorporation and integration of such factors for its successful implementation.

 

Table xyz. Governance Key Performance Indicators (KPIs) for ESG Evaluation within Digital Twin Frameworks

KPI

Acronym

Description

Formula

UoM

Cost of Energy Saving

CES

Measures the cost associated with energy savings achieved through energy efficiency interventions.

This parameter is particularly useful for comparing different investment options in terms of efficiency, as it estimates how much it costs to save one unit of energy (e.g., 1 kWh) through technological or operational measures.

The CES formula is structured to calculate the total cost of energy savings and divide it by the amount of energy saved, accounting for system inefficiencies.

A higher CES indicates a greater cost per unit of energy saved, suggesting that the intervention may be less cost-effective compared to other alternatives.

Conversely, a lower CES means a lower cost per unit of energy saved, making the energy efficiency measure more economically advantageous [24].

 

 

 = Change in initial investment. Represents the amount of capital required to implement the energy efficiency measure [€]

 = Change in operating costs. Includes expenses related to the operation and maintenance of the energy efficiency measure [€]

 = Energy price. Represents the cost per unit of energy, which can influence the savings achieved by the measure [€/kWh]

 = Change in energy consumption. Indicates the amount of energy saved as a result of the intervention [kWh]

 = Energy loss (or efficiency) factor associated with losses that may occur during the energy use process. It may include heat losses or other system inefficiencies [–]

 = Capital Recovery Factor. Used to calculate the annualized cost of the investment and determine how much an investment must generate each year to be recovered over time [-]

 

 = Interest rate [-]

 = Amortization period [years].

[€/kWh]

Energy Return on Investment

EROI

Evaluates the energy efficiency of a production source by measuring how much energy is obtained compared to how much energy is invested to produce it. It is a key indicator of energy sustainability: the higher the EROI, the more efficient the system.

If EROI > 1, the energy process is sustainable, as the energy produced exceeds the energy invested.

If EROI = 1, the energy produced is exactly equal to the energy invested, meaning the system is at the limit of sustainability and produces no usable net energy.

If EROI < 1, the system is inefficient, since it requires more energy than it generates. Such a process is neither economically nor energetically sustainable in the long term.

This indicator answers the question: “How efficient is the energy investment?” [25].

 

 = Total outgoing or produced energy from process i. This may include, for example, the electricity generated by a power plant or the fuel produced by a refinery [kWh].

 = Total incoming or consumed energy for process j. This may include the energy required to extract, transform, or transport the energy source [kWh].

 e  = Scaling factors that can represent the quality of energy. For instance, they may be used to assign greater or lesser importance to certain forms of energy or technologies [–].

[-]

Energy Payback Time

EPBT

Measures the time required for an energy system to produce the same amount of energy that was needed to build, install, and maintain it.

If EPBT is high, it takes longer for the system to return the energy invested. Conversely, if EPBT is low, the energy system quickly recovers the energy used for its construction and startup.

It is an indicator that answers the question: “How long does it take for the system to repay the energy invested?” [26].

 = Total invested energy required to build, install, maintain, and decommission the energy system throughout its life cycle [kWh].

 = Amount of energy that the system is capable of producing annually once it is operational [kWh/year]. 

[year]

Cost of Peak Demand

CPD

Measures the cost associated with the peak electricity demand over a given period.

A lower CPD is desirable, as it indicates effective management and reduced exposure to energy costs [27].

 

 = Represents the maximum power demand during a given period [kW].

 = Represents the cost associated with each unit of power [€/kW].

[€]

Cumulative Cash Flow

CCF

Measures the total cash flow generated by the project in relation to the initial investment.

The CCF is useful for investors and decision-makers, as it helps assess a project's profitability, compare different investments, and plan future financial needs and returns on investment.

A CCF > 0 indicates that the project is generating more cash flow than the costs incurred, while a CCF < 0 indicates a loss. [24]

 

 = Represents the Final Energy Savings in period k. This value indicates the final energy savings achieved through energy efficiency measures or other strategies [kWh].

 = Energy Carrier Cost, i.e. the cost of energy per unit during period k. This may include costs for purchasing or using energy such as electricity, gas, etc. [€/kWh].

 = Technical Life, i.e. the project period during which energy savings and economic benefits are expected [years].

 = Investment Cost, i.e. the cost of the investment. It includes all expenses necessary to implement the project, such as installation, equipment, and other preliminary costs [€].

[€]

Share of Project Cost Subsidized

SPC

Indicates the proportion of the total project cost that has been financed through grants.

A high SPCS means that a significant portion of the project has been funded through external aid, while a low SPCS suggests that the project has been mainly self-financed.

SPCS = 0% when no grants have been received (RS = 0), meaning no part of the project costs is subsidized.

SPCS = 100% when the entire project cost is covered by grants (RS = IC), meaning the entire project is subsidized.

Range: 0 % ≤   SPCS ≤   100%  [28].

 

 = Received Subsidies, meaning the total amount of grants or funding received for the project [€].

 = Investment cost, meaning the total investment cost [€].

 

 

 

[%]

Renewable Energy Use

REU

Provides a measure of the proportion of final energy savings that comes from renewable sources compared to all energy sources used.

It is useful for energy policies and environmental assessments, as it helps quantify and compare the impact of different energy sources on overall sustainability and efficiency.

A higher REU indicates greater use of renewable energy, while a lower REU suggests a higher dependence on fossil fuels.

Range: 0 % ≤   REU ≤   100%   [28].

 

 

 = Final Energy Savings for each energy source k. Indicates the final energy savings achieved from that specific source [kWh].

 = Conversion Factor for each energy source k. This factor is used to convert the saved energy into a common unit, allowing comparison among different sources [-].

 = Renewable Energy Source factor for each energy source k, which accounts for the sustainability of the source. This value varies depending on the type of energy:

·         0 for fossil fuels, indicating they do not contribute to sustainable energy production [-]

·         1 for renewable sources such as biomass, wind, solar, and other renewables, as they are considered sustainable [-]

A value between 0 and 1 for mixed sources, such as industrial waste or end-of-life tires, depending on the sustainability level of the source [-]

[%]

Energy Use per Worker-Hour

EPWH

Measures the total energy used by a production system in relation to the number of human resources and working time.

It calculates the energy used per working hour, taking into account the total supplied energy minus the imported one, and normalizing the result by the number of workers and the annual working hours.

This indicator is useful for evaluating the energy efficiency of an organization or an entire economy, allowing comparisons over time or between different sectors or countries.

A low EPWH is considered positive, as it indicates higher productivity with lower energy use, suggesting a more sustainable use of energy resources.

Conversely, a high EPWH may indicate energy inefficiency, potentially linked to poorly optimized production processes, outdated machinery, or energy-intensive technologies [29].

 

 = Total Primary Energy Supply, i.e., the total amount of primary energy supplied, including all available energy sources [kWh].

 = Population number, meaning the total number of individuals within the studied population.

  = Total number of working hours per person per year [hours/year].

 = Industrial Primary Energy Supply, meaning the portion of TPES specifically used in the industrial sector [kWh].

 

 = Industrial Final Consumption, referring to the final energy consumption by the industrial sector [MWh].

 = Total Final Consumption, referring to the total final energy consumption within a given economic system, including the industrial, residential, tertiary, and transport sectors [MWh].

MJ /

(ab. hour/years)

Note: This table summarizes the Governance Key Performance Indicators (KPIs) used for ESG evaluation within digital twin frameworks. The listed indicators quantify economic efficiency, financial accountability, and strategic resource management, enabling transparent decision-making and long-term sustainability assessment. These variables collectively support the “Governance” dimension of ESG by linking economic performance with responsible investment, policy transparency, and data-driven management.

 

Apart from the previously listed key performance indicators, the following are also calculated for measurement in relation to the context of the given system, making it easier for normalization:

  • Area (Area_m² – AREA): This signifies the total floor space investigated for the energy and environment indices related to the building or infrastructural facility. The total floor space is presented in square meters.
  • Energy Consumption (Energy_Consumption_kWh – ENCO): This refers to the total consumption during the period under review, expressed in kilowatt-hours. This is the fundamental unit that can also produce comparative energy performance indicators
  • Occupants (OCC): This variable measures the number of people using or occupying any given space. This parameter enables calculations related to energy use and per capita environmental factors, making analysis easier for the user.

These factors establish highly important normalizing variables, enabling true comparability of performance across different buildings, facilities, and circumstances, thereby enhancing the robustness of the entire KPI system.

 

Q13. Economic Feasibility and Cost-Benefit Analysis of Metaverse Integration: The discussion here is speculative and unsupported by quantitative data. The authors should outline a structured cost-benefit analysis model, referencing empirical or benchmark data from comparable projects. Potential risks and scalability challenges should also be discussed.

A13. The section Economic Feasibility and Cost-Benefit Analysis of Metaverse Integration has been deleted

Q14. Platform Functional Capabilities: While informative, this section is too general. The authors should distinguish between technical, operational, and experiential functionalities and clearly explain how the metaverse environment enhances each. A conceptual diagram or architecture model would improve clarity.

A14. The section Platform Functional Capabilities has been deleted

Q15. Prototype Implementation and Preliminary Results: The description lacks technical detail and transparency. The authors should specify the prototype development tools, datasets, and testing scenarios. Screenshots or workflow diagrams would help validate claims about functionality.

A15. This section has been removed and replaced by the following following changes requested by other co-authors. The amended section is listed below.

  1. Operationalizing Environmental Sustainability through Digital Twins: A Metaverse-Enhanced ESG Dashboard for Smart Building Management

 

In this rapidly shifting landscape of AI-driven building management, ESG factors aligned with digital twin technology and metaverse-based engagement, like the metaverse itself, are driving a paradigm shift in sustainability. As a precursor to this innovative solution developed to promote and enable prototyping and training for a digital twin infrastructure for sustainable building management in a smart city setup for ESG-based KPI development for digital twin sustainability metrics, this article proposes and brings to fore a critical dashboard that translates to The ESG KPI Framework – Metaverse-Enhanced Operations. The above-mentioned dashboard provides a comprehensive perspective on environmental building metrics and includes critical dimensions related to carbon emissions, energy usage patterns, and the integration of renewable energy sources, in addition to sustainable and optimized building management performance. The dashboard is made possible by enhanced, streamlined inputs from real-time IoT-based streaming sources and is supported by advanced computational methods such as PCA and Ordinary Least Squares for predictive and related mathematical modeling to ensure feasibility. Moreover, through metaverse-based infrastructure development opportunities that are inclusive of interactive 3D platform development for critical sustainability metrics and factors such as carbon emissions and building performance, this dashboard translates into a critical sustainability perspective that resolves in a waterfall fashion. Thus, it essentially encapsulates critical sustainability and assurance translation and resolution through adaptive, advanced platform development. Moreover, it essentially translates to a critical confluence between sustainability intelligence and innovative digital platform development. Therefore, basically translates to a new and critical metaverse-driven paradigm for sustainability intelligence and related predictive development.

 

“This dashboard is a representation of the Environmental (E) dimension of ESG KPI Framework – Metaverse-Enhanced Operations- and has specifically been designed for prototyping and training a digital twin and metaverse-based system for a smarter building management (No-Roozinejad Farsangi et al., 2024). The dashboard is specifically designed to provide information on the building and its environmental performance, and to demonstrate how sustainability metrics can be measured, authenticated, and even visualized to improve ecological intelligence and efficiency within a metaverse-based management platform (Mahariya et al., 2023). The dashboard essentially represents a systematic evaluation of key environmental factors designed to monitor and track the building's carbon footprint and the renewable energy generated and integrated within the building. The Carbon Footprint, with a value of 453.75 TCO2e, essentially represents a measure of total carbon emissions that are generated as a result of building operation in a given period and is a critical factor within this context as it essentially suggests that a building is striving to achieve sustainability and is committed to reducing carbon emissions and staying within limitations and goals established within ESG frameworks. The Emission Intensity of 0.0249 TCO2/kWh suggests that this building maintains a high level of energy efficiency and has a negligible environmental impact. The building’s commitment to a sustainable cause is essential, as it provides critical information on the adoption and integration of 57.4% of renewable resources into its energy structure. The building is essentially committed to sustainability and is well aligned with ESG-based strategies that identify net-zero and transition to a green building approach (Dovolil & Svítek, 2024). The building’s Energy Consumption of 1440827 kWh and related Energy ROI of 260220.0 suggest that sustainability strategies and techniques can deliver returns in this context and ensure that actions and strategies are focused on and optimized for energy efficiency. The building is capable of and can provide a substantial portion of the demand through on-site and renewable sources, as indicated by its critical factors, which show a Load Covering Factor of 76.5% and a Supply Covering Factor of 84.8%. The On-site Energy Ratio of 0.68 and its ability to interact with grids and manage and sustain its operation based on strategic connections as suggested through its critical factor that essentially suggests that it is capable of and has a strategic connection to grids as suggested through its value that essentially suggests that it is capable to operate independently and autonomously as suggested within its critical factor of 58.6% related to its interaction between its independent and strategic connections to grids. The bottom portion of a dashboard essentially provides a pictorial representation through a number of critical factors that essentially represent and identify a building and its dynamics within a broader context that essentially represents sustainability as suggested within its critical factor that essentially represents Energy Flow as suggested within its critical factor that essentially represents its dynamics and level within a broader context that suggests that this is essentially a building that is capable to switch its sources and essentially manage and sustain its operation within this context as suggested related to its critical factor that essentially represents “This capability to forecast and react to changes in demand patterns is but one example of how predictive control methods are indeed imbedded within this environmental management framework” (Masubuchi et al., 2025).The information in this dashboard is continuously updated using IoT devices and validated using computational methods such as correlation analysis, Principal Components Analysis (PCA), Ordinary Least Squares (OLS), and Machine Learning algorithms (Has-sani et al., 2022).The methods ensure that information is trustworthy and help make Key Performance Indicators science-based for training a digital twin. It is through this that “the environmental factor in ESG becomes not only descriptive but can accurately forecast future performance scenarios under different scenarios of either operation and climate in which a digital twin performs” (Tsouri & Avgousti-Della, 2024). On the designer’s intent level, this dashboard makes it clear that integration between environmental intelligence and immersive technologies has been realized. At the metaverse level and within this context-based scenario, this information is analyzed in real time through interactive 3D to assist in “the direct effect of building operation decisions on energy and carbon emissions and system performance” (Hernandez et al., 2023).The immersive experience is one that “brings monitoring and controlling traditional environments to an 'experiential' level for learning and strategies” while making “sustainability a not fixed information piece but a 'dynamic' and 'participative' strategy for decision-making” (E Zainab & Bawanay, 2023). The dashboard is one of those building foundational ingredients that contribute to a strategy platform for a digital twin application in a smarter building environment. On its platform strategy formulation, this one offers “continuing” feedback between information and simulation and system-level “optimization” for better building performance through a logical, optimized logical calculus for a better development strategy (Markopoulos et al., 2024). IT recognizes that “strategic” developments for “building environmental intelligence and immersive display” are reaching a decisive point to achieve “an intelligent ecosystem that can self-act to improve its building environment while keeping to transparency and accountability” and for itself “the strategic de-velopment level for optimized building development” and “strategic” that can self-act through its “strategic development level for optimized building development” in making better strategies for building. In conclusion, and in relation to how this ESG theory can help develop better building strategies for smarter buildings through a digital twin immaterial platform. The “E” in ESG theory can actually help in making a computational strategy for building digital twins and metaverse technology, and making better building strategies. The dashboard has a “continuous” and “data-driven” strategy for developing a new “ecological” approach to smarter building infrastructure systems that are not “theoretical” but “founded,” “dynamic,” and “interactive” realities in carbon emissions and renewable energy.

 

 

 

Figure xyz. Environmental Dimension Dashboard – ESG KPI Framework for Smart Building Digital Twin Development. Note: This dashboard represents the Environmental (E) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, focusing on carbon footprint, renewable energy use, and energy efficiency. The displayed KPIs—such as 453.75 tCO₂e, 57.4% renewable energy, and 0.0249 tCO₂/kWh emission intensity—demonstrate strong environmental performance. The Energy Flow and Load Shifting charts visualize real-time energy dynamics, supporting the digital twin prototype for sustainable and intelligent building management.

 

 

 

 

The above dashboard represents the Social (S) factor in relation to the ESG Key Performance Indicator Framework – Metaverse-Enhanced Operations within a digital twin and Metaverse-based system for a smart building management system (Farsangi et al., 2024). Unlike other dashboard views that consider sustainability in relation to environmental and governance factors, this dashboard focuses on directly improving user well-being and health through quantified, measurable Key Performance Indicators for social sustainability. The dashboard combines real-time monitoring from building sensors and digital twin analysis to evaluate indoor environmental quality (IEQ). This makes it a pivotal framework for ESG layers to ensure that building users’ well-being is maintained through a digital and sustainable metaverse platform. The top portion of this dashboard provides a summary of information on building sustainability. The Carbon Footprint (453.75 tCO2e) is maintained as indicative in ESG layers to promote sustainability. The key indicator directly related to building users’ health and well-being is given as an “Excellent” rating for indoor air quality. It is denoted as “11.4 μg/m3” and classified as a “PM2.5” concentration for air purity. The building maintains a “PM2.5” concentration within a “Very Low” level to ensure that building users are protected from inhaling building air pollutants. The subsequent section of this dashboard provides a deeper evaluation of key social factors that define the indoor experience for building users. The building maintains its “Relative Humidity” at “51.1%,” within the “Normal” range. Therefore, this ensures that building users experience health and well-being related to indoor humidity. The dashboard shows that the “PM10” air concentration is maintained at “20.9” “µg/m3,” while the “Volatile Concentration” is denoted as “20” “ppb” to support building health and well-being. The building maintains its “Air Changes/h /h” as “2.8” “1/h” to ensure that building users’ indoor health and well-being are maintained. The above-mentioned factor ensures that building users experience health and well-being benefits from indoor air quality improvements. The “R-Value” is maintained as a “2.19” building factor to ensure that building users experience health and well-being factors associated with indoor building temperatures. The above health and well-being factor related to indoor building temperatures is associated with indoor building noise levels. The building maintains its “Sound Insulation” level as “-” “dB” to ensure that building users’ health and well-being requirements are maintained. The factor is associated with indoor noise stress among building users. The above factor maintains indoor noise stress factors within a “Low” level. Therefore, indoor building noise stress factors are reduced to ensure that building users experience health and well-being related to indoor building noise. The dashboard displays that building users experience health and well-being factors within a “Very Low” level. The health and well-being factor is associated with indoor building temperatures. The above dashboard factor maintains indoor temperatures within a “Comfort” level. The health and well-being factor related to indoor building temperatures is associated with indoor building noise. The building serves as a component of its prototype for its digital twin system, translating abstract social sustainability criteria into measurable indicators to promote a new paradigm of human-centered, resilient building management.

Figure xyz.  Social Dimension Dashboard – ESG KPI Framework for Digital Twin and Metaverse-Based Smart Building Management. This dashboard focuses exclusively on the Social (S) dimension of the ESG framework, illustrating KPIs that measure comfort, health, and indoor environmental quality. Indicators such as PM2.5 (11.4 µg/m³), VOC (20 ppb), Sound Insulation (35.6 dB), and System Efficiency (86.0%) provide quantifiable insight into occupant well-being. These data, integrated within the digital twin and metaverse-based prototype, form the basis for predictive, interactive, and human-centered smart building governance.

 

 

The dashboard above represents the Governance (G) dimension within the ESG KPI Framework – Metaverse-Enhanced Operations. The dashboard is specifically designed to train and validate a prototype for building an integrated digital twin and metaverse system for smart building management (Noroozinejad et al., 2024). The dashboard is distinct from other ESG metrics because it focuses solely on economic governance and financial decisions related to energy management and sustainability (Adnan et al., 2024). The top portion of this dashboard frames its scope in relation to governance through three core metrics. Carbon Footprint (453.75 tCO2e) and Indoor Air Quality (11.4 μg/m³ PM2.5, Excellent) are maintenance metrics to maintain continuity with ESG dimensions. However, in this context, all metrics are financial in scope. The Energy ROI (26,022.0), with a 0.8-year payoff period, signifies economic sustainability in terms of how energy investments are returned (Cranford, 2023). The above-mentioned metric indicates financial efficiency in how capital and financial governance sustain this digital twin across financial and environmental parameters. The middle portion of this dashboard signifies a financial segregation regarding parameters for its governance. The Investment for system implementation is 1,679,500 €, while Subsidies are 883,081 €, corresponding to 52.6% of the total capital. The above-mentioned metrics indicate transparency and traceability levels for ESG financial governance by highlighting financial investment segregation within this ESG platform (Park et al., 2023). The Cost Energy Saving metric (0.701 €/kWh) signifies financial returns within energy-saving initiatives. The Annual S-avings value (22,513 €/yr) is reflected within this ESG platform for financial returns in relation to sustainability and financial benefit. The above-mentioned Peak Demand Cost (187,854 €) and the metrics related to demand on this ESG platform indicate financial sustainability. The aforementioned digital twin and demand within this ESG platform enable the platform to forecast demand cost variability across different scenarios (Aloqaily et al., 2022). The Cash Flow (–372,815 €), reflected in this ESG platform, is used to assess investment metrics for this digital twin platform's financial sustainability. However, the above-mentioned metric is supported by a 15-year forecast in this ESG platform, as reflected in the cash flow projection below. The above graph is one of the most important aspects of the Governance factor and shows how both the «Annual Cash Flow» and «Cumulative Cash Flow» have been displayed to better define financial recovery dynamics over time (Masubuchi et al., 2025). The «Negative bar» in this first-year chart refers to capital outflows as a direct investment cost and «Positive bars» that follow specifically denote «yearly savings». The point where «cumulative curves shift into positive zones» will denote a stage where a given investment becomes profitable. The ability to simulate this within a digital twin system will help this metaverse platform project financial sustainability and performance in line with its ecological and operational potential (Trung, 2022). The «Left side-bar» titled «Building Config» enhances this «governance strategy» by including «Real and Simulated parameters» within its digital twin platform. There is scope to switch between «Real Building Data» and «Simulated Data» and to define «building config parameters» like «Building Size » (16,795 m²), «Building Occupants» (701 people), and «Total Units» (209), all with a view to ensure «comparability» and «normalization» for «governance KPIs» when different building types will specifically come under this ESG factor and «Governance strategy» classification (Zainab & Bawanay, 2023). The «switch to enable» Renewable Energy Systems, «Smart & Grid Integration» and «Metaverse Technology» specifically not only enables monitoring but enhances training within this »governance logic» for this »prototype» to react and easily «predict» decisions to invest in the future in its «simulated metaverse» platform (Duong et al., 2023). In essence, this dashboard can specifically act as a «Governance Cock-pit» within this ESG framework that is specifically designed for monitoring and predicting »financial and strategic» aspects related to »smarter building» management. The dashboard directly connects financial accountability and sustainability goals and can define «Investment Efficiency» and »Savings» impact as well as «Pay Back Time» factors while predicting future »economic behavior» in this metaverse and digital twin platform. In reference to this digital twin and metaverse platform and prototype »indicators» and factors will specifically setup »training ground» for «algorithms» to easily execute direct »decision-making» in »immersing» management scenarios within its ecosystem. Thus, this dashboard can specifically address a «financial strategic» and related »Governance factor» related to this ESG strategy within its metaverse platform that directly translates this «G» factor to a «financial strategic digital governance model» within this metaverse digital twin. It not only proves that good governance is a management activity but is indeed a computational task that can be modeled and optimized in a digital twin space. So, in this context, a digital twin can indeed have a significant impact on building management.

 

Figure xyz. Governance Dimension Dashboard – ESG KPI Framework for Metaverse-Enhanced Smart Building Management. This dashboard focuses on the Governance (G) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations, highlighting financial transparency and investment efficiency. KPIs such as Energy ROI (26,022.0), Payback Time (0.8 yrs), and Subsidies (52.6%) demonstrate strong economic performance, while the 15-Year Cash Flow Projection confirms long-term financial sustainability within the digital twin and metaverse-based governance model.

Q16. Methodology and Case Study: The authors should explain how the research was structured, the rationale for case selection, and how the prototype or framework was evaluated. This is critical for reproducibility.

A16. The following section has been added:

 

 

  1. Validation Framework and Data Reliability for ESG-Based Smart Building Model

 

The image illustrates the validation framework for an ESG (Environmental, Social, Governance) Smart Building model, outlining a methodological process divided into four main phases.

 

Figure 1. Validation Framework for ESG-Based Smart Building Model. This framework validates and structures ESG data for Smart Building applications, combining statistical and machine learning methods to ensure data reliability and predictive accuracy. The validated dataset supports testing and prototyping of a management system that integrates metaverse and digital twin technologies for advanced, real-time smart building management.

The process starts with data preparation and structuring, in which data on environmental, social, and governance indicators should be collected and processed by normalizing and organizing them into three analytic blocks. In addition, data screening for outlier observation should be executed at the same stage to ensure data quality for subsequent analysis. The next process involves correlation analysis and Principal Component Analysis. The PCA analysis needs to identify hidden components and prove structural homogeneity. The next step involves Ordinary Least Squares linear regression for each component of environmental, social, and governance. The area will serve as the output for the data. In addition, the framework should use VIF to test for homogeneity in the data. Furthermore, it should apply the calculations for both the determination coefficient and the degrees of freedom. The framework should use machine learning algorithms to improve predictive analysis. At the same time, comparisons of various algorithms, such as Boosting algorithm analysis, Decision Tree Analysis by KNN, Random Forest by Regularization, and Support Vector Analysis, should be used. The analysis should be carried out separately for each component. The algorithm has been designed to ensure that the processed data can be used for testing during the design of a management system that combines the metaverse and a digital twin. At the same time, data structural homogeneity should be ensured. Therefore, based on the data structural homogeneity analysis, it is meaningful and timely to create an advanced digital environment that is both interactive and immersive. Furthermore, it should be an opportunity to create environmental management in an intelligent digital environment.

 

 

Q17. Dashboards: This section remains descriptive. The authors should elaborate on the data visualization logic, real-time monitoring mechanisms, and linkage with the KPI framework. Discussing dashboard usability or decision-support implications would enhance its value.

A17. The following paragraphs have been added.

 

Figure X above highlights the Environmental aspect of the “ESG KPI Framework – Metaverse-Enhanced Operations” component. The dashboard has been developed not only to visualize information but also to support active decision-making. All the information displayed on the dashboard links directly to the framework's logic. This information comes directly from real-time IoT sensors and the “analytical engine” used in the actual ESG performance measures. From the first look at the dashboard above, the top of the page provides immediate information on the main aspects of the “environmental” factors. These factors include the Carbon Footprint, Indoor Air Quality, and the Energy ROI. All the factors shown are updated as new information becomes available regarding the relationship between the actual physical structure and the “metaverse” structure. This provides the user with access to up-to-date information that is constantly changing. Below the main information area of the page, additional information can be found. This information includes the “Emission Intensity,” “Renewable Energy Share,” “Load Cover Factor,” and the “On-site Energy Ratio” factors. All the above factors are created through the usage of “multiple data sources” and verified “analytical techniques” such as PCA and “OLS Regression Analysis.” This process makes the information obtained from the sensors more meaningful. This information provides the user with “patterns and trends” at the “operational level” of the structure. This information can only become beneficial as the “bottom-line” strategy. As mentioned at the end of the above information area of the page. The “bottom” area of the page provides additional information relative to the main information. This information teaches the user about “energy flow” and “load shifting” aspects. This information shows the “balance of play” between “grid-based” sources and “on-site” sources. In the “right” area of the main information page above, the “demand structure” aspect can be found. This aspect “charts the course” of the “demand factor” in real time. This aspect shows the user “expected peak demands” as compared to “alternative” strategies. This aspect “embodies the spirit” of the “living” page. As mentioned above. The page provides more than adequate information relative to the main aspect. This aspect provides the user with a “bird's eye” view of the information. By integrating into the metaverse, the ability to analyze data interactively in 3D has become possible. This feature allows the user to analyze the effects of operational changes in the building. In other words, the user can analyze the effects of changes in carbon emissions, energy savings, and investment costs. This essentially enables the user to predict the changes before actually implementing them in the real-life structure. In other words, the dashboard has made sustainability information a decision-making tool. This shows how the ESG KPI Framework can shift from a descriptive model to a management experience.

 

 

Figure X illustrates the Social dimension of the ESG KPI Framework – Metaverse-Enhanced Operations. Rather than functioning as a simple display of information, this dashboard has been designed as a living interface that connects data visualization, real-time monitoring, and human decision-making within the metaverse. It translates technical sensor readings into understandable patterns, allowing managers to track comfort, health, and efficiency in a continuous feedback loop. At first glance, the top section gives a quick overview of the building’s overall performance, highlighting key indicators such as Carbon Footprint (453.75 tCO₂e), Indoor Air Quality (11.4 µg/m³ PM2.5, Excellent), and Energy ROI (26,022.0 with a 0.8-year payback period). These metrics establish a bridge between the environmental and social layers of the framework. They are constantly refreshed through IoT-based data streams, ensuring that the virtual model mirrors the real conditions of the building as they change throughout the day. At the bottom, visual elements such as Air Quality and HVAC Efficiency graphs tell the story of how the building breathes and performs. Bars compare live values with target thresholds, helping the user to see whether the building is maintaining healthy indoor conditions or if interventions are needed. The HVAC Efficiency chart reveals the system’s responsiveness, showing metrics such as EER, COP, and overall System Efficiency (86%). The clarity of these visuals allows users to identify inefficiencies and understand how adjustments might affect energy performance or comfort levels. The dashboard’s layout follows human-centered design principles. Clear labeling, intuitive grouping, and balanced color contrasts reduce visual fatigue and make the interface accessible to users beyond the technical sphere. The configuration panel on the left enables quick customization—users can toggle between simulated and real-time data, select different buildings, or activate modules such as renewable energy tracking or metaverse visualization. Ultimately, this dashboard functions as a decision-support environment rather than a static report. Its integration within the metaverse gives users the ability to interact with the data in immersive 3D space, experimenting with “what-if” scenarios before implementing real-world actions. For example, users can simulate how changes in HVAC operation might influence air quality or energy savings, observing projected outcomes instantly. In short, the dashboard transforms sustainability metrics into an interactive management experience. It connects live data, predictive analytics, and human judgment in one cohesive interface—bringing the ESG KPI Framework to life as a dynamic, adaptive tool for smarter and more sustainable building management.

 

 

 

Figure X above shows the Governance (G) dimension of the ESG KPI Framework – Metaverse-Enhanced Operations. This dashboard has been created not only as a summarizing tool for financial information but also as a living space that links the dimensions of economy, environment, and operations in real time. Its design consists of a hierarchical system that enables the understanding of financial information through the ESG-based management of the building. Starting at the very top of the interface, the user can clearly see the first set of significant factors: Carbon Footprint (453.75 tCO₂e), Indoor Air Quality (11.4 µg/m³ PM2.5, Excellent), and Energy ROI (26,022.0 with a payback time of 0.8 years). This set of figures serves as the link between the different components of the Environmental, Social, and Governance segments as a whole. This information is updated in real time using IoT-based data acquisition systems that track the behavior of the physical structure. The middle section of the screen highlights financial aspects that pertain to governance. Financial measures such as Investment (1,679,500 €), Subsidies (883,081 € -52.6%), Cost of Energy Saving per kWh (0.701 €), Average Annual Savings (€ 22,513 per year), or Peak Demand Cost (187,854 €) are processed by the analytics engine. In reality, the analytics engine uses regression analysis, PCA weights, and prediction formulas to analyze the raw data presented to the user. This way, the field of governance goes beyond the realm of accountancy. Thus, the results can lead to a more straightforward, easy-to-understand process for governing the field of cost efficiency and sustainability. The lower panel of the dashboard provides a 15-year Cash Flow Projection that highlights the long-term viability of financial investments through a graph integrating Annual Cash Flow (modeled as blue columns) and Cumulative Cash Flow (illustrated as a green line). The area where the green curve enters the positive region indicates the payback zone—the transition from mathematical financial investment predictions to easy visual understanding. Decision-makers can easily distinguish between investment and economic viability through the graph's visual representation. In the design of the dashboard as a whole, clarity and usability were the emphases. The financial measures related to ESG factors are presented in chronological order, allowing trends to be easily observed. Guidelines based on Human-Computer Interaction (HCI) were at the forefront of the design. The use of contrast to highlight trends in performance across financial measures stands out. The real-time monitoring logic has been structured based on the connectivity between the digital twin and the IoT sensors. With the real-time connectivity between the digital twin and the IoT sensors, the financial aspects of the concerned entity —whether ROIs or subsidies, depending on the type of entity —are dynamically linked to variations in related operating factors, such as consumption and efficiency. From the perspective of a decision-support tool, the dashboard functions as a smart cockpit for facility management and policymakers. In the metaverse, users can navigate the financial flow experience in 3D. In the 3D world, the user can analyze investment options or run “what-if” scenarios on the cost of energy or the subsidy rate. The Governance Dashboard illustrates how the realms of digital visualization, real-time analytics, and the metaverse can come together to create a truly end-to-end solution for management. This transforms financial analysis from a passive reporting process into a hands-on experience that can promote greater transparency and adaptability across the entire ESG realm.

Q18. Discussion: The discussion reiterates prior sections rather than synthesizing insights. It should connect the results to existing research and critically evaluate the implications of the proposed framework for real-world building management. Comparative reflection with other digital twin or BIM-based systems is essential to demonstrate conceptual maturity.

A18. The discussion section has been added to the article.

This “Metaverse-Enhanced Operations” ESG KPI Framework offers a revolutionary paradigm for managing smart buildings. By capitalizing on the collective synergy among ESG factors, digital twins, and the metaverse paradigm (Noroozinejad Farsangi et al., 2024), the framework enhances passive sustainability reporting. This approach goes beyond the previous digital twin-based systems' focus on predicting maintenance activities and optimizing energy consumption (Kuru, 2023), incorporating an ESG dimension. This dimension interlinks the above-mentioned environmental, user, and management factors within a holistic management system. The metaverse paradigm complements this management system and upgrades these synergistic systems. In this context, the manager can dynamically evaluate the functional dimension (HVAC efficiency factor, use of renewables, and users' level of comfort), enabling access to the immediate outcomes of a diverse management approach. Thus, the synergy between digital twins and the metaverse paradigm creates a “management space” in which ESG factors serve as real-time inputs to simulation processes. From a theoretical perspective, the presented framework incorporates concepts from “Cyber-Physical Systems” (CPS) and “Human-Computer Interaction” (HCI). This CPS structure provides real-time alignment between the physical and digital layers through the use of IoT sensors and AI-based analytics. HCI principles outline human-centered designs for enhanced human interaction. Thus, this synergy between the above-mentioned concepts enhances the robustness and human-centered orientation of the management strategy. This approach goes beyond the status quo. According to Masubuchi et al. (2025), the synergy between the IoT and metaverse paradigms enables real-time prediction and management. Using this concept as a starting motivation point for the current motivation, the presented approach provides a metaverse-integrated ESG management system. This management system interlinks real-time performance information on physical factors (CO₂ intensity factor, renewables-based factor) with financial and social factors (ROI factor, payback factor, user comfort factor), thereby overcoming the main weakness of current BIM systems. This weakness consists of the systems' exclusive reliance on geometry-based factors and their merely retrospective nature. This enables managers and decision-makers to model alternative sustainability strategies, predict the system's behavior under changing conditions, and analyze trade-offs among the system's goals related to green efficiency, occupant comfort, and financial gain. This paradigm enables traceability, transparency, and planning that align with the world's Net-Zero and Industry 5.0 goals. In the end, the ESG KPI Framework – Metaverse-Enhanced Operations offers a scalable and human-centered paradigm that redefines the role of ESG principles and the built environment. In fact, incorporating the principles of CPS and HCI rectifies the current state of digital twins and repositions the technology from the descriptive phase to the intelligent and adaptive phase.

Q19. Limitations: The limitations are underdeveloped. The authors should expand this section to acknowledge the lack of empirical validation, reliance on conceptual modeling, and assumptions regarding technological readiness. Transparent discussion of constraints will strengthen credibility.

A19. Although the ESG KPI Framework – Metaverse-Enhanced Operations provides a thoroughly innovative and holistic approach towards the management of smart buildings, some limitations must also be addressed to promote more transparency and improve the credibility of the proposed framework. First and foremost, the lack of large-scale empirical verification is the main limitation. In this context, the proposed framework serves as a technological and conceptual prototype, primarily for simulation testing and training (Farsangi et al., 2024). Although specific test contexts have been created upfront, the complexity and variability of real actual systems of the type under scrutiny have not been adequately considered. The corresponding real-time interaction and control logic has been optimized. However, additional iterations and benchmarking must occur to ensure its immediate functional veracity. The following specific limitation concerns the conceptual modeling scheme and the corresponding boundary conditions. The current simulations related to digital twins, as well as the prediction schemes based upon AI-powered “learning,” operate essentially under predetermined, specific, simplified patterns. Although the corresponding computational techniques used: Principal Component Analysis (PCA), Ordinary Least Squares regression (OLR), as well as AI-related “learning” processes have proven their robustness and efficiency upfront, their specific effectiveness under such unpredictable multi-source data flows has yet to be examined under the actual real-time context (Chambon et al., 2023). Third and last from a technological perspective, the present readiness level for the corresponding metaverse-based immersive systems still appears to be impacted primarily by hardware constraints and associated costs. This refers primarily to the high computational requirements of concurrent integration of XR solutions and their corresponding IoT components, as well as the blockchain requirements. Moreover, such solutions are currently neither universally accessible nor cost-effective in their singular total capacity. Consequently, any related immediate generalisability seems drastically limited upfront. Moreover, specific interfacing complications between the metaverse solutions and the existing legacy “Building Management Systems” (BMS) are likely to delay the overall implementation process. Another specific form of limitation concerns the present procedural focus. In other words, the present computational model has essentially focused upfront only on the corresponding “Environmental” aspect of the present “ESG” Framework. The “Social” and “Governance” aspects are currently, since upfront only partly “conceptual,” and must therefore again become the object of additional computational incorporation and verification. An enhancement of the interconnections among the pre-existing multiple aspects mentioned here appears fundamental for achieving a “Holistic” related “ESG”-Digital-System. Finally, the related question of “Data-Privacy” has currently remained unresolved. The more AI and ML components control the decision-making and prediction process, the greater the risk to user privacy, which must be mitigated under the umbrella of open and regulatory alignment (Jagatheesaperumal et al., 2023). All the above-mentioned constraints together outline the roadmap for future development. The following phases will concentrate on: (i) the empirical verification of the framework through the implementation of the project in real buildings, (ii) the optimization of the solution for the purposes of interoperability and scalability, (iii) the incorporation of the whole set of social and governance factors, and finally (iv) the development of the privacy ethics and metaverse-related protocols for the smart infrastructure. In the end, the final prototype presented offers a robust foundation both theoretically and technologically and stands out as a strong competitor among the other available approaches.

Q20. Conclusion: The conclusion restates ideas but lacks depth. It should synthesize findings, articulate contributions, and explicitly address both theoretical and practical implications.

A20 The following section has been added:

This research proposes a holistic, innovative solution for sustainable management in the realm of green buildings through the ESG KPI Framework – Metaverse-Enabled Management. This solution has extended the realm of sustainable management in the context of digital twins and metaverse-based systems. In fact, the solution has shown the potential of utilizing real-time IoT data for the real-time monitoring and analysis of ESG factors. From a methodology perspective, the solution has validated the KPIs using a robust, mathematical approach comprising Principal Component Analysis (PCA), Ordinary Least Squares Regression (OLS), and predictive modeling. This has made the solution more scientific and robust in terms of the assurance of KPI reliability for ESG factors. On the other hand, the solution has enabled real-time monitoring of important sustainability factors, such as the structure's carbon impact and energy use. In fact, the solution presents important aspects for developing sustainable management systems that enable decision-making among stakeholders through its 3D interactive platform. In fact, the solution has extended the concept of sustainable management systems by leveraging digital twins and BIM. This has provided important coverage of both human and financial factors, as the solution integrates all ESG-related factors. This has made the solution important at the juncture of Human-Computer Interaction (HCI), as it has demonstrated the effectiveness of metaverse-based systems in this context. On the other hand, the solution has made available important coverage in the context of sustainability management. In fact, the solution has made important aspects of the sustainability management system development available due to its applicability. The solution has made available important coverage pertaining to the development of a sustainable management system. In fact, the solution has made sustainable management systems. This has proven beneficial for both commercial and government structures. This has made important aspects of the development of real-time monitoring of sustainability factors available. On the other hand, the solution has enabled sustainable management systems due to its applicability. In fact, the solution has made important aspects of sustainable management system development available. This has been beneficial for the development of sustainable management systems. In fact, the solution has made sustainable management systems. This has been beneficial at the juncture of the development of sustainable management systems because of their applicability. On the other hand, the solution has made important aspects of sustainable management system development available. This has been beneficial because of its applicability. On the other hand, the solution has made important aspects of sustainable management system development available. This has been beneficial at the juncture of developing sustainable management systems. Conclusion: This research has laid the foundation for a new paradigm for the management of smart, sustainable buildings. This paradigm has shifted from a reporting model of performance to an interactive, predictive model of decision-making. This has been achieved through the integration of data science innovation, digital twin simulation techniques, and the metaverse approach.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

My comments, which are still relevant:

The bibliography entries contain incorrect volumes/numbers (e.g. "Energies, vol. 3785. Double numbering of entries in references.

Add a mini-table of 5 works (author, year, type, inclusion criterion, conclusion) and a short PRISMA-lite figure with numbers to validate the selection. Ensure consistent numbering and complete captions; ensure that every table/figure referred to in the text. 

The definition of BIM is missing so I recommend the narrow and broad approach from 2023.

Author Response

Point to Point Answers to Reviewer 1

Q1. The bibliography entries contain incorrect volumes/numbers (e.g. "Energies, vol. 3785.

A1. The quote has been edited as follows:

[10]     Casini, M. (2022). Extended reality for smart building operation and maintenance: A review. Energies, 15(10), 3785. https://doi.org/10.3390/en15103785

Q2. Double numbering of entries in references.

A2. Duplicate references have been eliminated.

Q3. Add a mini-table of 5 works (author, year, type, inclusion criterion, conclusion) and a short PRISMA-lite figure with numbers to validate the selection.

A3. The following part has been added within the second section:

Table 1. Overview of the five peer-reviewed studies identified through the Scopus search (“metaverse” AND “smart building”), including type, methodology, inclusion criteria, main conclusions, and relevance to the present research.

Ref.

Type

Methodology

Inclusion Criterion

Main Conclusion

Relevance for Our Study

[9]

Technical study

Simulation-based system architecture analysis for intelligent building networks

Addresses integration of intelligent computing systems for smart building networks

Demonstrates that distributed control enhances communication and data flow efficiency

Provides the foundation for connecting digital twin data with real-time management systems

[10]

Review paper

Systematic review of XR applications in building operation and maintenance

Focuses on extended reality (XR) for maintenance and user experience in smart buildings

Shows that XR improves monitoring, maintenance, and user engagement

Informs the immersive visualization layer of the proposed metaverse-based management framework

[11]

Conceptual framework

Analytical study integrating IoT and blockchain technologies for decentralized data management

Examines blockchain–IoT convergence for transparent smart building data handling

Highlights traceability and security benefits of decentralized systems

Supports the governance and transparency dimension in ESG-related metrics

[12]

Experimental study

Empirical testing using IoT sensors and commercial metaverse platforms

Investigates how physical IoT data interact with virtual metaverse spaces

Validates feasibility of real-time immersive visualization

Demonstrates the interoperability between physical and virtual building environments

[13]

Theoretical model

Conceptual modelling and synthesis of digital twin–metaverse integration

Explores merging digital twin and metaverse paradigms for smart building management

Proposes a conceptual framework for immersive, data-driven control of buildings

Provides the theoretical baseline for designing the integrated digital management model proposed in this research

Note. The five studies summarized in Table 1 were identified through a targeted Scopus search using the query TITLE-ABS-KEY("metaverse") AND TITLE-ABS-KEY("smart building") for the period 2018–2025. All selected papers are peer-reviewed and directly address the intersection of metaverse technologies and smart building management. Their synthesis provides the conceptual and methodological foundation for the development of the proposed integrated metaverse-based management framework

 

To ensure transparency in the literature selection process, a PRISMA-lite flow diagram was developed to illustrate the identification and screening of the retrieved studies. As shown in Figure 2, the Scopus database search conducted with the query TITLE-ABS-KEY("metaverse") AND TITLE-ABS-KEY("smart building") for the period 2018–2025 yielded five peer-reviewed records. No duplicates or exclusions were found, and all five studies were included in the qualitative synthesis summarized in Figure 1.

 

Figure 1. PRISMA-lite flow diagram illustrating the identification, screening, eligibility, and inclusion process of studies retrieved from Scopus (2018–2025) using the query TITLE-ABS-KEY(“metaverse”) AND TITLE-ABS-KEY(“smart building”).

 

 

Q4. Ensure consistent numbering and complete captions; ensure that every table/figure referred to in the text. 

A4. All tables and images have been renumbered

Q5.The definition of BIM is missing so I recommend the narrow and broad approach from 2023.

A5. The following definition of BIM has been added at page 2.

Building Information Modeling (BIM) can be defined and explained at two levels of meaning: narrow and broad. In its narrow sense, Building Information Modeling is a technology that supports and enables the creation, visualization, and management of three-dimensional building elements with parametric information attached or embedded. In its broadest sense, Building Information Modeling is an integrated, collaborative approach that links, manages, and integrates multidisciplinary information and workflows throughout the life cycle of a building —from creation and construction to operation and maintenance.

 

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

Having carefully reviewed the manuscript, I can say that all my previous comments have been addressed and the manuscript has been greatly improved, so I accept it as is.

Best Regards 

Author Response

Point to Point Answers to Reviewer 3

 

Q1. Dear Authors, Having carefully reviewed the manuscript, I can say that all my previous comments have been addressed and the manuscript has been greatly improved, so I accept it as is. Best Regards.

A1. Thanks dear reviewer.

Author Response File: Author Response.pdf

Back to TopTop