Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction

Albelwi, Naif

doi:10.3390/buildings16051029

Open AccessArticle

Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction

by

Naif Albelwi

Department of Architectural Engineering, College of Engineering at Yanbu, Taibah University, Yanbu Al-Bahr 46425, Saudi Arabia

Buildings 2026, 16(5), 1029; https://doi.org/10.3390/buildings16051029

Submission received: 3 February 2026 / Revised: 25 February 2026 / Accepted: 2 March 2026 / Published: 5 March 2026

(This article belongs to the Section Building Energy, Physics, Environment, and Systems)

Download

Browse Figures

Versions Notes

Abstract

Benchmarking embodied carbon in residential building stock accurately would involve a high volume of data sharing and would pose serious privacy and competitive issues among building construction stakeholders. This study introduces a new federated learning-based building stock modeling system (FedCarbon) that can allow embodied carbon to be evaluated collaboratively without data aggregation at a central place. The architecture proposed enables construction firms, cities, and providers of construction materials to collectively train predictive models at the same time as data sovereignty is achieved via a hierarchical federated aggregation mechanism with attention-based client weighting. A differentiated privacy scheme that is adaptively calibrated on noise guarantees the privacy of individual projects and allows for statistically significant benchmarking based on heterogeneous building portfolios. The framework also includes a gradient compression scheme based on momentum, which incurs an 82.6% reduction in communication overhead over traditional federated averaging-based methods and still maintains model convergence. The effectiveness of the approach is demonstrated with the help of comprehensive validation with the UCI Energy Efficiency Dataset, which includes 768 residential building configurations, and the Embodied Carbon in European Buildings Database, which includes 2340 residential units in 12 European jurisdictions. It has been experimentally shown that FedCarbon has a 94.2% prediction accuracy (R2) on embodied carbon intensity, with a mean absolute error of 21.4 kgCO₂e/m², and that (ε, δ) differential privacy can be guaranteed with ε = 1.0 and −δ = 10⁻⁵. This structure opens up building stock knowledge and hastens industry-wide implementation of low-carbon building strategies.

Keywords:

federated learning; embodied carbon; building stock modeling; differential privacy; sustainable construction; privacy-preserving machine learning; residential buildings; smart buildings

1. Introduction

The built environment contributes about 37% of all carbon dioxide emissions that are associated with energy [1,2,3], and the embodied carbon of building materials and construction processes is increasingly making up a larger share of the overall carbon footprint [4,5,6]. With reduced operational carbon emissions due to the increase in energy efficiency and the use of renewable energy, the relative significance of embodied carbon in life cycle assessment has increased significantly [7,8,9]. In developed economies, residential construction represents over 60% of current floor area, and presents a serious opportunity to reduce carbon by means of evidence-based material selection and design optimization [10,11,12]. Nevertheless, to set the right embodied carbon standards, it is necessary to have detailed data gathering under various building typologies and material supply chains, which poses serious difficulties in terms of coordination at the industry level.

The construction industry is a very fragmented ecosystem, with the companies engaged in construction, and the material suppliers, architects, and municipalities all keeping building performance data separately [13,14,15]. This information fragmentation poses significant challenges to creating strong carbon benchmarking systems that would be able to inform policy-making and make significant comparisons across building portfolios. Conventional centralized methods have major challenges, such as competitive issues among construction companies, regulatory limitation of cross-jurisdictional data exchange as well as privacy concerns for building owners [16,17,18]. These have constrained the creation of holistic building stock models that would hasten the process of moving towards low-carbon construction practices [19,20,21].

Urban science is based on the view that cities and regions are becoming more and more in need of credible portfolio-level embodied carbon standards to facilitate evidence-based urban planning, retrofit prioritization, and climate action plans. Municipal governments are supposed to calculate and contrast the carbon performance of residential building stocks at the city and regional levels so to use them in sustainability reporting, tracking of progress towards climate goals and development of low-carbon development policies. Nevertheless, it is difficult to put such benchmarking into practice, with the information about residential buildings being spread among various private and public stakeholders, including construction companies, home distributors, material suppliers, and municipal authorities. Detailed building and material data are often not centralized, as aggregation is often infeasible because of privacy laws, commercial sensitivity and fragmented ownership structures. Here, the suggested federated learning model is clearly placed as a facilitator of cross-city and cross-region embodied carbon benchmarking, where urban stakeholders can obtain similar, city-scale information without exchanging raw building data.

Federated learning has become one of the prospective paradigms of collaborative machine learning that can allow several stakeholders to train predictive models together without centralizing raw data [13]. Federated learning resolves the underlying privacy issues by distributing model training to the clients participating in it and summing up model parameters but not sensitive data, allowing knowledge sharing across organizational borders [16]. The recent use of smart building settings has proven that federated methods are possible in energy consumption prediction, thermal comfort, and building performance optimization [5]. Nonetheless, some of the issues of embodied carbon benchmarking are particular to specialized framework adaptations that have not been resolved sufficiently in the literature [9].

Why Existing FL Methods Are Insufficient for Embodied Carbon Assessment

Existing federated learning methods successful in operational energy prediction and thermal comfort modeling cannot be directly applied to embodied carbon assessment due to three domain-specific challenges:

(1): Extreme Data Heterogeneity in Material Composition—embodied carbon datasets exhibit significantly higher feature heterogeneity than operational energy datasets; while operational energy depends on relatively standardized building geometry and HVAC configurations, embodied carbon is determined by material composition (concrete types, steel grades, timber species, and insulation), varying dramatically across construction traditions, local material availability, and supply chain structures, creating non-IID distributions fundamentally different from the energy forecasting where standard FedAvg and FedProx were validated.
(2): Multi-Scale Feature Sensitivity—embodied carbon features span multiple orders of magnitude in sensitivity to privacy-preserving noise (concrete volume 50–500 m³ vs. reinforcement steel 1000–20,000 kg have vastly different scales and privacy implications); uniform noise calibration in standard DP-FedAvg either over-protects low-sensitivity features (degrading accuracy) or under-protects high-sensitivity features (compromising privacy), motivating our adaptive noise calibration (Equations (11) and (12)).
(3): Assessment Boundary Heterogeneity—different stakeholders use different LCA boundaries (A1–A3 cradle-to-gate, A1–A5 cradle-to-site, and A1–C4 cradle-to-grave), creating systematic target variable definition differences across clients, a unique challenge absent in operational energy benchmarking, where energy consumption (kWh) is universally defined, which standard federated aggregation methods do not account for, thereby producing biased global models.

Federated learning coupled with the implementation of differential privacy mechanisms can offer both formal mathematical privacy guarantees on the privacy of individual data and allow meaningful aggregate analysis [22,23,24]. Differential privacy adds some noise to the model training processes with a carefully selected level of noise, such that the existence (or absence) of a single data point cannot be confidently estimated based on the published model parameters [14]. In the case of embodied carbon benchmarking applications, the privacy of the approach to differentiate between the different construction stakeholders should be safeguarded as a way to motivate them to participate in the process in case they do not want to disclose their competitive advantage in terms of material costs or proprietary construction methods [17].

Figure 1 provides a conceptual map of privacy-preserving embodied carbon benchmarking, which shows the issues of data sovereignty and the suggested federated learning solution architecture. The framework allows the parties to participate in the joint model training as multiple categories of stakeholders, such as construction companies, municipalities, and material suppliers, without having to surrender their proprietary data assets.

The primary contributions of this research are summarized as follows:

Novel Hierarchical Federated Architecture: To fulfill the needs of heterogeneous building data, the multi-stakeholder coordination requirements of different life cycle assessment methods, and to address heterogeneous building data, we suggest FedCarbon, a hierarchical federated learning architecture with attention-based client weighting especially suitable to the needs of embodied carbon assessment.
Adaptive Differential Privacy Mechanism: We introduce a dynamic and adaptive noise calibration program, which varies the privacy settings according to the sensitivity of the embodied carbon features, to obtain formal (ϵ, δ) differential privacy guarantees and achieve a prediction accuracy of over 94%.
Momentum-Enhanced Gradient Compression: We propose an error-feedback sparsification method based on momentum that, when used, decreases communication overhead by 82.6% over a typical federated averaging method and facilitates stakeholders who have lower bandwidth connectivity to participate.
Comprehensive Empirical Validation: We perform comprehensive tests on two publicly available datasets of 3108 residential building configurations in various geographic locations, which prove to be practically useful to embodied carbon benchmarking applications in real-life settings.

The rest of this paper is structured in the following way: Section 2 provides the review of related work concerning federated learning, differential privacy, and the construction of carbon assessment. Section 3 provides the FedCarbon methodology and mathematical model. The results of the experiment and analysis are discussed in Section 4. Section 5 is a discussion and analysis. Section 6 is the conclusion of the paper.

2. Related Work

In this section, the literature in three interrelated areas is reviewed: federated learning applications in smart buildings, privacy-aware approaches in distributed machine learning, and embodied carbon assessment methods.

2.1. Federated Learning in Smart Building Environments

Federated learning has received much interest in smart building systems because it allows for the collaborative creation of models without the loss of data privacy [1]. Wang et al. [5] suggested a personalized federated learning model to develop energy consumption forecasting that can deal with non-IID data distributions among building typologies. Abbas et al. [4] proposed a privacy-aware thermal comfort prediction model on smart buildings based on federated learning, which proves that it is possible to use distributed machine learning to predict building performance [25,26,27,28,29,30,31,32,33,34].

Amangeldy et al. [6] provided a thorough review of artificial intelligence and deep learning techniques to manage resources in smart buildings and found federated learning as a future opportunity to use in privacy-preserving analytics. Shan et al. [7] discussed AI-based multi-objective optimization methods to optimize the energy retrofit of urban buildings, and stated the possibility of using machine learning to make decisions faster in carbon reduction [28,29]. Hinterstocker et al. [15] applied federated learning to building energy performance prediction across over 25,000 residential buildings, incorporating differential privacy techniques and demonstrating that privacy-preserving FL achieves comparable accuracy to centralized approaches in building stock-level energy assessment. Rizwan et al. [30] proposed a convergence-aware federated transfer learning framework for residential energy consumption prediction that enables collaborative model training across multiple buildings without disclosing raw energy data, demonstrating the applicability of FL to multi-building stock-level performance assessment.

2.2. Privacy-Preserving Mechanisms for Distributed Learning

Federated learning has been widely combined with differential privacy [9]. Mohammadi et al. [10] demonstrated the integration of federated learning with differential privacy for secure anomaly detection in smart grid infrastructure, showing that DP-enhanced FL can achieve effective privacy–utility balance in distributed energy systems. A detailed study of collaborative intelligence in federated learning was conducted by Lazaros et al. [13], who analyzed different aggregation strategies and its consequences to privacy and utility. The survey of scalable and secure edge AI systems conducted by Rourke and Leclair [14] explored privacy-preserving mechanisms that can be applied in resource-constrained environments of deployment. Folino et al. [17] created a scalable vertical federated learning system that has shown privacy-preserving analytics in the field of cybersecurity and generalizable methodological insights. Deng et al. [19] proposed a privacy-preserving federated learning framework for collaborative risk assessment across smart grid operators, demonstrating that distributed benchmarking of critical infrastructure performance is achievable without centralizing sensitive operational data. Yang et al. [22] developed a gradient compression federated learning framework with adaptive local differential privacy budget allocation, demonstrating the feasibility of jointly optimizing communication efficiency and privacy guarantees in distributed learning settings.

2.3. Embodied Carbon Assessment and Building Stock Modeling

Proper embodied carbon evaluation demands in-depth life cycle evaluation of building substances [2]. Feng et al. [8] showed how digital twin and edge intelligence could be used to decarbonize more precisely, and the methodological findings can be adopted and applied to the building sector. Bahadori-Jahromi et al. [11] discussed the applicability of artificial intelligence in promoting civil engineering, such as sustainable building. Siakas et al. [12] examined self-directed cyber-physical systems that facilitate intelligent positive energy districts.

Gupta et al. [18] studied the concept of federated learning in smart farming application, and showed that privacy-preserving distributed learning could be used in sustainability applications. Goktas and Ibrahim [20] wrote about the energy management and communication systems of smart grids, and their role in the optimization of building energy. El Hafdaoui et al. [21] demonstrated that machine learning models using supervised learning techniques can estimate embodied carbon throughout the building life cycle, with average errors of approximately 15.71%, though centralized data requirements limit scalability across diverse building stocks and geographic regions. Zhang et al. [28] provided a comprehensive NIST systematic review of embodied carbon assessment and reduction methods across building life cycles, identifying significant inconsistencies in assessment methodologies and database selection that underscore the need for standardized benchmarking frameworks. Li et al. [31] published a harmonized dataset of high-resolution embodied life cycle assessment results for North American buildings, revealing that inconsistent LCA scopes, methods, and background datasets across geographies severely limit the comparability of embodied carbon benchmarks—a challenge that federated learning approaches can address by enabling collaborative model training without requiring data centralization.

2.4. Research Gap Analysis

A systematic comparison of the current methods based on seven key capabilities, as shown in Table 1, indicates that, though previous literature considers individual elements like the federated learning of smart buildings [1,5], differential privacy mechanisms [4,13], or embodied carbon assessment [7,8], none of them incorporates all of the key components necessary to achieve privacy-preserving carbon benchmarking. The analysis reveals that there is a large research gap in which no framework is used to combine federated learning, embodied carbon modeling, differential privacy, gradient compression, and multi-stakeholder coordination to develop building stock applications. The gap that FedCarbon bridges—by offering the first end-to-end solution that is comprehensive and covers all seven dimensions of capability—allows full coverage in practical applications in the construction industry ecosystem when it comes to collaborative carbon benchmarking.

Table 1 evaluates related work against seven capability dimensions:

Federated—uses distributed model training across multiple clients without centralizing raw data.
Embodied Carbon—explicitly models embodied carbon, life cycle carbon, or material-related CO₂ emissions (not limited to operational energy).
Diff. Privacy—implements formal (ε, δ) differential privacy or equivalent mathematical guarantees (not merely anonymization or access control).
Compression—uses gradient compression, sparsification, quantization, or communication-efficient techniques to reduce bandwidth.
Multi-Stakeholder—designed for or validated with multiple distinct organizational entities (not merely multiple devices within one organization).
Building Stock—operates at building portfolio or urban stock levels, modeling multiple buildings across typologies (not single-building optimization).
Real Data—validated on real-world measured data or verified simulation datasets (not purely synthetic or toy examples).

Table 2 shows the Quantitative operationalization of capability dimensions.

3. Proposed Methodology

In this section, the FedCarbon framework on privacy-preserving embodied carbon benchmarking is provided. We start with the overview of the system, and then perform mathematical modeling of the federated learning structure, differential privacy, and gradient compression.

3.1. System Overview

Figure 2 shows the overall FedCarbon design, which shows the hierarchical layout between construction stakeholders, local training, and the privacy-sensitive aggregation server. The system consists of three main layers, including the client layer, which involves local building data and model training, the aggregation layer, which involves differential privacy and gradient compression, and the application layer, which involves carbon benchmarking services.

Let

K = {1, 2, \dots, K}

denote the set of K participating clients, where each client k maintains a local dataset

D_{k} = {(x_{i}^{(k)}, y_{i}^{(k)})}_{i = 1}^{n_{k}}

containing

n_{k}

building records. The feature vector

x_{i}^{(k)} \in R^{d}

encodes building characteristics relevant to embodied carbon estimation. The target variable

y_{i}^{(k)} \in R^{+}

represents embodied carbon intensity in kgCO₂e/m².

Hierarchical Architecture Implementation Details

The FedCarbon hierarchical architecture employs a three-level aggregation structure:

Level 1—Client-Level Training: Each client k (construction firm, municipality, or material supplier) trains the model locally on its private dataset D_k for E local epochs. Clients compute local model updates Δθ_k using differentially private SGD with per-sample gradient clipping (Equation (9)) and Gaussian noise injection (Equation (10)).

Level 2—Regional Aggregation: Clients are grouped into R = 4 geographic regions (Northern EU, Central EU, Southern EU, and Eastern EU). Each region r has a designated regional aggregator that collects compressed updates from its member clients and performs intra-region aggregation:

θ_{r}^{\{(t + 1)\}} = θ^{\{(t)\}} + {}_{{k \in R_{r}} α_{k}}{\sum_{\tilde{\{(t)\}} {\{Δ \ t h e t a\}}_{k}^{\{(t)\}}}}

where

α_{k}^{t}

are attention-based weights computed within the region using Equation (6). The regional aggregator does not access raw data—it only processes compressed model updates.

Level 3—Global Aggregation: A global server collects regionally aggregated updates and computes the global model:

θ^{\{(t + 1)\}} = θ^{\{(t)\}} + \sum_{\{r = 1\}}^{\{R\} w_{r}^{\{(t)\},} Δ \ t h e t a_{r}^{\{(t)\}}}

where w_r = n_r/N represents the proportion of total samples in region r.

Communication Timing and Synchronization Protocol:

FedCarbon adopts fully synchronous aggregation at both hierarchical levels using the following protocol.

Intra-Region Synchronization (Synchronous):

Within each region (r), all clients k∈Rr perform (E = 5) local epochs and send compressed updates to the regional aggregator. The aggregator waits for all selected clients St∩Rr, with a timeout of τ_timeout = 300 s; late clients are dropped and aggregation proceeds. With K = 20 clients (5 per region), no timeouts were observed.

Inter-Region Synchronization (Synchronous):

The global server waits for all (R = 4) regional aggregators before computing the global model θ^{(t + 1)}, resulting in fully synchronous global aggregation.

DP Noise Application Order (Critical Design Choice):

Differential privacy is applied at the client before communication, following this order:

Local update Δθ_k^(t) via SGD;
Gradient clipping and Gaussian noise injection;
Top-K gradient compression;
Transmission of compressed DP-sanitized update Δθ_k^(t);
Attention weight computation at the regional aggregator;
Attention-weighted regional aggregation and forwarding to the global server.

Implication for Attention–Noise Interaction:

Since attention operates on DP-noisy updates, it may reweight noise across clients; however, it learns to downweight low signal-to-noise updates. Empirically, this yields higher performance (R² = 0.942) than uniform-weighted DP-FedAvg (R² = 0.924).

Regional Aggregators and Raw Data Access:

Regional aggregators never access raw data or pre-DP updates; they use only compressed, DP-sanitized updates, and their attention parameters are updated using global validation feedback rather than raw client information.

3.2. Embodied Carbon Prediction Model

The embodied carbon prediction problem is presented as a regression problem in which the objective is to learn a mapping function

f_{θ} : R^{d} \to R^{+}

parameterized by weights

θ

:

f_{θ} (x) = W_{L} \cdot σ (W_{L - 1} \cdot σ (\dots σ (W_{1} \cdot x + b_{1}) \dots) + b_{L - 1}) + b_{L}

(1)

The local loss function for client k is defined using mean squared error with L2 regularization:

L_{k} (θ) = \frac{1}{n_{k}} \sum_{i = 1}^{n_{k}} {(f_{θ} (x_{i}^{(k)}) - y_{i}^{(k)})}^{2} + λ ∥ θ ∥_{2}^{2}

(2)

The global objective function aggregates local losses:

L (θ) = \sum_{k = 1}^{K} \frac{n_{k}}{N} L_{k} (θ)

(3)

3.3. Federated Learning Framework

Algorithm 1 presents the complete FedCarbon training procedure.

Algorithm 1: FedCarbon: Federated Learning for Embodied Carbon

Require: Clients K, rounds T, epochs E, learning rate η, privacy (ϵ, δ), compression ρ

Ensure: Global model θ^(T)

1.: Initialize global model θ(0), attention parameters for each region
2.: for t = 0 to T − 1 do
3.: Server broadcasts θ(t) to all regional aggregators
4.: for each region r ∈ {1, …, R} in parallel do
5.: Regional aggregator r broadcasts θ^(t) to clients in region r
6.: St ← random subset of m clients in region r
7.: for each client k ∈ St in parallel do
8.: θ _k ^(t,0) ← θ(t)
9.: for e = 0 to E − 1 do
10.: Sample mini-batch Bk from Dk
11.: gk ← ∇θ Lk(θ _k ^(t,e); Bk)
12.: $\tilde{g}$ k ← ClipGradient(gk, C)
Adaptive per-feature clipping
13.: $\hat{g}$ k ← AddNoise( $\tilde{g}$ k, σ)
Adaptive per-feature noise
14.: θ _k ^{(t,e + 1)} ← θ_k ^(t,e) − η $\hat{g}$ k
Momentum update (Equations (4) and (5))
15.: end for
16.: ∆θ(t) k ← θ(t,E)
17.: ∆ $\tilde{θ}$ _k^(t) ← Compress(∆θ _k^(t), ρ)
18.: Updates transmitted to regional aggregator are DP-sanitized and compressed
19.: end for
20.: θ(t+1)←θ⁽^t⁾ + ∑_k_∈_St nk/∑_jn_j Δθ^~⁽^t⁾
21.: end for
22.: Return θ

The local update rule with momentum is

v_{k}^{(τ + 1)} = β v_{k}^{(τ)} + (1 - β) {\hat{g}}_{k}^{(τ)}

(4)

θ_{k}^{(τ + 1)} = θ_{k}^{(τ)} - η v_{k}^{(τ + 1)}

(5)

The attention-weighted aggregation is

α_{k}^{(t)} = \frac{e x p (v^{⊤} t a n h (W_{a} Δ {\tilde{θ}}_{k}^{(t)}))}{\sum_{j \in S_{t}} e x p (v^{⊤} t a n h (W_{a} Δ {\tilde{θ}}_{j}^{(t)}))}

(6)

θ^{(t + 1)} = θ^{(t)} + \sum_{k \in S_{t}} α_{k}^{(t)} Δ {\tilde{θ}}_{k}^{(t)}

(7)

Attention parameters W_a ∈ ℝ^da×p} and v∈ ℝ^da (d_a = 64) operate exclusively on compressed model updates Δ

\tilde{θ}

_k^(t), not raw client data, maintained at the regional aggregator level. This ensures no aggregator accesses raw building data; attention learns to assign higher weights to informative and consistent updates as a proxy for quality without direct data inspection.

Training Procedure:

Clients transmit compressed model updates Δ $\tilde{θ}$ _k^t (no raw data transmitted);
Regional aggregator computes attention scores α_k^t using Equation (6) based solely on model updates;
After distributing regionally aggregated model, aggregator updates W_a and v using validation performance feedback—reinforcing current distribution if loss decreases, otherwise adjusting via gradient step on attention parameters.

Attention Weight Properties:

Attention weights moderately correlate with data share (Pearson r = 0.82) but not perfectly proportional, balancing quantity with quality;
Without regularization: attention variance increased (std = 0.058 vs. 0.031), while the lowest-weighted region’s R² dropped to 0.014;
Regularization ensures balanced contributions while downweighting consistently low-quality updates.

This design ensures that at no point does any aggregator access raw building data.

3.4. Differential Privacy Mechanism

We use the Rényi Differential Privacy (RDP) accountant [34] via Opacus for tight privacy composition, providing tighter bounds than basic composition or moments accountant. We adopt per-record adjacency (datasets D and D’ differ by ≤1 building record), protecting individual building-level information. Subsampling amplification with mini-batches (size B) from the client dataset (size n_k) yields the subsampling rate q = B/n_k; by privacy amplification lemma, mechanisms satisfying (α,

\tilde{ε}

)-RDP on the full dataset satisfy (α, log(1 + q(exp(

\tilde{ε}

) − 1)))-RDP on the subsampled dataset. The total privacy budget for T communication rounds with E local epochs, and the Gaussian mechanism (noise multiplier σ) is computed via RDP composition.

P r [M (D) \in S] \leq e^{ϵ} P r [M (D^{'}) \in S] + δ

(8)

Per-sample gradients are clipped:

{\tilde{g}}_{i} = g_{i} \cdot m i n (1, \frac{C}{∥ g_{i} ∥_{2}})

(9)

Gaussian noise is added:

\hat{g} = \frac{1}{|B|} \sum_{i \in B} {\tilde{g}}_{i} + N (0, σ^{2} C^{2} I)

(10)

The adaptive clipping threshold is

C_{j} = C_{base} \cdot (1 + α_{c} \cdot s_{j})

(11)

The adaptive noise variance is

σ_{j}^{2} = σ_{base}^{2} \cdot {(1 + α_{n} \cdot s_{j})}^{2}

(12)

Intuitive Explanation of Adaptive Noise Calibration:

The adaptive noise calibration mechanism adjusts the clipping threshold Cj and noise variance σj² per-feature-group based on the sensitivity score sj, quantifying how much information a feature group reveals about individual building projects. The sensitivity score sj combines: (1) value range sensitivity—ratio of feature group’s inter-quartile range to median, capturing distributional spread; and (2) gradient contribution—average magnitude of gradient components for feature group j over previous Tw = 10 communication rounds.

Gradient Independence from Private Aggregation: The gradient magnitudes used in the sensitivity score sj are computed exclusively on the frozen global model parameters θ^(t) at the beginning of each communication round, prior to any local private training updates. Specifically, at the start of round t, each client computes ∇_θ L^k(θ^(t); B_calibration) on a designated public calibration subset B_calibration (10% of each client’s local data, held out from training and pre-registered before FL training begins). These calibration gradients are aggregated across clients via simple averaging (without DP noise) to produce the global sensitivity score sj^(t). Crucially, the gradient operator used for sj is evaluated at θ^(t)—the global model parameters broadcast by the server—which is itself a post-processed output of the DP-protected aggregation from round t − 1. By the post-processing theorem, θ^(t) inherits the cumulative DP guarantee, and any deterministic function of θ^(t) (including gradient evaluation on public data) does not incur additional privacy cost. The calibration subset is excluded from the private training mini-batches to ensure no double-dipping between sensitivity estimation and private model updates

For analytical reference, the privacy budget under basic advanced composition can be upper-bounded as

ϵ = \sqrt{2 T l n (1.25 / δ)} \cdot \frac{q}{σ_{eff}}

(13)

Equation (13) provides a loose analytical upper bound based on the advanced composition theorem [12] and is included for interpretive reference only. The actual privacy budget reported in all experiments (ε = 0.97 at δ = 10⁻⁵) is computed using the Rényi Differential Privacy (RDP) accountant implemented via Opacus 1.4.0, which provides strictly tighter composition bounds through numerical RDP-to-(ε,δ)-DP conversion [34]. The RDP accountant tracks privacy expenditure across T = 200 communication rounds × E = 5 local epochs with subsampling rate of

q = \frac{B}{n_{k}} = \frac{32}{n_{k}}

per client, yielding

ε_{R D P} = 0.97 < ε_{a n a l y t i c a l}

, confirming that Equation (13) is indeed a conservative upper bound

3.5. Gradient Compression

Top-k components are chosen with the help of the compression operator:

{[TopK (u, ρ)]}_{j} = \{\begin{array}{l} {[u]}_{j} & if |{[u]}_{j}| \geq τ_{ρ} (u) \\ 0 & otherwise \end{array}

(14)

Error feedback accumulation:

e_{k}^{(t + 1)} = e_{k}^{(t)} + Δ θ_{k}^{(t)} - TopK (e_{k}^{(t)} + Δ θ_{k}^{(t)}, ρ)

(15)

Algorithm 2 details the compression procedure.

Algorithm 2: Gradient Compression with Error Feedback

Require: Update ∆θ, ratio ρ, error buffer e

Ensure: Compressed ∆ $\tilde{θ}$ , updated e′

u ← e + ∆θ

τ ← top-⌈ρp⌉ magnitude threshold in u

∆

\tilde{θ}

← 0

for j = 1 to p do

if |[u]j | ≥ τ then

[∆

\tilde{θ}

]j ← [u]j

end if

end for

e′ ← u − ∆

\tilde{θ}

return ∆

\tilde{θ}

, e′

3.6. Convergence Analysis

L_{k} (θ)

is L-smooth:

∥ \nabla L_{k} (θ) - \nabla L_{k} (θ^{'}) ∥_{2} \leq L ∥ θ - θ^{'} ∥_{2}

(16)

E [∥ \nabla L_{k} (θ; ξ) - \nabla L_{k} (θ) ∥_{2}^{2}] \leq σ_{g}^{2}

(17)

∥ \nabla L_{k} (θ) - \nabla L (θ) ∥_{2}^{2} \leq γ^{2}

(18)

Under Assumptions 1–3, FedCarbon achieves:

\frac{1}{T} \sum_{t = 0}^{T - 1} E [∥ \nabla L (θ^{(t)}) ∥_{2}^{2}] \leq O (\frac{1}{\sqrt{T}} + \frac{σ^{2} C^{2} p}{B^{2}} + (1 - ρ) γ^{2})

(19)

Relationship to Analytical Optimality Bounds: We note that the privacy–utility trade-off in FedCarbon is characterized empirically rather than through closed-form analytical bounds. Information-theoretic frameworks such as Sankar et al. [32] have established tight privacy–utility trade-off bounds for smart meter data using rate-distortion theory, demonstrating that optimal privacy-preserving solutions can be derived analytically under Gaussian assumptions. More recently, communication–privacy trade-offs in distributed settings have been characterized through explicit rate expressions at the 60th Allerton Conference on Communication, Control, and Computing [33], providing tight bounds on achievable accuracy under joint communication and privacy constraints. FedCarbon does not claim analytical optimality in the information-theoretic sense; rather, it demonstrates empirical near-optimality by achieving R² = 0.942 with a (ε = 1.0, δ = 10⁻⁵)-DP guarantee and an 82.6% communication reduction—which is within 2.6% of the non-private centralized upper bound (R² = 0.968). Our ‘first comprehensive’ claim refers specifically to the integration of all seven capability dimensions within a single operational framework for embodied carbon benchmarking, not to theoretical optimality of any individual component. A framework satisfying six of seven dimensions with tighter theoretical bounds would represent a complementary rather than superseding contribution, as practical deployment in multi-stakeholder construction ecosystems requires the full integration we provide.

4. Results and Evaluation

4.1. Datasets

The UCI Energy Efficiency Dataset is included to (i) leverage building geometry and envelope features that are strongly linked to material quantities and embodied carbon, (ii) enable reproducible comparison with prior federated learning studies in smart building research, and (iii) evaluate the generalizability of the FedCarbon framework across different building performance targets. While ECEBD remains the primary dataset for embodied carbon benchmarking, the UCI dataset provides complementary evidence of model robustness and cross-task applicability.

We use two publicly available datasets:

Dataset 1: UCI Energy Efficiency Dataset. Eight building features and 768 samples. URL: https://archive.ics.uci.edu/ml/datasets/energy+efficiency (accessed on 12 August 2025).

Dataset 2: Embodied Carbon in European Buildings Database (ECEBD). A total of 2340 residential buildings in 12 EU countries consisting of 24 features. URL: https://github.com/mroeck/Embodied-Carbon-of-European-Buildings-Database (accessed on 12 August 2025).

Table 3 summarizes dataset characteristics.

4.2. Experimental Setup

Our implementation of FedCarbon is based on PyTorch, PySyft and Opacus 3.9. Hyperparameters: K = 20 clients, R = 4 regions, E = 5 local epoch, batch size = 32, learning rate = 0.001, clipping = C = 1.0, noise

σ = 1.2

, compression

ρ = 0.1

, and momentum

β = 0.9

.

Baseline Hyperparameter Tuning and Fairness Protocol:

To ensure fair comparison, all baseline methods were evaluated under a standardized protocol.

Standardization:

Identical Data Partitions: All methods use the same client data partitions from Dirichlet allocation (α = 0.5 and seed = 42), an identical 80/20 train–test split per client.
Uniform Privacy Budget: All DP methods (DP-FedAvg, DP-SCAFFOLD, and FedCarbon) are evaluated at ε = 1.0, δ = 10⁻⁵, and the same clipping threshold C = 1.0 and noise multiplier σ = 1.2 for DP-FedAvg/DP-SCAFFOLD; FedCarbon uses adaptive clipping/noise (Equations (11) and (12)), constrained to ε = 1.0 via composition bound (Equation (13)).
Architecture: All baselines use the same three-layer MLP (hidden dimensions [128, 64, 32], ReLU activations, and L2 regularization λ = 10⁻⁴).
Infrastructure: NVIDIA A100 GPU (40GB VRAM), PyTorch 2.1, PySyft 0.8.7, and Opacus 1.4.0. Table 4 shows the Hyperparameter optimization (grid search).

The datasets are partitioned using a standard Dirichlet-based non-IID strategy to simulate heterogeneous federated settings with 20 clients organized into four regions (five clients per region). For the ECEBD dataset, buildings are first assigned to regions based on geography (Northern, Central, Southern, and Eastern Europe), while UCI records are randomly assigned to regions due to the absence of location attributes. Within each region, data are distributed to clients using a Dirichlet distribution over 10 target-variable (ECI) quantile bins, where the concentration parameter α controls heterogeneity (α = 0.1 for highly skewed, α = 0.5 for moderate non-IID, and α = 10 for near-IID partitions). The degree of heterogeneity is quantified using Earth Mover’s Distance between client distributions and Weight Divergence from the uniform distribution. Table 5 shows the Dirichlet concentration parameter (α) on federated data heterogeneity measured by Earth Mover’s Distance (mean ± std) and Weight Divergence. While, Table 6 shows the Impact of heterogeneity on FedCarbon performance (ECEBD).

The experimental setting, in the context of the present study, is a simulated multi-stakeholder urban setting. Every client is associated with an urban stakeholder, i.e., construction businesses, residential developers, material distributors, or local governments, which functions within a particular city or administrative area. The regional parameter R captures different urban or regional settings, which allow for modeling the city-level heterogeneity in residential building features. The non-IID data distributions among clients reflect the realistic variations in the typologies of buildings, material selections, and construction processes that are usually seen in varying urban environments. In such a setup, the proposed framework can be evaluated with the conditions that are close to the real-world urban and inter-city benchmarking scenarios, and the data sovereignty can be maintained across the stakeholders that are part of the setup.

4.3. Training Convergence

Figure 3 shows training loss curves in terms of communication rounds.

4.4. Prediction Accuracy

Figure 4 shows how the prediction accuracy (R²) increases with 200 communication rounds on both UCI Energy Efficiency and ECEBD datasets, and FedCarbon (red squares) reaches its final accuracy of 0.921 and 0.942 respectively, surpassing DP-FedAvg and Local Only baselines, and approaching the non-private Centralized upper bound. The convergence curves validate the fact that the adaptive differential privacy and attention-based aggregation mechanisms of FedCarbon are effective in balancing the privacy preservation with the model utility, and converge to a stable point with 150 rounds, despite the noise injection and gradient compression overhead.

Figure 5 presents predicted versus actual values.

Table 7 is the table with detailed performance comparisons.

4.5. Urban-Scale Embodied Carbon Benchmarking Demonstration

Although predictive accuracy measures of R², MAE, and RMSE are needed to verify the performance of models, urban sustainability models demand interpretable benchmarking results that allow for the comparison of cities and regions. To illustrate how the suggested framework can be applied in practice to decision making at the urban scale, we use percentile-based embodied carbon standards based on the federated predictions obtained on the ECEBD dataset. These benchmarks represent the possibility of positioning the residential building stocks against each other without the need to have central access to raw building data. Indicators based on percentiles are typical of reporting on the sustainability of cities and enable municipalities to understand the high-carbon segments, trace progress across time, and focus on the targeted retrofit or policy interventions.

The proposed federated learning approach provides the ability to extract portfolio-level embodied carbon benchmarks, which can be directly interpreted at the urban scale, as illustrated in Table 8. Cities or regions can use their residential building stock as a point of comparison between the percentile thresholds of the building stock to determine whether they are in the low-, medium-, or high-carbon categories compared to other jurisdictions. Notably, these standards can be generated with no access to personal building records or proprietary data, which enables comparison between cities and taking joint climate action in a demanding environment of data sovereignty and privacy.

The bootstrap confidence intervals in Table 9 are computed by resampling the federated model predictions with replacement 1000 times and computing percentiles on each resample. The CI widths (14.8–23.5 kgCO₂e/m²) represent 4.0–5.2% of the respective benchmark values, indicating moderate stability. Higher percentiles show wider intervals due to greater variance in the upper tail of the ECI distribution.

Percentile Boundary Crossing Analysis:

To quantify the practical impact of benchmark uncertainty on building classification, we analyze how many test-set buildings would be reclassified when percentile thresholds are adjusted to their CI bounds.” Table 10 shows the Boundary crossing analysis under CI-adjusted thresholds.

Leave-One-Country-Out (LOCO) Percentile Robustness Analysis:

To demonstrate robustness of the percentile benchmarks across geographic jurisdictions, we performed leave-one-country-out validation, where the federated model is retrained excluding all buildings from one country, and percentile benchmarks are recomputed on the remaining test set.

The LOCO analysis reveals that percentile benchmarks are moderately robust to the exclusion of any single country, with mean absolute shifts of 2.6–4.7 kgCO₂e/m² (0.9–1.0% of benchmark values) and maximum shifts of 8.4 kgCO₂e/m² (1.9%) when Poland is excluded. The largest perturbations occur when excluding countries with distinctive construction traditions (Poland: high masonry/concrete mix; Germany: largest sample contributing to Central EU calibration; Spain: high ECI variability in Southern EU). All LOCO percentile shifts fall within the bootstrap confidence intervals reported in Table 11, confirming that no single country disproportionately determines the aggregate benchmarks. The analysis validates that the federated model produces geographically robust benchmarks suitable for cross-jurisdictional comparison.

Recommendation for Municipal Use: Given the boundary crossing rates (6.6–9.4%) and LOCO variability (max 1.9%), we recommend that municipalities adopt the following protocol: (1) use CI-adjusted thresholds (lower bound of CI for P25 and upper bound for P75) to ensure conservative classification; (2) apply ‘buffer zones’ of ±15 kgCO₂e/m² around each percentile threshold; (3) buildings within buffer zones should undergo project-level LCA verification before policy classification.

Privacy Status of Released Benchmarks: The percentile benchmarks in Table 5 inherit the (ε = 1.0, δ = 10⁻⁵)-DP guarantee from the global model θ^(T) by using the post-processing theorem [20], since benchmarks are deterministic functions of θ^(T) applied to test inputs and not raw data. However, if benchmarks are computed on training data where a single city dominates a regional partition, information leakage risks exist. Mitigation: (i) compute benchmarks on separate held-out public building stock survey; (ii) reported Table 5 benchmarks use the test set (20% held-out); (iii) DP guarantee applies regardless, as the model is DP-protected. For policy deployment, municipalities should apply benchmarks as approximate reference ranges with bootstrap confidence intervals rather than hard regulatory thresholds, acknowledging inherent uncertainty in model-derived statistics.

4.6. Privacy–Utility Trade-Off

The privacy–utility trade-off is presented in Figure 6. The privacy–utility trade-off curve shows that prediction accuracy (R²) rises as the privacy budget (ϵ) becomes larger, and FedCarbon (square markers) consistently has better results in both datasets and has a high privacy guarantee (ϵ ≤ 1 green region). Its visual representation of R² values (blue (low 0.65) to red (high ∼0.96)) shows that FedCarbon can perform almost at the same level as non-privacy FedAvg (gray dashed line) despite the severe privacy settings, which proves its superior adaptive noise calibration scheme

4.7. Communication Efficiency

Figure 7 shows communication analysis. Communication efficiency analysis shows that the momentum-enhanced gradient compression of FedCarbon has a bandwidth reduction of 82.6 percent at a compression ratio 0.1 or 0.38 MB of the bandwidth instead of 2.18 MB of the bandwidth, with a prediction accuracy R² = 0.942, which is within the reasonable range (0.90). FedCarbon has a better preservation of accuracy as compared to TopK-Basic and Random-Sparse compression techniques because it has a mechanism of accumulating error feedback that only needs 165 convergence rounds compared to 195 and 232 rounds of baseline methods at the same compression levels. The trade-off analysis supports the fact that FedCarbon allows for practical implementation by construction industry players with low network connectivity without compromising the performance of the model. Table 12 shows the Computational overhead and training time comparison.

4.8. Regional Performance

Figure 8 visualizes variations in the regional performance. Regional performance analysis proves the performance of FedCarbon in heterogeneous geographic regions, with the COMSOL-style heatmaps indicating that the highest accuracy (R² = 0.959) is obtained in Region 2 (Central EU), whereas slower convergence (R² = 0.923) is seen in Region 3 (Southern EU), as the heterogeneity of the data is more pronounced in that region, but all regions are converging to the satisfactory level of performance above 0.92. The client attention weights evolution heatmap shows how the attention-based aggregation process dynamically adapts the contributions of clients to 200 communication rounds, where the larger the attention (red), the larger the client contribution to the aggregation. The polar attention distribution validates the balanced regional contributions of 23.1\% to 26.8\% that prove hierarchical aggregation of FedCarbon to be effective in non-IID data distributions of various European building stock properties.

4.9. Ablation Study

Table 13 presents ablation results.

4.10. Error Decomposition Analysis

As shown in Table 14, Table 15 and Table 16, the model performs best on (i) single-family detached buildings (R² = 0.951), which have the most standardized construction methods and material palettes; (ii) the Central EU region (R² = 0.959), which has the largest sample count and most consistent building standards; and (iii) reinforced concrete structures (R² = 0.948), which dominate the training data. The model performs worse on (i) high-rise apartments (R² = 0.912), which have complex structural systems and more variable material quantities; (ii) Southern EU (R² = 0.923), which exhibits the highest intra-regional construction practice variability; and (iii) timber and mixed/hybrid structures (R² = 0.908–0.918), which represent minority classes in the training data with higher material composition variability.

These results suggest that prediction accuracy is primarily driven by training data representation and construction practice homogeneity, indicating opportunities for targeted data collection in under-represented building categories to improve model performance.

4.11. Robustness to Data Partitioning

The headline R² = 0.942 has an expected variance of σ² = 1.6 × 10⁻⁵ (std = 0.004) across 10 independent non-IID partitions, Table 14 indicates high stability, with 95% CI [0.937, 0.945], confirming representativeness. The attention mechanism is most partition-sensitive (std increases 0.004→0.009 when removed), as attention weights adapt to client update distributions directly affected by data partitioning; without attention, fixed sample-size weighting provides less adaptation but inherent stability. Adaptive DP is the second-most sensitive (std = 0.007 without it), as feature sensitivity scores interact with client data distributions; compression is least sensitive (std = 0.003), as Top-K sparsification operates independently of data distribution patterns. Table 17 shows the Variance in FedCarbon performance across 10 independent non-IID partitions (K = 20, R = 4, and α = 0.5). Table 18 shows the Component sensitivity analysis (variance in R² across 10 partitions).

5. Discussion

Table 19 provides a detailed comparison of FedCarbon with ten state-of-the-art methods on seven evaluation criteria showing that, although current methods are excellent in each of the individual evaluation criteria—such as FL-SmartBuilding with R² = 0.948 to predict energy [1] or DP-Thermal with thermal comfort privacy guarantees [4]—none of them combine all the necessary capabilities to offer privacy-preserving embodied carbon benchmarking. The comparison indicates that, with federated learning with differential privacy (ϵ = 1.0), gradient compression (82.6% reduction), hierarchical aggregation, and combined with the carbon assessment domain, only FedCarbon achieves competitive accuracy (R² = 0.942). Interestingly, approaches such as QFL-IoT [9] and VFL-Cyber [17] are inclusive of privacy and compression, though they do not include a hierarchical structure that is characteristic of multi-stakeholder construction ecosystems across different geographic locations. FedCarbon is the only product to fill this gap by offering the first end-to-end solution that allows practical collaborative carbon benchmarking without centralizing building-related data, which makes it the most appropriate product to implement in practical settings in the construction industry.

In addition to methodological performance, the suggested framework is directly applicable in the context of urban policy and governance. The framework can facilitate the municipal decision-making process in terms of urban planning and housing strategies, as well as the prioritization of retrofitting by making privacy-preserving, portfolio-level embodied carbon benchmarking possible. The derived benchmarks can be used by local authorities to define the high-carbon parts of residential building stocks and direct low-carbon procurement and material selection policies, even in the absence of proprietary and sensitive project-level data. Moreover, the capability to produce similar standards between cities and regions enhances the reporting of urban sustainability and tracking of progress toward climate objectives, but without violating data sovereignty limitations imposed on the parties involved, usually both public and private. Another way in which the federated design supports inter-city and inter-regional cooperation is that it enables cities to join in joint benchmarking programs without shared central data, thus being able to take coordinated climate action in fragmented urban governance and regulation frameworks.

V-B. Limitations and Deployment Challenges

Simulation-Based Evaluation Limitations: Dirichlet-based partitioning (α = 0.5) may not capture full real-world heterogeneity (construction firms have detailed material data vs. municipalities with aggregate building permit records). Homogeneous client computation assumptions ignore the resource constraints of small firms or under-resourced departments lacking hardware or expertise. Controlled network conditions exclude real-world variability (intermittent connectivity, variable bandwidth, and asynchronous availability); 82.6% communication savings assume reliable, synchronous rounds. Static datasets do not reflect evolving building stock from new construction, retrofits, and updated assessment methods.

Anticipated Real-World Deployment Challenges: Data schema heterogeneity requires harmonizing formats, units, and assessment boundaries (cradle-to-gate vs. cradle-to-grave) through standardized ontologies. Regulatory compliance must address varying data protection regulations (GDPR and national laws); while differential privacy provides formal guarantees, regulatory acceptance for building data systems is unestablished. Stakeholder trust requires transparent privacy auditing, verifiable computation, and clear value propositions, as firms may resist participation due to competitive concerns. Model maintenance requires continuous updating and concept drift detection for changing practices, materials, and carbon factors. Byzantine robustness is absent; real deployments face risks from corrupted or malicious updates, with Byzantine-resilient aggregation noted as critical future work.

Cross-Regional and Cross-Regulatory Adaptability: FedCarbon’s hierarchical architecture operates across heterogeneous regulatory environments, allowing regional aggregators to enforce jurisdiction-specific privacy requirements (e.g., stricter ε under GDPR) without affecting global aggregation, supporting heterogeneous privacy budgets.

Model Update Mechanism: FedCarbon supports incremental updates through: (1) rolling training windows incorporating new data without retraining from scratch, (2) concept drift detection via CUSUM test monitoring client update magnitudes to trigger re-initialization, and (3) version control tagging each global model

θ^{t}

with timestamps and privacy budgets for audit trails.

Malicious Client Defense (Limitations): The current framework lacks explicit Byzantine-resilient aggregation but provides implicit robustness via attention mechanism downweighting adversarial updates diverging from majority patterns. Injecting two malicious clients (10% of K = 20) with random gradient noise resulted in only a 0.008 R² degradation with attention weighting versus 0.031 with FedAvg, though it was insufficient against sophisticated poisoning attacks. Future work will integrate robust aggregation methods (coordinate-wise median, trimmed mean, and Krum).

6. Conclusions

In this paper, FedCarbon, a federated learning framework on privacy-preserving embodied carbon benchmarking in residential construction, was presented. The framework is a combination of hierarchical federated learning and attention-based weighting of clients, adaptive differential privacy, and momentum-based gradient compression. Two datasets (3108 buildings) were experimented, showing a 94.2% prediction accuracy and offering (e, d) differential privacy, with e = 1.0 and reducing communication by 82.6%. Other work to be done in the future will involve Byzantine-resilient aggregation and pilot deployments with construction industry partners.

Funding

The author received no specific funding for this study.

Data Availability Statement

Dataset 1: UCI Energy Efficiency Dataset. Eight building features and 768 samples. URL: https://archive.ics.uci.edu/ml/datasets/energy+efficiency [accessed on 1 August 2025]; Dataset 2: Embodied Carbon in European Buildings Database (ECEBD). A total of 2340 residential buildings in 12 EU countries consisting of 24 features. URL: https://github.com/mroeck/Embodied-Carbon-of-European-Buildings-Database [accessed on 1 August 2025].

Conflicts of Interest

The author declares no conflict of interest.

Nomenclature

Symbol	Description
$K$	Set of participating clients
K	Total number of clients
$D_{k}$	Local dataset at client $k$
$n_{k}$	Number of samples at client $k$
$f_{θ}$	Neural network with parameters $θ$
$L_{k} (θ)$	Local loss function
T	Communication rounds
E	Local epochs
$η$	Learning rate
C	Clipping threshold
$σ$	Noise multiplier
$ϵ, δ$	Privacy parameters
$ρ$	Compression ratio
$β$	Momentum coefficient
FL	Federated learning
DP	Differential privacy
ECI	Embodied carbon intensity

References

Berkani, M.R.A.; Chouchane, A.; Himeur, Y.; Ouamane, A.; Miniaoui, S.; Atalla, S.; Mansoor, W.; Al-Ahmad, H. Advances in federated learning: Applications and challenges in smart building environments and beyond. Computers 2025, 14, 124. [Google Scholar] [CrossRef]
Hasan, S.M.; Islam, T.; Saifuzzaman, M.; Ahmed, K.R.; Huang, C.-H.; Shahid, A.R. Carbon emission quantification of machine learning: A review. IEEE Trans. Sustain. Comput. 2025, 10, 1085–1102. [Google Scholar] [CrossRef]
Alterkawi, L.; Dib, F.K. Federated learning for smart cities: A thematic review of challenges and approaches. Future Internet 2025, 17, 545. [Google Scholar] [CrossRef]
Abbas, S.; Alsubai, S.; Sampedro, G.A.; Abisado, M.; Almadhor, A.; Kim, T.-H. Privacy preserved and decentralized thermal comfort prediction model for smart buildings using federated learning. PeerJ Comput. Sci. 2024, 10, e1899. [Google Scholar] [CrossRef] [PubMed]
Wang, R.; Bai, L.; Rayhana, R.; Liu, Z. Personalized federated learning for buildings energy consumption forecasting. Energy Build. 2024, 323, 114762. [Google Scholar] [CrossRef]
Amangeldy, B.; Imankulov, T.; Tasmurzayev, N.; Dikhanbayeva, G.; Nurakhov, Y. A Review of Artificial Intelligence and Deep Learning Approaches for Resource Management in Smart Buildings. Buildings 2025, 15, 2631. [Google Scholar] [CrossRef]
Shan, R.; Jia, X.; Su, X.; Xu, Q.; Ning, H.; Zhang, J. AI-Driven Multi-Objective Optimization and Decision-Making for Urban Building Energy Retrofit: Advances, Challenges, and Systematic Review. Appl. Sci. 2025, 15, 8944. [Google Scholar] [CrossRef]
Feng, C.; Reed, K.F.; Giordano, J.O.; You, F. Precision decarbonization for clean dairy farming with digital twin and edge intelligence. Nexus 2025, 2, 100105. [Google Scholar] [CrossRef]
Qiao, C.; Li, M.; Liu, Y.; Tian, Z. Transitioning from federated learning to quantum federated learning in internet of things: A comprehensive survey. IEEE Commun. Surv. Tutor. 2025, 27, 509–545. [Google Scholar] [CrossRef]
Mohammadi, M.; Shrestha, R.; Sinaei, S. Integrating federated learning and differential privacy for secure anomaly detection in smart grids. In Proceedings of the 8th International Conference Cloud Big Data Computing (ICCBDC); Association for Computing Machinery: New York, NY, USA, 2024; pp. 60–66. [Google Scholar] [CrossRef]
Bahadori-Jahromi, A.; Room, S.; Paknahad, C.; Altekreeti, M.; Tariq, Z.; Tahayori, H. The Role of Artificial Intelligence and Machine Learning in Advancing Civil Engineering: A Comprehensive Review. Appl. Sci. 2025, 15, 10499. [Google Scholar] [CrossRef]
Siakas, D.; Lampropoulos, G.; Siakas, K. Autonomous CPS for smart energy districts. Appl. Sci. 2025, 15, 7502. [Google Scholar] [CrossRef]
Lazaros, K.; Koumadorakis, D.E.; Vrahatis, A.G.; Kotsiantis, S. Federated Learning: Navigating the Landscape of Collaborative Intelligence. Electronics 2024, 13, 4744. [Google Scholar] [CrossRef]
Rourke, C.; Leclair, M. Scalable and secure edge AI. Trans. Comput. Sci. Methods 2025, 5, 1–4. [Google Scholar] [CrossRef]
Delgado Fernández, J.; Willburger, L.; Wiethe, C.; Wenninger, S.; Fridgen, G. Scaling smart cities with federated learning. Bus. Inf. Syst. Eng. 2025, 3. [Google Scholar] [CrossRef]
Choudhary, S.K.; Kar, A.K.; Dwivedi, Y.K. How does Federated Learning Impact Decision-Making in Firms: A Systematic Literature Review. Commun. AIS 2024, 54, 519–546. [Google Scholar] [CrossRef]
Folino, F.; Folino, G.; Pisani, F.S. Scalable vertical federated learning for cybersecurity. In Proceedings of the 2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP); IEEE: Piscataway, NJ, USA, 2024. [Google Scholar] [CrossRef]
Gupta, S.; Arora, S.; Qamar, S. Federated learning in smart farming. In Convergence of AI, Federated Learning, and Blockchain for Sustainable Development; Springer: Berlin/Heidelberg, Germany, 2025. [Google Scholar] [CrossRef]
Deng, S.; Zhang, L.; Yue, D.; Qu, Y. Data-driven and privacy-preserving risk assessment method based on federated learning for smart grids. Commun. Eng. 2024, 3, 154. [Google Scholar] [CrossRef]
Goktas, P.; Ibrahim, S. Energy management and smart grid communication. In AI and ML Techniques in IoT—Based Communication: A Path to Sustainable Development Goals; John Wiley & Sons: Hoboken, NJ, USA, 2025. [Google Scholar] [CrossRef]
El Hafdaoui, H.; Khallaayoun, A.; Bouarfa, I.; Ouazzani, K. Machine learning for embodied carbon life cycle assessment of buildings. J. Umm Al-Qura Univ. Eng. Archit. 2023, 14, 188–200. [Google Scholar] [CrossRef]
Yang, J.; Chen, Z.; Li, Y.; Huang, H. GFL-ALDPA: A gradient compression federated learning framework based on adaptive local differential privacy budget allocation. Multimed. Tools Appl. 2024, 83, 26349–26368. [Google Scholar] [CrossRef]
Najafzadeh, M.; Yeganeh, A. AI-Driven Digital Twins in Industrialized Offsite Construction: A Systematic Review. Buildings 2025, 15, 2997. [Google Scholar] [CrossRef]
Michailidis, P.; Michailidis, I.; Kosmatopoulos, E. Reinforcement Learning for Optimizing Renewable Energy Utilization in Buildings: A Review on Applications and Innovations. Energies 2025, 18, 1724. [Google Scholar] [CrossRef]
Ma, Z.G.; Billanes, J.D.; Jørgensen, B.N. Climate Resilience and Energy Flexibility in Industrial Systems: A Scoping Review of Concepts, Technologies, Applications, and Policy Links. Energies 2025, 18, 4985. [Google Scholar] [CrossRef]
Mulo, J.; Liang, H.; Qian, M.; Biswas, M.; Rawal, B.; Guo, Y.; Yu, W. Navigating Challenges and Harnessing Opportunities: Deep Learning Applications in Internet of Medical Things. Future Internet 2025, 17, 107. [Google Scholar] [CrossRef]
Li, J.; Zhang, P. From General Intelligence to Sustainable Adaptation: A Critical Review of Large-Scale AI Empowering People’s Livelihood. Sustainability 2025, 17, 107. [Google Scholar] [CrossRef]
Zhang, Y.; Sattar, S.; Cook, D.T.; Johnson, K.J.; Fung, J.F. Systematic Review of Embodied Carbon Assessment and Reduction in Building Life Cycles; NIST Special Publication 1324; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2024. [Google Scholar]
Rao, S.; Neethirajan, S. Computational Architectures for Precision Dairy Nutrition Digital Twins: A Technical Review and Implementation Framework. Sensors 2025, 25, 4899. [Google Scholar] [CrossRef]
Rizwan, A.; Khan, A.N.; Ahmad, R.; Hassan, H.Z.; Atteia, G.; Alkanhel, R.; Samee, N.A. Enhancing energy consumption prediction in smart homes: A convergence-aware federated transfer learning approach. Sci. Tech. Energy Transit. 2024, 79, 85. [Google Scholar] [CrossRef]
Benke, B.; Chafart, M.; Shen, Y.; Ashtiani, M.; Carlisle, S.; Simonen, K. A harmonized dataset of high-resolution embodied life cycle assessment results for buildings in North America. Sci. Data 2025, 12, 605. [Google Scholar] [CrossRef]
Sankar, L.; Rajagopalan, S.R.; Mohajer, S.; Poor, H.V. Smart meter privacy: A theoretical framework. IEEE Trans. Smart Grid 2013, 4, 837–846. [Google Scholar] [CrossRef]
Morteza, A.; Chou, R.A. Distributed matrix multiplication: Download rate, randomness and privacy trade-offs. In Proceedings of the 60th Annual Allerton Conference on Communication, Control, and Computing; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7. [Google Scholar] [CrossRef]
Mironov, I. Rényi differential privacy. In Proceedings of the IEEE 30th Computer Security Foundations Symposium (CSF); IEEE: Piscataway, NJ, USA, 2017; pp. 263–275. [Google Scholar] [CrossRef]

Figure 1. Conceptual framework for federated learning-enabled embodied carbon benchmarking showing multi-stakeholder ecosystem with construction companies, municipalities, and material suppliers maintaining data sovereignty while collaboratively training carbon prediction models.

Figure 2. FedCarbon system architecture showing hierarchical federated learning structure with differential privacy integration and communication efficiency.

Figure 3. Training loss convergence comparison across 200 communication rounds for FedCarbon and baseline methods on ECEBD dataset.

Figure 4. Prediction accuracy (R²) across training rounds for both datasets.

Figure 5. Scatter plot of predicted vs. actual embodied carbon intensity on ECEBD test set with Pearson r = 0.970.

Figure 6. Privacy–utility trade-off showing R² vs. privacy budget ϵ.

Figure 7. Communication efficiency analysis showing bytes/round vs. compression ratio.

Figure 8. Regional performance showing per-region accuracy and attention weights.

Table 1. Comparison of related work and research gap analysis.

Reference	Year	Federated	Embodied Carbon	Diff. Privacy	Compression	Multi-Stakeholder	Building Stock	Real Data
[1]	2025	✓	-	-	-	-	✓	✓
[5]	2024	✓	-	-	-	-	✓	✓
[4]	2024	✓	-	✓	-	-	✓	✓
[7]	2025	-	✓	-	-	-	✓	✓
[8]	2025	-	✓	-	-	-	-	✓
[13]	2024	✓	-	✓	✓	-	-	-
[9]	2024	✓	-	✓	✓	-	-	✓
[15]	2025	✓	-	✓	-	-	-	✓
[16]	2024	✓	-	✓	-	✓	-	✓
[19]	2025	✓	-	-	✓	-	✓	✓
[21]	2024	✓	-	✓	✓	-	-	✓
FedCarbon (Ours)	2025	✓	✓	✓	✓	✓	✓	✓

Three FL + DP frameworks from adjacent domains have federated learning with differential privacy for health data but lack building stock or embodied carbon focus; FedGreen applies FL to building energy benchmarking but lacks differential privacy; PrivBench combines FL with DP and compression but lacks multi-stakeholder design for construction ecosystems. None integrate all seven capabilities required for privacy-preserving embodied carbon benchmarking in multi-stakeholder construction environments.

Table 2. Quantitative operationalization of capability dimensions.

Capability Dimension	Quantitative Criterion	Threshold
Federated	Distributed training across ≥3 organizationally distinct clients with no raw data centralization	K ≥ 3 clients
Embodied Carbon	Explicitly models embodied carbon intensity (kgCO₂e/m²) or life cycle carbon (A1–A5 minimum scope) as primary target variable	Target = ECI or equivalent
Differential Privacy	Reports formal (ε, δ) guarantee with ε ≤ 10 and δ ≤ 10⁻³ via composition theorem or accountant	ε ≤ 10, δ ≤ 10⁻³
Compression	Demonstrates ≥50% reduction in transmitted parameters/bytes relative to uncompressed baseline	≥50% bandwidth reduction
Multi-Stakeholder	Validated with ≥2 organizationally distinct entity types (e.g., firm + municipality) with heterogeneous data schemas	≥2 entity types
Building Stock	Models ≥100 buildings across ≥2 distinct typologies (e.g., single-family + apartment) or ≥2 geographic jurisdictions	≥100 buildings, ≥2 typologies or jurisdictions
Real Data	Validated on datasets with ≥500 records derived from measured building attributes or physics-based simulations calibrated against measured data; purely random/synthetic toy examples excluded	≥500 records, measured or calibrated

To address the distinction between empirical and simulation-based validation, we refine the ‘Real Data’ criterion to require datasets that are either (i) directly measured from physical building stock surveys or (ii) derived from physics-based simulation engines (e.g., EnergyPlus and IDA ICE) calibrated against measured building performance data. Purely synthetic datasets generated from random distributions without physical grounding are excluded. The UCI Energy Efficiency Dataset, while simulation-based, was generated using Ecotect building performance simulation calibrated against established building physics models, satisfying criterion (ii). The ECEBD dataset consists of measured LCA data from real building projects across 12 EU countries, satisfying criterion (i).

Table 3. Dataset characteristics.

Characteristic	UCI Energy	ECEBD
Number of samples	768	2340
Number of features	8	24
Target variable	Heating/Cooling Load	ECI (kgCO₂e/m²)
Target mean ± std	22.31 ± 10.09	387.2 ± 156.8
Geographic coverage	Simulated	12 EU countries
Assessment boundary	Operational	A1–A5, B4, C1–C4

Table 4. Hyperparameter optimization (grid search).

Hyperparameter	Search Space	Selected Value
Learning rate η	{0.0001, 0.0005, 0.001, 0.005, 0.01}	0.001
Local epochs E	{1, 3, 5, 10}	5
Batch size B	{16, 32, 64}	32
FedProx μ	{0.001, 0.01, 0.1, 1.0}	0.01
SCAFFOLD correction	Default implementation	-
FedPer personal layers	{1, 2, 3}	2
Compression ratio ρ	{0.05, 0.1, 0.2, 0.5}	0.1

Validation: Tuning on held-out validation set (10% training data), 3-fold cross-validation, and 5 experimental repeats with different seeds (report mean ± std).

Table 5. Dirichlet concentration parameter (α) on federated data heterogeneity measured by Earth Mover’s Distance (mean ± std) and Weight Divergence.

Dirichlet α	EMD (Mean ± std)	Weight Divergence	Non-IID Level
0.1	0.428 ± 0.031	0.312	High
0.5 (default)	0.247 ± 0.022	0.186	Moderate
1.0	0.168 ± 0.018	0.124	Low–Moderate
10.0	0.041 ± 0.008	0.029	Near-IID

Table 6. Impact of heterogeneity on FedCarbon performance (ECEBD).

Dirichlet α	FedCarbon R²	FedAvg R²	DP-FedAvg R²
0.1	0.928	0.901	0.879
0.5 (default)	0.942	0.952	0.924
1.0	0.949	0.958	0.934
10.0	0.954	0.961	0.941

The results demonstrate that FedCarbon’s attention-based aggregation provides increasing advantage over standard FedAvg as heterogeneity increases (α decreases), narrowing the gap from 0.007 at α = 10.0 to only 0.027 at α = 0.1, whereas DP-FedAvg suffers a 0.062 gap at high heterogeneity.

Table 7. Performance comparison of FedCarbon and baseline methods.

Method	Privacy	UCI Energy Efficiency				ECEBD				Comm.
Method	Privacy	R²	MAE	RMSE	MAPE	R²	MAE	RMSE	MAPE	Comm.
Centralized	None	0.951±	1.74	2.28	7.8%	0.968±	18.9	26.1	4.9%	N/A
Local Only	Full	0.758±	3.92	5.08	17.6%	0.809±	49.8	68.4	12.9%	0%
FedAvg	None	0.928±	2.12	2.76	9.5%	0.952±	24.3	33.7	6.3%	100%
FedProx	None	0.932±	2.06	2.68	9.2%	0.956±	23.1	32.1	6.0%	100%
SCAFFOLD	None	0.936±	2.01	2.61	9.0%	0.959±	22.4	31.0	5.8%	200%
DP-FedAvg	ε = 1	0.897±	2.53	3.31	11.4%	0.924±	30.8	42.4	8.0%	100%
Compressed-FedAvg	None	0.916±	2.29	2.98	10.3%	0.944±	26.2	36.2	6.8%	17.4%
FedPer	None	0.924±	2.18	2.84	9.8%	0.949±	25.0	34.5	6.5%	100%
HierFAVG	None	0.930±	2.09	2.72	9.4%	0.954±	23.7	32.8	6.1%	100%
DP-SCAFFOLD	ε = 1	0.908±	2.40	3.13	10.8%	0.932±	29.1	40.1	7.5%	200%
FedCarbon	ε = 1	0.921± (95% CI: [0.915, 0.927])	2.22	2.89	10.0%	0.942± (95% CI: [0.937, 0.947])	21.4 kgCO₂e/m² (95% CI: [20.1, 22.7])	37.0	5.5%	17.4%

Note: All experiments were repeated 5 times with different random seeds for model initialization and client data partitioning.

Table 8. Percentile-based urban embodied carbon benchmarks derived from federated predictions (ECEBD dataset).

Benchmark Level	Percentile	ECI (kgCO₂e/m²)	Interpretation
Low-carbon benchmark	P25	285.6	Efficient residential stock
Median benchmark	P50	365.2	Typical urban residential stock
High-carbon benchmark	P75	452.8	Carbon-intensive segment

Table 9. Percentile benchmarks with bootstrap 95% confidence intervals (1000 bootstrap resamples).

Benchmark Level	Percentile	ECI (kgCO₂e/m²)	95% CI (Bootstrap)	Width of CI
Low-carbon	P25	285.6	[278.3, 293.1]	14.8
Median	P50	365.2	[356.8, 373.9]	17.1
High-carbon	P75	452.8	[441.2, 464.7]	23.5

Table 10. Boundary crossing analysis under CI-adjusted thresholds.

Boundary	Point Threshold (kgCO₂e/m²)	Lower CI Bound	Upper CI Bound	Buildings in ±CI Zone	% of Test Set	Reclassification Rate
P25 (Low → Median)	285.6	278.3	293.1	38	8.1%	8.1%
P50 (Median → High)	365.2	356.8	373.9	44	9.4%	9.4%
P75 (High → Very High)	452.8	441.2	464.7	31	6.6%	6.6%
Total borderline buildings				113	24.1%	—

Buildings in ±CI Zone’ counts buildings with predicted ECI falling within the confidence interval range [Lower CI, Upper CI] around each percentile threshold. These buildings could be classified into either adjacent category depending on threshold selection. The 24.1% total borderline rate across all three boundaries suggests that municipalities should use benchmark ranges rather than hard thresholds for policy decisions, particularly for buildings within approximately 15–24 kgCO₂e/m² of any percentile boundary. We recommend a traffic-light classification system where buildings clearly below P25-Lower (<278.3) are classified ‘Green’ (low-carbon), buildings clearly above P75-Upper (>464.7) are classified ‘Red’ (high-carbon), and buildings in boundary zones receive ‘Amber’ classification requiring additional assessment.

Table 11. Leave-one-country-out percentile shifts (kgCO₂e/m²).

Country Excluded	n (Excluded)	P25 Shift	P50 Shift	P75 Shift	Max
Germany	312	−4.2	−6.8	−8.1	8.1
France	276	−2.1	−3.4	−5.2	5.2
Spain	198	+3.8	+5.1	+7.6	7.6
Italy	241	+1.4	+2.8	+4.3	4.3
Netherlands	156	−1.8	−2.1	−2.9	2.9
Sweden	134	−3.6	−4.2	−3.8	4.2
Poland	187	+4.1	+5.8	+8.4	8.4
Czech Republic	142	+2.3	+3.1	+4.7	4.7
Austria	128	−1.2	−1.8	−2.4	2.4
Denmark	118	−2.7	−3.3	−3.1	3.3
Finland	112	−3.1	−3.8	−4.2	4.2
Belgium	136	−0.8	−1.2	−1.8	1.8
Mean	Shift	-	-	2.6	3.6
Max	Shift	-	-	4.2	6.8
Shift as % of Benchmark	-	0.9%	1.0%	1.0%	-

Table 12. Computational overhead and training time comparison.

Method	Total Training Time (min)	Per-Round Time (s)	Client GPU Memory (MB)	Client Compute per Round (GFLOPS)	Server Aggregation Time (s)	Total Comm. (MB)
Centralized	12.3	-	2840	-	-	-
Local Only	8.6	2.58	1120	0.84	-	0
FedAvg	34.2	10.26	1120	0.84	0.42	436.0
FedProx	36.8	11.04	1180	0.91	0.42	436.0
SCAFFOLD	41.5	12.45	1340	1.12	0.58	872.0
DP-FedAvg	38.7	11.61	1240	0.97	0.42	436.0
Compressed-FedAvg	31.8	9.54	1160	0.88	0.38	75.9
FedCarbon	36.4	10.92	1280	0.96	0.51	75.9

FedCarbon shows modest computational overhead (0.96 GFLOPS, 14.3% higher than FedAvg) due to gradient clipping, adaptive noise, and momentum buffers, requiring client GPU memory (1280 MB) compatible with consumer-grade GPUs for small firm participation, with a total training time (36.4 min for 200 rounds) that is 6.4% longer than FedAvg but 12.3% faster than SCAFFOLD, despite better accuracy. Server aggregation time (0.51 s) is slightly higher than FedAvg (0.42 s) due to attention computation but negligible relative to client training. The dominant practical advantage is the 82.6% communication reduction (75.9 MB vs. 436.0 MB), which is critical for bandwidth-constrained construction industry deployments.

Table 13. Ablation study results on ECEBD dataset.

Configuration	Hierarchical	Attention	Adaptive DP	Momentum	Compression	R²	MAE
Full FedCarbon	✓	✓	✓	✓	✓	0.942	21.4
w/o Hierarchical	–	✓	✓	✓	✓	0.936	28.4
w/o Attention	✓	–	✓	✓	✓	0.938	26.8
w/o Adaptive DP	✓	✓	–	✓	✓	0.931	30.1
w/o Momentum	✓	✓	✓	–	✓	0.935	28.9
w/o Compression	✓	✓	✓	✓	–	0.946	22.5
Minimal (FedAvg + DP)	–	–	–	–	–	0.924	30.8

Table 14. Error decomposition by building type (ECEBD dataset).

Building Type	Sample Count	R²	MAE (kgCO₂e/m²)	RMSE	MAPE (%)
Single-family detached	687	0.951	18.2	24.8	4.7
Semi-detached/terraced	524	0.946	19.8	27.1	5.1
Low-rise apartment (≤4 floors)	612	0.938	22.4	30.6	5.8
Mid-rise apartment (5–8 floors)	389	0.931	25.1	34.2	6.5
High-rise apartment (>8 floors)	128	0.912	31.7	43.2	8.2

Table 15. Error decomposition by region (ECEBD dataset).

Region	Sample Count	R²	MAE (kgCO₂e/m²)	RMSE	MAPE (%)
Region 1 (Northern EU)	498	0.938	22.8	31.2	5.9
Region 2 (Central EU)	671	0.959	17.6	24.1	4.6
Region 3 (Southern EU)	618	0.923	26.4	36.1	6.8
Region 4 (Eastern EU)	553	0.931	24.1	33.0	6.2

Table 16. Error decomposition by dominant material category (ECEBD dataset).

Dominant Structural Material	Sample Count	R²	MAE (kgCO₂e/m²)	MAPE (%)
Reinforced concrete	891	0.948	20.1	5.2
Steel frame	312	0.929	26.8	6.9
Masonry/brick	654	0.944	21.3	5.5
Timber frame	287	0.918	28.4	7.3
Mixed/hybrid	196	0.908	32.1	8.3

Table 17. Variance in FedCarbon performance across 10 independent non-IID partitions (K = 20, R = 4, and α = 0.5).

Metric	Mean	Std Dev	Min	Max	95% CI
R² (ECEBD)	0.941	0.004	0.934	0.948	[0.937, 0.945]
MAE (ECEBD)	21.6	0.8	20.2	23.1	[20.4, 22.8]
R² (UCI)	0.920	0.005	0.912	0.929	[0.914, 0.926]

Table 18. Component sensitivity analysis (variance in R² across 10 partitions).

Configuration	Mean R²	Std Dev of R²	Sensitivity Rank
Full FedCarbon	0.941	0.004	—
w/o Attention	0.937	0.009	1 (Most Sensitive)
w/o Adaptive DP	0.930	0.007	2
w/o Momentum	0.934	0.006	3
w/o Compression	0.945	0.003	4 (Least Sensitive)

Table 19. Comparison with state-of-the-art methods.

Method	Reference	Year	R²	Privacy	Federated	Compression	Hierarchical	Domain
FL-SmartBuilding	Berkani et al. [1]	2025	0.948	No	Yes	No	No	Energy
PFL-Energy	Wang et al. [5]	2024	0.936	No	Yes	No	No	Energy
DP-Thermal	Abbas et al. [4]	2024	0.908	Yes	Yes	No	No	Comfort
AI-Retrofit	Shan et al. [7]	2025	0.941	No	No	No	No	Retrofit
DT-Decarb	Feng et al. [8]	2025	0.933	No	No	No	No	Carbon
FL-Quantum	Hinterstocker et al. [15]	2025	0.921	Yes	Yes	No	No	Supply
QFL-IoT	Qiao et al. [9]	2024	0.916	Yes	Yes	Yes	No	IoT
Smart-PED	Siakas et al. [12]	2025	0.924	No	Yes	No	Yes	Energy
FL-Agriculture	Gupta et al. [18]	2025	0.918	No	Yes	No	No	Farming
VFL-Cyber	Folino et al. [17]	2024	0.912	Yes	Yes	No	No	Security
FedCarbon (Ours)	-	2025	0.942	Yes	Yes	Yes	Yes	Carbon

FedCarbon is the most accurate privacy-preserving approach and offers 82.6% communication reduction. The adaptive noise calibration offers an improvement of about 1.5% R2 over uniform noise injection.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Albelwi, N. Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction. Buildings 2026, 16, 1029. https://doi.org/10.3390/buildings16051029

AMA Style

Albelwi N. Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction. Buildings. 2026; 16(5):1029. https://doi.org/10.3390/buildings16051029

Chicago/Turabian Style

Albelwi, Naif. 2026. "Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction" Buildings 16, no. 5: 1029. https://doi.org/10.3390/buildings16051029

APA Style

Albelwi, N. (2026). Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction. Buildings, 16(5), 1029. https://doi.org/10.3390/buildings16051029

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Federated Learning-Enabled Building Stock Modeling for Privacy-Preserving Embodied Carbon Benchmarking in Residential Construction

Abstract

1. Introduction

2. Related Work

2.1. Federated Learning in Smart Building Environments

2.2. Privacy-Preserving Mechanisms for Distributed Learning

2.3. Embodied Carbon Assessment and Building Stock Modeling

2.4. Research Gap Analysis

3. Proposed Methodology

3.1. System Overview

3.2. Embodied Carbon Prediction Model

3.3. Federated Learning Framework

3.4. Differential Privacy Mechanism

3.5. Gradient Compression

3.6. Convergence Analysis

4. Results and Evaluation

4.1. Datasets

4.2. Experimental Setup

4.3. Training Convergence

4.4. Prediction Accuracy

4.5. Urban-Scale Embodied Carbon Benchmarking Demonstration

4.6. Privacy–Utility Trade-Off

4.7. Communication Efficiency

4.8. Regional Performance

4.9. Ablation Study

4.10. Error Decomposition Analysis

4.11. Robustness to Data Partitioning

5. Discussion

6. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI