Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models

Zou, Yushi; Zheng, Rengeng; Xia, Jun

doi:10.3390/buildings15193449

Open AccessArticle

Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models

by

Yushi Zou

,

Rengeng Zheng

and

Jun Xia

^*

Department of Civil Engineering, Design School, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(19), 3449; https://doi.org/10.3390/buildings15193449

Submission received: 25 August 2025 / Revised: 21 September 2025 / Accepted: 22 September 2025 / Published: 24 September 2025

(This article belongs to the Section Building Energy, Physics, Environment, and Systems)

Download

Browse Figures

Versions Notes

Abstract

Accurate calculation of embodied carbon emissions in buildings (ECE) is crucial to achieving global carbon neutrality. However, fragmented data, inconsistent regional standards, and low computational efficiency have long hindered existing methods. This study innovatively integrates large language models (LLMs) with retrieval-enhanced generation (RAG) technology to establish a new intelligent accounting paradigm for embodied carbon in buildings. Through a systematic evaluation of three basic models—Kimi, Doubao, and DeepSeek-R1—in a five-level progressive input scenario, the study quantitatively reveals the “information sensitivity” patterns of LLMs. To address the illusion errors of general models in professional scenarios, an innovative three-stage closed-loop architecture of “knowledge retrieval—calculation embedding—trustworthy generation” is proposed. By dynamically invoking domain knowledge bases and embedded computing modules, zero-error verification of benchmark data is achieved. The core contributions include the following: (1) It has been clarified that the basic large model has application potential in calculating the implicit carbon emissions of buildings, but the reliability of the results is limited. (2) The influence of data elements on calculation accuracy is revealed. (3) The application path for integrating RAG with large models has been pioneered, and the results show that the RAG technology can enhance the performance of large models in calculating the implicit carbon emissions of buildings by approximately 25%. (4) The significant efficiency improvement of RAG technology is verified. (5) A supporting theoretical and application system is established.

Keywords:

embodied carbon emissions; large language models; retrieval-augmented generation; life cycle assessment; building sustainability; carbon neutrality

1. Introduction

In the context of escalating global climate change, the construction industry, as one of the significant sources of carbon emissions, has become a core focus in sustainable development. Although carbon emissions during the operation phase of buildings have been extensively studied, the embodied carbon emissions covering the entire life cycle process from material production, transportation, construction, and demolition remain a key area that has not been fully explored [1,2]. Accurately quantifying the embodied carbon emissions of buildings is of great significance for achieving the global carbon neutrality goal [3]. Meanwhile, large language models (LLMs) have triggered changes in multiple fields thanks to their robust natural language understanding, multimodal data integration, and complex task processing capabilities [4]. Their applications cover scenarios such as industrial code generation, medical diagnosis, financial risk assessment, and scientific research, demonstrating significant advantages in improving efficiency and accuracy in various professional fields [4,5,6,7]. LLMs have also been preliminarily attempted in the construction industry for energy audits and normative question answering [8]. However, the potential of LLMs in solving the complex problem of calculating embodied carbon emissions of buildings has not been fully exploited.

The calculation of embodied carbon emissions in buildings faces numerous challenges. At the methodological level, the Life Cycle Assessment (LCA) approach is comprehensive but requires extensive data; the process energy consumption estimation method and the list-based statistical method fall short in terms of accuracy due to the complexity of construction [1]. Although Building Information Modeling (BIM) and machine learning show promising application prospects at the technical level, their application scope remains limited due to issues such as real-time monitoring capabilities, dynamic simulation functions, and interoperability problems [9]. Additionally, the differences in standards between regions further hinder the comparability of calculation results and the effectiveness of policy implementation [10]. These limitations highlight the necessity of innovative methods, and there is an urgent need for solutions that can simplify data processing, improve accuracy, and adapt to changing scenarios. Large language models offer transformative opportunities to address these challenges. Their advantages in automated data processing, semantic parsing, and multimodal fusion are expected to enable efficient calculation of embodied carbon emissions in buildings [8]. However, the lack of domain knowledge limits the direct application of general LLMs, which may result in “hallucination” outputs such as mismatched material-emission factor pairs, and the calculation may have inevitable errors. Retrieval-augmented generation (RAG) technology, by enabling LLMs to call external knowledge bases without modifying parameters dynamically, becomes a key solution to address these issues, significantly improving calculation accuracy and scene adaptability [11].

Therefore, this study aims to combine LLMs with RAG technology. Through the establishment of a workflow, a retrieval-enhanced large model is constructed to calculate the embodied carbon emissions of buildings. By systematically evaluating the performance of the base large models (Kimi, Doubao, DeepSeek-R1) under different information input conditions, the impact of data integrity on calculation accuracy was quantified; subsequently, an RAG-enhanced framework integrating domain knowledge base and calculation module was constructed to overcome the limitations of general models. The research objectives include the following: (1) Systematically evaluate the performance boundaries and applicable conditions of LLMs in calculating building-embodied carbon emissions. (2) Establish a theoretical framework for optimizing input strategies for building embodied carbon calculation. (3) Design and verify a domain-enhanced retrieval large model based on RAG technology. (4) Propose an intelligent technical path-layering system for building carbon accounting. The innovation points of this research are reflected in four aspects: (1) RAG technology has been applied to calculating embodied carbon emissions in buildings, providing pioneering ideas for the intelligent transformation of this field. (2) It reveals the dynamic relationship between data integrity and computational accuracy. (3) A domain-specific RAG framework was developed, achieving zero-error calculation through the collaborative work of knowledge retrieval and calculation codes. (4) This study provided operational technical paths for different precision requirements and practical tools for accelerating the industry’s transition to carbon neutrality, and promoted the theoretical understanding of the application of artificial intelligence in the sustainable building field.

2. Literature Review

2.1. Large Language Models

With the rapid development of artificial intelligence technology, large language models, as one of the core technologies in natural language processing, have received widespread attention. In recent years, LLMs such as the GPT series, BERT, and LLaMA, with their robust natural language understanding and generation capabilities, have triggered revolutionary changes in artificial intelligence [12,13]. These models are based on the transformer architecture and can integrate various data modalities such as text, images, and sounds [14]. Through pre-training on massive text data, they can complete complex tasks such as machine translation, text summarization, and code generation [15]. LLMs offer several advantages, including speed, accuracy, creativity, efficiency, and versatility, so that they can be applied to almost every industry [16].

Large language models face numerous challenges and limitations during their development and application. From a technical perspective, the first issue is accuracy and reliability. LLMs may generate content that does not match the facts (i.e., “hallucinations”), which poses particularly significant risks in critical fields. For example, in medical scenarios, if the model provides incorrect advice on disease diagnosis or treatment plans, it may directly threaten the safety of patients’ lives [7]; in industrial knowledge question answering, incorrect information may lead to deviations in production processes, affecting product quality or production safety [5]. Secondly, insufficient generalization ability is also a significant drawback. The model performs well in common scenarios covered by the training data, but its performance significantly declines when faced with tasks outside the training data distribution. For instance, the model often struggles to provide accurate results when dealing with some rare industrial fault diagnoses or exceptional medical cases due to the extremely low proportion of relevant data in the training set [16]. Data privacy is the primary concern regarding ethics and social issues. The training of LLMs relies on massive data, which may contain sensitive content such as personal privacy information and trade secrets, posing a risk of leakage. For example, electronic health records (EHR) in the medical field contain detailed patient conditions and personal information. If mishandled during training, it may violate relevant privacy protection regulations [7]; public feedback data in the government sector also involves much personal information. Once leaked, it will undermine public trust [17]. Finally, from a professional perspective, different industries’ characteristics and professional requirements make it difficult for general large language models to meet the needs. Each industry requires specialized LLMs.

2.2. Embodied Carbon Emissions Estimation

As one of the significant sectors in energy consumption and embodied carbon emissions, the construction industry has drawn increasing attention to its embodied carbon [18]. The embodied carbon in construction refer to the embodied carbon emissions generated during the entire life cycle of buildings, including raw materials extraction, transportation, and material production [1]. Accurate calculation and effective control of embodied carbon in construction are significant for achieving the global carbon neutrality goal [19]. Research in the field of embodied carbon calculation in construction is continuously deepening, and diverse calculation methods have emerged and are discussed in the following sections.

(1): Life cycle assessment (LCA)

Life cycle assessment (LCA) is a method for comprehensively evaluating the environmental impact of products or services throughout their life cycle. It is widely used to calculate embodied carbon in construction [3]. This method takes a “from cradle to grave” perspective, covering the entire process from the extraction of building materials, production, transportation, construction, use, maintenance, demolition, and disposal [20]. By collecting data on energy consumption and material flows at each stage and combining corresponding carbon emission factors, the embodied carbon of buildings throughout their life cycle can be calculated [1]. Luo et al. examined the embodied carbon of office buildings, considering the production and transportation processes of building materials and equipment, as well as the embodied carbon emissions generated during the construction process, and calculated the embodied carbon [19]. Zhai et al. calculated the embodied carbon of a wooden structure building, which needed to cover the stages of material production, transportation, construction, demolition, waste transportation, treatment, and recycling [21]. The advantage of this method is that it can present the carbon emission contributions in stages, providing support for the formulation of precise emission reduction strategies [1]. However, it also has some limitations. Not only does it require high data input, necessitating complete life cycle data, and involves complex calculations, with significant differences in results among various models [3], but it is also affected by industry data confidentiality or poor management, resulting in the potential absence of material production energy consumption data.

(2): The construction energy consumption list statistics method

The construction energy consumption list statistics method collects data through on-site electricity meters, gasoline, and diesel meters, and summarizes the measured total energy consumption during the construction stage. Then, the embodied carbon emissions are calculated based on the conversion relationship between energy consumption and embodied carbon emissions [1]. Theoretically, this method is based on on-site measured data, and the results are accurate and reliable [22]. However, this method has limitations. It relies on theoretical parameters such as mechanical efficiency, and does not take into account uncertainties during construction, such as plan changes and equipment failures. This can lead to deviations between its results and actual energy consumption [23].

(3): Integrating Building Information Modeling (BIM)

Furthermore, with the development of information technology, by integrating BIM technology with carbon emission calculation, the project’s material usage and equipment energy consumption information can be embedded into the BIM model, and the carbon emission factors can be automatically matched [24,25]. Based on the industry introductory class (IFC) standard, the BIM model can achieve data integration of multiple heterogeneous BIM software, solve the problem of data incompatibility between different software, and improve the calculation efficiency [26]. However, it faces the challenge of insufficient application depth. It mostly stays at the level of data visualization and lacks real-time monitoring and dynamic optimization functions [25].

(4): Machine learning methods

Machine learning methods have been used to predict building embodied carbon emissions. By collecting relevant parameters of the building and historical carbon emission data, and training a machine learning model, the prediction of building-embodied carbon emissions can be achieved [27]. Intelligent technology is also applied to predicting the calculation of building-embodied carbon emissions. For example, with the BAS-LSTM model proposed by Jiang, through secondary decomposition processing of data using VMD and EEMD, combined with the BAS algorithm to optimize the LSTM weights, the fitting coefficient (R²) reaches above 0.94 [26]. Cang et al., based on data from 129 residential projects, established linear regression models of different structural forms, with an error of approximately 10% [9]. The QCEPM model uses concrete, steel bars, and masonry as variables, and for the prediction of multi-story frame structures, the MAPE is only 2.36% [28]. However, this method relies on high-quality historical data. Data deviations or insufficient samples can significantly amplify the prediction errors [9,27,28,29].

2.3. The Application Prospects of LLMs in Calculating the Embodied Carbon Emissions of Buildings

Although large-scale models have initiated a wave of intelligent transformation in numerous fields, their application in calculating embodied carbon emissions of buildings is still relatively scarce. Large language models can provide a new approach to solving the problems of existing carbon emission calculation methods. Their advantages in automated data processing, semantic parsing, and multimodal fusion have brought revolutionary breakthroughs, and are expected to achieve substantive carbon emission calculations and low-carbon data applications [11]. By widely applying large-scale language models in building energy consumption assessment, it is possible to meet the increasing demand for efficient energy consumption assessment and management in building projects [30].

Gu et al. proposed a novel “comprehensive carbon assessment” method. This method utilizes large language models for intelligent semantic parsing and automatically matches material and equipment information with corresponding carbon emission coefficients, significantly improving the calculation efficiency [30]. However, directly using large models to perform tasks in specialized fields such as calculating the embodied carbon emissions of buildings may encounter some issues. On one hand, although general large models have strong natural language understanding capabilities, they lack the specific professional knowledge of the construction industry, such as the carbon emission factors of various materials, the correlation between construction processes and carbon footprints, etc. This can lead to “hallucination” outputs, such as incorrectly matching steel bars and concrete carbon emission parameters [8]. On the other hand, data in the construction field often has high professionalism and scene dependence. There are differences in building regulations and material standards in different regions, and general models find it difficult to cover these nuances, resulting in significant deviations in calculation results. Therefore, the RAG technology does not require modifying model parameters to address these issues. Dynamically calling external knowledge bases can enhance the professionalism of the output, avoiding “hallucination” problems and enabling flexible adaptation to different scenarios [8].

Currently, RAG technology has been applied in multiple fields. For instance, Liu et al. pointed out that combining large models and RAG can achieve multimodal fusion of building energy data, enabling building system fault diagnosis and energy consumption optimization [31]. RAG technology and the method of the clinical medical professional textbook knowledge base can improve the accuracy and reliability of intelligent question-answering systems and reduce the occurrence of artificial intelligence illusions. RAG is applied in defense technology intelligence, which can significantly improve the efficiency, accuracy, relevance, and timeliness of intelligence collection, enabling LLMs to better meet the needs of defense technology intelligence [32]. However, the exploration of RAG technology in the field of building-embodied carbon emissions is still in a blank state. If RAG technology is introduced, by constructing a professional vector database, it will be possible to achieve real-time and accurate matching of carbon emission factors and cross-document information integration, thereby significantly improving the efficiency and accuracy of the accounting process.

In conclusion, applying large models and RAG technology to calculating embodied carbon emissions in buildings holds significant research value and practical significance. This integration can overcome the limitations of fragmented data and cumbersome processes in traditional calculations. Using intelligent retrieval and in-depth analysis can enhance the efficiency and accuracy of the calculations, providing reliable support for building low-carbon design, material selection, and policy formulation, and facilitating the realization of the dual-carbon goals.

3. Methods

3.1. Benchmark Case

A case study is selected for verification [12]. The 3D diagram of the British standard Lidl supermarket is shown in Figure 1. The building is a 2500-square-meter single-story structure designed with a steel frame and composite exterior walls.

Obtaining correct and accurate carbon emission factors is crucial for carrying out embodied carbon estimation, and carbon emission factors can be extracted from secondary sources such as environmental product declarations (EPDs), industry data, government data, commercial LCA databases [30]. In the construction case of this Lidl Supermarket, the carbon emission coefficient mainly comes from two core databases. One is the greenhouse gas conversion factor dataset released by the Department of Energy and Climate Change of the United Kingdom, which classifies the carbon emission coefficient (ECF) into 40 categories. The other is the ICE database published by the Institution of Engineers, which supplements the ECF by combining published data and derived data [4]. In terms of the assignment strategy of factors, two methods are adopted. The first method assigns values based on general material categories, and the second method selects the factors that are most compatible with the material details.

This building had previously undergone a life cycle assessment within the A1–A3 boundary based on the 2020 greenhouse gas conversion factor dataset of the UK Department of Energy and Climate Change and the 2019 ICE database. For the calculation of embodied carbon in materials, a specific logic and method were followed, namely using the formula “material usage (kg) × carbon emission factor (kgCO2e/kg) = embodied carbon (kgCO2e)”. The calculation of the embodied carbon emissions of the case building includes parts such as substructures (foundations, floor bearing plates, etc.) and superstructures (structural frames, floor slabs, roofs, stairs and ramps, interior and exterior walls and partitions, etc.); the excluded parts include the weight calculation of mechanical ducts, air handling units, electrical and plumbing systems, and other parts related to the building’s life cycle except for the A1–A3 stages. The calculated embodied carbon emissions of the building are shown in Table 1. This carbon emission calculation value provides a basis for comparing the current results.

3.2. Application of the Basic Large Model

3.2.1. Selection of the Basic Large Model

To systematically evaluate the applicability of large language models in calculating the embodied carbon emissions of buildings, this study selected Kimi, Doubao, and DeepSeek-R1 as the test objects. This study selects Kimi, Doubao and DeepSeek-R1 as the base models, mainly due to their wide application and high popularity in China. These models have significant influence in the domestic market and, after large-scale data training and optimization, possess strong natural language processing capabilities. Among them, Kimi (available at https://www.kimi.com/ accessed on 11 August 2025) is proficient in real-time information retrieval and long-context processing, and can accurately extract key parameters. Doubao (available at https://www.doubao.com/chat/ accessed on 13 August 2025) excels in integrating professional knowledge and adapting to scenarios, which can reduce calculation errors caused by deviations in standard interpretation. DeepSeek-R1 (available at https://chat.deepseek.com/ accessed on 16 August 2025) focuses on multi-modal data processing and can provide a scientific basis for the calculation. These three models have all undergone engineering corpus pre-training and fine-tuning, covering local building standards and policy documents. This study will input unified case data and compare their calculation accuracy and robustness differences to clarify the application potential of large models in building carbon accounting. It will also provide empirical evidence for constructing an intelligent carbon accounting framework for building carbon emissions.

3.2.2. Information Input Combination

To calculate the building’s embodied carbon emissions (A1–A3 stages) scientifically using large models, it is necessary to standardize the calculation elements through prompts. The prompts should specify basic information such as building type, area, and location, define the scope of A1–A3 stages, require the provision of material usage data, specify the calculation standards, emphasize the source of parameters, and stipulate the requirements for result output and analysis. This can reduce the deviation caused by ambiguous information in the large model and ensure that the calculation logic is consistent with industry norms, improving the scientificity and reliability of the results, and laying a rigorous foundation for subsequent comparative analysis. To verify the applicability and robustness of large models in calculating the embodied carbon (A1–A3) of buildings, this study designs 5 progressive input information combinations (InfoSet-1 to InfoSet-5). Taking the benchmark case, the functional units (GFA = 12,500 m²) remain consistent, while only the information granularity and data quality are changed. The information combination design is shown in Table 2. The detailed input combination content is shown in Appendix A.

3.2.3. Verification Process

For each model and every InfoSet, acquire the response of the LLM and record the estimated embodied carbon.

Step-1: Conduct 5 independent experiments for each model and each InfoSet, and record the metrics of each experiment.

Step-2: Calculate the Score for each model and each InfoSet based on the 5 experiments.

Step-3: Compute the average Score for each model and each InfoSet.

Step-4: Carry out the statistics and analysis of the data.

The specific details of the proposed assessment indicators are summarized in Table 3. Among them, the matching factor is used to measure the degree of agreement between the calculated embodied carbon value and the reference value of each material, requiring the difference between the two to be controlled within 20%, thereby determining the accuracy of the matching. The total relative error of carbon emissions focuses on the deviation between the carbon emissions calculated by the large-scale model and the reference carbon emissions of the building, and is an essential basis for reflecting the accuracy of the model’s carbon emissions calculation. The prediction accuracy is quantified to reflect the reliability of the model’s prediction results, and intuitively demonstrates the model’s predictive accuracy for carbon emissions of buildings. The standard deviation can represent the dispersion of carbon emission results obtained from multiple experiments under the same information set by the same model, and the smaller the value, the weaker the fluctuation of the experimental results, and the better the stability of the model performance. The coefficient of variation, as a nondimensional indicator, can describe the relative dispersion of data and is applicable for the stability comparison of indicators from different units. The smaller the coefficient, the lower the relative dispersion of the data from the average value, and the higher the reliability of the results. Stability comprehensively reflects the reliability and stability of the model results, which is calculated by subtracting the coefficient of variation from 1.

{E C}_{m o d e l - i}

: The embodied carbon emission amount of the building calculated by the large model in the i-th iteration.

{E C}_{r e f}

: The baseline embodied carbon emissions of the case buildings.

{E C}_{m o d e l - a v g}

: The average value of the embodied carbon emissions of the building resulting from N iterations of the large model calculation.

The comprehensive score is calculated as:

S c o r e = 0.1 {F M}_{a v g} + 0.7 {P A}_{a v g} + 0.2 S

(7)

Score ranges from 0 to 1, with a value closer to 1 indicating better performance.

3.3. Enhanced Retrieval-Based LLM

This study employs the Retrieval-Augmented Generation (RAG) technology to construct a retrieval-enhanced large model for building’s embodied carbon emissions. The aim is to address large models’ “hallucination” problem and improve the accuracy of building’s embodied carbon emissions calculation. The framework’s core is a “retrieval–enhancement–generation” three-stage closed-loop process. It achieves high-precision and interpretable carbon emission accounting results by integrating domain knowledge with the generation capabilities of large language models (LLMs) artificially; this system is based on the implicit carbon emission data of the building’s entire life cycle (A1–A3); and through the collaborative operation of text preprocessing, vector retrieval, and intelligent generation, it transforms from user queries to structured accounting reports.

3.3.1. Systematic Architecture Design

The system adopts a modular design concept, consisting of three core components: the knowledge retrieval module, the large language model module, and the Dify coordination hub. The knowledge retrieval module extracts relevant information fragments from the structured knowledge base. The large language model module is responsible for text understanding and generation. The Dify platform is the system hub, responsible for workflow orchestration, parameter configuration, and performance monitoring.

The knowledge retrieval system first converts the user query into a vector representation and searches for the most relevant document fragments in the vector database. The retrieval results are integrated with the original query through the Dify platform. Then the calculation code module is called to perform precise calculations on the complex numerical values within it. The calculation code can rely on its rigorous logic and precise calculation rules to strictly process complex numerical calculations, further reducing errors and making the calculation more accurate. The processed data is then sent to the basic large language model for answer generation. The Dify platform can optimize the context length through an intelligent context window management mechanism. Specifically, the system prioritizes based on the relevance between the retrieved document fragments and the user’s query, automatically truncating or compressing redundant and low-relevance text segments while retaining key data and computational logic. The platform also supports dynamic context assembly, only including necessary knowledge base retrieval results, the user’s current query, and the results of computational code calls in generating the context, avoiding model attention dispersion or performance degradation due to excessive information input. This optimization ensures the accuracy and efficiency of context information in professional computing tasks, and improves the response quality and stability of large-scale language models in complex carbon calculation scenarios. Throughout the entire process, the Dify platform can conduct quality screening of the search results, optimize the context length, coordinate the invocation of calculation codes, and dynamically adjust the generation parameters.

3.3.2. Realization Process

Constructing a carbon emission factor database requires systematic collection and organization of original emission data from authoritative institutions, followed by strict data cleaning and standardization processing, including uniformization of measurement units, detection of outliers, etc. Subsequently, the emission factors and their metadata (such as applicable scope, data creation time, geographical characteristics, etc.) are transformed into a standardized format through structured processing. In the Dify platform, data update strategies and quality verification rules can be conveniently configured, and a data version control and cross-validation mechanism can be established to ensure the database’s authority, accuracy, and timeliness, providing reliable data support for carbon accounting and emission reduction decision-making. The system adopts a two-stage processing flow: the first stage retrieves relevant document fragments through semantic search, and the second stage inputs the search results and the user’s query into the large language model. The Dify platform provides a flexible prompt engineering interface, allowing researchers to design dynamic prompt templates and automatically adjust the generation strategy based on the quality and quantity of the search results. In the Dify environment, various basic large language models can be easily integrated and fine-tuned for specific tasks. The platform supports refined control of generation parameters, including key parameters such as creativity level and response length. The system also implements an intelligent caching mechanism to cache results for common queries to improve response speed.

3.3.3. Analysis of Method Advantages

This method fully leverages the integration advantages of the Dify platform, simplifying the originally complex RAG system into a visual configuration process. The built-in performance analysis tools of the platform provide data support for system optimization, and the modular design enables each component to be updated and upgraded independently. Moreover, the hybrid architecture retains the language understanding and generation capabilities of large language models, while ensuring the accuracy of key information through the retrieval mechanism. Particularly in professional domain question answering, this method significantly reduces the phenomenon of model hallucination, while maintaining a natural and fluent conversation experience. The collaboration function of Dify also supports team members in jointly maintaining the knowledge base and optimizing the prompt strategy.

3.3.4. Evaluation and Verification

Compare the retrieval-enhanced large model proposed in this study with the actual situation and the results obtained from basic LLMs. Under the same dataset and query conditions, analyze the differences in the comparison results, and examine the advantages of the methodology in this article in terms of knowledge retrieval and answer generation. Additionally, propose optimization suggestions based on the experimental results.

4. Results and Discussion

4.1. Analysis of Carbon Emission Calculation Results in Buildings-Based on the Basic LLMs

The scores of each InfoSet under the three models are shown in Figure 2. By analyzing different combinations of inputs, the richness and completeness of the input information have a significant impact on the computational effect of the large model. The full information package (InfoSet 5) is the optimal input combination, as it contains all the core parameters required for the calculation and reduces the error of interpretation. Region marking plays an important role in improving the calculation accuracy. By comparing the scores of InfoSet 2 (without region marking) and InfoSet 3 (with region marking), it can be seen that both Kimi and DeepSeek have significantly improved scores after adding region marking. This indicates that the regional information can provide clear geographical boundaries for carbon factor retrieval and reduce the error caused by regional deviation. The form of information presentation also affects the model’s performance. The average score of InfoSet 4 is relatively low. This might be because some models have stronger parsing capabilities for text-based presented material data, while there is still room for optimization in the processing of structured Excel data in terms of details.

The performance of different large models in calculating the implicit carbon emissions of buildings varies, and it is related to the scale of the model parameters. The scale of the parameters determines the breadth of their knowledge base, but it is not the only factor. If a large number of parameters are not fully absorbed in the training process through professional knowledge such as building material databases, LCA standards, and dynamic emission factors, the calculation will still lack accuracy. More importantly, the knowledge structure of the model, the knowledge of general models, is broad and semantically related, while precise carbon calculation requires a strict, tree-like logical professional framework. During reasoning, the limited memory (context length) also restricts its ability to handle lengthy standards, complex calculation rules, and real-time data simultaneously, leading to the omission of key links or confusion of system boundaries in multi-step calculations, ultimately resulting in result deviations.

From the analysis of different LLMs, Kimi’s score gradually increased from 0.71 in InfoSet 1 to 1 in InfoSet 5, showing a stable upward trend overall. It reached 0.86 in InfoSet 3 and slightly dropped to 0.83 in InfoSet 4. Finally, it performed best in InfoSet 5, which contained all the input information. It was relatively sensitive to the richness of the input information, especially showing a significant improvement in score after adding area markers, indicating strong adaptability to area parameters. However, when inputting structured data, its performance was slightly inferior to the combination of text and area markers, which might be related to the details of Excel data parsing. Doubao performed outstandingly in InfoSet 1, but its score was lower than the other two models in InfoSet 2 to 4. The built-in vertical knowledge base it relied on maintained a high score when the information was incomplete, but was insensitive to the incremental information in the intermediate stage. This might be related to the integration logic of general information and professional knowledge bases, which require complete information to trigger. InfoSet 5, which contained all the input information, performed best, indicating its extremely high professional calculation accuracy when the parameters were complete. DeepSeek’s score rose rapidly from 0.53 in InfoSet 1 to 0.80 in InfoSet 2, and then remained stable. It performed best in InfoSet 5, which contained all the input information, showing the most significant overall growth rate and reacting strongly to supplementing material data. This indicates that it performs well in complex information integration and computational stability. However, in the early stage, its score was relatively low due to a lack of professional knowledge reserves. In conclusion, large models have significant potential for application in building implicit carbon emission calculations. Especially when the input information is sufficient, they can achieve high-precision calculations. When all information packages (InfoSet 5) are input, the comprehensive scores of the three models all reach 1, indicating that when complete information, such as structured material data, regional markers, and carbon emission factors, is provided, large models can effectively integrate information and output reliable calculation results.

Based on the above experimental results, it was found that the basic large model has certain limitations in calculating the embodied carbon emissions of buildings. Firstly, due to its universal and fixed knowledge system, adapting to this field’s professionalism, data dynamics, and regional specificity is difficult. When there is insufficient information increment, the built-in knowledge system of the model cannot precisely match the specific needs of the professional scenarios. The model either fills the information gap through incorrect assumptions or overly relies on local information for one-sided inference, highlighting the illusion problem and reducing the reliability of the calculation results.

It should be noted that even when all the information is complete, there will still be specific errors in the calculations performed by large models. This is closely related to the characteristics of the large models themselves. From the perspective of the model’s essence, the core principle of large models is a probability prediction model trained based on massive text data. Its essence is to generate “the most likely output” by learning language patterns and context correlations, rather than through precise mathematical logic calculations. For simple numerical operations in calculating embodied carbon emissions in buildings, the model may “remember” the results due to the frequent occurrence of such combinations in the training data, and perform relatively accurately. However, for complex calculations, because the coverage of such specific combinations in the training data is limited, the model cannot, like professional calculation tools, derive explicit formulas. Instead, it relies on similar patterns to “guess” the results, which is prone to errors. Meanwhile, the attention mechanism of large models has limitations when dealing with long sequences or multi-step logical operations. The calculation of embodied carbon emissions in buildings involves a large number of consecutive numerical operations and logical progressive steps. When the number of calculation steps increases and the data volume expands, the model finds it challenging to precisely track the details of each step of the calculation. It may suffer from attention dispersion and make errors in intermediate steps, ultimately resulting in deviations in the final results.

Therefore, building a specialized RAG large model is essential. This model can combine the basic large model with a knowledge base of expertise in embodied carbon emissions. When handling computational tasks, it first retrieves precise information related to the current task from the knowledge base. Then it conducts reasoning and calculation based on this honest and professional information. This effectively reduces the reliance on the internal knowledge of the basic large model, avoids subjective completion and illusion problems caused by insufficient information, improves the accuracy and reliability of the calculation results, and better meets the practical application needs of this field. During this process, adding calculation codes can rely on their rigorous logic and precise calculation rules to strictly handle complex numerical calculations, reducing errors and making the calculation more accurate.

4.2. Construction of Enhanced Retrieval-Based LLM

4.2.1. Selection of the Foundational Large Model

This study selects DeepSeek for RAG deployment, mainly based on its unique advantages demonstrated in the task of calculating the embodied carbon emissions in construction: Firstly, DeepSeek responds extremely strongly to the supplementation of basic information, with its scores rapidly improving from InfoSet 1 to InfoSet 3, showing its efficient absorption capacity for external information, which is highly consistent with the core requirement of the RAG architecture that relies on the supplementation of information from a knowledge base. Secondly, DeepSeek’s multi-modal data processing advantages in structured data input can better adapt to the diverse information retrieval and integration scenarios involved in deploying RAG.

4.2.2. Knowledge Base Construction

When building the database required for calculating the embodied carbon emissions of buildings, a comprehensive standardization of data processing is necessary. This includes the following: (1) Unifying measurement units to ensure consistency in measurement scales across all data, thereby avoiding analytical errors caused by unit differences. (2) Annotating the creation time of the data to facilitate the tracing of its timeliness and provide a basis for considering time factors in subsequent analyses. (3) Elaborate on the geographical characteristics of the data, such as covering different regions like Europe, Africa and Asia, to make the connection between the data and the geographical dimensions clear and explicit. By aggregating the standardized implicit carbon emission data of buildings into Excel, the data becomes easily understandable.

This database was established based solely on the data required for the selected cases. However, to verify the accuracy of the rag technology information retrieval, a series of hypothetical virtual data were specially designed based on the aforementioned carbon emission factor information. These virtual data will cover different regions such as Europe, Africa, and Asia. The data creation time is set to be earlier than the actual information, and all the values of the virtual data are set to −10,000,000. Setting the values of the virtual data uniformly to −10,000,000 achieves rapid verification of the model information retrieval results through this highly distinctive special value. Since this value has a significant difference from the conventional reasonable range of building implicit carbon emission-related data and is unique, in the model solution process, once this value appears, it can be immediately determined that the retrieved information comes from the above-mentioned virtual data, thereby distinguishing whether the model has accurately retrieved the real data in the RAG.

4.2.3. Information Input Method

Based on the above experimental results, choosing InfoSet 3 (plain text + material data + regional markers) as the input for the RAG technology combined with the large model has significant advantages. This input form has both practicality and technical adaptability. Firstly, the text-based material data is highly compatible with the parsing characteristics of natural language models, avoiding the format compatibility issues of structured data. Secondly, the clear regional markers provide precise geographical boundaries for the retrieval module of RAG, ensuring the efficient positioning of the carbon factor database and reducing the risk of regional bias from the source. By embedding core parameters such as material usage and process type in text form within the question context, it can meet the semantic understanding requirements of the large model and provide structured feature anchors for the vectorized retrieval of RAG, forming a “text-friendly structured” input paradigm. This design retains the integrity of information while optimizing the model’s processing efficiency for professional data. By selecting InfoSet 3 as the input template for the RAG system, it is possible to quickly build a large model application for building material embodied carbon emissions at a lower cost while ensuring calculation accuracy.

4.2.4. Workflow Setup

(1): Information Extraction Phase

By analyzing the input text description, identify and extract the material name and the corresponding quantity. For example, from “use 500 kg of steel”, extract the material name as “steel” and the quantity as 500 kg. The extraction method combines the association patterns of quantities, units, and material names in the text to accurately identify various materials and their quantities. Search for region-related keywords in the description, such as “UK”, “China”, etc., to determine the applicable regional scope for the calculation. If no regional information is in the description, mark it as “Region not specified”.

(2): Carbon Emission Factor Retrieval Phase

Based on the extracted material name and regional information, search in the carbon emission factor database following the rules. Prioritize searching for completely matching carbon emission factor records based on the combination of “material name + region” in the database. For example, “steel + UK” corresponds to the carbon emission factor for steel in the UK. If there are multiple records for the same “material + region”, select the one with the latest update date; if the update dates are the same, select the first record with reasonable carbon emission factor data. If no region is specified and there are multiple records for the same material in the database, directly return the prompt: “Please provide the region information in the description. I will recalculate for you!” If the material name does not fully match, such as inputting “rebar” while the database is “steel”, select the record with the highest matching degree through name similarity comparison as a substitute. If there is no matching record (including fuzzy matching) for a specific material in the database, return the prompt: “No carbon emission factor found for the following material. Please check the material name or provide the carbon emission factor data”, and list the unmatched material names.

(3): Carbon Emission Calculation

For materials that are successfully matched with carbon emission factors, calculate the carbon emission of a single material using the formula “carbon emission = material quantity × carbon emission factor”. Sum up the carbon emission results of all single materials to obtain the final total carbon emission.

(4): Result Presentation

The output content includes: the matched material name, corresponding region, carbon emission factor update date, carbon emission of a single material, and the cumulative total carbon emission; if there are unmatched materials or missing regions, display the corresponding prompt information synchronously.

Through the above process, the entire process from information parsing to carbon emission calculation can be systematically completed, ensuring that the results comply with the rules and are logically clear. The workflow is shown in Figure 3.

4.3. Analysis of Carbon Emission Calculation Results in Buildings Based on Retrieval-Enhanced LLM

After experimental verification, under the input condition of InfoSet 3, this study conducted five repeated tests on the RAG large model. The implicit carbon emission accounting results output by the model were all entirely consistent with the actual data, demonstrating the stability and accuracy of the retrieval-enhanced generation framework. Comparative tests showed that the average score of the basic large model based on DeepSeek under the same conditions was 0.8. In contrast, by leveraging the collaborative optimization of knowledge retrieval and calculation code, the RAG large model increased the average score to 1, with a performance improvement of 25%. This significant improvement stems from the three-stage closed-loop design of the RAG architecture: vector retrieval ensures data authority, relying on calculation code guarantees numerical accuracy, and finally, the large language model generates an interpretable report, thereby comprehensively surpassing pure generative models in terms of professionalism and accuracy. Thus, the results fully prove the significant optimization effect of the “retrieval–enhancement–generation” closed-loop mechanism on the accuracy of building implicit carbon emission calculation.

5. Further Discussion

This study combines large language models (LLMs) with retrieval-enhanced generation (RAG) technology to systematically explore its application potential in calculating embodied carbon emissions in buildings. It provides an innovative solution to address data fragmentation issues and inconsistent standards in traditional calculation methods. The research results not only verify the applicability of large models in this field but also, through the optimized design of the RAG architecture, break through the limitations of general models in professional scenarios, providing theoretical and practical support for the intelligent transformation of the building low-carbon field.

The findings of this study effectively echo and expand upon the conclusions in existing literature regarding the application of large models in professional fields. In terms of the industry adaptability of large models, the research results show that the basic large model can achieve high-precision calculations when the input information is complete (such as InfoSet 5), which is consistent with the conclusion proposed by Gu et al. [30] in the CECA method that “large models enhance calculation efficiency through semantic parsing”, confirming the potential of LLMs in handling structured data and professional logic. At the same time, this study further reveals that the influence pattern of information input granularity on model performance-regional marking (such as “the United Kingdom”) can reduce the carbon factor matching error by more than 20%, which supplements Zhang et al.’s research conclusion on “the need for regional parameter adaptation in the construction industry” [8].

Regarding selecting technical paths, the RAG architecture and fine-tuning method proposed in this study form a sharp contrast. Compared with the Claude-3.5 fine-tuning scheme adopted by Gu et al. [30], the RAG model in this study achieved zero-error calculation in case verification and did not rely on large-scale labeled data. This advantage effectively avoids the risks of “data dependence” and “overfitting” faced by the fine-tuning method, especially suitable for the characteristics of the construction industry’s scattered data and dynamic update of standards. Moreover, the closed-loop design of “retrieval–calculation–generation” in the RAG architecture significantly improves the processing efficiency of steps such as material usage analysis and carbon factor matching, surpassing the complexity of the traditional LCA method, providing a feasible solution for real-time carbon accounting scenarios.

Compared with the computing mode of the basic large model that relies on the complete information upload (such as InfoSet 5), the RAG architecture constructed in this study achieves a synergistic improvement in accuracy and efficiency through a dual design of code embedding and knowledge base embedding. This innovation demonstrates an irreplaceable advantage in professional computing scenarios. Regarding computational accuracy, the RAG architecture processes numerical operations through independent calculation code modules, avoiding the computational deviation caused by the “probabilistic generation” of the basic large model. Experimental data shows that when the basic large model processes multi-material cumulative calculations, there may be a 5–8% error due to the attention mechanism dispersion. However, in the RAG architecture, the buildings embedded in the code and their associated carbon emission calculation formulas can perform iterative calculations strictly according to mathematical logic. In five repeated tests, zero deviation was achieved from the benchmark data. This combination of “machine logic + professional formula” is particularly suitable for complex calculation scenarios of “multi-material–multi-factor” in building implicit carbon calculation, and its accuracy advantage increases exponentially with the increase in material types. Regarding operational convenience, the RAG architecture embeds the carbon factor database within the system design, significantly reducing users’ information input burden. When the basic large model operates in the full information upload mode, if users want to obtain relatively accurate results, they need to manually upload complete project information each time, which is rather cumbersome and prone to failure due to format errors. However, the RAG system uses the preset “material–region–factor” association rules. Users only need to input the text-based material list and regional information, and the system can automatically complete database retrieval and parameter matching. More importantly, when industry standards are updated, users do not need to upload the complete data repeatedly. They only need to update the corresponding entries through the knowledge base management interface, and the system can call the latest parameters in real time. For the same case, using traditional large models to integrate and upload information repeatedly requires at least 60 min or more. However, using the large model built based on RAG technology for repeated calculations only takes about 20 min, which demonstrates that RAG technology significantly enhances the automation level and efficiency of the carbon accounting process.

6. Conclusions

This study innovatively integrates large language models (LLMs) with retrieval-enhanced generation (RAG) technology, leveraging a dual mechanism of “knowledge plug-in” and “computational embedding” to transform LLMs from “general assistants” into “domain experts”, thereby achieving a leap in the accuracy and efficiency of ECE calculations. The system has successfully addressed long-standing issues in the calculation of building implicit carbon emissions, such as data fragmentation, inconsistent regional standards, and low computational efficiency. The main conclusions drawn in this study are as follows:

(1): The basic large model has application potential in the calculation of embodied carbon emissions in buildings but has limited reliability of results: It is pointed out that although the basic large model has application potential in the field of embodied carbon emissions calculation for buildings, due to its core reliance on massive text data training, by learning language patterns and context correlations to generate “most likely outputs”, rather than based on precise mathematical logical operations, there is a problem of limited reliability of results. This clarifies the direction for subsequent technical optimization.
(2): Revealing the influence of data elements on calculation accuracy: It is clarified that data integrity and regional identification significantly affect calculation accuracy. It is proposed that inputting complete information, including structured material data, regional identification, and carbon emission factors, can minimize parsing errors, providing an optimization idea at the data level for improving calculation accuracy.
(3): Creating a pioneering application path for integrating RAG and large models: This is the first proposal combining RAG technology and large models for application in the embodied carbon emissions calculation scenario for buildings. The core defect of general large models in this field is solved at the technical architecture level through the dynamic invocation of external knowledge bases to achieve precise matching of carbon emission factors and linkage with the calculation module to complete zero-error numerical operations. It provides a new perspective and new methods for theoretical research in this field.
(4): Verifying the significant efficiency improvement of RAG technology: Empirical evidence shows that the combination of RAG technology and the basic large model outperforms the basic large model by 25%, fully demonstrating the optimization value of RAG technology for the application of large models in carbon emission calculations, and providing a reusable performance improvement benchmark for the technical implementation in this field.
(5): Establishing a supporting theory and application system: Based on the above integration practice, a dynamic mapping relationship between data integrity and calculation accuracy is established, and an exclusive RAG architecture and hierarchical application strategy suitable for embodied carbon calculation are proposed and verified.

This study verified the feasibility of large models in calculating the implicit carbon emissions of buildings through the RAG architecture. However, there are certain limitations to the research scope. At the knowledge base level, the current carbon emission factor database is constructed based on the requirements of the Lidl Supermarket case in the UK, only integrating relevant data sources in the UK, with a narrow coverage range, and it is not easy to adapt to building projects in different regions around the world. Its application in uncovered areas may result in a calculation error. Although the existing RAG workflow achieves a basic closed loop in the workflow design, it lacks flexible adaptation modules for different calculation scenarios.

In the future, first of all, subsequent research will establish a global carbon emission factor database to provide comprehensive and accurate data support for calculating building carbon emissions, thereby improving model performance and providing tools for the low-carbon development of the global construction industry. At the same time, a dynamic and scalable knowledge base management architecture will be constructed, including automated data capture, version control, and verification synchronization with authoritative databases. Through rule-triggered updates combined with manual review, the latest policies and standards will be promptly incorporated to ensure the timeliness and accuracy of carbon accounting and guarantee the long-term effectiveness of the research methods in the low-carbon construction field.

Secondly, subsequent research will focus on developing technologies for the analysis and structured transformation of multimodal unstructured data. Firstly, it will prioritize the breakthroughs in the automated association mapping of Excel table data and the intelligent recognition and extraction algorithms for key parameters in engineering drawing images. A standardized processing flow applicable to multiple sources of unstructured data will be constructed. On this basis, further implementation of deep data interaction and integration with BIM tools will be achieved. A mechanism for automatic extraction and standardized conversion of core data such as component geometric properties, material types, and engineering quantities in BIM models will be established, forming a comprehensive data source processing system covering “unstructured data–structured data–BIM model data”.

Finally, our next stage of research will focus on developing a pluggable “standard adaptation module”. This module is designed to store the core calculation rules, system boundary definitions, and emission factor databases of various standards, forming an independent and configurable set of rules. The core of the system will call this module through a unified interface to achieve dynamic matching with specific regional standards. This move will significantly enhance the regulatory flexibility and regional applicability of the system, with the ultimate goal being that users only need to specify the location of the project, and the system can automatically call the corresponding standards to complete precise and compliant implicit carbon calculations, significantly improving the level of automated application.

Through the implementation of the above technical solution, it will provide high-quality and highly reliable data support for the carbon accounting throughout the entire lifecycle of buildings. This will further enhance the automation level of the carbon accounting process and the accuracy of the accounting results, and expand the academic research value and engineering application scenarios of this system in the field of building carbon management.

In summary, this paper combines LLM with RAG technology for the first time. It applies it to calculating carbon emissions in buildings, providing theoretical support and operational technical paths for the carbon neutrality of the industry, and promoting the intelligent development of this field.

Author Contributions

Conceptualization, J.X.; methodology, Y.Z. and J.X.; software, Y.Z. and J.X.; validation, none specified; formal analysis, Y.Z.; investigation, Y.Z.; resources, none specified; data curation, Y.Z.; writing—original draft preparation, Y.Z.; writing—review and editing, J.X. and R.Z.; visualization, J.X.; supervision, J.X.; project administration, none specified; funding acquisition, J.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The detailed input combination content is as follows:

(1): InfoSet-1 Minimal Text—No Building Materials Data

You are now an expert in building LCA. Please estimate the embodied carbon emissions of buildings A1–A3 based on the following description: Building A1–A3 is a single-story supermarket with a 2500 square meter floor area, using a steel frame and composite exterior wall design. The building materials include multiple types, such as Concrete (Concrete, Cast-in-Place-C15, Concrete-Cast-in-Place Concrete, Concrete, Cast in Situ, Concrete-Precast Concrete-35 MPa, and Default Mass Floor), Metal (Steel, Metal-Steel 50-355, Metal-Steel-S275, QuadCore Trapezoidal Roof Panel_KS1000RW_ExternalWeatherSheet, QuadCore Trapezoidal Roof Panel_KS1000RW_Internal LinerSheet, Default Roof-metal single skin, PrepaintSteel_ArcelorMittal_Construction_HARULTRA-35-CORAL, Cladding, Vertical Ribbed, Metal Stud Layer, Aluminium, PrepaintSteel_ArcelorMittal_Construction_INTERIEUR-12-WHITE), Insulation and Support (Default Roof-Generic insulation 125 mm, ArcelorMittal-Mineral Wool, Insulation/Support Frame, Rock Wool), Walls and Finishes (Default Wall, Gypsum Wall Board, Plaster, Brick, Common, Paint-White Lining), and Glass.

Please calculate the material usage for each material based on the average material usage coefficient for similar buildings (kg/m² GFA).

Automatically retrieve the carbon emission factors.

The formula for calculating the embodied carbon of the building is Embodied Carbon = Material Usage × Carbon Factor. Please note the consistency of units and retain the result to two decimal places.

Output: A calculation summary table, with each material name listed separately in a new row, and column titles as material, specification, quantity, unit, Carbon emission coefficient, source_database_or_EPD_ID, A1–A3_EC_kgCO2e, remarks, and calculate the total carbon emission value for each material.

(2): InfoSet-2 Plain Text—Contains building material data

You are now an expert in building LCA. Please estimate the embodied carbon emissions for buildings A1 to A3 based on the following description: Building A1 to A3 is a single-story supermarket with a 2500 square meter area, using a steel frame and composite exterior wall design. The building materials include multiple types and corresponding weights, among which the concrete types include Concrete, Cast-in-Place-C15 (14,674.397 kg), Default Mass Floor (1190.908 kg), Concrete-Cast-in-Place Concrete (97,024.061 kg), Concrete, Cast in Situ (4.813 kg), Concrete (9764.194 kg), Concrete-Precast Concrete-35 MPa (12,790.706 kg); the metal types cover Steel (4528.741 kg), QuadCore Trapezoidal Roof Panel_KS1000RW_ExternalWeatherSheet (19,720.414 kg), QuadCore Trapezoidal Roof Panel_KS1000RW_Internal LinerSheet (29,468.4 kg), Default Roof-metal single skin (43.79 kg), Metal-Steel 50-355 (20,982.768 kg), Metal-Steel-S275 (64,730.549 kg), PrepaintSteel_ArcelorMittal_Construction_HARULTRA-35-CORAL (27,644.39 kg), Cladding, Vertical Ribbed (18,124.461 kg), Metal Stud Layer (169,630.17 kg), Aluminium (2678.138 kg), PrepaintSteel_ArcelorMittal_Construction_INTERIEUR-12-WHITE (3213.457 kg); the insulation and support types include Default Roof-Generic insulation 125 mm (178.04 kg), ArcelorMittal-Mineral Wool (244,027.357 kg), Insulation/Support Frame (28.463 kg), Rock Wool (33,027.235 kg); the wall and finish types involve Default Wall (768.253 kg), Gypsum Wall Board (36,163.635 kg), Plaster (4467.497 kg), Brick, Common (25,933.089 kg), Paint-White Lining (0.557 kg); and there are also Glass (8286.045 kg).

Please conduct an automatic search for carbon emission factors. The implicit carbon calculation formula for buildings is: Implicit carbon = Material usage × Carbon factor. Please pay attention to the consistency of units. The result should be rounded to two decimal places.

Output: Generate a calculation detail table. Each material name should be listed on a separate line. The column titles should be: material, specification, quantity, unit, Carbon emission coefficient, source_database_or_EPD_ID, A1–A3_EC_kgCO2e, remarks, and calculate the total carbon emission value for each material.

(3): InfoSet-3 Plain Text—Includes building material data + area markings

You are a British expert in building life cycle assessment (LCA). Based solely on the following description, estimate the embodied carbon emissions of buildings A1–A3: The buildings are single-story supermarkets, using steel frames and composite facades. The building materials include various types and their corresponding weights. All the materials are sourced from the UK. Among them, the concrete types include Concrete, Cast-in-Place-C15 (14,674.397 kg), Default Mass Floor (1190.908 kg), Concrete-Cast-in-Place Concrete (97,024.061 kg), Concrete, Cast in Situ (4.813 kg), Concrete (9764.194 kg), Concrete-Precast Concrete-35 MPa (12,790.706 kg); the metal types cover Steel (4528.741 kg), QuadCore Trapezoidal Roof Panel_KS1000RW_ExternalWeatherSheet (19,720.414 kg), QuadCore Trapezoidal Roof Panel_KS1000RW_Internal LinerSheet (29,468.4 kg), Default Roof-metal single skin (43.79 kg), Metal-Steel 50-355 (20,982.768 kg), Metal-Steel-S275 (64,730.549 kg), PrepaintSteel_ArcelorMittal_Construction_HARULTRA-35-CORAL (27,644.39 kg), Cladding, Vertical Ribbed (18,124.461 kg), Metal Stud Layer (169,630.17 kg), Aluminium (2678.138 kg), PrepaintSteel_ArcelorMittal_Construction_INTERIEUR-12-WHITE (3213.457 kg); the insulation and support types include Default Roof-Generic insulation 125 mm (178.04 kg), ArcelorMittal-Mineral Wool (244,027.357 kg), Insulation/Support Frame (28.463 kg), Rock Wool (33,027.235 kg); the wall and finish types involve Default Wall (768.253 kg), Gypsum Wall Board (36,163.635 kg), Plaster (4467.497 kg), Brick, Common (25,933.089 kg), Paint-White Lining (0.557 kg); and there is also Glass (8286.045 kg).

Please conduct an automatic search for carbon emission factors. The implicit carbon calculation formula for buildings is: Implicit carbon = Material usage × Carbon factor. Please pay attention to the consistency of units. The result should be rounded to two decimal places.

Output: Generate a calculation detail table. Each material name should be listed on a separate line. The column titles should be: material, specification, quantity, unit, Carbon emission coefficient, source_database_or_EPD_ID, A1–A3_EC_kgCO2e, remarks, and calculate the total carbon emission value for each material.

(4): InfoSet-4 Structured EXCEL + Area Marking

You are a British expert in building life cycle assessment (LCA). Please estimate the embodied carbon emissions of buildings A1–A3 based on the following description: The building is a single-story supermarket with a steel frame and composite exterior walls in the UK, with multiple types of building materials and corresponding weights. I have uploaded an Excel file containing information about the building materials.

The formula for calculating embodied carbon is: Embodied Carbon = Material Usage × Carbon Factor. Please note the consistency of units. The result should be rounded to two decimal places.

Output: A calculation detail table should be generated, with each material name on a separate line. The column titles should be: material, specification, quantity, unit, Carbon Emission Coefficient, source_database_or_EPD_ID, A1–A3_EC_kgCO2e, remarks, and calculate the total carbon emission value of each material.

(5): InfoSet-5—Complete Package Information

You are an expert in building life cycle assessment (LCA) in the UK. Please estimate the embodied carbon emissions of buildings A1–A3 based on the following description: The building is a single-story supermarket with a steel frame and composite exterior walls in the UK, covering a total area of 2500 square meters. The building materials include multiple types and corresponding weights. I have uploaded an Excel file containing relevant information about the building materials.

The formula for calculating embodied carbon is: Embodied Carbon = Material Usage × Carbon Factor. Please note the consistency of units. The result should be rounded to two decimal places.

Output: Generate a calculation detail table, with each material name listed separately on a new line. The column titles should be: material, specification, quantity, unit, Carbon Emission Coefficient, source_database_or_EPD_ID, A1–A3_EC_kgCO2e, remarks, and calculate the total carbon emission value for each material.

References

Huang, Z.J.; Zhou, H.; Miao, Z.J.; Tang, H.; Lin, B.R.; Zhuang, W.M. Life-Cycle Carbon Emissions (LCCE) of Buildings: Implications, Calculations, and Reductions. Engineering 2024, 35, 115–139. [Google Scholar] [CrossRef]
Farhan, S.A.; Shafiq, N.; Azizli, K.A.; Umar, U.A.; Gardezi, S.S.S. Embodied Carbon of Buildings: Tools, Methods and Strategies. In Proceedings of the 2nd International Conference on Civil, Offshore and Environmental Engineering (ICCOEE), Kuala Lumpur, Malaysia, 3–5 June 2014. [Google Scholar] [CrossRef]
Peng, C.H. Calculation of a building’s life cycle carbon emissions based on Ecotect and building information modeling. J. Clean. Prod. 2016, 112, 453–465. [Google Scholar] [CrossRef]
Erzurum, T.; Bettemir, Ö. Analysis of Embodied Energy and Carbon Emission of a Single Story Rural Area Structure. J. Polytech.-Politek. Derg. 2024, 27, 1565–1580. [Google Scholar] [CrossRef]
Jiang, B.Z.; Zhang, H.; Li, Y.; Zhou, H.; Xiao, Z.; He, S.; Qiu, W.; Li, Y. A Practical Investigation of the Accuracy of Large Language Models in Various Industrial Application Scenarios. In Proceedings of the 1st International Workshop on IoT Datasets for Multi-modal Large Model, Hangzhou, China, 4–7 November 2024. [Google Scholar] [CrossRef]
Pan, H.N.; Mudur, N.; Taranto, W.; Tikhanovskaya, M.; Venugopalan, S.; Bahri, Y.; Brenner, M.P.; Kim, E.-A. Quantum many-body physics calculations with large language models. Commun. Phys. 2025, 8, 49. [Google Scholar] [CrossRef]
Rezgui, K. Large Language Models for Healthcare: Applications, Models, Datasets, and Challenges. In Proceedings of the 10th International Conference on Control, Decision and Information Technologies (CoDIT), Vallette, Malta, 1–4 July 2024. [Google Scholar] [CrossRef]
Zhang, L.; Chen, Z.L. Opportunities of applying Large Language Models in building energy sector. Renew. Sustain. Energy Rev. 2025, 214, 115558. [Google Scholar] [CrossRef]
Cang, Y.J.; Yang, L.; Luo, Z.; Zhang, N. Prediction of embodied carbon emissions from residential buildings with different structural forms. Sustain. Cities Soc. 2020, 54, 101946. [Google Scholar] [CrossRef]
Gao, H.; Wang, X.; Wu, K.; Zheng, Y.; Wang, Q.; Shi, W.; He, M. A Review of Building Carbon Emission Accounting and Prediction Models. Buildings 2023, 13, 1617. [Google Scholar] [CrossRef]
Chang, Y.J.; Yu, T.Y.; Chang, C.H. Evaluating the Performance of Open-Source LLMs in Local RAG Systems: A Practical Study on Low-Carbon Data Applications. In Proceedings of the Communications in Computer and Information Science, New Delhi, India, 24 May 2025. [Google Scholar] [CrossRef]
Mohebbi, G.; Bahadori-Jahromi, A.; Ferri, M.; Mylona, A. The Role of Embodied Carbon Databases in the Accuracy of Life Cycle Assessment (LCA) Calculations for the Embodied Carbon of Buildings. Sustainability 2021, 13, 7988. [Google Scholar] [CrossRef]
Mohan, G.B.; Kumar, R.P.; Krishh, P.V.; Keerthinathan, A.; Lavanya, G.; Meghana, M.K.U.; Sulthana, S.; Doss, S. An analysis of large language models: Their impact and potential applications. Knowl. Inf. Syst. 2024, 66, 5047–5070. [Google Scholar] [CrossRef]
Chen, Z.Y.; Xu, L.; Zheng, H.; Chen, L.; Tolba, A.; Zhao, L.; Yu, K.; Feng, H. Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models. CMC-Comput. Mater. Contin. 2024, 80, 1753–1808. [Google Scholar] [CrossRef]
Saleh, Y.; Abu Talib, M.; Nasir, Q.; Dakalbab, F. Evaluating large language models: A systematic review of efficiency, applications, and future directions. Front. Comput. Sci. 2025, 7, 1523699. [Google Scholar] [CrossRef]
Naik, D.; Naik, I.; Naik, N. Applications of AI Chatbots Based on Generative AI, Large Language Models and Large Multimodal Models. In Proceedings of the 2024 International Conference on Computing, Communication, Cybersecurity and AI, London, UK, 3–4 July 2024. [Google Scholar] [CrossRef]
Dai, Z.Q. Applications and Challenges of Large Language Models in Smart Government—From technological Advances to Regulated Applications. In Proceedings of the 3rd International Conference on Frontiers of Artificial Intelligence and Machine Learning (FAIML), College of Computer and Information Technology, Yichang, China, 26–28 April 2024. [Google Scholar] [CrossRef]
Santoro, J.F.; Kripka, M. Evaluation of CO₂ emissions in RC structures considering local and global databases. Innov. Infrastruct. Solut. 2024, 9, 33. [Google Scholar] [CrossRef]
Luo, Z.X.; Yang, L.; Liu, J.P. Embodied carbon emissions of office building: A case study of China’s 78 office buildings. Build. Environ. 2016, 95, 365–371. [Google Scholar] [CrossRef]
Aparna, K.; Baskar, K. Scientometric analysis and panoramic review on life cycle assessment in the construction industry. Innov. Infrastruct. Solut. 2024, 9, 96. [Google Scholar] [CrossRef]
Zhai, Y.K.; Li, Y.; Tang, S.; Liu, Y.; Liu, Y. Lightweight Strategies for Wooden-Structure Buildings Based on Embodied Carbon Emission Calculations for Carbon Reduction. Buildings 2024, 14, 3460. [Google Scholar] [CrossRef]
Robati, M.; Daly, D.; Kokogiannakis, G. A method of uncertainty analysis for whole-life embodied carbon emissions (CO₂-e) of building materials of a net-zero energy building in Australia. J. Clean. Prod. 2019, 225, 541–553. [Google Scholar] [CrossRef]
Su, S.; Zang, Z.; Yuan, J.; Pan, X.; Shan, M. Considering critical building materials for embodied carbon emissions in buildings: A machine learning-based prediction model and tool. Case Stud. Constr. Mater. 2024, 20, e02887. [Google Scholar] [CrossRef]
Rodrigues, F.; Isayeva, A.; Rodrigues, H.; Pinto, A. Energy efficiency assessment of a public building resourcing a BIM model. Innov. Infrastruct. Solut. 2020, 5, 41. [Google Scholar] [CrossRef]
Cang, Y.J.; Luo, Z.; Yang, L.; Han, B. A new method for calculating the embodied carbon emissions from buildings in schematic design: Taking “building element” as basic unit. Build. Environ. 2020, 185, 107306. [Google Scholar] [CrossRef]
Jiang, X. Prediction method of carbon emissions of intelligent buildings based on secondary decomposition BAS-LSTM. Clean Technol. Environ. Policy 2025, 27, 1903–1913. [Google Scholar] [CrossRef]
Zheng, Y.; Li, J.; Wang, S.; Ying, D.; Chew, B.C. Research on the Prediction Model of Green Building Carbon Emission Based on Computer Big Data. In Proceedings of the 2024 International Conference on Telecommunications and Power Electronics, TELEPE 2024, Frankfurt, Germany, 29–31 May 2024. [Google Scholar] [CrossRef]
Xie, Q.M.; Jiang, Q.; Kurnitski, J.; Yang, J.; Lin, Z.; Ye, S. Quantitative Carbon Emission Prediction Model to Limit Embodied Carbon from Major Building Materials in Multi-Story Buildings. Sustainability 2024, 16, 5575. [Google Scholar] [CrossRef]
Li, L. Research on Low Carbon Building Technology System and Carbon Emission Measurement Method Based on Neural Network. In Proceedings of the ACM International Conference Proceeding Series, Nantes, France, 12–15 June 2023. [Google Scholar] [CrossRef]
Gu, X.R.; Chen, C.; Fang, Y.; Mahabir, R.; Fan, L. CECA: An intelligent large-language-model-enabled method for accounting embodied carbon in buildings. Build. Environ. 2025, 272, 112694. [Google Scholar] [CrossRef]
Liu, M.; Zhang, L.; Chen, J.; Chen, W.-A.; Yang, Z.; Lo, L.J.; Wen, J.; O’nEill, Z. Large language models for building energy applications: Opportunities and challenges. Build. Simul. Int. J. 2025, 18, 225–234. [Google Scholar] [CrossRef]
Zhou, L.; Yan, S.; Li, Z.; Ma, J. Exploring the Application of Retrieval-Augmented Generation Technology in Defense Technology Intelligence. In Proceedings of the 2024 International Annual Conference on Complex Systems and Intelligent Science (CSIS-IAC), Guangzhou, China, 20–22 September 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 664–669. [Google Scholar] [CrossRef]

Figure 1. 3D model of the Lidl standard supermarket design [12].

Figure 2. The scoring results for each InfoSet.

Figure 3. Building carbon emission calculation workflow.

Table 1. Embodied carbon emissions of Lidl standard supermarket (A1–A3).

Material Name	Material Weight (kg)	Carbon Emission Coefficient	Embedded Carbon Emissions (kgCO2e)
Concrete, Cast-in-Place-C15	14,674.40	0.10	1511.46
Default Mass Floor	1190.91	0.10	122.66
Concrete-Cast-in-Place Concrete	97,024.06	0.10	9993.48
Concrete, Cast in Situ	4.81	0.10	0.50
Concrete	9764.19	0.10	1005.71
Concrete-Precast Concrete-35 MPa	12,790.71	0.28	3581.40
Steel	4528.74	3.02	13,676.80
QuadCore Trapezoidal Roof Panel_KS1000RW_External Weather Sheet	19,720.41	3.29	64,899.88
QuadCore Trapezoidal Roof Panel_KS1000RW_Internal LinerSheet	29,468.40	3.06	90,173.30
Default Roof-metal Single Skin	43.79	3.06	134.00
Metal-Steel 50-355	20,982.77	2.45	51,407.78
Metal-Steel-S275	64,730.55	2.45	158,589.85
PrepaintSteel_ArcelorMittal_Construction_HARULTRA-35-CORAL	27,644.39	3.06	84,591.83
Cladding, Vertical Ribbed	18,124.46	3.29	59,647.60
Metal Stud Layer	169,630.17	2.97	502,953.45
Aluminium	2678.14	1.71	4568.90
Default Roof-Generic Insulation 125 mm	178.04	1.44	256.38
ArcelorMittal-Mineral Wool	244,027.36	0.74	179,360.11
Insulation/Support Frame	28.46	0.74	21.06
Rock Wool	33,027.24	1.44	47,559.22
Default Wall	768.25	0.39	299.62
Gypsum Wall Board	36,163.64	0.39	14,103.82
Plaster	4467.50	0.39	1742.32
Brick, Common	25,933.09	0.21	5523.75
PrepaintSteel_ArcelorMittal_Construction_INTERIEUR-12-WHITE	3213.46	15.40	49,487.24
Paint-White Lining	0.56	2.33	1.30
Glass	8286.05	1.67	13,812.84
Total	849,094.53	-	1,359,026.26

Table 2. Information input combination.

Number	Material Quantity	ECF	Regional Data Source Indication
InfoSet 1	No	No	No
InfoSet 2	Yes, in plain text	No	No
InfoSet 3	Yes, in plain text	No	Yes, explain the area where the building is located.
InfoSet 4	Yes, in Excel	No	Yes, explain the area where the building is located.
InfoSet 5	Yes, in Excel	Yes	Yes, explain the area where the building is located.

Table 3. Summary of evaluation indicators.

Indicator Type	Indicator Name	Symbol	Formula	Eq.
Factor Matching Indicator	FM Score	FM_i	${FM}_{i} = \frac{The number of correctly matched factors}{T o t a l f a c t o r q u a n t i t y}$	(1)
Error Correlation Indicator	Total ECE Relative Error	δ_i	$δ_{i} = \frac{\| ({EC}_{model - i} - {EC}_{r e f}) \|}{{EC}_{r e f}} \times 100 %$	(2)
Discrepancy Index	Prediction Accuracy	PA_i	$PA = 1 -$ δ	(3)
	Standard Deviation	σ	$σ = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({EC}_{model - i} - {EC}_{model - avg})}^{2}}$	(4)
	Coefficient of Variation	CV	$CV = \frac{σ}{{EC}_{model - avg}}$ × 100%	(5)
	Stability	S	S = 1 − CV	(6)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zou, Y.; Zheng, R.; Xia, J. Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models. Buildings 2025, 15, 3449. https://doi.org/10.3390/buildings15193449

AMA Style

Zou Y, Zheng R, Xia J. Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models. Buildings. 2025; 15(19):3449. https://doi.org/10.3390/buildings15193449

Chicago/Turabian Style

Zou, Yushi, Rengeng Zheng, and Jun Xia. 2025. "Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models" Buildings 15, no. 19: 3449. https://doi.org/10.3390/buildings15193449

APA Style

Zou, Y., Zheng, R., & Xia, J. (2025). Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models. Buildings, 15(19), 3449. https://doi.org/10.3390/buildings15193449

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Embodied Carbon Calculation in Buildings: A Retrieval-Augmented Generation Approach with Large Language Models

Abstract

1. Introduction

2. Literature Review

2.1. Large Language Models

2.2. Embodied Carbon Emissions Estimation

2.3. The Application Prospects of LLMs in Calculating the Embodied Carbon Emissions of Buildings

3. Methods

3.1. Benchmark Case

3.2. Application of the Basic Large Model

3.2.1. Selection of the Basic Large Model

3.2.2. Information Input Combination

3.2.3. Verification Process

3.3. Enhanced Retrieval-Based LLM

3.3.1. Systematic Architecture Design

3.3.2. Realization Process

3.3.3. Analysis of Method Advantages

3.3.4. Evaluation and Verification

4. Results and Discussion

4.1. Analysis of Carbon Emission Calculation Results in Buildings-Based on the Basic LLMs

4.2. Construction of Enhanced Retrieval-Based LLM

4.2.1. Selection of the Foundational Large Model

4.2.2. Knowledge Base Construction

4.2.3. Information Input Method

4.2.4. Workflow Setup

4.3. Analysis of Carbon Emission Calculation Results in Buildings Based on Retrieval-Enhanced LLM

5. Further Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI