Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle

Oliveira, Ana Paula; Carraquico, Tânia; Martinez-Perez, Clara

doi:10.3390/su18031359

Open AccessSystematic Review

Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle

by

Ana Paula Oliveira

^1,2,*

,

Tânia Carraquico

¹ and

Clara Martinez-Perez

³

¹

School of Administration, Engineering and Aeronautics (EGEA), Instituto Superior de Educação e Ciências de Lisboa (ISEC Lisboa), Alameda das Linhas de Torres, 179, 1750-142 Lisboa, Portugal

²

MARE—Centro de Ciências do Mar e do Ambiente, Instituto Politécnico de Setúbal (MARE-IPSetúbal), Campus do IPS, Estefanilha, 2910-765 Setúbal, Portugal

³

Applied Physics Department (Optometry Area), Facultade de Óptica e Optometría, Universidade de Santiago de Compostela, 15705 Santiago de Compostela, Spain

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(3), 1359; https://doi.org/10.3390/su18031359

Submission received: 31 December 2025 / Revised: 26 January 2026 / Accepted: 26 January 2026 / Published: 29 January 2026

(This article belongs to the Special Issue Artificial Intelligence for Climate Change Mitigation, Adaptation and Sustainability: Innovative Approaches for a Greener Future)

Download

Browse Figures

Versions Notes

Abstract

The rapid expansion of artificial intelligence (AI) systems has intensified concerns regarding their energy consumption and carbon footprint, raising questions about whether efficiency-focused strategies under the Green AI paradigm are sufficient to ensure system-level environmental sustainability. This study systematically synthesizes empirical evidence on the energy use and carbon emissions of AI systems across their life cycle and develops a conceptual framework to integrate sustainability constraints into AI deployment. A systematic review was conducted in accordance with PRISMA 2020 guidelines and AMSTAR-2 standards, with searches performed in Web of Science, Pubmed and Scopus up to 19 December 2025. Eligible studies quantitatively assessed energy consumption, carbon footprint, greenhouse-gas emissions, or life-cycle impacts associated with AI systems, including training, inference, hardware, and deployment infrastructures. Ten studies met the inclusion criteria. The results show that AI-related environmental impacts are substantial and highly context-dependent, with inference-phase energy demand often matching or exceeding training-related consumption in large-scale deployments. Life-cycle assessments indicate that hardware-related emissions and electricity mix strongly influence total carbon footprints, while efficiency gains are frequently constrained by system-level feedback. These findings suggest that isolated efficiency improvements are insufficient and that sustainable AI requires coordinated, system-level governance embedding energy and carbon constraints into design and operational decision-making.

Keywords:

artificial intelligence; energy consumption; carbon footprint; sustainable AI; Green AI; life-cycle assessment; rebound effect; carbon-aware systems; multi-agent reinforcement learning

1. Introduction

Artificial intelligence (AI) systems have become deeply embedded in scientific research, industry, and public-sector decision-making, driving advances across domains such as healthcare, finance, transportation, and climate science [1,2,3]. Recent progress has been fueled by the rapid scaling of machine learning models, increased availability of data, and the proliferation of specialized computational hardware [4]. However, this expansion has also raised growing concerns regarding the energy consumption and carbon footprint associated with the development, deployment, and large-scale use of AI systems [5,6].

Early studies of AI-related environmental impacts primarily focused on the substantial energy required to train large models, particularly in deep learning and natural language processing [7,8]. More recent work, however, has demonstrated that the environmental footprint of AI extends far beyond training alone. Inference-phase computation, continuous deployment in user-facing applications, and embodied emissions from hardware manufacturing and turnover often represent comparable or even dominant sources of impact [9,10]. As AI systems increasingly operate at a global scale and in real-time environments, their cumulative contribution to energy demand and greenhouse-gas emissions has become a non-negligible component of the broader digital carbon footprint.

In response, the emerging field of Green AI has promoted energy-aware model design, improved measurement and reporting practices, and optimization techniques aimed at reducing per-task energy consumption [11,12,13]. While these efforts have advanced awareness and methodological rigor, they largely remain fragmented. Existing studies tend to focus on isolated components of the AI lifecycle, rely on heterogeneous system boundaries and metrics, or emphasize voluntary disclosure rather than enforceable or adaptive mitigation. As a result, it remains difficult to translate empirical findings into coordinated, system-level strategies for reducing aggregate environmental impact.

This limitation is increasingly problematic given the evolving socio-technical context in which AI systems operate [14]. Energy systems are transitioning toward higher shares of renewable generation, introducing temporal and geographic variability in electricity carbon intensity. AI infrastructures are geographically distributed across heterogeneous hardware platforms, and efficiency gains are frequently offset by rebound effects, whereby reduced per-task energy costs lead to expanded deployment and higher total emissions [15,16,17]. These dynamics suggest that sustainability challenges in AI are not static optimization problems, but adaptive control problems unfolding under uncertainty.

Against this backdrop, there is a clear need for work that goes beyond cataloguing impacts or proposing isolated efficiency improvements [13,18]. Specifically, the literature lacks (i) a systematic synthesis that jointly examines operational energy use, carbon emissions, and embodied life-cycle impacts across the AI lifecycle, and (ii) conceptual frameworks that explicitly translate these empirical insights into coordinated, adaptive mitigation mechanisms. To date, no prior study integrates these dimensions while linking sustainability assessment to control-oriented system design.

The contribution of this work is therefore twofold. First, we conduct a systematic review that synthesizes empirical and analytical evidence on the energy consumption and carbon footprint of contemporary AI systems across their lifecycle, identifying key drivers, methodological limitations, and sources of variability. Second, building directly on the patterns identified in this review, we propose a conceptual, system-level framework that illustrates how sustainability constraints could be operationalized within AI systems through coordinated, carbon-aware control mechanisms under dynamic and uncertain conditions.

By explicitly connecting empirical evidence to a hypothesis-driven control framework, this study advances the literature beyond descriptive assessment toward an integrated research agenda for sustainable and climate-resilient AI development and deployment.

2. Theoretical Background

2.1. Environmental Impacts of Artificial Intelligence

The environmental footprint of AI systems arises from multiple, interrelated components of the computational lifecycle. Operational energy consumption includes electricity used during model training and inference, while embodied emissions capture greenhouse-gas emissions associated with hardware manufacturing, transportation, maintenance, and end-of-life processes [19,20]. Early work focused primarily on training-phase energy demand, driven by the rapid scaling of model size and computational intensity [7]. More recent studies have demonstrated that inference-phase energy use can dominate total emissions for systems deployed at scale or operating continuously [10].

Life-cycle assessment (LCA) provides a theoretical foundation for understanding these impacts holistically, emphasizing the importance of system boundaries and temporal scope. Studies adopting cradle-to-grave perspectives have shown that neglecting embodied emissions can substantially underestimate the true environmental cost of AI systems, particularly in contexts characterized by frequent hardware upgrades or low utilization rates.

2.2. Green AI and Energy-Aware Machine Learning

The Green AI paradigm emerged as a response to performance-centric research trajectories that prioritize marginal accuracy gains at increasing computational cost. Green AI advocates for the reporting of energy and efficiency metrics alongside traditional performance indicators and promotes algorithmic techniques such as pruning, quantization, and efficient training protocols [11,21,22].

While these approaches have demonstrated meaningful reductions in per-task energy consumption, they remain largely voluntary, fragmented, and limited in scope. Current frameworks for AI sustainability can be categorized into three primary levels of maturity:

Measurement-focused frameworks: Tools like the Experiment Impact Tracker focus on granular logging of Central Processing Unit/Graphics Processing Unit (CPU/GPU) energy during the training phase. Their advantage is ease of integration into existing workflows, but they often ignore the broader infrastructure.
LCA frameworks These move toward a “cradle-to-grave” view, including manufacturing and end-of-life phases. They provide a more realistic assessment of total impact but are hindered by non-transparent supply chain data.
Governance-led frameworks: Concepts like “Sustainability Budgets” move beyond measurement toward ex-ante constraints. While effective for institutional accountability, they are difficult to enforce across decentralized or open-source development environments.

2.3. Rebound Effects and System-Level Feedback

The concept of the rebound effect, originating in energy economics, describes situations in which efficiency gains reduce the effective cost of a service, leading to increased consumption that offsets expected energy savings [15]. In the context of AI, rebound effects may manifest through expanded deployment of optimized models, increased inference frequency, or accelerated hardware turnover [23,24].

From a theoretical perspective, rebound effects underscore the importance of system-level feedback and the limitations of localized optimization. Without explicit constraints or coordinated governance, efficiency gains may shift emissions across time, space, or system components rather than reducing them in absolute terms.

2.4. AI Systems as Adaptive Socio-Technical Systems

Modern AI systems operate within complex socio-technical ecosystems that include cloud infrastructure, energy grids, regulatory frameworks, and human decision-makers. These systems are characterized by distributed control, non-stationarity, and uncertainty, particularly as energy systems transition toward higher shares of renewable generation.

Reinforcement learning (RL) and, more specifically, multi-agent reinforcement learning (MARL) provide a theoretical basis for modeling decision-making in such environments. MARL frameworks enable multiple agents to learn coordinated policies under shared objectives and constraints, making them well suited for managing trade-offs across heterogeneous system components [25,26,27]. When combined with constrained and robust RL formulations, MARL offers a mechanism for embedding environmental limits directly into adaptive control strategies [26,28].

2.5. From Measurement to Operational Sustainability

Taken together, these theoretical strands—LCA, Green AI, rebound economics, and adaptive control—suggest that sustainable AI requires a shift from ex post measurement to ex ante and run-time governance. Rather than treating energy consumption and carbon emissions as externalities to be reported after deployment, sustainability must be incorporated as a core design objective that shapes system behavior under uncertainty.

This theoretical foundation motivates the systematic review presented in Section 3, Section 4 and Section 5 and informs the carbon-aware multi-agent reinforcement learning framework introduced in Section 6, which seeks to translate empirical insights into coordinated, actionable strategies for climate-aligned AI systems.

3. Materials and Methods

3.1. Research Question and PICOS Framework

This systematic review was registered in PROSPERO (registration number: [CRD420251276494]) and conducted in accordance with PRISMA 2020 guidelines [29] and AMSTAR-2 [30] methodological standards (Figure 1). A completed PRISMA checklist is provided as Supplementary File S1. The final literature search was completed on 19 December 2025. The research question was formulated using the PICOS framework to ensure methodological rigor and relevance to the environmental assessment of artificial intelligence (AI) systems. Specifically, we aimed to evaluate whether the development, training, deployment, and large-scale use of AI and machine learning models (Population/Exposure) are associated with increased energy consumption and carbon emissions (Outcomes), compared with alternative computational configurations, deployment strategies, or baseline scenarios reported in the literature (Comparator). Eligible studies assessed at least one dimension of the environmental impact of AI, including operational energy consumption, computational cost, carbon footprint, greenhouse-gas emissions, or life-cycle emissions associated with AI hardware, software pipelines, or deployment infrastructures. We considered empirical measurement studies, life-cycle assessments, modelling and simulation studies, comparative experimental benchmarks, and systematic or exploratory reviews with quantitative or analytical components (Study design). Studies focused exclusively on AI applications without assessment of energy use or environmental impact, as well as purely conceptual or narrative work lacking empirical or analytical evaluation, were excluded. Subgroup analyses considered AI system characteristics (e.g., model type, scale, and application domain), computational context (e.g., cloud-based vs. on-device deployment, hardware configuration), methodological approach (e.g., direct energy measurement, modelling, life-cycle assessment), and geographic context of electricity generation as potential sources of heterogeneity. Through this comprehensive approach, the review aimed to synthesize current evidence on the energy and carbon implications of contemporary AI systems, identify key drivers of environmental impact across the AI lifecycle, and inform the development of energy-aware, transparent, and sustainable AI practices.

3.2. Eligibility Criteria

Studies were excluded if they met any of the following conditions: (i) commentary articles, policy notes, or purely conceptual papers without empirical data or analytical assessment; (ii) review articles (systematic or narrative), editorials, or conference abstracts lacking original quantitative results; (iii) duplicate publications derived from the same dataset, experimental setup, modelling framework, or computational pipeline; or (iv) studies judged to provide insufficient methodological detail, non-transparent assumptions, or a high risk of bias that precluded reliable interpretation of results.

Additional exclusions included studies that did not quantify the environmental impact of artificial intelligence systems, such as those reporting AI applications without measurable energy consumption, computational cost, or carbon footprint outcomes. Analyses focused exclusively on climate or sustainability domains where AI served solely as a predictive or analytical tool, without assessment of the energy or emissions associated with AI development, training, inference, or deployment, were also excluded. Furthermore, studies lacking comparative evaluation (e.g., absence of comparisons across model scales, hardware configurations, software pipelines, deployment contexts, or baseline versus optimized scenarios) were not considered eligible, as they did not allow meaningful synthesis of drivers influencing energy use or carbon emissions.

3.3. Information Sources

A comprehensive and systematic literature search was conducted using major scientific databases, including Web of Science, Pubmed and Scopus, without restrictions on publication date or language. These databases were selected to ensure broad coverage of the scientific literature related to artificial intelligence, machine learning, and deep learning, as well as studies assessing energy consumption, computational cost, carbon footprint, and greenhouse-gas emissions associated with AI systems.

To complement the database searches, the reference lists of all eligible articles were manually screened to identify additional relevant studies not captured by the initial search strategy, including foundational methodological papers, life-cycle assessment studies, and domain-specific applications evaluating the environmental impact of AI. Gray literature sources and non-peer-reviewed reports were not systematically searched and were included only when cited within eligible peer-reviewed articles and deemed necessary to contextualize methodological frameworks or reporting standards.

3.4. Search Methods for Identification of Studies

The search strategy combined controlled vocabulary and free-text terms related to artificial intelligence and its environmental impacts, with a specific focus on energy consumption and carbon emissions. Search terms covered AI systems and modelling approaches, energy and computational cost metrics, carbon footprint and greenhouse-gas emissions, and climate and sustainability contexts. Search strings were adapted to the syntax and field requirements of each database using Boolean operators and truncation to ensure comprehensive retrieval of relevant studies. The search query was constructed by combining the following terms using Boolean operators: ((“artificial intelligence” OR “machine learning” OR “deep learning” OR “large-scale AI” OR “foundation model” OR “large language model” OR LLM*) AND (“energy consumption” OR “energy footprint” OR “computational cost” OR “training cost” OR “inference cost” OR “electricity consumption”) AND (“carbon footprint” OR “carbon emission*” OR “greenhouse gas emission*” OR “CO₂ emission*” OR “carbon cost”) AND (“climate change” OR “climate mitigation” OR “environmental sustainability”)). The complete search strategies for all databases are provided in Supplementary File S2.

Two reviewers independently assessed study eligibility at both the title and abstract screening stage and the full-text review stage. Disagreements were resolved through discussion and consensus, with consultation of a third reviewer when necessary. No language restrictions were applied.

3.5. Data Extraction and Data Items

Two authors independently extracted data from all eligible studies. For each included article, key characteristics were recorded, including first author, year of publication, study scope or application domain, type of artificial intelligence system assessed, and study design. Study designs included empirical measurement studies, life-cycle assessments, comparative experimental analyses, modelling and simulation studies, and systematic or exploratory reviews with quantitative or analytical components. Discrepancies in data extraction were resolved through discussion and consensus. Record management, including duplicate removal and eligibility tracking, was performed using Microsoft Excel^®.

Primary variables extracted included indicators of computational and energy demand, such as energy consumption, electricity use, execution time, computational cost, and hardware utilization (Central Processing Unit (CPU), Graphics Processing Unit (GPU), or accelerator usage), as well as reported measures of energy efficiency. Carbon-related variables included carbon footprint, greenhouse-gas emissions, carbon dioxide equivalent (CO₂eq) emissions, and life-cycle emissions where available. When reported, information on electricity grid characteristics or emission factors used to estimate carbon emissions was also recorded.

Additional data items captured methodological characteristics relevant to environmental assessment, including the type of measurement or estimation approach (direct energy logging, modelling, or life-cycle assessment), system boundaries considered (operational-only versus cradle-to-grave), and the computational context (cloud-based, on-device, or hybrid deployment). Where applicable, comparative dimensions such as alternative model configurations, software pipelines, hardware platforms, or deployment scenarios were extracted.

Further information included temporal scope, scale of analysis, uncertainty assessment methods, and author-reported limitations. Potential sources of bias (such as reliance on single experimental setups, lack of comparative baselines, incomplete reporting of assumptions, or undisclosed funding or conflicts of interest) were documented to support interpretation of heterogeneity and robustness across studies.

3.6. Quality Appraisal and Risk of Bias Assessment

The methodological quality and potential risk of bias of the included studies were assessed using a structured appraisal framework informed by AMSTAR-2 principles and established guidance for environmental and computational impact assessments. As AMSTAR-2 is designed to evaluate systematic reviews rather than primary studies, it was used to guide the rigor and transparency of the review process, while study-level quality was evaluated using operational criteria tailored to the objectives of this synthesis.

Specifically, included studies were assessed according to: (i) transparency of energy measurement or estimation methods; (ii) clarity and justification of carbon intensity factors or emission conversion assumptions; (iii) definition of system boundaries (operational-only versus life-cycle assessment); and (iv) sufficiency of methodological detail to enable reproducibility. Studies relying primarily on indirect modelling assumptions without validation, or lacking explicit boundary definitions, were considered at higher risk of bias. Overall, the included studies exhibited moderate methodological quality, with the most common limitations related to incomplete life-cycle boundary specification and heterogeneous carbon accounting approaches.

4. Results

To improve analytical clarity and comparability across the heterogeneous literature, the results are organized using a standardized categorization framework that distinguishes between energy use and carbon emissions, as well as between operational and embodied sources of impact. Energy use refers to the direct electricity consumption associated with AI computation, typically reported in kilowatt-hours (kWh) or related units, whereas carbon emissions represent the climate-relevant impacts derived by converting energy use into CO₂eq emissions using electricity grid–specific emission factors. Operational impacts encompass energy use and emissions arising from model training and inference during system operation, while embodied impacts capture life-cycle emissions associated with hardware manufacturing, transportation, maintenance, and end-of-life processes. All results reported below are interpreted within this four-category framework, enabling a consistent synthesis of methodological approaches and reported outcomes across studies.

This section is intentionally limited to a descriptive synthesis of the evidence reported in the included studies. No new conceptual models, prescriptive frameworks, or system-level design proposals are introduced here. Interpretative integration and the proposed conceptual framework are developed separately in Section 5 and Section 6.

4.1. Study Selection

A total of 679 records were identified through database searches, including Web of Science (n = 268), PubMed (n = 27), and Scopus (n = 384) (Figure 1). After removal of duplicates and title and abstract screening, 547 records were excluded for not addressing the energy consumption, carbon footprint, or environmental impacts of artificial intelligence systems; lacking quantitative, life-cycle, or comparative analysis; or being commentaries, conceptual papers, or narrative reviews. Subsequently, 132 full-text articles were assessed for eligibility. Of these, 123 full-text articles were excluded for the following main reasons: absence of quantitative assessment of energy consumption or carbon emissions (n = 46); focus on AI applications without explicit evaluation of environmental impact (n = 32); lack of comparative or life-cycle analysis (n = 25); insufficient methodological transparency or incomplete reporting of assumptions (n = 12); and high risk of bias or non-reproducible results (n = 8). One additional study was identified through manual screening of reference lists. In total, 10 studies met the inclusion criteria and were included in the qualitative synthesis [9,31,32,33,34,35,36,37,38,39].

4.2. Methodological Quality of Included Studies

Of the 10 included studies, 6 relied primarily on direct energy measurements using CPU/GPU monitoring or execution-time logging, while 4 estimated energy use through modelling approaches. Explicit electricity grid emission factors were reported in 7 studies, whereas 3 relied on generic or assumed conversion factors. Life-cycle assessment boundaries extending beyond operational energy use were clearly defined in 4 studies, while the remaining studies focused on operational emissions only. Overall, the included studies exhibited moderate methodological quality, with the most common limitations related to incomplete life-cycle boundary specification and heterogeneous carbon accounting approaches.

4.3. Study Characteristics

Table 1 summarizes the principal characteristics of the studies included in this synthesis, which examine the energy consumption and carbon footprint of artificial intelligence (AI) systems across diverse computational contexts and application domains. The included evidence spans methodological frameworks, empirical measurement studies, life-cycle assessments, comparative experimental analyses, modelling studies, and systematic or exploratory reviews. Together, these studies address a wide range of AI systems, including large-scale machine learning experiments, large language models, deep neural networks deployed in cloud and on-device environments, AI hardware accelerators, data-processing pipelines, and domain-specific applications such as healthcare and cybersecurity.

Methodologically, the body of evidence encompasses direct energy measurements obtained from CPU and GPU monitoring, execution-time analysis, and workload profiling, as well as modelling approaches that convert energy use into CO₂eq emissions using region-specific electricity grid intensities. Several studies adopt a life-cycle perspective, extending the analysis beyond operational energy consumption to include embodied emissions associated with hardware manufacturing and infrastructure. Comparative studies evaluate alternative software frameworks, pipelines, and system configurations, demonstrating how technical choices can substantially influence energy efficiency and carbon outcomes. Review and framework-based studies synthesize algorithmic, hardware-level, and training-level strategies aimed at reducing the environmental impact of AI, while also highlighting governance mechanisms such as auditing, reporting, and sustainability budgeting.

Across the included studies, the primary objectives are to quantify the environmental costs associated with AI development and deployment, identify key drivers of energy demand and emissions, and explore mitigation strategies that balance computational performance with sustainability considerations. Collectively, the evidence indicates that modern AI systems, particularly large-scale and data-intensive models, can entail substantial energy use and carbon footprints. At the same time, the synthesis highlights considerable variability across systems and contexts, underscoring the potential for targeted technical and organizational interventions to reduce the environmental impact of AI.

As summarized in Table 1, the functional units used to report energy consumption and carbon emissions varied substantially across studies, reflecting differences in system boundaries, analytical scope, and deployment context. Reported units included energy or emissions per training run, per inference workflow, per operational period, or across defined life-cycle stages. Given this heterogeneity, results were interpreted within their respective functional contexts rather than normalized to a single unit, as forced normalization would require additional assumptions and could reduce interpretability. This variability highlights a current limitation in cross-study comparability and underscores the need for standardized functional units in future assessments of AI-related environmental impacts.

4.4. Outcomes

To synthesize the diverse empirical findings across the reviewed studies, Figure 2 provides a comparative breakdown of environmental impacts. Figure 2A highlights the shift from training-dominant footprints in early-stage development to inference and hardware-dominant (embodied) footprints in large-scale deployments. Figure 2B illustrates the critical role of geographic context, showing that identical energy demands result in vastly different carbon outcomes depending on the local electricity grid’s carbon intensity (CO₂eq/kWh).

4.4.1. Energy Consumption and Computational Demand of AI Systems

This subsection synthesizes evidence on operational energy use, focusing on the electricity consumed during AI model training and inference, independent of location-specific carbon intensity or life-cycle conversion assumptions.

Across the included studies, energy consumption emerges as a central and consistently reported outcome, highlighting the substantial computational demand associated with contemporary AI systems. Empirical measurement frameworks demonstrate that both training and inference phases contribute meaningfully to overall energy use, although their relative importance varies according to model scale, deployment frequency, and system architecture. Henderson et al. [9] provide one of the most influential methodological contributions by introducing standardized approaches to log energy consumption during machine learning experiments, revealing large variability in GPU- and CPU-related energy use even for comparable tasks. Their findings underscore that experimental design choices, such as hyperparameter tuning strategies and repeated model retraining, can dramatically increase energy demand.

The disproportionate computational burden imposed by large-scale models, specifically deep neural networks and Large Language Models (LLMs), has led to a critical re-evaluation of the AI energy lifecycle. While the “Green AI” discourse historically focused on the massive energy spikes during model training, recent empirical evidence suggests a significant shift toward inference-phase dominance. Jiang et al. [32] demonstrate that LLM-powered systems incur cumulative energy costs across multiple interaction stages, highlighting that under high-frequency, real-world deployment, sustained inference demand can match or even exceed the initial training expenditure.

This transition from training-centric to inference-centric impact has ignited a rigorous scholarly debate concerning measurement methodologies and system boundaries. Currently, the literature exhibits a methodological dichotomy. On one hand, studies utilizing direct hardware-level logging, such as those employing Running Average Power Limit (RAPL) or NVIDIA Management Library (NVML) sensors, provide high-precision, real-time data (e.g., Henderson et al. [9]). While these methods offer granular accuracy, they are often restricted to local hardware environments and frequently fail to account for the complex energy overheads of shared cloud infrastructures. Conversely, software-based estimation frameworks offer scalability by utilizing Floating Point Operation (FLOP)-to-energy conversion models. However, these estimations often overlook the “energy tail” of inference—encompassing data movement, cooling requirements, and network latency—resulting in a systematic underestimation of the total environmental footprint in live production environments.

Beyond measurement techniques, the debate centers on where the “computational boundary” of an AI system should be drawn. Proponents of operational boundaries argue that isolating per-inference energy provides the most actionable data for developers seeking immediate algorithmic optimizations. However, a growing body of scholarship led by LCA proponents, such as Schneider et al. [37], contends that such narrow boundaries are insufficient. They argue that the environmental shift toward inference is further exacerbated by the embodied emissions of specialized hardware (e.g., high-performance GPUs and TPUs) required for low-latency deployment. These embodied impacts are rarely captured in standard “Green AI” training logs, suggesting that current metrics, which favor training-phase transparency, may be structurally incapable of governing the cumulative, distributed energy demand of AI at a global scale.

This indicates that while training is energy-intensive, the cumulative demand of inference in production environments is the primary driver of long-term energy consumption (see Figure 2A). The software-based estimation methods often used in the literature may still struggle to capture the full ‘energy tail’ of these distributed inference phases.

Comparative experimental analyses provide additional insight into how software and pipeline choices influence computational efficiency. Mekouar et al. [33] benchmark data-processing frameworks commonly used in Green AI workflows and show that execution time, CPU utilization, and total energy consumption differ substantially across libraries, even when performing identical tasks. Such results indicate that energy consumption is not solely determined by model architecture but is also shaped by upstream data handling and software engineering decisions. Similarly, Boumendil et al. [31], through a systematic survey, identify algorithmic design, batch size, precision settings, and parallelization strategies as key determinants of energy efficiency in deep learning systems.

On-device deployment scenarios further illustrate the importance of context. The survey on on-device deep learning highlights how resource-constrained environments necessitate energy-aware model design, emphasizing trade-offs between accuracy, latency, and power consumption. Collectively, the evidence demonstrates that energy consumption is a multi-dimensional outcome influenced by model characteristics, computational workflows, and deployment conditions. Rather than being an inherent property of AI, computational demand reflects a series of technical and design choices, many of which remain underexplored in mainstream performance-driven research.

It should be noted that only a subset of the included studies relies on continuous, real-world inference measurements, while others estimate inference-related energy use through scaling assumptions, which introduces uncertainty into cross-study comparisons.

Critically, the current Green AI paradigm is often limited by “carbon tunnel vision”. While carbon footprint (CO₂eq) is the primary metric reported, it fails to capture the multi-dimensional environmental impact of AI. Scholars are increasingly calling for the inclusion of:

Water Footprint: The massive freshwater requirements for cooling the data centers that host large-scale model training and inference.
Electronic Waste (E-waste): The environmental cost of specialized AI hardware (e.g., GPUs/TPUs (Tensor Processing Unit)) that has a high turnover rate due to rapid technological shifts.

The contrast between these metrics reveals a fundamental trade-off: carbon-centric metrics are easier to standardize but may inadvertently encourage “efficiency” gains that lead to higher hardware turnover or increased water usage, thus shifting rather than reducing the total environmental burden.

4.4.2. Carbon Footprint and Greenhouse-Gas Emissions

Beyond raw energy consumption, the reviewed literature identifies carbon footprint as the primary metric linking AI systems to broader climate impacts. This subsection distinguishes between operational emissions, derived from energy use under region-specific electricity mixes, and embodied emissions, which arise from hardware manufacturing, infrastructure, and end-of-life processes. A central finding across the included studies is that the carbon footprint of an AI system is not a static attribute of its model architecture, but rather a dynamic function of the local energy infrastructure. As illustrated in Figure 2B, identical energy requirements result in vastly different CO₂eq outcomes depending on the carbon intensity of the regional electricity grid. This variability highlights the inherent limitations of “energy-only” metrics for assessing environmental sustainability.

Methodologically, carbon emissions are quantified by converting energy demand into CO₂eq using grid-specific emission factors; however, substantial heterogeneity remains in how these factors are applied. Henderson et al. [9] emphasize this contextual dependence, demonstrating that the geographic location of a data center can be as significant as the algorithmic efficiency itself in determining the final environmental cost. Consequently, achieving “Green AI” requires a shift in focus from purely computational optimization to the strategic, carbon-aware geographic placement of training and inference workloads.

Life-cycle–oriented analyses substantially extend this perspective by accounting for emissions beyond operational energy use. Schneider et al. [37] provide a comprehensive LCA of AI hardware accelerators, showing that embodied emissions from manufacturing, transport, and end-of-life phases can equal or exceed operational emissions over the lifespan of the hardware. These findings challenge narrow operational assessments and underscore the importance of cradle-to-grave accounting when evaluating the true carbon cost of AI systems.

Jiang et al. [32] further reinforce this systemic view by examining the life-cycle energy and carbon implications of LLM-powered chatbots. Their analysis reveals that emissions accumulate across development, deployment, and user interaction phases, with scaling effects that are often underestimated in conventional assessments. Importantly, the study demonstrates that even modest per-interaction emissions can translate into substantial carbon costs when deployed at a global scale.

Domain-specific modelling studies illustrate how these impacts may manifest in applied settings. Vafaei Sadr et al. [39] estimate CO₂eq emissions associated with deep learning inference in digital pathology workflows, showing that routine clinical deployment could generate non-negligible emissions at scale. Similarly, Sarkodie et al. [36], while focused on blockchain-related systems, provide relevant insights into the carbon intensity of energy-intensive digital infrastructures, offering a useful parallel for understanding AI-driven computational systems.

Across the corpus, a consistent signal emerges: AI-related carbon emissions are highly variable but potentially substantial, particularly for large-scale and continuously operating systems. The evidence highlights the need for standardized reporting practices and transparent assumptions to ensure comparability across studies. Without such harmonization, carbon footprint estimates risk underrepresenting the true climate impact of AI technologies.

4.4.3. Mitigation and Optimization Strategies

A third outcome domain concerns mitigation strategies aimed at reducing the energy and carbon intensity of AI systems without compromising performance. Several studies identify algorithmic, hardware-level, and software-based optimization techniques as promising avenues for impact reduction. Boumendil et al. [31] synthesize a wide range of strategies, including model pruning, quantization, knowledge distillation, and efficient training protocols, demonstrating their potential to significantly reduce computational demand.

On-device deep learning studies further highlight optimization as a necessity rather than an optional enhancement. By emphasizing lightweight architectures and energy-aware inference, these approaches illustrate how deployment constraints can drive innovation in efficiency-oriented model design. Comparative benchmarking by Mekouar et al. [33] extends this logic to data pipelines, showing that software choices alone can yield meaningful reductions in energy use and associated emissions.

Hardware-related strategies are also emphasized. Schneider et al. [37] discuss the role of hardware specialization and improved manufacturing practices in reducing life-cycle emissions, while also cautioning that efficiency gains at the device level may be offset by rebound effects associated with increased deployment. Jiang et al. [32] similarly note that optimization at individual stages of the AI lifecycle must be considered in a system-wide context to avoid shifting emissions rather than reducing them.

Together, the literature suggests that mitigation is feasible but requires coordinated action across the AI development pipeline. Isolated technical optimizations, while beneficial, are unlikely to achieve meaningful reductions without complementary changes in deployment practices and evaluation criteria.

4.4.4. Governance, Reporting, and Ethical Frameworks

The final outcome domain addresses governance and reporting frameworks that seek to integrate sustainability into AI development and deployment. Across the included studies, a recurring theme is that technical optimization alone may be insufficient to adequately address the environmental impact of AI systems, particularly under conditions of large-scale or continuous deployment. As methodological approaches and boundary definitions vary substantially, several authors emphasize the importance of standardized reporting and institutional oversight to improve transparency and comparability. Henderson et al. [9], for instance, advocate for routine disclosure of energy and carbon metrics alongside performance results, framing transparency as a prerequisite for accountability rather than as a definitive impact mitigation measure.

This comparative analysis reveals a fundamental trade-off: measurement-focused metrics offer high precision but low systemic relevance, whereas lifecycle and governance frameworks offer high relevance but suffer from significant data-gathering hurdles and a lack of universal standardization.

Raper et al. [34] introduce the concept of sustainability budgets, proposing governance mechanisms that allocate explicit energy and carbon constraints to AI projects. Rather than emerging from a single causal estimate, this approach is motivated by the recognition of scaling effects, uncertainty in inference-related energy use, and the potential for rebound effects, reframing environmental impact as a design parameter within existing project management practices. Usman et al. [38] extend this perspective to cybersecurity and digital infrastructure, highlighting the cumulative carbon burden associated with continuously operating AI-driven systems and the corresponding need for policy-level guidance under conditions of limited empirical evidence.

Studies focused on healthcare and applied domains further emphasize ethical considerations. Richie et al. [35] argue that environmental impacts should be considered alongside clinical benefits when evaluating AI deployment in healthcare, particularly given the potential for widespread adoption and long-term operational use. Similarly, Green Cybersecurity frameworks position sustainability as an integral component of responsible AI governance, not as a consequence of isolated performance metrics but as part of broader accountability structures.

Taken together, the evidence does not support a single causal pathway from specific energy estimates to governance prescriptions. Instead, it indicates that the combination of heterogeneous methodologies, limited long-term inference measurements, and scaling uncertainty motivates a precautionary, system-level governance perspective. Embedding energy and carbon considerations into reporting standards, funding criteria, and institutional policies is therefore presented as a pragmatic response to current evidence gaps, rather than as a definitive solution to the environmental footprint of AI systems.

4.5. Limitations and Strengths of the Review

A potential limitation of this review is the inclusion of a relatively small number of studies (n = 10). This result is indicative of the current state of the field: while the “Green AI” discourse is expansive, there remains a significant lack of peer-reviewed, primary empirical research that provides granular, hardware-verified energy data across the full lifecycle. However, the strength of this review is not found in volume, but in the rigorous exclusion of secondary data that often propagates unverified estimates. By focusing only on high-fidelity, quantitative studies, we provide a more accurate baseline for the current carbon-aware governance framework. This limited sample size underscores the critical ‘transparency gap’ in AI reporting and highlights the urgent need for standardized benchmarks that include water and e-waste metrics alongside carbon.

5. Discussion

This systematic review synthesizes emerging evidence on the energy consumption and carbon footprint of contemporary artificial intelligence systems, revealing that environmental impact is not a marginal side effect of AI development but a structural and increasingly consequential dimension of digital infrastructure. Across heterogeneous methodologies and application domains, the literature converges on the conclusion that modern AI systems, particularly large-scale, continuously deployed models, can entail substantial and highly variable environmental costs [7].

Beyond the studies included in the PRISMA-guided synthesis, a broader empirical literature—spanning numerical modelling, AI systems research, and experimental measurement—addresses complementary dimensions of the environmental footprint of AI. However, these contributions differ fundamentally from the studies included in this review in terms of methodological assumptions, system boundaries, and analytical scope, and their conclusions must therefore be interpreted with caution.

Numerical and simulation-driven analyses provide valuable insights into the sensitivity of AI-related emissions to workload parameters, infrastructure design, and carbon-intensity assumptions, and they highlight the theoretical mitigation potential of carbon-aware workload shifting and control strategies under controlled or idealized conditions [40,41]. Nevertheless, these studies are predominantly scenario-based, rely on synthetic workloads or assumed energy–carbon mappings, and rarely capture deployment-scale dynamics, which limits their empirical grounding and cross-context comparability.

In parallel, AI-based systems and deployment-oriented research on large language model inference clusters and carbon-aware scheduling demonstrate that operational energy demand is strongly shaped by serving stack design, batching and quantization strategies, and cluster orchestration under latency and throughput constraints [42,43,44,45]. While these studies offer high-resolution, system-specific evidence, their findings are inherently context-dependent, optimized for particular infrastructures, and not designed to support lifecycle-wide or cross-study synthesis.

Experimental and measurement-focused studies further contribute hardware-grounded evidence through NVML- and RAPL-based telemetry and benchmarking protocols that directly measure energy and power during training and inference across specific hardware configurations [46,47,48,49]. However, such measurements are typically limited to isolated components of the AI pipeline, short temporal windows, or single hardware generations, and do not account for embodied emissions, deployment scale, or long-term system feedback.

Complementary life-cycle assessments of data-centre and computing-centre infrastructures underscore the importance of embodied and infrastructure-mediated impacts, including cooling systems and facility-level design [50,51]. Yet, these analyses generally operate at an infrastructural level that precludes direct linkage to model-level or workload-level decision-making.

Overall, this body of literature provides fragmented but complementary evidence: it elucidates specific mechanisms, sensitivities, and localized optimization opportunities, but does not offer an integrated, lifecycle-wide synthesis across heterogeneous empirical contexts. This fragmentation directly motivates the present systematic review, which consolidates evidence across operational and embodied dimensions using transparent inclusion criteria and comparative synthesis to enable robust inference on system-level environmental impacts of AI.

A first critical insight concerns the distribution of energy demand across the AI lifecycle. While early discourse emphasized the environmental burden of training large models, the reviewed evidence consistently challenges this training-dominant narrative. In real-world deployment contexts, inference-phase energy consumption emerges as a co-equal or dominant contributor, especially for systems that operate continuously or at a global scale, such as large language model–based services [10,52]. This finding has important implications for both research evaluation and policy, as inference energy use is rarely reported, regulated, or optimized with the same rigor as training.

A second major theme is the importance of system boundaries in environmental assessment. Studies adopting a life-cycle perspective demonstrate that embodied emissions associated with hardware manufacturing, transport, and disposal can rival or exceed operational emissions over the lifespan of AI systems [10,52]. These findings expose the limitations of operational-only accounting practices and highlight a systematic underestimation of AI’s true carbon footprint in much of the existing literature. Moreover, embodied emissions are tightly coupled to hardware refresh cycles and utilization efficiency, linking technical design decisions directly to long-term environmental outcomes.

The review further underscores the contextual dependence of AI-related emissions, particularly with respect to geographic variation in electricity generation. Identical computational workloads can result in markedly different carbon footprints depending on grid composition, temporal availability of renewable energy, and regional emission factors [53]. This geographic sensitivity complicates cross-study comparison and reinforces calls for standardized reporting practices that transparently document assumptions, system boundaries, and emission factors.

Importantly, the evidence also reveals that energy consumption and carbon emissions are not intrinsic properties of AI models, but emergent outcomes shaped by interacting choices across algorithms, software pipelines, hardware configurations, and deployment strategies. Comparative studies demonstrate that engineering decisions, such as data-processing frameworks, numerical precision, batch size, and workload scheduling, can yield order-of-magnitude differences in energy use [11,54,55]. This observation shifts responsibility from model architecture alone to the broader socio-technical systems in which AI is embedded.

Despite growing awareness of these issues, the review identifies a persistent gap between measurement and mitigation. While numerous studies quantify environmental impacts and propose efficiency-enhancing techniques, these interventions are typically evaluated in isolation and without accounting for system-level feedbacks. In particular, the rebound effect emerges as a central unresolved challenge: efficiency gains achieved through algorithmic or hardware optimization may be offset by expanded deployment, increased demand, or accelerated hardware turnover, leading to stable or rising net emissions [15]. As a result, technical optimization alone is unlikely to deliver absolute reductions in environmental impact.

These findings align closely with the broader Green AI literature, which critiques the performance-centric paradigm of “Red AI” and advocates for energy- and resource-aware evaluation [11]. However, the reviewed evidence suggests that even Green AI approaches risk being insufficient if they remain confined to voluntary reporting or localized efficiency improvements. Without enforceable constraints or coordinated decision-making across the AI lifecycle, sustainability remains an external consideration rather than an operational imperative.

Taken together, the results of this review point toward a fundamental conclusion: sustainable AI cannot be achieved through isolated optimizations or post hoc reporting alone. Instead, the environmental impact of AI must be addressed as a dynamic control problem, in which performance, energy use, and carbon emissions are jointly managed under uncertainty and evolving constraints. This insight directly motivates the conceptual framework introduced in Section 6. The empirical evidence reviewed here—particularly the dominance of inference-phase emissions [10], the significance of embodied carbon [52], the sensitivity to energy-system context, and the prevalence of rebound effects [15]—collectively indicate the need for coordinated, adaptive, and constraint-aware governance mechanisms. A multi-agent reinforcement learning paradigm provides a principled means of operationalizing these requirements by enabling distributed decision-makers across software, hardware, and energy systems to learn policies that internalize environmental limits while maintaining system utility.

By embedding carbon constraints directly into optimization objectives, such approaches shift sustainability from a descriptive metric to a design-time and run-time control variable. Section 6 therefore extends the findings of this review from diagnosis to prescription, illustrating how environmental assessment can inform the development of AI systems aligned with long-term climate mitigation and adaptation goals.

6. Implications for Sustainable AI: A Carbon-Aware Multi-Agent Reinforcement Learning Framework

This section introduces a conceptual framework and research agenda for carbon-aware, climate-resilient AI systems. While grounded in empirical findings from the reviewed literature, the proposed MARL framework is not empirically validated within this study. Instead, it should be understood as a hypothesis-driven systems design that synthesizes existing evidence into a coherent control-oriented architecture, intended to guide future methodological development, empirical evaluation, and policy experimentation.

6.1. Motivation and Conceptual Contribution

The results of this systematic review highlight a critical limitation in current approaches to sustainable artificial intelligence: while environmental impacts are increasingly measured and reported, they are rarely operationalized as control variables within AI systems themselves. Most existing work remains descriptive, retrospective, or localized, focusing on individual models, experiments, or efficiency techniques rather than coordinated system-level intervention [9,11].

Three empirically grounded challenges identified in the reviewed literature motivate the need for a new paradigm. First, inference-phase energy consumption has emerged as a dominant contributor to total energy use in large-scale and continuously deployed AI systems, particularly in user-facing services [10,52]. Second, embodied emissions associated with hardware manufacturing and turnover can rival or exceed operational emissions, underscoring the inadequacy of operational-only optimization strategies [19,52]. Third, efficiency improvements are frequently undermined by the rebound effect, whereby reductions in per-task energy consumption are offset by increased deployment, usage, or scale [15].

In response, this section proposes a conceptual Carbon-Aware MARL framework that reframes AI deployment as a coordinated, adaptive decision problem across software, hardware, and energy infrastructures. Unlike existing Green AI approaches that emphasize voluntary reporting or isolated optimization, the framework hypothesize that embedding environmental constraints directly into the decision-making logic of AI systems, enabling proactive mitigation under dynamic conditions.

As a simple illustrative example, consider a large-scale language model deployed as a cloud-based conversational service. Under the proposed carbon-aware MARL framework, a model optimization agent adapts inference precision and batch size, a hardware lifecycle agent manages accelerator utilization, and a grid-aware scheduling agent shifts non-urgent inference workloads toward periods of lower electricity carbon intensity. This coordinated control aims to reduce cumulative operational and embodied emissions while maintaining service-level performance.

Figure 3 provides a conceptual overview of the proposed carbon-aware multi-agent reinforcement learning framework, illustrating how coordinated agents operate across software, hardware, and energy-system layers under shared carbon constraints. The figure highlights the interaction between centralized training, decentralized execution, and governance mechanisms that jointly internalize operational and embodied emissions throughout the AI lifecycle.

6.2. System-Level Architecture and Coordinated Control

The proposed framework adopts a Centralized Training, Decentralized Execution (CTDE) paradigm, which has proven effective for coordination in complex, distributed systems [56,57]. Centralized training allows agents to learn from a shared global state incorporating performance requirements, carbon intensity, and infrastructure constraints, while decentralized execution ensures scalability and robustness in real-world deployment. As shown in Figure 3, centralized training enables agents to learn from a shared global state incorporating performance requirements, carbon constraints, and infrastructure context, while decentralized execution allows each agent to act autonomously at deployment time.

Within this architecture, three functionally distinct agents are hypothesized to operate at complementary layers of the AI lifecycle:

A Model Optimization Agent, which dynamically adjusts computational complexity using techniques such as adaptive precision, early exiting, and knowledge distillation, reflecting evidence that computational demand varies substantially across tasks and contexts [11];
A Hardware Lifecycle Agent, which manages accelerator utilization, consolidation, and replacement timing to reduce embodied emissions per unit of computation, directly responding to life-cycle assessment findings on hardware-related carbon impacts [19,52];
A Grid-Aware Scheduling Agent, which interfaces with real-time and forecasted electricity grid carbon intensity to temporally shift energy-intensive workloads toward periods of lower marginal emissions, operationalizing geographic and temporal variability identified in prior work [9,53].

Together, these agents define a hypothetical cross-layer control system capable of internalizing both operational and embodied carbon costs, rather than optimizing them in isolation.

6.3. Carbon-Constrained Learning and Rebound Effect Mitigation

A central innovation of the framework is the explicit treatment of carbon emissions as a first-class optimization objective. This is implemented through a multi-objective reward function (R_total) that balances system utility with environmental impact:

R_{total} = w_{1} \cdot Utility - w_{2} \cdot Operational Emissions - w_{3} \cdot Embodied Emissions - w_{4} \cdot Constraint Violations

where the weighting coefficients (w_i) regulate trade-offs between system utility and environmental impact. Sustainability budgets are conceptualized as explicit upper bounds on allowable emissions and act as binding constraints that activate penalties when exceeded.

While not empirically evaluated here, this reward structure hypothesizes a mechanism by which efficiency gains can be prevented from translating into unbounded deployment and associated rebound effects. By embedding environmental constraints directly into the learning objective, the framework conceptually shifts sustainability from an external compliance metric to an endogenous optimization target aligned with long-term environmental goals.

6.4. Uncertainty-Aware Coordination in Non-Stationary Systems

Climate and energy systems are inherently non-stationary, characterized by fluctuating renewable generation, evolving demand patterns, and increasing frequency of extreme events. Prior research has shown that reinforcement learning systems trained under static assumptions often fail under distributional shifts [58,59].

To address this, the framework proposes uncertainty-driven exploration and cooperative coordination as design principles rather than validated solutions. Agents are assumed to seek strategies that remain robust across a wide range of plausible future states, consistent with concepts from robust and constrained reinforcement learning [60].

Conceptually, uncertainty-aware coordination enables agents to negotiate workload prioritization under fluctuating energy availability and climate-driven system stress. During periods of scarcity, agents are assumed to prioritize high-societal-value workloads, such as healthcare, emergency response, or climate modeling, while deferring elective or carbon-intensive tasks. This mechanism illustrates how AI deployment could be aligned with broader climate adaptation objectives, pending empirical validation.

6.5. Explainability, Ethics, and Governance by Design

Delegating cross-layer control to autonomous agents raises significant ethical and governance concerns, particularly in contexts where AI systems influence access to critical infrastructure during climate stress events. Opaque decision-making processes risk undermining public trust and accountability [61,62].

Accordingly, the framework incorporates Explainable Reinforcement Learning (XRL) as a design requirement, enabling post-hoc interpretation and auditing of agent decisions [63,64]. These mechanisms are intended to support accountability, regulatory oversight, and fairness auditing, particularly where historically biased data could otherwise lead to inequitable outcomes. In this sense, explainability and governance are treated as structural components of the framework rather than downstream add-ons.

6.6. Implementation Roadmap and Policy Implications

To clarify the pathway from conceptual design to empirical evaluation, we outline a high-level implementation roadmap for future research and pilot deployments:

Data Requirements: Continuous, interoperable data streams spanning model performance, energy consumption, grid carbon intensity, hardware utilization, and social vulnerability indicators. Regulatory mandates for standardized, real-time data sharing are likely prerequisites.
Digital Twin Infrastructure: High-fidelity digital twins of urban, energy, or data center systems are required to train and stress-test MARL agents under diverse and extreme scenarios. These environments must support scenario engineering, including rare and compound climate events.
Metrics and Validation: Beyond accuracy and efficiency, evaluation metrics should include operational and embodied emissions, rebound-adjusted carbon impact, system resilience under distributional shift, and equity-aware performance indicators.
Governance and Oversight: Clear legal mandates must define the scope and limits of autonomous decision-making, particularly during climate emergencies. Human-in-the-loop oversight, inter-institutional agreements, and long-term funding mechanisms are essential for responsible deployment.

Taken together, these considerations reinforce that the proposed framework should be understood as a conceptual synthesis and research agenda, rather than a validated solution. Its primary contribution lies in demonstrating how insights from environmental assessment, reinforcement learning, and climate governance can be integrated into a unified control-oriented vision for sustainable and climate-resilient AI systems.

7. Conclusions

This systematic review demonstrates that the environmental footprint of artificial intelligence is a structural and increasingly consequential challenge, shaped by interdependent decisions across algorithms, software pipelines, hardware infrastructures, and deployment contexts. The synthesized evidence shows that energy consumption and carbon emissions associated with AI systems are highly variable, context-dependent, and often underestimated, particularly when inference-phase demand and embodied hardware emissions are omitted from analysis.

A central conclusion of this work is that sustainable AI cannot be achieved through isolated technical optimizations or voluntary reporting alone. While efficiency-enhancing techniques and standardized measurement frameworks are necessary, they are insufficient to deliver absolute reductions in environmental impact due to rebound effects, scaling dynamics, and non-stationary energy systems. Sustainability must therefore be embedded directly into the operational logic and governance of AI systems.

To move from principle to practice, this study proposes a multi-layered implementation pathway. At the technical level, sustainability objectives can be operationalized through carbon-aware, multi-agent control frameworks that integrate energy and emission constraints as endogenous system goals. At the organizational level, AI developers and operators should adopt life-cycle assessment protocols, incorporate sustainability metrics into performance evaluations, and align Research and Development (R&D) priorities with energy-efficient design principles. At the policy level, targeted incentives, regulatory standards, and sectoral reporting requirements can reinforce adoption, creating an environment in which sustainable AI is economically viable and institutionally supported. Together, these layers form a feasible pathway for embedding environmental considerations into AI’s operational logic within existing commercial and policy landscapes.

By synthesizing empirical evidence on AI’s environmental impacts and integrating it with a forward-looking conceptual framework, this study advances the field in two key ways. First, it consolidates fragmented evidence into a coherent synthesis that identifies common drivers, methodological gaps, and actionable insights for policymakers and practitioners. Second, it illustrates how sustainability can be treated as a core operational objective rather than an external afterthought, enabling AI systems to align technological innovation with climate mitigation goals.

Future research should empirically evaluate system-level approaches through high-fidelity simulations and real-world pilot deployments, develop standardized benchmarks for inference-phase and life-cycle emissions, and investigate governance mechanisms that incentivize sustainable AI across commercial and regulatory contexts. As AI systems become increasingly embedded in critical societal functions, integrating environmental sustainability into their design and deployment is not only merely a technical challenge, it is an ethical and policy imperative.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su18031359/s1, Supplementary File S1: PRISMA 2020 checklist; Supplementary File S2: Full electronic database search strategies (PubMed, Web of Science, Scopus).

Author Contributions

Conceptualization, T.C. and A.P.O.; methodology, C.M.-P. and A.P.O.; software, C.M.-P. and A.P.O.; validation, C.M.-P., T.C. and A.P.O.; formal analysis, C.M.-P. and A.P.O.; investigation, C.M.-P., T.C. and A.P.O.; resources, C.M.-P. and A.P.O.; data curation, C.M.-P. and A.P.O.; writing—original draft preparation, C.M.-P. and A.P.O.; writing—review and editing, C.M.-P., T.C. and A.P.O.; visualization, C.M.-P., T.C. and A.P.O.; supervision, A.P.O.; project administration, A.P.O.; funding acquisition, T.C. and A.P.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Afroogh, S.; Akbari, A.; Malone, E.; Kargar, M.; Alambeigi, H. Trust in AI: Progress, Challenges, and Future Directions. Humanit. Soc. Sci. Commun. 2024, 11, 1568. [Google Scholar] [CrossRef]
Rashid, A.B.; Kausik, M.D.A.K. AI Revolutionizing Industries Worldwide: A Comprehensive Overview of Its Diverse Applications. Hybrid Adv. 2024, 7, 100277. [Google Scholar] [CrossRef]
Brandao, P.R. The Impact of Artificial Intelligence on Modern Society. AI 2025, 6, 190. [Google Scholar] [CrossRef]
Hanna, M.G.; Pantanowitz, L.; Dash, R.; Harrison, J.H.; Deebajah, M.; Pantanowitz, J.; Rashidi, H.H. Future of Artificial Intelligence—Machine Learning Trends in Pathology and Medicine. Mod. Pathol. 2025, 38, 100705. [Google Scholar] [CrossRef]
de Vries, A. The Growing Energy Footprint of Artificial Intelligence. Joule 2023, 7, 2191–2194. [Google Scholar] [CrossRef]
Lal, A.; You, F. Advances and Challenges in Energy and Climate Alignment of AI Infrastructure Expansion. Adv. Appl. Energy 2025, 20, 100243. [Google Scholar] [CrossRef]
Strubell, E.; Ganesh, A.; McCallum, A. Energy and Policy Considerations for Modern Deep Learning Research. Proc. AAAI Conf. Artif. Intell. 2020, 34, 13693–13696. [Google Scholar] [CrossRef]
Moravec, V.; Gavurova, B.; Kovac, V. Environmental Footprint of GenAI—Changing Technological Future or Planet Climate? J. Innov. Knowl. 2025, 10, 100691. [Google Scholar] [CrossRef]
Henderson, P.; Hu, J.; Romoff, J.; Brunskill, E.; Jurafsky, D.; Pineau, J. Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning. J. Mach. Learn. Res. 2020, 21, 1–43. [Google Scholar]
Patterson, D.; Gonzalez, J.; Le, Q.; Liang, C.; Munguia, L.-M.; Rothchild, D.; So, D.; Texier, M.; Dean, J. Carbon Emissions and Large Neural Network Training. arXiv 2021, arXiv:2104.10350. [Google Scholar] [CrossRef]
Schwartz, R.; Dodge, J.; Smith, N.A.; Etzioni, O. Green AI. Commun. ACM 2020, 63, 54–63. [Google Scholar] [CrossRef]
Tabbakh, A.; Al Amin, L.; Islam, M.; Mahmud, G.M.I.; Chowdhury, I.K.; Mukta, M.S.H. Towards Sustainable AI: A Comprehensive Framework for Green AI. Discov. Sustain. 2024, 5, 408. [Google Scholar] [CrossRef]
Bibri, S.E.; Huang, J. AI and AI-Powered Digital Twins for Smart, Green, and Zero-Energy Buildings: A Systematic Review. Environ. Sci. Ecotechnol. 2025, 28, 100628. [Google Scholar] [CrossRef] [PubMed]
Kolt, N.; Shur-Ofry, M.; Cohen, R. Lessons from Complex Systems Science for AI Governance. Patterns 2025, 6, 101341. [Google Scholar] [CrossRef] [PubMed]
Berkhout, P.H.G.; Muskens, J.C.; Velthuijsen, J.W. Defining the Rebound Effect. Energy Policy 2000, 28, 425–432. [Google Scholar] [CrossRef]
Moftah, N.; Alzubi, A.B. Bridging the Digital–Energy Divide. Sustainability 2025, 17, 8912. [Google Scholar] [CrossRef]
Zhou, X.; Zou, Y.; Ding, Y. Disentangling the Complex Effects of Artificial Intelligence on Carbon Neutrality in China. J. Innov. Knowl. 2026, 11, 100864. [Google Scholar] [CrossRef]
Adewale, B.A.; Ene, V.O.; Ogunbayo, B.F.; Aigbavboa, C.O. Applications of AI in a Sustainable Building’s Lifecycle: A Systematic Review. Buildings 2024, 14, 2137. [Google Scholar] [CrossRef]
Gupta, U.; Kim, Y.G.; Lee, S.; Tse, J.; Lee, H.-H.S.; Wei, G.-Y.; Brooks, D.; Wu, C.-J. Chasing Carbon: The Elusive Environmental Footprint of Computing. IEEE Micro 2022, 42, 37–47. [Google Scholar] [CrossRef]
Liu, H.; Zhai, J. Carbon Emission Modeling for HPC-Based AI. Processes 2025, 13, 595. [Google Scholar] [CrossRef]
Różycki, R.; Solarska, D.A.; Waligóra, G. Energy-Aware Machine Learning Models. Energies 2025, 18, 2810. [Google Scholar] [CrossRef]
Alzoubi, Y.I.; Mishra, A. Green Artificial Intelligence Initiatives. J. Clean. Prod. 2024, 468, 143090. [Google Scholar] [CrossRef]
Ertel, W.; Bonenberger, C. Rebound Effects Caused by Artificial Intelligence. Sustainability 2025, 17, 1988. [Google Scholar] [CrossRef]
Mhlanga, D. AI beyond Efficiency: Navigating the Rebound Effect. Front. Energy Res. 2025, 13, 1460586. [Google Scholar] [CrossRef]
Hady, M.A.; Hu, S.; Pratama, M.; Cao, Z.; Kowalczyk, R. Multi-Agent Reinforcement Learning for Resource Allocation: A Survey. Artif. Intell. Rev. 2025, 58, 354. [Google Scholar] [CrossRef]
Jia, L.; Pei, Y. Advances in Multi-Agent Reinforcement Learning for Water Systems. Machines 2025, 13, 503. [Google Scholar] [CrossRef]
Katzilieris, K.; Kampitakis, E.; Vlahogianni, E.I. Multi-Agent Reinforcement Learning for Traffic Control. Transp. Res. C 2026, 182, 105391. [Google Scholar] [CrossRef]
Ning, Z.; Xie, L. A Survey on Multi-Agent Reinforcement Learning. J. Autom. Intell. 2024, 3, 73–91. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Shea, B.J.; Reeves, B.C.; Wells, G.; Thuku, M.; Hamel, C.; Moran, J.; Moher, D.; Tugwell, P.; Welch, V.; Kristjansson, E.; et al. AMSTAR 2. BMJ 2017, 358, j4008. [Google Scholar] [CrossRef]
Boumendil, A.; Bechkit, W.; Benatchba, K. On-Device Deep Learning. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 7806–7821. [Google Scholar] [CrossRef]
Jiang, P.; Sonne, C.; Li, W.; You, F.; You, S. Life-Cycle Energy and Carbon Footprints of LLMs. Engineering 2024, 40, 202–210. [Google Scholar] [CrossRef]
Mekouar, Y.; Lahmer, M.; Karim, M. Optimizing Data Pipelines for Green AI. Computers 2025, 14, 319. [Google Scholar] [CrossRef]
Raper, R.; Boeddinghaus, J.; Coeckelbergh, M.; Gross, W.; Campigotto, P.; Lincoln, C.N. Sustainability Budgets for AI Development. Sustainability 2022, 14, 4019. [Google Scholar] [CrossRef]
Richie, C.; Hinrichs-Krapels, S.; Dobbe, R.; French, P.; Wei, J.C.J.; Diehl, J.C.; Kong, R. Environmental Impacts of AI in Health Care. Health Technol. 2025, 15, 1087–1093. [Google Scholar] [CrossRef]
Sarkodie, S.A.; Amani, M.A.; Ahmed, M.Y.; Owusu, P.A. Bitcoin Carbon Footprint. Sustain. Horiz. 2023, 7, 100060. [Google Scholar] [CrossRef]
Schneider, I.; Xu, H.; Benecke, S.; Patterson, D.; Huang, K.; Ranganathan, P.; Elsworth, C. Life-Cycle Emissions of AI Hardware. IEEE Micro 2025, 45, 9–19. [Google Scholar] [CrossRef]
Usman, Y.; Ihejirika, C.J.; Offor, S.N.; Robert, A.; Chataut, R. Green Cybersecurity. IEEE Access 2025, 13, 159345–159379. [Google Scholar] [CrossRef]
Vafaei Sadr, A.; Bülow, R.; von Stillfried, S.; Schmitz, N.E.J.; Pilva, P.; Hölscher, D.L.; Ha, P.P.; Schweiker, M.; Boor, P. Operational GHG Emissions of Deep Learning in Digital Pathology. Lancet Digit. Health 2024, 6, e58–e69. [Google Scholar] [CrossRef]
West, K.; Moawad, Y.; Lehmann, F.; Bountris, V.; Leser, U.; Elkhatib, Y.; Thamsen, L. A Systematic Evaluation of the Potential of Carbon-Aware Execution for Scientific Workflows. Future Gener. Comput. Syst. 2026; in press. [Google Scholar]
Mavromatis, I.; Katsaros, K.; Khan, A. Computing within Limits: An Empirical Study of Energy Consumption in Machine Learning Training and Inference. arXiv 2024, arXiv:2406.14328. [Google Scholar]
Stojkovic, J.; Zhang, C.; Goiri, I.; Torrellas, J.; Choukse, E. DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA), Westin Las Vegas, NV, USA, 1–5 March 2025; IEEE: Los Alamitos, CA, USA, 2025; pp. 1–15. [Google Scholar]
Xu, K.; Sun, D.; Tian, H.; Zhang, J.; Chen, K. GREEN: Carbon-Efficient Resource Scheduling for Machine Learning Clusters. In Proceedings of the 22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI), Philadelphia, PA, USA, 28–30 April 2025; USENIX Association: Santa Clara, CA, USA, 2025; pp. 1–16. [Google Scholar]
Niu, C.; Zhang, W.; Zhao, Y.; Chen, Y. Energy Efficient or Exhaustive? Benchmarking Power Consumption of LLM Inference Engines. SIGENERGY Energy Inform. Rev. 2025, 5, 56–62. [Google Scholar] [CrossRef]
Wiesner, P.; Grinwald, D.; Weiß, P.; Wilhelm, P.; Khalili, R.; Kao, O. Carbon-Aware Quality Adaptation for Energy-Intensive Services. In Proceedings of the 16th ACM International Conference on Future and Sustainable Energy Systems, Rotterdam, The Netherlands, 17–20 June 2025; ACM: New York, NY, USA, 2025; pp. 415–422. [Google Scholar]
Argerich, M.F.; Patiño-Martínez, M. Measuring and Improving the Energy Efficiency of Large Language Models Inference. IEEE Access 2024, 12, 80194–80207. [Google Scholar] [CrossRef]
Niu, C.; Zhang, W.; Li, J.; Zhao, Y.; Wang, T.; Wang, X.; Chen, Y. TokenPowerBench: Benchmarking the Power Consumption of LLM Inference. Proc. AAAI Conf. Artif. Intell. 2026; in press. [Google Scholar]
Fernandez, J.; Na, C.; Tiwari, V.; Bisk, Y.; Luccioni, S.; Strubell, E. Energy Considerations of Large Language Model Inference and Efficiency Optimizations. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, 27 July–1 August 2025; ACL: Toronto, ON, Canada, 2025; pp. 32556–32569. [Google Scholar]
Anonymous. Understanding Efficiency: Quantization, Batching, and Serving Strategies in LLM Energy Use. arXiv 2025, arXiv:2508.12808. [Google Scholar]
Alissa, H.; Nick, T.; Raniwala, A.; Arribas Herranz, A.; Frost, K.; Manousakis, I.; Lio, K.; Warrier, B.; Oruganti, V.; DiCaprio, T.J.; et al. Using Life Cycle Assessment to Drive Innovation for Sustainable Cool Clouds. Nature 2025, 641, 331–338. [Google Scholar] [CrossRef]
Wadenstein, M.; Vanderbauwhede, W. Life Cycle Analysis for Emissions of Scientific Computing Centres. Eur. Phys. J. C 2025, 85, 913. [Google Scholar] [CrossRef]
Patterson, D.; Gonzalez, J.; Hölzle, U.; Le, Q.; Liang, C.; Munguia, L.-M.; Rothchild, D.; So, D.R.; Texier, M.; Dean, J. The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink. Computer 2022, 55, 18–28. [Google Scholar] [CrossRef]
Pimenow, S.; Pimenowa, O.; Prus, P. AI Development and Energy Consumption. Energies 2024, 17, 5965. [Google Scholar] [CrossRef]
Alevizos, V.; Gerolimos, N.; Leligkou, E.A.; Hompis, G.; Priniotakis, G.; Papakostas, G.A. Sustainable Swarm Intelligence. Technologies 2025, 13, 477. [Google Scholar] [CrossRef]
Greif, L.; Röckel, F.; Kimmig, A.; Ovtcharova, J. AI Techniques and SDGs. Int. J. Environ. Res. 2024, 19, 1. [Google Scholar] [CrossRef]
Yang, S.; Hou, Y.; Ren, Y.; Ding, W. MCGA for Multi-Agent Coordination. Entertain. Comput. 2025, 54, 100968. [Google Scholar] [CrossRef]
Charbonnier, F.; Peng, B.; Vienne, J.; Stai, E.; Morstyn, T.; McCulloch, M. Multi-Agent Reinforcement Learning for Energy Flexibility. Appl. Energy 2025, 377, 124406. [Google Scholar] [CrossRef]
Wang, K.; Cheng, L.; Yin, M.; Zhang, K.; Wang, R.; Zhang, M.; Sun, R. Evolutionary Game Theory in Energy Storage Systems. Sustainability 2025, 17, 7400. [Google Scholar] [CrossRef]
Dai, S.; Meng, F.; Dai, H.; Wang, Q.; Chen, X.; Bai, W.; Shi, P.; Allmendinger, R.; Zhang, Y.; Liu, J. Machine Learning in Peak Demand Forecasting. Renew. Sustain. Energy Rev. 2026, 227, 116500. [Google Scholar] [CrossRef]
Charbonnier, F.; Morstyn, T.; McCulloch, M.D. Scalable Multi-Agent Reinforcement Learning. Appl. Energy 2022, 314, 118825. [Google Scholar] [CrossRef]
Dessureault, J.-S.; Lamontagne, R.; Parisé, P.-O. Ethics of Artificial Superintelligence. AI Ethics 2025, 5, 6241–6263. [Google Scholar] [CrossRef]
Dodig-Crnkovic, G.; Basti, G.; Holstein, T. Delegating Responsibilities to Intelligent Autonomous Systems. J. Bioeth. Inq. 2025, 22, 507–514. [Google Scholar] [CrossRef]
Bekkemoen, Y. Explainable Reinforcement Learning: A Systematic Review. Mach. Learn. 2024, 113, 355–441. [Google Scholar] [CrossRef]
Sequeira, P.; Gervasio, M. Interestingness Elements for Explainable Reinforcement Learning. Artif. Intell. 2020, 288, 103367. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram of study selection.

Figure 2. Multi-dimensional analysis of AI environmental impacts. (A) Distribution of relative energy contributions across the AI lifecycle for different model scales, illustrating the shift from training-dominant footprints to inference and embodied hardware impacts as systems scale. (B) Context dependency of carbon footprints, demonstrating how identical energy consumption results in divergent environmental outcomes (CO₂eq) based on the carbon intensity of regional electricity grids (e.g., renewable-heavy vs. carbon-intensive).

Figure 3. Conceptual Carbon-aware multi-agent reinforcement learning (RL) framework derived from the findings of the systematic review. The figure illustrates how recurring empirical patterns identified in the literature—such as inference-dominated energy use, embodied hardware emissions, rebound effects, and variability in grid carbon intensity—motivate coordinated, carbon-aware control across software, hardware, and energy layers within a centralized training, decentralized execution architecture.

Table 1. Baseline characteristics of the 10 included studies.

Author (Year)	Study Type	AI System/Domain	Energy Metric	Carbon Metric	Method/Technique
Boumendil et al. (2025) [31]	Systematic survey	Deep neural networks (training & inference)	Energy efficiency indicators	Carbon footprint (reported/estimated)	Taxonomy of algorithmic, hardware, and training optimizations
Henderson et al. (2020) [9]	Methodological framework + case studies	Machine learning experiments (RL, NLP, CV)	Power usage (kWh), GPU/CPU energy, runtime	CO₂eq based on grid intensity	Experiment-Impact-Tracker; empirical energy logging
Jiang et al. (2024) [32]	Perspective + life-cycle analysis	LLM-powered intelligent chatbots	Life-cycle energy consumption across interaction phases	Life-cycle CO₂ emissions	System-level life-cycle assessment
Mekouar et al. (2025) [33]	Comparative experimental study	Data pipelines for Green AI	Execution time, CPU load, energy use	CO₂eq derived from energy consumption	Benchmarking Pandas vs. Polars vs. PySpark
Raper et al. (2022) [34]	Conceptual governance framework	AI development processes	Energy auditing concepts	Carbon footprint at system level	Sustainability Budgets governance model
Richie et al. (2025) [35]	Exploratory review	AI in healthcare systems	Reported energy use	Environmental impact (qualitative)	Thematic synthesis and ethical analysis
Sarkodie et al. (2023) [36]	Empirical modelling study	Bitcoin/blockchain (AI-intensive computing)	Electricity consumption (TWh)	Carbon footprint projections	ML + econometric decomposition analysis
Schneider et al. (2025) [37]	Life Cycle Assessment (LCA)	AI hardware accelerators (TPUs)	Operational + embodied energy	Cradle-to-grave GHG emissions (CCI metric)	Hardware life-cycle assessment
Usman et al. (2025) [38]	Systematic review	AI, ML & LLM-based cybersecurity	Energy optimization indicators	Carbon emissions (reported/estimated)	Framework synthesis (CMM, SDGs, systems theory)
Vafaei Sadr et al. (2024) [39]	Modelling study	Deep learning in digital pathology	Inference energy (kWh)	CO₂eq using Electricity Maps	Workflow-based emissions modelling

AI = Artificial Intelligence; ML = Machine Learning; LLM = Large Language Model; RL = Reinforcement Learning; NLP = Natural Language Processing; CV = Computer Vision; GPU = Graphics Processing Unit; CPU = Central Processing Unit; TPU = Tensor Processing Unit; LCA = Life Cycle Assessment; GHG = Greenhouse Gas; CO₂eq = Carbon Dioxide Equivalent; kWh = kilowatt-hour; SDGs = Sustainable Development Goals; CMM = Cybersecurity Maturity Model.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Oliveira, A.P.; Carraquico, T.; Martinez-Perez, C. Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle. Sustainability 2026, 18, 1359. https://doi.org/10.3390/su18031359

AMA Style

Oliveira AP, Carraquico T, Martinez-Perez C. Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle. Sustainability. 2026; 18(3):1359. https://doi.org/10.3390/su18031359

Chicago/Turabian Style

Oliveira, Ana Paula, Tânia Carraquico, and Clara Martinez-Perez. 2026. "Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle" Sustainability 18, no. 3: 1359. https://doi.org/10.3390/su18031359

APA Style

Oliveira, A. P., Carraquico, T., & Martinez-Perez, C. (2026). Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle. Sustainability, 18(3), 1359. https://doi.org/10.3390/su18031359

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Beyond Efficiency: A Systematic Review of Energy Consumption and Carbon Footprint Across the AI Lifecycle

Abstract

1. Introduction

2. Theoretical Background

2.1. Environmental Impacts of Artificial Intelligence

2.2. Green AI and Energy-Aware Machine Learning

2.3. Rebound Effects and System-Level Feedback

2.4. AI Systems as Adaptive Socio-Technical Systems

2.5. From Measurement to Operational Sustainability

3. Materials and Methods

3.1. Research Question and PICOS Framework

3.2. Eligibility Criteria

3.3. Information Sources

3.4. Search Methods for Identification of Studies

3.5. Data Extraction and Data Items

3.6. Quality Appraisal and Risk of Bias Assessment

4. Results

4.1. Study Selection

4.2. Methodological Quality of Included Studies

4.3. Study Characteristics

4.4. Outcomes

4.4.1. Energy Consumption and Computational Demand of AI Systems

4.4.2. Carbon Footprint and Greenhouse-Gas Emissions

4.4.3. Mitigation and Optimization Strategies

4.4.4. Governance, Reporting, and Ethical Frameworks

4.5. Limitations and Strengths of the Review

5. Discussion

6. Implications for Sustainable AI: A Carbon-Aware Multi-Agent Reinforcement Learning Framework

6.1. Motivation and Conceptual Contribution

6.2. System-Level Architecture and Coordinated Control

6.3. Carbon-Constrained Learning and Rebound Effect Mitigation

6.4. Uncertainty-Aware Coordination in Non-Stationary Systems

6.5. Explainability, Ethics, and Governance by Design

6.6. Implementation Roadmap and Policy Implications

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI