Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals

Alka, T. A.; Suresh, M.; Mandal, Santanu; Filho, Walter Leal; Raman, Raghu

doi:10.3390/en19061588

Open AccessSystematic Review

Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals

by

T. A. Alka

¹

,

M. Suresh

^1,*

,

Santanu Mandal

²,

Walter Leal Filho

³

and

Raghu Raman

^4,*

¹

Amrita School of Business, Amrita Vishwa Vidyapeetham, Coimbatore 641112, Tamil Nadu, India

²

Amrita School of Business, Amrita Vishwa Vidyapeetham, Amaravati 522503, Andhra Pradesh, India

³

Department of Natural Sciences, Manchester Metropolitan University, Chester Street, Manchester M1 5GD, UK

⁴

Amrita School of Business, Amrita Vishwa Vidyapeetham, Amritapuri 690525, Kerala, India

^*

Authors to whom correspondence should be addressed.

Energies 2026, 19(6), 1588; https://doi.org/10.3390/en19061588

Submission received: 14 February 2026 / Revised: 9 March 2026 / Accepted: 20 March 2026 / Published: 23 March 2026

(This article belongs to the Special Issue Sustainable Energy Systems: Progress, Challenges and Prospects)

Download

Browse Figures

Versions Notes

Abstract

Sustainable energy systems (SESs) support intelligent modeling, automation, and governance that enable energy access, infrastructure innovation, and climate resilience. Despite their potential, their integration with large language models (LLMs) raises concerns regarding energy intensity, transparency, equity, and regulation. This study adopts a mixed-methods review combining a BERTopic-based thematic analysis and case-based synthesis to examine applications of LLMs in energy modeling, optimization, etc., and to assess their alignment with the United Nations Sustainable Development Goals. These applications support SDG 7 (Affordable and Clean Energy) by improving access to energy knowledge and decision support, SDG 9 (Industry, Innovation and Infrastructure) through intelligent and scalable digital infrastructure, and SDG 13 (Climate Action) by climate-responsive planning and operational efficiency. The findings reveal that modular, agent-based LLM workflows enhance energy modeling and regulatory compliance. However, sustainability trade-offs necessitate responsible Artificial Intelligence (AI) governance emphasizing transparency, ethical design, and inclusivity. This review informs policy and practice by suggesting that LLMs offer potential value for sustainable energy application deployment within responsible AI governance frameworks that emphasize ethical design, accountability, and equitable access. The study provides future research directions using the ADO (antecedents–decisions–outcomes) framework, emphasizing regulatory readiness, ethical design, and inclusive governance aligned with SDGs 7, 9, and 13, among others.

Keywords:

sustainable energy systems; large language models; energy optimization; artificial intelligence in energy; sustainable development goals

1. Introduction

The global energy sector plays a crucial role in enabling innovation, transforming infrastructure, and promoting environmental stewardship, making it central to achieving the United Nations Sustainable Development Goals (SDGs) [1,2,3]. As climate volatility intensifies [4] and centralized fossil-based grids face growing scrutiny, the transition toward sustainable energy systems (SESs) has become increasingly urgent [5,6,7]. Sustainable energy systems are defined as integrated sociotechnical systems that facilitate clean energy production, efficient distribution, equitable access, and long-term ecological balance by leveraging renewable resources, digital technologies, and inclusive governance structures [8,9]. This approach encompasses not only technological advancements but also fundamental changes in the interactions between energy systems and economic, social, and ecological domains [10,11].

In this context, artificial intelligence (AI) has emerged as an enabler of SESs for addressing challenges across forecasting, infrastructure optimization, and policy automation [12,13,14]. Among AI technologies, LLMs are gaining prominence because of their unique ability to support both decision-making and adaptive system design [15,16,17]. LLMs, such as GPT-4 and RE-LLaMA, leverage deep neural architectures trained on massive corpora to process complex regulatory texts, generate stakeholder-specific insights, and enable user-friendly, real-time policy interpretation interfaces [18,19]. These include tasks such as demand response, hybrid energy grid coordination, and real-time battery storage optimization [16,20,21,22].

Despite their promise, integrating LLMs into SESs involves important trade-offs. The computational resources required for LLM training and inference contribute to increased emissions [23]. Moreover, SESs are influenced by increasing data intensity, regulatory complexity, and the imperative for inclusive, climate-resilient decision-making [9,24]. While LLMs offer unique capabilities in semantic reasoning, natural language interaction, and automation, enabling advanced energy modeling, policy interpretation, diagnostics, and system control beyond conventional optimization approaches [25,26], evaluating their impact through the lens of sustainable development goals (SDGs) is essential.

This normative framework helps to determine whether these technologies advance clean energy access (SDG 7), intelligent infrastructure (SDG 9), responsible resource use (SDG 12), and climate action (SDG 13) rather than focusing solely on technical performance. These SDGs encompass more than clean energy transitions; they represent a broader agenda for equitable access, resilient infrastructure, and climate-resilient innovation. These goals represent not only energy access or climate mitigation targets, but also ethical and institutional imperatives that influence how AI technologies should be deployed in energy systems [27,28]. The integration of LLMs into SESs represents a change from traditional, isolated, and manually operated infrastructures toward intelligent, adaptive, and decentralized energy ecosystems [29]. Owing to their advanced natural language processing and reasoning abilities, LLMs are particularly well suited to address these emerging needs [12,19,25]. This highlights three complementary theoretical lenses—sustainability transitions, intelligent infrastructure, and responsible AI governance—to critically examine the roles of these technologies.

The 2030 Agenda for Sustainable Development identifies SESs as a foundational pillar for ensuring global resilience, equity, and environmental sustainability [8,9]. SESs are defined not only by their use of renewable energy sources such as solar, wind, and hydrogen but also by their ability to ensure inclusive access, adaptive infrastructure, and lifecycle sustainability, which requires systemic transformation across generation, distribution, consumption, and governance layers facilitated by data, automation, and participatory policy [12,30].

Despite the global expansion of renewable energy, critical bottlenecks remain: integrating intermittent power sources into the grid, optimizing consumption patterns, and democratizing access in underserved regions. These challenges limit progress in SDG 7. AI technologies, when responsibly deployed, can enable predictive diagnostics, system-wide adaptability, and user-specific interventions to fill these gaps [20,31]. However, without governance safeguards, AI may replicate or exacerbate existing disparities. As emphasized by Wang et al. [32], interventions must be grounded in principles of equity, transparency, and accountability to avoid reinforcing technoelitism and exclusion. LLM-guided decision frameworks enhance interpretability by explaining system recommendations, assumptions, and trade-offs in natural language [25]. This addresses a key governance gap in energy transitions, where trust, accountability, and stakeholder engagement are critical. However, the sustainability contribution of LLMs is not without challenges. Training and integrating large models remain computationally intensive and energy-intensive, with concerns about lifecycle emissions and alignment with climate goals [23]. Hence, evaluation must extend beyond task performance to include energy efficiency, equity of access, and long-term system resilience.

The growing integration of AI into energy infrastructures raises complex governance questions. As LLMs are increasingly embedded in policy interpretation, energy trading, and demand–response systems, there is an urgent need to evolve institutional frameworks capable of overseeing their ethical and sustainable deployment [28,33]. Pilot initiatives like Arslan et al. [18] already highlight the practical utility of these technologies in navigating complex renewable energy policy landscapes, thereby improving accessibility and compliance and contributing directly to context-sensitive energy access and operational flexibility aligned with SDG 7 and SDG 9. The environmental cost of AI, particularly in the case of LLM training and inference, raises pressing concerns about life-cycle sustainability. As AI becomes a foundational layer of energy infrastructures, institutions must adopt holistic, SDG-aligned governance frameworks and establish monitoring protocols to track both intended and unintended impacts. Without such guardrails, the energy transition risks drifting toward a technocentric, rather than a human-centric, trajectory [2,12].

Recent reviews have examined artificial intelligence, transformer models, and large language models in energy-related domains, but important gaps remain. Pimenow et al. [23] analyze the dual role of AI in energy systems, highlighting efficiency gains alongside the high energy and carbon costs of LLMs; however, they do not assess governance structures or SDG alignment. Barahona and Almulhim [34] examine circular economy aspects associated with energy-intensive digital technologies, yet they do not evaluate how LLMs function within operational energy infrastructures or governance systems. Arslan et al. [18] highlighted how LLM-based chatbots reduce information asymmetries for SMEs, but their analysis is application-specific and does not extend to system-level sustainability trade-offs. Antonesi et al. [35] provide a systematic review of transformer and LLM architectures for forecasting and grid management, focusing on technical and architectural advances without integrating institutional, policy, or SDG dimensions.

Elsisi et al. [33] examine Machine Learning (ML)–Internet of Things (IoT) integration for emission monitoring and energy management in port environments, emphasizing operational optimization, cybersecurity, and infrastructure resilience, yet without addressing transformer- or LLM-specific governance implications across energy transitions. Therefore, prior reviews emphasize technical performance, architectural innovation, or sector-specific applications. No prior review consolidates the deployment of LLMs across technical, operational, infrastructure, modeling, optimization, and governance dimensions within a SESs–SDG analytical framework. In particular, governance mechanisms, lifecycle sustainability trade-offs, and institutional embedding of LLMs within real-world energy transitions remain insufficiently consolidated. To address this gap, this study explores three guiding questions:

RQ1: How are LLMs being applied across infrastructure, modeling, and governance dimensions in SESs?

RQ2: In what ways do these technologies contribute to SDG-aligned outcomes such as energy access, intelligent infrastructure, and climate resilience?

RQ3: What sustainability trade-offs, risks, and governance considerations emerge from their deployment in real-world systems?

To ensure methodological transparency, this study follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [36] for systematic literature identification, screening, and inclusion. PRISMA provides a standardized protocol for documenting search strategies, eligibility criteria, and exclusion decisions, thereby reducing selection bias and enhancing methodological rigor in review-based research [36]. Its use is well-established in interdisciplinary domains that span technical, policy, and governance studies, making it particularly suitable for synthesizing fragmented research on LLMs in sustainable energy systems.

Within this PRISMA-guided corpus, the study applies bidirectional encoder representations from transformers (BERT)-based topic modeling to identify dominant themes systematically, emerging research trends, and conceptual gaps. These insights are then organized via the Antecedents–Decisions–Outcomes (ADO) framework [37], which helps guide further research on evaluating these emerging technologies, such as LLMs, where regulatory readiness, ethical considerations, and inclusive governance influence both adoption and impact through major lenses of drivers, decisions, and impact. The methodological transparency and theoretical coherence allow the review to move beyond descriptive mapping toward SDG-aligned insights for responsible LLMs in SESs.

In this way, this research makes three specific contributions. (1) It provides an integrated cross-sector synthesis of LLM applications spanning technical, operational, and governance dimensions of SESs, moving beyond sector-bounded or architecture-focused reviews. (2) It systematically links LLM use cases to SDG-aligned outcomes, including energy access, intelligent infrastructure, climate resilience, and resource efficiency, through a mixed-method design that combines PRISMA-based systematic review, BERTopic thematic analysis, and case study synthesis. (3) It identifies critical sustainability trade-offs and governance gaps related to energy intensity, equity, transparency, accountability, and responsible AI deployment, thereby establishing a structured research agenda for sustainable and accountable AI-enabled energy transitions.

The examination focuses on the emerging role of LLMs and generative AI (GenAI) in sustainable energy systems, focusing on applications in smart grids, energy system modeling and optimization, policy and governance support, and data-driven decision-making, which is relevant and novel. The study also explores how these technologies contribute to advancing sustainability goals within the energy transition, which is relevant. This integrated SESs–governance perspective provides actionable clarity for engineers, researchers, and policymakers seeking to operationalize AI responsibly within inclusive, adaptive, and environmentally sustainable energy systems.

The remainder of this paper is organized as follows: Section 2 details the research methodology, including PRISMA-based document selection, BERTopic modeling, case study selection, and ADO framework; Section 3 presents the results; Section 4 discusses the study’s contributions, extensions to the literature, and limitations; Section 5 outlines the implications of the study; Section 6 provides ADO-based directions for future research; and Section 7 concludes the study.

2. Methodology

This study adopts a mixed-method research design that integrates quantitative BERTopic modeling and qualitative case study analysis to examine the role of LLMs in sustainable energy systems (SESs). The approach enables (1) data-driven identification of thematic patterns from large-scale literature using BERTopic [38], (2) in-depth contextual insights through the analysis of six representative case studies on LLM applications in SESs, and (3) the development of future research directions using the ADO framework by identifying antecedents, decisions, and outcomes [37]. This integration supports a structured understanding of both macrolevel trends and microlevel mechanisms, effectively bridging theory and practice while enhancing the reliability and validity of findings through methodological triangulation [39].

Figure 1 summarizes the overall methodological workflow, beginning with document identification and selection via the PRISMA protocol [40,41] and machine learning-based SDG mapping [42], followed by BERTopic analysis, case study selection, and the formulation of future research directions via the ADO framework [37]. The PRISMA stages ensure the systematic and transparent selection of relevant studies. BERTopic is then applied to extract thematic structures via advanced natural language processing (NLP) techniques [38], after which contextually relevant case studies are analyzed. Finally, the ADO framework is employed to synthesize insights and propose structured directions for future research. The detailed methodological steps are presented in the following section.

2.1. PRISMA

This review was conducted in accordance with the PRISMA 2020 guidelines as recommended by Page et al. [43], which constitute the best reporting framework for ensuring transparency and a systematic process of document selection through identification, screening, eligibility assessment, and synthesis [44]. Figure 2 presents the complete process adopted in this research, along with the PRISMA checklist included in the Supplementary Materials. The PRISMA approach is applied in bibliometric and case study research [3,45]. In this research, we searched Scopus for high-quality peer-reviewed studies that used Boolean operators for comprehensive coverage and were suitable for comprehensive analysis [46]. The major steps in this process are as follows [36]: formulating research questions and the research scope, followed by developing a search string from the keywords and searching the databases. In this research, the following search strings are used in Scopus:

TITLE-ABS-KEY (“sustainab* energ*” OR “renewable energ*”) AND TITLE-ABS-KEY ((“large language model*” OR “llm”) OR (openai OR gpt* OR chatgpt) OR (bard OR gemini OR llama)).

The methodological reliability of using Scopus as the data source for BERTopic-based thematic modeling has been adopted in recent peer-reviewed studies in sustainable energy and AI research. For example, Raman et al. [3] in sustainable aviation fuel and Raman et al. [9] in sustainable energy systems used Scopus-derived datasets to conduct structured thematic analyses aligned with the Sustainable Development Goals (SDGs). Similarly, Raman et al. [47] applied integrated thematic and topic modeling techniques using Scopus metadata to examine green and sustainable AI research.

In addition, BERTopic-based modeling using Scopus data has been validated in interdisciplinary domains, including migrant entrepreneurship [36] and AI applications in digital ecosystems [40]. Energy-focused bibliometric and SDG-linked analyses have also relied on Scopus datasets, as highlighted by Raman et al. [41] and Alka et al. [44] in the sustainable energy entrepreneurship process. These studies confirm that Scopus provides sufficiently structured, high-quality, and standardized metadata suitable for reproducible thematic clustering, machine learning-driven topic extraction, and subsequent qualitative synthesis.

Scopus supports advanced search functionalities essential for systematic and reproducible reviews [41]. It allows precise Boolean operators, field-specific searches (e.g., TITLE-ABS-KEY), wildcard characters (e.g., *), phrase searching, and controlled filtering by year, document type, subject area, and language. These features support the development of structured, transparent search strings aligned with PRISMA guidelines [44]. Integrating multiple databases such as IEEE Xplore and Web of Science may introduce duplication, inconsistencies in metadata structures, and citation harmonization challenges, which can affect clustering stability and thematic modeling outputs. Given that the first phase of this mixed-method study relies on consistent thematic outputs to inform the following case study analysis, maintaining a single standardized database with comprehensive coverage ensured methodological coherence and analytical depth.

A total of 187 documents were identified during the search date of 21 December 2025, with papers published from 2012 onward. The data are subsequently screened, and the cleaned data are analyzed using BERTopic. The next step is the identification of topics and later classification of each under a theme in the order of citations through sample ordering. Each paper under each theme is then systematically analyzed, synthesized, and mapped with the SDGs, and, based on the detailed synthesis, provides answers to the research questions along with contributions to the existing knowledge and theory-building. This detailed process of thorough, structured steps, such as identifying search criteria, strings, and filtrations based on inclusion and exclusion criteria, is required in a systematic review process [41].

The search was limited to studies from 2022 to 2025 (2025, 93; 2024, 58; 2023, 15; 2022, 4) owing to the incorporation of only 170 recent relevant publications on the application of LLMs within sustainable energy systems. Publications before 2022 were excluded because there is a lack of integration of advanced AI frameworks, such as LLMs and reinforcement learning, in energy contexts. The chosen timeframe (2022–2025) indicates the most current research landscape, aligning with recent methods, tools, and technological innovations that align with the focus on intelligent infrastructure, real-time optimization, and SDG-linked sustainability outcomes. This promotes adherence to both the recency and relevance of PRISMA’s systematic selection standards. The source type is limited to the Journal and Conference proceedings. Documents that are published as Note (1) or Editorial (1) are excluded. Articles (72), conference papers (54), reviews (16), and conference reviews (12) are selected. During the screening phase, documents published in the English language, which is a commonly and widely accepted readable language across the globe, were selected (n = 147). Among these, the title, abstract, and keywords of each paper are examined, and the final 92 papers that fall under the research context are selected and considered eligible and included as documents with manual review (shown in Table S1. Summary of Selected Studies in Supplementary Materials).

This process is conducted through tabulation of each study’s characteristics and deciding which studies were eligible for each synthesis. By this, research works not directly addressing LLM-based applications in SESs, including those focused on general AI/ML methods without LLM integration or conducted outside the energy domain (e.g., education or cybersecurity), were removed to arrive at 92 relevant documents. Among these, only the most representative studies are discussed in the manuscript, selected based on citation, thematic relevance to the identified clusters, and their contribution to explaining key LLM–SES relationships. While the remaining studies contributed to the overall clustering and thematic structure of the analysis.

As this review is based on published literature, no human participants were directly involved. The included studies comprise journal articles and conference proceedings published between 2022 and 2025, showing the recent emergence of LLM applications in SESs. The sample includes empirical studies, review papers, and conceptual contributions covering positive contributions of LLM-enabled systems like AI-based forecasting, optimization, intelligent infrastructure, governance mechanisms, and SDG-aligned sustainability outcomes. The studies differ in methodological design, geographic context, and system scale, covering applications such as smart grids, renewable energy forecasting, building energy modeling, and climate-resilient infrastructure.

Another proportion of studies also highlight improvements in energy efficiency, grid management, and climate resilience planning. However, some studies identify trade-offs, including data dependency, energy consumption of AI systems, cybersecurity risks, governance challenges, etc. As this review synthesizes secondary literature, no human participants were directly involved, no meta-analysis was conducted; therefore, no effect estimates or confidence intervals are reported. Results were presented through thematic categorization, narrative interpretation, and framework-based synthesis.

The risk of bias in the included studies was assessed through a structured qualitative evaluation approach appropriate for systematic reviews without having a meta-analysis. As this review synthesizes empirical, conceptual, and review-based studies on LLM applications in sustainable energy systems, formal quantitative risk-of-bias tools were not applicable. Risk of bias was evaluated at three levels: study selection bias was minimized through a PRISMA-guided screening process using inclusion and exclusion criteria (publication type, timeframe 2022–2025, English language, relevance to LLM applications in SESs). Titles, abstracts, and keywords were manually screened by three reviewers, and the collected data were reviewed to ensure contextual relevance with the research questions. Each report was independently reviewed by two reviewers, with discrepancies resolved by discussion or a third reviewer. Study investigators were contacted when clarification was needed. Methodological quality and reporting bias are minimized by each included study, which was qualitatively appraised based on clarity of research objectives and design, transparency in describing LLMs, datasets, and implementation context, empirical validation or analytical depth; studies lacking methodological transparency or clear relevance to SES applications were excluded during the eligibility stage.

Synthesis-level bias is reduced by interpretive bias in thematic modeling. BERTopic outputs were manually validated through a topic coherence check, topic labels were cross-checked against original abstracts, and themes were reviewed iteratively to avoid misclassification. Methodological triangulation was adopted by synthesizing results using a mixed-method approach. Start with BERTopic modeling, which was applied to identify thematic clusters from the included studies based on semantic similarity. All data, including outputs from BERTopic modeling, were manually verified for accuracy. The generated topics were manually reviewed, refined, and grouped into higher-order themes aligned with the research questions. A case study analysis was conducted using a functional–comparative framework based on application domain, LLM type, decision logic, optimization objectives, and SDG relevance. The findings were structured using SDG mapping and the Antecedents–Decisions–Outcomes (ADO) framework to organize insights and derive future research directions.

This multi-layer synthesis approach enhanced systematic examination depth and reduced single-method bias. Because no statistical aggregation of effect sizes was conducted, quantitative risk-of-bias scoring tools were not used. Instead, transparency, structured screening, qualitative appraisal, and triangulation were applied to ensure credibility and minimize bias in included studies.

2.2. BERTopic Modeling

Limitations in capturing semantic relationships and short-text contexts create opportunities for BERT (bidirectional encoder representations from transformers) [48]. Traditional topic modeling techniques include nonnegative matrix factorization (NMF), latent Dirichlet allocation (LDA), probabilistic latent semantic analysis (PLSA), and To2Vec. BERT, a deep-learning language model developed by Google, employs embeddings to provide richer contextual insights than bag-of-words approaches [9,38]. Its multilingual capability supports document extraction across 50 languages through a sentence-transformer framework.

BERTopic integrates BERT embeddings with uniform manifold approximation and projection (UMAP) for dimensionality reduction and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) for clustering, offering advantages over LDA by eliminating the need for a predefined number of topics and effectively handling noise and outliers [49]. In this study, the all-MiniLM-L6-v2 model is adopted because of its optimization for clustering and semantic search [50]. Topic selection is based on term representations with probabilistic relevance to themes [51], and the BERTopic workflow is illustrated in Figure 3.

To ensure that thematic interpretation extends beyond technical clustering, multiple validation layers were incorporated to ensure the semantic coherence and analytical reliability of identified themes. Model optimization involves three hyperparameters: the n-gram range, the number of topics, and the minimum topic size. The n-gram range is set to (1, 2) to enhance contextual richness while avoiding overcomplexity. Topic numbers are tested between 5 and 20, with a minimum topic size of 20 keywords to ensure coherence and coverage. After stop word removal, common terms such as use, add, and related terms are excluded to reduce noise [38]. UMAP is configured with default settings and probability calculations for English-language document-topic associations, using a minimum distance of 0.05. Topic experimentation was evaluated through intertopic distance and coherence scores, with cosine similarity applied to measure vector alignment.

Robustness was assessed through sensitivity analysis across varying topic numbers, minimum topic size thresholds, and coherence metrics. Stable clustering patterns and consistent intertopic distance structures across parameter variations confirmed the reliability of the thematic configuration. Cosine similarity and coherence score comparisons were used to ensure that conclusions were not dependent on a single parameter setting.

Manual validation is conducted through topic-wise inspection of classified papers, probability values, and citation counts. Three researchers were actively involved in the thematic validation process. Each researcher separately examined topic-document assignments, representative keywords, probability scores, full-text alignment, etc., to assess semantic coherence and relevance. Inter-rater consistency was first evaluated through independent coding comparison. In cases of disagreement regarding topic labeling, document inclusion, or thematic interpretation, a structured reconciliation process was followed: (1) re-examination of probabilistic scores and intertopic distance positioning, (2) re-reading of full-text articles within the disputed cluster, and (3) consensus discussion supported by citation patterns and keyword prominence.

Final topic labels were assigned only after agreement was reached among all three reviewers. This multi-stage consensus approach minimized individual interpretative bias and ensured the methodological meticulousness, transparency, and reproducibility of the thematic analysis, relevance, and clarity, which is consistent with prior BERTopic studies [40,49]. Compared with LDA and NMF, BERT-based modeling better captures semantic similarity in complex textual datasets, allows overlapping themes, and supports scalable, domain-specific topic identification through UMAP-enhanced integrity [51].

Following prior BERTopic-based studies like Alka et al. [36] and Nedungadi et al. [40], topic modeling was conducted using publication titles, abstracts, and authors’ keywords, an approach that facilitates scalable semantic clustering across large bibliometric datasets. After automated topic generation, full-text articles within each cluster were reviewed to validate thematic coherence and ensure classification accuracy. In this study, the alignment of LLMs applied in SESs with sustainable development goals (SDGs) was identified via Scopus-based Elsevier’s SDG mapping methodology, where Boolean keyword searches derived from UN policy documents and SDG-specific academic vocabularies are combined with machine learning models trained on large, manually labeled datasets to improve precision and recall [38,52]. The integration of Scopus SDG classification provides external methodological triangulation, strengthening construct validity by cross-referencing machine-generated themes with independently validated SDG taxonomies. This dual-layer validation enhances the credibility and reproducibility of the derived thematic conclusions.

This SDG mapping approach has been validated in prior energy and sustainability studies, including green hydrogen, sustainable aviation fuel, and SDG trajectory analyses, highlighting its suitability for large-scale, interdisciplinary energy research [3,51]. Accordingly, the SDG associations identified in this study represent thematic alignment between the reviewed research and SDG objectives rather than a quantitative mapping to specific SDG targets or indicators.

2.3. Case Studies

A case study helps in the development of a new theoretical understanding and particular cases’ contextual insights, and case studies include broader coverage of datasets in which different methods and various sources are effective for data compilation in a particular case study [53]. This method is suitable here because of insights into the application of theoretical insights and methodological approaches in real-world cases through a qualitative approach [54]. From an initial 92 relevant documents, 8 studies were identified as case studies. After title and abstract screening, 8 candidate case studies met the inclusion criteria: (i) direct application of LLMs in energy systems, (ii) empirical case study design, and (iii) relevance to SDGs.

To minimize subjectivity, two independent reviewers coded the studies via a predefined framework covering the AI application context, system level, and SDG alignment. Disagreements were resolved through discussion. Based on the full agreement on relevance, clarity, and system integration depth, 6 case studies were retained. Two studies were excluded because they focused on education or cybersecurity rather than operational energy systems. The case studies are synthesized via a functional–comparative framework rather than a performance benchmarking approach. This choice is driven by heterogeneity across cases in terms of system scale, geographic context, energy infrastructure type, data availability, and evaluation protocols. Most source studies do not report standardized metrics such as cost reductions, peak load shaving percentages, emission savings, or regulatory violation counts in a comparable way [24]. The cases are analyzed on the basis of the application domain, LLMs, decision-making logic, optimization objective, and SDG relevance via cross-case analytical comparison without introducing nonreplicable or inferred quantitative estimates.

These case studies cover LLMs in sustainable energy systems through AI-based forecasting, modeling, optimization, real-time scheduling, offshore wind farms, agentic LLM workflows for building energy models (BEMs) in EnergyPlus, LLM-assisted autoformalism for personalized energy optimization, two-stage Reinforcement learning (RL), linear programming approaches for enterprise-level smart grid optimization, a deep RL-based real-time scheduling framework, etc.

While the cases have limitations related to geographic scope, methodological diversity, and limited long-term performance data, these gaps are addressed through complementary BERTopic-based thematic analysis. Together, the case studies provide a comprehensive understanding of how LLMs and AI contribute to sustainable energy systems across technical, operational, and policy dimensions, aligned with relevant SDGs.

2.4. ADO Framework

The ADO framework proposed by Paul and Benito [37] enables a systematic examination of the literature by categorizing research into antecedents (drivers), decisions (strategic choices), and outcomes (impacts). This structured synthesis supports clearer propositions for future research and provides deeper, contextually grounded insights [55,56,57]. Framework-based reviews using ADO offer greater transparency and analytical depth than narrative approaches [57]. Accordingly, the ADO framework is particularly suitable for this study, which integrates thematic analysis through BERTopic. ADO is well-suited for theory-building research requiring conceptual clarity and contextual coverage [56,57,58]. In this framework, antecedents include foundational drivers such as institutional readiness, technological maturity, policy support, and sociocultural factors; decisions represent key strategic choices and behaviors; and outcomes capture the operational and governance-level consequences of these decisions [57]. The framework has been effectively applied in prior studies on IT project success [55] and fintech transformation [59].

ADO facilitates future research by logically structuring interrelated constructs and examining their sequential relationships in a multidisciplinary context [37,57]. By organizing variables into drivers, choices, and impacts, the framework supports both exploratory and confirmatory research, enabling the development of theory-informed conceptual models and hypothesized relationships [37]. Consequently, ADO serves as a good tool for theoretical development and empirical analysis, supporting systematic model building, which is essential for advancing future research.

3. Results

The thematic analysis and examination, followed by case studies, are discussed in this section.

3.1. Themes Based on BERTopic

3.1.1. LLMs and Sustainable AI: Enablers, Trade-Offs, and Governance Pathways in Energy Transition

LLMs are increasingly viewed as enabling infrastructures for sustainable energy transitions, supporting decision-making, knowledge diffusion, energy optimization, and policy access across decentralized energy systems. By processing large-scale and heterogeneous data, LLMs facilitate stakeholder engagement and the integration of AI tools with sustainability objectives [32]. However, recent review and perspective studies highlight variation in how the sustainability value of LLMs is assessed, pointing to a recurring trade-off between the energy and carbon intensity of LLMs and the need for responsible governance, transparency, and inclusive deployment [23,27,28].

On the enabling side, studies frame LLMs as instruments for reducing information asymmetries and democratizing access to energy knowledge. Arslan et al. [18] identify information gaps among SMEs as a structural barrier to the UK’s net-zero transition and position LLM-based tools as institutional correctives rather than purely technical solutions. RAG-based energy chatbots improve access to policies, technologies, and funding, supporting SDG 7 and SDG 9. Likewise, Nammouchi et al. [60] show that Chat-SGP enables renewable energy communities (RECs) to engage with energy data through transparent natural language interfaces. However, a comparison reveals differentiated impacts: SME-oriented systems strengthen market participation and innovation (SDG 9), whereas REC-focused applications emphasize local empowerment and equity (SDG 7). This suggests that LLMs function as boundary-spanning infrastructures, but their sustainability contribution depends on governance design and institutional context.

However, another stream of research challenges the assumption that LLMs are inherently aligned with sustainability. Pimenow et al. [23] highlight the energy and carbon intensity of LLM training and deployment, questioning their compatibility with SDG 13 unless efficiency gains are realized. Achieving SDG 12, therefore, requires energy-efficient architectures and lifecycle management. Complementing this, Wang et al. [61], through analysis of 100,000 U.S. tweets, show that public concern centers on rising energy demand and the need for transparent, localized governance. Purely focusing on technical lifecycle assessments, this perspective emphasizes social legitimacy as a critical constraint. These studies change attention from downstream democratization benefits to upstream resource costs and societal acceptance, revealing a structural stress at the core of LLM-enabled sustainable energy transitions.

From a governance and regulatory perspective, intelligent analytical tools such as LLMs can support regulatory bodies in interpreting complex energy system data and policy frameworks. Integrated renewable energy systems involve multidimensional technological, economic, environmental, and regulatory interactions that require coordinated oversight and stakeholder collaboration [11]. In addition, regulatory-driven optimization frameworks highlight the importance of embedding policy and legal compliance into energy system operations to ensure adherence to carbon regulations and market mechanisms [24]. Such capabilities suggest that AI-assisted tools could strengthen regulatory monitoring, grid security analysis, and compliance with environmental standards in evolving energy systems.

The trade-off between these perspectives becomes more pronounced when examined through sustainability paradigms. Barahona and Almulhim [34], drawing on circular economy principles such as modularity, reusability, and energy efficiency, question the compatibility of energy-intensive digital infrastructures with sustainability objectives historically focused on low-energy and regenerative solutions. Their analysis highlights a structural stress between the resource-intensive architecture of current LLMs and the principles of responsible production and lifecycle efficiency associated with SDG 12. This suggests that without redesign toward modular, energy-efficient, and reusable AI systems, LLM utilization may risk reinforcing extractive digital infrastructures rather than supporting circular transitions. In contrast to application-oriented studies that treat LLMs as neutral optimization tools, this perspective reframes AI systems themselves as subjects of sustainability governance. Integrating circular economy principles into AI design and deployment is, therefore, not merely desirable but necessary to mitigate environmental trade-offs and ensure alignment with long-term sustainability objectives.

Despite the advantages of AI-enabled optimization in energy systems, operational and governance risks require systematic attention, particularly in safety-critical grid environments. LLM deployment in energy systems raises reliability concerns because generated outputs may contain inaccuracies or hallucinations, which can affect operational decision-making. Effective AI data governance is required to ensure data security, privacy protection, ethical AI practices, bias mitigation, and regulatory compliance throughout the model lifecycle [62]. Scalability challenges related to large model size, training data requirements, and computational resources remain a limitation for practical deployment [63]. At the system level, integrating inverter-based resources and community energy prosumers introduces coordination and policy dependency challenges that influence the scalability of grid innovations [5]. Renewable energy integration involves multidimensional constraints across technological, economic, environmental, social, and regulatory domains, highlighting the importance of coordinated planning and stakeholder governance mechanisms [11].

Regulatory complexity also plays a role in operational optimization, as integrated energy systems must comply with evolving carbon markets, policy incentives, and legal frameworks when determining dispatch strategies [24]. Energy community optimization frameworks that incorporate behavioral and demand-side data introduce additional data governance and coordination challenges [64]. Reviews of hydrogen-based microgrid management emphasize that advanced AI-driven optimization approaches increase computational complexity and deployment constraints in heterogeneous energy infrastructures [65].

These findings indicate that LLMs are neither inherently sustainable nor inherently unsustainable; rather, their systemic impact depends on lifecycle efficiency, governance mechanisms, and institutional oversight embedded within their design and application. Reliability assurance, regulatory compliance, data governance, and scalability limitations must be addressed systematically before AI-enabled decision systems can be safely deployed in critical energy infrastructures

3.1.2. Advancing Intelligent Energy Systems with LLMs: Forecasting, Modeling, and Policy Integration

The literature indicates that LLM integration contributes to intelligent, automated, adaptive, and resilient energy infrastructures by extending beyond natural language processing into forecasting, modeling, and governance functions. Across studies, LLMs manage heterogeneous energy datasets, automate complex workflows, and bridge technical and regulatory barriers in the clean energy transition [29]. These capabilities support renewable integration (SDG 7), digital infrastructure innovation (SDG 9), and informed modeling and governance for climate adaptability (SDG 13) [4], reflecting a shift toward governance-integrated intelligent energy systems.

In forecasting, Dai et al. [66] and Duan et al. [67] develop LLM-based architectures for complex wind energy datasets, with Duan et al. [67] introducing a hybrid hard-soft prompting mechanism for flexible, multilocation forecasting. Chang et al. [68] prioritize accessibility through a zero-code, prompt-driven approach that reduces computational and expertise barriers, contributing to SDG 7 and SDG 9. While all three approaches build decentralized forecasting, they vary in emphasis on performance optimization versus usability and inclusiveness. Akinci et al. [69] enhance transparency by combining LLMs with decision trees and Shapley values, improving interpretability and trust in climate-related decision-making (SDG 13). These studies show that forecasting advances integrate accuracy, accessibility, and transparency.

Beyond forecasting, LLMs enable automation in energy system modeling. Zhang et al. [64] propose a modular, agent-based workflow that converts textual building descriptions into validated EnergyPlus models. Specialized agents for preprocessing, IDF object extraction, object generation, and debugging decompose complex tasks into structured stages, improving modeling accuracy and reliability, expanding digital modeling capacity (SDG 9), and supporting climate-responsive building design (SDG 13). However, policy integration within modeling workflows remains limited.

Policy-oriented applications are addressed by Buster et al. [13], who use LLMs to extract zoning and siting regulations for renewable energy projects from legal texts. Their LLM–decision tree framework enhances transparency and scalability in infrastructure planning, directly linking AI-enabled systems to climate-aligned governance (SDG 13). This narrowed lens of policy integration is addressed by Buster et al. [13], who apply LLMs to extract zoning and siting regulations for solar and wind energy from legal texts. By combining LLMs with decision trees to develop a public regulatory database, their approach improves scalability, speed, and transparency in energy infrastructure planning, directly supporting climate-aligned governance under SDG 13.

These studies highlight the role of LLMs as integrative mediators in the development of intelligent, operational, and policy-driven sustainable energy systems. While forecasting and modeling studies emphasize technical performance and automation, policy-focused work highlights governance and transparency. The ability of LLMs to predict, model, automate, and integrate renewable energy may support progress toward SDG 7, SDG 9, and SDG 13 through potential improvements in energy access, innovation, and climate-driven system development.

3.1.3. LLMs and Generative AI for Decarbonized Innovation: A Multisectoral Application in Hydrogen, Electrochemical Storage, and Infrastructure Optimization

The literature conceptualizes LLMs and GenAI as cross-sectoral enablers of decarbonized innovation, extending beyond system optimization to materials discovery, infrastructure management, governance, and energy value chains, including predictive maintenance, catalyst discovery, battery optimization, and hydropower management. This expansion indicates a transition from isolated optimization tasks toward integrated value-chain intelligence.

In materials and hydrogen innovation, Ock et al. [70] propose CatBERTa, a transformer-based model using textual descriptors to predict catalyst adsorption energy. By removing reliance on atomic coordinates while maintaining high accuracy, the model improves interpretability and methodological accessibility, supporting sustainable hydrogen innovation under SDG 9 and SDG 13. In contrast, Shahin and Simjoo [26] evaluate ChatGPT across hydrogen case studies and report up to 97% predictive accuracy in maintenance, policy analysis, and process evaluation. Whereas CatBERTa targets early-stage material discovery, ChatGPT applications address operational and governance functions, demonstrating complementary roles of LLMs across the hydrogen value chain and supporting SDG 7 and SDG 9.

The role of domain-specific model design is further emphasized by Gabber and Hemied [19], who develop RE-LLaMA based on LLaMA 3.1 (8B). Their results show superior performance over generic LLMs in zero- and few-shot tasks related to material discovery, battery lifecycle assessment, cost optimization, and grid-scale storage adoption. Compared with generic LLM applications, this domain-adapted approach highlights the importance of contextual pretraining for reliability in specialized energy contexts, reinforcing industrial innovation (SDG 9) and clean energy deployment (SDG 7).

Beyond hydrogen, GenAI applications extend to storage and infrastructure optimization. Li and You [71] show that generative adversarial networks, diffusion models, and multimodal LLMs are used for nano- and microscale material discovery, battery lifecycle assessment, cost efficiency, and grid-scale storage adoption. Zhu et al. [72] report that ChatGPT-like models improve hydropower station management through enhanced control systems, automation, and decision support, contributing to safe and efficient clean energy operation. While storage-focused studies emphasize lifecycle efficiency and scalability, hydropower applications prioritize operational safety and reliability (SDG 9), indicating sector-specific sustainability priorities within a shared AI-enabled framework.

Energy system modeling and optimization are essential for managing renewable integration, load balancing, and grid stability. Recent studies apply AI-based optimization methods to improve system performance. For instance, deep reinforcement learning has been used for adaptive microgrid energy management under uncertain demand and renewable generation [20,73]. Community energy optimization frameworks also incorporate demand-side behavior and shared energy resources to improve operational efficiency [74]. In addition, regulatory-driven optimization models integrate carbon policies and market constraints into energy system dispatch, enabling policy-compliant and flexible operations [24]. These approaches show how AI-driven modeling and optimization can support intelligent energy system management and complement emerging LLM-based decision-support tools.

Recent studies also extend LLM applications to sustainability assessment. For example, SustainLLM proposed by Li et al. [75] uses LLMs to automatically extract lifecycle sustainability data from scientific literature, regulatory reports, and monitoring systems, improving the efficiency and consistency of Life Cycle Assessments (LCA). The framework integrates multi-objective optimization to evaluate environmental, economic, and technical trade-offs in energy transition planning. By enabling data-driven evaluation of emissions, resource efficiency, and energy system performance, SustainLLM supports SDG 7 (clean energy systems), SDG 9 (data-driven infrastructure innovation), and SDG 13 (climate action) through improved lifecycle sustainability analysis and evidence-based decarbonization planning.

These studies indicate that LLMs and GenAI can act as innovation catalysts across multiple energy sectors. Domain-specific models improve technical performance, while general-purpose LLMs support operational, maintenance, and policy decision-making. Frameworks such as SustainLLM extend this role by enabling dynamic Life Cycle Assessments (LCAs) for evidence-based energy transition planning. These approaches accelerate innovation, improve system efficiency, and support decarbonized energy infrastructure aligned with SDG 7, SDG 9, and SDG 13.

3.1.4. Deep Reinforcement Learning for Intelligent Energy System Optimization: Decarbonization, Infrastructure Innovation, and Clean Energy Access

Reinforcement learning (RL)/deep reinforcement learning (DRL) and LLMs are architecturally and functionally different AI paradigms. While LLMs primarily provide semantic reasoning, natural language interfaces, and governance-related interpretation, RL frameworks operate as numerical optimization and control mechanisms under uncertainty. Their co-occurrence in the thematic results indicates complementary roles within intelligent energy system architectures, where RL supports operational decision optimization, and LLMs contribute higher-level coordination and policy integration. RL and DRL function as core optimization mechanisms for adaptive, data-driven, low-carbon energy systems [76]. Across renewable integration, grid coordination, and decentralized infrastructure, RL-based frameworks address uncertainty, intermittency, and real-time control challenges, supporting SDG 7 (clean energy access), SDG 9 (digital infrastructure innovation), and SDG 13 (decarbonization). The reviewed studies collectively demonstrate how RL and DRL contribute to optimization, equitable access, and infrastructure intelligence.

At the system-integration level, the DRL-based optimization framework by Yi et al. [76] for a nuclear-renewable integrated energy system (NR-IES) is capable of coordinating nuclear power, hydrogen production, renewable generation, and energy storage. The results reported by Yi et al. [76] indicate that proximal policy optimization (PPO) achieves 13.9% higher mean-episode returns during training and 29.4% higher mean-episode returns during testing compared with Particle Swarm Optimization (PSO) under the same optimization conditions. This hybrid system promotes how DRL enables flexible and low-carbon energy production, contributing to SDG 7 and SDG 9 through intelligent energy infrastructure integration. This contrasts with grid-focused studies that focus on fast adaptive control rather than multivector integration.

Singh et al. [73] apply a deep Q-network (DQN) within an OpenAI Gym environment to integrate solar energy into smart grids. The focus is on real-time load balancing and battery optimization to manage intermittency. Compared with Yi et al. [76], this value-based RL approach prioritizes localized operational responsiveness rather than long-horizon system coordination, contributing to SDG 7 and advancing digital grid infrastructure under SDG 9.

At decentralized and enterprise scales, Maryasin [77] proposes a two-stage framework combining linear programming with RL-based scheduling for real-time control of local production and storage, minimizing waste and strengthening decentralized systems under SDG 7 and SDG 9. Yang et al. [78], using the CityLearn platform in cold-climate contexts, show that RL-based coordination of storage and adaptive demand control enhances clean energy utilization and community-level infrastructure innovation, supporting SDG 7 and SDG 9. Although differing in scale, both studies reinforce RL’s role in decentralized and community-level energy innovation [16].

Platform-oriented contributions highlight scalability and accessibility. Zhou et al. [74] introduce SEDRL, an open-source DRL platform integrating PandaPower and stable baselines for real-time renewable scheduling under uncertainty. By improving performance in managing intermittency and broadening access to advanced optimization tools, the platform supports renewable integration (SDG 7) and democratizes intelligent infrastructure solutions (SDG 9).

Across system-level integration [76], enterprise and community applications [77,78], and open-source platforms [74], RL and DRL consistently enhance sustainability, resilience, and efficiency by reducing fossil fuel dependence (SDG 7) and enabling adaptive infrastructure (SDG 9). The comparison across scales shows a clear differentiation: system-level models emphasize coordinated low-carbon integration, while grid- and community-level applications prioritize real-time responsiveness and equitable access. These findings suggest that RL-based optimization provides the operational control backbone upon which LLM-enabled forecasting, modeling, and governance functions can be integrated, reinforcing a layered hybrid AI architecture in sustainable energy systems.

Microgrid energy management is a key application of intelligent optimization because it coordinates renewable generation, storage, and demand under uncertainty. Guo et al. [20] formulate the microgrid optimal energy management problem as a Markov decision process and apply a proximal policy optimization (PPO)-based DRL approach for real-time scheduling. Studies on reinforcement learning-based microgrid management highlight the importance of carefully designed training and operational phases to ensure reliable decision-making under uncertainty [20]. Their results show improved computational efficiency and operational performance compared with conventional optimization methods.

Complementary evidence from Sarwar et al. [65] indicates that energy management strategies in hydrogen-based building microgrids require different optimization approaches depending on system predictability, computational complexity, and deployment conditions. Their review shows that deterministic, stochastic, and machine-learning-based methods each provide advantages under different operating scenarios. These studies highlight that advanced optimization methods play a critical role in enabling adaptive and efficient microgrid operation in renewable-based energy systems.

Although the reviewed studies highlight significant potential for LLMs and related AI systems to enhance forecasting, modeling, optimization, and governance in sustainable energy systems, several systemic limitations and trade-offs emerge across themes. The high computational intensity associated with training and deploying large models may increase energy demand and lifecycle carbon emissions, potentially offsetting climate benefits if efficiency gains are not realized. Reliance on large-scale training datasets raises concerns regarding bias, transparency, and interpretability, particularly in policy-sensitive and governance applications. Many applications remain context-specific, and the generalizability of domain-adapted models across diverse regulatory, infrastructural, and socioeconomic environments remains uncertain. Increased automation in decision-making processes may reduce human oversight if accountability mechanisms are not embedded within system design. These trade-offs suggest that the sustainability of LLM-enabled energy transitions depends not only on technical performance but also on lifecycle efficiency, governance safeguards, contextual adaptation, and responsible deployment strategies.

Figure 4 summarizes the key characteristics of the four thematic clusters identified through BERTopic analysis. The themes represent different but complementary roles of AI or LLMs in SESs, including governance-oriented applications of LLMs, intelligent energy system development, generative AI-driven decarbonization innovation, and DRL for operational optimization. The comparison highlights how these themes differ in their roles, applications, system design approaches, and sustainability impacts, revealing the evolving integration of LLMs across the energy transition.

3.2. Case Studies

The six recent application contexts summarized in Table 1 examine LLM-enabled and AI-supported approaches within SESs, while also recognizing the continued importance of complementary techniques for forecasting, modeling, optimization, and real-time scheduling. Collectively, these studies indicate LLMs as higher-level semantic, coordination, and decision-support layers operating alongside established technical models to advance infrastructure development aligned with SDG 7, SDG 9, and SDG 13.

To enhance clarity and benchmarking, the case analyses are structured around four performance dimensions: economic efficiency (cost and computational implications), environmental performance (emission reduction and renewable integration effects), operational reliability and accuracy (model validation, convergence, and scheduling stability), and governance or decision-support relevance (transparency, coordination, and institutional embedding). Not all studies report standardized quantitative metrics across these dimensions; however, this analytical structure provides systematic coverage and comparative interpretation of reported outcomes.

Ali et al. [79] conducted a case analysis of five wind-farm parameterizations using the Weather Research and Forecasting (WRF) Model v4.3.3 to simulate offshore wind interactions in the North Sea. Model performance was validated through airborne observations, satellite imagery, and FINO-1 measurements. In terms of operational reliability, the study quantified inter-array wake effects, reporting that the wake generated by the Veja Mate wind farm reduced the capacity factor of the Bard Offshore 1 wind farm from 0.96 to 0.82 under stable atmospheric conditions. This measurable reduction demonstrates the impact of turbulence modeling on renewable generation efficiency. Although no direct cost-savings metrics were reported, improved prediction accuracy supports more reliable renewable yield estimation and infrastructure planning, contributing to SDG 7 (renewable energy integration) and SDG 13 (climate-resilient infrastructure) (RQ2). While LLMs were not employed in this study, it establishes a performance baseline at the infrastructure layer within which higher-level AI or LLM-enabled coordination could operate.

In contrast, Zhang et al. [64] placed LLMs at the core of system design by developing a modular, agent-based workflow for automated creation and debugging of Building Energy Models (BEMs) using EnergyPlus. The study reports explicit computational cost metrics: processing approximately 120,000 tokens incurs around $0.4 input and $1.6 output per session for GPT-o1-mini, and $0.4 input and $2.0 output per session for Claude 3.5, with repeated iterations potentially accumulating to hundreds of dollars. In terms of operational performance, the agentic workflow outperformed naive prompt engineering, alternative LLM workflows, and manual modeling in accuracy, reliability, and time efficiency, although percentage improvements were not numerically specified. By reducing modeling errors and technical barriers in a sector responsible for approximately 40% of global energy use and 33% of carbon emissions [64], the approach indirectly supports decarbonization planning aligned with SDG 9.

Likewise, Jin et al. [80] proposed an LLM-assisted optimization autoformalism that translates natural-language objectives into customized energy optimization strategies for HVAC control, electric vehicle charging, and renewable planning. By improving the coordination of building energy systems, urban mobility charging, and distributed renewables, the approach supports energy-efficient and resilient urban infrastructure, thereby contributing to sustainable city energy management (SDG 11) and climate-oriented optimization (SDG 13). Likewise, Zhang and Chen [81] review LLM applications in the building energy sector, focusing on intelligent control, automation, regulatory compliance, and lifecycle management within buildings; however, their analysis remains confined to the building domain and does not generalize across broader SESs contexts or governance integration. Under economic and operational dimensions, the system promotes off-peak charging and optimized scheduling, which reduces grid stress and stabilizes electricity prices; however, quantified cost savings or emission reductions were not reported. The primary measurable contribution lies in enhanced decision accessibility and usability. By translating complex optimization problems into actionable recommendations (e.g., thermostat setpoints), the system strengthens decision transparency and user engagement, supporting decentralized energy management aligned with SDG 7 and SDG 9 (RQ2).

At the enterprise level, Maryasin [77] presented a two-stage optimization framework combining linear programming and reinforcement learning for smart grid control. In terms of economic performance, the optimized energy consumption profile reduced central grid usage during peak-price hours compared to the base case, fully compensating for peak load reductions, although exact monetary savings were not specified. Regarding operational reliability, reinforcement learning algorithms demonstrated good convergence and stable scheduling under system constraints. While emission reductions were not directly quantified, the integration of renewable generators and storage devices supports improved local energy efficiency and load management consistent with SDG 7.

At the community and regional scale, Yang et al. [78] highlighted that reinforcement learning-based controllers effectively coordinate energy storage and demand in cold-climate contexts, highlighting RL’s adaptability for decentralized energy coordination (RQ1). Zhou et al. [74] examined the relationship between industrial automation and pollution intensity across Chinese provinces. The study found that higher levels of industrial robot application were associated with significantly lower pollution emission intensity, which provides an empirically grounded environmental performance indicator that supports clean energy transitions and pollution reduction (SDG 7 and SDG 13) while also contributing to sustainable urban and industrial systems (SDG 11). However, specific percentage reductions were not detailed in the summarized results. Cost savings and reliability metrics were not explicitly evaluated.

Therefore, these studies indicate measurable performance improvements across renewable yield reliability, computational cost transparency, scheduling stability, and pollution intensity reduction. However, standardized reporting of cost savings, quantified emission reductions, reliability indices, and compliance metrics remains inconsistent across cases. This heterogeneity limits direct cross-case benchmarking and highlights the need for harmonized performance reporting frameworks in evaluating LLM-enabled and AI-assisted sustainable energy systems (RQ3).

While the results show extensive applications of LLMs and related AI systems in forecasting, modeling, optimization, governance, and cross-sectoral energy innovation, the analysis also reveals important structural limitations and trade-offs. Studies clearly question the assumption that LLMs are inherently aligned with sustainability goals. For example, Pimenow et al. [23] highlight the significant energy and carbon intensity associated with LLM training and deployment, while Barahona and Almulhim [34] identify a structural tension between energy-intensive LLM architectures and circular economy principles linked to SDG 12. Forecasting and modeling frameworks (e.g., Duan et al. [67]; Zhang et al. [64]; Yi et al. [76]) emphasize performance gains and automation efficiency, yet they depend on computational infrastructures whose lifecycle environmental impacts are not comprehensively assessed. Domain-adapted models such as RE-LLaMA [19] and CatBERTa [70] improve contextual reliability, but their specialization raises concerns regarding scalability and cross-context generalizability. Governance-oriented applications (Buster et al. [13]; Akinci et al. [69]) emphasize transparency and interpretability, implicitly acknowledging that accountability mechanisms are necessary to maintain policy legitimacy. Wang et al. [61] show that public perception and social acceptance influence sustainability outcomes, indicating that AI-enabled energy transitions are socially negotiated rather than purely performance-driven. Therefore, these findings suggest that the sustainability contribution of LLMs in energy systems is conditional upon lifecycle efficiency, governance safeguards, contextual adaptability, and responsible deployment, rather than being an automatic outcome of technical advancement.

4. Discussion

This study integrates case study evidence with BERTopic-based thematic analysis to interpret the role of large language models in sustainable energy systems. Rather than reiterating where LLMs are applied, the findings indicate a structural repositioning of AI within energy systems. The concentration of applications in modeling, diagnostics, optimization, and policy interaction (RQ1) suggests that LLMs are evolving from task-specific analytical tools into coordination and interpretation layers that mediate between infrastructure, regulatory frameworks, and operational decision-making. Their deployment across intelligent infrastructure, hydrogen systems, building energy models, and storage contexts show a broader transition toward digitally embedded and cognitively augmented energy systems aligned with SDG targets 7.2 (expanding renewable energy integration through improved forecasting and system coordination), SDG 7.3 (enhancing energy efficiency via intelligent diagnostics and operational optimization), SDG 9.4 (enabling sustainable and resilient infrastructure through AI-supported energy system management), and SDG 13.2 (supporting integration of climate considerations into energy planning and policy decision-making).

The dual role of AI in optimization and diagnostics, consistent with prior findings in adaptive control and battery management [29,77], carries systemic implications beyond performance enhancement. While earlier studies highlighted improvements in resilience and efficiency within specific domains, the present synthesis indicates that these adaptive capabilities are diffusing into fixed and decentralized infrastructures. This suggests maturation from sector-specific AI deployment toward system-level integration (RQ2). The recurring concerns regarding equity, environmental burden, and responsible AI integration (RQ3) indicate that scaling LLM-enabled systems introduces governance challenges that cannot be resolved through technical optimization. Thus, resilience emerges not only as a technical outcome but as an institutional and socio-technical condition.

In energy modeling, the significance of LLM integration extends beyond the automation of EnergyPlus-based Building Energy Models. Although modular model generation has been indicated by Zhang et al. [25] and Zhang & Chen [81], the incorporation of legal interpretation and regulatory alignment [24] signals a deeper transformation: modeling environments are becoming governance-aware systems. Compared with forecasting-accuracy-oriented research such as Kim & Kim [82], the emphasis here is on end-to-end integration, where LLMs and DRL converge to embed compliance, scalability, and optimization within a unified pipeline. This indicates that the future trajectory of LLM deployment lies in coupling computational modeling with institutional reasoning rather than improving predictive precision alone.

The extension to hydrogen systems further reinforces this integrative interpretation. While hydrogen applications have previously been grouped under broader machine learning frameworks [65], the evidence synthesized here suggests that LLMs span multiple lifecycle stages, from catalysis to operational coordination (RQ1). The integration of LLMs with DRL results in a hybrid intelligence architecture in which semantic reasoning complements numerical optimization. The broader implication is that industrial decarbonization may increasingly rely on such hybrid systems to coordinate technological innovation with strategic decision-making, thereby aligning technical progress with SDG-driven objectives.

A critical interpretive contribution concerns sustainability trade-offs. The identification of environmental and energy burdens associated with LLM integration [61], supported by evidence on computational intensity and training-related emissions [83], highlights a structural paradox: technologies intended to accelerate sustainable transitions may themselves generate additional resource pressures. This reframes the discussion from performance gains to lifecycle accountability. Modular, decentralized, and task-specific LLM designs therefore emerge not only as efficiency strategies but as governance mechanisms to reconcile digital innovation with responsible consumption, aligning with SDG 12.2 (promoting efficient use of natural resources by reducing computational and energy intensity) and 12.6 (encouraging responsible technological practices and sustainability reporting in digital and energy systems) (RQ2).

Likewise, in predictive maintenance and market optimization contexts, prior findings on AI-enabled safety and monitoring [2] extend the discussion by framing predictive capabilities as enablers of climate-adaptive and resilient infrastructure rather than merely operational tools, aligning with SDG 13.1 (resilience and adaptive capacity to climate-related risks in infrastructure systems), with evidence from energy market optimization [84] and real-time scheduling under uncertainty [74,78]. However, the broader significance lies in how LLMs and GenAI tools reshape coordination dynamics within energy markets and infrastructure systems [84,85]. Increasing interdependence between algorithmic decision-making, institutional regulation, financial efficiency, and operational resilience encourages stakeholder collaboration and optimizes intelligent infrastructure (SDG 9.4 by upgrading infrastructure through advanced technologies to improve sustainability and efficiency). This highlights the need for transparency, accountability, and standardized evaluation frameworks when embedding LLMs within economically sensitive systems.

The convergence between case analyses and BERTopic thematic modeling indicates that LLMs represent more than incremental technological upgrades within SESs. They signal a transition toward hybrid socio-technical systems in which cognitive automation, adaptive optimization, and governance structures co-evolve. The long-term impact of LLM integration will depend less on isolated technical performance and more on institutional embedding, lifecycle-aware evaluation, and context-sensitive governance. Consequently, the significance of these findings lies in reframing LLM-enabled energy transitions as governance-intensive transformations rather than purely technological advancements.

Therefore, this study integrated case analyses and thematic analysis through BERTopic, revealing the role of LLMs in SESs development. The research approaches include mixed methods, which range from cognitive automation (through LLMs) and adaptive control, with functions in modeling, governance, and optimization. The convergence between cases and thematic analysis suggests the potential for system-level integration that could support clean energy access (SDG 7), infrastructure innovation (SDG 9), responsible AI lifecycle management (SDG 12), and climate action (SDG 13). The findings indicate that LLMs have large potential in sustainable energy systems, but their long-term impact depends on responsible governance, standardized evaluation frameworks, and context-sensitive implementation.

The evidence included in this review has several limitations. It relies on the published literature and selected case studies from Scopus only, excluding other databases. This study has certain methodological limitations. BERTopic modeling was conducted using titles, abstracts, and author keywords rather than full-text articles in the initial review. Although post-clustering full-text review by domain experts was performed to validate thematic coherence and improve classification accuracy, some methodological details may not have been fully reflected in the automated clustering process. The PRISMA screening process identified 92 studies, which constituted the corpus used for BERTopic modeling and thematic analysis. However, only the most representative and thematically relevant studies are discussed and cited in the manuscript, while the remaining studies [86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116] contributed to the clustering process and thematic identification but are not individually referenced. Future research incorporating full-text corpora and structured technical metadata could further enhance thematic depth and analytical precision.

Comprehensive keywords were used for the literature search; it is possible that some relevant studies with alternative terminology or emerging concepts may not have been captured. The case studies have limited geographical coverage and lack long-term performance evaluation, reducing generalizability across regulatory and climatic contexts. The SDG mapping in this study may not fully capture all the SDG contributions. Some studies may be partially misclassified or underrepresented when sustainability impacts are indirect, context-specific, or involve multiple SDGs.

The included evidence presents a potential risk of bias, as many studies emphasize successful implementations while limited attention is given to failed applications or negative outcomes. There is variation in study design quality, with differences in transparency, validation, and empirical depth across the literature. The findings show inconsistency in reported impacts, as outcomes are more context-dependent and influenced by system-specific conditions. There is also imprecision, as many studies may not report standardized quantitative indicators (e.g., cost savings, emission reductions), preventing precise measurement of sustainability impacts.

SDG mapping here is interpretive, literature-driven, and based on reported outcomes, application contexts, and sustainability objectives rather than statistical classification. There is methodological heterogeneity across studies; quantitative validation metrics are not applied, and confirmation is ensured through triangulation across bibliometric analysis, topic modeling, and representative application contexts. Future research may extend this work by developing quantitative indicators to measure the SDG impacts of LLM-enabled energy systems. Although the sustainability trade-offs of AI are discussed, this study does not quantitatively assess the life-cycle environmental impacts of LLMs. Inconsistent reporting of performance indicators such as economic cost reductions, emissions impacts, constraint violations, or peak load reductions in the source literature has limited standardized, metric-based comparisons, making case analysis more conceptual rather than statistical. Future research should focus on unified reporting standards, benchmark datasets, and meta-analytic comparisons of LLM interventions across technical, economic, and environmental dimensions.

5. Implications

The major implications of the study in terms of theory, practice, and policy are described below.

5.1. Implications for Theory

The examination of LLMs in SESs provides a theoretical contribution to intelligent infrastructure, responsible AI, and sustainable energy transitions. This study contributes to sustainability transition theory through how LLMs and AI create changes toward the acceleration of low-carbon and adaptive sustainable energy systems through models, which increase responsiveness, automation, semantic reasoning, decentralized control, adaptive modeling, and infrastructure development. For example, LLM-based modular agent workflows for automating BEMs highlight the changes from the deterministic and rule-based energy system to the more flexible and natural language processing-based system, which addresses SDG 9.

The study also highlights the importance of transparency, as energy intensity directly aligns with responsible AI governance in the design phase, which provides a different view in the literature by revealing how AI can both support and restrict the achievement of sustainability goals. Social–technical systems theory is another relevant theory in the study’s context related to findings on the requirements of policy consistency, institutional-level readiness, and trust in community-level mechanisms, and how trust and environmental trade-offs impact AI- and LLM-based energy systems in centralized and decentralized community-based energy system development. The use of green AI in the development of equitable and energy-efficient systems shows that AI is not only accurate but also conducive to sustainability, contributing to SDG 12 and SDG 13.

This examination through a novel mixed method shows that LLMs have a role in infrastructure, social behavior, and institutional governance. The chatbot systems for SMEs and renewable energy communities show how AI supports inclusive policy engagement and the democratization of knowledge, directly addressing SDG 7 and SDG 9 through stakeholder-based innovation, which is novel. This study provides a broader view of how LLMs contribute to the multilevel SESs transition, combining optimization, governance, and equity.

5.2. Implications for Policy

These findings indicate an urgent need for SDG-aligned policy interventions to regulate the use of LLMs in sustainable energy systems (SESs). Government-led AI governance is essential for managing automation risk while enabling transparent and accountable interpretations of energy regulations, grid operations, and demand-response mechanisms, supporting SDG 13, particularly indicator 13.2, in climate-integrated energy planning. The study further shows that inclusive and decentralized SESs can be promoted through LLM-based tools such as Energy Chatbot and Chat-SGP, which support SMEs and local communities. Accordingly, policymakers should promote localized and multilingual AI systems to improve energy literacy and participation, directly contributing to SDG 7, especially indicator 7.1.1 on access to affordable and modern energy services. Impact can be accessed via pre/post-adoption measures such as changes in information access and participation in clean energy programs.

Compulsory green computing mandates the carbon labeling of AI systems and the development of a culture of modular and recyclable AI hardware aligned with circular economy principles, which will reduce high energy consumption and achieve the goal of sustainability, supporting both SDG 12 and SDG 13. Policies should encourage carbon labeling of AI systems, modular and recyclable hardware, and the use of small or domain-specific LLMs with energy-aware integration practices to reduce lifecycle impacts (SDG 12 and SDG 13). Responsible AI governance should be implemented through procurement standards, including the carbon labeling of AI models (training and inference emissions), mandatory modular and recyclable hardware criteria, and regulatory sandboxes for participatory policy chatbots with human support. Effectiveness can be monitored via clear metrics such as emissions deltas, regulatory violation rates, human override frequency, and equity-of-access indicators.

5.3. Implications for Practice

The practical implications of integrating LLMs into SESs are broader and range from engineering design and infrastructure operational strategies in the development of infrastructure design to enhancing the efficiency of operations through the adoption of LLM workflows, which can mitigate errors in modeling and simulations. Agents such as IDF debuggers, preprocessors, and energy engineers can promote the development of climate-responsive infrastructure supporting SDG 13 and SDG 9. Another implication of this study is the use of AI and DRL-based controllers and the LLM-driven advisory system, which can be integrated by energy operators in the areas of community-level energy planning, ensuring that the energy equity, real-time responsiveness, agility, and decision-making goals of SMEs align with the SDG 7 goals of energy access.

The integration of green AI and life cycle monitoring is another area that can be considered. AI developers are integrating eco-design and sustainability principles, such as developing data centers that are energy efficient and reducing the loss of performance through reducing the size of the model and integrating the circular economy through following modularity and upgradability, which will help create SESs. This will contribute to SDG 12 (sustainable production) and support SDG 13 (climate change reduction). This study identified a special case of hydrogen energy, and the optimization of batteries with LLMs is most important. The development of domain-based LLMs such as RE-LLaMA in planning, prediction, and maintenance will increase efficiency and decarbonization.

Another major need is the development of skills and enhancing capacity, promoting LLMs and AI literacy in terms of their functions, constraints, and metrics. This can be integrated into collaboration with universities and industry to introduce curriculum courses in the engineering and environmental disciplines on AI for sustainable energy to create readiness for developing more sustainable and responsible SESs driven by these technology trends.

6. Future Research Directions

The Antecedents–Decisions–Outcomes (ADO) framework [37,57] is a widely accepted theory-building tool that follows a systematic method to guide future research. This covers the identifying enablers (antecedents) of a specific phenomenon, how actors respond (decisions), and the associated consequences (outcomes) [44]. In this study, LLMs in SESs are integrated with the ADO model, informing further research, which is contextually relevant, policy-aligned, and globally impactful, addressing the SDG targets by highlighting key drivers, decisions, and effects along with proposing research objectives and hypotheses (Table 2 and Figure 5).

Figure 5 presents the Antecedents–Decisions–Outcomes (ADO) framework for the integration of LLMs in SESs. The antecedents represent enabling conditions such as regulatory support, infrastructure readiness, and data governance. These factors influence key decisions related to model design, governance, and real-time optimization, which ultimately lead to outcomes including emission reduction, improved grid efficiency, energy equity, and broader sustainability impacts.

6.1. Antecedents: Enablers of LLMs in SESs

The antecedents are the enablers of LLMs and AI technologies in SESs. These include technological maturity, regulatory clarity, trust in data systems, institutional capacity, and sociocultural readiness. The results from this study highlight different antecedents; for example, LLM-based tools such as policy chatbots [18] are effective when supported by institutional policies and regulatory data accessibility. These aspects are linked to SDG 9.5 (enhancing scientific research and technological innovation) and SDG 17.16 (strengthening cross-sector partnerships and institutional collaboration). The emphasis on equitable access to digital platforms and addressing differences in data literacy also supports SDG 12.8 (ensuring access to information for sustainable development and responsible decision-making).

The drivers for the adoption of LLMs include problems in traditional systems regarding modeling issues, inadequate access policies, and issues related to integration in energy planning [66]. Zhou et al.’s [74] thematic analysis highlight that dynamic scheduling, weather-dependent systems, and demand variations necessitate adaptive, intelligent tools addressing SDG 9.4 (upgrading infrastructure through cleaner and more efficient technologies) and SDG 13.2 (integrating climate change measures into planning and systems). Existing digital knowledge gaps in microgrids and hydrogen energy systems further limit effective system integration and optimization. Supporting these findings, inadequately integrated user-adaptive AI tools create hurdles for the efficient management of decentralized systems aligned with SDG 9.5 (enhancing technological capabilities and innovation). Increasing attention is given to the demand for automation and governance by SMEs and local stakeholders, who are facing barriers in accessibility, and policy-based AI tools [18], relating to the first topic. To focus on the enablers, future research can be conducted to understand how policy readiness, community trust, and technological infrastructure influence the LLMs in energy systems, especially through exploring differences between urban and rural settings; small, medium, and large enterprises; and high- and low-income regions. The antecedent-based future research objectives and hypotheses are highlighted below.

Future Research Questions (Antecedents):

FR1: What is the impact of regulatory flexibility on the adoption and scale-up of LLM-based applications in sustainable energy governance?
FR2: How does stakeholder trust in AI systems and data governance facilitate decentralized energy optimization?

Propositions (Antecedents):

P1: Energy organizations in adaptive regulatory environments adopt LLM-based planning with greater readiness, in addition to the presence of stringent policy frameworks.
P2: Communities that accept AI systems as transparent and inclusive are willing to adopt LLM-based energy tools.

6.2. Decisions: Strategic Choices in LLM-Driven SESs

Decisions are actions and strategies based on antecedents. In the study context, the choices of model, transparency protocols, user interface design, and automation are important. Modular agent-based LLM workflows [66] help to eliminate errors in energy modeling and are integrated with human-readable explanations. Likewise, DRL optimization systems for load balancing are effective. However, stakeholders are questioning expansion and control issues. This decision is aligned with SDG 7.3 by improving energy efficiency through intelligent optimization tools. The environmental effect of decisions in the energy-intensive training of LLMs covers SDG 13.2, through integrating climate considerations into technological and operational strategies. In addition, governance-based decisions, such as whether to integrate human explanations, contribute to SDG 16.6 in terms of transparency and accountable institutional practices in AI-enabled decision systems.

Another key decision in the application of modular LLM workflows for energy modeling, diagnostics, and policy automation [66] aligns with BERTopic analysis, which is important for scheduling, adaptive control, and decision-making in real time [74,78], related to Topic 3. LLMs promote participatory models such as energy chatbots and multi-agent systems to eliminate knowledge asymmetries [18], as highlighted in the topic. Another primary decision involves the integration of LLMs in hydrogen and battery systems for predictive maintenance and storage coordination [29,79].

These decisions, on the basis of future research, can be made to understand how decisions impact social legitimacy, environmental sustainability, and operational scalability in SESs. For this purpose, the research questions and propositions are as follows:

Future Research Questions (Decisions):

FR3: How do governance mechanisms and participatory governance influence public acceptance of LLMs and AI-enabled energy tools?
FR4: What are the environmental trade-offs between centralized and federated learning approaches in LLM applications for SESs?

Propositions (Decisions):

P3: Users are more likely to accept LLM-based energy scheduling systems if they are provided with transparent explanations of decision logic.
P4: Federated AI systems in smart grid applications will produce at least 30% fewer training-related emissions than centralized models with the same levels of accuracy.

6.3. Outcomes: Impacts and Consequences of Decisions

The outcomes are the tangible and intangible consequences of LLM adoption in SESs, comprising operational efficiency, environmental advantages (emissions reduction), performance improvement, equity, and positive public perception. The results indicate that these models have high potential for increasing energy reliability, optimizing resources, and contributing to climate goals. LLM-based policy tools enhance transparency and participatory governance. These align with SDG 13.2 by enabling data-driven strategies for climate mitigation and decarbonization planning, and SDG 7.1 through supporting improved accessibility and affordability of energy services through intelligent automation and system optimization.

However, the unfavorable consequences of the exclusion of marginalized populations who are technically vulnerable and the possibility of increased energy costs through high-compute models restrict the achievement of SDG 10.2 by reducing social and digital inequalities and SDG 13.3 through climate awareness and institutional capacity. Hence, future research should focus on performance metrics reflecting the societal influence or outcomes of LLMs and AI-based SESs to promote inclusivity. Future research can also be conducted in the aspect of how LLM modeling can influence microgrid performance and align with SDG 7.1 by improving access to reliable and affordable energy.

Along with this, there is a possibility for research—which can be conducted as an extension of [20]—highlighting the improvement in the ability of DRL with high-end algorithms and multi-micro grids. In addition, the integration of LLM workflows with policy frameworks may incorporate life-cycle impact assessments, supporting SDG 9.4 through developing sustainable and resource-efficient infrastructure and AI design, and 12.2 by promoting responsible resource and energy use. For further research, we propose the following research objectives and hypotheses.

Future Research Questions (Outcomes):

FR5: What are the measurable outcomes of environmental sustainability achieved through LLM applications in smart energy systems?
FR6: How does the integration of AI in SESs foster energy access equity in various socioeconomic and geographic regions?

Propositions (Outcomes):

P5: AI-optimized grids using DRL achieve at least 15% lower carbon emissions over five years than conventionally managed grids.
P6: Rural and marginalized communities have less satisfaction and do not benefit from LLM-based energy systems.

Table 2 presents the key elements of the Antecedents–Decisions–Outcomes (ADO) framework in the context of LLM adoption in Sustainable Energy Systems (SESs) and their alignment with relevant SDG targets. The table highlights enabling factors, strategic decisions, and expected outcomes associated with integrating LLMs in energy system planning and management.

7. Conclusions

This study presents the first integrated analysis of the application of large language models (LLMs) across the technical, governance, and operational dimensions of sustainable energy systems (SESs) (RQ1). Its novelty lies in a mixed-method approach that combines BERTopic-based thematic analysis with SDG alignment, enabling a systematic assessment of how LLMs support intelligent infrastructure, inclusive policymaking, and climate resilience (RQ2). The analysis identifies critical research gaps primarily linked to SDG 7 (affordable and clean energy), SDG 12 (responsible consumption and production), and SDG 13 (climate action), highlighting the urgency of responsible AI integration in energy transitions (RQ2, RQ3). Key findings indicate that modular, agent-based building energy modeling (BEM) frameworks and LLM-driven regulatory extraction systems help to automate technical and policy-related tasks and improve speed, scalability, accessibility, and affordability. These capabilities support digital energy infrastructure (SDG 9) while supporting climate-smart planning and governance (SDG 13). Moreover, this study highlights the sustainability challenges associated with LLM integration, including the high computational burden and energy-intensive nature of AI architectures, limited transparency, unequal access to AI tools, governance and privacy gaps, and the absence of dedicated policy frameworks for AI–energy integration (RQ3).

The findings suggest that LLMs may support SDG-aligned outcomes by potentially advancing clean energy access (SDG 7), digital infrastructure innovation (SDG 9), climate-responsive operations (SDG 13), and resource efficiency through sustainable AI practices (SDG 12). The study concludes that, while LLMs can act as powerful accelerators of sustainable energy transitions, their benefits are conditional on governance, accountability, and accessibility mechanisms. This calls for urgent policy interventions to establish regulatory frameworks emphasizing energy efficiency, fairness, and accountability, particularly for decentralized and community-level applications. Enforceable procurement standards, mandatory carbon and energy-use disclosure for AI models, and modular, recyclable AI infrastructure are essential to ensure that LLM-enabled SESs advance sustainability goals without reinforcing environmental or social inequities.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/en19061588/s1, File S1: PRISMA checklist; Table S1: Summary of selected studies.

Author Contributions

Conceptualization, T.A.A., and R.R.; methodology, T.A.A., and R.R.; formal analysis, T.A.A., M.S., S.M., and R.R.; investigation, T.A.A., S.M., and R.R.; data curation, T.A.A., and R.R.; writing—original draft preparation, T.A.A., M.S., S.M., and R.R.; writing—review and editing, R.R., and W.L.F.; visualization, T.A.A., W.L.F., and R.R.; supervision, R.R., and W.L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors are thankful for the anonymous reviewers.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SESs	Sustainable Energy Systems
LLMs	Large Language Models
AI	Artificial Intelligence
SDG	Sustainable Development Goal
GenAI	Generative Artificial Intelligence
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
ADO	Antecedents–Decisions–Outcomes
RL	Reinforcement Learning
DRL	Deep Reinforcement Learning
BERT	Bidirectional Encoder Representations from Transformers
NMF	Non-negative Matrix Factorization
LDA	Latent Dirichlet Allocation
PLSA	Probabilistic Latent Semantic Analysis
NLP	Natural Language Processing
UMA	Uniform Manifold Approximation and Projection
HDB-SCAN	Hierarchical Density-Based Spatial Clustering of Applications with Noise
BEMs	Building Energy Models
RAG	Retrieval-Augmented Generation
RECs	Renewable Energy Communities
NR-IES	Nuclear-Renewable Integrated Energy System
PPO	Proximal Policy Optimization
SAC	Soft Actor–Critic
TD3	Twin Delayed Deep Deterministic Policy Gradient
DQN	Deep Q-network
WRF	Weather Research and Forecasting

References

Kyei, S.K.; Boateng, H.K.; Frimpong, A.J. Renewable energy innovations: Fulfilling SDG targets. Clean Energy 2025, 9, 190–203. [Google Scholar] [CrossRef]
Zhao, Q.; Wang, L.; Stan, S.E.; Mirza, N. Can artificial intelligence help accelerate the transition to renewable energy? Energy Econ. 2024, 134, 107584. [Google Scholar] [CrossRef]
Raman, R.; Gunasekar, S.; Dávid, L.D.; Rahmat, A.F.; Nedungadi, P. Aligning sustainable aviation fuel research with sustainable development goals: Trends and thematic analysis. Energy Rep. 2024, 12, 2642–2652. [Google Scholar] [CrossRef]
Zhao, A.P.; Alhazmi, M.; Li, S.; Li, J.; Xie, D.; Chen, S.; Hu, P.J.-H.; Ju, X. Enhancing Los Angeles’ resilient energy systems amid wildfires. Sci. Rep. 2025, 15, 20813. [Google Scholar] [CrossRef]
Tsao, Y.C.; Banyupramesta, I.A.; Lu, J.C. Optimizing grid energy management through the integration of inverter-based resources and community energy prosumers for sustainable energy systems. Energy Convers. Manag. 2025, 344, 120265. [Google Scholar] [CrossRef]
Yaffid, R.; Sundari, S. Renewable Energy and Economic Growth: Comparative Evidence from Germany, Denmark, and China and Strategic Implications for Indonesia. Asian J. Manag. Entrep. Soc. Sci. 2025, 5, 608–627. [Google Scholar]
Wang, Y.; Xie, Y.; Shao, G.; Zhao, P.; Xie, D.; Liu, T.; Gu, C.; Li, S.; Shi, C.; Zhang, Y.; et al. Low-carbon upgrading to China’s communications base stations for economic profits and additional environmental and public health benefits. Cell Rep. Sustain. 2025, 2, 100492. [Google Scholar] [CrossRef]
Klemm, C.; Wiese, F. Indicators for the optimization of sustainable urban energy systems based on energy system modeling. Energy Sustain. Soc. 2022, 12, 3. [Google Scholar] [CrossRef]
Raman, R.; Pattnaik, D.; Kumar, C.; Nedungadi, P. Advancing sustainable energy systems: A decade of SETA research contribution to sustainable development goals. Sustain. Energy Technol. Assess. 2024, 71, 103978. [Google Scholar] [CrossRef]
Sharma, S.; Ali, I. Efficient Optimization Techniques for Renewable and Sustainable Energy Systems. In Optimization in Sustainable Energy: Methods and Applications; Scrivener Publishing LLC: Beverly, MA, USA, 2026; pp. 405–464. [Google Scholar] [CrossRef]
Nwala, K.C.; Kabeyi, M.J.B.; Olanrewaju, O.A. A Visual and Strategic Framework for Integrated Renewable Energy Systems: Bridging Technological, Economic, Environmental, Social, and Regulatory Dimensions. Energies 2025, 18, 5468. [Google Scholar] [CrossRef]
Yao, Z.; Lum, Y.; Johnston, A.; Mejia-Mendoza, L.M.; Zhou, X.; Wen, Y.; Aspuru-Guzik, A.; Sargent, E.H.; Seh, Z.W. Machine learning for a sustainable energy future. Nat. Rev. Mater. 2023, 8, 202–215. [Google Scholar] [CrossRef]
Buster, G.; Pinchuk, P.; Barrons, J.; McKeever, R.; Levine, A.; Lopez, A. Supporting energy policy research with large language models: A case study in wind energy siting ordinances. Energy AI 2024, 18, 100431. [Google Scholar] [CrossRef]
Dewi, T.; Risma, P.; Oktarina, Y.; Dwijayanti, S.; Mardiyati, E.N.; Sianipar, A.B.; Hibrizi, D.R.; Azhar, M.S.; Linarti, D. Smart integrated aquaponics system: Hybrid solar-hydro energy with deep learning forecasting for optimized energy management in aquaculture and hydroponics. Energy Sustain. Dev. 2025, 85, 101683. [Google Scholar] [CrossRef]
Zhao, Z.; Tang, D.; Liu, C.; Wang, L.; Zhang, Z.; Zhu, H.; Chen, K.; Nie, Q.; Ji, Y. A Large language model-based multiagent manufacturing system for intelligent shopfloors. Adv. Eng. Inform. 2026, 69, 103888. [Google Scholar] [CrossRef]
Zhao, A.P.; Li, S.; Qian, T.; Guan, A.; Cheng, X.; Kim, J.; Alhazmi, M.; Hernando-Gil, I. Can People Flow Enhance the Shared Energy Facility Management? IEEE Trans. Smart Grid 2025, 16, 4673–4684. [Google Scholar] [CrossRef]
Li, H.; He, X.; Wu, Y.; Liu, G.; Wang, H.; Wen, X.; Li, L. Digital twin and AI-driven robotic embodied control system: A novel adaptive learning and decision optimization method. Robot. Comput. Integr. Manuf. 2026, 98, 103138. [Google Scholar] [CrossRef]
Arslan, M.; Mahdjoubi, L.; Munawar, S. Driving sustainable energy transitions with a multi-source RAG-LLM system. Energy Build. 2024, 324, 114827. [Google Scholar] [CrossRef]
Gabber, H.A.; Hemied, O.S. Domain-Specific Large Language Model for Renewable Energy and Hydrogen Deployment Strategies. Energies 2024, 17, 6063. [Google Scholar] [CrossRef]
Guo, C.; Wang, X.; Zheng, Y.; Zhang, F. Real-time optimal energy management of microgrid with uncertainties based on deep reinforcement learning. Energy 2022, 238, 121873. [Google Scholar] [CrossRef]
Boukoberine, M.N.; Zia, M.F.; Berghout, T.; Benbouzid, M. Reinforcement learning-based energy management for hybrid electric vehicles: A comprehensive up-to-date review on methods, challenges, and research gaps. Energy AI 2025, 21, 100514. [Google Scholar] [CrossRef]
Ye, Z.; Qiu, D.; Li, S.; Fan, Z.; Strbac, G. Federated Reinforcement Learning for decentralized peer-to-peer energy trading. Energy AI 2025, 20, 100500. [Google Scholar] [CrossRef]
Pimenow, S.; Pimenowa, O.; Prus, P. Challenges of artificial intelligence development in the context of energy consumption and impact on climate change. Energies 2024, 17, 5965. [Google Scholar] [CrossRef]
Ding, C.X.; Zhong, S.; Li, S.; Alhazmi, M. Regulatory-driven optimization of integrated energy systems: A legal and policy-compliant framework for flexibility and carbon management. Energy Rep. 2025, 14, 157–172. [Google Scholar] [CrossRef]
Zhang, H.; Liao, K.; Yang, J.; Yin, Z.; He, Z. Long-term and Short-term Coordinated Scheduling for wind-PV-hydro-storage hybrid energy system based on deep reinforcement learning. IEEE Trans. Sustain. Energy 2025, 16, 1697–1710. [Google Scholar] [CrossRef]
Shahin, M.; Simjoo, M. Potential applications of innovative AI-based tools in hydrogen energy development: Leveraging large language model technologies. Int. J. Hydrogen Energy 2025, 102, 918–936. [Google Scholar] [CrossRef]
Ashraf, A.; Basheer, M.; Gonzalez, J.M.; Martínez Ceseña, E.A.; Etichia, M.; Obuobie, E.; Bottacin-Busolin, A.; Adamowski, J.; Panteli, M.; Harou, J.J. Delivering equity in low-carbon multisector infrastructure planning. Nat. Commun. 2025, 16, 5320. [Google Scholar] [CrossRef]
Wen, J.; Yin, H.T.; Chang, C.P.; Tang, K. How AI shapes greener futures: Comparative insights from equity vs debt investment responses in renewable energy. Energy Econ. 2024, 136, 107700. [Google Scholar] [CrossRef]
Li, T.T.; Zhao, A.P.; Wang, Y.; Li, S.; Fei, J.; Wang, Z.; Xiang, Y. Integrating solar-powered electric vehicles into sustainable energy systems. Nat. Rev. Electr. Eng. 2025, 2, 467–479. [Google Scholar] [CrossRef]
Cai, H.; Zhang, W.; Yuan, Q.; Salameh, A.A.; Alahmari, S.; Ferrara, M. Cost-effective intelligent building: Energy management system using machine learning and multicriteria decision support. Energy Econ. 2025, 142, 108184. [Google Scholar] [CrossRef]
Hussain, A.; Musilek, P. Energy management of buildings with energy storage and solar photovoltaic: A diversity in experience approach for deep reinforcement learning agents. Energy AI 2024, 15, 100313. [Google Scholar] [CrossRef]
Wang, Y.; Wu, J.; He, H.; Wei, Z.; Sun, F. Data-driven energy management for electric vehicles using offline reinforcement learning. Nat. Commun. 2025, 16, 2835. [Google Scholar] [CrossRef]
Elsisi, M.; Amer, M.; Su, C.L.; Aljohani, T.; Ali, M.N.; Sharawy, M. A comprehensive review of machine learning and Internet of Things integrations for emission monitoring and resilient sustainable energy management of ships in port areas. Renew. Sustain. Energy Rev. 2025, 218, 115843. [Google Scholar] [CrossRef]
Barahona, I.; Almulhim, T. Renewable energies and circular economies: A systematic literature review before the ChatGPT boom. Energy Rep. 2024, 11, 2656–2669. [Google Scholar] [CrossRef]
Antonesi, G.; Cioara, T.; Anghel, I.; Michalakopoulos, V.; Sarmas, E.; Toderean, L. From Transformers to Large Language Models: A systematic review of AI applications in the energy sector towards Agentic Digital Twins. arXiv 2025, arXiv:2506.06359. [Google Scholar]
Alka, T.A.; Suresh, M.; Raman, R. Exploring the trajectory of migrant entrepreneurship research: BERTopic modeling. J. Organ. Change Manag. 2025. epub ahead of print. [Google Scholar] [CrossRef]
Paul, J.; Benito, G.R.G. A review of research on outward foreign direct investment from emerging countries, including China: What do we know, how do we know and where should we be heading? Asia Pac. Bus. Rev. 2018, 24, 90–115. [Google Scholar] [CrossRef]
Grootendorst, M. BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv 2022, arXiv:2203.05794. [Google Scholar] [CrossRef]
Gillespie, A.; Glăveanu, V.; de Saint-Laurent, C.; Zittoun, T.; Bernal Marcos, M.J. Multiresolution design: Using qualitative and quantitative analyses to recursively zoom in and out of the same dataset. J. Mix. Methods Res. 2024, 20, 16–35. [Google Scholar] [CrossRef]
Nedungadi, P.; Veena, G.; Tang, K.Y.; Menon, R.R.; Raman, R. AI techniques and applications for online social networks and media: Insights from BERTopic modeling. IEEE Access 2025, 13, 37389–37407. [Google Scholar] [CrossRef]
Raman, R.; Sreenivasan, A.; Kulkarni, N.V.; Suresh, M.; Nedungadi, P. Analyzing the contributions of biofuels, biomass, and bioenergy to sustainable development goals. iScience 2025, 28, 112157. [Google Scholar] [CrossRef]
Roberge, G.; Kashnitsky, Y.; James, C. Elsevier 2022 sustainable development goals (sdg) mapping. Mendeley Data 2022, 1, 2022. [Google Scholar]
Page, M.J.; Moher, D.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef] [PubMed]
Alka, T.A.; Raman, R.; Suresh, M. Analyzing the causal relationships among socioeconomic factors influencing sustainable energy enterprises in India. Energies 2025, 18, 4373. [Google Scholar] [CrossRef]
Alka, T.A.; Raman, R.; Suresh, M. Research trends in innovation ecosystem and circular economy. Discov. Sustain. 2024, 5, 323. [Google Scholar] [CrossRef]
Donthu, N.; Kumar, S.; Mukherjee, D.; Pandey, N.; Lim, W.M. How to conduct a bibliometric analysis: An overview and guidelines. J. Bus. Res. 2021, 133, 285–296. [Google Scholar] [CrossRef]
Raman, R.; Pattnaik, D.; Lathabai, H.H.; Kumar, C.; Govindan, K.; Nedungadi, P. Green and sustainable AI research: An integrated thematic and topic modeling analysis. J. Big Data 2024, 11, 55. [Google Scholar] [CrossRef]
Egger, R.; Yu, J. A topic modeling comparison between lda, nmf, Top2Vec, and bertopic to demystify twitter posts. Front. Sociol. 2022, 7, 886498. [Google Scholar] [CrossRef]
Wang, Z.; Chen, J.; Chen, J.; Chen, H. Identifying interdisciplinary topics and their evolution based on BERTopic. Scientometrics 2024, 129, 7359–7384. [Google Scholar] [CrossRef]
Kim, K.; Kogler, D.F.; Maliphol, S. Identifying interdisciplinary emergence in the science of science: Combination of network analysis and BERTopic. Humanit. Soc. Sci. Commun. 2024, 11, 603. [Google Scholar] [CrossRef]
Khodeir, N.; Elghannam, F. Efficient topic identification for urgent MOOC Forum posts using BERTopic and traditional topic modeling techniques. Educ. Inf. Technol. 2025, 30, 5501–5527. [Google Scholar] [CrossRef]
Bedard-Vallee, A.; James, C.; Roberge, G. Elsevier 2023 Sustainable Development Goals (SDGs) Mapping. Elsevier Data Repository. 2023, Version 1. Available online: https://elsevier.digitalcommonsdata.com/datasets/y2zyy9vwzy/1 (accessed on 13 February 2026).
Trevisan, L.V.; Leal Filho, W.; Pedrozo, E.Á. Transformative organisational learning for sustainability in higher education: A literature review and an international multicase study. J. Clean. Prod. 2024, 447, 141634. [Google Scholar] [CrossRef]
Kasaraneni, H.; Rosaline, S. Automatic merging of Scopus and Web of Science data for simplified and effective bibliometric analysis. Ann. Data Sci. 2024, 11, 785–802. [Google Scholar] [CrossRef]
Koi-Akrofi, G.Y.; Aboagye-Darko, D.; Gaisie, E.; Banaseka, F. IT project success in perspective: Systematic literature review analysis founded on the ADO, TCM, and the PSALAR frameworks. Manag. Rev. Q. 2023, 74, 2401–2441. [Google Scholar] [CrossRef]
Paul, J.; Merchant, A.; Dwivedi, Y.K.; Rose, G. Writing an impactful review article: What do we know and what do we need to know? J. Bus. Res. 2021, 133, 337–340. [Google Scholar] [CrossRef]
Paul, J.; Khatri, P.; Kaur Duggal, H. Frameworks for developing impactful systematic literature reviews and theory building: What, Why and How? J. Decis. Syst. 2024, 33, 537–550. [Google Scholar] [CrossRef]
Singh, S.; Paul, J.; Dhir, S. Innovation implementation in Asia-Pacific countries: A review and research agenda. In Trends in Asia Pacific Business and Management Research; Routledge: London, UK, 2022; pp. 36–64. [Google Scholar]
Choudhary, P.; Thenmozhi, M. Fintech and financial sector: ADO analysis and future research agenda. Int. Rev. Financ. Anal. 2024, 93, 103201. [Google Scholar] [CrossRef]
Nammouchi, A.; Chaabani, C.; Theocharis, A.; Kassler, A. Towards Explainable Renewable Energy Communities Operations Using Generative AI. In Proceedings of the 2024 IEEE PES Innovative Smart Grid Technologies Europe (ISGT EUROPE); IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
Wang, H.; Hua, W.; Peng, J.; Hu, M. Public sentiment analysis of data center energy consumption using social media data and large language models. Energy Build. 2025, 341, 115802. [Google Scholar] [CrossRef]
Pahune, S.; Akhtar, Z.; Mandapati, V.; Siddique, K. The importance of AI data governance in large language models. Big Data Cogn. Comput. 2025, 9, 147. [Google Scholar] [CrossRef]
Patil, R.; Gudivada, V. A review of current trends, techniques, and challenges in large language models (llms). Appl. Sci. 2024, 14, 2074. [Google Scholar] [CrossRef]
Zhang, L.; Ford, V.; Chen, Z.; Chen, J. Automatic building energy model development and debugging using large language models agentic workflow. Energy Build. 2025, 327, 115116. [Google Scholar] [CrossRef]
Sarwar, F.A.; Hernando-Gil, I.; Vechiu, I. Review of energy management systems and optimization methods for hydrogen-based hybrid building microgrids. Energy Convers. Econ. 2024, 5, 259–279. [Google Scholar] [CrossRef]
Dai, X.; Liu, G.P.; Hu, W.; Lei, Z.; Zhou, H. Learning from ChatGPT: A transformer-based model for wind power forecasting. In Proceedings of the 2023 IEEE International Conference on Environment and Electrical Engineering and 2023 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe); IEEE: New York, NY, USA, 2023; pp. 1–6. [Google Scholar]
Duan, Z.; Bian, C.; Yang, S.; Li, C. Prompting large language model for multi-location multi-step zero-shot wind power forecasting. Expert Syst. Appl. 2025, 280, 127436. [Google Scholar] [CrossRef]
Chang, X.; Liang, D.; Peng, F.; Zhang, G. A New Prediction Paradigm for Nonprofessionals: Prompt-based Zero-code Wind Power Prediction. In Proceedings of the 2024 6th International Conference on Power and Energy Technology (ICPET); IEEE: New York, NY, USA, 2024; pp. 1782–1787. [Google Scholar]
Akinci, T.C.; Nogay, H.S.; Penchev, M.; Martinez-Morales, A.A.; Raju, A. A hybrid approach to wind power intensity classification using decision trees and large language models. Renew. Energy 2025, 250, 123388. [Google Scholar] [CrossRef]
Ock, J.; Guntuboina, C.; Barati Farimani, A. Catalyst energy prediction with CatBERTa: Unveiling feature exploration strategies through large language models. ACS Catal. 2023, 13, 16032–16044. [Google Scholar] [CrossRef]
Li, S.; You, F. GenAI for Scientific Discovery in Electrochemical Energy Storage: State-of-the-Art and Perspectives from Nano-and Micro-Scale. Small 2024, 20, 2406153. [Google Scholar] [CrossRef]
Zhu, J.; Zhang, L.; Lu, J.; Wu, X.; Wei, C. Research on the Application of ChatGPT-like Language Models in Hydropower Stations. In E3S Web of Conferences; EDP Sciences: Les Ulis, France, 2024; Volume 561, p. 02026. [Google Scholar]
Singh, D.; Shah, O.A.; Arora, S. Adaptive control strategies for effective integration of solar power into smart grids using reinforcement learning. Energy Storage Sav. 2024, 3, 327–340. [Google Scholar] [CrossRef]
Zhou, W.; Zhuang, Y.; Chen, Y. How does artificial intelligence affect pollutant emissions by improving energy efficiency and developing green technology. Energy Econ. 2024, 131, 107355. [Google Scholar] [CrossRef]
Li, T.T.; Liang, R.; Shang, Y.; Ding, C.X.; Hua, Y.; Wang, Z.; Alhazmi, M. SustainLLM: AI-driven lifecycle sustainability assessment and energy transition optimization. Sustain. Energy Technol. Assess. 2025, 82, 104475. [Google Scholar] [CrossRef]
Yi, Z.; Luo, Y.; Westover, T.; Katikaneni, S.; Ponkiya, B.; Sah, S.; Mahmud, S.; Raker, D.; Javaid, A.; Heben, M.J.; et al. Deep reinforcement learning based optimization for a tightly coupled nuclear renewable integrated energy system. Appl. Energy 2022, 328, 120113. [Google Scholar] [CrossRef]
Maryasin, O.Y. Two-stage problem of optimizing smart grid energy consumption at the enterprise. In Proceedings of the 2022 4th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA); IEEE: New York, NY, USA, 2022; pp. 808–813. [Google Scholar]
Yang, Y.; Brozovsky, J.; Liu, P.; Goia, F. Data-driven energy management of a neighborhood with renewable energy sources. In Building Simulation 2023; IBPSA: Las Cruces, NM, USA, 2023; Volume 18, pp. 3499–3507. [Google Scholar]
Ali, K.; Schultz, D.M.; Revell, A.; Stallard, T.; Ouro, P. Assessment of five wind-farm parameterizations in the weather research and forecasting model: A case study of wind farms in the North Sea. Mon. Weather Rev. 2023, 151, 2333–2359. [Google Scholar] [CrossRef]
Jin, M.; Sel, B.; Hardeep, F.; Yin, W. Democratizing energy management with llm-assisted optimization autoformalism. In Proceedings of the 2024 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm); IEEE: New York, NY, USA, 2024; pp. 258–263. [Google Scholar]
Zhang, L.; Chen, Z. Opportunities of applying Large Language Models in building energy sector. Renew. Sustain. Energy Rev. 2025, 214, 115558. [Google Scholar] [CrossRef]
Kim, H.J.; Kim, M.K. Real-time price forecasting combined deep reinforcement learning for predictive home energy management system. IEEE Internet Things J. 2025, 12, 34806–34821. [Google Scholar] [CrossRef]
Chen, J.; Liu, Z.; Huang, X.; Wu, C.; Liu, Q.; Jiang, G.; Pu, Y.; Lei, Y.; Chen, X.; Wang, X.; et al. When large language models meet personalization: Perspectives of challenges and opportunities. World Wide Web 2024, 27, 42. [Google Scholar] [CrossRef]
Świrski, K.; Błach, P. Energy Storage Management Using Artificial Intelligence to Maximize Polish Energy Market Profits. Energies 2024, 17, 4855. [Google Scholar] [CrossRef]
Alhazmi, M.; Li, O.P.; Huo, D.; Zhang, H.; Bao, Z. Can digital twin technology revolutionize wildfire management and energy resilience in Los Angeles? Energy Rep. 2025, 14, 3118–3131. [Google Scholar] [CrossRef]
Tsihrintzis, G.A.; Sarmas, E.; Marinakis, V.; Panagoulias, D.; Tsihrintzi, E.A.; Virvou, M. Evaluating non-expert stakeholder interaction with artificial intelligence on energy urban domain using VIRTSI: The case of ChatGPT. In Proceedings of the 2024 15th International Conference on Information, Intelligence, Systems & Applications (IISA); IEEE: New York, NY, USA, 2024; pp. 1–10. [Google Scholar]
Mohamed, B.; Babiker, S.; Aldybous, A.; Kabbaj, N.; Brahimi, T. “Fiction to Function” Shaping Renewable Energy Education with MATLAB and ChatGPT-Driven Environments. In Proceedings of the 2024 21st Learning and Technology Conference (L&T); IEEE: New York, NY, USA, 2024; pp. 7–12. [Google Scholar]
Oh, S.; Kum, S.; Moon, J. Real-time environment monitoring and response through iot and retrieval-augmented generation. In Proceedings of the 2024 15th International Conference on Information and Communication Technology Convergence (ICTC); IEEE: New York, NY, USA, 2024; pp. 1658–1659. [Google Scholar]
Amjad, F.; Korotko, T.; Rosin, A. Forecasting pv energy generation using transformer-based architectures: A comparative study of lag-llama, tft, and deepar. In Proceedings of the 2024 IEEE 65th International Scientific Conference on Power and Electrical Engineering of Riga Technical University (RTUCON); IEEE: New York, NY, USA, 2024; pp. 1–6. [Google Scholar]
Li, Y.; Ni, S.; Tang, X.; Xie, S.; Wang, P. Analysis of EU’s coupled carbon and electricity market development based on generative pre-trained transformer large model and implications in China. Sustainability 2024, 16, 10747. [Google Scholar] [CrossRef]
Wang, X.; Bai, F.; Cai, Y. Electricity load forecasting using large language models. In Proceedings of the International Conference on Mechatronic Engineering and Artificial Intelligence (MEAI 2024); SPIE: Bellingham, WA, USA, 2025; Volume 13555, pp. 664–668. [Google Scholar]
Sriram, A.; Miller, B.K.; Chen, R.T.; Wood, B.M. Flowllm: Flow matching for material generation with large language models as base distributions. Adv. Neural Inf. Process. Syst. 2024, 37, 46025–46046. [Google Scholar]
Wang, Q.; Yang, F.; Wang, Y.; Zhang, D.; Sato, R.; Zhang, L.; Cheng, E.J.; Yan, Y.; Chen, Y.; Kisu, K.; et al. Unraveling the Complexity of Divalent Hydride Electrolytes in Solid-State Batteries via a Data-Driven Framework with Large Language Model. Angew. Chem. Int. Ed. 2025, 64, e202506573. [Google Scholar]
Almilaify, Y.; Nweye, K.; Nagy, Z. Scalex: Scalability exploration of multi-agent reinforcement learning agents in grid-interactive efficient buildings. In Proceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, Istanbul, Turkey, 15–16 November 2023; pp. 261–264. [Google Scholar]
Friansa, K.; Pradipta, J.; Nanda, R.M.; Haq, I.N.; Mangkuto, R.A.; Iskandar, R.F.; Wasesa, M.; Leksono, E. Enhancing university building energy flexibility performance using reinforcement learning control. IEEE Access 2024, 12, 192377–192395. [Google Scholar] [CrossRef]
Egbemhenghe, A.U.; Ojeyemi, T.; Iwuozor, K.O.; Emenike, E.C.; Ogunsanya, T.I.; Anidiobi, S.U.; Adeniyi, A.G. Revolutionizing water treatment, conservation, and management: Harnessing the power of AI-driven ChatGPT solutions. Environ. Chall. 2023, 13, 100782. [Google Scholar] [CrossRef]
Oprea, S.V.; Bâra, A. A recommendation system for prosumers based on large language models. Sensors 2024, 24, 3530. [Google Scholar] [CrossRef] [PubMed]
Alarcón-López, C.; Krütli, P.; Gillet, D. Assessing ChatGPT’s Influence on Critical Thinking in Sustainability Oriented Activities. In Proceedings of the 2024 IEEE Global Engineering Education Conference (EDUCON); IEEE: New York, NY, USA, 2024; pp. 1–10. [Google Scholar]
Adem, T.; McCrabb, A.; Goyal, V.; Bertacco, V. Evergreen: Comprehensive carbon model for performance-emission tradeoffs. In Proceedings of the 2024 IEEE International Symposium on Workload Characterization (IISWC); IEEE: New York, NY, USA, 2024; pp. 132–143. [Google Scholar]
Leon, M. The escalating AI’s energy demands and the imperative need for sustainable solutions. WSEAS Trans. Syst. 2024, 23, 444–457. [Google Scholar] [CrossRef]
Bernardini, A.; Lezoche, M.; Angelini, S.; Dondossola, G.; Terruggia, R. Advancing Internet-Connected Devices Posture Analysis with a Meta-Search Engine: A Case Study in Energy Systems. In Proceedings of the Joint National Conference on Cybersecurity (ITASEC & SERICS 2025), Bologna, IT, USA, 3–8 February 2025. [Google Scholar]
Huang, L.; Deng, W.; Jiang, Y.; Zhong, Q. Development trends of large language models and their applications in green digital intelligence of supply chains. In Proceedings of the 2024 5th International Conference on Computer Science and Management Technology, Xiamen, China, 18–20 October 2024; pp. 770–774. [Google Scholar]
Weers, J.; Podgorny, S.; Taverna, N.; Anderson, A.; Porse, S.; Buster, G. Empowering Geothermal Research: The Geothermal Data Repository’s New AI Research Assistant; No. NREL/CP-6A20-90339; National Renewable Energy Laboratory (NREL): Golden, CO, USA, 2024. [Google Scholar]
Verdoodt, J.; Milleville, K.; Huang, H.; Vandeviver, C.; Verstockt, S.; Van de Weghe, N. Geosocial media’s perspective on energy: A text classification approach using natural language processing. J. Locat. Based Serv. 2025, 1–26. [Google Scholar] [CrossRef]
Isaza-Giraldo, A.; Bala, P.; Jiskrová, A.; Sachser, L.; Campos, P.; Pereira, L. Meta-evaluating the Effects of Social Preferences on NPC-evaluators in an Energy Community Game. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 26 April–1 May 2025; pp. 1–10. [Google Scholar]
Cha, S. The potential role of small modular reactors (SMRs) in addressing the increasing power demand of the artificial intelligence industry: A scenario-based analysis. Nucl. Eng. Technol. 2025, 57, 103314. [Google Scholar] [CrossRef]
Petzinna, N.; Nikora, V.; Onoufriou, J.; Williamson, B.J. Evaluating the performance of a dual-frequency multibeam echosounder for small target detection. J. Mar. Sci. Eng. 2023, 11, 2084. [Google Scholar] [CrossRef]
Rodríguez-Muñoz-de-Baena, I.; Coronado-Vaca, M.; Vaquero-Lafuente, E. Fine-tuning transformer models for M&A target prediction in the US ENERGY sector. Cogent Bus. Manag. 2025, 12, 2487219. [Google Scholar]
Na, G.S. Artificial intelligence for learning material synthesis processes of thermoelectric materials. Chem. Mater. 2023, 35, 8272–8280. [Google Scholar] [CrossRef]
Xie, T.; Wan, Y.; Zhou, Y.; Huang, W.; Liu, Y.; Linghu, Q.; Wang, S.; Kit, C.; Grazian, C.; Zhang, W.; et al. Creation of a structured solar cell material dataset and performance prediction using large language models. Patterns 2024, 5, 100955. [Google Scholar] [CrossRef]
Feng, Z.; Lu, J.; Zhang, C.; Liu, D.; Jia, Z.; Wang, Y.; Wang, C. New Facile Continuous Microwave Pipeline Technology for the Preparation of Highly Stable and Active Carbon-Supported Platinum Catalyst. ChemElectroChem 2024, 11, e202300546. [Google Scholar] [CrossRef]
Jo, B.; Chen, W.; Jung, H.S. Comprehensive review of advances in machine-learning-driven optimization and characterization of perovskite materials for photovoltaic devices. J. Energy Chem. 2025, 101, 298–323. [Google Scholar] [CrossRef]
Shehata, A.M.; Kalia, N.; Comer, R. Advancing isolation techniques for geothermal wells: Development of polymer and nanoparticle system. In SPE Canadian Energy Technology Conference; SPE: Richardson, TX, USA, 2024; p. D021S015R001. [Google Scholar]
Parsamanesh, M.; Shekarriz, S.; Montazer, M. Enhanced thermal stability of eutectic PCMs via microencapsulation: Inverse emulsion polymerization with silica shells. Therm. Sci. Eng. Prog. 2025, 60, 103420. [Google Scholar] [CrossRef]
Sanguinetti, M.; Pani, A.; Perniciano, A.; Zedda, L.; Loddo, A.; Atzori, M. Assessing Italian large language models on energy feedback generation: A human evaluation Study. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), Pisa, Italy, 4–6 December 2024; pp. 880–887. [Google Scholar]
Saeed, F.; Aldera, S.; Al-Shamma’a, A.A.; Farh, H.M.H. Rapid adaptation in photovoltaic defect detection: Integrating CLIP with YOLOv8n for efficient learning. Energy Rep. 2024, 12, 5383–5395. [Google Scholar] [CrossRef]

Figure 1. Research process for analyzing LLMs in SESs.

Figure 2. PRISMA 2020 flow diagram for study selection on LLMs in SESs.

Figure 3. BERTopic process in examining LLMs in SESs.

Figure 4. Summary of results for LLMs in SESs.

Figure 5. ADO framework for LLMs in SESs.

Table 1. Case studies on LLMs in sustainable energy systems.

References	Context	Characteristics
Ali et al. [79]	Offshore wind farm simulation using WRF	Wind turbine parameterization, mesoscale modeling
Zhang et al. [64]	BEM development through LLMs	Agentic workflow, EnergyPlus
Jin et al. [80]	Personalized energy optimization through LLMs	Autoformalism, language-based control
Maryasin [77]	Smart grid optimization at the enterprise level	Two-stage optimization, RL plus linear programming
Yang et al. [78]	RL in neighborhood energy systems in cold climates	CityLearn, storage coordination, and low-quality data processing
Zhou et al. [74]	Deep RL-based real-time scheduling in uncertainty	SEDRL platform, open-source integration

Table 2. ADO framework with thematic focus and SDG mapping in LLMs in SESs.

Elements	Area of Focus	SDG Targets
Antecedents	Regulatory support, technological maturity, infrastructure, sociocultural readiness, digital equity, data governance, SME engagement, and knowledge asymmetry
Decisions	Model architecture, model training, governance, real-time optimization, human–AI, energy chatbot, policy, adaptive scheduling
Outcomes	Emission reduction, energy equity, grid efficiency, social inclusion, lifecycle impacts.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alka, T.A.; Suresh, M.; Mandal, S.; Filho, W.L.; Raman, R. Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals. Energies 2026, 19, 1588. https://doi.org/10.3390/en19061588

AMA Style

Alka TA, Suresh M, Mandal S, Filho WL, Raman R. Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals. Energies. 2026; 19(6):1588. https://doi.org/10.3390/en19061588

Chicago/Turabian Style

Alka, T. A., M. Suresh, Santanu Mandal, Walter Leal Filho, and Raghu Raman. 2026. "Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals" Energies 19, no. 6: 1588. https://doi.org/10.3390/en19061588

APA Style

Alka, T. A., Suresh, M., Mandal, S., Filho, W. L., & Raman, R. (2026). Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals. Energies, 19(6), 1588. https://doi.org/10.3390/en19061588

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Large Language Models in Sustainable Energy Systems: A Systematic Review on Modeling, Optimization, Governance, and Alignment to Sustainable Development Goals

Abstract

1. Introduction

2. Methodology

2.1. PRISMA

2.2. BERTopic Modeling

2.3. Case Studies

2.4. ADO Framework

3. Results

3.1. Themes Based on BERTopic

3.1.1. LLMs and Sustainable AI: Enablers, Trade-Offs, and Governance Pathways in Energy Transition

3.1.2. Advancing Intelligent Energy Systems with LLMs: Forecasting, Modeling, and Policy Integration

3.1.3. LLMs and Generative AI for Decarbonized Innovation: A Multisectoral Application in Hydrogen, Electrochemical Storage, and Infrastructure Optimization

3.1.4. Deep Reinforcement Learning for Intelligent Energy System Optimization: Decarbonization, Infrastructure Innovation, and Clean Energy Access

3.2. Case Studies

4. Discussion

5. Implications

5.1. Implications for Theory

5.2. Implications for Policy

5.3. Implications for Practice

6. Future Research Directions

6.1. Antecedents: Enablers of LLMs in SESs

6.2. Decisions: Strategic Choices in LLM-Driven SESs

6.3. Outcomes: Impacts and Consequences of Decisions

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI