How Can Large Language Models Drive Environmental Sustainability? A Systematic Scoping Review
Abstract
1. Introduction
- (1)
- What are the geographical and temporal trends of research utilizing LLMs for environmental sustainability?
- (2)
- Which LLMs are most widely applied in the environmental sustainability domain?
- (3)
- Which environmental sustainable development domains can LLMs be used in?
- (4)
- What are the roles and impacts of LLMs in the field of environmental sustainability?
- (5)
- What are the future potential and development trends for the process of environmental sustainability supported by current LLMs?
2. Methods
2.1. Search Strategy
2.2. Data Selection and Extraction
2.3. Data Charting, Synthesis, and Reporting
2.4. Quality Assessment of Included Studies
3. Results
3.1. Characteristics of Studies
3.2. Research Methodology
3.3. Types of Input and Output Data
3.4. Application Domains
4. Discussion
4.1. Main Findings and Results of Studies
4.2. Impact of LLMs on Environmental Sustainability
4.3. Role and Significance of LLMs for Environmental Sustainability
4.4. Limitations of LLMs Applications in Environmental Sustainability
4.5. Potential and Future Trends of LLMs Applications in Environmental Sustainability
4.6. Comparative Synthesis of Performance and Efficacy
4.7. Patterns of Success and Failure: An Interpretative Analysis
4.8. Limitations
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
| Type of Research Design | ||||||||||
| 1. QUALITATIVE STUDIES | ||||||||||
| NO. | Authors (year) | S1. | S2. | 1.1. | 1.2. | 1.3. | 1.4. | 1.5. | Score | Study Quality |
| [17] | Shoaib et al. (2023) | + | + | + | + | + | + | ? | 85.71% | Good |
| [21] | Cecconi et al. (2024) | + | + | + | + | + | + | ? | 85.71% | Good |
| [24] | Stein (2025) | + | + | + | + | + | + | + | 100% | Good |
| [31] | Rivero et al. (2025) | + | + | + | + | + | + | − | 85.71% | Good |
| 2. RANDOMIZED CONTROLLED TRIALS | ||||||||||
| NO. | Authors (year) | S1. | S2. | 2.1. | 2.2. | 2.3. | 2.4. | 2.5. | Score | Study Quality |
| [22] | Bhaskar et al. (2024) | + | + | + | + | + | ? | + | 85.71% | Good |
| [26] | Bibri et al. (2025) | + | + | + | + | + | ? | + | 85.71% | Good |
| [28] | Cheng et al. (2024) | + | + | + | + | + | ? | + | 85.71% | Good |
| [33] | Cheng et al. (2025) | + | + | + | + | + | ? | + | 85.71% | Good |
| 3. NON-RANDOMIZED STUDIES | ||||||||||
| NO. | Authors (year) | S1. | S2. | 3.1 | 3.2. | 3.3. | 3.4. | 3.5. | Score | Study Quality |
| [18] | Song et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| [19] | Cappendijk et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| [16] | Lin et al. (2025) | + | + | + | + | + | ? | + | 85.71% | Good |
| [23] | Wang et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| [24] | Chew et al. (2024) | + | + | + | + | + | + | + | 100% | Good |
| [27] | Huang et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| [30] | Krzyżewska (2025) | + | + | + | + | + | + | + | 100% | Good |
| [34] | Zhang et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| 4. QUANTITATIVE DESCRIPTIVE STUDIES | ||||||||||
| NO. | Authors (year) | S1. | S2. | 4.1. | 4.2. | 4.3. | 4.4. | 4.5. | Score | Study Quality |
| [32] | Hou et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| 5. MIXED METHODS STUDIES | ||||||||||
| NO. | Authors (year) | S1. | S2. | 5.1. | 5.2. | 5.3. | 5.4. | 5.5. | Score | Study Quality |
| [20] | Lee et al. (2024) | + | + | + | + | + | + | + | 100% | Good |
| [7] | Martín-Domingo et al. (2025) | + | + | + | + | + | + | + | 100% | Good |
| [29] | Karlsson (2025) | + | + | + | + | + | ? | + | 85.71% | Good |
| “+” = Yes; “−” = No; “?”= Can’t tell. | ||||||||||
References
- Luo, Y.; Pang, P.C.-I.; Chang, S. Enhancing exploratory learning through exploratory search with the emergence of large language models. arXiv 2024, arXiv:2408.08894. [Google Scholar]
- Mao, C.; Li, J.; Pang, P.C.-I.; Zhu, Q.; Chen, R. Identifying Kidney Stone Risk Factors Through Patient Experiences with a Large Language Model: Text Analysis and Empirical Study. J. Med. Internet Res. 2025, 27, e66365. [Google Scholar] [CrossRef] [PubMed]
- Annepaka, Y.; Pakray, P. Large language models: A survey of their development, capabilities, and applications. Knowl. Inf. Syst. 2025, 67, 2967–3022. [Google Scholar] [CrossRef]
- Han, S.; Wang, M.; Zhang, J.; Li, D.; Duan, J. A review of large language models: Fundamental architectures, key technological evolutions, interdisciplinary technologies integration, optimization and compression techniques, applications, and challenges. Electronics 2024, 13, 5040. [Google Scholar] [CrossRef]
- Peduzzi, P. The disaster risk, global change, and sustainability nexus. Sustainability 2019, 11, 957. [Google Scholar] [CrossRef]
- Salmi, A.; Jussila, J.; Hämäläinen, M. The role of municipalities in transformation towards more sustainable construction: The case of wood construction in Finland. Constr. Manag. Econ. 2022, 40, 934–954. [Google Scholar] [CrossRef]
- Martín-Domingo, L.; Fernandez, J.B.; Efthymiou, M.; Ali, M.I. Extracting airline emission KPIs from sustainability reports using large language models (LLMs). Transp. Res. Interdiscip. Perspect. 2025, 33, 101599. [Google Scholar] [CrossRef]
- Taheri Hosseinkhani, N. Artificial Intelligence and Large Language Models in Energy Systems and Climate Strategies: Economic Pathways to Cost-Effective Emissions Reduction and Sustainable Growth. SSRN Electron. J. 2025. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5385513 (accessed on 12 January 2026).
- Al Khourdajie, A. The role of artificial intelligence in climate change scientific assessments. PLoS Clim. 2025, 4, e0000706. [Google Scholar] [CrossRef]
- Cao, C.; Zhuang, J.; He, Q. LLM-Assisted Modeling and Simulations for Public Sector Decision-Making: Bridging Climate Data and Policy Insights. In Proceedings of the AAAI—2024 Workshop on Public Sector LLMs: Algorithmic and Sociotechnical Design, Vancouver, BC, Canada, 27 February 2024. [Google Scholar]
- Van de Weghe, N.; De Sloover, L.; Cohn, A.; Huang, H.; Scheider, S.; Sieber, R.; Timpf, S.; Claramunt, C. Opportunities and challenges of integrating geographic information science and large language models. J. Spat. Inf. Sci. 2025, 30, 93–116. [Google Scholar] [CrossRef]
- Everman, B.; Villwock, T.; Chen, D.; Soto, N.; Zhang, O.; Zong, Z. Evaluating the carbon impact of large language models at the inference stage. In Proceedings of the 2023 IEEE International Performance, Computing, and Communications Conference (IPCCC), Anaheim, CA, USA, 17–19 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 150–157. [Google Scholar]
- Alizadeh, N.; Castor, F. Green AI: A preliminary empirical study on energy consumption in dl models across different runtime infrastructures. In Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering-Software Engineering for AI, Lisbon, Portugal, 14–15 April 2024; pp. 134–139. [Google Scholar]
- McGowan, J.; Straus, S.; Moher, D.; Langlois, E.V.; O’Brien, K.K.; Horsley, T.; Aldcroft, A.; Zarin, W.; Garitty, C.M.; Hempel, S. Reporting scoping reviews—PRISMA ScR extension. J. Clin. Epidemiol. 2020, 123, 177–179. [Google Scholar] [CrossRef]
- Hong, Q.N. Revision of the Mixed Methods Appraisal Tool (MMAT): A Mixed Methods Study. Ph.D. thesis, McGill University, Montréal, QC, Canada, 2018. [Google Scholar]
- Lin, W.-C.; Tseng, M.-H. Autonomous Epidemic and Geographic Disaster Mapping: Assessing the Performance of Large Language Models in Spatial Information Integration. J. Disaster Res. 2025, 20, 386–395. [Google Scholar] [CrossRef]
- Shoaib, M.R.; Emara, H.M.; Zhao, J. A survey on the applications of frontier ai, foundation models, and large language models to intelligent transportation systems. In Proceedings of the 2023 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 28–30 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–7. [Google Scholar]
- Song, J.; Ma, C.; Ran, M. AirGPT: Pioneering the convergence of conversational AI with atmospheric science. npj Clim. Atmos. Sci. 2025, 8, 179. [Google Scholar] [CrossRef]
- Cappendijk, T.; de Reus, P.; Oprescu, A. An exploration of prompting LLMs to generate energy-efficient code. In Proceedings of the 2025 IEEE/ACM 9th International Workshop on Green and Sustainable Software (GREENS), Ottawa, ON, Canada, 29 April 2025; IEEE Computer Society: Washington, DC, USA, 2025; pp. 31–38. [Google Scholar]
- Lee, S.; Peng, T.-Q.; Goldberg, M.H.; Rosenthal, S.A.; Kotcher, J.E.; Maibach, E.W.; Leiserowitz, A. Can large language models estimate public opinion about global warming? An empirical assessment of algorithmic fidelity and bias. PLoS Clim. 2024, 3, e0000429. [Google Scholar] [CrossRef]
- Cecconi, F.; Marconi, L.; Barazzetti, A. Climate Change Mitigation Policies Using GPT-4; MISC: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
- Bhaskar, P.; Seth, N. Environment and sustainability development: A ChatGPT perspective. In Applied Data Science and Smart Systems; CRC Press: Boca Raton, FL, USA, 2024; pp. 54–62. [Google Scholar]
- Wang, Z.; Zheng, X.; Meng, F.; Wang, K.; Wu, X.; Yu, D. Exploring the Joint Influence of Built Environment Factors on Urban Rail Transit Peak-Hour Ridership Using DeepSeek. Buildings 2025, 15, 1744. [Google Scholar] [CrossRef]
- Chew, Y.J.; Ooi, S.Y.; Pang, Y.H.; Lim, Z.Y. Framework to create inventory dataset for disaster behavior analysis using google earth engine: A Case Study in Peninsular Malaysia for historical forest fire behavior analysis. Forests 2024, 15, 923. [Google Scholar] [CrossRef]
- Stein, A.L. Generative AI and Sustainability. In The Oxford Handbook of the Foundations and Regulation of Generative AI; Hacker, P., Engel, A., Hammer, S., Mittelstadt, B., Eds.; Oxford University Press: Oxford, UK, 2025. [Google Scholar] [CrossRef]
- Bibri, S.E.; Huang, J. Generative AI of Things for sustainable smart cities: Synergies in cognitive augmentation, resource efficiency, network traffic, and anomaly and threat detection for environmental optimization. Sustain. Cities Soc. 2025, 133, 106826. [Google Scholar] [CrossRef]
- Huang, J.; Bibri, S.E.; Keel, P. Generative spatial artificial intelligence for sustainable smart cities: A pioneering large flow model for urban digital twin. Environ. Sci. Ecotechnol. 2025, 24, 100526. [Google Scholar] [CrossRef]
- Cheng, Y.; Zhou, X.; Zhao, H.; Gu, J.; Wang, X.; Zhao, J. Large Language Model for Low-Carbon Energy Transition: Roles and Challenges. In Proceedings of the 2024 4th Power System and Green Energy Conference (PSGEC), Shanghai, China, 22–24 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 810–816. [Google Scholar]
- Karlsson, J.; Käck, J. Targeting Green Prospects: Identifying Environmentally Conscious Prospects Using AI-driven Tools Within the Swedish Energy Sector. 2025. Available online: https://www.diva-portal.org/smash/get/diva2:1964601/FULLTEXT01.pdf (accessed on 8 January 2026).
- Krzyżewska, A. The applications of ai tools in the fields of weather and climate—Selected examples. Atmosphere 2025, 16, 490. [Google Scholar] [CrossRef]
- Rivero, S.; Chinarro Vadillo, D.; Prieto Andres, A. The green algorithm: Can sustainability define the winner in the AI race? Front. Political Sci. 2025, 7, 1629914. [Google Scholar] [CrossRef]
- Hou, Y.; Yang, S.; Li, L.; Chen, L. Unlocking Environmental Sustainability with Generative Artificial Intelligence: Insights from Resource Orchestration Theory. IEEE Trans. Eng. Manag. 2025, 72, 3080–3093. [Google Scholar] [CrossRef]
- Cheng, Y.H.; Wang, Y.W.; Kuo, C.N. The Potential and Applications of Utilizing the ChatGPT Model for Comparative Analysis of Carbon Emission Calculation Formulas in Public Transportation. In Proceedings of the 2023 12th International Conference on Awareness Science and Technology (iCAST), Taichung, Taiwan, 9–11 November 2023; pp. 31–34. [Google Scholar]
- Zhang, L.; Yue, D.; Hancke, G.P.; Dou, C.; Yu, L.; Chen, Z. Optimization of Energy and Carbon Emissions in Integrated Energy System Based on Deep Reinforcement Learning Assisted by Large Language Model. IEEE Trans. Ind. Inf. 2025, 21, 8186–8197. [Google Scholar] [CrossRef]
- Ncube, M.M.; Ngulube, P. Enhancing environmental decision-making: A systematic review of data analytics applications in monitoring and management. Discov. Sustain. 2024, 5, 290. [Google Scholar] [CrossRef]
- Wen, J.; Zhang, R.; Niyato, D.; Kang, J.; Du, H.; Zhang, Y.; Han, Z. Generative AI for low-carbon artificial intelligence of things with large language models. IEEE Internet Things Mag. 2024, 8, 82–91. [Google Scholar] [CrossRef]
- Tao, L.; Zhang, H.; Jing, H.; Liu, Y.; Yan, D.; Wei, G.; Xue, X. Advancements in vision–language models for remote sensing: Datasets, capabilities, and enhancement techniques. Remote Sens. 2025, 17, 162. [Google Scholar] [CrossRef]
- Lei, Z.; Dong, Y.; Li, W.; Ding, R.; Wang, Q.R.; Li, J. Harnessing large language models for disaster management: A survey. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2025, Vienna, Austria, 27 July–1 August 2025; ACL: San Diego, CA, USA, 2025; pp. 14528–14551. [Google Scholar]
- Jiang, P.; Sonne, C.; Li, W.; You, F.; You, S. Preventing the immense increase in the life-cycle energy and carbon footprints of LLM-powered intelligent chatbots. Engineering 2024, 40, 202–210. [Google Scholar] [CrossRef]
- Folke, O.; Ivan Erik Troedsson, A. How Effectively Can AI Be Applied to Extract ESG-Related KPIs from Annual Reports? 2025. Available online: https://www.diva-portal.org/smash/get/diva2:1985641/FULLTEXT01.pdf (accessed on 29 January 2026).
- Zhong, Y.; Zhao, K. Application and Research of Large Language Model in Foreign Language Translation. In Proceedings of the 2024 International Conference on Information Technology, Comunication Ecosystem and Management (ITCEM), Bangkok, Thailand, 20–22 December 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 63–68. [Google Scholar]
- Bina, R.; Luong, K.; Mehta, S.; Pang, D.; Xie, M.; Chou, C.; Kimbrough, S.O. On Large Language Models as Data Sources for Policy Deliberation on Climate Change and Sustainability. arXiv 2025, arXiv:2503.05708. [Google Scholar] [CrossRef]
- Rahman, G.; Fitriyah, A. Harnessing AI for Climate Change Communication: Analyzing Public Perception through NLP and Machine Learning. Sinergi Int. J. Commun. Sci. 2025, 3, 87–98. [Google Scholar] [CrossRef]
- Sharif, S.; Zeadally, S.; Ejaz, W. Resource optimization in UAV-assisted IoT networks: The role of generative AI. IEEE Internet Things Mag. 2024, 8, 34–41. [Google Scholar] [CrossRef]
- Vivekanandan, V.; Sureshkumar, R.; Manikandan, S.; Ram Kumar, R.; Das, M.S.; Kumar, G.R.; Nandakumar, S. Environmental Monitoring and Sustainability: LLMs for Climate-Responsive Urban Design. In Large Language Models for Sustainable Urban Development; Springer: Berlin/Heidelberg, Germany, 2025; pp. 89–109. [Google Scholar]
- Das, B.C.; Amini, M.H.; Wu, Y. Security and privacy challenges of large language models: A survey. ACM Comput. Surv. 2025, 57, 1–39. [Google Scholar] [CrossRef]
- Gao, C.; Fan, G.; Chong, C.Y.; Chen, S.; Liu, C.; Lo, D.; Zheng, Z.; Liao, Q. A Systematic Literature Review of Code Hallucinations in LLMs: Characterization, Mitigation Methods, Challenges, and Future Directions for Reliable AI. arXiv 2025, arXiv:2511.00776. [Google Scholar] [CrossRef]
- Mirshekali, H.; Shadi, M.R.; Ladani, F.G.; Shaker, H.R. A Review of Large Language Models for Energy Systems: Applications, Challenges, and Future Prospects. IEEE Access 2025, 13, 163162–163188. [Google Scholar] [CrossRef]
- Leon, M. The escalating AI’s energy demands and the imperative need for sustainable solutions. WSEAS Trans. Syst. 2024, 23, 444–457. [Google Scholar] [CrossRef]
- Agoro, H.; Llorient, M. Methodologies for Testing the Performance and Reliability of Large Language Models in Real-World AI Applications. 2025. Available online: https://www.researchgate.net/publication/391986163_Methodologies_for_Testing_the_Performance_and_Reliability_of_Large_Language_Models_in_Real-World_AI_Applications (accessed on 7 February 2026).
- Hadi, M.U.; Al-Tashi, Q.; Qureshi, R.; Shah, A.; Muneer, A.; Irfan, M.; Zafar, A.; Shaikh, M.B.; Akhtar, N.; Al-Garadi, M.A. Large Language Models: A Comprehensive Survey of Applications, Challenges, Datasets, Models, Limitations, and Future Prospects. TechRxiv 2024, techrxiv.23589741. [Google Scholar] [CrossRef]
- Feretzakis, G.; Verykios, V.S. Trustworthy AI: Securing sensitive data in large language models. AI 2024, 5, 2773–2800. [Google Scholar] [CrossRef]
- Afreen, J.; Mohaghegh, M.; Doborjeh, M. Systematic literature review on bias mitigation in generative AI. AI Ethics 2025, 5, 4789–4841. [Google Scholar] [CrossRef]
- Piwowar-Sulej, K. Sustainable development and national cultures: A quantitative and qualitative analysis of the research field. Environ. Dev. Sustain. 2022, 24, 13447–13475. [Google Scholar] [CrossRef]
- Svoboda, I.; Lande, D. Enhancing multi-criteria decision analysis with ai: Integrating analytic hierarchy process and gpt-4 for automated decision support. arXiv 2024, arXiv:2402.07404. [Google Scholar] [CrossRef]
- Huang, Y. Advancing industrial sustainability research: A domain-specific large language model perspective. Clean. Technol. Environ. Policy 2025, 27, 1899–1901. [Google Scholar] [CrossRef]
- Zou, Y.; Shi, M.; Chen, Z.; Deng, Z.; Lei, Z.; Zeng, Z.; Yang, S.; Tong, H.; Xiao, L.; Zhou, W. ESGReveal: An LLM-based approach for extracting structured data from ESG reports. J. Clean. Prod. 2025, 489, 144572. [Google Scholar] [CrossRef]
- Amangeldy, B.; Tasmurzayev, N.; Imankulov, T.; Baigarayeva, Z.; Izmailov, N.; Riza, T.; Abdukarimov, A.; Mukazhan, M.; Zhumagulov, B. AI-Powered Building Ecosystems: A Narrative Mapping Review on the Integration of Digital Twins and LLMs for Proactive Comfort, IEQ, and Energy Management. Sensors 2025, 25, 5265. [Google Scholar] [CrossRef] [PubMed]
- Ullah, A.; Qi, G.; Hussain, S.; Ullah, I.; Ali, Z. The role of llms in sustainable smart cities: Applications, challenges, and future directions. arXiv 2024, arXiv:2402.14596. [Google Scholar] [CrossRef]
- Peykani, P.; Ramezanlou, F.; Tanasescu, C.; Ghanidel, S. Large language models: A structured taxonomy and review of challenges, limitations, solutions, and future directions. Appl. Sci. 2025, 15, 8103. [Google Scholar] [CrossRef]
- Zheng, D.; Li, J.; Yang, Y.; Wang, Y.; Pang, P.C.-I. MicroBERT: Distilling MoE-based Knowledge from BERT into a Lighter Model. Appl. Sci. 2024, 14, 6171. [Google Scholar] [CrossRef]





| Database | Search Formula |
|---|---|
| WOS | (TS=): (“LLM*” OR “Large Language Model*” OR “ChatGPT” OR “GPT*” OR “Foundation Model*” OR “AIGC”) AND (“Sustainability” OR “Environmental Sustainability” OR “Green Economy” OR “Climate” OR “Environment*”) |
| Scopus | (TITLE-ABS-KEY): (“LLMs” OR “Large Language Model*” OR “ChatGPT” OR “GPT*” OR “Foundation Model*” OR “AIGC”) AND (“Sustainability” OR “Environmental Sustainability” OR “Green Economy” OR “Climate” OR “Environment*”) |
| IEEE Xplore | (All Metadata): (“LLMs” OR “Large Language Model*” OR “ChatGPT” OR “GPT*”) AND (“Sustainability” OR “Environmental Sustainability” OR “Green Economy” OR “Climate”) |
| ACM Digital Library | (Full Text): (“LLMs” OR “Large Language Model*” OR “ChatGPT” OR “GPT*”) AND (“Sustainability” OR “Environmental Sustainability” OR “Green Economy” OR “Climate”) |
| ScienceDirect | (Title/Abstract/Keywords): (“LLMs” OR “Large Language Model*” OR “ChatGPT” OR “GPT*”) AND (“Sustainability” OR “Environmental Sustainability” OR “Green Economy” OR “Climate”) |
| Google Scholar | (LLMs OR Large Language Model OR ChatGPT OR GPT) AND (Sustainability OR Environmental Sustainability OR Green Economy OR Climate) |
| Inclusion Criteria | Exclusion Criteria |
|---|---|
| Research specifically targeting environmental sustainability | Research fields outside of environmental sustainability |
| Research on LLM technology within the environmental sustainability domain | Research on LLM technology outside of environmental sustainability |
| Applied research utilizing LLM technology | Studies focusing on attitudes, opinions, intentions, benefits, barriers, impacts, experiences, or usage demands regarding LLM technology |
| Research articles and conference papers (full text) | Review articles, theses/dissertations, non-academic publications, book chapters, etc. |
| Full text published in English | Full text published in other languages |
| Studies deploying LLMs as an active analytical or operational tool with documented input/output processes | Studies presenting solely conceptual, theoretical, or argumentative analyses of LLM potential without empirical demonstration or system implementation |
| Author/Year/Country | Methodology | Input and Output Content Type/Publication | Application | Model | Significance | Impact | Potential & Future Trends |
|---|---|---|---|---|---|---|---|
| Shoaib et al. 2023 Singapore [17] | Qualitative | Text/Conference papers | Intelligent Transportation Systems (ITS), Vehicle Technology, Smart Cities | LLM, ITS Domain | LLMs not only facilitate the development of autonomous vehicles but also contribute to smart cities by alleviating congestion and optimizing traffic routes through ITS. They address the challenge of fundamental models and frontier AI not efficiently solving problems | Researchers need to proactively use regulatory frameworks to mitigate potential harms and the unique challenges associated with integrating frontier AI and fundamental models into ITS | Future research in this area should delve into the temporary response capabilities of these AI technologies when confronting new challenges. This involves refining their real-time decision-making in traffic management, tailoring responses to the specific needs of smart cities, and encouraging interdisciplinary collaboration to fully unlock the potential of LLMs in creating sustainable, smart, and human-centered transportation ecosystems |
| Song et al. 2025 Hong Kong, China [18] | Quantitative (Atmospheric Simulation Systems, Retrieval-Augmented Generation (RAG) Model, AirGPT Framework) | Literature Corpus, Related Professional Data/Journal Articles | Air Quality Assessment | LLM | Demonstrates exceptional ability in providing accurate regulatory information, executing basic data analysis, and generating location-specific management advice. AirGPT outperforms others in professionalism and accuracy across most query categories. Although GPT-40’s quality score is comparable, its broader knowledge base occasionally leads to more detailed answers | The system aids researchers by providing accurate regulatory information, executing basic data analysis, and generating location-specific management advice. It avoids hallucinations by strictly adhering to validated scientific sources, which is especially critical when addressing air quality management policies that affect public welfare | Although AirGPT exhibits strong analytical capabilities, it is a decision support tool, not a substitute for expert judgment. Future research advises users to exercise professional discretion and verify key recommendations, particularly those affecting public health and environmental policy. Future work needs to incorporate safeguards in training data to prevent potential biases and continuously monitor system output to ensure alignment between LLMs and established scientific consensus in environmental science and air quality management |
| Cappendijk et al. 2025 Netherlands [19] | Quantitative (Peak and Floating Point Operations (Flops)) | Energy-Efficient Code/Conference papers | Atmospheric Greenhouse Gas Emissions | Code Llama-70b, Code Llama-70b-Indict, Code Llama-70b-Python, DeepSeek-Coder-33b-BASE, and DeepSeek-Coder-33b-Indict | LLMs are largely capable of generating code with lower energy consumption than human-written code | For-Loop optimization often leads to lower energy consumption compared to baseline solutions. Replacing native Python 3.8 code with more efficient library functions allows some LLM-generated solutions to outperform baseline solutions | AI-generated code is not always superior to human-generated code, especially since the AI-generated code setups were simple (single prompt, no hyperparameter tuning). In future coding processes, the same result might sometimes lead to lower or higher energy consumption |
| Lin et al. 2025 Taiwan, China [16] | Quantitative (Python mapping libraries, JSON module, Folium Heatmap plugin) | Geographic Information Maps, Taiwan CDC Open Data Platform/Journal Articles | Dengue Fever Epidemic and Earthquake Intensity Maps | ChatGPT 4, Copilot1.8, Claude, Chatbot UI 2.0, Code Lats, Code Llama, Gemini, and BigCode | Enhances the autonomy of geographical monitoring for dengue fever and earthquakes, saving researchers substantial time and labor costs. Autonomously generated maps provide a clear understanding of the distribution and clustering of disasters in different regions. Improves the timeliness, consistency, scalability, and standardization of data processing while automating repetitive tasks and assisting with data analysis | Highlights the broad applicability of LLMs in text generation and data visualization, and offers valuable reference for research in automated disaster monitoring, prevention, and mitigation | Challenges remain in ensuring the accuracy of geographically related code generated by LLMs and effectively interpreting results. Future research still requires targeted training of LLMs in the geospatial domain |
| Lee et al. 2024 USA [20] | Mixed (Sampling Survey, Silicon Wafer Sample Data Collection, Survey Measurement, MAF) | Questionnaire/Journal Articles | Human Attitudes and Behavior, Global Warming Opinions | GPT-4 | LLMs can effectively reproduce presidential voting behavior but not global warming opinions, apart from issue-related covariates. When demographic and covariates are considered, GPT-4 demonstrates higher accuracy in predicting beliefs and attitudes toward global warming | Provides valuable insights into the algorithmic fidelity and bias of LLMs when simulating public opinions on global warming. Offers targeted practical guidance on conditional prompting and model selection to maximize fidelity in social science applications, while emphasizing the importance of validating LLMs, especially for minority groups | Future researchers need to pay close attention to whether massive training data favors certain attitudes towards climate change, or whether post-training adjustments guided by human feedback lack representativeness. A nuanced approach is needed for human attitudes and behavior to utilize the low-cost management capabilities while addressing their limitations through proactive algorithmic auditing and bias mitigation |
| Cecconi et al. 2024 Italy [21] | Qualitative (Predictive Modeling, NLP | Legislative Documents/Book Chapter | Policies related to Climate Change Mitigation Efforts | GPT-4 | Through NLP, LLMs can help interpret complex climate model results, translating technical model outputs into understandable narratives that describe potential future states of the climate system under different policy pathways. Enhancing public engagement and education: LLMs play a crucial role in transforming complex climate science into content that is understandable and engaging for the public | The integration of LLMs into climate change research and policymaking represents a promising convergence of technology and environmental science. By leveraging the power of LLMs, researchers, policymakers, and advocates can enhance citizens’ understanding of climate dynamics, improve the formulation of effective policies, and promote greater public participation in climate action | As the urgency to address global warming intensifies, current research techniques are temporarily unable to efficiently support environmental research. Integrating complex AI technologies like LLMs into environmental research is a direction researchers need to prioritize |
| Bhaskar et al. 2024 India [22] | Quantitative | Digital Articles, Theses, Reputable Journals, Government Websites, Statistical Websites, etc./Book Chapter | Environmental Impact, Carbon Footprint, Water Footprint, Greenhouse Gas Emissions | ChatGPT | Emphasizes the importance of sustainability in AI development and reveals the negative environmental impacts of ChatGPT. Highlights the excessive energy consumption associated with training and running ChatGPT models. The carbon footprint generated by the training process raises concerns about AI development’s contribution to climate change and environmental degradation | Longitudinal studies using LLMs over extended periods can help researchers monitor changes in the environmental impact of AI models. Provides researchers with insightful information on the precise environmental impact of AI models across various industries or applications. Future focus can be placed on actual settings, considering factors like data centers, power supply, and energy-saving technologies | The study relies solely on secondary data sources, meaning the actual findings are limited by the availability and quality of existing studies and reports. Future research should focus on providing more comprehensive and accurate empirical data on the environmental impact of AI models like ChatGPT |
| Wang et al. 2025 China [23] | Quantitative (Points of Interest (POI) calculation, Descriptive Statistics, Mean Absolute Percentage Error (MAPE), Coefficient of Determination, and A20 index) | POI, Road Network Data, Housing Prices, and Population Data/Journal Articles | Built Environment; Public Transit Ridership | DeepSeek-R1 | Offers valuable insights for transportation planners and policymakers, assisting government and transportation departments in urban traffic layout planning and providing guidance for environmental sustainability governance policies | Research findings can lead to land use intensification, an increased modal share for public transport, reduced traffic congestion, thereby lowering associated carbon emissions from traffic, and ultimately enhancing the overall sustainability of the transportation system | The mechanism by which the built environment influences various modes of transportation remains a vital topic requiring further research. With the continuous improvement of LLM reasoning capabilities in future research, LLMs can be utilized by researchers to consider a wider range of environmental sustainability impact factors and explore the joint influence between environmental sustainability and other factors |
| Martín-Domingo et al. 2025 Ireland [7] | Mixed (Sustainability Reporting, Systematic Review) | Environmental Key Performance Indicators (KPIs)/Journal Articles | Environmental Sustainability Indicators and Regulatory Compliance | GPT-4.0, o3mini, and DeepSeek R1 | Expands the scope of previous research to include a broader range of sustainability indicators, different transport modes or domain, and the use of multiple languages and further fine-tuning for Small Island Developing States (SIDS). Precisely evaluates the accuracy and reliability of LLMs in extracting emission-related KPIs from European airline sustainability reports | The model analysis covers multiple data sources, extraction strategies, and model architectures, providing a comprehensive overview of factors affecting automated KPI extraction performance. Using commercial LLMs may lead to data being publicly disclosed outside the responsible organization’s jurisdiction, potentially violating GDPR or other regulations when processed by cloud-hosted LLMs, and risks exposing sensitive information to external systems | Future research needs to further explore the integration of LLMs with structured data extraction tools. Cost–benefit and business scalability assessments will help promote the automation of ESG data extraction and support constantly changing regulatory requirements. Facilitating collaboration across the interdisciplinary boundaries of AI, sustainability, and compliance will lay the groundwork for future sustainable digital transformation and standards |
| Chew et al. 2024 Malaysia [24] | Quantitative (Machine Learning (ML), Data Analysis) | Keetch-Byram Drought Index (KBDI), Soil Moisture, Temperature, Wind Speed, Land Surface Temperature (LST), Palmer Drought Severity Index (PDSI), Normalized Difference Vegetation Index (NDVI), Land Cover, and Rainfall, etc./Journal Articles | Forest Fire Investigation | Google Earth Engine Integrated Framework, ChatGPT | Provides valuable insights into the fire scenarios in Peninsular Malaysia. Preliminary analysis of the annual average of forest fires concludes that the main factors | Lowers the threshold for data scientists, allowing users to apply their analytical skills directly to datasets extracted by the Global Environmental Research Centre, thereby reducing the need for in-depth remote sensing knowledge | No manual coding was performed during the analysis; the analysis and Python scripts for generating the results were created through simple prompts in the ChatGPT interface. As technology continues to evolve, researchers leading future studies should consider adopting this technique to improve their methods and analysis |
| Stein 2025 USA [25] | Qualitative | Digital Data of the Internet, Social Media, and AI/Journal Articles | Water, Energy, Carbon, Waste, and Land Use | GenAI | GenAI can bring more net benefits in our collective pursuit of a more sustainable environment, while stimulating innovation and sustainability potential, offering more targeted innovation and upgrades compared to previous research | GenAI, like all major industries, places pressure on the world’s limited resources while bringing convenience. Data centers training and running GenAI models generate non-negligible negative impacts on water, energy, carbon, waste, and land use | Most AI users are distant from these impacts. Future involvement from government or industry is needed to minimize the socially borne costs of data centers. The GenAI industry is better suited to record its environmental impact and integrate sustainability into corporate ethical commitments. Future research in related fields needs to strive to minimize the negative environmental impact of GenAI |
| Bibri et al. 2025 Switzerland [26] | Quantitative | Heterogeneous Real-time Data from Urban Systems/Journal Articles | Cognitive Enhancement, Resource Efficiency, Network Traffic, Cybersecurity and Anomaly Detection, Resource Sustainability, Resource Efficiency | GenAI, AI + Internet of Things (AIoT) Systems | The study integrates current GenAI and AIoT, further emphasizing domain-specific advancements and their synergy. It promotes the development of sustainable smart cities by fostering a smarter, more energy-efficient, adaptive, secure, robust, and autonomous AIoT ecosystem through the strategic application of Generative intelligence | Provides practical guidance for policymakers, urban planners, system designers, and technology developers, helping researchers utilize GAIoT to enhance the resilience, sustainability, and operational capabilities of smart cities | Future research should consider simulation-based testing, prototyping, and real-world implementation to validate the practical value of the framework. This includes specifying data flows, control algorithms, feedback mechanisms, and system interactions to convert the high-level conceptual framework into a functional architecture |
| Huang et al. 2025 Switzerland [27] | Quantitative | Lausanne’s Blue City Project Data/Journal Articles | Innovative Urban Development Solutions | GenAI, Foundation Models (FMs), and Urban Digital Twin (UDT) Framework | Enhances decision-making processes, supports evidence-based planning and design, promotes integrated development strategies, and enables the development of more efficient, resilient, and sustainable urban environments. It advances the theory and practice of AI-driven, environmentally sustainable urban development through the implementation of GenAI and FMs within the UDT framework | Provides complex decision tools and valuable insights for urban planners, designers, policymakers, and researchers, helping them address the complexities of modern cities and accelerate the transition towards a sustainable urban future | N/A |
| Cheng et al. 2024 China [28] | Quantitative (Time Series Analysis, Regression Analysis) | Real-time Meteorological Data, Electricity Market Data, and Equipment Operating Status Data/Conference papers | Low-Carbon Power Scenarios, Carbon Market Dynamics, Climate Risk Assessment, and Urban Planning Strategies | LLM | Low-carbon energy management can drive innovation and sustainability in the low-carbon energy transition, and is capable of optimizing grid operations, integrating renewable energy, and predicting demand patterns | LLMs can facilitate better decision-making for researchers, optimize resource allocation, and accelerate the development of innovative low-carbon technologies. It also maximizes the impact of low-carbon energy management in the low-carbon energy transition | Future research may face challenges in data quality and availability, cybersecurity, legal management training costs, interdisciplinary collaboration, and evaluation benchmarks. Further exploration of land use management applications in various low-carbon energy transition scenarios and its combination with other advanced technologies will help unlock its full potential in driving sustainable development |
| Karlsson 2025 Sweden [29] | Mixed | Sustainability Reports, Corporate Websites, Social Media, and News Articles/Book Chapter | Green Transition | GenAI | Microsoft Copilot and ChatGPT achieved similar results in identifying environmental prospects, with only minor differences observed between the AI models | GenAI adds value to exploration tasks, but the usefulness of its models depends on the implementation strategy. The models’ conclusions are accurate enough for the task but should be continuously monitored. The suggested tools are not meant to replace human judgment but to assist the sales process and drive green transition | Researchers should adopt alternative methods, emphasizing strong initial economic partnerships before focusing on sustainability intentions |
| Krzyżewska 2025 Poland [30] | Quantitative (Cloud Identification and Classification) | Official World Meteorological Organization (WMO) Cloud Atlas/Journal Articles | WMO Cloud Classification; AI Map Interpretation | ChatGPT o3-mini, o1, 4.0, 4.0; Gemini Advanced 1.5 and 2.0; Copilot; Pplexity; DataAnalyst; Consensus; ScholarGPT; SciSpace; Claude; and DeepSeek | Current systems in the meteorological field offer tremendous support in areas such as cloud classification, map interpretation, and literature review support, but their performance remains inconsistent and varies across models and tasks | Standardized protocols must be established to evaluate their performance over time. Repeating tests across model updates and platforms will help determine if these tools can achieve the consistency and reliability required for broader adoption in meteorology and climate science | Future research should focus on improving the ability of AI models to interpret structured geo-scientific data, such as time series, gridded weather data, and integrated model outputs. Specialized evaluation benchmarks are also needed to reflect the complexities of specific domains, such as the WMO classification system or geospatial metadata interpretation |
| Rivero et al. 2025 Spain [31] | Qualitative (Theoretical and Comparative Analysis) | Publicly Available Technical Literature, Academic Literature, and Official Policy Documents related to LLM Development and Deployment/Journal Articles | Environmental Sustainability | ChatGPT and DeepSeek | The study highlights certain security risks associated with the DeepSeek distilled model. It indicates that sustainability is no longer a marginal issue but is increasingly viewed as a crucial factor in the geopolitical agenda | While it is too early to definitively conclude that LLMs are the decisive axis of technological competition, current findings suggest that China is progressively adjusting its strategic focus toward more responsible innovation in the field of environmental sustainability | Parts of the study rely on developer information and white papers, which may not fully reflect the technical specifications or energy consumption data of ChatGPT and DeepSeek. Inherent opacity limits full comparability between models. The academic community must remain focused on empirically validated new data in the future. Although the article positions sustainability as a potential strategic advantage axis, this hypothesis has not been empirically tested through deployment or real-time performance measurement |
| Hou et.al 2025 China [32] | Quantitative (Questionnaire Survey, Back-translation Method) | Questionnaire from 260 High-tech Manufacturing Enterprises in China/Journal Articles | Decarbonization Capability, Environmental Performance | GenAI | Given the dual impact of GenAI on the environment, achieving sustainability through LLMs requires careful management of the technology’s footprint, posing a key challenge for engineering managers | GenAI contributes to the technology-driven management literature in environmental sustainability and provides valuable insights to help companies take steps toward achieving carbon neutrality | Future research should collect historical data on sustainability initiatives and incorporate other research methods to further improve the accuracy of conclusions. It emphasizes the boundary conditions of environmental digitization but does not consider some contextual factors, including pressure from customer engagement, policy support, and industry pollution levels. Future research may include more contextual factors to systematically explore the boundary conditions for GenAI in unlocking environmental sustainability |
| Cheng et al. 2025 Taiwan, China [33] | Quantitative | Carbon emission formulas for various modes of transportation in 2021/Conference papers | Public Transport Carbon Emissions, Carbon Emissions, Carbon Reduction | ChatGPT | The study demonstrates the complexity and challenges of carbon emission calculation, emphasizing the importance of comparing and validating formulas from different sources. The potential of the ChatGPT model as an NLP technology is showcased in the context of carbon emission calculation | Potential to provide more accurate carbon emission estimation methods and recommendations, helping governments and relevant organizations formulate effective emission reduction policies and measures | Future calculations of carbon emissions for different transportation modes should consider factors like vehicle type and driving conditions in greater detail to improve calculation accuracy. A meticulous assessment of accuracy, reliability, interpretability, and practicality is required when selecting carbon emission calculation formulas |
| Zhang et al. 2025 China [34] | Quantitative (Integrated Energy System Model) | Problems of Supply–Demand Imbalance in Energy Systems, Operational Optimization Problems in Grid-Interactive High-Efficiency Commercial Buildings/Journal Articles | Efficient Energy Conversion and Utilization | Deep Reinforcement Learning (DRL), Integrated Energy System (IES), LLMs | Provides a new approach for improving the optimization and decision-making results of intelligent evolutionary systems in environmental sustainability, enhancing efficiency in environmental sustainability governance | The combined mechanism is specifically designed to support dynamic transactions and enhances decision performance | Consumer satisfaction, which was not explicitly reflected in the reward function in the research findings, leads to an incomplete analysis. Future research needs to design specific consumer satisfaction evaluation segments to ensure coverage of various consumer groups |
| Methods | Type Frequency (n) | Specific Methodologies |
|---|---|---|
| Quantitative | 13 | Atmospheric simulation systems, RAG models, AirGPT framework, LLMs, Peak and Flops, Python mapping libraries, JSON modules, Folium Heat map plugins, data analysis, POI calculation, descriptive statistics, MAPE, Coefficient of Determination, A20 index, machine learning, flowchart, conceptual framework, large flow models, time series analysis, regression analysis, cloud identification and classification, carbon emission formulas, IES models |
| Qualitative | 4 | Predictive modeling, NLP, comparative discussion, theoretical and comparative analysis |
| Mixed | 3 | Sampling surveys, silicon wafer sampling, data collection, survey measurements, sustainable investigation reports, systematic review, LLMs, and manual control groups |
| Data Type (Input/Output) | Frequency (n) | Core Function/Application Scenarios |
|---|---|---|
| Structured Professional Data | 13 | Climate change analysis, drought warning, and ecosystem health assessment; Simulating carbon emission distribution, planning green infrastructure, and evaluating the socio-economic impact of policies; Load forecasting, energy efficiency analysis, and grid balance optimization; Traffic carbon emission formulas to quickly and accurately calculate the environmental footprint of various activities |
| Text | 4 | Forming the knowledge base and contextual understanding essential for comprehending concepts, interpreting policies, tracking frontiers, and gaining insight into public sentiment. Rapidly reviewing academic literature to distill the latest research findings and technological trends; Analyzing corporate sustainability reports to assess ESG performance and greenwashing risks; Capturing public concern and discourse trends on environmental issues from news and social media |
| Real-time Data | 5 | Providing dynamic, real-world information for real-time monitoring, warning, and adaptive control. Integrating real-time meteorological data to issue warnings for extreme weather (e.g., typhoons, heatwaves) and generating emergency recommendations; Accessing real-time urban data (e.g., traffic, energy consumption) to dynamically optimize traffic signals and adjust energy distribution |
| Sustainable Problem-specific Data | 3 | Translating abstract sustainability challenges into specific task instructions that LLMs can comprehend and execute, guiding the model to solve highly specialized and complex industry problems. Enabling LLMs to generate concrete solutions for reducing energy consumption and enhancing system resilience; Guiding LLMs to propose innovative solution pathways by synthesizing their knowledge base and data analysis capabilities |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Su, X.; Liu, T.; Pang, P.; Luo, Y.T.; Wong, D. How Can Large Language Models Drive Environmental Sustainability? A Systematic Scoping Review. Sustainability 2026, 18, 4327. https://doi.org/10.3390/su18094327
Su X, Liu T, Pang P, Luo YT, Wong D. How Can Large Language Models Drive Environmental Sustainability? A Systematic Scoping Review. Sustainability. 2026; 18(9):4327. https://doi.org/10.3390/su18094327
Chicago/Turabian StyleSu, Xiaotong, Ting Liu, Patrick Pang, Yiming Taclis Luo, and Dennis Wong. 2026. "How Can Large Language Models Drive Environmental Sustainability? A Systematic Scoping Review" Sustainability 18, no. 9: 4327. https://doi.org/10.3390/su18094327
APA StyleSu, X., Liu, T., Pang, P., Luo, Y. T., & Wong, D. (2026). How Can Large Language Models Drive Environmental Sustainability? A Systematic Scoping Review. Sustainability, 18(9), 4327. https://doi.org/10.3390/su18094327

