Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review

Pahl, Claus; Sezen, Övgüm Can; Hofer, Florian

doi:10.3390/electronics15040755

Open AccessSystematic Review

Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review

by

Claus Pahl

^*

,

Övgüm Can Sezen

and

Florian Hofer

Faculty of Engineering, Free University of Bozen-Bolzano, 39100 Bolzano, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(4), 755; https://doi.org/10.3390/electronics15040755

Submission received: 23 December 2025 / Revised: 28 January 2026 / Accepted: 31 January 2026 / Published: 10 February 2026

(This article belongs to the Special Issue Advanced System Architectures and AI-Driven Innovations for Next-Generation Computing)

Download

Browse Figures

Versions Notes

Abstract

ingInfrastructure-as-Code (IaC) is a systems management practice that involves managing and provisioning computing infrastructure through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. IaC is an essential contribution to the complete automation of the entire software lifecycle in a Development and Operations (DevOps) context. The deployment and management of software through coded configuration, monitoring, and analysis is the IaC solution. In recent times, artificial intelligence (AI)—including generative AI, machine learning, and related techniques—offers opportunities to improve techniques across the IaC life cycle from IaC code generation to its deployment and runtime analysis. We conducted a comprehensive and systematic literature review for all IaC code development and operations phases, considering IaC as a specific software type that we map to the DevOps model. We present the bibliographic review results and investigate in which phases and how AI can enhance IaC techniques by extracting a framework of phase-specific AI contributions and research challenges, contrasting, in particular, generative AI and machine-learning applications across the phases. Key findings include Large Language Models (LLMs) dominating generation and Machine Learning (ML) dominating analysis activities, also showing that operations phases are less studied than IaC development. This review extends previous literature reviews by covering the full DevOps lifecycle, developing a phase-specific taxonomy of AI techniques for IaC, and aligning a comprehensive analysis of research challenges and directions with those that benefit developers by highlighting current innovations and pointing researchers to future directions.

Keywords:

infrastructure-as-code; IaC; DevOps; artificial intelligence (AI); generative AI; machine learning; systematic literature review; research challenges

1. Introduction

Infrastructure-as-Code (IaC) is concerned with provisioning and managing infrastructure resources through programmatic solutions. It is an essential contributor to the full automation of the software lifecycle in a DevOps context as the last activity in the full cycle. The deployment and management of systems through coded DevOps configuration, deployment, monitoring, analysis, and self-healing activities is the IaC solution.

In recent times, artificial intelligence (AI) offers opportunities to improve the technique across the IaC life cycle, from IaC generation to self-healing IaC deployments. Machine learning (ML) can help analyze code and execution logs to detect anomalies and undesirable events. Generative AI (GenAI), specifically large language models (LLMs), can help generate or analyze IaC code.

IaC is a programmatic tool, i.e., it is a specific kind of software with its own languages, tools, and lifecycle. We embed this IaC lifecycle within a DevOps lifecycle model, enabling us to focus on automation. We conducted a comprehensive systematic literature review (SLR) for all IaC development and operations lifecycle phases. We based the extraction on four common literature databases and applied the PICO protocol for filtering relevant contributions. In addition to presenting a range of bibliometric data from the review, we extracted a framework of IaC phase-specific AI contributions and challenges that organizes and concretizes the AI contributions by mapping them to specific IaC DevOps tasks and their respective solutions.

While some literature reviews and technical surveys exist on IaC and AI utilization, this is the first review to comprehensively cover the full IaC DevOps cycle, following a systematic literature review protocol. The results indicate, for instance, that successful applications of AI to software development in general are also transferred to IaC as a specific type of software. Examples include code generation (via LLMs) and code analysis (via ML). Currently, the use of AI for the latter IaC code operation phases in the lifecycle is less well explored.

The paper is structured as follows. We begin with a brief background on IaC in Section 2, then present our review methodology in Section 3. General observations from the literature review are presented first in Section 4, followed by a deeper analysis of existing contributions based on a phased IaC DevOps framework in Section 5. In a further round, we contrast our findings with those of other reviews in Section 6, and conclude in Section 7.

2. Infrastructure-as-Code and DevOps

We start with a definition of ‘Infrastructure-as-Code’ as a concept to frame the review. Generally, Infrastructure-as-Code (IaC) refers to the concept of managing and provisioning infrastructure resources, such as servers or networks. IaC is a software operations solution, i.e., it refers to the later phases of a DevOps chain. DevOps is a set of practices that aims to improve collaboration and communication between software development and IT operations teams, to deliver high-quality software more quickly and reliably.

2.1. IaC Definition

We considered the IaC definitions from two international companies with different business areas (AWS and Red Hat), as well as Wikipedia as a widely used knowledge source, and finally the Cloud Native Computing Foundation (CNCF) as a foundation with a diverse set of participants. These four cover a range of perspectives and views, but allowed us the identification of common aspects that are used in the respective definitions.

The four sources with their definitions range across the professional context, but have in common the following aspects in their definitions:

an activity, which is sometimes also called a process, an ability, or a practice;
an objective, which can be divided into expected benefits (e.g., to automate or to transition from fixed to software-based flexibility) and foreseen disadvantages or problems that should be avoided (e.g., undocumented manual changes could entail configuration problems),
a means or a mechanism that is linked to the objective, which aims at realizing the intended benefit, often directly referring to software scripts (generally declarative or procedural) as the most common solution;
subjects, which are the assets that are affected by both benefits or disadvantages through the application of the means, which in general is the IT infrastructure;
human agents, or in some cases only an agent, which is generally the organization in charge, or a dedicated developer.

We provide a concrete definition of Infrastructure-as-Code by taking into account the common indications from the above sources that we surveyed:

IaC is an activity/practice.
The IaC objective is to realize automation and quality benefits and to avoid problems and manual costs caused by complexity.
The IaC means are digitally stored instructions that can be automatically executed (usually in the form of software scripts).
The IaC subject is the IT infrastructure.
The human agents are generally developers or other DevOps teams specifically concerned with a system’s operation.

The definition identifies commonalities among IaC technologies. A variety of IaC technologies exist. Existing tools in the market are, however, despite commonalities, fragmented based on the different functions they support, such as infrastructure provisioning, configuration, orchestration, or deployment. As concrete examples, CloudFormation or Terraform are cloud-oriented and are primarily used to set up VMs and connect them to a network, while, on the other hand, Chef is used to configure and secure those VMs.

2.2. IaC Context

The use of AI for operations is often referred to as AIOps [1], which is a related term. AIOps, or Artificial Intelligence for IT Operations, is a term that refers to the use of artificial intelligence, machine learning, and data analytics to automate and improve IT operations. AI solutions aim to handle the complexity and volume of data, helping to detect, diagnose, and resolve incidents more quickly and efficiently. As AIOps is a term that encompasses broader activities beyond IaC, we do not use it to describe our focus, which is narrower, with AI applied only to IaC.

We classify the following selected IaC technologies in the market: Chef, Puppet, Ansible, Pulumi, Heat, DOML, Terraform, TOSCA and CloudFormation in Table 1 in terms of context, functionality, language, and architecture concerns. This reveals some industry trends. There is a shift towards declarative paradigms (7 out of 9 tools) and DSLs (6 out of 9 tools). This indicates the benefits of AI solutions that would better support declarative approaches by allowing more freedom in how to achieve the results. The concept of immutable infrastructure is now embraced by the majority of tools (6 out of 9). This addresses so-called configuration drift as each deployment starts from a fresh, version-controlled state, which also would benefit from AI-controlled state management, building on AI-driven analysis and self-healing.

2.3. SLR Motivation

Given the lack of comprehensive SLR-type surveys on AI for IaC, this motivates a dedicated study. We approach this by considering IaC as a specific software type with its own languages and coding techniques, as well as deployment tools and techniques. IaC is based on coded infrastructure operations and is thus a specific software development and operations technique in itself that uses common software techniques to create, manage, and deploy infrastructure resources throughout the software lifecycle. In Figure 1, we outline an IaC lifecycle by adapting a generic DevOps lifecycle. Thus, we define an IaC-specific DevOps, or IaCDevOps, cycle. An IaC lifecycle process follows, therefore, a common DevOps lifecycle for general use that covers development and operations activities:

Planning: Define infrastructure requirements through general planning and structured requirements elicitation with stakeholders.
Code creation: Create code or scripts that define the required infrastructure resources and their management according to the requirements.
Building and Packaging: Use software engineering tools to compose, build, and package IaC software.
Testing and verification: Validate the software using code testing and other suitable verification techniques, such as static code analysis.
Release, configuration, and deployment: Use deployment automation tools, e.g., Jenkins or Ansible/Terraform, to automate the configuration and deployment of infrastructure resources.
Operation, monitoring and self-healing: Configure and use monitoring tools, e.g., Prometheus or Nagios, to track the general health, and specifically performance and security of the infrastructure resources, and, where possible, self-heal through remediation actions.

Figure 1. IaCDevOps—a dedicated DevOps Lifecycle for IaC.

3. SLR—Methodology and Execution

Systematic Literature Reviews (SLRs) follow a defined protocol to reduce bias through a rigorous sequence of methodological steps in the research literature. They rely on well-defined review protocols to extract, analyze, and document results. We follow the process presented in [3] with a three-step review that includes planning, conducting, and documenting.

Plan: identify need, specify RQs, define protocol
Conduct: select primary studies, extract and synthesize data
Document: document observations, analyze threats, report

The review is complemented by an evaluation of each step’s outcome. Furthermore, we provide an additional characterization framework for the DevOps for IaC study context.

Now, the individual steps will be outlined, which follow the PRISMA. Based on the objectives, we specify the research questions (RQs) and the review scope to formulate search strings for literature extraction.

3.1. PRISMA Compliance

The literature review follows the PRISMA guidelines [4] The corresponding PRISMA flow chart is presented in Figure 2, which presents the respective activities and reports the numbers of studies considered for each activity. Our manuscript is reported as a systematic review in the title, following the PRISMA guidelines. The main objectives the review addresses are outlined in Section 2 to confirm the lack of a comprehensive review in areas receiving increasing attention. The research questions are summarized in Table 2. The inclusion and exclusion criteria for the review are listed in Table 3. The information sources used to identify studies are the four databases Scopus, SpringerLink, Google Scholar, and IEEE Xplore, which include studies selected until 1 October 2025. We apply the PICO criteria and follow specifically [3] as a protocol for software engineering research to avoid bias and present and synthesize results.

For the total number of 44 included studies, we summarize relevant characteristics of the studies in Section 4. We present more detailed results, describing the included studies in Section 5. We provide a general interpretation of the results and important implications in the research challenges and directions subsections of Section 5. The limitations of the evidence included in the review can be subject to imprecision. This is discussed in the Summary Section 6.4 and also the final limitations in Section 7.2. No funding was received for this review work. Registers have not been used.

3.2. Planning the Review

SLR planning is the first step and covers justifying the need, defining research questions, and specifying a corresponding review protocol.

Identify the Need for SLR: We have already discussed the need for an SLR, and we also make the general goal and scope of the study explicit using the PICO (Population, Intervention, Comparison, Outcome) criteria; see Table 2.
Specifying the Research Questions: As the next activity, we define the research questions to help shape the review protocol, see Table 3. This comprises the motivation to use IaC (e.g., to automate a particular IaC lifecycle activity), the identification of the different types of AI generally used (e.g., the specific forms of Generative AI or Machine Learning), and how these AI techniques are used (e.g., in which phase of the lifecycle they are applied) as well as the identification of open research challenges.
Define and Evaluate Review Protocol: We define a protocol for a literature study based on [3] and our experience with SLRs to define key elements such as the PICO criteria, inclusion/exclusion criteria, as well as the internal extraction and review activities.

The PICO criteria (Population, Intervention, Comparison, Outcome) are explained in Table 2. The population concern is defined through the research questions, which are defined and motivated in Table 3. Intervention refers to the activities of characterization, validation, data extraction, and synthesis. These are the activities applied in our methodology and are explained in Section 3.3. Comparison refers to the mapping of the selected studies to a characterization framework. This framework consists of a bibliometric framework in Section 4 and an AI activities framework organized by DevOps phases in Section 5. The outcome, as the final PICO concern, refers to the characterization framework itself.

3.3. Conducting the Review

Conducting starts with study selection and results in extracted data and synthesized information. We used Scopus, SpringerLink, Google Scholar, and IEEE Xplore as the literature databases to survey. These four databases are either publisher databases of the two most widely active publishers, Scopus, as a curated database often used for evaluations and assessments, and Google Scholar, with a very comprehensive coverage of published material.

To extract studies from the four databases, a suitable search term was defined. It focuses on three core concepts—IaC, AI, and DevOps—that we already defined and distinguished in Section 2. For each of the three concerns, we included various spellings and related concepts. For IaC, we used the standard spelling, but also infrastructure automation as a frequently used objective. For AI, we expanded the AI term to include machine learning, generative AI, and large language models as central AI concepts, but also included deep learning and natural language processing as linked terms of higher relevance. For the DevOps concern, we include a range of related concepts from continuous integration to software deployment and automation.

Thus, the combined search term is built from the three distinct thematic concerns: IaC, AI, and DevOps. These were then formalized using suitable acronyms and synonyms and combined as a conjunction of these core thematic concerns:

(“IaC” OR “Infrastructure as Code” OR “Infrastructure-as-Code” OR “Infrastructure Automation”)

AND

(“Artificial Intelligence” OR “Machine Learning” OR “Deep Learning” OR “Natural Language Processing” OR “Large Language Models” OR “AI” OR “ML” OR “DL” OR “NLP” OR “LLM” OR “Generation” OR “Generative”)

AND

(“DevOps” OR “CI/CD” OR “Continuous Integration” OR “Continuous Deployment” OR “Software Deployment” OR “Software Delivery” OR “Automation”)

The search term was developed using [3] and guided by the research questions. These terms were applied to the document title, abstract, and keywords.

Step 1–Select Primary Studies (Study Selection and Qualitative Assessment). Using the above search term, the extraction from the four databases proceeded as follows in three steps:

1.: We extracted initially 102 studies from 2020 to 2025 (until 1 October 2025) from Scopus, SpringerLink, Google Scholar, and IEEE Xplore. No papers published before 2020 were identified during the search.
2.: Removing duplicates from the joined list from the four databases resulted in 83 unique publications.
3.: The application of exclusion criteria in a quality assurance process was carried out in two steps. (i) Firstly, the title, abstract, and keywords were used to remove non-relevant publications. (ii) Secondly, a further manual review of the remaining papers was carried out to assess the significance of the contribution of AI utilization for IaC. This resulted in a final list of 44 publications, which were then categorized into technical contributions and review papers.

The extraction and internal summarizing of the publications with respect to their AI for IaC content was carried out by the second and third authors. The first author served to validate this and to carry out the further selection process. He also allocated the publications to the respective DevOps phases based on a term-extraction process, which in turn was verified by the other two authors.

Step 1a–Initial Selection. This includes screening of titles and abstracts of potential primary studies, performed against inclusion/exclusion criteria as defined in Table 4. We validated whether the abstract or keywords included the key terms and whether it was clear that a contribution towards IaC and an AI-based contribution was made. We exclude literature only available in the form of an abstract, blog, or a presentation, and studies with AI and IaC terms only in the abstract, but with little concrete details in the main sections.

Step 1b–Final Selection. This is based on a validation scan of the studies, a focus on methods for AI for IaC, and tool support, and details of the evaluation approach. Final selected studies are listed in Table 5. At the end, 44 studies were selected. A total of 34 of these were technical contribution papers, and 11 were reviews or surveys. For this review category, we included systematic literature reviews or mapping studies, but also technical surveys of the technology without a specific technical contribution, based on a comparison of methods, techniques, and tools.

Table 5. Final List of Selected Papers based on the PRISMA Protocol.

[5]	Openja, M.; Adams, B.; Khomh, F. Analysis of Modern Release Engineering Topics:–A Large-Scale Study using StackOverflow. 2020
[6]	Bhuiyan, F.A.; Rahman, A. Characterizing Co-located Insecure Coding Patterns in Infrastructure as Code Scripts. 2020.
[7]	Borovits, N.; Kumara, I.; Krishnan, P.; Palma, S.D.; Di Nucci, D.; Palomba, F.; Tamburri, D.A.; van den Heuvel, W.J. DeepIaC: deep learning-based linguistic anti-pattern detection in IaC. 2020.
[8]	Opdebeeck, R.; Zerouali, A.; Velázquez-Rodríguez, C.; Roover, C.D. Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution. 2020, 238–248.
[9]	Palma, S.D.; Mohammadi, M.; Di Nucci, D.; Tamburri, D.A. Singling the odd ones out: a novelty detection approach to find defects in infrastructure-as-code. 2020.
[10]	Rahman, A.; Williams, L. Different Kind of Smells: Security Smells in Infrastructure as Code Scripts. 2021.
[11]	Alonso, J.; Orue-Echevarria, L.; Osaba, E.; López Lobo, J.; Martinez, I.; Diaz de Arcaya, J.; Etxaniz, I. Optimization and Prediction Techniques for Self-Healing and Self-Learning Applications in a Trustworthy Cloud Continuum. 2021.
[12]	Alnafessah, A.; Gias, A.U.; Wang, R.; Zhu, L.; Casale, G.; Filieri, A. Quality-Aware DevOps Research: Where Do We Stand? 2021.
[13]	Recupito, G.; Pecorelli, F.; Catolino, G.; Moreschini, S.; Nucci, D.D.; Palomba, F.; Tamburri, D.A. A Multivocal Literature Review of MLOps Tools and Features. 2022.
[14]	Petrovic, N.; Cankar, M.; Luzar, A. Automated Approach to IaC Code Inspection Using Python-Based DevSecOps Tool. 2022.
[15]	Borovits, N.; Kumara, I.; Di Nucci, D.; Krishnan, P.; Palma, S.D.; Palomba, F.; Tamburri, D.A.; Heuvel, W.J.v.d. FindICI: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code. 2022.
[16]	Kyryk, M.; Pleskanka, N.; Pleskanka, M.; Kyryk, V. Infrastructure as Code and Microservices for Intent-Based Cloud Networking. 2022.
[17]	Quattrocchi, G.; Tamburri, D.A. Predictive maintenance of infrastructure code using “fluid” datasets: An exploratory study on Ansible defect proneness. 2022.
[18]	Chiari, M.; De Pascalis, M.; Pradella, M. Static Analysis of Infrastructure as Code: a Survey. 2022.
[19]	Myat, H.M.; Phyu, M.P.; Paing, A.M.M. Towards Infrastructure Automation Using IaC in the Era of GenAI. 2025.
[20]	Dalla Palma, S.; Di Nucci, D.; Palomba, F.; Tamburri, D.A. Within-Project Defect Prediction of Infrastructure-as-Code Using Product and Process Metrics. 2022.
[21]	Srivatsa, K.G.; Mukhopadhyay, S.; Katrapati, G.; Shrivastava, M. A Survey of using Large Language Models for Generating Infrastructure as Code. 2023.
[22]	Lanciano, G.; Stein, M.; Hilt, V.; Cucinotta, T. Analyzing Declarative Deployment Code with Large Language Models. 2023.
[23]	Opdebeeck, R.; Zerouali, A.; De Roover, C. Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort? 2023.
[24]	Rahman, A.; Parnin, C. Detecting and Characterizing Propagation of Security Weaknesses in Puppet- Based Infrastructure Management. 2023.
[25]	de la Fuente Ruiz, A.E.; Novakova Nedeltcheva, G. Game-theory strategies for open-source Infrastructure-as-Code. 2023.
[26]	Cankar, M.; Petrovic, N.; Pita Costa, J.; Cernivec, A.; Antic, J.; Martincic, T.; Stepec, D. Security in DevSecOps: Applying Tools and Machine Learning to Verification and Monitoring Steps. 2023.
[27]	Reddy Konala, P.R.; Kumar, V.; Bainbridge, D. SoK: Static Configuration Analysis in Infrastructure as Code Scripts. 2023.
[28]	Bär, F.; Leyer, M. YUMA—An AI Planning Agent for Composing IT Services from Infrastructure-as-Code Specifications. 2023.
[1]	Diaz-de Arcaya, J.; Torre-Bastida, A.I.; Zárate, G.; Miñón, R.; Almeida, A. A Joint Study of the Challenges, Opportunities, and Roadmap of MLOps and AIOps: A Systematic Survey. 2023.
[29]	Abbas, S.I.; Garg, A. AIOps in DevOps: Leveraging Artificial Intelligence for Operations and Monitoring. 2024.
[30]	Sokolowski, D.; Spielmann, D.; Salvaneschi, G. Automated Infrastructure as Code Program Testing. 2024.
[31]	Begoug, M.; Chouchen, M.; Ouni, A.; Abdullah Alomar, E.; Mkaouer, M.W. Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC). 2024.
[32]	Kon, P.T.J.; Liu, J.; Qiu, Y.; Fan, W.; He, T.; Lin, L.; Zhang, H.; Park, O.M.; Elengikal, G.S.; Kang, Y.; et al. IaC-Eval: A Code Generation Benchmark for Cloud Infrastructure-as-Code Programs. 2024.
[33]	Ragothaman, H.; Udayakumar, S.K. Optimizing Service Deployments With NLP Based Infrastructure Code Generation–An Automation Framework. 2024.
[34]	Low, E.; Cheh, C.; Chen, B. Repairing Infrastructure-as-Code using Large Language Models. 2024.
[35]	Vasileiou, Z.; Kumara, I.; Meditskos, G.; Tokmakov, K.; Radolovi´c, D.; Cruz, J.; Nitto, E.; Tamburri, D.; Heuvel, W.J.; Vrochidis, S. A knowledge-based approach for guided development of Infrastructure as Code. 2025.
[36]	Eken, B.; Pallewatta, S.; Tran, N.; Tosun, A.; Babar, M.A. A Multivocal Review of MLOps Practices, Challenges and Open Issues. 2025.
[37]	Seth, D.K.; Ratra, K.K.; Sundareswaran, A.P. AI and Generative AI-Driven Automation for Multi-Cloud and Hybrid Cloud Architectures: Enhancing Security, Performance, and Operational Efficiency. 2025.
[38]	Opdebeeck, R.; Adams, B.; De Roover, C. Analysing Software Supply Chains of Infrastructure as Code: Extraction of Ansible Plugin Dependencies. 2025.
[39]	Peng, J.; Qiu, Y.; Kon, P.T.J.; Zhao, P.; Huang, Y.; Guo, Z.; Wang, X.; Chen, A. Automated Lifting for Cloud Infrastructure-as-Code Programs. 2025.
[40]	Senthamarai, N.; Jeyaselvi, M.; Hemamalini, V. Automatic Cloud Formation Using LLM. 2025.
[41]	Vorel, R., Generative AI for IaC and Data Provisioning. 2025.
[42]	Toprani, D.; Madisetti, V.K. LLM Agentic Workflow for Automated Vulnerability Detection and Remediation in Infrastructure-as-Code. 2025.
[43]	Kosbar, S.; Hamdaqa, M. Smells-sus: Sustainability Smells in IaC. 2025.
[44]	Muthukrishnan, H.; Viradia, V.; Yadav, D. Unified AI and ML Framework in DevSecOps Practices, Solving Real-World Problems. 2025.
[45]	Brojabasi, S.; Paul, S.; Mitra, A. Cloud native engineering: A comprehensive review of principles, practices, and challenges. 2025.
[46]	Ramos, R.C.B.; Yoo, S.G. Cybersecurity in DevOps Environments: A Systematic Literature Review. 2025.
[47]	Novakova Nedeltcheva, G.; De La Fuente Ruiz, A.; Orue-Echevarria Arrieta, L.; Bat, N.; Blasi, L. Towards Supporting the Generation of Infrastructure as Code Through Modelling Approaches–Systematic Literature Review. 2022.

Step 1c–Qualitative Assessment of Studies. For the 34 included technical contribution studies, we primarily focused on the technical rigor of the content presented and excluded those with little convincing evidence. The quality assessment was carried out manually, taking full manuscripts into account. In a first stage, one of the authors (CP), as the one not having been involved in the database extraction, reviewed all remaining studies for technical rigor, with the proposed final list then being validated by the other authors in a second step.

Steps 2 and 3–Data Extraction and Synthesis. To record the extracted data from the selected studies, we follow [3,48] and use a structured format based on characterization dimensions.

4. SLR—Bibliometric Results

We categorize the results in terms of concerns such as publication format, forum, and the technical contribution with the evaluation method. The results are discussed, and the validity of the results and their discussion is addressed. We present a key terms extraction process based on the studies in the following section.

To examine the state of research on AI for IaC, the following questions are considered:

When did research on AI in IaC become active in the computing community?
What are the fora in which research work on AI for IaC has been published? On which communities does the focus lie?
How is AI for IaC research reported, and what is the maturity level of the research in this field?

In order to identify the start of activities and the current maturity, we start with a temporal overview.

4.1. Temporal Overview of Studies

With only a few studies in 2020, the trend is an increase in subsequent years, signaling a significant concern. see Figure 3. No relevant publications were found prior to 2020, as IaC in itself is a relatively new concept, and AI is only systematically used with the general proliferation of AI use cases:

2020 : 5, 2021 : 3, 2022 : 9, 2023 : 8, 2024 : 7, 2025 : 12

It can be noted that the earlier activities in the years 2020 to 2023 had a strong focus on ML for code analysis, whereas in the more recent years since 2023, this has shifted to LLMs for code generation. The strongest number of publications is recorded for 2025. Overall, the six years covered can also be interpreted as an area of still growing interest, with consequently no substantial maturity being reached. The spike in 2025 can be explained through more Generative AI being addressed, which is demonstrated by the focus of the papers considered in 2025 (after screening these), where the terms Generative AI, LLM, and Unified AI, as a wider, LLM-including concept, dominate the focus expressed in the titles.

Figure 3. Temporal Distribution of Studies.

The remaining bibliometric analysis results, which are described in the next two subsections, are visualized in Figure 4.

4.2. Publication Fora and Formats

We have categorized the publication fora by computing field, as indicated below. The ‘wider computing context’ refers to venues (journals or conferences) that accept contributions from a variety of computing fields.

software engineering (22), AI (5), networks, telecoms and distributed systems (3), security (2), cloud computing (2), wider computing context (10).

This shows that DevOps is a predominantly software engineering concern, with a few publications in relevant other fields such as AI, networks, security, and cloud. Within the wider venues, e.g., ACM Computing Surveys or IEEE Access, appear as journal venues that are not restricted to any specific computing field.

The software engineering field, with 13 conferences, 7 journals, and 1 chapter publication, dominates the publication fora and shall be analyzed further. In terms of CORE and Scimago rankings, these are almost equally distributed between A/Q1 and B/Q2 ranked fora. For the journals, IEEE Transactions on Software Engineering, with 3 publications, is the top journal venue, as is the Conference on Mining Software Repositories, as the top conference with 3 publications. A focus on software architecture and maintenance can be noted for the conferences. Besides the two software engineering venues above, IEEE Access was the third forum with a high count.

Regarding the sources, we recognize and distinguish the following publication formats:

journal: 15, conferences and workshops: 25 + 2, chapters: 2.

Conference and workshop contributions are the majority, indicating this as an area of ongoing activity. A relatively high number of journal publications is split between the reviews we considered and mature work on code analysis as a specific concern, where activities from general software code analysis have been transferred to IaC. The reviews are focused on AI within specific IaC lifecycle phases and do not yet cover the whole lifecycle.

4.3. Research and Evaluation Methods

Typically, in SLRs, the contribution type (Solution Proposal, Evaluation Research, Validation Research, Experience Report, Review) and the evaluation method (Case Study, Mathematical Proof, Experience Report, Example Application, Controlled Experiment) are distinguished.

Study Distribution by Contribution Type:

Solution Proposal: 16, Validation Research: 8, Experience: 5, Report: 4, Review: 11.

Most papers were solution proposals that were also evaluated through controlled experiments, which can be expected to be the case for AI-based techniques utilized to enhance some basic tool functions. The validation research included various tools for experimental evaluation and developers as the main target group for mainly qualitative analyses. Only 5 papers presented an experience report to a greater extent.

Study Distribution by Evaluation Method for the Non-Review Studies:

Case Study: 3, Experience Report: 7, Example Application: 3, Controlled Experiment: 20.

We noted that some reviews include experience reports with various tools. If a comparison of tools was the main aim, we still classified them as reviews. Also, some controlled experiments included elements of an example application, but we classified them as the former category when a rigorous evaluation was performed, and the example served only to further illustrate the technical contribution. Low numbers of experience reports indicate a lack of industrial studies. Most of those that are reported in the selected studies cover collaborative academia-industry projects where the authors were generally direct particpants. Among the 11 reviews, 6 follow a systematic literature review or mapping methodology; the remaining ones are generally more tool and technology reviews, mixing methods for their evaluation.

Overall, given the relative immaturity of the domain, the evaluations lack detailed experience reports and proofs, while some controlled lab experiments have been reported.

5. SLR—AI Techniques for IaC

AI can be applied to support the Infrastructure-as-Code (IaC) lifecycle in several ways. In this section, we report on a term-extraction process for the main identified AI uses in IaC techniques. We associate the extracted AI technique with the DevOps phase it was applied to, based on the technical contribution papers. This results in a finer analysis of AI techniques by associating them with specific lifecycle activities. Specific AI techniques are associated with concrete IaC development and operations techniques.

5.1. Key Terms Extraction and Phase Contribution

The key terms extracted from all selected studies help categorize them by individual focus and contribution and provide an understanding of key research concerns. The terms extraction starts with identifying the IaC-related activities that are supported by the proposed AI technique. That means that by seeing IaC code as an artefact that is processed by activities in the four DevOps phases as indicated in Table 6. Figure 5 visualizes these terms using the DevOps loop as a diagrammatic reference. We organized these development and operations terms by the DevOps phase they primarily contribute to. The terms were manually extracted by the co-authors in two steps: first, by providing summaries for each paper and, second, by merging terms that were considered to be overlapping. This was the case for activities such as code generation, benchmarking, or code analysis. In those cases, we used common terms stemming from programming and software engineering.

Please note that we first summarize the extracted concepts. In the following sections, we also list the individual publications from which these originate to give (i) a summary of the specific contribution and (ii) make the number of occurrences countable, thus indicating the strength of an interest.

Extracted key concepts from the publications and their respective phases—the main phase that a contribution relates to—are presented in Table 6. The alignment with the IaCDevOps model is presented in Figure 5.

Based on this initial term extraction in terms of two concerns (DevOps phases and AI techniques), we now analyze the contributions in more detail following the extracted phase allocation. We present the Dev (phases 1 and 2) and Ops (phases 3 and 4) concerns in the following subsections.

Table 6. Term Extraction for the literature review—with Association to IaC Lifecycle Phases.

Plan	(i) game-theoretic analysis of strategic decisions (e.g., type of IaC technology to use), (ii) user story analysis [44].
Code	(i) generate code, with tools, (ii) guided coding (AI coding assistants)—completion, integrated analysis, (iii) generate and verify, (iv) lift, (v) benchmark.
Compose	(i) compose scripts.
Test/Verify	(i) test–code analysis. defect prediction, anti-pattern detection, specific aspects (platforms (K8s), Ansible roles (module/abstraction concept)), (ii) code analysis and fix
Release	(i) repo analysis (NLP).
Configure/Deploy	(i) anomaly detection (e.g., Ansible exec env), confirmed by [44].
Operate	(i) RL-based resource management, (ii) API/tool integration via GenAI [37].
Monitor/Self-Heal	(i) anomaly detection, confirmed by [44] for security threats, (ii) predictive anomaly detection [29], (iii) RL/rule-based resource management/optimization, confirmed by [29], (iv) optimization of remediation strategies via RL [44], (v) cross-cloud orchestration and optimization [37], (vi) drift detection [44], (vii) incident management via NLP [37].

Following term extraction and phase alignment as the first step, we add a second analysis step that identifies the main challenges and open research directions in the Dev and Ops phases, respectively. The analysis of open challenges will cover phases 1 and 2, as well as phases 3 and 4, together, as in many software techniques, where, for instance, coding and testing (phases 1 and 2) or deployment and monitoring (phases 3 and 4) are integrated.

We will present the analysis by following a common structure for all DevOps phases, consisting of (i) a context description describing the main concepts and tasks covered for the phase, (ii) the current research challenges, based on the phase contributions extracted from the publications, as a summary of current ongoing research informing both researchers and practitioners, (iii) a review of ongoing research directions, based on our analysis and observations, that primarily sets an agenda for researchers. The latter point (iii) provides evidence of currently ongoing research that addresses the challenges from point (ii). A further identification of future research directions then follows in the two open-challenge sections at the end of the Dev and Ops phases, separately as a higher-level research agenda.

5.2. Phase 1—Plan, Code, Build

This addresses the first phase in Figure 5 with respect to the context, challenges, and current directions.

Context—Concepts and Tasks: Planning is concerned with defining infrastructure requirements through planning and requirements elicitation. Code creation is about creating code, templates, or scripts that define the required infrastructure resources and their management. Building and Packaging addresses the use of software tools to compose, build, and package IaC software.

Research Challenges: The research challenges are divided into the following, based on the term extraction process: Planning, Code generation, Code verification, Code benchmarks, Code lifting, Integrated generation tools.

Review of Research Directions: The six research challenges above will be further analyzed and illustrated using documented research efforts, drawing on the selected studies for the SLR. We organize this into the three sub-steps of phase 1.

1.

Plan: The IaC platform selection is the first decision, particularly the supported strategy in terms of open-source or proprietary platforms. Platform planning analysis using game theory is presented in [25], which investigates the benefits and risks of different formats.

2.

Code: Automated infrastructure-as-code generation using LLMs as coding assistants is an active direction.

(a): Direct Code Generation: several authors address this concern. In [40], an LLM is used to generate Terraform code. Equally, ref. [33] uses NLP and LLM to generate Terraform code from natural-language queries. Ref. [19] uses the AIaC library (which accesses LLMs) to generate code. Ref. [35] reports on the use of an ontology and SPARQL queries to guide IaC code development as a non-LLM solution.
(b): Code Verification: As a kind of verification for LLM-based generation, ref. [15] uses NLP and ML to detect inconsistencies between natural-language descriptions and IaC code.
(c): Code Benchmarks: The authors in [32] define an LLM benchmark (An evaluation of LLMs for infrastructure as code generation can be found in https://medium.com/gft-engineering/evaluating-llms-for-infrastructure-as-code-9f8b9ac4ca33 (accessed on 23 December 2025), which covers Gemini 1.5, ChatGPT-4, LLAMA 3.8, DeepSeek-V3, and others based on a defined benchmark) for LLM-generated IaC code.
(d): Lift Code to IaC: The aim is to lift low-level cloud states and translate them into corresponding IaC programs, which is a type of legacy migration in a brownfield development context. The Lilac tool enables lifting existing cloud states into IaC using an LLM [39].
(e): Platform LLM Tools: Some IaC platforms already provide LLM support. (1) Ansible Lightspeed is an Ansible-specific code generator that builds on the IBM Watsonx Code Assistant to generate tasks or even full playbooks from a prompt. (2) Pulumi provides the Pulumi AI Assistant, building on LLM technology for IaC generation.

3.

Compose/Build/Package: In [28], an AI planning agent for service composition using live context information is presented.

We can conclude that code generation is by far the most active area in phase 1.

5.3. Phase 2—Test/Verify

This addresses the second phase in Figure 5 with respect to the context, challenges, and future directions.

Context—Concepts and Tasks:Testing and verification aim to validate the software system using common code testing and verification techniques.

Research Challenges The research challenges are Automated testing, Code-level syntax analysis (ML and other AI), and Code analysis and repair. Please note that in general, there are different drivers behind these testing, analysis, and repair techniques, such as better quality or more efficiency through automation.

Review of Research Directions: These challenges shall be analyzed further, as before, by concretizing ongoing research work through the selected SLR studies.

1.: Automated infrastructure testing: Machine-learning algorithms can analyze infrastructure changes and automatically test them for potential issues, reducing manual testing and improving infrastructure quality. In [30], an ACT configuration testing approach is presented.
2.: Code-level syntax analysis: Code analysis specifically using ML techniques is widely used for defect prediction or anti-pattern detection.
In [14], an ML-based code analysis method is proposed. Ref. [7] presents an approach using a convolutional neural network (CNN) for anti-pattern detection. Ref. [31] compares six ML models for defect prediction. In [17], ML is used for defect detection and analysis. In [9], different ML models are compared for defect prediction. Ref. [27] introduces a CNN-based method for anti-pattern detection. Ref. [20] also uses ML for defect prediction.
Some specific technical aspects merit highlighting in this context. Ref. [22] presents a solution for analyzing Kubernetes manifest quality using an LLM (GPT). In [8], an analysis of Ansible role evolution is conducted using a random forest classifier.
3.: Code-level syntax analysis (beyond core AI) employs various intelligent methods, including graphs, data mining, statistics, ontologies, and model checking. Please note that established work on graph-based analyses, data mining, statistical methods, or model checking exists. We refer to some exemplary publications to indicate these directions: Ref. [38] on call graph analysis Ansible; Ref. [6] on insecure pattern mining via association rule mining; Ref. [23] on graph-based smell detectors for Ansible; Ref. [24] on rule-based code analysis; Ref. [10] on linters for security smell detection; Ref. [43] on statistical methods for smell category identification; Ref. [18] on model checking for code analysis; Ref. [35] on ontologies for smell detection (SPARQL).
4.: Code analysis and repair using LLM: includes vulnerability detection and remediation (LLM-based). Code analysis and heal/fix research includes: Ref. [34]—repair via LLM; Ref. [42]—detect and repair security vulnerabilities.

It becomes apparent that AI-based code analysis is an active research area, with techniques that have been transferred to the specific IaC code context. It should be noted that integrated feedback loops are essential if LLMs are used for Phase 1 to support code generation. Some ongoing work even addresses automated fixes.

5.4. Analysis Open Challenges—Phases 1 and 2

Some open research challenges across the two phases can be identified by analyzing the principal challenges and current work. For this analysis, we also take into consideration wider DevOps concerns, such as general tool support or concerns, such as version control, which are important for DevOps, but have not been addressed in research.

Integration with Existing Tools: LLMs can be integrated with existing IaC tools such as Ansible or Pulumi. For example, Pulumi AI leverages LLMs to author IaC code for various architectures and clouds, enabling users to generate custom configurations tailored to their specific needs.
Version Control and Collaboration: Version control is a central aspect in IaC management in general, although specific AI uses have not been reported to address change management and consistency. IaC configurations generated by LLMs should be stored in version control systems. This practice enables collaboration, tracking, and, if necessary, rollback, ensuring that changes can be managed and reviewed systematically.
Benchmarking and Evaluation: Benchmarking frameworks are needed to benchmark the capabilities of specifically LLMs in generating IaC configurations. These evaluations help identify the strengths and limitations of different LLMs, providing insights into their performance and areas for improvement. While quality assurance is important, only one effort is reported.

Large Language Models (LLMs) have recently been used to generate IaC configurations and play an important role, which the three observations above confirm. This generation leverages LLM capabilities to automate infrastructure configuration, thereby enhancing efficiency and reducing manual effort. Thus, we end the Dev part with a note on LLMs.

1.: Phase 1: Generation–Automation and Efficiency: LLMs can automate the generation of IaC scripts, reducing the time and effort required for manual configurations. This can lead to more efficient infrastructure management and deployment processes. While LLMs can generate IaC scripts, it remains essential to review these scripts to ensure compliance and mitigate configuration errors. Currently, the human-in-the-loop approach is needed to maintain the quality and reliability of the generated code.
2.: Phases 2–4: Feedback Loops—Quality through Testing and Monitoring: Implementing explicit, automated feedback loops that return errors and warnings from the generated IaC to the LLM can improve code quality, potentially with less human intervention.

5.5. Phase 3—Release, Configure, Deploy

This addresses the third phase in Figure 5 with respect to the context, challenges, and future directions.

Context—Concepts and Tasks: Release, configuration, and deployment depend on the use of deployment automation tools, such as Jenkins or Ansible/Terraform, which allow the configuration and deployment of the infrastructure resources to be automated.

Research Challenges: The research challenges are Automated provisioning, Improved configuration, and improved integration.

Review of Research Directions: The research challenges shall be analyzed further.

1.: Automated infrastructure provisioning: Generally, machine learning can analyze infrastructure requirements and automatically provision the necessary resources. This reduces the need for manual intervention and improves the speed and accuracy of provisioning, as demonstrated for non-coded cloud management solutions, such as VM configuration optimization.
2.: Predictive autoscaling for infrastructure resources: Historical data can be used to predict potential infrastructure scaling needs and recommend proactive remediation strategies to address the overall performance and reliability of the infrastructure. Reinforcement learning has already been used for this in cloud resource autoscaling.
3.: Integration of IaC infrastructure objects, specifically for multi/edge/hybrid cloud management, is reported in [37].

This shows that this phase is only partially covered, as the last reference above shows. In the other concerns, automated provisioning and predictive management, no results have been reported. However, particularly in cloud infrastructure management, they are widely covered, though without a reference to the utilization of IaC.

5.6. Phase 4—Operate, Monitor, Self-Heal

This addresses the fourth phase in Figure 5 with respect to the context, challenges, and future directions.

Context—Concepts and Tasks: Operation, monitoring, and self-healing are about configuration and the use of monitoring tools, such as Prometheus or Nagios, to track the general health, specifically performance and security of the infrastructure resources, and, where possible, self-heal through remediation actions.

Research Challenges: The challenges are Monitoring and anomaly detection (detect, diagnose); Controller construction (including self-adaptation/self-healing/remediation), with further concerns (a) Root cause analysis, (b) Remediation, and (3) Predictive management (autoscaling, etc.).

Review of Research Directions: Their deeper analysis shows the following:

1.

Continuous infrastructure monitoring: real-time performance and health data can be obtained, enabling the identification of anomalies and the remediation of potential issues before they impact the system.

2.

IaC controllers: ML/RL-based IaC controllers for self-healing, covering the following functions:

release: Ref. [5] describes an ML-based repository analysis using NLP processing to extract release-related concerns.
configure: No study was selected, but in an extended abstract, ref. [49] proposes using AI for anomaly detection in configuration specifications.
monitor, self-heal: These two phases are often interlinked. In [29], ML is used for anomaly detection and reinforcement learning (RL) for provisioning and control. In [26], another ML-based anomaly detection method (called LOMOS) for logs is presented.
full phase 3 and 4 coverage: In [11], a full support model covering deployment, monitoring, analysis, and healing is described.

There is more documented research in this phase. However, it should be noted that the studies provide novel contributions to this phase, but do rely on phase 3 functionality for their implementation, for which the last reference above is a good example.

5.7. Analysis Open Challenges—Phases 3 and 4

AI can support IaC by providing the following benefits to DevOps tasks in phases 3 and 4 (Operations). Regarding the Dev part, we consider general controllers and self-adaptation functions in the discussion, even if they have not been addressed in an IaC-specific setting.

1.: Improved incident management: AI can help to detect and diagnose issues more quickly, reducing the mean time to recovery (MTTR) and improving the overall reliability of software. This can be divided into the following specific aspects: monitoring, analysis, and prediction.
2.: Enhanced performance monitoring: Real-time data on application and infrastructure performance can improve processing performance.
3.: Automated root cause analysis: ML can be used to analyze large volumes of data and determine the root cause of incidents.
4.: Predictive analysis and management: historical data can be used to predict potential issues and recommend proactive remediation.

There are fewer publications for phases 3 and 4 than for 1 and 2. There is evidence that for phases 1 and 2, prior generic SE research has been transferred to the IaC context. Work on software operations does not exist, which explains the lower numbers here.

6. A Review of Surveys

To validate and complement the review of the technical contribution papers, we summarized and extracted information from a range of surveys or review-type papers.

6.1. Objective

We now detail those that provide a deeper analysis and yield new insights for our context. The purpose of this section is twofold: firstly, to confirm the Section 5 identification of ongoing work and, secondly, to confirm as important specific concerns where no research has yet been reported.

6.2. Major Surveys

In [29], phases 3 and 4 are covered through anomaly detection and prediction, which are investigated for cloud resource provisioning, referring to machine learning and reinforcement learning in particular as AI solutions. The aim is enhanced anomaly detection. New AI aspects not covered by the previous technical contributions survey are (i) proactive problem-solving through predictive analytics, and also (ii) more effective incident management using NLP. Confirmed is the use of RL for efficient cloud resource provisioning.

Ref. [21] focuses on phases 1 and 2, in particular, providing a review of LLM technology, including tools and LLM support. The focus is on IaC code generation through LLMs, with GPT3.5-turbo and Codeparrot having been used. For instance, the IaC generation through ChatGPT queries gave SSO configurations in Kubernetes configuration as an example. Sample IaC Generation Tools based on LLM were reviewed: Infracopilot, K8sGPT, and Pulumi AI8, which show that some IaC platforms proactively adopt and integrate AI solutions. Highlighted is the use of LLMs in DevSecOps, i.e., cybersecurity-specific, through static code analysis of IaC and also runtime analysis of server logs, here again using ChatGPT. A further example is the generation of Ansible-YAML code by open-source LLMs. BERT has been applied here.

Ref. [44] does cover all phases 1, 2, 3, and 4, but is not an SLR, and sees its contribution in the form of a conceptual model. It introduces an abstract model of AI use across the IaC lifecycle. This work is DevSeOps-specific, i.e., it focuses primarily on security concerns and provides a highly detailed model. Table 7 reproduces their main findings. We highlighted the DevOps phase in the second column, while the first makes the specific activity more concrete. Columns 3 and 4 then detail the AI technique used to support the activity. This review confirms many of the AI use cases highlighted above. Specific new contributions are the first and last table entries, which refer to using NLP and LLMs for early requirements processing in the form of user stories, and to using RL for security-specific incident management.

Table 7. AI techniques for DevSecOps, based on the technology review [44].

DevSecOps Activity	Phase	ML Models Used
User Story Definition	Plan	NLP-based Risk Analysis
Development (Code Writing)	Code	Code Completion & Static Analysis (GPTCode)
Static Analysis (SAST)	Verify	Random Forest, Decision Trees, XGBoost
Build & CI Testing	Verify	Isolation Forest, Autoencoders (Anomaly Detection)
Dynamic Analysis (DAST)	Monitor/Self-Heal	Reinforcement Learning, CNN-based Models for Traffic Analysis
Deployment Security—Drift detection	Monitor/Self-Heal	LSTM-based Threat Detection, Transformer-based Security Policies
Runtime Security & Operations	Monitor/Self-Heal	LSTM, GRU-based Anomaly Detection
Incident Response & Remediation	Monitor/Self-Heal	Reinforcement Learning, AutoML-based Threat Mitigation

Ref. [37] presents a survey with use cases on the later phases. AI-driven resource optimization in autonomous cloud management is the focus. Generative AI is mentioned in the context of customized configurations and security. New aspects are enhanced API Integration with Generative AI and cross-cloud orchestration, and Cost Optimization (multi, edge, hybrid clouds).

6.3. Other Surveys

Furthermore, some other reviews provide additional insights. Ref. [13] covers all phases 1, 2, 3, 4 from an MLOps and tool-centric perspective. The authors carried out a multivocal literature review, in which 13 MLOps tools were surveyed, including Gitlab, Jenkins, and AzureML. In this review, activities from ML model generation tools to CI/CD in the DevOps context were considered. The tool survey identified general tool properties and data and model management features across these 13 tools.

Ref. [36] has a similar MLOps focus, but in a wider setting than the above reference. It investigates DevOps for an ML life cycle, but covers MLOps-as-Pipeline-Automation. No phase-oriented organization, however, is present. Ref. [1] is similar in addressing MLOps and AIOps as a wider scope with CI/CD orchestration, cloud, and big data. Here, some emphasis is given to later deployment, monitoring, and management of infrastructure phases. Ref. [12] is a 2021 survey on DevOps, similar in style of presentation to ours, but predating the AI era. Thus, only an outlook of potential AI usages is given. The possibility for full DevOps support through AI is, though, noted as an emerging trend.

Other review examples are [41,45,46,47], but they do not add any additional AI use cases.

6.4. Summary

In summary, the surveys we reviewed here confirmed our findings from Section 5, i.e., demonstrating that DevOps activities can be supported by AI tools across all phases to provide some improvement. Firstly, they confirm a number of activities identified in the term extraction in Section 5, adding to the direct evidence already provided. The challenges and directions we identified were not opposed or contradicted. Secondly, those aspects where no ongoing activity was noted in the selected research papers analyzed in Section 5 point to possible other research challenges and directions. We noted the respective new observations in the discussion of the surveys above, but did not include these in Figure 5 and Table 6 as these are based on quality-assessed primary contribution studies.

Table 8 compares our SLR with the others described above. We report on a number of criteria that also confirm the novelty and advancement of this survey. The comparison criteria are: (i) recency, i.e., when published, (ii) literature or technology focus, i.e., were publications or tools/technology reviewed, (iii) coverage of the full DevOps lifecycle, indicated through the phases covered, (iv) development of a phase-specific AI technique within an IaC taxonomy, (v) inclusion of a comprehensive analysis of research challenges and directions in terms of the extent of this.

Table 8. Comparison of contributions with relevant related surveys—SLR (systematic literature review), MLR (multivocal literature review), technology review (tools and techniques).

Publication	Type	Year	Coverage	Taxonomy	Directions
[29]	literature (non-SLR)	2024	3, 4	AIOps	basic AI factor taxonomy
[21]	literature (non-SLR)	2023	1, 2	LLM Generation	basic recommendations and challenges
[44]	technology	2025	1, 2, 3, 4	conceptual framework	model selection criteria and DevSecOps challenges
[37]	literature (non-SLR)	2025	1, 2	LLM Generation	detailed best practices summary and future trends
[13]	technology (MLR)	2022	3, 4	AI for MLOps	basic future work
[36]	literature (MLR-based)	2025	3,4	AI for MLOps—practices and techniques	future research directions and evolution of MLOps
[12]	literature (SLR-based)	2021	1, 2, 3, 4	no specific AI focus	general DevOps future directions (not AI-specific)
[1]	literature (SLR-based)	2023	3, 4	Conceptual AIOps and MLOps framework	detailed future research directions and trends in AIOps/MLOps
this SLR	literature (SLR-based)	2025	1, 2, 3, 4	full LLM and ML activity taxonomy	detailed, DevOps-phase based research challenges and directions

This survey review served to address a possibly imprecision due to the final number of 44 studies selected, which have shown an increasing trend, but also that no maturity has been achieved. The survey in this section has served to put the individual contributions covered in Section 5 into context.

7. Conclusions

AI is a technology that can improve IaC development and operation in terms of efficiency, reliability, and quality of software and infrastructure. Through artificial intelligence, including machine learning and generative AI, many tasks in the DevOps and IaC lifecycles can be automated and optimized [50,51].

7.1. Observations and Directions

Our literature review shows a clear trend toward greater AI adoption in IaC across all activities. Although still a young topic with only around five years of intensive work, it already shows evidence of two distinct waves: first, the use of ML for code and operations analyses, and second, the use of Generative AI/LLM for generation in recent years.

Another observation is that in terms of DevOps, the Development part is better covered, which has been a focus of software engineering research for a long time. The Operations part is less well addressed. In particular, the latter phases, such as self-healing, would benefit significantly from further research to fully automate the life cycle. Here, also earlier work on cloud infrastructure management could be transferred to IaC, as has been done with code analysis techniques.

We condense the previous observations into a future research agenda, aligned with the gaps identified in each phase.

Phase 1: LLMs for Generation: LLMs can further automate the generation of quality IaC scripts, reducing the time and effort for manual configurations.
Phase 1, 2: LLMs in Feedback Loops: Automated feedback loops that return errors/warnings from generated IaC to the LLM can improve code quality through testing and monitoring.
Phase 2: Version Control and Collaboration: Version control is a central aspect in IaC management, but specific AI is needed to address change management and consistency.
Phase 2: Benchmarking and Evaluation: Benchmarking frameworks are needed to benchmark the capabilities of specifically LLMs in generating IaC configurations.
Phase 3: Performance monitoring: Real-time data on application and infrastructure performance can improve processing performance.
Phase 3, 4: Root cause analysis: ML can be used to analyze large volumes of data and determine the root cause of incidents.
Phase 3, 4: Predictive analysis and management: historical data can be used to predict potential issues and recommend proactive remediation.

Automation is the key aim of IaC. AI can add efficiency and quality. However, the results on LLM usage also reveal a known weakness of all AI techniques: automated construction can cause explainability issues, which in turn may raise quality concerns. Other general AI concerns, such as fairness and bias, can also be applied here. In [52], five quality criteria for AI-generated software controllers are proposed: Performance addresses the core quality management function; Robustness relates to the input of monitoring the environment and subsequent processing; Explainability concerns the governance of the quality framework in order to allow for trustworthy behavior; Fairness refers to the avoidance of bias in automated decisions and the effect on system performance; Sustainability relates to the effect of actions on the consumption of resources in terms of cost and energy [53]. For instance, for fairness, in a technical sense, there may be a technical bias towards certain strategies. The discussion on over- or under-provisioning of infrastructure might serve as an example of possibly favored strategies. Thus, the investigation of common AI quality needs to be extended for AI in IaC as another future research task.

7.2. Limitations

Finally, we also discuss possible limitations that apply to the method and results from this SLR.

The search scope could have been too limited. We selected four databases, where some others could have been included. However, two are the two largest publishers in computer science, with Scopus as a widely used curated database was added as well as Google Scholar, with the widest presentation of publication sources. The search terms cover the three core aspects: IaC, AI, and DevOps. Here, only some minor concrete search terms semantically related to the core terms might be missed. The time range starting with 2020 includes all publications that combine the three aspects. No prior publication was found in any of the databases.

A selection bias could have been included by limiting to English as the publication language and peer review as the publication assessment. However, in order to rely on quality being ascertained in the first cohort of publications and allowing for further manual quality checks, these two restrictions are commonly accepted. In the same direction, the omission of grey literature served to focus on a clear quality assurance to have taken place for the selected studies.

Author Contributions

Conceptualization, C.P., Ö.C.S. and F.H.; methodology, C.P. and F.H.; validation, C.P., Ö.C.S. and F.H.; formal analysis, C.P.; resources, Ö.C.S. and F.H.; data curation, C.P., Ö.C.S. and F.H.; writing—original draft preparation, C.P. and F.H.; writing—review and editing, C.P., Ö.C.S. and F.H.; visualization, C.P.; supervision, C.P. and F.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Diaz-de Arcaya, J.; Torre-Bastida, A.I.; Zárate, G.; Miñón, R.; Almeida, A. A Joint Study of the Challenges, Opportunities, and Roadmap of MLOps and AIOps: A Systematic Survey. ACM Comput. Surv. 2023, 56, 1–30. [Google Scholar] [CrossRef]
Pahl, C.; Gunduz, N.; Sezen, Ö.C.; Ghamgosar, A.; Ioini, N.E. Infrastructure as Code: Technology Review and Research Challenges. In Proceedings of the 15th International Conference on Cloud Computing and Services Science–Volume 1: CLOSER; SCITEPRESS: Setúbal, Portugal, 2025; pp. 151–158. [Google Scholar]
Petersen, K.; Feldt, R.; Mujtaba, S.; Mattsson, M. Systematic mapping studies in software engineering. In 12th International Conference on Evaluation and Assessment in Software Engineering, EASE’08; BCS Learning & Development Ltd.: Swindon, UK, 2008; pp. 68–77. [Google Scholar]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Br. Med. J. Publ. Group 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
Openja, M.; Adams, B.; Khomh, F. Analysis of Modern Release Engineering Topics: –A Large-Scale Study using StackOverflow–. In 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME); IEEE: Piscataway, NJ, USA, 2020; pp. 104–114. [Google Scholar]
Bhuiyan, F.A.; Rahman, A. Characterizing Co-located Insecure Coding Patterns in Infrastructure as Code Scripts. In Proceedings of the 2020 35th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW); ACM: New York, NY, USA, 2020; pp. 27–32. [Google Scholar]
Borovits, N.; Kumara, I.; Krishnan, P.; Palma, S.D.; Di Nucci, D.; Palomba, F.; Tamburri, D.A.; van den Heuvel, W.J. DeepIaC: Deep learning-based linguistic anti-pattern detection in IaC. In MaLTeSQuE 2020: Proceedings of the 4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation; ACM: New York, NY, USA, 2020; pp. 7–12. [Google Scholar]
Opdebeeck, R.; Zerouali, A.; Velázquez-Rodríguez, C.; Roover, C.D. Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution. In 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM); IEEE: Piscataway, NJ, USA, 2020; pp. 238–248. [Google Scholar]
Palma, S.D.; Mohammadi, M.; Di Nucci, D.; Tamburri, D.A. Singling the odd ones out: A novelty detection approach to find defects in infrastructure-as-code. In MaLTeSQuE 2020: Proceedings of the 4th ACM SIGSOFT International Workshop on Machine-Learning Techniques for Software-Quality Evaluation; ACM: New York, NY, USA, 2020; pp. 31–36. [Google Scholar]
Rahman, A.; Williams, L. Different Kind of Smells: Security Smells in Infrastructure as Code Scripts. IEEE Secur. Priv. 2021, 19, 33–41. [Google Scholar] [CrossRef]
Alonso, J.; Orue-Echevarria, L.; Osaba, E.; López Lobo, J.; Martinez, I.; Diaz de Arcaya, J.; Etxaniz, I. Optimization and Prediction Techniques for Self-Healing and Self-Learning Applications in a Trustworthy Cloud Continuum. Information 2021, 12, 308. [Google Scholar] [CrossRef]
Alnafessah, A.; Gias, A.U.; Wang, R.; Zhu, L.; Casale, G.; Filieri, A. Quality-Aware DevOps Research: Where Do We Stand? IEEE Access 2021, 9, 44476–44489. [Google Scholar] [CrossRef]
Recupito, G.; Pecorelli, F.; Catolino, G.; Moreschini, S.; Nucci, D.D.; Palomba, F.; Tamburri, D.A. A Multivocal Literature Review of MLOps Tools and Features. In 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA); IEEE: Piscataway, NJ, USA, 2022; pp. 84–91. [Google Scholar]
Petrović, N.; Cankar, M.; Luzar, A. Automated Approach to IaC Code Inspection Using Python-Based DevSecOps Tool. In 2022 30th Telecommunications Forum (TELFOR); IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
Borovits, N.; Kumara, I.; Di Nucci, D.; Krishnan, P.; Palma, S.D.; Palomba, F.; Tamburri, D.A.; Heuvel, W.J.v.d. FindICI: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code. Empir. Softw. Eng. 2022, 27, 178. [Google Scholar] [CrossRef] [PubMed]
Kyryk, M.; Pleskanka, N.; Pleskanka, M.; Kyryk, V. Infrastructure as Code and Microservices for Intent-Based Cloud Networking. In Future Intent-Based Networking; Klymash, M., Beshley, M., Luntovskyy, A., Eds.; Springer: Cham, Switzerland, 2022; pp. 51–68. [Google Scholar]
Quattrocchi, G.; Tamburri, D.A. Predictive maintenance of infrastructure code using “fluid” datasets: An exploratory study on Ansible defect proneness. J. Softw. Evol. Process 2022, 34, e2480. [Google Scholar] [CrossRef]
Chiari, M.; De Pascalis, M.; Pradella, M. Static Analysis of Infrastructure as Code: A Survey. In 2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C); IEEE: Piscataway, NJ, USA, 2022; pp. 218–225. [Google Scholar]
Myat, H.M.; Phyu, M.P.; Paing, A.M.M. Towards Infrastructure Automation Using IaC in the Era of GenAI. In Genetic and Evolutionary Computing, Proceedings of the Sixteenth International Conference on Genetic and Evolutionary Computing, Miyazaki, Japan, 28–30 August 2024; Pan, J.S., Zin, T.T., Sung, T.W., Lin, J.C.W., Eds.; Springer: Singapore, 2025; pp. 486–494. [Google Scholar]
Dalla Palma, S.; Di Nucci, D.; Palomba, F.; Tamburri, D.A. Within-Project Defect Prediction of Infrastructure-as-Code Using Product and Process Metrics. IEEE Trans. Softw. Eng. 2022, 48, 2086–2104. [Google Scholar] [CrossRef]
Srivatsa, K.G.; Mukhopadhyay, S.; Katrapati, G.; Shrivastava, M. A Survey of using Large Language Models for Generating Infrastructure as Code. In Proceedings of the 20th International Conference on Natural Language Processing (ICON), Goa, India, 14–17 December 2023; Pawar, J.D., Lalitha Devi, S., Eds.; NLP Association of India: Patna, India, 2023; pp. 523–533. [Google Scholar]
Lanciano, G.; Stein, M.; Hilt, V.; Cucinotta, T. Analyzing Declarative Deployment Code with Large Language Models. In Proceedings of the 13th International Conference on Cloud Computing and Services Science (CLOSER 2023); SCITEPRESS: Setúbal, Portugal, 2023; pp. 289–296. [Google Scholar]
Opdebeeck, R.; Zerouali, A.; De Roover, C. Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort? In 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR); IEEE: Piscataway, NJ, USA, 2023; pp. 534–545. [Google Scholar]
Rahman, A.; Parnin, C. Detecting and Characterizing Propagation of Security Weaknesses in Puppet-Based Infrastructure Management. IEEE Trans. Softw. Eng. 2023, 49, 3536–3553. [Google Scholar] [CrossRef]
de la Fuente Ruiz, A.E.; Novakova Nedeltcheva, G. Game-theory strategies for open-source Infrastructure-as-Code. In 2023 IEEE 20th International Conference on Software Architecture Companion (ICSA-C); IEEE: Piscataway, NJ, USA, 2023; pp. 328–332. [Google Scholar]
Cankar, M.; Petrovic, N.; Pita Costa, J.; Cernivec, A.; Antic, J.; Martincic, T.; Stepec, D. Security in DevSecOps: Applying Tools and Machine Learning to Verification and Monitoring Steps. In ICPE ’23 Companion: Proceedings of the Companion of the 2023 ACM/SPEC International Conference on Performance Engineering; ACM: New York, NY, USA, 2023; pp. 201–205. [Google Scholar]
Reddy Konala, P.R.; Kumar, V.; Bainbridge, D. SoK: Static Configuration Analysis in Infrastructure as Code Scripts. In 2023 IEEE International Conference on Cyber Security and Resilience (CSR); IEEE: Piscataway, NJ, USA, 2023; pp. 281–288. [Google Scholar]
Bär, F.; Leyer, M. YUMA—An AI Planning Agent for Composing IT Services from Infrastructure-as-Code Specifications. In Proceedings of the Hawaii International Conference on System Sciences 2023 (HICSS-56); University of Hawai’i at Mānoa: Honolulu, HI, USA, 2023. [Google Scholar]
Abbas, S.I.; Garg, A. AIOps in DevOps: Leveraging Artificial Intelligence for Operations and Monitoring. In 2024 3rd International Conference on Sentiment Analysis and Deep Learning (ICSADL); IEEE: Piscataway, NJ, USA, 2024; pp. 64–70. [Google Scholar]
Sokolowski, D.; Spielmann, D.; Salvaneschi, G. Automated Infrastructure as Code Program Testing. IEEE Trans. Softw. Eng. 2024, 50, 1585–1599. [Google Scholar] [CrossRef]
Begoug, M.; Chouchen, M.; Ouni, A.; Abdullah Alomar, E.; Mkaouer, M.W. Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC). In MSR ’24: Proceedings of the 21st International Conference on Mining Software Repositories; ACM: New York, NY, USA, 2024; pp. 100–112. [Google Scholar]
Kon, P.T.J.; Liu, J.; Qiu, Y.; Fan, W.; He, T.; Lin, L.; Zhang, H.; Park, O.M.; Elengikal, G.S.; Kang, Y.; et al. IaC-Eval: A Code Generation Benchmark for Cloud Infrastructure-as-Code Programs. In Proceedings of the Advances in Neural Information Processing Systems; Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 134488–134506. [Google Scholar]
Ragothaman, H.; Udayakumar, S.K. Optimizing Service Deployments With NLP Based Infrastructure Code Generation—An Automation Framework. In 2024 IEEE 2nd International Conference on Electrical Engineering, Computer and Information Technology (ICEECIT); IEEE: Piscataway, NJ, USA, 2024; pp. 216–221. [Google Scholar]
Low, E.; Cheh, C.; Chen, B. Repairing Infrastructure-as-Code using Large Language Models. In 2024 IEEE Secure Development Conference (SecDev); IEEE: Piscataway, NJ, USA, 2024; pp. 20–27. [Google Scholar]
Vasileiou, Z.; Kumara, I.; Meditskos, G.; Tokmakov, K.; Radolović, D.; Cruz, J.; Nitto, E.; Tamburri, D.; Heuvel, W.J.; Vrochidis, S. A knowledge-based approach for guided development of Infrastructure as Code. Softw. Syst. Model. 2025, 1–34. [Google Scholar] [CrossRef]
Eken, B.; Pallewatta, S.; Tran, N.; Tosun, A.; Babar, M.A. A Multivocal Review of MLOps Practices, Challenges and Open Issues. ACM Comput. Surv. 2025, 58, 39. [Google Scholar] [CrossRef]
Seth, D.K.; Ratra, K.K.; Sundareswaran, A.P. AI and Generative AI-Driven Automation for Multi-Cloud and Hybrid Cloud Architectures: Enhancing Security, Performance, and Operational Efficiency. In 2025 IEEE 15th Annual Computing and Communication Workshop and Conference (CCWC); IEEE: Piscataway, NJ, USA, 2025; pp. 00784–00793. [Google Scholar]
Opdebeeck, R.; Adams, B.; De Roover, C. Analysing Software Supply Chains of Infrastructure as Code: Extraction of Ansible Plugin Dependencies. In 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER); IEEE: Piscataway, NJ, USA, 2025; pp. 181–192. [Google Scholar]
Peng, J.; Qiu, Y.; Kon, P.T.J.; Zhao, P.; Huang, Y.; Guo, Z.; Wang, X.; Chen, A. Automated Lifting for Cloud Infrastructure-as-Code Programs. In 2025 IEEE/ACM International Workshop on Cloud Intelligence & AIOps (AIOps); IEEE: Piscataway, NJ, USA, 2025; pp. 4–9. [Google Scholar]
Senthamarai, N.; Jeyaselvi, M.; Hemamalini, V. Automatic Cloud Formation Using LLM. In 2025 International Conference on Intelligent and Cloud Computing (ICoICC); IEEE: Piscataway, NJ, USA, 2025; pp. 1–6. [Google Scholar]
Vorel, R. Generative AI for IaC and Data Provisioning. In NoOps: How AI Agents Are Reinventing DevOps and Software; Apress: Berkeley, CA, USA, 2025; pp. 133–148. [Google Scholar]
Toprani, D.; Madisetti, V.K. LLM Agentic Workflow for Automated Vulnerability Detection and Remediation in Infrastructure-as-Code. IEEE Access 2025, 13, 69175–69181. [Google Scholar] [CrossRef]
Kosbar, S.; Hamdaqa, M. Smells-sus: Sustainability Smells in IaC. In 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR); IEEE: Piscataway, NJ, USA, 2025; pp. 801–812. [Google Scholar]
Muthukrishnan, H.; Viradia, V.; Yadav, D. Unified AI and ML Framework in DevSecOps Practices, Solving Real-World Problems. In SoutheastCon 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1250–1257. [Google Scholar]
Brojabasi, S.; Paul, S.; Mitra, A. Cloud Native Engineering: A Comprehensive Review of Principles, Practices, and Challenges; Advances in Computers; Elsevier: Amsterdam, The Netherlands, 2025. [Google Scholar]
Ramos, R.C.B.; Yoo, S.G. Cybersecurity in DevOps Environments: A Systematic Literature Review. IEEE Access 2025, 13, 191959–191979. [Google Scholar] [CrossRef]
Novakova Nedeltcheva, G.; De La Fuente Ruiz, A.; Orue-Echevarria Arrieta, L.; Bat, N.; Blasi, L. Towards Supporting the Generation of Infrastructure as Code Through Modelling Approaches–Systematic Literature Review. In 2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C); IEEE: Piscataway, NJ, USA, 2022; pp. 210–217. [Google Scholar]
Kitchenham, B.; Pearl Brereton, O.; Budgen, D.; Turner, M.; Bailey, J.; Linkman, S. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 2009, 51, 7–15. [Google Scholar] [CrossRef]
Diefenbach, A.; Raymond, B.; Esther, D. AI-Driven Configuration Management: Automating Infrastructure as Code (IaC). 2023. Available online: https://www.researchgate.net/profile/Dorcas-Esther/publication/388633079_AI-Driven_Configuration_Management_Automating_Infrastructure_as_Code_IaC/links/67a012d7207c0c20fa72eac5/AI-Driven-Configuration-Management-Automating-Infrastructure-as-Code-IaC.pdf (accessed on 23 December 2025).
Pahl, C. Research challenges for machine learning-constructed software. Serv. Oriented Comput. Appl. 2023, 17, 1–4. [Google Scholar] [CrossRef]
Azimi, S.; Pahl, C. Anomaly analytics in data-driven machine learning applications. Int. J. Data Sci. Anal. 2025, 19, 155–180. [Google Scholar] [CrossRef]
Pahl, C.; Barzegar, H.R.; El Ioini, N. Quality Management for AI-Generated Self-Adaptive Resource Controllers. Machines 2026, 14, 25. [Google Scholar] [CrossRef]
Pahl, C.; Jamshidi, P.; Weyns, D. Cloud architecture continuity: Change models and change rules for sustainable cloud software architectures. J. Softw. Evol. Process 2017, 29, e1849. [Google Scholar] [CrossRef]

Figure 2. PRISMA Flow Chart for this SLR.

Figure 4. Visualized Results of Publication Fora, Formats, Contribution Types, and Evaluation Methods.

Figure 5. DevOps Lifecycle—Term Extraction of AI for DevOps phases.

Table 1. Classification Table for IaC Technologies—see [2].

Dimension	Aspect	Chef	Puppet	Ansible	Pulumi	CloudFormation	Heat	Terraform	TOSCA	DOML
Context	Accessibility	Open-Source	Open-Source	Open-Source	Open-Source	Closed-Source	Open-Source	Open-Source	Open-Source	Open-Source
	Cloud Compatibility	All	All	All	All	AWS	All	All	All	All
	Community	Large	Large	Huge	Small	Small	Small	Huge	Large	Small
	Maturity	High	High	Medium	Medium	Low	Medium	Medium	Medium	Low
Functionality	Type	Configuration	Configuration	Configuration	Provisioning	Provisioning	Provisioning	Provisioning	Configuration	Provisioning
Functionality	Infrastructure	Mutable	Mutable	Mutable	Immutable	Immutable	Immutable	Immutable	Immutable	Immutable
Language	Paradigm	Procedural	Declarative	Declarative	Declarative	Declarative	Declarative	Declarative	Declarative	Declarative
Language	Scope	GPL	DSL	DSL	GPL	DSL	DSL	DSL	GPL	DSL
Architecture	Master Server	Required	Required	Not Required	Not Required	Not Required	Not Required	Not Required	Not Required	Not Required
Architecture	Agent Client	Required	Required	Not Required	Not Required	Not Required	Not Required	Not Required	Not Required	Not Required

Table 2. PICO Criteria.

Concern	Explanation
Population	RQ1: Practical motivation, RQ2: AI Techniques, RQ3: AI Application, RQ4: Research challenges and future directions [all detailed below]
Intervention	Characterization, Internal/external validation; Extracting data and Synthesis
Comparison	A comparison by mapping the primary studies to a characterization framework
Outcome	A characterization framework

Table 3. Research Questions.

Research Question	Motivation
RQ1 What are the main motivations behind using AI for IaC?	The aim is to obtain insight into what the main reasons are for using AI techniques to improve IaC.
RQ2 What are the different types of AI techniques used?	The aim is to investigate the technical possibilities for achieving IaC improvements.
RQ3 What are the IaC phases and tasks that are specifically supported by AI?	The aim is to identify existing opportunities and progress in specific activities.
RQ4 What are the existing research challenges, and what should be the future research agenda?	The aim is to understand and reveal the research gaps and identify future directions.

Table 4. Inclusion/Exclusion Criteria.

Criteria	Definition
Inclusion	(i) Abstract/keywords include key terms (ii) From the abstract, it is clear that a contribution towards IaC and an AI-based contribution is made
Exclusion	(i) Type: literature only in the form of an abstract, blog, or presentation is excluded (ii) Papers with AI and IaC terms only in the abstract or with little concrete details

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pahl, C.; Sezen, Ö.C.; Hofer, F. Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review. Electronics 2026, 15, 755. https://doi.org/10.3390/electronics15040755

AMA Style

Pahl C, Sezen ÖC, Hofer F. Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review. Electronics. 2026; 15(4):755. https://doi.org/10.3390/electronics15040755

Chicago/Turabian Style

Pahl, Claus, Övgüm Can Sezen, and Florian Hofer. 2026. "Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review" Electronics 15, no. 4: 755. https://doi.org/10.3390/electronics15040755

APA Style

Pahl, C., Sezen, Ö. C., & Hofer, F. (2026). Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review. Electronics, 15(4), 755. https://doi.org/10.3390/electronics15040755

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Intelligence for Infrastructure-as-Code—A Systematic Literature Review

Abstract

1. Introduction

2. Infrastructure-as-Code and DevOps

2.1. IaC Definition

2.2. IaC Context

2.3. SLR Motivation

3. SLR—Methodology and Execution

3.1. PRISMA Compliance

3.2. Planning the Review

3.3. Conducting the Review

4. SLR—Bibliometric Results

4.1. Temporal Overview of Studies

4.2. Publication Fora and Formats

4.3. Research and Evaluation Methods

5. SLR—AI Techniques for IaC

5.1. Key Terms Extraction and Phase Contribution

5.2. Phase 1—Plan, Code, Build

5.3. Phase 2—Test/Verify

5.4. Analysis Open Challenges—Phases 1 and 2

5.5. Phase 3—Release, Configure, Deploy

5.6. Phase 4—Operate, Monitor, Self-Heal

5.7. Analysis Open Challenges—Phases 3 and 4

6. A Review of Surveys

6.1. Objective

6.2. Major Surveys

6.3. Other Surveys

6.4. Summary

7. Conclusions

7.1. Observations and Directions

7.2. Limitations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI