MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation

Zhang, Yinggang; Xia, Weiyi; Zhao, Ben; Yuan, Tongwen; Yu, Xianchuan

doi:10.3390/sym18020248

Open AccessArticle

MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation

by

Yinggang Zhang

¹,

Weiyi Xia

²

,

Ben Zhao

³,

Tongwen Yuan

² and

Xianchuan Yu

^4,*

¹

China Academy of Machinery Science & Technology, Beijing 100044, China

²

National Engineering Research Center for Manufacturing Automation, Beijing 100120, China

³

Department of Precision Instrument, Tsinghua University, Beijing 100084, China

⁴

School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(2), 248; https://doi.org/10.3390/sym18020248

Submission received: 22 December 2025 / Revised: 20 January 2026 / Accepted: 21 January 2026 / Published: 30 January 2026

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

Industrial PLC programming faces persistent difficulties: lengthy development cycles, low fault tolerance, and cross-platform incompatibility among vendors. While LLMs show promise for automated code generation, their direct application is hindered by the gap between ambiguous natural language and the strict determinism required by control logic. This paper proposes MPC-Coder, a dual-knowledge enhanced multi-agent system that addresses this gap. The system combines a structured knowledge graph that imposes hard constraints on process parameters and equipment specifications with a vector database that offers implementation references such as code templates and function blocks. These two knowledge sources form a symmetric complementary architecture. A closed-loop “generation–verification–repair” mechanism leverages formal verification tools to iteratively refine the generated code. Experiments demonstrate that MPC-Coder achieves 100% syntactic correctness and 78% functional consistency, significantly outperforming general-purpose LLMs. The results indicate that the complementary fusion of domain knowledge and closed-loop verification effectively enhances the reliability of code generation, offering a viable technical pathway for the reliable application of LLMs in industrial control systems.

Keywords:

industrial control code generation; large language model; dual-knowledge architecture; multi-agent collaboration; formal verification

1. Introduction

The Programmable Logic Controller (PLC) serves as the core control unit of industrial automation systems, and the efficiency and quality of its program development directly affect the operational reliability of production lines [1,2]. The PLC’s control logic is typically written using languages defined by the IEC 61131-3 standard [3], among which Ladder Diagram and Structured Text (ST) are the most commonly used in engineering practice [4,5]. Traditional PLC programming predominantly relies on the domain expertise of process experts and the on-site debugging experience of control engineers, resulting in lengthy development cycles and repetitive debugging iterations [6]. Furthermore, technical barriers exist among PLC platforms from different vendors, such as incompatible function block libraries, which render cross-platform migration very difficult [7]. How to enhance the automation level of PLC program development has become an urgent problem to be addressed in the field of industrial control.

In recent years, generative artificial intelligence technologies, represented by large language models (LLMs), have provided new approaches for automated code generation. However, the direct application of general-purpose LLMs to industrial control code generation faces the following challenge: natural language requirement descriptions are inherently ambiguous, whereas PLC control logic demands strict determinism. LLMs are essentially probabilistic generative models that, in the absence of domain knowledge constraints, tend to produce “hallucinations”, in which the generated code may be syntactically correct but contains defects in process logic or safety specifications, making it difficult to deploy directly in industrial environments where reliability requirements are stringent [8].

Existing research has explored several techniques to incorporate domain knowledge, including retrieval-augmented generation (RAG) and model fine-tuning [9,10,11]. RAG methods based on vector retrieval offer abundant code snippet references, yet they cannot precisely capture hard constraint relationships among process parameters. Knowledge graph-based approaches provide structured logical constraints but lack coverage of diverse code implementation patterns. Most existing code generation frameworks also follow an “open-loop” process, directly mapping requirements to code without effective verification or error correction. As a result, they cannot guarantee that the generated code aligns with the original intent.

To address the aforementioned problems, this paper proposes a dual-knowledge enhanced multi-agent system (MPC-Coder), which is developed and validated in the context of thermal processing control scenarios. The main contributions of this paper are as follows:

First, a dual-knowledge architecture combining structured and unstructured knowledge is constructed. The knowledge graph stores process entities, equipment parameters, and their logical relationships, providing hard constraints. The vector database stores ST code snippets and function block documentation, providing semantic references. The symmetric complementary fusion of these two types of knowledge simultaneously satisfies the requirements for logical precision and implementation diversity.

Second, we designed a closed-loop collaborative framework comprising five specialized agents: parsing, planning, coding, verification, and fixing. The verification agent integrates formal verification tools that are capable of mapping the generated code back to logical specifications for consistency checking; the repair agent performs iterative corrections based on verification feedback. Through this “generation–verification–repair” closed-loop mechanism, the reliability of generated code is effectively enhanced.

2. Related Work

This section reviews the research status in the field of automated industrial control code generation from three aspects: LLM-based code generation, knowledge-enhanced generation methods, and multi-agent collaboration with verification mechanisms.

2.1. LLM-Based Code Generation

Research on automated PLC program generation has evolved from early formal methods to model-driven engineering [12,13,14,15,16,17,18,19,20]. In recent years, LLMs based on the Transformer architecture have achieved significant progress in the field of code generation. IntelliCode Compose, proposed by Svyatkovskiy et al. [21], demonstrated sequence generation capabilities for general-purpose programming languages; the study by Tran et al. [22] showed that GPT-4 outperforms lightweight local models in terms of pass rate when generating IEC 61131-3 standard ST code.

However, applying general-purpose LLMs to the industrial control domain faces notable challenges. LLMs are probabilistic generative models, whereas PLC control logic demands strict determinism and safety. Haag et al. [23] attempted to improve domain adaptability through fine-tuning, but purely data-driven approaches cannot fully resolve domain knowledge deficiency, with the generated code frequently exhibiting logical errors or violating physical constraints.

2.2. Knowledge-Enhanced Code Generation

To compensate for the domain knowledge deficiency of general-purpose models, researchers have explored various knowledge enhancement methods, primarily along two technical routes.

The first route is vector retrieval-based RAG. Koziolek et al. [24] proposed enhancing the generation context through vectorized retrieval of function block documentation. This method utilizes unstructured data such as technical documents and code snippets to provide semantic references for the model. However, vector retrieval relies on semantic similarity matching and cannot precisely express hard constraint relationships among process parameters.

The second route is knowledge graph-based enhancement. As carriers of structured knowledge, knowledge graphs can store entities and their relationships, providing factual constraints. Yang et al. [25] and Ji et al. [26] explored combining knowledge graphs with LLMs to enhance factual awareness; An et al. [27] addressed the semantic heterogeneity problem in the PLC domain through ontology construction; and Zhao et al. [28] leveraged knowledge graphs to enhance the requirements for the analysis phase of code generation. However, relying solely on knowledge graphs fails to cover diverse code implementation patterns and lacks references for programming styles and algorithmic templates.

Existing knowledge enhancement methods each have their own emphasis: vector retrieval provides semantic richness but with loose logical constraints, while knowledge graphs provide structured constraints but insufficient implementation references. Recently, Ye et al. [29] proposed SWP-Chat for welding process Q&A, which combines Neo4j knowledge graphs with vector databases in a dual-channel design, demonstrating the feasibility of integrating symbolic reasoning with semantic retrieval in manufacturing domains. However, few studies in the field of PLC control code generation have attempted to complementarily fuse these two types of knowledge to simultaneously satisfy the requirements for logical precision and implementation diversity.

2.3. Multi-Agent Collaboration and Verification Mechanisms

To handle complex programming tasks, the research paradigm is shifting from single large models to multi-agent collaborative systems. Islam et al. [30] proposed MapCoder, a multi-agent framework that simulates human problem-solving through four specialized agents for retrieval, planning, coding, and debugging. Bai et al. [31] introduced a collaborative framework that decomposes code generation into role definition, demand optimization, code writing, and code review phases. The Self-Collaboration framework proposed by Dong et al. [32] and the CodeAgent framework proposed by Zhang et al. [33] also improved code generation quality by simulating different roles in software development. In the PLC domain, Fakih et al. [11] proposed the LLM4PLC framework with automated verification for iterative correction.

For PLC testing and verification, Koziolek et al. [34] explored using LLMs to automatically generate test cases for industrial control logic, demonstrating high coverage but noting limitations in handling complex logic assertions. He et al. [35] proposed STAutoTester based on dynamic symbolic execution for automated test generation of IEC 61131-3 ST programs, achieving higher efficiency than traditional symbolic execution tools. On the formal verification side, Fink et al. [36] extended PLCverif to integrate NASA’s FRET tool, enabling monitor-based verification with natural language requirements and supporting timed properties. The nuXmv symbolic model checker [37], as an evolution of NuSMV, provides enhanced verification capabilities for industrial control logic through SAT and SMT techniques. Additionally, Wang et al. [38] developed K-ST, a formal executable semantics for the ST language based on the K Framework, which provides a mathematical foundation for verifying ST program correctness and addresses the inconsistency issues among different vendor compilers.

However, existing multi-agent systems still have limitations in industrial applications: most focus on forward generation capabilities and give less attention to reverse verification. Although some studies have introduced syntax checking or unit testing, they lack closed-loop mechanisms that integrate formal verification. In industrial control scenarios with strict safety requirements, relying solely on syntax-level checking is insufficient to ensure logical consistency between generated code and the original intent.

2.4. Major Limitations of Existing Research

Based on the above analysis, the existing research exhibits two major limitations:

First, the singularity of knowledge representation. Existing methods either rely on unstructured vector retrieval or structured knowledge graphs, failing to effectively fuse the two types of knowledge, and are thus unable to simultaneously satisfy the dual requirements of industrial code generation for logical constraints and implementation references.

Second, the absence of verification mechanisms. Most existing code generation frameworks adopt open-loop processes, lacking effective formal verification and iterative repair mechanisms, and consequently failing to guarantee that the generated code meets industrial safety specifications.

Compared to existing approaches, our work differs in several key aspects. The SWP-Chat system explored combining knowledge graphs with vector databases for welding process Q & A, demonstrating the feasibility of dual-channel knowledge integration in manufacturing domains; however, it does not address code generation or formal verification. In the PLC domain, standard RAG-based methods provide semantic references through vector retrieval but cannot enforce hard constraints on process parameters. LLM4PLC employed LoRA fine-tuning for code generation and introduced formal verification tools, but the iterative repair process requires semi-manual guidance. To date, no prior work has applied knowledge graphs to PLC code generation. Our MPC-Coder uniquely combines dual-knowledge sources with fully automated closed-loop verification, enabling end-to-end code generation without human intervention during the repair process. A quantitative comparison with these system classes is presented in Section 4.3.

To address these limitations, this paper proposes a dual-knowledge enhanced multi-agent system that improves the reliability of industrial control code generation through the complementary fusion of structured and unstructured knowledge, as well as the closed-loop coupling of generation and verification.

3. Methodology

3.1. Overall System Architecture

The automated generation of industrial control code is essentially the transformation of natural language requirements into ST programs that are compliant with the IEC 61131-3 standard. This process involves two key concerns: the effective injection of domain knowledge and the quality assurance of generated code.

Taking thermal processing control as an application scenario based on the domain data that are available to our research team, this paper designs a dual-knowledge enhanced multi-agent system (MPC-Coder). As shown in Figure 1, the system comprises three core modules: the knowledge graph module, the vector database module, and the multi-agent system module.

The knowledge graph module stores structured knowledge in the thermal processing domain, including process entities, equipment parameters, and logical constraints, providing hard rules for code generation. The vector database module stores unstructured knowledge such as ST code snippets and function block documentation, providing implementation references for code generation. The multi-agent collaboration module is responsible for coordinating five specialized agents (parsing, planning, coding, verification, and fixing) to complete the entire workflow, from understanding requirements to code generation, verification, and repair.

The overall workflow of the system can be formally expressed as follows:

C = ClosedLoop (F_{c o d e} (F_{p l a n} (F_{p a r s e} (R, P), G), D), F_{v e r i f y}, F_{f i x})

(1)

where

R

denotes the natural language requirements,

P

denotes the process parameter file,

G

denotes the knowledge graph,

D

denotes the vector database, and

C

denotes the generated ST code.

F_{p a r s e}

,

F_{p l a n}

,

F_{c o d e}

,

F_{v e r i f y}

, and

F_{f i x}

correspond to the mapping functions of the parsing, planning, coding, verification, and fixing agents, respectively.

The core design philosophy of the system embodies symmetric principles: addressing the domain knowledge injection problem through the symmetric complementary fusion of structured and unstructured knowledge, and addressing the code quality assurance problem through the symmetric closed-loop coupling of forward generation and reverse verification.

3.2. Dual-Knowledge Architecture

3.2.1. Knowledge Graph Construction

This study constructs a domain knowledge graph for thermal processing control scenarios, illustrated using aluminum alloy vacuum heat treatment processes as an example.

The knowledge graph is defined as a set of triplets:

G = (h, r, t) ∣ h, t \in E, r \in R

(2)

where

E

denotes the entity set,

R

denotes the relation set, and

(h, r, t)

indicates that the head entity

h

is connected to the tail entity

t

through relation

r

.

The knowledge graph construction adopts a combined approach of “top-down” ontology design and “bottom-up” data extraction, following a methodology that has been proven to be effective in manufacturing domains [39,40,41]. First, a standardized ontology layer is defined, with core entities comprising five categories: materials (e.g., 6061 and 7075 aluminum alloys), processes (e.g., solution treatment and artificial aging), equipment (e.g., vacuum furnaces and controllers), parameters (e.g., holding time and quenching rate), and properties (e.g., tensile strength). Entities are connected through semantic relations, such as “process–hasParameter–parameter” and “material–applicableTo–process”. The ontology design is primarily based on equipment specifications provided by industrial partners and process data accumulated from our team’s historical projects, rather than a single industry standard. In the thermal processing and automation domain, individual standards such as AMS2770R [42] cover only specific aspects and cannot independently guide comprehensive knowledge modeling. Therefore, our ontology integrates heterogeneous knowledge sources to capture the complete scope of control-relevant entities and constraints. Figure 2 illustrates the knowledge graph construction workflow.

For data extraction, heterogeneous data sources provided by collaborating organizations, including process manuals and equipment specification documents, are first converted to text through OCR, followed by the design of structured prompts to guide LLMs in identifying entities and relations. For terminology inconsistencies (e.g., “Al-6061” versus “6061-T6”), we employ a vector similarity-based entity alignment mechanism with a cosine similarity threshold of 0.85. The extracted results are exported to a CSV format for manual review and correction before being imported into the graph database, ensuring quality control throughout the construction process. The resulting knowledge graph covers heat treatment process knowledge for 2xxx, 6xxx, and 7xxx series aluminum alloys, comprising 326 entities, 30 relation types, and 289 explicit triples. A substantial portion of specific process parameters are stored as node attributes rather than explicit triples to improve retrieval efficiency. Figure 3 presents the knowledge graph visualization.

3.2.2. Vector Database Construction

The vector database stores unstructured code knowledge, including OSCAT standard function block documentation (496 pages), function block libraries developed by our team (approximately 200 blocks), and industrial PLC source code files (970 files).

The construction workflow comprises three phases: text chunking, vector embedding, and index construction. Text chunking employs a sliding window strategy to recursively segment source documents into text chunks of 512 tokens, with an overlap of 128 tokens being retained to ensure contextual continuity. Vector embedding utilizes the Multilingual-e5-large model to map text chunks into 1024-dimensional dense vectors:

v_{i} = Embed (c_{i}) \in R^{1024}

(3)

where

c_{i}

denotes the

i

-th text chunk, and

v_{i}

denotes the corresponding vector representation. Index construction employs the Hierarchical Navigable Small World (HNSW) algorithm to balance retrieval speed and accuracy. Figure 4 illustrates the vector database construction workflow.

During the query phase, the input from the coding agent is converted into a query vector

v_{q}

using the same embedding model, and the most relevant knowledge chunks are retrieved via cosine similarity:

Sim (v_{q}, v_{i}) = \frac{v_{q} \cdot v_{i}}{|v_{q} | | v_{i}|}

(4)

The system returns the

Top - k

text chunks with the highest similarity as retrieval results:

K_{q} = \underset{i \in 1, \dots, N}{Top - k}, Sim (v_{q}, v_{i})

(5)

In this study,

k

is set to 5, providing the coding agent with code templates and function block references.

3.2.3. Hybrid Retrieval Strategy

The system employs a hybrid retrieval strategy, invoking the two types of knowledge bases according to task requirements.

The planning agent dominates structured retrieval. This study uses Cypher query syntax to support structured access to the Neo4j graph database. The agent extracts information such as process parameter thresholds, equipment connection relationships, and equipment I/O addresses through graph traversal, providing hard constraints for the control logic planning. Since a substantial portion of process parameters are stored as node attributes, the required information can typically be obtained within 2–3 hops, reducing traversal complexity. The retrieved constraints are converted to a structured JSON format before being passed to subsequent agents. Figure 5 illustrates the ReAct (Reasoning and Acting) query strategy workflow for the knowledge graph.

The coding agent dominates semantic retrieval. It retrieves code snippets from similar scenarios in the vector database through an HNSW search, obtaining implementation references such as variable naming conventions and function block invocation patterns.

When conflicts arise between the two types of knowledge (e.g., a code template retrieved from the vector database suggests an upper temperature limit of 1000 °C, while the equipment safety boundary defined in the knowledge graph is 900 °C), the structured constraints from the knowledge graph take precedence. This is because the knowledge graph encodes hard constraints that are non-negotiable, while the vector database provides soft implementation suggestions that serve as references rather than mandatory rules.

3.3. Multi-Agent Collaboration Mechanism

This study designs a collaborative system comprising five specialized agents, divided into two parts: the forward generation workflow and the closed-loop verification workflow. The agents are implemented using the LangChain framework, with DeepSeek-R1 as the underlying LLM. Given the large context window of DeepSeek-R1, combined with our retrieval strategy that converts knowledge to compact representations, the risk of token overflow is minimal.

3.3.1. Forward Generation Workflow

The forward generation workflow progressively transforms natural language requirements into ST code, involving three agents.

The parsing agent is responsible for structuring requirements. This agent processes two types of inputs: unstructured natural language requirement text and semi-structured CSV process parameter files. Through a dual-path parsing mechanism, it extracts process types and control constraints from the text and extracts critical curve data such as time–temperature profiles from the parameter files, ultimately outputting a standardized JSON structural description.

The planning agent is responsible for the control logic design. This agent integrates the ReAct reasoning mechanism to query the knowledge graph based on parsing results, extracting the process parameters, equipment constraints, and interlock logic. Information deficiency is detected through LLM-based semantic judgment: the prompt instructs the LLM to evaluate whether the retrieved knowledge is sufficient to complete the control logic. If insufficient, the LLM leverages its reasoning capability to reformulate query statements rather than repeating previous queries. The agent automatically triggers iterative queries (up to 3 rounds) when needed. The final output comprises a detailed control plan including control steps, state transition conditions, and exception handling strategies.

The coding agent is responsible for code implementation. Based on the control plan, this agent retrieves code templates from similar scenarios through vector retrieval, assembles the planning logic and reference code into structured prompts, guides LLMs to generate IEC 61131-3 standard ST code, and removes non-code content such as Markdown tags through a post-processing module.

3.3.2. Closed-Loop Verification and Repair

The closed-loop verification workflow performs multi-level checking and iterative repair on the generated code, involving two agents.

The verification agent is responsible for code verification, employing a two-phase detection mechanism. The first phase is syntax checking: the LLVM-based ST compiler ruSTy is invoked to perform static analysis, filtering basic syntax errors and parsing error logs. The second phase is logic verification: the PLCverif tool is utilized to convert ST code into SMV models that are compatible with symbolic model checkers, and bounded model checking is executed through the nuXmv engine in conjunction with predefined Linear Temporal Logic (LTL) specifications. The verification results from both phases are consolidated into a structured JSON report. If verification fails, the report includes error types, locations, and counterexamples containing state transition traces, along with repair suggestions, which serve as the basis for the fixing agent to perform targeted repairs.

The fixing agent is responsible for iterative correction. This agent employs chain-of-thought reasoning and retains access to the shared knowledge bases, previous code generation context, and verification agent’s repair suggestions. For compilation errors, the chain-of-thought method is employed to parse error stacks, locating and correcting syntax issues. For logic errors, the state variable change paths that lead to specification violations in counterexamples are analyzed, and the corresponding logic branches are corrected. The repaired code is resubmitted to the verification agent for validation, forming an iterative “generation–verification–repair” closed loop until all checks are passed or the maximum of 4 repair iterations is reached (5 iterations including initial generation). Unresolved cases are flagged for manual review with full diagnostic information.

Through this closed-loop mechanism, the system can maximize the generative capabilities of LLMs while ensuring that the generated code meets industrial safety specifications through formal verification.

To intuitively demonstrate the practical effects of multi-agent collaboration and dual-knowledge injection, Figure 6 presents a complete end-to-end generation process using a 6061 aluminum alloy aging heat treatment temperature ramp control task as an example. As shown in the figure, after receiving the user’s natural language requirements (A), the planning agent queries the knowledge graph and outputs a control scheme containing key control steps and I/O configurations; the coding agent retrieves a reference implementation of the PID control block from the vector database (B). Subsequently, the coding agent generates initial code (C), which contains a “dual-coil conflict” logic error. At this point, the closed-loop verification mechanism intervenes: the formal verification tool detects this logic violation and drives the fixing agent to perform corrections, ultimately outputting the correct code (D). This example illustrates the complementary roles of the dual-knowledge architecture: process parameter constraints (such as temperature thresholds) originate from the knowledge graph, while code implementation details (such as the programming convention of defaulting maximum values to 32,767) originate from historical code references in the vector database.

4. Experiments and Analysis

This section evaluates the proposed method through experiments. The experiments address two key questions: (1) Can the dual-knowledge architecture improve code generation quality? (2) Can the closed-loop verification mechanism correct generation errors?

4.1. Experimental Setup

4.1.1. Experimental Environment

Experiments were conducted on a workstation with the configuration shown in Table 1.

4.1.2. Evaluation Dataset

This study constructed an evaluation dataset comprising 50 thermal processing control programming tasks, divided into two groups based on control logic complexity:

Simple task group (30 tasks): This group covers tasks such as LED status indication, counter logic, and single-loop start–stop control. These tasks involve a relatively simple control logic and primarily test the system’s basic knowledge retrieval capabilities.

Moderately complex task group (20 tasks): This group covers tasks such as multi-segment temperature curve control, multi-axis coordinated motion, and pressure closed-loop control with safety interlocks. These tasks involve complex temporal constraints and state transition logic and are used to test the comprehensive effectiveness of dual-knowledge fusion and closed-loop verification mechanisms.

It should be noted that all 50 evaluation tasks were independently designed by domain experts and are completely separate from the 970 source files in the vector database.

4.1.3. Evaluation Metrics

This study adopts the following evaluation metrics:

Syntactic Correctness: The proportion of code passing the ruSTy compiler syntax check, measuring the basic validity of generated code.

Functional Consistency: The proportion of code passing nuXmv formal verification, measuring whether the code logic satisfies requirement specifications.

Pass@

k

: The probability that at least one out of

k

independent generations simultaneously passes both syntax checking and functional verification. In this study,

k

is set to 3.

4.2. Overall Performance Comparison

4.2.1. Comparative Methods

To validate the effectiveness of the proposed method, the following comparison groups are established:

Group A: General-purpose large models. Two LLM APIs with built-in RAG capabilities, GPT-4o and DeepSeek-R1, are selected to directly generate code using one-shot prompting. Although these models may have built-in RAG or web search capabilities, they are not connected to the private domain database constructed in this study, representing general generation approaches lacking deep customization of domain knowledge.

Group B: Vector retrieval augmentation. DeepSeek-R1 is combined with the vector database constructed in this study for retrieval-augmented generation. This group represents a single enhancement approach that only introduces unstructured knowledge.

Group C: MPC-Coder. The dual-knowledge enhanced multi-agent system proposed in this paper. This group represents the complete solution with fusion of structured and unstructured knowledge, as well as closed-loop verification capabilities.

4.2.2. Comparison Results

Table 2 presents the performance comparison of each method across 50 tasks.

Figure 7 illustrates the performance differences of each method across the three metrics in a bar chart form.

The experimental results demonstrate the following:

First, the necessity of domain knowledge injection. Despite possessing built-in RAG capabilities, general-purpose LLMs (GPT-4o, DeepSeek-R1) perform poorly on industrial control code generation tasks, with syntactic correctness below 50% and functional consistency below 25%. This indicates that general knowledge bases cannot adequately cover the specialized specifications of PLC programming, and deep customization of domain knowledge is essential.

Second, the advantages of dual-knowledge fusion. Compared to Vector-RAG, which only utilizes vector retrieval, MPC-Coder achieves improvements of 26 percentage points in syntactic correctness and 22 percentage points in functional consistency. This demonstrates that the structured constraints provided by the knowledge graph can effectively compensate for the deficiencies of vector retrieval in logical rigor.

Third, the effectiveness of closed-loop verification. MPC-Coder achieved 98% syntactic correctness (49/50) in initial generation, with only one task containing syntax errors that were resolved after a single repair iteration, ultimately reaching 100%. This indicates that the dual-knowledge enhancement mechanism itself has significantly improved code quality, while the closed-loop verification mechanism ensures complete correctness of the final output. For the functional consistency metric (78%), the 95% confidence interval calculated using the Wald method is [66.5%, 89.5%]. While the modest sample size (N = 50) yields a relatively wide interval, the lower bound still substantially exceeds baseline performance (18–22%), supporting the robustness of our findings.

4.3. Ablation Study

To quantify the contribution of each module, this section conducts ablation experiments to evaluate system performance after removing the knowledge graph, vector database, and closed-loop verification, respectively.

4.3.1. Ablation Settings

w/o Knowledge Graph: The knowledge graph module is removed. The planning agent still performs control logic design tasks but no longer retrieves structured constraints from the knowledge graph, relying solely on the reasoning capabilities of the LLM itself.

w/o Vector Database: The vector database module is removed. The coding agent no longer retrieves code templates and function block references, relying only on the structured information provided by the knowledge graph and the built-in RAG capabilities of the LLM.

w/o Closed-Loop Verification: The invocation of formal verification tools is removed. The verification agent and fixing agent still exist but cannot invoke the ruSTy compiler and nuXmv model checker, relying solely on the self-reflection capability of the LLM for code review. The fixing agent retains the ability to retrieve reference code from the vector database.

4.3.2. Ablation Results

Table 3 presents the ablation experiment results.

Figure 8 visually presents the performance variation trends of the system after removing different modules.

4.3.3. Analysis of Module Contributions

The first result is the critical role of closed-loop verification. After removing closed-loop verification, system performance exhibits the most significant decline, with syntactic correctness dropping from 100% to 74% and functional consistency dropping from 78% to 42%. This indicates that single-pass generated code often contains errors, and the objective feedback provided by formal verification tools is critical for iterative correction. Relying solely on LLM self-reflection is insufficient for effectively identifying deeper logic issues.

The second result is the constraining role of the knowledge graph. After removing the knowledge graph, syntactic correctness remains at 100% (benefiting from closed-loop verification), but functional consistency drops from 78% to 56%, a decrease of 22 percentage points. Analysis of failure cases reveals that the primary issues include process parameter settings exceeding physical boundaries and missing equipment interlock logic. This demonstrates that the structured constraints provided by the knowledge graph are essential for ensuring the correctness of the control logic. Regarding the quality of the knowledge graph itself, this study adopted a construction approach combining LLM-assisted extraction with expert review, where all entities and relations were ultimately verified by domain experts.

The third result is the referential role of the vector database. After removing the vector database, syntactic correctness drops from 100% to 92%, and functional consistency drops from 78% to 62%. Without code template references, the generated code shows less structural standardization, increasing the probability of compilation errors. To further verify the effectiveness of the retrieval strategy, we conducted manual relevance annotation on Top-5 retrieval results for 20 randomly selected tasks (100 code snippets in total), achieving a Precision@5 of 90%. This indicates that the implementation references provided by vector retrieval make positive contributions to code quality.

To provide a reproducible breakdown of these contributions, we categorized detected errors into four types. Table 4 presents the error categories with definitions, instance counts, and representative examples.

The error category analysis further confirms these roles: E1 errors are largely mitigated by knowledge graph constraints (71% resolved), E4 errors are fully resolved through closed-loop verification, while E3 errors remain the most challenging (20% resolved), indicating the need for enhanced temporal reasoning capabilities.

To contextualize these findings against related work, Table 5 compares MPC-Coder with the closest system classes discussed in Section 2, using our experimental configurations as proxies.

The comparison reveals that RAG-based and verification-assisted methods achieve identical functional consistency (56%) through different mechanisms, yet neither alone reaches the level attained by MPC-Coder (78%). This 22-percentage-point improvement demonstrates that structured knowledge constraints and closed-loop verification are complementary rather than substitutable.

In summary, the performance improvement of MPC-Coder stems from the synergistic effect of the three modules: the knowledge graph provides logical constraints, the vector database provides implementation references, and the closed-loop verification provides error correction capabilities. The error category analysis (Table 4) and system class comparison (Table 5) provide quantitative evidence for these complementary roles.

4.4. Convergence Analysis

To examine the iterative convergence characteristics of the closed-loop verification mechanism, this section analyzes the variation in error rates during multiple rounds of the “verification–repair” process.

4.4.1. Valuation Metric

The functional failure rate (FFR) is defined as follows:

{FFR}_{t} = 1 - \frac{N_{pass} (t)}{N_{total}}

(6)

where

N_{pass} (t)

denotes the number of tasks passing formal verification after the

t

-th iteration, and

N_{total}

denotes the total number of tasks.

4.4.2. Iteration Trajectory

On the entire task dataset, the system recorded the FFR variation of different system variants across five iterations, with results being shown in Table 6.

Figure 9 illustrates the convergence trajectories of the functional failure rates for each system variant during the iteration process.

4.4.3. Convergence Behavior Analysis

First, rapid convergence of the full system was carried out. In total, 52% of tasks (26/50) passed verification at initial generation; this increased to 66% after one repair iteration, 72% after two, 76% after three, and 78% after four. The average number of iterations for successful tasks was only 0.59, while tasks requiring repair averaged 1.77 iterations, with a maximum of 4. For the 39 successful tasks, the median number of iterations was 0, indicating that the majority of tasks required no repair. The average total processing time was 172.4 s per task. This indicates that formal verification tools can precisely locate errors, guiding the fixing agent to efficiently correct problems.

Second, the absence of the knowledge graph limits the convergence ceiling. After removing the knowledge graph, although the system can still reduce error rates through iteration, it ultimately stabilizes at a relatively high level of 44%. Analysis reveals that without structured constraints, certain physical parameter errors cannot be autonomously detected by the model, forming repair blind spots.

Third, the absence of closed-loop verification leads to slow convergence. After removing closed-loop verification, the error rate decreases extremely slowly (only 18 percentage points over five rounds) and ultimately remains at a high level of 58%. This demonstrates that the “open-loop” mode relying solely on LLM self-reflection is ineffective for error correction.

The convergence analysis demonstrates that the closed-loop verification mechanism provides the driving force for iterative correction, while the knowledge graph defines the quality ceiling for convergence. The synergy of both ensures effective evolution of the system from initial generation toward correct code.

5. Conclusions and Future Work

5.1. Research Summary

This paper addresses the problems of domain knowledge deficiency and difficulty in ensuring generation quality in automated industrial control code generation, proposing a dual-knowledge enhanced multi-agent system (MPC-Coder).

At the knowledge representation level, a dual-knowledge architecture combining structured and unstructured knowledge is constructed. The knowledge graph stores process entities, equipment parameters, and their logical relationships, providing hard constraints; the vector database stores code snippets and function block documentation, providing implementation references. The complementary fusion of these two types of knowledge simultaneously satisfies the requirements for logical precision and implementation diversity in industrial code generation.

At the system functionality level, we design a collaborative framework with five agents: parsing, planning, coding, verification, and fixing. The verification agent integrates a formal verification toolchain (rusty/PLCverif/nuXmv), enabling syntax checking and logic verification of generated code; the fixing agent performs iterative corrections based on verification feedback. Through the “generation–verification–repair” closed-loop mechanism, the reliability of the generated code is improved.

5.2. Main Conclusions

Based on experimental validation in thermal processing control scenarios, the following main conclusions are drawn:

First, dual-knowledge fusion effectively improves code quality. Experiments demonstrate that MPC-Coder achieves a functional consistency rate of 78%, compared to 56% for methods using only vector retrieval augmentation, an improvement of 22 percentage points. The structured constraints provided by the knowledge graph effectively compensate for the deficiencies of vector retrieval in logical rigor, avoiding issues such as process parameter boundary violations and missing interlock logic.

Second, closed-loop verification is critical for quality assurance. Ablation experiments show that after removing closed-loop verification, syntactic correctness drops from 100% to 74%, and functional consistency drops from 78% to 42%. The objective feedback provided by formal verification tools is critical for iterative correction, since LLM self-reflection alone cannot reliably catch deeper logic errors.

Third, the system exhibits favorable convergence characteristics. Convergence analysis demonstrates that the full system reaches a stable state after 2–3 iterations, with the functional failure rate decreasing from an initial 24% to 11%. Closed-loop verification provides the driving force for iterative correction, while the knowledge graph defines the quality ceiling for convergence. The synergy of both ensures effective system evolution.

5.3. Limitations and Future Work

This study has the following limitations:

First, knowledge graph construction relies on domain expert involvement. The current ontology design and data validation of the knowledge graph require support from process experts, and the degree of automation needs improvement. When applied to new process domains, certain knowledge engineering costs are required.

Second, the coverage of verification specifications is limited. Current formal verification primarily targets predefined safety specifications, and the system’s detection capability for implicit constraints that are not explicitly expressed in requirements remains insufficient.

Third, the agent collaboration follows a linear process. The current five agents execute in a fixed sequence, lacking dynamic scheduling and parallel collaboration capabilities, which limits efficiency when handling large-scale complex tasks.

Regarding system scalability, the multi-agent framework and closed-loop verification mechanism proposed in this study are domain-agnostic and can be directly reused for other manufacturing scenarios. When extending to new domains, the main work involves constructing the corresponding domain knowledge graph, populating the vector database, adjusting LTL verification specifications, and adapting some prompt templates. Currently, our team is applying this system to automated storage and retrieval system control and film production line control projects, preliminarily validating the transferability of the framework.

Future work may address these limitations in several directions:

First, automatic construction and evolution of knowledge graphs. Research on LLM-based automatic knowledge extraction and graph completion techniques can reduce the manual cost of knowledge engineering. Establishing continuous update mechanisms for knowledge graphs will enable the system to accumulate new knowledge from practical applications.

Second, enhancement of verification capabilities. Exploring methods for automatically generating formal specifications from natural language requirements can improve the detection of implicit constraints. Introducing runtime verification techniques will enable more comprehensive quality assessment of generated code.

Third, optimization of agent collaboration mechanisms. Research on dynamic scheduling and parallel collaboration strategies for multi-agent systems can improve efficiency in handling complex tasks. Exploring adversarial collaboration mechanisms, where agents mutually verify each other, can further enhance generation quality.

Fourth, extension to multimodal inputs. The current system primarily accepts text and parameter tables as inputs. Future work may consider introducing parsing capabilities for visual information such as P&ID diagrams and electrical schematics, enabling end-to-end generation from engineering design documents to control code.

In conclusion, the dual-knowledge enhanced multi-agent system proposed in this paper provides a viable technical pathway for the application of LLMs in the industrial control domain. Through effective injection of domain knowledge and establishment of closed-loop verification mechanisms, the system can leverage the generative capabilities of large models while ensuring the reliability of industrial code, offering new perspectives for automation of control system development in the context of intelligent manufacturing.

Author Contributions

Conceptualization, Y.Z.; methodology, Y.Z. and W.X.; software, W.X.; validation, Y.Z. and W.X.; formal analysis, W.X.; investigation, Y.Z. and B.Z.; resources, Y.Z.; data curation, W.X.; writing—original draft preparation, Y.Z. and W.X.; writing—review and editing, W.X., B.Z., T.Y. and X.Y.; visualization, W.X.; supervision, X.Y.; project administration, Y.Z. and X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All the data supporting the findings of this study can be made available upon reasonable request.

Conflicts of Interest

Author Yinggang Zhang was employed by the China Academy of Machinery Science & Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhou, J.; Li, P.G.; Zhou, Y.H.; Wang, B.C.; Zang, J.Y.; Meng, L. Toward New-Generation Intelligent Manufacturing. Engineering 2018, 4, 11–20. [Google Scholar] [CrossRef]
Zhou, J.; Zhou, Y.H.; Wang, B.C.; Zang, J.Y. Human-Cyber-Physical Systems (HCPSs) in the Context of New-Generation Intelligent Manufacturing. Engineering 2019, 5, 624–636. [Google Scholar] [CrossRef]
IEC 61131-3:2013; Programmable Controllers—Part 3: Programming Languages. IEC: Geneva, Switzerland, 2013.
Tiegelkamp, M.; John, K.-H. IEC 61131-3: Programming Industrial Automation Systems; Springer: Berlin/Heidelberg, Germany, 2010; Volume 166. [Google Scholar]
Walters, E.G.; Bryla, E.J. Software Architecture and Framework for Programmable Logic Controllers: A Case Study and Suggestions for Research. Machines 2016, 4, 13. [Google Scholar] [CrossRef]
Dai, W.W.; Vyatkin, V. A Case Study on Migration from IEC 61131 PLC to IEC 61499 Function Block Control. In Proceedings of the 7th IEEE International Conference on Industrial Informatics, Cardiff, UK, 23–26 June 2009; IEEE: New York, NY, USA, 2009; pp. 79–84. [Google Scholar]
Renard, D.; Saddem, R.; Annebicque, D.; Riera, B. From Sensors to Digital Twins toward an Iterative Approach for Existing Manufacturing Systems. Sensors 2024, 24, 1434. [Google Scholar] [CrossRef] [PubMed]
Chen, P.; Liu, X.; Wang, Y. Fine-Tune LLMs for PLC Code Security: An Information-Theoretic Analysis. Mathematics 2025, 13, 3211. [Google Scholar] [CrossRef]
Haider, S.A.; Prabha, S.; Cabello, C.A.G.; Genovese, A.; Collaco, B.; Wood, N.; London, J.; Bagaria, S.; Tao, C.; Forte, A.J. The Development and Evaluation of a Retrieval-Augmented Generation Large Language Model Virtual Assistant for Postoperative Instructions. Bioengineering 2025, 12, 1219. [Google Scholar] [CrossRef]
Kizi, M.K.Z.; Suh, Y. Design and Performance Evaluation of LLM-Based RAG Pipelines for Chatbot Services in International Student Admissions. Electronics 2025, 14, 3095. [Google Scholar] [CrossRef]
Fakih, M.; Dharmaji, R.; Moghaddas, Y.; Araya, G.Q.; Ogundare, O.; Al Faruque, M.A.; Assoc Computing, M. LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems. In Proceedings of the ACM/IEEE 46th International Conference on Software Engineering—Software Engineering in Practice (ICSE-SEIP), Lisbon, Portugal, 14–20 April 2024; Association for Computing Machinery: New York, NY, USA, 2024; pp. 192–203. [Google Scholar]
Wang, R.; Gu, M.; Song, X.Y.; Wan, H. Formal Specification and Code Generation of Programable Logic Controllers. In Proceedings of the 14th IEEE International Conference on Engineering Complex Computer Systems, Potsdam, Germany, 2–4 June 2009; IEEE: New York, NY, USA, 2009; p. 102. [Google Scholar]
Tikhonov, D.; Schütz, D.; Ulewicz, S.; Vogel-Heuser, B. Towards Industrial Application of Model-driven Platform-independent PLC Programming Using UML. In Proceedings of the 40th Annual Conference of the IEEE-Industrial-Electronics-Society (IECON), Dallas, TX, USA, 29 October–1 November 2014; IEEE: New York, NY, USA, 2014; pp. 2638–2644. [Google Scholar]
Thapa, D.; Park, C.M.; Park, S.C.; Wang, G.N. Auto-Generation of IEC Standard PLC Code Using t-MPSG. Int. J. Control Autom. Syst. 2009, 7, 165–174. [Google Scholar] [CrossRef]
Swartjes, L.; van Beek, D.A.; Fokkink, W.J.; van Eekelen, J. Model-based design of supervisory controllers for baggage handling systems. Simul. Model. Pract. Theory 2017, 78, 28–50. [Google Scholar] [CrossRef]
Steinegger, M.; Zoitl, A. Automated Code Generation for Programmable Logic Controllers based on Knowledge Acquisition from Engineering Artifacts: Concept and Case Study. In Proceedings of the 17th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), AGH Univ Sci & Technol, Krakow, Poland, 17–21 September 2012; IEEE: New York, NY, USA, 2012. [Google Scholar]
Prenzel, L.; Provost, J. PLC Implementation of Symbolic, Modular Supervisory Controllers. IFAC-PapersOnLine 2018, 51, 304–309. [Google Scholar] [CrossRef]
Pavlovskyi, Y.; Kennel, M.; Schmucker, U. Template-Based Generation of PLC Software from Plant Models Using Graph Representation. In Proceedings of the 25th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), Stuttgart, Germany, 20–22 November 2018; IEEE: New York, NY, USA, 2018; pp. 278–285. [Google Scholar]
Julius, R.; Trenner, T.; Neidig, J.; Fay, A. A model-driven approach for transforming GRAFCET specification into PLC code including hierarchical structures. IFAC-PapersOnLine 2019, 52, 1767–1772. [Google Scholar] [CrossRef]
Cheng, C.H.; Huang, C.H.; Ruess, H.; Stattelmann, S. G4LTL-ST: Automatic Generation of PLC Programs. In Proceedings of the 26th International Conference on Computer Aided Verification (CAV) Held as Part of the Vienna Summer of Logic (VSL), Vienna Univ Technol, Vienna, Austria, 18–22 July 2014; Springer: Cham, Switzerland, 2014; pp. 541–549. [Google Scholar]
Svyatkovskiy, A.; Deng, S.K.; Fu, S.Y.; Sundaresan, N. IntelliCode Compose: Code Generation using Transformer. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Virtual, 8–13 November 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1433–1443. [Google Scholar]
Tran, K.; Zhang, J.X.; Pfeiffer, J.; Wortmann, A.; Wiesmayr, B. Generating PLC Code with Universal Large Language Models. In Proceedings of the 29th Conference on Emerging Technologies and Factory Automation-ETFA-Annual, Padova, Italy, 10–13 September 2024. [Google Scholar]
Haag, A.; Fuchs, B.; Kacan, A.; Lohse, O.; Ieee Computer, S.O.C. Training LLMs for Generating IEC 61131-3 Structured Text with Online Feedback. In Proceedings of the 2025 International Workshop on Large Language Models for Code-LLM4Code, Ottawa, Canada, 3 May 2025; IEEE: New York, NY, USA, 2025; pp. 65–71. [Google Scholar]
Koziolek, H.; Grüner, S.; Hark, R.; Ashiwal, V.; Linsbauer, S.; Eskandani, N. LLM-based and Retrieval-Augmented Control Code Generation. In Proceedings of the 1st International Workshop on Large Language Models for Code (LLM4Code), Lisbon, Portugal, 20 April 2024; ACM: New York, NY, USA, 2024; pp. 22–29. [Google Scholar]
Yang, L.Y.; Chen, H.Y.; Li, Z.; Ding, X.; Wu, X.D. Give us the Facts: Enhancing Large Language Models with Knowledge Graphs for Fact-Aware Language Modeling. IEEE Trans. Knowl. Data Eng. 2024, 36, 3091–3110. [Google Scholar] [CrossRef]
Ji, S.W.; Liu, L.F.; Xi, J.Z.; Zhang, X.X.; Li, X.L. KLR-KGC: Knowledge-Guided LLM Reasoning for Knowledge Graph Completion. Electronics 2024, 13, 5037. [Google Scholar] [CrossRef]
An, Y.M.; Qin, F.W.; Sun, D.F.; Wu, H.F. A multi-facets ontology matching approach for generating PLC domain knowledge graphs. IFAC-PapersOnLine 2020, 53, 10929–10934. [Google Scholar] [CrossRef]
Zhao, Z.L.; Zhang, N.; Yu, B.; Duan, Z.H. Generating Java code pairing with ChatGPT. Theor. Comput. Sci. 2024, 1021, 20. [Google Scholar] [CrossRef]
Ye, S.X.; Cai, L.W.; Zhang, Y.W.; Xin, X.Q.; Jiang, B.; Qi, L. Intelligent Q&A System for Welding Processes Based on a Symmetric KG-DB Hybrid-RAG Strategy. Symmetry 2025, 17, 1994. [Google Scholar] [CrossRef]
Islam, M.A.; Ali, M.E.; Parvez, M.R. MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. In Proceedings of the 62nd Annual Meeting of the Association-for-Computational-Linguistics (ACL)/Student Research Workshop (SRW), Bangkok, Thailand, 11–16 August 2024; Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 4912–4944. [Google Scholar]
Bai, X.Y.; Huang, S.B.; Wei, C.; Wang, R. Collaboration between intelligent agents and large language models: A novel approach for enhancing code generation capability. Expert Syst. Appl. 2025, 269, 19. [Google Scholar] [CrossRef]
Dong, Y.H.; Jiang, X.; Jin, Z.; Li, G. Self-Collaboration Code Generation via ChatGPT. ACM Trans. Softw. Eng. Methodol. 2024, 33, 38. [Google Scholar] [CrossRef]
Zhang, K.C.; Li, J.; Li, G.; Shi, X.J.; Jint, Z. CODEAGENT: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. In Proceedings of the 62nd Annual Meeting of the Association-for-Computational-Linguistics (ACL)/Student Research Workshop (SRW), Bangkok, Thailand, 11–16 August 2024; Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 13643–13658. [Google Scholar]
Koziolek, H.; Ashiwal, V.; Bandyopadhyay, S.; Chandrika, K.R. Automated Control Logic Test Case Generation using Large Language Models. In Proceedings of the 29th Conference on Emerging Technologies and Factory Automation-ETFA-Annual, Padova, Italy, 10–13 September 2024. [Google Scholar]
He, W.G.; Shi, J.Q.; Su, T.; Lu, Z.Y.; Hao, L.; Huang, Y.H. Automated test generation for IEC 61131-3 ST programs via dynamic symbolic execution. Sci. Comput. Program. 2021, 206, 12. [Google Scholar] [CrossRef]
Fink, X.; Mavridou, A.; Katis, A.; Adiego, B.F. Verifying PLC Programs via Monitors: Extending the Integration of FRET and PLCverif. In Proceedings of the 16th International Symposium on NASA Formal Methods (NFM), NASA Ames Res Ctr, Moffett Field, CA, USA, 4–6 June 2024; Springer: Cham, Switzerland, 2024; pp. 427–435. [Google Scholar]
Cavada, R.; Cimatti, A.; Dorigatti, M.; Griggio, A.; Mariotti, A.; Micheli, A.; Mover, S.; Roveri, M.; Tonetta, S. The NUXMV Symbolic Model Checker. In Proceedings of the 26th International Conference on Computer Aided Verification (CAV) Held as Part of the Vienna Summer of Logic (VSL), Vienna Univ Technol, Vienna, Austria, 18–22 July 2014; Springer: Cham, Switzerland, 2014; pp. 334–342. [Google Scholar]
Wang, K.; Wang, J.Y.; Poskitt, C.M.; Chen, X.X.; Sun, J.; Cheng, P. K-ST: A Formal Executable Semantics of the Structured Text Language for PLCs. IEEE Trans. Softw. Eng. 2023, 49, 4796–4813. [Google Scholar] [CrossRef]
Zheng, X.Y.; Kong, Y.; Chang, T.T.; Liao, X.; Ma, Y.W.; Du, Y. High-Throughput Computing Assisted by Knowledge Graph to Study the Correlation between Microstructure and Mechanical Properties of 6XXX Aluminum Alloy. Materials 2022, 15, 5296. [Google Scholar] [CrossRef] [PubMed]
Trelles, E.G.; Schweizer, C.; Thomas, A.; von Hartrott, P.; Janka-Ramm, M. Digitalizing Material Knowledge: A Practical Framework for Ontology-Driven Knowledge Graphs in Process Chains. Appl. Sci. 2024, 14, 11683. [Google Scholar] [CrossRef]
Li, L.; Liang, J.X.; Li, C.L.; Liu, Z.; Wei, Y.Y.; Ji, Z.Y. Construction of a Machining Process Knowledge Graph and Its Application in Process Route Recommendation. Electronics 2025, 14, 3156. [Google Scholar] [CrossRef]
AMS2770R; Heat Treatment of Wrought Aluminum Alloy Parts. SAE International: Warrendale, PA, USA, 2020.

Figure 1. Overall architecture of the MPC-Coder system.

Figure 2. Knowledge graph construction workflow.

Figure 3. Visualization of the thermal processing knowledge graph.

Figure 4. Vector database construction workflow.

Figure 5. ReAct-based knowledge graph query workflow of the planning agent.

Figure 6. End-to-end code generation example.

Figure 7. Overall performance comparison across different methods.

Figure 8. Ablation study results.

Figure 9. Convergence curves of functional failure rate across iterations.

Table 1. Experimental environment configuration.

Category	Item	Specification/Version
Hardware	CPU	Intel Core i9-14900K
	GPU	NVIDIA GeForce RTX 4090D (24 GB)
	Memory	64 GB DDR5
	Storage	2 TB NVMe SSD
Software	Operating System	Ubuntu 22.04.4 LTS
	Programming Language	Python 3.10.12
	Framework	LangChain 0.1.0
Knowledge Base	Graph Database	Neo4j Community 5.15.0
Knowledge Base	Vector Database	ChromaDB 0.6.3
Verification Tools	Compiler	ruSTy (https://github.com/PLC-lang/rusty, accessed on 20 January 2026)
Verification Tools	Model Checking	nuXmv (Fondazione Bruno Kessler, Trento, Italy)/PLCverif (CERN, Geneva, Switzerland)

Table 2. Performance Metrics of Compared Methods.

Method	Syntactic Correctness (%)	Functional Consistency (%)	Pass@3 (%)
DeepSeek-R1	46	18	22
GPT-4o	40	22	24
Vector-RAG	74	56	58
MPC-Coder	100	78	82

Table 3. Ablation experiment results.

System Variant	Syntactic Correctness (%)	Functional Consistency (%)	Pass@3 (%)
Full System	100	78	82
w/o Knowledge Graph	100	56	60
w/o Vector Database	92	62	66
w/o Closed-Loop Verification	74	42	46

Table 4. Error category distribution.

Category	Definition	Initial	Final
E1: Missing Input Validation	Unchecked parameter boundaries	7	2
E2: Insecure State Machines	Bypassed interlocks or illegal transitions	9	5
E3: Timing/Control Flow Errors	Race conditions or sequencing issues	5	4
E4: Duplicate Writes	Conflicting output assignments in one cycle	4	0

Note: Counts represent error instances; one task may contain multiple error types.

Table 5. Capability comparison across system classes.

System Class	Constraint Enforcement	Logic Repair	Functional Consistency
Koziolek et al. [24]	Low (Probabilistic)	None (Open-loop)	56% (Vector-RAG)
LLM4PLC [11]	Low (Probabilistic)	High (Formal Verify)	56% (w/o KG)
MPC-Coder (Ours)	High (KG-based)	High (Formal Verify)	78% (Full System)

Note: Direct comparison with prior work is infeasible due to different datasets. Functional consistency is approximated using configurations from Section 4.2 and Section 4.3 that represent each system class.

Table 6. Functional failure rate during iterations (%).

System Variant	R1	R2	R3	R4	R5
Full System	0.48	0.34	0.28	0.24	0.22
w/o Knowledge Graph	0.68	0.58	0.50	0.46	0.44
w/o Vector Database	0.60	0.50	0.44	0.40	0.38
w/o Closed-Loop Verification	0.76	0.68	0.64	0.60	0.58

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Y.; Xia, W.; Zhao, B.; Yuan, T.; Yu, X. MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation. Symmetry 2026, 18, 248. https://doi.org/10.3390/sym18020248

AMA Style

Zhang Y, Xia W, Zhao B, Yuan T, Yu X. MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation. Symmetry. 2026; 18(2):248. https://doi.org/10.3390/sym18020248

Chicago/Turabian Style

Zhang, Yinggang, Weiyi Xia, Ben Zhao, Tongwen Yuan, and Xianchuan Yu. 2026. "MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation" Symmetry 18, no. 2: 248. https://doi.org/10.3390/sym18020248

APA Style

Zhang, Y., Xia, W., Zhao, B., Yuan, T., & Yu, X. (2026). MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation. Symmetry, 18(2), 248. https://doi.org/10.3390/sym18020248

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

MPC-Coder: A Dual-Knowledge Enhanced Multi-Agent System with Closed-Loop Verification for PLC Code Generation

Abstract

1. Introduction

2. Related Work

2.1. LLM-Based Code Generation

2.2. Knowledge-Enhanced Code Generation

2.3. Multi-Agent Collaboration and Verification Mechanisms

2.4. Major Limitations of Existing Research

3. Methodology

3.1. Overall System Architecture

3.2. Dual-Knowledge Architecture

3.2.1. Knowledge Graph Construction

3.2.2. Vector Database Construction

3.2.3. Hybrid Retrieval Strategy

3.3. Multi-Agent Collaboration Mechanism

3.3.1. Forward Generation Workflow

3.3.2. Closed-Loop Verification and Repair

4. Experiments and Analysis

4.1. Experimental Setup

4.1.1. Experimental Environment

4.1.2. Evaluation Dataset

4.1.3. Evaluation Metrics

4.2. Overall Performance Comparison

4.2.1. Comparative Methods

4.2.2. Comparison Results

4.3. Ablation Study

4.3.1. Ablation Settings

4.3.2. Ablation Results

4.3.3. Analysis of Module Contributions

4.4. Convergence Analysis

4.4.1. Valuation Metric

4.4.2. Iteration Trajectory

4.4.3. Convergence Behavior Analysis

5. Conclusions and Future Work

5.1. Research Summary

5.2. Main Conclusions

5.3. Limitations and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI