Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection

Lee, Ju Hyeon; Ji, Il Hwan; Jeon, Seung Ho; Seo, Jung Taek

doi:10.3390/app15073507

Open AccessArticle

Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection

by

Ju Hyeon Lee

¹

,

Il Hwan Ji

¹,

Seung Ho Jeon

²

and

Jung Taek Seo

^2,*

¹

Department of Information Security, Gachon University, Seongnam-daero 1342, Seongnam-si 13120, Republic of Korea

²

Department of Smart Security, Gachon University, Seongnam-daero 1342, Seongnam-si 13120, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(7), 3507; https://doi.org/10.3390/app15073507

Submission received: 3 March 2025 / Revised: 15 March 2025 / Accepted: 20 March 2025 / Published: 23 March 2025

(This article belongs to the Special Issue Advances in Attack Detection and Secure State Estimation for Cyber–Physical Systems (CPS))

Download

Browse Figures

Versions Notes

Abstract

Anomaly detection systems are being studied to detect cyberattacks in industrial control systems (ICSs). Existing ICS anomaly detection systems monitor network packets or operational data. However, these anomaly detection systems cannot detect control logic targeted attacks such as Stuxnet. Control logic tampering detection studies also exist, but they detect code modifications rather than determining whether the logic is normal. These tampering detection methods classify control logic as abnormal if any code modifications occur, even if the logic represents normal behavior. For this reason, this paper proposes an anomaly detection method that considers the structure of control logic. The proposed embedding method performs embedding based on control logic Instruction List (IL) code. The opcode and operand of IL code use separate embedding models. The embedded vectors are then sequentially combined to preserve the IL structure. The proposed method was validated using Long Short-Term Memory (LSTM), LSTM-Autoencoder, and Transformer models with a dataset of normal and malicious control logic. All models achieved an anomaly detection performance with an F1 score of at least 0.81. Additionally, models adopting the proposed embedding method outperformed those using conventional embedding methods by 0.088259. The proposed control logic anomaly detection method enables the model to learn the context and structure of control logic and identify code with inherent vulnerabilities.

Keywords:

industrial control systems; anomaly detection; control logic; programmable logic controller cyber security

1. Introduction

With the advent of the Fourth Industrial Revolution, various technologies and systems have been introduced to industrial control systems (ICSs). While these advancements enhance ICS efficiency, they also increase external network exposure, leading to a surge in cyberattacks [1]. Recent studies on ICS cybersecurity have focused on AI-based anomaly detection systems to detect increasingly sophisticated and complex ICS cyberattacks [2]. AI-based anomaly detection systems in ICSs typically monitor network packets or process data to detect anomalies. Network packet-based anomaly detection systems cannot monitor all network segments in ICSs, so they collect and monitor network packets at specific segments or root switches [3]. Process data-based anomaly detection systems monitor power and sensor data generated within ICSs [4]. However, network packet-based or process data-based anomaly detection systems cannot detect programmable logic controller (PLC) [5] control-logic-tampering attacks such as Stuxnet [6]. Stuxnet physically accessed the system via a Universal Serial Bus (USB) and installed malware on the Engineering Workstation (EWS). The infected EWS then uploaded malicious control logic to the PLC, causing abnormal operations. Subsequently, the compromised EWS altered abnormal process data to appear normal before transmitting it to the monitoring system. Since Stuxnet used a USB for physical access, it cannot be detected by network packet-based anomaly detection systems. Additionally, since anomalous process data are altered to appear normal, they cannot be detected by process data-based anomaly detection systems. To detect PLC control-logic-tampering attacks, studies have been conducted to identify control logic modifications. However, these tampering detection methods determine only whether the code has been modified, which can lead to false positives by detecting minor changes in normal control logic as unauthorized tampering. Choia, M. K. et al. [7] used encryption algorithms and blockchain technology to verify the integrity of PLC control logic and detect tampering. This study detected tampering by storing the hash values of PLC control logic in the blockchain. Kai Yang et al. [8] proposed a resilience approach to protect PLCs from control logic data tampering attacks. This study detects tampering in transmitted PLC control logic packets using message digests and restores devices from anomalous states. However, these studies determine code modifications rather than assessing the normality of control logic. If normal control logic is modified for maintenance purposes, it may be mistakenly classified as anomalous despite being a legitimate change. Additionally, they require a storage system, such as a blockchain or database, to store normal control logic hashes.

The threat model presented in this study involves a malicious control logic upload attack, which manipulates PLC control logic to induce unintended system behavior. In this attack, an attacker (e.g., an individual with access to EWS) utilizes physical storage media to access EWS and inject malware. The compromised EWS subsequently uploads malicious control logic to the PLC. The primary targets of this attack include industrial processes that rely on PLCs. Malicious control logic can lead to physical damage, operational disruptions, or safety hazards. Such attacks modify anomalous process data to appear normal. Additionally, they inject malicious control logic through trusted communication channels, enabling them to evade traditional anomaly detection systems. Given these challenges, an AI-based control logic anomaly detection method is required to distinguish between legitimate modifications and malicious alterations in PLC control logic.

Therefore, an AI-based control logic anomaly detection technique is needed to detect anomalous control logic that deviates from the normal operating range without requiring additional storage. Since control logic is structured in graph or text format, embedding must be performed to make it suitable for AI model input. However, embedding should preserve the relationships and structure of the code while considering the characteristics of control logic. To perform embedding suitable for control logic, embedding is conducted based on the Instruction List (IL) [9] code of the control logic. The proposed control logic embedding method separates the opcode and operand in the IL code, embedding them individually before combining them. This approach preserves the structure and context of the IL code. Since the resulting embedding vectors reflect the structure of the control logic, the anomaly detection system can learn the control logic structure and effectively detect anomalous control logic.

For this reason, this paper proposes a control logic anomaly detection method that considers the PLC control logic structure to detect ICS cyber threats. The proposed method embeds control logic composed of IL code to enable AI-based learning. The proposed embedding preserves the structural and syntactic relationships of the control logic. The control logic anomaly detection model detects anomalous code within control logic by learning control logic embedding vectors based on deep learning algorithms. This approach enables the identification of which PLC control logic code is anomalous. Furthermore, it allows the detection of code with embedded vulnerabilities. The proposed AI-based control logic anomaly detection method is the first approach of its kind for detecting ICS cyber threats, as it has not been previously explored. The proposed method can detect cyberattacks that conventional anomaly detection systems fail to identify. To validate the proposed method, publicly available datasets containing both normal and anomalous control logic were used for anomaly detection. Experimental results show that all models achieved an F1 score of at least 0.81. The proposed control logic anomaly detection method identifies anomalous code that deviates from the normal control flow. The contributions of this paper are as follows.

Propose an embedding method optimized for PLC control logic by embedding PLC IL code while preserving the structure and context of the control logic;
Develop a control logic-based anomaly detection model to identify anomalous patterns within PLC control logic;
Conduct experiments using 30 normal control logic samples and 30 malicious control logic samples, demonstrating that the proposed control logic anomaly detection method achieved an F1 score of 81% in detecting anomalous control logic code.

This paper’s composition is as follows: Section 2 introduces the background knowledge of PLC control logic and existing research on detecting PLC control logic modulation. Section 3 describes a control logic anomaly detection method that considers the structure of PLC control logic. Section 4 verifies the proposed anomaly detection method. Section 5 compares the proposed method with existing control logic tampering detection approaches. Finally, it concludes with a conclusion in Section 6.

2. Background and Related Works

Section 2 introduces background knowledge of PLC control logic and analyzes PLC control logic modulation detection studies.

2.1. Background

A PLC is an embedded device in industrial control systems that controls field devices such as sensors, actuators, and motors. The PLC receives the current state from input devices such as sensors and switches, processes it through control logic, and controls output devices such as motors and valves [10]. A PLC operates based on control logic downloaded to its memory, which can be uploaded or downloaded via an EWS. It is an embedded device used in ICSs. A PLC consists of a power supply module, input/output modules, an operating system, memory, and a communication interface [11]. IEC 61131-3 standardized the PLC programming language for developing PLC control logic [12]. Table 1 shows the PLC control logic language specified in the standard.

IL and ST are text-based languages, while LD, FBD, and SFC are graphical languages. However, both text and graphical formats are not suitable as inputs for anomaly detection models in control logic. Therefore, control logic must be embedded into a suitable format for model input.

2.2. Related Works

Most studies on anomaly detection in PLC control logic focus on detecting control logic tampering. Control logic tampering detection research aims to identify attacks that modify control logic, such as Stuxnet.

Choia, M.K. et al. [7] used encryption algorithms and blockchain technology to monitor PLC tampering. The study aimed to ensure the integrity of PLC logic and protect it from cyber threats such as modifications to deployed logic or configuration values. The code deployed in PLC memory is hashed using a hashing algorithm and recorded in the blockchain. This study was the first to apply blockchain technology to the cybersecurity of nuclear power instrumentation and control (I&C) systems.

Kai Yang et al. [8] proposed a novel resilience method for PLCs to counter data tampering attacks. This study generates message digests for communication data (control logic) between PLCs and determines whether the transmitted data have been tampered with. If tampering is detected, we use a data recovery algorithm to resend the data in encrypted form for restoration.

J.C. Lee et al. [13] analyzed the project file structure of PLC control logic to detect manipulation attacks and verify control logic tampering. The proposed PLC control logic tampering detection method stores the control logic from the Engineering Workstation (EWS) and compares it with the extracted control logic from the PLC to check for tampering. The detection method converts LD code into a binary file and identifies discrepancies by comparing it with the extracted logic source code from the PLC.

Yau, Ken et al. [14] proposed a program called the Control Program Logic Change Detector (CPLCD) to detect PLC tampering. CPLCD operates with a set of detection rules to identify and log unintended incidents that disrupt the normal operation of a PLC. It organizes reserved variables and logical expressions used in the PLC, such as I (input), Q (output), and Database, on a per-Rung basis. A Rung represents a single line of control logic code. CPLCD generates detection rules for each Rung to identify modifications in control logic.

Studies utilizing AI for anomaly detection in PLCs have primarily focused on monitoring process data, such as PLC power consumption, input/output signals, CPU usage, PLC log information, and runtime data. Yu-jun Xiao et al. [15] analyzed PLC power consumption to detect malicious software execution as a means of protecting PLCs from cyberattacks. Their study measured power consumption during command execution within the PLC. Using an LSTM-based anomaly detection approach, their method achieved a 99.83% accuracy in detecting malicious code.

Arup Ghosh et al. [16] proposed an error and behavior monitoring tool for PLCs to detect faults and anomalies in PLC operations. Their study constructed a nominal deterministic finite-state automaton-based model of the PLC control process and utilized this model to detect and isolate faults and behavioral anomalies. The system was considered free of faults and behavioral anomalies if there was little to no difference between the observed behavior of PLC input/output signals and the behavior predicted by the model.

Seungjae Han et al. [17] proposed a method for detecting abnormal temporal behavior by monitoring CPU usage to identify denial-of-service (DoS) attacks targeting PLCs. When a temporal anomaly is detected, the proposed method examines function call sequences to identify control flow anomalies, followed by the detection of stack-based buffer overflow attacks.

Chan and Chun Fai [18] proposed a PLC logging-based anomaly detection system to detect anomalous PLC behavior. Their study introduced a new framework and preprocessing method to improve the quality of PLC log data. The extracted data were then used to train an unsupervised machine learning model, enabling precise monitoring of anomalous behavior in PLC systems.

Irfan Ahmed [19] proposed a detection framework that captures and analyzes PLC runtime data to identify control logic attacks. Their study involved discovering PLC vulnerabilities, developing exploits to understand attack methodologies, and subsequently modifying PLC memory using the developed attack techniques. The proposed detection framework detects cyberattacks on PLCs by utilizing memory structures such as stacks to determine whether the control logic has been manipulated.

However, existing PLC anomaly detection methods cannot identify the root cause of control logic anomalies and are vulnerable to attacks that manipulate process data, such as Stuxnet. Existing studies on detecting control logic tampering require a separate control logic repository and a verification system. Additionally, they detect tampering by comparing hashes or individual lines of code. As a result, even minor modifications that do not affect functionality, such as changing the order of variable declarations, can falsely detect that normal control logic has been tampered with. For example, if {A = 1, B = 2} is changed to {B = 2, A = 1}, it is detected as tampering and classified as anomalous control logic. In other words, existing methods increase false positives by flagging slight variations in control logic within the normal operating range. Moreover, they require dedicated storage for control logic or hash values. For this reason, an AI-based anomaly detection model is needed to analyze logic without requiring additional storage. Instead of detecting changes, it should determine whether the control logic operates within a normal range.

3. Proposal for PLC Control Logic Embedding and Anomaly Detection

Section 3 proposes a control logic anomaly detection approach that considers the structure of PLC control logic. First, Section 3.1 introduces an overview of control logic anomaly detection. Section 3.2 explains the method and rationale for converting control logic into IL code. Section 3.3 proposes an embedding technique suitable for control logic written in IL code. Finally, Section 3.4 presents a control logic anomaly detection model using the embedded control logic.

3.1. Overview

The proposed PLC control logic anomaly detection approach targets PLC control logic composed of IL code. It detects anomalies by embedding IL code and training an AI model. Since control logic programming methods are standardized under IEC 61131-3, other control logic languages can be converted into IL, ensuring the generalizability of the proposed method. Figure 1 presents an overview of the proposed PLC control logic anomaly detection approach.

The proposed PLC control logic anomaly detection approach first performs embedding on IL code. Since the opcode and operand of IL code have distinct characteristics, they are separated and embedded using different embedding models. The opcode and operand are then combined to restore the original IL code structure. This embedding method preserves the structure of the control logic, ensuring that the embedded control logic retains the original IL code’s structure and context. Finally, the anomaly detection model learns only the embedded normal control logic and detects anomalous control logic that deviates from the normal range.

Control Logic IL Code Conversion: Before being deployed to the PLC, normal control logic is converted into IL code. The anomaly detection model is trained using this IL code;
Control Logic Embedding: The IL code is separated into opcodes and operands, which are then embedded using different integer encoding and embedding models. If an opcode has multiple operands, the average of the embedded operand values is used. The opcode and operand are then combined to preserve the original IL code structure;
Control Logic Anomaly Detection: The embedded control logic is fed into a deep learning model for training. The model learns normal control logic and predicts expected normal behavior. If the difference between the predicted control logic and the input control logic exceeds a predefined threshold, it is detected as an anomaly.

3.2. Control Logic IL Code Conversion

The proposed PLC control logic embedding method is designed for PLC control logic represented in IL code. However, IEC 61131-3 standardizes multiple PLC programming languages, allowing control logic written in different languages to be systematically converted into IL code. Most PLC manufacturers comply with this standard, ensuring that conversion between IEC 61131-3 languages, including IL, is generally supported. Engineering Workstations (EWSs) typically provide built-in functionality to facilitate this conversion. Therefore, if the control logic follows the IEC 61131-3 standard, it can be transformed into IL code. This ensures that our method can be applied across different PLC programming languages. Figure 2 illustrates an example of converting LD code into IL code.

An opcode refers to the name of an LD function (symbol) or command specified by the user. Operand 1 is the first operand, designated as the input value for the operator. Operand 2 is the second operand, designated as the output value for the operator. Since an operator can have multiple input and output values, operands may contain more than two variables.

At this stage, normal control logic before being deployed to the PLC is used. If control logic was extracted from a PLC and used, there would be a possibility that it had been tampered with, as seen in Stuxnet-like attacks. Therefore, control logic in a secure state is used as the training dataset.

3.3. Control Logic Embedding

The proposed PLC control logic embedding method was inspired by Asm2Vec [20] and restructured to suit control logic. Asm2Vec is a technique for embedding assembly language. The objective of the Asm2Vec model is to learn vector representations that reflect the semantic similarity of functions and instruction sequences in assembly code. The model aims to maximize log-likelihood to construct a vector space that best preserves the structure and semantics of the code. In other words, the Asm2Vec model seeks to maximize the value of Equation (1) [20].

\sum_{f_{n}}^{R P} \sum_{s e q_{i}}^{S (f_{n})} \sum_{i n_{j}}^{I (s e q_{i})} \sum_{t_{c}}^{T (i n_{j})} \log P (t_{c} | f_{s}, i n_{j - 1}, i n_{j + 1})

(1)

R P

represents all functions within the binary code being analyzed.

f_{n}

refers to the function contained in the binary code.

S (f_{n})

is the set of specific instruction sequences extracted from the function

f_{n}

.

s e q_{i}

refers to a specific instruction sequence extracted from

f_{n}

.

I (s e q_{i})

is the set of individual instructions contained within the sequence

s e q_{i}

.

i n_{j}

represents a specific instruction within a given sequence

j

.

T (i n_{j})

is the set of individual tokens that make up the instruction

i n_{j}

. Assembly code consists of multiple tokens, such as opcodes and operands.

t_{c}

refers to an individual token, which is a specific code included in an opcode or operand. For example, if the opcode is ‘mov eax, ebx’, then ‘mov’, ‘eax’, and ‘ebx’ are each considered tokens.

\log P (t_{c} | f_{s}, i n_{j - 1}, i n_{j + 1})

represents the probability of a specific token

t_{c}

occurring given the function

f_{s}

and the surrounding instructions

i n_{j - 1}

and

i n_{j + 1}

. In other words, the Asm2Vec model learns to maximize the probability of a specific token appearing within a given context.

The proposed control logic embedding method embeds IL code, which consists of opcodes and operands. Since opcodes and operands have distinct characteristics, they contain only designated data. For example, a command like ‘mov’ can only be assigned to an opcode. Therefore, opcode values cannot appear in operands, and embedding must be performed independently for each. For this reason, opcodes and operands are embedded using separate embedding models. Algorithm 1 presents the proposed control logic embedding process.

Algorithm 1: Control logic embedding

Input:
Control logic IL Code

X

;
Output:
Control logic Vector

V

;
Start:

O p c o d e,

Operand

\leftarrow X

;

O \leftarrow f_{m a p} (O p c o d e)

;

P \leftarrow f_{m a p} (O p e r a n d)

;

E_{O p c o d e} \leftarrow f_{e m b} (O)

;

E_{O p e r a n d} \leftarrow f_{e m b} (P)

;

E_{O p e r a n d}^{f i n a l} \leftarrow \frac{1}{| P_{O} |} \sum_{p_{i} \in P_{O}} E_{O p e r a n d} (p_{i})

;

V \leftarrow [E_{O p c o d e}, E_{O p e r a n d}^{f i n a l}]

;
end

The control logic embedding algorithm first takes the control logic IL code

X

as input.

X

is then separated into opcodes and operands. To perform integer encoding, the opcodes and operands are mapped to integer indices

O

and

P

using the mapping function

f_{m a p}

.

O

and

P

are transformed into the vectorized representations

E_{O p c o d e}

and

E_{O p e r a n d}

through the embedding function

f_{e m b}

. The embedding dimension is determined by the configuration of

f_{e m b}

. The final value of

E_{O p e r a n d}

, denoted as

E_{O p e r a n d}^{f i n a l}

, is computed by summing the embedding vectors

E_{O p e r a n d} (p_{i})

assigned to the opcode and averaging them based on the number of operands in the set,

| P_{O} |

. In other words, when an opcode has multiple operands, the final vector representation of the operands is defined as the average of their embedding vectors. Finally, to preserve the IL code format, the generated

E_{O p c o d e}

and

E_{O p e r a n d}^{f i n a l}

are sequentially combined to form the control logic vector

V

. This process ensures that the IL code structure is maintained while minimizing information loss during the vectorization process. Figure 3 is an example of the proposed control logic embedding approach.

The control logic embedding process consists of five steps:

The IL code is divided into opcode and operands.
Since the embedding layer cannot process text directly, opcode and operands are separately converted into integers using distinct integer encoding methods.
The integer-encoded opcode and operands are embedded using their respective embedding layers. At this stage, the embedding dimension is set, and each opcode and operand are transformed into a vector representation.
If an opcode has multiple operands, the average of the operand embeddings is calculated and used as the final embedding representation.
The embedded opcode and operand are combined while preserving the original IL code structure.

The proposed embedding method effectively maintains the relationships between opcode and operands within IL code, making it highly beneficial for anomaly detection and code analysis. This approach offers two key advantages over conventional embedding techniques. First, the proposed method clearly distinguishes the semantic differences between operators and operands, allowing the model to better preserve contextual information in the code. Traditional single vector embedding approaches use a single embedding model without differentiating between opcodes and operands. As a result, if the same text appears in both an opcode and an operand, it is interpreted with the same meaning. However, the proposed embedding method separates opcodes and operands and applies different embedding models to each. This ensures that even if identical text appears, its semantic meaning is correctly distinguished. Second, the proposed method preserves the original IL code structure. By sequentially combining the embedded opcode and operand, the method maintains the structural integrity of the IL code. This approach enables the model to effectively capture both the context and structure of IL code.

3.4. Control Logic Learning and Anomaly Detection

The control logic anomaly detection model learns the structure of control logic and predicts normal control logic. It consists of fully connected layers that train on embedded normal control logic and detect anomalous control logic. Figure 4 shows the architecture of the control logic anomaly detection model.

The control logic anomaly detection model embeds and learns control logic before it is deployed to the PLC. The trained anomaly detection model extracts control logic from field PLCs, performs the embedding process, and uses it as input. Anomaly detection operates by predicting the

t + 1

th IL code based on the

t

input IL codes. The model then calculates the error between the actual

t + 1

th IL code and the predicted

t + 1

th IL code. Since the anomaly detection model learns the patterns of normal control logic, it predicts normal control logic. If the error between the predicted control logic and the input control logic exceeds a predefined threshold, it is classified as an anomaly. This process determines whether the control logic uploaded to the PLC is anomalous. The proposed control logic anomaly detection model detects control logic that deviates from the normal range, rather than merely identifying whether modifications have occurred. Additionally, since the anomaly detection model consists of fully connected layers capable of detecting even subtle code changes, it can identify previously unseen patterns in control logic.

4. Experiment and Evaluation

Section 4 conducts experiments and validation on the proposed anomaly detection method that considers the PLC control logic structure. Section 4.1 describes the PLC control logic dataset used in the experiments. Section 4.2 explains the experimental environment, performance metrics, and models used. Section 4.3 evaluates the proposed method using various performance metrics. Additionally, this section attempts to answer the following research questions.

How does the embedding process preserve the structure and context of control logic?
How does the proposed method detect cyber threats that existing ICS anomaly detection systems fail to identify?
What makes the proposed method more effective than existing PLC control logic tampering detection studies?

4.1. Dataset Description

The dataset used in the experiment is the PLC control logic Ladder Logic Bombs (LLB) dataset [21]. LLB is a broad category of attack types that focus on manipulating PLC control logic, like Stuxnet. The dataset consists of 30 normal control logic samples and 30 malicious control logic samples. The study claims to be the first to construct a PLC control logic dataset. For this reason, the LLB dataset is the only available dataset for this experiment. The control logic dataset is provided in LD and ST formats. The dataset was created in the OpenPLC [22] environment. Normal control logic is a program that models a water tank filling system. The malicious control logic was designed by incorporating LLB attack patterns. A total of 26 variations of malicious samples were generated by applying different attack techniques, such as comparison operations, water level detection, and specific component modifications. These variations are classified into two main types: (1) modifications in the placement of function blocks containing malicious code within the LD code or changes in the connection methods between LD components, and (2) replacement of function blocks containing logic bombs with other similar function blocks that serve the same purpose. While the proposed LLB attacks were tested on a water tank filling system, they can also be applied to other ICS environments utilizing PLCs.

For the experiment, we converted it into IL format using Open PLC. In this experiment, the additional code introduced for LLB was labeled as anomalous. The training dataset consists of 30 normal control logic samples, comprising 714 lines of IL code. The test dataset consists of 30 malicious control logic samples, comprising 1260 lines of IL code. The goal of this experiment is to identify malicious code embedded within the anomalous control logic. This malicious code represents previously unknown threats.

4.2. Experiment Settings

Computational Environment: We performed all experiments in the same computational environment: Intel(R) Core (TM) i5-13600KF 3.50 GHz, 64 GB RAM, and Windows 10 22H2. Additionally, NVIDIA GeForce RTX 4070 Ti was used for efficient model learning as VGA hardware acceleration.

Evaluation metrics: In our experiment, we use accuracy (Equation (2)), recall (Equation (3)), precision (Equation (4)), f1-score (Equation (5)), false negative rate (FNR, Equation (6)), and false positive rate (FPR, Equation (7)) as evaluation metrics [23,24]. The evaluation metric is calculated using true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs).

A c c u r a c y = \frac{(T P + T N)}{(T P + T N + F P + F N)}

(2)

R e c a l l = \frac{(T P)}{(T P + F N)}

(3)

P r e c i s i o n = \frac{(T P)}{(T P + F P)}

(4)

F 1 - S c o r e = \frac{(2 \times P r e c i s i o n \times R e c a l l)}{(P r e c i s i o n + R e c a l l)}

(5)

F N R = \frac{(F N)}{(F N + T P)}

(6)

F P R = \frac{(F P)}{(F P + T N)}

(7)

Experimental Setup and Parameters: Control logic embedding was performed using Keras’ embedding layer, with the embedding dimension set to 40. A larger embedding dimension allows for richer feature representation but increases computational complexity and overfitting risks. Through experiments, we confirmed that setting the embedding dimension to 40 results in the best model performance. For the experiment, Long Short-Term memory (LSTM) [25], LSTM-Autoencoder [26], Transformer [27], recurrent neural network (RNN) [28], and gated recurrent unit (GRU) [29] models were used. ReLU [30] was used as the activation function for all models. Training was performed using the Adam optimizer [31] with a learning rate of either 0.001 or 0.0005. The learning rate is a crucial hyperparameter that determines how quickly the model updates its weights. If the learning rate is too high, the model may overshoot the optimal point, whereas if it is too low, the convergence speed may slow down. Experimental results showed that LSTM and LSTM-Autoencoder achieved the highest performance at a learning rate of 0.0005, while the Transformer model performed optimally at a learning rate of 0.001. The LSTM model consists of a total of eight layers, with 128 nodes per layer. RNN and GRU were analyzed using the same hyperparameters as LSTM and compared against LSTM. The LSTM-Autoencoder designed in this study reduces the number of nodes by half at each encoder layer. Conversely, in the decoder layers, the number of nodes doubles at each layer, restoring the original input dimensions. The LSTM-Autoencoder consists of three encoder layers and three decoder layers. The LSTM-Autoencoder starts with 512 nodes, and the latent variables correspond to the number of features. The Transformer model consists of two encoder layers. The number of attention heads is set to eight, and the vector dimension is 512.

The batch size was selected based on the model’s convergence behavior and memory constraints. For LSTM, a batch size of 8 was chosen as it yielded stable training and better performance in our experiments. For LSTM-Autoencoder, a batch size of 32 was used to improve learning stability in autoencoder-based anomaly detection. For the Transformer model, a batch size of 8 was used since increasing the batch size resulted in performance degradation due to overfitting. The parameters for each model were determined through the ablation study mentioned in Section 4.3. Hyperparameter tuning was performed by modifying the key parameters of each model and evaluating their performance.

4.3. Ablation Study of Proposal Methods

In Section 4.3, an ablation study is conducted to adjust the hyperparameters of each model and evaluate the performance of the proposed control logic anomaly detection method.

In this experiment, an embedding approach was applied that preserves the structure of the control logic IL code while ensuring effective embedding. Specifically, opcodes and operands were embedded separately and then combined. Unique words for opcodes and operands were extracted, assigned integer values, and stored in a dictionary format. The IL code was then integer-encoded using this dictionary. The integer-encoded code was mapped to a 40-dimensional vector using Keras’ embedding layer [32]. The vectorized opcodes and operands were arranged sequentially in the same order as they appear in the IL code, preserving the structural characteristics of the IL code. The embedding dimension is a critical factor in determining the level of compression when converting control logic into a continuous vector representation. Using a dimension that is too small may lead to information loss, while using a dimension that is too large may result in overfitting and increased computational cost. Through experimentation, we confirmed that setting the embedding dimension to 40 yielded the best model performance.

Answer to (RQ1): To preserve the structure of control logic during the embedding process, the IL code’s opcodes and operands are embedded separately. Since opcodes and operands have different meanings, they should not be transformed using a single embedding model; instead, they must be embedded individually to prevent semantic mixing. The embedded vectors are combined in the order of opcodes and operands to maintain the IL code structure. This approach ensures that the instruction execution flow and structural relationships of the control logic are reflected. Additionally, it prevents the loss of contextual information during the vectorization process. By applying this method, the original meaning of the IL code is preserved, enabling the effective use of vectorized data for anomaly detection

Control logic anomaly detection was performed using LSTM, LSTM-Autoencoder, and Transformer models to learn and validate normal control logic. Anomaly detection is formulated as a one-class classification problem. One-class classification determines whether an object belongs to the class observed during training [33]. By performing one-class classification, the deep learning model can detect malicious control logic that does not exist in normal control logic. The parameters of the anomaly detection model are key factors influencing detection performance. Therefore, in this experiment, the effects of hyperparameters impacting performance were observed to determine the optimal hyperparameter settings. Additionally, to enable analysis from various perspectives, normalization techniques such as Minmax scaler, Maxabs scaler, Robust scaler, and Standard scaler were applied [34]. Table 2, Table 3 and Table 4 present the performance evaluation results of the LSTM, LSTM-Autoencoder, and Transformer models based on different hyperparameter configurations.

LSTM achieved the highest performance using Robust Scaler with a learning rate of 0.0005. The key parameters for LSTM are batch size and the number of nodes per layer. LSTM achieved the highest F1 score of 0.8129 with a batch size of 8 and 128 nodes. Similarly, LSTM-Autoencoder achieved the highest performance using the Robust Scaler with a learning rate of 0.0005. Like LSTM, the key parameters for LSTM-Autoencoder are batch size and the number of nodes per layer. In Table 3, ‘Node’ refers to the number of nodes in the first encoder layer. LSTM-Autoencoder gradually decreases the number of nodes in the encoder and increases them in the decoder. LSTM-Autoencoder achieved the highest F1 score of 0.8160 with a batch size of 32 and 512 nodes. The Transformer model achieved the highest performance using the Standard Scaler with a learning rate of 0.001. In the multi-head attention mechanism, attention was measured in parallel by dividing it into eight heads. The key parameters for Transformer are batch size, the number of encoder and decoder layers, and the dimension of the embedding vector. Transformer achieved the highest F1 score of 0.8117 with a batch size of 8, 2 layers, and a dimension of 512. Table 5 presents the performance of the models after hyperparameter tuning.

RNN-based models are primarily used for learning sequential data [35]. In this experiment, various RNN-based models were trained, and LSTM demonstrated the highest performance. While RNN exhibited a lower FNR, it also had a higher FPR. GRU showed similar values for both FNR and FPR, but LSTM maintained lower FNR and FPR values compared to GRU. LSTM outperformed RNN and GRU due to its ability to effectively mitigate the vanishing gradient problem and capture long-term dependencies in control logic sequences. Among all models, LSTM and Transformer achieved the lowest FPR, while LSTM-Autoencoder maintained the lowest FNR. The F1 scores of LSTM, LSTM-Autoencoder, and Transformer used in the experiment were above 0.81. This experiment is similar to studies that identify code vulnerabilities through AI-based static analysis. Existing studies on vulnerability detection have focused on identifying vulnerabilities in C/C++ programs [36]. The highest average F1 score achieved in those studies was 0.69. Detecting malicious code with embedded vulnerabilities is inherently challenging. Therefore, when compared to similar studies, the F1 score obtained in this experiment does not indicate low performance.

Additionally, the effectiveness of the proposed control logic embedding method was validated. For verification, two embedding approaches for IL code were compared. The two embedding methods are the proposed embedding method and general embedding methods. The general embedding method converts opcodes and operands into a single vector without distinguishing between them. The same model was trained using both embedding methods, and their performance was compared. Table 6 presents the performance of models that applied the proposed embedding method versus those that did not.

In all comparison models, performance consistently declined when the proposed embedding method was not applied. Specifically, the performance decreased by 0.072154 in the LSTM model, 0.059463 in the LSTM-Autoencoder model, and 0.088259 in the Transformer model. The general embedding method has two major issues. First, it generates an excessively large dictionary. As the dictionary size increases, the dimensionality grows, leading to higher computational costs. This increases the number of parameters the model must process, affecting both training and inference speed. Second, words with different meanings may be represented as the same entity. This issue results in the loss of contextual information in the code. These results support that the proposed control logic embedding method effectively preserves the structural characteristics of IL code, enabling the model to learn code more accurately.

Answer to (RQ2): The proposed control logic anomaly detection method can detect cyber threats that existing ICS anomaly detection systems fail to identify, such as Stuxnet. Traditional ICS anomaly detection systems monitor network packets or process data. However, Stuxnet physically accesses the system to manipulate control logic and generates falsified process data to conceal anomalies. For this reason, conventional ICS anomaly detection systems cannot detect attacks like Stuxnet. In contrast, the proposed method monitors PLC control logic to detect abnormal control logic behavior. Moreover, the embedding process adopts a control logic-specific approach. Table 6 presents the performance of models using control logic structure-aware embedding compared to those using conventional embedding methods. The proposed method outperformed conventional embedding approaches. This demonstrates that the proposed control logic anomaly detection method introduces a novel approach and can detect cyber threats that existing ICS anomaly detection systems fail to identify.

5. Comparative Study

Section 5 qualitatively compares existing control logic tampering detection studies with the proposed method. The proposed control logic anomaly detection method trains AI models using PLC control logic to detect anomalies. This approach is unprecedented in existing ICS anomaly detection research, which primarily focuses on monitoring network packets and processing data. Due to this fundamental difference, direct comparison with conventional ICS anomaly detection studies is challenging. Therefore, we compare our method with control logic tampering detection research, which shares a similar objective of identifying anomalies within PLC control logic. Existing studies primarily focus on detecting whether control logic has been tampered with, while research on utilizing the structural characteristics of control logic for anomaly detection remains limited. As a result, direct quantitative comparison with previous studies is challenging. Instead, the comparison analyzes detection objectives, detection methods, additional storage requirements, tampering detection capability, and anomaly detection capability. The detection objective refers to the purpose of the study. The objectives are classified into tampering detection and anomaly detection. The detection method refers to the approach used to achieve the detection objective. Additional storage requirement refers to whether extra storage space is needed. Tampering detection capability refers to whether tampering can be detected. Anomaly detection capability refers to whether anomaly detection is possible. Table 7 compares the proposed method with existing control logic tampering detection approaches.

Existing studies, except for our research, focus on detecting tampering. Our study detects control logic anomalies using AI. Traditional control-logic tampering detection studies identify modifications through blockchain, message digests, and line-by-line code comparisons. Choia, M.K. and J.C. Lee require additional storage for storing hashes or normal control logic. Tampering detection studies can identify modifications, whereas our research detects anomalies to infer tampering. Anomaly detection in control logic is not possible in previous studies, but our research enables it. Additionally, since existing studies detect only code modifications, even legitimate changes may be misclassified as tampering.

Answer to (RQ3): Unlike traditional tampering detection methods, the proposed approach detects anomalous control logic by reflecting the structural characteristics of control logic. While existing studies focus solely on determining whether modifications have occurred, the proposed method can identify patterns that deviate from normal control flow. Through this approach, anomalies can be detected, allowing the identification of vulnerable code embedded within control logic and precisely analyzing which parts of the code cause abnormalities. This capability is not achievable with conventional methods. Additionally, since the model learns normal control logic to detect anomalies, there is no need for separate storage to retain normal control logic.

The proposed control logic anomaly detection method utilizes IL code standardized under IEC 61131-3. PLC models that comply with this standard support cross-language conversion regardless of the manufacturer, ensuring high scalability and broad applicability across various PLC models. However, some ICS infrastructures do not fully follow the IEC 61131-3 standard. In such cases, additional preprocessing steps may be required to convert control logic into IL code before applying the proposed method.

6. Conclusions

Anomaly detection systems have been introduced to detect cyber threats in ICS. Traditional ICS anomaly detection systems monitor network packets or process data. However, these systems cannot detect control logic tampering attacks. Various studies have been conducted to detect control logic tampering. However, tampering detection focuses on identifying code modifications rather than determining whether the control logic is functioning normally. As a result, even legitimate modifications to normal operations are detected as anomalies, leading to an increase in false positives. For this reason, this paper proposes an anomaly detection method that considers the structure of control logic to detect ICS cyber threats. The proposed method performs embedding on IL code. To preserve the context and structure of IL code, opcodes and operands are embedded separately using different embedding models. The embedded values are then combined sequentially to maintain the IL code structure. This embedding approach effectively preserves the relationships between operators and operands in IL code, making it useful for future anomaly detection and code analysis. To validate the effectiveness of the embedded values, LSTM, LSTM-Autoencoder, and Transformer models were implemented for anomaly detection. All models achieved an F1 score of 0.81 or higher. These models learn normal control logic and detect control logic exhibiting anomalous behavior. This approach enables the detection of not only anomalous code but also code containing vulnerabilities. Additionally, the proposed embedding method was compared with an approach that embeds opcodes and operands into a single vector without differentiation. When applying the proposed embedding method, the anomaly detection model achieved an F1 score that was 0.088259 higher. Experimental results demonstrate that the proposed embedding method allows the anomaly detection model to better understand the execution flow and structural relationships of control logic instructions. Existing ICS anomaly detection methods cannot detect malicious control logic injection attacks targeting critical infrastructure, such as power plants and smart grids. These attacks can cause abnormal operations in critical infrastructure, leading to financial and physical damage. The proposed control logic anomaly detection method enhances ICS cybersecurity and minimizes damage by detecting control logic anomalies that existing ICS anomaly detection methods fail to identify. Additionally, this study provides a foundation for further research on AI-based ICS anomaly detection. As future work, a model will be proposed to expand the monitoring scope by simultaneously analyzing both control logic and process data.

Author Contributions

Conceptualization, J.H.L. and S.H.J.; methodology, J.H.L. and S.H.J.; validation, J.H.L. and I.H.J.; formal analysis, J.H.L.; investigation, J.H.L. and I.H.J.; resources, J.H.L. and I.H.J.; data curation, J.H.L. and I.H.J.; writing—original draft preparation, J.H.L. and I.H.J.; writing—review and editing, S.H.J. and J.T.S.; visualization, J.H.L.; supervision, S.H.J. and J.T.S.; project administration, S.H.J. and J.T.S.; funding acquisition, J.T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly supported by the Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by the Korean government (MSIT) (No. RS-2024-00354169, Technology Development of Threat Model/XAI-based Network Abnormality Detection, Response and Cyber Threat Prediction, 90%) and the Gachon University research fund of 2025 (GCU-202406180001, 10%).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are openly available in a public repository. The data supporting this study’s findings are openly available in the [PLC-LD-dataset] at [https://github.com/UniboSecurityResearch/PLC-LD-dataset (accessed on 22 March 2025)].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ICS	Industrial Control System
AI	Artificial Intelligence
PLC	Programmable Logic Controller
EWS	Engineering Workstation
USB	Universal Serial Bus
LD	Ladder Diagram
IL	Instruction List
ST	Structured Text
FBD	Function Block Diagram
SFC	Sequential Function Chart
I&C	Instrumentation and Control Systems
CPLCD	Control Program Logic Change Detector
LLB	Ladder Logic Bombs
TPs	True Positives
TNs	True Negatives
FPs	False Positives
FNs	False Negatives
FNR	False Negative Rate
FPR	False Positive Rate
LSTM	Long Short-Term Memory
RNN	Recurrent Neural Network
GRU	Gated Recurrent Unit

References

Koay, A.M.; Ko, R.K.L.; Hettema, H.; Radke, K. Machine learning in industrial control system (ICS) security: Current landscape, opportunities and challenges. J. Intell. Inf. Syst. 2023, 60, 377–405. [Google Scholar] [CrossRef]
Luo, Y.; Xiao, Y.; Cheng, L.; Peng, G.; Yao, D. Deep learning-based anomaly detection in cyber-physical systems: Progress and opportunities. ACM Comput. Surv. CSUR 2021, 54, 1–36. [Google Scholar] [CrossRef]
Jiang, J.-R.; Chen, Y.-T. Industrial control system anomaly detection and classification based on network traffic. IEEE Access 2022, 10, 41874–41888. [Google Scholar] [CrossRef]
Mokhtari, S.; Abbaspour, A.; Yen, K.K.; Sargolzaei, A. A machine learning approach for anomaly detection in industrial control systems based on measurement data. Electronics 2021, 10, 407. [Google Scholar] [CrossRef]
Hudedmani, M.G.; Umayal, R.; Kabberalli, S.K.; Hittalamani, R. Programmable logic controller (PLC) in automation. Adv. J. Grad. Res. 2017, 2, 37–45. [Google Scholar] [CrossRef]
Baezner, M.; Robin, P. Stuxnet; ETH Zurich: Zürich, Switzerland, 2017. [Google Scholar]
Choia, M.K.; Yeunb, C.Y.; Seonga, P.H. Development of a Monitoring System for Data integrity of PLC code using blockchain technologies. In Proceedings of the Transactions of the Korean Nuclear Society Spring Meeting, Jeju, Republic of Korea, 23–24 May 2019. [Google Scholar]
Yang, K.; Wang, H.; Sun, L. An effective intrusion-resilient mechanism for programmable logic controllers against data tampering attacks. Comput. Ind. 2022, 138, 103613. [Google Scholar] [CrossRef]
Shedge, S.; Tade, S. Design of instruction list processor for industrial applications. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 16–18 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–3. [Google Scholar]
Alphonsus, E.R.; Abdullah, M.O. A review on the applications of programmable logic controllers (PLCs). Renew. Sustain. Energy Rev. 2016, 60, 1185–1205. [Google Scholar] [CrossRef]
Alsabbagh, W.; Langendörfer, P. A flashback on control logic injection attacks against programmable logic controllers. Automation 2022, 3, 596–621. [Google Scholar] [CrossRef]
Tiegelkamp, M.; John, K.-H. IEC 61131-3: Programming Industrial Automation Systems; Springer: Berlin/Heidelberg, Germany, 2010; Volume 166. [Google Scholar]
Lee, J.; Choi, H.; Shin, J.; Seo, J.T. Detection and analysis technique for manipulation attacks on plc control logic. In Proceedings of the 2020 ACM International Conference on Intelligent Computing and Its Emerging Applications, GangWon, Republic of Korea, 12–15 December 2020; pp. 1–6. [Google Scholar]
Yau, K.; Chow, K.-P. PLC forensics based on control program logic change detection. J. Digit. Forensics Secur. Law 2015, 10, 5. [Google Scholar]
Xiao, Y.-j.; Xu, W.-y.; Jia, Z.-h.; Ma, Z.-r.; Qi, D.-l. NIPAD: A non-invasive power-based anomaly detection scheme for programmable logic controllers. Front. Inf. Technol. Electron. Eng. 2017, 18, 519–534. [Google Scholar]
Ghosh, A.; Qin, S.; Lee, J.; Wang, G.-N. FBMTP: An automated fault and behavioral anomaly detection and isolation tool for PLC-controlled manufacturing systems. IEEE Trans. Syst. Man Cybern. Syst. 2016, 47, 3397–3417. [Google Scholar] [CrossRef]
Han, S.; Lee, K.; Cho, S.; Park, M. Anomaly detection based on temporal behavior monitoring in programmable logic controllers. Electronics 2021, 10, 1218. [Google Scholar] [CrossRef]
Chan, C.F. Enhancing PLC Logging and Abnormality Detection Through Machine Learning and Process Mining. Ph.D. Thesis, The University of Hong Kong, Hong Kong, China, 2024. [Google Scholar]
Ayub, A. Stealthy Control Logic Attacks and Defense in Industrial Control Systems. Ph.D. Thesis, Virginia Commonwealth University, Richmond, VA, USA, 2024. [Google Scholar]
Ding, S.H.; Fung, B.C.; Charland, P. Asm2vec: Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 472–489. [Google Scholar]
Iacobelli, A.; Rinieri, L.; Melis, A.; Al Sadi, A.; Prandini, M.; Callegati, F. Detection of Ladder Logic Bombs in PLC Control Programs: An Architecture based on Formal Verification. In Proceedings of the 2024 IEEE 7th International Conference on Industrial Cyber-Physical Systems (ICPS), St. Louis, MO, USA, 12–15 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7. [Google Scholar]
Alves, T.R.; Buratto, M.; De Souza, F.M.; Rodrigues, T.V. OpenPLC: An open source alternative to automation. In Proceedings of the IEEE Global Humanitarian Technology Conference (GHTC 2014), San Jose, CA, USA, 10–13 October 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 585–589. [Google Scholar]
Hossin, M.; Sulaiman, M.N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1. [Google Scholar]
Dalvi, S.; Gressel, G.; Achuthan, K. Tuning the false positive rate/false negative rate with phishing detection models. Int. J. Eng. Adv. Technol. 2019, 9, 7–13. [Google Scholar] [CrossRef]
Lindemann, B.; Maschler, B.; Sahlab, N.; Weyrich, M. A survey on anomaly detection for technical systems using LSTM networks. Comput. Ind. 2021, 131, 103498. [Google Scholar] [CrossRef]
Nguyen, H.D.; Tran, K.P.; Thomassey, S.; Hamad, M. Forecasting and Anomaly Detection approaches using LSTM and LSTM Autoencoder techniques with the applications in supply chain management. Int. J. Inf. Manag. 2021, 57, 102282. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems 30; NeurIPS: Long Beach, CA, USA, 2017. [Google Scholar]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1597–1600. [Google Scholar]
Agarap, A.F. Deep learning using rectified linear units (relu). arXiv 2018, arXiv:1803.08375. [Google Scholar]
Chen, Y.; Zhang, S.; Zhang, W.; Peng, J.; Cai, Y. Multifactor spatio-temporal correlation model based on a combination of convolutional neural network and long short-term memory neural network for wind speed forecasting. Energy Convers. Manag. 2019, 185, 783–799. [Google Scholar] [CrossRef]
Gulli, A.; Pal, S. Deep Learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
Perera, P.; Oza, P.; Patel, V.M. One-class classification: A survey. arXiv 2021, arXiv:2101.03064. [Google Scholar]
Ahsan, M.M.; Mahmud, M.P.; Saha, P.K.; Gupta, K.D.; Siddique, Z. Effect of data scaling methods on machine learning algorithms and model performance. Technologies 2021, 9, 52. [Google Scholar] [CrossRef]
Waqas, M.; Humphries, U.W. A critical review of RNN and LSTM variants in hydrological time series predictions. MethodsX 2024, 13, 102946. [Google Scholar] [PubMed]
Zheng, Y.; Pujar, S.; Lewis, B.; Buratti, L.; Epstein, E.; Yang, B.; Laredo, J.; Morari, A.; Su, Z. D2a: A dataset built for ai-based vulnerability detection methods using differential analysis. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), Madrid, Spain, 25–28 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 111–120. [Google Scholar]

Figure 1. Overview of structure-based control logic anomaly detection of control logic.

Figure 2. Example of conversion of LD code to IL code.

Figure 3. Control logic embedding example.

Figure 4. Control logic anomaly detection model configuration.

Table 1. IEC 61131-3 programming language.

Types of Programming Languages	Description
Ladder Diagram (LD)	A programming language written using ladder diagram symbols
Instruction List (IL)	Programming using one operator and one or more operands (like assembly language)
Structured Text (ST)	High-level programming languages like Basic, PASCAL, and C
Function Block Diagram (FBD)	Programming language using block diagrams
Sequential Function Chart (SFC)	A programming language like the format of a sequence diagram.

Table 2. LSTM hyperparameter tuning process (optimization algorithm learning rate is 0.0005).

Scaler	Batch Sizes	Node	F1 Score
Robust	8	64	0.7610
		128	0.8129
		256	0.7610
		512	0.7780
	16	64	0.7383
		128	0.7606
		256	0.7763
		512	0.7520
	32	64	0.6958
		128	0.7587
		256	0.7547
		512	0.7437

Table 3. LSTM-Autoencoder hyperparameter tuning process (optimization algorithm learning rate is 0.0005).

Scaler	Batch Sizes	Node	F1 Score
Robust	8	64	0.7782
		128	0.7712
		256	0.7685
		512	0.7828
	16	64	0.7393
		128	0.7756
		256	0.7834
		512	0.7690
	32	64	0.7355
		128	0.7598
		256	0.7847
		512	0.8160

Table 4. Transformer hyperparameter tuning process (optimization algorithm learning rate is 0.001; the number of parallels (heads) to perform the attentions is 8).

Scaler	Batch Sizes	Layers	Vector Dimension	F1 Score
Standard	8	2	64	0.7657
			128	0.6987
			256	0.7458
			512	0.8117
		4	64	0.7795
			128	0.7232
			256	0.7558
			512	0.6878
		6	64	0.7467
			128	0.7370
			256	0.6878
			512	0.6974
	16	2	64	0.7493
			128	0.7612
			256	0.6341
			512	0.6761
		4	64	0.7333
			128	0.6882
			256	0.7031
			512	0.7285
		6	64	0.7232
			128	0.7489
			256	0.7651
			512	0.7146
	32	2	64	0.7766
			128	0.6653
			256	0.6687
			512	0.7334
		4	64	0.7034
			128	0.7312
			256	0.7431
			512	0.7213
		6	64	0.7013
			128	0.7305
			256	0.7558
			512	0.7099

Table 5. Control logic anomaly detection model performance.

Model	Accuracy	FNR	FPR	Precision	Recall	F1-Score
RNN	0.704762	0.199675	0.386646	0.66442	0.800325	0.726068
GRU	0.755556	0.232143	0.256211	0.741379	0.767857	0.754386
LSTM	0.811905	0.121160	0.133869	0.756241	0.87884	0.812944
LSTM-Autoencoder	0.810317	0.139609	0.100148	0.775988	0.86039	0.816012
Transformer	0.810317	0.121160	0.137046	0.754026	0.87884	0.811663

Table 6. Comparison of the proposed embedding method and general embedding methods in terms of performance.

Model	F1 Score
Model	Proposed Embedding Methods	General Embedding Methods
LSTM	0.812944	0.74079
LSTM-Autoencoder	0.816012	0.756549
Transformer	0.811663	0.723404

Table 7. Qualitative comparison between the proposed method and control logic tampering research.

Reference	Detection Objective	Detection Method	Additional Storage Requirement	Tampering Detection Capability	Anomaly Detection Capability
Choia, M.K. [7]	Tampering verification	Blockchain (hash value comparison)	O	O	X
Yang, K. [8]	Tampering verification	Message digest	X	O	X
Lee, J.C. [13]	Tampering verification	Comparison with normal control logic code	O	O	X
Yau, Ken [14]	Tampering verification	Review of modifications to rung	X	O	X
Our	Anomaly detection	AI-based anomaly detection	X	△ (Detection if tampering causes anomalies)	O

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lee, J.H.; Ji, I.H.; Jeon, S.H.; Seo, J.T. Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection. Appl. Sci. 2025, 15, 3507. https://doi.org/10.3390/app15073507

AMA Style

Lee JH, Ji IH, Jeon SH, Seo JT. Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection. Applied Sciences. 2025; 15(7):3507. https://doi.org/10.3390/app15073507

Chicago/Turabian Style

Lee, Ju Hyeon, Il Hwan Ji, Seung Ho Jeon, and Jung Taek Seo. 2025. "Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection" Applied Sciences 15, no. 7: 3507. https://doi.org/10.3390/app15073507

APA Style

Lee, J. H., Ji, I. H., Jeon, S. H., & Seo, J. T. (2025). Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection. Applied Sciences, 15(7), 3507. https://doi.org/10.3390/app15073507

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Anomaly Detection Method Considering PLC Control Logic Structure for ICS Cyber Threat Detection

Abstract

1. Introduction

2. Background and Related Works

2.1. Background

2.2. Related Works

3. Proposal for PLC Control Logic Embedding and Anomaly Detection

3.1. Overview

3.2. Control Logic IL Code Conversion

3.3. Control Logic Embedding

3.4. Control Logic Learning and Anomaly Detection

4. Experiment and Evaluation

4.1. Dataset Description

4.2. Experiment Settings

4.3. Ablation Study of Proposal Methods

5. Comparative Study

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI