Next Article in Journal
SOAR-RL: Safe and Open-Space Aware Reinforcement Learning for Mobile Robot Navigation in Narrow Spaces
Previous Article in Journal
Advancing Remote Life Sensing for Search and Rescue: A Novel Framework for Precise Vital Signs Detection via Airborne UWB Radar
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Neural Network Method of Analysing Sensor Data to Prevent Illegal Cyberattacks

by
Serhii Vladov
1,*,
Vladimir Jotsov
2,3,*,
Anatoliy Sachenko
4,5,
Oleksandr Prokudin
6,
Andrii Ostapiuk
7,† and
Victoria Vysotska
8
1
Department of Scientific Activity Organization, Kharkiv National University of Internal Affairs, 27, L. Landau Avenue, 61080 Kharkiv, Ukraine
2
Department of Information Systems and Technologies, University of Library Studies and Information Technologies, 119, Tsarigradsko Shose, 1784 Sofia, Bulgaria
3
Department of Cybersecurity, International Information Technology University, 34A, Manas Street, Almaty 050000, Kazakhstan
4
Research Institute for Intelligent Computer Systems, West Ukrainian National University, 11, Lvivska Street, 46009 Ternopil, Ukraine
5
Department of Teleinformatics, Casimir Pulaski Radom University, 29, Malczewskiego Street, 26-600 Radom, Poland
6
Department of Organization of Educational and Scientific Training, Kharkiv National University of Internal Affairs, 27, L. Landau Avenue, 61080 Kharkiv, Ukraine
7
Lviv State University of Life Safety, 79000 Lviv, Ukraine
8
Information Systems and Networks Department, Lviv Polytechnic National University, 12, Bandera Street, 79013 Lviv, Ukraine
*
Authors to whom correspondence should be addressed.
First Vice-Rector.
Sensors 2025, 25(17), 5235; https://doi.org/10.3390/s25175235
Submission received: 28 July 2025 / Revised: 8 August 2025 / Accepted: 12 August 2025 / Published: 22 August 2025
(This article belongs to the Section Communications)

Abstract

This article develops a method for analysing sensor data to prevent cyberattacks using a modified LSTM network. This method development is based on the fact that in the context of the rapid increase in sensor devices used in critical infrastructure, it is becoming an urgent task to ensure these systems’ security from various types of attacks, such as data forgery, man-in-the-middle attacks, and denial of service. The method is based on predicting normal system behaviour using a modified LSTM network, which allows for effective prediction of sensor data because the F1 score = 0.90, as well as on analysing anomalies detected through residual values, which makes the method highly sensitive to changes in data. The main result is high accuracy of attack detection (precision = 0.92), achieved through a hybrid approach combining prediction with statistical deviation analysis. During the computational experiment, the developed method demonstrated real-time efficiency with minimal computational costs, providing accuracy up to 92% and recall up to 89%, which is confirmed by high AUC = 0.94 values. These results show that the developed method is effectively protecting critical infrastructure facilities with limited computing resources, which is especially important for cyber police.

1. Introduction and Related Works

In recent years, there has been a rapid growth in the number of Internet of Things (IoT) devices [1] and sensor systems [2] integrated into industrial [3,4], consumer [5,6], transport [7,8,9], and critical infrastructure [10,11,12]. These devices collect massive amounts of data in real time, providing environmental monitoring [13], process control [14], and user scenarios [15]. At the same time, the growth in the number of connected sensors creates new vectors for cyberattacks [16,17,18]—attackers can inject malicious packets into sensors, distorting readings or disrupting system operation.
Sensor data have traditionally been considered in the controlled processes context [19,20], but in recent years, they have been actively used to detect anomalies [21] and cyberattacks [22]. Information coming from the network’s different points allows us to identify atypical patterns of behaviour, such as a sudden increase in message frequency [23] or deviations in the physical quantity spectrum [24]. However, classic statistical methods often fail to cope with high dimensionality and non-trivial correlation between data channels.
The relevance of developing a neural network method for analysing sensor data to prevent cyberattacks is determined by a combination of factors: the increasing attack-type complexity (including targeted and hidden), the need to process multidimensional flows in real time, and limited device resources (energy consumption, computing power). Machine learning methods and neural networks [25,26,27,28] will allow adaptation to the object’s changing operating conditions and the detection of new, previously unseen attack signatures.
The first studies on neural network application in information security problems relied on simple fully connected architectures [29] and “shallow learning” [30]. These methods demonstrated satisfactory accuracy (up to 80–85%) on small datasets but were inferior in scalability and sensitivity to noise in the data. Their key limitation was the lack of a built-in feature extraction mechanism, which required careful manual selection and preprocessing.
With the development of deep learning technologies, convolutional neural networks (CNNs) have become widely used for working with sensory data in time series [31,32]. CNNs are good at extracting local patterns and are able to detect temporal anomalies without complex feature engineering. A number of studies, such as [32,33], have shown that convolutional filters are effective in analysing vibration, acoustic, and temperature sensors.
In parallel with CNNs, special attention has been paid to recurrent neural networks (RNNs), in particular, LSTM [34] and GRU [35], due to their ability to take into account long-term dependencies in time series. The developed models have shown high accuracy (over 90%) in detecting gradual changes in the sensor system’s behaviour, median attack characteristics [36], or covert malware injection [37].
Autoencoders [38] and variational autoencoders (VAEs) [39] have laid the foundation for unsupervised anomaly detection; by learning to reconstruct “normal” sensor patterns, they exhibit low reconstruction errors (less than 1–2% [40,41]) when encountering anomalous, potentially malicious signals. These approaches are instrumental in settings where labelled data are scarce.
Recently, graph neural networks (GNNs) have been actively developed for modelling correlations between sensors in distributed systems [42]. By representing a sensor network as a graph, where nodes are sensors and edges are communication channels, GNNs allow spatial and topological dependencies to be taken into account, which increases the accuracy of localisation.
Hybrid methods that combine classic statistical models with neural network blocks [43,44,45] have become popular due to the balance between interpretability and adaptability. In [46], an approach for unsupervised anomaly detection in sensor streams by combining statistical models and deep networks is presented. In [47], a label noise filter ensemble is proposed to improve local diagnostic interpretation of diabetes readings; in [48], an LSTM soft sensor for batch processes with just-in-time multi-model learning is developed; and in [49], a discrete conformal fractional sensing system for predicting CO2 emissions is described. However, the approaches considered [46,47,48,49] face a number of limitations: they require a large amount of labelling or threshold fine-tuning, are rarely tested on real data streams with conceptual drift, and often do not take into account the strict limitations of computational resources and the need for interpretability in online monitoring conditions. Thus, architectures where preliminary feature selection is based on signal entropy analysis and subsequent classification is performed using deep neural networks demonstrate better results on multichannel sensor arrays (Table 1).
Despite the successes, there are a number of unsolved problems:
  • The lack of unified datasets for evaluating the comparative effectiveness complicates the optimal architecture choice.
  • Many studies do not take into account the real transmission delays and end-device resource limitations.
  • Neural network solution interpretability. Security operators often require explanations of why a particular point was marked as abnormal.
  • Without the deep models’ “black box”, practical implementation in industrial systems is complex.
In addition, when developing methods for analysing sensory data, it is vital to consider resistance to adversarial attacks [50]—attackers can deliberately distort the input signal in such a way that the model will not notice the anomaly or, conversely, will create false positives. Embedding neural network models directly into sensor units (edge computing [51]) requires architecture optimisation in terms of memory and energy consumption, which remains a relevant research area. Thus, the development of a new neural network method for analysing sensory data should solve the following key problems: adaptability to new types of attack, resistance to adversarial disturbances, results interpretability, and the ability to work under conditions of limited computing resources.

2. Materials and Methods

2.1. Theoretical Foundations of Sensory Data Analysis for Cyberattack Prevention

It is known that in complex technical systems operating under modern conditions, sensory data grows exponentially. These flows allow anomalies to be detected and timely analyses associated with cyberattacks performed at early stages [2,16,18]. This study proposes a structure for the analysis of a sensory data method to prevent cyberattacks (Figure 1) that reflects the main stages of processing and detecting such threats.
The sensory data model is based on the fact that x(t) is a readings vector from n sensors at time t [18]. It is assumed that in regular operation, the system is described by a linear stochastic differential equation:
d x t = A · x t d t + B · d w t ,
where A ∈ ℝn×n is the normal behaviour dynamics matrix, B ∈ ℝn×m is the noise intensities matrix, and w(t) is an m-dimensional Wiener process [52].
In discrete time (with step Δt), this gives the following model:
x k + 1 = Φ · x k + v k ,   v k ~ N 0 , Q ,
where
Φ = exp A · t ,   Q = 0 t exp A · τ · B · B · exp A · τ d τ .
To minimise the estimated mean square error and obtain an optimal approximation of the actual state in the presence of the dynamics and observations, a recursive Kalman filter is used [53,54], which, at each step, predicts the system state vector based on the previous estimate and then corrects it, taking into account the incoming measurements. The following expressions describe the recursive Kalman filter:
x ^ k k 1 = Φ · x ^ k 1 k 1 ,   P k k 1 = Φ · P k 1 k 1 · Φ + Q , K k = Φ · P k 1 k 1 · H · H · P k k 1 · H + R 1 , x ^ k k = x ^ k k 1 + K k · y k H · x ^ k k 1 ,   P k k = I K k · H · P k k 1 ,
where yk is the measurement vector (xk part), H is the observation matrix, and R is the measurement noise covariance.
The residual (innovation) at the k-th step is the difference between the actual measurement and the predicted observed value, which is defined as follows:
r k = y k H · x ^ k k 1 ,
and characterises the deviation magnitude from the model; with normal behaviour, rkN(0, Sk) has a zero mean and a covariance of the following form:
S k = H · P k 1 k 1 · H + R .
To reliably detect deviations from the system’s normal behaviour, a statistical test is constructed based on residuals rk. The key idea is to compare the “distance” between the actual measurements and the predicted state with a threshold that ensures a false-positive given level. For this, the statistics of the form are defined as follows:
d k = r k · S k 1 · r k .
Under the hypothesis “no attack” (H0), rkN(0, Sk) holds, whence d k ~ χ m 2 , where m = rank(Sk), usually equal to the measurement vector dimension. The threshold γ is selected so that the exceeding probability during normal behaviour does not exceed a predetermined level α (the false alarm level):
P F A = P d k > γ H 0 = γ f χ m 2 x d x = α .
From the inverse distribution function χ2, we obtain the following:
γ = F χ m 2 1 1 α .
For example, for m = 3 and α = 0.01, the threshold is γ ≈ 11.34.
In the presence of an attack (hypothesis H1), the residuals acquire a nonzero mathematical expectation:
r k ~ N μ r , S k ,   μ r = H · Φ · x ^ k 1 k 1 + U · u 0 H · Φ · x ^ k 1 k 1 = H · U · u 0 .
Then, dk has a nonparametric distribution (of a nonzero quadratic form):
d k ~ χ m 2 λ ,   λ = μ r · S k 1 · μ r ,
that is, the unstandardised χ2 distribution with the uncentered parameter λ [55]. Then, the detection probability is represented as follows:
P D = P d k > γ H 1 = 1 F χ m λ 2 γ .
The λ value is proportional to the attack u 0 · U · S k 1 · H · U · u 0 “strength”.
In the sequential control scheme (every frame k), the detection time is introduced:
T = min k 1 : d k > γ .
For a small α and constant noise drift, it can be shown through the unquenched χ2 additivity property that
E T γ m λ .
It follows from the average E[dk] ≈ m + kλ uniform increment and the expectation of the line’s growth rate.
Thus, the threshold γ choice determines a trade-off between the false positive rate and the missing attacks probability, while the unstandardised χ2 distribution allows us to express the PD test explicitly in terms of the uncentered parameter λ, and the average detection delay time is estimated as approximately γ m λ . This allowed us to formulate Theorem 1, “Detection Time Theorem”:
Theorem 1. 
Let the attack introduce a constant offset u0 ≠ 0 into the system with the described model, and let the detector be tuned to a threshold γ corresponding to the false positive rate α. Then, the detection time T = min{k ≥ 1: dk > γ} satisfies E T γ m u 0 · W · u 0 .
Proof of Theorem 1. 
The proof of this theorem consists of three steps:
  • Step 1. The dk statistic expectation growth.
  • Step 2. Markov estimate of the time to reach the level.
  • Step 3. The average E[T] estimation.
To estimate the dk statistic expectation growth, the introduced statistic d k = r k · S k 1 · r k is adopted, which helps the comparison of the “distance” between the actual measurements and the predicted state, with the threshold that provides a false-positive given level. In this case, in the presence of an attack, the residual model has the form r k = H · U · u 0 + r ~ k , where r ~ k ~ N 0 , S k is the noise part. Then,
E d k = E H · U · u 0 + r ~ k · S k 1 · H · U · u 0 + r ~ k .
Expanding the brackets and taking into account that E r ~ k = 0 and E r ~ k · r ~ k = S k , we obtain
E d k = u 0 · U · H · S k 1 · H · U · u 0 + t r S k 1 · S k = λ + m ,
where
λ = u 0 · W · u 0 , W = U · H · S k 1 · H · U , m = r a n k S k .
Note that in the steady state, SkP, and W are constant; therefore,
E d k m + k · λ .
At the Markov estimation of the time stage to reach the level, it is assumed that T is the threshold γ first-crossing time. Then, for any N ∈ ℕ,
T > N d 1 γ , d N γ k = 1 N d k N · γ .
By Markov’s inequality for a non-negative random variable k = 1 N d k
P T > N = P k = 1 N d k N · γ E k = 1 N d k N · γ k = 1 N m + k · λ N · γ .
The arithmetic progression sum gives
k = 1 N m + k · λ = N · m + λ · N · N + 1 2 = N · m + λ 2 · N + 1 .
This is why
P T > N m + λ 2 · N + 1 γ .
At the average E[T] estimating stage, the classic equality for non-negative integer T is used:
E T = N = 0 P T > N .
Taking into account the P(T > N) assessment,
E T N = 0 m + λ 2 · N + 1 γ = 1 γ · N = 0 m + λ 2 · N + 1 .
However, series N = 0 N + 1 diverges, so we truncate the sum before the first intersection, approximating the expected detection time T*, at which the average E d T = γ . From the first step
m + T · λ = γ T = γ m λ .
Based on the inequality for the first level crossing time, an estimate was obtained:
E T γ m λ γ m u 0 · W · u 0 .
Thus, the detection time is inversely proportional to the attack “strength” u 0 · W · u 0 square and linearly depends on the chosen threshold γ. The theorem is proved. □

2.2. Development of a Neural Network Method for Analysing Sensory Data to Prevent Cyberattacks

Based on the theoretical model and residual statistics, the study proposes a hybrid method (Figure 2) that consists of training a neural network predictor of normal behaviour and an anomaly classification component based on residuals.
The LSTM predictor fθ takes as input a sliding window of k − 1 previous samples {xtk+1, …, xt−1} and passes them through several LSTM layers [21,34,56] to capture both short-term and long-term dependencies, after which a linear layer produces a one-step prediction x ^ t . In the next step, the residual r t = x t x ^ t is calculated, and the matrix Rt = [rtm+1, …, rt] is formed from the last m residual, which is fed into the anomaly classifier input gϕ (e.g., a feedforward neural network) trained on normal and synthetically distorted data, after which a scalar score st is obtained. Threshold τ is chosen based on the validation set so that the PFA false-positive proportion does not exceed a predetermined level, which ensures the required balance between the detector’s sensitivity and specificity.
The LSTM predictor (Figure 3) is a modified standard LSTM cell with the addition of “module drift”, which is an additional gate dt, which introduces adaptive drift into the cell state. The following parameters describe the LSTM predictor: xt ∈ ℝn is the input vector at time t, ht−1 ∈ ℝh is the previous hidden state and cell state, W* and U* are weight matrices, b* represents biases for the corresponding gates, σ(●) is the sigmoid, ⊙ is element-wise multiplication, and tanh(●) is the hyperbolic tangent.
The LSTM cell’s classic gates (input gate, forget gate, output gate, and state candidate) are described by traditional expressions [21,34,56]:
i t = σ W i · x t + U i · h t 1 + b i , f t = σ W f · x t + U f · h t 1 + b f , o t = σ W o · x t + U o · h t 1 + b o , c ~ t = t a n h W c · x t + U c · h t 1 + b c .
The proposed drift module introduces an additional gate dt into the LSTM cell, which adaptively regulates the special “drift” vector influence on the state update, taking into account both the current input and the previous hidden state. It allows the model to adjust predictions in the sensory data, gradually changing the base-level context according to the following expression:
d t = σ W d · x t + U d · h t 1 + b d ,
where Wd ∈ ℝh×n, Ud ∈ ℝh×h, bd ∈ ℝh.
The drift signal can be taken either as a constant vector u0 or as a previous state function:
δ t = V d · h t 1 + c d ,
where Vd ∈ ℝh×h, cd ∈ ℝh. Then, the “drift contribution” to the cell state is carried out according to the following expression:
c t = d t δ t ,
The cell state update combines the previous state forgetting effects, adding new information and adaptive drift through the corresponding gates:
c t = f t c t 1 + i t c ~ t + d t δ t .
The hidden state ht is obtained by modulating the cell state activation function output gate-filtered value:
h t = o t t a n h c t .
To obtain a one-step prediction after L cells, the hidden state ht is passed through a linear output layer:
x ^ t + 1 = W y · h t + b y ,
where Wy ∈ ℝn×h, by ∈ ℝn.
Thus, the proposed extension allows the LSTM model not only to take into account the input and past states but also to adaptively add predictions taking into account drift, which allows the slowly changing underlying mode of sensory data to be considered.
The loss function for training the LSTM predictor fθ is constructed based on the mean square error of predictions and weight regularisation [21,34,56], which is formalised as follows. Let there be a training dataset of length T sequences, and for each step t = k, …, T − 1, an input window Xt = (xtk+1, …, xt) is formed and the vector x ^ t + 1 = f θ X t prediction is made. Then, the empirical loss function is defined as follows:
L p r e d θ = 1 T k · t k T 1 x t + 1 f θ x t k + 1 , , x t 2 2 + λ θ θ 2 2 ,
where x t + 1 x ^ t + 1 2 2 is the squared prediction error, the coefficient 1 T k ensures normalisation by the step number, and θ 2 2 = i θ i 2 L 2 is the weight decay regularisation controlled by hyperparameter λθ > 0, which prevents overfitting and guarantees the solution’s smoothness.
In the optimisation statement, this function’s mathematical expectation is minimised over the training data distribution [57]:
θ = arg   min θ   E t r a i n x t + 1 f θ X t 2 2 + λ θ θ 2 2 .
For numerical optimisation, mini-batch stochastic gradient descent (SGD) [58] is usually used, in which at each j-th step, the parameter update is given by the following rule:
θ j + 1 = θ j η · 𝛻 θ L p r e d B j θ j ,
where η is the learning rate and L p r e d B j is the loss function averaged over batch Bj. In this case, the stochastic gradient descent mini-batch adaptive versions are allowed to be used in the Adam or RMSProp optimiser form [59].
To improve the process time structure reproduction quality, it is proposed that the term μ · t = k T 2 x ^ t + 2 x ^ t + 1 x t + 2 x t + 1 2 2 (where the coefficient μ > 0 specifies this penalty weight) is added to the loss function, penalising the discrepancy between the predicted and actual data change rates, which ensures the prediction’s “smoothness”.
Taking into account the added penalty for the prediction’s “smoothness”, the resulting loss function is represented as follows:
L p r e d θ = 1 T k · t k T 1 x t + 1 f θ x t k + 1 , , x t 2 2 + λ θ θ 2 2 M S E   p r e d i c t i o n s + μ · t = k T 2 x ^ t + 2 x ^ t + 1 x t + 2 x t + 1 2 2 P e n a l t y   f o r   t h e   d y n a m i c s   i n c o n s i s t e n c y  
To describe the residuals and the classifier input, it is assumed that at each moment t, the actual sensory data values x t = x t 1 , x t 2 , , x t n R n vector is received and the corresponding prediction x ^ t = f θ x t k + 1 , , x t R n is made. Then, the residual vector is defined as follows:
r t = x t x ^ t R n .
To feed information about the residue’s dynamics into the classifier input, a sliding window of length m of the form is formed:
R t = r t m + 1 , r t m + 2 , , r t R n × m .
In order to take into account the different scales of sensor measurements, the sensor data values are normalised using their empirical covariance S and “whitened” by the residuals:
r ~ τ = S 1 2 · r τ ,   R ~ t = r ^ t m + 1 , r ~ t m + 2 , , r ~ t .
To feed the LSTM predictor, the matrix R ~ t is “straightened” into a vector of the form
z t = v e c R ~ t = r ^ t m + 1 r ~ t m + 2 r ~ t R n · m ,
which serves as the anomaly classifier st = gϕ(zt) input, which, in turn, based on it, produces a scalar score st ∈ [0, 1], interpreted as the anomaly posterior probability.
In this research, a single-layer MLP detector with one hidden layer is used [60,61] (Figure 4), which is described by the following expression for linear transformation and activation:
h t = ϕ 1 W 1 · z t + b 1 R d ,
where W(1) ∈ ℝd×(nm), b(1) ∈ ℝd are the first layer parameters, ϕ1(⋅) is the SmoothReLU activation function [62,63], and d represents the hidden neurons. The following expressions describe the output layer:
a t = W 2 · z t + b 2 ,   s t = σ a t = 1 1 + exp a t ,
where W(2) ∈ ℝd, b(2) ∈ ℝ.
According to the logit model derived from the Maxwell–Boltzmann distribution principles [64], the scalar score st = σ(at) is directly interpreted as the posterior probability that the residual’s current window belongs to the anomalous class. That is,
log P y t = 1 z t P y t = 0 z t = a t ,
which is why
s t = P y t = 1 z t ,   1 s t = P y t = 0 z t .
The loss function for training the binary anomaly classifier is based on the cross-entropy between the accurate labels yt and the predicted probabilities st, which allows one to directly maximise the correct classification log likelihood, and the weight ϕ 2 2  L2-regularisation addition prevents overfitting and improves the model’s generalisation ability [21,34]. Thus,
L c l a s ϕ = 1 N · y t · log s t + 1 y t · log 1 s t + λ ϕ · ϕ 2 2 .
Gradient descent produces updates of the form:
ϕ ϕ η · 𝛻 ϕ L c l a s ϕ .
The decision rule is based on the obtained scalar score st in comparison with a pre-selected threshold τ, which is defined in the validation dataset as the value at which the empirical false alarm rate
P ^ F A τ = 1 N 0 · y t = 0 1 s t > τ
best corresponds to the required P F A . The decision on “attacks” is made by the threshold. If st > τ, then a decision is made on the presence of an attack ( y ^ t = 1 ); otherwise, it remains in the system’s normal state ( y ^ t = 0 ). Thus,
y ^ t = 1 ,   i f   s t > τ , 0 ,   i f   s t τ ,
In this case, threshold τ is set in such a way as to ensure the required level of false alarms PFA during validation:
τ = arg min τ P ^ F A τ P F A ,
where P ^ F A τ is determined according to (48).
The anomaly detector quality metrics [65,66,67,68,69] are based on the confusion matrix, which is presented in Table 2.
According to Table 2, four indicators were obtained for N tested windows:
T P = t = 1 N 1 y t = 1 , y ^ t = 1 ,   F P = t = 1 N 1 y t = 0 , y ^ t = 1 , T N = t = 1 N 1 y t = 0 , y ^ t = 0 ,   F N = t = 1 N 1 y t = 1 , y ^ t = 0 ,
where TP is the number of cases when the model correctly identified an anomaly (the “attack” signal in the presence of a real attack), TN is the number of instances where the model correctly identified a normal state (there is no “attack” signal, and there is really no attack), and FP is the number of false positives, i.e., the model generated an “attack” signal. In fact, the system was working normally; FN is the model’s missed number, i.e., the model did not generate an “attack” signal, although an attack actually occurred.
The correct proportion of detected attacks shows what the real proportion of attacks was that was successfully detected and is defined as follows:
R e c a l l = T P R = T P T P + F N .
Precision reflects the correctly recognised proportion of attacks among all “attack” signals and is defined as follows:
P r e c i s i o n = T P T P + F P .
The false alarm rate characterises the detector’s tendency to generate false signals and is defined as follows:
F P R = T P F P + F N .
The overall recognition accuracy is defined as follows:
A c c u r a c y = T P + T N N .
To balance precision and recall, the F-measure (Fβ) is used and is defined as follows:
F β = 1 + β 2 · P r e c i s i o n · R e c a l l β 2 · P r e c i s i o n + R e c a l l ,
where β > 1 enhances the recall weight (detection value) and β < 1 enhances the precision weight.
The ROC curve displays the correctly detected proportion of attacks TPR(τ) in relation to the false alarm FPR(τ) proportion dependence with varying threshold τ, and the area under the ROC curve (AUC) numerically characterises the model’s generalised ability to separate classes—the closer the AUC is to 1, the better the detector distinguishes between normal and abnormal states. The AUC is equal to the probability that a randomly selected attack dataset will receive a higher st score than a randomly chosen standard dataset. The area under the ROC curve (AUC) is defined as follows:
A U C = 0 1 T P R F P R 1 u d u
and serves as an integral measure of class separability.
The PR curve plots the precision on recall dependence when threshold τ changes, which is especially important in rare attacks, where the balance between false alarms and missed events is critical. The average precision is calculated as the integral sum of precision over recall increments:
P R = i = 1 R R e c a l l i R e c a l l i 1 · P r e c i s i o n i ,
integrating the precision over the recall increments, where Recall0 = 0 and P r e c i s i o n i , R e c a l l i i = 1 K are the PR curve points, ordered by decreasing speed values.
In addition to traditional metrics, the average response delay is also estimated:
E T = 1 N 1 · y t = 1 t t a t t a c k , t ,
where tattack,t is the moment the attack begins, and the sum is taken over all windows with real attacks.
To enhance reactivity to prolonged minor deviations, one can construct “sliding” abbreviated sums of speed values:
S t = max 0 , S t 1 + s t v ,
and signal when St > h, similar to the CUSUM method [70].
Thus, based on the above, the developed method algorithm was synthesised and is presented in Table 3.
Thus, the developed hybrid approach combines the LSTM predictor’s power, capable of capturing complex nonlinearities, with the classic residuals’ statistics from the SDE model and χ2 tests, which ensures accurate and provable detection. At the same time, when normal conditions change, it is sufficient to retrain fθ on new “clean” data, and threshold τ is understandably adjusted through the abnormal score st empirical distribution in accordance with the false alarm required level PFA. With weak attacks and correct modelling at the vector u0 generating stage, a theorem analogous to the average detection delay is easily derived, where the expected increase E[st] appears instead of the uncentered parameter λ.

3. Case Study

3.1. Description of the Research Object and Experimental Setup

In this study, the research object is the industrial IoT controller sensor system for monitoring the environment in a government facility server room [71] (Figure 5) where critical equipment is located: servers, network storage, and confidential data processing systems. In this room, it is essential to maintain a stable temperature, optimal humidity, and the absence of harmful gases since even minor deviations can lead to equipment overheating, moisture condensation, or corrosion, which creates the risk of failure of the entire IT infrastructure. The IoT controller consists of three sensors:
  • The temperature sensor measures room temperature.
  • The humidity sensor measures air humidity.
  • The gas sensor measures gas concentrations, such as CO2 or volatile organic compounds.
Possible cyberattacks on an IoT controller include the following:
  • Spoofing, in which an attacker sends fake numeric values, such as an elevated temperature, to cause an emergency shutdown of the equipment.
  • A man-in-the-middle (MITM) attack, which allows the information being transmitted to be intercepted and modified.
  • A replay attack, in which old but correct data is transmitted to hide current conditions, and a denial of service (DoS), which disrupts data transmission and paralyses the system.
Thus, by distorting sensor readings, attackers can cause cooling to be turned off, equipment to overheat, server failures to occur, and, as a result, denial of service (DoS) or loss of access to critical information (Figure 6).
Figure 6 shows how three sensors (temperature, humidity, and gas concentration) transmit key parameters of the server room environment to the IoT controller, while vector cyberattacks are implemented through spoofing (sensor value substitution), MITM (message interception and modification), replay attacks (re-sending old correct data), and DoS (communication channel disruption), which allows attackers to distort or block information and, accordingly, destabilise the entire system operation.
Taking into account the developed neural network method for analysing sensory data presence in a cyber police unit to prevent cyberattacks, the research object structural diagram will be presented in the form of that in Figure 7.
Figure 7 shows how data from three sensors (temperature, humidity, and gas concentration) is received by a local IoT controller and then transmitted via a secure channel to a remote cyber police analytical centre, where a neural network model in real time identifies anomalies characteristic of spoofing, MITM, replay, and DoS attacks and returns alarm signals and recommendations to the controller to block or filter suspicious messages.
In this research, to conduct a computational experiment, the developed neural network method for analysing sensory data to prevent cyberattacks is implemented as a test sample in the MATLAB Simulink R2014b software environment (Figure 8).
The model subsystems are organised as follows:
  • The input sensor data are first normalised in the preprocessing block (tanh transformation taking into account pre-calculated μ and σ);
  • The sliding-window buffer cumulative block forms a vector for the LSTM predictor (MATLAB function) from the previous k − 1 sample, which, based on the states h(t − 1), c(t − 1), and the model equations, produces a predict x ^ t and updated states;
  • The Residual Computation block calculates the residual r t = x t x ^ t and “whitens” it by multiplying by S 1 2 ;
  • The Residual Window Buffer accumulates last m vectors rwhitened to form the matrix Rt;
  • In the MATLAB Function subsystem, Anomaly Classifier, based on MLP (two linear operations with SmoothReLU and sigmoid), anomaly probability estimate s(t) is produced for vector z = reshape(Rt), which is compared with threshold τ in the Threshold Decision (Compare to Constant block) and generates a Boolean alarm signal.
The model’s first block in MATLAB Simulink begins by receiving a three-channel sequential signal (temperature, humidity, and CO2) via the Sequence Input block (dimension is 3). Then, the data passes through a sliding-window mechanism based on the Buffer block (window length is 50 samples, step is five samples, and window intersection is 90%), after which it enters two LSTM layers:
  • The first LSTM layer contains 128 hidden elements, with tanh activation, output mode “sequence,” and a dropout of 0.2;
  • The second contains 64 hidden elements, the output mode is “last,” and dropout is 0.1. The second LSTM layer output is passed to the Fully Connected block (10 neurons, softmax) and then to the Classification Output block, which forms a probabilistic vector class label (norm, spoofing, replay, or DoS).
Network training is organised via a MATLAB R2014b script using the trainNetwork function and the following options: solver = “adam”, initialLearnRate = 10−3, maxEpochs = 50, miniBatchSize = 32, gradientThreshold = 1, shuffle = “every-epoch”. Upon training completion, the trained model is exported to Simulink format and, if necessary, converted using Simulink Quantizer for real-time deployment on embedded platforms.

3.2. Analysis and Preprocessing of the Training Dataset

In this research, the input data were the temperature values, air humidity, and CO2 concentration recorded by the IoT controller based on Texas Instruments TMP117 (Texas Instruments, Dallas, Texas, USA, temperature) [72], TEConnectivity HTU21D (TE Connectivity, Galway, Ireland, humidity sensor) [73], and SGX Sensortech MiCS-6814 sensor (Sgx Sensortech SA, Corcelles-Cormondrèche, Switzerland, gas sensor) [74] readings. The IoT controller recorded the values at a time interval of 1 s for one hour (Figure 9) (a total of 3600 values of each parameter were obtained). Since the developed neural network method requires further testing with outlier-resistant data, with zero mean, extreme values, and smooth compression use, the research used tangent-natural normalisation, which converts the temperature, air humidity, and CO2 concentration values into absolute values in the interval [−1; 1] (Table 4) as follows [75]:
x = tanh x μ σ ,
where μ is the mean value and σ is the standard deviation.
For the developed neural network method, the input dataset is the residual vector r t i = x t i x ^ t i , which serves to form the anomaly classifier inputs (Table 5).
Figure 10 shows each sensor’s residuals (the difference between the signal and the moving average prediction) over normalised time from 0.5 to 1. Figure 10 visualises the noise component that the anomaly detector will use to estimate deviations from normal behaviour.
Thus, Figure 10 shows the residual time series from the three sensors, calculated as the difference between the actual signal and its moving average prediction. All three channels show approximately zero mean and random fluctuations in the order of ±0.2…0.3, with the noise peaks’ amplitude and their frequency remaining relatively stable over the entire normalised time interval from 0.5 to 1. Such a uniform “ringing” without pronounced trends or structures allows these residuals to be used as the anomaly detector input that will track significant deviations from this baseline noise level.
Table 6 shows the standardised residuals training dataset homogeneity assessment results for three sensors, divided into five equal segments.
The W(p > 0.05) label indicates homogeneity of variances for all sensors. The F(p > 0.05) label indicates homogeneity of mean values only for Sensor 3, while for Sensors 1 and 2, statistically significant differences in mean values between segments are observed.
The residuals’ overall sample mean for the entire period is defined as follows [76]:
r ¯ = 1 N · t = 1 N r t ,
where rt is the standardised residual at time t and N is the total number of observations. In Table 6, the value close to zero confirms the absence of a systematic shift in the predictor. The residuals’ sample variance over the entire interval is defined as follows [76]:
s 2 = 1 N · t = 1 N r t r ¯ 2 .
The obtained s2 value near one indicates that the standardisation was performed correctly, and the residuals have the same variance.
The Levene test value for the homogeneity between k segments is defined as follows [77]:
W = N k · j = 1 k n j · z ¯ j z ¯ j 2 k 1 · j = 1 k i = 1 n j z i j z ¯ j 2 ,
where z i j = r i j r ¯ j . A p-value greater than 0.05 means that the null hypothesis is not rejected.
The one-way ANOVA (F-test) value for means equality across k segments is defined as follows [78]:
F = N k · j = 1 k n j · r ¯ j r ¯ j 2 k 1 j = 1 k i = 1 n j r ¯ j r ¯ j 2 N k .
The F-distribution p-value evaluates whether there are significant differences between segment means.
The variance homogeneity fact logical label (var_homogeneous) is described as follows:
H 0 = W p > α ,
where α = 0.05 is the significance level.
The mean homogeneous logical label is described as follows:
H 0 = F p > α .
To assess the training dataset representativeness (see Table 5), the k-means clustering method was used [79,80]. A training dataset of 10,800 values was randomly divided in a 2:1 ratio, i.e., 67% (7236 values) constituted the training subdataset, and 33% (3564 values) constituted the validation subdataset. As a clustering result of the training subdataset, seven classes (classes I…VII) were identified, with a metric distance between clusters of no more than 0.12, which confirms the homogeneity of the structure of both datasets (see Figure 11). On this basis, the optimal amount was finally determined: 7236 values are in the training dataset, and 3564 values are in the validation dataset.
Thus, the training dataset homogeneity and representativeness assessment results indicate its possible use for conducting a computational experiment consisting of the developed neural network method.

3.3. Results of Testing a Neural Network Method for Analysing Sensory Data to Prevent Cyberattacks

3.3.1. Test Results

For dataset synthetic enrichment, a procedure consisting of attack modelling, class balancing, annotation, and partitioning was used, as presented in Table 7.
Table 8 provides a detailed classification of the cyberattacks used in the experiments, indicating their subtypes and key parameters.
Thus, synthetic attacks are introduced into the original series as follows: anomaly fragments are programmatically added to each of the three normalised reading channels (temperature, humidity, and CO2) according to pre-set scenarios (spoofing, replay, and DoS) using a MATLAB script: for spoofing, constant and drift offsets (Δ from 0.2 to 0.8 of the norm, α drift 0.01…0.05·Aₘₐₓ) and random bursts of up to 1.0·Aₘₐₓ with a frequency of 0.5…1 times/min are introduced on the 30…120 s segments; for replay, 25…100 s segments are replaced with previously recorded “clean” segments; for DoS, 10–60 s fragments are either erased by the zero level or contaminated with white noise σ2 = 0.5·Var(unorm). After generation, the overall “norm/ attack” ratio is set at ≈85%: 15%, which corresponds to the class imbalance coefficient I R = N n o r m N a t t a c k 5.7 .
As a developed neural network method for analysing sensory data to prevent cyberattacks (see Figure 2, Figure 3 and Figure 4), testing results using the research object during cyberattacks on an IoT controller example (see Figure 5, Figure 6 and Figure 7) were as follows:
  • The original signal and prediction time series (Figure 12), which are the xt and x ^ t superposition for each sensor, determine the prediction quality of the LSTM predictor.
  • Residual diagrams (Figure 13) reflecting the noise component and identified outliers.
  • Standardised residual diagrams (Figure 14), similar to residual diagrams but normalised to zero mean and unit variance, are used to assess the distribution normality.
  • The “whitened” residuals diagrams (Figure 15), representing correlated channels, are transformed into independent ones, which is convenient for clustering anomalies.
  • The residuals matrix in a sliding window (Figure 16) of the array Rt ∈ ℝn×m allows for local deviation pattern analysis.
  • The abnormal rate st over time diagram (Figure 17) allows you to track changes in the attack probability and the sharp peak locations.
  • The cumulative summation CUSUM diagram (Figure 18) provides a curve St = max(0, St−1 + stν) for early response to protracted minor anomalies.
  • The ROC curve (Figure 19), which represents the TPR(τ) on FPR(τ) dependence for different thresholds τ, illustrates the “sensitivity–false alarms” trade-off.
  • The PR curve (Figure 20), which represents the precision on recall dependence with varying τ, is more informative for rare attacks.
  • The detection delays histogram (Figure 21), which represents the times T distribution from the attack’s actual start to the moment the detector is triggered, in order to estimate the reaction speed.
Figure 12 superimposes the actual xt time series (blue, green, and purple curves) and x ^ t predictions (red curve) for each of the three sensors over normalised time from 0.5 to 1. The close agreement between the intersection density and the curves indicates the prediction model’s adequacy—the prediction deviations from the actual values do not exceed the noise component spread, which provides a reliable basis for subsequent residual analysis and anomaly detection.
Figure 13 shows the residual time dynamics for each of the three sensors: for Sensor 1 and Sensor 2, rare outliers in the order of ±0.25 are observed. At the same time, Sensor 3 demonstrates a more uniform distribution of the noise component without pronounced peaks. In all three cases, the residual average value is close to zero, which indicates the absence of systematic biases, and the presence of unit variance confirms that the prediction model has adequately separated the trend component from the noise component, allowing the anomaly detector to respond to statistically significant deviations.
The standardised residuals r ^ t = r t r ¯ σ t diagrams for each sensor (Figure 14) show that all three series fluctuate around the zero level (dashed line) with approximately the same point value density in the ±3σ range, which indicates a satisfactory approximation to the normal distribution without strong asymmetries or artefacts. At the same time, the spread and rare extremes uniformity (close to the ±3 boundaries) confirm the residuals’ correct standardisation and suitability for subsequent statistical testing for anomalies.
The “whitened” residual diagrams (Figure 15) for each sensor show that after applying the whitened transformations, the noise fluctuations remain centred around zero at normalised times from 0.5 to 1. Still, the peak amplitudes are more evenly distributed and do not show cross-correlation between channels. The components’ independence simplifies the clustering and statistical detection methods application since each whitened residual value for a sensor now reflects its own, uncorrelated noise component.
The residual matrix heat map Rt ∈ ℝn×m (Figure 16) constructed over the time window of the last m = 20 points for the three sensors visualises local anomaly patterns: the axes represent sensors and time indices, and the colour scale displays the whitened residuals’ deviation magnitude. Consecutive “hot” (red) and “cold” (blue) zones are observed, which may indicate short-term structural changes or potential attack signatures. The residual matrix heat map allows for the rapid detection of correlated violations or single outliers for preliminary analysis and anomaly flagging before being fed into the classifier.
The abnormal rate st time history diagrams for each sensor (Figure 17) show the bleached residual norm square as a local anomaly measure: the abnormal rate st peaks correspond to potential deviations from normal behaviour caused by noise, failures, or cyberattacks. Figure 17 shows that in most cases, the abnormal rate remains close to low values corresponding to regular operation. However, sharp spikes exceeding the adaptively selected threshold (dashed red line) are occasionally observed, indicating an increased anomaly probability.
The CUSUM diagrams (Figure 18) show accumulated rate St = max(0, St−1 + stν) dynamics for each sensor, where ν acts as an acceptable average level of anomaly. Such curves are sensitive to long-term but weak deviations that could remain unnoticed with threshold detection. Visually, one can observe phases of smooth St growth, indicating the accumulation of weak anomalies, which, in total, signal a potential threat. Zeroing the St function after reaching a local maximum corresponds to the normal state returning period, which makes CUSUM one of the main tools for monitoring stable deviations and identifying protracted attacks.
ROC curve analysis (Figure 19) for the three sensors (temperature, humidity, and gas concentration) shows different levels of model performance for each type of data. The temperature sensor shows the highest AUC, indicating the model’s high accuracy and sensitivity to deviations in the temperature data, allowing it to effectively distinguish between normal and abnormal conditions with a minimum number of false positives. The humidity sensor shows a slightly lower AUC, indicating the model’s lesser ability to accurately distinguish between normal and abnormal conditions. However, this does not necessarily imply poor performance since such data contain more complex or less pronounced anomalies. The gas sensor, in turn, has the lowest AUC, indicating a more difficult classification task for gas concentration data due to noise or subtle deviations, requiring more complex algorithms [66,67]. For all sensors, there is a trade-off between the actual positive rate (TPR) and the false positive rate (FPR)—increasing the threshold results in fewer false alarms but may reduce the number of correctly detected anomalies.
The PR curves for the temperature, humidity, and gas concentration sensors (Figure 20) show how precision changes with recall for different threshold values τ. For all sensors, there is a noticeable increase in precision at the increasing recall initial stages, which is usually associated with a decrease in the number of false alarms with increasing threshold.
For the temperature and humidity sensors, the curves show a steady increase in precision up to a specific recall value, after which the precision stabilises. It confirms that for rare attacks, the model classifies true positives better, and with increasing recall, the false alarm number also increases, which reduces precision.
For the gas concentration sensor, the curves show similar behaviour. Still, precision remains slightly lower, which may indicate difficulty in classifying anomalies in gas data, possibly due to noise or a comparable large amount of standard deviations. Overall, the PR curves demonstrate the importance of tuning the threshold to balance recall (the model’s ability to detect anomalies) and precision (the ability to avoid false alarms
To construct a detection delay histogram for each sensor, which shows the time T distribution from the attack’s actual start to the moment the detector is triggered, data on the detector response time for each sensor were used [81] (Figure 21).
The detection latencies analysis (Figure 21) shows that the temperature sensor exhibits the most compressed latency distribution, with the majority of values in the 1 to 3 s range, indicating the model’s high sensitivity and fast response to anomalies, especially for attacks related to temperature changes. The humidity sensor has a slightly wider latency distribution, with triggers occurring the maximum number of times in the 2 to 4 s range, which may indicate the model is less sensitive to changes in humidity data, slowing its response speed. The gas concentration sensor exhibits the widest latency distribution, with triggers occurring in the 3 to 6 s range. It is associated with greater difficulty in detecting anomalous changes in gas concentration data, requiring more time for their processing and model response.

3.3.2. Implementation for Practical Activities of Cyber Police

The developed method of sensor data analysis using neural networks is integrated into cyber police activities to prevent cyberattacks aimed at systems using sensors. This method allows sensor data to be monitored in real time, which makes it possible to promptly detect anomalies indicating possible cyberattacks. The use of this method by cyber police will ensure the protection of critical facilities, such as server rooms, industrial systems, and transport networks, which are highly dependent on sensor stability [82,83,84].
The system receives data from several sensors, such as temperature, humidity, and gas, which are accepted at a high frequency—for example, once per second. Figure 12 shows both the original data and their predictions obtained using a neural network. These data are used to calculate residuals, which allow deviations from the system’s normal state to be detected. In particular, the model predictions (red line) are compared with the actual sensor values (blue and green lines), which clearly show possible deviations.
After receiving the data and making predictions, residuals are calculated, which show the difference between the actual values and the model predictions (Figure 13). If the residuals significantly exceed normal values, this may indicate the presence of an anomaly. To improve the accuracy of the analysis, the residuals are standardised (Figure 14), which reduces the noise impact and improves data interpretation. Residual cleaning (Figure 15) eliminates the correlation between channels, simplifying further analysis and anomaly localisation.
The CUSUM method is used in the analysis (Figure 18), which allows accumulated deviations to be tracked from the normal state. It allows the system to detect long-term but weak deviations that could otherwise go unnoticed. Anomaly probability diagrams (Figure 17) show the moments when an anomaly probability increases significantly, which signals a possible attack. Assessing system performance using ROC and PR curves (Figure 19 and Figure 20) helps tune system parameters to achieve the optimal balance between accuracy and recall, minimising false alarms while maximising attack detection. The detection latency histogram (Figure 21) allows you to evaluate the system’s speed response to attacks, which is critical to preventing significant damage in real time.
A flowchart (Figure 22) of the developed neural network method for analysing sensor data implementation to prevent cyberattacks in cyber police activities has been developed. At the first stage, data is collected from sensors (temperature, humidity, and gas concentration), which enter the system at a high frequency. Then, the data is transferred to the LSTM network, which predicts the expected sensor values based on normal data. After that, the residuals, which are the difference between the actual values and the model’s predictions, are calculated. At the next stage, the residuals are standardised to eliminate the noise influence and ensure the accuracy of detecting deviations. At the next stage, the residuals are cleaned, eliminating the correlation between the channels, which simplifies further analysis. The CUSUM method is used to detect deviations, which helps to track long-term but weak anomalies. After that, the system estimates an anomaly probability based on probability diagrams and also evaluates the performance using ROC and PR curves to configure optimal parameters. Based on the detection latency histogram, the system evaluates the response speed and decides to block or filter suspicious data.
To increase operator confidence in the model’s decisions and decision-making processes and ensure transparency in the cyber police environment, implementation provides for multi-level interpretation. In each window of the residual matrix Rt, contribution indicators are calculated for the channel. They are the weighted residual score square components dk according to (7), which allow one to unambiguously identify the sensors with the most significant deviation from the prediction. For example, during spoofing on the temperature channel, a gradual increase in the r t e m p 2 component is observed, which signals a baseline drift; during a DoS attack on the humidity channel, the noise variance σ2 contribution increases sharply, indicating the signal’s high-frequency “contamination”.
At the post-processing stage, the SHAP (SHapley Additive exPlanations) method is applied to the MLP classifier outputs. Accordingly, for each classified fragment, Shapley values are estimated, showing how much each input feature (residuals sliding window for each of the three channels and their statistical characteristics, the mean, variance, and peak emissions) influenced the “abnormality” final logit. The SHAP method allows cyber police to not only receive a binary signal “attack/norm” but also a report on which sensory indicators and which deviation types (long-term drift, single bursts, noise emissions) serve as the basis for the response. The interpretation results are visualised graphically in contribution bar chart form and a text explanation. It ensures prompt adoption of countermeasures and serves as an evidence base for the incident’s subsequent investigation.

3.4. Evaluation of the Effectiveness of the Neural Network Method for Analysing Sensory Data to Prevent Cyberattacks

A comparative study with several popular anomaly detection methods was conducted to evaluate the developed method of sensor data analysis for the early detection and prevention of cyberattacks. Table 9 presents the quality metrics values for four methods: the developed method, the isolation forest-based method (IForest), support vector machine (SVM), and the k-means method (K-means). During the comparative analysis, accuracy (precision), recall (recall), F1-measure (F1 score), AUC (area under the ROC curve), and training time were evaluated.
Comparative testing result analysis shows that the developed method outperforms all the compared algorithms in terms of the main quality metrics: precision is 0.92, recall is 0.89, the F1 score is 0.90, and the area under the ROC curve (AUC) is 0.94; this indicates its high ability to correctly distinguish between normal and abnormal states with a minimum level of false positives. At the same time, the training time, which is 15 s, remains moderate and ensures deployment efficiency. The isolation forest method, despite having the shortest training time (5 s), demonstrates lower precision and recall, which limits its use in tasks that are critical to false alarms. SVM shows close precision at 0.89 and recall at 0.87 but requires significantly more training time (25 s), which may be undesirable when processing large amounts of data. VAE (precision is 0.85, recall is 0.83, AUC is 0.88) and K-means (precision is 0.80, recall is 0.75, AUC is 0.83) are inferior in all key indicators, and CNN with MLP (precision is 0.90, recall is 0.88, AUC is 0.92) is close to the developed method in quality but has the highest training load (30 s), which reduces its practical attractiveness in resource-limited scenarios.
To compare the modified LSTM neural network architecture performance with other popular architectures for anomaly detection, a comparative analysis of the following neural networks was performed (Table 10): LSTM, GRU, CNN, and MLP (multilayer perceptron). All methods were evaluated by the precision, recall, F1 score, AUC (area under the ROC curve), and training time labels.
The comparative analysis results show that the modified LSTM network architecture provides the best precision, recall, and F1 score, making it the most suitable for sensory data anomaly detection tasks where long-term dependencies in time series are essential to consider. The high area under the ROC curve (AUC = 0.94) confirms its ability to effectively separate normal data from anomalies while minimising the number of false alarms. At the same time, GRU, although inferior to LSTM in precision and recall, demonstrates precision = 0.88 and recall = 0.85 with a shorter training time (20 s). CNN shows precision = 0.85 and recall = 0.83 in accuracy terms. Still, its training time is significantly higher (30 s), and its application to time series analysis is limited, as this architecture is more suitable for image processing. MLP, although having the lowest precision and recall (0.82 and 0.80, respectively), is fast to train (15 s), which can be helpful for simple problems where temporal dependencies and complex patterns are not so critical.
The modified LSTM network with an additional drift module (see Figure 3) was compared with the traditional LSTM, as well as other adaptive models (e.g., adaptive LSTM, residual LSTM), in terms of the metrics of precision, recall, F1 score, AUC (area under the ROC curve), and training time (Table 11).
In Table 3, Traditional LSTM is a classic LSTM network without a drift module, Adaptive LSTM is an LSTM with an adaptive mechanism for changing the weight learning rate, Residual LSTM is an LSTM with residual connections between layers to reduce gradient attenuation, and Modified LSTM is a developed LSTM with an additional drift module (see Figure 3).
The comparative results analysis (Table 11) shows that the developed modified LSTM network with an integrated drift module (see Figure 3) provides the best performance in all key quality metrics—precision is 0.92, recall is 0.89, the F1 score is 0.90, and AUC is 0.94—which indicates its increased ability to accurately and promptly detect anomalies in sensory data compared to the LSTM architectures. At the same time, the training time of 25 s remains comparable to that of Traditional LSTM (22 s). It surpasses Adaptive LSTM (28 s) and Residual LSTM (30 s), indicating an optimal balance between threat detection efficiency and computational costs. It is critical for implementation in resource-limited cyber defence systems.

3.5. Development of an Optimisation Method for Low-Power Embedded Devices

It is accepted that W l R n l × n l 1 is the weight matrix of the i-th layer. At the initial stage, hard thresholding (pruning) is performed:
W ~ i j l = W i j l ,   i f   W i j l τ l , 0 ,   i f   W i j l < τ l ,  
where τl is selected so as to preserve only p% of the most significant elements in absolute value.
In the main stage, symmetric quantisation into Q levels (+/−) is applied to nonzero elements:
W ^ i j l = r o u n d W ~ i j l l · l ,   l = max W ~ l Q 1 w .
As a result, matrix W ^ l is stored in Q-bit integer format, and zero elements are skipped during multiplication, which allows for a strong reduction in model size and inference acceleration.
Based on the above, a processing scheme was developed on a low-power device (Figure 23). The Sequence Input block accepts sequentially formed sliding windows of 50 samples in length with a step of 5. These data are transferred to Sparse Quantised LSTM, in which, instead of the original weight matrices W(l), pre-cut W ~ l and quantised W ^ l are used, and multiplication is organised via the CSR format with the zero elements. The Lightweight Fully Connected block is a fully connected layer with Q = 8-bit weight quantisation. The Softmax and Argmax blocks calculate the probability vector and the final class label. The post-processing block performs a threshold check, “packs” the result into one byte, and then transfers it to the monitoring system.
To estimate resource intensity, it is assumed that the model memory size after the pruning and quantisation procedures is reduced by approximately Q × Q 32 times compared to the original 32-bit representation. If we choose p = 20% and Q = 8, we get ~ 20 % × 8 32 = 5 % of the original size, which allows it to fit into modern MCUs with 256 KB flash memory, while the inference time on a 6-core ARM Cortex-M7 is reduced from ~200 ms to ~30 ms.

4. Discussion

A method is proposed to analyse sensor data for cyberattack prevention using neural networks. Introduced is the model of the expected behaviour of the system implemented by a linear stochastic differential equation and a recursive algorithm for the system state under noise Kalman filter estimation (according to (1)–(6)). Then, the residuals vector is used to detect deviations using a statistical test, capable of distinguishing between nominal and anomaly states, such as a cyberattack (according to (7)–(12)). So as to increase accuracy and flexibility in the described attacks that the new types will meet, a hybrid approach to predicting normal behaviour and LSTM neural-based anomaly analysis is utilised to evaluate the deviation effectively by using minimum computing capacity with time (see Figure 2). The technique comprises anomaly grouping based on a residual values matrix and their subsequent probability estimation using gradient descent to train the model (according to (34)–(50)).
The computational experiment results conducted using the developed method for analysing sensor data showed high efficiency in detecting cyberattacks. Based on real data collected from temperature, humidity, and gas concentration sensors, it was demonstrated that the LSTM model predictions correspond well to real values (see Figure 12), which is confirmed by the predicted values’ low deviation from the actual ones. The resulting residual diagrams (see Figure 13) show that the residual values fluctuate around zero. This demonstrates the adequacy of the developed method in extracting “useful” data from a noisy data context. Residual standardisation (see Figure 14) and residual cleaning (Figure 15) allowed us to eliminate correlations between data channels. Analysis using the CUSUM deviation accumulation method (see Figure 18) made it possible to detect long-term but weak deviations that could be missed using simple threshold methods. Additionally, the anomaly probability diagrams (see Figure 17) and detector quality assessment (ROC and PR curves in Figure 19 and Figure 20) showed promising results for detection accuracy and recall, with high AUC and balanced accuracy and false favourable rates.
The developed method implementation in cyber police practical activities (see Figure 22) includes a neural network for analysing sensor data integrated into the cyberattack monitoring method used by cyber police. The system analyses data from various sensors, such as temperature, humidity, and gas concentration, in real time to identify attacks, anomalies, and characteristics. The algorithm, which includes LSTM model predictions, residual calculation, and their standardisation, helps to promptly detect deviations and take measures to block suspicious sensory data.
Despite the high results achieved in testing the developed method and its implementation in cyber police practical activities, some of its limitations should be highlighted:
  • The technique requires a large amount of “clean data” without attacks to train the LSTM model, which may be a problem in real-world conditions, where data with cyberattack labels may be limited or unavailable for training.
  • Determining the optimal threshold for classifying anomalies depends on the chosen level. It requires additional settings and adaptation depending on the particular practical application specifics.
  • Despite the method’s effectiveness, it requires significant computing resources to process large amounts of data in real time. It is a limitation for computing devices with limited computing power and energy consumption.
  • Like many other neural network-based methods, the proposed approach suffers from the “black box” problem, which may make it difficult to explain to the operator why a particular result was classified as anomalous, which is vital for real-world exploitation in the cybersecurity field.
Future research is aimed at developing methods that will explain the results of the neural network with high accuracy, which will improve cyber police activities. At the same time, future research needs to develop methods for optimising neural networks for working with limited computing resources, which will make it possible to implement the proposed method on low-power computing devices. In addition, it is relevant to research the possibility of adapting the technique to new, previously unknown types of cyberattacks, including algorithms for training the model on data with small amounts of development, as well as in the dynamically changing attack context. Also, further research is needed to develop approaches to integrating the developed method with other security tools, such as intrusion detection systems (IDSs) and attack prevention systems (IPSs), for comprehensive protection of critical infrastructure.
In further research, it is also advisable to study the developed LSTM architecture’s (see Figure 3) robustness to targeted adversarial attacks (e.g., FGSM, PGD) at the level of both the neural network itself and the neural network classifier, conduct formal experiments on generating and introducing minor targeted distortions into sensory data to assess the impact on the anomaly detection accuracy, and develop counter-defence methods (e.g., adversarial training or anomalous gradient patterns detection), which will improve the system’s robustness in adversarial conditions.

5. Conclusions

In this article, a neural network method for analysing sensor data to prevent cyberattacks, based on a modified LSTM predictor, was developed. Using the LSTM predictor and the residual value method ensured high accuracy and minimal false positives when analysing sensor data, which is confirmed by the AUC = 0.94, precision = 0.92, and recall = 0.89.
The developed neural network method for analysing sensor data to prevent cyberattacks effectively processes data from various sensors (temperature, humidity, gas concentration), which allows multiple types of attacks, such as data forgery (spoofing), man-in-the-middle attacks, and denial of service (DoS), to be detected. This is ensured by using a hybrid model based on a modified LSTM network, which combines an analysis of predicted values with a calculation of residual values and their statistical processing.
The developed method application ensures real-time operation with minimal computational costs—the one-hour processing time of data (3600 points with three channels) does not exceed 15 s when using traditional cyber police tools. At the same time, training the model requires a “clean” dataset without attacks, and application on devices with severely limited resources is possible only after improvement and additional optimisation of both the architecture and the response threshold, which is a prospect for further research and implementation in cyber police practical activities.

Author Contributions

Conceptualisation, S.V., O.P., A.O. and V.V.; methodology, S.V., V.J., A.S. and V.V.; software, S.V., V.J., A.S. and V.V.; validation, V.J., O.P., A.O. and V.V.; formal analysis, S.V. and V.V.; investigation, V.J., A.S. and V.V.; resources, S.V., V.J., A.S. and V.V.; data curation, S.V., O.P. and A.O.; writing—original draft preparation, S.V., V.J. and V.V.; writing—review and editing, A.S., O.P. and A.O.; visualisation, S.V., A.S. and V.V.; supervision, S.V., V.J., A.S. and V.V.; project administration, S.V., O.P. and A.O.; funding acquisition, S.V. and V.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Acknowledgments

The research was carried out with the grant support of the Ministry of Education and Science of Ukraine “Methods and tools for detecting disinformation in social networks based on deep learning technologies” under Project No. 0125U001852. During the preparation of this manuscript/study, the author(s) used [ChatGPT 4o Available, Gemini 2.5 flash, Grammarly] to correct and improve the text quality and also to eliminate grammatical errors. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Song, W.; Zhu, X.; Ren, S.; Tan, W.; Peng, Y. A Hybrid Blockchain and Machine Learning Approach for Intrusion Detection System in Industrial Internet of Things. Alex. Eng. J. 2025, 127, 619–627. [Google Scholar] [CrossRef]
  2. Abidi, H.; Sidhom, L.; Bollen, M.; Chihi, I. Adaptive Software Sensor for Intelligent Control in Photovoltaic System Optimization. Int. J. Electr. Power Energy Syst. 2025, 170, 110921. [Google Scholar] [CrossRef]
  3. Yang, C.; Wang, J.; Liu, Y.; Ding, Y.; Liu, Z.; Wang, S. A Lightweight Decentralized Federated Learning Framework for the Industrial Internet of Things. Ad Hoc Netw. 2025, 178, 103962. [Google Scholar] [CrossRef]
  4. Afrin, S.; Rafa, S.J.; Kabir, M.; Farah, T.; Alam, M.S.B.; Lameesa, A.; Ahmed, S.F.; Gandomi, A.H. Industrial Internet of Things: Implementations, Challenges, and Potential Solutions across Various Industries. Comput. Ind. 2025, 170, 104317. [Google Scholar] [CrossRef]
  5. Mengash, H.A.; Mahgoub, H.; Alshuhail, A.; Darem, A.A.; Majdoubi, J.; Yafoz, A.; Alsini, R.; Alghushairy, O. Agricultural Consumer Internet of Things Devices: Methods for Optimizing Data Aggregation. Alex. Eng. J. 2025, 125, 692–699. [Google Scholar] [CrossRef]
  6. Liu, X. Research on Consumers’ Personal Information Security and Perception Based on Digital Twins and Internet of Things. Sustain. Energy Technol. Assess. 2022, 53, 102706. [Google Scholar] [CrossRef]
  7. Baláž, M.; Kováčiková, K.; Novák, A.; Vaculík, J. The Application of Internet of Things in Air Transport. Transp. Res. Procedia 2023, 75, 60–67. [Google Scholar] [CrossRef]
  8. Yin, Y.; Wang, H.; Deng, X. Real-Time Logistics Transport Emission Monitoring-Integrating Artificial Intelligence and Internet of Things. Transp. Res. Part D Transp. Environ. 2024, 136, 104426. [Google Scholar] [CrossRef]
  9. Rey, A.; Panetti, E.; Maglio, R.; Ferretti, M. Determinants in Adopting the Internet of Things in the Transport and Logistics Industry. J. Bus. Res. 2021, 131, 584–590. [Google Scholar] [CrossRef]
  10. Knieps, G. Internet of Things, Critical Infrastructures, and the Governance of Cybersecurity in 5G Network Slicing. Telecommun. Policy 2024, 48, 102867. [Google Scholar] [CrossRef]
  11. Bisikalo, O.; Danylchuk, O.; Kovtun, V.; Kovtun, O.; Nikitenko, O.; Vysotska, V. Modeling of Operation of Information System for Critical Use in the Conditions of Influence of a Complex Certain Negative Factor. Int. J. Control Autom. Syst. 2022, 20, 1904–1913. [Google Scholar] [CrossRef]
  12. Bisikalo, O.; Kovtun, O.; Kovtun, V.; Vysotska, V. Research of pareto-optimal schemes of control of availability of the information system for critical use. CEUR Workshop Proc. 2020, 2623, 174–193. Available online: https://ceur-ws.org/Vol-2623/paper17.pdf (accessed on 29 May 2025).
  13. Wang, Q.; Mu, Z. Risk Monitoring Model of Intelligent Agriculture Internet of Things Based on Big Data. Sustain. Energy Technol. Assess. 2022, 53, 102654. [Google Scholar] [CrossRef]
  14. Lan, Y.; Li, L.; Peng, H. A Verifiable Efficient Federated Learning Method Based on Adaptive Boltzmann Selection for Data Processing in the Internet of Things. J. Syst. Archit. 2025, 168, 103523. [Google Scholar] [CrossRef]
  15. Zhao, J.; Huang, F.; Hu, H.; Liao, L.; Wang, D.; Fan, L. User Security Authentication Protocol in Multi Gateway Scenarios of the Internet of Things. Ad Hoc Netw. 2024, 156, 103427. [Google Scholar] [CrossRef]
  16. Abdullah, M. IoT-CDS: Internet of Things Cyberattack Detecting System Based on Deep Learning Models. Comput. Mater. Contin. 2024, 81, 4265–4283. [Google Scholar] [CrossRef]
  17. Alanazi, M.; Aljuhani, A. Anomaly Detection for Internet of Things Cyberattacks. Comput. Mater. Contin. 2022, 72, 261–279. [Google Scholar] [CrossRef]
  18. Mohamed, H.; Koroniotis, N.; Schiliro, F.; Moustafa, N. IoT-CAD: A Comprehensive Digital Forensics Dataset for AI-Based Cyberattack Attribution Detection Methods in IoT Environments. Ad Hoc Netw. 2025, 174, 103840. [Google Scholar] [CrossRef]
  19. Kishor, G.; Mugada, K.K.; Mahto, R.P. Sensor-Integrated Data Acquisition and Machine Learning Implementation for Process Control and Defect Detection in Wire Arc-Based Metal Additive Manufacturing. Precis. Eng. 2025, 95, 163–187. [Google Scholar] [CrossRef]
  20. Xue, D.; El-Farra, N.H. Optimal Sensor and Actuator Scheduling in Sampled-Data Control of Spatially Distributed Processes. IFAC-Pap. 2018, 51, 327–332. [Google Scholar] [CrossRef]
  21. Fang, W.; Shao, Y.; Love, P.E.D.; Hartmann, T.; Liu, W. Detecting Anomalies and De-Noising Monitoring Data from Sensors: A Smart Data Approach. Adv. Eng. Inform. 2023, 55, 101870. [Google Scholar] [CrossRef]
  22. Messina, D.; Durand, H. Lyapunov-Based Cyberattack Detection for Distinguishing Between Sensor and Actuator Attacks. IFAC-Pap. 2024, 58, 604–609. [Google Scholar] [CrossRef]
  23. Awad, A.H.; Alsabaan, M.; Ibrahem, M.I.; Saraya, M.S.; Elksasy, M.S.M.; Ali-Eldin, A.M.T.; Abdelsalam, M.M. Low-Cost IoT-Based Sensors Dashboard for Monitoring the State of Health of Mobile Harbor Cranes: Hardware and Software Description. Heliyon 2024, 10, e40239. [Google Scholar] [CrossRef] [PubMed]
  24. Golovko, V.; Egor, M.; Brich, A.; Sachenko, A. A Shallow Convolutional Neural Network for Accurate Handwritten Digits Classification. Commun. Comput. Inf. Sci. 2017, 673, 77–85. [Google Scholar] [CrossRef]
  25. Bodyanskiy, Y.; Deineko, A.; Skorik, V.; Brodetskyi, F. Deep Neural Network with Adaptive Parametric Rectified Linear Units and Its Fast Learning. Int. J. Comput. 2022, 21, 11–18. [Google Scholar] [CrossRef]
  26. Sun, L.; Sun, Y. Photovoltaic Power Forecasting Based on Artificial Neural Network and Ultraviolet Index. Int. J. Comput. 2022, 21, 153–158. [Google Scholar] [CrossRef]
  27. Vladov, S.; Shmelov, Y.; Petchenko, M. A Neuro-Fuzzy Expert System for the Control and Diagnostics of Helicopters Aircraft Engines Technical State. CEUR Workshop Proc. 2021, 3013, 40–52. Available online: https://ceur-ws.org/Vol-3013/20210040.pdf (accessed on 8 June 2025).
  28. Turchenko, V.; Kochan, V.; Sachenko, A. Estimation of Computational Complexity of Sensor Accuracy Improvement Algorithm Based on Neural Networks. Lect. Notes Comput. Sci. 2001, 2130, 743–748. [Google Scholar] [CrossRef]
  29. Hamolia, V.; Melnyk, V.; Zhezhnych, P.; Shilinh, A. Intrusion detection in computer networks using latent space representation and machine learning. Int. J. Comput. 2020, 19, 442–448. [Google Scholar] [CrossRef]
  30. Tian, J.; Mercier, P.; Paolini, C. Ultra Low-Power, Wearable, Accelerated Shallow-Learning Fall Detection for Elderly at-Risk Persons. Smart Health 2024, 33, 100498. [Google Scholar] [CrossRef]
  31. Jokic, A.; Zivkovic, M.; Jovanovic, L.; Mravik, M.; Sarac, M.; Simic, V.; Khan, M.A.; Bacanin, N. A Convolutional Neural Network-Enhanced Attack Detection Framework with Explainable Artificial Intelligence for Internet of Things-Based Metaverse Security. Eng. Appl. Artif. Intell. 2025, 144, 111358. [Google Scholar] [CrossRef]
  32. Vijayalakshmi, P.; Karthika, D. Hybrid Dual-Channel Convolution Neural Network (DCCNN) with Spider Monkey Optimization (SMO) for Cyber Security Threats Detection in Internet of Things. Meas. Sens. 2023, 27, 100783. [Google Scholar] [CrossRef]
  33. Balingbing, C.B.; Kirchner, S.; Siebald, H.; Kaufmann, H.-H.; Gummert, M.; Van Hung, N.; Hensel, O. Application of a Multi-Layer Convolutional Neural Network Model to Classify Major Insect Pests in Stored Rice Detected by an Acoustic Device. Comput. Electron. Agric. 2024, 225, 109297. [Google Scholar] [CrossRef]
  34. Zarzycki, K.; Chaber, P.; Cabaj, K.; Ławryńczuk, M.; Marusak, P.; Nebeluk, R.; Plamowski, S.; Wojtulewicz, A. Forgery Cyber-Attack Supported by LSTM Neural Network: An Experimental Case Study. Sensors 2023, 23, 6778. [Google Scholar] [CrossRef]
  35. Gupta, B.B.; Chui, K.T.; Gaurav, A.; Arya, V.; Chaurasia, P. A Novel Hybrid Convolutional Neural Network- and Gated Recurrent Unit-Based Paradigm for IoT Network Traffic Attack Detection in Smart Cities. Sensors 2023, 23, 8686. [Google Scholar] [CrossRef]
  36. Vladov, S.; Vysotska, V.; Sokurenko, V.; Muzychuk, O.; Nazarkevych, M.; Lytvyn, V. Neural Network System for Predicting Anomalous Data in Applied Sensor Systems. Appl. Syst. Innov. 2024, 7, 88. [Google Scholar] [CrossRef]
  37. Yu, X.; Meng, W.; Liu, Y.; Zhou, F. TridentShell: An Enhanced Covert and Scalable Backdoor Injection Attack on Web Applications. J. Netw. Comput. Appl. 2024, 223, 103823. [Google Scholar] [CrossRef]
  38. Vladov, S.; Yakovliev, R.; Vysotska, V.; Nazarkevych, M.; Lytvyn, V. The Method of Restoring Lost Information from Sensors Based on Auto-Associative Neural Networks. Appl. Syst. Innov. 2024, 7, 53. [Google Scholar] [CrossRef]
  39. Zhang, X.; Wang, G.; Chen, Y.; Yang, W.; Wang, G. Inter-Layer Explainable Variational Autoencoder Model for Multivariate Time Series Anomaly Detection. Eng. Appl. Artif. Intell. 2025, 159, 111585. [Google Scholar] [CrossRef]
  40. Lu, R.; Zheng, D.; Yang, Q.; Cao, W.; Zhu, C. Anomaly Detection for Non-Stationary Rotating Machinery Based on Signal Transform and Memory-Guided Multi-Scale Feature Reconstruction. Eng. Appl. Artif. Intell. 2025, 154, 110824. [Google Scholar] [CrossRef]
  41. Omatu, S. Classification of Mixed Odors Using A Layered Neural Network. Int. J. Comput. 2017, 16, 41–48. [Google Scholar] [CrossRef]
  42. Lynnyk, R.; Vysotska, V.; Matseliukh, Y.; Burov, Y.; Demkiv, L.; Zaverbnyj, A.; Sachenko, A.; Shylinska, I.; Yevseyeva, I.; Bihun, O. DDOS Attacks Analysis Based on Machine Learning in Challenges of Global Changes. CEUR Workshop Proc. 2020, 2631, 159–171. Available online: https://ceur-ws.org/Vol-2631/paper12.pdf (accessed on 16 June 2025).
  43. Striuk, O.; Kondratenko, Y. Generative Adversarial Neural Networks and Deep Learning: Successful Cases and Advanced Approaches. Int. J. Comput. 2021, 20, 339–349. [Google Scholar] [CrossRef]
  44. Wang, X.; Zhang, Y.; Bai, N.; Yu, Q.; Wang, Q. Class-Imbalanced Time Series Anomaly Detection Method Based on Cost-Sensitive Hybrid Network. Expert Syst. Appl. 2024, 238, 122192. [Google Scholar] [CrossRef]
  45. Mahdi, Z.; Abdalhussien, N.; Mahmood, N.; Zaki, R. Detection of Real-Time Distributed Denial-of-Service (DDoS) Attacks on Internet of Things (IoT) Networks Using Machine Learning Algorithms. Comput. Mater. Contin. 2024, 80, 2139–2159. [Google Scholar] [CrossRef]
  46. Munir, M.; Siddiqui, S.A.; Chattha, M.A.; Dengel, A.; Ahmed, S. FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models. Sensors 2019, 19, 2451. [Google Scholar] [CrossRef]
  47. Xu, C.; Zhu, P.; Wang, J.; Fortino, G. Improving the Local Diagnostic Explanations of Diabetes Mellitus with the Ensemble of Label Noise Filters. Inf. Fusion 2025, 117, 102928. [Google Scholar] [CrossRef]
  48. Shen, F.; Zheng, J.; Ye, L.; Ma, X. LSTM Soft Sensor Development of Batch Processes with Multivariate Trajectory-Based Ensemble Just-in-Time Learning. IEEE Access 2020, 8, 73855–73864. [Google Scholar] [CrossRef]
  49. Zhu, P.; Zhang, H.; Shi, Y.; Xie, W.; Pang, M.; Shi, Y. A Novel Discrete Conformable Fractional Grey System Model for Forecasting Carbon Dioxide Emissions. Environ. Dev. Sustain. 2024, 27, 13581–13609. [Google Scholar] [CrossRef]
  50. Amin, R.; Gantassi, R.; Ahmed, N.; Hassan Alshehri, A.; Alsubaei, F.S.; Frnda, J. A Hybrid Approach for Adversarial Attack Detection Based on Sentiment Analysis Model Using Machine Learning. Eng. Sci. Technol. Int. J. 2024, 58, 101829. [Google Scholar] [CrossRef]
  51. Sheikh, A.M.; Islam, M.R.; Habaebi, M.H.; Zabidi, S.A.; Bin Najeeb, A.R.; Kabbani, A. A Survey on Edge Computing (EC) Security Challenges: Classification, Threats, and Mitigation Strategies. Future Internet 2025, 17, 175. [Google Scholar] [CrossRef]
  52. Cho, C.; Kim, C.; Sull, S. PIABC: Point Spread Function Interpolative Aberration Correction. Sensors 2025, 25, 3773. [Google Scholar] [CrossRef]
  53. Connolly, G.; Sachenko, A.; Markowsky, G. Distributed Traceroute Approach to Geographically Loocating IP Devices. In Proceedings of the Second IEEE International Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Lviv, Ukraine, 8–10 September 2003; pp. 128–131. [Google Scholar] [CrossRef]
  54. Vladov, S.; Muzychuk, O.; Vysotska, V.; Yurko, A.; Uhryn, D. Modified Kalman Filter with Chebyshev Points Based on a Recurrent Neural Network for Automatic Control System Measuring Channels Diagnosing and Parring off Failures. Int. J. Image Graph. Signal Process. 2024, 16, 36–61. [Google Scholar] [CrossRef]
  55. Sachenko, A.; Kochan, V.; Turchenko, V. Intelligent Distributed Sensor Network. In Proceedings of the IMTC/98 Conference Proceedings. IEEE Instrumentation and Measurement Technology Conference. Where Instrumentation is Going (Cat. No.98CH36222), St. Paul, MN, USA, 18–21 May 1998; Volume 1, pp. 60–66. [Google Scholar] [CrossRef]
  56. Vitulyova, Y.; Babenko, T.; Kolesnikova, K.; Kiktev, N.; Abramkina, O. A Hybrid Approach Using Graph Neural Networks and LSTM for Attack Vector Reconstruction. Computers 2025, 14, 301. [Google Scholar] [CrossRef]
  57. Vladov, S.; Shmelov, Y.; Yakovliev, R. Modified Helicopters Turboshaft Engines Neural Network On-board Automatic Control System Using the Adaptive Control Method. CEUR Workshop Proc. 2022, 3309, 205–224. Available online: https://ceur-ws.org/Vol-3309/paper15.pdf (accessed on 29 June 2025).
  58. Morales, M.D.; Antelis, J.M.; Moreno, C.; Nesterov, A.I. Deep Learning for Gravitational-Wave Data Analysis: A Resampling White-Box Approach. Sensors 2021, 21, 3174. [Google Scholar] [CrossRef] [PubMed]
  59. Park, H.; Lee, K. Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode. Appl. Sci. 2019, 9, 4568. [Google Scholar] [CrossRef]
  60. Todo, H.; Chen, T.; Ye, J.; Li, B.; Todo, Y.; Tang, Z. Single-Layer Perceptron Artificial Visual System for Orientation Detection. Front. Neurosci. 2023, 17, 1229275. [Google Scholar] [CrossRef]
  61. Jeong, S.; Lee, J. Soft-Output Detector Using Multi-Layer Perceptron for Bit-Patterned Media Recording. Appl. Sci. 2022, 12, 620. [Google Scholar] [CrossRef]
  62. Vladov, S.; Scislo, L.; Sokurenko, V.; Muzychuk, O.; Vysotska, V.; Osadchy, S.; Sachenko, A. Neural Network Signal Integration from Thermogas-Dynamic Parameter Sensors for Helicopters Turboshaft Engines at Flight Operation Conditions. Sensors 2024, 24, 4246. [Google Scholar] [CrossRef]
  63. Vladov, S.; Sachenko, A.; Sokurenko, V.; Muzychuk, O.; Vysotska, V. Helicopters Turboshaft Engines Neural Network Modeling under Sensor Failure. J. Sens. Actuator Netw. 2024, 13, 66. [Google Scholar] [CrossRef]
  64. Biçer, C.; Bakouch, H.S.; Biçer, H.D.; Alomair, G.; Hussain, T.; Almohisen, A. Unit Maxwell-Boltzmann Distribution and Its Application to Concentrations Pollutant Data. Axioms 2024, 13, 226. [Google Scholar] [CrossRef]
  65. Nazarkevych, M.; Kowalska-Styczen, A.; Lytvyn, V. Research of Facial Recognition Systems and Criteria for Identification. In Proceedings of the IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, IDAACS, Dortmund, Germany, 7–9 September 2023; pp. 555–558. [Google Scholar] [CrossRef]
  66. Vlasenko, D.; Inkarbaieva, O.; Peretiatko, M.; Kovalchuk, D.; Sereda, O. Helicopter Radio System for Low Altitudes and Flight Speed Measuring with Pulsed Ultra-Wideband Stochastic Sounding Signals and Artificial Intelligence Elements. Radioelectron. Comput. Syst. 2023, 3, 48–59. [Google Scholar] [CrossRef]
  67. Marakhimov, A.R.; Khudaybergenov, K.K. Approach to the synthesis of neural network structure during classification. Int. J. Comput. 2020, 19, 20–26. [Google Scholar] [CrossRef]
  68. Bodyanskiy, Y.; Shafronenko, A.; Pliss, I. Clusterization of Vector and Matrix Data Arrays Using the Combined Evolutionary Method of Fish Schools. Syst. Res. Inf. Technol. 2022, 4, 79–87. [Google Scholar] [CrossRef]
  69. Dyvak, M.; Manzhula, V.; Melnyk, A.; Rusyn, B.; Spivak, I. Modeling the Efficiency of Biogas Plants by Using an Interval Data Analysis Method. Energies 2024, 17, 3537. [Google Scholar] [CrossRef]
  70. Lopes, J.F.; Barbon Junior, S.; de Melo, L.F. Online Meta-Recommendation of CUSUM Hyperparameters for Enhanced Drift Detection. Sensors 2025, 25, 2787. [Google Scholar] [CrossRef]
  71. Riad, K. Robust Access Control for Secure IoT Outsourcing with Leakage Resilience. Sensors 2025, 25, 625. [Google Scholar] [CrossRef]
  72. Pieniazek, J. Thermocouple Sensor Response in Hot Airstream. Sensors 2025, 25, 4634. [Google Scholar] [CrossRef]
  73. He, Y.; Yang, F.; Wei, P.; Lv, Z.; Zhang, Y. A Novel Adaptive Flexible Capacitive Sensor for Accurate Intravenous Fluid Monitoring in Clinical Settings. Sensors 2025, 25, 4524. [Google Scholar] [CrossRef]
  74. Ali, M.; Ahmad, I.; Geun, I.; Hamza, S.A.; Ijaz, U.; Jang, Y.; Koo, J.; Kim, Y.-G.; Kim, H.-D. A Comprehensive Review of Advanced Sensor Technologies for Fire Detection with a Focus on Gasistor-Based Sensors. Chemosensors 2025, 13, 230. [Google Scholar] [CrossRef]
  75. Gao, G.F.; Oh, C.; Saksena, G.; Deng, D.; Westlake, L.C.; Hill, B.A.; Reich, M.; Schumacher, S.E.; Berger, A.C.; Carter, S.L.; et al. Tangent Normalization for Somatic Copy-Number Inference in Cancer Genome Analysis. Bioinformatics 2022, 38, 4677–4686. [Google Scholar] [CrossRef]
  76. Gao, X.; Yao, X.; Chen, B.; Zhang, H. SBCS-Net: Sparse Bayesian and Deep Learning Framework for Compressed Sensing in Sensor Networks. Sensors 2025, 25, 4559. [Google Scholar] [CrossRef]
  77. Wang, Y.; Tang, M.; Wang, P.; Liu, B.; Tian, R. The Levene Test Based-Leakage Assessment. Integration 2022, 87, 182–193. [Google Scholar] [CrossRef]
  78. Zhang, G.; Christensen, R.; Pesko, J. Parametric Boostrap and Objective Bayesian Testing for Heteroscedastic One-Way ANOVA. Stat. Probab. Lett. 2021, 174, 109095. [Google Scholar] [CrossRef]
  79. Lytvyn, V.; Dudyk, D.; Peleshchak, I.; Peleshchak, R.; Pukach, P. Influence of the Number of Neighbours on the Clustering Metric by Oscillatory Chaotic Neural Network with Dipole Synaptic Connections. CEUR Workshop Proc. 2024, 3664, 24–34. Available online: https://ceur-ws.org/Vol-3664/paper3.pdf (accessed on 8 July 2025).
  80. Hu, Z.; Kashyap, E.; Tyshchenko, O.K. GEOCLUS: A Fuzzy-Based Learning Algorithm for Clustering Expression Datasets. Lect. Notes Data Eng. Commun. Technol. 2022, 134, 337–349. [Google Scholar] [CrossRef]
  81. Cai, H.; Xie, Z.; Ma, Y.; Xiang, L. A 209 Ps Shutter-Time CMOS Image Sensor for Ultra-Fast Diagnosis. Sensors 2025, 25, 3835. [Google Scholar] [CrossRef]
  82. Ablamskyi, S.; Tchobo, D.L.R.; Romaniuk, V.; Šimić, G.; Ilchyshyn, N. Assessing the Responsibilities of the International Criminal Court in the Investigation of War Crimes in Ukraine. Novum Jus 2023, 17, 353–374. [Google Scholar] [CrossRef]
  83. Ablamskyi, S.; Nenia, O.; Drozd, V.; Havryliuk, L. Substantial Violation of Human Rights and Freedoms as a Prerequisite for Inadmissibility of Evidence. Justicia 2021, 26, 47–56. [Google Scholar] [CrossRef]
  84. Kovtun, V.; Izonin, I.; Gregus, M. Model of Functioning of the Centralized Wireless Information Ecosystem Focused on Multimedia Streaming. Egypt. Inform. J. 2022, 23, 89–96. [Google Scholar] [CrossRef]
Figure 1. The sensory data analysis method for preventing cyberattacks: a block diagram.
Figure 1. The sensory data analysis method for preventing cyberattacks: a block diagram.
Sensors 25 05235 g001
Figure 2. The proposed method’s architectural scheme.
Figure 2. The proposed method’s architectural scheme.
Sensors 25 05235 g002
Figure 3. The LSTM predictor architecture uses a modified LSTM cell.
Figure 3. The LSTM predictor architecture uses a modified LSTM cell.
Sensors 25 05235 g003
Figure 4. The single-layer MLP detector with one hidden layer architecture diagram.
Figure 4. The single-layer MLP detector with one hidden layer architecture diagram.
Sensors 25 05235 g004
Figure 5. The research object structural diagram.
Figure 5. The research object structural diagram.
Sensors 25 05235 g005
Figure 6. The research object during cyberattacks (spoofing, MITM, Replay) on the IoT controller structural diagram.
Figure 6. The research object during cyberattacks (spoofing, MITM, Replay) on the IoT controller structural diagram.
Sensors 25 05235 g006
Figure 7. The research object structural diagram takes into account the developed neural network method for analysing sensory data to prevent cyberattacks.
Figure 7. The research object structural diagram takes into account the developed neural network method for analysing sensory data to prevent cyberattacks.
Sensors 25 05235 g007
Figure 8. The neural network method for analysing sensory data to prevent cyberattacks was developed, and a test sample scheme was used in the MATLAB Simulink R2014b software environment.
Figure 8. The neural network method for analysing sensory data to prevent cyberattacks was developed, and a test sample scheme was used in the MATLAB Simulink R2014b software environment.
Sensors 25 05235 g008
Figure 9. The parameter values dynamics diagram: (a) temperature; (b) air humidity; (c) CO2 concentration.
Figure 9. The parameter values dynamics diagram: (a) temperature; (b) air humidity; (c) CO2 concentration.
Sensors 25 05235 g009
Figure 10. Sensory data residual diagrams.
Figure 10. Sensory data residual diagrams.
Sensors 25 05235 g010
Figure 11. Clustering analysis results: (a) training subdataset; (b) validation subdataset.
Figure 11. Clustering analysis results: (a) training subdataset; (b) validation subdataset.
Sensors 25 05235 g011
Figure 12. The original signal and prediction time series diagrams: (a) temperature; (b) air humidity; (c) CO2 concentration.
Figure 12. The original signal and prediction time series diagrams: (a) temperature; (b) air humidity; (c) CO2 concentration.
Sensors 25 05235 g012
Figure 13. The residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Figure 13. The residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Sensors 25 05235 g013
Figure 14. The standardised residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Figure 14. The standardised residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Sensors 25 05235 g014
Figure 15. The “whitened” residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Figure 15. The “whitened” residual time dynamics diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Sensors 25 05235 g015
Figure 16. The residual matrix Rt ∈ ℝn×m heat map in a sliding window.
Figure 16. The residual matrix Rt ∈ ℝn×m heat map in a sliding window.
Sensors 25 05235 g016
Figure 17. The abnormal rate st over time diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Figure 17. The abnormal rate st over time diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Sensors 25 05235 g017
Figure 18. The cumulative summation CUSUM diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Figure 18. The cumulative summation CUSUM diagrams: (a) temperature sensor; (b) air humidity sensor; (c) CO2 concentration sensor.
Sensors 25 05235 g018
Figure 19. The ROC curve diagram.
Figure 19. The ROC curve diagram.
Sensors 25 05235 g019
Figure 20. The PR curve diagram.
Figure 20. The PR curve diagram.
Sensors 25 05235 g020
Figure 21. The detection delays histogram.
Figure 21. The detection delays histogram.
Sensors 25 05235 g021
Figure 22. The flowchart of the developed neural network method for analysing sensor data implementation to prevent cyberattacks in cyber police activities.
Figure 22. The flowchart of the developed neural network method for analysing sensor data implementation to prevent cyberattacks in cyber police activities.
Sensors 25 05235 g022
Figure 23. The optimised method’s block diagram.
Figure 23. The optimised method’s block diagram.
Sensors 25 05235 g023
Table 1. Related works comparative analysis.
Table 1. Related works comparative analysis.
Neural Network MethodSensor TypeKye ResultsLimitationsReferences
CNNVibration, acoustic95% anomaly detection accuracyHigh computational load[32,33]
LSTMTemperature, pressureAUC ROC = 0.97Labelled data for a large amount[34]
Variable autoencoder (VAE)IoT flows (multiple data)FP reduction by 30%Difficulty in selecting threshold values[38,39,40,41]
GNNDistributed networkAnomaly source localisation to a node ±1 mUnaccounted impact of delays[42]
Hybrid autoencoder and GCNMulti-sensor (4 or more sensors)TPR = 93%,
average latency < 50 ms
Lack of explanation of the model for the operator[43,44,45]
Hybrid approaches (FuseAD, ensembles, JIT, DCFGM)Streaming sensory data, medical data, batch processes, and environmental dataHigh accuracy of anomaly detection; improved local diagnostic interpretation; adaptive soft sensors; accurate CO2 emission forecastingThey require fine-tuning and markup, are rarely tested during conceptual drift, have high computational requirements, and have low interpretability.[46,47,48,49]
Table 2. The confusion matrix.
Table 2. The confusion matrix.
Predicted AttackPredicted Normal
Actual AttackTPFN
Actual NormalFPTN
Table 3. The neural network method for analysing sensory data to prevent cyberattacks algorithm.
Table 3. The neural network method for analysing sensory data to prevent cyberattacks algorithm.
Stage NumberStage NameStage Description
1Pre-training phaseClean data without attacks is collected, on which the LSTM fθ basis is trained, minimising Lpred.
2Attack generationThe vector u0 is modelled using the SDE model (different intensities and directions).
3Classifier trainingThe residuals from the predictor are used to train an MLP detector to recognise “attacks”.
4Online stageFor each new measurement xt, a prediction is made, the remainder is calculated, matrix Rt is formed, and the scalar rate st is calculated. If st > τ, then the response system is launched.
Table 4. The normalised sensory data fragment.
Table 4. The normalised sensory data fragment.
Time, HoursSensor 1
(Temperature)
Sensor 2
(Air Humidity)
Sensor 3
(CO2 Concentration)
0.5000000.1030130.8890870.489480
0.5010020.1301571.0235720.282682
0.5020040.0478991.0301670.306659
0.5030060.0698550.7727100.222945
0.5040080.0549911.0032690.412067
Table 5. A fragment of the residuals for each sensor, where the prediction is a moving average over a window of size 5.
Table 5. A fragment of the residuals for each sensor, where the prediction is a moving average over a window of size 5.
Time, HoursResidual Sensor 1
(Temperature)
Residual Sensor 2
(Air Humidity)
Residual Sensor 3
(CO2 Concentration)
0.5000000.0000000.0000000.000000
0.501002−0.0334460.107082−0.017275
0.5020040.0678370.0849940.070574
0.5030060.039845−0.0068140.033270
0.504008−0.031736−0.1253220.054965
Table 6. The training dataset’s homogeneity evaluation results.
Table 6. The training dataset’s homogeneity evaluation results.
Sensor r ¯ σ2σWFTitle 6Title 7
Sensor 1≈01.0021.00099950.9792.72 × 10−8TrueFalse
Sensor 201.0021.00099950.7076.22 × 10−6TrueFalse
Sensor 3≈01.0021.00099950.1480.1068TrueTrue
Table 7. Synthetic dataset enrichment procedure.
Table 7. Synthetic dataset enrichment procedure.
NumberNameDescription
1Attack modellingAn adaptive noise vector ua(t) was added to each channel, generated as a Gaussian process, with mean zero and variance σ2 varying from the original signal range of 0.1 to 0.5; the attacked fragments duration was fixed randomly in the 30...120 s range, which ensures a total norm attack ratio of ≈85%: 15%. Attack scenarios included spoofing (smooth drift shift), replay (previous segments repeat), and DoS (fixed signal erasure).
2Balancing classesTo compensate for the imbalance, attacks of rare combinations on adjacent channels were additionally synthesised, bringing the “attack type” final proportion classes to 1…5% for each subtype and 10…15% in total.
3Annotation and partitioningEach timestamp was assigned a “norm” or “attack” label, and the data was then randomly split into training (67%) and validation (33%) sets without overlapping fragments.
Table 8. Classification of cyberattacks and their subtypes.
Table 8. Classification of cyberattacks and their subtypes.
Cyberattack TypeCyberattack SubtypeDescriptionParameters
Spoofing
attacks
Constant SubstitutionA fixed error Δ0, equal to the normal signal amplitude 20…50% is added to each time sample.Intensity: Δ0 or α as the normalised amplitude proportion.
Incremental DriftThe signal shifts linearly: u(t) = unorm(t) + α · t, where α is set in the (0.01…0.05) · Amax range, where Amax is the normalised amplitude.Fragment duration is 30…120 s.
Random SpikesTime intervals of 1…5 s duration with peak emissions up to (0.8…1.0) · Amax, repeating with a frequency of 60…120 s.Spikes frequency (for Random Spikes) is 0.5…1 time per minute.
Replay
attacks
Full ReplayReplacing the current 50… 100 s long window with a pre-recorded “clean” segment.Segment duration is 25…100 s.
Segmented ReplayRepeat only the signal part (e.g., the first 25 s out of 50 s) while preserving the rest of the data.Interval between playbacks is 100…300 s.
DoS attacksBlackoutThe signal is replaced by a zero level or a constant of 0 ± 1% of Amax for 10…30 s.Block duration is 10…60 s, where the noise intensity σ2 is the normal signal variance fraction.
High-Frequency NoiseAdding white noise with variance σ2 = 0.5 ⋅ Var(unorm) on the 20… 60-s intervals.The interval between DoS episodes is 200…400 s.
Table 9. Comparative analysis results.
Table 9. Comparative analysis results.
MethodPrecisionRecallF1 ScoreAUCTraining Time, s
Developed method0.920.890.900.9415
IForest0.870.850.860.905
SVM0.890.870.880.9125
K-means0.800.750.770.8310
VAE0.850.830.840.8820
CNN with MLP0.900.880.890.9230
Table 10. Comparative analysis results.
Table 10. Comparative analysis results.
MethodPrecisionRecallF1 ScoreAUCTraining Time, s
LSTM (proposed)0.920.890.900.9425
GRU0.880.850.860.9120
CNN0.850.830.840.8930
MLP0.820.800.810.8515
Table 11. Comparative analysis results.
Table 11. Comparative analysis results.
MethodPrecisionRecallF1 ScoreAUCTraining Time, s
Modified LSTM0.920.890.900.9425
Traditional LSTM0.880.850.860.9122
Adaptive LSTM0.900.870.880.9228
Residual LSTM0.890.860.870.9030
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Vladov, S.; Jotsov, V.; Sachenko, A.; Prokudin, O.; Ostapiuk, A.; Vysotska, V. Neural Network Method of Analysing Sensor Data to Prevent Illegal Cyberattacks. Sensors 2025, 25, 5235. https://doi.org/10.3390/s25175235

AMA Style

Vladov S, Jotsov V, Sachenko A, Prokudin O, Ostapiuk A, Vysotska V. Neural Network Method of Analysing Sensor Data to Prevent Illegal Cyberattacks. Sensors. 2025; 25(17):5235. https://doi.org/10.3390/s25175235

Chicago/Turabian Style

Vladov, Serhii, Vladimir Jotsov, Anatoliy Sachenko, Oleksandr Prokudin, Andrii Ostapiuk, and Victoria Vysotska. 2025. "Neural Network Method of Analysing Sensor Data to Prevent Illegal Cyberattacks" Sensors 25, no. 17: 5235. https://doi.org/10.3390/s25175235

APA Style

Vladov, S., Jotsov, V., Sachenko, A., Prokudin, O., Ostapiuk, A., & Vysotska, V. (2025). Neural Network Method of Analysing Sensor Data to Prevent Illegal Cyberattacks. Sensors, 25(17), 5235. https://doi.org/10.3390/s25175235

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop