1. Introduction
Modern industry is undergoing significant transformation due to the emergence of increasingly advanced technologies. These technologies are driving the development of new paradigms and are shaping trends that are receiving considerable attention across various fields of science and engineering. One such technology that has garnered particular focus is the concept of digital twins of real-world systems.
A digital twin is an intelligent virtual model of a real-world system that serves as its counterpart, maintaining a synchronization link with the physical system. Various definitions exist in the literature depending on the context, as the term is quite broad; however, the most widely accepted definitions are provided in [
1]. Digital twins have found widespread application across numerous domains such as industry/manufacturing [
2], energy [
3], infrastructure [
4], automotive [
5], aerospace [
6], computer science [
7], healthcare [
8], agriculture [
9], and many others [
10]. In the management of technological processes and systems, digital twins aim to achieve time synchronization between the virtual and physical models, preferably in real time. This enables the implementation of adaptability within the constructed model and even allows for further development and refinement to achieve higher accuracy.
Due to the nature of the digital twin, which tracks the temporal evolution of the real-world object, the development of control systems involving digital twins presents a complex challenge, primarily because of the presence of multiple time scales [
11]. A key limitation of this type of control lies in the fact that digital twins typically account only for parameter variations within the model, but not for changes in its structure. As a result, they can capture parametric uncertainty but often fail to account for unmodeled dynamics. In such cases, accurately predicting the behavior of the physical system becomes crucial. For this reason, control systems based on digital twins frequently employ adaptive PID and MPC controllers. However, even with an adaptive controller in place, a coarse approximation can lead to discrepancies between the digital twin and the real system, ultimately resulting in desynchronization.
At the core of the digital twin concept lies the understanding that the virtual model approximates the real object. From this, it can be inferred that the use of functions for model description, updating, and extension could be a promising solution to the limitations outlined above. This, in turn, frames the challenge as one of system approximation, along with related procedures such as data processing (e.g., filtering), identification, discretization, and others that are integral to the functioning of the digital twin. One key advantage of using approximated models is that standard stability criteria can provide robust control properties, offering good tolerance of modeling errors [
12].
System approximation is a widely studied problem, and the scientific literature offers a vast array of methods, with new approaches—often modifications or combinations of existing ones—continually being published. In fact, the body of work in the technical literature is so extensive that it is practically impossible to cover the full spectrum of available techniques. Nevertheless, the main approximation methods can generally be classified into the following key groups:
- ‑
- ‑
- ‑
Series expansion methods [
15];
- ‑
Smoothing-based approximation methods [
16];
- ‑
Approximation based on intelligent techniques [
17].
The classification of the known groups of approximation methods, along with their advantages and disadvantages, is presented in
Table 1.
Among all the approaches used for approximation, an important class of approximation functions are the orthonormal functions, with Laguerre and Kautz functions being among the most prominent representatives. In the current study, Laguerre functions are not the focus. Attention is directed toward the orthonormal Kautz functions due to their more generalizable nature, as they allow for the approximation of systems with both real and complex poles.
In the literature review regarding orthonormal Kautz functions, greater attention is traditionally given to their application in system approximation. They are used in recursive algorithms for constructing equivalent transfer functions of systems [
18], for representing measurement dynamics [
19], for approximating control impulse trajectories [
20], and in stochastic systems [
21].
Identification is another process in which Kautz functions give good results. New methods are proposed based on the extension of the impulse response function (IRF) of linear systems [
22], algorithms based on maximum likelihood and Kautz functions [
18], the use of regression spaces of orthonormal Kautz and Laguerre functions to improve performance [
23], and identification of stochastic systems [
21].
In modeling nonlinear systems, the approach using models, filters, and Volterra series gives good results. The combination with Kautz functions is proposed in [
24,
25,
26], through extended implementations that address issues related to convergence, stability, and over-parameterization.
Kautz functions are used in nearly all processes and procedures related to digital twins, even in signal filtering [
27]. The orthonormal Kautz functions provide a simple and elegant method for capturing the dynamics of various systems, and for this reason, they are applied in the synthesis of model predictive control systems [
20,
28,
29].
This paper proposes the combination of orthonormal Kautz functions and deep neural networks for system approximation. The goal is for the resulting hybrid structure to be implementable in more modules within the architecture of a system’s digital twin, in the presence of parametric uncertainties and noise.
2. Materials and Methods
2.1. Digital Twins
Digital twins are implemented based on the idea of representing real objects through their virtual analogs. They can be synchronized with the real system and exchange information in a bidirectional connection or operate in offline mode, depending on the objectives. The main modules included in the structure of a digital twin can vary, with the five-dimensional structure becoming increasingly popular [
30]. In this structure, the output of the twin is a function of five variables, which are the outputs of the five modules contained within it. These modules are as follows:
Physical system—This includes the real object itself, along with sensors, sensor systems, and data collection techniques. Depending on the type of data, this module may also include primary data processing.
Digital system—This represents the digital model, which forms the core of the digital twin’s operation. When using information from a measurement system or a simulated one, models can primarily be classified as statistical and machine learning (ML) models.
Update module—In this module, the parameters of the digital model are updated based on information about the measured quantities.
Prediction module—The updated digital model is then used to predict future states of the real object or the probabilities of certain processes (such as wear, failures, lifespan of mechanisms, etc.).
Optimization module—The optimization module includes additional functional capabilities of the digital twin, working in close collaboration with the other four modules to optimize the processes carried out within them. Optimization can be performed in two modes: offline and online.
The working structure of the digital twin is chosen to be based on machine learning. Specifically, a structure based on supervised machine learning with deep neural networks and orthonormal Kautz functions is proposed and examined, as it operates with raw but labeled data. In this implementation, the update module approximates the real system to build the digital twin, providing the possibility to implement MPC using orthonormal Kautz functions in the prediction module.
Figure 1 shows a block diagram of the concept of the proposed structure.
This type of structure has advantages such as accuracy and precision, simplified understanding of the model representation, broad applicability, labeled data making the model practical, and the possibility for continuous improvement.
Focusing on the Update module, the following sections will provide an overview of the implemented and utilized approaches and methodologies.
2.2. Orthonormal Kautz Functions
Kautz functions are, in general, orthonormal functions that allow the approximation of impulse (transient) characteristics of systems with real and complex poles. Modeling using Kautz functions is a generalized foundation for the synthesis of MPC, which is presented in detail in [
31].
Orthonormal Kautz functions overcome the limitation of another popular type of functions—Laguerre functions—as they allow the system under consideration to have complex poles, thereby providing the possibility to approximate more complex systems. Kautz orthonormal functions are calculated through the inverse Laplace transform, with three main cases being distinguished depending on the type of poles. They are obtained in operator form and are also referred to as Kautz networks. The three main cases based on the type of poles are as follows:
Case A: For non-identical real poles;
Case B: For non-identical complex poles;
Case C: For a combination of real and complex poles.
The Kautz functions
K(
t) are obtained from a network, including both real and complex poles, through a representation using a state-space model:
where
F(
t) is the state vector,
is the impulse response of the system, and the matrices
are determined by the location of the poles.
Thus, if the system has only real poles
,
and
, the state vector is chosen as
, with the corresponding three Kautz functions
obtained through the system (1) as follows:
When the values of , and are equal, a Laguerre network is obtained.
If the system, for example, has three real poles
,
, and
and a pair of complex conjugate poles
, c
and
, then the Kautz functions are given by the following:
In
Figure 2a, the graphs of the Kautz functions for
N = 3 are presented, with
p1 = 1.1,
p2 = 0.8, and
p3 = 2.1, In
Figure 2b, the Kautz functions for
N = 4 are shown, with
p1 = 1.1,
p2 = 0.8,
p3 = 1.5, and
p4 = 0.8
i2, implemented in the MATLAB environment.
2.3. Approximating Systems Using Kautz Functions
Approximating systems using Kautz orthonormal functions is similar to the use of Laguerre functions, although the Kautz network is much more complex and allows for the approximation of much more intricate systems. This, in turn, expands the capabilities of digital twins and the synthesis of MPC.
Approximating systems means approximating the transient or impulse response. Kautz orthonormal functions are used to approximate only decaying functions, with the condition of exponentiality not being mandatory. This means that the focus is mainly on impulse responses. However, when using a “gray box” or even more so a “black box” approach in creating digital twins, identification based on the transient response is implemented. This is not a problem, as the transition from one to the other characteristic is possible through the following dependencies:
where
y(
t) is the transient response and
h(
t) is the impulse response.
Thus, representing the impulse response using the Kautz functions, it becomes an approximated model of the following form:
Equation (5) is valid only when all the poles of the system lie in the left half of the complex plane and the condition for
L2 stability of the system is satisfied. The coefficients c
i are determined by the following:
The integral mean square error of the approximation is calculated using the following relation:
Theoretically, if the order of the approximated model N is equal to the number of poles and the values of the poles of the Kautz filter are the same as the values of the system’s poles, the approximated system will be identical to the real system.
In practice, when calculating the Kautz decomposition coefficients, the values of the poles can vary within certain limits, which accounts for the good robust properties of control systems that use MPC synthesized in this way. The graphs showing the coincidence of the real and approximated impulse responses of a system with two real poles and one complex conjugate pair of poles as the pole values change are shown in
Figure 3a,b.
The graphs shown in
Figure 3 clearly demonstrate that the values of the real and complex poles have a critical impact on the quality of approximation. For the real poles of the system, a maximum is observed around the actual pole values, forming an insensitivity zone within approximately 10% of the true value. As the value increases, the approximation accuracy decreases exponentially, while at lower values, zones of deteriorated accuracy emerge. Regarding the complex poles,
Figure 3b reveals a very small region (a plateau) near the actual values of the pole pair, where the approximation error is extremely low. The resulting surface exhibits local minima and folds, indicating that the relationship is nonlinear and requires careful tuning of the pole values for optimal results.
Complex poles have a significant effect on approximation accuracy—poorly chosen values lead to weak results, while well-selected values yield substantially better approximations.
2.4. System Approximation Using Kautz Functions and DNN
In the results derived in the previous section, it is evident that by using orthonormal Kautz functions, we can approximate systems with a certain tolerance regarding parametric uncertainties. However, this tolerance is not very large, around 16% for real poles and much smaller for complex poles.
On the other hand, it is well known that DNNs yield very good results in tasks related to system approximation. They are also a key element in the structure of almost every digital twin, as well as in the current one.
The use of DNNs could address some of the issues associated with the Kautz functions by expanding the scope of uncertainties, including unmodeled dynamics, while providing accuracy and broad applicability and enhancing the procedures in the Update module of the digital twin when using the “gray box” or “black box” method in the identification process. Since the digital twin has the capability to operate in real time, which is a relative term and depends on the transient processes of the system, DNNs are used to approximate the Kautz functions, rather than the transfer characteristic itself. This is a significantly simpler task and helps prevent additional issues typical of modules that use elements of machine learning and artificial intelligence, such as small datasets.
To achieve the goal set, it is necessary to construct a hybrid structure combining a Kautz network with a DNN. The deep neural network must have good approximation capability, without saturation, and with controllable parameters. The neural network used has four parallel branches, each with several hidden layers. The hybrid structures that are formed cover the variants for Kautz functions with only real poles (
Figure 4) and combinations of real and complex poles (
Figure 5). Implementing a Kautz network with only complex poles represents a special case.
The used neural network aims to solve a task that cannot be entirely classified as either approximation or decomposition but is rather characterized as feature extraction.
Figure 6 shows the architecture of the implemented DNN.
In the used DNN, a branched architecture is employed, with all branches being merged by a summation layer. The goal is to achieve significant depth for the extraction of various characteristics of the Kautz functions and the integration of the information. Since the Kautz functions are significantly more complex than the Laguerre functions, it is necessary to increase the number of neurons in the fullyConnectedLayer layers.
The input layer is chosen to be of the sequenceInputLayer type, as the current task involves data with time dynamics.
In the main part of the network, four parallel branches are implemented, built with a combination of fullyConnectedLayer and preluLayer. The neurons in the first branch start from 256, in the second from 128, and so on, with all branches ending with 32 outputs before merging. The use of PReLU allows the network to flexibly adapt the slope for negative values, which is important for capturing nonlinearities and oscillations, typical of systems with complex poles, such as those used in Kautz functions. After merging, the softmaxLayer normalizes the information. This layer also serves as a mechanism for weighting the characteristics from the four branches, allowing the network to dynamically adapt to the type of input dynamics.
After parallel processing, further refinement of the result is performed by merging the information from the branches using additionLayer and softmaxLayer layers, implementing a residual structure (residual path). This allows the network to maintain depth without suffering from gradient loss, while simultaneously keeping information from earlier layers active during training.
3. Results and Analysis
In this section, results from the operation of the developed software system, presented in
Figure 4 and
Figure 5, using the DNN shown in
Figure 6, implemented in the MATLAB environment, will be presented. The section is divided into two parts. The first part will present simulation results for the approximation of a sample system, while the second part will present experimental results with data from measuring the rotational speed of an electrohydraulic system.
3.1. Simulation Results
The implementation of approximation with orthonormal Kautz functions and DNN for generating decomposition coefficients is tested with the impulse characteristics of a sample system described by its transfer function of the following form:
The system has two real poles and and one pair of complex conjugate poles .
The used Kautz network has the same number of poles—N = 4 and generates four decomposition coefficients for the four generated orthonormal functions.
The training dataset is created by generating the impulse response of the system (8), with embedded parametric uncertainties and random noise. Each parameter has a 50% uncertainty, and the noise is set to 10%, with a sample size of 50 different functions.
The neural network is trained using the formed training samples, with 85% of the data used for training and 15% for validating the training. The training parameters are as follows: MaxEpochs = 1000, InitialLearnRate = 0.00001, and GradientThreshold = 0.0001, using the Adam optimizer.
Figure 7a–d show the error graphs for the training of the four neural networks, along with the validation data.
Figure 8 graphically presents the system approximation with coefficients generated by the neural networks. The achieved approximation accuracy between the real and approximated characteristics is 99.7% under the given uncertainties and noise.
3.2. Experimental Results
The real electro-hydraulic servo system with throttle control is considered. The system is described in detail in [
32,
33,
34]. In general, the system represents a laboratory stand for studying methods for controlling the speed of a hydraulic motor. The stand is equipped with all the necessary elements and subsystems to implement several operating modes. In this type of electro-hydraulic system, there are significant parametric uncertainties and unmodeled dynamics, which depend on the increase in the temperature of the working fluid.
Using the Series Data Acquisition NI USB-6215 measurement system, real-time data for the transient characteristic of the considered open-loop system were recorded. A set of 30 transient characteristics was obtained, which will be used for training the neural network. For each transient characteristic, 900 values were recorded. From a practical perspective, during identification, the transient characteristic of the system is always used, while the impulse response is obtained using the following dependency:
where
y(
t) s the transient response and h(t) is the impulse response.
Figure 9 shows the graphical results for the recorded transient responses and the obtained impulse responses.
It is assumed that the structure and model of the system are unknown, and through the use of the Matlab Identification Toolbox, system identification is performed using the “Black Box” method. Although a complete mathematical modeling of the system is provided in the literature sources presented above, it is used for validation of the obtained model. During the identification process, the transfer function of the system is obtained with an accuracy of 97.31% with
and
. The obtained transfer function is of the third order with three distinct real poles and has the following form:
The obtained model of the system is used in the training of the DNN as the nominal model of the system. The same percentage ratio is maintained in the training, with 85% of the data used for training and 15% for validating the training.
Figure 10a–c show the graphs for the training errors using the training data.
The obtained training graphs of the DNNs show that both the training and validation errors decrease sharply and stabilize around the 300th iteration, reaching values below 10−6—and even below 10−7 for the second approximating coefficient. This indicates excellent approximation and good generalization of the model. This result is expected, considering that after the identification process, system (10) contains only real poles. All three training processes demonstrate high efficiency and excellent agreement between the model and the data, without any signs of overfitting. This consistency of the results under different input conditions indicates that the proposed network architecture is well-suited to the task. Moreover, the chosen data split strategy (85% training/15% validation) proves to be effective. The achieved accuracy in generating the approximating coefficients confirms the reliability of the model in practical applications. The trained DNNs are thus able to generate the approximating coefficients with high accuracy, which leads to a precise approximation of the system.
Figure 11 shows the graphical results from the approximation of the real system. The red dashed line represents the approximated characteristic, while the blue color indicates all the measured values of the impulse characteristics of the real system that are used in the training dataset. It can be seen that the approximated characteristic matches the values obtained through the measurements. The achieved result confirms that the combination of orthonormal Kautz functions and DNN can be effectively used in the architecture of the digital twin.
4. Conclusions
This paper presents a method for combining orthonormal Kautz functions and deep neural networks for system approximation, aimed at use within the structure of a digital twin. The architecture of the digital twin, based on supervised machine learning, highlights the significant role of DNNs. By combining with Kautz functions, several advantages are achieved, such as accuracy and precision, clarity in model evaluation and understanding, practicality through labeled data, continuous updates, and the possibility of applying the approach in both the Update and Prediction modules of the digital twin.
The designed DNN is tested for training using both simulation and experimental data, achieving an average MSE value around 1.10−6 in both cases. Validation with other data shows that the neural network is well-trained without overfitting, and the model is both adequate and stable. The results demonstrate that the use of orthonormal Kautz functions and DNN in the structure of a digital twin is effective, covering a broad range of systems under conditions of noise and uncertainty.