Investigating Data-Driven Systems as Digital Twins: Numerical Behavior of Ho–Kalman Method for Order Estimation

: System identiﬁcation has been a major advancement in the evolution of engineering. As it is by default the ﬁrst step towards a signiﬁcant set of adaptive control techniques, it is imperative for engineers to apply it in order to practice control. Given that system identiﬁcation could be useful in creating a digital twin, this work focuses on the initial stage of the procedure by discussing simplistic system order identiﬁcation. Through speciﬁc numerical examples, this study constitutes an investigation on the most “natural” method for estimating the order from responses in a convenient and seamless way in time-domain. The method itself, originally proposed by Ho and Kalman and utilizing linear algebra, is an intuitive tool retrieving information out of the data themselves. Finally, with the help of the limitations of the methods, the potential future outlook is discussed, under the prism of forming a digital twin.


Introduction
Adaptive control has been quite popular over the last fifty years [1,2], with a variety of methodologies available [3]. As a matter of fact, as early as 1955, the adaptive techniques have been reported to be widely utilized in industry and this can be come across in literature [4]. The comparative advantage, being the lack of the model, has helped in creating huge related literature. Even nowadays, with Industry 4.0-like movements across the Globe being the main streams of digitalization trends in industry [5,6], the cognitive functionalities of automation (exploiting Cyber-Physical Systems and Internet of Things) have been integrated to a great extent and the use of adaptive control techniques has been spread even more. Also, there have been reported works [7], where the well-established technology of adaptive system identification has been presented as an underlying technology for a digital twin.
Applications of adaptive control can be found literally everywhere. From domestic applications [8], to engineering [9] and manufacturing [10], it is highly evident that adaptive control is very useful. Indicatively, recently, identification techniques had been used to model a system response originating from Partial Differential Equations [11] and attempting to control it in an empirical, yet adaptive way. In the case of a digital twin formation, automatic operation is highly important, so the identification phase is of utmost importance.
A brief, yet full, review on the State-of-The-Art on methods of identification techniques-and more specifically on the issue of choosing the order of the model-reveals initially the use of empirical methods such as trial and error [12], estimation utilizing the frequency domain [13], and maximum a posteriori (MAP) method [14]. Works have been done previous on the choice the the most suitable method [15]. Furthermore, the co-variances matrix and the residual whiteness are two more methods [16,17] that are often discussed. Moreover, the set of Bayesian information criterion (BIC)/Akaike information criterion (AIC)/generalized information criterion (GIC) methodologies is another set of methods [18,19] highly utilized; in the literature there has also been a practical comparison between residual sum of squares (RSS) and BIC [20]. It is worth mentioning at this point that the later method(s) implies the integration of the concept of information.
What seems to be missing, however, is a numerical illustration on the simplest, intuitive way to extract such information (meaning the order of the system) from data (the responses themselves). To this end, this work attempts to investigate numerically a simple method for the estimation of the system order, in time domain, utilizing the linear dependence between the sampled data. The concept of the rank of the matrix is utilized as the tool to perform this, as originally suggested by Ho and Kalman [21]. The paper is structured as follows: Firstly, an underlying framework is given. Thus, the methodology of creating a digital twin for a manufacturing process through data-driven models is illustrated. Also, the significance of introducing automated order estimation techniques to such digital twins is pointed out. Next, the Ho-Kalman algorithm is presented and is compared against other methods. In continuation, numerical examples are given on the efficiency of the Ho-Kalman algorithm in various cases. Finally, conclusions are extracted on the significance and the usability of the algorithm.

Framework
Regarding manufacturing processes digital twins, it is extremely useful that they are "near-real-time" [22]. This could be defined as having a running time of at least one order of magnitude smaller than process time constants. This way, control and optimization would be feasible. Data-driven problems in particular are very flexible towards this end, as they can be based on adaptive control techniques. However, the estimation of system order may be yet another loophole as proved with numerical examples herein. The framework implied herein is based on such technologies and the order estimation is discussed.
Under the current framework, the Ho-Kalman estimation is considered as an order estimation algorithm. As proved hereafter, the algorithm could be applied with great success in some cases and the scope of the current work is to examine the applicability of this method. The framework, in full description, is described in the list below, given the fact that the main module of the digital twin is a dynamic system. The main functionality is the control of the physical system (process), but other scopes can be defined on top of that, such as running the simulation to respond to What-if scenarios and be able to find proper working conditions (process parameters). Figure 1 is used to illustrate the operation of such a control-based digital twin.
Ho-Kalman algorithm is applied to estimate the order of the system 3.
Plain estimation techniques are applied (i.e., mean least squares) to retrieve the transfer function(s) Design phase: 1.
Sensors are used to detect the model that should be applied 2.
Sensors are used to measure input and output of the system 2.
(optional) An observer is used to estimate the state (inner variables) of the system 3.
The control signal is generated and control is applied (these may be two different steps depending on the implementation) Figure 1. Underlying framework that takes order estimation into account.

Method
As briefly aforementioned, the method investigated here is based on the fact that linear systems response values at time n (in the case of discrete systems) are linear combinations of previous values at time n − k, for some n, k ∈ N. Therefore, the concept of linear independence is exploited, through the concept of ranks of matrices. To achieve this, a matrix is formed, containing translated versions of the response, as shown in Equation (1), given a response y[n].
The order of the system S that had y[n] as a response, is expected to be equal to the rank of this matrix, namely ρ N×N = ρ(Ỹ N×N ). Even in the marginal case where the N is taken to be equal to M + 1 (with M being the order of the system) it is evident that the rank of the matrix is equal to the order of the system, as shown in Equation (2).
In the next sections, the numerical performance of this algorithm is investigated with respect to the complexity of the system; the order itself, the system structure and potential noise interfering.

Comparison to Other Methods and Correlation to Information
For reasons of completeness, this simple method should be compared against other ones. So, to this end, the following response of Equation (3) is utilized. The investigated method gives out explicitly (and correctly) an order of 5. However, AIC-based order estimation gives out 8, as shown in Figure 2, while Co-variant Matrix Method leads to inclusive results, as showed in Table 1 (a potential adoption of order 3 could take place).
(3) Figure 2. Akaike information criterion (AIC) values as a function of the system order (for system of Equation (3)). This small numerical example has pointed out the numerical superiority of this algorithm-in a case where the method is applicable in its current form. Interestingly enough, the whole point of modelling with a differences equation is of course to be able to reproduce a sequence by a finite (smaller) number of numbers. This slightly reminds one of Chaitin's work [23] on linking compression with theory (the concept of statistical inference is also relevant). The only difference herein is that instead of utilizing bits, one tries to compress numbers into numbers, regardless of digits. The same principle lies behind the use of auto-encoders, as illustrated in Figure 3. The objective is to utilize a representative feature or equivalently representative compressed signal or image. Thus, the complexity and the dimensionality of the data is reduced. This is very useful for cases where the computational effort has to be reduced. The rank of the responses matrix (even in its full infinite version) is an index of such a complexity (information). So, transformation metrics related to invertibility can also be used, such as the determinant or the eigenvalues distribution. Alternatively, the Lagrangian (or instead a custom Liapunov function) of the system can be used as a different metric. Such a function is of degree higher than linear, thus there is link to correlation matrix method as well. This kind of compression has been very useful in cases where the complexity is simply measured by the rank of the system, such as the case of tool-wear [24].

Numerical Behaviour and Applicability
So, in the context of finding the numerical limitations of this simple method, various systems have been studied in terms of system order identification. In this section particularly, the applicability and the limitations of the method are shown and discussed through specific paradigms.

Simple Numerical Examples
To begin with, a first-order system-that would give a response of the form y 1 [n] = Be −Qn -is utilized (Equation (4)). Thus, a responses matrix of dimensions 5 × 5 would be given byỸ 1 5×5 .
Even using symbolic matrices, without specific values, the rank of the matrix, for various dimensions, is equal to 1, as also computationally shown in Figure 4, for a specific value of Q. This is easily proved, as each row (or column) is the product of the previous one with e −Q . So far, everything seems to work well. However, in reality, the sampled values of the responses contain noise, either from measurement, or from sampling itself. Therefore, in this section, noise is going to be regarded, as this is the case in all measured responses. To simulate this, a uniform random number is added after sampling the response, which is regarded in continuous time. Supposing that the continuous-time response is a simplified version of the above one, sampling is applied. In the case where the amplitude of the noise is relatively small, then, as shown in Figure 5, the convergence is rather rapid. However, if the noise amplitude is increased by one order of magnitude (same Figure), then the convergence becomes much slower. Oddly enough, the unitary signal (f (t) = 1) has been added to the response on purpose. It has been observed that if the mean value of the response is increased by an offset, then the method converges much faster. Also, the adoption of a row-echelon form of the responses matrix also seems to accelerate the convergence of the method. Furthermore, to study the effect of the poles' proximity on a second-order system, the following response of Equation (5) is regarded given that δp = 2 −q .
The elaboration of such a system has as a goal to study the numerical limitations of the method, as the system tends to be a double-pole system in the limit of q approaching infinity. Simultaneously, the effect of the dominance of one pole is studied. The results are shown in Figure 5. Evidently, the rank remains equal to 2, for values of q < q 0 and N ∈ {2, . . . , 8}, depending also on Signal-to-Noise Ratio (SNR) value. As proved with this numerical study, it seems that for extreme cases of poles' proximity, if the SNR becomes substantially small, then it requires a lot of data for the order estimation alone. For precise calculations in some applications, specifically where this almost double pole affects the controller design, then the response size has to reach up to thousands of samples.
Regarding the implementation of the training phase, this affects the use of higher storage capacity, higher memory for processing and faster processors, potentially prohibiting the use of low-performance embedded systems.

Performance on Systems of Higher Order
To move on to higher order systems, the (arbitrarily chosen) following response (Equation (6)) consisting of N 0 terms is considered: The choice of this series as a system leads to a very interesting diagram of ρ N×N as a function of N. The results are given in Figures 6 and 7, for f 0 (t) = sint and f 0 (t) = e −t , respectively. It is quite interesting that in the case of sinusoidal functions, some sort of numerical effect takes place. This drives the rank estimation evolution (since it is a function of responses matrix size) to converge to the value of 2N 0 at a "faster" rate. This should be investigated to a further extent, through the consideration of a case of a system of even larger order. This is not the case when dealing with exponential functions, probably due to bad condition number of the responses matrix. Also, since often there can be responses of high order [25], i.e., close to 100 [26], similar case have also been included here.

Non-Homogeneous Systems
Moving on to a different kind of complexity, one can form a matrix for the case of for a non-homogeneous system, such as in the case of y (t) + 0.9y(t) = e −t/8 . The rank of a 10 × 10 responses matrix would be equal to P + Z = 2, where the Z is the number of Zeros and P the number of poles (1 and 1 respectively). If one augments this matrix to be 11 × 10 or 10 × 11, padding with (translated) excitation function values to the bottom or to the right, as shown in Equation (7) below, then the rank remains equal to 2. This indicates that the order of the differential equation is equal to P (equal to 1 in this case), as the input and the output have linearly dependent terms.

Summary and Future Outlook
The simplest and most intuitive method for estimating the order of the model by the responses themselves in time-domain, utilizing the rank of the responses matrix, has been reviewed as a candidate method for forming automatically a digital twin. It has been proved to be quite successful in many cases, with the efficiency of the algorithm depending highly on the order of the problem as well as the available response dataset size. More specifically, the algorithm appears to work successfully under any kind of system, even though a large amount of data is needed when SNR becomes smaller. The use of this large amount of data seems to cancel the uncertainty introduced by noise. However, its extension towards practical rank estimation algorithms or iterative decomposition of responses is required so that it is able to handle highly noisy data. This iterative method, may be loosely correlated to the Gram-Schmidt procedure [27], however, one must have in mind the issue of orthogonality of signals. Nevertheless, the Ho-Kalman algorithm has been proved a very promising tool that may be really useful in slightly noisy data of high dimensionality. In addition to the above, it can be very useful in terms of clustering, in a two-fold way; speeding up computations and offering intuition to a human operator. Also, it seems to be a powerful tool towards compression of data. It has the capability to represent a whole signal through some sort of complexity abstraction.
The extensibility of the Ho-Kalman method in non-linear systems ought to be further investigated. The first barrier in this direction is expected to be the very formation of a responses matrix. However, more elaborate tools can be used towards this. Moreover, it seems that as far as the formation of a digital twin is concerned, the use of such a method is promising, however, in the presence of noise, the method has to be combined with the use of another method. Digital twins, either for continuous manufacturing processes, such as plastic extrusion and chemical reactions, or discrete manufacturing processes, such as laser welding and additive manufacturing, will highly benefit from order estimation techniques. This benefit is highly linked to the automated decision making procedure, as the order estimation will not be some kind of fuzzy process that the engineer has to go through. The digital twin will automatically select the values of the (data-driven) model parameters and the control signals generation will be performed automatically and seamlessly.
Funding: This research was partially funded as per the acknowledgement.