Causality in Control Systems Based on Data-Driven Oscillation Identiﬁcation

: This paper addresses the subject of causality analysis using simulation data and data collected from a real control system. Simulated data includes Gaussian and Cauchy noise signals. Real-time series include various, mostly unknown distortions, like trends, oscillations, and noises. Presented research focuses on the oscillatory component in data and its propagation in multi-loop control systems. Oscillation identiﬁcation is based on a deep decomposition process for control error time series. Identiﬁed periodic signals are used for further causality processing. The analysis uses the Transfer Entropy approach. This method belongs to the group of model-free methods. The determination of information pathways is conducted without any model or a priori process knowledge. The research investigates the impact of the oscillation time-series component on the Transfer Entropy causality analysis. The summary shows the observations obtained for given simulated datasets and those collected from real processes. The obtained results show that simulated analysis works properly. On the contrary, the direct application of the oscillation decomposition in real industrial cases may be misleading. Large datasets demand modiﬁcation in the methodology. Different variants are tested. They show that oscillation propagation is biased in real systems and, therefore, the decomposition should be applied with caution. Furthermore, it is important to remember that the algorithm transition from simulated data to real industrial ones is demanding and should be done with the utmost care.


Introduction
Causality analysis requires a specific approach and designated methods that allow one to determine the internal relations between different variables [1]. One may find various approaches, which have been developed in different domains. Actually. there are four leading approaches to detecting causality from historical process data: Granger Causality, Transfer Entropy, Cross-correlation, and Partial Directed Coherence [2]. Cross-correlation is simple, but it assumes linearity [3]. Some of them are based on the process model, like the coherence method [4], the Granger causality [5] and its extension Partial Directed Coherence. Spectral Granger causality uses Fourier Transform, which limits the method to the linear and stationary cases. Others belong to the group of data-driven and model-free methods, like the Transfer Entropy approach [6]. The analysis of complex and varying control systems, for which the exact model is unknown or its estimation is highly uncertain, encourages the use of the data-driven method. It is a great simplification in the field of causality determination as well as fault propagation research. Faults can be observed as system oscillations, among others.
The industrial demands for the causality analysis have been defined quite early [7], but the he research on the causality root-cause analysis in the area of process control is relatively new and only few works are published. The introductory paper [2] has been followed by some examples of the Gragner causality and Transfer Entropy comparison in [8,9]. Transfer Entropy research has been followed in [10] with the hybrid combination of TE and the modified conditional mutual information (CMI) approach, which uses not data time series but multi-valued alarm series. Authors of [11] used an opposite approach. They are assuming that periodical elements are not bringing any value so they use denoising and periodicity-removing time-delayed convergent cross mapping (TD-CCM), which is an extension of the original algorithm [12] and its time-delayed version [13].
The Transfer Entropy approach has been addressed by a few works. The Transfer Entropy systematic workflow for oscillation diagnosis is proposed in [14], while the outliers impact is addressed in [15]. Finally, the TE approach is improved by information granulation in [16].
AI-driven methods, like neural or Bayesian networks, are uncommon in the process industry applications. They are many times harder to train because of the increasing size and complexity of the used data sets [11]. Moreover, we would need to know the causality examples to train it. The authors decided not to use AI and develop the TE approach.
Our approach assumes that periodicity contains important information and propagates through the system. Therefore, the Transfer Entropy approach uses decomposed oscillation elements.
An example of the oscillatory behavior propagation through the control system is discussed in [17]. Oscillation diagnosis may allow one to trace the root cause of these faults. It allows answering the question of which control loop of a given system introduces disturbances into the system causing its improper performance. As the result, too aggressive or too sluggish operation [18] is observed. Thus, assuming that the oscillations in a multiloop control system carry out valuable information, it is imperative to conduct the research in such a direction. In the considered analysis, the innovative application of the Transfer Entropy approach is combined with the oscillation decomposition leading toward timefrequency causality analysis.
The algorithm should work not only in the simulation or laboratory environment but also in industrial reality, which introduces various disturbances and limitations. Nonstationarity and non-Gaussian properties are common. They often turn out in the form of the variables distribution tails and outliers. One has to be aware that outlying observations are very frequent in industrial practice [19]. They may deteriorate statistical estimation and regression analysis [20] and also oscillation decomposition [21]. This research takes this aspect into account as well trying to omit the problem through the subsets decomposition and median filtering. Our contribution consists of two parts. First, we show how the performance of decomposition methods can be extended for very long time series. Second, real-world examples have shown the inadequacy of existing oscillation decomposition methods for industrial data. This observation opens a new direction for further research.
Thus, assuming that the oscillations in a multi-loop control system carry out valuable information, it is imperative to conduct the research in such a direction. In the considered analysis, the innovative application of the Transfer Entropy approach is combined with the oscillation decomposition leading toward time-frequency causality analysis. The combination of the Transfer Entropy method with frequency analysis has not been used either with simulation or real data, however, it may be a great simplification in the field of diagnosis of fault propagation methods based on a data-driven model-free approach.
Once the causality graph is identified the tracing of the misbehavior is allowed in the multi-loop configurations. Frequently the poor tuning of a single loop causes low performance for the rest of the process. Thus, the identification of such loops may help control engineers in the process of multi-loop system tuning and fault diagnosis. Moreover, the causality graphs may help to properly organize the intelligent alarming system protecting from multiple alarming making the installation operator's job easier.
The paper is organized as follows: Section 2 contains an overview of the applied methods and algorithms, while Section 3 describes the used simulation and industrial case study. Section 4 presents obtained results including causality diagrams reflecting relationships between variables based on oscillation signals. Section 5 concludes the paper and identifies open issues for further research.

Proposed Methodology
It is shown that the diagnosis of large-scale industrial control systems, especially the creation of its good representation by a mathematical model, is a tedious process and its accuracy depends on many factors [22]. Depending on the adopted methodology, analysis can be based on various types of data, ranging from data measured directly from the given control loops (raw datasets), through the noise, to oscillations occurring in almost every system [15]. The last-mentioned type of data is strongly related to Time-Frequency (T-F) analysis, which is a powerful tool and can be successfully used to determine interconnections between variables and units (causality). These relations represent the information flow (caused by control loops) in the process and can be "measured" thanks to the Transfer Entropy (TE) approach. The combination of T-F analysis with TE has not been previously observed in the literature, in particular in industry-related applications.
Proposed a median ensembled version of EEMD (MEEMD) that belongs to a class of noise-assisted EMD methods, helps to decompose signals into oscillations required for further analysis. This approach is used together with specific statistical data handling. The analysis is applied to two case studies: the known and simulated environment and the unknown industrial case.

Transfer Entropy Approach
Transfer Entropy (TE) is an information-theoretic interpretation of Wiener's causality definition. In practice, this is a measure of information transfer from one variable to another by measuring the reduction of uncertainty while assuming predictability. TE is given by the Equation (1).
where p means the complete or conditional Probability Density Function (PDF), variables Y i and X j respectively as Y i = [y i , y i−τ , . . . , y i−(l−1)τ ], X j = [x j , x j−τ , . . . , x j−(k−1)τ ], τ is a sampling, h is a prediction horizon. Concluding, it is the difference between an information about a future observation of x obtained from the simultaneous observation of past values of both x and y, and the information about the future of x obtained from the past values of x alone. Phenomenon of entropy in both directions is highly probable that is why a measure described as T x→y = T x|y − T y|x is decisive due to quantity and direction, what is causality. The practical implementation of the Transfer Entropy approach between pair of variables according to Equation (1) requires its simplification to the form presented in Equation (2): where p means the conditional (PDF), τ and t are the time lags in x and y, respectively. If the time-series length is short, the t is set to 1 under the assumption that the maximum auto-transfer of information occurs from the data point immediately before the target value in y.

Median Ensemble Empirical Mode Decomposition
Median Ensemble Empirical Mode Decomposition (MEEMD) is a variation of the EEMD algorithm that uses the median operator instead of the mean operator to ensemble noisy Intrinsic Mode Function (IMF) trials [23]. The use of this algorithm is a practical exten-sion of the classic EMD and justified choice with real-world applications. The EMD method was developed so that data can be examined in an adaptive time-frequency-amplitude space for nonlinear and non-stationary signals [24]. It decomposes the input signal into a few Intrinsic Mode Functions and a residue. The given equation will be as follows: where I(n) is is the multi-component signal. I MF m (n) is the mth intrinsic mode function, and Res M (n) represents the residue corresponding to M intrinsic modes. The proposed median EEMD (MEEMD) algorithm, defines median operator as: where I MF(t) denotes the ordered I MF list at time instant t, which is obtained from N independent noise realizations. Consider a real-valued signal x(t) and a predefined noise amplitude , the used MEEMD is outlined in Algorithm 1.

Algorithm 1 Algorithm of MEEMD
1. Generate the ensemble y n (t) = x(t) + ω n (t) for n = 1, . . . , N, where ω n (t) ∼ N(0, 1); 2. Decompose every member of y n (t) into M n IMFs using the standard EMD, to yield the set (d n m (t)) M n m=1 ; 3. Assemble same-index IMFs across the ensemble using the median operator to obtain the final IMFs within MEEMD; for instance, the mth IMF is computed as Nevertheless, EMD, and thus its derivative MEEMD has limitations. In particular the occurrence of dependencies between the length of analyzed data and the number of received IMFs resulting from such a decomposition process.

Statistical Analysis
Actually one should remember that industrial datasets can include many observations and therefore the constraint of the oscillation detection algorithm (datasets cannot be too long) might limit the analysis. Splitting of the long datasets into smaller subsets is only one of the solutions to overcome a limitation of the MEEMD method described in Section 2.2. Statistical analysis suggests that it is natural to use an arithmetic mean. Analyzing datasets/signal values based on the arithmetic mean is not always producing accurate results. It is known that using the arithmetic mean, adequate (true for a given sample) information is obtained only when the sample comes from a population with a normal distribution or close to it. On the other hand, if the population distribution is not close to a normal distribution, the median is used to infer the sample mean value. If industrial data is considered as a time series, the question arises of whether the generated noise can be smoothed using multi-period smoothing (averaging filters). However, with their construction, some information is lost, as a result of which the extremes of the smoothed signal may appear in inappropriate/illogical places, contrary to real conditions. As a result, the information obtained would be highly incorrect or may even be distorted. Previous papers show that raw datasets collected from real-scale industrial processes do not have normal distributions. Considering the above, the use of a median filter is justified.
Several consecutive samples are taken and their median is determined in the classical median filter. A modified sample is obtained as a result of this operation. Working with datasets of a given amount is still an insufficient solution for further analysis and inference. Therefore, such an n-element dataset is divided into i-th windows with a length corresponding to natural divisors of number n (in this case 10, 25, 50, and 100, respectively).
In each window, we have k samples, which are arranged in a non-decreasing manner, and for each already ordered window we determine the median as follows [25]: where x k is the value of the i-th dataset. The position of the median (i.e., the middle unit) in each of the windows is set at half the size of the window under consideration: In this way, based on the medians from each of the windows, a shorter, filtered sample is obtained.

The Method
The methodology of the analysis depends on the type of data, especially on the length of the time series. Simulation data presented in Figure 1a may require a simpler approach as no additional data preprocessing is required. In our case, they are disturbed by randomly generated Gaussian measurement noise and optionally by the Cauchy disturbance. Oscillations are introduced using added sinusoidal signal of frequency equal to 60 Hz, not exceeding 10% of the mean value of the data. Real process data consists of longer time series. Therefore, they require preprocessing that is proposed to be done in two separate variants, as in Figure 1b. Both approaches can be summarized in a single algorithm. The methodology is described by the algorithm Algorithm 2. It uses acquired process data representing control errors for M loops of length equal to N observations X k ∈ R M×N . In the proposed analysis the suggested value for the maximum time-series length to work without decomposition is set to N max = 2000.

Algorithm 2 TE algorithm with oscillation decomposition
Acquire data time series Detrend data using polynomial/spline interpolation Select splitting variant Split detrended data into n datasets of length N < N max Decompose each subset data with MEEMD to M k = round(log(N)) Use Algorithm 1 Calculate TE coefficients d i for corresponding pairs of IMFs M k Use Equation (2) end if Draw causality diagram using d i coefficients

Research Case Studies
Both simulation environment and industrial data are summarized below.

Simulation Data
To generate data for analysis, an artificial multi-loop simulation model is created in MATLAB Simulink ® environment. The model scheme is presented in Figure 2. This model consists of five control loops-four PID controllers (R 1 , R 2 , R 3 , and R 4 ) and one PI (R 5 ) controller. The control errors from the loops, marked in the scheme as 1 , 2 , 3 , 4 , and 5 , respectively, are the basis for the problem analysis. The model incorporates the possibility to apply noises. In this case, two variants are given-first that simulation data is distorted by Gaussian noise (represented by variable σ i ) and second that simulation data is distorted by both Gaussian and Cauchy (represented by variable γ i ) noises. A sinusoidal signal is added to the Cauchy disturbance to simulate potential loop oscillations of known frequency and their propagation. Simulated process transfer functions are in form of the following linear models: Feedforward filters are defined as Controllers are roughly tuned with their parameters sketched in Table 1. Disturbance decoupling is done using industrial design [26] with a block diagram sketched in Figure 3.  The generated simulation data, including Gaussian noise, is presented in Figure 4. For each σ i variable, the noise parameters are selected randomly. In the case of the second variant, both Gauss and Cauchy disturbances are randomly selected, and the frequency of the added sinusoidal signal equals 30 Hz. The result of the above assumptions is presented in the form of trends in Figure 5. Next, the control error histograms are evaluated (see Figure 6) to check the data's statistical properties. It is shown that even in the case of only Gaussian noise the loop statistical properties are not Gaussian and they exhibit fat tails. Therefore, robust statistical estimators should be used to protect against misfitting.
Due to the fact that the relationships between the signals from the control loop are known, Figure 7 shows the causality diagram for the simulation model. The solid line means direct relationship, the dashed line means indirect relationship.

Industrial Data Overview
Industrial data used in this part of the research originates from the ammonia production plant, specifically from the ammonia synthesis loop. The process is using the catalytic reaction of the hydrogen, which is produced in earlier stages of production, with nitrogen being obtained from ambient air to produce anhydrous liquid ammonia. The process runs in the so-called synthesis loop, where the synthesis gas circulates and mixes with the fresh gas input. The reaction takes place on the ferric catalyst. The heat, which is obtained, heats up the steam and preheats the gas entering the installation. Included time series of the control errors describe five PID loops controlling the final stage of the ammonia productions, i.e., the synthesis loop. Loops denoted as u, w, x, y represent level control in the separators while the loop z is the temperature control.
Determining causality between variables of any large-scale industrial process requires proper preparation of data for analysis. As processes are mostly complex and non-linear, the data series obtained from such plants are characterized by a non-Gaussian distribution. The data is affected by noise as well as by the outliers appearing during measurements or resulting from external actions, for example, human activity. Basic preprocessing of such data consists primarily of removing the trend that appears in them, as well as the separation of other components important for the analysis of given signals.
Causality detection is performed on oscillations that are obtained through the decomposition process from the initially pre-processed dataset. It contains five separate variables designed as u, w, x, y, and z. Initial data pre-processing consists of trends removal based on the most common polynomial interpolation (shifting the signals by a constant value-0 order polynomial) for u to y and 2nd order spline interpolation for z. The detrended data for the analysis is presented in Figure 8. Moreover, signals are characterized by quite high stationarity with a clear presence of outliers. Their impact on the quality of results using the Transfer Entropy approach has been tested [27] and may be ignored for the purposes of this paper. Each given variable is a sequence of 59,900 samples, which is crucial information in further calculations. Similar to the simulation case, Figure 9 presents control error histograms, which are not Gaussian as well, justifying the use of robust estimators.

Results
Results presentation is grouped into two parts, i.e., for simulated and real data.

Results for Simulation Data
Simulation data is decomposed into 7 IMFs for both variants (data length set to 2000 samples for each i ). In Figure 10a there are results presented for variable 3 with Gaussian noise added. Oscillations are barely visible in IMFs d 4 , d 5 , d 6 , and d 7 . IMFs designated as d 1 , d 2 , and d 3 are considered as noise. The situation is different in the case of the second variant-the oscillation phenomenon is much more visible for the variant with Cauchy noise and sinusoidal signal (of known frequency) added (Figure 10b). Due to the above, research is conducted on the last four IMF functions.  Tables 2 and 3 show the values of the Transfer Entropy coefficients for simulation data with Gaussian noise and Gaussian and Cauchy noises respectively. An analysis of the determined TE coefficients values for simulation data with Gaussian noise shows that after the decomposition process, mostly NaN or In f values type appear (considered as 0). According to the calculated results, in this case, it is impossible to determine the causality graph. There can be many reasons for such poor behavior The most probably is that Gaussian disturbance is too weak and does not stimulate the simulation system sufficiently. The oscillations detected in the decomposition process are negligible, as shown in Figure 10a. Therefore, the Transfer Entropy approach is ineffective and the determined values of the coefficients should be completely rejected. According to the analysis of TE coefficients for oscillations with Gauss and Cauchy noises, the effects of the calculations are completely different and the causality graph can be successfully determined. It is presented in Figure 11. In the case of all IMF functions for individual i , the largest values of the Transfer Entropy coefficients appear for the same pairs of variables. The expected causality diagram for the simulation model (Figure 7) assumes direct and indirect relations, but focusing on causality for the highest value of TE in a row, such a complicated graph cannot be determined. However, when comparing the two graphs, there are some subtle differences. There is the lack of dependence between the variables 3 and 1 , and the direction of causality of the variables 5 and 2 . Considering the second largest value of TE coefficient for the pair 3 with the other variables, the causality graph would show the relationship between 3 and 1 , as originally assumed. The inverse, presumed causality between 5 and 2 should be considered in further research. Correctly obtained results are the effect of stimulating the simulation system not only with additional Cauchy noise but also with a sinusoidal signal with a known frequency. The causality graph obtained as a result of calculations should be considered correct and sufficient. 4 1 2 3 5 Figure 11. Causality diagram for simulated data-Gaussian noise and Cauchy disturbance.

Results for Industrial Data
In the case of industrial data, due to the length of time series, each signal from the detrended dataset can be decomposed into multiple IMFs. However, it is unacceptable. In the literature, it can be found that only the first five IMFs are selected for analysis, as the remaining ones being the sum of non-significant IMFs. This assumption is too subjective and unjustified to ignore the fact that there are more IMFs obtained for such a long time series. Whereby an application of the MEEMD algorithm is computationally demanding and the following assumptions for industrial data are stated: first analyzed variant is that the whole dataset is split into 50 (and respectively 49) sub-datasets of equal lengths in two ways (S 1 = [s 11 , s 12 , . . .] and S 2 = [s 21 , s 22 , . . .]), as it is shown in the Figure 12. Such overlapping of the obtained sub-datasets will allow for checking the potentially emerging frequency bands that are key in given considerations. The second variant is to shorten the dataset by using Median Filter (MF) due to the mentioned enormous number of IMFs and savings resulting from computation time. This operation is widely described in Section 2.3. Such action is aimed not only at comparing variants of oscillations with different frequencies on the same data but also at confronting the statement that the use of a Median Filter will delete (filter) higher frequencies so they are not visible. For simulation data, the above does not apply because it is possible to generate time series of a required length. As a result, industrial datasets are divided into 50 (and 49) overlapping sub-datasets successively. Next the MEEMD algorithm is applied to decompose individual signals from the control loops. Sample data for 1st variant is presented in Figure 13. It shows the MEEMD results for a randomly selected pair of overlapping sub-datasets (in this case for signal y). It can be noticed that there are six IMFs functions (d 1 to d 6 ) for both given sub-dataset lengths in variant 1st. The first three IMFs are considered as noise. The most interesting ones are the IMFs denoted as d 4 , d 5 , and d 6 for which actual oscillation occurs. Through analysis of Figure 13 it is confirmed that due to this operation emerging frequency bands are not skipped. According to the multiple numbers of sub-datasets both for 1st and 2nd variants, it is impossible to show all the MEEMD algorithm operation results. The use of the median operator is justified through the fact that industrial control errors time series are in the majority non-Gaussian fat-tailed [28]. Therefore, a robust shift estimator is required. Median is the simplest robust estimator. It holds a breakdown point of 50% [20], meaning that half the points must be outlying observations before the median can be moved outside the inlying range and thus it is used.
Due to the large amount of data, it is impossible to illustrate the causality in the form of Transfer Entropy coefficients for each dataset. For both variants, the relationship between every single signal (variable) (u, w, x, y, z) with the others is determined. It is noted that no universal threshold method for the Transfer Entropy approach has been developed. Reminding that each signal is decomposed into 6 IMF functions (the most interesting are d 3 to d 6 oscillations) for 1st variant and from 4 to 7 IMF functions for 2nd variant, it is rational to make the following assumptions: if the value of TE is determined in the form of NaN or Inf it equals 0; in other cases, the relationship between the variables is determined based on the highest value in a given row. The results presented in Sections 4.2.1 and 4.2.2 are the totals of cases where the value of the TE coefficient in a row for a given variables pair is the highest. Tables 4 and 5 show the sum of cases where the value of the Transfer Entropy coefficient in a row for a given variables pair is the highest respectively for 50 and 49 overlapping sub-datasets. An analysis of the determined TE coefficients values shows that after the decomposition process of individual signals u to z, only in the case of d 4 , NaN or In f values type appear. The TE coefficients for pairs from the d 4 are also clearly greater than those obtained for d 5 or d 6 . This suggests the rejection of this dataset, which could affect the shape of the resultant causality graph.

The 1st Variant
Due to the fact that it is decided to average the values of the TE coefficient for d 5 and d 6 , and on this basis to check the relationship between signals, including 0 to the mean (as an assumption for NaN and In f ) would clearly distort the causality results. Tables 6 and 7 show the results for this operation. Table 4. Sum of cases where the Transfer Entropy coefficient in a row for a given variables pair is the highest-1st variant (50 subdatasets).  Based on the results given in the Tables 4-7, a causality graph can be determined for the most frequently repeated pattern. It is shown in a Figure 14. The analysis of the tables allows one to unequivocally state that the signals u, w, x, and y show a close causal relationship with the signal z. There are of course exceptions as for the signal u (d 5 ) (16 casual relationships for both y and z in Table 4 and the same number for w and z in Table 5) or for the signal w (Table 4-d 5 ), but the general rule is maintained. Doubts arise in the case of signal z-it can be noticed that a relationship appears between both signals u and x separately. For the sums of the highest Transfer Entropy coefficients determined on the basis of the average TE values of the sets d 5 and d 6 , the results for the 50 sub-datasets unequivocally define the causality of the signals u, w, x, and y with the signal z and the signal z with the signal u, however, in the case of 49 subdatasets the causal relationship is the same for u and y. Due to the fact that such a relationship does not repeat before, this can be omitted.

The 2nd Variant
According to Section 2.3, the median filter is used to shorten the data. Operating with MF where n divisor equals 10 and determining the Transfer Entropy coefficients for the signal pairs of the whole dataset, only NaN or In f values are obtained. The same situation is repeated for the n divisor equals 25, so these two variants are completely rejected from the analysis. Correct results in the form of real numbers are obtained for the n divisors equal to 50 and 100 (except for causality for signal u-0 for Nan or In f ). Tables 8 and 9 show the measure of causal relationships for the oscillations of individual pairs of variables for n divisor equals 50 and 100, respectively. Due to the lack of consistency and repeatability of the results for the 2nd variant, it cannot be generalized to the form of only two versions of the causality graph, as in Section 4.2.1. Figure 15 presents causality graphs for oscillations from decomposed detrended signals.
Comparing the results obtained as part of the division of datasets according to the assumptions of the 1st variant, especially causality graphs are shown in Figure 14, it is seen that those presented in Figure 15 do not have any common features. The strong relationship with the variable z does not occur as strongly as it does in the 1st variant. The results vary significantly not only in this aspect but also in the repeatability of the TE coefficient values for the corresponding IMFs. The situation is much better in case of the results presented in Table 8, which is reflected in the causality graphs (Figure 15a,b).   Graphs are similar to each other to a certain extent however, they can not be generalized to a single case. Even worse conclusions are drawn when it comes to the analysis of results presented in Table 9. It can be noticed that the highest TE coefficient value in a row occurs for different pairs of variables within one signal (d 3 and d 4 ). Attention is also drawn to the fact that in the case when the divisor of n equals 100, for the IMFs marked as d 3 , for pairs of the variable u with the variables x, y, and z there are 0 (Nan or In f ). For d 4 such a problem does not occur. Nevertheless, causality graphs can be determined and and their form are presented in Figure 15c,d. Graphs are unfortunately so inconsistent from each other that one can not be accepted as a correct result. Due to the occurrence of NaN's and Inf 's adopted as 0, it is impossible to perform the averaging of the entropy coefficients values, as it is possible for the 1st variant (Tables 6 and 7).
As one can see, it is hard to find a single explanation for the above-given industrial example. The process knowledge shows that inferring causality from the highest TE coefficient for oscillations is most likely wrong, and the causality graphs do not reflect the correct relationships. The information contained in the oscillations obtained during the decomposition process for 1st variant may be insufficient for the successful Transfer Entropy method application or may be lost. The use of the Median Filter has a positive effect on the computational demands. However, further analysis of the oscillations obtained from the minimized datasets does not bring the desired results.
These observations clearly show that the analysis of the industrial data differs completely from the theoretical analyzes. Simple relations and approaches developed in the sterile simulated environment do not have to extend towards real applications.

Conclusions and Further Research
The presented approach extends the classical TE model-free approach, changing the domain of causality detection. Original approaches take the process data as it is. We try to investigate the hypothesis that the identification can be done by taking into account the only oscillatory element of the data. This modification can be simply done with the TE method using common oscillation detection methods, for instance, the MEEMD algorithm. It is difficult to compare the TE method with other approaches as this method is model-free, while the others assume some process models. One has to be aware that such models are rarely known in the process industry and any misfit in modeling may arise the dilemma of what is wrong: the model or the causality detection. Therefore the the methods requiring model assumptions are not considered. In the case of the industrial data used in the analysis, the modeling effort would unnecessarily extend the complexity without any positive add-ons.
The paper focuses on simulated and industrial data processing used to discover the oscillations existing in control-loop variables. Time series decomposition is performed using the MEEMD algorithm. The disadvantages and limitations of this approach force the authors to prepare specific industrial data for analysis. Two variants are checked. The key issue is the use of the Transfer Entropy algorithm on the obtained oscillations from the simulation and industrial data and then the generation of causality graphs.
It is shown that causality analysis is a complex task. There is no one single solution that works. The paper discovers two realities. Simulation example, despite the fact that is is non-Gaussian works perfectly. There is no doubt that the proposed methodology works in that case. Application of the successfully tested method to the normal, not specifically treated industrial data opens the Pandora box. The results become counterintuitive, variable, unrepeatable, and difficult to interpret. It appears that long data files pose a challenge to the methods used. The question of why is hard to answer. There might be several reasons: known and unknown. It might be non-stationarity, non-linearity, outliers, data granulation, fractality, human interventions, uncoupled disturbances (weather?), and many others.
In the case of the simulated data, the data are coherent with the applied methods and algorithms limitations are not violated. However, in the case of industrial data, we are never sure about their actual properties. We do not know if they are stationary. We do not know what type of non-stationarity they exhibit. We should investigate the outliers. We should identify the oscillations. We also need to analyze the residuum and its statistical properties.
There are various issues that should be taken into consideration. Once we properly identify the process data, we may select or adopt the proper methodology. However, the research on industrial data internal properties and their compliance with the limitations of the methods used is infrequent.
However, the above observation should not discourage anyone. It represents a research challenge. However, this challenge requires a different approach. Simulations are not sufficient and in-depth, step-by-step comparative analysis must be carried out on several industrial sites; real ones, not laboratory ones. Of course, this is difficult and demanding. But only then it is possible to delve deeper into the issue and perhaps, but not certainly, find a solution.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The author declares no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: