# Detection of Causality between Process Variables Based on Industrial Alarm Data Using Transfer Entropy

^{*}

## Abstract

**:**

## 1. Introduction

Method | Authors | Variable Type | Main Characteristic |
---|---|---|---|

signed directed graph | Yang, Shah and Xiao [5,6] | continuous | qualitative model to detect the cause and effect relationship |

event correlation analysis | Noda, Higuchi, Takai and Nishitani [7] | continuous | a data mining method to detect statistical similarities |

Granger causality | Granger [10] | continuous | based on AR models |

extended Granger causality | Ancona, Marinazzo and Stramaglia [11] | continuous | nonlinear extension of Granger causality |

nearest neighbor methods | Bauer, Cox, Caveness, Downs and Thornhill [12] | discrete | data-driven and operating on the process measurements stored in a data historian |

transfer entropy | Schreiber [13] | continuous | based on information theory |

direct transfer entropy | Duan, Yang, Chen and Shah [16] | continuous | extension of TE to detect direct relationship |

transfer zero-entropy | Duan, Yang, Shah and Chen [17] | continuous | avoid estimating high dimensional pdfs using 0-entropy |

symbolic transfer entropy | Staniek, Lehnertz [20] | discrete | avoid estimating high dimensional pdfs using a technique of symbolization |

this paper | discrete | use natural binary alarm data for causality detection |

## 2. Concept of Transfer Entropy

#### 2.1. Basic Definition

#### 2.2. Discrete Version of TE

#### 2.3. Required Assumptions

## 3. Transfer Entropy Based on Alarm Data

1 | Obtain the original data; |

2 | Preprocess data and obtain series I, J; |

3 | Estimate the TE${}_{JI}$ = TE$(I,J,k,l)$; |

4 | for t = 1:reptime{ |

5 | ${J}^{\prime}$ = surrogate(J); |

6 | Estimate the TE${}_{t}$ = TE$(I,{J}^{\prime},k,l)$; |

7 | } |

8 | $threshold$ = ${P}_{95}$(${F}_{n}$(TE${}_{t}$)); |

9 | Compare TE${}_{JI}$ with $threshold$ |

#### 3.1. Alarm Series and Data Collection

**Figure 1.**False alarms and missed alarms generated by noise (a false alarm at $t=7$, a missed alarm at $t=62$).

#### 3.2. Data Preprocessing

- (1)
- Interpretation of the preprocessing: The starting point to use the moving average method is quite natural. Generally speaking, the state of a real industrial process will last for quite a long time, no matter if in the normal state or the abnormal state. For that reason, those single alarms with no other alarms before or after them over a long period of time are more likely to be considered as false alarms caused by noise, which is different from those similar single signals or “spikes” in neuroscience, while alarmless time bins surrounded by alarms are more likely to be considered as missed alarms. Furthermore, these false alarms or missed alarms can be seen as originating from high-frequency variables, and with the moving average method, they can be reduced significantly.
- (2)
- Determination of the parameters: The window width u should be determined in the algorithm. However, u is influenced by several factors. The first one is the property of the industrial process itself, which means the structure and the operation mode of the process, such as whether it is a fast process or a slow process. If it is a slow process, which means the measured value changes slowly, both the normal state and the abnormal state will last for a longer time, and the influence of the noise is smaller, so u can be set to a larger value to improve the accuracy. If it is a fast process, u should be set small. The second factor is the sampling interval. If the interval is large, u should be set small in order not to lose more information of the process, otherwise u can be set large. Furthermore, the total length of the time series should be considered, and the ISA (The International Society of Automation) 18.2 standard [27], in which the limit of an alarm flood is 10 alarms per 10 min, should be met. Thus, the determination of u is quite complex.

**Figure 3.**The impact of u of the moving average filters. (

**a**) A random time series x; (

**b**) the ROC curve of x by using moving average filters with different u.

#### 3.3. Estimation of TE

#### 3.4. Estimation of the Significance Level

## 4. Case Studies

#### 4.1. Stochastic Processes

**Example 1.**The first case is described by the following equations:

**Figure 4.**Measured values of X, Y, Z and the corresponding alarm series based on thresholds represented by green lines.

**Figure 5.**Selection of the embedding dimension of y and x for Example 1. (

**a**) Selection of the embedding dimension of y; (

**b**) selection of the embedding dimension of x for $T{E}_{x\to y}$.

**Table 3.**Estimated TE values and the corresponding thresholds without and with preprocessing, respectively, for Example 1.

${TE}_{row\to col}$ | X | Y | Z |
---|---|---|---|

X | $N/A$ | 0.0088(0.0005) | 0.0082(0.0006) |

Y | 0.0979(0.0008) | $N/A$ | 0.0225(0.0007) |

Z | 0.0934(0.0004) | 0.0163(0.0014) | $N/A$ |

X | $N/A$ | 0.0134(0.0006) | 0.0117(0.0007) |

Y | 0.0004(0.0007) | $N/A$ | 0.0127(0.0007) |

Z | 0.0007(0.0015) | 0.0006(0.0007) | $N/A$ |

**Example 2.**The second example is described by the following nonlinear equations:

${TE}_{row\to col}$ | X | Y | Z |
---|---|---|---|

X | $N/A$ | 0.2926(0.0009) | 0.0923(0.0005) |

Y | 0.0015(0.0014) | $N/A$ | 0.0487(0.0004) |

Z | 0.0006(0.0005) | 0.3163(0.0006) | $N/A$ |

#### 4.2. Simulated Industrial Case

${TE}_{row\to col}$ | Stream 1 | Stream 8 | Stream 6 | Stream 9 | Stream 10 | Stream 11 |
---|---|---|---|---|---|---|

Stream 1 | $N/A$ | 0.0010(0.0001) | 0.0014(0.0001) | 0.0010(0.0008) | 0.0003(0.0001) | 0.0055(0.0001) |

Stream 8 | 0.0005(0.0002) | $N/A$ | 0.0087(0.0002) | 0.0012(0.0012) | 0.0085(0.0002) | 0.0094(0.0002) |

Stream 6 | 0.0005(0.0002) | 0.0075(0.0001) | $N/A$ | 0.0003(0.0003) | 0.0082(0.0003) | 0.0098(0.0003) |

Stream 9 | 0.0004(0.0002) | 0.0031(0.0002) | 0.0020(0.0003) | $N/A$ | 0.0144(0.0001) | 0.0004(0.0002) |

Stream 10 | 0.0004(0.0003) | 0.0022(0.0014) | 0.0015(0.0009) | 0.0005(0.0002) | $N/A$ | 0.0013(0.0003) |

Stream 11 | 0.0001(0.0004) | 0.0008(0.0003) | 0.0008(0.0003) | 0.0001(0.0002) | 0.0030(0.0010) | $N/A$ |

**Figure 7.**Schematic illustration obtained from the estimated results. Bold arrows show that there is a true cause-effect relationship between the two variables based on process connectivity. Thin arrows with solid lines mean that the estimated result is consistent with the real situation, and those with broken lines mean that the estimated result is wrong.

% | Stream 1 | Stream 8 | Stream 6 | Stream 9 | Stream 10 | Stream 11 |
---|---|---|---|---|---|---|

FPR | 0.03 | 0.44 | 0.45 | 1.95 | 0.81 | 1.13 |

TPR | 99.77 | 96.95 | 97.95 | 99.76 | 81.01 | 45.92 |

## 5. Concluding Remarks

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- King, P.L.; Kroeger, D.R.; Foster, J.B. Making Cereal-Not Cars. Ind. Eng.
**2008**, 40, 34–37. [Google Scholar] - Yang, F.; Xiao, D. Research Topics of Intelligent Alarm Management. Comput. Appl. Chem.
**2011**, 28, 1485–1491. [Google Scholar] - Izadi, I.; Shah, S.L.; Shook, D. Optimal Alarm Design. In Proceedings of the 7th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes, Barcelona, Spain, 30 June–3 July 2009; pp. 651–656.
- Folmer, J.; Vogel-Heuser, B. Computing Dependent Industrial Alarms for Alarm Flood Reduction. In Proceedings of the IEEE 9th International Multi-Conference on Systems, Signals and Devices (SSD), Chemnitz, Germany, 20–23 March 2012; pp. 1–6.
- Yang, F.; Shah, S.L.; Xiao, D. Signed Directed Graph Based Modeling and Its Validation from Process Knowledge and Process Data. Int. J. Appl. Math. Comput. Sci.
**2012**, 22, 41–53. [Google Scholar] [CrossRef] - Yang, F.; Xiao, D.; Shah, S.L. Signed Directed Graph-based Hierarchical Modelling and Fault Propagation Analysis for Large-scale Systems. IET Control Theory Appl.
**2013**, 7, 537–550. [Google Scholar] [CrossRef] - Noda, M.; Higuchi, F.; Takai, T.; Nishitani, H. Event Correlation Analysis for Alarm System Rationalization. Asia-Pac. J. Chem. Eng.
**2011**, 6, 497–502. [Google Scholar] [CrossRef] - Yang, F.; Shah, S.L.; Xiao, D.; Chen, T. Improved Correlation Analysis and Visualization of Industrial Alarm Data. ISA Trans.
**2012**, 51, 499–506. [Google Scholar] [CrossRef] [PubMed] - Hollender, M.; Beuthel, C. Intelligent Alarming, Effective Alarm Management Improves Safety, Fault Diagnosis and Quality Control. Available online: https://library.e.abb.com/public/0d024150cfb0dfd0c125728b0036f2be/20-23%201M703_ENG72dpi.pdf (accessed on 14 August 2015).
- Granger, C.W.J. Investigating Causal Relations by Econometric Models and Cross-Spectral Methods. Econometrica
**1969**, 37, 424–438. [Google Scholar] [CrossRef] [PubMed] - Ancona, N.; Marinazzo, D.; Stramaglia, S. Radial Basis Function Approach to Nonlinear Granger Causality of Time Series. Phys. Rev. E
**2004**, 70, 056221:1–056221:7. [Google Scholar] - Bauer, M.; Cox, J.W.; Caveness, M.H.; Downs, J.J.; Thornhill, N.F. Nearest Neighbors Methods for Root Cause Analysis of Plantwide Disturbances. Ind. Eng. Chem. Res.
**2007**, 46, 5977–5984. [Google Scholar] - Schreiber, T. Measuring Information Transfer. Phys. Rev. Lett.
**2000**, 85, 461. [Google Scholar] [CrossRef] - Barnett, L.; Barrett, A.B.; Seth, A.K. Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables. Phys. Rev. Lett.
**2009**, 103, 238701. [Google Scholar] [CrossRef] - Bauer, M.; Cox, J.W.; Caveness, M.H.; Downs, J.J. Finding the Direction of Disturbance Propagation in a Chemical Process Using Transfer Entropy. IEEE Trans. Control Syst. Technol.
**2007**, 15, 12–21. [Google Scholar] [Green Version] - Duan, P.; Yang, F.; Chen, T.; Shah, S.L. Direct Causality Detection via the Transfer Entropy Approach. IEEE Trans. Control Syst. Technol.
**2013**, 21, 2052–2066. [Google Scholar] - Duan, P.; Yang, F.; Shah, S.L.; Chen, T. Transfer Zero-entropy and Its Application for Capturing Cause and Effect Relationship between Variables. IEEE Trans. Control Syst. Technol.
**2015**, 23, 855–867. [Google Scholar] - Yang, F.; Duan, P.; Shah, S.L.; Chen, T. Capturing Causality from Process Data. In Capturing Connectivity and Causality in Complex Industrial Processes; Springer: New York, NY, USA, 2014; pp. 57–62. [Google Scholar]
- Duan, P.; Chen, T.; Shah, S.L.; Yang, F. Methods for Root Cause Diagnosis of Plant-wide Oscillations. AIChE J.
**2014**, 60, 2019–2034. [Google Scholar] - Staniek, M.; Lehnertz, K. Symbolic Transfer Entropy. Phys. Rev. Lett.
**2008**, 100, 158101. [Google Scholar] [CrossRef] - Silverman, B.W. Chapter 4: The Kernel Method for Univariate Data. In Density Estimation for Statistics and Data Analysis; Chapman&Hall Press: Boca Raton, FL, USA, 1986; pp. 77–78. [Google Scholar]
- Girod, B.; Rabenstein, R.; Stenger, A. Chapter 11: Sampling and Periodic Signals. In Signals and Systems; Wiley: Hoboken, NJ, USA, 2001; pp. 261–293. [Google Scholar]
- Li, X.R. Probability, Random Signals and Statistics; CRC Press: Boca Raton, FL, USA, 1999. [Google Scholar]
- Yang, Z.; Wang, J.; Chen, T. Detection of Correlated Alarms Based on Similarity Coefficients of Binary Data. IEEE Trans. Autom. Sci. Eng.
**2013**, 10, 1014–1025. [Google Scholar] - Ito, S.; Hansen, M.E.; Heiland, R.; Lumsdaine, A.; Litke, A.M.; Beggs, J.M. Extending Transfer Entropy Improves Identification of Effective Connectivity in a Spiking Cortical Network Model. PLoS ONE
**2011**, 6, e27431. [Google Scholar] - Kondaveeti, S.R.; Izadi, I.; Shah, S.L.; Shook, D.S.; Kadali, R.; Chen, T. Quantification of Alarm Chatter Based on Run Length Distributions. Chem. Eng. Res. Des.
**2013**, 91, 2550–2558. [Google Scholar] - ISA. Management of Alarm Systems for the Process Industries, 2nd ed.; The International Society of Automation (ISA): Research Triangle Park, NC, USA, 2009. [Google Scholar]
- Bauer, M.; Thornhill, N.F. A Practical Method for Identifying the Propagation Path of Plant-wide Disturbances. J. Process Control
**2008**, 18, 707–719. [Google Scholar] - Kantz, H.; Schreiber, T. Nonlinear Time Series Analysis; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
- Downs, J.J.; Vogel, E.F. A Plant-wide Industrial Process Control Problem. Comput. Chem. Eng.
**1993**, 17, 245–255. [Google Scholar] - Ricker, N.L. Decentralized Control of the Tennessee Eastman Challenge Process. J. Process Control
**1996**, 6, 205–221. [Google Scholar]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yu, W.; Yang, F.
Detection of Causality between Process Variables Based on Industrial Alarm Data Using Transfer Entropy. *Entropy* **2015**, *17*, 5868-5887.
https://doi.org/10.3390/e17085868

**AMA Style**

Yu W, Yang F.
Detection of Causality between Process Variables Based on Industrial Alarm Data Using Transfer Entropy. *Entropy*. 2015; 17(8):5868-5887.
https://doi.org/10.3390/e17085868

**Chicago/Turabian Style**

Yu, Weijun, and Fan Yang.
2015. "Detection of Causality between Process Variables Based on Industrial Alarm Data Using Transfer Entropy" *Entropy* 17, no. 8: 5868-5887.
https://doi.org/10.3390/e17085868