Detecting Enclosed Board Channel of Data Acquisition System Using Probabilistic Neural Network with Null Matrix

The board channel is a connection between a data acquisition system and the sensors of a plant. A flawed channel will bring poor-quality data or faulty data that may cause an incorrect strategy. In this paper, a data-driven approach is proposed to detect the status of the enclosed board channel based on an error time series obtained from multiple excitation signals and internal register values. The critical faulty data, contrary to the known healthy data, are constructed by using a null matrix with maximum projection and are labelled as training examples together with healthy data. Finally, the status of the enclosed board channel is validated by a well-trained probabilistic neural network. The experimental results demonstrate the effectiveness of the proposed method.


Introduction
Data acquisition systems play a vital role in the data collection of industry [1]. Among them, the board tunnel, which is usually classified as analog input (AI), analog output (AO), digital input (DI), and digital output (DO) modules, is a bridge between the processor and sensors, which ensures the data conversion at the physical level [2]. The tunnel board is made up of enclosed circuit boards that are convenient to be replaced immediately once they are found to have any faults occur due to security reasons. In order to detect the inertial faults of these circuit boards in time, most famous products, such as Siemens, Honeywell, etc., have provided error codes to help operators [3][4][5]. However, these codes are limited to meeting the requirements of board channel diagnosis in a practical complex application.
Different kinds of methods for fault detection and diagnosis (FDD) have been developed, which are classified as model-based approaches, signal-based approaches, and data-driven approaches [6,7]. In model-based approaches, the fault diagnosis algorithms are developed to monitor the consistency between the measured outputs of the practical systems and the model-predicted outputs, which are based on an appropriate model, whether a physical model or equivalent model. Reference [8] proposed a new method by combining the model-based FDD method and the support vector machine (SVM) method. In reference [9], the spindle modes are determined through a three-step procedure in order to overcome these issues of the low number of sensors and the presence of many harmonics in the measured signals and to extract the characteristics of the system. In reference [10], based on the information of fault-free data series, fault detection was promptly implemented by comparison with the model forecast and real-time process. Signal-based approaches include time-domain analysis, frequency-domain analysis, and both together. Reference [11] proposed a novel "frequency-domain damping design" using a high-pass filter for acceleration-based bilateral control (ABC) based on modal space analysis. In reference [12], a unified measurement model was utilized to simultaneously characterize (3) The PNN is used to adapt to the law of probability hidden in the time series, and case studies verify the effectiveness.
The remainder of this article is organized as follows. In Section 2, the acquisition of error time series and the relationship between multiple input signals and overall performance of the board tunnel are given. Section 3 describes the proposed approach, including the probability neural network, the construction of critical faulty data, the structure, and the workflow. The case studies are illustrated in Section 4, followed by conclusions in Section 5.

The Error Time Series of Board Tunnel
The error between input signal and output (memory) mainly affected by internal factors of the board is regarded as a comprehensive index reflecting the performance of the board tunnel. A single sample is meaningless for evaluating the board performance because it is an instance and not enough to observe the law of probability. Thus, an error time series is taken as an analysis object of the enclosed board tunnel, and the error time series is obtained, as shown in Figure 1. Let the input signal of the board tunnel be k and the value of the corresponding memory be k ; thus, the error time series is where k is the sampling time. Formula (1) is abbreviated as Formula (2) by using , , instead of k , k , k . ( Notice that is the converted data of input signal x according to the physical meaning of the board channel, and is regarded as a probability model of noisy influences that follows a normal distribution with a form of Formula (3): where and σ are an expectation and a variance for the board, respectively. It is worth noting that if the board input is enough to cover all the work conditions and influences of the environment, the expectation is equal to the mean, which ideally satisfies 0 0 . Thus, thereafter, we use the mean instead of the expectation.
In fact, different input signals will cause some changes due to the influence of the environment and internal parameters.  Let the input signal of the board tunnel be {x(k)} ∞ k=1 and the value of the corresponding memory be {y(k)} ∞ k=1 ; thus, the error time series is where is the sampling time. Formula (1) is abbreviated as Formula (2) by using x, y, z Notice that is the converted data of input signal x according to the physical meaning of the board channel, and z is regarded as a probability model of noisy influences that follows a normal distribution with a form of Formula (3): where µ board and σ board are an expectation and a variance for the board, respectively.
It is worth noting that if the board input x is enough to cover all the work conditions and influences of the environment, the expectation µ board is equal to the mean, which ideally Thus, thereafter, we use the mean instead of the expectation.
In fact, different input signals will cause some changes due to the influence of the environment and internal parameters.  Each input signal that is long enough will produce its own probability distribution laws with a form of where and σ are the mean and the variance under the i-th input signal. It is inevitable for some deviations to occur between and . From the view of fault detection and diagnosis, the board tunnel is considered to be in a healthy state as long as is within the allowable range. However, these deviations between and will disturb the judgment of healthy states due to the limitation of the sampling data number. In order to establish the relation between sampling data and board performance, it is assumed that the mean is equal to the mean of different input signals, that is,

Lemma1:
The mean and variance of the sampling data series satisfying normal distribution can be replaced by sub-sampling data whose mean is , 1,2, ⋯ , and whose variance is , 1,2, ⋯ , . That is, Proof: For the data series ~ , that follows normal distribution with a mean and variance , suppose the data series z has enough data of samples to reflect the statistical characteristics of a whole. The unbiased estimate of is ̅ , and the unbiased estimate of is obtained according to Each input signal that is long enough will produce its own probability distribution laws with a form of where µ i and σ i are the mean and the variance under the i-th input signal. It is inevitable for some deviations to occur between µ board and µ i . From the view of fault detection and diagnosis, the board tunnel is considered to be in a healthy state as long as µ board is within the allowable range. However, these deviations between µ board and µ i will disturb the judgment of healthy states due to the limitation of the sampling data number. In order to establish the relation between sampling data and board performance, it is assumed that the mean µ board is equal to the mean of different input signals, that is, Lemma1: The mean µ board and variance σ 2 board of the sampling data series z satisfying normal distribution can be replaced by m sub-sampling data whose mean is µ m (i), i = 1, 2, · · · , m and whose variance is σ 2 m (i), i = 1, 2, · · · , m. That is, Proof: For the data series z ∼ N µ board , σ 2 board that follows normal distribution with a mean µ board and variance σ 2 board , suppose the data series z has enough data of n samples to reflect the statistical characteristics of a whole. The unbiased estimate of µ board is z, and the unbiased estimate of σ 2 board is obtained according to Consider the relation of the mean between the whole and sub-sampling data. Let the n samples be divided into m groups with the mean and the length of the k-th group being z m (k) and L m (k): Thus, the mean of a whole is Formula (10) shows the unbiased estimate of z. Therefore, µ board can be estimated by the above formula.
For a variance, it is well known that the sample mean of normal distribution also obeys normal distribution according to the mathematical statistical theory. Thus, the mean z i of each group follows For m groups, an unbiased estimate of σ 2 i is obtained by Furthermore, the σ 2 board of a whole is obtained according to As a result, the proof is completed. The lemma shows that the performance of the board can be obtained through the combination of different groups. For a board tunnel, this means the total probability of healthy model can be combined with different input signals.

Probability Neural Network
The probability neural network (PNN) that was proposed by D.F Specht in 1990 is a kind of statistical neural network model based on a Bayesian minimum risk criterion [24]. It consists of four layers, including the input layer, the pattern layer, the summation layer, and the output layer. The input layer is responsible for transmitting the feature vector into the network. The pattern layer takes full connection directly from the input layer through the connection weight. The pattern layer reflects the spatial distribution of the samples, in which each sample works in a limited local space, and the whole space constitutes a distributed probability distribution with a sample combination. This structure accurately reflects the probability distribution of the sample in the whole space. It is usually trained with supervised learning based on training samples and the responding patterns. The distance between the input eigenvector and the trained pattern is used to activate the Gaussian function of the pattern layer. The summation layer is responsible for connecting the outputs of the pattern layer and the schema units of each class through the score probability. Finally, the output layer outputs the category with the highest total score of schema units of each class in the summation layer. In PNN, the probability density p(x|w i ) is expressed by a radial basis function: Sensors 2022, 22, 5559 6 of 15 where x ik ,N i , σ, and l are the sample center, the smoothness factor, the hyper-parameters, and the coefficient, respectively. The discriminant function g i (x) is where p(w i ) is the probability of w i occurrence. Additionally, the discrimination rule is

The Construction of Critical Faulty Data
The PNN distinguishes the category of input data based on the established relationship of the train examples and the category belonged to. Different from the weights principle of direct mapping between input and output, the PNN adopts computing the proximity to the different sample data and judges the category according to a posterior probability. In principle, as long as there are faulty data samples and health data samples, the new data will be classified in healthy states and faulty states, except for an occurrence of posteriori probability of just 50%. However, the fault samples are in fact in a large range that affects the accuracy as a criterion. The schematic diagram of critical faulty data construction is shown in Figure 3.
where is the probability of occurrence. Additionally, the discrimination rule is

The Construction of Critical Faulty Data
The PNN distinguishes the category of input data based on the established relationship of the train examples and the category belonged to. Different from the weights principle of direct mapping between input and output, the PNN adopts computing the proximity to the different sample data and judges the category according to a posterior probability. In principle, as long as there are faulty data samples and health data samples, the new data will be classified in healthy states and faulty states, except for an occurrence of posteriori probability of just 50%. However, the fault samples are in fact in a large range that affects the accuracy as a criterion. The schematic diagram of critical faulty data construction is shown in Figure 3. In Figure 3, the square represents the entire set of healthy and faulty states, which is classed as section I (health), section II (vagueness), and section III (fault). The A and the B are the observed sets that build the health data samples. The F1 and the F2 are the faulty data samples. Additionally, the T1 and the T2 are the test sets. For a healthy dataset T1, it is prone to find an observed health set A that is close to T1. However, for a faulty dataset T2, if one randomly selects the faulty dataset F1 as faulty data samples, it will produce the incorrect result that the T2 is health because the distance from T2 to B is closer than that from T2 to F1. If the position of F1 moves to the position of F2 that belongs to section III but is close to section II, the previous mistakes will be avoided. Thus, the fault samples at the edge of vagueness and fault are called critical faulty data. Although the critical faulty data cannot distinguish the dataset of all sections (for example, the M of section II), they can solve the judgment problem for the most of the datasets.
However, the board channel has almost no historical faulty data to be used because the board channel is prohibited from working with faults. This makes it impossible to find the critical faulty data by analyzing historical data. To produce the examples of critical faulty data from the healthy data, the null matrix is introduced as a vertical cross mode of the healthy state and the critical faulty data. The null matrix N of a non-full rank matrix is defined if there is a matrix that satisfies = 0 and = I. [25] According to the definition of the null matrix, for being a sampling vector of healthy data, there is In Figure 3, the square represents the entire set of healthy and faulty states, which is classed as section I (health), section II (vagueness), and section III (fault). The A and the B are the observed sets that build the health data samples. The F1 and the F2 are the faulty data samples. Additionally, the T1 and the T2 are the test sets. For a healthy dataset T1, it is prone to find an observed health set A that is close to T1. However, for a faulty dataset T2, if one randomly selects the faulty dataset F1 as faulty data samples, it will produce the incorrect result that the T2 is health because the distance from T2 to B is closer than that from T2 to F1. If the position of F1 moves to the position of F2 that belongs to section III but is close to section II, the previous mistakes will be avoided. Thus, the fault samples at the edge of vagueness and fault are called critical faulty data. Although the critical faulty data cannot distinguish the dataset of all sections (for example, the M of section II), they can solve the judgment problem for the most of the datasets.
However, the board channel has almost no historical faulty data to be used because the board channel is prohibited from working with faults. This makes it impossible to find the critical faulty data by analyzing historical data. To produce the examples of critical faulty data from the healthy data, the null matrix is introduced as a vertical cross mode of the healthy state and the critical faulty data. The null matrix N of a non-full rank matrix X is defined if there is a matrix N that satisfies XN = 0 and NN = I [25].
According to the definition of the null matrix, for x i being a sampling vector of healthy data, there is N i x i = 0 (17) where N i is the corresponding null matrix. For another sampling vector x j (x j = x i ), there is where b ij is the deviation of x j under the action of null matrix N i . Compute the deviation b kl of all samples x l (l = 1 . . . n) and null matrix N k (k = 1 . . . n) according to b kl = N k x l (k = 1.n; L = 1 . . . N) Take b = max{b kl , k = 1 . . . n; l = 1 . . . n} and obtain the corresponding null matrix N for all healthy data, and inequality (20) is satisfied: The corresponding equation reflects the critical state of fault and health: Move the left of Formula (21) to the right and replace I with NN −1 : andx is the critical faulty data.

The Structure and Workflow of Proposed Approach
The proposed method is made up with four parts, including the data acquisition, the data processing, the probability neural network, and the diagnostic output. The excitation source acts on the board channel with multiple groups of different kinds of signals in order to expand the detection scope as much as possible. The error time series is built from the excitation signal and the converted data by a technique of OLE for process control (OPC) [26]. Then, it is transformed to a Hankel matrix by a sliding window in order to adapt to the PNN training. The diagnostic result is output by the PNN. The structure is shown in Figure 4.
Take , 1. . ; 1 … and obtain the corresponding null matrix for all healthy data, and inequality (20) is satisfied: (20) The corresponding equation reflects the critical state of fault and health: (21) Move the left of formula (21) to the right and replace I with : where is a pseudo inverse of . We obtain and ́ is the critical faulty data.

The Structure and Workflow of Proposed Approach
The proposed method is made up with four parts, including the data acquisition, the data processing, the probability neural network, and the diagnostic output. The excitation source acts on the board channel with multiple groups of different kinds of signals in order to expand the detection scope as much as possible. The error time series is built from the excitation signal and the converted data by a technique of OLE for process control (OPC) [26]. Then, it is transformed to a Hankel matrix by a sliding window in order to adapt to the PNN training. The diagnostic result is output by the PNN. The structure is shown in Figure 4.   The workflow is described as follows: Step 1: Record the signal generator and use OPC to obtain the internal memory data of the board tunnel. Thus, the error time series {z k } j k=1 combined with different input signals is formed according to Formula (1).
Step 2: Suppose the length of the sliding window is T, and construct the Hankel matrix H L with depth L (usually L T): Step 3: The critical faulty dataset H LN is constructed according to Formula (24) of 3.2: Step 4: Construct the sample matrix of the PNN by using input H: Moreover, the corresponding category is [0 1], where 0 and 1 represent the healthy states and the faulty states, respectively.
Step 5: Build the PNN by following three rules: (1) the number of input layers is the length of the sliding window (T); (2) the number of neurons in the mode layer is the number of input sample vectors (L); and (3) the summation layer is of class 2, which represents health and fault.
Step 6: The test sequence {T k } j k=1 is converted to the input sample matrix D by normalizing the Hankel matrix, denoted as The sample reference C is obtained by row normalizing the train of input matrix H where the Norm[•] is an operator of matrix row normalizing.
Step 7: Calculate the Euclidean distance between the input matrix D and the sample reference matrix C according to Step 8: The initial probability matrix P is obtained by activating the Gaussian function of the pattern layer: Step 9: The probability S that q samples belong to two categories (health and fault) is obtained according to Formula (29): Step 10: The maximum probability of each row is taken as the category according to Bayesian decision theory.

Case Studies
The experimental platform is a distributed control system with an engineer station. Our goal was to test the performance of the board without any destruction. The input signal of the board tunnel for the test was imposed directly by another board tunnel because the board channel of the laboratory is not loaded. If the board channel is connected to the sensor signal, it will reform the input signal by adding a small series signal source (usually not more than 15% of the normal signal amplitude). This small series signal is used only to detect the performance of the board tunnel and is easily eliminated by software. The central control platform of the laboratory is shown in Figure 5. There were five groups of healthy data with input signals of 5 V with additional pulse voltage, piecewise linear voltage, exponential voltage, thermal noise, and chirp signal. However, it was a challenge to construct any faults of the board tunnel because the board was not to be disassembled or damaged. Losing faults by changing internal states, there is only one possibility that uses the calibration function of the control system to change the AD converted reference signal. Two groups of faulty data were simulated by changing the AD converted reference signal with the addition of the stochastic disturbance and the periodic voltage, respectively. The cases of seven groups are shown in Table 1. The reference signal with additional random noise (Fault1) 7 Case7 The reference signal with periodic voltage signal (Fault2)

Change the Number of Intermediate Layers of PNN
Eight numbers of sliding window length from 100 to 20,000 were selected to detect the state of case5 to case 7. It repeated 1000 times per sliding window length. The training samples were combined with case1, case2, case3, and case4. For each test, the starting of sliding window was randomly selected from the error time series, and the Hankel depth was always kept at 10,000, just for simplicity. The results are shown in Table 2.  There were five groups of healthy data with input signals of 5 V with additional pulse voltage, piecewise linear voltage, exponential voltage, thermal noise, and chirp signal. However, it was a challenge to construct any faults of the board tunnel because the board was not to be disassembled or damaged. Losing faults by changing internal states, there is only one possibility that uses the calibration function of the control system to change the AD converted reference signal. Two groups of faulty data were simulated by changing the AD converted reference signal with the addition of the stochastic disturbance and the periodic voltage, respectively. The cases of seven groups are shown in Table 1.

Change the Number of Intermediate Layers of PNN
Eight numbers of sliding window length from 100 to 20,000 were selected to detect the state of case5 to case 7. It repeated 1000 times per sliding window length. The training samples were combined with case1, case2, case3, and case4. For each test, the starting of sliding window was randomly selected from the error time series, and the Hankel depth was always kept at 10,000, just for simplicity. The results are shown in Table 2. Table 4 shows that the most health and faults can be detected by combining three groups of health data as training examples via the proposed method. However, a few health cases are in states of poor accuracy because the training examples can partly cover the information of other health cases. This will be further confirmed by reducing the number of groups for training examples. In the cases of taking two groups as a combination of training examples, the situation is similar to before. Most health and faults can be detected correctly, but there are some incorrect detection results for healthy states. For example, taking case1 and case5 as training examples, the results of case2 and case4 are correct, but the results of case3 are all wrong in 1000 tests. These results are not listed here due to space limitations. By analyzing the above situations, we found that incorrect detection is related to some kinds of healthy data. It is due to the reason that the training data do not completely cover the characteristics of the test samples. We also notice that the detection results for faulty states are correct, which shows that the null matrix plays an important role. A conclusion is drawn that the feature coverage of training samples is more important than the number of groups.

Comparison with LDM
The classical linear discriminative method (LDM) was used to detect the fault of board channel. The 10,000 groups combined from the time series were selected as training samples whose length of the sliding window was 2000 samples, and a random 1000 groups of each case were tested for imitating the situation with known historical data. The results are shown in Table 5. The 1000 groups of data from case7 were used be tested as unknown faulty data, and the results are shown in Table 6. It is seen from Table 5 that for the labeled data, the LDM has a high accuracy of more than 99.3%, and it can be divided into more detailed categories. However, Table 6 shows that the accuracy of the LDM for a new fault is 70.3%, which is low. Compared with the LDM, the proposed approach, which is shown in Table 3, can achieve good results only by using healthy data.

Conclusions
At present, there is no practical method to detect the enclosed board tunnel except for returning it to the factory or an error code display. Failure to find the abnormal board brings a great potential threat to the control system of the plant. This paper proposes an approach for fault detection of an enclosed board channel by using a PNN based on an error time series excited by various external signals. The critical faulty data, contrary to the known healthy data, are constructed by using a null matrix with maximum projection and are labelled as training examples together with healthy data. This provides the mode criteria of PNN training. Thus, the problem of PNN lacking faulty data examples has been solved to some extent. The proposed approach is a data-driven method that can detect the abnormal state or fault of an enclosed board channel without knowing any internal circuit of the board channel. It only needs a small number of additional hardware devices and does not need any mechanism knowledge on the board channel, which greatly reduces the costs and the professional knowledge for staff. In the future, cases where the output probabilities of the health mode and fault mode are similar will be studied, which should improve the accuracy of the proposed approach in some special scenarios.