Adaptive-Cognitive Kalman Filter and Neural Network for an Upgraded Nondispersive Thermopile Device to Detect and Analyze Fusarium Spores

Noises such as thermal noise, background noise or burst noise can reduce the reliability and confidence of measurement devices. In this work, a recursive and adaptive Kalman filter is proposed to detect and process burst noise or outliers and thermal noise, which are popular in electrical and electronic devices. The Kalman filter and neural network are used to preprocess data of three detectors of a nondispersive thermopile device, which is used to detect and quantify Fusarium spores. The detectors are broadband (1 µm to 20 µm), λ1 (6.09 ± 0.06 µm) and λ2 (9.49 ± 0.44 µm) thermopiles. Additionally, an artificial neural network (NN) is applied to process background noise effects. The adaptive and cognitive Kalman Filter helps to improve the training time of the neural network and the absolute error of the thermopile data. Without applying the Kalman filter for λ1 thermopile, it took 12 min 09 s to train the NN and reach the absolute error of 2.7453 × 104 (n. u.). With the Kalman filter, it took 46 s to train the NN to reach the absolute error of 1.4374 × 104 (n. u.) for λ1 thermopile. Similarly, to the λ2 (9.49 ± 0.44 µm) thermopile, the training improved from 9 min 13 s to 1 min and the absolute error of 2.3999 × 105 (n. u.) to the absolute error of 1.76485 × 105 (n. u.) respectively. The three-thermopile system has proven that it can improve the reliability in detection of Fusarium spores by adding the broadband thermopile. The method developed in this work can be employed for devices that encounter similar noise problems.


Introduction
Fusarium is a hazardous fungus. It can weaken the immunization system of the hosts such as animals and human. It also and cause different diseases such as onychomycosis or keratitis for human [1], or meningoencephalitis in the dog [2]. Fusarium also can result in many other diseases on plants such as Fusarium wilt on watermelon or bean [3,4], Fusarium head blight on wheat [5], Fusarium dry on citrus [6] or Fusarium root rot [7]. According to Fusarium management guide [8], Fusarium head blight disease, which is the key factor to cause Fusarium damage kernel on wheat, has annually resulted in losses of hundreds of million dollars. Many other previous studies of analyzing and detection Fusarium were conducted by applying mass spectroscopy [9], Fourier transform infrared spectroscopy, near-infrared spectroscopy [10,11], polymerase-chain-reaction machine [12], chlorophyll fluorescent imaging [5] or impedance-based gold-electrodes sensor [13]. Though these mentioned approaches are effective, some drawbacks can be seen such as expensive, complex to manipulate and hard to achieve quick detection. Thus, early detecting of Fusarium spore help crops to avoid dangerous fungal diseases and losses. Fusarium spores can spread out through the water, air and collaborative media of both Z k = HX k + V k . (2) in which, Z k is the observation vector or measurement vector; H is the observation matrix and V k is the observation noise vector with the observation covariance matrix R, V k~N (0, R). The A, B, H, Q and R can have the subscript index k if they change with discrete-time, yet they are invariant in most problems. Figure 1a illustrated the Kalman algorithm. In the diagram, P + k and P − k are the updated and predicted state error covariance matrices respectively [37]. X − k is called an a priori prediction. The output from the Kalman is the updated X + k or an a posteriori vector. There are many systems that their state equation is nonlinear, so Equation (1) cannot be applied. The state can depend on a certain function f, which is: The observation equation of the measurement and prediction vectors can be: To be able to apply the Kalman filter, a linearization was suggested to approximate the nonlinear problem into a linear problem by first-order Taylor series. At each discrete-time, it is essential to calculate: The extended Kalman filter (EKF) algorithm is shown in Figure 1b. Viewing the two algorithms in Figure 1a,b, the differences are in the prediction equations. The update equations are the same in both algorithms [30,37]. calculating the error covariance and prediction. With a linear system in the state-space model, the discrete Kalman can be applied. The discrete-time state evolution equation of a linear system [37] can be defined as: where A is the state transition matrix impacting on , which is the state vector at the discretetime k−1; B is the control-input matrix; and is the control vector and is the process noise vector, which is supposed to be zero-mean Gaussian with the process noise covariance matrix Q, ~N(0, Q). The prediction Equation (1) will go along with an observation equation to describe the correlation between the measured value and the prediction at the discrete-time k: in which, is the observation vector or measurement vector; H is the observation matrix and is the observation noise vector with the observation covariance matrix R, ~N (0, R). The A, B, H, Q and R can have the subscript index k if they change with discrete-time, yet they are invariant in most problems. Figure 1a illustrated the Kalman algorithm. In the diagram, and are the updated and predicted state error covariance matrices respectively [37].
is called an a priori prediction. The output from the Kalman is the updated or an a posteriori vector. There are many systems that their state equation is nonlinear, so Equation (1) cannot be applied. The state can depend on a certain function f, which is: The observation equation of the measurement and prediction vectors can be: To be able to apply the Kalman filter, a linearization was suggested to approximate the nonlinear problem into a linear problem by first-order Taylor series. At each discrete-time, it is essential to calculate: The extended Kalman filter (EKF) algorithm is shown in Figure 1b. Viewing the two algorithms in Figure 1a,b, the differences are in the prediction equations. The update equations are the same in both algorithms [30,37].

Neural Network
In this work, as the neural network was applied to fix the error data caused by the BG noise, the theory of the neural network was briefly discussed here. An artificial neural network or neural network (NN) copying the work of biological neural systems [38][39][40] can react with certain inputs to provides outputs. An NN can have many layers, and the number of nodes in each layer is arbitrary. Looking at layer l with K nodes, one can have the output equation of this layer is: where σ is an activation function such as linear function, binary step, hyperbolic tangent, sigmoid function, tanh, rectified linear unit (ReLU), softplus functions or Leaky ReLU [40,41]; z (l) ij .a K } are weights and biases of the layer l respectively. To have desired outputs from an input vector, the NN must be trained to find the weights and biases of the NN. The training process is actually an optimal problem of finding the global minima of a cost function, which is often based on the mean square error (MSE) [38,42,43]. NN has applications in many areas such as signal processing [44], voice recognition [45], image processing [46] or navigation [47]. The back-propagation algorithm (BPA) is a fundamental algorithm in NN. In this algorithm, it needs a set of training data including input vector X of N elements {x 1 , x 2 , . . . , x N } and output vector O of M elements {o 1 , o 2 , . . . , o M }. The quadratic cost function of the stochastic gradient descent (SGD) is defined as: From Equation (8), the BPA to update the weight matrix and the bias vector for a hidden layer l is: In which ∆W (l) k+1 and ∆B (l) k+1 are the update matrices for weights and biases of the hidden layer l respectively; A (l−1) is the output vector of the layer l-1; [D (l) ] K×1 = σ [Z (l) ] K×1 is the activation derivative matrix with the argument is Z (l) matrix of z (l) i and E (l) is the error matrix. η is the learning rate. If η is too small, it may take a long time to find the global minima. If η is large, it can never obtain the optimum global minima. To overcome this difficulty, the steepest descent algorithm was proposed by using Taylor approximation to find an appropriate η [39]. In this algorithm: where, g is the vector gradient of E(W), and d is the descent direction. η should be small enough to make E(W + ηd) − E(W) < 0. Since η should not be so small, η can be chosen to minimize E(W + ηd).
. To increase the convergence speed, Newton algorithm can be used. In this algorithm, the update form of the weights is: where, H (l) k = ∇ 2 E W (l) k is the Hessian matrix. Solving the equation of H (l) k . d| k = − g k to find the descent direction d| k at discrete time k. Equation (9) is applied to find η k [39]. For the least square problems, as the Hessian matrix calculation is difficult sometimes, the Levenberg-Marquardt algorithm (LMA) can be applied to avoid that calculation by the approximation of H = J T .J, in which, J is the Jacobian matrix of first derivative ∇E W (l) k [39]. In our NN, the LMA was applied to find weights and biases.

System
The Fusarium detection device was upgraded from the authors' previous work, which was presented in [20] by removing the reference chamber or the splitting plate to make only one reaction chamber. The trap has two silver-coated mirrors at the top and bottom, one IR source, one ZnSe window, a pair of reflective mirrors to direct IR light to the IR thermopiles, an inlet pipe, an outlet pipe and methyl methacrylate plates to cover the surrounding. The upgraded device structure is shown in Figure 2. The reference or broadband (BR) thermopile became the third detector along with the other two thermopiles to analyze the incident IR light. The broadband thermopile has the IR spectrum of 1 µm to 20 µm; λ 1 and λ 2 thermopiles have very narrow bandwidth spectra by using window filters of 6.09 ± 0.06 µm and 9.49 ± 0.44 µm respectively. The window filters were supplied by Northumbria Optical [48] and installed into the 2 mm × 2 mm 2 M thermopiles supplied by Dexter Research Inc. [49]. The typical internal resistance of these thermopiles is about 10 kΩ, and the responsivity R is 18.9 V/W. From [49], the damage threshold P thres is 0.5 W/cm 2 , so it is not recommended to expose the 2 M thermopiles to any IR source higher P thres . The IR source is 2.2 mm × 2.2 mm JSIR350-4-AL-C-D3.7-2-A5-I, and its spectrum is around from 1 µm to 20 µm [50]. In the measurement, the biased current and the voltage for the IR source were 141.4 mA and 5.65 V respectively. As the signals from the thermopiles in this research were extremely weak, preamplifiers were necessary. The preamplifiers employed the AD8629 integrated circuit (IC) devices because these ICs have low bias current, low offset voltage, high common-mode rejection ration as well as chopping stabilization circuit [51]. These features will help to lower the output noises. The final amplifier is OPA320 IC [52]. The output of the final amplifier is digitalized by a 24-bit LT2400 analog-to-digital converter (ADC) [53]. The setup voltage for the ADC was 4.096 V, so the resolution was 0.488 µV [20]. In the device, a vacuum pump was attached to the inlet pipe. An output of a 15 kV high voltage (HV) circuit was connected to one of the silver-coated mirrors. In the device, to monitor the operating conditions of the device, a temperature sensor DS18B20 and a 5 V monitor and a 9 V monitor were used to monitor the output of the regulator circuits. When the temperature of the environment and voltages of these regulators change, the changes will be recorded to serve for the data error correction.

System
The Fusarium detection device was upgraded from the authors' previous work, which was presented in [20] by removing the reference chamber or the splitting plate to make only one reaction chamber. The trap has two silver-coated mirrors at the top and bottom, one IR source, one ZnSe window, a pair of reflective mirrors to direct IR light to the IR thermopiles, an inlet pipe, an outlet pipe and methyl methacrylate plates to cover the surrounding. The upgraded device structure is shown in Figure 2. The reference or broadband (BR) thermopile became the third detector along with the other two thermopiles to analyze the incident IR light. The broadband thermopile has the IR spectrum of 1 µm to 20 µm; and thermopiles have very narrow bandwidth spectra by using window filters of 6.09 ± 0.06 µm and 9.49 ± 0.44 µm respectively. The window filters were supplied by Northumbria Optical [48] and installed into the 2 mm × 2 mm 2 M thermopiles supplied by Dexter Research Inc. [49]. The typical internal resistance of these thermopiles is about 10 kΩ, and the responsivity R is 18.9 V/W. From [49], the damage threshold Pthres is 0.5 W/cm 2 , so it is not recommended to expose the 2 M thermopiles to any IR source higher Pthres. The IR source is 2.2 mm × 2.2 mm JSIR350-4-AL-C-D3.7-2-A5-I, and its spectrum is around from 1 µm to 20 µm [50]. In the measurement, the biased current and the voltage for the IR source were 141.4 mA and 5.65 V respectively. As the signals from the thermopiles in this research were extremely weak, preamplifiers were necessary. The preamplifiers employed the AD8629 integrated circuit (IC) devices because these ICs have low bias current, low offset voltage, high common-mode rejection ration as well as chopping stabilization circuit [51]. These features will help to lower the output noises. The final amplifier is OPA320 IC [52]. The output of the final amplifier is digitalized by a 24-bit LT2400 analog-to-digital converter (ADC) [53]. The setup voltage for the ADC was 4.096 V, so the resolution was 0.488 µV [20]. In the device, a vacuum pump was attached to the inlet pipe. An output of a 15 kV high voltage (HV) circuit was connected to one of the silver-coated mirrors. In the device, to monitor the operating conditions of the device, a temperature sensor DS18B20 and a 5 V monitor and a 9 V monitor were used to monitor the output of the regulator circuits. When the temperature of the environment and voltages of these regulators change, the changes will be recorded to serve for the data error correction. The microcontroller (µC) used to operate the system is an Atmelt 328p [54]. In Figure 2, to start, the µC turns on the vacuum pump to deliver the air into the trap chamber. The particles in the air are caught by the electrostatic charges provided by the HV circuit. After turning off the pump and the HV module, the microcontroller starts to collect data by following the following measurement procedure:  The microcontroller (µC) used to operate the system is an Atmelt 328p [54]. In Figure 2, to start, the µC turns on the vacuum pump to deliver the air into the trap chamber. The particles in the air are caught by the electrostatic charges provided by the HV circuit. After turning off the pump and the HV module, the microcontroller starts to collect data by following the following measurement procedure: • Phase 1: Measuring environment temperature-T1; then, measuring outputs of the 5 V and 9 V regulators, which are V1 and V2 respectively. • Phase 2: Measuring background data of BR thermopile in 6 s, when the IR source is still turned OFF; turning ON the IR source in 1.5 s and measuring data from the BR thermopile during this period to have peak data (PD); turning OFF the IR source in 6 s and measuring background data of the BR thermopile again. Thus, the data include background data, peak data PD and background data again. • Phase 3: Similar to phase 2, λ 1 thermopile data are measured. • Phase 4: Similar to phase 2, λ 2 thermopile data are measured. • Phase 5: Repeating phase 1, but renaming temperature as T2, and the outputs of 5 V and 9 V regulators as V3 and V4 respectively. • Phase 6: Sending all data to the computer in time order for further processing and analyzing.
In the computer, the background data will be averaged to have BG mean value. The data order is T1-V1-V2-BG-PD-BG-T2-V3-V4. After the measurement, one will have one data batch. To have a precise analysis, this procedure can be repeated to have more data batches. The number of the measurement batches is arbitrary. To have a good decision, five batches are sufficient [20].

Analyzing Method
To be able to detect a sample in the device, it is necessary to find a formula that depends only on the monochromatic absorbance features of the samples. From Beer-Lambert law, we proposed a group distinction coefficient equation, which can be applied to distinguish a group of samples in the device as follow [20]: In which P o,λ is the IR radiant power of a monochromatic light of the IR light source (W/sr); P λ is IR power of the monochromatic light going through a sample (W/sr) and ε λ is monochromatic extinction coefficient (1/obj.). The formula to determine the density of the sample is [20]: where D x is an unknown density of a sample; D is a known-sample density (obj./cm 2 ); S is the area of the sample and obj. is the studied object, which is caught on the area S. Additionally, from experiment results, the Fusarium curve of P BR and D x can be plotted and in the later measurements, the values of P BR and D x can be found. Testing whether the data point of (P BR , D x ) is on the curve can consolidate a decision of detection. This additional step helps to eliminate the confusion between two samples having a similar group-distinction coefficient η. Therefore, the third sensor is added to improve the reliability and extend application areas.
Power of incident light coming to a thermopile can be calculated by applying: where, N dig is digital output from ADC when reading thermopile data. Actually, when P x, λ 1 P xo,λ 1 ratio is estimated, the resolution, responsivity, and gain will cancel out each other. Therefore,

Adaptive and Cognitive Kalman Filter
In our work, the Kalman filter had two functions, which were the noise filter and outlier reducer for the signal data of each thermopile. As mentioned in Section 2.1, Q is the process noise covariance. In our work, the Kalman filter processed signal data of each thermopile, and Q should be called as the process noise error. The process noise, theoretically, depends on the working condition at each discrete-time, but in many problems, this condition is almost unchanged. During the time of turning ON the IR source, the IR radiation changed the working condition, so the process noise errors in the turning ON and turning OFF periods were not the same. The observation error could be determined from the experiments. Section 3.1 described the measurement procedure. In a turning ON period, the temperature of the IR source promptly increased. Since the IR source used the microelectromechanical system (MEMS), the temperature would soon reach the saturation temperature. As a result, in the early of the turning ON period, the signals on the thermopiles increased quickly but slightly improved in the end of this period. In the turning OFF period, the IR source temperature quickly decreased until reaching the environment temperature, so the data in this period would decline too. In practice, three types of data pulse can be seen as illustrated in Figure 3. In our work, the Kalman filter processed signal data of each thermopile, and Q should be called as the process noise error. The process noise, theoretically, depends on the working condition at each discrete-time, but in many problems, this condition is almost unchanged. During the time of turning ON the IR source, the IR radiation changed the working condition, so the process noise errors in the turning ON and turning OFF periods were not the same. The observation error could be determined from the experiments. Section 3.1 described the measurement procedure. In a turning ON period, the temperature of the IR source promptly increased. Since the IR source used the microelectromechanical system (MEMS), the temperature would soon reach the saturation temperature. As a result, in the early of the turning ON period, the signals on the thermopiles increased quickly but slightly improved in the end of this period. In the turning OFF period, the IR source temperature quickly decreased until reaching the environment temperature, so the data in this period would decline too. In practice, three types of data pulse can be seen as illustrated in Figure 3.  Figure 3a illustrates a normal pulse, in which, the front peak (FP) corresponds to turning OFF and the back peak (BP) corresponds to turning OFF and the START of the temperature balance period. In the research, the burst or popcorn noise may occur during the data collection and cause outliers in background and PD. Figure 3b,c shows the two typical pulse data with burst noise or outliers. As the front peak data caused by the reaction of the thermopiles with the coming-IR light reflecting from analyzing samples, the data will contain useful information of the samples. Besides, from observation, outliers often appear in FP range. Therefore, we focused on how to process outliers in the FP range. In the FP range (illustrated in Figure 3a), let us look at two adjacent points, P1 and P2 corresponding to the discrete-time k and k + 1, in a data peak. D1 and D2 are the tangential lines going through P1 and P2 respectively. and are the angles of the tangential lines D1 and D2 with the horizontal line. For normal pulses, it can be seen that:  Figure 3a illustrates a normal pulse, in which, the front peak (FP) corresponds to turning OFF and the back peak (BP) corresponds to turning OFF and the START of the temperature balance period. In the research, the burst or popcorn noise may occur during the data collection and cause outliers in background and PD. Figure 3b,c shows the two typical pulse data with burst noise or outliers. As the front peak data caused by the reaction of the thermopiles with the coming-IR light reflecting from analyzing samples, the data will contain useful information of the samples. Besides, from observation, outliers often appear in FP range. Therefore, we focused on how to process outliers in the FP range. In the FP range (illustrated in Figure 3a), let us look at two adjacent points, P1 and P2 corresponding to the discrete-time k and k + 1, in a data peak. D1 and D2 are the tangential lines going through P1 and P2 respectively. α 1 and α 2 are the angles of the tangential lines D1 and D2 with the horizontal line. For normal pulses, it can be seen that: where: Similarly, the conditions for Figure 3b,e are: For Figure 3c,f, the conditions are: The conditions in Equations (14), (16) and (17)

f(k).
As mentioned above, in the turning ON and OFF periods, the process noise and the other parameters of the Kalman filter should be adjusted. Figure 4 shows the adjustment diagram of process noise, and recursive coefficients based on the experiments, the conditions in Equations (14), (16) and (17) for the ACKF.
Let us name Q o and R o as the constant process noise and observation noise errors respectively. In the discrete-time zones (I) and (III), the data are BG data. In these discrete-time zones, the process noise error is set at Q = β 1 × Q o and the observation noise error is R = R o . Attentionally, R = R o everywhere, and the values β 1 , β 2 , β 3 , β 4 and β 5 , which will be discussed later are cognitively determined by experiments. Figure 3b shows an example of background range with an outlier that can be fixed by the Kalman filter if β 1 is appropriately chosen. In the discrete-time (II), the FP range is studied. In the FP range, if . f (k) > 0 condition is true, the condition in Equation (16) is considered: If ..
f(k) > 0, it is abnormal and an outlier appears in the FP range. If the outlier is large, it requires a correction for the observation value. In our research, if < 0.5, then the outlier is large. The observation correction is conducted by using the previous normal data points at discrete times k − 1 and k − 2: where z + (k) is the observation prediction and η is a percentage constant to take an amount of the difference of z(k-1)-z(k-2). After this prediction, we still put more reliability on the process noise error rather than the observation noise error. In other words, at discrete time k, the observation noise error should be larger than the process noise error (R > Q or Q R < 1; R = R o ). It can be seen that: .
. In addition, a recursive mechanism is designed to recall the Kalman filter module itself. The number of recalls, N, depends on whether this ratio is small or large. The smaller .

f(k)
is, the more the Kalman module will recall itself.
Basically, even in a normal case, the Kalman filter is called two times, so

Entropy
To evaluate the effectiveness of the filter and outlier-elimination process, the entropies of the raw and preprocessed signals is used: In which y can be x, the raw data, or z, the processed signal data, and ( ) is the probability of x(i) or z(i) to happen [55][56][57]. Entropy quantity can reveal the uncertainty or the randomness of the investigated signal. To the raw signal containing much noise, the noise can cover the useful information and show a high disorder, so the entropy of the signal is small. If the outlier elimination modules work well, much noise including thermal or burst noise is reduced, then the entropy of the

Entropy
To evaluate the effectiveness of the filter and outlier-elimination process, the entropies of the raw and preprocessed signals is used: ). (20) In which y can be x, the raw data, or z, the processed signal data, and p(i) is the probability of x(i) or z(i) to happen [55][56][57]. Entropy quantity can reveal the uncertainty or the randomness of the investigated signal. To the raw signal containing much noise, the noise can cover the useful information and show a high disorder, so the entropy of the signal is small. If the outlier elimination modules work well, much noise including thermal or burst noise is reduced, then the entropy of the preprocessed signal can be larger than the raw signal.
In the NN training, as the BG noise of each thermopile affects most to the output, it is crucial to choose a standard BG (SBG), and corresponding with each SBG is a standard MP (SMP) based on the measurements of each thermopile. The SBG for each thermopile is chosen based on the appearance frequency of the BG data. The chosen BG should be the highest appearance frequency. We use the absolute-mean error function (AME as a stop criterion and efficiency coefficient). AME equation is: However, we encountered some cases that the correction values swing around the SMP. To overcome the problem, applying Equation (20) of the entropy provides a better operational condition. As discussed above, to a data with much noise or a fluctuation data vector, the entropy will be small. In the training, the program will train NN and drive corrected data to a trend of entropy increase. Therefore, the best entropy will lead to the least swinging correction values.

Error Correction by Neural Network
In Section 3.1, temperature, 5 V and 9 V monitoring voltages and raw data were presented. Section 3.3 introduced the data after being preprocessed. Although the thermal and burst noises can be mitigated by the ACKF filter, the BG noise or error still exists in the data. To reduce this noise, a NN was applied. The NN was trained by a set of collected data from the Fusarium detection device. To prepare data for training NN, some estimations should be done first: T1 = T1+T2 2 , V1 = V1+V3 2 and V2 = V2+V4 2 ; BG is the average of background data; STD(BG) is the standard deviation of background data; MP is the maximum of PD; WP is the mean value of the whole PD and FP is the mean value of data points in FP range. To train the NN precisely, many data batches were recorded. Each data batch will have the previously introduced quantities. Gathering data for these quantities from the measured data batches, one will have data vectors, which are presented in bold font: T1, V1, V2, BG, STD(BG), MP, FP and WP. MP vector is used to analyze samples in the trap chamber (Section 3.1), and is the N dig data in equation (13)  vector can be calculated. From Section 3.1, we know that MP is the digital value of amplified signal from an input signal X. MP can be estimated by: MP = G.X, where, G is the gain of the amplifiers. As an OCS can affect to the gain, so G depends on the OCS. Therefore, SMP = G S .X, where G S is the gain at standard condition set. If X is stable, the ratio of MP i is: G i depends on the OCS at the measurement i th . The NN would be trained by using OCS of T1, V1, V2, BG, STD(BG), FP, WP and r. The trained NN would be used to determine r x from a new OCS x of a new investigation of any new sample. These procedures are depicted in Figure 5. gain at standard condition set. If X is stable, the ratio of MPi is: depends on the OCS at the measurement i th . The NN would be trained by using OCS of , , , BG, STD(BG), FP, WP and r. The trained NN would be used to determine rx from a new OCSx of a new investigation of any new sample. These procedures are depicted in Figure 5. From rx, MPx of the new sample can be corrected to eliminate the affection of OCSx by applying Equation (22), so MPx is adjusted to SMP, which is the MP value corresponding the SOCS: In NN training, the input data and output data to supply into the NN were recorded in two cases of without-sample and Fusarium sample and in different operating conditions. In this paper, we mainly focused on the operation of the ACKF and the role of the broadband thermopile in the upgraded nondispersive thermopile device. The collected data would be preprocessed by the ACKF, and then being used to train the NN. To evaluate the effectiveness of the ACKF filter, the NN would be trained by two OCSs of raw data and ACKF-preprocessed data. Based on the comparison of the errors, entropies and times of NN training after employing the two OCS, the effectiveness can be concluded. The diagram of using the collected data for NN training is shown in Figure 6. From r x , MP x of the new sample can be corrected to eliminate the affection of OCS x by applying Equation (22), so MP x is adjusted to SMP, which is the MP value corresponding the SOCS: In NN training, the input data and output data to supply into the NN were recorded in two cases of without-sample and Fusarium sample and in different operating conditions. In this paper, we mainly focused on the operation of the ACKF and the role of the broadband thermopile in the upgraded nondispersive thermopile device. The collected data would be preprocessed by the ACKF, and then being used to train the NN. To evaluate the effectiveness of the ACKF filter, the NN would be trained by two OCSs of raw data and ACKF-preprocessed data. Based on the comparison of the errors, entropies and times of NN training after employing the two OCS, the effectiveness can be concluded. The diagram of using the collected data for NN training is shown in Figure 6.

Samples
The samples were used in the experiments are Fusarium oxysporum [22] and starch as these two samples have the group distinction coefficients are close to each other. In the previous work, we also used pollen and turmeric to test the device and the analyzing formula, which is the group-distinction coefficient [20]. The F. oxysporum was collected from rotten garlic bulbs and nurtured in potatodextrose-agar Petri dishes by following the instructions in [22]. To be able to collect Fusarium samples, it requires at least 4 weeks of fostering. The starch sample was from a local food market. The samples were used to test if the outlier reduction by ACKF and the upgraded Fusarium detection device can work effectively.

Results and Discussion
In [20], four samples were used to test the Fusarium detection method and device. In that research, the coefficients of Fusarium, pollen, starch and turmeric were 1.144 0.153, 0.136 0.116, 0.939 0.073 and 0.794 0.139 respectively. It can be seen that Fusarium and starch coefficients were very similar to each other. Therefore if there is a way to process further the samples with similar coefficients, it will be useful. In this work, we focused on mainly the method of using the combination of BR, and thermopiles to distinguish the two samples, Fusarium and starch, which have similar group-distinction coefficients.

Reduction of Thermal and Burst Noises
This section presents the operation of the ACKF. Its results and the raw data are shown in Figure   Figure 6. Estimation of the effectiveness of the ACKF.

Samples
The samples were used in the experiments are Fusarium oxysporum [22] and starch as these two samples have the group distinction coefficients are close to each other. In the previous work, we also used pollen and turmeric to test the device and the analyzing formula, which is the group-distinction coefficient [20]. The F. oxysporum was collected from rotten garlic bulbs and nurtured in potato-dextrose-agar Petri dishes by following the instructions in [22]. To be able to collect Fusarium samples, it requires at least 4 weeks of fostering. The starch sample was from a local food market. The samples were used to test if the outlier reduction by ACKF and the upgraded Fusarium detection device can work effectively.

Results and Discussion
In [20], four samples were used to test the Fusarium detection method and device. In that research, the coefficients of Fusarium, pollen, starch and turmeric were 1.144 ± 0.153, 0.136 ±0.116, 0.939 ± 0.073 and 0.794 ±0.139 respectively. It can be seen that Fusarium and starch coefficients were very similar to each other. Therefore if there is a way to process further the samples with similar coefficients, it will be useful. In this work, we focused on mainly the method of using the combination of BR, λ 1 and λ 2 thermopiles to distinguish the two samples, Fusarium and starch, which have similar group-distinction coefficients.

Reduction of Thermal and Burst Noises
This section presents the operation of the ACKF. Its results and the raw data are shown in Figure 7. Figure 7a,d shows thermopile signals with noise and outliers. Especially, Figure 7a has many outliers. Figure 7b,e shows the preprocessed signals by applying ACKF to filter out the noise and the outliers. Figure 7c,f depicts the entropies of the first-order differentiation of these signals. Each entropy value will stand for an uncertainty level of a signal. As seen, the ACKF work well, a few outliers still can be seen in Figure 7b.   Figure 8a,d-f. In these plots, Figure  8f could not be fixed well as the signal had too much affection from the thermal and burst noises. Figure 8b,c did not get much effect from the burst noise and the ACKF function was to smooth the raw signals.  Table 1 shows the max peak (MP) differences, ∆MP, between the MP of ACKF the preprocessed and raw signals of the three thermopiles. Similarly, it also introduces the entropies of the signals of a typical case of the outlier effect. It can be seen that raw signal had an outlier in the peak. Thus,   Figure 8a,d-f. In these plots, Figure 8f could not be fixed well as the signal had too much affection from the thermal and burst noises. Figure 8b,c did not get much effect from the burst noise and the ACKF function was to smooth the raw signals.   Figure 8a,d-f. In these plots, Figure  8f could not be fixed well as the signal had too much affection from the thermal and burst noises. Figure 8b,c did not get much effect from the burst noise and the ACKF function was to smooth the raw signals.  Table 1 shows the max peak (MP) differences, ∆MP, between the MP of ACKF the preprocessed and raw signals of the three thermopiles. Similarly, it also introduces the entropies of the signals of a typical case of the outlier effect. It can be seen that raw signal had an outlier in the peak. Thus, ∆ of was very large, while ∆MPs of BR and were very small. The last two columns show  Table 1 shows the max peak (MP) differences, ∆MP, between the MP of ACKF the preprocessed and raw signals of the three thermopiles. Similarly, it also introduces the entropies of the signals of a typical case of the outlier effect. It can be seen that λ 2 raw signal had an outlier in the peak. Thus, ∆MP of λ 2 was very large, while ∆MPs of BR and λ 2 were very small. The last two columns show the entropies of the signal differentiation of the three thermopiles of the raw and preprocessed signals. Table 1 proves that entropies of differentiation of the ACKF preprocessed data were better than the raw data. Thus, the ACKF could process the thermal noise and burst noise well.

Reduction of Background Noise
For training data of the NN, 5422 data were consecutively and automatically recorded in many days to mimic normal working conditions. To find an appropriate and adequate NN structure for our application, we simply started using a single hidden layer with two nodes, and then the number of nodes was increased. The number of nodes was stopped at eight. The training times and absolute errors from the training were taken note. Then, we increased the number of hidden layers to two layers with m = 2 nodes for the first hidden layers and n = 1 node for the second layers; m was increased until reaching eight nodes for the training. Then, n was increased to two nodes, and again m started at m = 2 nodes. After m = 8 nodes and n = 2 nodes, we stopped there and compared the times and errors in the simulation to find the best NN structure.
The best NN structure had two hidden layers, in which, the first hidden layer had three nodes, while the second had two nodes. To compare the effectiveness of the ACKF, the raw data and the preprocessed data were employed. The data aggregation was of five different samples in which there were no sample, Fusarium samples at different densities and starch sample. In each case, the power supply for the IR source and the other circuits were unchanged, so the outputs of the three thermopiles were expected constant. Additionally, the number of batches in each samplemeasurement case was arbitrary. However, the working condition was probably unstable and even the power supply could have a certain fluctuation, which could affect the output of the detectors. By using the inputs of the information of the temperature, 5 V and 9 V monitors and the BG to train the NN, we could correct the recorded-unstable outputs of these thermopiles, and return back more stable outputs. Firstly, we checked the efficiency of the ACKF by comparing the training times and the absolute errors of the and thermopiles. The results are depicted in Table 2. In the training NN, both error and entropy criteria were applied. As mentioned in the entropy section that the AME may cause the data correction swing even though the AME was optimized through the weights and biases searching. Figure 9 shows the plots of the ACKF preprocessed data and corrected data of thermopile of using error and entropy as operational criteria. In the preprocessed MP data of the λ thermopile, Figure 8a illustrates the results of applying entropy. Figure 9b is the close view of Figure 9a of the four different samples. Similarly, Figure 9c,

Reduction of Background Noise
For training data of the NN, 5422 data were consecutively and automatically recorded in many days to mimic normal working conditions. To find an appropriate and adequate NN structure for our application, we simply started using a single hidden layer with two nodes, and then the number of nodes was increased. The number of nodes was stopped at eight. The training times and absolute errors from the training were taken note. Then, we increased the number of hidden layers to two layers with m = 2 nodes for the first hidden layers and n = 1 node for the second layers; m was increased until reaching eight nodes for the training. Then, n was increased to two nodes, and again m started at m = 2 nodes. After m = 8 nodes and n = 2 nodes, we stopped there and compared the times and errors in the simulation to find the best NN structure.
The best NN structure had two hidden layers, in which, the first hidden layer had three nodes, while the second had two nodes. To compare the effectiveness of the ACKF, the raw data and the preprocessed data were employed. The data aggregation was of five different samples in which there were no sample, Fusarium samples at different densities and starch sample. In each case, the power supply for the IR source and the other circuits were unchanged, so the outputs of the three thermopiles were expected constant. Additionally, the number of batches in each sample-measurement case was arbitrary. However, the working condition was probably unstable and even the power supply could have a certain fluctuation, which could affect the output of the detectors. By using the inputs of the information of the temperature, 5 V and 9 V monitors and the BG to train the NN, we could correct the recorded-unstable outputs of these thermopiles, and return back more stable outputs. Firstly, we checked the efficiency of the ACKF by comparing the training times and the absolute errors of the λ 1 and λ 2 thermopiles. The results are depicted in Table 2. In the training NN, both error and entropy criteria were applied. As mentioned in the entropy section that the AME may cause the data correction swing even though the AME was optimized through the weights and biases searching. Figure 9 shows the plots of the ACKF preprocessed data and corrected data of λ 1 thermopile of using error and entropy as operational criteria.
In the preprocessed MP data of the λ 1 thermopile, Figure 8a illustrates the results of applying entropy. Figure 9b is the close view of Figure 9a of the four different samples. Similarly, Figure 9c,d show the results when using the AME criterion. The black lines in the plots are to show the expectation of MP values. The expectation MPs were chosen from view the correlation of the MP data and the SBGs of the three cooperative thermopiles. The close views show data of the other four different samples. It can be seen the entropy operating criterion could work better than the error operating criterion.  . The ACKF preprocessed (prep.) and corrected max peak (MP) data of thermopile of using entropy and absolute-mean error function (AME) criteria respectively. (a) Full view of the data achieved by entropy criterion; (b) close view of the data batches from 5001 to 5422 achieved by entropy criterion; (c) full view of the data achieved by AME and (d) close view of the MP data from the batches of 5001 to 5422 achieved by AME criterion. Figure 10 shows the other views on the operation of these criteria when processing the λ thermopile MP data. Figure 10a,b presents the relationship between the training time and entropy of differentiation, and error of the corrected data respectively. The two red dots in Figure 10a,c are the two optimized entropies, which are close to each other. Figure 10c,d are the results that were recorded in one searching batch of 1000 loops.  Figure 11 shows the results of processing BR and λ thermopiles, and the differentiation plots of the data. Figure 11a is of the BR thermopile and Figure 11b is of the λ thermopile. A note that the entropy was applied to the differentiation of the preprocessed data and the corrected data. The differentiation plots of the two types of data shown in Figure 11 belong to the λ thermopile. Table  3 shows the results of applying entropy and AME operating criteria for λ thermopile. In each method, both AME and entropy quantities were recorded for investigation. From Table 3, in the Figure 9. The ACKF preprocessed (prep.) and corrected max peak (MP) data of λ 1 thermopile of using entropy and absolute-mean error function (AME) criteria respectively. (a) Full view of the data achieved by entropy criterion; (b) close view of the data batches from 5001 to 5422 achieved by entropy criterion; (c) full view of the data achieved by AME and (d) close view of the MP data from the batches of 5001 to 5422 achieved by AME criterion. Figure 10 shows the other views on the operation of these criteria when processing the λ 1 thermopile MP data. Figure 10a,b presents the relationship between the training time and entropy of differentiation, and error of the corrected data respectively. The two red dots in Figure 10a,c are the two optimized entropies, which are close to each other. Figure 10c,d are the results that were recorded in one searching batch of 1000 loops.  . The ACKF preprocessed (prep.) and corrected max peak (MP) data of thermopile of using entropy and absolute-mean error function (AME) criteria respectively. (a) Full view of the data achieved by entropy criterion; (b) close view of the data batches from 5001 to 5422 achieved by entropy criterion; (c) full view of the data achieved by AME and (d) close view of the MP data from the batches of 5001 to 5422 achieved by AME criterion. Figure 10 shows the other views on the operation of these criteria when processing the λ thermopile MP data. Figure 10a,b presents the relationship between the training time and entropy of differentiation, and error of the corrected data respectively. The two red dots in Figure 10a,c are the two optimized entropies, which are close to each other. Figure 10c,d are the results that were recorded in one searching batch of 1000 loops.  Figure 11 shows the results of processing BR and λ thermopiles, and the differentiation plots of the data. Figure 11a is of the BR thermopile and Figure 11b is of the λ thermopile. A note that the entropy was applied to the differentiation of the preprocessed data and the corrected data. The differentiation plots of the two types of data shown in Figure 11 belong to the λ thermopile. Table  3 shows the results of applying entropy and AME operating criteria for λ thermopile. In each method, both AME and entropy quantities were recorded for investigation. From Table 3, in the entropy operating criterion, when the entropy was optimized, then the error was very close to the  Figure 11 shows the results of processing BR and λ 2 thermopiles, and the differentiation plots of the data. Figure 11a is of the BR thermopile and Figure 11b is of the λ 2 thermopile. A note that the entropy was applied to the differentiation of the preprocessed data and the corrected data. The differentiation plots of the two types of data shown in Figure 11 belong to the λ 1 thermopile. Table 3 shows the results of applying entropy and AME operating criteria for λ 1 thermopile. In each method, both AME and entropy quantities were recorded for investigation. From Table 3, in the entropy operating criterion, when the entropy was optimized, then the error was very close to the optimal error of the error operating criterion. However, in the error operating criterion method, it was not similar to the entropy, in this case, it was less than 1, which is not good. These points of view could be more consolidated by reviewing Figures 9 and 10.

Analysis
From experiments, as the group distinction coefficients, η, of the Fusarium oxysporum chlamydospore [22], and the starch samples were somewhat similar, in this section, the analysis results of these samples were introduced. Applying the trained NN for these two samples can help to correct or calibrate the data of the three thermopiles. Figure 12 shows the ACKF preprocessed data and corrected data of the two samples, which were measured in 50 batches.
From the figure, one can see that the output data of BR were very stable and the correction process calibrates the data. Applying Equation (11), the group distinction coefficients of the two samples could be found. Figure 13a shows and plots. It can be seen that and were very close to each other. To determine the number of Fusarium in the trap, Equation (12) was employed. Figure 13b depicts

Analysis
From experiments, as the group distinction coefficients, η, of the Fusarium oxysporum chlamydospore [22], and the starch samples were somewhat similar, in this section, the analysis results of these samples were introduced. Applying the trained NN for these two samples can help to correct or calibrate the data of the three thermopiles. Figure 12 shows the ACKF preprocessed data and corrected data of the two samples, which were measured in 50 batches.
From the figure, one can see that the output data of BR were very stable and the correction process calibrates the data. Applying Equation (11), the group distinction coefficients of the two samples could be found. Figure 13a shows η starch and η Fusarium plots. It can be seen that η starch and η Fusarium were very As the group distinction coefficients of Fusarium and starch are close to each other, thus it can cause confusion at certain times. Table 4 shows the means of the group distinction coefficients, the absolute errors, and the relative errors of Fusarium and starch. From the table, one can see these values were very close to each other.  To improve the fidelity, the broadband thermopile was used. We investigated the other samples of starch and Fusarium that their quantities were unknown in advance. Making an assumption that all the samples were Fusarium, we could find the sample quantities Nx by replacing log( ) into of Equation (24). Figure 13b illustrates the extrapolated and interpolated values of the new samples. The fitted curve in Figure 13b was formed by applying data of the Fusarium samples, which their known quantities (N): As the group distinction coefficients of Fusarium and starch are close to each other, thus it can cause confusion at certain times. Table 4 shows the means of the group distinction coefficients, the absolute errors, and the relative errors of Fusarium and starch. From the table, one can see these values were very close to each other.
As the group distinction coefficients of Fusarium and starch are close to each other, thus it can cause confusion at certain times. Table 4 shows the means of the group distinction coefficients, the absolute errors, and the relative errors of Fusarium and starch. From the table, one can see these values were very close to each other.  To improve the fidelity, the broadband thermopile was used. We investigated the other samples of starch and Fusarium that their quantities were unknown in advance. Making an assumption that all the samples were Fusarium, we could find the sample quantities Nx by replacing log( ) into of Equation (24). Figure 13b illustrates the extrapolated and interpolated values of the new samples.  To improve the fidelity, the broadband thermopile was used. We investigated the other samples of starch and Fusarium that their quantities were unknown in advance. Making an assumption that all the samples were Fusarium, we could find the sample quantities N x by replacing log( P λ 1 P λ 1 0 ) into f λ 1 of Equation (24). Figure 13b illustrates the extrapolated and interpolated values of the new samples. From the data of the BR thermopile and the numbers of known-in-advance F. samples (Fusa. 0), the fitted curve was formed: f BR = fitting(N, log( P BR P 0,BR )).
Additionally, it is necessary to form the lateral fitted curves for the max and min data points, which can be seen from the error boxes. error1 and error2 are the errors of the numbers of the Fusarium oxysporum chlamydospore and log10( P BR P 0,BR ) respectively. Thus, the lateral-fitted curves are: f BR min = fitting(N-error1, log( P BR P 0,BR )-error2).
In Equations (26) and (27), error1 and error2 are the errors of the quantity number of Fusarium sample N and log( P BR P 0,BR ) respectively. The lateral curves will create a validation area (VA). In the case of investigating new measurement, if the point of the quantity number N and log( P BR P 0,BR ) is in the VA and η of the sample is in the range of 1.125 ± 0.110, we could conclude that the sample is Fusarium. Drawing the points of (N x , log( P BR,x P 0,BR ) ) is presented in Figure 14. The figure also provides a visual view of the lateral curve and the VA. In Figure 14, the points of starch samples were out of the VA, so along with η of starch, we could go to a conclusion with more confident and reliable. For the other two Fusarium samples, we could see that almost all of the measurement points were in the VA, except few points on the left of the figure.
Additionally, it is necessary to form the lateral fitted curves for the max and min data points, which can be seen from the error boxes. error1 and error2 are the errors of the numbers of the Fusarium oxysporum chlamydospore and log10( In Equations (26) and (27) Drawing the points of (Nx, log( , , ) ) is presented in Figure 14. The figure also provides a visual view of the lateral curve and the VA. In Figure 14, the points of starch samples were out of the VA, so along with η of starch, we could go to a conclusion with more confident and reliable. For the other two Fusarium samples, we could see that almost all of the measurement points were in the VA, except few points on the left of the figure.

Discussion
The ACKF helps to reduce thermal noise and burst noise well. To be able to fix the outliers of BG or peak data, at least some reference data points were not affected by the outliers. From these reference data points, the ACKF could eliminate the outliers. In reality, there are cases that the ACKF cannot fix the error data (Figure 8f), as the outliers happen too close to each other. Therefore, the reference data points are covered by the burst noise. As a result, the error data cannot adequately be fixed. Besides, from our experiments, we found that the ACKF could also help to reduce the time to search the global minima for the NN. This could be explained as the thermal noise and burst noise occurring in the BG noise were filtered very well by the ACKF (Section 4.1), so the NN could go to the global minima faster. The evidence of this point of view can be seen in Table 2.
Entropy is not only a useful tool to evaluate the work of Kalman filter, but also can be applied

Discussion
The ACKF helps to reduce thermal noise and burst noise well. To be able to fix the outliers of BG or peak data, at least some reference data points were not affected by the outliers. From these reference data points, the ACKF could eliminate the outliers. In reality, there are cases that the ACKF cannot fix the error data (Figure 8f), as the outliers happen too close to each other. Therefore, the reference data points are covered by the burst noise. As a result, the error data cannot adequately be fixed. Besides, from our experiments, we found that the ACKF could also help to reduce the time to search the global minima for the NN. This could be explained as the thermal noise and burst noise occurring in the BG noise were filtered very well by the ACKF (Section 4.1), so the NN could go to the global minima faster. The evidence of this point of view can be seen in Table 2.
Entropy is not only a useful tool to evaluate the work of Kalman filter, but also can be applied as an operational criterion to replace the other criteria such as the mean absolute error. The results in Sections 4.1 and 4.2 show the efficiency of the entropy. From Figure 8c,d, and Table 2, although the error was the smallest after 1000 loops, the visual results were not what we expected. The corrected points fluctuated surrounding the expected lines. Entropy was applied to the differentiation of the preprocessed and corrected MP data rather than being applied directly to these data. Loosely speaking, the differentiation step helped to remove the difference in the magnitude of these MP data, as we only focused on the BG noise. The information left was mainly the BG noise, which influenced the MP data ( Figure 10c). Entropy now reveals how much BG noise is removed by comparing the entropies of the differentiation of the preprocessed and corrected MP data. Figure 12 presents the results when the trained NN was used to correct the error data of Fusarium and starch in which these samples were measured in many batches. In Figure 11a, one can see that the NN adjusted the MP of both samples. In Figure 11b,c, the MP data of starch achieved the largest errors as they lasted from around 0.6 × 10 6 to 1.7 × 10 6 in the case of λ 1 thermopile, and from around 4.3 × 10 6 to 9.2 × 10 6 in the case of λ 2 thermopile.
As the group-distinction coefficient of Fusarium and starch were very similar, the addition of another thermopile detector, the BR thermopile, could help to distinguish better these two samples. Figure 13 shows that some Fusarium points were out of the VA. This could be explained that in the data there were outliers, which the ACKF could not correct them. The figure also introduced a case that the starch point was in the VA. However, in general, one could see that most of the experiment points were in the VA, so the device could distinguish the Fusarium sample from other samples. With an adding detector, the ability of the device could be expanded. It could help to detect the group of many more substances.
The group-distinction coefficient of the starch was found in this work was a little bit different from the value in [20], 0.9390 ± 0.0732. This could be explained that the moistures of the starch samples used in this work and in [20] were different. A slight change in moisture of the starch sample might affect its group-distinction coefficient.

Conclusions
The proposed adaptive-cognitive Kalman filter worked well to reduce the thermal noise and burst noise. The background noise could be mitigated by applying a neural network. The entropy could be applied to replace the mean absolute error as an operational condition. The upgraded device increased the reliability and precision of the current Fusarium detection and quantifying by applying the proposed techniques. Additionally, by adding one more thermopile, the group coefficients of substances were more distinct. This assisted the device to distinguish different substances easier with higher accuracy compared to the use of only two thermopiles.