Comparison of pH Data Measured with a pH Sensor Array Using Different Data Fusion Methods

This paper introduces different data fusion methods which are used for an electrochemical measurement using a sensor array. In this study, we used ruthenium dioxide sensing membrane pH electrodes to form a sensor array. The sensor array was used for detecting the pH values of grape wine, generic cola drink and bottled base water. The measured pH data were used for data fusion methods to increase the reliability of the measured results, and we also compared the fusion results with other different data fusion methods.


Introduction
The investigation of data fusion has developed since the 1980s. The United States Department of Defense (DoD) first used data fusion for a military detection and management system [1]. In recent years, data fusion has been applied to various application fields, such as robotics, image processing and non-military purposes, and is also used in traffic management and smart transport systems. Sharma and Raju [2] have described some characteristics of data fusion as follows: it raises information reliability, reduces uncertainty, improves detection effects, increases practicability, etc., as in weighted average methods [3][4][5][6], fuzzy fusion and neural network fusion [7]. This paper introduces some data OPEN ACCESS fusion methods in later sections, consisting of average data fusion, self-adaptive data fusion [3], fuzzy set data fusion [8] and coefficient of variance data fusion [9]. Li et al. [10] stated that sensor networks were an integration of sensor techniques, nested computation techniques, distributed computation techniques and wireless communication techniques. They can be used for testing, sensing, collecting and processing information of monitored objects and transferring the processed information to users. Sensor networks represent a new research area of computer science and technology and have wide application in the future. Both academia and industries are very interesting in them. The concepts and characteristics of sensor networks and the data in the networks were introduced, and the issues of the sensor networks and the data management of sensor networks were discussed. The advances of research on sensor networks and the data management of sensor networks were also presented. Wang et al. [11] proposed a new mobile-agent-based adaptive data fusion (ADF) algorithm to determine the minimum number of measurements each node required for a perfectly joint reconstruction of multiple signal ensembles. They theoretically showed that ADF provided the optimal strategy with the minimum total number of measurements possible and hence reduced communication cost and network load.
Xia et al. [12] introduced a novel approach called the linearly constrained least squares (LCLS) method for statistical data fusion. The LCLS method uses only the constrained minimum sample variance of fused information, and the proposed fusion method can tackle the unknown covariance problem. Wei [13] introduced that multi-sensor data fusion technology was one of the main techniques of the modern C31 system, and the C31 system performance played a decisive role. The paper used Visual C++ and MATLAB languages to jointly design and construct a universal visualization multi-sensor data fusion simulation platform, which provided researchers with a variety of fusion algorithm simulations and quantitative assessment of the simulation environment, as well as carrying out teaching and scientific research to provide support. Recently, Zakaria et al. [14] reported an improved classification of the herb Orthosiphon stamineus using a data fusion technique. Low level fusion was performed by combining the information provided by different sensors in different modalities. Principal component analysis (PCA) and linear discriminant analysis (LDA) were chosen to perform the low level fusion.
Utilization of data sources measured with a sensor array in pH sensing studies has gained popularity with recent technological advances. Data fusion was used to provide a better solution than could otherwise be achieved from the use of single sensor data alone. Data fusion was used to produce an improved model or estimate of a sensing system from a set of independent data sources. In this study, we investigated the feasibility of using the data fusion method for a pH sensor array and used the measured pH data to apply these data fusion methods. This research investigated the comparison of data measured by the electrochemical pH sensor array with different data fusion methods. The primary objective of this study was to select an appropriate data fusion method for electrochemical measurement applications, regardless of whether the pH sensor array contained a failed pH sensor.

Experimental
In this paper, the sensor array for pH measurement was based on the ruthenium dioxide (RuO 2 ) pH electrode. The RuO 2 thin film was deposited onto a silicon substrate using a sputtering system. In the experimental process, the sensor array and the Ag/AgCl reference electrode (RE) were immersed in commercial drinks (grape wine, generic cola drink and bottled base water) to obtain the pH readings by using a voltage-time measurement system interfaced with the program LabVIEW. The experiment uses a sensor array (eight pH electrodes) and repeats measurements fifteen times [15]. Figure 1 show a sensor array with eight pH sensors, a reference electrode, readout circuit, and the data acquisition card connected to a personal computer. The measured data were used for data fusion with different data fusion methods which are average data fusion (ADF), self-adaptive data fusion (SADF), fuzzy set data fusion (FSDA), and coefficient of variance data fusion (CVDF). The readout circuit consists of eight instrument amplifiers (IAs) and low pass filters (LPFs). The DAQ card is a product of National Instrument (NI) with universal series bus (USB) interface. Figure 1. Experimental structure includes a sensor array with eight pH sensors, a reference electrode, readout circuit and uses a data acquisition card connected to a personal computer, and the measured data used for data fusion with different data fusion methods.

Average Data Fusion (ADF)
The pH measured data were obtained from a pH sensor array with eight ruthenium dioxide pH electrodes. For example, each sensor was measured n times during a measurement period. The pH measured data were used as a mathematical or statistical method to obtain the average value (mean) for the data measured n times from each pH sensor. Let the n times measured data of the i th sensor and the mean (μ) of n times measured data of the i th sensor be as follows [15]: where i is the number of sensors, k is the number of data measurements for each pH sensor.

Readout circuit
In this study, the average data fusion of sensor array with eight pH electrodes have the same weighted coefficients (w 1 = w 2 =…= w 8 ). The sum of weighted coefficients is equal to 1 and the final average data fusion of sensor array is shown as follows [15]:

Self-Adaptive Data Fusion (SADF)
The current work obtained pH measured values from eight ruthenium dioxide pH sensors. Each sensor was measured n times during a measurement period. The pH measured data can be pre-processed using a mathematical or statistical method to obtain the mean ( i y ) and variance ( 2 i σ ) for the n times measured data from each pH sensor. The mean and variance equations are expressed in general form as follows [3]: where i is the number of sensors, j is the number of data measurements for each pH sensor, is the j th data from the i th sensor, and are the mean and variance from the i th sensor, respectively.
Here, we utilize the sensor array based on the minimum mean variance to proceed with measurement data fusion. First, we assume that all data from each sensor have the same mean and exclusion independent each other. We evaluated the weighted coefficients w i (w 1 , w 2 , … w n ) for each pH sensor and the sum of weighted factors for each pH sensor is equal to unity. The estimated data fusion value μ y can then be described as follows [3]: where w i is the weighted coefficient of the i th sensor, y i is the measured data of the i th sensor, μ y is the final value after data fusion. After data fusion, the equation for the total mean variance is as follows [3]: From Equation (6), this study can obtain the total of mean variance σ 2 which is related to each weighted coefficient in the multi-dimension second order function. According to the multi-dimension function theory, we can obtain the f function that consists of λ and w i variables in the equation as follows [3]: (10) The Lagrange multiplier method is used to evaluate the solution of Equation (9). Let the f function proceeds partial deviation of λ and w i , respectively. The equations are obtained as follows [3]: The solutions of the Equations (11) and (12) are evaluated and the expressed equation for w i is obtained as follows [3]:

Fuzzy Set Data Fusion (FSDA)
There are n sensors in the measurement system and the sensors are used to determine the analyte, respectively. The measured values of the i th sensor in the k time are shown as follows [8]: (14) The measured values of each sensor acted as a fuzzy set. According to the fuzzy mathematic theory, we can closely measure between two fuzzy sets.
Definition 1: the approach degree measured values of the i sensor and j sensor at k time is shown as follows [8]: (15) Definition 2: the approach degree matrix between each sensor at k time is shown as follows [8]: (16) Definition 3: the consistence measurement of the measured value between the i th sensor and other sensors in the time of k is shown as follows [8]: Regularity equal to one then we have the following form [8]: Utilizing the measurement of consistence dependability for data fusion, to obtain the measured value of data fusion of all sensors in the k time is presented as follows [8]:

Coefficient of Variance Data Fusion (CVDF)
The coefficient of variance (CV), also named discrete coefficient, is used for different measurement data. The CV is the ratio of the standard deviation and mean value. The CV i is presented as the coefficient of variance of measured data X i, and the calculation of the CV i is described as follows [9]: To utilize the coefficient of variance for the data fusion of sensor array, the data process and equation are shown as follows [9]: (1) From Equation (23), calculate the coefficient of variance with measured data of sensor array (CV 1 , CV 2 , ···, CV n ).
(2) Calculate the reciprocal of the coefficient of variance with measured data of sensor array ( ) .
(3) Let the reciprocal of the coefficient of variance, to obtain the weighting fusion of sensor array.
(4) The result of fusion is described as follows [9]:

Results and Discussion
This study can obtain the pH measurement data of grape wine, generic cola drink and bottled base water from the ruthenium dioxide sensor array as shown in Table 1. We utilized the pH measured data of drinks in Table 1 to obtain fusion results with different data fusion methods. We also compared the fusion results with average data fusion, self-adaptive data fusion, fuzzy set data fusion and coefficient of variance data fusion etc. The pH sensor array was measured one time and obtained eight pH data. In this research, we provided an appropriate data fusion method for electrochemical measurement applications. This study is associated the various data fusion methods and pH sensor array to investigate the reliability of measured results of sensor array and without removing the measured data of the failed pH sensor among sensor array.

Average Data Fusion (ADF)
The current study has obtained average data fusion (ADF) from each sensor with the same weighted coefficients (w 1 = w 2 =…= w 8 ). We used Equation (4) to derive the average data fusion with the pH measured data. The weighted coefficient of eight sensor array is 0.125. The data fusion results of grape wine, generic cola drink and bottled base water with average data fusion are 4.04, 5.11 and 7.62, respectively, and are shown in Table 2.

Self-Adaptive Data Fusion (SADF)
The measured pH data of the RuO 2 sensor array were used with self-adaptive data fusion (SADF). We used the Equation (13) to obtain the weighted coefficients (w i ) of sensor array. The weighted coefficients and the fusion results with weighted coefficients of SADF are shown in Table 3. The fusion results of grape wine, generic cola drink and bottled base water by using self-adaptive data fusion are 3.58, 4.67 and 7.44, respectively. Table 3. The weighted coefficients of self-adaptive data fusion (SADF) for every sensor and used the pH measured data of grape wine, generic cola drink and bottled base water drinks to obtain fusion results.

Sensor (i)
Grape wine (w i ) Generic cola drink (w i ) Bottled base water (w i )

Fuzzy Set Data Fusion (FSDF)
We used Equation (21) to obtain the weighted coefficients (w i ) of the sensor array with fuzzy set data fusion (FSDF). The fusion results and weighted coefficients (w i ) of every sensor are shown in Table 4. The fusion results of grape wine, generic cola drink and bottled base water with fuzzy set data fusion are 3.56, 4.68 and 7.30, respectively.

Coefficient of Variance Data Fusion (CVDF)
We used Equation (24) to obtain the weighted coefficients (w i ) of the sensor array with coefficient of variance data fusion (CVDF). Table 5 shows the fusion results of the coefficient of variance data fusion with weighted coefficients (w i ) of every sensor. The measured pH data of grape wine after coefficient of variance data fusion is 3.62. The measured pH data of generic cola drink after coefficient of variance data fusion is 4.79 and the measured pH data of bottled base water after coefficient of variance data fusion is 7.46.

Summary Results
The weighted coefficients of various data fusion methods are obtained from the measured pH data and used these measured pH data to compute the mean (μ), standard deviation (σ) and variance (σ 2 ) with mathematic statistic functions. In this study, we investigated the various data fusion methods and applied for the measured pH values of an electrochemical pH sensor array. The fusion technology is only to use the measured data and added mathematic statistic formula to derive the solution. Table 6 shows the summary of pre-calculation, weighted coefficient and computational complexity with different data fusion methods. The mean (μ) value of measured data was obtained for the H sensor array in the average data fusion in advance. The average data fusion has the same weighted coefficient and the complexity of calculation is easy. The self-adaptive data fusion need to obtain the mean (μ) and variance (σ 2 ) beforehand, the weighted coefficients of sensor array were obtained from the variance of each pH sensor. The approach degree (σ ij ) and consistent (r i ) were used to calculate the weighted coefficients of the fuzzy set data fusion. The coefficient of variance data fusion used the mean (μ) and standard deviation (σ) to derive the weighted coefficients for pH sensor array. According to the computational process of these data fusion methods, in which the fuzzy set data fusion uses the variance and matrix operations and is more difficult than the others. The computational complexity of self-adaptive and coefficient of variance data fusions are moderate with mean, standard deviation and variance statistic operation. The average data fusion has easy computation with arithmetic average to get the weighted coefficients. In this study, we have performed a series of trials for commercial drinks using different data fusion methods with a RuO 2 pH sensor array. This section summarizes the fusion results in this experiment. We used the measured pH data of Table 1 for different data fusion methods to perform the data fusion. Tables 2-5 present the experiment results. Table 2 shows the final results of the pH measurement of grape wine, generic cola drink and bottled base water with a single sensor. The no. 6 sensor failed and its measured value is very different from other sensors. The fusion results with different data fusion methods are shown in Table 7 and compared with the measurement results of a commercial pH meter. From Table 7, we can conclude that the fusion result of average data fusion is more different from the commercial pH meter than the other data fusion methods. This phenomenon is due to the fact the 6th sensor of the pH sensor array failed; its measured data was incorrect and had the same weight coefficient as the others. As to the other data fusion methods, the 6th sensor has a smaller weighted coefficient, therefore the fusion results were fairly close to the measured value of the commercial pH meter. According to the experimental results, the conclusion was that the fusion results with self-adaptive, fuzzy set and coefficient of variance methods were superior to a single failed pH sensor and the average data fusion.

Conclusions
This study used ruthenium dioxide pH electrodes to form a sensor array and obtained a set of measured pH data with a voltage-time measurement system. The sensor array was applied to measure the pH of commercial drinks. The measured pH data were used for different data fusion methods. We also compared the fusion results with different data fusion methods and investigated the complexity of each one. The data fusion results were obviously superior to a single failed sensor and the average data fusion.