An Information Entropy-Based Modeling Method for the Measurement System

Measurement is a key method to obtain information from the real world and is widely used in human life. A unified model of measurement systems is critical to the design and optimization of measurement systems. However, the existing models of measurement systems are too abstract. To a certain extent, this makes it difficult to have a clear overall understanding of measurement systems and how to implement information acquisition. Meanwhile, this also leads to limitations in the application of these models. Information entropy is a measure of information or uncertainty of a random variable and has strong representation ability. In this paper, an information entropy-based modeling method for measurement system is proposed. First, a modeling idea based on the viewpoint of information and uncertainty is described. Second, an entropy balance equation based on the chain rule for entropy is proposed for system modeling. Then, the entropy balance equation is used to establish the information entropy-based model of the measurement system. Finally, three cases of typical measurement units or processes are analyzed using the proposed method. Compared with the existing modeling approaches, the proposed method considers the modeling problem from the perspective of information and uncertainty. It focuses on the information loss of the measurand in the transmission process and the characterization of the specific role of the measurement unit. The proposed model can intuitively describe the processing and changes of information in the measurement system. It does not conflict with the existing models of the measurement system, but can complement the existing models of measurement systems, thus further enriching the existing measurement theory.


Introduction
Measurement has been developed through the physical sciences and plays a very important role in industry, commerce, health and safety, and environmental protection [1][2][3][4][5]. A unified model of measurement systems is critical to the design and optimization of measurement systems. However, the existing measurement theory which will be reviewed below is too abstract. To a certain extent, this makes it difficult to have a clear overall understanding of measurement systems and how to obtain information with measurement units during the measurement process at the outset. Therefore, measurement science needs a theoretical framework [2] that can intuitively describe, analyze, and evaluate measurement systems and characterize how measurement units work to obtain the information of the measurand.
Numerous works of the modeling of the measurement or measurement system have been developed and published. Helmholtz and Hoelder developed a theory of measurement based

Information Entropy and Related Concepts
Entropy is a measure of the uncertainty of a random variable. For a discrete random variable X with limited states, probability of each state X = x i , i = 1, 2, · · · , N, is denoted as p(x i ) = p(X = x i ). For the sake of simplicity, we use p(x i ) to represent probability instead of p(X = x i ). Similarly, for discrete random variable Y, its probability function is denoted as p y j , j = 1, 2, · · · , M. The joint probability function of X and Y is represented by p x i y j . Definition 1. The information entropy of the discrete random variable X is defined as: If the log is to base 2, the unit of information entropy is bits; if the log is to base e (the natural logarithm), the unit is nats, and if the log is to base 10, the unit is harts. For the related measures that will be introduced later, their units are the same. The unit of entropy and related measures for continuous random variables is also the same.
For a continuous random variable X with probability density function p(x), its information entropy is infinite since the number of its stats is infinite. In this case, information entropy is the sum of differential entropy and a constant that tends to infinity. The definition of differential entropy is given as follows: Definition 2. The differential entropy of the continuous random variable X with probability density function p(x), x ∈ R, is defined as: Obviously, differential entropy cannot represent the uncertainty of continuous random variables and does not have the connotation of information. However, when discussing mutual information, since two infinite constant terms will cancel each other, differential entropy has the same information characteristics as information entropy.
In this paper, in order to make each item in the model established in Section 3 have the connotation of information, the uncertainty of a random variable is characterized by information entropy, whether the random variable is continuous or discrete. In addition, for continuous cases, mutual information is calculated using differential entropy.
Based on the information entropy, the related concepts and their definitions are introduced below: Definition 3. The joint information entropy of discrete random variables X and Y is defined as: M j=1 p x i y j log p x i y j .
Definition 4. The conditional entropy of the discrete random variable X given Y is defined as: Definition 5. The average mutual information (also referred to as mutual information) between discrete random variables X and Y is defined as: The relationship between H(X), H(Y), H(X|Y), H(Y|X) and I(X; Y) can be expressed by the Venn diagram shown in Figure 1. Two equations governing this are: Entropy 2019, xx, x FOR PEER REVIEW 4 of 14 The relationship between

H Y X and ( )
; I X Y can be expressed by the Venn diagram shown in Figure 1. Two equations governing this are:

Entropy Balance Equation
In this part, the extension for the chain rule of joint entropy, called the entropy balance equation (Equation (8)), is developed for system modeling, which is given and proved below: Theorem 1. Given random variables 1 2 , , , n X X X  which are drawn according to ( ) 1 2 , , , n p x x x  , then: Proof. By the chain rule for entropy [35], we have: Equation (9) can be readily proved with ( ) ( )  and the definitions of entropy and conditional entropy. By symmetry, one can write: Thus: Based on Equations (9) and (11), one can obtain the following equality:

Entropy Balance Equation
In this part, the extension for the chain rule of joint entropy, called the entropy balance equation (Equation (8)), is developed for system modeling, which is given and proved below: Theorem 1. Given random variables X 1 , X 2 , · · · , X n which are drawn according to p(x 1 , x 2 , · · · , x n ), then: Proof. By the chain rule for entropy [35], we have: Equation (9) can be readily proved with p(x 1 , x 2 , · · · , x n ) = n i=1 p(x i |x i−1 , x i−2 , · · · , x 1 ) and the definitions of entropy and conditional entropy. By symmetry, one can write: Thus: Based on Equations (9) and (11), one can obtain the following equality: H(X i X i+1 , X i+2 , · · · X n ) (12) which is equivalent to Equation (8).

Modeling of Measurement Systems
The unified description and modeling of most measurement systems for all measurement applications is one of the key problems in measurement theory. This paper focuses on the traditional measurement system that provides information about the physical values of measurand [36]. The system has three types of components connected in series, including sensor, variable conversion units, and signal processing units. Sometimes the sensor and variable conversion units are combined.

Model of Measurement Unit
A measurement system [3] consists of a finite number of measurement units as depicted in Figure 2 which is generally a series system. For any unit i of the system (i = 1, 2, · · · , n − 1), there are four random variables X i , E i , N i , X i+1 associated to it, where X i is input, E i denotes the error, N i is noise (this model only considers additive noise), and with the combined effect of X i , E i and N i , the output is X i+1 . Therefore, the unit can be described by the information entropy-based model in the form of Venn diagram as shown in Figure 1 (for the sake of convenience, here is redrawn as Figure 3) and the entropies of the four variables satisfy: where H(X i ) denotes the entropy of the unit input X i , H(X i+1 ) represents the entropy of output X i+1 , H(N i ) = H(X i+1 X i ) is the noise entropy that stands for the entropy increase caused by noise, amplification and other reasons, H(E i ) = H(X i X i+1 ) is the error entropy which denotes the information loss of X i passively or proactive, and indicates the active denoising of the measurement unit. which is equivalent to Equation (8). □

Modeling of Measurement Systems
The unified description and modeling of most measurement systems for all measurement applications is one of the key problems in measurement theory. This paper focuses on the traditional measurement system that provides information about the physical values of measurand [36]. The system has three types of components connected in series, including sensor, variable conversion units, and signal processing units. Sometimes the sensor and variable conversion units are combined.

Model of Measurement Unit
A measurement system [3] consists of a finite number of measurement units as depicted in Figure 2 which is generally a series system. For any unit i of the system ( 1, 2, , 1 i X + . Therefore, the unit can be described by the information entropy-based model in the form of Venn diagram as shown in Figure 1 (for the sake of convenience, here is redrawn as Figure 3) and the entropies of the four variables satisfy: is the noise entropy that stands for the entropy increase caused by noise, amplification and other reasons, is the error entropy which denotes the information loss of i X passively or proactive, and indicates the active denoising of the measurement unit.  ( ) is the average mutual information between i X and 1 i X + , which denotes the amount of information shared by i X and 1 i X + . The relationships of these entropies satisfy equations as follows: which is equivalent to Equation (8). □

Modeling of Measurement Systems
The unified description and modeling of most measurement systems for all measurement applications is one of the key problems in measurement theory. This paper focuses on the traditional measurement system that provides information about the physical values of measurand [36]. The system has three types of components connected in series, including sensor, variable conversion units, and signal processing units. Sometimes the sensor and variable conversion units are combined.

Model of Measurement Unit
A measurement system [3] consists of a finite number of measurement units as depicted in Figure 2 which is generally a series system. For any unit i of the system ( 1, 2, ,  Figure 1 (for the sake of convenience, here is redrawn as Figure 3) and the entropies of the four variables satisfy: is the noise entropy that stands for the entropy increase caused by noise, amplification and other reasons, is the error entropy which denotes the information loss of i X passively or proactive, and indicates the active denoising of the measurement unit.  ( ) is the average mutual information between i X and 1 i X + , which denotes the amount of information shared by i X and 1 i X + . The relationships of these entropies satisfy equations as follows: H(X i X i+1 ) denotes the joint entropy of X i and X i+1 , I(X i ; X i+1 ) is the average mutual information between X i and X i+1 , which denotes the amount of information shared by X i and X i+1 . The relationships of these entropies satisfy equations as follows: The traditional model only considers noise in the signal and the error between the measurement result and true value. In contrast with the traditional model, the proposed model of the measurement unit also considers the information loss in the process of transmission through each measurement unit and can describe the denoising and amplification effect of the measurement unit on the input. These functions of measurement units are represented by error entropy and noise entropy. This shows that this model has excellent ability to describe the measurement unit.

Information Entropy-Based Model of Measurement System
By repeated application of Equation (13), we have the relations of entropies of every unit in a measurement system: Then, adding the two sides of these equations, respectively, and eliminate the same terms, we have: where (17) is equivalent to: Equation (18) is the information entropy-based model of measurement system. Notice that Equation (18) is similar to the entropy balance Equation (8). The reason is that the measurement system shown in Figure 2 has a multi-unit serial structure. For the input of the system and outputs of units X 1 , X 2 , · · · , X n , the random variable X i+1 generally only depends on the input X i of the unit i, and is not directly related to the previous random variables X 1 , X 2 , · · · , X i−1 . Therefore, X 1 , X 2 , · · · , X n forms a first-order Markov chain, namely: Since X 1 , X 2 , · · · , X n constitutes a first-order Markov chain, X n , X n−1 , · · · , X 1 is also a first-order Markov chain, that is: Therefore, the measurement system can also be described by a first-order Markov chain. Figure 4 depicts the Venn diagram of entropy model of a first-order Markov chain, and this model has a symmetrical structure: According to the previous discussion, we have: , the entropy balance equation can be further written as Equation (18).

From Corollary 1, the entropy balance equation of a Markov chain
information entropy-based model of measurement system. It shows that all units of a measurement system can be equivalent to one unit as displayed in Figure 5, the sum of all input entropies is equal to the sum of all output entropies. The information entropy-based model of the measurement system (Equation (18)) not only describes the relationship of the inputs and outputs of the system, but also represents the intermediate quantity in the system, that is, the model of the subsystem can be expressed as: If the input entropy (or output entropy) and all conditional entropies associated with the subsystem are known, then the subsystem's output entropy (or input entropy) can be calculated according to Equation (21).
For an ideal source (the system input is without noise), the measurement result can be directly evaluated by mutual information between the system input and output ( ) 1 ; n I X X . The greater the mutual information, the more accurate the measurement result. The information loss of the measurand can be evaluated by the relative information error (RIE) which is defined as: An ideal measurement system satisfies ( ; n n I X X H X H X = = , and the condition is: which means that 1 X and n X have the same probability function and information of the measurand is completely acquired by the measurement system. According to the previous discussion, we have: For a Markov chain X n , X n−1 , · · · , X 1 , the entropy balance equation can be further written as Equation (18).
From Corollary 1, the entropy balance equation of a Markov chain X 1 → X 2 → · · · → X n is the information entropy-based model of measurement system. It shows that all units of a measurement system can be equivalent to one unit as displayed in Figure 5, the sum of all input entropies is equal to the sum of all output entropies.  Proof. According to Theorem 1 and the Markov property, we have Equations (8), (19), and (20). Substituting Equations (19) and (20) into Equation (8) gives Equation (18) information entropy-based model of measurement system. It shows that all units of a measurement system can be equivalent to one unit as displayed in Figure 5, the sum of all input entropies is equal to the sum of all output entropies. The information entropy-based model of the measurement system (Equation (18)) not only describes the relationship of the inputs and outputs of the system, but also represents the intermediate quantity in the system, that is, the model of the subsystem can be expressed as: If the input entropy (or output entropy) and all conditional entropies associated with the subsystem are known, then the subsystem's output entropy (or input entropy) can be calculated according to Equation (21).
For an ideal source (the system input is without noise), the measurement result can be directly evaluated by mutual information between the system input and output ( ) 1 ; n I X X . The greater the mutual information, the more accurate the measurement result. The information loss of the measurand can be evaluated by the relative information error (RIE) which is defined as:  The information entropy-based model of the measurement system (Equation (18)) not only describes the relationship of the inputs and outputs of the system, but also represents the intermediate quantity in the system, that is, the model of the subsystem can be expressed as: If the input entropy (or output entropy) and all conditional entropies associated with the subsystem are known, then the subsystem's output entropy (or input entropy) can be calculated according to Equation (21).
For an ideal source (the system input is without noise), the measurement result can be directly evaluated by mutual information between the system input and output I(X 1 ; X n ). The greater the mutual information, the more accurate the measurement result. The information loss of the measurand can be evaluated by the relative information error (RIE) which is defined as: An ideal measurement system satisfies I(X 1 ; X n ) = H(X 1 ) = H(X n ), and the condition is: which means that X 1 and X n have the same probability function and information of the measurand is completely acquired by the measurement system.

Application
To better understand the proposed model, three cases of typical measurement units or processes are discussed in this section.

Case 1: Bandpass Filter
The bandpass filter, which is a typical unit in the measurement system, is analyzed in this section. As shown in Figure 6, the input of the filter K(ω) is Y = X + N, where X is a Gaussian random variable with power of σ 2 x , N is white Gaussian noise with power of σ 2 n , X and N are independent of each other. The differential entropy of X can be expressed as: and the differential entropy of N is denoted by: Entropy 2019, xx, x FOR PEER REVIEW 8 of 14

Application
To better understand the proposed model, three cases of typical measurement units or processes are discussed in this section.

Case 1: Bandpass Filter
The bandpass filter, which is a typical unit in the measurement system, is analyzed in this section. As shown in Figure 6, the input of the filter Before passing through the filter, since X and N are independent, the power of Y satisfies 2 2 2 y x n σ σ σ = + . The mutual information between X and Y is: After passing through the filter, the mutual information between X and Z is: where 2 x σ and 2 n σ represent the power of X and N after pass through the filter, respectively. , then Equation (28) can be rewritten as: Before passing through the filter, since X and N are independent, the power of Y satisfies σ 2 y = σ 2 x + σ 2 n . The mutual information between X and Y is:

The increment of mutual information (IMI) is defined by:
After passing through the filter, the mutual information between X and Z is: where σ 2 x and σ 2 n represent the power of X and N after pass through the filter, respectively. The increment of mutual information (IMI) is defined by: Suppose that the power of noise N is σ 2 n = N 0 f /2 where f is the bandwidth of noise and N 0 /2 denotes bilateral power spectral density of noise. The filter is an ideal bandpass filter with a bandwidth of ∆ f and the gain is 1 in the passband. After passing through the filter, the power of the noise is σ 2 n = N 0 ∆ f /2, then Equation (28) can be rewritten as: According to the characteristics of the filter, the passband should be consistent with the frequency band of X, that is, σ 2 x = σ 2 x and σ 2 x >> σ 2 n , therefore: Equation (30) shows that the IMI is related to the bandwidth ∆ f and signal to noise ratio (SNR) of the input signal σ 2 x /σ 2 n . The narrower the bandwidth of the filter is, the larger the increment of mutual information is. In general, f /∆ f 1, but the SNR of the input signal σ 2 n /σ 2 x is uncertain. For small signals, the SNR is less than 1 (σ 2 x /σ 2 n < 1), then we have If σ 2 x /σ 2 n = 1, then: For large signal, the SNR is generally much more than 1 (σ 2 x /σ 2 n 1), then: The function of the filter is to filter out the noise contained in the signal. From the above three cases, the IMIs are all greater than zero, which means that at the information level, the role of filter is to increase the amount of information that can be obtained.

Case 2: Quantization Process
The quantization process is an important step in the measurement process. From the perspective of information acquisition, the quantization process is a process of information loss. For a continuous random variable, it requires infinitely high precision to describe itself in theory, and its information entropy is infinite. After quantization, the continuous random variable is transformed into a discrete random variable with limited precision, and its information entropy is finite.
Given a continuous random variable X with a probability density function of p(x), the range of X is evenly divided into intervals of length ∆. Assuming that p(x) is continuous within each interval. According to the mean value theorem, there exists x i within each interval such that: After quantization, the discrete random variable X Q is obtained and its definition is: Then, the probability of X Q = x i is: Therefore, the information entropy of X Q is: If the function p(x) log p(x) is Riemann integrable, the first item in Equation (37) approaches h(X) = − p(x) log p(x)dx as ∆ → 0 , which means: Since ∆ → 0 is not achievable in practice, there is information loss in the quantization process. For a N-bit quantizer, ∆ = 2 −N , then the information loss H X X Q can be defined as: The amount of information obtained from X with quantization process is: Therefore, the quantization process can be illustrated as shown in Figure 7. It can be found from Equations (39) and (40) that the larger N is, the less information is lost and the more information is obtained.
If the function ( ) ( ) log p x p x is Riemann integrable, the first item in Equation (37) approaches Since 0 Δ → is not achievable in practice, there is information loss in the quantization process.
can be defined as: The amount of information obtained from X with quantization process is: Therefore, the quantization process can be illustrated as shown in Figure 7. It can be found from Equations (39) and (40) that the larger N is, the less information is lost and the more information is obtained. It is quantized by a N-bit quantizer and the process is simulated with MATLAB R2018b (developed by the MathWorks, Inc. with headquarters in Natick, Massachusetts, USA). X is generated by the unifrnd function with 1,000,000 data points. The first 5000 data points of X are shown in Figure 8a, and the probability density function of X is shown in Figure 8b. It can be found that the simulated data of X is not ideal, and its probability density is significantly less than 1 when its value is close to 0 or 1. Here, five quantizers with N-bit ( ) 8,9,10,11,12 N = are used to quantize X , and then the corresponding information entropies of Q X are calculated and the results are shown in Figure 8c.
, the mutual information is also N bits), when the log is to the base 2. It can be seen from Figure 8c that the simulation results are consistent with the theoretical values within the allowable error. This also shows that the more bits the quantizer has, the more information can be obtained, which is consistent with the theoretical analysis. For example, consider a continuous random variable X with uniformly distribution on [0, 1]. It is quantized by a N-bit quantizer and the process is simulated with MATLAB R2018b (developed by the MathWorks, Inc. with headquarters in Natick, Massachusetts, USA). X is generated by the unifrnd function with 1,000,000 data points. The first 5000 data points of X are shown in Figure 8a, and the probability density function of X is shown in Figure 8b. It can be found that the simulated data of X is not ideal, and its probability density is significantly less than 1 when its value is close to 0 or 1. Here, five quantizers with N-bit (N = 8,9,10,11,12) are used to quantize X, and then the corresponding information entropies of X Q are calculated and the results are shown in Figure 8c. As h(X) = 0, according to Equation (37), the information entropy of X Q is equal to N bits (since I X; X Q = H X Q , the mutual information is also N bits), when the log is to the base 2. It can be seen from Figure 8c that the simulation results are consistent with the theoretical values within the allowable error. This also shows that the more bits the quantizer has, the more information can be obtained, which is consistent with the theoretical analysis.

Case 3: Cumulative Averaging Procedure
In some practical measurement applications, the noisy signal is sampled at high speed, then the cumulative averaging procedure is performed to the measured values to filter out the high frequency parts of noise to obtain higher measurement accuracy.
As shown in Figure 9, given a Gaussian signal S with zero mean, a Gaussian noise N with zero mean, S and N are independent of each other and Y S N = + , in a very short period of time t Δ , the amplitude of the signal can be considered as constant, and the amplitude of the noise is a variable. Therefore, the correlation coefficient between the signal amplitudes at any two moments in t Δ is 1, and for noise, the correlation coefficient is zero. Assuming that the number of cumulative averaging times is n, and the power of the signal and noise at each sampling moment i t is Si P and Ni P ( 1, 2, , i n =  ), then after the cumulative averaging procedure, their power become:

Case 3: Cumulative Averaging Procedure
In some practical measurement applications, the noisy signal is sampled at high speed, then the cumulative averaging procedure is performed to the measured values to filter out the high frequency parts of noise to obtain higher measurement accuracy.
As shown in Figure 9, given a Gaussian signal S with zero mean, a Gaussian noise N with zero mean, S and N are independent of each other and Y = S + N, in a very short period of time ∆t, the amplitude of the signal can be considered as constant, and the amplitude of the noise is a variable. Therefore, the correlation coefficient between the signal amplitudes at any two moments in ∆t is 1, and for noise, the correlation coefficient is zero. Assuming that the number of cumulative averaging times is n, and the power of the signal and noise at each sampling moment t i is P Si and P Ni (i = 1, 2, · · · , n), then after the cumulative averaging procedure, their power become: where A Si and A Ni are the amplitudes of the signal and noise at each sampling moment t i , respectively; A S and A N are the average amplitudes of the signal and noise during ∆t, respectively; and P S and P N are the average powers of the signal and noise during ∆t, respectively. After the cumulative averaging procedure, the mutual information that can be obtained from the processed data is: * Figure 9. Gaussian random signal with additive Gaussian noise processed by the cumulative averaging procedure.
After the cumulative averaging procedure, the mutual information that can be obtained from the processed data is: which is greater than the mutual information before the cumulative averaging procedure, that is: This shows that the cumulative averaging procedure can be equivalent to a digital filter, which can improve the signal-to-noise ratio and increase the mutual information. It can also be seen from Equation (43) that the mutual information increases as the number of cumulative averaging times n increases.

Conclusions
In this paper, an information entropy-based modeling method for measurement systems is proposed. The modeling idea of the measurement system based on the viewpoints of information acquisition and uncertainty is presented. Based on this idea, the entropy balance equation based on the chain rule for entropy is proposed for system modeling. Then, information entropy-based models of measurement units and measurement systems are established with the entropy balance equation. Finally, three cases of typical measurement units or processes are analyzed using the proposed model. Compared with the existing modeling methods of measurement systems, the proposed method considers the modeling problem from the perspective of information and uncertainty, and focuses on the loss of the measurand information in the transmission process and the representation of the role of the measurement unit, such as filtering, amplification, and introduced noise. From error entropy, noise entropy, and mutual information between input and output of each unit, the changes of information can be intuitively reflected. If the system input is without noise, the mutual information between the input and output of the system directly reflects the amount of information acquired from measurand, which can be directly used as an evaluation index of the performance of the measurement system.
The proposed model has excellent ability to intuitively describe the processing and changes of information in the measurement system. These characteristics make it easy to have a clear overall understanding of the concept of the measurement system and specific implementation of measurement with measurement units. Note that, although the proposed model has the above advantages, it is not considered and proposed from the perspective of metrological analysis. Compared with the existing models of the measurement system, the output of the proposed model cannot be directly applied to represent the measurement results in the traditional sense, and loses the time information of measurement result. The proposed model does not conflict with the existing models of measurement systems, but can complement the existing models of measurement systems, thus further enriching the existing measurement theory.
Funding: This research was funded by the National Natural Science Foundation of China (61873101, 61771210).