Air Temperature Error Correction Based on Solar Radiation in an Economical Meteorological Wireless Sensor Network

Air temperature (AT) is an extremely vital factor in meteorology, agriculture, military, etc., being used for the prediction of weather disasters, such as drought, flood, frost, etc. Many efforts have been made to monitor the temperature of the atmosphere, like automatic weather stations (AWS). Nevertheless, due to the high cost of specialized AT sensors, they cannot be deployed within a large spatial density. A novel method named the meteorology wireless sensor network relying on a sensing node has been proposed for the purpose of reducing the cost of AT monitoring. However, the temperature sensor on the sensing node can be easily influenced by environmental factors. Previous research has confirmed that there is a close relation between AT and solar radiation (SR). Therefore, this paper presents a method to decrease the error of sensed AT, taking SR into consideration. In this work, we analyzed all of the collected data of AT and SR in May 2014 and found the numerical correspondence between AT error (ATE) and SR. This corresponding relation was used to calculate real-time ATE according to real-time SR and to correct the error of AT in other months.

It is quite reasonable to take SR into account in the research of NodeAT correction. In this work, we found the relevance among NodeAT, AwsAT and AwsSR and proposed an original approach to reduce the error of NodeAT based on the value of SR. This work is motivated by our real-time [11] meteorological factor collecting project at NUIST. We launched an ongoing WSN consisting of dozens of sensing nodes continuously collecting scientific data. In this application, sensing nodes have to sustain solar radiation, for they are arranged in an open area. Therefore, it is urgent to discover the relation among NodeAT, AwsAT and AwsSR and to invent an effective method to correct the value of NodeAT.

Overview
There are two key parts in our methodology: one is data processing and analyzing, and the other one is data correcting.
In the first part, the data of NodeAT, AwsAT and AwsSR are processed and analyzed to get intermediate data, like NodeATinterp (interpolated NodeAT), NodeATinterpshift (shifted NodeATinterp), etc. These intermediate data are used as the medium to achieve the functional relationship of AT error (ATE) and SR and to get the ATE-SR function eventually.
Then, in the second part, the errors of NodeAT are corrected by using the ATE-SR function and AwsSR. There are three key correcting procedures: (1) correcting the time coordinates of NodeAT to eliminate the time phase difference between NodeAT and AwsAT; (2) calculating the theoretical AT error through the ATE-SR function according to real-time AwsSR; and (3) subtracting the AT error acquired in (2) from the AT data obtained in (1) to achieve the corrected AT.

Terms
In order to make it convenient to state the method, several accessible abbreviations of terms were defined in this paper, as is presented in Table 1. Table 1. All terms involved in the manuscript have two forms. One is the generic form, as is listed in the column "Abbreviations", which is used generally in the text. The other is the mathematical form, as is listed in the column "Mathematical Forms", which is used to state the course of the methodology. Terms with the suffix interp mean that the data are processed by interpolation and with the suffix shift mean that the data are processed by time shifting. NodeAT, AwsAT, AwsSR, NodeATinterp, AwsATinterp, AwsSRinterp, NodeATinterpshift, AwsSRinterpshift, NodeATE, NodeATtime, AwsATtime, AwsSRtime and Interptime are involved in the part for the data preprocessing and analysis. NodeAT, AwsSR, NodeATshift, AwsSRshift, CalcATE and NodeATcorr are used in the part for data correction.

Framework
There are two key steps in the part for the data preprocessing and analysis and three steps in the part for the data correction, as is depicted in Figures 2 and 3 Figure 3. The procedure of data correction.
In Figure 2, cubic spline interpolation is applied to process raw data T node, T aws and Raws and to get interpolated data T Inode, T Iaws and RIaws. Then, the time coordinates of T Inode and RIaws are shifted to get T ISnode and RISaws. T ISnode and T Iaws are used to compute T Enode. Lastly, an ATE-SR tabular function Err = A(r) is acquired by the correlation analysis of T Enode and RISaws.
In Figure 3, T Snode is obtained by shifting the time coordinate of T node to decrease the time deviation between T node and T aws. RSaws is also obtained by shifting the time coordinate of Raws to fit the time coordinate of T Snode. Then, T Ecalc is calculated by using RSaws as the parameter of the ATE-SR function Err = A(r). Lastly, T Snode and T Ecalc are used to calculate T Cnode.

Interpolation
Due to the different sampling frequencies of NodeAT, AwsAT and AwsSR, we use the methods of interpolation to regulate these data at the same time sampling points. It is justified to describe the principle of interpolation used in our work by referring to Appendix B.1. Moreover, the cubic spline function is elected as the interpolation function to process raw data in our experiment. The advantages of the cubic spline function can be seen in Appendix B.2.
The cubic spline interpolation applied in our practical work is a good method to process row data of AT and SR, which can be fortunately acquired in MATLAB as a built-in function: In this function, it returns interpolated values of a 1D function at specific query points using spline interpolation. Vector x contains the sample points, and v contains the corresponding values v(x). Vector xq contains the coordinates of the query points [46]. For example, we can get T Inode by using this function: T Inode = interp1(ttnode, T node, tinterp)

Time Shift
Due to the outdoor arrangement of nodes, they were affected by radiation, involving solar (or short-wave) radiation [35], terrestrial (or long-wave) radiation [47], atmospheric radiation [47], etc. Additionally, the integrated sensor in the sensing node can be influenced by solar radiation by a significant measure. The trend of NodeAT and AwsSR in Figure 1 also shows that NodeAT changes along with AwsSR. However, specialized temperature sensors in the thermometer do not suffer from solar radiation. AwsAT, which is sensed by the specialized sensors, can be regarded as the actual temperature. Moreover, AT is an indicatrix of the energy of the atmosphere, and the atmospheric energy mostly derives from solar radiation. Due to the duration of the transmission of energy from solar radiation to the atmosphere, there is a sensing delay between the low-cost sensor and the specialized sensor. In conclusion, the trend of NodeAT is synchronized with AwsSR, but there is a phase difference between NodeAT and AwsAT. Hence, it is imperative for us to transform the time coordinate of T Inode and RIaws to fit T Iaws.
Suppose that the function S = Shif t(x) is used to transform the time coordinate. Then, there is no doubt of getting the function T ISnode = F S(t) by the transformation as follows:

Error Calculating
The data processing executed above makes it possible to calculate the authentic deviation between NodeAT and AwsAT. T Enode can be obtained as follows:

Statistical Analysis
Comparing T Enode and RISaws, we can find that there is a strong relevance between them. Therefore, it is conceivable to gain a numerical correspondence between them.
The corresponding relation of ATE and SR is established by using statistical analysis. Theoretically, every value of SR should map a single value of ATE, but there are several ATE values corresponding to one SR value in the real experiment. Thus, it is necessary to do some statistical analysis to obtain a single valued mapping from SR to ATE. We calculated the average of ATE corresponding to every possible value of SR (from 0.01 to 3.60 increasing by 0.01) in May and acquired a tabulation between ATE and SR. Then, the ATE-SR function Err = A(r) at r ∈ Ψ = {0.01, 0.01, . . . , 3.60} is constituted based on this tabulation, where r is every possible value of SR in May and Err is the homologous ATE. The real-time ATE can be calculated by using real-time SR as the input parameter r in Err = A(r).

Time Correction
The time coordinate of T node and Raws can be corrected by using function S = Shif t(x), and T Snode and RSaws can be achieved simultaneously. Thus, the trends of T Snode, RSaws and T aws are synchronal to each other.

Error Calculation
The value of T Ecalc can be calculated according to the value of RSaws, using the ATE-SR function Err = A(r): T Ecalc = A(RSaws)

Value Correction
It is possible to get the corrected AT T Cnode:

Experiment Foundation
We designed the sensing nodes and carried out the ongoing WSN of the meteorological factor monitoring project by using these nodes. Our sensing nodes were designed based on the technology of WSN, on-board sensor, ZigBee, integrated circuit, etc. The sensing node is also equipped with a solar panel to guarantee the supply of electric power. The sensing node can generate electricity for its battery pack in the daytime and consume reserved electric energy in the evening. That is also the reason why we tend to arrange the node in open air. We have deployed several meteorological WSNs in Beijing, Xi'an, Wuhan, Changsha and Nanjing in China. We can use the on-board temperature and relative humidity sensor integrated in the node to sense real-time AT. Besides the on-board sensor, the node also can be connected to specialized meteorological sensors, like an AT sensor, anemometer, pyranometer, etc., to collect different kinds of meteorological factors. Figure 4a presents the internal circuit structure of the sensing node, which contains the power module, on-board sensor, interface circuits, etc. We adopt SHT15 as the on-board temperature sensor, which is integrated on the bottom circuit board, as is shown in Figure 4b. In Figure 4a,b, the universal interfaces in the node are highlighted by the big ellipses. It is convenient for users to connect external meteorological sensors though these ports. However, for saving on the costs of the project, we adopt the on-board sensor to collect AT in most sensing nodes.
NodeAT employed in this experiment was sensed by the on-board temperature sensor in node No. 105. This node was contained in the wireless meteorological sensing network deployed in the campus of NUIST, as is shown in Figure 5a.
The numerical labels marked on the map represent every node deployed in the campus. We circle the label of node No. 105 with the small ellipse in Figure 5a. This node was deployed in the AWS of NUIST. Owing to the SR sensing work being conducted with the AWS, we can apply AwsSR as the same one as node No. 105 suffered. Figure 5b shows a single node that just collects AT by using the on-board temperature sensor. The node in Figure 5c is connected to the anemometer, pluviometer and other meteorological sensors to collect numerous varieties of meteorological factors. This meteorological WSN at NUIST has been set up since August 2013 and has been collecting meteorological data continuously. There are sufficient data for us to execute the experiment.  This research also relies on the standard data supply of the AWS at NUIST. The AWS at NUIST (depicted in Figure 6a) is a national base station, whose number is 59606. This weather station was founded according to the AWS construction technical standard. Thus, meteorological data applied to it can be treated as the standard data. It is equipped with all the infrastructure demanded by a standard weather station. The meteorological AT sensor used in this AWS is HMP45D (Figure 6b), which is placed in the thermometer screen to avoid solar radiation. The pyranometer TBQ-2-B (Figure 6c) is applied in the AWS to sense global SR.
In the actual experiment, there are three kinds of raw data acting as the basic data: (1) NodeAT sensed by SHT15 in node No. 105; (2) AwsAT sensed by the HMP45D in the AWS; and (3) AwsSR sensed by the TBQ-2-B in the AWS. We first carried out the course of data preprocessing and analysis in May and obtained ATE-SR tabulation, then accomplished the correction of NodeAT in other months and, lastly, did a performance evaluation to evaluate the efficiency of the method. The analysis work was based on the data of NodeAT, AwsAT and AwsSR collected directly by sensors in May. After this procedure, ATE-SR tabulation was established and was used to correct NodeAT sensed by node No. 105 in June to December, employing AwsSR as an input parameter.

Data Process and Correction
All of the meteorological data involved in this research in May had to be dealt with to fulfill the course of data preprocessing and analysis. However, we only took one day (13 May 2014) to demonstrate the course of data processing.
As is symbolically illustrated in Figure 7, the sample points of NodeAT, AwsAT and AwsSR are different from each other. It is difficult for us to calculate the deviation between NodeAT and AwsAT. Moreover, the accumulative time interval of AwsSR is 60 min. The sample points of AwsSR are too few to accomplish the analysis. In order to carry out the statistical analysis between ATE and SR, we must acquire more sample points of SR. Thus, cubic spline interpolation was used to unify the data resolution and to add sample points of NodeAT, AwsAT and AwsSR. As is depicted in Figure 8, NodeATinterp, AwsATinterp and AwsSRinterp were acquired with an identical time coordinate. Then, we obtained the original ATE between NodeATinterp and AwsATinterp (NodeATEori) to find the correlation between ATE and SR. However, as we can see in Figure 8, there is no obvious correlation between NodeATEori and AwsSRinterp. What is more, the values of NodeATEori are not close to zero, while AwsSRinterp are zero. However, theoretically speaking, the deviations between NodeAT and AwsAT should be zero when the value of SR is zero, for there is no solar radiation in the evening. For this reason, we tried to shift NodeATinterp and AwsSRinterp to the future by 60 min to approach AwsATinterp, as is plotted in Figure 9. It is easy to find that the pattern of NodeATinterpshift gets more correlated to AwsATinterp, and NodeATE is more relative to AwsSRinterpshift.
In order to make it more visual to observe the effect of shifting, we enlarged the ATE-SR patterns of Figures 8 and 9 by narrowing the limit of the Y-axis, changed the Y-tick [46] of SR and put them into one figure. As is displayed in Figure 10, NodeATE, which is obtained after time shifting, changes along with the trend of SR visibly.  Then, the relationship of ATE and SR was acquired by calculating the average value of NodeATE in May corresponding to every value of SR (from 0.01 to 3.60, increasing by 0.01) using statistical analysis. The corresponding relation of ATE and SR is presented in Table 2.
Then, the ATE-SR function Err = A(r) was obtained according to the relationship. Therefore, it is feasible to work out CalcATE by using the value of SR as a parameter in Err = A(r).
To correct NodeAT, we followed three steps: (1) correcting the time coordinate of NodeAT by shifting them to the future by 60 min and getting NodeATshift; then (2) calculating CalcATE by using AwsSRshift as an input parameter in the ATE-SR function; and (3) obtaining the corrected data NodeATcorr by subtracting CalcATE from NodeATshift.
We used this method to correct NodeAT in June to December in different seasons. As is plotted in Figures 11 to 17, NodeATcorr approximates AwsAT very well. Obviously, this method is useful to reduce the data error of NodeAT.

Performance Evaluations
We calculated several kinds of statistical characterizations to evaluate the performance of this correcting method. Maximal and mean error and the standard deviation of error were computed to estimate the correcting efficiency of AT error, and the correlation coefficient was given to show the degree of correlation between NodeAT and AwsAT in different correcting phases.
Error correcting efficiency can be obtained by following equation: To make the evaluation more objective, we used the absolute value of error to calculate the maximal error and mean error, and the applied error contains a negative value to calculate the standard deviation.
As we can see in Tables 3 and 4, the values of maximal error and mean error are decreasing progressively along with the correcting course. Comparing these two statistical characterizations before shifting with these after shifting, we can find that the process of time shifting can reduce some error. This means that time correction is effective to reduce some of the error. More values of the error are cut down by the uppermost value correction. Finally, the error had been reduced largely by these two correcting process.  Table 5 presents the standard deviation of error on every whole day. The values of the standard deviation are also decreasing progressively along with every step of correction. In Table 6, we present the Pearson product-moment correlation coefficient between NodeAT and AweAT in three different phases. In statistics, the Pearson product-moment correlation coefficient (sometimes referred to as the PPMCC, or PCC, or Pearson's r) is a measure of the linear correlation (dependence) between two variables X and Y , giving a value between +1 and −1 inclusive, where 1 is the total positive correlation, 0 is no correlation and −1 is total negative correlation [48]. It is widely used in the sciences as a measure of the degree of linear dependence between two variables and can be used to measure the correlation between NodeAT and AwsAT. As we can see in the column "Original Data" in Table 6, the correlation coefficients between uncorrected NodeAT and standard AwsAT are not high. This means a low correlation degree between unprocessed data and standard data. However, the coefficient was improved after time shifting, and further advanced after the process of value correcting. Eventually, there is a higher correlation coefficient between corrected NodeAT and standard AwsAT.

Conclusions
In this paper, an effective error correcting method for NodeAT is presented. According to the results, more than 60% of the error of NodeAT can be corrected by using this approach, and it can be applied to the real-time AT monitoring system in a practical scenario.
This study has confirmed that SR plays an extremely vital role in the correcting scheme of NodeAT. However, the ATE-SR function, which is based on discrete data, potentially can be perfected. What is more, this method has to rely on the data of SR sensed by a pyranometer. The cost of SR sensing is still high. In order to reduce the expense of our project further, the data of the voltage of the solar cell panel equipped in the sensing node will be considered to replace the data of SR in the future. unit time and per unit area. Irradiance is the same as radiant flux density or flux. Since irradiance will mean the rate of incident energy, its units will be W·m −2 , and the units of irradiation will be kJ·m −2 ·h −1 or MJ·m −2 ·day −1 . Radiation will be employed in a generic sense, and its meaning should be treated as irradiation in this paper. Intensity means irradiance from a particular direction and contained within a unit solid angle. Intensity is expressed in W·m −2 ·sr −1 on an area normal to the direction of radiation. It is pertinent to point out that the term "intensity" is often loosely employed. For example, in meteorology, intensity is used for radiative flux, as well as for the quantity of radiation arriving from all over the sky dome [44].

A.2. Solar Radiation
The Sun is the star closest to Earth, and its radiant energy is practically the only source of energy that influences atmospheric motions and our climate [44]. Due to solar radiation emanating from the Sun being attenuated before reaching the ground, the maximum radiation on the Earth is received under cloudless and clear sky [44].
When solar radiation enters the Earth's atmosphere, a part of the incident energy is removed by scattering and a part by absorption. The scattered radiation is called diffuse radiation. A portion of this diffuse radiation goes back to space, and a portion reaches the ground. The radiation arriving on the ground directly in line of the solar disk is called direct or beam radiation [44]. The quantity of total direct and diffuse radiation arriving at the Earth's surface is important to the temperature variation [49] on the ground. However, in order to quantify SR in a horizontal surface, global solar irradiance has been presented. Global irradiance is the sum of the beam plus diffuse irradiance on a horizontal surface and can be measured by radiometers with hemispherical fields of view, called pyranometers [44]. Additionally, global solar irradiation is the integral of solar irradiance in the period of the given time interval.

B.1. Interpolation
Suppose that the function y = f (x) is known at the N + 1 points (x 0 , y 0 ), . . . , (x N , y N ), where the values x k are spread out over the interval [a, b] and satisfy: a ≤ x 0 < x 1 < · · · < x N ≤ b and y k = f (x k ) A polynomial P (x) of degree N will be constructed that passes through these N + 1 points. In the construction, only the numerical values x k and y k are needed. Hence, the higher-order derivatives are not necessary. The polynomial P (x) can be used to approximate f (x) over the entire interval [a, b] [50].
Situations in statistical and scientific analysis arise where the function y = f (x) is available only at N + 1 tabulated points (x k , y k ), and a method is needed to approximate f (x) at non-tabulated abscissas. When x 0 < x < x N , the approximation P (x) is called an interpolated value. If either x < x 0 or x N < x, then P (x) is called an extrapolated value [50].

B.2. Spline Function
A spline function is a function that consists of polynomial pieces joined together with certain smoothness conditions [51].
In most situations, polynomial interpolation for a set of N + 1 points {(x k , y k )} N k=0 is frequently unsatisfactory from a practical point of view, and other functions need to be considered [50,51].
Another method is to piece together the graphs of lower-degree polynomials S k (x) and interpolate between the successive nodes (x k , y k ) and (x k+1 , y k+1 ). The two adjacent portions of the curve y = S k (x) and y = S k+1 (x), which lie above [x k , x k+1 ] and [x k+1 , x k+2 ], respectively, pass through the common knot (x k+1 , y k+1 ), and the set of functions {S k (x)} forms a piecewise polynomial curve, which is denoted by S(x) [50].
According to the degree of piecewise polynomial, spline function can be classified as first-degree spline (whose pieces are linear polynomials joined together to achieve continuity) [51], quadratic spline (which is a continuously differentiable piecewise quadratic function, where quadratic includes all linear combinations of the basic monomials x → 1, x, x 2 ) [51,52], cubic spline [51,53] and higher-degree spline [51]. Compared with splines of other degrees, the cubic spline function has two continuous derivatives everywhere. At each knot, three continuity conditions are imposed. Since S, S and S are continuous, the graph of the function will appear smooth to the eye. Discontinuities, of course, may occur in the third derivative, but cannot be easily detected visually, which is one reason for choosing degree three. Experience has shown, moreover, that using splines of a degree greater than three seldom yields any advantage. For technical reasons, odd-degree splines behave better than even-degree splines (when interpolating at the knots) [51].