Novel Method for Measuring the Heat Collection Rate and Heat Loss Coefficient of Water-in-Glass Evacuated Tube Solar Water Heaters Based on Artificial Neural Networks and Support Vector Machine

The determinations of heat collection rate and heat loss coefficient are crucial for the evaluation of in service water-in-glass evacuated tube solar water heaters. However, the direct determination requires complex detection devices and a series of standard experiments, which also wastes too much time and manpower. To address this problem, we propose machine learning models including artificial neural networks (ANNs) and support vector machines (SVM) to predict the heat collection rate and heat loss coefficient without a direct determination. Parameters that can be easily obtained by “portable test instruments” were set as independent variables, including tube length, number of tubes, tube center distance, heat water mass in tank, collector area, final temperature and angle between tubes and ground, while the heat collection rate and heat loss coefficient determined by the detection device were set as dependent variables respectively. Nine hundred fifteen samples from in-service water-in-glass evacuated tube solar water heaters were used for OPEN ACCESS Energies 2015, 8 8815 training and testing the models. Results show that the multilayer feed-forward neural network (MLFN) with 3 nodes is the best model for the prediction of heat collection rate and the general regression neural network (GRNN) is the best model for the prediction of heat loss coefficient due to their low root mean square (RMS) errors, short training times, and high prediction accuracies (under the tolerances of 30%, 20%, and 10%, respectively).


Introduction
Solar water heaters (SWHs) are the most popular way to make use of solar energy, a consequence of their technological feasibility and the economic benefits they afford.Typically, the system uses solar collectors and concentrators to gather, store, and use solar radiation to heat air or water in domestic, commercial, or industrial plants [1].Of the three types of stationary collector [2], evacuated tube solar collectors have substantially lower heat loss coefficient and cost than standard flat plate collectors [3].In China, all-glass evacuated tubular solar water heaters are widely used due to their excellent thermal performance, convenient installation, and easy transportability [4,5].A preliminary investigation showed that all-glass evacuated tube solar collectors took an 88% share of the market in 2003 and 95% in 2009 [6].The annual production of evacuated solar tubes in China, expanding at an annual average growth of 30% in recent years [7], was estimated to be more than 20 million tubes in 2001 and 350 million tubes in 2009 [8].
Many researchers have undertaken significant studies investigating and evaluating the thermal performance of water-in-glass evacuated tube solar water heaters both experimentally and theoretically [9][10][11][12].Tang et al. [13] developed a detailed mathematical procedure to estimate the daily collectible radiation from a single tube of all-glass evacuated tube solar collectors based on solar geometry and knowledge of two-dimensional radiation transfer, and the results showed that the annual collectible radiation on a tube is affected by many factors such as collector type, central distance between tubes, size of solar tubes, tilt and azimuth angles, and use of a diffuse flat reflector.Wang et al. [14] performed an experiment and simulation study on a new type of all-glass evacuated tubular solar air heater with simplified compound parabolic concentrator (CPC), and the results showed the whole system had an outstanding high-temperature collecting performance and the new simplified simulation model can meet the general requirements of engineering calculations.Çomaklı et al. [11] optimized the size of solar collectors and storage tanks to design more economic and efficient solar water heating systems, according to Turkish conditions and relevant Turkish standards, with experiments and simulations.Porras-Prieto et al. [15] discussed the influence of required tank water temperature on the energy performance and water withdrawal potential of a solar water heating system equipped with a heat pipe evacuated tube collector, and the results indicated that, as the required tank water temperature increases, the net energy that can be stored by the system falls, with differences of over 1000 Wh/m 2 /day between required tank water temperature of 40 and 80 °C at a solar radiation input of 8000 Wh/m 2 /day (system efficiency range 56%-73%).Zhang et al. [16] investigated the higher coefficient of thermal performance for water-in-glass evacuated tube solar water heaters by experiment testing and determined the optimum ratio of tank volume to collector area for solar water heater is 57 to 72 L/m 2 .
Artificial neural networks (ANNs) have been used in many renewable energy systems in the last two decades, especially for solar thermal energy systems and solar radiation.In one work by Kalogirou et al. [17], the objective was to train an ANN to predict the useful energy extracted from solar domestic hot water systems and the temperature rise of the stored water with minimal input data.In another study by Kalogirou et al. [18], different ANNs were used to predict the collector parameters describing the instantaneous efficiency, the incidence angle modifier coefficients at longitudinal and transverse directions, the collector time constant, the collector stagnation temperature, and the collector heat capacity.Kalogirou et al. also used ANNs to predict the performance of large solar systems.The ANN method was used to predict the expected daily energy output for typical operating conditions, as well as the temperature level that storage tank can reach by the end of the daily operation cycle [19].In addition, Kalogirou et al. used the neural network method in the long-term performance of thermosiphon domestic solar water heating systems [20] and to model the starting-up of a solar steam generator [21].An application of ANNs to predict the in situ daily performance of solar air collectors was also presented by Lecoeuche et al.In this study, the output of the ANN was the outlet temperature of the collector, and inputs to the network were the solar radiation and the thermal heat loss coefficients [22].
However, the most important coefficients of thermal performance (CTP), the heat collection rate and heat loss coefficient, are very difficult to determine because the test conditions to assess the thermal performance of SWHs should follow GB/T 19141-2011 [23]: (i) The test period is 8 h, including 4 h before solar noon and 4 h after; (ii) The daily solar irradiation shall be higher than 16 MJ/m 2 ; (iii) The daily average surrounding temperature shall be between 8 and 39 °C; (iv) The daily average surrounding air speed shall be less than 4 m/s; (v) The initial temperature in the storage tank shall be 20 ± 1 °C.
According to the conditions for testing above, the least time required to obtain the heat collection rate of a new solar water heater in Beijing, China is 15 days, which is time-consuming and strenuous [24].Generally, the detection device is used for the determination.In addition, these 915 water-in-glass evacuated tube solar water heaters come from one company.Due to the high cost of detection, the company entrusted us to develop this model to predict the operation performance, which could be employed to predict the performance.
The "portable test instruments" (Table 1), which we employed here, are highly convenient and effective to determine the relevant parameters of water heaters, but not available for the determination of heat collection rate and heat loss coefficient directly.To address these problems, here, we propose a series of machine learning models based on experimental data to predict the heat collection rate and heat loss coefficient of solar water heaters.We used the "portable test instruments" to measure the some independent variables, including tube length, number of tubes, tube center distance, heat water mass in tank, collector area, final temperature and angle between tubes and ground.Then we input these independent variables into ANNs and obtain the predicted heat loss coefficient and the heat collection rate.The heat loss coefficient and the heat collection rate in field measurement were determined by a "PDT2013-1" detection device (as shown in Figure 1), developed by the China Academy of Building Research, one organization cooperating with this paper.Parameters that can be precisely obtained from "portable test instruments" were set as independent variables, which are all relevant to the values of heat collection rate and heat loss coefficient, including tube length, number of tubes, tube center distance, heat water mass in tank, collector area, final temperature and angle between tubes and ground.The final temperature could be defined as the stable final temperature in the tank after the heat loss test.The temperature could be measured by digital thermoelectric thermometer with thermocouple.The sampling point was placed at the outlet of SWH, while the heat collection rate and heat loss coefficient obtained from the "PDT2013-1" and relevant equations were set as dependent variables.The solar irradiation and the ambient temperature are different under different climatic conditions.Meanwhile, the final temperature could be determined by the solar irradiation and the ambient temperature to a great extent.Therefore, different climatic conditions could result in different final temperatures, and furthermore the heat collection rate will be different.This is why we just chose the final temperature as the independent variable, without taking into account the solar irradiation and the ambient temperature in our model.ANNs and support vector machine (SVM) were developed to "learn" the experimental data and give predictions of the two dependent variables.Comparisons among different models were made in order to find out the most suitable model for the prediction of heat collection rate and heat loss coefficient.

Experimental
According to the determination methods of independent and dependent variables in this research, 915 water-in-glass evacuated tube solar water heaters (in service for one year) were precisely determined by the "portable test instruments" and the PDT2013-1 (China Academy of Building Research, Beijing, China) detection device developed by the national center for quality supervision and testing of solar heating systems.Forty-eight PDT2013-1 detection devices were employed to measure the heat collection rate and heat loss coefficient (USL) simultaneously.Table 2 shows the statistical results of the experimental data.

Artificial Neural Networks (ANNs)
ANNs [25][26][27] are strong machine learning approaches with functions of estimation and approximation based on input values.Interconnected neural networks are usually made up of neurons that can calculate values from inputs and adapt to different situations.Therefore, ANNs are widely used in numerical predictions and pattern recognitions.Currently, ANNs have become very popular in inferring a function from observation, especially when objects of study are too complex to be dealt with human brains.In our studies, two kinds of ANNs were used for model developments, multilayer feed-forward neural networks (MLFNs) and general regression neural network (GRNN).

Multilayer Feed-Forward Neural Networks (MLFNs)
MLFN, trained with a back-propagation (BP) learning algorithm, is one of the most popular ANNs in scientific research [28][29][30].Figure 2 is the schematic structure of an MLFN, with input, hidden, and output layers.Each single neuron interconnects with all neurons in the contiguous layer and each pair of connected neurons is connected via adaptable synaptic weights (Figure 3).As shown in Figure 3, the connection between the i th and j th neuron is characterized by the weight coefficient ij  and the threshold coefficients i  and j  [28].The weight coefficient reflects the degree of importance of the given connection in the neural network.i x and j x are the output values of the i th and j th neurons, respectively.Knowledge is mainly stored as a set of connection weights, corresponding to synapse efficacy in human brain [31,32].The process of training is the modification of connection weights until it satisfies users' needs.During the training process, weights are adjusted in order to acquire the desired output [33].

General Regression Neural Network (GRNN)
GRNN was firstly designed by Specht [34].It has strong prediction capacity in prediction and pattern recognition [35].The features of the GRNN are fast learning, consistency, and optimal regression with large number of samples [35].Being similar to MLFN, GRNN consists of a series of interconnected neurons and layers.A typical GRNN has four layers: input, pattern, summation, and output, which are shown in Figure 4

Support Vector Machine (SVM)
A support vector machine (SVM) is a novel machine learning algorithm based on statistical learning theory [36,37], which mainly uses the central concept called "kernel" for learning tasks.Kernel machines provide a modular framework that can be adapted to various tasks and domains with the use of different kernel functions (i.e., linear, polynomial, radial basis, or sigmoid) and the base algorithm [38].Due to its principles, SVM has good performance in solving both prediction and classification problems.Figure 5 shows the main structure of SVM.The letter "K" represents kernels [39].As can be seen from Figure 5, small subsets extracted from the training data by relevant algorithm consists of the SVM.For prediction and classification, choosing suitable kernel functions and appropriate parameters is important to get good prediction accuracy.With the development of computer science, there are currently many software packages that are helpful for us in developing the SVM [40,41].

Model Development
According to the determination capacity of "portable test instruments", for an in service water-in-glass evacuated tube solar water heater, precise values of tube length, number of tubes, tube center distance, heat water mass in tank, collector area, final temperature and angle between tubes and ground can be easily obtained outdoors, while the heat collection rate and heat loss coefficient can only be determined by the detection device after being dismantled.To avoid complex disassembly and obtain the heat collection rate and heat loss coefficient in real time, here we aim at using machine learning techniques including ANNs and SVM to develop a series of prediction models for the heat collection rate and heat loss coefficient.Due to the large scale of data groups we acquired from experiments, the number of data groups in the testing set was large enough to help us evaluate the performance of the models.Also, if the number of data groups in the training set was not large enough (compared to that of the testing set), the training processes of models may had the risk of over-fitting.One of the empirical proportion settings for the testing set is lower than 20%, and we found that the setting of 15% proportion for the testing set to the total samples can ensure that a large number of experimental data groups (778 in total) were trained, and meanwhile, there was still a large number of data groups (137 in total) were tested.Therefore, 85% data groups, including the independent variables (tube length, number of tubes, tube center distance, heat water mass in tank, collector area, final temperature and angle between tubes and ground) and dependent variables (heat collection rate and heat loss coefficient) measured from 915 samples of in service water-in-glass evacuated tube solar water heater were set as the training set, while the remaining 15% were set as the testing set.The SVM model was developed by Matlab software.The ANN prediction models were constructed by the NeuralTools ® software (trial version, Palisade Corporation, New York, NY, USA) [42][43][44].The GRNN and MLFN were chosen as the learning algorithms of ANNs.
Root mean square (RMS) error, required training time, and prediction accuracy (under the tolerances of 30%, 20%, and 10%, respectively) are used as indicators to measure the performances of the SVM and ANNs.Training times representing the required time for developing a complete learning machine were recorded during the training process of a model.The RMS errors were calculated from Equation (1) and the prediction accuracies were calculated from Equation (2): where i Z is the predicted value, i O is the actual value, and tot n is the number of tested samples.
good n is the number of tested samples with good predicted results under a certain tolerance.The nodes of MLFN models were set from 2 to 39, from which we could find out the change regulation of the MLFN models when dealing with the development processes.Each model was trained for 20 times using different components of training and testing sets, then the mean RMS errors in testing, mean training times, and mean prediction accuracies were acquired.Tables 3 and 4 show the development results for the prediction of heat collection rate and heat loss coefficient, respectively, using the mean RMS error in testing, mean training time, and mean prediction accuracy as the indicators for showing the performances of the models.
Table 3 shows that the mean RMS errors of the presented models are very close.The lowest mean RMS error exists in the MLFN with 12 nodes (0.14) and the second lowest mean RMS error exists in the MLFN with 3, 4, 7, 9, 10, 11, and 14 nodes respectively, which are all 0.15.The mean RMS errors of the SVM and GRNN are obviously higher than those of the MLFNs presented in Table 3 (0.29 and 0.33, respectively).In terms of the prediction accuracies, the mean prediction accuracies of the models are all 100% under the tolerance of 30%.Under the tolerance of 20%, most of the mean prediction accuracies of models are higher than 99.8%.When the tolerance is decreased to 10%, the MLFN with 3 and 12 nodes has the highest mean prediction accuracies among all models (98.33% and 98.57%, respectively).However, considering the mean training time of the models, the MLFN with 3 nodes has a significantly shorter mean training time.Comprehensively, the MLFN with 3 nodes (MLFN-3) can be considered as the best model for the prediction of heat collection rate due to its low RMS error, short training time, and high prediction accuracies under different tolerances.Likewise, in terms of the prediction of heat loss coefficient, mean RMS errors are also similar among different models (Table 4).This shows that although the prediction accuracies of all the models presented in Table 4 are 100% under the tolerance of 30%, the GRNN has the highest prediction accuracies under the tolerances of 20% and 10%, respectively, and also has the significant advantage of comparatively low mean RMS error (0.71) and short mean training time (8 s).During our experiments, considering the RMS error, training time, and prediction accuracies under different tolerances, GRNN is regarded as the best prediction model for heat loss coefficient.

Model Analysis
To analyze the results of the best models for heat collection rate and heat loss coefficient, we should firstly divide the model development process of an ANN into two parts, the training and testing processes.The training process reveals the fitting results of an ANN, showing whether the training set was fitted properly and, at the same time, the capacity for recall of the model, which acts like the memory function of a human brain; the testing process reveals the prediction results of an ANN after training, showing whether the model can be applied to practical applications.Therefore, both training and testing results cannot be neglected when evaluating the precision and robustness of an ANN model.

The MLFN-3 for Heat Collection Rate
Typical training results (Figure 6) and testing results (Figure 7) of the MLFN-3 for the prediction of heat collection rate are illustrated to show the robustness and precision of the model.It is significant that after the training process, the predicted values are close to the actual values (Figure 6a), which indicates that the MLFN-3 has a comparatively strong capacity for recalling the data in a training set.Residual values (Figure 6b,c) are also highly concentrated to zero, showing that the non-linear fitting results of the model are reliable.
In terms of the testing results, predicted values are also close to the actual values in the testing set (Figure 7a), indicating that the MLFN-3 has a very strong prediction capacity for heat collection rate.Residual values are also generally close to zero (Figure 7b,c).All these testing results prove that the MLFN-3 is a powerful tool to predict the heat collection rate based on the input of independent variables.

The GRNN for Heat Loss Coefficient
For the prediction of heat loss coefficient, the training results (Figure 8) and testing results (Figure 9) are illustrated to show the robustness and precision of the GRNN.It can be seen that after the training process, though there are some deviations, the predicted values generally correspond to the actual values in the training set (Figure 8a).The predicted values of heat loss coefficient are close to the actual values in the range between 10 and 11.The residual values (Figure 8b,c) also prove that more than one third of the residual values are very close to zero.In addition, although the rest of the predicted values (approximately two thirds of the values in the training set) have some deviation from their actual values, the deviations are all in a controllable range, with their residual values only ranging from −1.8 to 2.4.Therefore, the GRNN is also considered as a good model in a certain range of heat loss coefficient values.
In terms of the testing results, in the testing set, the predicted values (Figure 9a) are very close to the actual values when the values are in the range between 10 and 11.Above or below this range there may be some deviations when predicting.However, residual values (Figure 9b,c) show that the deviations are acceptable since all residual values are approximately ranging from −1.5 to 2.0.Also, in practical situations, most of the heat loss coefficient of water-in-glass evacuated tube solar water heater is in the exact range between 10 and 11 and the average of the values is 10 (as Table 2 shows).
The RMS error (0.71) and prediction accuracy (100%) of the GRNN also show that the GRNN is a good predictor for heat loss coefficient.Therefore, in spite of the partial deviation phenomenon in the testing results, the GRNN is still acceptable and can be considered as the best model for the prediction of the heat loss coefficient.

Robustness Analysis
According to the principles of the ANN algorithm, the initial values created in the "hidden box" are chosen at random.Therefore, the reproducibility of an ANN cannot be neglected.

Comparison with Conventional Methods
According to conventional methods for the determination of heat collection rate and heat loss coefficient, technicians should accomplish a series of steps using the detection devices (Figure 11).The water-in-glass evacuated tube solar water heaters should be dismantled at first.However, for the determination for the in service heaters, the disassembly of the heater is highly inconvenient and will cause damage to instruments.What is worse, a completed conventional determination process requires at least 15 days.After a series of complicated determination processes, the heat collection rate and heat loss coefficient should be obtained by Equations ( 3) and ( 4), respectively: ( ) where qs is the heat collection rate; H is the amount of solar radiation; ta is the ambient temperature; ti is the initial temperature of water in tank; S is the area of tubes and a1, a2 and a3 are the coefficients; Us is the heat loss coefficient according to the ISO 9459-2 [45]; w  is the water density; Cp,w is the specific heat of water; Vs is the heat water mass in the tank; tis is the initial temperature of water; tfs is the final temperature of water; tas(av) is the ambient average temperature; V is the volume of water; and   is the duration time of heat loss coefficient experiments.
To revolutionize the determination method, the novel method we propose here can avoid the disassembly to the heaters and meanwhile save time.The determination process using our novel method is presented in Figure 12, showing that the independent variables in our study can be inputted into the ANNs after being obtained by "portable test instruments".The precise predicted results of heat collection rate and heat loss coefficient can be obtained precisely from the output of the ANNs.The use of "portable test instruments" in combination with ANNs can save time and give highly precise predicted results.However, this new method was only suitable for water-in-glass evacuated tube solar water heaters from one Chinese company.The model here could be employed for the prediction of the heat collection rate under the conditions of the Chinese standard GB/T 19141 [23].

Conclusions
Here, the best prediction models for heat collection rate and heat loss coefficient of water-in-glass evacuated tube solar water heaters are proposed after detailed model development and analysis.
Results show that the MLFN-3 has the best prediction results for heat collection rate and the GRNN has the best prediction for heat loss coefficient due to their low RMS errors, short training times, and high prediction accuracies.In practical applications, the determination of heat collection rate and heat loss coefficient can be undertaken outdoors using "portable test instruments".Data of independent variables obtained by "portable test instruments" can be inputted into the ANNs and the precise predicted values of heat collection rate and heat loss coefficient can be obtained rapidly in the output of the models.Therefore, using our novel determination techniques with the combination of "portable test instruments" and ANNs can acquire the heat collection rate and heat loss coefficient easily and quickly.However, ANNs are developed totally based on the training of experimental data, which may neglect the inner principles and theories between the independent and dependent variables, which is what we call the "black box."Although it is a major advantage of ANNs, it leads to difficulties in studying the exact causalities between the independent and dependent variables, which may also be hard for us to remove unnecessary noise when training.Future studies may find it difficult to optimize the design of assembly conditions of water-in-glass evacuated tube solar water heaters using ANN methods.Fortunately, with a large-scale experimental data like the large sample size in this study, noise can be eliminated with a proper training process, ensuring good predicted results.With the help of this study, the determinations of heat collection rate and heat loss coefficient no longer need to be undertaken in a laboratory after dismantling the solar water heater, which can avoid probable damage to related instruments and at the same time enormously reduce manpower, experimental time, and unnecessary operations.Further studies will be aimed at two explorations: (i) developing a robust software in both personal computer (PC) and mobile phone platforms, in order to assist the practical measurements using the novel method; and (ii) optimizing the assembly conditions of water-in-glass evacuated tube solar water heaters with higher heat collection rates and lower heat loss coefficients, using high-throughput screening based on ANNs.However, the developed models were only suitable for water-in-glass evacuated tube solar water heaters from one Chinese company.The model here cannot be used for the prediction of the heat collection rate under other conditions than the ones stated within Chinese standard GB/T 19141.

Figure 1 .
Figure 1.Schematic diagram of detection device of the determination for water-in-glass evacuated tube solar water heaters.

Figure 3 .
Figure 3. Connection between neurons i and j .
[35].The input layer keeps corresponding inputs and transfers input vector x to the pattern layer.The pattern layer consists of neurons for each training datum.Any test input applied to the network is first subtracted from the pattern layer neuron values.Either squares or absolute values of subtracts applied to exponential activation function will be summed.Results are transferred to the summation layer.Dot product of the pattern layer outputs and weights are added by summation layer neurons.As can be seen in Figure4, weights are shown by A and B, and ( ) f x K denotes the weighted outputs of the pattern layer, where K is a Parzen window associated constant [26].( ) Yf x K denotes multiplication of pattern layer outputs and training data output Y values.At the output layer, ( ) f x K is divided by ( ) Yf x K to estimate the desired Y.

Figure 5 .
Figure 5. Main structure of the support vector machine.

Figure 6 .Figure 7 .
Figure 6.Training results of 778 samples using the MLFN-3 for the prediction of heat collection rate.(a) Predicted values versus actual values; (b) residual values versus actual values; (c) residual values versus predicted values.

Figure 8 .Figure 9 .Figure 9 .
Figure 8. Training results of 778 samples using the GRNN for the prediction of heat loss coefficient.(a) Predicted values versus actual values; (b) residual values versus actual values; (c) residual values versus predicted values.

Figure 10 .
Figure 10.Repeated experiments of (a) the MLFN-3 for the prediction of heat collection rate and (b) the GRNN for the prediction of heat loss coefficient.

Figure 12 .
Figure 12.Flow chart of the novel method using "portable test instruments" combined with ANNs for determining heat collection rate and heat loss coefficient.

Table 1 .
"Portable test instruments" for the determination of parameters of water-in-glass evacuated tube solar water heaters.

Table 2 .
Descriptive statistic of the variables for 915 samples of in service water-in-glass evacuated tube solar water heaters.

Table 3 .
Results of prediction models for heat collection rate.

Table 4 .
Results of prediction models for heat loss coefficient.