Output Temperature Predictions of the Geothermal Heat Pump System Using an Improved Grey Prediction Model

: This paper presents the Improved Grey Prediction Model, also called IGM (1,1) model, to increase the prediction accuracy of the Grey Prediction Model (GM) model that performs the GHPS output temperature prediction. This was based on correcting the current predicted value by subtracting the error between the previous predicted value and the previous immediate mean of the measured value. Subsequently, the IGM (1,1) model was applied to predict the output temperature of the GHPSs at Oklahoma University, the University Polit è cnica de Val è ncia, and Oakland University, respectively. For each GHPS, the model uses a small dataset of 24 data points (i.e., 24 h) for training to predict the output temperature eight hours in advance. The proposed model was veriﬁed using three different output temperature datasets; these datasets were also used to validate the power efﬁciency of the proposed model. In addition, the empirical results show that the proposed IGM (1,1) model signiﬁcantly improves the simulation (in-sample) and the prediction (out-of-sample) of the output temperature of the GHPS through error reduction, thereby enhancing the GM (1,1) model’s overall accuracy. As a result, the prediction accuracies were compared, and the improved model was found to be more accurate than the GM (1,1) model in both simulation and prediction results for all datasets used.


Introduction
Recently, many developed countries have turned to the use of Geothermal Heat Pump Systems (GHPS), an alternative source of renewable energy that is essentially clean, flexible, and freely available. For example, many universities have adopted the GHPS which uses the difference in temperature between a building and the deeper ground layers to provide heating/cooling and hot water for the building. GHPS is environmentally friendly and once installed, either vertically or horizontally, it does not require additional gas for operation. The output temperature of the GHPS is nonlinear due to the variation of the speed flow rate of water and fluctuations in ground temperature, which is a gradient in-depth and season [1][2][3][4][5]. Given this, it can be challenging to get consistently accurate readings when the engineers are battling with factors out of their control. Such instances are rare occurrences, but they do happen. When an anomaly in the dataset is seen, it is either because the temperature meter produced a bad reading, the heat pump did not operate properly, or the optimal flow rate of water was lost. As a result, these metrics could provide an accurate dataset, allowing the prediction models to accurately predict the GHPS's output temperature. Predicting the output temperature of Geothermal Heat Pump Systems (GHPS) is important because it provides information in advance that can be used to optimize temperatures, reduce energy consumption, and improve and maintain the system's efficiency. Therefore, it is critical to select an appropriate prediction model that deals with limited datasets for short-term prediction. Thus, the Grey Prediction Model (GM) may be adequate to perform a short-term output temperature prediction of the GHPS.
Numerous different prediction techniques have been proposed to predict the output temperature of the GHPS. These models have included the Linear Regression (LR), Autoregressive Integrated Moving Average (ARIMA), Multiple Linear Regression (MLR), Support Vector Regression (SVR), and Learning Algorithm (LM) [6][7][8].
Literature has sought different techniques to improve the output temperature prediction of the GHPS. Michopoulos et al. [9] proposed an algorithm based on the infinite line source method (LSM) that was able to improve the output temperature prediction of the GHPS, thereby showing different values between measured and predicted temperatures at roughly ±2 • C. Wenjie and Jinbo [7] adopted the dynamics of the GHE, which was developed based on the ANN model to perform the exit temperature prediction. The result showed that the LM model was more accurate than the ANN model. Furthermore, Quan et al. [10] improved the Support Vector Machine (M-GASVR) model to predict the vertical water temperature and the results demonstrated that the improved model has a more accurate prediction compared to the SVR and ANN models. By contrast, K. Thiyagarajan et al. [11] used the ARIMA, the prophet, the ETS, and the Bagged models to predict sensor measurement of concrete at a sewer pipe surface temperature for one week, 24 h, and 12 h. The results showed that the ARIMA model was found to be more accurate than the other models used, and it was shown to be suitable for creating a one-week prediction of a sensor detecting concrete of sewer pipe surface temperature. Mehrmolaei, Soheila, and Mohammad Reza Keyvanpour [12] proposed an improved ARIMA model, which applied a mean estimation error to the original ARIMA, to increase the performance of the time series prediction. Toharudin, et al. [13] used long short-term memory (LSTM) and Facebook Prophet models to predict the air temperature. The results demonstrated that the Facebook Prophet model performed better on maximum air temperature, while the LSTM model performed better on minimum air temperature. These models require a large enough dataset for training to make an accurate prediction. When the dataset is large, it increases the performance of the model, and a more accurate prediction is likely to be achieved. Contrarily, small datasets due to a lack of sufficient information can result in an inaccurate prediction.
Grey System Theory is performed to analyze uncertainty systems that have small sample sizes and insufficient datasets. The Grey Model (GM) was established from the Grey System Theory by Deng [14]. In the Grey System Theory, the GM (1,1) model is a critical model that is commonly used for short-term time series prediction. The GM (1,1) model could result in an accurate prediction for an uncertain system which has a small dataset [15,16]. However, it is appropriate for problem predictions with complete, non-negative, and non-random datasets. The GM (1,1) model has some advantages, such as being simple to calculate, analyze, fast to implement, and effective at predicting when enhanced. Therefore, the GM (1,1) model has been applied to different research applications, and it has demonstrated satisfactory prediction results when it is improved [17][18][19][20]. The GM (1,1) model is constructed using the system dataset that considers the model's input [21]; thus, the dataset needs to be free of negatives, randoms, and anomalies. The GM (1,1) model's performance can be improved by further optimizing its parameters. When using a rolling mechanism approach [22], only the most recent data were analyzed; the previous data were removed, and new data were added to the data sequence to continue the prediction. In addition, the sample size is an effect on the GM (1,1) model's precision. If a smaller sample size of the grey model is established, it results in a higher accuracy prediction; however, using a large sample size results in an inaccurate prediction [23]. Thus, a small sample size is required to make the prediction more accurate. Zhang, Yi et al. [24] used the GM (1,1) model to perform a wind farm prediction in China with a sample size between 4 to 8. The results concluded that the best sample size was 6.
The range of applicability and prediction accuracy of the GM (1,1) model is limited, and it requires improvement. Here, some developments are proposed to improve its performance. Hsu et al. [25] suggested that the GM (1,1) model could be improved by the Bayesian analysis to predict the output of the integrated circuit industry. The resulting improved model was more accurate than traditional GM (1,1) and LR models. An analysis of combining the GM (1,1) model and Markov Chain to establish the Markov Chain Grey Model (MCGM) model was carried out by Urrutia et al. [26]; it predicted the energy demand of the Philippines, and it has achieved higher precision prediction. In addition, Wang, Yuhong, et al. [27] increased the prediction accuracy of the GM (1,1) model by using its optimization of the initial condition. Moreover, Jia, Zhengyuan, et al. [28] generated a new data sequence by cubing the original data x (0) (k) = 3 x (0) (k) to establish new methods used to improve the GM (1,1) model and the original dataset. Mahdi and Mohamed [29] further improved the prediction accuracy of the GM (1,1) model by optimizing the background value which achieved increased precision. In addition, Hsu, Li-Chang [30] used an Improved Transformed Grey Model (ITGM (1,1) with a genetic algorithm and applied it to predict the output of the Opto-electronics industry in Taiwan from 1990 to 2008 for verification. Akay and Mehmet [22] proposed a grey prediction with a rolling mechanism (GPRM) to predict the electricity demand of Turkey for the period of 1970-2004. The resulting GPRM model was more accurate than the Model of Analysis the Energy Demand (MAED). These developments were proposed to create a powerful technique that can increase the precision of the GM (1,1) model by reducing prediction errors. However, these improved models still result in prediction errors, meaning further improvement is required.
In this paper, we propose an improved model (IGM (1,1)) to increase the accuracy and range applicability of the GM (1,1) model. The improved method is based on correcting the current predicted value by subtracting the error between the previous predicted value and the previous immediate mean of the measured value. The IGM (1,1) model is then applied to predict the Oklahoma University, University Politècnica de València, and Oakland University's GHPS output temperatures for the next 8 h, respectively. The experimental design is implemented using MATLAB. As a result, the IGM (1,1) model improves the simulation and prediction output temperatures for all datasets used. The IGM (1,1) model's prediction accuracies are then compared to the GM (1,1) model. The performance of the IGM (1,1) model is validated for the three above-mentioned datasets.
The remainder of the paper is arranged as follows: Section 2 contains the materials and methods. Data description and evaluation of the improved model are provided in Section 3. Section 4 presents the results and discussion, and Section 5 provides a brief conclusion.

GM (1,1) Model
This section explains the analysis and modeling procedure for the traditional GM (1,1) model.
Let T (0) be the original non-negative and non-random data sequence values of temperature, i.e., where n is the length of the data sequence. Use the First order Accumulated Generating Operation (1-AGO) sequence for the general data T (0) as follows: Define Z (0) as a sequence that is generated based on the neighbor means from T (1) . The expression of the immediate mean series Z (1) is as follows: where: Calculate the parameters {a, b} of the GM (1,1) model's equation as: Use the first-order ordinary differential equation of the GM (1,1) model, also called the whitening equation. Therefore, the equation is written as follows: Here, the model parameters areâ = [a, b] T where a is the developing coefficient, and b is a grey control input. Then, optimize the parameters a, b which are obtained from a least-square method as follows: . . .
, and the initial condition is The time response equation of the GM (1,1) model is presented as follows: Use the Inverse first-order Accumulated Generating Operation (Inverse 1-AGO) to calculate the original predicted sequenceT (0) as follows: Subsequently, the fitted and predicted ofT (1) (k) are calculated and its equation is written as follows.T

Improved IGM (1,1) Model
This section explains the proposed modification for improving the GM (1,1) model to make a better prediction through the reduction of error between predicted values and the measured values to improve the overall accuracy of the GM (1,1) model. The improvement in prediction is based on correcting the current predicted value by subtracting the error between the previous predicted value and the previous immediate mean of the measured value. Subsequently, the defined equation of the IGM (1,1) model is presented and discussed below.
According to the time response Equation (9), the GM (1,1) model's prediction equation is:T when the Z(k) generates a sequence for the immediate mean of the general data T (1) (k) is accumulated, the error increases. The aim is to get the prediction error as close to zero as possible.
A correction is made on the next time step based on the current error, i.e., Define δ (1) (k) as the difference error between the general response of the grey prediction T (1) (k) and immediate mean Z (1) (k), and it has a major role in reducing the error prediction and enhancing the performance of the GM (1,1) model.
The error prediction equation presents as follows: Thus, is The GM (1,1) model's prediction equation in Equation (9) results in the error prediction that can be observed in Figures 1-3. However, by subtracting the δ (1) (k) from Equation (9), the error prediction was reduced.
Use the First order Accumulated Generating Operation (1-AGO) sequence for the general modified equationT (0) (k). Then, substitute Equation (11) into Equation (9) to get the improved model's prediction equation is: The final fitted and predicted values are calculated based on Equation (10).
Thus, is The GM (1,1) model's prediction equation in Equation (9) results in the error prediction that can be observed in Figures 1-3. However, by subtracting the ( ) ( ) from Equation (9), the error prediction was reduced. Use the First order Accumulated Generating Operation (1-AGO) sequence for the general modified equation ( ) ( ). Then, substitute Equation (11) into Equation (9) The final fitted and predicted values are calculated based on Equation (10).

Data Description
Three different GHPS output temperature datasets were used in this study to validate and verify the benefits of the improved model. The datasets were obtained from different locations and different operating times, which were performed free of negatives, Randoms, and anomalies. The first dataset was obtained from Oklahoma University and was set up for model training during the winter operation from 1 January 2015 to 2 January 2015 [31]. The second dataset was from the University Politècnica de València and was

Data Description
Three different GHPS output temperature datasets were used in this study to validate and verify the benefits of the improved model. The datasets were obtained from different locations and different operating times, which were performed free of negatives, Randoms, and anomalies. The first dataset was obtained from Oklahoma University and was set up for model training during the winter operation from 1 January 2015 to 2 January 2015 [31]. The second dataset was from the University Politècnica de València and was

Data Description
Three different GHPS output temperature datasets were used in this study to validate and verify the benefits of the improved model. The datasets were obtained from different locations and different operating times, which were performed free of negatives, Randoms, and anomalies. The first dataset was obtained from Oklahoma University and was set up for model training during the winter operation from 1 January 2015 to 2 January 2015 [31]. The second dataset was from the University Politècnica de València and was performed to train the model during summer operation from 8 May 2002 to 9 May 2002 [32]. The third was obtained through facility management at Oakland University. From 1 July 2019 to 2 July 2019, the dataset was carried out to train the model to perform output temperature predictions. The meter measured and recorded the output and input temperatures and the flow rate of water at separate, fifteen-minute intervals [33]. Thus, a small dataset of 24 data points (i.e., 24 h) from each GHPS's output temperature dataset was used to train the IGM (1,1) and GM (1,1) models to perform the output temperature prediction.

Model Evaluation
To evaluate the prediction performance of the improved model, the Root Mean Square Error (RSME), Mean Absolute Error (MAE), and the Mean Absolute Percentage Error (MAPE) were used to measure the prediction accuracy of the IGM (1,1) model. Therefore, MAE, RMSE, and MAPE are written in the Equations (13)-(15), respectively. where: is the predicted temperature value, and n is the predicted length.

Result and Discussion
Oklahoma University's GHPS output temperature prediction. Temperature data sequence values T (0) were generated using Oklahoma University's GHPS output temperature dataset for 24 h. Thus, T (0) = (21.9778, 30.1667, . . . . . . . . . , 37.6889). The dataset contains 32 data points (i.e., 32 h) with one point denoting one hour. That was split into training sets and testing sets. The first 24 data points were set up to train the models, and the remaining eight data points for the period from 25 to 32 h were used to test the model's accuracy. This process was done to establish the improved IGM (1,1) and GM (1,1) models. The models were implemented using MATLAB, and their parameters were optimized with each iteration. After the models were well trained on the data for 24 h, the IGM (1,1) and GM (1,1) models were implemented to predict the output temperatures for the next eight hours. Therefore, the parameters a, b were calculated based on Equation (7), which were obtained from a least-square method as follows: The simulation and prediction results of the IGM (1,1) model were calculated based on Equation (12), and its parameters a, b were substituted into Equation (12). Then, the IGM (1,1) model's prediction equation is written as shown below: Substitute k = 2,3,....,24 into Equation (16) Table 1. The IGM (1,1) model accuracy predicted Oklahoma University's GHPS output temperature for the next eight hours. However, the GM (1,1) model produced inaccurate predictions. Subsequently, the IGM (1,1) model's predicted accuracy was evaluated and validated, then compared to the GM (1,1) model. As a result, the IGM (1,1) model was more accurate than the GM (1,1) model, where the IGM (1,1) model has reduced the error prediction from 4.98% to 0.56% (evaluated using MAPE%). In this case, the IGM (1,1) model enhanced the GM (1,1) model's accuracy by 88.76%. Table 1. Comparison of IGM (1,1) and GM (1,1) models' predicted values for Oklahoma University's GHPS output temperature for the period from 25 to 32 h (Out-of-sample).  To further demonstrate the significant enhancement of the improved model and its accuracy prediction, the IGM (1,1) model was applied to perform the GHPS output temperature prediction for the next eight hours at both the University of Politècnica de València and Oakland University. The same modeling approach was used for Oklahoma University's GHPS output temperature prediction. Therefore, the predicted results for the two GHPSs are depicted in Figures 2 and 3. Then, the eight predicted temperature values of the GHPS output temperature at University Politècnica de València, and Oakland University were compared to actual temperature values, as shown in Tables 2 and 3. A comparison between the performance IGM (1,1) and GM (1,1) models were also conducted, as shown in Table 4. In the GM (1,1) model, the error prediction was increased, while Z(k) generated a sequence for an immediate mean of T (1) (i), resulting in an inaccurate prediction. The GM (1,1) model predicted values were different from the actual temperature values, indicating that the model did not fit the actual temperature values correctly and did not accurately predict the next eight temperature values, as seen in Figures 1-3. The improved model was proposed to overcome this disadvantage of the GM (1,1) model and to increase its accuracy prediction. The error prediction was minimized, as δ (1) (k) =T (1) (k − 1) − Z (1) (k − 1) by subtracting the error between the previous predicted value and the previous immediate mean of the measured value. As such, the IGM (1,1) model's prediction accuracy was compared to the GM (1,1) model to illustrate the significant enhancement of the improved model. So, overall, the IGM (1,1) is far more accurate than the GM (1,1) model in its prediction results. Moreover, the simulation (in-sample), and the prediction (out-of-sample) output temperature results were smoothly improved using the IGM (1,1) model by reducing the error prediction.

Actual
The prediction accuracies of the models were assessed using RMSE, MAE, and MAPE based on Equations (13)- (15); the results are shown in Table 4 In summary, the IGM (1,1) model predicted the next eight hours of Oklahoma University's GHPS temperature the best, followed by University Politècnica de València and then Oakland University. Oklahoma University's GHPS output temperature was easier for the model to predict, simply because it was more consistent, to begin with. Conversely, the other two universities' temperatures were more varied, making it more difficult for our model to predict the following eight hours.

Conclusions
In this paper, an improved model (IGM (1,1)) was analyzed and implemented to increase the prediction accuracy of the GM (1,1) model. The IGM (1,1) model prediction was based on correcting the current predicted value by subtracting the error between the previous predicted value and the previous immediate mean of the measured values. The IGM (1,1) model was applied to predict the GHPS output temperatures for eight hours at Oklahoma University, University Politècnica de València, and Oakland University. The experimental design for the time series prediction models was implemented using MATLAB 2018a software. The prediction accuracies were compared, and the improved model was found to be more accurate than the GM (1,1) model for every dataset at each university. The proposed IGM (1,1) model significantly improved the simulation (insample) and prediction (out-of-sample) output temperatures of the GHPSs by reducing prediction errors and improving the GM (1,1) model's overall accuracy. Therefore, the results have validated the performance of the IGM (1,1) model for a short-term accurate prediction. In the future, we anticipate improving on the IGM (1,1) to make it even more accurate for geothermal heat pump systems for longer than just the short-term.