Next-Day Prediction of Hourly Solar Irradiance Using Local Weather Forecasts and LSTM Trained with Non-Local Data

Solar irradiance prediction is significant for maximizing energy-saving effects in the predictive control of buildings. Several models for solar irradiance prediction have been developed; however, they require the collection of weather data over a long period in the predicted target region or evaluation of various weather data in real time. In this study, a long short-term memory algorithm–based model is proposed using limited input data and data from other regions. The proposed model can predict solar irradiance using next-day weather forecasts by the Korea Meteorological Administration and daily solar irradiance, and it is possible to build a model with one-time learning using national and international data. The model developed in this study showed excellent predictive performance with a coefficient of variation of the root mean square error of 12% per year even if the learning and forecast regions were different, assuming that the weather forecast was correct.


Introduction
In general, approximately 60% of a building's energy is used for heating, ventilation, and air conditioning operation [1], and energy can be saved by optimally controlling the building's heating and air conditioning systems [2]. There have been an increasing number of studies related to model predictive control (MPC), which establishes an optimal control strategy to ensure efficient air conditioner control and system operation in advance [3,4]. Various studies have confirmed the effect of reducing building energy consumption through MPC [5][6][7]. The performance of MPC control is affected by the accuracy of the hourly load prediction of a building, and the load is affected by next-day weather information; therefore, most models require weather forecast information [8][9][10]. Typical factors affecting the load are outdoor air temperature and solar irradiance. Although prediction of outdoor air temperature is relatively easy because of small hourly changes, forecasting the actual hourly values of solar irradiance is very rare [11][12][13][14]. In previous MPC studies, methods of predicting solar irradiance have been rarely reported, and most studies were conducted using the data provided by an energy analysis program or assuming that the amount of solar irradiance was completely predicted from the solar irradiance prediction model [15].
Solar irradiance prediction models are typically physics-based or data-based [16]. Physical models are generally developed based on solar geometry to construct an empirical correlation between solar irradiance data and meteorological parameters measured in the past in the observation region [17]. In 1956, the author Black developed a model for predicting solar irradiance by analyzing the correlation between sky cover and solar irradiance data measured over three years in a region [18]. Similarly, Samimi developed a solar irradiance model with high accuracy using Iran's weather data measured over 17 years [19]. Paltridge and Daneshyar [20,21] developed physics-based solar irradiance prediction develop a learning model that reflects the recent solar irradiance characteristics of the surrounding region without long-term accumulated data of the predicted region. Second, we intended to develop a prediction model that can be used without additional updates with only one learning, considering the performance of the embedded central processing unit (CPU) installed in the controller of a smallto medium-sized building. Finally, we used only weather forecast information that could be easily accessed from a mobile device or PC and considered a simple weather forecast to increase the analysis efficiency by using a data collection environment that mirrors the actual environment.

LSTM Networks
In this study, a solar irradiance prediction model was developed using a deep learning structure. In this structure, a specific neural network structure responsible for learning was defined in multiple layers and typically classified into a convolutional neural network (CNN) structure and a recurrent neural network (RNN) structure. The CNN structure exhibits excellent learning ability for information whose order is not important, such as images, and the RNN structure has been verified in many studies to learn successfully and predict problems with time-series characteristics or order [34]. In this study, the model learns using RNN because the solar irradiance in the previous period also has a time-series characteristic that moderately affects the next period. However, in the existing RNN, when learning increases, a vanishing and exploding problem [35] occurs that does not improve learning performance. The LSTM network can improve existing RNNs [36] and consists of one input layer, multiple hidden layers, and one output layer. The input and output layers are constructed in the same form as the existing neuron network model and have a number of neurons that correspond to the data size of the input and output variables. The main feature of the LSTM model is the hidden layer with memory cells. The structure of the memory cell is shown in Figure 1.
Energies 2020, 13, x FOR PEER REVIEW 3 of 17 develop a learning model that reflects the recent solar irradiance characteristics of the surrounding region without long-term accumulated data of the predicted region. Second, we intended to develop a prediction model that can be used without additional updates with only one learning, considering the performance of the embedded central processing unit (CPU) installed in the controller of a smallto medium-sized building. Finally, we used only weather forecast information that could be easily accessed from a mobile device or PC and considered a simple weather forecast to increase the analysis efficiency by using a data collection environment that mirrors the actual environment.

LSTM Networks
In this study, a solar irradiance prediction model was developed using a deep learning structure. In this structure, a specific neural network structure responsible for learning was defined in multiple layers and typically classified into a convolutional neural network (CNN) structure and a recurrent neural network (RNN) structure. The CNN structure exhibits excellent learning ability for information whose order is not important, such as images, and the RNN structure has been verified in many studies to learn successfully and predict problems with time-series characteristics or order [34]. In this study, the model learns using RNN because the solar irradiance in the previous period also has a time-series characteristic that moderately affects the next period. However, in the existing RNN, when learning increases, a vanishing and exploding problem [35] occurs that does not improve learning performance. The LSTM network can improve existing RNNs [36] and consists of one input layer, multiple hidden layers, and one output layer. The input and output layers are constructed in the same form as the existing neuron network model and have a number of neurons that correspond to the data size of the input and output variables. The main feature of the LSTM model is the hidden layer with memory cells. The structure of the memory cell is shown in Figure 1. Each memory cell maintains or adjusts the cell state through three gates and is composed of an input gate ( t ), an output gate ( t ), and a forget gate ( t ). The purpose of each gate is as follows [38].

-
Input gate: specifies which information is added to the cell state; -Output gate: specifies which information from the cell state is used as output; -Forget gate: defines which information is removed from the cell state.
The following equations (Equations (1)-(4)) represent the process of updating memory cells in the LSTM layer at a given time step, t. Each memory cell maintains or adjusts the cell state through three gates and is composed of an input gate (i t ), an output gate (o t ), and a forget gate ( f t ). The purpose of each gate is as follows [38].

-
Input gate: specifies which information is added to the cell state; -Output gate: specifies which information from the cell state is used as output; -Forget gate: defines which information is removed from the cell state.
The following equations (Equations (1)-(4)) represent the process of updating memory cells in the LSTM layer at a given time step, t.
Energies 2020, 13, 5258 4 of 16 The notation definitions are as follows: W: weight matrix b: bias vector S: candidate cell-state value h t : coefficient vector for the outputs of the LSTM layer σ: sigmoid function x t : input vector at time step t Finally, the current cell state was determined by the following Equation (5).
Here, • refers to the Hadamard product, which is an operation of multiplying co-location matrix terms between two matrices of the same size.
The output layer (h t ) was calculated as the product of the current states, output gate, and tanh, the active function. The output layer (h t ) was used as a coefficient for prediction, and the output values, y t and h t , are expressed by the following Equations (6) and (7): The LSTM network equations and main contents described in this study were referenced from previous studies [33][34][35][36][37][38][39]. In this study, the LSTM model used the deep learning toolbox provided by MATLAB and specified various hyperparameters and algorithms that determine learning performance. First, the stochastic gradient descent (SGD) algorithm and adaptive moment estimation (ADAM) algorithm were used as optimization techniques provided by MATLAB. It is known that the SGD algorithm has a disadvantage because it takes a long time to learn when a sufficient working environment for iterative calculation is not provided [40]. The ADAM algorithm is advantageous for identifying the optimal solution efficiently in a short time by flexibly adjusting the learning rate [41].
The mathematical formalism and calculation process on ADAM are detailed in the reference [42]. No known learning algorithms perform well in all situations; however, as purpose of this study is to repeatedly predict daily solar irradiance to apply MPC, the ADAM algorithm was selected because of its high calculation speed. When the learning model adjusted the weight of the hidden layer and hidden unit and minimized the model error, exact guidelines or rules for the optimal setting of the layer and the unit were not specified but determined depending on the user's experience. However, Abhishek et al. [43] compared the weather predictive performance according to the number of layers and units of the ANN model and reported that the performance of the model increased as the number of layers and units increased. In this study, a deeper LSTM model using three hidden layers was constructed, and 300 hidden units were constructed per layer. This is slightly more than 250, which is the default number in the simulation tool, MATLAB. The data used for learning were prevented from being diverted through the normalization process, and the calculation was performed using a graphics processing unit (GPU-RTX 2080Ti 11 GB) parallel processing technology. The setting values of other learning models are listed in Table 1, and the values were determined based on the recommended values provided by MATLAB [44]. This study aimed to utilize patterns of past solar irradiance to predict next-day solar irradiance using the LSTM network. Because the performance of the LSTM model varies depending on the vector size of the input data, the data must be classified into an appropriate vector size according to the purpose of the study before learning. According to previous studies on LSTM-based solar irradiance prediction, even if the data size is the same, it is advantageous for long-term predictive performance, such as season and month, if the vector size of the data is large. Appropriately reducing the vector size of the input data for short-term predictions, such as day and week, is good for learning and prediction [45].
As the purpose of this study was to develop a daily solar irradiance prediction model for MPC application and renewable optimal operation, learning was performed by grouping the data in units of 24 h using the periodic characteristics of solar irradiance. The input data of the model were the outdoor air temperature, sky cover, humidity, wind speed, and precipitation provided by the Korea Meteorological Administration, and the output value was the horizontal irradiance, which was the prediction target of the model. The data used in this model were secured with weather data from six major cities in Korea; learning was performed using the data of five among the six major cities, and the test was conducted using the data of the remaining city. The reference model used in this study for comparison purposes comprised a model similar to those in previous solar irradiance prediction studies [21][22][23][24][25][26], and we used the forecast information of the next day as the input value and the hourly solar irradiance measured on the next day as the output value to predict the solar irradiance of the next day.

Proposed Model
The proposed model uses weather parameters similar to those used by the reference model; however, the model is designed to predict the next day's solar irradiance by adding weather information, including the previous day's solar irradiance. To predict the solar irradiance on the next day (Day + 1), the existing prediction models in the prediction stage used only next-day forecast information (Day + 1), and it is rare to consider weather data for the previous day (Day) as input values. That is, it is more common for existing models to use only field-measured data as output values in learning, assuming that the data are only results from forecasts. Figure 2 shows the results of analyzing the Pearson correlation between weather parameters and solar irradiance, and the data used for the correlation analysis are the annual weather data from five regions used to learn the proposed model. That is, the data used in the sample are 1825 days (365 × 5). The correlation ranges between −1 and 1, and it can be interpreted that the closer to 1 or −1, the higher the correlation with the solar irradiance pattern of the next day, and the closer to 0, the lower the correlation. As the results show, the solar irradiance of the previous day (Day) was 0.8, indicating that it was very highly correlated with the solar irradiance of the following day. Most weather parameters except sky cover were very similar in the Day and Day + 1 cases, indicating that they had very similar correlations. Furthermore, it was confirmed that the precipitation of the previous day was slightly more related than that of the following day to the pattern of solar irradiance on the following day. Figure 3 shows the pattern of solar irradiance during January 1-9 in Incheon, used for testing purposes. In the upper graph, solar irradiance exhibits a large deviation from that of the previous day; however, as shown in the graph below, assuming that the point at which the maximum solar irradiance occurs during the day is 1, the normalization of values maintains the width and height of the graph, enabling the isolation of the solar irradiance pattern. Thus, the data of solar irradiance on the previous day can be included. Normalization is a critical process in learning because these characteristics highlight how regional patterns of solar irradiance and time zones with different solar irradiance can be used.  Therefore, in the proposed model, the weather information of the previous day (Day), including solar irradiance, was used for learning, in addition to the input value used in the reference model. In the proposed solar irradiance model, once learning proceeded, only the weather information of the next day was updated under the assumption that data transmission is possible every 24 h only in the use stage. The weather region used for learning and testing in the model was the same as that in the reference model. Table 2 summarizes the relationship between the input data and output data of the reference model and the proposed model.   Therefore, in the proposed model, the weather information of the previous day (Day), including solar irradiance, was used for learning, in addition to the input value used in the reference model. In the proposed solar irradiance model, once learning proceeded, only the weather information of the next day was updated under the assumption that data transmission is possible every 24 h only in the use stage. The weather region used for learning and testing in the model was the same as that in the reference model. Table 2 summarizes the relationship between the input data and output data of the reference model and the proposed model. Therefore, in the proposed model, the weather information of the previous day (Day), including solar irradiance, was used for learning, in addition to the input value used in the reference model. In the proposed solar irradiance model, once learning proceeded, only the weather information of the next day was updated under the assumption that data transmission is possible every 24 h only in the use stage. The weather region used for learning and testing in the model was the same as that in the reference model. Table 2 summarizes the relationship between the input data and output data of the reference model and the proposed model. The weather data used in this study were TMY2 data provided by TRNSYS, an energy analysis program. TMY2 provides various types of weather information; however, the purpose of this study was to predict the amount of solar irradiance using limited forecast data provided by the weather forecast system. Therefore, only the forecast information (outdoor air temperature, humidity, wind speed, and precipitation information) obtained from the weather forecast system of Korea, a test region, was collected ( Figure 4). In addition, hourly data can be secured in TMY2. As shown in Table 2 and Figure 4, meteorological parameters excluding precipitation are forecasted every 3 h on the previous day, and hourly solar irradiance information is required to apply to MPC. Therefore, an average of the TMY2 data was calculated at 3 h intervals to generate similar forecast data and regenerate hourly data through linear interpolation. For precipitation, hourly data were regenerated in the same manner, considering the forecast interval of 6 h. In addition, Korea's cloud forecast categories are of four types (mostly cloudy, cloudy, partly cloudy, and clear sky), which are simpler than those of TMY2, which expresses sky cover with values between 0 and 100. Therefore, the sky cover data of TMY2 were also simplified to four types at 25% intervals, as in Korea's weather forecast system. Because the purpose of this test was to verify the performance of the model using existing data and compare the performance of the models, it was assumed that the weather forecast would accurately predict the TMY2 data.

Proposed Model with Global Data
In the case of the proposed model, major domestic cities with relatively similar weather conditions were used as learning data. We evaluated the model performance by using different country data with less similar weather patterns. We referred to the proposed (global) model, and the same weather parameters were used; however, solar irradiance and weather information from different countries were used for model learning. The regions used for the prediction model were Cape Town, Canberra, Colorado, and Paris, and solar irradiance data for each region were selected. We used the notations and their definitions used in Koppen-Geiger, a representative climate classification method [46]. The climates can be classified in the order of main climate, seasonal precipitation type, and heat level. According to the classification criteria, Korean cities, which are used as input data for the test region and proposed model in this study, exhibited similar climatic conditions (Snow/Fully-humid/Hot-summer vs. Snow/Winter/Hot-summer), and the climates of the cities that were used as the input to the proposed (global) model were not the same as that of Incheon, the test region. The classification of regions, latitude, longitude, and Koppen-Geiger used in the proposed model and the proposed (global) model in this study are summarized in Table 3. Considering the general characteristics of a deep learning model in which the performance of the model improves as more data are available for learning, if the proposed (global) model shows successful results in predictive performance, the available data from different climate zones can be used for learning; thus, this method can be useful when data in the local and surrounding regions are insufficient. Figure 5 presents the input/output vector of the reference model that was compared with the proposed model or the proposed (global) model. Weather information of Day and Day + 1 predicted by the Meteorological Agency was largely used, and in the case of solar irradiance, the field measurement value of the target region was connected as an input vector. Then, the optimal weight and bias for deriving next-day solar irradiance was learned. All input values were interpolated and normalized according to time. In particular, the maximum solar irradiance and sunshine duration for the next day was affected by the present day's solar irradiance and duration with other input conditions; therefore, the same normalization was important for successive days. Thus, in the case of the proposed (global) model, even if the model has learned in a region with a different climate, the field value for the present day is used for prediction. Thus, it is expected that there will be a correction effect in the prediction for other regions with different learning environments.
Energies 2020, 13, 5258 9 of 16 normalized according to time. In particular, the maximum solar irradiance and sunshine duration for the next day was affected by the present day's solar irradiance and duration with other input conditions; therefore, the same normalization was important for successive days. Thus, in the case of the proposed (global) model, even if the model has learned in a region with a different climate, the field value for the present day is used for prediction. Thus, it is expected that there will be a correction effect in the prediction for other regions with different learning environments.

Simulation Results and Analysis
Under the assumption that data communication is performed on a daily basis, the input data of the model used for prediction was updated once every 24 h, and the model predicted with the update to establish the 24 h solar irradiance of the next day. Learning was the same as the prediction method using the model but aimed to correct the internal coefficient in the direction where the prediction error was minimized. The predicted results of the model were evaluated using the root mean square error (RMSE) and coefficient of variation of the RMSE (CVRMSE) [47], which are general error evaluation methods, as shown in Equations (8) and (9).
Here, ref,t is the measured value for time t, TMY2 data, and test,t denotes the predicted value.

Simulation Results and Analysis
Under the assumption that data communication is performed on a daily basis, the input data of the model used for prediction was updated once every 24 h, and the model predicted with the update to establish the 24 h solar irradiance of the next day. Learning was the same as the prediction method using the model but aimed to correct the internal coefficient in the direction where the prediction error was minimized. The predicted results of the model were evaluated using the root mean square error (RMSE) and coefficient of variation of the RMSE (CVRMSE) [47], which are general error evaluation methods, as shown in Equations (8) and (9).
Here, v ref,t is the measured value for time t, TMY2 data, and v test,t denotes the predicted value.

Comparison Between Reference and Proposed Model
First, the proposed LSTM prediction model and the reference model were constructed through the connection strength between neurons and layers that best described the relationship between the patterns of solar irradiance in the five regions. If the model similarly depicted the pattern of solar irradiance in the learning data of the five regions, it was expected that the error could be reduced in the prediction. Therefore, prior to the prediction, the learning performance of the model was analyzed, and the results are shown in Figure 6a. The learning error (RMSE) was 39.42 W/m 2 for the reference model and 26.08 W/m 2 for the proposed model. The scatter diagram indicates that more the points are distributed diagonally, the more accurate the model is. Most points were distributed around the diagonal line, but the learning performance of the proposed model was improved by approximately 22 W/m 2 by utilizing more inputs to describe the target data. To increase the learning performance, the model can be enhanced by strengthening the hidden layer and the number of iterative learning; however, excessive learning performance settings may require significant time and a high-performance learning device or cause an over-fitting phenomenon. Figure 6b shows the performance results of predicting the solar irradiance in Incheon, the target region used for the reference model, and the proposed model showed a similar level of error in the learning performance. reference model and 26.08 W/m for the proposed model. The scatter diagram indicates that more the points are distributed diagonally, the more accurate the model is. Most points were distributed around the diagonal line, but the learning performance of the proposed model was improved by approximately 22 W/m 2 by utilizing more inputs to describe the target data. To increase the learning performance, the model can be enhanced by strengthening the hidden layer and the number of iterative learning; however, excessive learning performance settings may require significant time and a high-performance learning device or cause an over-fitting phenomenon. Figure 6b shows the performance results of predicting the solar irradiance in Incheon, the target region used for the reference model, and the proposed model showed a similar level of error in the learning performance. As shown in Figure 6, the reference model using only forecast information showed RMSE of 50.89 W/m2 and CVRMSE of 36%. The proposed model, which additionally learns the weather pattern of the previous day, showed high predictive performance with an RMSE of 18 W/m 2 and CVRMSE of 12.9%. Figure 7 shows the predictive performance comparison according to sky type. As shown in Figure 7, the reference model showed the highest error in mostly cloudy conditions with relatively little solar irradiance. As shown in Figure 6, the reference model using only forecast information showed RMSE of 50.89 W/m 2 and CVRMSE of 36%. The proposed model, which additionally learns the weather pattern of the previous day, showed high predictive performance with an RMSE of 18 W/m 2 and CVRMSE of 12.9%. Figure 7 shows the predictive performance comparison according to sky type. As shown in Figure 7, the reference model showed the highest error in mostly cloudy conditions with relatively little solar irradiance.
reference model and 26.08 W/m 2 for the proposed model. The scatter diagram indicates that more the points are distributed diagonally, the more accurate the model is. Most points were distributed around the diagonal line, but the learning performance of the proposed model was improved by approximately 22 W/m 2 by utilizing more inputs to describe the target data. To increase the learning performance, the model can be enhanced by strengthening the hidden layer and the number of iterative learning; however, excessive learning performance settings may require significant time and a high-performance learning device or cause an over-fitting phenomenon. Figure 6b shows the performance results of predicting the solar irradiance in Incheon, the target region used for the reference model, and the proposed model showed a similar level of error in the learning performance. As shown in Figure 6, the reference model using only forecast information showed RMSE of 50.89 W/m2 and CVRMSE of 36%. The proposed model, which additionally learns the weather pattern of the previous day, showed high predictive performance with an RMSE of 18 W/m 2 and CVRMSE of 12.9%. Figure 7 shows the predictive performance comparison according to sky type. As shown in Figure 7, the reference model showed the highest error in mostly cloudy conditions with relatively little solar irradiance.  Figure 8 shows the predictive performance of each model at random intervals. As observed in the figure, the reference model is also excellent in the interval where the solar irradiance was relatively high, but the pattern of solar irradiance was not constant, and on the day when solar irradiance was relatively low, the reference model exhibits a large error. For example, the reference   Figure 8 shows the predictive performance of each model at random intervals. As observed in the figure, the reference model is also excellent in the interval where the solar irradiance was relatively high, but the pattern of solar irradiance was not constant, and on the day when solar irradiance was relatively low, the reference model exhibits a large error. For example, the reference model showed the pattern of solar irradiance that failed to cope with sudden solar irradiance fluctuations, such as the 1057-1150 interval, where the change in solar irradiance over time is relatively large owing to sudden weather fluctuations and the pattern of solar irradiance with similar errors for four consecutive days.  Figure 8 shows the predictive performance of each model at random intervals. As observed in the figure, the reference model is also excellent in the interval where the solar irradiance was relatively high, but the pattern of solar irradiance was not constant, and on the day when solar irradiance was relatively low, the reference model exhibits a large error. For example, the reference model showed the pattern of solar irradiance that failed to cope with sudden solar irradiance fluctuations, such as the 1057-1150 interval, where the change in solar irradiance over time is relatively large owing to sudden weather fluctuations and the pattern of solar irradiance with similar errors for four consecutive days. As the proposed model learned more diverse time-series data patterns, including the weather conditions of the previous day, the model depicted patterns of solar irradiance similar to the actual one in all intervals. As mentioned earlier, it exhibited excellent predictive performance even when the solar irradiance during a day fluctuated suddenly. Figure 9 shows the predictive performance and weather conditions of each model in the interval of 1033-1153, where the error of the reference model was intensified. As the proposed model learned more diverse time-series data patterns, including the weather conditions of the previous day, the model depicted patterns of solar irradiance similar to the actual one in all intervals. As mentioned earlier, it exhibited excellent predictive performance even when the solar irradiance during a day fluctuated suddenly. Figure 9 shows the predictive performance and weather conditions of each model in the interval of 1033-1153, where the error of the reference model was intensified.
As shown in Figure 9, all weather parameters, except sky cover, exhibited a considerably different pattern of occurrence from that in the previous day. That is, if the weather pattern of the previous day was added, as in the proposed model, it was possible to construct a more accurate solar irradiance prediction model even in the same type of learning parameter condition as the reference model.
The error of the proposed prediction model was a difference of 18 W/m 2 on average. Based on the study by Jeon et al. [15], in which the error of horizontal solar irradiance of 79 W/m 2 in the Energy Plus single residential building template generated a build load error of approximately 2%, it was determined that the proposed model can provide prediction values suitable for MPC control. Energies 2020, 13, x FOR PEER REVIEW 12 of 17 As shown in Figure 9, all weather parameters, except sky cover, exhibited a considerably different pattern of occurrence from that in the previous day. That is, if the weather pattern of the previous day was added, as in the proposed model, it was possible to construct a more accurate solar irradiance prediction model even in the same type of learning parameter condition as the reference model.
The error of the proposed prediction model was a difference of 18 W/m 2 on average. Based on the study by Jeon et al. [15], in which the error of horizontal solar irradiance of 79 W/m 2 in the Energy Plus single residential building template generated a build load error of approximately 2%, it was determined that the proposed model can provide prediction values suitable for MPC control.

Proposed (Global) Model Results
The proposed (global) data model is a model that learns using weather data from regions other than the prediction target city. Figure 10 shows the annual predictive performance of the proposed (global) data model. Overall, it was predicted to be less than the actual solar irradiance, but most points are distributed around the scatter diagram.

Proposed (Global) Model Results
The proposed (global) data model is a model that learns using weather data from regions other than the prediction target city. Figure 10 shows the annual predictive performance of the proposed (global) data model. Overall, it was predicted to be less than the actual solar irradiance, but most points are distributed around the scatter diagram. The model yielded an error of 15.9 W/m 2 in the learning, and the learning performance of the model was similar to that of the proposed model, which learned the data of the region close to the target region. Figure 11 depicts the predictive performance of the proposed (global) model in random intervals. Even if the model has learned the solar irradiance data of completely different regions, it The model yielded an error of 15.9 W/m 2 in the learning, and the learning performance of the model was similar to that of the proposed model, which learned the data of the region close to the target region. Figure 11 depicts the predictive performance of the proposed (global) model in random intervals. Even if the model has learned the solar irradiance data of completely different regions, it showed similarity in the pattern of the solar irradiance in most intervals with an error of 21.6 W/m 2 . The model yielded an error of 15.9 W/m 2 in the learning, and the learning performance of the model was similar to that of the proposed model, which learned the data of the region close to the target region. Figure 11 depicts the predictive performance of the proposed (global) model in random intervals. Even if the model has learned the solar irradiance data of completely different regions, it showed similarity in the pattern of the solar irradiance in most intervals with an error of 21.6 W/m 2 . The learning and predictive performance of the various models proposed in this study are summarized in Table 4. The results of this study were excellent based on an RMSE of 76 W/m 2 , which was the prediction result in a previous similar study [33]. It should be noted that the models in majority of the existing studies are similar to the reference model in which the comparative experiment was conducted. However, because typical weather data were used in this study, weather prediction errors were not included. The learning and predictive performance of the various models proposed in this study are summarized in Table 4. The results of this study were excellent based on an RMSE of 76 W/m 2 , which was the prediction result in a previous similar study [33]. It should be noted that the models in majority of the existing studies are similar to the reference model in which the comparative experiment was conducted. However, because typical weather data were used in this study, weather prediction errors were not included.

Conclusions
In this study, an LSTM-based learning model was proposed to predict solar irradiance, which is the main input data required for the predictive control of buildings. The proposed model has the following three advantages over the existing solar irradiance prediction model. First, the proposed LSTM model uses data mostly provided by the weather forecasting system that can be easily obtained with the Application Programming Interface. Second, long-term measured data in the target region are not required for learning. As confirmed in previous studies, most models require long-term measured weather information in a region to predict local solar irradiance. Third, the proposed deep learning model does not require additional learning to update the model once it is constructed. Because the existing deep learning model for solar irradiance prediction uses data measured in a specific region, it is typical to update the model by periodically learning the measured data to improve model performance.
Owing to limitations such as the lack of measurement equipment and high cost in a small building or a building where MPC is newly applied, it is impossible to obtain historical data measured for several years. Moreover, a small CPU system installed for the MPC application cannot learn a substantial amount of data; therefore, continuous update is difficult. In this study, a model was developed that requires learning only once and can continuously predict the solar irradiance in a specific region using the existing solar irradiance data provided by other regions or other countries. Additionally, to improve the performance, a method for learning weather data from the previous day was proposed in this study, unlike the existing method that uses only the next day's forecast information and the corresponding solar irradiance data.
The proposed model was verified by using the weather data of five among the six regions in Korea secured through TMY2 for learning and the solar irradiance data of the remaining region for prediction. The proposed model showed an RMSE of 17.4 W/m 2 in the learning performance and 18 W/m 2 in the predictive performance. According to previous studies, a model can be used for predictive control with a corresponding error of 2% or less when applied to a building load model. Through a method of learning the existing pattern of solar irradiance in other foreign countries, the proposed (global) model identified the patterns of solar irradiance fluctuations in a specific region that lacks accumulated data and provided reliable prediction results. The proposed learning model can exhibit excellent predictive performance with an RMSE of 30 W/m 2 even when using intercontinental weather data far away from the prediction region; therefore, in practice, it can predict local solar irradiance using data from the region with a well-equipped database through long-term measurement. The verification result of the proposed model may have an increased error according to the forecast accuracy of the Korea Meteorological Administration; however, based on the fact that it predicted solar irradiance more accurately in the same test environment than did the reference model that was applied to the existing learning method, an improved predictive performance is expected in the future when this model is applied to the experimental environment of solar irradiance.