Time Series ARIMA Model for Prediction of Daily and Monthly Average Global Solar Radiation : The Case Study of Seoul , South Korea

Forecasting solar radiation has recently become the focus of numerous researchers due to the growing interest in green energy. This study aims to develop a seasonal auto-regressive integrated moving average (SARIMA) model to predict the daily and monthly solar radiation in Seoul, South Korea based on the hourly solar radiation data obtained from the Korean Meteorological Administration over 37 years (1981–2017). The goodness of fit of the model was tested against standardized residuals, the autocorrelation function, and the partial autocorrelation function for residuals. Then, model performance was compared with Monte Carlo simulations by using root mean square errors and coefficient of determination (R2) for evaluation. In addition, forecasting was conducted by using the best models with historical data on average monthly and daily solar radiation. The contributions of this study can be summarized as follows: (i) a time series SARIMA model is implemented to forecast the daily and monthly solar radiation of Seoul, South Korea in consideration of the accuracy, suitability, adequacy, and timeliness of the collected data; (ii) the reliability, accuracy, suitability, and performance of the model are investigated relative to those of established tests, standardized residual, autocorrelation function (ACF), and partial autocorrelation function (PACF), and the results are compared with those forecasted by the Monte Carlo method; and (iii) the trend of monthly solar radiation in Seoul for the coming years is analyzed and compared on the basis of the solar radiation data obtained from KMS over 37 years. The results indicate that (1,1,2) the ARIMA model can be used to represent daily solar radiation, while the seasonal ARIMA (4,1,1) of 12 lags for both auto-regressive and moving average parts can be used to represent monthly solar radiation. According to the findings, the expected average monthly solar radiation ranges from 176 to 377 Wh/m2.


Motivations of the Study
Climate change, global warming, and increasing energy demands have motivated the Korean government to search for economically and environmentally clean energy options [1,2].Besides, energy security is a key issue, especially in developed and fuel-importing countries, such as South Korea.Renewable energy is considered as one of the most promising solutions to achieve sustainable development and energy security.Renewable natural resources such as solar radiation that enable the Korean government to achieve the vision of sustainable development and energy security as well as follow policies that combine energy, environment, economy, and society in aspect of solar energy are abundant available in South Korea [3].Thus, the Korean government has sought to diversify its energy sources by increasing the dependence on renewable energy sources to enhance energy security and protect the environment [4,5].The strategic vision of the energy sector in South Korea aims to increase the contribution of renewable energy to the total generated power.Solar energy will increase in particular to 14.2% of total energy production by 2035 [6].

Problem Statement
The variability of solar radiation is one of the most important challenges limiting the use of solar energy.Solar radiation is extremely variable; it is often described as an unreliable source of energy.In addition, increases in the risk and uncertainty of generating the expected solar energy could function as an inhibiting factor of energy security [7].Competent forecasting and careful analysis of solar radiation can help reduce risk and enable assets to be operated in the most cost-effective manner.By applying accurate solar radiation forecasting, solar energy can be scheduled and solar power penetration can be increased.Accurate solar radiation forecasting may also exert a significant economic impact on solar power operations and substantially reduce costs [8].However, estimating solar radiation is not an easy task due to the influence of several factors, including weather conditions, geographic location, season, and humidity [9].In addition, the continuous availability of data on solar radiation over the past years is necessary for successful simulation with high reliability [10].

Literature Review
Numerous forecasting models have been proposed to find an effective method that can be applied to practical situations [11].These techniques mostly rely on complex statistics, artificial intelligence techniques, and large amounts of meteorological and topographic data [9].Ideally, these methods minimize the risk of failure within the energy system and forecast its reliability by modeling or simulating future scenarios.The available prediction models can be classified into three main categories: (i) Qualitative techniques, (ii) Quantitative techniques, and (iii) Artificial neural networks (ANNs).(i) Qualitative techniques are based on expert opinion and/or personal judgment.(ii) Quantitative techniques are based on mathematical models, which can be further classified as time series or causal forecasting techniques.Causal forecasting is used to identify relationships between dependent and independent variables.The quality of causal forecasting models depends on the accuracy of the input factors.Due to the high fluctuation of factors affecting solar radiation, however, the availability and accuracy of these models are questionable.Time series forecasting models collect observations x(t i ) over a designated period of time, where every observation represents a specific time (t) and then predicts future outputs according to previous events.Compared with causal forecasting, time series forecasting is flexible and requires fewer data inputs; thus the technique is easier to implement and does not require much cost.However, the main limitation of time series forecasting is the lack of a deterministic cause [11].To overcome this limitation, model developers usually depend on large numbers of inputs or stochastic events.(iii) Artificial neural networks (ANNs), which have been used in several studies [12][13][14].On the other hand, the authors in References [15][16][17][18] showed that a radial basis function neural networks (RBF-NN) can be applied to a wide range of nonlinear equation sets.The authors Baghaee et al. [15] proposed the RBF-NN for nonlinear mapping, which is exploited to solve a nonlinear equation set of load flow analysis.While, References [16][17][18] applied the RBF-NN technique into microgrids.Despite the benefits of ANN, however, a previous study [19] investigated the performance of the time series auto-regressive integrated moving average (ARIMA) model in comparison with ANN models and found that the former generally performs better than the latter due to the effect of weather conditions, such as clouds.Solar radiation concentration is partially dependent on various weather, location, and time factors; thus, it displays a type of serial correlation, which suggests that time series forecasting is appropriate for solar radiation forecasting.The unknown working principle or "black-box" of neural networks limits their applicability in predicting solar radiation.The Numerical Weather Prediction (NWP) is widely available in meteorological organizations.However, NWP is highly dependent on air quality and hydrological characteristics, which strongly vary with time and are sensitive to location [20].The direct implementation of NWP in solar radiation forecasting has been criticized [21].Therefore, we considered developing a time series forecasting technique in this study because of its convenience and accurate prediction, low data input requirement, and simple computational process.The forecast procedure can provide a rapid and standard way to generate forecasts for many time series in a single step.In the past, hundreds of series were forecasted at a time, with the series organized into separate variables or across groups.ARIMA is regarded as a smooth technique, and it is applicable when the data is reasonably long and the correlation between past observations is stable [22].Several studies in the literature have used ARMA and ARIMA models for solar radiation prediction [23][24][25][26].The ARMA and ARIMA models have also been compared in terms of the goodness-of-fit values produced by the log-likelihood function.As a result, the best statistical models and corresponding parameters for solar radiation prediction can be determined comprehensively.Many feasible comparisons have been conducted for solar radiation prediction.In previous work, the prediction task of many models lacked adequacy and timing in terms of data collection.Here, a time series ARIMA model is built to forecast the daily and monthly solar radiation of Seoul, South Korea in consideration of the accuracy, suitability, adequacy, and timeliness of the collected data, which have been obtained from KMS over 37 years.The reliability, accuracy, suitability, and performance of the model are investigated in comparison with those of established tests, such as standardized residual, ACF, and PACF.Finally, the obtained results are compared with those forecasted by the Monte Carlo method.

Contributions of the Study
This study aims to build a time series ARIMA model to forecast daily and monthly solar radiation in Seoul, South Korea, based on the hourly solar radiation data obtained from the Korean Meteorological Administration (KMA) over 37 years.The model performance was tested against established tests, such as standardized residual, ACF and PACF for residuals, and Monte Carlo simulations to ensure model suitability.ACF and PACF were also used to determine the structure of the seasonal ARIMA (SARIMA) model.The main contributions of this study are as follows: • A time series ARIMA model is built to forecast the daily and monthly solar radiation of Seoul, South Korea, considering the accuracy, suitability, adequacy, and timeliness of the collected data.

•
The reliability, accuracy, suitability, and performance of the model are investigated in comparison with those of established tests, such as standardized residual, ACF, and PACF, and the results are compared with the results forecasted by the Monte Carlo method.

•
The trend of monthly solar radiation in Seoul for the coming years is analyzed and compared based on solar radiation data obtained from the KMS over 37 years.

Paper Organization
This study is organized as follows: Section 2 highlights the case study and describes why Seoul is considered in this work.The first part of Section 3 shows the data set used in this study and discusses the daily and monthly solar radiation in Seoul over 37 years, while the second part of the section highlights the forecasting model approach.The performance of the forecasting model, the simulation results, and a discussion are presented in Section 4. The conclusion is given in Section 5.

Case Study: Seoul, South Korea
Seoul, the capital and largest metropolis of South Korea, lies between 37 • 34 N latitude and 126 • 58 E longitude.This study highlights Seoul because the city has the largest energy consumption compared with other cities in the county.Seoul's energy consumption includes roughly half of the population, as well as most industrial and commercial areas [27].Seoul consumes 15,496,000 tons of oil equivalent of energy per year, which is 7.5% of South Korea's national total.Of this energy consumption, 56% is attributed to residential and commercial use.In addition, Seoul is dependent on fossil fuels, with oil and liquefied natural gas accounting for 38.9% and 29.7% of the energy mix, respectively.The city has abundant potential for using solar energy [28].Byrne et al. [29] described and defined a solar city concept and an evaluation method for estimating solar electric potential on rooftops using distributed photovoltaic (PV) systems with Seoul as a case study and demonstrated a potential equivalent to almost 30% of the city's annual electricity consumption.Through this methodology, approximately 66% of the daytime electricity demand in Seoul can be served by solar power systems (PV systems) on a typical day.However, the total renewable energy production contributes merely 1.6% of the city's total energy consumption [30].Thus, Seoul's municipal government has continuously strived to implement new policies to transform Seoul into a leading solar city, in order to save energy in a city with high energy consumption and reduce greenhouse gas emissions, which are the main causes of global warming.To this end, the municipal government has developed a solar map to show areas suitable for solar PVs and quantify the savings rooftop installations can deliver to increase citizen engagement in the program, shift toward the use of sustainable energy sources, and reduce greenhouse gas emissions [31].The findings of this study may help policymakers frame strong policies based on their understanding of the overall perspective of solar radiation in the coming future.

Data Collection
Solar radiation data in Seoul for over 37 years that is obtained from the KMA [32].Table 1 presents the basic statistics of the collected data.The collected data were first rearranged and adjusted to eliminate probable errors.Data from weather stations were used to evaluate horizontal solar radiation over 37 years.Figure 1 gives an overview of the monthly average of solar radiation in Seoul.
fossil fuels, with oil and liquefied natural gas accounting for 38.9% and 29.7% of the energy mix, respectively.The city has abundant potential for using solar energy [28].Byrne et al. [29] described and defined a solar city concept and an evaluation method for estimating solar electric potential on rooftops using distributed photovoltaic (PV) systems with Seoul as a case study and demonstrated a potential equivalent to almost 30% of the city's annual electricity consumption.Through this methodology, approximately 66% of the daytime electricity demand in Seoul can be served by solar power systems (PV systems) on a typical day.However, the total renewable energy production contributes merely 1.6% of the city's total energy consumption [30].Thus, Seoul's municipal government has continuously strived to implement new policies to transform Seoul into a leading solar city, in order to save energy in a city with high energy consumption and reduce greenhouse gas emissions, which are the main causes of global warming.To this end, the municipal government has developed a solar map to show areas suitable for solar PVs and quantify the savings rooftop installations can deliver to increase citizen engagement in the program, shift toward the use of sustainable energy sources, and reduce greenhouse gas emissions [31].The findings of this study may help policymakers frame strong policies based on their understanding of the overall perspective of solar radiation in the coming future.

Data Collection
Solar radiation data in Seoul for over 37 years that is obtained from the KMA [32].Table 1 presents the basic statistics of the collected data.The collected data were first rearranged and adjusted to eliminate probable errors.Data from weather stations were used to evaluate horizontal solar radiation over 37 years.Figure 1 gives an overview of the monthly average of solar radiation in Seoul.The highest average solar radiation of 338.7 Wh/m 2 is observed in May; the lower adjacent is 251.1 Wh/m 2 and the upper adjacent is 377 Wh/m 2 .The red cross represents outliers, which mostly belonged to the lower adjacent.Thus, these outliers should not be considered to reduce their effect on the collected data.The two lower outliers were 209.2 and 224.8 Wh/m 2 , which occurred in May 1990 and 1981, respectively.The lowest average solar radiation, 166.1 Wh/m 2 , was observed in December.The lower adjacent was 129 Wh/m 2 in December 1992, and the upper adjacent was 207.7 Wh/m 2 in December 1985.The two lower outliers, 98.3 Wh/m 2 and 113.4 Wh/m 2 , occurred in December 1990 and 1991, respectively.The largest variation in solar radiation was observed in September, which may be attributed to seasonal weather fluctuations in this month of the year.

ARIMA Forecasting Model
Time series forecasting is a multidisciplinary scientific tool used to solve prediction problems.Its implementation is easy and flexible because it only requires historical observations of the necessary variables [33][34][35].ARIMA was first presented by Box and Jenkin in 1976 [33].The general equation of successive differences at the dth difference of X t is as follows: where d is the difference order and is usually 1 or 2, and B is the backshift operator.
The successive difference at one-time lag equals to, In this work, the general ARIMA (p, d, q) is briefly expressed as follows [36]: where Φ p (B) is an auto-regressive operator of order p, θ q (B) is a moving average operator of order q, and W t = ∆dX t .ARIMA modeling was developed using Matlab R2012a (7.14) software; this software was also used to prepare the data and calculate the ACF and PACF.Data preparation included the removal of outliers (as shown in Figure 1), treatment of zero readings, and interpolation of missing data.Model performance was evaluated using root mean square errors (RMSEs, Equation ( 4)) and coefficient of determination (R 2 ) [37].
where X t is the forecasted observation and X o is the actual observation.
Figure 2 shows that the proposed model used in this study can enhance the modeling process and minimize errors.

Result and Discussion
Stationary time series data are prerequisite for developing and testing an ARIMA model [38].Therefore, the collected data were processed, and the first difference was applied to stationarize data.A Phillips-Perron test was performed to determine the first difference stationary of the data by using a Matlab code (pptest).This test can assess the null hypothesis of a unit root in a univariate time series yt.Here, all tests use the model yt = c + δt + a yt -1 + e(t).The null hypothesis restricts a to equal 1.The variants of the test, which are appropriate for series with different growth characteristics, restrict the drift and deterministic trend coefficients (i.e., c and δ, respectively) to 0. The tests use the modified Dickey-Fuller statistics to account for serial correlations in the innovations process e(t) [39].The Phillips-Perron test result of h= 1 indicates that the first difference can sufficiently stationarize the monthly and daily solar radiation data.Furthermore, Figure 3 shows the stationary time series data for both daily and monthly solar radiation after the first difference was applied.The data clearly fluctuate around zero; thus, they may be considered stationarized.This finding indicates that the first difference is sufficient and further data treatments are unnecessary.

Result and Discussion
Stationary time series data are prerequisite for developing and testing an ARIMA model [38].Therefore, the collected data were processed, and the first difference was applied to stationarize data.A Phillips-Perron test was performed to determine the first difference stationary of the data by using a Matlab code (pptest).This test can assess the null hypothesis of a unit root in a univariate time series yt.Here, all tests use the model yt = c + δt + a yt − 1 + e(t).The null hypothesis restricts a to equal 1.The variants of the test, which are appropriate for series with different growth characteristics, restrict the drift and deterministic trend coefficients (i.e., c and δ, respectively) to 0. The tests use the modified Dickey-Fuller statistics to account for serial correlations in the innovations process e(t) [39].The Phillips-Perron test result of h = 1 indicates that the first difference can sufficiently stationarize the monthly and daily solar radiation data.Furthermore, Figure 3 shows the stationary time series data for both daily and monthly solar radiation after the first difference was applied.The data clearly fluctuate around zero; thus, they may be considered stationarized.This finding indicates that the first difference is sufficient and further data treatments are unnecessary.ARIMA is based on two parts: an auto-regressive (AR) part and a moving average (MA) part.Also, this model is usually referred to as ARIMA (p, d, q).In this, p and q are the order of AR and MA respectively; where d is the difference order [40,41].They indicate the possible ARIMA (p, 1, q) model, where, usually, p = 1-5, q = 1-3, and d is usually 1 [42].ARIMA model identification was done by considering the ACF and PACF for the stationary time series data.The results of the ACF and PACF tests for daily and monthly solar radiation are shown in Figure 4.The ACF and PACF were tested for 60 lags to investigate the seasonality action.ARIMA is based on two parts: an auto-regressive (AR) part and a moving average (MA) part.Also, this model is usually referred to as ARIMA (p, d, q).In this, p and q are the order of AR and MA respectively; where d is the difference order [40,41].They indicate the possible ARIMA (p, 1, q) model, where, usually, p = 1-5, q = 1-3, and d is usually 1 [42].ARIMA model identification was done by considering the ACF and PACF for the stationary time series data.The results of the ACF and PACF tests for daily and monthly solar radiation are shown in Figure 4.The ACF and PACF were tested for 60 lags to investigate the seasonality action.Figure 4 shows that for the monthly solar radiation, (a) significant autocorrelations (spikes) are present at lags that are multiples of 12, which indicates a seasonality action every 12 months.However, potentially significant autocorrelations are present at smaller lags.PACF decays after the fourth lag, while ACF decays after the second lag, and thus, the suggested SARIMA model is ARIMA (4,1,1) with seasonal auto-regression (AR) at a lag of 12 and moving average (MA) at a lag of 12.All possible combinations were attempted to determine the best model with the smallest RMSE.In addition, for daily solar radiation (Figure 4b), the model showed reasonable results (statistically significant results) after the second lag of ACF with a low lag decay for PACF.Thus, the final model was assumed to be ARIMA (1,1,2).Again, all possible combinations were attempted to determine the best model and simplify the modeling process.Figure 4 shows that for the monthly solar radiation, (a) significant autocorrelations (spikes) are present at lags that are multiples of 12, which indicates a seasonality action every 12 months.However, potentially significant autocorrelations are present at smaller lags.PACF decays after the fourth lag, while ACF decays after the second lag, and thus, the suggested SARIMA model is ARIMA (4,1,1) with seasonal auto-regression (AR) at a lag of 12 and moving average (MA) at a lag of 12.All possible combinations were attempted to determine the best model with the smallest RMSE.In addition, for daily solar radiation (Figure 4b), the model showed reasonable results (statistically significant results) after the second lag of ACF with a low lag decay for PACF.Thus, the final model was assumed to be ARIMA (1,1,2).Again, all possible combinations were attempted to determine the best model and simplify the modeling process.
Table 2 summarizes the SARIMA (4,1,1) of monthly solar radiation with an AR lag of 12 and MA lag of 12.The developed SARIMA model can be expressed by the following equation with an RMSE of 33.18 with R 2 = 79%.Table 3 presents the performance of the daily solar radiation model; here, the suggested model is ARIMA (1,1,2), which is presented by Equation ( 6) with an RMSE equal to 104.26 with R 2 = 68%.
where, y t is the daily average of solar radiation Wh/m 2 at day (t).The standardized residual between the predicted and true values of daily and monthly solar radiation was calculated to test the model goodness.Figure 5 shows that most of the standardized residuals for both the daily and monthly solar radiation forecasts are between ±2; thus, the standardized residual can be considered normally distributed, which reveals the goodness of the proposed ARIMA models [43].The positive and negative values of the standardized residuals also indicate model goodness and suggest that the predicted values are sometimes higher or lower than the original values.Although some residuals fall beyond the ±2 limits, these represent a limited number of readings and still fall within the accepted 95% confidence interval.Jarque-Bera test is a goodness-of-fit test, and it determines whether the skewness and kurtosis of the sample data match those of a normal distribution.This test can provide a decision for the null hypothesis, i.e., whether the input data comes from a normal distribution with an unknown mean and variance [44].The null hypothesis for monthly solar radiation is tested by using the Jarque-Bera test, and the obtained results agree with the null hypothesis at P = 0.313, which refers to the normal distribution of the standardized residual and its goodness of fit.The ACF and PACF for residuals were investigated to determine the white noise of the predicted data.Figure 6 shows the ACF and PACF for 40 lags of residuals of the daily and monthly solar radiation models.The presented 40 lags are adequate to judge model goodness because most of the spikes are within the confidence limits with a tendency to decay.The lags for both daily and monthly radiation are within the critical values represented by blue and green lines, and the residuals can be considered uncorrelated, which indicates the goodness of the fitted model.A limited number of ACF and PACF residuals are beyond the critical limit, but these are relatively few and still within the accepted 95% confidence interval.The ACF and PACF for residuals were investigated to determine the white noise of the predicted data.Figure 6 shows the ACF and PACF for 40 lags of residuals of the daily and monthly solar radiation models.The presented 40 lags are adequate to judge model goodness because most of the spikes are within the confidence limits with a tendency to decay.The lags for both daily and monthly radiation are within the critical values represented by blue and green lines, and the residuals can be considered uncorrelated, which indicates the goodness of the fitted model.A limited number of ACF and PACF residuals are beyond the critical limit, but these are relatively few and still within the accepted 95% confidence interval.Long-term forecasting for lead times of 10 years and 1 month were conducted using the best models with historical data of average monthly and average daily solar radiation.Figure 7a presents the expected monthly solar radiation for the next 10 years (120 months).The expected average monthly solar radiation values were within the range of 176-377 Wh/m 2 .The general fluctuation Long-term forecasting for lead times of 10 years and 1 month were conducted using the best models with historical data of average monthly and average daily solar radiation.Figure 7a presents the expected monthly solar radiation for the next 10 years (120 months).The expected average monthly solar radiation values were within the range of 176-377 Wh/m 2 .The general fluctuation trend was maintained, which can be explained by variations in solar radiation due to changes in weather conditions throughout the year.The general trend of monthly solar radiation increased with time, as shown in Figure 7, which may be expected due to increasing UV radiation levels as a result of climate change and ozone layer depletion [45].The expected monthly solar radiation is expected to reach 377 Wh/m 2 .Figure 7b shows the expected average daily radiation for a month in advance.The fluctuation is existing for real average daily radiation, and this fluctuation is kept with the forecasted values.The fluctuations for forecasted average daily radiation seemed larger in comparison with real inputs or forecasted monthly solar radiation.
trend was maintained, which can be explained by variations in solar radiation due to changes in weather conditions throughout the year.The general trend of monthly solar radiation increased with time, as shown in Figure 7, which may be expected due to increasing UV radiation levels as a result of climate change and ozone layer depletion [45].The expected monthly solar radiation is expected to reach 377 Wh/m 2 .Figure 7b shows the expected average daily radiation for a month in advance.The fluctuation is existing for real average daily radiation, and this fluctuation is kept with the forecasted values.The fluctuations for forecasted average daily radiation seemed larger in comparison with real inputs or forecasted monthly solar radiation.95% confidence intervals.Slight discrepancies were observed between the theoretical 95% forecast intervals and the simulation-based 95% forecast intervals.As in the real measured data, May showed the highest solar radiation with an average value of about 346.2 Wh/m 2 .The lowest monthly solar radiation was expected in December (184.3Wh/m 2 ), which represents the lowest month of solar radiation in actual readings.However, the general trend of monthly solar radiation increased with time.For instance, an increase in average solar radiation of about 2.5% was recorded in May, while the increment in average solar radiation for December is about 10%.Finally, comparisons of the ARIMA model with Monte Carlo simulations of monthly and daily solar radiation for the next year and next month are shown in Figure 8.Although the Monte Carlo simulation presents relatively larger fluctuations, the means of this model and the ARIMA model are virtually indistinguishable.The upper on lower lines in Figure 8a,b represent 95% confidence intervals.Slight discrepancies were observed between the theoretical 95% forecast intervals and the simulation-based 95% forecast intervals.As in the real measured data, May showed the highest solar radiation with an average value of about 346.2 Wh/m 2 .The lowest monthly solar radiation was expected in December (184.3Wh/m 2 ), which represents the lowest month of solar radiation in actual readings.However, the general trend of monthly solar radiation increased with time.For instance, an increase in average solar radiation of about 2.5% was recorded in May, while the increment in average solar radiation for December is about 10%.

Conclusions
Forecasting solar radiation is the current focus of several researchers due to the growing interest in green energy.However, a solar radiation forecasting method, which needs to be as accurate as possible, is needed to help policy makers frame strong policies, particularly by understanding the overall perspective of solar radiation in consideration of the strength, weakness, opportunities, and challenges associated with the predictions.The ARIMA model can potentially and effectively predict daily and monthly solar radiation because of its convenience and accurate prediction, low data input requirement, and simple computational process.In this work, the ARIMA model was used to predict the monthly average solar radiation with seasonal lag and mimic the seasonality and monthly cyclic nature of solar radiation in Seoul, South Korea.A large-volume dataset of solar radiation of Seoul City over the past 37 years was used as the training and testing datasets to accurately forecast solar radiation in this study.To check the validation and stability of the simulation results, the goodness of fit of the model was tested by using RMSE, R 2 /coefficient of determination (share of explained variance), Phillips-Perron test, and Jarque-Bera test, from which the standardized residuals indicate that the residuals of the model are non-correlated and normally distributed.The model has passed the tests, and the results demonstrate the capability of ARIMA to provide accurate monthly and daily solar prediction, especially with the availability of solar radiation data from the previous 37 years.The RMSE was equal to 33.18 and the coefficient of determination (R 2 ) was equal to 79% for the monthly solar radiation model.Meanwhile, the RMSE was equal to 104.26 and the R 2 was equal to

Conclusions
Forecasting solar radiation is the current focus of several researchers due to the growing interest in green energy.However, a solar radiation forecasting method, which needs to be as accurate as possible, is needed to help policy makers frame strong policies, particularly by understanding the overall perspective of radiation in consideration of the strength, weakness, opportunities, and challenges associated with the predictions.The ARIMA model can potentially and effectively predict daily and monthly solar radiation because of its convenience and accurate prediction, low data input requirement, and simple computational process.In this work, the ARIMA model was used to predict the monthly average solar radiation with seasonal lag and mimic the seasonality and monthly cyclic nature of solar radiation in Seoul, South Korea.A large-volume dataset of solar radiation of Seoul City over the past 37 years was used as the training and testing datasets to accurately forecast solar radiation in this study.To check the validation and stability of the simulation results, the goodness of fit of the model was tested by using RMSE, R 2 /coefficient of determination (share of explained variance), Phillips-Perron test, and Jarque-Bera test, from which the standardized residuals indicate that the residuals of the model are non-correlated and normally distributed.The model has passed the tests, and the results demonstrate the capability of ARIMA to provide accurate monthly and daily solar prediction, especially with the availability of solar radiation data from the previous 37 years.The RMSE was equal to 33.18 and the coefficient of determination (R 2 ) was equal to 79% for the monthly solar radiation model.Meanwhile, the RMSE was equal to 104.26 and the R 2 was equal to 68% for the daily solar radiation model.An R 2 value higher than 50% indicates excellent performance of the prediction model.Moreover, the Jarque-Bera test was implemented to investigate the null hypothesis of the normal distribution of the standardized residual.The results support the null hypothesis at P-value = 0.313, which indicates the normal distribution of the standardized residual and its goodness of fit.The standardized residual also shows that the model can effectively predict solar radiation on a monthly basis.In addition, a comparison of the ARIMA model with the Monte Carlo simulations of monthly and daily solar radiation was conducted.The results show that the average monthly solar radiation fluctuates by approximately 250 Wh/m 2 , which can be considered a reference figure for estimating potential solar power and in building a method for feasibility calculation.Furthermore, the expected average monthly solar radiation ranges from 176 Wh/m 2 (December) to 377 Wh/m 2 (May), which is compatible with the general trends of the highest and lowest monthly values and daily fluctuations.Considering these findings is essential in sustainable and proper planning, especially in the field of solar power generation.

Figure 2 .
Figure 2. Flowchart of the proposed procedure.

Figure 2 .
Figure 2. Flowchart of the proposed procedure.

Figure 3 .
Figure 3. First difference of the monthly and daily solar radiation data.(a) Month; (b) Daily.

Figure 3 .
Figure 3. First difference of the monthly and daily solar radiation data.(a) Month; (b) Daily.

Figure 4 .
Figure 4. ACF and PACF of the first difference of monthly and daily solar radiation data.(a) Month; (b) Daily.

Figure 4 .
Figure 4. ACF and PACF of the first difference of monthly and daily solar radiation data.(a) Month; (b) Daily.

Figure 6 .
Figure 6.Monthly and daily ACF and PACF residuals of solar radiation.(a) Month; (b) Daily.

Figure 6 .
Figure 6.Monthly and daily ACF and PACF residuals of solar radiation.(a) Month; (b) Daily.

Figure 7 .
Figure 7. Solar radiation forecasts for (a) 10 years ahead and (b) 1 month ahead.Finally, comparisons of the ARIMA model with Monte Carlo simulations of monthly and daily solar radiation for the next year and next month are shown in Figure 8.Although the Monte Carlo simulation presents relatively larger fluctuations, the means of this model and the ARIMA model are virtually indistinguishable.The upper on lower lines in Figures 8a,brepresent 95% confidence intervals.Slight discrepancies were observed between the theoretical 95% forecast intervals and the simulation-based 95% forecast intervals.As in the real measured data, May showed the highest solar radiation with an average value of about 346.2 Wh/m 2 .The lowest monthly solar radiation was expected in December (184.3Wh/m 2 ), which represents the lowest month of solar radiation in actual readings.However, the general trend of monthly solar radiation increased with time.For instance, an increase in average solar radiation of about 2.5% was recorded in May, while the increment in average solar radiation for December is about 10%. represent

Figure 8 .
Figure 8.Comparison of ARIMA forecasts versus the Monte Carlo simulation model.(a) Month; (b) Day.

Figure 8 .
Figure 8.Comparison of ARIMA forecasts versus the Monte Carlo simulation model.(a) Month; (b) Day.

Table 1 .
Summary of the descriptive statistics of the solar radiation data from KMA.

Table 1 .
Summary of the descriptive statistics of the solar radiation data from KMA.

Table 2 .
Model performance in terms of monthly solar radiation.

Table 3 .
Model performance in terms of daily solar radiation.