# Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. PV Power Forecasting Models based on GAM

#### 2.1. Area-Wide PV Power Generation Forecasting Model

- (i)
- Using the actual areawide PV power generation capacity ${W}_{t}$ (observed no more frequently than monthly), the daily increasing trend is estimated using a linear model. As a result, the daily forecast value of PV power generation capacity is obtained as ${\widehat{W}}_{t}$ (see Appendix A).
- (ii)
- The unit power generation ${U}_{t,\text{}h}$ is obtained by dividing the measured hourly area-wide PV power generation ${V}_{t,h}$ by ${\widehat{W}}_{t}$ (i.e., ${U}_{t,\text{}h}:={V}_{t,h}/{\widehat{W}}_{t}$).
- (iii)
- For ${U}_{t,\text{}h}$ a GAM is built with calendar information, general weather conditions forecast, and maximum/minimum temperature forecast as explanatory variables.
- (iv)
- The out-of-sample forecast unit power generation ${\widehat{U}}_{t,\text{}h}$ is obtained by substituting the explanatory variables observed in the estimated GAM forecast formula (predictor part of the GAM).
- (v)
- The out-of-sample forecast power generation ${\widehat{V}}_{t,\text{}h}$ is obtained as the product of the forecast PV power capacity ${\widehat{W}}_{t}$ and the forecast unit power generation ${\widehat{U}}_{t,\text{}h}$.

#### 2.2. Individual PV Power Generation Forecasting Model

## 3. Machine Learning Methods to Be Compared

**k-nearest neighbor (kNN):**kNN [13] is one of the simplest yet effective ML algorithms [29]. kNN can be used for classification and regression problems. The main idea is to use the proximity of features to predict the value of new data points. When used for classification problems, the classification of an object is determined by the votes of its neighboring groups of objects (i.e., the most common class in the $k$ nearest neighbor groups is assigned to the object). kNN regression, on the other hand, uses the average of the values of the $k$ nearest neighbors, or the inverse distance weighted average of the $k$ nearest neighbors as the expected result. The kNN algorithm measures the distance between the numerical target parameters and a set of parameters in the dataset, usually the Euclidean distance (which is also used by caret’s “knn”). Other distances, such as the Manhattan distance, can also be used. kNN methods have the challenge of being sensitive to the local structure of the data.**Artificial Neural Networks (ANN):**ANN [9] is “a mathematical model or computational model based on biological neural networks; in other words, it is an emulation of a biological neural system” [30]. The perceptron is the starting point for the neural-network formation procedure. Simply put, the input is received by the perceptron, where it is multiplied by a series of weights and then passed to the activation function of choice (linear, logistic, hyperbolic tangent, or ReLU) to produce the output. A neural network consists of a multilayer perceptron model, which consists of a cascade of perceptron layers: an input layer, a hidden layer, and an output layer. Data are received in the input layer, and the final output is generated in the output layer. The hidden layer, as is commonly known, is located between the input and output layers, where transient computations occur.**Support vector machine (SVM)/support vector regression (SVR):**SVM is considered “one of the most robust and accurate methods among all well-known algorithms” [31]. When used for classification problems, SVMs learn the boundary that most boldly separates a given training sample (maximizing the margin, which is the distance between the boundary and the data). The unique feature of SVM is that it can be combined with the kernel method [32], which is a method for nonlinear data analysis. That is, by using a method that maps data to a finite (or infinite) dimensional feature space using a kernel function and performs linear separation on that feature space, it is possible to apply this method to nonlinear classification problems. When SVM is used for regression, known as support vector regression (SVR), as originally proposed in [10], some properties of the SVM classifier are inherited. That is, the problem is solved in such a way that the error and the weights (regression coefficients) of the mapping functions are minimized simultaneously (see [33] for the formula). This prevents overlearning in a manner similar to ridge regression [34]. SVR has a structure similar to that of kernel ridge regression (KRR), but it is unique in that a loss function called “(linear) ε-insensitive loss functions” (see, e.g., “Figure 1” of [35]) is used to evaluate the prediction error. In this respect, it differs from KRR, which has squared error loss as its loss function [36].**Random Forest (RF):**RF [14] is an ensemble model that combines several prediction models called “decision trees.” It is called “forest” because it consists of a large number of decision trees, and “random” because the decision trees (classification trees or regression trees) are constructed using $k$ (which is given in advance) sorts of randomly chosen explanatory variables instead of all explanatory variables. In random forest regression, when new data are given, each generated regression tree predicts the output of the individual prediction, and they are averaged to output the final prediction [14]. The random forest regressor has advantages in that it can solve complex problems on a variety of datasets using different functions and find and unbiased estimate the generalization error; however, it can be overfitted for some datasets and add noisy classification/regression tasks [37].

## 4. Empirical Analysis

#### 4.1. Area-Wide PV Power Generation Forecasting Model

- PV power generation volume ${V}_{t,h}$ (MW): published by nine electricity power companies (e.g., data for the Tokyo area was downloaded from [39])
- PV power capacity ${W}_{t}$ (MW): month-end results published by the Ministry of Economy, Trade and Industry [40]
- Weather condition dummy ${I}_{.,t,h}$, max (min) temperature $Tma{x}_{t}$ ($Tmi{n}_{t}$) (°C): forecast values (of one major city in each of the nine areas) announced by the JMA on the previous morning [41]

#### 4.1.1. Estimated Trend

#### 4.1.2. Comparison of Forecast Accuracy

#### 4.2. Individual PV Power Generation Forecasting Model

- PV power generation volume ${V}_{t,h}$ (MW): measured value of the household’s solar power system (with the permission of the owner, we use the data of a private roof-mounted power system in Hiroshima city, Japan).
- Solar radiation ${R}_{t,\text{}h}$ (MJ/m
^{2}): Measured solar radiation in Hiroshima City as published by the JMA [45].

#### 4.2.1. Estimated Trend

#### 4.2.2. Comparison of Forecast Accuracy

#### Comparison of Forecast Accuracy among Statistical Models

#### Comparison of Forecasting Accuracy between GAM and ML Methods

## 5. Conclusions

- We constructed different GAM-based forecasting models for area-wide PV power generation and individual PV power generation, and demonstrated the effectiveness of the models by visualizing the estimated trends and providing reasonable interpretations.
- For the individual PV power generation model, we constructed a new forecasting model using 3D tensor product splines and demonstrated its effectiveness. We quantitatively demonstrated that the robustness and forecasting accuracy of the model increased when smoothing (nonlinear) conditions were incorporated in three directions by comparing it with linear models.
- By comparing the proposed GAM-based models with other popular ML methods, such as kNN, ANN, SVR, and RF for each PV power model, it was shown that the GAM-based models have advantages in terms of computational speed and forecast error minimization. Specifically, we have shown that the GAM-based model is highly effective for global nonlinear trend completion.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Nomenclature

${V}_{t,h}$ | measured PV power generation volume at date $t$, hour $h$ |

${W}_{t}$ | installed PV power capacity at date $t$, hour $h$ |

${U}_{t,\text{}h}$ | unit power generation at date $t$, hour $h$ |

${I}_{.,t,h}$ | dummy variables, which are 1 if the forecast general weather condition at date $t$ hour $h$ is the same as the suffix’s weather condition, or 0 otherwise |

$Seasona{l}_{t}$ | yearly cyclical dummy variables $\left(=1,\dots ,365\left(or366\right)\right)$ |

$Tma{x}_{t}$, $Tmi{n}_{t}$ | previous day’s maximum or minimum temperature forecast at date $t$ |

${\u03f5}_{tmax,t}$, ${\u03f5}_{tmin,t}$ | maximum or minimum temperature forecast deviation at date $t$ (observed temperature forecast minus its trend) |

${u}_{.}\left(t,h\right)$ | 2D tensor product spline functions estimated by the GAM of the area-wide PV power forecast model ($t$ denote $Seasona{l}_{t}$) |

${f}_{.}\left(t\right)$ | univariate spline functions estimated by the GAM of the temperature trend model ($t$ denote $Seasona{l}_{t}$) |

${\eta}_{t,\text{}h}$ | residual terms with the average of 0 |

${R}_{t,\text{}h}$ | forecast solar radiation at date $t$, hour $h$ |

$\beta $, $\alpha $ | coefficients and constant terms for the individual PV power generation models when each model is viewed as a linear regression equation for solar radiation (constant for M0 and M1, or variables defined by 2D tensor product spline function for M2) |

$v\left(t,\text{}h,{R}_{t,\text{}h}\right)$ | 3D tensor product spline functions estimated by the GAM of the individual PV power forecast model ($t$ denote $Seasona{l}_{t}$) |

## Appendix A. Installed Capacity Trend Estimation for Area PV Generation Forecasting

## Appendix B. Smoothing Spline Functions

## Appendix C. Hyperparameters to be Tuned for the Caret Package

Model | Method Value | Package | Tuning Parameters | |
---|---|---|---|---|

kNN | knn | caret [28] | k | Number of neighbors considered |

ANN | nnet | nnet [52] | decay | The parameter for weight decay |

size | Number of units in the hidden layer | |||

SVM (SVR) | svmRadial | kernlab [53] | sigma | The inverse kernel width used by the Gaussian kernel |

C | The cost regularization parameter, which controls the smoothness | |||

RF | rf | randomForest [54] | mtry | Number of variables randomly sampled as candidates at each split |

## Appendix D. Estimated Trends for Areawide PV Power Generation Model

**Figure A1.**Estimated trends for areawide PV power generation model (all nine areas). Note: The unit of vertical axis for “Sunny,” “Cloudy,” “Rainy” and “Snowy” is (%), and that for “Temp_max” and “Temp_min” is (%/°C). “Seasonal” denotes a yearly cyclical dummy variable (see Section 2.1), and “Hour” denotes the time (o’clock).

## Appendix E. Trend Estimation Results for the M3 Model (Estimated from 9 Months of In-Sample Data)

**Figure A2.**Trend estimation results for the M3 model (estimated from 9 months of in-sample data). Note: The units of vertical axis and “Radiation” axis are (MW) and (MJ/m

^{2}), respectively. “Seasonal” denotes a yearly cyclical dummy variable (see Section 2.1).

## References

- Matsumoto, T.; Yamada, Y. Prediction method for solar power business based on forecasted general weather conditions and periodic trends by weather. Trans. Mater. Res. Soc. Jpn.
**2019**, 62, 1–22. (In Japanese) [Google Scholar] [CrossRef] - Antonanzas, J.; Osorio, N.; Escobar, R.; Urraca, R.; Martinez-de-Pison, F.J.; Antonanzas-Torres, F. Review of photovoltaic power forecasting. Sol. Energy
**2016**, 136, 78–111. [Google Scholar] [CrossRef] - Pelland, S.; Remund, J.; Kleissl, J.; Oozeki, T. Photovoltaic and Solar Forecasting: State of the Art; Report PVPS T14-10; International Energy Agency: Paris, France, 2013. [Google Scholar]
- Kato, T. Development of forecasting method of photovoltaic power output. J. IEEJ
**2017**, 137, 101–104. (In Japanese) [Google Scholar] [CrossRef] - Raza, M.Q.; Nadarajah, M.N.; Ekanayake, C. On recent advances in PV output power forecast. Sol. Energy
**2016**, 136, 125–144. [Google Scholar] [CrossRef] - Mosavi, A.; Salimi, M.; Faizollahzadeh Ardabili, S.; Rabczuk, T.; Shamshirband, S.; Varkonyi-Koczy, A.R. State of the art of machine learning models in energy systems, a systematic review. Energies
**2019**, 12, 1301. [Google Scholar] [CrossRef] [Green Version] - Das, U.K.; Tey, K.S.; Seyedmahmoudian, M.; Mekhilef, S.; Idris, M.Y.I.; Van Deventer, W.; Horan, B.; Stojcevski, A. Forecasting of photovoltaic power generation and model optimization: A review. Renew. Sustain. Energy Rev.
**2018**, 81, 912–928. [Google Scholar] [CrossRef] - Sobri, S.; Koohi-Kamali, S.; Rahim, N.A. Solar photovoltaic generation forecasting methods: A review. Energy Convers. Manag.
**2018**, 156, 459–497. [Google Scholar] [CrossRef] - Ripley, B.D. Pattern Recognition and Neural Networks; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
- Drucker, H.; Burges, C.C.; Kaufman, L.; Smola, A.J.; Vapnik, V.N. Support vector regression machines. In Advances in Neural Information Processing Systems 9, NIPS 1997; MIT Press: Cambridge, MA, USA, 1996; pp. 155–161. [Google Scholar]
- Maitanova, N.; Telle, J.S.; Hanke, B.; Grottke, M.; Schmidt, T.; Maydell, K.V.; Agert, C. A machine learning approach to low-cost photovoltaic power prediction based on publicly available weather reports. Energies
**2020**, 13, 735. [Google Scholar] [CrossRef] [Green Version] - Mohammed, A.A.; Aung, Z. Ensemble learning approach for probabilistic forecasting of solar power generation. Energies
**2016**, 9, 1017. [Google Scholar] [CrossRef] - Altman, N.S. An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat.
**1992**, 46, 175–185. [Google Scholar] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Das, U.K.; Tey, K.S.; Seyedmahmoudian, M.; Idris, M.Y.I.; Mekhilef, S.; Horan, B.; Stojcevski, A. SVR-based model to forecast PV power generation under different weather conditions. Energies
**2017**, 10, 876. [Google Scholar] [CrossRef] - Rosato, A.; Altilio, R.; Araneo, R.; Panella, M. Prediction in photovoltaic power by neural networks. Energies
**2017**, 10, 1003. [Google Scholar] [CrossRef] [Green Version] - Khandakar, A.; Chowdhury, M.E.H.; Khoda Kazi, M.; Benhmed, K.; Touati, F.; Al-Hitmi, M.; Gonzales, J.S. Machine learning based photovoltaics (PV) power prediction using different environmental parameters of Qatar. Energies
**2019**, 12, 2782. [Google Scholar] [CrossRef] [Green Version] - Nespoli, A.; Ogliari, E.; Leva, S.; Massi Pavan, A.; Mellit, A.; Lughi, V.; Dolara, A. Day-ahead photovoltaic forecasting: A comparison of the most effective techniques. Energies
**2019**, 12, 1621. [Google Scholar] [CrossRef] [Green Version] - Abdel-Nasser, M.; Mahmoud, K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl.
**2019**, 31, 2727–2740. [Google Scholar] [CrossRef] - Matsumoto, T. Forecast Based Risk Management for Electricity Trading Market. Ph.D. Thesis, University of Tsukuba, Tokyo, Japan, 2020. [Google Scholar]
- Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar]
- Matsumoto, T.; Yamada, Y. Construction of forecast model for power demand and PV power generation using tensor product spline function. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2021; Volume 812, p. 012001. [Google Scholar]
- Sundararajan, A.; Ollis, B. Regression and generalized additive model to enhance the performance of photovoltaic power ensemble predictors. IEEE Access
**2021**, 9, 111899–111914. [Google Scholar] [CrossRef] - Gadiwala, M.S.; Usman, A.; Akhtar, M.; Jamil, K. Empirical models for the estimation of global solar radiation with sunshine hours on horizontal surface in various cities of Pakistan. Pak. J. Meteorol.
**2013**, 9, 43–55. [Google Scholar] - Wood, S.N. Generalized Additive Models: An Introduction with R; Chapman and Hall: London, UK, 2017. [Google Scholar]
- Matsumoto, T.; Yamada, Y. Customized yet Standardized temperature derivatives: A non-parametric approach with suitable basis selection for ensuring robustness. Energies
**2021**, 14, 3351. [Google Scholar] [CrossRef] - Japan Meteorological Business Support Center. Available online: http://www.jmbsc.or.jp/jp/online/file/f-online10200.html (accessed on 26 October 2021).
- Kuhn, M.; Wing, J.; Weston, S.; Williams, A.; Keefer, C.; Engelhardt, A.; Cooper, T.; Mayer, Z.; Kenkel, B.; Benesty, M.; et al. Package Caret. Available online: https://cran.r-project.org/web/packages/caret/caret.pdf (accessed on 26 October 2021).
- Cherkassky, V.; Mulier, F.M. Learning from Data: Concepts, Theory, and Methods; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
- Singh, Y.; Chauhan, A.S. Neural networks in data mining. J. Theor. Appl. Inf. Technol.
**2009**, 5, 37–42. [Google Scholar] [CrossRef] - Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.; Ng, A.; Liu, B.; Yu, P.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst.
**2008**, 14, 1–37. [Google Scholar] [CrossRef] [Green Version] - Aizerman, M. Theoretical foundations of the potential function method in pattern recognition learning. Autom. Remote Control
**1964**, 25, 821–837. [Google Scholar] - Friedman, J.; Hastie, T.; Tibshirani, R. The Elements of Statistical Learning; Springer Series in Statistics: New York, NY, USA, 2001; Volume 1, Available online: https://web.stanford.edu/~hastie/ElemStatLearn/printings/ESLII_print12_toc.pdf (accessed on 26 October 2021).
- Marquardt, D.W.; Snee, R.D. Ridge regression in practice. Am. Stat.
**1975**, 29, 3–20. [Google Scholar] - Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput.
**2004**, 14, 199–222. [Google Scholar] [CrossRef] [Green Version] - Scikit-learn. Kernel Ridge Regression. 2018. Available online: http://scikit-learn.org/stable/modules/kernel_ridge.html (accessed on 26 October 2021).
- Dutta, A.; Dureja, A.; Abrol, S.; Dureja, A. Prediction of ticket prices for public transport using linear regression and random forest regression methods: A practical approach using machine learning. In International Conference on Recent Developments in Science, Engineering and Technology; Springer: Singapore, 2019; pp. 140–150. [Google Scholar]
- Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw.
**2008**, 28, 1–26. [Google Scholar] [CrossRef] [Green Version] - Tokyo Electric Power Co. “Announcement of Area Supply and Demand Results”. Available online: https://www.tepco.co.jp/forecast/html/area_data-j.html (accessed on 26 October 2021).
- Ministry of Economy, Trade and Industry. “Feed-In Tariff (FIT) Program Information Release Website”. Available online: https://www.fit-portal.go.jp/PublicInfoSummary (accessed on 26 October 2021).
- Changes in Weather Forecast. Available online: http://weather-transition.gger.jp/ (accessed on 26 October 2021).
- Campbell, J.Y.; Thompson, S.B. Predicting excess stock returns out of sample: Can anything beat the historical average? Rev. Financ. Stud.
**2008**, 21, 1509–1531. [Google Scholar] [CrossRef] [Green Version] - Kuhn, M. Parallel Processing. Available online: https://topepo.github.io/caret/parallel-processing.html (accessed on 27 September 2021).
- Kaggle, mlcourse.ai. Open Machine Learning Course by OpenDataScience. Available online: https://www.kaggle.com/kashnitsky/topic-5-ensembles-part-2-random-forest (accessed on 26 October 2021).
- Japan Meteorological Agency “Historical Weather Data Download”. Available online: https://www.data.jma.go.jp/gmd/risk/obsdl/ (accessed on 26 October 2021).
- Takuji, M.; Misao, E. One-week-ahead electricity price forecasting using weather forecasts, and its application to arbitrage in the forward market: An empirical study of the Japan electric power exchange. J. Energy Mark.
**2021**, 14, 1–26. [Google Scholar] - Smits, G.F.; Jordaan, E.M. Improved SVM regression using mixtures of kernels. In Proceedings of the 2002 International Joint Conference on Neural Networks, Honolulu, HI, USA, 12–17 May 2002; IJCNN’02 Cat. No. 02CH37290. IEEE: New York, NY, USA, 2002; Volume 3, pp. 2785–2790. [Google Scholar]
- Wood, S.N. 2019 Package “mgcv”. Available online: https://cran.r-project.org/web/packages/mgcv/mgcv.pdf (accessed on 15 October 2020).
- Matsumoto, T.; Yamada, Y. Simultaneous hedging strategy for price and volume risks in electricity businesses using energy and weather derivatives. Energy Econ.
**2021**, 95, 105101. [Google Scholar] [CrossRef] - Matsumoto, T.; Yamada, Y. Cross hedging using prediction error weather derivatives for loss of solar output prediction errors in electricity market. Asia-Pac. Financ. Mark.
**2018**, 26, 211–227. [Google Scholar] [CrossRef] - Kuhn, M. The Caret Package. J. Stat. Softw.
**2009**, 28. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.216.2142&rep=rep1&type=pdf (accessed on 26 October 2021). - Ripley, B.; Venables, W.; Ripley, M.B. Package ‘nnet’. R Package Version. 2016. Available online: https://cran.r-project.org/web/packages/nnet/nnet.pdf (accessed on 26 October 2021).
- Karatzoglou, A.; Smola, A.; Hornik, K.; Karatzoglou, M.A. Package ‘Kernlab’. CRAN R Project. 2019. Available online: https://cran.r-project.org/web/packages/kernlab/kernlab.pdf (accessed on 26 October 2021).
- Breiman, L.; Cutler, A. Package ‘RandomForest’; University of California, Berkeley: Berkeley, CA, USA, 2018; Available online: https://cran.r-project.org/web/packages/randomForest/randomForest.pdf (accessed on 26 October 2021).

**Figure 1.**Estimation procedures and data periods in the area-wide PV power generation forecasting model. Note: In our empirical analysis, we use the in-sample period from 1 April 2016, to 31 December 2017, and the out-of-sample period from 1 January to 31 December 2018 (as described in Section 4.1). When using the ML methods introduced in Section 3, the only difference is that each ML model is adopted instead of GAM (1) in steps (iii) and (iv), and the other steps are the same.

**Figure 2.**Estimated trends for areawide PV power generation model (example of Tokyo area). Note: The unit of vertical axis for “Sunny,” “Cloudy” and “Rainy” is (%), and that for “Temp_max” and “Temp_min” is (%/°C). “Seasonal” denotes a yearly cyclical dummy variable (see Section 2.1), and “Hour” denotes the time (o’clock).

**Figure 5.**The relationship between in-sample and out-of-sample periods of each validation case for the individual PV power generation forecasting model. Note: “Default validation case” corresponds to Figure 6 and Figure 7 and Table 1; “Additional validation case 1” corresponds to Figure 8 and Figure A2; “Additional validation case 2” corresponds to Figure 9.

**Figure 6.**Trend estimation results for the M3 model (estimated from 5 years of in-sample data). Note: The units of vertical axis and “Radiation” axis are (MW) and (MJ/m

^{2}), respectively. “Seasonal” denotes a yearly cyclical dummy variable (see Section 2.1).

**Figure 7.**Monthly relative forecast error (MAE and RMSE) of M1 and M2 with respect to M3. Note: The dashed line is the relative increment of the forecast error to M3 over the period (the out-of-sample MAEs overlap the two because they are equal). This is “Default validation case” in Figure 5.

**Figure 8.**MAE and RMSE of each method on individual PV power generation (change in forecast error by method when the in-sample period is shortened). Note: White dots represent the 5-year in-sample period (corresponding to values in Table 2), and color-filled dots represent the 9-month in-sample period. Out-samples were both 2018 (i.e., this is “Additional validation case 1” in Figure 5).

**Figure 9.**Comparison of forecast errors for the next three months when the model is estimated from nine months of data. Note: The in-sample period is 9 months, from 1 April to 31 December 2017; the out-of-sample period is 3 months, from 1 January to 31 March 2018 (i.e., this is “Additional validation case 2” in Figure 5). The reason for the absence of GAM-M1 is that model estimation is not possible because of the lack of in-sample data for the same month.

**Table 1.**Forecasting accuracy and computation time of each method for PV power generation for nine areas.

Metrix | Model | In Sample | Out-of-Sample | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ||

RSQ | GAM | 0.793 | 0.776 | 0.821 | 0.835 | 0.761 | 0.796 | 0.840 | 0.801 | 0.805 | 0.718 | 0.754 | 0.792 | 0.790 | 0.716 | 0.766 | 0.807 | 0.759 | 0.805 |

kNN | 0.900 | 0.910 | 0.916 | 0.923 | 0.889 | 0.910 | 0.927 | 0.915 | 0.914 | 0.620 | 0.621 | 0.712 | 0.716 | 0.644 | 0.684 | 0.735 | 0.664 | 0.705 | |

ANN | 0.741 | 0.740 | 0.763 | 0.799 | 0.726 | 0.750 | 0.812 | 0.757 | 0.770 | 0.626 | 0.681 | 0.709 | 0.742 | 0.666 | 0.715 | 0.767 | 0.725 | 0.772 | |

SMR | 0.835 | 0.822 | 0.859 | 0.867 | 0.798 | 0.821 | 0.869 | 0.830 | 0.837 | 0.629 | 0.542 | 0.722 | 0.665 | 0.676 | 0.741 | 0.704 | 0.655 | 0.713 | |

RF | 0.979 | 0.973 | 0.979 | 0.978 | 0.968 | 0.975 | 0.979 | 0.976 | 0.976 | 0.672 | 0.723 | 0.775 | 0.776 | 0.691 | 0.741 | 0.782 | 0.741 | 0.776 | |

MAE | GAM | 0.227 | 0.264 | 0.233 | 0.217 | 0.275 | 0.242 | 0.220 | 0.240 | 0.253 | 0.298 | 0.295 | 0.238 | 0.226 | 0.323 | 0.309 | 0.233 | 0.250 | 0.243 |

kNN | 0.165 | 0.174 | 0.168 | 0.156 | 0.191 | 0.169 | 0.153 | 0.158 | 0.176 | 0.356 | 0.364 | 0.290 | 0.284 | 0.363 | 0.342 | 0.281 | 0.309 | 0.306 | |

ANN | 0.272 | 0.303 | 0.295 | 0.261 | 0.313 | 0.292 | 0.254 | 0.286 | 0.291 | 0.365 | 0.350 | 0.308 | 0.278 | 0.376 | 0.341 | 0.271 | 0.298 | 0.278 | |

SMR | 0.189 | 0.219 | 0.195 | 0.179 | 0.230 | 0.207 | 0.182 | 0.197 | 0.212 | 0.329 | 0.356 | 0.264 | 0.282 | 0.332 | 0.318 | 0.272 | 0.297 | 0.280 | |

RF | 0.071 | 0.091 | 0.082 | 0.081 | 0.101 | 0.086 | 0.079 | 0.084 | 0.090 | 0.310 | 0.304 | 0.244 | 0.234 | 0.329 | 0.317 | 0.243 | 0.257 | 0.258 | |

RMSE | GAM | 0.318 | 0.373 | 0.336 | 0.311 | 0.386 | 0.353 | 0.314 | 0.345 | 0.368 | 0.415 | 0.422 | 0.348 | 0.345 | 0.450 | 0.430 | 0.336 | 0.379 | 0.363 |

kNN | 0.222 | 0.236 | 0.231 | 0.214 | 0.264 | 0.235 | 0.212 | 0.226 | 0.246 | 0.481 | 0.505 | 0.404 | 0.401 | 0.503 | 0.477 | 0.394 | 0.449 | 0.446 | |

ANN | 0.356 | 0.401 | 0.386 | 0.344 | 0.413 | 0.391 | 0.340 | 0.381 | 0.400 | 0.482 | 0.470 | 0.409 | 0.381 | 0.488 | 0.454 | 0.368 | 0.408 | 0.392 | |

SMR | 0.284 | 0.332 | 0.298 | 0.281 | 0.355 | 0.330 | 0.284 | 0.319 | 0.338 | 0.474 | 0.551 | 0.399 | 0.437 | 0.480 | 0.438 | 0.415 | 0.453 | 0.440 | |

RF | 0.102 | 0.128 | 0.115 | 0.114 | 0.142 | 0.123 | 0.113 | 0.119 | 0.129 | 0.447 | 0.446 | 0.361 | 0.355 | 0.469 | 0.445 | 0.359 | 0.391 | 0.389 | |

Time | GAM | 12.1 | 21.4 | 6.1 | 6.0 | 42.5 | 6.6 | 5.9 | 5.8 | 5.7 | |||||||||

kNN | 6.0 | 1.4 | 1.1 | 1.7 | 1.7 | 1.1 | 1.8 | 2.1 | 1.2 | ||||||||||

ANN | 55.0 | 53.8 | 50.2 | 52.3 | 55.4 | 55.0 | 53.9 | 53.8 | 52.3 | ||||||||||

SMR | 73.1 | 71.3 | 68.4 | 72.7 | 62.1 | 77.3 | 79.0 | 71.9 | 73.1 | ||||||||||

RF | 313.3 | 426.5 | 211.6 | 358.7 | 152.6 | 144.1 | 132.5 | 327.5 | 139.9 |

**Table 2.**Forecasting accuracy and computation time of each method for individual PV power generation.

Model | Time | In Sample | Out-of-Sample | ||||
---|---|---|---|---|---|---|---|

RSQ | MAE | RMSE | RSQ | MAE | RMSE | ||

GAM-M0 | 0.01 | 0.844 | 0.245 | 0.363 | 0.850 | 0.247 | 0.360 |

GAM-M1 | 4.86 | 0.912 | 0.163 | 0.273 | 0.923 | 0.153 | 0.258 |

GAM-M2 | 0.41 | 0.910 | 0.167 | 0.275 | 0.924 | 0.153 | 0.256 |

GAM-M3 | 0.40 | 0.912 | 0.162 | 0.272 | 0.926 | 0.149 | 0.253 |

kNN | 7.52 | 0.918 | 0.151 | 0.262 | 0.925 | 0.146 | 0.254 |

ANN | 212.70 | 0.909 | 0.171 | 0.277 | 0.923 | 0.158 | 0.258 |

SVR | 418.73 | 0.910 | 0.151 | 0.276 | 0.924 | 0.140 | 0.258 |

RF | 328.24 | 0.975 | 0.084 | 0.146 | 0.919 | 0.151 | 0.264 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Matsumoto, T.; Yamada, Y.
Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques. *Energies* **2021**, *14*, 7146.
https://doi.org/10.3390/en14217146

**AMA Style**

Matsumoto T, Yamada Y.
Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques. *Energies*. 2021; 14(21):7146.
https://doi.org/10.3390/en14217146

**Chicago/Turabian Style**

Matsumoto, Takuji, and Yuji Yamada.
2021. "Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques" *Energies* 14, no. 21: 7146.
https://doi.org/10.3390/en14217146