# Spatial and Temporal Wind Power Forecasting by Case-Based Reasoning Using Big-Data

^{*}

## Abstract

**:**

## 1. Introduction

^{2}(i.e., 7.6 km × 7.6 km), which could be not suitable for accurately describing local wind dynamics in complex regions. Moreover, they require very large computational resources and complex, time-consuming solution algorithms, which make complex their deployment in a real grid operation scenario.

## 2. The Role of Physical Downscaling Models in Wind Power Forecasting

## 3. Wind Power Forecasting by Case Based Reasoning

**X**

_{B}be the input matrix storing n boundary conditions, each of them described by a vector of m components, representing the values of the weather variables on a spatial mesh with large resolution; and let

**Y**

_{B}be the output matrix storing the corresponding n downscaled solutions obtained by the CFD solver, each of them described by a vector of r components, representing the wind speed components in the points of interest, i.e., the wind turbines locations. These input/output matrixes represent the knowledge base of the CBR process, since they allow computing the downscaled solution for a boundary condition described by the query vector

**x**

_{q}according to the following procedure:

- Compute the distance between the query point
**x**_{q}and each vector of the input matrix**X**_{B}:$${d}_{j}=\sqrt{{\displaystyle \sum}_{i=1}^{m}{\left({\mathit{x}}_{q}\left(i\right)-{\mathit{X}}_{B}\left(j,i\right)\right)}^{2}}\text{\hspace{1em}}\forall j\in [1,n]$$ - Compute the similarity degree between the query and the stored vectors:$${w}_{j}=\frac{\underset{i}{\mathrm{max}}({d}_{i})}{{d}_{j}}\forall j\in [1,n]$$
- The downscaled solution ${\widehat{\mathit{y}}}_{q}$ for the query vector can be approximated by processing the downscaled solutions corresponding to the “most similar input vectors”, here referred as the neighbors, which can be identified ordering the input vectors according to their similarity degrees. To this aim, the following naïve approach can be applied:$${\widehat{\mathit{y}}}_{q}\left(i\right)=\frac{{{\displaystyle \sum}}_{j\in N}{\mathit{Y}}_{B}\left(j,i\right)\times {w}_{j}}{{{\displaystyle \sum}}_{j\in N}{w}_{j}}\text{}\forall i\in [1,r]$$

**X**

_{B}and

**Y**

_{B}. In fact, these matrixes are characterized by a large number of both rows and columns, depending on the number of available input/output patterns (e.g., order of several hundreds), and the spatial resolution of the environmental variables profile (e.g., order of several thousands), respectively.

## 4. Enabling Methodologies for Feature Extraction from Massive Wind Data

#### 4.1. PCA: Principal Component Analysis

- ${\overline{\mathit{X}}}_{B}$ and ${\overline{\mathit{Y}}}_{B}$ are the center of the matrixes ${\mathit{X}}_{B}$ and ${\mathit{Y}}_{B}$, respectively;
- ${\mathit{X}}_{Bs}$ and ${\mathit{Y}}_{Bs}$, whose dimensions are $[{n}_{PCx},\text{}n]$ and $[{n}_{PCy},\text{}n]$ ${(\mathrm{with}\text{}n}_{PCx}\ll {m,\text{}n}_{PCy}\ll r$), are the score matrixes;
- $\mathit{P}$ and $\mathit{Q}$, whose dimensions are $[n{,\text{}n}_{PCx}]$ and $[n{,\text{}n}_{PCy}]$, respectively, are the loadings matrixes;
- ${\mathit{\u03f5}}_{x}$ and ${\mathit{\u03f5}}_{y}$ are the error matrixes.

**P**and

**Q**, respectively, and f is the number of principal components, which can be selected by applying the methodologies described in [16].

#### 4.2. Partial Last Square Regression

**Y**as follows:

_{B}#### 4.3. Proposed Method

**X**

_{B}and

**Y**

_{B}, which are processed by a feature extraction technique based on PCA or PLSR. The corresponding reduced matrixes,

**X**

_{Bs}and

**Y**

_{Bs}, are stored in the database of the historical physical downscaling solutions, and used to infer the solutions for the query vectors. The overall forecasting process is summarized in Figure 3.

## 5. Experimental Results

#### 5.1. PCA Results

#### 5.2. PLSR Results

## 6. Conclusions

## Author Contributions

## Conflicts of Interest

## List of Symbols

x_{q} | query vector |

X_{B} | the input matrix storing n boundary conditions |

Y_{B} | output matrix storing the downscaled solutions obtained by using of the CFD solver |

${\mathit{d}}_{\mathit{j}}$ | distance between the query point x_{q} and each vector of the input matrix X_{B} |

${\mathit{w}}_{\mathit{j}}$ | similarity degree between the query and the stored vectors |

${\widehat{\mathit{y}}}_{\mathit{q}}$ | approximated downscaled solution ${\widehat{\mathit{y}}}_{\mathit{q}}$ for the query vector with CBR |

n | number of downscaled solutions obtained by the CFD solver |

m | number of components of each boundary condition set in X_{B} |

r | number of components of each downscaled solution set in X_{B} |

$\mathit{\beta}$ | regression matrix |

X_{N} | set of nearest neighbors of X_{B} |

Y_{N} | set of nearest neighbor of the Y_{B} correspondent to the X_{N} vectors |

${\overline{\mathit{X}}}_{B}$, ${\overline{\mathit{Y}}}_{B}$ | center of the matrixes X_{B} and Y_{B}, respectively |

X_{Bs}, Y_{Bs} | score matrixes of the matrixes X_{B} and Y_{B}, respectively |

${n}_{\mathit{P}\mathit{C}\mathit{x}}$, ${n}_{\mathit{P}\mathit{C}\mathit{y}}$ | number of principal components of the matrixes X_{B} and Y_{B}, respectively |

$\mathit{P}$, $\mathit{Q}$ | loadings matrixes |

${\mathit{\u03f5}}_{\mathit{x}}$, ${\mathit{\u03f5}}_{\mathit{y}}$ | error matrixes |

$\mathit{\Sigma}$ | covariance matrix |

${\mathit{w}}_{\mathit{k}}$ | example of column vector of loading matrix |

${\mathit{p}}_{\mathit{s}}$, ${\mathit{q}}_{\mathit{s}}$ | the s-th column vectors of loadings matrices P and Q of matrices X_{Bs} and Y_{Bs}, respectively |

${\mathit{\beta}}_{0}$ | intercept matrix |

${\mathit{x}}_{\mathit{q}\mathit{s}}$ | query vector in the new phase space |

${\widehat{\mathit{y}}}_{\mathit{q}\mathit{s}}$ | approximated downscaled solution ${\widehat{\mathit{y}}}_{q}$ for the query vector with CBR in the new phase space |

N | is the set of the neighbors |

${\mathit{k}}_{{\mathit{X}}_{\mathit{B}}},\text{}{\mathit{k}}_{{\mathit{Y}}_{\mathit{B}}}$ | principal components number of matrices X_{B} and Y_{B}, respectively |

RS_{PSLR} | Root Square Ratio |

## References

- Lerner, J.; Grundmeyer, M.; Garvert, M. The role of wind forecasting in the successful integration and management of an intermittent energy source. Energy Cent. Wind Power
**2009**, 3, 1–6. [Google Scholar] - Qu, G.; Mei, J.; He, D. Short-term wind power forecasting based on numerical weather prediction adjustament. In Proceedings of the 11th IEEE International Conference of Industrial Informatics, Bochum, Germany, 29–31 July 2013.
- Terciyanli, E.; Demirci, T.; Kucuk, D.; Sarac, M.; Cadirci, I.; Ermis, M. Enhanced nationwide wind-electric power monitoring and forecast system. IEEE Trans. Ind. Inform.
**2014**, 10, 1171–1184. [Google Scholar] [CrossRef] - Palomares-Salas, J.C.; De la Rosa, J.J.G.; Ramiro, J.G.; Melgar, J.; Agüera, A.; Moreno, A. Comparison of Models for Wind Speed Forecasting. In Proceedings of the ICCS 2009, International Conference on Computational Science, Baton Rouge, LA, USA, 25–27 May 2009.
- Katsigiannis, Y.A.; Tsikalakis, A.G.; Georgilakis, P.S.; Hatziargyriou, N.D. Improved wind power forecasting using a combined neuro-fuzzy and artificial neural network model. In Hellenic Conference on Artificial Intelligence; Springer: Berlin, Germany, 2006. [Google Scholar]
- Vaccaro, A.; Bontempi, G.; Taieb, S.B.; Villacci, D. Adaptive local learning techniques for multiple-step-ahead wind speed forecasting. Electr. Power Syst. Res.
**2012**, 83, 129–135. [Google Scholar] [CrossRef] - Ozkan, M.B.; Karagoz, P. A novel wind power forecast model: Statistical hybrid wind power forecast technique (SHWIP). IEEE Trans. Ind. Inform.
**2015**, 11, 375–387. [Google Scholar] [CrossRef] - Bellman, R. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 1957. [Google Scholar]
- ECMWF. Available online: http://www.ecmwf.int/en/research/modelling-and-prediction/atmospheric-dynamics (accessed on 13 February 2013).
- MeteoSwiss Operational Applications within COSMO. Available online: http://www2.cosmo-model.org/content/tasks/operational/meteoSwiss/#domai (accessed on 13 February 2013).
- Sammut, C.; Webb, G.I. Encyclopedia of Machine Learning; Springer Science & Business Media: New York, NY, USA, 2011; pp. 284–285. [Google Scholar]
- Skittides, C.; Früh, W.G. Wind forecasting using principal component analysis. Renew. Energy
**2014**, 69, 365–374. [Google Scholar] [CrossRef] [Green Version] - Davò, F.; Alessandrini, S.; Sperati, S. An Application of PCA Based Approach to Large Area Wind Power Forecast. In Proceedings of the EWEA Wind Power Forecasting Technology Workshop, Rotterdam, The Netherlands, 3–4 December 2013.
- Wu, Q.; Peng, C. Wind power generation forecasting using least squares support vector machine combined with ensemble empirical mode decomposition, principal component analysis and a bat algorithm. Energies
**2016**, 9, 261. [Google Scholar] [CrossRef] - Li, S.; Wang, P.; Goel, L. Wind power forecasting using neural network ensembles with feature selection. IEEE Trans. Sustain. Energy
**2015**, 6, 1447–1456. [Google Scholar] [CrossRef] - Hall, S. Implementation and Verification of Robust PLS Regression Algorith; Chalmers University of Technology: Gothemburg, Sweden, 2014; pp. 5–10. [Google Scholar]
- Aamodt, A.; Plaza, E. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Commun.
**1994**, 7, 39–59. [Google Scholar]

**Figure 5.**Comparison between latent variables extracted from CFD solver output (green) and obtained by using of the PCA-based CBR for the 2nd forecasting hour.

**Figure 6.**Comparison between spatial wind speed obtained from CFD solver output (green) and obtained by using of the PCA-CBR for the 2nd hour forecasting.

**Figure 7.**Comparison between spatial power output obtained from CFD solver output (green) and obtained by using of PCA for the 2nd forecasting hour.

**Figure 8.**Comparison between spatial power output obtained from CFD solver output (green) and obtained by using of PCA for the 2nd forecasting hour.

**Figure 13.**Scatter plot observed- fitted response in function of several number of PLS components used in this work.

Case Number | N° xPCs (${\mathit{k}}_{{\mathit{X}}_{\mathit{B}}}$) | N° yPCs (${\mathit{k}}_{{\mathit{Y}}_{\mathit{B}}}$) | Nearest Neighbors (NN) | Normal Root Mean Square Error (NRMSE) | Normal BIAS (NBIAS) | Normal Mean Absolute Error (NMAE) |
---|---|---|---|---|---|---|

1 | 5 | 5 | 3 | 0.5204 | 0.2063 | 0.4439 |

2 | 10 | 12 | 3 | 0.3995 | 0.0166 | 0.3403 |

3 | 5 | 5 | 6 | 0.4195 | 0.2599 | 0.3629 |

4 | 5 | 5 | 3 | 0.2994 | −0.0412 | 0.2366 |

5 | 10 | 12 | 3 | 0.3039 | 0.0864 | 0.2594 |

Case | RMSE on Wind Speed (m/s) | RMSE on Wind Direction (°) | Time Required | N° of PCs | Rsquared Ratio |
---|---|---|---|---|---|

1 | 1.7865 | 25.5 | 6.10 s | 24 | 0.8355 |

2 | 2.1861 | 27.8 | 2.87 s | 12 | 0.7581 |

3 | 1.6119 | 24.0 | 11.39 s | 48 | 0.8538 |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

De Caro, F.; Vaccaro, A.; Villacci, D.
Spatial and Temporal Wind Power Forecasting by Case-Based Reasoning Using Big-Data. *Energies* **2017**, *10*, 252.
https://doi.org/10.3390/en10020252

**AMA Style**

De Caro F, Vaccaro A, Villacci D.
Spatial and Temporal Wind Power Forecasting by Case-Based Reasoning Using Big-Data. *Energies*. 2017; 10(2):252.
https://doi.org/10.3390/en10020252

**Chicago/Turabian Style**

De Caro, Fabrizio, Alfredo Vaccaro, and Domenico Villacci.
2017. "Spatial and Temporal Wind Power Forecasting by Case-Based Reasoning Using Big-Data" *Energies* 10, no. 2: 252.
https://doi.org/10.3390/en10020252