Parameter Evaluation in Motion Estimation for Forecasting Multiple Photovoltaic Power Generation

: The power-generation capacity of grid-connected photovoltaic (PV) power systems is increasing. As output power forecasting is required by electricity market participants and utility operators for the stable operation of power systems, several methods have been proposed using physical and statistical approaches for various time ranges. A short-term (30 min ahead) forecasting method had been proposed previously for multiple PV systems using motion estimation. This method forecasts the short time ahead PV power generation by estimating the motion between two geographical images of the distributed PV power systems. In this method, the parameter λ , which relates the smoothness of the resulting motion vector ﬁeld and affects the accuracy of the forecasting, is important. This study focuses on the parameter λ and evaluates the effect of changing this parameter on forecasting accuracy. In the periods with drastic power output changes, the forecasting was conducted on 101 PV systems. The results indicate that the absolute mean error of the proposed method with the best parameter is 10.3%, whereas that of the persistence forecasting method is 23.7%. Therefore, the proposed method is effective in forecasting periods when PV output changes drastically within a short time interval.


Introduction
Reducing greenhouse gas emissions caused by power production can effectively address climate changes [1]. Therefore, the installation of photovoltaic (PV) power generation systems is significantly increasing with the decreasing installation costs [2], reaching 39 GW in 2010 and 760 GW in 2020 [3]. However, as opposed to the existing dispatched power generations, the power generation of PV relies on the weather conditions, thus varying drastically and uncertainly. The high penetration of PV power generation makes it difficult to maintain the balance between demand and supply in electric power systems. This uncertain variability in PV generation can significantly impact the stable and economical operation of power systems [4,5]. Therefore, power system operators require forecasting of PV generation output; particularly, short-term forecasting is required to determine the need for demand response, and a quick-start generator is required for the stable operation of the power grid [4]. To date, different forecasting methods have been proposed for various time scales using solar radiation forecasting or direct PV power forecasting. PV forecasting can be classified into persistent forecasting, physical approach, and statistical approach [6,7]. The physical approach forecasts solar irradiance data from a numeric weather predictor atmospheric model. A common approach for regional PV output using numeric weather predictor (NWP) data is to upscale a set of representative regional PV systems. Saint-Drenan et al. [8] proposed a method to probabilistically forecast the PV output of the entire forecast area from the reference PV power and meteorological data obtained from NWP. Ma et al. [9] proposed a method to correct the NWP irradiance using the measured irradiance and optimized the prediction model using the particle swarm optimization algorithm. Statistical approaches include forecasting models based on a learning process that uses variables with historical influence. These forecasting models attempt to reduce errors by using the difference between the model's forecasted and actual data. Therefore, this approach has high-forecasting accuracy for time periods during which the model input patterns are smooth. However, forecast errors increase during periods associated with abrupt changes in weather variables commonly referred to as "drastic change periods", such as solar radiation, temperature, and humidity [6]. Statistical methods can be divided into two categories, namely time-series-based and artificial intelligence forecasting models [6]. Artificial intelligence is a popular technique for forecasting PV output generation. Among them, NN is one of the most used forecasting models. Bacher et al. [10] reported that autoregression model forecasting can be effective for short-term forecasting based on the results obtained using NWP, observed PV data, and data from both sources. Reikard [11] estimated the solar radiation intensity using the autoregressive integrated moving average (ARIMA) method and compared it with that obtained from the artificial neural network (ANN) method. Neural networks use different sets of input data to forecast future output patterns and improve the accuracy of the model by carefully selecting different influential parameters. Pedro and Coimbra [12] implemented an ANN method optimized by a genetic algorithm (GAs/ANN) and compared it with the persistent forecast, ARIMA method, and k-nearest neighbors (kNN) method. They verified that GAs/ANN is superior to the other forecast methods. Zhou et al. [13] used two long short-term memory models to forecast the temperature and PV output and combined them to enhance the forecast accuracy. Behera et al. [14] implemented an optimal design methodology for an extreme learning machine-based forecasting model of a PV system and compared its results with those of existing models, such as the backpropagation model. Alamoudi et al. [15] analyzed the performance of photovoltaic panels using an adaptive neural network-based fuzzy inference system (ANFIS) to determine the self-consumption and sufficiency rates of a PV system, and used response surface methodology (RSM) to determine the optimal operating conditions of the PV panels. They used the ANFIS to analyze the performance of photovoltaic panels to determine the self-consumption and sufficiency rates of PV systems, and a response surface methodology (RSM) to determine the optimal operating conditions of the PV panels. In addition to these approaches, forecasting methods based on image processing techniques have been proposed. Studies have been conducted to estimate solar radiation from satellites and grand-based sensors. Hammer et al. [16] proposed a method for estimating solar radiation from satellite images and a statistical method for estimating the movement of clouds. Perez et al. [17] reported that satellite-derived cloud motion-based forecasts are better models for short-term forecasts than NWP-based solar radiation forecasts. Bright et al. [18] proposed a method for selecting reference PVs to support PV output nowcasts from satellite images. Feng et al. [19] developed a deep convolutional neural network for global horizontal irradiance one hour ahead of the sky image. Chu et al. [20] proposed a reforecasting model to improve the forecasting by reforecasting with an ANN and optimizing with a GA from the sky racking technique, ARIMA, and kNN forecasting. Chow et al. [21] estimated cloud motion from the optical flow method using a sky-imaging system, and the best forecasting may be obtained depending on the smoothness parameter. Table 1 presents a brief description and summary of these forecasts. The effective timescales in Table 1 are summarized in [22].  [8,9] statistical approach This includes forecast by autoregressive and artificial intelligence, which are particularly effective in the short-term forecast.
very short-term, short-term [16][17][18][19][20][21] persistence approach This approach is effective in very short-term forecasting and is used as a benchmark for the forecast.
very short-term [6,12,17] The aforementioned short-term forecasting method tends to yield a high error during periods when PV output changes abruptly. In contrast to these studies, our previous proposal focused on the temporal transition of output power from geographically distributed PV systems [23]. Therefore, this approach aims to reduce forecasting errors during periods of abrupt changes in weather conditions for multiple PV systems at different locations. The output power distribution of irregularly located PV systems was converted into a regularly arranged mesh distribution with the same latitude and longitude intervals. Forecasting was conducted by estimating the motion of mesh distribution. In the future, the smart grid will be upgraded to capture the PV output power; therefore, this method is economical for short-term forecasting, as it allows for predictions without additional equipment.
The existing method uses the optical flow for forecasting, wherein the motion field between two consecutive images is estimated. This motion field is obtained by minimizing the objective function, comprising the data term and regularization term, by using the variational method [24,25]. The parameter λ of the regularization term is a model of the smoothness of the motion field obtained using this function. This parameter is important because the obtained motion field is highly affected by the parameter. Previously [23,26], the mesh size and interpolation method for blank values when converting to a mesh distribution were determined. In [23], the proposed method was evaluated based on the forecasting results by only λ = 1.0. In addition, the mesh size determined in the previous studies was small relative to the geographic transitions of actual weather conditions, and the estimations based on this mesh size can drive the objective function to local minimum. Therefore, there were many periods in which the estimated motion vectors did not match the actual motions in rapidly changing weather conditions. In this study, a multiscale approach was used to address this issue. This makes it possible to obtain a motion vector field that is more appropriate to the actual motions by finding the global minimum of the objective function, and to properly evaluate the effect of changing the parameter λ. The purpose of this study is to examine the effect of changing this parameter λ and to show the improvement of the prediction results from previous studies. In this study, the prediction of 101 PV power systems is executed with several values of the parameter λ, and the prediction error distribution of each PV is evaluated. The PV distribution in Japan (latitude 31.20 • 6 -39 • 80 and longitude 129 • 60 -141 • 60 ) was normalized every 30 min for one year from 2013 to 2014, converted to geographically distributed mesh data, the motion of the NV distributions was estimated, and then the forecast was evaluated. The PV output volatility is important when evaluating a forecast. Certain periods exist, such as clear days, when it is easy to forecast the PV generation. These times reduce the error difference when evaluating the motion estimations of each parameter, making it difficult to evaluate the parameters. Pedro and Coimbra [12] reported that developing seasonal models for each period of variability should improve the evaluation of the forecasting method. The objective of this method is to forecast the PV output during periods when the output of multiple PVs changes. As the changes in the PV output over a wide range differ from place to place, capturing the changes in multiple PV outputs is difficult. Therefore, in this study, the period when the output of PVs in a certain area within a 15-km-radius in Japan changed drastically within 30 min was extracted and evaluated. This paper evaluates the proposed method with the multiple PV output changed periods in a certain area where the output changes are correlated to each other. The contributions of this study can be summarized as follows.

1.
A method is proposed for evaluating multiple PV generation forecasts when the PV output changes drastically.

2.
The parameter that makes the best forecast for motion estimation reduces the error by 56.6% compared to that of the persistence forecasting. It is an effective forecasting method for periods when PV output changes drastically. 3.
The characteristics of forecasting a wide range of PV outputs using a motion estimation method are presented.
The remainder of this paper is organized as follows. Section 2 describes the data preprocessing, estimation, and evaluation methods of the proposed method. Section 3 explains the characteristics of the motion estimation, forecasting results at drastic change periods, and reasons for the errors. Finally, Section 4 concludes the study.

Materials and Methods
This section describes the input data, forecasting methods, and evaluation methods. Figure 1 depicts the flowchart of the proposed method. The forecast method uses the power generation output of the PV and position data as input and performs the forecast using an image processing approach. In this method, prior to motion estimation, the geographically distributed power output of PVs is preprocessed to enable image processing. The power output of each PV system is forecasted based on the estimation results. Additionally, this method changes the smoothness parameter of the motion estimation and evaluates it during periods when the output changes drastically. In this section, first the conditions under which the period was extracted and then the evaluation methods are described.

Dataset
The dataset used in this study is the actual data of the output power from 5097 PV systems measured at 30 min intervals from 1 August 2013, to 31 July 2014, in Japan. Therefore, the forecast timescale ∆t was set to 30 min. Table 2 presents an outline of the dataset.

Preprocessing
The proposed method uses motion estimation, which is an image processing method.
To apply the motion estimation to forecast the output of geographically distributed PV systems with various power ratings and forms, the output power data were normalized and converted to mesh data. The following process was implemented to achieve this.

Normalization
The output power from the PV systems is affected by the panel size, material, azimuth angle, tilt angle, and installation location. Therefore, normalization is necessary to compare numerous geographically distributed PV systems. In this study, NV of the output power is defined as indicated in Equation (1).
where P(t) denotes the actual PV power output at a specific time t, and P 2w (t) represents the highest power output at the same time t in the previous two weeks. Figure 2 depicts the concept of normalization implemented on 14 July 2014. Figure 2a shows the output power P(t) of a PV system on 14 July 2014 (orange line) and that observed on the previous 14 days from 30 June to 13 July 2014 (gray lines); their maximum values P 2w (t) at the corresponding times are also indicated (red line). The two-week period is selected under the assumption that at least one sunny day exists, and the seasonal change in sun elevation is negligible during this period.

Conversion to Mesh Data
In the proposed method, the irregular geographical distribution of NV values is converted to a mesh distribution, which is the distribution data aligned at equal intervals in the east-west and north-south directions to adapt it to image processing methods. The NV value of each parcel in the regularly arranged distribution with equally spaced widths of latitude and longitude were obtained from the average of the NVs of PV systems located in the parcel. In this study, the width of the mesh was set to 0.02 in latitude and longitude, based on the reports of a previous study [22]. Figure 3 depicts the conversion diagram to mesh data; Figure 3a shows the actual locations of the PV systems and their NVs at 11:00 on 14 July 2014, are classified by color, and Figure 3b shows the mesh distribution at that time. In Figure 3b, the parcels with no corresponding PV are considered blank values and are indicated in black. If several blank values exist in the mesh data, the motion estimation cannot be performed. Therefore, we performed interpolation (Figure 3c) based on Delaunay triangulation using "grid data", which is a function of MATLAB 2020a [27].

Motion Estimation
The geographical distribution of NV is assumed to move over time based on the movement of elements affecting weather conditions, such as clouds. The optical flow method, which involves image processing, was used to analyze the motion [25,26]. Assuming that NV is invariant and moves linearly in the time interval ∆t, the NVs expressed as f (x, y, t) during the period from time t − ∆t to t can be expressed as indicated in Equation (2).
where (x, y) denotes the position of NV in the east-west and north-south directions, and u(x, y, t) and v(x, y, t) represent the velocities of NV at position (x, y) and at time t in the east-west and north-south directions, respectively. A regulation term is defined considering that adjacent motion vectors tend to be aligned. The NV data term E D and regularization term E s can be determined using Equations (3) and (4), respectively, in each mesh.
where ∇ = ∂ x , ∂ y T represents the partial differential operator. The energy function is expressed as shown in Equation (5) while introducing a weight coefficient λ, which determines the extent to which u and v are spatially aligned. The value of λ is adjusted to minimize the estimation error. The vector distribution can be obtained by solving the energy function minimization problem, and the solution is expressed in Equations (6) and (7), which are Euler-Lagrange equations.
The detailed solution and algorithm are reported in [24]. As mentioned previously, the width of the mesh is 0.02 , which corresponds to 1.806 km and 2.219 km in the east-west and north-south directions, respectively, at latitude 35 • 78 and longitude 140 • 04 . However, the motion vectors for 30 min may be longer than the mesh width. Sawada and Takahashi [28] analyzed the migration of precipitation areas and reported that the speed was distributed in the range of 2-8 m/s. Bosch and Kleissl [29] proposed a method to estimate the cloud velocity using three reference PV cells and inverter output in a 48 MW PV plant. From their analysis, they reported that the velocity of clouds is distributed in the range of 3-35 m/s based on the analysis.
A multiscale approach was used to avoid the local minima of the energy function indicated in Equation (5). This approach generates a rougher distribution with multiple size scales, seeks the global minimum, and propagates the solution gradually to finer scales. In this study, the Gaussian filter was used to generate a four-fold rougher distribution. Figure 4 illustrates the motion estimation concept. Using the optical flow method, vector AB representing the NV of location A at time t − ∆t and then moved to location B at time t is estimated. Assuming that the movement is a constant velocity linear motion and persists until t + ∆t, vector BC is obtained by moving vector AB in parallel. The forecasting distribution of all mesh at time t + ∆t is obtained by interpolating the location of C from each NV obtained from vector BC.

Error Evaluation Method
In this study, the periods when NV values of more than 80% of the PV systems changed to higher than 0.2 in 30 min are referred to as drastic change periods. Generally, PV power forecasting is difficult when a drastic change occurs. Periods with drastic changes were extracted to evaluate forecast errors under harsh conditions.
The forecast error is evaluated in a certain area, which is shown by the green circle in Figure 5, in the Kanto Region with 101 PV systems while varying λ, which is the parameter in the motion estimation, to minimize the error caused by the interpolation process. Table 3 summarizes the dataset of the evaluated area. The central point of the evaluated area is located at latitude 35 • 78 and longitude140 • 04 . Figure 4 depicts the area, wherein the green circle with a radius of 15 km denotes the evaluated area, and the blue circle with a radius of 65 km indicates the area that has probably been used to estimate NV distribution in the green area. The absolute percent error (APE) is used as an evaluation method to examine the forecast error in periods during which the weather changes drastically, as indicated in Equation (8). Additionally, the definitions of the mean-absolute-percent error (MAPE) and root-mean-square error (RMSE) are indicated in Equations (9) and (10), respectively. (10) where N denotes the number of evaluated time steps, P forecasted represents the forecasted PV output, P obverved denotes the observed PV output, and P maximun indicates the maximum observed PV output for a year.

Comparison with Other Methods
In this study, persistence forecasting and ANN were implemented for comparison. As the influence of the sun elevation in a day is assumed to be represented by multiplying P 2w (t), the PV power output in persistence forecasting can be calculated using Equation (11). P persistence (t + ∆t) = P 2w (t + ∆t) × NV(t) The ANN was used as a machine learning-based forecasting method and consists of an input layer, three hidden layers, and an output layer. The numbers of neurons in the three hidden layers are 768, 384, and 192. The rectified linear unit (ReLU) function was used as the activation function. Cross-validation was performed by dividing the continuous time into four parts to obtain the validation error. Input data are month, day, hour, solar radiation, and temperature obtained from the Japanese Meteorological Agency. Herein, solar radiation represents the accumulated solar radiation from one hour before the forecast time. A detailed description of the model is given in [30]. A comparison between the ANN and the proposed method is shown in Section 3.2.

Results and Discussion
This section presents the forecast results of the motion estimation that were evaluated using 101 PVs in the evaluation area. Initially, the importance of extracting and evaluating the period in which the outputs change drastically was confirmed by comparing the forecasting results of days exhibiting a drastic change in the period with those obtained on a clear day. Subsequently, the error distribution of the persistent forecasting and the nine parameters at the extracted times were evaluated to determine the best parameter. Finally, the forecast results for a certain period were analyzed to determine the cause of the forecast error at the period when the error was reduced by changing the parameters and at periods when the forecast error could not be reduced by any of the parameters. This analysis improves the understanding of the characteristics of the motion estimation method.  Figure 6b shows the error of the total output in the evaluation area. Table 4 summarizes the daily average MAPE and RMSE calculated on 16 March 2014. Figure 7 shows the time variation of the observed NV distribution at the period used to estimate NV motion and the forecasting period, respectively, at 12:00 on 26 March 2014. Figure 8 illustrates the distribution of motion vectors between 11:00 and 12:00 calculated to forecast the NV distribution at 12:00 on 16 March 2014, under clear weather conditions. Figure 7 indicates that the change of NVs is small in the evaluation area; the forecast error of the proposed method is similar to that of the persistence forecasting. On a clear day, motion estimation estimates small movements, as depicted in Figure 8.     Figure 9a,b depict the sum of the power generation of 101 PV systems and the corresponding error of the total output in the evaluation area, respectively. Table 5 summarizes the daily average MAPE and RMSE of the total output calculated on 5 September 2013. Figure 10 shows the time variation of the observed NV distribution at the period used to estimate NV motion and the forecasting period, respectively, at 12:30 on 5 September 2013. Figure 11 depicts the distribution of motion vectors between 11:00 and 11:30 calculated to forecast the NV distribution at 12:30 on 5 September 2013, when the output power changed drastically.     Figure 10 indicates that there was a visible change in the distribution of NVs in the evaluation area; thus, a drastic change occurred in the output power, and the large motion vectors of the NV distribution were calculated, as depicted in Figure 11. In such a period, the forecast error of the proposed method is expected to reduce in comparison with that of the persistence forecasting. Figure 11 indicates that the estimation results differ according to the value of parameter λ.

Characteristics of Motion Estimation Based on Weather
This method is classified as a short-term forecast, and PV has a higher correlation with the previous output as the forecast time-horizon scale is shorter. Additionally, the result of motion estimation is close to that of the persistence forecast when the period NV distribution remains unchanged.
In this study, the proposed method is compared with persistence forecasting and ANN to verify its effectiveness, and parameter λ is evaluated for the drastic change periods.

Evaluation of the Estimation Error
The estimation results of the proposed forecasting evaluated the motion estimation parameter λ and compared it with the persistence forecasting and ANN for periods when the output power varied drastically. Note that in this section, the ANN results are for the 96 units that could be forecasted in the ANN. Figure 12 depicts a box plot of the estimation errors for the periods in the evaluated area, Table 6 shows the mean, median, and standard deviation of the errors. In the case of persistence forecasting, the mean was 23.7%, and the median was 22.2%, and for the ANN, the mean was 13.0% and the median was 10.2% In the case of proposed forecasting with the best motion estimation parameter λ = 0.019, the mean was 10.3%, and the median was 7.20%. In other words, the means of the proposed forecasting with the best parameter reduced in comparison with the persistence forecast (by 56.6%) and the ANN (by 20.8%).  The authors have applied the same forecasting method in other two regions in Japan: 122 PV systems in Kansai region and 26 PV systems in Chubu region. Furthermore, in these two regions, the forecasting errors by the proposed method with the best parameter λ (λ = 0.019 in the case of Kansai region and λ = 0.036 in the case of Chubu region) werẽ 50% lower than those by the persistence forecast [31]. Therefore, it is confirmed that the proposed method is effective for forecasting periods with a drastic change in output power.

Results of Estimation Errors
The estimation errors observed in the proposed method can be attributed to two reasons, namely the pre-processing and motion estimation. During pre-processing, the influence of the differences in tilt angle and installation orientation of the PV arrays is not completely canceled out by the normalization process. Consequently, the representative value of each mesh used in the estimation is the average of all NV values of the corresponding PV systems in the mesh. Furthermore, certain errors are caused by the interpolation of NVs in areas with no data; errors on motion estimation occur when assumptions of NV invariance and NV having a constant velocity linear motion were inappropriate.
First, an example is presented and explained such that the result of the proposed forecasting is substantially better than that of the persistence forecasting. Figure 13 depicts the box plot of the estimation error at 12:00 on 21 March 2014. It indicates that the proposed forecasting with a large parameter λ does not reduce the error compared to persistence forecasting. Parameter λ is associated with the unity of the vector distribution. Figure 14 depicts three NV distributions at the period, which shows a lump of low NVs in the center that expands, and the spread of NV is non-uniform depending on the direction. In this case, the forecast can be improved with a low λ estimation that considers local motion. Figure 15 is the estimated flow visualization with different λ. It shows that the motion vectors which is yielded with high λ values are too short compared with the practical motions of the lump in Figure 14. However, divergence vector is too large and certain vectors cross each other when λ is too small in Figure 15, which is also considered to be unnatural because weather changes caused mainly by cloud motions are not considered to intersect. On the other hand, the motion vector distributions with λ = 0.019-0.036 have few vector crossings and the fine movement, so they seem to be appropriate.   Another example showing that the result of the proposed forecasting is not better than that of the persistence forecasting is explained. Figure 16 depicts the box plot of the estimation error at 13:00 on 6 September 2013. It indicates that motion estimation does not reduce the forecasting error compared to that of persistence forecasting at 12:30 on 6 September 2013. Figure 17 depicts three NV distributions at the period, which shows the movement of the lumps of the NVs may differ at different times of the period, such as between 11:30 and 12:00, and 12:00 and 12:30. The difference is clearer in the comparison between Figures 18 and 19: Figure 18 shows the distribution of motion vectors from 11:30 to 12:00 calculated using the NV distributions at 11:30 and 12:00 with the best parameter λ = 0.036 and 0.019, and Figure 19 depicts the distribution of motion vectors from 12:00 to 12:30. Even if the motion vectors in Figure 18 are moved forward in parallel as explained in Figure 4 (such as from vector AB to vector BC), the moved distribution is different from Figure 19. That is to say, the assumption that the movement from t − ∆t (11:30) to t (12:00) persists in the period from t (12:00) to t + ∆t (12:30) is not suitable for this period.

Conclusions
In this study, a parameter for motion estimation, which was insufficiently discussed in a previous study [22], was investigated for a certain geographic area during drastic change periods. The method proposed in this paper is achieved by normalizing the geographically distributed PV output data, converting them to mesh data, analyzing those motions, generating estimated mesh data from the motions, and converting them into the output of each PV for forecasting. The forecasting was performed using the output data of 5097 PVs located in Japan (31 • 20 -39 • 80 , longitude 129 • 6 -141 • 60 ) for every 30 min from August 2013 to September 2014. The period when output changed drastically in a certain area within a radius of 15 km in the Kanto region (latitude 35 • 78 and longitude140 • 04 ) was extracted to evaluate the forecasting method. The parameters were evaluated from an error distribution of 101 PV units. In comparison with the persistence forecasting, the parameter for motion estimation, which exhibited the smallest error, was able to significantly reduce the error from 23.7% to 10.3%. Motion estimation indicates that the proposed forecasting with a large λ does not reduce the error compared to persistence forecasting. This indicates that low λ estimation improves the prediction by considering local motion when estimating the motion of a wide range of non-uniform NVs.
This result indicates that the parameter presents the best forecast for motion estimation and reduces the error by 56.6% than that of persistence forecasting. The proposed method can effectively forecast for periods when PV output changes drastically.
The following studies should be conducted in the future 1.
Correlation between the number of PV data affecting the evaluation area and forecasting accuracy: The proposed forecasting was evaluated using data obtainable in this case. However, the positional relationship and density of PV output in the evaluation area that affect the forecasting accuracy have yet to be investigated. Therefore, it should be considered in the future.

2.
Consideration of ensemble learning: It has been reported that ensemble learning, which combines different forecasting models, improves the forecasting accuracy. Therefore, we believe that further improvement of accuracy can be achieved by combining ensemble learning with statistical methodology.