A Study on the Power Generation Prediction Model Considering Environmental Characteristics of Floating Photovoltaic System

: The main contents of this paper are to verify the environmental factors a ﬀ ecting the power generation of ﬂoating photovoltaic systems and to present the power generation prediction model considering environmental factors by using regression analysis and neural networks studied during the last decade. This study focused on a comparative analysis of which model is best suited for the power generation prediction of the ﬂoating photovoltaic (PV) system. To compare the power generation characteristics of a ﬂoating and a land-based PV system, two identical 2.5 kW PV systems were installed—one on the water surface in the Boryeong Dam, Korea, and the other nearby on dry land—and their performances were compared. The solar irradiance of the ﬂoating PV system was 1.1% lower than that of the land-based PV. Nevertheless, the ﬂoating PV module temperature was 4.9% lower than that of the land-based PV, generating approximately 3% more power. Using the correlation analysis of data mining techniques, environmental factors a ﬀ ecting the e ﬃ ciency of the ﬂoating PV system were investigated. The correlation coe ﬃ cient between the module temperature and water temperature was r = 0.6317 which proves that the high e ﬃ ciency and low module temperature characteristics of the ﬂoating PV system, when compared with that of the land-based PV, are due to the water evaporation e ﬀ ect. Considering environmental factors, power-generation prediction models based on regression analysis and neural networks are presented, and their accuracies are compared. This comparison conﬁrms that the accuracy of the power generation prediction model using neural networks was approximately 2.59% higher than that of the regression analysis method. As a result of adjusting the hidden nodes in the neural network algorithm, it was conﬁrmed that a neural network algorithm with ten hidden nodes was most suitable for calculating the amount of power generation.


Introduction
Recently, the renewable energy market has been growing rapidly, owing to fossil fuel reduction and environmental problems [1,2]. With the expansion of the renewable energy market, the global photovoltaic (PV) capacity is projected to exceed 1 TW by 2022 [3]. Especially, the solar power market for large-scale PV power-generation systems with capacities in the range of MW or higher is expanding rapidly. However, it is difficult to secure land for the construction of such large-scale systems; hence, large-scale floating PV facilities have been proposed as high-efficiency alternatives [4]. systems; hence, large-scale floating PV facilities have been proposed as high-efficiency alternatives [4]. As of 2018, 1314 MW of floating PV systems were installed worldwide, and it has been predicted that 5,211,086 GWh/y of electricity will be generated annually if 10% of the man-made reservoir surfaces are used for floating PV systems [5]. Various researches have accompanied the growth of the floating PV market. In Korea, as in many countries around the world, several kinds of research on floating solar systems are being conducted, such as research on the installation of floating PVs in oceans, the grounding of floating PVs, and tracing-type floating PVs that track sunlight [6,7]. The floating PV system is a new concept of a power plant, which uses buoyant bodies to float solar power plants on the surfaces of dam reservoirs. The system is designed such that the solar module is fixed to a float that is secured by a mooring device on the water surface [8]. Figure 1 shows the concept of a typical floating PV system. The benefit of a large-scale floating PV system is that it can be installed in the reservoir of a dam without the need for deforestation, which is a common problem encountered while installing land-based PV systems. The first floating PV system was built in 2007 in Aichi, Japan, following which several other countries, including France, Italy, Korea, Spain, and the United States, installed small-scale systems for research and development (R&D) purposes [5]. Recently, PV systems have been widely deployed in many countries, and their capacity has increased to several tens of MW or higher. A 7.5 MW system and a 6.3 MW system on the Umenoki reservoir, Japan, and the Queen Elizabeth Ⅱ Lake, UK, are currently in operation. In Huainan City, China, a 40 MW PV system is in operation [5,9]. In Korea, a 100 kW demonstration floating PV system was installed, for the first time, on the water surface of the Hapcheon Dam reservoir, in October 2011. After successfully installing this PV system, a 500 kW commercial floating PV system was installed at another nearby location in 2012. As of 2020, the representative floating PV systems operating in Korea are a 2 MW system on the Boryeong Dam reservoir, a 3 MW system on the Chungju Dam reservoir, and an 18.7 MW system on the lagoon at the Gunsan Industrial Complex. Figure 2 shows the 3MW floating PV system installed on the reservoir of the Chungju Dam, Korea.  The first floating PV system was built in 2007 in Aichi, Japan, following which several other countries, including France, Italy, Korea, Spain, and the United States, installed small-scale systems for research and development (R&D) purposes [5]. Recently, PV systems have been widely deployed in many countries, and their capacity has increased to several tens of MW or higher. A 7.5 MW system and a 6.3 MW system on the Umenoki reservoir, Japan, and the Queen Elizabeth II Lake, UK, are currently in operation. In Huainan City, China, a 40 MW PV system is in operation [5,9]. In Korea, a 100 kW demonstration floating PV system was installed, for the first time, on the water surface of the Hapcheon Dam reservoir, in October 2011. After successfully installing this PV system, a 500 kW commercial floating PV system was installed at another nearby location in 2012. As of 2020, the representative floating PV systems operating in Korea are a 2 MW system on the Boryeong Dam reservoir, a 3 MW system on the Chungju Dam reservoir, and an 18.7 MW system on the lagoon at the Gunsan Industrial Complex. Figure 2 shows the 3MW floating PV system installed on the reservoir of the Chungju Dam, Korea. 100 kW demonstration floating PV system was installed, for the first time, on the water surface of the Hapcheon Dam reservoir, in October 2011. After successfully installing this PV system, a 500 kW commercial floating PV system was installed at another nearby location in 2012. As of 2020, the representative floating PV systems operating in Korea are a 2 MW system on the Boryeong Dam reservoir, a 3 MW system on the Chungju Dam reservoir, and an 18.7 MW system on the lagoon at the Gunsan Industrial Complex. Figure 2 shows the 3MW floating PV system installed on the reservoir of the Chungju Dam, Korea.  The power generation characteristics of PV modules are heavily influenced by the cell/module temperature [10]. Generally, the PV module output can be estimated as shown in Equation (1), and thus, the rating of a PV module may differ depending on the temperature differences from the actual site under the standard test conditions (STC), wherein the solar radiation intensity is 1000 W/m 2 , the module temperature is 25 • C, and the air mass (AM) coefficient is 1.5 [11].
where P is the output power of the PV module [W], G T is the solar irradiance on the module plane [W/m 2 ], τ pv is the transmittance of the PV cells' outside layers, η Tre f is the electrical efficiency of the module at a specified temperature, A is the surface area of the PV module [m 2 ], and T c is the module operating temperature [ • C]. From Equation (1), it can be expected that the amount of power generation decreases by 0.45% as the module temperature increases by 1 • C from 25 • C.
Published research has concluded that the power generation of floating PV systems is 10 to 15% greater than that of the land-based PV systems because of the cooling effect [12,13]. In one study, a PV module cooling system with intermittent water spray to module was installed to get the above mentioned result, and in another, the solar-cell module temperature was presumed to be reduced by the evaporation of water. This paper investigates the PV systems without any mechanical cooling system and with the same sizes and nearby locations, which is rarely reviewed in the other papers. In addition, this paper proposes how to calculate the generated power by a machine learning algorithm, which is not done much in floating PV systems. The various factors used for comparing the land and floating PV systems were different; for example, the modules, inverters, years of installation, and power-generation capacities were different. Therefore, there was a limit to the precise comparison of the power-generation characteristics of the two types of PV systems.
To develop a large-scale floating PV power plant, the economic feasibility must be secured, and the most important factor in analyzing the economic feasibility is the power-generation prediction. The land-based PV system output forecasting problem has been thoroughly studied during the last decade using machine or neural network-based learning. However, the installation environmental of floating PV systems is different from that of land-based PV systems. Therefore, this paper focuses on selecting the most suitable power generation prediction model, considering the power generation characteristics of floating PV system. Two identical 2.5 kW PV systems were installed on the water surface of the Boryoung Dam reservoir, Korea, and on the nearby land, respectively, to ensure precise comparison [14]. Based on the operation results over six months, the following facts were confirmed: The solar irradiance of the floating PV was 1.1% less than that of land-based PV. Nevertheless, the floating PV module temperature was 4.9% lower than that of the land-based PV, generating approximately 3% more power. The correlation coefficient between the module temperature and water temperature was r = 0.6317, which proved that the high efficiency and low module temperature characteristics of the floating PV, when compared to that of the land-based PV, were due to the water evaporation effect. Through regression analysis, the power-generation prediction model of a floating PV system considering environmental factors was presented. Recently, neural network analysis is being widely used for estimating the generation of power because of its more accurate prediction when compared to regression analysis [15]. Therefore, a prediction model using neural networks, represented by a nonlinear algorithm, was developed in this study. A hyperbolic tangent was used as the activation function, and prediction models were compared using one to twenty nodes to choose the best model.
In this study, prediction models for floating PV power generation, considering environmental factors, are presented using regression analysis and neural networks. Upon comparing the accuracy, it was confirmed that the prediction model using neural networks was more accurate than the one using regression analysis. Therefore, this study presents more logical and accurate approaches to power generation prediction and economic feasibility analysis, which are essential for the development of large-scale floating PV power plants.

Power Generation Characteristics Comparison System (PGCCS)
To precisely compare and analyze the power generation characteristics of floating and land-based PV systems, depending on the environmental effects, both systems should have the same system specifications and installation conditions, such as PV modules, inverters, the capacity of facilities, installation orientation, and angle [16].
In this study, 2.5 kW power generation characteristics comparison systems (PGCCS) were installed, as shown in Figure 3. The configuration of the power circuit of the 2.5 kW PGCCS is shown in Figure 4. Table 1 lists the specifications of the PV module and Power Conditioning System (PCS) of the PGCCS installed, as shown in Figure 4. The installation direction of PGCCS was south, and the mounting tilt was fixed at 30 degrees.
Appl. Sci. 2020, 10, 4526 4 of 21 power generation prediction and economic feasibility analysis, which are essential for the development of large-scale floating PV power plants.

Power Generation Characteristics Comparison System (PGCCS)
To precisely compare and analyze the power generation characteristics of floating and landbased PV systems, depending on the environmental effects, both systems should have the same system specifications and installation conditions, such as PV modules, inverters, the capacity of facilities, installation orientation, and angle [16].
In this study, 2.5 kW power generation characteristics comparison systems (PGCCS) were installed, as shown in Figure 3. The configuration of the power circuit of the 2.5 kW PGCCS is shown in Figure 4. Table 1 lists the specifications of the PV module and Power Conditioning System (PCS) of the PGCCS installed, as shown in Figure 4. The installation direction of PGCCS was south, and the mounting tilt was fixed at 30 degrees.  To analyze the power generation characteristics, the PV output and meteorological data, as summarized in Table 2, were collected.   The analysis period was six months from July 1, 2017 to December 31, 2017, and the analysis time was from 7:00 to 19:00 during the daytime when solar power is generated. Because the PGCCS was installed at the end of June 2017, it was analyzed based on the results of operation over the six months from July 1, 2017 to December 31, 2017. As the climate during spring and autumn exhibits the same characteristics, in Korea, all the different seasonal characteristics for these seasons could be covered during this period of analysis. The analysis was performed in the daytime because the module temperature was influenced by solar irradiance during the daytime and this directly affected the system efficiency. Night-time was excluded because the deviation in the PV module temperature characteristics reduced when it was included in the analysis time.

Operating Characteristics
Figures 5-7 show the comparative results of the floating and land-based PV systems in terms of the amount of power generated, the PV module temperatures, and irradiation, respectively.
During the survey period, the total solar irradiances were 622,262 and 615,696 W m ⁄ for the land-based and floating PV systems, respectively, as shown in Figure 7. This means that the solar irradiance of the land-based PV system was 1.1% higher than that of the floating PV system. The reason for the solar irradiance of the floating PV system being slightly lower than that of the land PV system was presumed to be the mist at the water surface that occurred frequently. However, as shown in Figure 6, the daytime PV module temperature of the floating PV system was 4.9% lower than that of the land-based PV system. The outputs of the floating and land-based PV systems were 1408 and 1367 kWh, respectively, as shown in Figure 5. Therefore, the output of the floating PV system was 3% higher than that of the land-based PV system.  To analyze the power generation characteristics, the PV output and meteorological data, as summarized in Table 2, were collected.

Data Analysis Algorithm
The analysis period was six months from 1 July 2017 to 31 December 2017, and the analysis time was from 7:00 to 19:00 during the daytime when solar power is generated. Because the PGCCS was installed at the end of June 2017, it was analyzed based on the results of operation over the six months from 1 July 2017 to 31 December 2017. As the climate during spring and autumn exhibits the same characteristics, in Korea, all the different seasonal characteristics for these seasons could be covered during this period of analysis. The analysis was performed in the daytime because the module temperature was influenced by solar irradiance during the daytime and this directly affected the system efficiency. Night-time was excluded because the deviation in the PV module temperature characteristics reduced when it was included in the analysis time. For analyzing the power-generation characteristics of the floating PV system, correlation and regression analyses were used first, following which the neural networks-based analysis was applied as an improved method for more accurate analysis.

Correlation Analysis
Correlation analysis is a method of numerically expressing the correlations of the mutual effects in the data [17]. Pearson's correlation, which is the correlation statistic most widely During the survey period, the total solar irradiances were 622,262 and 615,696 W/m 2 for the land-based and floating PV systems, respectively, as shown in Figure 7. This means that the solar irradiance of the land-based PV system was 1.1% higher than that of the floating PV system. The reason for the solar irradiance of the floating PV system being slightly lower than that of the land PV system was presumed to be the mist at the water surface that occurred frequently. However, as shown in Figure 6, the daytime PV module temperature of the floating PV system was 4.9% lower than that of the land-based PV system. The outputs of the floating and land-based PV systems were 1408 and 1367 kWh, respectively, as shown in Figure 5. Therefore, the output of the floating PV system was 3% higher than that of the land-based PV system.

Data Analysis Algorithm
For analyzing the power-generation characteristics of the floating PV system, correlation and regression analyses were used first, following which the neural networks-based analysis was applied as an improved method for more accurate analysis.

Correlation Analysis
Correlation analysis is a method of numerically expressing the correlations of the mutual effects in the data [17]. Pearson's correlation, which is the correlation statistic most widely used to measure the degree of the relationship between linearly related variables, was used to analyze the relationship between the temperatures of the PV modules and the environmental factors, including water temperature and wind speed. Pearson's correlation coefficient, r, is described in Equation (2).
where, x = 1 n n i=1 x n , y = 1 n n i=1 y n , n is the number of samples, and −1 ≤ r ≤ 1 [18]. The variables x i and y i , which affect the module temperature of the floating PV, are defined as in Equations (3)-(5).
where x i is the module temperature difference between the temperatures of the land-based PV module, Tc spv , and floating PV module, Tc f pv . y i1 is the temperature difference between the PV system on land, T a , and the water surface, T w , and y i2 is the wind speed difference between land, v spv , and water surface, v f pv . The reason for setting the temperature differences, x i , y i1 , and y i2 , as above is to focus on the precise influence of each variable.

Regression Analysis
Regression analysis was used to analyze the power-generation characteristics of the floating PV system, according to the change in module temperature. For describing the dependent variable Y, the multiple regression model was defined as in Equation (6) with the k number of X k [19][20][21].
where, Y = dependent variable, PV generated power. X k = independent variables, such as module temp, slope irradiation, water temp, and so on. β 0 , β 1 , . . . , β k are the regression coefficients to be estimated, i . is the residual error, and i is the data number.

Neural Network Analysis
Neural networks that mimic the information processing functions of the brain nerves of organisms can be classified into supervised learning and unsupervised learning networks, according to the learning methods. Figure 8 shows the supervised learning process of neural networks, which use training data consisting of input values and target values to adjust the weight, which is the strength of the formation, according to certain rules, to minimize the error between the neural network output value and target output value. Therefore, the weight is set such that the correspondence between the input and output from the given input and output data is implemented well [22].

Influence Factors
To analyze the environmental factors affecting the floating PV module temperature, correlations among the module temperature difference, air-water temperature difference, and wind speed difference between land and water, were analyzed. Figure 10a shows a scatter plot that compares the module temperature difference between the land-based and floating PV modules to the temperature difference between the air over land and In unsupervised learning, on the other hand, no target values corresponding to the input values are provided. In this case, the weights are adjusted for similar input patterns so that each node produces a similar output.
In this study, the multilayer perceptron (MLP) was used, which is the most popular model for supervised learning [23]. The MLP consists of units with many interconnected nodes. Each node function usually consists of a weighted sum and differential nonlinear activity function. Figure 9 shows the most useful activation functions of MLP.
In artificial neural networks, the activity function of a node defines the output of that node, given an input or set of inputs. A standard integrated circuit can be expressed as a digital network of activity functions, which can be "ON" (1) or "OFF" (0), depending on the input.
If three activation functions are to be described as equations, they can be expressed as the following equations: Logistic function : Hyperbolic tan gent function : Identity function : Appl. Sci. 2020, 10, 4526 9 of 20

Influence Factors
To analyze the environmental factors affecting the floating PV module temperature, correlations among the module temperature difference, air-water temperature difference, and wind speed difference between land and water, were analyzed. Figure 10a shows a scatter plot that compares the module temperature difference between the land-based and floating PV modules to the temperature difference between the air over land and water [19]. It is observed that the scatter plot has a regular form and a positive correlation wherein the larger the temperature difference between the air over land and the water, the larger the module temperature difference between the land-based and floating PV modules is. The linear regression algorithm shows a high correlation coefficient of = 0.6317. In general, the degree of correlation to the correlation value is given, as in Table 3 [24,25]. Thus, significant correlations exist between the land-based and floating module temperature differences and the air and water temperature differences.

Influence Factors
To analyze the environmental factors affecting the floating PV module temperature, correlations among the module temperature difference, air-water temperature difference, and wind speed difference between land and water, were analyzed. Figure 10a shows a scatter plot that compares the module temperature difference between the land-based and floating PV modules to the temperature difference between the air over land and water [19]. It is observed that the scatter plot has a regular form and a positive correlation wherein the larger the temperature difference between the air over land and the water, the larger the module temperature difference between the land-based and floating PV modules is. The linear regression algorithm shows a high correlation coefficient of r = 0.6317. In general, the degree of correlation to the correlation value r is given, as in Table 3 [24,25]. Thus, significant correlations exist between the land-based and floating module temperature differences and the air and water temperature differences.  Figure 10b shows the results of the nonlinear learning method applied to improve the results of the linear regression analysis. As a result of this analysis, the correlation coefficient r was improved to 0.6541, and it exhibited mutual nonlinearity. Figure 11 shows a scatter plot showing the mutual characteristics of the module temperature difference and wind speed difference between land and water. The land-based and floating module temperature difference and wind speed difference indicate that there is no special pattern and that the correlation coefficient is very low (0.0032), which indicates that there is no specific correlation. Therefore, the module temperature that directly affects power generation has a very low correlation with wind speed; however, the change in water temperature has a significant correlation with the module temperature.  Figure 10b shows the results of the nonlinear learning method applied to improve the results of the linear regression analysis. As a result of this analysis, the correlation coefficient was improved to 0.6541, and it exhibited mutual nonlinearity.  Figure 11 shows a scatter plot showing the mutual characteristics of the module temperature difference and wind speed difference between land and water. The land-based and floating module temperature difference and wind speed difference indicate that there is no special pattern and that the correlation coefficient is very low (0.0032), which indicates that there is no specific correlation. Therefore, the module temperature that directly affects power generation has a very low correlation with wind speed; however, the change in water temperature has a significant correlation with the module temperature.  Table 4 summarizes that solar radiation is the most influential factor in the correlation analysis for calculating the amount of solar power generated by the floating PV module, followed by the module temperature and ambient temperature, in sequence. The wind speed and direction have little effect on power generation. From Table 4, the correlation between the horizontal surface irradiance  Table 4 summarizes that solar radiation is the most influential factor in the correlation analysis for calculating the amount of solar power generated by the floating PV module, followed by the module temperature and ambient temperature, in sequence. The wind speed and direction have little effect on power generation. From Table 4, the correlation between the horizontal surface irradiance and slope irradiance, and that between the ambient temperature and module temperature, are 0.97 and 0.89, respectively, which means that the data are relatively similar. Thus, it is desirable to consider only the slope irradiance and module temperature.  Table 5 summarizes the analysis results of the floating PV regression. From this table, the multiple correlation coefficient is 0.9859, which is more accurate than that of the land-based PV system. In detail, the t-statistic of irradiation is 178.63; therefore, the linear relationship between the independent variable and the dependent variable is very high. As the p-values of the module temperature and solar irradiation are 0 and the y-intercept is 0.01, whose absolute value is less than 0.05, the null hypothesis that there is no linear relationship is rejected. Therefore, the module temperature and solar irradiation coefficient can be used as the variables [26]. As a result, the formula for estimating the power generation of the floating PV module using linear regression is described in Equation (10).

Prediction Model Based on Linear Regression Analysis
where P f pv is the floating PV module output (kW), T c is the module temperature ( • C), and G is the solar irradiation amount (W/m 2 ). Upon generating random numbers from the floating PV operating results data, 50% of the data were used for training, and the remaining 50% were used for verification, to confirm whether overfitting occurred or not. In addition, an error function, the mean absolute percentage error (MAPE) was used to compare the accuracy, as shown in Equation (11) [27].
where D i is the value predicted by the predictive model, E i is the measured value, and n is the number of datasets. Figure 12 shows the results of the linear regression analysis with the learning data. The linearity can be confirmed as a result of plotting the true output and predicted output. Upon converting this into MAPE, the error showed 9.61%.

0.9859
Upon generating random numbers from the floating PV operating results data, 50% of the data were used for training, and the remaining 50% were used for verification, to confirm whether overfitting occurred or not. In addition, an error function, the mean absolute percentage error (MAPE) was used to compare the accuracy, as shown in Equation (11) [27].
where is the value predicted by the predictive model, is the measured value, and is the number of datasets. Figure 12 shows the results of the linear regression analysis with the learning data. The linearity can be confirmed as a result of plotting the true output and predicted output. Upon converting this into MAPE, the error showed 9.61%. Figure 13 shows the results obtained by entering the test data into the learning model generated from the learning data. The MAPE is 11.35%, which is greater by 1.74%, but it was confirmed that overfitting did not occur.   Figure 13 shows the results obtained by entering the test data into the learning model generated from the learning data. The MAPE is 11.35%, which is greater by 1.74%, but it was confirmed that overfitting did not occur.

Prediction Model based on Neural Network Analysis
Neural networks are used as representative nonlinear algorithms, which can be learned by adjusting the layer, node, and activation functions. In this work, the hyperbolic tangent was used as

Prediction Model Based on Neural Network Analysis
Neural networks are used as representative nonlinear algorithms, which can be learned by adjusting the layer, node, and activation functions. In this work, the hyperbolic tangent was used as the activation function, and the numbers of nodes were set as 1, 5, 10, and 20. Before those models were made, the dataset was preprocessed to remove the outlier data to have more than 40% MAPE in the linear model, which was also applied in linear regression. Figure 14 shows the learning result model with one hidden layer node.

Prediction Model based on Neural Network Analysis
Neural networks are used as representative nonlinear algorithms, which can be learned by adjusting the layer, node, and activation functions. In this work, the hyperbolic tangent was used as the activation function, and the numbers of nodes were set as 1, 5, 10, and 20. Before those models were made, the dataset was preprocessed to remove the outlier data to have more than 40% MAPE in the linear model, which was also applied in linear regression. Figure 14 shows the learning result model with one hidden layer node. In Figure 15, the trained results show stable results on the graph. Although a MAPE of 10.25% occurred, this is 0.64% lower than that of the linear regression of 9.61%, but can be judged to be almost similar. In Figure 15, the trained results show stable results on the graph. Although a MAPE of 10.25% occurred, this is 0.64% lower than that of the linear regression of 9.61%, but can be judged to be almost similar.   Figure 16 shows the result of verifying the accuracy by entering the test data into the trained model. The MAPE was 12.17%, which was similar to that of the learning model, without overfitting. Figure 17 shows a model that predicts the power generation after calculating eight input variables with five hidden nodes. Figure 18 shows that the percentage error is reduced to 8.78% as a result of the five hidden nodes learned.   Figure 16 shows the result of verifying the accuracy by entering the test data into the trained model. The MAPE was 12.17%, which was similar to that of the learning model, without overfitting. Figure 17 shows a model that predicts the power generation after calculating eight input variables with five hidden nodes.  Figure 18 shows that the percentage error is reduced to 8.78% as a result of the five hidden nodes learned. Figure 19 presents the results of verifying the accuracy of the model by entering test data into the trained model. The percentage error is 10.36% and an error increase of 1.58% occurs; however, it can be confirmed that the percentage error is lower than that of the result of the learning and verification of the linear regression analysis.  Figure 18 shows that the percentage error is reduced to 8.78% as a result of the five hidden nodes learned. Figure 19 presents the results of verifying the accuracy of the model by entering test data into the trained model. The percentage error is 10.36% and an error increase of 1.58% occurs; however, it can be confirmed that the percentage error is lower than that of the result of the learning and verification of the linear regression analysis.      Figure 21 shows that the percentage error is reduced to 6.98% as a result of the ten hidden nodes learned. Figure 22 presents the results of verifying the accuracy of the model by entering verification data into the trained model. The percentage error is 8.8% and an error increase of 1.82% occurs; however, it can be confirmed that the percentage error is lower than that of the result of the learning and verification of the linear regression analysis and the 5 hidden nodes of the neural network.     Figure 21 shows that the percentage error is reduced to 6.98% as a result of the ten hidden nodes learned. Figure 22 presents the results of verifying the accuracy of the model by entering verification data into the trained model. The percentage error is 8.8% and an error increase of 1.82% occurs; however, it can be confirmed that the percentage error is lower than that of the result of the learning and verification of the linear regression analysis and the 5 hidden nodes of the neural network.  Figure 21 shows that the percentage error is reduced to 6.98% as a result of the ten hidden nodes learned.           Figure 24 shows that the percentage error is reduced to 7.54% as a result of the five hidden nodes learned. Figure 25 presents the results of verifying the accuracy of the model by entering verification data into the trained model. The percentage error is 11.35% and an error increase of 3.81% occurs. It is confirmed that overfitting occurred with an error value much higher than that of the linear regression or the neural networks of the hidden nodes 1, 5, and 10.    Figure 24 shows that the percentage error is reduced to 7.54% as a result of the five hidden nodes learned. Figure 25 presents the results of verifying the accuracy of the model by entering verification data into the trained model. The percentage error is 11.35% and an error increase of 3.81% occurs. It is confirmed that overfitting occurred with an error value much higher than that of the linear regression or the neural networks of the hidden nodes 1, 5, and 10.   Figure 25 presents the results of verifying the accuracy of the model by entering verification data into the trained model. The percentage error is 11.35% and an error increase of 3.81% occurs. It is confirmed that overfitting occurred with an error value much higher than that of the linear regression or the neural networks of the hidden nodes 1, 5, and 10.  Table 6 summarizes an analysis of the linear and nonlinear algorithms reviewed to predict the amount of power generated by the floating PV system. The linear regression prediction model showed a percentage error of 9.61-11.35%, and the neural network prediction model with five hidden nodes showed an error range of 8.78-10.36%. As a result, it was confirmed that the neural network prediction model, which was a nonlinear algorithm, showed 2.59% higher accuracy than the linear regression analysis prediction model, on average. The numbers of hidden nodes were set to one to twenty in the neural network algorithm, and it was confirmed that overfitting happened in twenty nodes and ten hidden nodes produced the best accurate results among them. In conclusion, the power generation must be calculated in the non-linear model and the number of hidden nodes should be taken into consideration so as not to have overfitting, as below.

Conclusions
In this study, the power-generation prediction model for a floating PV system using neural networks was compared and analyzed with one based on regression analysis. The summary of the characteristics of the floating solar power generation model verified in this study is as follows: • The existing knowledge that the cooling effect and efficiency of the floating PV system increased with the evaporation of water from the surface was supplemented by more objective and concrete analyses. Based on the operation of two 2.5 kW land-based and floating PV systems installed near and on the water surface of the Boryeong Dam, Korea, respectively, the results were analyzed comparatively by applying a statistical probability method.  Table 6 summarizes an analysis of the linear and nonlinear algorithms reviewed to predict the amount of power generated by the floating PV system. The linear regression prediction model showed a percentage error of 9.61-11.35%, and the neural network prediction model with five hidden nodes showed an error range of 8.78-10.36%. As a result, it was confirmed that the neural network prediction model, which was a nonlinear algorithm, showed 2.59% higher accuracy than the linear regression analysis prediction model, on average. The numbers of hidden nodes were set to one to twenty in the neural network algorithm, and it was confirmed that overfitting happened in twenty nodes and ten hidden nodes produced the best accurate results among them. In conclusion, the power generation must be calculated in the non-linear model and the number of hidden nodes should be taken into consideration so as not to have overfitting, as below.

Conclusions
In this study, the power-generation prediction model for a floating PV system using neural networks was compared and analyzed with one based on regression analysis. The summary of the characteristics of the floating solar power generation model verified in this study is as follows: • The existing knowledge that the cooling effect and efficiency of the floating PV system increased with the evaporation of water from the surface was supplemented by more objective and concrete analyses. Based on the operation of two 2.5 kW land-based and floating PV systems installed near and on the water surface of the Boryeong Dam, Korea, respectively, the results were analyzed comparatively by applying a statistical probability method.
• During the period of analysis, the amount of solar irradiation for the land-based PV system was approximately 1.1% higher than that of the floating PV system; however, the average module temperature of the floating PV system was 4.9% lower than that of the land-based PV system.

•
To analyze the reason behind the module temperature of the floating PV system being lower than that of the land-based PV system, the correlation between the module temperature difference and the air-water temperature, as well as the module temperature and wind speed, were analyzed. From the results, it was shown that there was a positive correlation between the module temperature difference and the land and water temperature difference, with a correlation coefficient of 0.6451. On the other hand, there was no correlation between the module temperature difference and the wind speed, with a correlation coefficient of 0.0032. Based on the operational data, the regression analysis method suggested a characteristic function that could predict the power generation, even considering solar radiation and module temperature.

•
This study proved that the efficiency of a floating PV system was higher than that of a land-based PV system, owing to the cooling effect of water, although there are some differences, according to the environmental factors of the area. Using the water temperature characteristics, it was possible to predict the amount of electricity generated more accurately.

•
In addition, the accuracy of the power generation prediction model using neural networks was approximately 2.91% higher than that of the regression analysis method. As a result of adjusting the hidden nodes in the neural network algorithm, it was confirmed that a neural network algorithm with ten hidden nodes was most suitable for calculating the amount of power generation • This paper is the result of utilizing 2.5 kW research equipment installed in the dams and nearby areas of South Korea. Therefore, as a result of this study, it is possible to more accurately predict the power generation of the floating PV system in an environment similar to these study conditions, such as installation locations, system scales and configurations, etc.