Forecast Research on Multidimensional Inﬂuencing Factors of Global Offshore Wind Power Investment Based on Random Forest and Elastic Net

: Recently, countries around the world have begun to develop low-carbon energy sources to alleviate energy shortage and cope with climate change. The offshore wind power has become a new direction for clean energy exploration. However, the accuracy of offshore wind power investment is still an urgent problem due to its complexity. Therefore, this paper investigates offshore wind power investment to improve the investment forecasting accuracy. In this study, the random forest (RF) algorithm was used to screen out the key factors inﬂuencing multi-dimensional global offshore wind power investment, and the elastic net (EN) was optimized using the ADMM algorithm and used in the global offshore wind power investment forecast model. The results show that the adoption of the random forest algorithm can effectively screen out the key inﬂuencing factors of global offshore wind power investment. Water depth, offshore distance and sweeping area have the most inﬂuence on the investment. Moreover, compared with other models, the elastic net optimized by ADMM can better reﬂect the changing trend of global offshore wind power investment, with smaller errors and a higher regression accuracy. The application of the RF–EN combined model can screen out effective factors from complex multi-dimensional inﬂuencing factors, and perform high-precision regression analysis, which is conducive to improving the global offshore wind power investment forecast. The conclusion obtained can set a more reasonable plan for the future construction and investment of global offshore wind power projects.


Introduction
With rapid economy development and population growth, the global energy demand continues to increase. Environmental pressure and energy shortage have caused many countries to increase investment in clean energy fields such as wind power, solar thermal power and biomass power [1]. As a clean and low-carbon type of renewable energy, wind power has becoming more and more important in the global energy field, and the site selection of wind power farms has gradually transferred from land to ocean [2]. To date, the total installed capacity of Europe and China is close to 98% of the world, and comprises the main markets of offshore wind power. According to the "Global Wind Energy Report 2019" published by the Global Wind Energy Council in March 2020, the cumulative installed capacity of global offshore wind power exceeded 29 GW: 21930 MW in Europe, 7204 MW in the Asia-Pacific region, and 30 MW in the Americas by the end of 2019. From the perspective of cumulative installed capacity, the European region is still the world's largest offshore wind power market [3]. However, the investment of the offshore wind power farms is still complex due to uncertain factors and high risks [4]. It is difficult to make investment plans for new offshore wind power farms. There are a few studies [5,6] that carried out an accurate forecast method which pays attention to the multiple influencing factors. Therefore, it is of great significance to comprehensively analyze the key influencing factors of project investment, accurately forecast the investment, and effectively control the cost level to make more reasonable investment plans for offshore wind power.
At present, the published literature on global offshore wind power projects mainly focuses on project cost and management strategy [7][8][9] and there is little research that pays attention to the overall project investment forecast. Chinmoy et al. (2019) took the comprehensive cost into consideration and built a clean energy collaborative optimization model from an overall perspective [10]. Ge et al. (2020) proposed an optimization model to make offshore wind farm maintenance plans by considering the minimum maintenance cost as the constraint [11]. Kucuksari et al. (2019) calculated the energy economic cost indicators of the offshore wind power and developed a cost model [12]. Through comprehensive analysis, some comprehensive cost energy allocation analysis methods combining output characteristics [13,14], load forecasting [15] and demand side management [16] have been established. From the perspective of the value chain, the cost level control of offshore wind power projects has also been studied [17][18][19]. At present, the most common research method for the investment cost of offshore wind power projects is the life cycle analysis method [20][21][22]. The life cycle cost model is summarized by the latest data and parameter equations in the database and literature of the floating offshore wind farms (Maienza et al., 2020) [23]. Loannou et al. (2018) established a life cycle technology and economic model to analyze the cost-benefit of offshore wind energy [24]. However, the literature above has not set up a complete system on how to consider multiple influencing factors to forecast investment in offshore wind power projects.
Since the construction of global offshore wind power is a complicated process, its investment is affected by many factors. Therefore, if we want to accurately forecast the investment level, we need to accurately determine the factors affecting global offshore wind power investment. At present, the screening methods of influencing factors mainly include the fishbone analysis method [25], system dynamics method, traditional measurement method [26,27], machine learning method [28,29] and deep learning method [30,31]. For the analysis of the factors affecting the global offshore wind power industry, current research mainly adopts the traditional hierarchical structure for analysis. Xu, Y et al. used the explanatory structural model (ISM) to study the relationship between the influencing factors of the offshore wind power industry [32]. However, there are relatively few studies on applying deep learning methods to the analysis of factors affecting global offshore wind power investment. Among them, the random forest algorithm, as one of the deep learning algorithms that shows good results in the process of factor selection. Miao, S et al. used the random forest algorithm to study the relationship between the NAI concentration (NAIC) in the urban atmosphere and environmental factors (such as meteorological factors and air pollutants) [33]. Liu, D et al. used the random forest method to study the fluctuations of gold prices and proved that random forest has a good factor identification effect [34].
In view of the different degrees of influence of global offshore wind power multidimensional influencing factors on investment, to analyze the overall level of global offshore wind power investment, regression models should be used to analyze the effects. Traditional regression models include multiple linear regression, logistic regression, ridge regression, and elastic net. Among them, multiple linear regression is more used in the study of obvious linear relationship between influencing factors and dependent variables [35]. Logistic regression is generally used to analyze a situation in which the variable is a logical variable [36]. Ridge regression is a biased estimation regression method dedicated to collinearity data analysis. It is essentially an improved least squares estimation method [37]. For the multi-dimensional influencing factors of offshore wind power, the elastic net can achieve better regression analysis effect because it can sort different influencing factors. Chen, BZ et al. performed high-dimensional least squares matrix regression through elastic net to deal with group variables. Studies have shown that elastic nets have the characteristics of grouping effects [38]. Araveeporn, A compared the lasso and elastic net methods for high-dimensional data classification and the high level of the adaptive lasso and adaptive elastic net methods, and verified the effectiveness of the elastic net [39].
The goal of this paper was to analyze the key factors affecting global offshore wind power investment and to set up a multi-stage model to forecast the investment accurately. We attempted to do this with the following methods: random forest (RF) algorithm was used to confirm the importance of each factor, and elastic net model was used to forecast investment of offshore wind power project.
The key points of this study are as follows.
(1) Unlike other studies, this article does not analyzes global offshore wind power investment from the perspective of project cost and management strategy, but conducts research from the perspective of overall offshore wind power investment. This paper uses a two-stage model combining random forest and elastic net for analysis. (2) The random forest algorithm is used to filter out key factors, and the understandings gained will help to identify the key factors affecting global offshore wind power investment. (3) The elastic net is used to solve the regression analysis of multi-dimensional factors influencing global offshore wind power investment and improve the level of investment forecasting. The forecasting effects of different regression models for small sample investment are compared. (4) According to the regression coefficient, this study further analyzes how each influencing factor affects the investment.
The remainder of this paper is organized as follows. Section 2 presents the main methods used in this paper. Section 3 shows the process of designing the global offshore wind power investment regression model. The data and variables are detailed in this section and the results of the different model are compared. Section 4 analyzes the results obtained in this paper. Section 5 summarizes the main findings, lists the limitations, and proposes further research directions.

Random Forest Algorithm
This study uses random forest algorithm to screen the influencing factors, reduces the dimensions of input variables and identifies key factors. It takes into account the complex influencing factors and many uncertainties of global offshore wind power projects. At present, many studies have proven that the random forest algorithm has the effect of identifying factors [40]. Richmond et al. (2020) used random forest to identify key factors, and proposed a method of selecting input vectors of wind power forecast model based on random forest, which has a better comprehensive performance [41]. Wei et al. (2020) used the random forest method to filter out important variables to forecast and alert different energy efficiency characteristics [42]. It was applied to the spot market clearing price prediction and established a market clearing price forecast model based on random forest regression.
Random forest algorithm is proposed by Breiman (2001), which is based on the Bagging algorithm and comprehensive learning [43]. It is a combination classification prediction algorithm. It summarizes the forecasting results of multiple decision tree combinations and outputs the total output which are obtained by using bootstrap repeated sampling technology. The over-fitting problem caused by improper classification of high-dimensional data of global offshore wind power investment can be overcome by integrating many random-decision tree models, and the forecasting accuracy and classification accuracy can be also raised.
The process of using the random forest algorithm to calculate the importance of global offshore wind power influencing factors is as follows.
When the random forest {T n (x)} ntree n−1 is built, for the n-th regression tree T n in the forest, the corresponding out-of-bag data is denoted as OOB n . It can be calculated by Equation (1).
(1) When the amount of raw data N is large enough, k ≈ 0.368 N. The regression tree T n is use to forecast OOB n , and the mean square error of OOB n can be calculated by Equation (2).
where y i is the i-th actual measurement of the response variable in OOB n ; and yˆi is the i-th predicted value of the OOB n -fold response variable. Therefore, the mean square error of ntree out-of-bag data can be obtained.
(2) The data of the global offshore wind power influencing factor X j (1 ≤ j ≤ p) should be kept in other unchanged columns in ntree OOB samples. ]-th row vector of the above matrix, and is divided by the standard error after averaging to obtain the importance score (increase of mean squared error, IncMSE) of the global offshore wind power influencing factor X j .
where t i and y are relatively the i-th actual measured values and the mean values of the response variables in out-of-bag data.

Elastic Net
The elastic net is a dynamic mixture of ridge regression and Lasso regression. The ultimate goal of linear regression is to minimize the loss function, and the basic idea is the least square method (See Equation (7)). The Lasso regression and Ridge regression add LI regularization term and L2 regularization term respectively on the basis of the least square method. That is, Lasso uses the sum of the absolute values of the coefficients (Equation (8)), while the latter uses the sum of the squares of the coefficients (Equation (9)) as a penalty.
where N is the sample number, y i is the true value of the dependent variable, y i is the forecast value, λ is the penalty parameter, k is the eigen number of the sample, β is When it is not clear whether L1 regularization or L2 regularization is better for parameter updating, we can combine the two to form Elastic net model, whose loss function is as follows: where α is the mixed parameters of elastic net. When α is 0, it is a complete Ridge regression, when α is 1, it is a complete Lasso regression, ||β|| 2 L2 is the ridge regression term, ||β|| L1 is the Lasso regression term.
The strength of elastic net lies in that it cannot only extract features which Ridge regression cannot do, but also achieve feature grouping, which Lasso cannot do. In elastic net modeling, penalty parameter λ and mixed parameter α need to be solved. Generally, the criterion for selecting the optimal model is the minimum mean squared error (MSE).

Global Offshore Wind Farm Data
In this paper, 30 offshore wind farms in the world were selected as the research sample. At present, Europe and Asia are the main markets of offshore wind power. Their total installed capacity accounts for 98% of the world. Therefore, the offshore wind farms selected in this paper are mainly located in Europe and Asia. The data collection of the offshore wind farms comes from CDM (Clean Development Mechanism) file in Wikipedia and related databases. The distribution and information of 30 offshore wind farms are shown in the Table 1 and Figure 1.

Independent Variables
Independent variables refer to variables that can affect the investment of offshore wind power projects directly. There are many factors that affect global offshore wind power investment. This article found through literature search that the factors affecting global offshore wind power investment include installed capacity, average annual power generation, number of units, engineering level, commissioning time, area, single unit capacity, sweeping area, and unit Weight, steel price, wind wheel diameter, submarine cable, hub height, number of blades, water depth, distance from shore, rated wind speed, limit wind speed, etc. [44,45]. In order to accurately determine the level of investment, it is necessary to accurately select the influencing factors. This article first uses the Delphi method to make a preliminary selection of the factors. We invited 15 experts to give effective scores on the factors that may be involved in the above, and obtain variables that have a certain degree of influence on global offshore wind power investment in the general perception.  We divide the influence of variables on the investment level into five grades. Different grades have different scoring standards. The high degree of influence is 90-100 points, and then is 80-90 points, and the general degree of influence is 70-80 points. The low degree of influence is 60-70 points, and the very low degree of influence is less than 60 points. The above criteria are used to score the experts with the above-mentioned indicator variables, and the scoring results and opinions of the experts are summarized, and the variables with large differences in scores are counted and sent to the experts for the second round of scoring. The second results and comments are fed back and the results are submitted to the experts for the third round of scoring, and finally the average score is used as the final influencing factor scoring result. The results of the identification of factors affecting global offshore wind power investment are shown in Table 2.
According to the results of the Delphi method, we retain the first and second influencing factors for further screening, so as to achieve a better investment level forecast. Among them, the first and second levels of influencing factors can be divided into the following parts. The factors that affect the total investment include the power generation scale, construction scale, construction site selection and wind speed of offshore wind power projects. Among them, the power generation scale includes the installed capacity of project (x 1 ), the average annual power generation (x 2 ), the number of units (x 3 ), and the single unit capacity (x 4 ). Construction scale factors include sweeping area (x 5 ), wind wheel diameter (x 6 ), submarine cable length (x 7 ), hub height (x 8 ), and number of blades (x 9 ). Engineering construction site selection factors mainly include the water depth (x 10 ), and the distance from shore (x 11 ). In terms of wind speed, the rated wind speed (x 12 ) of the wind farm is mainly selected as the research object.
The influences of the independent variables to the investment of global offshore wind power farms are as follows: x 1 represents the installed capacity of projects. When it is large, the corresponding generator installation scale will also be expanded, which will increase the cost and the total investment of offshore wind power projects.
x 2 represents the annual average power generation. The larger the scale, the higher the corresponding cost and the investment required.
x 3 is the number of units. The number of units to be built directly affects the cost of offshore wind power projects and it indirectly influences the total investment of offshore wind power projects.
x 4 is the single unit capacity. It is generally believed that the larger the unit capacity, the smaller the number of units to be used and the smaller the construction scale. As a result, the cost level will be reduced, and the total investment level will also be reduced.
x 5 represents the sweeping area of offshore wind power projects. When the area covered by offshore power generation is larger, the single unit capacity required is smaller, the number of units is larger, and the cost level is higher.
x 6 is the diameter of the wind wheel. The larger the diameter of the impeller, the higher the construction cost and the higher the investment cost.
x 7 is the length of the submarine cables. Because the cost of submarine cables increases linearly with the increase in length, the length of the submarine cable has a direct and important relationship with the investment.
x 8 is the hub height. It is an important factor influencing fixed costs and indirectly affects the total investment level.
x 9 represents the number of blades. It directly affects the fixed cost of an offshore wind power construction and the level of investment.
x 10 is the water depth refers to the deepest depth. The deeper the depth, the higher the technical requirements required for construction, and the corresponding cost will also increase significantly.
x 11 is the offshore distance. The longer the offshore distance, the longer the length of the submarine cable, the greater the difficulty of construction, and the higher the cost required.
x 12 represents the rated wind speed. It determines the power generation efficiency of offshore wind power projects and is also a key factor to be considered when investing in offshore wind power projects.
Input variables are listed in detail in the Table 3. The dependent variable selected in this paper is the investment of a single offshore wind power project. In this paper, we collected the total investment of 30 offshore wind power projects through the CDM file in Wikipedia. The data is called y 1 in this paper. We also calculated the unit kilowatt investment of them by using the scales of the offshore wind power projects collected in the Section 3.1.1. The data of the unit kilowatt investment is called y 2 . Either of them can be regarded as the dependent variable of the investment forecasting. However, it is of vital importance to take the scale of offshore wind power project into consideration in the process of forecasting investment. In order to avoid the large differences in magnitude of investment data caused by scales of the offshore wind power project, it is important to take the unit kilowatt investment into account. Therefore, this paper finally selects the unit kilowatt investment data y 2 as the dependent variable for fitting to obtain more accurate results.
Because the offshore wind power farms collected above are around the world, it is necessary to deal the data of investment with the exchange rate. Therefore, this paper collected the exchange rate of different countries and changed the unit of investment to dollars.
The dependent variables are listed in detail in the Table 4. In the first stage of forecasting, we firstly identify the key factors of offshore wind power investment. This paper selects twelve variables that may affect the investment of offshore wind power projects, but it may cause over-fitting of the model due to the excessive number of variable dimensions. Therefore, this study utilizes the random forest algorithm to retain the factors selected above with the influence degrees, and to eliminate the variables with weak influence degrees.
Firstly, the corresponding out-of-bag data OOB n can be calculated through the Equation (1), and the mean square error of OOB n can be calculated. Then, for other global offshore wind power influencing factors, the processes are the same as above. By calculating all of the mean square errors, the forecasting mean square error matrix can be represented by Equation (4). The standard error of the factor matrix should be calculated to obtain the importance score of the global offshore wind power influencing factors.
In this paper, we checked out related research and set up 95% as the variable selection criterion finally. The cumulative influence degree of global offshore wind power factors is summarized by the importance scores obtained above. The factors with the cumulative influence degrees of more than 95% should be retained, and others will be eliminated. The factors retained finally will be regarded as the final independent variables to put into the forecasting model as the input variables.

Research on Global Offshore Wind Power Investment
This paper conducts research on global offshore wind power investment based on the selection of key influencing factors by random forest algorithm. The choice of the dependent variable is y 2 . The selection of independent variables is determined according to the results of random forest screening.
The research method of global offshore wind power investment selected in this paper is the elastic net regression model. In the elastic net, the key influencing factors selected by the random forest are used as input variables, and y 2 is used as the output variable for analysis. The use of this model can effectively analyze the degree of influence of multiple influencing factors on global offshore wind power investment, so as to accurately forecast the investment of offshore wind power projects. In addition, the elastic net model will also give the effects of different key influencing factors as a basis for considering the key factors of global offshore wind power investment in the future.
Then, error analysis is also used in the accuracy verification process of the combined model. This article selects MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error), RMSE (Root Mean Squared Error), and SSE (Sum of Squares for Error) four indicators for error analysis of the results.
MAE is the average value of absolute error, which can directly reflect the actual situation of error. The calculation equation is as follows: MAPE is the average value of the absolute value of the relative percentage error, which can be used to measure the quality of the results in the model. The calculation equation is as follows: RMSE is the square root of the average value of the square difference between the predicted value and the actual value, which can be used to measure the average size of the error. The index is greatly affected by outliers. The calculation equation is as follows: SSE reflects the discrete state of each sample observation, which can also be called the sum of squares within a group or the sum of squares of error terms. The calculation equation is as follows: The specific construction steps of the two-stage RF-EN model involved in this article are shown in the following Figure 2.

Key Factors Selecting Results
In this paper, random forest is used to select the key factors affecting offshore wind power investment, and the dimension is reduced according to the degree of correlation.
The key factors are retained, and the unimportant factors are eliminated, so as to achieve the purpose of screening out key factors. This paper sets 95% as the critical value for screening key influencing factors. The corresponding cumulative interpretation degree is calculated according to the influence degree of the above 12 variables. The results are shown in Table 5 and Figure 3.  As is shown in the Figure 3, the vertical axis left represents the influence degree of the key factors, and the vertical axis right represents the cumulative influence of them. Through random forest screening, x 10 has the greatest correlation with offshore wind power project investment, which is 0.258283. It is followed by x 11 , x 5 and x 1 . Variables with less correlation include x 7 , x 8 , x 12 . Figure 3 shows that the boundary of the retained factors is between x 4 and x 3 . The final screening results of the influencing factors retained as input variables are shown in Table 4. The total influence of the factors before x 4 is 0.933846, and it is 0.962316 before x 3 . Therefore, the factors before x 3 should be retained to meet the requirement of 95%. The variables that need to be retained are x 10 , x 11 , x 5 , x 1 , x 9 , x 2 , x 3 , x 4 to achieve an explanation of more than 95% of the investment. Variables to be eliminated are x 6 , x 7 , x 8 , x 12 . The corresponding influencing factors are water depth, distance from shore, sweeping area, installed capacity, number of blades, average annual power generation, single unit capacity and number of units.
The variables that need to be eliminated are wind wheel diameter, submarine cable length, hub height, and rated wind speed. Their correlation with global offshore wind power investment is 0.023971, 0.007929, 0.005784, and 0, respectively. In the overall selection of key factors, these four variables have little influence and have low reference value for the forecast of global offshore wind power investment, so they are eliminated.

Regression Analysis Results
In this article, the elastic net is used in the regression analysis of global offshore wind power investment and key influencing factors. This paper takes x 10 , x 11 , x 5 , x 1 , x 9 , x 2 , x 3 , x 4 as the independent variable and y 2 as the dependent variable into the elastic net regression model. The elastic net is trained, and the influence coefficients of different influencing factors on global offshore wind power investment are judged. Elastic net is a new regression algorithm that combines ridge regression and Lasso regression. The determination of the α value requires comparative analysis using optimization algorithms. Therefore, this paper uses the coordinate descent method, alternating direction multiplier method (ADMM) and Sklearn algorithm to determine the α coefficient of the elastic net. In this way, the coefficient values of the global offshore wind power investment influencing factors under three different optimization algorithms are obtained, as shown in the following Table 6. It can be seen from the above table that under the three optimization algorithms, the influence coefficient of the independent variable x 3 in the Coordinate Descent and Sklearn algorithms is negative, and the rest are positive. Under the ADMM optimization algorithm, all independent variables have a positive impact on global offshore wind power investment.
Among them, x 3 is the number of units. Under normal circumstances, the larger the number of units, the greater the investment in the overall project construction. Therefore, these two cases are abnormal conditions, and the reliability of the elastic net output under this optimization algorithm is not high. In order to further verify the reliability, it is necessary to compare the regression results of the three optimization algorithms.
According to the coefficient values of each influencing factor obtained above, we bring in the initial values of the variables. The regression value of the global offshore wind power investment is calculated, and the offshore wind power investment value under the three optimization algorithms is obtained, as is shown in Figure 4.
Through the analysis of the results in the above figure, it can be seen that the trends of the results obtained by the three optimization algorithms for the elastic net regression are similar. The regression results obtained by the three optimization algorithms all show a general trend of rising first and then falling, and the highest values are all near the 15th sample. Among them, the regression result under the ADMM optimization algorithm has a higher peak value. The reason may be that in the output result of the elastic net optimized by ADMM, the degree of influence of each variable is more in line with the actual situation, and there is no outlier, so the overall output investment is closer to the actual investment level. The specific regression effect analysis needs to compare the actual investment value of the offshore wind power project with the obtained regression value for error analysis.

Error Analysis
This article uses multiple linear regression, logistic regression, lasso regression, support vector machine (SVM), and algorithms without elastic net to forecast global offshore wind power investment. They are used as control groups to compare and analyze the results of elastic net regression.
First of all, in the multiple linear regression model, this article still uses x 10 , x 11 , x 5 , x 1 , x 9 , x 2 , x 3 , x 4 as the independent variable and y 2 as the dependent variable. A regression analysis is performed to obtain the coefficient value of the independent variable including the intercept term.
When using the Logistic model for regression analysis, this paper divides the dependent variable y 2 into two categories according to whether the total investment value exceeds 20 million US dollars. The excess part is set to 2, otherwise it is set to 1. This is because the logistic model is suitable for regression analysis of categorical variables. Therefore, a regression analysis is carried out on the probability of whether the total investment exceeds 20 million US dollars.
In the Lasso regression model, we also use three optimization models for Lasso analysis in order to make comparative analysis more meaningful.
When using the SVM algorithm for regression forecast, we use the first 20 input variables and output variables as training variables, and the others as predictor variables. The SVM algorithm cannot directly give the degree of influence of the respective variables, so we directly judge the effect of regression forecasting through the error.
In addition, in order to verify the effectiveness of the elastic net in the investment level forecast, this paper calculates the influence of variables directly output by the random forest to obtain the global offshore wind power investment level without elastic net analysis, and compares it with the actual situation.
The degree of influence of the independent variables under each model is shown in Table 7. It can be seen from the results in the above table that except for the negative effects of the coefficients in logistic regression, the results of other models conform to the meaning of independent variables. Through the above independent variable coefficients, the global offshore wind power investment value under different regression models is calculated, and the regression results obtained are compared and analyzed with the actual investment to calculate the error. The comparison between the 10 forecasting methods and the actual value is shown in Figure 5. The error results of each regression model are shown in Table 8 and Figure 6.  It can be seen from the error analysis result that only the elastic net under the ADMM optimization algorithm has a better regression effect in comparison. Its value is the smallest among the four error evaluation indicators. The output results obtained by using only the random forest algorithm without using the elastic net have a very large error, which verifies the necessity of the elastic net for regression analysis. In the regression results obtained by using the elastic net to perform the regression analysis, the error result of the analysis using the optimization algorithm is smaller. Moreover, the overall regression effect under the three optimization algorithms of elastic net is better than Logistic regression and SVM regression forecasting. On the whole, the elastic net has a good return effect on global offshore wind power investment.
The results of the error analysis verify the effectiveness of the elastic net for global offshore wind power investment forecasts, analyze the effectiveness of the combined model, and provide new ideas and methods for the regression forecast of global offshore wind power investment in the future, which is beneficial to the future global offshore wind power project investment planning.

Discussion
This paper makes some discussions according to the results above. From the selecting results of the key factors of random forest, the ranking of the factors of global offshore wind power investment is shown in Figure 3. It is shown that x 10 has the greatest correlation with offshore wind power project investment, which is 0.258283, and x 12 has no correlation with offshore wind power project investment. They represent the water depth and the rated wind speed respectively. Therefore, it can be seen from Figure 3 that the greatest influence on global offshore wind power investment is the depth of water, followed by distance from shore, sweeping area, installed capacity, number of blades, average annual power generation, single unit capacity and number of units. The rated wind speed has no relation to the investment. The reason for this phenomenon may be that the depth of water has more correlation with the construction scale of the project, which will influence the site selection of the project and influence the construction cost. Before investing in different global offshore wind power projects, the investor will also examine the depth of water and other factors of the project construction to determine the investment intention. Through the analysis of dimensional factors, these factors should be given priority to determine a reasonable project construction scale and plan in the future global offshore wind power project construction process to ensure subsequent effective financing. Although wind speed is an effective factor influencing the construction of offshore wind power projects, the reason for its low degree of correlation on investment may be due to the large range of wind speed changes during the entire project construction period, which is more influenced by uncertain factors. From the perspective of the entire investment cycle, the wind speed cannot be taken into consideration. Therefore, this study also provides an effective analysis of the influencing factors for the financing of the project construction in the early stage, and it also provides ideas for the offshore wind power project construction plans to better attract investment.
In the analysis of the regression results, this paper finally selects the elastic net under the ADMM algorithm as the optimal regression result. It can be seen from Table 6 that the influencing factors of the respective variables have a positive influence on global offshore wind power investment. For global offshore wind power projects, there are currently few data on investment in offshore wind power projects due to their development. Therefore, the elastic net's regression forecasting of global offshore investment can well solve the problem of low accuracy caused by ordinary forecasting methods. In addition, compared to the direct use of random forests to forecast global offshore wind power investment, the selection of elastic net can better improve the accuracy of global offshore wind power investment regression forecasts, so as to make more reasonable cost planning.
Through the analysis of the results, the RF-EN model set up in this paper can improve the regression forecasting accuracy effectively, and its error is reduced. In addition, it has a better effect to use the factors screened out by random forests for investment forecasting in terms of the overall forecasting effect. The selection of dimensional factors is necessary. From the perspective of the overall construction of the projects, the construction level and scale of the above factors are effectively analyzed in the early stage of project construction, and the elastic net model can effectively forecast the amount of investment required for the project, thereby accurately attracting investment.
The results of the error analysis show that for the several regression models of the control group, the elastic net regression forecasting results under the three optimization algorithms are significantly lower than the error results of other models. Among the three optimization models, the error value of the elastic network regression forecasting result under the ADMM optimization algorithm is significantly lower than the results of the other two optimization algorithms. Compared with the traditional forecasting model, the accuracy has been significantly improved, but it has not yet reached the best forecast error range of 2%. The reason may be that the current data of offshore wind power projects are small, the number of samples is small, and the error is large. In this paper, the regression forecasting effect is improved through the RF-EN model construction and process under the condition of the small sample. The results can be used in the planning and forecast of investment scale in the early period of construction.
Through the research and the comparative analysis of the results, the proposed regression model can improve the investment forecast accuracy. Factors such as water depth, offshore distance and sweeping area should also be considered in the early stage of global offshore wind power project construction. In the future construction of global offshore wind power investment projects, it is necessary to give priority to the possible impact of key factors selected by the random forest algorithm. In addition, the elastic net can be effectively used to analyze its specific change trends, so as to better adjust the construction plan and adjust the investment plan. Through the analysis of this model, it can be possible to set a more reasonable plan for the construction and cost of global offshore wind power, and a reasonable scope for investment and financing can be set.

Conclusions
Investment regression forecast of offshore wind power projects is a complex process. This paper conducted a preliminary and systematic analysis of the factors that influence the global offshore wind power investment. Firstly, this paper utilized the Delphi method and random forest algorithm to screen the key factors affecting global offshore wind power investment, avoiding the problem of over-fitting. Then the elastic net algorithm is used to establish the RF-EN model to improve the accuracy of global offshore wind power investment regression. The main conclusions are drawn as follows.
(1) There are many factors that are related to the investment of offshore wind power projects. Through the screening of the random forest algorithm, the most important factor among them is the depth of sea water, followed by the offshore distance of the construction of offshore substations, the sweeping area of offshore wind farms, and the installed capacity. (2) For the problem of the small samples of global offshore wind power investment, this paper adopts the regression forecast method of elastic net improved by optimization algorithm. Compared with other traditional regression methods, the application of this method can effectively improve the accuracy of regression forecasting. (3) In this paper, we have established the RF-EN model to forecast the sub-sequence of unit kilowatt investment, which significantly improves the forecasting accuracy. The study has also proposed a more accurate method for forecasting global offshore wind power investment, which can provide some planning suggestions for the future construction of global offshore wind power projects.
At present, the small sample size of global offshore wind power projects is the key issue affecting forecast accuracy. Moreover, data collection is difficult due to the relatively concentrated distributions of the global offshore wind power farms. Therefore, it must be taken into consideration to collect complete data in the future research of forecasting global offshore wind power investment. Future research should also focus on improving model accuracy and using certain methods to deal with data stability. Therefore, the accuracy of offshore wind power project investment can be further improved in the future.