1. Introduction
With the advantages of flexible movement, low manufacturing cost, and wide range of water depth, jack-up platforms are widely used in many fields such as marine oil and gas exploration and development, offshore construction, island construction, and so on. As marine resources are increasingly developed, jack-up platforms are undergoing continual advancement for use in deeper waters. The long-term action of wind and wave current loads, complex platform structure, and extreme sea conditions present a significant challenge to the reliability of the platform. It is therefore of paramount importance to predict and evaluate the operational status and safety of jack-up platforms. A jack-up platform generally consists of a main hull and several piling legs that provide support. The pile legs not only have to bear the huge weight of the main hull, but also have to bear the effects of wind, waves, currents, and other environmental loads. The health condition of the pile legs has a very important influence on the safe operation of the whole platform. Therefore, it is necessary to analyze the dynamic response of the pile leg structure under environmental loads, that is, the force condition. Traditional methods for dynamic response analysis are mainly numerical simulations, but they often consume a lot of time and computational resources. The development of artificial intelligence has provided new ideas for the assessment of platform pile leg operational status. The focus of this paper is to use a powerful integrated learning method, random forest, to predict the dynamic response of jack-up platform pile legs under marine environmental loads.
Regarding the numerical simulation of the dynamic response of pile leg structures in jack-up platforms, numerous scholars have conducted research utilizing the equivalent beam theory, demonstrating its accuracy. Bienen et al. [
1,
2] developed the SOS_3D numerical computation program, which is based on the beam–column formulation to establish the stiffness matrix of jack-up platform. Additionally, the program possesses the capability to conduct three-dimensional dynamic simulations, incorporating factors such as soil–structure interaction and environmental loads. Meanwhile, it is also able to investigate the overturning response of the platform based on the displacement-hardened plasticity theory. Their study demonstrated that reliable results can be obtained for dynamic response analysis and overturning analysis of jack-up platforms based on the equivalent beam theory. Cassidy et al. [
3] introduced the constrained NewWave method of stochastic wave theory to consider the spectral content of wave loading based on the equivalent beam theory. A comprehensive consideration of nonlinear factors, including structural behavior, pile–soil interaction, and wave loading, was conducted to investigate the dynamic response of the platform in extreme sea conditions. Cassidy et al. [
4] developed an equivalent beam finite element model for jack-up platforms to fully express the load–displacement response of pile boots in the form of combined forces for full 3D dynamic response analysis. The effects of spatial variations and directional wave diffusion on the response of jack-up rigs were also investigated. Pisano et al. [
5] developed a 3D continuum model based on equivalent beam theory to capture nonlinear soil–structure interactions in a jack-up installation. The 3D continuum modelling proved to be a reliable method to analyze the stationarity and overall operational performance of jack-up platforms through experiments and numerical simulations. Wang, Y. et al. [
6] combined macro-element models with equivalent beam theory and applied it to the structural response analysis of a jack-up platform. The advantage of using the macro-element base model to accurately assess the capacity of the jack-up device under the external loads of ocean wind, waves, and currents was verified. He et al. [
7] developed an equivalent model for the whole jack-up platform, which simplified the main hull and pile legs using beam units that represent the geometric and overall stiffness characteristics of the actual structure. All the above studies are based on the equivalent beam theory for simulation modelling of platforms and have been proven to have high accuracy and reliability through experiments and numerical simulations. However, the numerical solution requires relatively large computational resources. As an example, for the high-fidelity model in this study, it takes 3 to 4 min to compute the dynamic response of the pile leg of a jack-up platform under a single environmental load on a 28-core CPU. The numerical solution’s time lag makes it difficult to meet the real-time state detection requirements of the platform.
Structural health monitoring (SHM) has been utilized in numerous studies [
8,
9] to analyze the operational status of structures and to detect and locate structural damage, in conjunction with operational modal analysis (OMA). Nonetheless, these monitoring techniques are heavily reliant on sensors. The intricate structure of the pile legs of jack-up platforms renders sensor deployment infeasible at certain critical points, and maintaining sensor availability over an extended period in the complex marine environment is challenging. The dependence on sensors limits the application of these methods.
The development of artificial intelligence technology brings new ideas to break the limitations of SHM and the numerical computation solution. A mapping was established between input operating conditions and output dynamic response based on the random forest machine learning algorithm. The trained prediction model can provide fast results within milliseconds. This will help to ensure the reliability of jack-up platform operations, improve economic efficiency, and safeguard the lives of the personnel working on the platform.
A prediction model based on a jack-up platform simulation calculation database and combining intelligent optimization algorithms was developed to reflect the platform force field quickly. Firstly, a high-fidelity simulation model of the jack-up platform was established, and the dynamic analysis of the platform under different combined loads of wind, waves, and currents was calculated to establish a database of the dynamic response of the platform. Bayesian optimization and random forest algorithms were applied to train the prediction model based on the database to predict the response of the jack-up platform under specific loads. Lastly, the impact of various sample sizes and train/test ratios on the performance of the training model was assessed and compared. The prediction model gives a highly accurate prediction of the dynamic response of the pile leg within a few milliseconds and is also highly accurate at extreme values. The framework of the prediction model is shown in 
Figure 1.
The remainder of this paper is organized as follows: 
Section 2 introduces the algorithms used in this article, including principal component analysis, the random forest algorithm, and the Bayesian optimization algorithm. In 
Section 3, the database is constructed, and the parameters of the algorithm are adjusted to train the prediction model. 
Section 4 evaluates the performance of the prediction model, and the effects of the number of samples in the training set and the train/test ratio on the accuracy of the prediction model are compared. 
Section 5 summarizes the whole study.
  4. Model Evaluation
  4.1. Performance of the Prediction Model
The use of relative error δ in engineering problems provides a measure of the confidence level associated with the data. Due to the mechanical nature of the pile leg structure of the jack-up platform, some of the data values are zero, and it is necessary to replace zero with a very small value when calculating the relative error. However, subsequent to this operation, the relative error of these zero-valued points can easily reach 100%, prompting the introduction of a weighting factor when calculating the relative error. Min-max normalization of the absolute value of each condition in the sample data
        
        where the maximum and minimum values are for each column of data in the response variable and the physical meaning is a force or moment in one direction. Thus, the weighted relative error is as follows:
        where 
 is the weight and 
 is the absolute error.
In pile leg structures, the greater the force applied, the greater the likelihood of damage, the more dangerous, and thus the most important place to focus on in the prediction of the dynamic response. Using weighted relative error to measure the accuracy of the prediction model not only eliminates the effect of the 0 value in the response variable, but also conforms to the laws of physics and can increase the focus on the dangerous points.
The trained model fits well, with an R2 determination coefficient of 0.935 on the test set. This indicates that the model has high prediction accuracy on the test set and captures the patterns and trends of the data well. The mean value of the weighted relative error for all data points is 0.413%, which is much less than 1%; the maximum weighted relative error is 5.277%, which is much less than 10%, satisfying the accuracy requirement.
To further analyze the capability of the prediction model, including its prediction accuracy at extreme values, a scatterplot was generated for condition 901, selected from the test set. In condition 901 the wind direction and wave direction are the same, in this condition the platform is more dangerous, the scatter plot is shown in 
Figure 8.
The horizontal coordinates in 
Figure 8 are the indexes of the data points, the red circles represent the calculated true values, and the blue asterisks represent the predicted values. It can be noticed that all the points almost coincide. Especially at extreme values, the model can also accurately predict the force state, which indicates the high availability of the predicted data. It is proved that the model is able to predict the force condition of the pile leg structure under a certain working condition, bringing it in line with the problem to be solved in this paper.
  4.2. Importance of Features
Evaluating the importance of the features of a sample is a feature that comes with the random forest algorithm, which calculates the average of how much contribution each feature makes to each tree and determines the importance of the feature by comparison. The calculation of the contribution is based on mean decrease impurity (MDI) and mean decrease accuracy (MDA) [
22]. The scikit-learn machine learning library used in this paper is based on the MDI method to compare the importance of features.
The importance of features in the prediction model was evaluated and ranked as shown in 
Figure 9.
As shown in 
Figure 9, the most influential feature for the dynamic response of the jack-up platform is the height of the main hull of the platform, with an importance score of 0.432, which is much higher than the remaining five features. The main hull weighs a great deal, and these weights are carried by the rods at locations below the main hull, so the height of the main hull affects the stresses on the entire pile leg structure. Among the remaining five features, the direction of wind loads has the highest importance score of 0.295, which affects the prediction results much more than the remaining four features. The main hull of the target jack-up platform has a huge wind area and the four pile leg structures are complex. Among all the environmental loads considered, the force of wind load on the whole structure is much larger compared to the other loads. The aspect ratio of the main hull is as high as 1.8, which makes the dynamic response of the platform very sensitive to the direction of the wind load. The feature of wind direction makes the second largest contribution in the model prediction process, which is in line with the characteristics of the jack-up platform structure.
  4.3. Effect of Sample Size of Dataset on Precision
The impact of varying sample sizes on the accuracy of model prediction outcomes was investigated; 1680, 1512, and 1344 sample points were randomly selected for training and testing, respectively. The samples were allocated as 80% training set and 20% testing set. Scatter plots depicting the prediction results for the three scenarios, with the true values on the horizontal axis and the predicted values on the vertical axis, are presented in 
Figure 10, 
Figure 11 and 
Figure 12.
As shown in 
Figure 10, 
Figure 11 and 
Figure 12, when the model is trained with 1680 samples, the points in the graph are all closely distributed near the y = x straight line and the weighted relative errors of all the points are within 10%, which is the best prediction for each data point among the three models. When the sample size is decreased to 1512, it is observable that some points deviate further from the y = x line, and a minority of points exhibit weighted relative errors exceeding 10%. Upon further reduction in sample size to 1344, it becomes apparent that the distribution of data points becomes more dispersed, with an increased number of points exhibiting weighted relative errors exceeding 10%. This indicates that training the prediction model with 1680 samples yields the most accurate prediction results. As the number of sample points decreases, the prediction results become worse.
In order to further compare the effects of different numbers of sample points on the accuracy of the prediction model, the average values of the weighted relative errors for all conditions under the three sample numbers were calculated, as shown in 
Table 3.
As can be seen from 
Table 3, the overall error rises as the number of samples decreases, and the prediction effect of the model decreases sharply when the quantity of data is reduced to 1344. This may be attributed to the uniform distribution of environmental loads considered in the simulation calculations of the pile leg model force applied to the jack-up platform. When the sample points are randomly selected, this random distribution is destroyed, resulting in a very sparse distribution of sample points at some environmental loads. This non-uniform distribution can make it difficult for the prediction model to accurately capture the distributional trends and details of the data, resulting in a reduced ability to fit the whole. Notably, when the sample size is decreased to 1344, this uneven distribution is further exacerbated, resulting in a significant decline in the model’s fitting ability.
  4.4. Effect of Training/Testing Ratio on Precision
The training-to-testing ratios for the random forest algorithm were set to 60:40, 70:30, 80:20, and 90:10, respectively, with all other parameters remaining unchanged to train the models and compare their accuracies. For each condition with different training-to-testing ratios, the maximum weighted relative error and the average weighted relative error were calculated, and a heat map was plotted, as shown in 
Figure 13.
In order to more fully compare the effect of the train/test ratio on the predictive model, the maximum values of the maximum weighted relative error and the mean weighted relative error were compared, as shown in 
Table 4.
As can be seen from 
Table 4, the training/testing ratio has a significant effect on the accuracy of the prediction model. As the train/test ratio keeps increasing, the error of the prediction model keeps decreasing and the fit keeps improving. When the training ratio is 60%, the maximum weighted relative error over all samples is 11.375%, which is more than 10%, and the model predicts poorly. When the training ratio was increased to 70%, the average weighted error was reduced by 0.021% and the maximum weighted relative error was within the acceptable range. When the training ratio was further increased, to either 80% or 90%, the average weighted error was reduced by 0.019% and 0.104%, respectively. It can be seen that the accuracy of the predictive model continues to improve as the train/test ratio continues to increase. Training the model using a training set with a percentage of 70% or more makes it possible to obtain a more accurate model.
When the proportion of the training set was increased from 70% to 80%, the maximum weighted relative error decreased by 1.932%, and the model’s ability to fit extreme values was enhanced. Further, when the proportion of training data was increased to 90%, the model’s ability to fit both global and extreme points improved. However, excessive training data may lead to overfitting, thereby reducing the predictive model’s generalization ability. Furthermore, 10% of the testing set is insufficient for the goal of evaluating the model’s ability to fit in the global context, resulting in low confidence. Therefore, after considering the model’s fitting ability and generalization ability, it is more appropriate to choose 80% of the data as the training set for the target prediction model.
  5. Conclusions
In this paper, a dynamic response prediction model based on simulation database and random forest algorithm is established for the safety prediction and damage prevention of pile leg structure of jack-up platform. After constructing the simulation database, the machine learning algorithm is employed to derive the predicted response value under the prevailing environmental load, which is then compared with the permissible material strength. Upon detection of a predicted value exceeding the threshold, the system triggers an early warning, prompting immediate inspection of the structure to identify and rectify issues promptly, thereby fulfilling the objective of early warning. The relevant conclusions are summarized as follows:
- A predictive model for the dynamic response of the pile leg of a jack-up platform based on the random forest algorithm was developed. The R2 determination coefficient of the model is 0.935, which provides a good description and prediction of the data with a high fitting performance. Comparing the predicted values of the model and the accurate values obtained from numerical calculation, the prediction model also has good prediction performance at extreme points and has high credibility in prediction. 
- The importance of each feature for the prediction model was studied. The height of the main hull was found to have the greatest influence on the prediction results, due to the fact that the huge main hull weight is supported by the pile legs located below the hull. The direction of the wind load had the second largest effect on the prediction results, which was due to the fact that the platform receives a large wind force and is very sensitive to the wind direction. 
- The effect of different sample sizes on the prediction accuracy of the model was studied. As the number of sample points decreases, the fitting performance of the model gradually deteriorates and the accuracy decreases. This is due to the fact that randomly drawing new samples from the original samples leads to the sample points becoming sparse in some places and the model cannot capture the characteristics of the data well. 
- The effect of the training-to-testing ratio on the model’s fitting ability was investigated. As the proportion of the training set rises, the predictive ability of the model gradually increases. After considering the factors of saving training resources of the model and avoiding the occurrence of overfitting, using too high a proportion of the training set should be avoided. For this prediction model, it is more appropriate to choose 80% of the data as the training set. 
The findings of this study offer insights into the framework of the prediction model for pile leg structural forces in jack-up platforms, encompassing database establishment, algorithm selection, and parameter tuning. The method is capable of predicting the forces acting on the pile legs structures based on an established database. The prediction model is very responsive and is able to derive the structural stresses for the input condition loads within a few milliseconds. The predictions are globally accurate and reliable at the extremes of the forces. This prediction model can be combined with techniques such as digital twins and SHM to obtain a better information about the structural forces.