Heat Load Forecasting of Marine Diesel Engine Based on Long Short-Term Memory Network

: High heat load on diesel engines is a main cause of ship failure, which can lead to ship downtime and pose a risk to personal safety and the environment. As such, predictive detection and maintenance measures are highly important. During the operation of marine diesel engines, operating data present strong dynamic, time lag, and nonlinear characteristics, and traditional models and prediction methods cause difﬁculties in accurately predicting the heat load. Therefore, the prediction of its heat load is a challenging and signiﬁcant task. The continuously developing machine learning technology provides methods and ideas for intelligent detection and diagnosis maintenance. The prediction of diesel engine exhaust temperature using long short-term memory network (LSTM) is analyzed in this study to determine the diesel engine heat load and introduce an effective method. Spearman correlation coefﬁcient method with the addition of artiﬁcial experience is utilized for feature selection to obtain the optimal input for the LSTM model. The model is applied to validate the ship data of the Shanghai Fuhai ship, and results show that the mean absolute percentage error (MAPE) of the model is lowest at 0.089. Compared with other models, the constructed prediction model presents higher accuracy and stability, as well as an optimal evaluation index. A new idea is thus provided for combining artiﬁcial knowledge experience with data-driven applications in engineering practice.


Introduction
In dealing with the increasing severity of fossil energy crisis and the strict emission requirements of internal combustion engines, the effective use of energy and environmental protection are also increasing in importance. If the diesel engine set has insufficient combustion, then the fuel-generated heat decreases, resources are wasted, black smoke and a large amount of CO and other harmful gases are discharged, and the environment is polluted, which will cause harm to human body through direct inhalation [1][2][3]. The diesel engine set is an important power source for ship navigation, and its normal working cycle is a major contributor to efficient transportation by sea, saving energy, and reducing emission [4].
Taking exhaust manifold as an example, the finite element method is used by Li et al. [5] to verify the effect of thermal load on its fatigue life. The high efficiency heat transfer model is used by Zhang et al. [6] to analyze the direct relationship between cylinder head fatigue life and average gas temperature. In addition, Chaboche model is established to analyze the local deformation and leakage of cylinder head under thermal cycle test [7]. The failure of the ship's exhaust valve was investigated and analyzed by EI-Bitar et al. [8], and it is determined that the high temperature environment would lead to the expansion of microcracks and easy fracture. According to the above research, main equipment of the diesel engine set will be damaged by high heat load, the ship will be stopped, which will greatly increase the navigation cost, and the safety of ship equipment and environment will As such, a hybrid prediction model incorporating the artificial empirical Spearman correlation coefficient method (AESR) and long short-term memory network (LSTM) is proposed in this study to achieve accurate and stable predictions of exhaust temperature by using the AESR-LSTM model. Redundant information is eliminated through the Spearman correlation coefficient method, and the optimal input is derived by adding artificial empirical supplementary variables while retaining those with high correlation ratings. The hyperparameters are usually selected according to experience and then set in the combination. The combination of cross-validation and grid search methods is used to avoid the blindness of adjusting parameters. The hyperparameters combination of neural network is scientifically optimized and adjusted, and the robustness and accuracy of the prediction model are ensured. After the optimal parameter set is selected by grid search and cross-validation, the model is trained again using the optimal parameters. The trained LSTM model is utilized to predict the exhaust temperature and highlight the advantages of the AESR-LSTM model for data trend prediction compared with other models. The experimental results of the selected prediction model are consistent with the actual values. The prediction result of the model can be sent to the console as a feedback signal, and more convenience and information can be provided to the operator. The predicted results can be used to analyze the combustion conditions in the combustion chamber. Complex models do not need to be used to create analysis, and such signals are difficult to obtain by physical sensors. The predicted trend results can be adopted to analyze the working condition and emission substances of diesel engines, implement certain avoidance measures before failure occurs, reduce the risk of accidents, improve the safety of ship systems, and prevent serious personal injury and economic loss. The AESR-LSTM neural network modeling is simpler than conventional modeling analysis because the workload of heat load research is reduced, more comprehensive influencing factors are taken into account, complex changes in the combustion chamber are predicted by a small amount of experimental data, and more accurate prediction results are obtained. A new idea is provided in this study, which combines artificial experience with data driven application in engineering practice.
Accordingly, a method for predicting diesel engine exhaust temperature that integrates feature selection, parameter combination search, and comparative analysis of multiple model combinations is proposed in this study. The remainder of this paper is structured as follows. Methods used and the proposed hybrid prediction system model are briefly described in Section 2. Relevant data are collected and analyzed in Section 3, and the results of the proposed system used to predict the thermal load of the combustion chamber of the marine diesel engine set were displayed, and then the results were compared with those of other models. Finally, the conclusions of this study are drawn in Section 4.

Prediction Method
In this section, data preprocessing method, network model, and optimization method are introduced, and a method to predict the heat load of marine diesel engine combustion chamber is proposed. The AESR-LSTM method is developed, which mainly consists of the Spearman correlation coefficient method and the LSTM network, and is used to predict heat load.

Long Short-Term Memory Network
LSTM is a neural network proposed by Hochreiter and Schmidhuber in 1997 [23,24]. This model has been continuously developed to form a systematic and complete framework [25][26][27]. The LSTM is used in this study to compensate for the limitations of recurrent neural network (RNN) in dealing with the dependence problem at long distances and to solve the enlargement of and difficulty in updating partial derivatives W during training. The internal structure of the LSTM neural unit is shown in Figure 1. The LSTM adds three thresholds to the framework of the RNN as three logical control units, and the input and output information of the entire network is controlled and managed by the three thresholds. The three thresholds are described as follows: Input Gate: Whether the information is stored in the storage unit is determined by the threshold and denotes it as it.
Forget Gate: Whether the information stored in the storage unit at the previous time is stored in the storage unit at the current time is determined by the threshold and denotes it as ft.
Output Gate: Whether the information in the storage unit at the current moment enters the hidden state ht is determined by the threshold and denotes it as ot.
Historical information can be saved, read, updated, and reset by the unit; it is the core of the LSTM unit and is denoted as Ct. The LSTM neural network at moment t is expressed as follows: where ft, it, ot, and ht are (1), (2), (3), and (6), respectively; Wf, Wi, Wo, and WC denote the recursive connection weights of the corresponding thresholds; σ is the sigmoid function, which is the same as the tanh function for the activation in Equations (7) and (8). Input Gate: Whether the information is stored in the storage unit is determined by the threshold and denotes it as i t .
Forget Gate: Whether the information stored in the storage unit at the previous time is stored in the storage unit at the current time is determined by the threshold and denotes it as f t .
Output Gate: Whether the information in the storage unit at the current moment enters the hidden state h t is determined by the threshold and denotes it as o t .
Historical information can be saved, read, updated, and reset by the unit; it is the core of the LSTM unit and is denoted as C t .
The LSTM neural network at moment t is expressed as follows: where f t , i t , o t , and h t are (1), (2), (3), and (6), respectively; W f , W i , W o , and W C denote the recursive connection weights of the corresponding thresholds; σ is the sigmoid function, which is the same as the tanh function for the activation in Equations (7) and (8).
The state at the previous point in time needs to be discarded, and the content saved to the memory unit is determined by the forgetting gate. The sigmoid function is used to decide whether C t−1 is cumulatively retained or not. Cumulative retention is achieved when the sigmoid function is equal to 1 but is absent when the function is equal to 0.
The input gate contains the output h t−1 from the previous moment and the input X t at this time, and the sigmoid function is used to control how much to add to C t . An alternative C t is also created and then the tanh function can be used to control how much to add to C t . The two parts are then multiplied to determine the amount of influence C t , and the influence of the forgetting gate is added to obtain the expression for C t .
The output gate is a sigmoid function that can determine which parts of C t need to be output to describe the o t expression. C t is placed into the tanh function to determine the final output C t and then multiplied with o t to obtain the final output h t , which signals the end of the LSTM work for one moment. How many memory units are forgetten, retained, and outputted at each moment are determined and affected by the three thresholds, and they are finally transferred to the state of this moment at the end.
The prediction results of LSTM model are affected by the learning rate, weights, activation function, step size, and number of batches in the network. For example, convergence failure is caused by learning rate being set too high, while consuming a lot of training time to calculate the optimal value is caused by learning rate being set too low. Problems, such as gradient explosion and disappearance, can occur when the activation function is poorly chosen. Therefore, LSTM prediction model needs to be trained, and appropriate parameters are selected to improve the prediction accuracy.

Spearman Correlation Coefficient Method
As mentioned above, factors affecting the exhaust temperature are typically influenced by uncertain dynamic environmental factors. To find them, Spearman correlation analysis method was adopted by us. The change trend and correlation strength between the two variables were tested by Spearman's correlation coefficient method. This method is based on calculating the difference of each pair of equivalents of two columns of paired ranks as the basis. If the correlation coefficient between two variables is close to +1 and −1, then the surface correlation is strong. The Spearman correlation coefficient r p can be expressed as follows: where n is the sample size, d i is the difference of bit values of the ith data pair. The values of r p are within the range of [−1, 1]. If r p = 1, then the correlation is perfectly positive; if r p = −1, then the correlation is perfectly negative. The absolute value is used as the basis to judge the correlation. The strength of correlation between variables is divided into four categories, as shown in Table 1 [28].

AESR-LSTM Hybrid Prediction Model
AESR-LSTM hybrid prediction model is proposed to combine Spearman correlation coefficient method with LSTM network, and artificial experience is added to conduct exhaust temperature prediction. First, sensor data is analyzed to eliminate overlapping features. Spearman correlation coefficient method is used to discard redundant information in the original data because exhaust temperature will be affected by various factors and there is correlation between various factors. Finally, the variables are supplemented by artificial experience, and the efficiency of the algorithm and the accuracy of prediction are improved. The cross-validation and grid search methods are used to optimize the hyperparameters of the neural network to obtain the optimal combination of parameters with maximum prediction accuracy. After the optimal parameter set is selected by grid search and cross-validation, the model is trained again using the optimal parameters. The overall framework and partial procedures of AESR-LSTM are shown in Figure 2 and Algorithm 1. The specific modeling steps are presented as follows.
Step 1: The influencing factors related to exhaust temperature are analyzed to collect relevant time series data X t on the basis of engineering experience.
Step 2: The training and test sets are divided into pieces in a ratio of 7:3.
Step 3: Data is preprocessed, Spearman correlation coefficient is used for feature selection to process the original data, redundant information is eliminated, highly corre-lated variables are extracted, and variables are supplemented by mechanisms and human experience to obtain the best input X * t .
Step 4: The hyperparameters in the LSTM neural network model are adjusted through iterative optimization combined with cross-validation and grid search methods to select the optimal combination of parameters and improve its prediction accuracy.
Step 5: After the optimal parameter set is selected by grid search and cross-validation, the model is trained again using the optimal parameters.
Step 6: The test set samples are input into the prediction model to predict the combustion chamber exhaust temperature of marine diesel engine sets.
Step 7: The prediction performance of the proposed model is compared with those of other prediction models.
tors and there is correlation between various factors. Finally, the variables are supplemented by artificial experience, and the efficiency of the algorithm and the accuracy of prediction are improved. The cross-validation and grid search methods are used to optimize the hyperparameters of the neural network to obtain the optimal combination of parameters with maximum prediction accuracy. After the optimal parameter set is selected by grid search and cross-validation, the model is trained again using the optimal parameters. The overall framework and partial procedures of AESR-LSTM are shown in Figure 2 and Algorithm 1. The specific modeling steps are presented as follows.
Step 1: The influencing factors related to exhaust temperature are analyzed to collect relevant time series data Xt on the basis of engineering experience.
Step 2: The training and test sets are divided into pieces in a ratio of 7:3.
Step 3: Data is preprocessed, Spearman correlation coefficient is used for feature selection to process the original data, redundant information is eliminated, highly correlated variables are extracted, and variables are supplemented by mechanisms and human experience to obtain the best input * .
Step 4: The hyperparameters in the LSTM neural network model are adjusted through iterative optimization combined with cross-validation and grid search methods to select the optimal combination of parameters and improve its prediction accuracy.
Step 5: After the optimal parameter set is selected by grid search and cross-validation, the model is trained again using the optimal parameters.
Step 6: The test set samples are input into the prediction model to predict the combustion chamber exhaust temperature of marine diesel engine sets.
Step 7: The prediction performance of the proposed model is compared with those of other prediction models.

Principle Analysis and Data Processing
In a ship, the power source is composed of the main engine and an auxiliary engine. The auxiliary power system is composed of machinery other than the diesel engine (main engine), including the fuel system, lubricating oil system, air system, cooling system, and other mechanical equipment. The main and auxiliary engines work together to propel the ship, and its composition structure is shown in Figure 3.  On the basis of the mechanism and data of the ship, the heat load of the mari diesel engine during operation is accurately reflected by the exhaust temperature. T amount, perfection, and timeliness of fuel combustion in the combustion chamber can reflected by the exhaust temperature, as well as the high temperature heating time a brightness of combustion chamber components. Hence, exhaust temperature can used to predict the heat load of diesel engine set.
The exhaust temperature of a single cylinder is predicted as an example in th study to analyze the trend of heat load variation and the operating performance of t combustion chamber. The high exhaust temperature of the cylinder is due to poor inte nal combustion, which is related to the amount of fresh air in the cylinder, cooler cooli effect, injector atomization quality, fuel viscosity, and cylinder compression pressu The sensor is used to monitor its working condition and collect factors related to exhau temperature, including high-temperature cooling, freshwater outlet temperature, cyli der liner cooling water inlet pressure, piston cooling oil outlet temperature, and fu pressure after the fuel filter. Determining the correlation and dependence among the data is important to predict the exhaust temperature of marine diesel engine sets.
Sensor monitoring data of the Chinese vessel Shanghai Fuhai are used in this stud which are uploaded every 25 min. The sampled relevant data of initial variables a listed in Table 2. Field data for two months show that 28,160 pieces of ship data a measured via the ship's sensors and constitute the data set, which is randomly divid into training and test sets at a ratio of 7:3.  On the basis of the mechanism and data of the ship, the heat load of the marine diesel engine during operation is accurately reflected by the exhaust temperature. The amount, perfection, and timeliness of fuel combustion in the combustion chamber can be reflected by the exhaust temperature, as well as the high temperature heating time and brightness of combustion chamber components. Hence, exhaust temperature can be used to predict the heat load of diesel engine set.
The exhaust temperature of a single cylinder is predicted as an example in this study to analyze the trend of heat load variation and the operating performance of the combustion chamber. The high exhaust temperature of the cylinder is due to poor internal combustion, which is related to the amount of fresh air in the cylinder, cooler cooling effect, injector atomization quality, fuel viscosity, and cylinder compression pressure. The sensor is used to monitor its working condition and collect factors related to exhaust temperature, including high-temperature cooling, freshwater outlet temperature, cylinder liner cooling water inlet pressure, piston cooling oil outlet temperature, and fuel pressure after the fuel filter. Determining the correlation and dependence among these data is important to predict the exhaust temperature of marine diesel engine sets.
Sensor monitoring data of the Chinese vessel Shanghai Fuhai are used in this study, which are uploaded every 25 min. The sampled relevant data of initial variables are listed in Table 2. Field data for two months show that 28,160 pieces of ship data are measured via the ship's sensors and constitute the data set, which is randomly divided into training and test sets at a ratio of 7:3. The time series correlation data X t associated with the exhaust temperature are collected as follows.
The turbocharger is driven by the inertial impulse of the exhaust gas to drive the turbine, and then fresh air is pressurized into the cylinder. Thus, overlapping variables and supercharger front T a3 and rear T a4 exhaust temperatures must be eliminated to obtain time series data as follows.
X t = T a1 , T o1 , T o2 , P o1 , T o3 , T w1 , P w1 , T w2 , T a2 , T f 1 , T o4 , P f 1 , P f 2 , P w2 , P o2 , N T , P a1 Spearman correlation coefficient method is used for feature selection of data, and the input of neural network is determined by the correlation between two factors, as shown in Figure 4. The turbocharger is driven by the inertial impulse of the exhaust gas to drive the turbine, and then fresh air is pressurized into the cylinder. Thus, overlapping variables and supercharger front Ta3 and rear Ta4 exhaust temperatures must be eliminated to obtain time series data as follows.
= {Ta1,To1,To2,Po1,To3,Tw1,Pw1,Tw2,Ta2,Tf1,To4,Pf1,Pf2,Pw2,Po2,NT,Pa1} Spearman correlation coefficient method is used for feature selection of data, and the input of neural network is determined by the correlation between two factors, as shown in Figure 4. According to the correlation matrix in Table 1 and the above figure, the correlation coefficient between the exhaust temperature and To2 is 0.8997. Hence, the turbocharger lubricating oil outlet temperature is highly relevant to the exhaust temperature. This finding is consistent with the actual scenario. The viscosity of the lubricating oil will be According to the correlation matrix in Table 1 and the above figure, the correlation coefficient between the exhaust temperature and T o2 is 0.8997. Hence, the turbocharger lubricating oil outlet temperature is highly relevant to the exhaust temperature. This finding is consistent with the actual scenario. The viscosity of the lubricating oil will be affected by the temperature of the lubricating oil and increase the exhaust temperature. The correlation of the variable T w2 is 0.8639, and how much heat is taken away from the combustion chamber is determined by the outlet temperature of the cylinder liner cooling water, thereby indicating its sensitivity to changes in the exhaust temperature. The cylinder liner cooling water inlet pressure and the sweep box temperature are important factors affecting the exhaust temperature. Six variables with correlations higher than 0.5 are derived. The significance of their p-values is below 0.001.
If the temperature of the pressurized air after the cooler is excessively high, then the exhaust temperature rises because the fresh gas entering the diesel engine is cooled by the cooler after being pressurized by the turbocharger into the combustion chamber. With the increase in supercharger speed, the increase in exhaust energy is affected by the increase in exhaust temperature. The reason is that the high-temperature exhaust gas from the combustion chamber flows through the supercharger. Another factor to be considered is the fuel pressure after the diesel filter. This refers to whether the faulty filter is reflected by the fuel pressure. Fuel quality and exhaust temperature can be affected by damaged filters.
The three variables T a2 , N T , and P f1 mentioned above are all important with a significance of less than 0.001, and the predictive variables will be affected, although their correlations are below 0.5, 0.1117, 0.1863, and 0.3574, respectively. Therefore, these factors are considered when deriving the final set of variables for the input model as follows.

Analysis of Modeling and Prediction Results
On the basis of Spearman correlation analysis, the top nine positively correlated parameters are selected as model inputs in predicting the target output exhaust temperature T. The inputs are divided into training and test sets in a ratio of 7:3 given the impact of data volume on learning ability in the data drive. A combination of grid search and tenfold cross-validation methods is applied to improve the prediction performance of the model. The number of times to calculate the set of hyperparameters X = {X 1 , X 2 , . . . , X n } is Π i = 1 i = n |h i |, where (i = 1, 2, . . . ) and h i is the number of hyperparameter values. Five parameters are selected in this study to set the hidden layers, hidden units, training rounds, learning rate, and batch size of the LSTM prediction network. The change trend of the loss function is affected by five super parameters, which are divided into two groups. See the change in loss function under the change in hyperparameters.
The influence of the number of units and learning rounds of the five-layer neural network on RMSE is shown in Figure 5. With the increase in the number of learning rounds, the RMSE decreases first and then increases, and the RMSE of 100 units is generally lower than that of other units. From Figure 6, we can see that the loss function is affected by different hidden layers. Usually, higher values are caused by the low learning rate of 0.001. Among the 0.01 learning rate and 0.005 learning rate, the number of hidden layers of five layers is better than other layers.    In the process of hyperparameters optimization, the combination with low RMSE value is selected as the best hyperparameters combination. Some adjustment results of cross-validation grid search optimization are shown in Table 3 below.  In the process of hyperparameters optimization, the combination with low RMSE value is selected as the best hyperparameters combination. Some adjustment results of cross-validation grid search optimization are shown in Table 3 below. After optimization, the best hyperparameter combination of RMSE is obtained. The hyperparameter candidate values and optimal values of the prediction model LSTM are shown in Table 4 below. After the optimal parameter combination is selected, the training set is input into the LSTM model for training. At the same time, discard technology is introduced to prevent the model from over fitting. The training curve and training relative error scatter diagram are shown in Figures 7 and 8 below. From the figure, we can see that the predicted value basically coincides with the actual value in the training, and the error in the training finally approaches the zero line.
After the optimal parameter combination is selected, the training set is input into the LSTM model for training. At the same time, discard technology is introduced to prevent the model from over fitting. The training curve and training relative error scatter diagram are shown in Figures 7 and 8 below. From the figure, we can see that the predicted value basically coincides with the actual value in the training, and the error in the training finally approaches the zero line.  The test set is fed into the trained model for exhaust temperature prediction. The prediction results are illustrated in Figure 9. The strong generalization ability of the prediction model is reflected by the consistency between the predicted and measured temperature values. The results of the selected forecasting model are subsequently analyzed After the optimal parameter combination is selected, the training set is input into the LSTM model for training. At the same time, discard technology is introduced to prevent the model from over fitting. The training curve and training relative error scatter diagram are shown in Figures 7 and 8 below. From the figure, we can see that the predicted value basically coincides with the actual value in the training, and the error in the training finally approaches the zero line.  The test set is fed into the trained model for exhaust temperature prediction. The prediction results are illustrated in Figure 9. The strong generalization ability of the prediction model is reflected by the consistency between the predicted and measured temperature values. The results of the selected forecasting model are subsequently analyzed The test set is fed into the trained model for exhaust temperature prediction. The prediction results are illustrated in Figure 9. The strong generalization ability of the prediction model is reflected by the consistency between the predicted and measured temperature values. The results of the selected forecasting model are subsequently analyzed by comparison with those of traditional forecasting methods, as described in detail below. Appl by comparison with those of traditional forecasting methods, as described in detail below.

Multimodel Comparative Analysis
In this study, Spearman correlation coefficient method and LSTM network are combined to predict a time series data. Other prediction models are input into the same data set, and the results of other prediction methods are compared with the results of the proposed methods for further analysis. The results of each prediction model are shown in Figures 10 and 11.

Multimodel Comparative Analysis
In this study, Spearman correlation coefficient method and LSTM network are combined to predict a time series data. Other prediction models are input into the same data set, and the results of other prediction methods are compared with the results of the proposed methods for further analysis. The results of each prediction model are shown in Figures 10 and 11.

Multimodel Comparative Analysis
In this study, Spearman correlation coefficient method and LSTM network are combined to predict a time series data. Other prediction models are input into the same data set, and the results of other prediction methods are compared with the results of the proposed methods for further analysis. The results of each prediction model are shown in Figures 10 and 11.

Multimodel Comparative Analysis
In this study, Spearman correlation coefficient method and LSTM network are combined to predict a time series data. Other prediction models are input into the same data set, and the results of other prediction methods are compared with the results of the proposed methods for further analysis. The results of each prediction model are shown in Figures 10 and 11.  From Figure 10, the prediction curve (red line) of AESR-LSTM model with human experience is closer to the true value (blue line). As can be seen in Figure 11, except for a few predicted outliers, the system's scatter plot of forecasting and actual values is closest to the diagonal, which indicates that the difference between the forecasting value and actual value is the smallest.
At the same time, several commonly used evaluation indicators were cited to further verify the prediction performance of the AESR-LSTM model. The prediction performance of the four models is used for comparison, as shown in Table 5. Table 5. Evaluation indicators.

Indicators Formula
Mean absolute error (MAE) Mean absolute percentage error (MAPE) Root-mean-square error (RMSE) N is the number of predicted values, T ri is the original data value, T pi is the predicted value. The prediction performance of the prediction model is indicated by the value of MAPE, MAE and RMSE. The MAPE, MAE, and RMSE of the four models were calculated separately to reflect the goodness of the prediction model through the indexes. Figure 12 shows the values of the four prediction models the evaluation indexes. The error bars in the figure represent 95% confidence intervals. The mean absolute percentage, mean absolute, and root-mean-square errors of the proposed AESR-LSTM model are 0.089, 10.5403, and 27.5408, respectively, and the best indicators among several prediction models. The feature inputs selected by the improved AESR-LSTM model are better than those obtained by traditional methods for data trend prediction, so the method optimization is effective. N is the number of predicted values, Tri is the original data value, Tpi is the predicted value. The prediction performance of the prediction model is indicated by the value of MAPE, MAE and RMSE. The MAPE, MAE, and RMSE of the four models were calculated separately to reflect the goodness of the prediction model through the indexes. Figure 12 shows the values of the four prediction models the evaluation indexes. The error bars in the figure represent 95% confidence intervals. The mean absolute percentage, mean absolute, and root-mean-square errors of the proposed AESR-LSTM model are 0.089, 10.5403, and 27.5408, respectively, and the best indicators among several prediction models. The feature inputs selected by the improved AESR-LSTM model are better than those obtained by traditional methods for data trend prediction, so the method optimization is effective.

Conclusions
According to the data set collected in the marine cabin system, an AESR-LSTM data trend prediction model with artificial experience is constructed in this study. The model can be used for heat load prediction, fault detection, and diagnosis of marine diesel engines. Spearman correlation coefficient method is used to collect relevant raw data for feature selection, and the optimal input is selected by artificial empirical and significance

Conclusions
According to the data set collected in the marine cabin system, an AESR-LSTM data trend prediction model with artificial experience is constructed in this study. The model can be used for heat load prediction, fault detection, and diagnosis of marine diesel engines. Spearman correlation coefficient method is used to collect relevant raw data for feature selection, and the optimal input is selected by artificial empirical and significance check. The cross-validation and grid search methods are combined, and the hyperparameters are adjusted scientifically to avoid the randomness of the validation set. After the optimal parameter set is selected by grid search and cross-validation, the model is trained again with the optimal parameters, and the test set data is input into the training model to obtain the prediction results. The findings are subsequently compared and analyzed with those of other prediction models.
(1) The Spearman correlation coefficient method incorporating artificial experience was proposed to select features on the basis of operational monitoring data collected from the sensors. The correlation, redundancy, and significance of variable sets are analyzed separately, and the nine monitoring characteristic parameters with the maximum influence on the exhaust temperature are selected. Data-driven analysis and human experience are combined to provide optimal input features for the predictive models.
(2) The LSTM prediction model is trained with parameter tuning in combination with cross-validation grid search to obtain the prediction and evaluation metrics. The results and indicators of several models were compared. The results show that predicted value of AESR-LSTM are closest to the true value, and its evaluation indicators MAPE, MAE and RMSE are the best, which are 0.089, 10.5403, and 27.5408, respectively.
(3) The shortcomings of only using a single method can be overcome by the fusion of multiple methods, and the data can be scientifically and effectively screened to improve the effectiveness of the model in data prediction and fault diagnosis of marine diesel engines. Thus, the hybrid algorithm model is stable, and the error tolerance of the prediction results is reduced.
(4) The proposed method is based on the mechanism and data of the ship. All factors that may cause thermal load failure of the diesel engine are taken into account and can be used to analyze and refer to the working performance of the marine diesel engine. The prediction data can achieve effective fault detection and maintenance of ships for the implementation of preemptive corrective measures before ship failure, prevent ship downtime due to damaged components caused by excessive heat load, improve fuel economy and equipment reliability of ship diesel engines, and reduce economic losses.
A novel method combining artificial experience and data-driven is proposed. The selected optimal feature set is input into the model for prediction, and the better prediction results are obtained. As such, a feasible extended method of machine learning in marine diesel engine thermal load prediction and fault diagnosis is provided. Future research can focus on the optimization of methods, better operation parameter combination will be obtained through data mining techniques, and independent fault detection system will be developed to provide more convenience and information for ship operators.