1. Introduction
As a power output device, the diesel engine can be used not only as the main power device of the ship but also as the power output device of the ship’s power station [
1]. The normal operation of the main engine can guarantee the normal power output of the ship, while the marine power generation diesel engine is an essential part of the ship’s power station, which can ensure the power supply of the ship [
2]. The development of economic globalization cannot be separated from the support of the shipping industry. Ninety percent of the world economy depends on the ocean. However, the carbon emissions from shipping account for about 3% of the total global carbon emissions, of which carbon dioxide is the main emission. With the rapid development of the shipping industry, the tonnage and quantity of ships are increasing, and the emission of carbon dioxide from ship equipment is growing rapidly [
3]. The operation status of a marine engine directly affects the exhaust performance of the engine. Once the diesel engine breaks down, it may cause the deterioration of the operating conditions of the diesel engine, leading to a decline in the emission performance and pollution of the marine environment. In serious cases, the diesel engine will be shut down, which will affect the normal navigation of the ship and the power supply, causing damage to the ship’s equipment and even endangering the lives of the people on board [
4]. At the same time, with the development of the Internet of Things, the Internet, and automation and intelligence, the structure of marine diesel engines is becoming more complex. The components are more numerous because marine diesel engines have been operating in harsh environments for a long time, and the probability of failure is increasing [
5]. The traditional diesel engine monitoring and alarm technology mainly focuses on the thermal parameters of the diesel engine, such as oil temperature, water temperature, etc. These parameters will have obvious abnormalities only when the fault deteriorates to a certain extent [
6]. Therefore, traditional monitoring and alarm technology cannot directly realize early fault warnings. Relatively mature fault diagnosis technologies mostly use intelligent algorithms to learn a large number of diesel engine fault features and classify them in detail. This can enable the fault location to be realized in time after the fault occurs, which is convenient for the operation and management personnel to repair the equipment. Still, it is impossible to send an alarm prompt to relevant personnel before the fault occurs [
7]. Therefore, studying the fault early warning technology of the marine diesel engine and predicting its status can detect and send out alarms at the early stage of the fault in a timely manner, which can not only give the operation management personnel enough time to check the abnormality but also save a lot of time and cost for post maintenance, which is of great significance to ensure the normal operation of the diesel engine and improve the reliability of equipment.
Exhaust gas temperature is an important thermal performance parameter for a marine diesel engine. It contains a large amount of state information about the diesel engine and reflects the combustion characteristics and dynamic characteristics of the diesel engine under operating conditions [
8]. The exhaust gas temperature is used as a slowly changing parameter signal, which is less disturbed and has a strong fault indication. By monitoring and predicting the exhaust gas temperature, the health status of the marine diesel engine can be reflected in real-time [
9].
In the existing research, the methods of fault early warning usually include the method based on the physical model and the data-driven method. If the model-based method is adopted, it is necessary to construct an accurate mathematical or physical model to describe the working process of the predicted object [
10]. For marine equipment in harsh and changeable environments, it is often difficult to establish a more accurate model system. The data-driven method does not involve the construction of complex models. It can collect historical operation data from the equipment as the object of study [
11]. By processing and analyzing the data and using relevant algorithms, we can establish a fault early warning model to realize fault early warning. In recent years, with the rapid development of deep learning and the continuous improvement and innovation of various intelligent algorithms, fault warning based on the data-driven method has attracted more and more attention [
12].
Based on the improved random forest algorithm, Li et al. proposed a bus engine failure early warning model, set early warning indicators using rough set theory, and verified the proposed method through experiments [
13]. Liu et al. used a combination of Bayesian and long-short-term neural networks (LSTM) to predict the status of nuclear power plant equipment. The residual value is used to set the threshold value, and the early fault warning is realized by observing the change in the residual value [
14]. Zhang et al. used the multivariate state estimation technique (MSET) to establish a state prediction model for power plant auxiliary equipment and used sliding window similarity instead of the traditional residual threshold method to achieve effective fault prediction and early warning of faults [
15]. Liang et al. used the bidirectional recurrent neural network (BRNN) to establish a wind turbine prediction model and calculated the residual between the predicted value and the actual value. A fault warning can be achieved by using the sliding window to calculate the residual value [
16]. Yuan et al. established the echo state network (ESN) to predict the state of the pitch motor. According to the analysis of the residual value, they used the exponential weighted average algorithm to obtain the alarm threshold of each parameter to realize the early fault warning [
17]. Based on the nonlinear state estimation method, Yan et al. established a gearbox bearing temperature prediction model and analyzed the distribution characteristics of the residual error and the standard deviation of the residual error through the sliding window error statistical analysis method. When the abnormal state exceeds the set threshold, a fault warning can be realized [
18]. Liu et al. used the LSTM model to predict the trend of the data, which was used to predict the exhaust gas temperature of the marine diesel engine [
19]. Desbazeille et al. established a neural network model for cylinder fault prediction in marine diesel engines to predict the vibration value of the crankshaft and determine the severity of the fault [
20]. Cheliotis et al. combined the expected behavior (EB) model with an exponentially weighted moving average to predict the operating state of the main engine. The results show that this method can detect faults early [
21]. Han et al. used the LSTM network to build a fault prediction model to predict the load of the marine diesel engine [
22]. Theodoropoulos et al. used a one-dimensional convolutional neural network (Convolution1d, 1DCNN) to predict and analyze the ship data collected by sensors. The prediction results show that this method can help shipowners to save on operating costs and improve operating efficiency. At the same time, it can also ensure the normal operation of ship equipment [
23].
Che et al. predicted the state of multiple variables of the aircraft system based on the LSTM neural network. They used a deep belief network to evaluate and classify the state of the aircraft system [
24]. Yang et al. conducted fault diagnosis research on robots based on the deep belief network (DBN). The established model can identify the fault state of the machine and give early warning in time [
25]. Karatug et al. analyzed the thermal parameters of the diesel engine through the artificial neural network (ANN), judged the real-time operation status of the diesel engine, and classified the status in detail [
26]. Lazakis et al. monitored the thermal parameters of the diesel engine through fault tree analysis and fault mode response analysis to achieve fault prediction and fault location of the diesel engine [
27]. Vos et al. used the combination of LSTM and a support vector machine (SVM) to detect the abnormality of the reduction gearbox [
28]. Kichan et al. used principal component analysis and a neural network to monitor the vibration signals of the marine diesel engine. This method could realize early fault diagnosis [
29]. Christian established a real-time anomaly monitoring system for marine diesel generators based on the long and short-term memory variational automatic encoder and multi-level Otsu threshold. The system can realize early warning and fault diagnosis [
30]. Chernyi et al. studied the condition diagnosis of ladle lining by using a neural network and developed relevant software to evaluate its condition and make auxiliary decisions [
31]. Theodoropoulos et al. used the two-dimensional convolutional neural network (Convolution2d, 2DCNN) to identify the abnormal state of the ship during navigation. The experimental results show that this method can effectively monitor the state of the ship’s equipment and classify it [
32]. Dong et al. used the convolutional neural network to establish a multi-input model for fault diagnosis of rotating machinery and achieved good results [
33].
The existing research on fault early warning can be roughly divided into two categories: state prediction and state classification [
34]. Among them, the state classification method requires a large amount of early fault data input to an established model for learning, which is obviously not applicable to marine diesel engines where early fault data is difficult to collect. In addition, there is little research on the fault warning of marine diesel engines. Still, the marine diesel engine is an important piece of equipment to ensure the safe navigation of ships and the normal lives of personnel. Therefore, this paper proposes a state prediction model based on CNN-BiGRU to predict the exhaust temperature of the marine diesel engine. Then, we use the knowledge of mathematical statistics to analyze the residual value between the predicted value and the actual value, and the alarm threshold of the residual value is set. At the same time, in order to prevent the prediction value of some normal states from being greatly deviated due to the prediction model itself, the sliding window algorithm is used to calculate the standard deviation of the residual value, and the alarm threshold of the standard deviation is set. Only when the residual value and the standard deviation value exceed the threshold at the same time, will the fault warning be triggered.
The rest of the research content of this paper is as follows. The convolutional neural network and the bidirectional gated recurrent unit used in this paper are introduced in the deep learning theory section. In the section on the forecasting process, the data set collected and the data processing process are introduced. The hyperparameter selection process is described. The forecasting model is established, and the advantages of the forecasting model proposed in this paper in terms of trend prediction are verified through experiments. In the section on fault warning, the setting method of the alarm threshold is introduced. The conclusion is drawn in the conclusion section.
2. Deep Learning Theory
2.1. The Convolutional Neural Network
The convolutional neural network is a deep feed-forward neural network that processes network structure data and has strong feature extraction capabilities [
35]. It usually consists of the input layer, the convolutional layer, the pooling layer, and the fully connected layer, and its structure diagram is shown in
Figure 1 [
36]. The convolution layer is named after the convolution calculation in mathematics. It realizes the convolution calculation of the input data by the convolution kernel, whose size can be adjusted. It is an essential step for CNN to realize local feature extraction. Its calculation formula is shown in the Formula (1).
In the formula, Wcnn is the weight coefficient of the convolution kernel; nt is the input variable at time t; * represents the convolution operation; bcnn represents the deviation coefficient of the convolution operation; ct is the output after the convolution layer operation; f means the convolution activation function during operation.
The pooling layer is usually located after the convolutional layer. Its main function is small, based on the premise of ensuring that the feature information is not lost, reducing the amount of calculation, and avoiding the phenomenon of overfitting. Average pooling and maximum pooling are the two most commonly used pooling methods at present.
The fully connected layer will rearrange the features and perform nonlinear calculations to obtain the output results.
CNN is widely used in image processing, and its application type is the 2DCNN. Still, because its latitude does not match the time series data, it cannot be directly applied to the time series data prediction problem. To solve this problem, the 1DCNN is often used to process time-series data and mine the internal correlation and potential features of the data to improve the efficiency and accuracy of the prediction model [
37].
2.2. The Bidirectional Gated Recurrent Unit
Recurrent neural networks (RNN) and their variants are widely applied in the study of time series prediction. Compared with traditional neural networks, the RNN has a certain memory ability to relate the effects produced by historical data into the current training and learning process, resulting in a relatively ideal prediction effect. Because only a single tanh unit is included in its interior, problems such as gradient disappearance and gradient explosion are easily encountered in the process of training. In response to this problem, the long-short-term neural network and the gated recurrent unit (GRU) are improved based on the RNN, which can well circumvent the dropout gradient problem and optimize the prediction effect of the model.
The LSTM consists mainly of the forget gate, the input gate, and the output gate. The main function of the forget gate is to forget redundant and invalid information. The input gate filters the input information and leaves important feature information in the data. The main function of the output gate is to pass information down. The LSTM improves the RNN by altering the internal structure of neurons and solving the gradient problem in the RNN model. However, because the internal structure is more complex, the number of activation functions and training parameters set internally increases, which increases the difficulty of training and reduces the speed of model prediction. In order to improve the prediction speed of the model on the premise of ensuring prediction accuracy, simplify its internal structure as much as possible, so some scholars proposed the GRU network.
The GRU is also a gating mechanism in the recurrent neural network, which is proposed to solve problems such as long-term memory and gradient in backpropagation [
38]. The GRU is similar to the LSTM. By simplifying the internal structure of the LSTM network, an update gate is used to replace the roles of the input gate and the forgetting gate inside the network. Therefore, compared with LSTM, the GRU network has fewer parameters [
39]. In the GRU network unit, the reset gate is used to integrate the previous state information with the current state information as the output of the current state information. The update gate selectively forgets the hidden state of the previous moment by obtaining the hidden state
ht−1 of the previous moment, the current input
Xt and the current state information
ĥt, and passing their combined matrix through the nonlinear change of the activation function. That is, forget some unimportant information in
ht−1 and selectively remember some information in
ĥt. The basic structure of the GRU network unit is shown in
Figure 2 [
40], and the mathematical calculation formula is shown in the Formula (2).
In the formula, Xt, ht−1, zt, rt, ĥt, and ht, respectively, represent the input information at the current moment, the hidden state at the previous moment, the update gate, the reset gate, the candidate hidden state, and the output of the hidden layer at the current moment. Wr, Wz, Wĥ is the weight parameter matrix of the GRU neural network. σ is the sigmoid function.
In the GRU neural network, the current state is always determined by the state at the previous moment; that is, it is output from front to back. However, in actual situations, if the feedback of a certain moment to the state of a future moment is considered, it is more conducive to the extraction of deep features, which requires BiGRU to establish this connection [
41]. BiGRU is a neural network model composed of GRU cells with one-way and opposite directions, and the output is determined by the states of these GRU cells. At each moment, the input will provide two GRU cells with opposite directions simultaneously, and the output will be jointly determined by the two unidirectional GRU cells. The specific structure of BiGRU is shown in
Figure 3 [
42]. The output is shown in the Formula (3).
In the formula, the GRU function represents the nonlinear transformation of the input sequence vector, turning the input at the current moment and the input at the adjacent moment into the corresponding hidden layer state; xt is the input at time t; ht−1 is the state of the hidden layer at time t − 1; ht + 1 is the state of the hidden layer at time t + 1; W, V represent the weights corresponding to the state of the forward hidden layer and the state of the reverse hidden layer, respectively, and bt represents the bias corresponding to the state of the hidden layer at time t.
2.3. CNN-BiGRU Prediction Model
Compared with traditional neural networks, CNN can efficiently and accurately extract features from data sets. However, when dealing with long time series data, CNN cannot effectively analyze the time series features in the data, let alone accurately predict the time series data. Therefore, this paper adopts 1DCNN to convert the long time series into shorter ones composed of high-dimensional features and then outputs them to the next layer of network training through a pooling layer operation. Considering that the diesel engine exhaust temperature prediction contains the continuity principle and the correlation principle. In this paper, the exhaust temperature at the previous time and the influencing factors of the exhaust temperature at the previous time are linked with the exhaust temperature prediction. The BiGRU neural network is used to conduct two-way training on the time series, which can learn the complete information of the entire time series data and effectively extract and use the time series characteristics in the data.
In conclusion, this paper proposes a prediction model combining the CNN and the BiGRU, which makes full use of the advantages of the CNN in feature extraction and the BiGRU in time series feature prediction, and improves the prediction accuracy of the prediction model through this combined neural network. The framework diagram of the marine diesel engine exhaust temperature prediction model is shown in
Figure 4. The model is mainly divided into an input layer, a CNN layer, a BiGRU layer, a full connection layer, and an output layer. Each layer in the model is described as follows: (1) Input layer. The collected diesel engine exhaust temperature and other thermodynamic parameter data are used as the input of the prediction model, and its expression is shown in the Formula (4).
In the formula, xin represents the value of thermal parameter i collected at time n.
(2) CNN layer. Extracting the temporal and spatial characteristics of the input diesel engine historical operation data, which is composed of a convolution layer, a pool layer and a full connection layer. Map the data processed by the convolution layer and pooling layer to the hidden layer feature space and output it after the full connection layer as the input of the BiGRU layer.
(3) BiGRU layer. Building a single-layer, two-way GRU structure, learning the data extracted from the CNN layer, fully capturing the changing rules of its internal information characteristics, and inputting them to the full connection layer. At the same time, in order to prevent the model from overfitting, Dropout is added to this layer to randomly discard some hidden layer nodes during the training process.
(4) Full connection layer. Converting the input of BiGRU into prediction results and inputting them into the output layer.
(5) Output layer. Outputting the predicted value of exhaust temperature.
5. Conclusions
This paper studies the problem of early warning of marine diesel engine faults and proposes a fault early warning method for marine diesel engines based on CNN-BiGRU. Conclusion as below:
Using the Pearson correlation coefficient to select the operating parameters that are highly correlated with the exhaust temperature as the input of the prediction model can effectively reduce the model input dimension, simplify the model structure, reduce the model calculation time, and reduce the model training load.
Aiming at the prediction of the exhaust temperature of diesel engine for marine power generation, this paper proposes a prediction method combining CNN and BiGRU, which extracts the features of multi-dimensional input variables through CNN and then uses the advantages of BiGRU in predicting time series data to predict the exhaust temperature of a diesel engine. The prediction results show that there is less deviation between the predicted value and the actual value, and the prediction accuracy is higher. In order to verify the effectiveness of the model proposed in this paper, it is compared with RNN, LSTM, GRU, and BiGRU. Firstly, the convergence speed of the model proposed in this paper is significantly faster than that of other models in the training process. Then, compare the time required by different models to predict a step. The results show that the time required by the proposed model to predict a step is only 0.066 s. Compared with the other four models, the time is reduced by 0.424 s, 0.364 s, 0.342 s, and 0.264 s, respectively. It reflects the advantages of the model in predicting time consumption. Finally, the prediction results of CNN-BiGRU are compared with those of the above four models. In contrast, the MSE decreased by 0.0307, 0.0233, 0.013, and 0.0078, respectively; the MAE decreased by 0.0225, 0.0174, 0.0092, and 0.0102, respectively; and the MAPE decreased by 0.0000477, 0.000037, 0.0000191, and 0.0000236, respectively. The experimental results show that the prediction accuracy of the method proposed in this paper is higher, which further demonstrates the advantages of the CNN–BiGRU combined model in time series data prediction.
Through the comparison of the forecast duration and the analysis of the forecast residual value, the forecast duration is determined, and the alarm threshold for the forecast residual value and standard deviation value is set. Through the experimental verification analysis, the method used can detect the abnormal state of the diesel engine in time and provide strong support for the state detection and health management of the marine diesel engine.
In summary, the fault early warning method of the marine diesel engine proposed in this paper not only provides a new reference for the health management of the marine diesel engine but also has a certain reference significance for the prediction of future states and abnormal detection of intelligent marine equipment. However, we have only performed condition prediction as well as fault warning for the marine diesel engine under steady-state operation. In addition, the model can only recognize the abnormal operation state of a diesel engine at the early stage of a fault but cannot realize fault diagnosis. In our future work, we will collect operating data under different states to identify the operating state of the diesel engine and also collect early fault data as much as possible to achieve fault localization and continue to improve the function of the model.