A Sparse Autoencoder and Softmax Regression Based Diagnosis Method for the Attachment on the Blades of Marine Current Turbine

The development and application of marine current energy are attracting more and more attention around the world. Due to the hardness of its working environment, it is important and difficult to study the fault diagnosis of a marine current generation system. In this paper, an underwater image is chosen as the fault-diagnosing signal, after different sensors are compared. This paper proposes a diagnosis method based on the sparse autoencoder (SA) and softmax regression (SR). The SA is used to extract the features and SR is used to classify them. Images are used to monitor whether the blade is attached by benthos and to determine its corresponding degree of attachment. Compared with other methods, the experiment results show that the proposed method can diagnose the blade attachment with higher accuracy.


Introduction
To date, reducing carbon emissions has become a consensus around the world. It is urgent to adjust the energy structure, reduce the dependence on fossil energy, and increase the use of sustainable energy, which makes the wind, solar, and marine current energies [1][2][3] more and more attractive. The system of wind and solar energies is greatly affected by the environment, which occupies a lot of land resources, and brings noise and visual pollution to surrounding residents. The marine current energy can avoid these problems. The marine current mainly refers to the steady flow in the submarine channel, and the regular flow of water caused by the tides [4]. The flow of the marine current is stable, and the flow rate is kept within a certain range all year round [5], therefore power can be continuously generated [6,7]. Marine current energy is an inexhaustible green energy resource and the marine current turbine (MCT) is mainly independent of weather conditions [8]. However, compared with the terrestrial environment, the undersea working environment is more complex. In addition to the traditional generator faults, the MCT system is also influenced by the marine environment, such as attachment, biofouling [9,10], etc., affecting the normal operation of the electrical equipment. On the other hand, the marine current generation system is affect by the sun, lunar gravity and the surge. The resulting instability of the current flow rate [11,12] makes the MCT work in a complicated environment for a long time, which means that the detection and diagnosis of the faults of the MCT are more difficult. The faults can cause great damage to the whole system, if not found and dealt with in time. The conventional faults caused by attachment include rotor asymmetries, increased surface roughness and the deformation of blade [13]. In addition, the metal parts are much easier corroded by attachment [8]. When sea creatures attach the blades, the blade imbalance and hydrodynamic will degrees are set, which can be explicitly distinguished under waves, but cannot be distinguished under conditions of turbulence.
The increased surface roughness and the deformation of the blade are also important, in addition to the rotor asymmetries caused by the imbalance attachment. These two kinds of faults are mainly caused by symmetrical or uniform attachment. For example, the output voltage signals are sampled under health conditions and uniform attachment; FFT (Fast Fourier Transformation) is used to analyze the sampled signal. The results are shown in Figure 1. Because it is difficult to distinguish between a health condition and uniform attachment condition for the amplitude and main frequency in the output voltage. This leads to the challenge of an accurate diagnosis based on the electrical signal under the increased surface roughness, and the deformation of blade. An acoustic signal is also used to diagnose faults under the increased surface roughness of the blade for the wind turbine [13]. However, many acoustic signals are lost in the undersea environment [21]. The increased surface roughness and the deformation of the blade are also important, in addition to the rotor asymmetries caused by the imbalance attachment. These two kinds of faults are mainly caused by symmetrical or uniform attachment. For example, the output voltage signals are sampled under health conditions and uniform attachment; FFT (Fast Fourier Transformation) is used to analyze the sampled signal. The results are shown in Figure 1. Because it is difficult to distinguish between a health condition and uniform attachment condition for the amplitude and main frequency in the output voltage. This leads to the challenge of an accurate diagnosis based on the electrical signal under the increased surface roughness, and the deformation of blade. An acoustic signal is also used to diagnose faults under the increased surface roughness of the blade for the wind turbine [13]. However, many acoustic signals are lost in the undersea environment [21]. MCT's image is used as the fault-diagnosing signal in this paper. The undersea environment is different from that on land, as there is no source of light. Underwater imaging systems have to rely on artificial light to provide illumination, which produces problems due to light absorption, light reflection, bending, light scattering and poor visibility [32]. Therefore, the image feature extraction method is a key point for diagnosing faults based on image classification.
The MCT is salvaged from undersea with a thin attachment [8]. In addition, real biofilms were not able to be grown on a rotating turbine, or tested in the towing tank [33]. Blades were fouled with a 1.1 mm thick layer of lithium grease in reference [33]. Ropes used to simulate attachment in this paper are shown in Figure 2. Marine biofouling is a process from being attached to biological reproduction and takes about three-weeks [9]. By analyzing the images, and the degree of attachment, consequently, the degree of fault could be estimated in time. This kind of diagnosis method has been applied in cancer-image processing and has achieved promising results, such as the diagnosis of breast cancer [34].
(a) MCT's image is used as the fault-diagnosing signal in this paper. The undersea environment is different from that on land, as there is no source of light. Underwater imaging systems have to rely on artificial light to provide illumination, which produces problems due to light absorption, light reflection, bending, light scattering and poor visibility [32]. Therefore, the image feature extraction method is a key point for diagnosing faults based on image classification.
The MCT is salvaged from undersea with a thin attachment [8]. In addition, real biofilms were not able to be grown on a rotating turbine, or tested in the towing tank [33]. Blades were fouled with a 1.1 mm thick layer of lithium grease in reference [33]. Ropes used to simulate attachment in this paper are shown in Figure 2. Marine biofouling is a process from being attached to biological reproduction and takes about three-weeks [9]. By analyzing the images, and the degree of attachment, consequently, the degree of fault could be estimated in time. This kind of diagnosis method has been applied in cancer-image processing and has achieved promising results, such as the diagnosis of breast cancer [34]. The increased surface roughness and the deformation of the blade are also important, in addition to the rotor asymmetries caused by the imbalance attachment. These two kinds of faults are mainly caused by symmetrical or uniform attachment. For example, the output voltage signals are sampled under health conditions and uniform attachment; FFT (Fast Fourier Transformation) is used to analyze the sampled signal. The results are shown in Figure 1. Because it is difficult to distinguish between a health condition and uniform attachment condition for the amplitude and main frequency in the output voltage. This leads to the challenge of an accurate diagnosis based on the electrical signal under the increased surface roughness, and the deformation of blade. An acoustic signal is also used to diagnose faults under the increased surface roughness of the blade for the wind turbine [13]. However, many acoustic signals are lost in the undersea environment [21]. MCT's image is used as the fault-diagnosing signal in this paper. The undersea environment is different from that on land, as there is no source of light. Underwater imaging systems have to rely on artificial light to provide illumination, which produces problems due to light absorption, light reflection, bending, light scattering and poor visibility [32]. Therefore, the image feature extraction method is a key point for diagnosing faults based on image classification.
The MCT is salvaged from undersea with a thin attachment [8]. In addition, real biofilms were not able to be grown on a rotating turbine, or tested in the towing tank [33]. Blades were fouled with a 1.1 mm thick layer of lithium grease in reference [33]. Ropes used to simulate attachment in this paper are shown in Figure 2. Marine biofouling is a process from being attached to biological reproduction and takes about three-weeks [9]. By analyzing the images, and the degree of attachment, consequently, the degree of fault could be estimated in time. This kind of diagnosis method has been applied in cancer-image processing and has achieved promising results, such as the diagnosis of breast cancer [34]. (a)

The Sparse Autoencoder and Softmax Regression Based Diagnosis Method
The diagnosis method proposed in this paper is divided into four steps as shown in Figure 3.
Step 1, preprocessing the unlabeled images to pre-train the convolution kernels; Step 2, making the convolution between the labeled images and the convolution kernels to obtain the convolved features of each image in the labeled samples; Step 3, transforming the convolved features into the pooled features by using a pooling operation; and finally, Step 4, putting the pooled features into the softmax classifier to diagnose the faults category. Step 1 Unlabeled Step 3 Step 2 Step 4

The Sparse Autoencoder and Softmax Regression Based Diagnosis Method
The diagnosis method proposed in this paper is divided into four steps as shown in Figure 3.
Step 1, preprocessing the unlabeled images to pre-train the convolution kernels; Step 2, making the convolution between the labeled images and the convolution kernels to obtain the convolved features of each image in the labeled samples; Step 3, transforming the convolved features into the pooled features by using a pooling operation; and finally, Step 4, putting the pooled features into the softmax classifier to diagnose the faults category.

The Sparse Autoencoder and Softmax Regression Based Diagnosis Method
The diagnosis method proposed in this paper is divided into four steps as shown in Figure 3.
Step 1, preprocessing the unlabeled images to pre-train the convolution kernels; Step 2, making the convolution between the labeled images and the convolution kernels to obtain the convolved features of each image in the labeled samples; Step 3, transforming the convolved features into the pooled features by using a pooling operation; and finally, Step 4, putting the pooled features into the softmax classifier to diagnose the faults category. Step 1 Unlabeled Step 3 Step 2 Step 4

Image Data Preprocessing
The MCT images are used to extract patches for effectively extracting features. We extracted 500 patches of 20 × 20 pixels per channel (3 channels for each patch) from each image as the unlabeled learning samples, which are arranged in matrix X unlabel = x 1 unlabel , . . . , x k unlabel , . . . ,where x k unlabel is the kth column of X unlabel . Then we used the zero mean and zero-phase component (ZCA) whitening technique [35] to calculate matrix X whitening . The row images of MCT are effectively reduced by preprocessing of ZCA so as to sparse autoencoder's input with low correlation.
where x * k unlabel is the kth column of X * unlabel ; C X the covariance matrix of X * unlabel ; m the number of samples; S is the eigenvalues of diagonal matrix and U is the eigenvectors of C X , and ε is the regularization parameter.

Pre-Training Convolutional Kernels Based on Sparse Autoencoder
In classical CNN training, convolutional kernels and softmax's parameters are simultaneously trained. In this paper, convolutional kernels are trained before training softmax's parameters. Since the convolutional kernels and softmax's parameters are trained asynchronously, SA is used to train the convolutional kernels. Figure 4 shows the structure of the SA neural network. It has three layers: the input layer (L 1 ), the hidden layer (L 2 ) and the output layer (L 3 ), where "+1" is the bias coefficient. SA is an unsupervised learning algorithm because its ideal output equals to its input, which means that it can learn features from training data by itself. Assuming the preprocessed input matrix X whitening = x 1 , . . . , x 80000 , where x k is the kth column of X whitening , x k ∈ R n , n = 1200 is the number of pixels of each patch. (1) ji , for i = 1, . . . , s 1 , j = 1, . . . , s 2 , denotes the weight connecting the ith neuron from the input layer to the jth neuron of the hidden layer. The input threshold of the hidden layer is b (1) . W (2) ij , for i = 1, . . . , s 3 , j = 1, . . . , s 2 , which denotes the weight connecting the jth neuron from the hidden layer to the ith neuron of the output layer; where s 1 = 1200 is the number of neurons in the input layer, s 2 = 800 the number of neurons in the hidden layer, s 3 = 1200 the number of neurons in the output layer. The threshold of the output layers b (2) . W (1) ji , W (2) ij , b (1) and b (2) are trainable parameters, which are trained by the forward and backward propagation method. The activation function of the hidden layer is the sigmoid function and the output layer is the proportional function. The optimal values of parameters are calculated by using L-BFGS algorithm [36]. Finally, the weights of the hidden layer are the learned features. After pretraining based on SA, the weights between input layer and hidden layer are reshaped for extracting the convolution features as convolutional kernels.

W
ij a where, x i is the ith component of vector x, z (2) j and a (2) j correspond to the input and output of the activation function in the jth neurons of the hidden layer respectively, z (3) i and a (3) i correspond to the input and output of the activation function in the ith neuron of the output layer respectively, t is the proportionality coefficient.
where, is the ith component of vector , ( ) and ( ) correspond to the input and output of the activation function in the jth neurons of the hidden layer respectively, ( ) and ( ) correspond to the input and output of the activation function in the ith neuron of the output layer respectively, t is the proportionality coefficient.

Features Extraction Based on Convolution and Pooling
Local connection and weight sharing are the characteristics of the convolution layer, so using convolution can reduce the number of parameters and training complexity. In addition, the convolutional and pooling architecture can learn invariant features and reduce over-fitting [37]. Firstly, the convolved features will be extracted from each image, and then the pooled features will be obtained by aggregating the convolved features.
Different features activation value is obtained at each location in the image by convolving each image with the convolution kernels pre-trained in the previous step. Specifically, if the number of pixels of one image is and the number of pixels of the convolution kernels is , the dimension of the convolved features is ( − + 1) ( − + 1) [30]. Assuming the number of kernels for the hidden layer is equal to , the dimension of a convolved feature is ( − + 1) ( − + 1) . The pooling operation is then introduced to reduce the dimension of the convolved features, while maintaining the invariant information and to improve the results of less over-fitting. Since the

Features Extraction Based on Convolution and Pooling
Local connection and weight sharing are the characteristics of the convolution layer, so using convolution can reduce the number of parameters and training complexity.
In addition, the convolutional and pooling architecture can learn invariant features and reduce over-fitting [37]. Firstly, the convolved features will be extracted from each image, and then the pooled features will be obtained by aggregating the convolved features.
Different features activation value is obtained at each location in the image by convolving each image with the convolution kernels pre-trained in the previous step. Specifically, if the number of pixels of one image is D image × D image and the number of pixels of the convolution kernels is D patch × D patch , the dimension of the convolved features is D image − D patch + 1 × D image − D patch + 1 [30]. Assuming the number of kernels for the hidden layer is equal to n h , the dimension of a convolved The pooling operation is then introduced to reduce the dimension of the convolved features, while maintaining the invariant information and to improve the results of less over-fitting. Since the features of each category are not complex, the mean pooling is used in this paper [30].

Faults Classification Based on Softmax Classifier
After Step 3, the pooling features are obtained for the training classifier. According to the different attachment degrees, the different categories and labels are set. The pooling features are the input of softmax. Suppose θ is a parameter matrix, the L-BFGS iterative algorithm can be used to obtain parameter θ.

Experimental Platform
In order to get a rich diversity of samples, the state of each category will be sampled from the blade in four different configurations to extract data, as shown in Figure 5. In this experiment, the speed of the water current is set to 0.6 m/s. 860 images with RGB channels collected by the underwater camera. The camera has 1.2 million pixels. The sampling frequency is 1 Hz. The luminous flux of fluorescent lamp is 1700 lm. After the remote transmission, each channel is represented by a matrix of size (320 × 320). Among them, 160 images are selected as unlabeled pre-training samples, 420 images as labeled training samples, and the remaining 280 images, as testing samples. The detail information is shown in Tables 1 and 2.
In this paper, for simplicity and without losing generality, we defined eight categories according to the proportion of the area covered by attachment, as shown in Figure 6. Step 3, the pooling features are obtained for the training classifier. According to the different attachment degrees, the different categories and labels are set. The pooling features are the input of softmax. Suppose is a parameter matrix, the L-BFGS iterative algorithm can be used to obtain parameter .

Experimental Platform
In order to get a rich diversity of samples, the state of each category will be sampled from the blade in four different configurations to extract data, as shown in Figure 5. In this experiment, the speed of the water current is set to 0.6 m/s. 860 images with RGB channels collected by the underwater camera. The camera has 1.2 million pixels. The sampling frequency is 1Hz. The luminous flux of fluorescent lamp is 1700 lm. After the remote transmission, each channel is represented by a matrix of size (320 × 320). Among them, 160 images are selected as unlabeled pre-training samples, 420 images as labeled training samples, and the remaining 280 images, as testing samples. The detail information is shown in Tables 1 and 2.
In this paper, for simplicity and without losing generality, we defined eight categories according to the proportion of the area covered by attachment, as shown in Figure 6.  After Step 3, the pooling features are obtained for the training classifier. According to the different attachment degrees, the different categories and labels are set. The pooling features are the input of softmax. Suppose is a parameter matrix, the L-BFGS iterative algorithm can be used to obtain parameter .

Experimental Platform
In order to get a rich diversity of samples, the state of each category will be sampled from the blade in four different configurations to extract data, as shown in Figure 5. In this experiment, the speed of the water current is set to 0.6 m/s. 860 images with RGB channels collected by the underwater camera. The camera has 1.2 million pixels. The sampling frequency is 1Hz. The luminous flux of fluorescent lamp is 1700 lm. After the remote transmission, each channel is represented by a matrix of size (320 × 320). Among them, 160 images are selected as unlabeled pre-training samples, 420 images as labeled training samples, and the remaining 280 images, as testing samples. The detail information is shown in Tables 1 and 2.
In this paper, for simplicity and without losing generality, we defined eight categories according to the proportion of the area covered by attachment, as shown in Figure 6.   Figure 7 shows the experiment platform of MCT, it is a 230 W direct-drive permanent magnet synchronous motor prototype. The whole system mainly consists of three parts: (1) the permanent magnet synchronous generator (PMSG) prototype; (2) the marine current simulation system (adjustable flow rate from 0.2 m/s to 1.5 m/s); (3) the data monitoring and collection system. This platform can simulate stationary current, waves and turbulence. Table 3 gives the main parameters of the system.  Table 2. Detail of dataset.

Dataset's Name Number
Unlabeled pre-training sample 160 Labeled training sample 420 Testing sample 280   Figure 7 shows the experiment platform of MCT, it is a 230 W direct-drive permanent magnet synchronous motor prototype. The whole system mainly consists of three parts: (1) the permanent magnet synchronous generator (PMSG) prototype; (2) the marine current simulation system (adjustable flow rate from 0.2 m/s to 1.5 m/s); (3) the data monitoring and collection system. This platform can simulate stationary current, waves and turbulence. Table 3 gives the main parameters of the system.

Experimental Results and Comparison
Besides using the SA neural network and softmax classifier for features extracting and classifying, this paper also uses CNN for features of extraction and classification. The PCA (Principal

Experimental Results and Comparison
Besides using the SA neural network and softmax classifier for features extracting and classifying, this paper also uses CNN for features of extraction and classification. The PCA (Principal Component Analysis) algorithm [38,39] for features extraction and BP neural network for classification [40], compares the results of different methods. The PCA algorithm is used to produce kernels from X whitening and the BP neural network is used to classify the faults, so the combination of the PCA algorithm with the BP neural network can produce kernels and classify faults, as seen in Table 4. Meanwhile, compared to CNN's, the weights are different for this proposed method because the kernels and softmax's parameters are simultaneously trained. Table 5 shows all the parameters of SA in training step and Figure 8 shows a flow chart for all of the steps. The parameters of the compared methods are shown in Table 4. It is just the training of the softmax's parameters and convolution kernels of CNN that varies, the architecture of it is the same throughout. Component Analysis) algorithm [38,39] for features extraction and BP neural network for classification [40], compares the results of different methods. The PCA algorithm is used to produce kernels from and the BP neural network is used to classify the faults, so the combination of the PCA algorithm with the BP neural network can produce kernels and classify faults, as seen in Table 4. Meanwhile, compared to CNN's, the weights are different for this proposed method because the kernels and softmax's parameters are simultaneously trained. Table 5 shows all the parameters of SA in training step and Figure 8 shows a flow chart for all of the steps. The parameters of the compared methods are shown in Table 4. It is just the training of the softmax's parameters and convolution kernels of CNN that varies, the architecture of it is the same throughout.      Table 6. That means the representational characteristics are obtained by the proposed method. As a result, the softmax classifier presents better performance than BP classifier. In addition, the softmax classifier shows a more stable diagnosis accuracy. The experimental results also show that the extraction ability of SA is better than that of PCA whatever its value of CPV (95% or 99%).

Conclusions
Due to the hardness of MCT's working environment, underwater image is chosen as the fault diagnosing signal to classify the different degrees of MCT's biological attachment. This paper proposes a diagnosis method based on a sparse autoencoder and softmax regression, which consists of four parts. (1) Preprocessing the unlabeled images to pre-train the convolution kernels; (2) making the convolution between the labeled images and convolution kernels to obtain the convolved features of each image in the labeled samples; (3) transforming the convolved features into the pooled features by using pooling operation; (4) putting the pooled features into the softmax classifier to diagnose the faults category. The SA is used to create kernels and the SR is used to classify them. Images are used to monitor whether the blade is attached by benthos and then to determine its corresponding degree of attachment. Also, this paper compares the simultaneous training method (CNN) with other asynchronous training methods (PCA for kernel production and BP for classification). The experimental results and comparison with other methods show that the proposed method is useful to classify the different degrees of biological attachment. The proposed method can also be applied to other fields [41][42][43][44][45][46]. However, the percentage of the area occupied by attachment is diagnosed in this paper. The types of attachment are not considered. In addition, the training time of the proposed method is too long. In the future work, we will think about the color and the thickness of the attachment, and we will simplify the algorithm to speed up the training time in the future research.
Funding: This paper was supported by Shanghai Natural Science Foundation (16ZR1414300) and National Natural Science Foundation of China (61673260).