Classiﬁcation of Marine Vessels with Multi-Feature Structure Fusion

: The classiﬁcation of marine vessels is one of the important problems of maritime tra ﬃ c. To fully exploit the complementarity between di ﬀ erent features and to more e ﬀ ectively identify marine vessels, a novel feature structure fusion method based on spectral regression discriminant analysis (SF-SRDA) was proposed. Firstly, we selected the di ﬀ erent convolutional neural network features that better describe the characteristics of ships, and constructed the features based on graphs by the similarity metric. Then we weighed the concatenate multi-feature and fused their structures according to the linear relationship assumption. Finally, we constructed the optimization formula to solve the fusion features and structure by using spectral regression discriminant analyses. Experiments on the VAIS dataset show that the proposed SF-SRDA method can reduce the feature dimension from the original 102,400 dimensions to 5 dimensions, that the classiﬁcation accuracy of visible images can reach 87.60%, and that that of the infrared image can reach 74.68% at daytime. The experimental results demonstrate that the proposed method can not only extract the optimal features from the original redundant feature space, but also greatly reduce the dimensions of the feature. Furthermore, the classiﬁcation performance of SF-SRDA also gets a promising result.


Introduction
The classification of marine vessels is an important issue in maritime safety and traffic control. It has a broad application in both civil and military industries [1]. Compared with other target recognition problems, the classification of marine vessels is more difficult because of the large changes in viewing perspectives, illumination conditions, and scale, and the image background is disorganized [2].
According to the image types of marine vessels, there are mainly synthetic aperture radar images (SAR), spaceborne optical images (SOI), visible images and infrared images (IR). Because SAR images are characterized by all-day and all-weather imaging, Eldhuset et al. [3] developed an automatic ship wake detection system for spaceborne SAR images in 1996. However, the number of SAR sensors is limited, the revisit period is long, and the resolution is low. In 2010, Zhu et al. [4] conducted experiments on SOI image sets with higher resolution captured by optical sensors from multiple satellites, which can effectively distinguish between ships and non-ships, and obtain satisfactory ship detection performance. Similarly, satellite resources were still limited, and it is obviously more convenient for the camera to collect images. In 2015, Zhang et al. [5] published the world's first marine vessel dataset with paired visible and infrared images, which lead to progress in the research field of marine vessel classification. The visible image has sufficient detail and color information, and the infrared image has a strong adaptability to the environment, which meant combining the two images yielded a higher accuracy of vessel classification.
From the perspective of feature extraction, there are two types of methods for classifying marine vessels: methods based on traditional features and methods based on a convolutional neural network (CNN). Methods based on traditional features rely on artificially designed feature vectors for target recognition and classification [6,7]. Zhang Difei et al. [8] used the Histogram of Oriented Gradient (HOG) feature combined with the support vector machine (SVM) method to identify and classify the infrared ship targets on the sea surface, which can overcome the background interference to a certain extent. However, in the experiment, there are no tests on multiple targets, deformation, illumination, or other changes. Feineigle et al. [9] used SIFT descriptors to identify ship targets in the port, and realized the invariance of illumination and angle by describing and matching the local features of the target. Yet using sliding window to extract features of targets resulted in high dimension leads to low computational efficiency. Sánchez et al. [10] combined Fisher vector with Gaussian mixture model to linearly classify large-scale datasets. The target category contained more than one thousand kinds and the best classification accuracy was obtained by optimizing the loss function. The literature [11] has proposed that different features have their own advantages and disadvantages in different aspects, so the idea of using three features was synthetically adopted. The three features consist of multi-scale completed local binary patterns (MS-CLBP), Bag of visual words (BOVW), and spatial pyramid matching (SPM). Methods based on CNN mainly refer to select the features of a certain layer in the convolutional neural network, and then use the support vector machine (SVM), extreme learning machine (ELM) or logistic regression classifier [12][13][14][15][16][17][18]. The literature [5] has used the 15th layer feature extracted from the pre-trained model vgg-16 based on the ImageNet dataset. Literature [12] has adopted AlexNet, VGG-16 and Inception-V3 to extract the features of the image, and then normalizes the different features into the same feature space. Literature [13] has designed a convolutional neural network extraction feature, and combined the traditional Gabor feature with MS-CLBP feature to describe the ship's target. Literature [16] has trained extreme learning machine for exploiting the correlation of multi-color features. Literature [17] has extracted features from a pre-trained 16-layer convolutional neural network (vgg-16) and train the logistic regression classifier for object recognition. Literature [18] has constructed a dual-flow DNN network to extract the high-frequency and low-frequency features of ship images, and finally ELM aggregates feature and decision-making. Literature [19] has proposed a classification framework consists of a multi-feature ensemble based on convolutional neural network (ME-CNN). Literature [20] has introduced a new approach based on ELM to learn discriminative CNN features. Compared with the method based on traditional features, the method based on CNN shows powerful capabilities of feature extraction.
Considering the fact that a single feature may not be comprehensive enough for representing an image, some scholars have made further explorations on feature fusion. Some existing feature fusion methods mostly have used simple concatenation of different features in series [12,13], which do not consider the heterogeneous characteristics between different features. Sun et al. [21] have adopted Canonical Correlation Analysis (CCA) for feature fusion to achieve compression of feature vector dimensions. Subsequently, KCCA (Kernel CCA) [22] and OCCA (Orthogonal CCA) [23] were presented for feature fusion. Lin et al. [24][25][26][27] proposed a multi-feature structure fusion method, which achieved good recognition results in many fields. The method first constructs the internal structure of each feature by the similarity measure, and then performs algebraic operations on the corresponding structure and features based on locality preserving projection (LPP) [28]. The method projects features from the combined high-dimensional space to the low-dimensional space. This method not only retains the internal structure of the features, but also greatly reduces the dimension of the features and improves the performance of object classification.
Although the method of structural fusion can fuse different features together, this method belongs to unsupervised learning method. The weakness of this method is that the feature's category information is not integrated into the process of feature fusion, so the natural distribution structure of the class from multi-feature is usually ignored. The structural information can enhance the discrimination of object classification. To solve this issue, a novel multi-feature structure fusion based on spectral regression discriminant analysis (SF-SRDA) is proposed in this paper by combining structural fusion [24] with linear discriminant analysis (LDA) [29,30]. SF-SRDA is a supervised dimension reduction technology, so that the method can not only preserve the internal structure of the category information in the process of feature fusion, but also select the minimal dimension features. The minimal dimension features can completely describe the target for object recognition and classification. The overall framework of the method is shown in Figure 1 (In this paper, two-type features indicate multi-feature for structure fusion), and the focus (which will be detailed in Section 3) of this paper is marked with a red box. The main contributions of this paper can be summarized as follows: (1) we propose a feature structure fusion method based on LDA. The method can not only maintain the internal structure of the feature, but also integrate the category information to improve recognition performance. (2) Due to the consideration of category information and the intrinsic structure, the feature dimension can be greatly reduced from 102,400 to 5 dimensions. (3) The experimental results are promising for marine vessels classification.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 3 of 14 discrimination of object classification. To solve this issue, a novel multi-feature structure fusion based on spectral regression discriminant analysis (SF-SRDA) is proposed in this paper by combining structural fusion [24] with linear discriminant analysis (LDA) [29,30]. SF-SRDA is a supervised dimension reduction technology, so that the method can not only preserve the internal structure of the category information in the process of feature fusion, but also select the minimal dimension features. The minimal dimension features can completely describe the target for object recognition and classification. The overall framework of the method is shown in Figure 1 (In this paper, two-type features indicate multi-feature for structure fusion), and the focus (which will be detailed in Section 3) of this paper is marked with a red box. The main contributions of this paper can be summarized as follows: (1) we propose a feature structure fusion method based on LDA. The method can not only maintain the internal structure of the feature, but also integrate the category information to improve recognition performance. (2) Due to the consideration of category information and the intrinsic structure, the feature dimension can be greatly reduced from 102,400 to 5 dimensions. (3) The experimental results are promising for marine vessels classification. The following sections are arranged as follows. Section 2 is the selection of multiple features. Section 3 describes the construction of the internal structure of features and the structure fusion mechanism of the feature. Section 4 gives the experimental results and comparison with other stateof-art methods. Finally, a conclusion is made and the work is summarized in Section 5.

Feature Selection
To select features that can better describe the visual characteristic of the vessel, we conducted some experiments with typical traditional features and CNN features according to the existing literature. Traditional features include both the HOG [6] and LBP [31] features. CNN features include the extracted features of different depths and different layers in the pre-trained CNN models, which refer to VGG [32], GoogLeNet [33], and ResNet [34]. These features can be sent to the SVM classifier for classification. These experiments involve the visual information that includes the visible and IR vessel images provided in [5]. Table 1 shows the dimensionality of each feature and the classification accuracy about visible and IR images. In Table 1, the relu5-4 layer features based on VGG-19 [32] and the pool5 layer features based on ResNet-152 [34] achieve a higher correct recognition accuracy and have the better complementarity because of the large different network structure, so we selected these two types of features for subsequent experimental fusion. Figure 2 shows VGG-19 or ResNet-152 of the structure, in which layer output selected as features marked with a red box.  The following sections are arranged as follows. Section 2 is the selection of multiple features. Section 3 describes the construction of the internal structure of features and the structure fusion mechanism of the feature. Section 4 gives the experimental results and comparison with other state-of-art methods. Finally, a conclusion is made and the work is summarized in Section 5.

Feature Selection
To select features that can better describe the visual characteristic of the vessel, we conducted some experiments with typical traditional features and CNN features according to the existing literature. Traditional features include both the HOG [6] and LBP [31] features. CNN features include the extracted features of different depths and different layers in the pre-trained CNN models, which refer to VGG [32], GoogLeNet [33], and ResNet [34]. These features can be sent to the SVM classifier for classification. These experiments involve the visual information that includes the visible and IR vessel images provided in [5]. Table 1 shows the dimensionality of each feature and the classification accuracy about visible and IR images. In Table 1, the relu5-4 layer features based on VGG-19 [32] and the pool5 layer features based on ResNet-152 [34] achieve a higher correct recognition accuracy and have the better complementarity because of the large different network structure, so we selected these two types of features for subsequent experimental fusion. Figure 2 shows VGG-19 or ResNet-152 of the structure, in which layer output selected as features marked with a red box.

Multi-feature Structure Fusion Based on Linear Discriminant Analysis
Linear discriminant analysis (LDA) is a very popular dimensionality reduction technique, which is widely used in the field of pattern recognition. However, in the process of dimensionality reduction, the natural distribution structure of the class from a multi-feature is not considered, resulting in the complementarity structure information loss with class labels between multi-features after dimensionality reduction. However, structure information mining is a key question in vessel target recognition. In existing methods, the structure fusion [24] method applies a multi-structure fusion, but this does not take into account the category information of the features. Therefore, this method makes feature discrimination insufficient. Based on both the linear discriminant analysis and structure fusion, we propose a fusion method of the multi-feature structure that considers the class label in a supervised way.

Linear Discriminant Analysis
LDA [29] is a supervised dimensionality reduction technology. The idea of LDA minimizes the variance within a class and simultaneously maximizes the variance between classes. After projecting the data into a low dimensional space, the same category data are as close as possible, and the different category data are as far as possible. By eigenvalue decomposition of the divergence matrix of the given training data, the optimal projection function of LDA can be solved. A brief introduction of LDA and spectral regression discriminant analysis (SRDA) [29] is shown below.
x , the optimization function of LDA is shown in Equation (1):

Multi-feature Structure Fusion Based on Linear Discriminant Analysis
Linear discriminant analysis (LDA) is a very popular dimensionality reduction technique, which is widely used in the field of pattern recognition. However, in the process of dimensionality reduction, the natural distribution structure of the class from a multi-feature is not considered, resulting in the complementarity structure information loss with class labels between multi-features after dimensionality reduction. However, structure information mining is a key question in vessel target recognition. In existing methods, the structure fusion [24] method applies a multi-structure fusion, but this does not take into account the category information of the features. Therefore, this method makes feature discrimination insufficient. Based on both the linear discriminant analysis and structure fusion, we propose a fusion method of the multi-feature structure that considers the class label in a supervised way.

Linear Discriminant Analysis
LDA [29] is a supervised dimensionality reduction technology. The idea of LDA minimizes the variance within a class and simultaneously maximizes the variance between classes. After projecting Appl. Sci. 2019, 9, 2153 5 of 13 the data into a low dimensional space, the same category data are as close as possible, and the different category data are as far as possible. By eigenvalue decomposition of the divergence matrix of the given training data, the optimal projection function of LDA can be solved. A brief introduction of LDA and spectral regression discriminant analysis (SRDA) [29] is shown below.
Suppose m samples x 1 , x 2 , . . . , x m , the optimization function of LDA is shown in Equation (1): where c is the number of classes, µ is the mean vector of all samples, m k is the number of samples of the kth class, µ (k) the mean vector of the kth class, x (k) i the ith sample in the kth class, S w is the intra-class divergence matrix, and S b is the inter-class divergence matrix.
, and the optimization function of LDA in Equation (1) is equivalent to Equation (4).
The optimization problem of Equation (4) is equivalent to solving the following generalized eigenvalue problem: Then, Equation (2) can be converted to Equation (6): where W (k) is a m k × m k matrix where all elements are 1/m k . W is a m × m matrix as follows: x i = x i − µ stands for the centralized data point, m k denotes the centralized data matrix of the kth class, and X = X , . . . , X is the centralized data matrix. Since S t = XX T , the generalized eigenvalue problem of Equation (5) can be converted as follows: To solve the eigenvector problem of LDA in Equation (8) more effectively, the literature [29] has proposed spectral regression discriminant analysis (SRDA). Let X T a = y, and then Equation (8) can be transformed into: Since the eigenvalue λ of Equations (8) and (9) are the same, the eigenvector a of X T a = y is the same as the eigenvector a in Equation (8). For the solution of X T a = y, a possible solution is to use the least squares method, as shown in Equation (10): By solving Equations (9) and (10), the mapping matrix can be obtained as A = [a 1 , a 2 , . . . , a c−1 ]. Thus, the features after dimension reduction is obtained by Y = A T X, where Y is a c − 1 dimensional vector through projection.

Structure Fusion Mechanism
Structure fusion [24] means that the structures (the similarity measure is used to represent the internal structure of the feature) of different features are merged by the algebraic optimization. The combined feature is mapped onto a new structure-fusion feature by the mapping matrix under consideration with a fusion structure. Therefore, the literature [24] has proposed a structure fusion method based on a locality-preserving projection (SFLPP).
] are the high-dimensional feature sets of the multi-feature, where x 1i ∈ R D 1 and x 2i ∈ R D 2 (i = 1, 2, . . . , m). These feature matrixes are then The internal structure of the features W k = W kij (k = 1, 2; i = 1, 2, . . . , m; j = 1, 2, . . . , m) is measured by the similarity measure, and they are calculated by the χ 2 metric distance as formula (11); the specific formula of the χ 2 metric is described in the literature [24].
W kij = e −d(x ki ,x k j )/σ k , x ki and x k j is neighbor 0 , x ki and x k j is not neighbor (11) respectively is a single-feature structure. The structure of the combined feature X 3 can be represented as W = W 1 + W 2 . The literature [24] demonstrates that W has the same characteristics as W 1 and W 2 due to their linear relationship, so W can indirectly represent the internal structure of the combined feature X 3 . By performing a specific optimization solution on the combined feature X 3 and its internal structure W, the combined feature can be mapped into a new structure fusion feature. More details can be found in [24] and [25], the former of which refers to the optimized formula of LPP. Since LPP is an unsupervised method, the category information is not integrated into the feature structure fusion process. Therefore, the recognition performance of SFLPP needs to be further improved by considering the category information.

Multi-feature Structure Fusion Based on Linear Discriminant Analysis
In the LDA, the inter-class matrix S b in Equation (6) contains a matrix W on class information.
In the final solution of Equation (8), different categories information only show in the weight of matrix W when solving the equation. The weight matrix W in the original formula is as shown in Equation (7). The weights of the same class are same, while the weights of different classes are different. Each weight is marked as 1/m k . The information of the class is only related to the sample number of the class.
To enhance feature discrimination, we incorporate class information into feature structure fusion process. For this purpose, SF-SRDA is proposed for combining LDA with structure fusion. The schematic diagram of the method is shown in Figure 3. The core of the method is how to construct a weight matrix (it represents the internal structure of feature) as shown in Equation (7). Our weight matrix contains both the class information and the structural information of the feature, and replaces the original weight matrix with the internal structure of the combined feature. The proposed method mainly includes three aspects: one is the construction of the weight matrix of the same kind feature, which comes from the same extracting method; the other is the weight matrix fusion of the different kind features, which are extracted by the various methods; and the third is the weight matrix generation after the feature weighting.

Multi-feature Structure Fusion Based on Linear Discriminant Analysis
In the LDA, the inter-class matrix b S in Equation (6)   To enhance feature discrimination, we incorporate class information into feature structure fusion process. For this purpose, SF-SRDA is proposed for combining LDA with structure fusion. The schematic diagram of the method is shown in Figure 3. The core of the method is how to construct a weight matrix (it represents the internal structure of feature) as shown in Equation (7). Our weight matrix contains both the class information and the structural information of the feature, and replaces the original weight matrix with the internal structure of the combined feature. The proposed method mainly includes three aspects: one is the construction of the weight matrix of the same kind feature, which comes from the same extracting method; the other is the weight matrix fusion of the different kind features, which are extracted by the various methods; and the third is the weight matrix generation after the feature weighting.
In the Formulas (12) and (13), d(a, b) represents the Euclidean distance of vectors a and b, and t is selected to be 0.4. The weight matrix W 1 and W 2 reflect the relationship between each class center and others. In other words, these weight matrixes are the description of the relationship between classes.

Weight Matrix Fusion of Different Kind Features
To fuse the weight matrix of different type features, we sum the weighted W 1 and W 2 matrix based on ResNet-152 and VGG-19 features in this paper. W 3 = α 1 * W 1 + α 2 * W 2 , where α 1 = 0.6 and α 2 = 0.4 by cross-validation.

Weight Matrix Generation after Feature Weighting
We combine feature X 1 and feature X 2 by the proportional weighted stitching, that is X 3 = [X 1 * β 1 X 2 * β 2 ] T , here β 1 = 0.6 and β 2 = 0.4 is selected by cross-validation.
To match the weighted spliced feature X 3 , the weight matrix W is produced by assigning each class weights in W 3 to all samples of the corresponding class, as shown in Equation (14).
where the operator repmat(W 3 , k, m k ) means that the kth row of matrix W 3 is copied m k times as m k rows of matrix W. The combined feature X 3 and its internal structure W are constructed by the above method, and then Equation (8) is reformulated as Equation (15): To solve Equation (15), the mapping matrix A = [a 1 , a 2 , . . . , a c−1 ] can be obtained by using the SRDA method proposed in [29], followed by the feature after structure fusion is Y = A T X.
For the fusion feature, we calculated the mean of each class samples to get each class feature, and used the nearest neighbor method to determine the classification of each sample.

Dataset
The experiment used the VAIS dataset, which was the first publicly available dataset presented at the CVPR conference in 2015 [5], and it contains pairs of both visible and infrared vessel images. The dataset consists of 1623 visible and 1242 infrared images-a total of 2865 images-in which there are 1088 pairs (the visible and corresponding infrared image pairs). The dataset includes six coarse-grained categories, namely merchant ships, medium-other ships, passenger ships, sailing ships, small boats, and tugboats. It can also be subdivided into 15 fine-grained categories, such as the sailing ships that can be further subdivided into sailing-large-sails-down, sailing-small-sails-down, and sailing-small-sails-up. Table 2 gives the distribution of the VAIS dataset in the experiment, where it can be seen that the distribution of samples in each category is extremely imbalanced, which increases the difficulty of classification. For example, some categories have 67 images, while others have 499 images in coarse-grained training samples. Figures 4 and 5 show some visible and IR samples from each class in the dataset. It can be seen that the size of ships is various, the illumination is uneven, and the background is complex. These issues put forward a very high requirement for the distinguishing ability of features.

Experiments
To evaluate the proposed SF-SRDA, we compared the related methods with three configurations. The first configuration is the comparison between SF-SRDA and the base-line methods in the coarse-grained partition. The second configuration is the comparison between SF-SRDA and the state-of-arts in the coarse-grained partition. The third configuration is the comparison of the fine-grained partition between SF-SRDA and the base-line methods.

Experiments
To evaluate the proposed SF-SRDA, we compared the related methods with three configurations. The first configuration is the comparison between SF-SRDA and the base-line methods in the coarse-grained partition. The second configuration is the comparison between SF-SRDA and the state-of-arts in the coarse-grained partition. The third configuration is the comparison of the fine-grained partition between SF-SRDA and the base-line methods.

Experiments
To evaluate the proposed SF-SRDA, we compared the related methods with three configurations. The first configuration is the comparison between SF-SRDA and the base-line methods in the coarse-grained partition. The second configuration is the comparison between SF-SRDA and the state-of-arts in the coarse-grained partition. The third configuration is the comparison of the fine-grained partition between SF-SRDA and the base-line methods.
In Table 3, features involve the VGG-19(relu5-4) feature, the ResNet-152(pool5) feature and the combination of these two features. We assessed the base-line methods (VGG-19(relu5-4) + SVM, ResNet-152(pool5) + SVM, SFLPP [24], and SRDA [29]) and the proposed SF-SRDA on the visible and IR imagery. Since the SFLPP method can customize the feature dimension after fusion, we found that the fusion feature has the highest accuracy in the 85 dimension after many experiments, so the SFLPP method in Table 3 gives the result when the feature is reduced to 85 dimension.  From Tables 3 and 4, it can be observed that: (1) the dimension of single feature is higher, such as VGG-19(relu5-4) feature, which has 100352 dimension, and the ResNet-152(pool5), which has 2048 dimension. However, the feature has redundancy and single feature recognition ability is insufficient; (2) Comparing SF-SRDA with SFLPP, the feature dimension of fusion using our method is greatly reduced, to only 5 dimensions. Moreover, the recognition rate on visible and IR images is higher than that of the SFLPP method. It shows that SF-SRDA improves the discriminant ability of features; (3) Comparing SF-SRDA with SRDA, the recognition rate of our method is higher than that of SRDA when the feature is reduced to the same dimension. It proves that the structure information between features can be maintained by structure fusion in the process of feature dimension reduction, as is beneficial to target recognition; (4) In terms of training time, our approach is similar to ResNet-152 (pool5) + SVM and SRDA, and much lower than VGG-19 (relu5-4) + SVM and SFLPP; (5) Generally speaking, the proposed SF-SRDA achieves the best results, which can greatly reduce the feature dimension and improve the recognition ability of features to different targets.
The VAIS dataset has proposed by the literature [5], which gives experimental results of daytime visible images, daytime IR images, paired visible and IR images, and nighttime IR images. To compare with the method in reference [5], we carried out the vessel classification experiment under the coarse-grained condition according to the setting of reference [5]. Apart from the four test sets mentioned above, we added an all-day IR test set for comparison. In addition, we also compared the proposed SF-SRDA with traditional methods (HOG + SVM, LBP + SVM), SFLPP [24], SRDA [29], and other state-of-the-arts in literature [11,19,20]. The experimental results are shown in Table 5. From the results of Table 5, with the exception of the nighttime IR results, our method achieved the best recognition results for different weather conditions, different modal images, and multi-modal images, indicating that the feature fusion method of this paper has a strong robustness.  [11] 77.73% ----MFL (decision-level) + ELM [11] 85.07% ----MFL (feature-level) + SVM [11] 85.33% ----HOG + SVM [19] 71.87% ----ME-CNN [19] 87 To further validate the effectiveness of SF-SRDA, we conduct experiments in the case of fine-grained dataset with more categories, and compare with the two base-line methods. As shown in the experimental results in Table 6, we compared SF-SRDA with SFLPP and achieved better recognition results and great improvement in various cases. Compared with the SRDA method, SF-SRDA obtains better recognition results in cases of visible, IR, and paired images during the daytime. Only nighttime IR results are slightly lower than the SRDA method. The main discussion about the performance comparisons include the following: (1) The proposed SF-SRDA under the coarse-grained condition and SFLPP under the fine-grained condition, combining multi sensors (visible and IR) shows performance enhancement, while in some other cases, it shows performance degradation. The reason for this is that the IR image has a significantly lower resolution and a smaller size than the visible image. Figure 6 shows some examples of the IR image, thus IR information has little effect on enhancing recognition during the day. (2) The focus of this paper is on feature fusion. It can be seen from our experimental results that feature fusion is better than modal fusion. (3) In the nighttime IR image classification, the reason our method did not improve may be that the category information of the nighttime IR image is very blurred, and the image of the fine-grained classes makes almost no difference. As shown in Figure 6, there are three subcategories under the sailing category in the fine-grained classification: the first row is the large-sails-down class, the second row is the small-sails-down class, and the third row is the small-sails-up class. As can be seen from Figure 6, the images of different class vessels have almost no discrimination, which explains why our methods aimed at improving category information cannot work well.

Conclusions
In this paper, a classification method of SF-SRDA was proposed. Firstly, we selected different types of features through experiments and constructed the internal structure of features by similarity measure. Then, the algebraic operation was formed after the feature and its internal structure were effectively combined. The optimization method refers to linear discriminant analysis and spectral regression discriminant analysis. Finally, the fusion features after dimension reduction were sent to the classifier for marine vessels classification. Experiments on the VAIS dataset show that the extremely high dimensional feature can be reduced to very low dimension by the proposed method, and the accuracy is improved. Through our method, during the daytime, the classification accuracy of visible images can reach 87.60%, which is 5.7% higher than the best result of [5], and the infrared image can reach 74.68%, which is 15.98% higher than the best result in reference [5]. In general, the

Conclusions
In this paper, a classification method of SF-SRDA was proposed. Firstly, we selected different types of features through experiments and constructed the internal structure of features by similarity measure. Then, the algebraic operation was formed after the feature and its internal structure were effectively combined. The optimization method refers to linear discriminant analysis and spectral regression discriminant analysis. Finally, the fusion features after dimension reduction were sent to the classifier for marine vessels classification. Experiments on the VAIS dataset show that the extremely high dimensional feature can be reduced to very low dimension by the proposed method, and the accuracy is improved. Through our method, during the daytime, the classification accuracy of visible images can reach 87.60%, which is 5.7% higher than the best result of [5], and the infrared image can reach 74.68%, which is 15.98% higher than the best result in reference [5]. In general, the proposed method can not only extract the optimal features from the original redundant feature space, but also save a large amount of memory space and greatly improve the classification accuracy.
Author Contributions: E.Z. conceived this study and improved the text of the manuscript. K.W. designed the computational algorithms, wrote the program code, and wrote the manuscript. G.L. proposed some valuable suggestion and guided the experiments.

Conflicts of Interest:
The authors declare no conflicts of interest.