Optimized Variational Mode Decomposition and Permutation Entropy with Their Application in Feature Extraction of Ship-Radiated Noise

The complex and changeable marine environment surrounded by a variety of noise, including sounds of marine animals, industrial noise, traffic noise and the noise formed by molecular movement, not only interferes with the normal life of residents near the port, but also exerts a significant influence on feature extraction of ship-radiated noise (S-RN). In this paper, a novel feature extraction technique for S-RN signals based on optimized variational mode decomposition (OVMD), permutation entropy (PE), and normalized Spearman correlation coefficient (NSCC) is proposed. Firstly, with the mode number determined by reverse weighted permutation entropy (RWPE), OVMD decomposes the target signal into a set of intrinsic mode functions (IMFs). The PE of all the IMFs and SCC between each IMF with the raw signal are then calculated, respectively. Subsequently, feature parameters are extracted through the sum of PE weighted by NSCC for the IMFs. Lastly, the obtained feature vectors are input into the support vector machine multi-class classifier (SVM) to discriminate various types of ships. Experimental results indicate that five kinds of S-RN samples can be accurately identified with a recognition rate of 94% by the proposed scheme, which is higher than other previously published methods. Hence, the proposed method is more advantageous in practical applications.


Introduction
Before the rise of artificial neural networks, the recognition of warships in military confrontations mainly relied on sonar soldiers. The recognition performance would fluctuate with changes in the status, experience, and knowledge reserves of technicians, making the results somewhat unstable. With the continuous development of science and technology, the recognition accuracy of underwater acoustic target has been improved to a great extent with the technicians assisted by machine learning. The line spectrum com-ponents in ship-radiated noise (S-RN) signals carrying rich information that characterizes the target, such as speed, tonnage, and type, provides the main basis for target classification [1]. Nevertheless, attributed to the interference of marine environmental noise, along with the time-varying nature of underwater acoustic channel, feature extraction of S-RN becomes increasingly difficult.
Despite the satisfactory performance of traditional signal analysis methods, they fade under the non-linear, non-stationary and non-Gaussian property of underwater acoustic signals. For example, Fourier transform (FT) [2] provides a global description of the overall singularity of the target signal, but it fails to point out the local contribution to the overall singularity, thereby making it unable to locate the specific moment when the mutation In recent years, VMD combined with various entropy algorithms and its extended version have been utilized in full swing for underwater acoustic signal processing. In [27], based on VMD, the multi-scale permutation entropy (MPE) of IMF with the highest energy was extracted, achieving a classification accuracy of 94%. In [33], a finite number of IMFs were firstly decomposed by VMD of S-RN signals. The IMF with the smallest fluctuation-based dispersion entropy (FDE) difference with the original signal was regarded as the sensitive IMF, whose FDE was input into self-organizing map classifier to reach an identification rate of 97.5%. In [25], the S-RN signals were firstly decomposed by enhanced VMD into several IMFs. Then the WPE of all the IMFs and their Pearson correlation coefficients (PCCs) with the target signal were calculated, respectively. The sensitive IMFs were screened out by the maximum variance of WPE obtained. Finally, the sum of PE weighted by the normalized PCC for the sensitive IMFs was regarded as the feature vector fed into the SVM, which has been demonstrated to yield better recognition performance. Despite the impressive results of the above methods, there still exist some points to be urgently rectified: (1) the mode number of VMD in [27,33] is consistent with the decomposition results of EMD, which does not make sense in theory; (2) in [33], the noise IMF inadequately characterizing the original signal may be locked by means of the selection method for the sensitive IMF. As we know, the S-RN signal contains rich low-frequency line spectrum components in the range of 0-100 Hz, carrying abundant information about ships, and should not be excluded. Thus, there obviously lacks persuasiveness in [33].
To solve these problems, a new feature extraction technique for S-RN signals based on optimized VMD (OVMD), PE, and normalized Spearman CC (NSCC) is put forward in this paper. Firstly, with the mode number determined by RWPE, a set of IMFs are decomposed by OVMD of the target signal. Then, the PE of all the IMFs and SCC between each IMF with the raw signal are calculated, respectively. Finally, the feature vectors extracted through the sum of PE weighted by NSCC for the IMFs are fed into classifier to realize the classification of S-RN samples.
The main innovations and contributions of this paper are summarized as follows: (1) VMD is proposed to conquer the mode number issue for VMD, where RWPE is utilized to lock the mode number. Experimental results on sinusoidal signals have proved the better decomposition performance of OVMD than that of EMD and EEMD. (2) A novel ship-radiated noise feature extraction technique based on OVMD, PE, and NSCC is put forward. The classification results of five kinds of measured S-RN samples indicate that the proposed method is obviously superior to the existing methods with higher recognition rate.
The structure of the paper is organized below: Section 2 is the background. A brief description of the proposed technique is presented in Section 3. The proposed method is utilized for analysis of simulation signals in Section 4. In Section 5, the measured S-RN data are used to test the performance of the proposed technique. Finally, the conclusion is drawn in Section 6.

Variational Mode Decomposition (VMD)
VMD is an adaptive, completely non-recursive mode variational and signal processing method [13]. It overcomes the problem of end effect and mode aliasing in EMD, and has a more solid mathematical theoretical foundation. The essence of VMD is to construct and solve variational problems.
Given target signal f (t), the constraint variational expression is given by where K is the mode number; {u k }, {w k } are the k-th mode and its center frequency, respectively; δ(t) is the Dirac function, ∂(·) is partial derivative, and * means the convolution operation.
To solve the variational problem in Equation (1), the Lagrange multiplication operator λ is introduced and the constrained variational problem is transformed into an unconstrained variational problem. That is, the augmented Lagrange expression is achieved by where α is the quadratic penalty term. Each mode and corresponding center frequency are calculated by employing alternating direction multipliers, combining Paseval's theorem and Fourier transform (FT). {w k }, {u k }, and λ can be finally updated as follows: where the noise tolerance ε meets the fidelity requirements of signal decomposition, and u n+1 , andλ(t), respectively. In summary, the decomposition process of VMD is briefly summarized as follows: Step 1: Initialize u 1 k , w 1 k , λ 1 and maximum number of iterations N; Step 2: Updateû k and w k using Equations (3) and (4); Step 3: Update λ based on Equation (5); Step 4: For convergence accuracy a > 0, if the convergence condition ∑ k û n+1 k −û n k 2 2 / û n k 2 2 < a is not satisfied, go to step 2; otherwise terminate the iteration and outputû k and w k .

Permutation Entropy (PE)
The application of PE in this paper benefits by its being conceptually simple, computationally fast, as well as better stability. For time series {X(i), i = 1, 2, · · · , n}, embedding dimension m, and time delay τ, the calculation steps for PE are epitomized as the steps below [17]: The time series can be reconstructed in phase space as: Arrange the elements in j-th row in ascending order according to their value: In the case of two equal elements: These two elements are rearranged as: Consequently, for each row vector in the reconstruction matrix, a set of symbols can be obtained: S(l) = (j 1 , j 2 , · · · , j m ), l = 1, 2, · · · , k, and k ≤ m!
The probability of each symbol sequence is P 1 , P 2 , · · · , P k , the PE can be finally calculated as: The change in H p reflects and magnifies the minute changes in the time series. The smaller the value of H p , the more regular the time series; on the contrary, the closer the time series is to random.

Reverse Weighted Permutation Entropy (RWPE)
As a fusion of weighted permutation entropy (WPE) and RPE, reverse weighted permutation entropy (RWPE) enjoys a powerful advantage in detection of signal mutation and recognition of noise [26,32]. In the RWPE algorithm, for time series {X(i), i = 1, 2, · · · , n}, given embedding dimension m and time delay τ, the weight w j of the embedding vector X i can be calculated as: Thus, the weighted relative frequency is expressed as: where π m,τ i is one of the m! distinct symbols; 1 I (u) means the indicator function of set I defined as 1 I (u) = 1 if u ∈ I, else 1 I (u) = 0.
RWPE is the pointer to distance with Gaussian white noise. In this case, RWPE is finally defined as: The value range of RWPE is 0 to 1. Contrary to traditional entropy algorithms, a smaller RWPE means a more random time series, and vice versa. In this article, for all calculations involving entropy, as in [17], the embedding dimension and time delay are uniformly set to four and one, respectively.

The Proposed Feature Extraction Technique
This paper proposes a novel feature extraction technique for S-RN signals integrating OVMD, PE, and normalized spearman correlation coefficient (NSCC). The flow chart of the proposed scheme is depicted in Figure 1. The basic steps of the scheme are summarized as follows:

Property Analysis of RWPE
The less sensitivity of PE to data length has been discussed in study [26], we won't explore it in this paper. After this, we will make a comparison for the signal-recognized ability of PE, WPE, AAPE, RPE and RWPE. To this end, we construct a standard Gaussian white noise series with pulse sequence added, and the data length is set to 5000. We then calculate the above five entropy methods using a window with length of 500 and sliding step of 50. Figure 2 are the time-domain waveform of the constructed signal and calculation results. As illustrated in Figure 2, there appear to be no difference in the mutation region with other region for PE and RPE, this can be explained by the ignorance of amplitude,  Step 1: Set the range of mode number K to 3-12 and decompose the target signal; Step 2: Calculate the RWPE of each IMF. Subsequently, count the number of IMFs with RWPE greater than 0.2 after each decomposition termed as n; Step 3: Decompose the target signal using the mode number maximizing n for the first time; Step 4: Extract the PE of all the IMFs and their NSCCs with the raw signal; Step 5: Calculate the sum of PE weighted by NSCC for the IMFs; Step 6: Divide the data set into training set for training the classifier according to a proper proportion, and the remaining for testing it; Step 7: Finally, the testing data are fed into the SVM multi-class classifier to complete recognition of various types of S-RN samples.

Property Analysis of RWPE
The less sensitivity of PE to data length has been discussed in study [26], we won't explore it in this paper. After this, we will make a comparison for the signal-recognized ability of PE, WPE, AAPE, RPE and RWPE. To this end, we construct a standard Gaussian white noise series with pulse sequence added, and the data length is set to 5000. We then calculate the above five entropy methods using a window with length of 500 and sliding step of 50. Figure 2 are the time-domain waveform of the constructed signal and calculation results. As illustrated in Figure 2, there appear to be no difference in the mutation region with other region for PE and RPE, this can be explained by the ignorance of amplitude, but only comparison of ordinal patterns. Rather, attributed to the weight assigned to sensitive patterns, the value of AAPE in the mutation region start to decrease, which is significantly different from other regions. In marked contrast, both WPE and RWPE have shown significant changes in the mutation area, indicating an exceedingly good recognition ability for different signals. but only comparison of ordinal patterns. Rather, attributed to the weight assigned to sensitive patterns, the value of AAPE in the mutation region start to decrease, which is significantly different from other regions. In marked contrast, both WPE and RWPE have shown significant changes in the mutation area, indicating an exceedingly good recognition ability for different signals. For further quantitative comparison, Table 1 presents the ratio of the maximum and minimum values of the average values for the above five entropies in the mutation region and other regions. It can be observed from Table 1 that the value of RWPE in the mutation area changes most significantly with the obviously larger mutation rate than others. Hence, RWPE is chosen for the calculation candidate of mode number for VMD in this article.

OVMD of Sinusoidal Signals
The mode number K set in advance plays a decisive role in influence on the decomposition accuracy of VMD. If K is too small, useful low-frequency components cannot be completely recovered from the signal submerged by noise, that is, the so-called underdecomposition. However, too large K will cause over-decomposition to occur, that is to say, in addition to undesired IMFs being generated, concurrently, the computational complexity will also be dramatically increased. In view of this, the balance between data recovery and time cost is particularly interesting concerning VMD. Study in [27,33], the decomposition result of EMD is used as a reference for VMD, which is obviously short of reasonable mathematical foundation, as well as may result in over-decomposition. Besides, considering the results of VMD, with the increase of K, low-frequency IMFs are gradually being discovered and become cleaner. When K increases to a certain critical point, the low-frequency components have been completely reproduced, and their com- For further quantitative comparison, Table 1 presents the ratio of the maximum and minimum values of the average values for the above five entropies in the mutation region and other regions. It can be observed from Table 1 that the value of RWPE in the mutation area changes most significantly with the obviously larger mutation rate than others. Hence, RWPE is chosen for the calculation candidate of mode number for VMD in this article.

OVMD of Sinusoidal Signals
The mode number K set in advance plays a decisive role in influence on the decomposition accuracy of VMD. If K is too small, useful low-frequency components cannot be completely recovered from the signal submerged by noise, that is, the so-called underdecomposition. However, too large K will cause over-decomposition to occur, that is to say, in addition to undesired IMFs being generated, concurrently, the computational complexity will also be dramatically increased. In view of this, the balance between data recovery and time cost is particularly interesting concerning VMD. Study in [27,33], the decomposition result of EMD is used as a reference for VMD, which is obviously short of reasonable mathematical foundation, as well as may result in over-decomposition. Besides, considering the results of VMD, with the increase of K, low-frequency IMFs are gradually being discovered and become cleaner. When K increases to a certain critical point, the low-frequency components have been completely reproduced, and their complexity will also converge to a certain value. When K continues to increase, the contribution to the decomposition accuracy is negligible except for the significant increase in computational complexity. Inspired by the analysis, given the high sensitivity of RWPE to noise, it is introduced for K selection in VMD (see Section 3).
Subsequently, we will make a specific discussion on how to choose an appropriate RWPE threshold. Due to the single frequency nature of intrinsic mode functions (IMF) by VMD, three sinusoidal signals are utilized for the selection of RWPE threshold. The three sinusoidal signals are given as follows: where the sampling frequency and the data length are set to 5 kHz and 5000, respectively.
The RWPE of them under the signal-to-noise ratio (SNR) range of −15-30 dB is shown in Figure 3. As indicated in Figure 3, when the threshold of RWPE is set to be greater than 0.2, the corresponding SNR of the three signals is greater than 20 dB, and their complexity is drastically reduced. In this situation, the signal can be considered approximately pure with the noise being negligible compared to it. Therefore, as K increases, in the case of all the reproduced low-frequency IMFs whose RWPE greater than 0.2 for the first time, the corresponding K is considered to be the optimal mode number. Subsequently, we will make a specific discussion on how to choose an appropriate RWPE threshold. Due to the single frequency nature of intrinsic mode functions (IMF) by VMD, three sinusoidal signals are utilized for the selection of RWPE threshold. The three sinusoidal signals are given as follows: where the sampling frequency and the data length are set to 5 kHz and 5000, respectively. The RWPE of them under the signal-to-noise ratio (SNR) range of −15-30 dB is shown in Figure 3. As indicated in Figure 3, when the threshold of RWPE is set to be greater than 0.2, the corresponding SNR of the three signals is greater than 20dB, and their complexity is drastically reduced. In this situation, the signal can be considered approximately pure with the noise being negligible compared to it. Therefore, as K increases, in the case of all the reproduced low-frequency IMFs whose RWPE greater than 0.2 for the first time, the corresponding K is considered to be the optimal mode number. As is well known to us, S-RN signals contain rich line spectrum components being one of the key parameters for underwater acoustic target recognition. Accordingly, in this paper, a composite signal composed of several sinusoidal signals is randomly constructed to validate the performance of the proposed method. The simulation signals are as follows: where the sampling frequency and the data length are set to 1 kHz and 5000, respectively.
η represents the standard Gaussian white noise. As the mode number of VMD is less than that of EEMD, we then set the range of K to 3-12, and count the number of IMFs whose RWPE greater than 0.2 after each decomposition. The number of IMFs greater than As is well known to us, S-RN signals contain rich line spectrum components being one of the key parameters for underwater acoustic target recognition. Accordingly, in this paper, a composite signal composed of several sinusoidal signals is randomly constructed to validate the performance of the proposed method. The simulation signals are as follows: where the sampling frequency and the data length are set to 1 kHz and 5000, respectively. η represents the standard Gaussian white noise. As the mode number of VMD is less than that of EEMD, we then set the range of K to 3-12, and count the number of IMFs Entropy 2021, 23, 503 9 of 18 whose RWPE greater than 0.2 after each decomposition. The number of IMFs greater than 0.2 versus the mode number K is presented in Figure 4.      It can be concluded from Figure 4 that the K maximizing n for the first time is 7, thus, 7 is considered as the optimal mode number for VMD. In order to enhance persuasiveness, Tables 2 and 3 list the center frequency and RWPE of IMFs after each decomposition, respectively.  According to Table 2, when K is 3-6, the components corresponding to the simulation signals have not been completely reproduced yet; when K reaches 7, the first three IMFs correspond to the raw signals exactly; when K is more than 8, despite the fully decomposed components, more high-frequency noise is also generated. As for Table 3, when K is 3-6, the number of IMFs with RWPE greater than 0.2 remains 2; when K is 7, the number increases to 3, and the number continues to be 3 when K is more than 7. In this case, 7 is exactly the optimum mode number. Furthermore, with the K increased, the RWPE of useful components are increasing regularly and gradually converging to a fixed value. While the noise IMFs are not the case, thereby scientifically validating the rationality of employing RWPE threshold to lock the mode number for VMD. The time-domain waveforms of simulation signals, along with the decomposition results of EMD, EEMD, and OVMD are shown in Figure 5. According to Table 2, when K is 3-6, the components corresponding to the simulation signals have not been completely reproduced yet; when K reaches 7, the first three IMFs correspond to the raw signals exactly; when K is more than 8, despite the fully decomposed components, more high-frequency noise is also generated. As for Table 3, when K is 3-6, the number of IMFs with RWPE greater than 0.2 remains 2; when K is 7, the number increases to 3, and the number continues to be 3 when K is more than 7. In this case, 7 is exactly the optimum mode number. Furthermore, with the K increased, the RWPE of useful components are increasing regularly and gradually converging to a fixed value. While the noise IMFs are not the case, thereby scientifically validating the rationality of employing RWPE threshold to lock the mode number for VMD. The time-domain waveforms of simulation signals, along with the decomposition results of EMD, EEMD, and OVMD are shown in Figure 5. As illustrated in Figure 5, a certain degree of mode aliasing occurs in EMD and EEMD. In contrast, the decomposition performance of OVMD is better than that of EMD and EEMD without this phenomenon. In order to make the decomposition accuracy of OVMD prominent, the mean absolute error (MAE) between the center frequency of IMFs by EMD, EEMD, and OVMD with that of corresponding simulation signals are listed in As illustrated in Figure 5, a certain degree of mode aliasing occurs in EMD and EEMD. In contrast, the decomposition performance of OVMD is better than that of EMD and EEMD without this phenomenon. In order to make the decomposition accuracy of OVMD prominent, the mean absolute error (MAE) between the center frequency of IMFs by EMD, EEMD, and OVMD with that of corresponding simulation signals are listed in Table 4. Table 4 shows that the maximum MAE of EMD means poor decomposition performance, while the decomposition performance of EEMD has been improved to a certain extent with a smaller MAE than EMD. Compared with EMD and EEMD, OVMD has revealed the best decomposition performance with the smallest MAE. Therefore, OVMD is superior to EMD and EEMD in terms of recovering the desired components from the signal masked by noise.

OVMD of S-RN Signals
In this paper, five types of real measured S-RN samples are randomly selected from a data set in [34], namely, Ship I, Ship II, Ship III, Ship IV, and Ship V. There are 50 randomly selected samples from each category for analysis. The data points are set to 2000. The time-domain waveforms of the five normalized signals are shown in Figures 6 and 7 depicts the results of using the proposed method to lock the mode number of VMD for the five types of signals.  Table 4. Table 4 shows that the maximum MAE of EMD means poor decomposition performance, while the decomposition performance of EEMD has been improved to a certain extent with a smaller MAE than EMD. Compared with EMD and EEMD, OVMD has revealed the best decomposition performance with the smallest MAE. Therefore, OVMD is superior to EMD and EEMD in terms of recovering the desired components from the signal masked by noise.

OVMD of S-RN Signals
In this paper, five types of real measured S-RN samples are randomly selected from a data set in [34], namely, Ship I, Ship II, Ship III, Ship IV, and Ship V. There are 50 randomly selected samples from each category for analysis. The data points are set to 2000. The time-domain waveforms of the five normalized signals are shown in Figures 6 and 7 depicts the results of using the proposed method to lock the mode number of VMD for the five types of signals.  As indicated in Figure 7, for the five types of S-RN signals, the optimal K maximizing the number of IMFs with RWPE greater than 0.2 for the first time is 11,9,9,12, and 9, respectively. The OVMD of the signals are shown in Figure 8.   As indicated in Figure 7, for the five types of S-RN signals, the optimal K maximizing the number of IMFs with RWPE greater than 0.2 for the first time is 11,9,9,12, and 9, respectively. The OVMD of the signals are shown in Figure 8.

Recognition of S-RN Samples
Combined with the time-domain waveforms of the five S-RN signals in Figure 6, due to the pollution of marine environmental noise, the fluctuation of the signal appears to be fairly messy, showing a certain degree of randomness. We can then easily extract the PE

Recognition of S-RN Samples
Combined with the time-domain waveforms of the five S-RN signals in Figure 6, due to the pollution of marine environmental noise, the fluctuation of the signal appears to be fairly messy, showing a certain degree of randomness. We can then easily extract the PE of these samples to quantify the uncertainty. The PE distribution of the five types of S-RN samples is presented in Figure 9. As observed from Figure 9, owing to the disturbance of ocean noise, these samples are mixed together and cannot be recognized at all.

Recognition of S-RN Samples
Combined with the time-domain waveforms of the five S-RN signals in Figure 6, due to the pollution of marine environmental noise, the fluctuation of the signal appears to be fairly messy, showing a certain degree of randomness. We can then easily extract the PE of these samples to quantify the uncertainty. The PE distribution of the five types of S-RN samples is presented in Figure 9. As observed from Figure 9, owing to the disturbance of ocean noise, these samples are mixed together and cannot be recognized at all. For further analysis, Table 5 lists the PE value of the IMF by OVMD with the maximum energy (ME) and that of the IMF with the maximum correlation coefficient (MCC) with the raw S−RN signal. As shown in Table 5, Ships I and III can be well distinguished by these two methods, however, there appear to be relatively close PE values concerning For further analysis, Table 5 lists the PE value of the IMF by OVMD with the maximum energy (ME) and that of the IMF with the maximum correlation coefficient (MCC) with the raw S−RN signal. As shown in Table 5, Ships I and III can be well distinguished by these two methods, however, there appear to be relatively close PE values concerning other three categories, which cannot be separated. Then, we employ NSCC to weigh the PE of the IMFs, and the sum of the weighted PE (SWPE) can be obtained, namely: where K is the mode number of OVMD. For the sake of comparative analysis, the results by EMD-PE-NSCC, EEMD-PE-NSCC, VMD-SIMF-FDE [33], and the proposed OVMD-PE-NSCC are also given in Figure 10.   As in Figure 10a,b, obviously, despite a large number of crossed samples between the other categories, Ship I can be clearly identified, and the overall separation has been improved a little compared with the state without any processing. In terms of VMD-SIMF-FDE, a majority of the samples have become easily recognized with a greater degree of clustering and separation between classes. In marked contrast, intuitively, the proposed OVMD-PE-NSCC outperforms others regardless of the degree of intra-class aggregation and inter-class separation. In order to facilitate the comparison accuracy, 30 randomly selected individuals in each class are utilized for training the SVM multi-class classifier, and the rest are for prediction. The outputs of the classification are displayed in Table 6. Based on Table 6, the proposed method is significantly superior to other algorithms with higher recognition rate. Our method is an effective tool, the low-frequency IMFs are covered, and concurrently, the recognition rate is also enhanced.

Conclusions
In order to extract useful features from S-RN signals, a novel technique fully integrating OVMD, PE, and normalized SCC is put forward in this paper. The main innovations and contributions of this paper are as follows: (1) Compared with PE, AKPE, AAPE, and WPE, the simulation experiments indicate that RWPE, which incorporates amplitude and distance information, is the most competent in distinguishing different signals. Hence, it is innovatively used to lock the mode number for VMD in this paper. This paper successfully conquers the mode number issue for VMD by RWPE, the experimental results on sinusoidal signals have proved the better decomposition performance of OVMD than that of EMD and EEMD. (2) Five types of measured S-RN signals are selected to verify the performance of our method. The classification results have illustrated that our method is obviously superior to others. (3) The proposed method can serve as a supplement to the field of underwater acoustic signal processing, as well as extended to other aspects. In future work, we will try to explore other signal decomposition methods and features to further improve the performance of the system.