Passive Detection of Ship-Radiated Acoustic Signal Using Coherent Integration of Cross-Power Spectrum with Doppler and Time Delay Compensations

Passive sonar is widely used for target detection, identification and classification based on the target radiated acoustic signal. Under the influence of Doppler, generated by relative motion between the moving target and the sonar array, the received ship-radiated acoustic signals are non-stationary and time-varying, which has a negative effect on target detection and other fields. In order to reduce the influence of Doppler and improve the performance of target detection, a coherent integration method based on cross-power spectrum is proposed in this paper. It can be concluded that the frequency shift and phase change in the cross-power spectrum obtained by each pair of data segments can be corrected with the compensations of time scale (Doppler) factor and time delay. Moreover, the time scale factor and time delay can be estimated from the amplitude and phase of the original cross-power spectrum, respectively. Therefore, coherent integration can be implemented with the compensated cross-power spectra. Simulation and experimental data processing results show that the proposed method can provide sufficient processing gains and effectively extract the discrete spectra for the detection of moving targets.


Introduction
Sonar (sound navigation and ranging) can be regarded as a kind of underwater radar, using sound to interrogate the surroundings. In general, sonar can be grouped into two main categories, i.e., active sonar and passive sonar, according to the presence and absence of a sound-projecting transducer as a component of the sonar system [1]. Passive sonar only includes a receiver without a transmitter and is used to detect the sound emitted by the target, which has the advantages of high concealment and a long detection distance. Therefore, passive sonar is widely used to detect, locate, and classify different targets on or under the surface of water, such as vessels and submarines [2,3], as illustrated in Figure 1.
In the common scenario of passive target detection shown in Figure 1, the Doppler phenomenon, caused by relative motion between the sonar array and the moving target, can give rise to the distortion of received signals. Under the influence of Doppler, the received acoustic signals become non-stationary and time-varying. If the detection or localization of the moving target is operated by methods based on stationary signal hypothesis, the estimated results will definitely deviate from the true values. Therefore, it is necessary to utilize a signal model in which the influence of Doppler is taken into account. Passive sonar array composed of two hydrophones suspended in water at a certain depth by using the buoy and weights, and applied to acoustic signal acquisition of moving targets on or under water surface. Under the influence of Doppler generated by relative motion of the moving target and the passive sonar array, the received acoustic signals emitted from the target become nonstationary and time-varying. In this case, there must be significant errors between the true values of target detection or localization and the estimation results obtained by the methods based on stationary signal hypothesis.
In the common scenario of passive target detection shown in Figure 1, the Doppler phenomenon, caused by relative motion between the sonar array and the moving target, can give rise to the distortion of received signals. Under the influence of Doppler, the received acoustic signals become non-stationary and time-varying. If the detection or localization of the moving target is operated by methods based on stationary signal hypothesis, the estimated results will definitely deviate from the true values. Therefore, it is necessary to utilize a signal model in which the influence of Doppler is taken into account.
It is known that deterministic mechanical sounds and random hydrodynamic noise are the main components of underwater acoustic signals radiated by vessels [4]. Generally, a typical ship-radiated acoustic signal consists of a continuous spectrum and narrowband discrete components, which are known simply as lines or tonals, and can be successfully used for ship detection and classification.
For the passive detection of ships, a spectrogram is often used to estimate the parabolic patterns and tonals in ship-radiated acoustic signals [5]. Moreover, performance of this kind of classical spectral estimation method based on Fourier transform, such as frequency resolution [6] and processing gain, is affected by the length of data used for spectral estimation. However, as for the non-stationary signal, its satisfactory spectral estimation result cannot be acquired from the long-time integration.
For non-stationary signal processing, time-frequency analysis [7,8] is widely used to provide the joint distribution information of the time and frequency domains, and describe the relationship between signal frequency and time. In addition, based on the assumption that the signal is stationary in a short time duration, high-resolution spectral estimation methods can be used to obtain the spectrum with high frequency resolution using a short time series. In this way, the high-resolution spectral estimation methods based on parametric models, such as autoregressive (AR), autoregressive moving average (ARMA) [9] and Multiple Signal Classification (MUSIC) [10] have been introduced to extract the narrowband characteristics of ship-radiated acoustic signal. In recent years, compressed sensing (CS) [11] has also been taken into account for high-resolution and precision extraction of the narrowband components of acoustic signals. However, the performances of these parametric spectral estimation techniques are limited by signal-to-noise ratio (SNR).
An adaptive line enhancer (ALE) [12] is a classical method for enhancing the underwater narrowband discrete components (known as lines or tonals) in time-varying channels. However, the ability of ALE to deal with colored Gaussian noise is limited, and its performance deteriorates under lower SNR conditions [13]. In order to effectively extract or enhance the narrowband signals in complex environments, the conventional ALE is always used with a combination of other methods. Passive sonar array composed of two hydrophones suspended in water at a certain depth by using the buoy and weights, and applied to acoustic signal acquisition of moving targets on or under water surface. Under the influence of Doppler generated by relative motion of the moving target and the passive sonar array, the received acoustic signals emitted from the target become non-stationary and time-varying. In this case, there must be significant errors between the true values of target detection or localization and the estimation results obtained by the methods based on stationary signal hypothesis.
It is known that deterministic mechanical sounds and random hydrodynamic noise are the main components of underwater acoustic signals radiated by vessels [4]. Generally, a typical ship-radiated acoustic signal consists of a continuous spectrum and narrowband discrete components, which are known simply as lines or tonals, and can be successfully used for ship detection and classification.
For the passive detection of ships, a spectrogram is often used to estimate the parabolic patterns and tonals in ship-radiated acoustic signals [5]. Moreover, performance of this kind of classical spectral estimation method based on Fourier transform, such as frequency resolution [6] and processing gain, is affected by the length of data used for spectral estimation. However, as for the non-stationary signal, its satisfactory spectral estimation result cannot be acquired from the long-time integration.
For non-stationary signal processing, time-frequency analysis [7,8] is widely used to provide the joint distribution information of the time and frequency domains, and describe the relationship between signal frequency and time. In addition, based on the assumption that the signal is stationary in a short time duration, high-resolution spectral estimation methods can be used to obtain the spectrum with high frequency resolution using a short time series. In this way, the high-resolution spectral estimation methods based on parametric models, such as autoregressive (AR), autoregressive moving average (ARMA) [9] and Multiple Signal Classification (MUSIC) [10] have been introduced to extract the narrowband characteristics of ship-radiated acoustic signal. In recent years, compressed sensing (CS) [11] has also been taken into account for high-resolution and precision extraction of the narrowband components of acoustic signals. However, the performances of these parametric spectral estimation techniques are limited by signal-to-noise ratio (SNR).
An adaptive line enhancer (ALE) [12] is a classical method for enhancing the underwater narrowband discrete components (known as lines or tonals) in time-varying channels. However, the ability of ALE to deal with colored Gaussian noise is limited, and its performance deteriorates under lower SNR conditions [13]. In order to effectively extract or enhance the narrowband signals in complex environments, the conventional ALE is always used with a combination of other methods. For example, the incorporation of an l 1 -norm sparse penalty into frequency domain adaption has been proposed in [14] to reduce the misadjustment of conventional ALE. Coherent integration and high-order cumulants have been combined with a multilevel switching adaptive line enhancer to obtain better performance [15].
Coherent integration [16] is essentially a digital filtering process to filter out much of the wideband noise, which is widely used in radar signal processing as an effective method for signal enhancement. However, the detection performance of coherent integration can be affected significantly by target motion, for instance, the imaging quality, target tracking, and identification [17]. It has become a highlight topic in radar signal processing to achieve long-time coherent integration for weak signals echoed by moving targets. Based on Fourier transform, the improved axis rotation discrete chirp-Fourier transform [18] and modified radon Fourier transform [19] were proposed to realize the compensations of range and Doppler migrations. In addition, a long-time coherent integration algorithm was proposed to compensate the phase of original echo signal by extracting the relative acceleration information carried in the echo signal, such that the compensated signals can be refocused in the frequency domain [20].
Similar to the application of coherent integration in radar signal processing, the SNR improvement of passive sonar signal processing that benefits from long-time coherent integration is also limited by the target motion. In order to increase the SNR, a coherent integration method of the cross-power spectrum with compensations of Doppler and time delay is proposed in this paper. The proposed method is implemented using the signals received by two hydrophones. In consideration of the Doppler and time delay between the two hydrophones, the received signal model is established and the cross-power spectrum of the received signals is derived. The Doppler factor and time delay can be estimated from the amplitude and phase of the cross-power spectrum. Then, the frequency shift and phase change in the cross-power spectrum can be corrected with the compensations of the estimated Doppler factor and time delay. Finally, the long-time coherent integration can be achieved by the compensated cross-power spectra of different time intervals. With this method, we can take full advantage of long-time data processing to get enough SNR gain for the weak and moving target detection. Compared with other methods, it is shown that the proposed method is capable of providing sufficient processing gain and effectively enhancing the narrowband discrete components in a ship-radiated acoustic signal. This is also of great significance for target recognition [21,22] and classification [23].
The paper is organized as follows. Section 2 introduces the signal model and the theory of coherent integration for the cross-power spectrum compensated with time scale factor and time delay. Different algorithms for discrete spectral estimation in simulations and experimental signal processing are analyzed in Sections 3 and 4. Section 5 is exclusively dedicated to the conclusions.

Signal Model and Cross-Power Spectrum Coherent Integration
It has been mentioned in Introduction that the received signals from the passive sonar array are time-varying in the scenario illustrated in Figure 1. Therefore, it is necessary to establish the received signal model at first. The signals received by the two hydrophones are expressed as r 1 (n) and r 2 (n). Their subscripts represent the hydrophone numbers. When a moving target passes over these hydrophones, the target motion corresponding to each hydrophone results in a relative time scaling between the data received by these hydrophones. Thus, the time scale factor and time delay of the sound source signal received by one hydrophone can be modeled as a constant compared with the signal received by the other hydrophone over a short time interval. The received signals r 1 (n) and r 2 (n) can be written as r 1 (n) = s(n) + n 1 (n) where s(n) is the sound source signal, τ is the time delay of the two received signals, α is r 2 (n)'s time scale factor relative to r 1 (n). In the practical situation, n 1 (n) and n 2 (n) consist of the self-noise of the corresponding hydrophone and the complex environmental noise. Moreover, Moschas and Stirios [24] have confirmed that the outputs of hydrophones are different under the influence of dynamic noise of different types even if they are recording the same source excitation under the same propagation effects. However, the spectrum level of hydrophone self-noise can be designed to be~20 dB smaller than the environmental noise for underwater detection [25]. Therefore, n 1 (n) and n 2 (n) are only the Sensors 2020, 20, 1767 4 of 20 environmental noises received by each hydrophone and the self-noise of the hydrophone is not taken into account. In Equation (1), time scale factor α and time delay τ change with the movement of the sound source. If the two received signals are divided into M N-point length data segments, it is supposed that the speed of the moving target, the time scale factor α, and the time delay τ in each data segment remain constant. Then, the time scale factor and time delay can be rewritten as a = 1 − 1/α, b = τ/α [26] in each data segment. Not only are the time scales of signals received by the two hydrophones different under the influence of Doppler, but the time scale factor of the signal received by each hydrophone also differs with sound source movement. The first data segment in r 1 (n) is set to r 11 (n), and its time scale factor is α 11 = 1, i.e., a 11 = 0. The time scale factors of other data segments are represented according to a 11 , and the i-th data segment of N-point length received by two hydrophones can be expressed as where a 1i and a 2i are the time scale factors of r 1i (n) and r 2i (n) relative to r 11 (n), respectively, and b i is the time delay between r 1i (n) and r 2i (n). In addition, b i = τ i /α i , α i is the time scale factor of r 2i (n) relative to r 1i (n). n 1i (n) and n 2i (n) are the noises in the i-th data segment received by the two hydrophones. The received signal model of each data segment for coherent integration is expressed in Equation (2). For convenient deduction and analysis, the sound source is supposed to be a harmonic signal at f 0 . The sampling interval is t s . Then, the received sound source signals in Equation (2) can be represented as Then, the N-point length harmonic signal cross-power spectrum P s12i ( f ) of the i − th data segments s 1i (n) and s 2i (n) can be indicated as x, and * is the symbol of conjugate operation. A i (a 1i , a 2i ) and ψ i (a 1i , a 2i , b i ) are the amplitude and phase of P s12i ( f ), and can be represented as It is evident from Equation (5) that the amplitude A i (a 1i , a 2i ) of P s12i ( f ) is actually affected by the frequency shift generated by Doppler (time scale factor), and the phase ψ i (a 1i , a 2i , b i ) is strictly under the influence of time delay and time scale factor. Although the changes in amplitude and phase with sound source movement make impossible the coherent integration of P s12i ( f ); the cross-power spectrum coherent integration can be realized by compensations of time scale factor and time delay.

Methods for Doppler Factor and Time Delay Estimations
For the estimations of time scale (Doppler) factor and time delay, the joint estimation method of time scale factor and time delay based on the cross-ambiguity function [27,28] is widely employed in both radar and sonar domains. The cross-ambiguity function of the received signals r 1 (n) and r 2 (n) can be expressed as where W r 2 r 1 (a, b) is the cross-ambiguity function, a and b are time scale factor and time delay, respectively. The symbol T represents transpose. With the defined search ranges of a and b, we can obtain a two-dimensional plane of the cross-ambiguity function W r 2 r 1 (a, b). The time scale factor and time delay corresponding to the maximum of W r 2 r 1 (a, b) are the estimation results, i.e., where a and b are the estimation results of time scale factor and time delay. However, the joint estimation method requires a considerable amount of calculations for maximum outcomes on a two-dimensional plane. In addition, it is difficult to obtain the maximum on this two-dimensional plane when the target is moving at a low speed, even if the search step size is enough. In order to estimate the time scale factor and time delay quickly and accurately, a two-step method proposed in our previous work [29] can be used. Compared with the joint estimation method, the time scale factor and time delay are estimated step by step. First, a one-dimensional search is performed for the estimation of time scale factor. Then, the time delay can be obtained by the phase of the cross-power spectrum compensated with the estimated time scale factor. This two-step method reduces the search dimension, which can avoid a large amount of calculations.
The detailed principle and implementation of this two-step method can be seen in Appendix A.

Compensations of Doppler Factor and Time Delay for Cross-Power Spectrum
Several methods for the estimations of time scale factor and time delay have been mentioned in the previous section. The specific compensation process of time scale factor and time delay, aimed at the cross-power spectrum, is illustrated in Figure 2. The two-step method demonstrated in Appendix A is used to estimate the time scale factor and time delay in Figure 2. In fact, the time scale factor and time delay estimated by any method can be used for the compensations of the cross-power spectrum according to the operations illustrated in the dashed rectangles of Figure 2.
As illustrated in Figure 2, a is the search range of time scale factor. r 1i (n) and r 2i (n) are the resampling results of r 1i (n) and r 2i (n), respectively, in the range of a . The amplitudes Am(R 1i ) and Am(R 2i ) are obtained from R 1i ( f ) and R 2i ( f ), i.e., the FFT (fast Fourier transform) results of r 1i (n) and r 2i (n). The amplitude Am(R 11 ) of r 11 (n) can be acquired in the same way. Then, we proceed with the calculations of the cross-correlation coefficients of Am R 1i and Am(R 11 ), as well as Am R 2i and Am(R 11 ). The time scale factors a 1i and a 2i of r 1i (n) and r 2i (n), relative to r 11 (n), can be estimated by searching the time scale factors in a corresponding to the maxima of cross-correlation coefficients. r 1i (n) and r 2i (n) are resampled again according to a 1i and a 2i to acquire r 1i (n) and r 2i (n). Then, the time scale factors of r 11 (n), r 1i (n), and r 2i (n) become the same. After the compensation of time scale factor, the frequencies of sound source estimated by each cross-power spectrum P r12i ( f ) of r 1i (n) and r 2i (n) become the same, i.e., the frequency shifts are compensated with the estimated time scale factors.
Due to the difference between the time scale factors of r 1i (n) and r 2i (n), the time scale factors should be compensated prior to time delay estimation. As indicated in Figure 2, the time scale factor a i of r 2i (n) relative to r 1i (n) can be estimated by assessing the value in a corresponding to the maximum of cross-correlation coefficients between Am(R 1i ) and Am R 2i . Then, r 2i (n) can be obtained from the resampled r 2i (n) according to a i , which has the same time scale factor as r 1i (n). The phase Ph P r12i can be acquired from the cross-power spectrum P r12i ( f ) of r 1i (n) and r 2i (n). The slope estimation of Ph P r12i is divided by 2π to obtain the time delay estimation result. Finally, the time delay of the sound source at f 0 in the cross-power spectrum P r12i ( f ) can be compensated by multiplying by e − j2π f 0 b i . Sensors 2020, 20, x FOR PEER REVIEW 6 of 22 of the cross-power spectrum compensated with the estimated time scale factor. This two-step method reduces the search dimension, which can avoid a large amount of calculations. The detailed principle and implementation of this two-step method can be seen in Appendix A.

Compensations of Doppler Factor and Time Delay for Cross-Power Spectrum
Several methods for the estimations of time scale factor and time delay have been mentioned in the previous section. The specific compensation process of time scale factor and time delay, aimed at the cross-power spectrum, is illustrated in Figure 2. The two-step method demonstrated in Appendix A is used to estimate the time scale factor and time delay in Figure 2. In fact, the time scale factor and time delay estimated by any method can be used for the compensations of the cross-power spectrum according to the operations illustrated in the dashed rectangles of Figure 2.

Implementation of Coherent Integration Using the Compensated Cross-Power Spectrum and Analysis of Integration Gain
After the aforementioned compensations of time scale factor and time delay highlighted in Figure 2, the frequency and phase of the sound source in each compensated cross-power spectrum P r12i ( f )e − j2π f 0 τ i are entirely similar. The amplitude and phase of the compensated cross-power spectrum of sound source signal can be written as Sensors 2020, 20, 1767 7 of 20 According to Equation (8), the compensated cross-power spectrum of the sound source signal in each data segment can be represented as As is evident in Equation (9), the frequency estimation result of each compensated cross-power spectrum is analogous, and the phase estimation is transformed to zero. These outcomes satisfy the conditions of coherent integration. The coherent integration result of P s12i ( f ) can be written as The incoherent integration obtained from the uncompensated cross-power spectra of the sound source signal illustrated in Equation (4) can be expressed as Compared with the incoherent integration result in Equation (11), the additional integration gain offered by the coherent integration shown in Equation (10) can be written as For the harmonic signal of f = f 0 , the extra integration gain at f 0 is The aforementioned derivations suggest that the integration gain provided by the compensated cross-power spectra under ideal conditions is 10 log 10 M, which is proportional to the number of integrations. The performance of incoherent integration is limited by Doppler and phase differences in each data segment, which are responsible for the confined gain of incoherent integration. In the ensuing sections, the performance of the proposed method and the comparisons with other spectral estimation algorithms will be elaborated by employing certain simulations and experimental data processing.

Simulated Signals
The method proposed in [30] was applied to simulate a ship-radiated acoustic signal in this section. The characteristics of the signal acquired through this particular method were fairly analogous to the practical ship properties. The frequencies of discrete spectra contained in the ship-radiated acoustic signal are set at 134, 146, and 158 Hz, corresponding to the power spectra of 130, 128, and 125 dB, respectively, which are illustrated in Figure 3a. The continuous spectrum was obtained from the output of white Gaussian noise processed through a bandpass filter with a sampling frequency of 10 kHz. The amplitude of the continuous spectrum increased with 2.82 dB/oct during 0-150 Hz, and decreased with −2.10 dB/oct at other frequencies. The maximum of the continuous spectrum was set as 125 dB at a frequency of 150 Hz, which can be seen in Figure 3b. The SNRs of discrete spectra in the simulated ship-radiated acoustic signal are 12, 5, and 3 dB. 128, and 125 dB, respectively, which are illustrated in Figure 3a. The continuous spectrum was obtained from the output of white Gaussian noise processed through a bandpass filter with a sampling frequency of 10 kHz. The amplitude of the continuous spectrum increased with 2.82 dB/oct during 0-150 Hz, and decreased with -2.10 dB/oct at other frequencies. The maximum of the continuous spectrum was set as 125 dB at a frequency of 150 Hz, which can be seen in Figure 3b. The SNRs of discrete spectra in the simulated ship-radiated acoustic signal are 12, 5, and 3 dB. The received signals of two hydrophones were simulated using the ship-radiated acoustic signal mentioned above. The conditions for array deployment and ship navigation are indicated in Figure  4. The two hydrophones, placed 5 m apart, were assumed to be suspended at a depth of 150 m and floating parallel to the water surface. The target ship sailed through the distance from S1 to S2 at a speed of 10 m/s. The direct sound signals received by R1 and R2 were the simulated signals for The received signals of two hydrophones were simulated using the ship-radiated acoustic signal mentioned above. The conditions for array deployment and ship navigation are indicated in Figure 4. The two hydrophones, placed 5 m apart, were assumed to be suspended at a depth of 150 m and floating parallel to the water surface. The target ship sailed through the distance from S 1 to S 2 at a speed of 10 m/s. The direct sound signals received by R 1 and R 2 were the simulated signals for performance analysis of different spectral estimation methods. Distortions of the received signals due to the hydrophones were not considered in the simulation.  The target ship sailed through the distance from S1 to S2 on the water surface. The hydrophones R1 and R2 were suspended at the depth of DO (vertical distance between the hydrophones and water surface) and floating parallel to the water surface for acoustic signal acquisition.

Analysis of Narrowband Discrete Spectral Estimation Results Obtained from Different Methods
In this section, the algorithms listed in Table 1 were applied to the spectral estimation of the signals simulated in Section 3.1. It is noted that Cross-Power Spectrum Coherent Integration (CPSCI) is the coherent integration of the cross-power spectrum without compensation. The methods listed in Table 1 are appropriate for spectral estimation under the condition that the background noise is assumed to be white Gaussian noise. The SNRs of ship-radiated acoustic signal to background noise were set at 20 dB and −10 dB, respectively. The estimation results obtained from the methods mentioned above are all displayed by low-frequency analysis and recording (LOFAR) and shown in Figures 5 and 6. For the LOFAR processing of the methods listed in Table 1, the length of the sliding window is 30 s and the overlap is 25 s. In addition, the CS method is also used for LOFAR processing with a sliding window . Side view of the geometric conditions for hydrophone deployment and ship track. The target ship sailed through the distance from S 1 to S 2 on the water surface. The hydrophones R 1 and R 2 were suspended at the depth of DO (vertical distance between the hydrophones and water surface) and floating parallel to the water surface for acoustic signal acquisition.

Analysis of Narrowband Discrete Spectral Estimation Results Obtained from Different Methods
In this section, the algorithms listed in Table 1 were applied to the spectral estimation of the signals simulated in Section 3.1. It is noted that Cross-Power Spectrum Coherent Integration (CPSCI) is the coherent integration of the cross-power spectrum without compensation. The methods listed in Table 1 are appropriate for spectral estimation under the condition that the background noise is assumed to be white Gaussian Sensors 2020, 20, 1767 9 of 20 noise. The SNRs of ship-radiated acoustic signal to background noise were set at 20 dB and −10 dB, respectively. The estimation results obtained from the methods mentioned above are all displayed by low-frequency analysis and recording (LOFAR) and shown in Figures 5 and 6. For the LOFAR processing of the methods listed in Table 1, the length of the sliding window is 30 s and the overlap is 25 s. In addition, the CS method is also used for LOFAR processing with a sliding window of 5 s and an overlap of 2.5 s, which is illustrated in Figures 5a and 6a, corresponding to different SNRs. The specific process of these methods for spectral estimation can be seen in Appendix B.
As shown in Figure 5a, CS method can provide pretty high frequency resolution. In the top subfigure of Figure 5a, the CS method can perform well with a short sliding window to reflect the frequency changes in discrete spectra with time caused by the target motion. However, the estimation inaccuracy of some frequencies leads to the poor continuity of discrete spectra illustrated by LOFAR. Moreover, the spectral estimation result of CS is not satisfactory using a long sliding window. In the bottom subfigure of Figure 5a, it is difficult for us to identify the discrete spectra of moving target from the LOFAR result. This suggests that the frequency changes in a large amount of time-varying signal samples should be taken into consideration for a better spectral estimation result. Compared with CS, the estimation result of MUSIC in Figure 5b shows that MUSIC has a better tolerance for time-varying signal processing. Except for the weakest discrete spectrum, the time-frequency characteristics of the other two discrete spectra with higher SNRs can be reflected correctly in the LOFAR result of MUSIC.
The frequency changes indicated in the estimation results of CPSCI, CPSII, and CCPSCI in Figure 5c-e are not as clear as CS, due to the limited frequency resolution provided by them. However, these three methods can estimate the frequencies of discrete spectra with a fair amount of stability and can represent the circumstance of frequency variation accurately in comparison with CS and MUSIC. During the time period in which the frequencies of discrete spectra change rapidly with time (~20-50 s), the discrete spectra estimated by CPSCI and CPSII start to disconnect because the integration gains provided by them are insufficient for extracting the weak discrete spectra at this time. In contrast to CPSCI and CPSII, CCPSCI (with the compensations of Doppler factor and time delay) may offer to estimate weak discrete spectra, even for abrupt variations in frequency. In this way, our method can offer relatively stable and continual discrete spectral estimation results.
The estimation results of these methods with a low SNR of −10 dB are shown in Figure 6. The number of breaking points in the discrete spectra, estimated by CS in Figure 6a, increases so much that the weakest discrete signal is almost undetectable in both subfigures. In addition, the decrease in SNR causes the frequency errors of discrete spectra estimated by MUSIC in Figure 6b to increase even further. As for CPSCI and CPSII in Figure 6c-d, there are obvious gaps in the discrete spectra, which can make target detection and identification almost impossible in real scenarios. Moreover, the performance of CPSCI using traditional coherent integration significantly deteriorates due to the influences from frequency and phase changes under a low SNR. The comparative outcomes of Figure 6e illustrate that CCPSCI can still provide the spectral estimation result with a relatively high degree of continuity and stability, even at a low SNR.
Based on the comprehensive analysis of the above methods, we may summarize the conclusions in the following section. Firstly, the performance of the classical CS algorithm for discrete spectral estimation depends, to some extent, on SNR and the premise of a stationary signal, and it may not be able to detect the weak discrete spectrum in real scenarios. Similarly, the estimation error of the discrete spectrum obtained from MUSIC increases significantly when the SNR decreases. Although increasing the number of snapshots can improve the processing performance of the MUSIC algorithm [31], the calculation time would also increase dramatically at the same time [32]. Compared to CS and MUSIC, more stable spectral estimation results can be provided by CPSCI, CPSII, and CCPSCI, which are based on Fourier theory. However, under the influence of Doppler phenomenon, the integration gains obtained from CPSCI and CPSII are so limited that the discrete spectra cannot be detected during this period. Subsequently, the resulting discontinuous discrete spectrum has a negative impact on target recognition. By contrast, the proposed CCPSCI can effectively overcome the influence of Doppler on integration gain with Doppler factor and time delay compensations. This, in turn, allows the coherent integration of the cross-power spectrum. Consequently, the result of the discrete spectrum estimated by CCPSCI has the advantages of being stable, continual, and accurate throughout the detection process.
It is noted that Cross-Power Spectrum Coherent Integration (CPSCI) is the coherent integration of the cross-power spectrum without compensation. The methods listed in Table 1 are appropriate for spectral estimation under the condition that the background noise is assumed to be white Gaussian noise. The SNRs of ship-radiated acoustic signal to background noise were set at 20 dB and −10 dB, respectively. The estimation results obtained from the methods mentioned above are all displayed by low-frequency analysis and recording (LOFAR) and shown in Figures 5 and 6. For the LOFAR processing of the methods listed in Table 1, the length of the sliding window is 30 s and the overlap is 25 s. In addition, the CS method is also used for LOFAR processing with a sliding window of 5 s and an overlap of 2.5 s, which is illustrated in Figure 5a and Figure 6a, corresponding to different SNRs. The specific process of these methods for spectral estimation can be seen in Appendix B. MUSIC can provide quite good estimation results of the two discrete spectra with high SNRs. However, the frequency change in the weakest discrete spectrum is estimated by mistake. In (c)-(e), CPSCI, CPSII and CCPSCI can estimate the frequencies of discrete spectra steadily and can represent the frequency variations accurately in comparison with CS and MUSIC. During the time period of ~20-50 s, the discrete spectra estimated by CPSCI and CPSII start to disconnect. However, the proposed CCPSCI can still offer relatively stable and continual spectral estimations.
As shown in Figure 5a, CS method can provide pretty high frequency resolution. In the top subfigure of Figure 5a, the CS method can perform well with a short sliding window to reflect the frequency changes in discrete spectra with time caused by the target motion. However, the estimation inaccuracy of some frequencies leads to the poor continuity of discrete spectra illustrated by LOFAR. Moreover, the spectral estimation result of CS is not satisfactory using a long sliding window. In the can provide quite good estimation results of the two discrete spectra with high SNRs. However, the frequency change in the weakest discrete spectrum is estimated by mistake. In (c-e), CPSCI, CPSII and CCPSCI can estimate the frequencies of discrete spectra steadily and can represent the frequency variations accurately in comparison with CS and MUSIC. During the time period of~20-50 s, the discrete spectra estimated by CPSCI and CPSII start to disconnect. However, the proposed CCPSCI can still offer relatively stable and continual spectral estimations.
in SNR causes the frequency errors of discrete spectra estimated by MUSIC in Figure 6b to increase even further. As for CPSCI and CPSII in Figure 6c-d, there are obvious gaps in the discrete spectra, which can make target detection and identification almost impossible in real scenarios. Moreover, the performance of CPSCI using traditional coherent integration significantly deteriorates due to the influences from frequency and phase changes under a low SNR. The comparative outcomes of Figure  6e illustrate that CCPSCI can still provide the spectral estimation result with a relatively high degree of continuity and stability, even at a low SNR. Based on the comprehensive analysis of the above methods, we may summarize the conclusions in the following section. Firstly, the performance of the classical CS algorithm for discrete spectral estimation depends, to some extent, on SNR and the premise of a stationary signal, and it may not be able to detect the weak discrete spectrum in real scenarios. Similarly, the estimation error of the discrete spectrum obtained from MUSIC increases significantly when the SNR decreases. Although increasing the number of snapshots can improve the processing performance of the MUSIC algorithm [31], the calculation time would also increase dramatically at the same time [32]. Compared to CS and MUSIC, more stable spectral estimation results can be provided by CPSCI, CPSII, and CCPSCI, which are based on Fourier theory. However, under the influence of Doppler phenomenon, the integration gains obtained from CPSCI and CPSII are so limited that the discrete spectra cannot be detected during this period. Subsequently, the resulting discontinuous discrete spectrum has a negative impact on target recognition. By contrast, the proposed CCPSCI can effectively overcome the influence of Doppler on integration gain with Doppler factor and time delay compensations. This, in turn, allows the coherent integration of the cross-power spectrum. Consequently, the result of the  In (c,d), the discontinuity of discrete spectra estimated by CPSCI and CPSII is obvious, which brings difficulties to target identification and classification. (e) CCPSCI can still provide the spectral estimation result with a relatively high degree of continuity and stability.
In addition to the evaluation of spectral estimation performance discussed above, the calculation time required during spectral estimation is also an important reference for measuring the engineering application value of the method. It takes hours for us to obtain the spectral estimation results by CS and MUSIC in the simulations. However, the methods based on classical spectral estimation theory, CPSCI, CPSII, and CCPSCI require much less calculation time than either CS or MUSIC. Although the proposed CCPSCI requires a bit of extra time (several minutes) for spectral estimation as compared to CPSCI and CPSII (less than 1 second), the calculation of CCPSCI can be further reduced with an improved estimation process for the time scale factor in our future research.

Extra Integration Gain
As mentioned above, coherent integration can provide more integration gain than incoherent integration. Under the simulation conditions of Sections 3.1 and 3.2, the extra integration gain provided by CCPSCI compared with CPSII can be calculated according to Equation (13) in Section 2.3, and is illustrated in Figure 7.
Sensors 2020, 20, x FOR PEER REVIEW 13 of 22 provided by CCPSCI compared with CPSII can be calculated according to Equation (13) in Section 2.3, and is illustrated in Figure 7. The extra integration gains provided by CCPSCI for discrete spectra with frequencies of 134, 146, and 158 Hz are shown in Figure 7. The horizontal axis of this figure represents the movement time of the target ship, which is equivalent to the vertical axis of LOFAR in Figures 5 and 6, i.e., the time axis. In Figure 7, extra integration gain varies with the movement time of the target ship because the time scale factor changes with the target motion. It can be known from the simulation environment and spectral estimation results that frequency shifts are very obvious during the time period of ~20-50 s under the influence of Doppler. During this period, CCPSCI can offer a ~6-8 dB extra integration gain compared with CPSII. As shown in Figure 7, the extra integration gains of these three discrete spectra are slightly different. Under the same conditions, the frequency shift in the higher frequency discrete spectrum is bigger, so the performance of incoherent integration is worse. Therefore, the proposed CCPSCI method can provide extra integration gains for the discrete spectrum at a higher frequency.
The averages of power spectra estimated by CPSCI, CPSII, and CCPSCI during the time period of ~20-50 s were calculated using the normalized spectral estimation results in the corresponding time period in Figures 5 and 6, which can be seen in Figure 8. We ignored the frequency changes in discrete spectra when calculating the averages. The value of the power spectrum corresponding to each discrete spectrum is listed in Table 2.  The extra integration gains provided by CCPSCI for discrete spectra with frequencies of 134, 146, and 158 Hz are shown in Figure 7. The horizontal axis of this figure represents the movement time of the target ship, which is equivalent to the vertical axis of LOFAR in Figures 5 and 6, i.e., the time axis. In Figure 7, extra integration gain varies with the movement time of the target ship because the time scale factor changes with the target motion. It can be known from the simulation environment and spectral estimation results that frequency shifts are very obvious during the time period of~20-50 s under the influence of Doppler. During this period, CCPSCI can offer a~6-8 dB extra integration gain compared with CPSII. As shown in Figure 7, the extra integration gains of these three discrete spectra are slightly different. Under the same conditions, the frequency shift in the higher frequency discrete spectrum is bigger, so the performance of incoherent integration is worse. Therefore, the proposed CCPSCI method can provide extra integration gains for the discrete spectrum at a higher frequency.
The averages of power spectra estimated by CPSCI, CPSII, and CCPSCI during the time period of~20-50 s were calculated using the normalized spectral estimation results in the corresponding time period in Figures 5 and 6, which can be seen in Figure 8. We ignored the frequency changes in discrete spectra when calculating the averages. The value of the power spectrum corresponding to each discrete spectrum is listed in Table 2.
Through the comparison and analysis of the averages of power spectra shown in Figure 8 and Table 2, it can be seen that the performance of CPSCI for signal enhancement is unsatisfactory. The processing gain provided by CPSCI is particularly limited due to the influence from Doppler under low SNR. Compared with CPSII and CCPSCI, the frequency and phase errors in CPSCI cause a loss of integration gain. However, the proposed CCPSCI can reduce this loss by correcting the frequency shifts and phase changes in the cross-power spectrum. As shown in Table 2, CCPSCI can provide an additional~2-4 dB integration gain compared with other methods. CCPSCI method can provide extra integration gains for the discrete spectrum at a higher frequency. The averages of power spectra estimated by CPSCI, CPSII, and CCPSCI during the time period of ~20-50 s were calculated using the normalized spectral estimation results in the corresponding time period in Figures 5 and 6, which can be seen in Figure 8. We ignored the frequency changes in discrete spectra when calculating the averages. The value of the power spectrum corresponding to each discrete spectrum is listed in Table 2.

Analysis of Experimental Data Processing Using Different Methods
The simulations in Section 3 analyze the discrete spectral estimation performance of CS, MUSIC, CPSCI, CPSII, and CCPSCI. In this section, these methods are applied to experimental data processing and evaluations are based on the estimation outcomes. The experimental data used for the analysis were collected in an oceanic experiment. The passive sonar array was deployed at a depth of 170 m and the element spacing was kept at 0.3 m. The data received by two adjacent hydrophones of that particular sonar array were used for signal processing. In order to compare the performance of these methods conveniently, the data of the target ship cruising in the proximity of the array at a speed of 12 knots (approximately 6 m/s) were selected for discrete spectral estimation.
A classical spectral estimation method, i.e., periodogram, is used for LOFAR to process the data from one hydrophone, in which the sliding window is 5 s and the overlap is 2.5 s. The LOFAR result estimated by the periodogram is displayed in Figure 9. This LOFAR result can help us to achieve comparative analysis among the various spectral estimation methods discussed in the previous section.  Through the comparison and analysis of the averages of power spectra shown in Figure 8 and Table 2, it can be seen that the performance of CPSCI for signal enhancement is unsatisfactory. The processing gain provided by CPSCI is particularly limited due to the influence from Doppler under low SNR. Compared with CPSII and CCPSCI, the frequency and phase errors in CPSCI cause a loss of integration gain. However, the proposed CCPSCI can reduce this loss by correcting the frequency shifts and phase changes in the cross-power spectrum. As shown in Table 2, CCPSCI can provide an additional ~2-4 dB integration gain compared with other methods.

Analysis of Experimental Data Processing Using Different Methods
The simulations in Section 3 analyze the discrete spectral estimation performance of CS, MUSIC, CPSCI, CPSII, and CCPSCI. In this section, these methods are applied to experimental data processing and evaluations are based on the estimation outcomes. The experimental data used for the analysis were collected in an oceanic experiment. The passive sonar array was deployed at a depth of 170 m and the element spacing was kept at 0.3 m. The data received by two adjacent hydrophones of that particular sonar array were used for signal processing. In order to compare the performance of these methods conveniently, the data of the target ship cruising in the proximity of the array at a speed of 12 knots (approximately 6 m/s) were selected for discrete spectral estimation.
A classical spectral estimation method, i.e., periodogram, is used for LOFAR to process the data from one hydrophone, in which the sliding window is 5 s and the overlap is 2.5 s. The LOFAR result estimated by the periodogram is displayed in Figure 9. This LOFAR result can help us to achieve comparative analysis among the various spectral estimation methods discussed in the previous section. As indicated in Figure 9, there are many discrete spectra that can be extracted by a periodogram in the current time period and frequency band. However, it is very difficult to determine which discrete spectra belong to the target ship in this particular situation, where multifarious discrete spectra mingle without any regular features. Therefore, it is necessary to process the received data further. The algorithms mentioned in Table 1 are applied to data processing, and their LOFAR results are illustrated in Figure 10. For all the methods, LOFAR processing is operated with a sliding window of 30 s and an overlap of 25 s. In addition, the LOFAR results obtained by CS using two different As indicated in Figure 9, there are many discrete spectra that can be extracted by a periodogram in the current time period and frequency band. However, it is very difficult to determine which discrete spectra belong to the target ship in this particular situation, where multifarious discrete spectra mingle without any regular features. Therefore, it is necessary to process the received data further. The algorithms mentioned in Table 1 are applied to data processing, and their LOFAR results are illustrated in Figure 10. For all the methods, LOFAR processing is operated with a sliding window of 30 s and an overlap of 25 s. In addition, the LOFAR results obtained by CS using two different sliding windows are shown in Figure 10a. The detailed process of these methods for spectral estimation of experimental data can be seen in Appendix B.
Sensors 2020, 20, x FOR PEER REVIEW 15 of 22 sliding windows are shown in Figure 10a. The detailed process of these methods for spectral estimation of experimental data can be seen in Appendix B.  Compared with the periodogram, a few discrete spectra can be enhanced by CS. However, it is difficult to extract the characteristic discrete spectra from the CS spectral estimation results in both subfigures. In the estimation result from MUSIC, the discrete spectra at the frequencies of 225, 238, and 251 Hz have an octave relationship. In (c-e), the LOFAR results of CPSCI, CPSII and CCPSCI are much improved in comparison to CS and MUSIC. The discrete spectra at frequencies of 199, 212, 225, 238, 251 and 301 Hz obtained from these three methods can be regarded as the characteristic discrete spectra of the target.
Compared with the periodogram, the background noise in the estimation results of CS illustrated in Figure 10a using the sliding window of 5 s is more obvious. In the bottom subfigure of Figure 10a, the background noise is much reduced using a longer sliding window. Although a few discrete spectra have been enhanced in comparison with the periodogram, it is still hard to detect and identify characteristic discrete spectra from the CS spectral estimation results.
As for the result estimated by MUSIC in Figure 10b, in the frequency band of~220-260 Hz, there are discrete spectra at the frequencies of 225, 238, and 251 Hz. In addition, these discrete spectra have an octave relationship. However, they have short durations and are simultaneously affected by other spectra. Thus, we may only deduce that these discrete spectra are probably the characteristic tonal spectra of the target ship.
The LOFAR results of CPSCI, CPSII, and CCPSCI in Figure 10c-e are significantly improved in comparison to CS and MUSIC, which is reflected in the reduction in background noise and the enhancement of the discrete spectrum. Similar to the simulation results, the integration gains obtained from CPSCI and CPSII are decreased during the time period in which the frequencies of the discrete spectra change rapidly, which leads to the discontinuity of discrete spectrum illustrated by LOFAR. As a result, it is not suitable for CPSCI and CPSII to estimate the weak discrete spectrum severely affected by Doppler. In order to extract the weak discrete spectrum under the influence of Doppler, the CCPSCI algorithm based on the compensations of time scale factor and time delay can play an important role. This is attributed to our proposed method providing sufficient integration gain by correcting the frequency shifts and phase changes in the cross-power spectrum. Therefore, the discrete spectrum obtained from CCPSCI has a good quality of continuity, which helps to track the target ship and aids in comprehensible identification. As shown in Figure 10e, the discrete spectra at frequencies of 199, 251, and 301 Hz have the strongest energy, longest duration, and obvious frequency shifts with time. These discrete spectra may be regarded as the characteristic discrete spectra influenced by the diesel engine, shaft, and valve piston of the target ship, respectively. Simultaneously, there are five discrete spectra in the frequency band of~190-270 Hz showing a relationship of 13 Hz octave, which represents the ignition frequency characteristic of a diesel engine.
In Figure 10c-e, the averages of power spectra estimated by CPSCI, CPSII, and CCPSCI are calculated during the time period of~20-40 s. The averages of power spectra are shown in Figure 11, and the exact averages of discrete spectra are listed in Table 3.
Sensors 2020, 20, x FOR PEER REVIEW 16 of 22 Compared with the periodogram, the background noise in the estimation results of CS illustrated in Figure 10a using the sliding window of 5 s is more obvious. In the bottom subfigure of Figure 10a, the background noise is much reduced using a longer sliding window. Although a few discrete spectra have been enhanced in comparison with the periodogram, it is still hard to detect and identify characteristic discrete spectra from the CS spectral estimation results.
As for the result estimated by MUSIC in Figure 10b, in the frequency band of ~220-260 Hz, there are discrete spectra at the frequencies of 225, 238, and 251 Hz. In addition, these discrete spectra have an octave relationship. However, they have short durations and are simultaneously affected by other spectra. Thus, we may only deduce that these discrete spectra are probably the characteristic tonal spectra of the target ship.
The LOFAR results of CPSCI, CPSII, and CCPSCI in Figure 10c-e are significantly improved in comparison to CS and MUSIC, which is reflected in the reduction in background noise and the enhancement of the discrete spectrum. Similar to the simulation results, the integration gains obtained from CPSCI and CPSII are decreased during the time period in which the frequencies of the discrete spectra change rapidly, which leads to the discontinuity of discrete spectrum illustrated by LOFAR. As a result, it is not suitable for CPSCI and CPSII to estimate the weak discrete spectrum severely affected by Doppler. In order to extract the weak discrete spectrum under the influence of Doppler, the CCPSCI algorithm based on the compensations of time scale factor and time delay can play an important role. This is attributed to our proposed method providing sufficient integration gain by correcting the frequency shifts and phase changes in the cross-power spectrum. Therefore, the discrete spectrum obtained from CCPSCI has a good quality of continuity, which helps to track the target ship and aids in comprehensible identification. As shown in Figure 10e, the discrete spectra at frequencies of 199, 251, and 301 Hz have the strongest energy, longest duration, and obvious frequency shifts with time. These discrete spectra may be regarded as the characteristic discrete spectra influenced by the diesel engine, shaft, and valve piston of the target ship, respectively. Simultaneously, there are five discrete spectra in the frequency band of ~190-270 Hz showing a relationship of 13 Hz octave, which represents the ignition frequency characteristic of a diesel engine.
In Figure 10c-e, the averages of power spectra estimated by CPSCI, CPSII, and CCPSCI are calculated during the time period of ~20-40 s. The averages of power spectra are shown in Figure 11, and the exact averages of discrete spectra are listed in Table 3. Figure 11. Averages of power spectra estimated by CPSCI, CPSII, and CCPSCI in experimental data processing. It can be seen from these results that CPSII and CCPSCI can improve the SNRs of discrete spectra much better than CPSCI. However, the discrete spectrum at 262 Hz cannot be detected by CPSII. Compared with CPSCI and CPSII, CCPSCI can provide an additional 2~6 dB integration gain. Figure 11. Averages of power spectra estimated by CPSCI, CPSII, and CCPSCI in experimental data processing. It can be seen from these results that CPSII and CCPSCI can improve the SNRs of discrete spectra much better than CPSCI. However, the discrete spectrum at 262 Hz cannot be detected by CPSII. Compared with CPSCI and CPSII, CCPSCI can provide an additional 2~6 dB integration gain.
It can be seen in Figure 11 that the interference of undesired discrete spectra in the spectral estimation result of CPSCI is stronger than CPSII and CCPSCI. It is easier for us to identify the characteristic discrete spectra in the spectral estimation results of CPSII and CCPSCI. Seen from the averages of the discrete spectra in Table 3, it can be concluded that CCPSCI can provide an additional 2~6 dB integration gain compared with CPSCI and CPSII. Consequently, this makes it appropriate for the proposed method to be applied to target detection in time-varying coherent channels.

Conclusions
In order to estimate the narrowband discrete components of ship-radiated acoustic signals for target detection, a coherent integration method of a cross-power spectrum with time scale factor and time delay compensations was proposed in this paper. The data received by two hydrophones were modeled and applied to cross-power spectrum estimation. With the estimation results of time scale factor and time delay, the frequency shifts and phase changes existing in the cross-power spectra can be compensated. Thus, we were capable of performing coherent integration with these compensated cross-power spectra in order to enhance and extract the discrete spectra in the ship-radiated acoustic signal. The performance of CS, MUSIC, cross-power spectrum coherent integration (CPSCI), cross-power spectrum incoherent integration (CPSCII), and the proposed method compensated cross-power spectrum coherent integration (CCPSCI) were analyzed in simulations and experimental data processing. The conclusions are summarized as follows.
The classical CS method used in this paper is not appropriate for spectral estimation at a low SNR or with the long time duration of a non-stationary signal. The performance of MUSIC can be enhanced by utilizing more snapshots. However, longer data contain more variations in frequency, which may restrict the improvement obtained through increasing the number of snapshots.
When the frequencies of discrete spectra fluctuate greatly under the influence of Doppler, the integration gain provided by CPSCI and CPSII is too low to estimate the discrete spectrum. The phase error in CPSCI can also cause a loss of integration gain. The frequency shifts and phase changes in the cross-power spectrum can be corrected by the proposed method (CCPSCI) to provide higher integration gain satisfactorily under the influence of Doppler.
Our proposed method has advantages in spectral estimation, calculation time, and Doppler compensation. It can be applied to longer duration coherent integration to enhance and estimate the weak narrowband signals in ship-radiated acoustic signals in a time-varying coherent channel. Moreover, the proposed method can provide the feasible and convincing estimation results of a discrete spectrum and can be easily demonstrated in engineering applications, which assures certainty for passive detection and timely warning during real-time scenarios.