Blind Modulation Identiﬁcation of Underwater Acoustic MPSK Using Sparse Bayesian Learning and Expectation Maximization

: This paper presents a likelihood-based algorithm for identifying different phase shift keying (PSK) modulations, i.e., BPSK, QPSK, and 8PSK. This algorithm selects the modulation type that maximizes a loglikelihood function that is based on the known original constellation associated with the constellation of the received signals for the candidate modulation types. However, there are two problems in non-cooperative underwater acoustic Multiple Phase Shift Keying (MPSK) modulation identiﬁcation based on the likelihood method. One is the original constellation, which as prior information is unknown. The other is the underwater acoustic multipath channel makes the constellation distort seriously. In this paper, we solved these problems by combining sparse bayesian learning (SBL) with expectation maximization (EM). The speciﬁc steps are as follows. Firstly, blind channel equalization can be achieved by channel impulse response (CIR), which is estimated by sparse bayesian learning in single input multi output (SIMO) underwater acoustic channel. Subsequently, we used expectation maximization to compensate amplitude attenuation and phase offset, as the original constellation of MPSK is unknown. Finally, modulation can be successfully identiﬁed by the Quasi Hybrid Likelihood Ratio Test (QHLRT). The simulation results show that the channel estimation method based on SBL can eliminate the inﬂuence of channel effectively, and the EM algorithm can make the received constellation converge to the preset constellation in the case of unknown original transmit constellation, which effectively solves these two problems. We use the proposed SBL-EM-QHLRT method to achieve an identiﬁcation rate of more than 95% in underwater acoustic multipath channels with Signal to Noise Ratio (SNR) higher than 15 dB, which provides a new idea for modulation identiﬁcation of non-cooperative underwater acoustic MPSK.


Introduction
This paper deals with the blind modulation identification of signals transmitting through underwater acoustic multipath channel. We assume that the received signal is Multiple Phase Shift Keying (MPSK), i.e., BPSK, QPSK, and 8PSK, but do not assume any prior knowledge of the modulation order.
Underwater acoustic MPSK is widely used in information transmission; it is mainly used in high-speed communication scenarios [1,2]. MPSK can also be used as a part of Direct Sequence Spread Spectrum (DSSS) to achieve long range data transmission [3,4], and has been abundantly used for underwater Orthogonal Frequency Division Multiplexing (OFDM) subcarrier modulation [5][6][7]. With the development of underwater acoustic communication technology and the increasing demand in the field of underwater acoustic signal detection, modulation identification of non-cooperative MPSK has become a research hotspot. MPSK modulation identification is an important basis for parameter estimation, scrambling, and blind demodulation [8][9][10], particularly in military applications. Various authors have discussed the MPSK modulation identification in the literature and have proposed many schemes; however, most of the application scenarios are focused on the radio. These methods can be divided into two different groups: likelihood function-based and cyclic cumulant-based methods.
Regarding the MPSK modulation identification, various methods have been proposed based on higher-order cumulants and cyclic cumulants [11][12][13][14][15][16]. These methods assumed that the signals were in baseband.But in practical application, it is necessary to estimate the parameters of communication system in order to achieve baseband processing. Because the estimated parameters are susceptible to the channel, the higher-order cumulants and cyclic cumulants based method is not effective in practical application.
A different class of approaches to PSK modulation identification that employed the likelihood-based method. In [17,18], the method indicated that the system needed high SNR in order to obtain accurate identification results and this method is only suitable for noise channel. A similar likelihood-based method for identifying MPSK signals was described in [19]. However, this method needs to know the parameters of MPSK modulation. In [20], the authors introduced a likelihoodbased method for BPSK, QPSK, and 8PSK in multipath channels. This method does not take into account that the original constellation is unknown and the frequency-selective channel was slow-varying. In [21], a new MPSK modulation identification method based on likelihood function is proposed. This method identify modulation without knowledge of the symbol rate, carrier frequency and noise power in the received signals. However, this method is suitable for noise channel, its performance is poor in multipath channels. In non-cooperative environment, it is difficult to estimate the channel impulse response (CIR) of multipath channel. Therefore, the likelihood-based modulation identification method is not suitable for multipath channels generally. In [19], the authors concluded that the likelihood-based method performs well in modulation identification when the CIR and noise power are known. Therefore, the likelihood based method draws researchers' attention when the CIR and noise power are unknown in the multipath fading channel. Recently, lots of scholars used the method of expectation maximization (EM) to estimate the CIR and noise power in flat fading channel in view of MPSK and MQAM signals and then they used QHLRT to identify modulation [22][23][24].Through these analysis and studies, we found EM-QHLRT can increase the identification rate effectively and make us come up with some new ideas about modulation identification in fading channel. However, EM-QHLRT is only suitable for flat fading channels, but not for underwater acoustic multipath channels. Consequently, the likelihood based methods need concern how to reduce the impact of multipath fading channels.
In [25,26], researchers used the cross-relations in the multiple reception in order to complete channel estimation in single input multi output (SIMO) channel. In [25], a least square blind channel estimation method is proposed and the necessary condition for channel estimation based on cross-relations is analyzed in SIMO channel. In [26], an adaptive blind channel estimation algorithm is proposed. However, these blind channel estimation methods have huge amount of computation to do and do not consider the sparsity of underwater acoustic channel. In recent years, scholars have done lots of research on channel estimation methods based on sparse bayesian learning (SBL) and have achieved good result [27][28][29]. However, it is difficult to apply the SBL-based channel estimation method in the non-cooperative condition due to the lack of prior information. Moreover, the identification will fail because the original constellation cannot be achieved when the channel estimation result is applied to the likelihood method directly.
In this paper, we consider likelihood-based identification of modulation types including BPSK, QPSK, and 8PSK. Firstly, we use the cross-relations and SBL to estimate the CIR. Then, EM algorithm is used to further process the received signal after channel equalization. Finally, modulation can be identified successfully by likelihood-based method. The proposed method can effectively improve the identification rate in the underwater acoustic multipath channel. The proposed SBL-EM-QHLRT method provide the following contributions in this regard:

•
Aiming at the problem that the traditional likelihood-based modulation identification method cannot achieve the identification in underwater acoustic multipath channel, we propose a blind channel estimation method based on SBL, which can eliminate the impact of multipath on the identification effectively.

•
Because of the error of SBL blind channel estimation and the unknown original constellation mapping of MPSK, we model the signal processed by SBL as Gaussian mixture model (GMM), and use EM in orr to correct the constellation and compensate the error of channel estimation.
The rest of the paper is organized, as follows. Section 2 shows a SIMO underwater acoustic system model, and the cross-relation in SIMO channel is derived. A Novel method named SBL-EM-QHLRT is developed in Section 3. Section 4 discusses the performance of SBL-EM-QHLRT method through numerical simulation. Conclusions are drawn in Section 5.

System Model
The SIMO channel model which is showed in Figure 1 can be described as follows: where * represents convolution, K is the number of outputs, w(n) = [w 1 (n), w 2 (n), . . . , w K (n)] T represents the noise which obey the Gaussian distribution with mean value of zero and variance of σ . s(n) is the transmission signal of length N , x(n) = [x 1 (n), x 2 (n), . . . , x K (n)] T represents the received signal, h(n) = [ h1 (n), h2 (n), . . . , hK (n)] T represents the CIR, We assume that the length of CIR is L. Without considering the influence of noise, the received signals of every output have the following cross-relationship: Equation (2) can be expressed as a matrix form, as shown as: According to Equation (3): Every block of Equation (4) is a matrix of size N × L. We can obtain the following, as shown as: Adding constraints by setting h1 ( L/3 ) = 1, where · represents the operation of rounding down. The purpose of the constraints is to avoid zero solution, which will introduce the delay and gain to the final channel estimation results inevitably. However, in practice, the delay and gain will not affect the performance of the system. Equation (5) can be expressed as: where b is the column L/3 of the matrix X(n) , Ψ is the matrix of all the columns in X(n) except L/3 ,h is all elements except h1 ( L/3 ) = 1 in h . The CIR obtained by Equation (6) needs to meet two identifiable condition, which is described in detail in [25]. One condition is that the transmission function of each subchannel in h(n) = [ h1 (n), h2 (n), . . . , hK (n)] T has no common zero, and the other condition is that the autocorrelation matrix of the transmission signal s(n) is full rank.

Blind Channel Estimation Based on SBL
We can achieve the process of calculating CIR by using the method of SBL [30], which is based on Equation (6), due to the sparsity of underwater acoustic channel. The specific process is as follows. We replace − b with t and assume that the channel has noise that obeys the Gaussian distribution with mean value of 0 and variance of σ , Equation (6) can be expressed as: The likelihood function of t can be expressed as: The maximum a posteriori (MAP) ofh can be expressed as: The first term on the right of Equation (9) comes from the likelihood function and the second term comes from the prior distribution. SBL assumes the prior distribution of parameters firstly and then calculates the maximum posterior estimate. Assuming the prior distribution of CIR, as: where α = [α 1 , α 2 , . . . , α i , . . . , α KL−1 ] is called hyper-parameter and corresponds to the elements inh one by one. The value of element inh corresponding to α is zero When the value of element in α tends to infinity. In order to calculate MAP, we need to calculate p(h, α, σ|t) : The first term on the right of Equation (11) is posterior distribution ofh . According to the Bayesian rule: where If we get the value of α and σ , we can bring them into µ = σ −1 ∑ Ψ T t to get the MAP ofh [31]. The second term on the right of Equation (15) can be expressed as: Assuming α and σ obey the uniform distribution and we can achieve the estimation of α and σ by maximizing p(t|α, σ): The derivatives of p(t|α, σ) with respect to α and σ, while setting the derivatives to zero, we can get the new estimates for α and σ as: where ∑ ii is the ith diagonal element of ∑ , µ i is the ith element of µ . According to the above analysis, SBL sets the initial value of α and σ, threshold and iteration times. The iterative process can be showed as follow. Firstly, we calculate ∑ and µ , and achieve new α and σ. Secondly, we set the corresponding value inh to zero when the value in α is greater than the threshold. Then we repeat the iterative process for set times. Finally,h is only a small number of non-zero values and the channel equalization is realized.

Modulation Identification Method of EM-QHLRT
The performance of likelihood based identification method will be greatly affected because of the error of channel estimation and the unknown original constellation. Therefore, EM is used to compensate the error in channel estimation and the phase. Signal after channel equalization is modelled as GMM. We assume the number of Gaussian models is M and the probability density function of GMM can be expressed as [32]: where h = ae jϕ is channel coefficients, α is amplitude attenuation, ϕ is frequency offset, CN(r|s m , h, σ) is the probability density function of the mth Gaussian model, s m is the symbol of the original constellation.
Step Expectation aims to estimate the likelihood value of symbols regarding to every Gaussian model. Setting the hidden variable z n,m to represent that symbol r n comes from the mth Gaussian model.
Equation (17) can be expressed as: Taking logarithm for Equation (19): We define Q function as: where E(z n,m r n , h i , where h i and σ i represents the channel coefficients and noise power of the previous iteration, assuming the value of Equation (20) is γ n,m . We replace h with Re{h} + jIm{h} and Q function can be expressed as: Step Maximization aims to maximize the Q function. The derivatives of Q(h, σ, h i , σ i ) with respect to the Re{h} , Im{h} and σ, while setting the derivatives to zero, we get the new estimates, as shown as: Repeat Equations (22)- (26) until the end of the iteration. After the channel coefficient and noise power are estimated by EM, the modulation identification based on QHLRT can be expressed as: The result can be achieved by comparing the values of (27) under BPSK, QPSK, and 8PSK: where n {·} is the natural logarithm.

Parameter Setting
The simulation uses the channel model of single input and three outputs, and then we get three different CIR of underwater acoustic channel [33]. The generation process of channel is as follows. We assume the number of paths in three channels is 6, the average multipath delay is 14 ms, and the time delay of adjacent paths obeys the exponential distribution. We can achieve six random variables which are negative exponential distributed, then we calculate the sampling point positions delay_p corresponding to six paths. We also assume the amplitudes of paths are Rayleigh distributed with the average power decreasing exponentially with delay, where the difference between T_guard = 20 ms is p = 20 dB. Therefore, we can conclude that: where α = 1, t − {rms} = T_guard/ log(10 p/10 ). Now, we can achieve random variables that are Rayleigh distributed in each path, and then we assume that the value of random variable is (a + bj).
The amplitude of each path can be expressed as: where || is the modulus of complex number. Finally, the sparse multipath channel is generated by normalizing the amplitudes of all paths [34,35]. The noise included in simulation is the gaussian white noise in band, and a fixed frequency offset of 300 Hz is added in simulation in order to verify the phase compensation capability of the algorithm, while the constellation may rotate by down sampling, which make the receiving signal become the constellation. The sampling frequency of the communication system is 48 kHz, the carrier frequency is 8 kHz, and the bandwidth is 6 kHz. We select 100 groups of signals from BPSK, QPSK, and 8PSK, respectively, to calculate the identification rate. The simulation parameter settings are shown in Table 1.

Performance of SBL Blind Channel Estimation
The performance of blind channel estimation of SBL is verified by simulation, the multipath length of three outputs is assumed to be 900 sampling points. We use normalized mean square error (NMSE) to evaluate the performance of SBL channel estimation. Figure 2 shows the estimated CIR of the three channels. Figure 3 shows the difference between the original CIR and the CIR estimated by SBL. The first 900 sampling points correspond to channel 1, the 901-1800 sampling points correspond to channel 2, and the 1801-2700 sampling points correspond to channel 3. Figure 4 shows the NMSE of three channels. Refer to Figures 2-4, by using SBL, we can achieve the estimation of underwater acoustic sparse channel accurately with the increase of SNR. In addition, we also simulate the performance of channel estimation under time-varying channel. We set the simulation SNR to 20 dB. After each time interval, we can achieve 100 groups of random sparse channels [21], and then we use the proposed method to estimate these channels. Finally, we take the average to be the result as it is shown in Figure 5. From Figure 5, we can find that channel estimation accuracy is low as the channels' NMSE is higher than −10 dB when the time interval is 0.05 s or lower than 0.05 s. By contrast, the channels' NMSE is lower than −20 dB when the time interval is higher than 0.2 s, which shows that the method proposed in this paper is still effective when the time interval is about 0.2 s. Refer to Figures 4 and 5, due to the constraints that we set h1 ( L/3 ) = 1, the gain of channel 1 achieved by channel estimation cannot be further eliminated. Therefore, we can know the estimation of channel 2 and channel 3 are more accurate than channel 1. In this section, we verify the performance of the SBL-based blind channel estimation method. Refer to Figures 2-5, we found that the SBL-based channel estimation method can show good performance. After we obtained the estimated CIR, the channel equalization can be used to eliminate channel interference. However, even the channel interference is eliminated, the likelihood-based method cannot effectively identify MPSK signals in the non-cooperative condition.

Performance of SBL-EM-QHLRT
Two problems need to be considered in the likelihood method. The one is that the original constellation is unknown, and the other one is that the constellation will rotate when obvious frequency offset exist in the underwater acoustic and down sampling will also make the constellation rotate even the original constellation can be known. Obviously, the likelihood method cannot be used to identify modulation directly. Therefore, we use EM to solve these problems. EM sets the original constellation firstly, and then it estimates the amplitude attenuation and phase offset of the channel under all possible modulation modes.
The simulation process is as follows. QPSK modulation is adopted in simulation and the constellation is [0.707 + 0.707j, 0.707 − 0.707j, −0.707 − 0.707j, −0.707 + 0.707j], as it is shown in Figure 6a-c show the constellation of the received signal and the constellation after SBL channel estimation and equalization, respectively. By comparison original constellation and the constellation shown in Figure 6c, we can find that constellation will rotate after SBL channel estimation and equalization. In EM, the constellation of MPSK is arbitrarily determined, we set [1, 1j, −1, −1j] as original constellation in QPSK. We also need to set original constellation in BPSK and 8PSK. Figure 6d-f show the result of constellation compensated by EM under BPSK, QPSK and 8PSK. It can be found that constellation in Figure 6c has converged to constellation in Figure 6e that the influence of phase shift has been eliminated. Furthermore, the influence of the channel is further eliminated, which makes the constellation morepolymerized. Now, these two problems have been solved and the likelihood-based method can be used to identify MPSK modulation successfully. Next, we illustrate the influence of parameters of EM algorithm on its performance.
The influence of different iteration times on the performance of EM method is analysed by this simulation. We set the number of symbols of EM to 300 and the iteration times of EM to 4, 6, 8, and 10 respectively. Figure 7 is the identification rate curve of iteration times under different SNR, and we can find the identification rate is relatively better When iteration times is 8 and 10. Besides, taking the amount of computation into consideration. Therefore, we set iteration times to 8. Based on the above analysis, we can simulate the influence of different number of symbols on EM when the iteration times is 8. We set the number of symbols to 150, 300, and 450, respectively. The Figure 8 showed that the more the symbol number is, the better the identification rate is. When considering the amount of computation, we set the number of symbols to 300 finally.  In the end, we compared the identification rate by using SBL-QHLRT with frequency offset and without frequency offset, also by using EM-QHLRT without SBL and SBL-EM-QHLRT. As shown in the Figure 9 (here we have set EM iteration times to 8 and the number of symbols to 300). SBL-QHLRT with frequency offset cannot identify MPSK modulation successfully because the constellation rotation will make the QHLRT failed. EM-QHLRT also failed to identify MPSK modulation because EM is only suit for flat fading channel. SBL-QHLRT method with known original constellation and have no frequency offset can be useful, but it will be affected by channel estimation error. We can apparently find that SBL-EM-QHLRT can overcome the problem of constellation rotation and reduce the influence of underwater acoustic fading channel. Figure 10 shows the identification rate of BPSK, QPSK, and 8PSK under SBL-EM-QHLRT, of which BPSK has the highest identification rate and 8PSK has the lowest.
The simulation results show that EM can effectively improve the identification rate of underwater acoustic MPSK. When the number of iterations and the number of symbols of EM meet the requirements, EM can further eliminate the influence of channel and solve the problem of unknown original constellation. Therefore, the likelihood-based identification method can successfully identify MPSK modulation without any prior information.

Conclusions
This paper developed a novel likelihood-based modulation identification algorithm for MPSK signals in underwater acoustic multipath channels. The simulation results showed that the proposed method can identify the modulation mode of MPSK well in underwater acoustic multipath channel.
In the aspect of blind channel estimation, the NMSE of SBL based blind channel estimation algorithm is below −20 dB when the SNR is higher than 15 dB. In addition, EM algorithm makes the received constellation more convergent and solves the problem of unknown original constellation. In view of the simulation results, some discussions for the SBL-EM-QHLRT method are given, as follows: In the SBL channel estimation method, the channel 2 and 3 are more accurately estimated than channel 1 due to the setting constraint conditions on channel 1. Adding constraints will lead to gain and delay in the estimated CIR. Channel 2 and channel 3 can be normalized to eliminate the gain, but channel 1 cannot eliminate the gain due to constraints, which results in poor performance of channel 1 as compared with channel 2 and 3. The estimated result of channel 1 does not affect the identification rate, because we only need to take the received signal of one of the paths for processing.
The SBL blind channel estimation method proposed in this paper can also be applied to other MPSK modulation identification methods. After eliminating the channel interference, the features that are used for identification in these methods become more apparent.
From the simulation results, when the iteration times and symbol length of EM increases, the identification rate will be better, but the computational efficiency should be considered in practical use.
Further improvements of the blind likelihood-based modulation identification method for signals in different channel environment, such as Doppler shift and MIMO, is needed in the future.
Author Contributions: T.F. conceived the algorithm and designed the simulations; S.L. and L.Z. analyzed the results; Z.X. and X.W. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: