Signal Classiﬁcation Algorithms over Time Selective Channels

: In this work, we propose a general framework to design a signal classiﬁcation algorithm over time selective channels for wireless communications applications. We derive an upper bound on the maximum number of observation samples over which the channel response is an essential invariant. The proposed framework relies on dividing the received signal into blocks, and each of them has a length less than the mentioned bound. Then, these blocks are fed into a number of classiﬁers in a parallel fashion. A ﬁnal decision is made through a well-designed combiner and detector. As a case study, we employ the proposed framework on a space-time block-code classiﬁcation problem by developing two combiners and detectors. Monte Carlo simulations show that the proposed framework is capable of achieving excellent classiﬁcation performance over time selective channels compared to the conventional algorithms.


Introduction
Signal classification was first motivated by its usage in military applications such as electronic warfare, spectrum surveillance, and interference identification. With the growing popularity of reconfigurable radios, it becomes an important technology for commercial applications as well because it enables the optimization of transmission parameters according to their interactions with the environment [1,2].
Modulation classification has recently attracted a great deal of interest from academia, industry, and global standardization bodies [3][4][5]. Moreover, classification of space block time coding (SBTC) is introduced in [6,7]. In addition, channel coding classification algorithms are proposed in [8,9]. Generally, the existing classification algorithms are divided into two types: likelihood-based and feature-based. The former calculates a likelihood function of a noisy received signal and applies a maximum likelihood (ML) classifier to complete the process [10]. The latter traces features of the distorted received signal by exploring its characteristics, and after that a decision is made [11]. Recently, machine learning and neural network algorithms are being considered for signal classification [12,13].
Designing classification algorithms for wireless applications is typically challenging since classifiers have no (or limited) a priori knowledge of transmission parameters such as channel state information, synchronization parameters, the distribution and power of the noise, and transmitted data symbols. With mobility, channel variation over the interval of observation creates a further issue of concern [14].
The key idea of the previously published works that focused on signal classification over time selective channels for multiple-input multiple-output (MIMO) systems was to blindly separate the MIMO source symbols through using continuous modulus algorithms and then perform signal classification through employing intelligent learning algorithms (e.g., [15] and reference therein). However, this approach requires offline training, which may be not available in many practical scenarios.
Alternatively, in this paper, we propose a general framework applicable to diverse deployment of signal classification scenarios over time selective channels without the need to have offline training. The framework relies on using a bank of a conventional classifier that is designed to operate in time-invariant channels. Each classifier processes a part of observation samples whose length is less than or equal to a certain value. Then, a well-designed combiner and detector are used to make a final decision. As a case study, we apply the proposed framework on a STBC classification problem over time selective channels by proposing two novel combiners and detectors.
The paper is organized as follows. In Section 2, we introduce the observation model. The proposed framework and its application on STBC signal classification are provided in Sections 3 and 4, respectively. In Section 5, simulation results are presented. Finally, the work is concluded in Section 6.
Notation: Throughout this paper, E[·] denotes the expectation with respect to all randomness in the argument; J 0 (·) refers to the zeroth-order Bessel function of the first kind; F −1 (x) is the inverse function of F(x),κ is the estimated value of unknown value κ; |I| is the cardinality of the set I; f d is the maximum Doppler shift; f n d is the maximum Doppler shift normalized to the total bandwidth; T s is the sampling duration; U is the number of observation samples; λ is the transmission wavelength; h f i (g) is the gth channel coefficient between transmit-antenna f and receive-antenna i; G denotes the channel length; M is the number of classifiers; L is the number of received samples that can be handled by a single classifier; and P f is the predefined probability of false detection.

Observation Model
In this section, we investigate the properties of time selective channels, with the aim of determining an upper bound formula on the number of observation samples such that the channel has an approximate time-invariant behavior over those samples. Later on, we make use of this upper bound to properly design a signal classification algorithm operating over time selective channels.
We consider a narrow-band communication system operating over a time selective channel. The received signal at the time instant t, y(t), is expressed as where x(t), h(t), and w(t) refer to the transmitted signal, the channel gain, and the noise component, respectively. (In order to simplify the analysis, we assume a single tap time varying channel. However, the proposed upper bounds shown in (3) and (4) are also valid for multiple path channels). Note that Equation (1) is expressed in a general form, which can be applied to any modulation format associated with any pulse shaping. The autocorrelation of the fading envelope at the time lag τ is given by [16] R(τ) = 1 where E and * stand for expectation and complex conjugate operators, respectively. Here, σ 2 h is the variance of h(t), J 0 (·) denotes the zeroth-order Bessel function of the first kind, and f d is the maximum Doppler shift. Figure 1 shows R(τ) as a function of the parameter 2π f d τ.
The channel is essentially invariant if R(τ) ≥ 0.9, and this corresponds to |2π f d τ| ≤ 0.64 as indicated in Figure 1. The maximum time lag holding the previous inequality is given as τ (0.9) Here the superscript (0.9) refers to using the correlation of 0.9. We denote f n d as the maximum Doppler shift normalized to the total bandwidth, 1/T s , where T s is the sampling duration. In the discrete time domain, τ Equation (3)  max gets smaller with increasing | f n d |, which, in turn, reduces L, and this negatively affects the final classification performance. In order to decrease the limitation of L, we also consider the channel as essentially invariant if R(τ) ≥ 0.5. Following a similar approach as previously explained, one can express the upper bound in this case as Both conditions of R(τ) ≥ 0.9 and R(τ) ≥ 0.5 are widely used in the literature [16] with the definition of channel coherence time. Note that the coherence time is the interval of τ that keeps the function J 0 (2π f d τ) almost constant around its maximum. The small value of the coherence time yields a small number of received samples that are subject to approximately a time-invariant channel. Using a small number of processing samples leads to a reduction in classification performance. However, a large value of the coherence time allows us to have a large number of received samples that are subject to approximately a time-invariant channel. This improves the classification performance.
It is worth mentioning that most of the classification algorithms reported in the literature failed to achieve a satisfactory performance under time selective channels because they relied on observation periods greater than the corresponding upper bound. For example, the classifier reported in [17] provided a poor performance under a time selective channel with f n d = 10 −5 . The reason is that it relied on 69,000 observation samples; however, the upper bound (L (0.5) max ) is 30,000 samples.

Proposed Framework
In the previous section, we showed that the number of observation samples, L, should not exceed a certain value (either L (0.5) max or L (0.9) max ) in order to avoid dramatic variations in the fading envelope. However, this constraint has a negative impact on the performance of a classifier especially when f n d is relatively high. We propose the following framework in order to overcome this issue. We denote U as the observation samples, where U is greater than L max . Each block is fed into a signal classifier. This acts as a bank of M signal classifiers operating in parallel fashion; each of them sees an almost constant channel coefficient in the time domain. The outputs of these M classifiers need to be appropriately compiled to reach a final decision through a well-designed detector. The structure of the framework is shown in Figure 2. The design of the combiner and the detector differs for different applications. The following practical aspects should be taken into account. • The value f n d should be provided as a priori information to determine the upper bound on the number of processed samples per classifier, and then determine the appropriate length of each block. To this end, we assume that the receiver is equipped with a speed meter to measure the relative velocity between the transmitter and receiver, v.
We assume that the receiver has a rough estimation of the received signal bandwidth. Therefore, f n d can be computed by using f n d = f d T s . The aforementioned assumptions can be easily carried out in practice.

STBCs Classification
With the growing involvement of MIMO technology in smart wireless applications, the classification of STBCs has gained much attention in the last few years. The previously reported investigations are limited to either frequency flat [18] or frequency selective channels [17,19], ignoring the time-varying nature of wireless channels. However, this should be taken into consideration when designing practical STBCs classifiers. In this section, we show how to implement the proposed framework in one of the reported STBCs classifiers [19]. In addition, we develop and analyze two novel combiners and detectors to compensate for the critical effect of time selective channels.

Preliminaries
We briefly describe the algorithm proposed in [19], which classifies the received signal into an Alamouti (AL) or a spatial multiplexing (SM) signal with the aid of multiple receive antennas. The kth sample of the received signal at antenna i is expressed as where f is the transmit-antenna index. Here h f i (g) is the gth channel coefficient between transmit-antenna f and receive-antenna i, G denotes the channel length, and w i (k) is the noise contribution. The key principle beyond this algorithm is that the cross-correlation functions of the outputs of different receive antennas exhibit peaks at a particular set of time tags for AL, and exhibit nulls for SM. This feature is exploited for classification via employing a false alarm rate (FAR) method. A more formal mathematical description is provided as follows. The expression of the estimated cross-correlation function,Ĉ i,i (τ), between the output of antenna pair (i, i ) at time lag τ is given aŝ where U is the total number of observation samples and e i,i (τ) is the estimation error.
For AL, C i,i (τ) has non-zero values for τ = −G + 1, . . . , G − 1 and zeros otherwise. However, C i,i (τ) equals zero at all values of τ for SM. We denote N r as the number of receive antennas, I as a set of receive antenna pairs {(i, i ) : i, i = 0, 1, . . . , N r − 1, i = i , and i > i} and |I| as the cardinality of the set I. A vector Λ of length |I|(2G + 1) is created by concatenating all values of |Ĉ i,i (τ)| at τ = −G + 1, . . . , G − 1, ∀(i, i ) ∈ I. We introduce α as the maximum value of the vector Λ, and then α is compared to a threshold value, λ. The AL signal is declared to exist if α ≥ λ; otherwise, the SM signal is chosen. The threshold value is determined as λ = F −1 (1 − P f ), where P f is the predefined probability of falsely detecting the SM signal due to the estimation error and F −1 (1 − P f ) is the inverse function of F(1 − P f ), which is given as Hereσ 2 i,i is the variance of the estimation error, which is computed aŝ where S 2 and S 3 are design parameters chosen arbitrarily to be much greater than the expected value of G. This guarantees that no peaks occur in the interval [S 2 ,S 3 ], where G S 2 < S 3 . Since G is not accurately known in practice, we replace G with a design parameter, S 1 , and is arbitrarily chosen to be close to G, such that S 1 < G. However, in case of no knowledge about G, we set the value of S 1 to 1.

Proposed Classification Algorithm over Time Selective Channels
We employ the proposed framework on the previously mentioned algorithm to have a better performance over time-varying channels. After dividing the observation samples into M blocks, each has L ≤ L (0.5) max , and forwarding them to M classifier in a parallel way, we assume that the output of the mth classifier is α m and λ m . Hereafter, the index m is attached to the parameters α and , λ in order to identify the output of each classifier separately. The conceptual block diagram of the proposed algorithm is represented in Figure 3. In the following, we develop two novel combiners and detectors in order to make a final decision.

Combiner and Detector 1
We define the following indicator function . AL is declared present if the number of ones is greater than (or equal) the number of zeros in the vector X; otherwise, SM is declared present. Simply, each classifier makes its own decision independently and the final decision is made on the basis of the majority of votes. The majority-based logic (and, in general, a counting rule) possesses a number of relevant robustness properties, as shown in [20,21].

Combiner and Detector 2
We define the following two functions and otherwise SM is declared present. For illustration, we assume M = 3 with (α 1 = 0.9 , It is worth mentioning that the terminology of hard and soft decision combiners is widely employed in the literature for different applications with different implementation strategies. An example includes their use in classifying mobile application traffic where the combiners are based on probabilistic models with requirements on training and learning philosophy [22]. As one observes, the proposed combiners in this work relax these requirements and the comparison between them is not feasible because the proposed combiners do not rely on a data set.

Combiner and Detector
First

Simulation Results
Monte Carlo simulations were carried out to examine the performance of the proposed framework. Unless otherwise stated, a quadrature phase shift keying (QPSK) modulation scheme was adopted. Each link had four statistically independent taps, each modeled as a zero-mean complex Gaussian random variable having an exponential power delay profile, σ 2 (g) = B h exp(−g/5), where g = 0, . . . , 3 and B h was chosen such that the average energy was normalized to unity. The time selective nature of each tap in the time domain was described in terms of f n d as shown in [23]. The number of received antennas, N r , was 2, the probability of false alarm, P f , was 10 −2 , and the number observation samples, U, was 2000. The peaks were searched using S 1 = 1 while the variance of | (K) i,i (τ)| was computed over the interval of [10 − 85], i.e., S 2 = 10 and S 3 = 85. The probability of correct classification, P c (λ|λ), λ =AL, SM, i.e., the probability that the code λ is classified when this is present, was used as a figure of merit. Each set of simulations was run for 1000 trials. Table 1 collects all parameters concurring to describe the simulation setup.

Parameter
Value of trials 1000 Figure 4 shows the performance of the proposed framework at f n d = 2(10) −3 . Equation (4) said that, for this specific value for f n d , the number of observation samples should not exceed 119. Therefore, we split 2000 observation samples into 20 blocks of which each has 100 samples. For the sake of comparison, we also provide the performance of the case of f n d = 0 and the performance of the algorithm proposed in [19]. As observed, a significant improvement is achieved when adopting the proposed framework. In addition, the second combiner and detector outperform the first. This is because the former combining scheme relies on the hard decision outputs of the classifiers, while the latter deals with soft-decision outputs. Note that the classification performance of the SM code is predetermined by P f , P c (SM|SM) = 1 − P f , which is independent of the SNR. Moreover, the peak detection of the AL code is improved at low and intermediate values of SNR. However, at high SNR values, the effect of the estimation error dominates, leading to saturation of the classification performance. Figures 5 and 6 illustrate the classification performance of the two proposed combiners and detectors for correlated channels at SNR = 12 dB and f n d = 0.001. The correlation matrices at the transmitter and receiver are provided as, respectively, [24] Z t = 1 z t z t 1 , where z t and z r are the correlation coefficients between the transmit and receive antenna elements, respectively. The correlated matrix for the lth tap, H(l), can be modeled as [24] H( where the components of H o (l) are complex Gaussian random variables with zero mean and variance σ 2 h (l). The results show that the proposed algorithm provides a satisfactory performance for AL codes up to z t = 0.4 and z r = 0.4. Note that z t and z r do not affect the classification performance of the SM code because its performance is predetermined by 1 − P f , which is independent of the correlation coefficients.    Figure 7 shows the effect of the time window duration L on the classification performance of the two proposed combiners and detectors at SNR = 12 dB and f n d = 0.002. As observed, the performance of the proposed combiners and detectors do not rely on L for L < 150. However, a significant performance degradation occurs after that. This is in agreement with the theoretical findings mentioned in Section 2 that time selective channels have a negative impact on the observation periods that are greater than the corresponding upper bound. Note that here the corresponding bound for f n d = 0.002 is 119 samples.

Conclusions
This paper detailed the design of signal classifiers over time selective channels. An upper bound expression on the length of observation samples over which the channel response can be considered constant was derived. Adjacent segments of the received signal whose lengths were less than or equal to the mentioned bound were applied to a bank of a conventional classification algorithm. The final decision was made through a combiner and detector. This framework was general in the sense that it worked for any signal classifier over time selective channels. As a case study, we highlighted STBC classification by proposing two novel combiners and detectors. Simulation results showed the robustness of the suggested framework against time-varying channels.