Next Article in Journal
Electric Field Measurement in Radiative Hyperthermia Applications
Previous Article in Journal
Enhanced Landslide Visualization and Trace Identification Using LiDAR-Derived DEM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Differentiated Embedded Pilot Assisted Automatic Modulation Classification for OTFS System: A Multi-Domain Fusion Approach

Ocean College, Jiangsu University of Science and Technology, Zhenjiang 212100, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(14), 4393; https://doi.org/10.3390/s25144393
Submission received: 5 June 2025 / Revised: 3 July 2025 / Accepted: 10 July 2025 / Published: 14 July 2025
(This article belongs to the Section Communications)

Abstract

Orthogonal time–frequency space (OTFS) modulation has emerged as a promising technology to alleviate the effects of the Doppler shifts in high-mobility environments. As a prerequisite to demodulation and signal processing, automatic modulation classification (AMC) is essential for OTFS systems. However, a very limited number of works have focused on this issue. In this paper, we propose a novel AMC approach for OTFS systems. We build a dual-stream convolutional neural network (CNN) model to simultaneously capture multi-domain signal features, which substantially enhances recognition accuracy. Moreover, we propose a differentiated embedded pilot structure that incorporates information about distinct modulation schemes to further improve the separability of modulation types. The results of the extensive experiments carried out show that the proposed approach can achieve high classification accuracy even under low signal-to-noise ratio (SNR) conditions and outperform the state-of-the-art baselines.

1. Introduction

With the rapid development of wireless communication networks toward high-frequency bands and high-mobility scenarios, the bottlenecks of orthogonal frequency division multiplexing (OFDM) brought about by inter-carrier interference and Doppler spread have become increasingly prominent [1]. orthogonal time–frequency space (OTFS) modulation, proposed as a novel multi-carrier modulation scheme [2] and emerging as a key candidate technology for next-generation 6G mobile communication systems [3,4], can serve as a solution. By mapping information symbols to the delay-Doppler (DD) domain, OTFS systems effectively convert doubly dispersive channels into approximately quasi-static channels in the DD domain, thereby mitigating the time-selective fading caused by Doppler shifts in high-mobility environments [5]. This distinctive advantage leads OTFS to demonstrate remarkable potential in dynamic communication scenarios, including satellite communications (SATCOM) [6,7,8], unmanned aerial vehicle (UAV) communications [9,10], vehicle-to-everything (V2X) networks [11,12,13], etc.
Automatic modulation classification (AMC), a fundamental technology in wireless communications, aims to identify the modulation types of received signals without prior information, providing essential support for subsequent demodulation and signal processing [14]. Traditional AMC approaches primarily fall into two categories: likelihood-based AMC approaches [15,16] and feature-based AMC approaches [17,18]. While achieving theoretical optimality, maximum likelihood-based AMC can suffer from prohibitive computational complexity in model parameter estimation [19]. In contrast, feature-based AMC recognizes modulation schemes by leveraging characteristics such as high-order spectrum features, frequency spectrum signatures, and power spectrum attributes. In recent years, deep learning has achieved breakthroughs in different areas, e.g., image classification, speech recognition, etc. Driven by these developments, increasing research efforts have focused on integrating deep learning with modulation recognition [20]. Some works leverage convolutional neural networks to extract 2D signal features [21,22], some works design long short-term memory (LSTM)-based networks to capture the temporal features of signals [23,24], and a large portion of works build hybrid CNN–LSTM models that are simultaneously based on features from multiple dimensions [25,26,27,28]. There are other works that develop ResNet-based hybrid neural network models [29,30] and multi-stream/multi-scale fusion models [28,31].
Despite these existing AMC approaches, there are very few studies focusing on AMC for OTFS systems. Additionally, it is infeasible to directly apply the existing AMC approaches, which are primarily designed for OFDM systems, to OTFS systems and achieve satisfactory classification performance [32]. An AMC approach for OTFS systems is proposed in [33] by designing a hybrid CNN and LSTM network with a residual stack to process I/Q symbols. However, this method primarily leverages the time-domain features of the received signal and does not consider the intrinsic DD domain of OTFS signals.
In this paper, we investigate the AMC issue for OTFS systems by simultaneously taking into account features of the signals in both the time domain and delay-Doppler domain. To further boost the classification accuracy, we leverage embedded pilot patterns in the delay-Doppler plane to enhance the feature discriminability among modulation schemes. Embedded pilot structures have predominantly been adopted in channel estimation in OTFS systems [34,35]. However, to date, there are no works that harness it well for modulation recognition/classification.In this work, we develop and incorporate a differentiated pilot insertion scheme in AMC that leverages the inherent properties of OTFS pilot structures and introduces modulation-specific patterns to enhance modulation discriminability. In a nutshell, our contributions are summarized as follows:
  • We propose a multi-domain fusion-based AMC approach for OTFS systems by designing a dual-stream CNN architecture that simultaneously incorporates the time-domain and DD-domain features of OTFS signals.
  • We develop a differentiated embedded pilot insertion scheme which incorporates modulation-related pilot symbols in DD plane structure to enhance classification accuracy.
  • We conduct extensive experiments, and the results demonstrate that the proposed approach can achieve high classification accuracy in high-mobility scenarios and low-signal-to-noise ratio (SNR) conditions and outperform the state-of-the-art approaches.
The remaining parts of this paper are organized as follows: Section 2 reviews the related literature and is followed by Section 3, which depicts the system model and describes the problem to be solved. In Section 4, the proposed method is elaborated on. Section 5 showcases and analyzes the numerical results. In Section 6, we conclude the paper with some final remarks.

2. Related Work

Automatic modulation classification (AMC) plays a vital role in radio monitoring and spectrum management and has been widely applied in various communication systems, such as satellite communications and network communications. AMC was first proposed by researchers from Stanford University [36], and it aimed at overcoming the limitations of traditional manual identification methods and achieving high accuracy. Recent years have witnessed the development of different AMC methods, which can be generally categorized into three types: the likelihood-based method, the feature-based method, and the deep learning-based method (as shown in Figure 1).
Likelihood-based method. The modulation recognition method based on likelihood ratio testing (LRT) is carried out on top of composite hypothesis testing. First, the maximum likelihood function of the unknown signal and the optimal decision threshold are determined based on comprehensive analysis of signal characteristics. Then, the extracted statistics are compared with the thresholds to achieve modulation signal classification. The authors in [37] propose a modulation recognition method based on the Average Likelihood Ratio Test (ALRT), which shows potential in distinguishing between binary phase-shift keying (BPSK) and quadrature phase shift keying (QPSK) signals. In [38], a Generalized Likelihood Ratio Test (GLRT)-based AMC approach is proposed by combining power series with likelihood functions. It achieves comparable recognition performance to the average likelihood approach while reducing computational complexity and decreasing implementation difficulties. A Hybrid Likelihood Ratio Test (HLRT) classification approach, proposed in [39], both preserves the advantages of conventional algorithms and suppresses inter-symbol interference while maintaining reasonable computational complexity. Although likelihood-based methods can achieve optimal classification performance in the Bayesian sense, their practical application faces challenges due to the need to account for unknown parameters. As the number of unknown parameters increases, they become increasingly difficult to implement and computationally intensive.
Feature based method. This type of methods extracts key features from modulated signal samples and builds classifiers for modulation classification. As different types of signals are characterized by diverse features, adopting different types of features can significantly impact signal recognition accuracy. Typically, expertise-based features are categorized into time-domain features, cumulant-based features, spectral features, etc. To name a few, the authors in [40] extract and analyze nine distinct signal features, including amplitude, phase, and instantaneous frequency characteristics, and conduct modulation recognition based on these features. Ref. [41] utilizes fourth-order cumulants as signal features and employs a hierarchical model to progressively classify modulation types, including M-ary phase-shift keying (MPSK), M-ary quadrature amplitude modulation (MQAM), and M-ary pulse-amplitude modulation (MPAM). In [42], a hidden Markov model (HMM) approach is employed to analyze spectral correlation characteristics in the cyclic frequency domain. The feature-based methods can achieve relatively satisfactory recognition performance. However, they often require extensive preprocessing procedures and highly rely on feature extraction. When the received signal is contaminated by noise, the feature extraction process becomes less effective, leading to a degraded overall recognition performance.
Deep learning-based method. While it fundamentally belongs to the category of feature-based methods, we categorize it separately because of its prevalence in the existing work and significant methodological and conceptual differences from conventional feature-based methods.On one hand, the existing DL-based methods can be classified into time-domain waveform-based ones and transform-domain image-based ones, according to the types of data fed into the neural networks. On the other, they can be classified based on the different neural network architectures that they are built on. In particular, long short-term memory (LSTM) networks are employed to extract temporal features from input signals to improve classification accuracy [23,24]. Convolutional neural network (CNN)-based automatic modulation classification approaches are proposed in [21,22]. The authors in [25,27] propose hybrid CNN–LSTM models combining CNN’s spatial feature extraction and RNN’s temporal sequence processing capabilities for time-dependent signals. Ref. [26] adapts a convolutional long short-term deep neural network (CLDNN) to AMC issues, while [28] further proposes a multi-channel CLDNN (MCLDNN) approach. Leveraging the residual network (ResNet), a parallel architecture integrating residual network (ResNet) modules with gated recurrent units (GRUs) and cascaded LSTM networks with two-layer ResNets is developed in [29,30]. A three-stream deep learning framework is designed in [28], leveraging the independent/joint features of in-phase/quadrature (I/Q) components through multi-channel inputs to enhance spatiotemporal feature fusion. In [31], modulation classification is carried out by adopting a multi-channel input strategy and building a multi-scale neural network (MSNN) with multi-head self-attention and bidirectional GRUs.
The above existing works show good examples of AMC. However, the number of works focusing on AMC for OTFS systems is very limited. In particular, although existing deep learning-based works demonstrate the power of applying deep learning techniques in the field of AMC, determining how to leverage deep neural networks to enhance the accuracy of AMC for OTFS is still an urgent issue to tackle. In [33], an AMC approach for OTFS systems is proposed by designing a hybrid CNN and LSTM network with a residual stack to process I/Q symbols. However, it primarily relies on time-domain features of the received signal, while the intrinsic DD-domain features of OTFS signals remain unexploited. Moreover, although embedded-pilots are harnessed in channel estimation in OTFS systems [34,35], as far as we are aware, there is no AMC-related work incorporating them to boost the performance.In this paper, we propose an AMC approach for OTFS systems by building a dual-stream CNN-based framework to incorporate both time-domain and DD-domain information and by developing a differentiated embedded pilot insertion scheme. The experimental results demonstrate that the proposed approach can achieve high accuracy even under extreme conditions.

3. System Model

We investigate the modulation recognition of the OTFS transmission system depicted in Figure 2. The information symbols { x dd [ k , l ] , k = 0 , , N 1 , l = 0 , , M 1 } are in a two-dimensional DD grid of size N × M , where N and M are the numbers of indices along the Doppler and delay axes, respectively. The symbol x dd [ k , l ] is transformed to the TF-domain symbol X tf [ n , m ] , n = 0 , , N 1 , m = 0 , , M 1 through the inverse symplectic finite Fourier transform (ISFFT) [43]:
X tf [ n , m ] = 1 N M k = 0 N 1 l = 0 M 1 x dd [ k , l ] e j 2 π n k N m l M .
The signal is transformed from the TF domain to the time domain via the Heisenberg transform [34], and the transmitted time-domain signal is
s ( t ) = n = 0 N 1 m = 0 M 1 X tf [ n , m ] g tx ( t n T ) e j 2 π m Δ f ( t n T ) ,
where Δ f is the subcarrier spacing, T is the symbol duration, and g tx ( t ) denotes the transmit pulse shaping filter.
The signal transmitted through the channel is affected by the channel impulse response h ( τ , ν ) in the DD domain:
h ( τ , ν ) = i = 1 P h i δ ( τ τ i ) δ ( ν ν i ) ,
where P is the number of propagation paths, and h i is the gain of the i-th path. τ i and ν i respectively denote the delay and Doppler shift of the i-th path,
τ i = l i M Δ f , ν i = k i N T ,
where l i is the delay index, and k i is the Doppler index.
Then, the received signal r ( t ) is
r ( t ) = h ( τ , ν ) s ( t τ ) e j 2 π ν ( t τ ) d τ d ν + ω ( t ) ,
where ω ( t ) represents the additive white Gaussian noise.
The received symbols in the TF domain Y tf [ n , m ] can be obtained via Wigner Transform:
Y tf [ n , m ] = r ( t ) g rx * ( t n T ) e j 2 π m Δ f ( t n T ) d t .
Here, g rx ( t ) denotes the pulse shaping filter at the receiver.
Finally, applying the symplectic finite Fourier transform (SFFT) to the samples yields the symbols in the DD domain [43]:
y dd [ k , l ] = 1 N M n = 0 N 1 m = 0 M 1 Y tf [ n , m ] e j 2 π n k N m l M .
In this paper, we propose an automatic modulation classification framework for OTFS systems as described above, which involves a differentiated embedded pilot insertion based DD plane structure, dataset construction scheme, and a dual-stream CNN architecture-based multi-domain fusion approach.

4. Proposed Method

In this section, we first develop a differentiated embedded pilot insertion scheme, and on top of that, propose a multi-domain fusion-based automatic modulation classification approach for OTFS systems by combining DD-domain and time-domain signals through a dual-stream CNN-based architecture.

4.1. Differentiated Embedded Pilot Insertion Scheme

We propose a differentiated embedded pilot insertion scheme that incorporates in the DD plane distinct pilot symbols related to modulation, which are surrounded by zero-padded guard bands, as illustrated in Figure 3. While preserving the inherent advantage of no interference between data and pilot signals as in existing pilot insertion schemes [34], it can further facilitate the neural network model designed hereafter to extract modulation-related features from the received signals.
Specifically, we employ differentiated pilot values for different modulation schemes. We define the set of modulation schemes as M = { BPSK , QPSK , 8 PSK , 16 QAM , 64 QAM , 256 QAM } and the corresponding pilot set as P = { P M i } M i M , where P M i is the unique pilot symbol associated with modulation M i M . The specific values for P depend on the channel types (e.g., PDCCH, PDSCH, PUCCH, PUSCH) and the utility of the reference signals (e.g., CSI-RS, SRS), and can be set according to rules indicated in Release 18 of the 3GPP NR specification [44].
In the embedded pilot design, the pilot symbols (i.e., P P ) are placed at specific positions in the DD domain grid, and the guard symbols (i.e., 0) are configured around the pilots to isolate them from data symbols (i.e., D) [35]. This configuration can be expressed as
x dd [ k , l ] = P , if ( k , l ) F p , 0 , if ( k , l ) G p , D , otherwise ,
where F p denotes the set of pilot positions in the DD grid, while G p represents the guard symbol region surrounding the pilot, and its size is constrained by the maximum delay and maximum Doppler shift.
The DD-domain structure based on the above differentiated embedded pilots can capture characteristics of different modulation schemes and provide additional information on modulation, thus allowing signals to be well-distinguished from each other. Therefore, the classification accuracy at the receiver can further be improved.

4.2. Dataset Design

In the OTFS system considered, the symbol is transformed from the DD domain to the time domain via ISFFT and Heisenberg transform. After the wireless channel transmission, the received signal is transformed into the DD domain via the Wigner transform and SFFT. We design a data acquisition scheme to extract two distinct data flows in the time and DD domains from the above process and employ them as the input to the neural network, as shown in Figure 2. We consider M subcarriers in the delay dimension and N symbols in the Doppler dimension.
We construct the time-domain data vector Θ with a dimension of 2 × M N , using the in-phase (real) component and quadrature (imaginary) component of r [ l ] ( l = 0 , , M N 1 ), which is a discrete sampled version of the received time-domain signal r ( t ) :
Θ = [ Re ( r [ l ] ) ] l [ 0 , M N 1 ] Im ( r [ l ] ) ] l [ 0 , M N 1 ] R 2 × M N .
To generate data for the DD-domain stream, we split the DD-domain symbol grid y dd into an in-phase (real) component and a quadrature (imaginary) component with the dimensions of N × M and construct a DD-domain input data matrix Γ with a dimension of 2 × N × M :
Γ = [ Re ( y dd [ k , l ] ) ] k [ 0 , N 1 ] , l [ 0 , M 1 ] Im ( y dd [ k , l ] ) ] k [ 0 , N 1 ] , l [ 0 , M 1 ] R 2 × N × M .

4.3. Dual-Stream Architecture for Multi-Domain Fusion

We propose a multi-domain fusion approach for modulation classification by developing a dual-stream CNN-based architecture, which is characterized by the 1D-CNN branch and 2D-CNN branch for processing time-domain data and DD-domain data, as illustrated in Figure 4 (Note that the specific valuesblue, e.g., dropout rate of 0.6, adopted in the neural network architecture are based on numerical tests and comparisons carried out in the experiments. Please refer to Section 5 for more details). (empiracally considering N = 32 , M = 64 ).
The 1D-CNN branch processes time-domain data ( Θ ) with an input shape of ( 2 ,   2048 ) . The first block employs 32 kernels of 3 × 1 convolutions, followed by batch normalization and ReLU activation, and then performs 2 × downsampling ( 2048 1024 ). The second block contains 64 convolutional kernels with the same processing flow ( 1024 512 ) thereafter. The third block uses 128 convolutional kernels and incorporates a dropout of rate 0.6 . Finally, after transforming features to the dimensions of 32 via adaptive average pooling and passing through a fully connected layer, the branch outputs a 64-dimensional feature (i.e., Ψ 1 ).
The 2D-CNN branch processes delay-Doppler-domain data ( Γ ) with an input shape of ( 2 ,   32 ,   64 ) . The first block uses 32 kernels of 3 × 3 convolutions, followed by batch normalization, ReLU activation, and pooling. The second block and third block respectively employ 64 and 128 convolutional kernels. Finally, the data passes through a 2 × 2 adaptive average pooling layer and a fully connected layer, and results in an output with a dimension of 64 (i.e., Ψ 2 ).
The features from 1D-CNN and 2D-CNN branches are combined, resulting in a 128-dimensional Ψ com , which is then fed into an attention module. This module generates adaptive weights that can be applied to Ψ com to emphasize the most significant aspects in features. The final classifier comprises two fully connected layers, with a dropout rate of 0.6 applied between them to prevent overfitting.

4.4. Computational Complexity Analysis

The computational complexity of the proposed model is analyzed by examining neural network components. We define B as the batch size, C i n 1 and C i n 2 as the numbers of input channels for 1D and 2D branches, L the sequence length in the 1D branch, H × W as the spatial dimensions in the 2D branch, K 1 ,   K 2 ,   K 3 the sizes of kernels, C o u t 1 ,   C o u t 2 ,   C o u t 3 as the number of output channels of different stages, N as the number of output classification categories, S 1 and S 2 as the sizes after 1D-CNN and 2D-CNN feature extraction, D a t t as the attention dimension, and D f c 1 as the first fully connected layer dimension. Then, the complexity of the 1D-CNN branch, which processes time-domain signals with input shape ( B , C i n 1 , L ) , is O 1 D O ( B · L · C i n 1 · C o u t 1 · K 1 ) , while the complexity of the 2D-CNN, which processes delay-Doppler images with input shape ( B , C i n 2 , H , W ) , is O 2 D O ( B · H · W · C i n 2 · C o u t 1 · K 1 2 ) . The complexity of the attention mechanism is O A t t e n t i o n O ( B · C o u t 3 · D a t t ) , and the complexity of the classifier module is O C l a s s i f i e r O ( B · C o u t 3 · D f c 1 ) . Therefore, the total complexity is O m o d e l = O 1 D + O 2 D + O A t t e n t i o n + O C l a s s i f i e r , which can be approximately represented as O m o d e l O ( B · L · C i n 1 · C o u t 1 · K 1 ) + O ( B · H · W · C i n 2 · C o u t 1 · K 1 2 ) , considering L max ( H , W ) in typical implementations.

5. Numerical Results

In this section, we evaluate the proposed differentiated embedded pilot and multi-domain fusion-based OTFS automatic modulation classification approach from diverse aspects. Extensive experiments are carried out to investigate the modulation classification accuracy w.r.t. different SNRs, the effects of pilot symbols, different NN architectures for time-domain and DD-domain data streams, and maximum Doppler shifts. Moreover, the proposed approach is compared with state-of-the-art baselines.

5.1. Experimental Settings and Performance Metric

The dataset consists of signals modulated by BPSK, QPSK, 8PSK, 16QAM, 64QAM, and 256QAM, with the SNRs ranging within { 5 ,   0 ,   5 ,   10 ,   15 ,   20 ,   25 } dB. For each case in the combination set of such modulation schemes and SNR conditions, 2000 samples are generated; therefore, the complete dataset comprises 84,000 samples in total. To ensure effective model training and evaluation, a dataset is constructed with training, validation, and test subsets with the proportion of 7:2:1, respectively. The training phase spans 80 epochs with a batch size of 400 samples per iteration, optimized by the Adam algorithm with a learning rate of 0.0035 . More experiment settings are listed in Table 1.
The experiments are conducted on a device with a single NVIDIA RTX 4090 GPU (24 GB), an Intel Xeon Gold 6430 CPU (16 cores), and 128 GB RAM. With the specified configurations, model training takes an average of approximately 7.2 min, with each epoch spanning an average of 5.4 s. With a trained model, it takes 0.054 ms for modulation classification of a signal. According to the above time costs, the proposed model can be implemented in real-time wireless operations. First, once trained, the model can be utilized for an extended period and does not need to be retrained unless the network undergoes significant changes. Second, 0.054 ms is an acceptable operation duration compared to the intervals (e.g., typically 0.1∼10 ms in different cases) between two modulation classifications in real wireless systems.
In the experiments, without loss of generality, the pilot values within the pilot matrix are kept identical for simplification. The values adopted for different modulation types are listed in Table 2.
The performance of modulation classification is evaluated based on the accuracy, precision, recall, and F1-score metrics. Let the entire sample set be divided into four subsets based on the true labels and predicted outcomes, i.e., the sets of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN); then, the four metrics are computed as follows:
The accuracy quantifies the proportion of correctly classified samples in the entire dataset and is defined by
Accuracy = TP + TN TP + TN + FP + FN .
The precision measures the proportion of correctly predicted positive instances among all the predicted positive instances:
Precision = TP TP + FP .
The recall measures the proportion of correctly predicted positive instances among all the actual positive instances:
Recall = TP TP + FN .
The F1-score is the harmonic mean of precision and recall, providing a balanced evaluation metric:
F 1 = 2 × Precision × Recall Precision + Recall .

5.2. Performance Analysis

First of all, we evaluate the impact of the dropout rate on the model training in Figure 5. We measure fitting degree as the average difference between training set accuracy and validation set accuracy over the batches from convergence until the end of training. As can be seen, the model with the dropout rate of 0.6 reaches the best trade-off between classification accuracy and regularization effectiveness; it achieves highest average accuracy while effectively preventing overfitting and thus guaranteeing generalization. Therefore, a dropout rate of 0.6 is employed in the neural network model design.
Figure 6 shows the confusion matrices of modulation classification at different SNRs ranging from 0 dB to 15 dB. As can be seen, the classification accuracy is significantly improved as the SNR is raised from 0 dB to 15 dB. Moreover, in Figure 6a,b, it is obvious that even under low SNR conditions (i.e., 0 dB and 5 dB), the confusion matrices exhibit apparent diagonal patterns, with 100 % classification accuracy for some modulation schemes. Figure 6c,d further demonstrate that our approach shows excellent classification performance at high SNRs.
In Table 3 and Figure 7, we investigate the effect of the proposed differentiated embedded pilot insertion scheme on the neural network training and modulation classification performance. Compared with data structures using identical pilot symbols, the ones embedded with differentiated pilot values show higher classification accuracy thanks to the feature differences introduced among modulation schemes. Moreover, the number of embedded pilots also has an impact on the classification accuracy.
Furthermore, we conduct experiments for multi-domain and single-domain approaches whose results are shown in Figure 8 and Table 4. As can be seen, the DD domain NN branch, especially the one with the 2D-CNN, substantially contributes to a high modulation classification accuracy. Based on that, the incorporation of the time-domain NN branch can significantly boost the performance as the SNRs vary within [ 5 ,   5 ] dB. This accounts for why the proposed combination of DD domain and time domain can achieve high classification accuracy under varying SNR conditions, including even 5 and 0 dB. Note that the structure (i.e., 1D-CNN/2D-CNN) of the time domain NN branch has limited effects on the performance; therefore, it is sufficient for the time-domain NN branch to adopt 1D-CNN for computational efficiency.
The impact of maximum Doppler shifts on the classification performance of the proposed approach is investigated in Figure 9 and Table 5. The experiments are conducted with a carrier frequency of 5 GHz and maximum Doppler shifts of 400 Hz, 1000 Hz, 1200 Hz, and 1400 Hz, which correspond to the moving speeds of 86 km/h, 216 km/h, 260 km/h, and 302 km/h, covering common speed ranges of transportation. Therefore, the robustness of the proposed approach under various mobility conditions is comprehensively evaluated. Table 5 presents the classification accuracy, precision, recall, and F1-score over SNRs ranging from 5 dB to 25 dB under four different maximum Doppler shift conditions. As we can see, a high classification accuracy (approaching 75 % ) can be achieved even in a highly dynamic scenario (characterized by a maximum Doppler shift of 1400 Hz) with a low SNR (i.e., 5 dB), demonstrating the robustness of the proposed approach in extreme conditions.
As the received signal is inherently in the complex domain, in Figure 10, we compare the proposed approach, i.e., the proposed neural network (NN) architecture with Adam optimizer, with the complex-valued neural network (CVNN) model with different optimizers, including Adam, Adagrad, Adamax, Amagrad, and RMSprop, in terms of validation loss and accuracy. Apparently, when applied to CVNN, the Adagrad optimizer achieves lower validation loss and a faster convergence rate, while the Adam optimizer exhibits smoother curves with greater stability. Last but not least, the proposed approach can achieve the lowest loss and highest accuracy.
In Figure 11, the proposed approach is compared with different state-of-the-art baseline approaches. ResNet [14], originally designed for automatic modulation classification in OFDM systems, is adapted to OTFS systems and evaluated upon datasets with differentiated pilots, identical pilots, and no pilots. The baselines CLDNN [26], LSTM [24], CNN_LSTM [33], and CVNN with Adagrad optimizer [45] adopt differentiated pilot structures. The results for ResNet [14] further demonstrate the advantage of the proposed differentiated pilots. Compared to all the state-of-the-art baselines, the proposed approach, characterized by the dual-stream network architecture, shows superior classification performance (at least 90 % accuracy) under both low- and high-SNR conditions. Table 6 records average metrics including accuracy, precision, recall, and F1-score over all the SNR values. The near-identical values of the metrics for the proposed approach indicate that the generated AMC model prevents biases among classification tasks well.

6. Conclusions

In this paper, we have proposed a novel multi-domain fusion-based automatic modulation classification approach for OTFS systems. We have designed a new dual-stream CNN-based architecture that exploits both the time-domain and DD-domain signal features. Moreover, we have introduced a differentiated embedded pilot structure in the frame design, which incorporates the modulation-related symbol to further improve discriminability among different modulation schemes. Experimental results have demonstrated that the proposed approach can outperform the state-of-the-art baselines and achieve an average classification accuracy of 97.8% across a wide SNR range from 5 dB to 25 dB in high-mobility environments.

Author Contributions

Conceptualization, Z.L. and B.Z.; methodology, Z.L. and B.Z.; software, Z.L.; writing—original draft preparation, Z.L.; writing—review and editing, B.Z., H.L. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wei, Z.; Yuan, W.; Li, S.; Yuan, J.; Bharatula, G.; Hadani, R.; Hanzo, L. Orthogonal time-frequency space modulation: A promising next-generation waveform. IEEE Wirel. Commun. 2021, 28, 136–144. [Google Scholar] [CrossRef]
  2. Shen, W.; Dai, L.; An, J.; Fan, P.; Heath, R.W. Channel estimation for orthogonal time frequency space (OTFS) massive MIMO. IEEE Trans. Signal Process. 2019, 67, 4204–4217. [Google Scholar] [CrossRef]
  3. Hadani, R.; Rakib, S.; Tsatsanis, M.; Monk, A.; Goldsmith, A.J.; Molisch, A.F.; Calderbank, R. Orthogonal time frequency space modulation. In Proceedings of the 2017 IEEE Wireless Communications and Networking Conference (WCNC), San Francisco, CA, USA, 19–22 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
  4. Zhang, Z.; Xiao, Y.; Ma, Z.; Xiao, M.; Ding, Z.; Lei, X.; Karagiannidis, G.K.; Fan, P. 6G wireless networks: Vision, requirements, architecture, and key technologies. IEEE Veh. Technol. Mag. 2019, 14, 28–41. [Google Scholar] [CrossRef]
  5. Matz, G.; Hlawatsch, F. Time-varying communication channels: Fundamentals, recent developments, and open problems. In Proceedings of the 2006 14th European Signal Processing Conference, Florence, Italy, 4–8 September 2006; IEEE: Piscataway, NJ, USA, 2006; pp. 1–5. [Google Scholar]
  6. Zhou, X.; Ying, K.; Gao, Z.; Wu, Y.; Xiao, Z.; Chatzinotas, S.; Yuan, J.; Ottersten, B. Active terminal identification, channel estimation, and signal detection for grant-free NOMA-OTFS in LEO satellite Internet-of-Things. IEEE Trans. Wirel. Commun. 2022, 22, 2847–2866. [Google Scholar] [CrossRef]
  7. Gao, Z.; Zhou, X.; Zhao, J.; Li, J.; Zhu, C.; Hu, C.; Xiao, P.; Chatzinotas, S.; Ng, D.W.K.; Ottersten, B. Grant-free NOMA-OTFS paradigm: Enabling efficient ubiquitous access for LEO satellite Internet-of-Things. IEEE Netw. 2023, 37, 18–26. [Google Scholar] [CrossRef]
  8. Buzzi, S.; Caire, G.; Colavolpe, G.; D’Andrea, C.; Foggi, T.; Piemontese, A.; Ugolini, A. LEO satellite diversity in 6G non-terrestrial networks: OFDM vs. OTFS. IEEE Commun. Lett. 2023, 27, 3013–3017. [Google Scholar] [CrossRef]
  9. Linsalata, F.; Albanese, A.; Sciancalepore, V.; Roveda, F.; Magarini, M.; Costa-Perez, X. OTFS-superimposed PRACH-aided localization for UAV safety applications. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
  10. Han, R.; Ma, J.; Bai, L. Trajectory planning for OTFS-based UAV communications. China Commun. 2023, 20, 114–124. [Google Scholar] [CrossRef]
  11. Blazek, T.; Radovic, D. Performance evaluation of OTFS over measured V2V channels at 60 GHz. In Proceedings of the 2020 IEEE MTT-S International Conference on Microwaves for Intelligent Mobility (ICMIM), Linz, Austria, 23 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–4. [Google Scholar]
  12. Lopez, L.M.W.; Bengtsson, M. Achievable rates of orthogonal time frequency space (OTFS) modulation in high speed railway environments. In Proceedings of the 2022 IEEE 33rd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Kyoto, Japan, 12–15 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 982–987. [Google Scholar]
  13. Ma, Y.; Ma, G.; Ai, B.; Fei, D.; Wang, N.; Zhong, Z.; Yuan, J. Characteristics of channel spreading function and performance of OTFS in high-speed railway. IEEE Trans. Wirel. Commun. 2023, 22, 7038–7054. [Google Scholar] [CrossRef]
  14. Liu, X.; Yang, D.; El Gamal, A. Deep neural network architectures for modulation classification. In Proceedings of the 2017 51st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 29 October–1 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 915–919. [Google Scholar]
  15. Wei, W.; Mendel, J.M. Maximum-likelihood classification for digital amplitude-phase modulations. IEEE Trans. Commun. 2000, 48, 189–193. [Google Scholar] [CrossRef]
  16. Su, W.; Xu, J.L.; Zhou, M. Real-time modulation classification based on maximum likelihood. IEEE Commun. Lett. 2008, 12, 801–803. [Google Scholar] [CrossRef]
  17. Liedtke, F. Computer simulation of an automatic classification procedure for digitally modulated communication signals with unknown parameters. Signal Process. 1984, 6, 311–323. [Google Scholar] [CrossRef]
  18. Gardner, W.A.; Spooner, C.M. Cyclic spectral analysis for signal detection and modulation recognition. In Proceedings of the MILCOM 88, 21st Century Military Communications-What’s Possible?’ Conference Record, Military Communications Conference, San Diego, CA, USA, 23–26 October 1988; IEEE: Piscataway, NJ, USA, 1988; pp. 419–424. [Google Scholar]
  19. Meng, F.; Chen, P.; Wu, L.; Wang, X. Automatic modulation classification: A deep learning enabled approach. IEEE Trans. Veh. Technol. 2018, 67, 10760–10772. [Google Scholar] [CrossRef]
  20. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  21. Hermawan, A.P.; Ginanjar, R.R.; Kim, D.S.; Lee, J.M. CNN-based automatic modulation classification for beyond 5G communications. IEEE Commun. Lett. 2020, 24, 1038–1041. [Google Scholar] [CrossRef]
  22. Wang, Y.; Liu, M.; Yang, J.; Gui, G. Data-driven deep learning for automatic modulation recognition in cognitive radios. IEEE Trans. Veh. Technol. 2019, 68, 4074–4077. [Google Scholar] [CrossRef]
  23. Chen, Y.; Shao, W.; Liu, J.; Yu, L.; Qian, Z. Automatic modulation classification scheme based on LSTM with random erasing and attention mechanism. IEEE Access 2020, 8, 154290–154300. [Google Scholar] [CrossRef]
  24. Rajendran, S.; Meert, W.; Giustiniano, D.; Lenders, V.; Pollin, S. Deep learning models for wireless signal classification with distributed low-cost spectrum sensors. IEEE Trans. Cogn. Commun. Netw. 2018, 4, 433–445. [Google Scholar] [CrossRef]
  25. Zhang, Z.; Luo, H.; Wang, C.; Gan, C.; Xiang, Y. Automatic modulation classification using CNN-LSTM based dual-stream structure. IEEE Trans. Veh. Technol. 2020, 69, 13521–13531. [Google Scholar] [CrossRef]
  26. West, N.E.; O’shea, T. Deep architectures for modulation recognition. In Proceedings of the 2017 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 6–9 March 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
  27. Wu, Y.; Li, X.; Fang, J. A deep learning approach for modulation recognition via exploiting temporal correlations. In Proceedings of the 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Kalamata, Greece, 25–28 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–5. [Google Scholar]
  28. Xu, J.; Luo, C.; Parr, G.; Luo, Y. A spatiotemporal multi-channel learning framework for automatic modulation recognition. IEEE Wirel. Commun. Lett. 2020, 9, 1629–1632. [Google Scholar] [CrossRef]
  29. Li, L.; Zhu, Y.; Zhu, Z. Automatic modulation classification using ResNeXt-GRU with deep feature fusion. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
  30. Elsagheer, M.M.; Ramzy, S.M. A hybrid model for automatic modulation classification based on residual neural networks and long short term memory. Alex. Eng. J. 2023, 67, 117–128. [Google Scholar] [CrossRef]
  31. Wang, X.; Liu, D.; Zhang, Y.; Li, Y.; Wu, S. A spatiotemporal multi-stream learning framework based on attention mechanism for automatic modulation recognition. Digit. Signal Process. 2022, 130, 103703. [Google Scholar] [CrossRef]
  32. Li, S.; Ding, C.; Xiao, L.; Zhang, X.; Liu, G.; Jiang, T. Expectation propagation aided model driven learning for OTFS signal detection. IEEE Trans. Veh. Technol. 2023, 72, 12407–12412. [Google Scholar] [CrossRef]
  33. Kumar, A.; Manish; Satija, U. Residual stack-aided hybrid CNN-LSTM-based automatic modulation classification for orthogonal time-frequency space system. IEEE Commun. Lett. 2023, 27, 3255–3259. [Google Scholar] [CrossRef]
  34. Raviteja, P.; Phan, K.T.; Hong, Y. Embedded pilot-aided channel estimation for OTFS in delay–Doppler channels. IEEE Trans. Veh. Technol. 2019, 68, 4906–4917. [Google Scholar] [CrossRef]
  35. Guo, L.; Gu, P.; Zou, J.; Liu, G.; Shu, F. DNN-based fractional Doppler channel estimation for OTFS modulation. IEEE Trans. Veh. Technol. 2023, 72, 15062–15067. [Google Scholar] [CrossRef]
  36. Azzouz, E.; Nandi, A. Procedure for automatic recognition of analogue and digital modulations. IEE Proc.-Commun. 1996, 143, 259–266. [Google Scholar] [CrossRef]
  37. Kim, K.; Polydoros, A. Digital modulation classification: The BPSK versus QPSK case. In Proceedings of the MILCOM 88, 21st Century Military Communications-What’s Possible?’ Conference Record, Military Communications Conference, San Diego, CA, USA, 23–26 October 1988; IEEE: Piscataway, NJ, USA, 1988; pp. 431–436. [Google Scholar]
  38. Boiteau, D.; Le Martret, C. A general maximum likelihood framework for modulation classification. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), Seattle, WA, USA, 15 May 1998; IEEE: Piscataway, NJ, USA, 1998; Volume 4, pp. 2165–2168. [Google Scholar]
  39. Panagiotou, P.; Anastasopoulos, A.; Polydoros, A. Likelihood ratio tests for modulation classification. In Proceedings of the MILCOM 2000, 21st Century Military Communications, Architectures and Technologies for Information Superiority (Cat. No. 00CH37155), Los Angeles, CA, USA, 22–25 October 2000; IEEE: Piscataway, NJ, USA, 2000; Volume 2, pp. 670–674. [Google Scholar]
  40. Azzouz, E.E.; Nandi, A.K.; Azzouz, E.E.; Nandi, A.K. Modulation recognition using artificial neural networks. Autom. Modul. Recognit. Commun. Signals 1996, 132–176. [Google Scholar]
  41. Swami, A.; Sadler, B.M. Hierarchical digital modulation classification using cumulants. IEEE Trans. Commun. 2000, 48, 416–429. [Google Scholar] [CrossRef]
  42. Kim, K.; Akbar, I.A.; Bae, K.K.; Um, J.S.; Spooner, C.M.; Reed, J.H. Cyclostationary approaches to signal detection and classification in cognitive radio. In Proceedings of the 2007 2nd IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, Dublin, Ireland, 17–20 April 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 212–215. [Google Scholar]
  43. Raviteja, P.; Phan, K.T.; Hong, Y.; Viterbo, E. Interference cancellation and iterative detection for orthogonal time frequency space modulation. IEEE Trans. Wirel. Commun. 2018, 17, 6501–6515. [Google Scholar] [CrossRef]
  44. 3GPP. Physical Channels and Modulation, TS 38.211. Available online: https://www.etsi.org/deliver/etsi_ts/138200_138299/138211/16.02.00_60/ts_138211v160200p.pdf (accessed on 11 July 2025).
  45. Lee, C.; Hasegawa, H.; Gao, S. Complex-Valued Neural Networks: A Comprehensive Survey. IEEE/CAA J. Autom. Sin. 2022, 9, 1406–1426. [Google Scholar] [CrossRef]
Figure 1. Automatic modulation classification methods.
Figure 1. Automatic modulation classification methods.
Sensors 25 04393 g001
Figure 2. System model and acquisition of the time and delay-Doppler domain data.
Figure 2. System model and acquisition of the time and delay-Doppler domain data.
Sensors 25 04393 g002
Figure 3. Differentiated embedded pilot insertion scheme.
Figure 3. Differentiated embedded pilot insertion scheme.
Sensors 25 04393 g003
Figure 4. Dual-stream neural network architecture for multi-domain fusion.
Figure 4. Dual-stream neural network architecture for multi-domain fusion.
Sensors 25 04393 g004
Figure 5. Accuracy and fitting degree across different dropout rates.
Figure 5. Accuracy and fitting degree across different dropout rates.
Sensors 25 04393 g005
Figure 6. Confusion matrix of modulation classification results.
Figure 6. Confusion matrix of modulation classification results.
Sensors 25 04393 g006
Figure 7. Comparison in terms of different pilot structures.
Figure 7. Comparison in terms of different pilot structures.
Sensors 25 04393 g007
Figure 8. Comparison in terms of different NN architectures.
Figure 8. Comparison in terms of different NN architectures.
Sensors 25 04393 g008
Figure 9. The performance of the proposed approach w.r.t. different maximum Doppler shifts.
Figure 9. The performance of the proposed approach w.r.t. different maximum Doppler shifts.
Sensors 25 04393 g009
Figure 10. Comparison between CVNN models with different optimizers and the proposed approach.
Figure 10. Comparison between CVNN models with different optimizers and the proposed approach.
Sensors 25 04393 g010
Figure 11. Comparison with state-of-the-art approaches.
Figure 11. Comparison with state-of-the-art approaches.
Sensors 25 04393 g011
Table 1. Simulation parameters.
Table 1. Simulation parameters.
ParameterValue
Delay-Doppler grid sizeN = 32, M = 64
Pilot symbol dimensions3 × 3
Guard interval lengths2
Sampling rate (kHz)100
Maximum Doppler shift (Hz)1000
Carrier frequency (GHz)5
Channel modelExtended vehicular A model (EVA)
Modes of modulationBPSK, QPSK, 8PSK, 16QAM, 64QAM, 256QAM
Table 2. Pilot signal configurations.
Table 2. Pilot signal configurations.
Modulation TypePilot TypeValue
BPSKReal number2
QPSKComplex number 1 + j
8PSKPhase rotation e j π / 4
16QAMComplex number 1.5 + 1.5 j
64QAMComplex number 2 + 2 j
256QAMReal number2.5
Table 3. Average classification accuracy under different pilot symbol configurations.
Table 3. Average classification accuracy under different pilot symbol configurations.
Pilot Symbol ConfigurationAverage Classification Acc. (%)
3 × 3 differentiated pilot symbols97.8
3 × 3 same pilot symbols93.7
2 × 2 differentiated pilot symbols92.8
2 × 2 same pilot symbols92.6
1 × 1 differentiated pilot symbols92.4
1 × 1 same pilot symbols81.2
No pilot symbols74.6
Table 4. Average classification accuracies of hybrid 1D/2D CNN models for time and DD domains.
Table 4. Average classification accuracies of hybrid 1D/2D CNN models for time and DD domains.
ParameterAverage Classification Acc. (%)ParameterAverage Classification Acc. (%)
Time Domain + 1D-CNN & DD Domain + 2D-CNN97.8Time Domain + 1D-CNN79.0
Time Domain + 2D-CNN & DD Domain + 1D-CNN95.0Time Domain + 2D-CNN78.6
Time Domain + 2D-CNN & DD Domain + 2D-CNN97.7DD Domain + 1D-CNN93.3
Time Domain + 1D-CNN & DD Domain + 1D-CNN95.3DD Domain + 2D-CNN96.4
Table 5. Average classification performance under different maximum Doppler shifts.
Table 5. Average classification performance under different maximum Doppler shifts.
Maximum Doppler Shift (Speed)Accuracy (%)Precision (%)Recall (%)F1-Score (%)
400 Hz (86 km/h)98.2698.2798.2698.27
1000 Hz (216 km/h)97.8197.8197.8197.81
1200 Hz (260 km/h)97.7197.7797.7397.75
1400 Hz (302 km/h)95.4195.4395.4095.40
Table 6. Classification Performance Metrics of Different Approaches.
Table 6. Classification Performance Metrics of Different Approaches.
ModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
Proposed Method97.8197.8197.8197.81
Differentiated pilot symbols + ResNet [14]92.5092.5792.5292.54
Same pilot symbols + ResNet [14]83.8783.9083.8983.88
no pilot symbols + ResNet [14]56.4560.8156.1456.97
CNN_LSTM [33]79.0979.0679.1479.09
CLDNN [26]87.8788.8487.8787.81
LSTM [24]66.6069.3059.5064.00
CVNN + Adagrad [45]95.6295.6495.6395.63
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Z.; Zhang, B.; Luo, H.; He, H. Differentiated Embedded Pilot Assisted Automatic Modulation Classification for OTFS System: A Multi-Domain Fusion Approach. Sensors 2025, 25, 4393. https://doi.org/10.3390/s25144393

AMA Style

Liu Z, Zhang B, Luo H, He H. Differentiated Embedded Pilot Assisted Automatic Modulation Classification for OTFS System: A Multi-Domain Fusion Approach. Sensors. 2025; 25(14):4393. https://doi.org/10.3390/s25144393

Chicago/Turabian Style

Liu, Zhenkai, Bibo Zhang, Hao Luo, and Hao He. 2025. "Differentiated Embedded Pilot Assisted Automatic Modulation Classification for OTFS System: A Multi-Domain Fusion Approach" Sensors 25, no. 14: 4393. https://doi.org/10.3390/s25144393

APA Style

Liu, Z., Zhang, B., Luo, H., & He, H. (2025). Differentiated Embedded Pilot Assisted Automatic Modulation Classification for OTFS System: A Multi-Domain Fusion Approach. Sensors, 25(14), 4393. https://doi.org/10.3390/s25144393

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop