SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage

Ling, Yunzhi; Xu, Jun

doi:10.3390/electronics14173532

Open AccessArticle

SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage

by

Yunzhi Ling

^*

and

Jun Xu

School of Physics, University of Electronic Science and Technology of China, No. 2006 Xiyuan Avenue, Hi-Tech Zone, Chengdu 611731, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(17), 3532; https://doi.org/10.3390/electronics14173532

Submission received: 2 July 2025 / Revised: 25 August 2025 / Accepted: 30 August 2025 / Published: 4 September 2025

(This article belongs to the Section Microwave and Wireless Communications)

Download

Browse Figures

Versions Notes

Abstract

With the evolution of 6G technology toward global coverage and multidimensional integration, OTFS modulation has become a research focus due to its advantages in high-mobility scenarios. However, existing OTFS signal detection algorithms face challenges such as pilot contamination, Doppler spread degradation, and diverse interference in complex environments. This paper proposes the SPARSE-OTFS-Net algorithm, which establishes a comprehensive signal detection solution by innovatively integrating sparse random pilot design, compressive sensing-based frequency offset estimation with closed-loop cancellation, and joint denoising techniques combining an autoencoder, residual learning, and multi-scale feature fusion. The algorithm employs deep learning to dynamically generate non-uniform pilot distributions, reducing pilot contamination by 60%. Through orthogonal matching pursuit algorithms, it achieves super-resolution frequency offset estimation with tracking errors controlled within 20 Hz, effectively addressing Doppler spread degradation. The multi-stage denoising mechanism of deep neural networks suppresses various interferences while preserving time-frequency domain signal sparsity. Simulation results demonstrate: Under large frequency offset, multipath, and low SNR conditions, multi-kernel convolution technology achieves significant computational complexity reduction while exhibiting outstanding performance in tracking error and weak multipath detection. In 1000 km/h high-speed mobility scenarios, Doppler error estimation accuracy reaches ±25 Hz (approaching the Cramér-Rao bound), with BER performance of 5.0 × 10⁻⁶ (7× improvement over single-Gaussian CNN’s 3.5 × 10⁻⁵). In 1024-user interference scenarios with BER = 10⁻⁵ requirements, SNR demand decreases from 11.4 dB to 9.2 dB (2.2 dB reduction), while maintaining EVM at 6.5% under 1024-user concurrency (compared to 16.5% for conventional MMSE), effectively increasing concurrent user capacity in 6G ultra-massive connectivity scenarios. These results validate the superior performance of SPARSE-OTFS-Net in 6G ultra-massive connectivity applications and provide critical technical support for realizing integrated space–air–ground networks.

Keywords:

6G communication; Orthogonal Time Frequency Space (OTFS) modulation; ubiquitous coverage and connectivity; sparse signal processing; deep learning; channel estimation; multi-user interference suppression; high-mobility scenarios

1. Introduction

Driven by the rapid evolution of communication technology, the central application scenarios of 6G are moving beyond the 5G “Enhanced Mobile Broadband (eMBB)/Massive Machine Type Communications (mMTC)/Ultra-Reliable Low-Latency Communications (URLLC)” triangle toward ubiquitous intelligence, multi-dimensional convergence, and the symbiosis of the physical and digital worlds. This shift has led to six flagship scenarios: Ubiquitous-X connectivity, integrated sensing, communication, and computing (ISAC+), immersive extended reality (XR-Pro), autonomous networks 2.0, energy–information cooperative networks, and the digital-twin Earth [1,2].

Within 6G systems, signal-detection technology is critical along three core dimensions:

(1): Ubiquitous Connectivity: Demands seamless air–space–ground–sea coverage. The resulting channels are highly heterogeneous—e.g., high-speed rail channels exhibit Doppler spreads up to 1 kHz, and satellite links experience delay spreads exceeding 10 ms. Traditional detectors struggle to balance dynamics and accuracy; ref. [1] shows that 5G algorithms degrade to a BER of 10⁻² at 1000 km/h.
(2): Ultra-Dense Deployment: The 6G network aims for an area-level deployment target of 10⁷ devices per square kilometer. However, during peak hours, a single base station must simultaneously schedule and transmit data for hundreds of users, leading to significantly intensified pilot contamination and multi-user interference. Field measurement data in [2] demonstrate that when 1024 users access concurrently, the Error Vector Magnitude (EVM) of conventional MMSE detection deteriorates to 18%, far exceeding the 8% threshold typically required by communication standards.
(3): Physical–Digital Symbiosis: Requires joint optimization of physical channels and their digital-twin models, but existing approaches cannot achieve millisecond-level synchronization—ref. [3] shows that latency jitter must remain below 0.1 ms.

Currently, numerous researchers are dedicated to the study of OTFS signal detection algorithms [3]. Through analysis of existing research findings, this paper discovers the following:

(1): The Least Squares (LS)-based channel estimation algorithm [4] achieves rapid estimation by constructing a pilot matrix and directly computing its pseudo-inverse, with a computational complexity of O(P³ + P²MN). It is well-suited for low-interference environments. When the signal-to-noise ratio exceeds 20 dB, the normalized mean square error (NMSE) can reach −15 dB, and detection latency is as low as 0.5 ms. However, the method exhibits several significant drawbacks: pilot contamination degrades the error-vector magnitude (EVM) to 18% in multi-user scenarios; Doppler spreads exceeding 500 Hz increase the estimation error by 10 dB; and multipath components with similar power cannot be resolved, as the resolution is limited to 1/Ts;
(2): The Minimum Mean Square Error (MMSE) algorithm [5,6] performs channel estimation by exploiting the statistical properties of noise and multipath interference, exhibiting excellent noise-robustness with an NMSE as low as −20 dB. Nevertheless, the method suffers from notable limitations: in multipath resolution, its delay resolution can only reach half of the conventional grid spacing, failing to achieve super-resolution estimation; and in dynamic channel tracking, performance degrades by 40% when the frequency offset exceeds 500 Hz;
(3): The 3D Simultaneous Orthogonal Matching Pursuit (3D-SOMP) algorithm [7] significantly enhances OTFS massive MIMO channel-estimation accuracy by leveraging a three-dimensional sparse model spanning delay, Doppler, and angle. With 128 antennas, it reduces pilot overhead by 40% and achieves an NMSE of −22 dB, while regularized screening effectively mitigates fractional-Doppler effects. However, its practical deployment is constrained by limited dynamic adaptability and hardware implementation bottlenecks, restricting its applicability in real-time communication scenarios;
(4): The Sparse Bayesian Learning with damped Least Squares Minimum Residual (SBL-d-LSMR) algorithm [8] achieves high-precision fractional-Doppler estimation by integrating a Basis Expansion Model (BEM) with sparsity-driven optimization. With only 15% pilot overhead, it attains an NMSE of −22 dB and a BER as low as 10⁻⁵ in 120 km/h vehicular networks. However, its performance depends on the channel-sparsity assumption; in dense multipath environments, estimation accuracy decreases by 30%. In addition, 20–30 SBL iterations are required for convergence, resulting in limited real-time capability; dynamic Doppler adaptability is constrained; hardware implementation increases power consumption by 25%; and the algorithm exhibits high parameter sensitivity. These limitations restrict its applicability in more complex scenarios;
(5): Atomic Norm Minimization (ANM) off-grid channel estimation [9] overcomes the limitations of conventional discrete-grid approaches by employing a continuous-domain sparse-recovery framework. The method models delay-Doppler parameters as continuous variables and achieves super-resolution through atomic norm minimization, providing a ten-fold improvement in resolution over traditional techniques and reducing delay error to below 0.1 µs. Nevertheless, it suffers from high computational complexity, limited sensitivity to weak multipath components, and significant performance degradation in multi-user scenarios;
(6): The CNN (Convolutional Neural Network)-based OTFS detector [10] employs multi-scale convolutional kernels to extract local delay-Doppler domain features, enhances sparsity through Rectified Linear Unit (ReLU) activations, and produces symbol estimates via fully connected layers. At 15 dB SNR, it achieves a BER of 3 × 10⁻⁵ with a detection latency of only 1.2 ms, representing a significant improvement over conventional OMP algorithms. Nevertheless, the approach has notable limitations: with a fixed training set, the error-vector magnitude degrades to 18% in multi-user scenarios, indicating insufficient adaptability to dynamic interference patterns;
(7): The GNN (Graph Neural Network)-based OTFS resource-allocation algorithm [11] introduces an innovative dynamic-attributed graph model that maps delay-Doppler space resource units to graph vertices while representing inter-user interference relationships as dynamic edges. In scenarios with 1024 users, it achieves a latency of only 0.6 ms and suppresses multi-user interference by −22 dB. However, the algorithm has two major limitations: it relies on prior channel state information (CSI), with performance decreasing by 30% when CSI error exceeds 10%, and frequent retuning is required under severe Doppler spreads;
(8): The Transformer-OTFS algorithm [12] introduces a novel Gated Multi-Head Self-Attention (G-MHSA) mechanism to capture global context in the delay-Doppler domain. By integrating positional encoding to accurately preserve spatiotemporal signal features, it achieves a bit-error rate of 5 × 10⁻⁷ at 15 dB SNR and suppresses multi-user interference by −22 dB, outperforming existing approaches. However, the algorithm requires 50,000 training samples, which is five times the data demand of CNN-based methods;
(9): Model-Agnostic Meta-Learning (MAML) [13] enables rapid adaptation by creating a dynamic initialization pool that supports instantaneous switching among typical scenarios, such as high-speed rail (500 km/h) and UAV links. It achieves an NMSE of −28 dB with only 50 pilot symbols, reducing pilot overhead to 3%. A notable limitation is that the meta-task generalization capability decreases by 40% under severe Doppler spread.

In response to these issues, this paper conducts research on sparse pilot design, channel error estimation, and real-time signal detection, and proposes an algorithm based on SPARSE-OTFS-Net (Sparse Random Assisted Robust Synchronization and Estimation for OTFS with Network-based Denoising). This algorithm can effectively detect OTFS signals in complex scenarios with large frequency offsets, multipath, low signal-to-noise ratio, high-speed mobility, and co-channel/inter-channel interference during multi-user parallel access. It provides an innovative solution to meet the requirements of OTFS signal detection for ubiquitous connectivity in Ubiquitous-X scenarios.

2. SPARSE-OTFS-Net Model Construction

2.1. Overall Model Structure

The SPARSE-OTFS-Net model achieves high-performance OTFS signal detection through the collaborative optimization of multiple technologies, mainly including sparse random pilot design, compressive sensing-based frequency offset estimation and elimination, autoencoder-based denoising, residual learning-based denoising, and multi-scale feature fusion-based denoising, as shown in Figure 1.

In terms of interference suppression:

The autoencoder module extracts the sparse features of the DD domain signal through a deep convolutional network. This effectively suppresses multipath interference, co-channel/inter-channel interference, and additive white Gaussian noise (AWGN), significantly improving signal quality in low signal-to-noise ratio (SNR) environments;
The residual learning network models the residual features of separable interference through a gating mechanism, focusing on eliminating strong multipath interference and intermittent co-channel interference;
The multi-scale fusion mechanism combines features at different resolutions through a parallel multi-path architecture, significantly enhancing the model’s robustness against mixed interference (including co-channel, inter-channel, multipath, and cross-system interference).

This design comprehensively considers the complex interference characteristics of the Ubiquitous Connectivity (Ubiquitous-X) scenario, ensuring high-precision and low-latency OTFS signal detection in extreme environments.

2.2. SPARSE-OTFS-Net Signal Parsing Algorithm

SPARSE-OTFS-Net Signal Parsing Algorithm is an integrated signal processing framework designed for Orthogonal Time Frequency Space (OTFS) systems. This algorithm innovatively combines a sparse random pilot design, a compressive sensing-based frequency offset estimation and closed-loop cancellation system, along with technologies such as autoencoders, residual learning, and multi-scale feature fusion to construct a comprehensive signal parsing solution. The specific algorithm workflow is as follows:

Algorithm 1: SPARSE-OTFS-Net Signal Parsing

Input: Received signal matrix Y ∈ ℂ^(M × N) (M delay dimension, N Doppler dimension)
Dynamic pilot positions S (deep learning-generated)
Noise power σ²
Output: Parsed clean signal Ŷ_final
1://Sparse random pilot design
  Generate pilot symbols:
  p_k = complex_pilot_symbol();//QPSK (Quadrature Phase Shift Keying)/QAM (Quadrature Amplitude Modulation) modulation
  Layout pilots dynamically:
  For each position (k, l):
if (k, l) ∈ S: X[k][l] = p_k
else: X[k][l] = 0
  Synchronization mechanism:
  Y_p = matching_pursuit(Y); //Extract pilots with sparse sensing
  compensate_delay(≤0.1 Ts); //Time delay compensation
  compensate_doppler(≤20 Hz); //Doppler frequency offset
  if (doppler > 500 Hz) increase_pilot_ratio(10%);//Adaptive for high Doppler
2://Compressive sensing frequency offset estimation
  y = vectorize(Y); // Vectorize received signal
  D = [];   // Initialize dictionary matrix
  for (i = 1 to L):
  D.append(SUG(Δf_i) * conjugate_transpose(U) * P);//Build Φ(Δf_i), SUG: Sparse Unitary Group
  //OMP(Orthogonal Matching Pursuit) frequency offset estimation
  r = y_p; //Initialize residual
  I = {}; //Initialize support set
  while (norm(r) ≥ σ_t):
  i_t = argmax_i(|dot_product(D[:,i], r)|/norm(D[:,i]));//Find best match
  I = I ∪ {i_t}; //Update support set
  α = least_squares(D[:,I], y_p);//Solve least squares
  r = y_p—D[:,I] * α; //Update residual
  Δf_hat = Δf_i_L; //Estimated frequency offset
3://Frequency offset compensation
  G = diag_matrix(exp(j*2*π*Δf_hat*(0:N-1)/N));//Compensation matrix
  Y_comp = inverse(G) * Y; //Compensate frequency offset
4://Joint denoising mechanism
  //Autoencoder denoising
  Ŷ_ae = autoencoder(Y_comp); //Apply autoencoder
  //Encoder: ReLU(conv_layer(Z, W_e, b_e))
  //Decoder: ReLU(deconv_layer(Ŷ, W_d, b_d))
  //Loss: ‖Y—Ŷ_ae‖²₂
  //Residual learning denoising
  noise_map = resnet(Ŷ_ae); //Learn noise residual
  Ŷ_res = Ŷ_ae—noise_map; //Subtract noise
  //Loss: ‖N—noise_map‖²₂ + λ * ‖noise_map‖₁
  //Multi-scale feature fusion
  F1 = conv3 × 3(Ŷ_res); //3 × 3 convolution
  F2 = conv5 × 5(Ŷ_res); //5 × 5 convolution
  F3 = conv7 × 7(Ŷ_res); //7 × 7 convolution
  Σ = covariance_matrix(F1, F2, F3);//Compute covariance matrix
  α, β, γ = adaptive_weights(Σ); //Compute dynamic weights
  Ŷ_final = α * F1 + β * upsample(F2) + γ * upsample(F3);//Fusion
  return Ŷ_final;

3. Core Module Design

3.1. Sparse Random Pilot Signal Design

In the delay-Doppler (DD) domain, the sparse pilot part of the OTFS transmitted signal matrix

X \in C^{M \times N}

(with M as the delay dimension and N as the Doppler dimension) can be expressed as [14]:

X_{k, l} = {_{0, o t h e r}^{p_{k}, (k, l) \in S}

(1)

where

p_{k}

represents the pilot symbol, and S is the set of random pilot positions. The sparse pilot distribution in the OTFS system is dynamically generated through deep learning to meet the testing requirements of different scenarios.

The synchronization mechanism for the DD-domain sparse random pilot signals X_k,l (dynamically generated pilot location set S) has been supplemented as follows:

(1): Pilot detection: At the receiver, compressive sensing matching pursuit is applied to extract pilots from the random locations S. Correlation peak detection, based on the known amplitude/phase signature p₁(k,l), resolves the synchronization challenge caused by non-fixed pilot positions;
(2): Joint timing–frequency calibration: Using the estimated pilot locations $\hat{S}$ , an interpolation-based phase-locked loop (PLL) compensates for both timing offset (error ≤ 0.1 Ts) and Doppler shift (error ≤ 20 Hz);
(3): Dynamic feedback optimization: An online learning module continuously adapts the pilot density across the M × N grid, balancing signaling overhead against synchronization robustness. For instance, the pilot ratio is increased to 10% in high-Doppler scenarios.

3.2. Received Signal Model

After passing through the time-varying channel and experiencing frequency offset

Δ f

, the received signal in the delay-Doppler (DD) domain can be expressed as:

Y = G (Δ f) X F_{N}^{H} + W

(2)

where:

$G (Δ f)$ is the diagonal matrix representing the frequency offset, given by $G (Δ f) = d i a g {[e^{j 2 π Δ f \frac{n}{N}}]}_{n = 0}^{N - 1}$ ;
$F_{N}$ is the N-point Discrete Fourier Transform (DFT) matrix;
W is the additive white Gaussian noise.

3.3. Compressive Sensing-Based Frequency Offset Estimation

(1) Vectorization Processing

Vectorize the received signal Y to form

y = v e c (y) \in C^{M N \times 1}

, i.e.,

y = \underset{U}{\underset{︸}{(F_{N} \otimes I_{M})}} G (Δ f) \underset{U^{H}}{\underset{︸}{(F_{N}^{H} \otimes I_{M})}} H_{x} + w

(3)

where H is the channel convolution matrix, x = vec(X) is the sparse vector.

(2) Sparse Sensing Equation

Extract the received signal corresponding to the pilot positions,

y_{p} = φ (Δ f) h + w_{p}

, where

φ (Δ f) = S U G (Δ f) U^{H} P

(P being the pilot position mapping matrix), and h is the sparse channel vector.

(3) Compressive Sensing Frequency Offset Estimation

For the candidate frequency offset set

\{Δ f_{1}, Δ f_{2} \dots Δ f_{L}\}

, construct the dictionary matrix

D = [φ (Δ f_{1}) φ (Δ f_{2}) \dots φ (Δ f_{L})]

. The sparse recovery problem can be formulated as:

y_{p} = D α + w_{p}

(4)

where α is the sparse vector.

Steps for Compressive Sensing Frequency Offset Estimation:

(a) Initialization: Initial error

r_{0} = y_{p}

, Initial support set

I_{0} = 0

(b) Iteration Process (t-th iteration):

Select Atom: $i_{t} = \arg \max_{i} \frac{|d_{i}^{H} r_{i - 1}|}{{‖d_{i}‖}_{2}}$ ;
Update Support Set: $I_{t} = I_{t - 1} U \{i_{t}\}$ ;
Least Squares Estimation: $α_{t} = {(D_{I_{t}}^{H} D_{I_{t}})}^{- 1} D_{I_{t}}^{H} y_{p}$ ;
Update Estimation Error: $r_{t} = y_{p} - D_{I_{t}} α_{t}$ .

Termination Condition:

{‖r_{t}‖}_{2} < σ_{t}

,where

σ_{t}

is the error threshold.

(4) Frequency Offset Estimation:

Δ \hat{f} = Δ f_{i_{L}}

,where

i_{L} = \arg \min_{I_{L}} ‖y_{p} - D_{I_{L}} α_{L}‖

3.4. Autoencoder Denoising

The autoencoder projects the noisy signal onto the manifold of the clean signal through a nonlinear mapping, thereby suppressing the noise components. The encoder retains the main components of the signal (such as the time–frequency sparsity of OTFS), achieving the denoising function. The error is determined by the noise power [15].

Encoder

The encoder extracts low-dimensional features through multiple layers of convolution and downsampling. The output of the l-th layer is:

Z^{(l)} = Re L U (W_{e}^{(l)} * Z^{(l - 1)} + b_{e}^{(l)})

(5)

where * denotes the convolution operation,

W_{e}^{(l)}

is the convolution kernel weight,

b_{e}^{(l)}

is the bias, and

Z^{(0)} = \hat{Y} (k, l)

is the input from the previous layer;

Decoder

The decoder reconstructs the signal through transposed convolution:

{\hat{Y}}^{(l)} = R e L U (W_{e}^{(l)} \otimes {\hat{Y}}^{(l - 1)} + b_{e}^{(l)})

(6)

where

W_{e}^{(l)}

and

b_{e}^{(l)}

are the weights and biases of the decoder,

{\hat{X}}^{(l - 1)}

is the output from the encoder, and

{\hat{Y}}_{f i n a l} = {\hat{Y}}^{(L)}

is the final output. The goal is to minimize the reconstruction loss

L_{r e c o n} = {‖Y - {\hat{Y}}_{f i n a l}‖}_{2}^{2}

.

Assuming the noise suppression capability of the encoder–decoder pair of the autoencoder is ϵ, then:

E [{‖Y - {\hat{Y}}_{f i n a l}‖}_{2}^{2}] \leq \in σ^{2} + O (\frac{1}{\sqrt{n}})

(7)

The training of the autoencoder aims to learn a mapping function from noisy inputs to clean signals. The learning process involves the following steps:

(1) Data preparation: We assemble a dataset

D = {({\hat{Y}}_{i}, Y_{i})}_{i = 1}^{n}

, where

{\hat{Y}}_{i}

denotes the noisy received signal and

Y_{i}

the corresponding clean ground-truth signal. The noise power σ² is controlled during data generation.

(2) Loss-function optimization: With the reconstruction loss

L_{r e c o n}

as the objective, the training aims to minimize the mean squared error (MSE) by means of a gradient-descent algorithm:

Forward pass: the noisy input $\hat{Y}$ is first fed into the encoder—a stack of convolution layers with ReLU activations—yielding a low-dimensional latent feature $Z^{(Ɩ)}$ . $Z^{(Ɩ)}$ is then passed through the decoder, composed of transposed convolutions and ReLU activations, to produce the reconstructed signal ${\hat{Y}}_{f i n a l}$ ;
Compute the loss $L_{r e c o n}$ ;
Back-propagation: compute the gradients 〖∇L〗 _recon of the loss with respect to the weights W_e^((Ɩ)) and biases b_e^((Ɩ)) using the chain rule, employing the derivative of the ReLU activations.

(3) Training algorithm: we use either stochastic gradient descent (SGD) or the Adam optimizer. Parameters are updated as

W_{e}^{(Ɩ)} \leftarrow W_{e}^{(Ɩ)} - η \frac{\partial L_{r e c o n}}{{\partial W}_{e}^{(Ɩ)}}, b_{e}^{(Ɩ)} \leftarrow b_{e}^{(Ɩ)} - η \frac{\partial L_{r e c o n}}{{\partial b}_{e}^{(Ɩ)}}

where

η

is the learning rate. Training continues iteratively until the loss converges or the preset number of epochs is reached.

(4) Hyper-parameter tuning: the threshold ϵ, which governs effective network capacity (e.g., depth L or convolutional kernel size), is selected via cross-validation.

3.5. Residual Learning for Denoising Under Non-Gaussian Noise Challenges

Assuming that the noise N is independent of the signal Y, and

Y^{'}

is the signal after adding the noise N to the signal Y, the residual network R(

Y^{'}

) estimates the noise. The posterior probability is then given by:

p (N | Y^{'}) \propto p (Y^{'} | N) p (N) = δ (Y^{'} - Y - N) \cdot N (N | {0, σ}^{2})

(8)

By using the negative log likelihood, the loss function is transformed into:

L_{r e s} = {‖N - R (Y^{'})‖}_{2}^{2} + {λ ‖R (Y^{'})‖}_{1}

(9)

where R(

Y^{'}

) is the output of the residual network, and the

‖\cdot‖

term enforces the sparsity of the noise.

The estimation error of the noise output by the residual network is:

E [{‖N - R (Y^{'})‖}_{2}^{2}] = σ^{2} (1 - \frac{S N R}{1 + S N R}) + λ E [{‖R (Y^{'})‖}_{1}]

(10)

where SNR is the signal-to-noise ratio. The sparsity constraint introduces a bias–variance trade-off.

Although the residual learning in this paper is based on a Gaussian noise assumption, it has effectively suppressed non-Gaussian noise in high-speed mobility (Doppler dispersion) and ultra-dense deployment (spatially correlated noise) scenarios through a triple mechanism of L1 (Least Absolute Deviation) sparsity constraint, multi-scale covariance weighting, and dynamic regularization, thereby further enhancing noise adaptability in 6G full-scenario coverage.

The core idea of residual learning is to train the network to estimate the noise residual directly, rather than the signal itself. The learning procedure is formulated under a Maximum A Posteriori (MAP) framework, and proceeds as follows:

(1) Data preparation: we construct a dataset

D = {(Y^{'}, Y_{i})}_{i = 1}^{n}

, where

Y^{'}

=

Y_{i}

+

N_{i}

is the noisy observation,

N_{i}

is the known noise vector, and

Y_{i}

is the corresponding clean ground-truth signal.

(2) Loss-function optimization: the loss

L_{r e s}

is composed of two terms—an MSE term

{‖N - R (Y^{'})‖}_{2}^{2}

and a sparsity-regularization term

{λ ‖R (Y^{'})‖}_{1}

. The training procedure includes:

Forward pass: the noisy input Y′ is forwarded through the residual network to yield a noise estimate R(Y′);
Compute the loss $L_{r e s}$ ;
Back-propagation: for gradient computation, the ‖·‖_2^2 term employs the standard L2-norm(Least Squares Norm) derivative, whereas the ‖·‖_1 term is handled via the sub-gradient method.

(3) Training algorithm: we employ the Adam optimizer to update the network weights so as to minimize the loss

L_{r e s}

. The training objective is to minimize the expected error

E [{‖N - R (Y^{'})‖}_{2}^{2}]

.

(4) Convergence and evaluation: during training, the validation-set error is continually monitored; the regularization parameter λ is adjusted on the fly to balance noise-estimation accuracy against sparsity.

3.6. Multi-Scale Feature Fusion Denoising

The main approach is to use convolutional kernels of different scales to cover different frequency bands. The covariance matrix adjusts the weights to enhance the signal’s dominant frequency bands and suppress noise.

Multi-branch Convolution

Different-scale convolutional kernels extract features:

F_{1} = {C o n v}_{3 \times 3} (Y), F_{2} = {C o n v}_{5 \times 5} (Y), F_{3} = {C o n v}_{7 \times 7} (Y)

(11)

Adaptive Fusion

F_{f u s e} = α F_{1} + β \cdot U p s a m p l e (F_{2}) + γ \cdot U p s a m p l e (F_{3})

(12)

where

α

is the weight for the signal’s dominant frequency band, and

β

and

γ

are the weights for the noise frequency bands. The weights

α

,

β

, and

γ

are dynamically adjusted through the covariance matrix in the frequency domain:

Σ = \frac{1}{K} \sum_{k = 1}^{K} (F_{k} - μ) {(F_{k} - μ)}^{T}

(13)

After multi-scale fusion, the signal-to-noise ratio (SNR) is improved:

{S N R}_{n e w} = \frac{α^{2} P_{X}}{β^{2} P_{N} + γ^{2} P_{N}}

(14)

where

P_{X}

and

P_{N}

are the powers of the signal and noise, respectively.

Multi-scale feature fusion strengthens signal-related frequency components and suppresses noise by learning adaptive weights across different scales. The learning process centers on feature fusion and weight optimization:

(1): Data preparation: assemble the dataset $D = {({\hat{Y}}_{i}, Y_{i})}_{i = 1}^{n}$ , where ${\hat{Y}}_{i}$ denotes the noisy input and $Y_{i}$ the corresponding clean target.
(2): Feature extraction and fusion:

Multi-branch convolution: the input $\hat{Y}$ is processed by three parallel convolutional layers to produce feature maps $F_{1}$ , $F_{2}$ , and $F_{3}$ , whose kernel sizes are 3 × 3, 5 × 5, and 7 × 7, respectively, capturing information across different frequency bands;
Feature fusion: $F_{f u s e} = α F_{1} + β \cdot F_{2} + γ \cdot F_{3}$ , where α, β, and γ are learnable weights initialized to 1;
Final output: the fused feature $F_{f u s e}$ is forwarded through additional layers (e.g., fully connected or convolutional) to generate the denoised signal $\hat{Y}$ .

(3): Loss function and optimization: the loss is based on the mean squared error, i.e., $L_{f u s i o n} = {‖Y - \hat{Y}‖}_{2}^{2}$
(4): Training steps

Forward pass: compute $F_{1}$ , $F_{2}$ , $F_{3}$ , and the fused feature $F_{f u s e}$ ;
Loss computation: $L_{f u s i o n} = \frac{1}{n} \sum_{i = 1}^{n} {‖Y_{i} - {\hat{Y}}_{i}‖}_{2}^{2}$ ;
Back-propagation: gradients are applied to update the fusion weights (α, β, γ) and the convolutional kernels. The objective is to maximize the output signal-to-noise ratio; through this learning process, the weights automatically emphasize dominant signal features and suppress noise, thereby improving the overall SNR.

(5): Training algorithm

We employ SGD with momentum and a decaying learning-rate schedule. The initialization of the fusion weights (α, β, γ) strongly influences convergence; therefore, frequency bands with higher signal power are initialized with larger α values. Throughout training, the output

{S N R}_{n e w}

is continuously monitored to guide convergence and hyper-parameter tuning.

4. Performance Evaluation and Simulation Analysis

To verify the performance of SPARSE-OTFS-Net in the Ubiquitous Connectivity (Ubiquitous-X) scenario, this study designed simulation tests for three typical scenarios: resource half-allocation detection under large frequency offset, multipath, and low signal-to-noise ratio (SNR) conditions; resource full-allocation detection in a 1000 km/h high-speed mobility scenario; and detection in a co-channel/inter-channel interference scenario with 1024 users accessing in parallel.

4.1. Large Frequency Offset, Multipath, and Low SNR Scenario Test

Simulation parameter settings:

Signal bandwidth: 30.72 MHz;
Subcarrier spacing: 30 kHz;
Delay dimension (M): 1024;
Doppler dimension (N): 16 (Doppler resolution: 1.876 kHz);
Doppler shift: 1 kHz;
Channel environment: 1024 multipath interferences with a maximum delay of 3.1 μs and a maximum multipath gain of −24.2 dB;
Service distribution: 42 resource blocks (RBs) out of a total of 85 RBs.

The performance comparison analysis of various algorithms is shown in Table 1.

For detailed results, refer to the BER and EVM comparison charts for OTFS signal detection based on each algorithm (Figure 2 and Figure 3).

The simulation results demonstrate that SPARSE-OTFS-Net achieves significantly reduced computational complexity through multi-kernel convolution technology compared to conventional algorithms, while exhibiting superior performance in both tracking error and weak multipath detection. This makes it particularly suitable for typical 6G complex scenarios characterized by large frequency offsets, multipath interference, and low signal-to-noise ratio (SNR) conditions.

4.2. Performance Evaluation in High-Speed Mobility Scenario

In the high-speed mobility scenario at 1000. km/h, the OTFS system parameters are set as follows: bandwidth of 30.72 MHz, subcarrier spacing of 30 kHz, delay dimension of 1024, Doppler dimension of 16, and full resource allocation of 85 resource blocks (RBs) [16]. The performance comparison analysis of various algorithms is shown in Table 2.

For detailed results, refer to the BER and EVM comparison charts for OTFS signal detection based on each algorithm (Figure 4 and Figure 5).

Simulation results demonstrate that SPARSE-OTFS-Net delivers exceptional performance in high-speed mobility scenarios:

Doppler estimation accuracy: ±25 Hz (approaching the Cramér–Rao lower bound);
Bit error rate (BER): 5.0 × 10⁻⁶ (7× improvement over single-Gaussian CNN’s 3.5 × 10⁻⁵).

The algorithm demonstrates significant advantages for 6G ultra-high-speed applications.

4.3. Multi-User Interference Scenario Test

In the scenario with 1024 users (511 co-channel + 512 inter-channel) accessing the system in parallel, the system parameters are set as described above, with one resource block (RB) allocated. The performance comparison analysis of various algorithms is shown in Table 3.

For detailed results, refer to the BER and EVM comparison charts for OTFS signal detection based on each algorithm (Figure 6 and Figure 7).

Simulation results demonstrate that SPARSE-OTFS-Net significantly enhances interference resistance in multi-user interference scenarios through residual learning and multi-scale covariance weighting. Under a BER requirement of 10⁻⁵, the SNR demand is reduced from 11.4 dB to 9.2 dB (a 2.2 dB improvement), while maintaining EVM at 6.5% with 1024 concurrent users (compared to 16.5% for traditional MMSE). This effectively increases the number of concurrent users in 6G ultra-massive connectivity scenarios.

5. Conclusions

This paper proposes a high-precision OTFS signal detection method based on SPARSE-OTFS-Net for 6G ubiquitous coverage (Ubiquitous-X) scenarios, specifically addressing the challenges of signal detection in complex environments such as high-speed mobility and multi-user interference. By employing sparse random pilot design, compressive sensing-based frequency offset estimation and cancellation, and a joint denoising mechanism combining autoencoders, residual learning, and multi-scale fusion, the proposed method effectively overcomes the limitations of traditional algorithms in Doppler tracking, weak multipath detection, and interference resistance.

Experimental results demonstrate the algorithm’s outstanding performance across three extreme scenarios:

Large frequency offset and multipath environments (Doppler tracking error approaching 20 Hz);
High-speed mobility at 1000 km/h (performance nearing theoretical optimum);
Parallel access for 1024 users (achieving BER = 10⁻⁵ at SNR = 9.2 dB with multi-kernel fusion, versus 11.4 dB for single-kernel CNN, showing significant interference resistance improvement).

This work provides a reliable signal detection solution for 6G ultra-massive connectivity deployments.

Future research directions will focus on the following aspects:

Exploring the deep integration of OTFS and edge computing [17] to enhance the real-time performance of the algorithm;
Investigating the detection methods for new transmission technologies such as NOMA (Non-Orthogonal Multiple Access)-OTFS [18] and terahertz communication;
Strengthening the robustness verification of the algorithm in typical scenarios such as industrial dense multipath and vehicle-to-everything (V2X) communications [19];
Promoting the standardization of complex channel models for 6G Internet of Things.

These efforts will further improve the performance of 6G communication systems and provide strong support for future industrial development.

Author Contributions

Conceptualization, Y.L.; methodology, Y.L.; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L.; resources, J.X.; data curation, Y.L.; writing—original draft, Y.L.; writing—review and editing, J.X.; visualization, Y.L.; supervision, J.X.; project administration, J.X.; funding acquisition, J.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wei, Z.; Yuan, W.; Li, S.; Yuan, J.; Bharatula, G.; Hadani, R.; Hanzo, L. Orthogonal Time-Frequency Space Modulation: A Promising Next-Generation Waveform. IEEE Wirel. Commun. 2021, 28, 136–144. [Google Scholar] [CrossRef]
3GPP TR 38.901, version 17.0.0. Study on Channel Model for Frequencies from 0.5 to 100 GHz. 3rd Generation Partnership Project: Sophia Antipolis, France, 2023.
Li, S.; Yuan, W.; Yuan, J.; Caire, G. On the Potential of Spatially-Spread Orthogonal Time Frequency Space Modulation for ISAC Transmissions. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Singapore, 22–27 May 2022. [Google Scholar] [CrossRef]
Surabhi, G.D.; Chockalingam, A. Low-Complexity Linear Equalization for OTFS Modulation. IEEE Commun. Lett. 2020, 24, 330–334. [Google Scholar] [CrossRef]
Murali, K.R.; Chockalingam, A. On OTFS Modulation for High-Doppler Fading Channels. In Proceedings of the 2018 Information Theory and Applications Workshop (ITA), San Diego, CA, USA, 11–16 February 2018; pp. 11–16. [Google Scholar] [CrossRef]
Raviteja, P.; Phan, K.T.; Hong, Y.; Viterbo, E. Interference Cancellation and Iterative Detection for Orthogonal Time Frequency Space Modulation. IEEE Trans. Signal Process. 2020, 68, 258–272. [Google Scholar] [CrossRef]
Shen, W.; Dai, L.; Han, S.; Chih-Lin, I.; Heath, R.W. Channel Estimation for Orthogonal Time Frequency Space (OTFS) Massive MIMO. IEEE Trans. Signal Process. 2019, 67, 4204–4217. [Google Scholar] [CrossRef]
Yong, L.; Xue, L. Low Complexity Receiver Design for Orthogonal Time Frequency Space Systems. J. Electron. Inf. Technol. 2024, 46, 2418–2424. [Google Scholar]
Tian, Z.; Zhang, Z.; Wang, Y. Low-complexity optimization for two-dimensional direction-of-arrival estimation via decoupled atomic norm minimization. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017. [Google Scholar] [CrossRef]
Enku, Y.K.; Bai, B.; Wan, F.; Guyo, C.U.; Tiba, I.N.; Zhang, C.; Li, S. Two-Dimensional Convolutional Neural Network-Based Signal Detection for OTFS Systems. IEEE Wireless Commun. Lett. 2021, 10, 2514–2518. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, S.; Xiao, L.; Li, S.; Jiang, T. Graph Neural Network Assisted Efficient Signal Detection for OTFS Systems. IEEE Commun. Lett. 2023, 27, 2058–2062. [Google Scholar] [CrossRef]
Yang, H.; Zhu, J.; Sun, Y.; Ding, Z. A Transformer Based Signal Detection Method for Orthogonal Time Frequency Space Communication Systems. In Proceedings of the 024 IEEE/CIC International Conference on Communications in China (ICCC Workshops), Hangzhou, China, 7–9 August 2024. [Google Scholar] [CrossRef]
Zhang, Z.; Duan, J.; Wang, G. Channel Estimation of IRS-OTFS Communication System with Meta-learning Algorithm. J. Electron. Inf. Technol. 2024, 46, 1353–1362. [Google Scholar] [CrossRef]
Rasheed, O.K.; Surabhi, G.D.; Chockalingam, A. Sparse Delay-Doppler Channel Estimation in Rapidly Time-Varying Channels for Multiuser OTFS on the Uplink. In Proceedings of the IEEE 91st Vehicular Technology Conference, Antwerp, Belgium, 25–28 May 2020. [Google Scholar] [CrossRef]
Tek, Y.I.; Dogukan, A.T.; Basar, E. Autoencoder-Based Enhanced Orthogonal Time Frequency Space Modulation. IEEE Commun. Lett. 2023, 27, 2628–2632. [Google Scholar] [CrossRef]
Balan, S.; Joshi, S. A Study of High-Speed Railway Communication Channel Models for OTFS-Based Systems. In Proceedings of the 2024 16th International Conference on COMSNETS, Bengaluru, India, 3–7 January 2024. [Google Scholar] [CrossRef]
Bi, J.; Cheng, X.; Yuan, H.; Niu, S.; Zhai, J. Resource Allocation and Trajectory Optimization in Unmanned Aerial Vehicle-assisted Mobile Edge Computing. In Proceedings of the 2024 IEEE International Conference on SMC, Kuching, Malaysia, 6–10 October 2024. [Google Scholar] [CrossRef]
Chatterjee, A.; Rangamgari, V.; Tiwari, S.; Das, S.S. Nonorthogonal Multiple Access With Orthogonal Time–Frequency Space Signal Transmission. IEEE Syst. J. 2021, 15, 383–394. [Google Scholar] [CrossRef]
Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Li, J.; Niyato, D.; Dobre, O.; Poor, H.V. 6G Internet of Things: A Comprehensive Survey. IEEE Internet Things J. 2022, 9, 359–383. [Google Scholar] [CrossRef]

Figure 1. The overall structure of the SPARSE-OTFS-Net mode. Based on the sparsity characteristics of the delay-Doppler (DD) domain channel, a non-uniform random pilot layout strategy is adopted. This effectively reduces pilot contamination and enhances noise resistance, thereby achieving precise reconstruction of high-dimensional channels. To address Doppler offset and carrier frequency offset issues, the frequency offset in the DD domain is modeled as a sparse perturbation matrix. The frequency offset estimation is transformed into a sparse signal recovery problem, and a closed-loop feedback system is constructed to achieve real-time estimation and precise compensation of frequency offsets.

Figure 2. BER comparison chart in large frequency offset, multipath, and low SNR scenario.

Figure 3. EVM comparison chart in large frequency offset, multipath, and low SNR scenario.

Figure 4. BER comparison chart in high-speed mobility scenario.

Figure 5. EVM comparison chart in high-speed mobility scenario.

Figure 6. BER comparison chart in multi-user interference scenario.

Figure 7. EVM comparison chart in multi-user interference scenario.

Table 1. Performance comparison of various algorithms under large Doppler shift, multipath effects, and low SNR conditions.

Algorithm	SNR = 10 dB BER	SNR = 10 dB EVM (%)	Doppler Tracking Error (Hz)	Delay Resolution (μs)	Computational Complexity
LS	1.5 × 10⁻³	17.0	±220	0.035	O(P³ + P2MN) = 1.8 × 10¹⁰
MMSE	3.5 × 10⁻⁴	7.5	±180	0.025	O(M³N³) = 4.4 × 10¹²
3D-SOMP	9.0 × 10⁻⁵	5.0	±110	0.012	O(KPMN) = 3.0 × 10¹⁰
SBL-d-LSMR	3.5 × 10⁻⁵	4.0	±85	0.009	O(KP³MN) = 8.8 × 10¹⁴
ANM	1.5 × 10⁻⁵	3.7	±40	0.001	O((MN)3.5) = 5.6 × 10¹⁴
CNN	1.2 × 10⁻⁵	3.3	±70	0.018	O(BMNd²) = 4.3 × 10⁹
GNN	3.2 × 10⁻⁵	4.8	±55	0.008	O(kMNd²) = 3.4 × 10⁹
MAML	1.3 × 10⁻⁵	3.6	±40	0.006	O(Lkd + Ld2) = 1.2 × 10⁸
Transformer-OTFS	1.2 × 10⁻⁵	3.0	±30	0.004	O(kBMNd²) = 2.1 × 10¹³
SPARSE-OTFS-Net	6.0 × 10⁻⁶	2.0	±18	0.002	O(kMNd²) = 2.0 × 10⁸

Table 2. Performance comparison of various algorithms in high-mobility scenarios.

Algorithm	SNR = 10 dB BER	SNR = 10 dB EVM (%)	Doppler Tracking Error (Hz)	Delay Resolution (μs)	Computational Complexity
LS	2.0 × 10⁻⁴	9.8	±200	0.03	O(P³ + P2MN) = 1.8 × 10¹⁰
MMSE	1.0 × 10⁻⁴	5.9	±150	0.02	O(M³N³) = 4.4 × 10¹²
3D-SOMP	8.8 × 10⁻⁵	4.9	±100	0.01	O(KPMN) = 3.0 × 10¹⁰
SBL-d-LSMR	1.5 × 10⁻⁵	3.5	±80	0.008	O(KP³MN) = 8.8 × 10¹⁴
ANM	3.0 × 10⁻⁵	4.0	±30	0.001	O((MN)3.5) = 5.6 × 10¹⁴
CNN	3.5 × 10⁻⁵	3.2	±60	0.015	O(BMNd²) = 4.3 × 10⁹
GNN	1.1 × 10⁻⁵	3.6	±50	0.006	O(kMNd²) = 3.4 × 10⁹
MAML	9.2 × 10⁻⁶	3.8	±35	0.005	O(Lkd + Ld2) = 1.2 × 10⁸
Transformer-OTFS	8.0 × 10⁻⁶	3.2	±15	0.002	O(kBMNd²) = 2.1 × 10¹³
SPARSE-OTFS-Net	5.0 × 10⁻⁶	1.5	±25	0.003	O(kMNd²) = 2.0 × 10⁸

Table 3. Performance comparison of various algorithms in multi-user interference scenarios.

Algorithm	SNR = 10 dB BER	SNR = 10 dB EVM (%)	Doppler Tracking Error (Hz)	Computational Complexity
LS	2.0 × 10⁻³	17.5	±180	O(P³ + P2MN) = 1.8 × 10¹⁰
MMSE	2.0 × 10⁻⁴	16.5	±150	O(M³N³) = 4.4 × 10¹²
3D-SOMP	3.5 × 10⁻⁴	19.8	±85	O(KPMN) = 3.0 × 10¹⁰
SBL-d-LSMR	4.0 × 10⁻⁵	12.7	±40	O(KP³MN) = 8.8 × 10¹⁴
ANM	1.2 × 10⁻⁴	17.8	±30	O((MN)3.5) = 5.6 × 10¹⁴
CNN	5.0 × 10⁻⁴	14.9	±70	O(BMNd²) = 4.3 × 10⁹
GNN	1.1 × 10⁻⁴	10.7	±55	O(kMNd²) = 3.4 × 10⁹
MAML	5.3 × 10⁻⁵	8.5	±35	O(Lkd + Ld2) = 1.2 × 10⁸
Transformer-OTFS	6.0 × 10⁻⁵	7.3	±25	O(kBMNd²) = 2.1 × 10¹³
SPARSE-OTFS-Net	1.4 × 10⁻⁵	6.5	±18	O(kMNd²) = 2.0 × 10⁸

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ling, Y.; Xu, J. SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage. Electronics 2025, 14, 3532. https://doi.org/10.3390/electronics14173532

AMA Style

Ling Y, Xu J. SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage. Electronics. 2025; 14(17):3532. https://doi.org/10.3390/electronics14173532

Chicago/Turabian Style

Ling, Yunzhi, and Jun Xu. 2025. "SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage" Electronics 14, no. 17: 3532. https://doi.org/10.3390/electronics14173532

APA Style

Ling, Y., & Xu, J. (2025). SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage. Electronics, 14(17), 3532. https://doi.org/10.3390/electronics14173532

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SPARSE-OTFS-Net: A Sparse Robust OTFS Signal Detection Algorithm for 6G Ubiquitous Coverage

Abstract

1. Introduction

2. SPARSE-OTFS-Net Model Construction

2.1. Overall Model Structure

2.2. SPARSE-OTFS-Net Signal Parsing Algorithm

3. Core Module Design

3.1. Sparse Random Pilot Signal Design

3.2. Received Signal Model

3.3. Compressive Sensing-Based Frequency Offset Estimation

3.4. Autoencoder Denoising

3.5. Residual Learning for Denoising Under Non-Gaussian Noise Challenges

3.6. Multi-Scale Feature Fusion Denoising

4. Performance Evaluation and Simulation Analysis

4.1. Large Frequency Offset, Multipath, and Low SNR Scenario Test

4.2. Performance Evaluation in High-Speed Mobility Scenario

4.3. Multi-User Interference Scenario Test

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI