Joint Method of Moments (JMoM) and Successive Moment Cancellation (SMC) Multiuser Time Synchronization for ZP-OFDM-Based Waveforms Applicable to Joint Communication and Sensing

It has been recently shown that zero padding (ZP)-orthogonal frequency-division multiplexing (OFDM) is a promising candidate for 6G wireless systems requiring joint communication and sensing. In this paper, we consider a multiuser uplink scenario where users are separated in power domain, i.e., non-orthogonal multiple access (NOMA), and use ZP-OFDM signals. The uplink transmission is grant-free and users are allowed to transmit asynchronously. In this setup, we address the problem of time synchronization by estimating the timing offset (TO) of all the users. We propose two non-data-aided (NDA) estimators, i.e., the joint method of moment (JMoM) and the successive moment cancellation (SMC), that employ the periodicity of the second order moment (SoM) of the received samples for TO estimation. Moreover, the coding assisted (CA) version of the proposed estimators, i.e., CA-JMoM and CA-SMC, are developed for the case of short observation samples. We also extend the proposed estimators to multiuser multiple-input multiple-output (MIMO) systems. The effectiveness of the proposed estimators is evaluated in terms of lock-in probability under various practical scenarios. Simulation results show that the JMoM estimator can reach the lock-in probability of one for the moderate range of Eb/N0 values. While existing NDA TO estimators in the literature either offer low lock-in probability, high computational complexity that prevents them from being employed in MIMO systems, or are designed for single-user scenarios, the proposed estimators in this paper address all of these issues.


Introduction
Integrating sensing with communication is an inevitable feature in the next generation of wireless networks, i.e., 6G. More specifically, integrated sensing and communication (ISAC), also called joint communication and sensing (JCAS), empowers various radarbased applications such as mobile-based medical imaging and powerful location-aware applications [1]. Integrated communication enhances the radar capability of such devices where each device acts as a node in a distributed radar fusion [2]. On the other hand, the integrated sensing could significantly improve the beam alignment and therefore considerably reduce co-device interference [3]. Due to the enormous potential of JCASenabled systems, significant effort, in the past years, from both academia and industry has been put into developing this technology for future wireless networks.
Various waveforms for JCAS systems have been proposed and investigated. However, there is a trade-off between the communication and sensing capability of any proposed technique [4][5][6][7]. The reason is that random signals are required for communication to convey information; however, deterministic signals are employed for sensing. For example, frequency-modulated continuous wave signals (Frequency-modulated continuous wave signals are used in radar applications) in combination with quadrature amplitude modulation (QAM) or frequency shift keying offer high sensing abilities with relatively simple transceiver design but suffer from low spectral efficiency [8][9][10]. On the other hand, 5G's cyclic prefix (CP)-OFDM signals provide high spectral efficiency but exhibit relatively low sensing capabilities compared to their counterparts, frequency-modulated continuous wave signals [3].
Recently, ZP-OFDM-based waveforms are shown to be promising candidates for ISAC because they offer a trade-off between spectral efficiency for communication and sensing capabilities [3]. More specifically, ZP-OFDM-based waveforms can operate in half-duplex instead of CP-OFDM systems that can perform JCAS only in full-duplex mode. This is important because practical full-duplex implementations consume a lot of energy and are costly. Moreover, self-interference in full-duplex scenarios such as in CP-OFDM systems is of great concern as the received power of the echoed signal is orders of magnitude less than that of the transmitted signal. This is due to the fact that the echoed signal travels twice the distance to the target and its power decays with the fourth power of the distance. The problem gets worse given that practical full-duplex implementations offer limited self-interference isolation between the transmitted signal and the received signal. On the contrary, ZP-OFDM can take advantage of the silent period, i.e., guard interval, in the transmission signal in order to receive the echo signals and perform sensing tasks. This eliminates the need for costly full-duplex implementation. In addition, ZP-OFDM based waveforms offer a higher peak power compared to the CP-OFDM based waveforms, which makes them more interesting for sensing.
In terms of communication capabilities and features, also, ZP-OFDM offers various advantages over CP-OFDM [11] such as enabling finite impulse response equalization of the channels irrespective of channel nulls and improving BER through guaranteeing symbol recovery regardless of the channel zeros. Moreover, ZP-OFDM makes channel tracking and estimation simpler and exhibits higher power efficiency compared to CP-OFDM as it does not resend cyclic data samples [11].
Despite all these benefits, there are practical issues that need to be solved in order to make ZP-OFDM a viable solution for JCAS. One such a problem is the time synchronization. Given the massive number of antennas, connected devices, and subcarriers in 6G systems, the pilot transmission overhead becomes a bottleneck for both extremely high spectral efficiency and ultralow latency requirements of future wireless systems [12]. Various methods are therefore proposed to reduce or ideally remove the pilots [12][13][14]. However, reducing (or removing) the pilots makes the time synchronization task in OFDM systems significantly more challenging. This becomes a crucial problem specifically for JCAS zero-padding-based OFDM waveform candidates, e.g., ZP Dual Index Trimode OFDM-IM [15], ZP-OTFS [16,17], and RP-OTFSM [18], for 6G, compared to their counterpart, i.e., CP-OFDM, due to the lack of CP.
Estimating the timing offset for time synchronization without the need for pilots and preamble is called NDA time synchronization. NDA or semi-NDA timing offset estimation are promising solutions for ISAC since the spectral efficiency of the communication increases in the absence or reduction of pilots and preamble. NDA timing offset estimation methods for ZP-based OFDM waveforms, such as ZP Dual Index Trimode OFDM-IM [15], ZP-OTFS [16,17], and RP-OTFSM [18], mostly rely on heuristic techniques such as the one in [19] where a transition metric (TM) is defined. In such metrics, usually, the ratio of the power of a window of nonzero transmitted samples over the power of a window of zero samples (which are received as noise samples at the receiver) is calculated. Then, the point where this ratio is maximized is considered as the timing offset. Such techniques, though simple, have a poor probability of correct estimation, called lock-in probability. On the other hand, a mathematically heavy approach via maximum likelihood (ML) technique was proposed in [20]. However, the approach in [20] is highly complex, which hinders its implementation in many practical scenarios where the user has limited computational capacity, e.g., mobile users and sensors. Moreover, the proposed method in [20] cannot be used for signal-to-noise ratio (SNR)s less than 5 dB due to the accumulation of numerical errors. In an attempt to address these issues, authors in [21] proposed two methods based on the method of moments. In such techniques, the timing offset is estimated by equating the theoretical moments of the received samples and their natural moments. The lowest possible moment order was used by the authors in [21] in order to keep the computational complexity as low as possible. The advantage of this technique besides its low complexity compared to that of [20] is its ability to be implemented for all SNR ranges, especially very low SNRs. However, the scenario considered in [21] is for single-user transmission, and does not address when there are multiple users and the users are separated in power domain, i.e., NOMA. NOMA is a major multiple access candidate for 6G wireless systems in order to provide the required data rates for ever-increasing number of connected devices [22][23][24][25].
In this paper, we consider a multiuser uplink scenario where users are separated in power domain, i.e., NOMA. ZP-OFDM signal is used for data transmission. A grantfree uplink transmission scheme is considered where users transmit asynchronously. The problem of time synchronization is investigated by estimating the timing offset (TO) of all the users. We propose two NDA TO estimators, i.e., the JMoM and the SMC, that utilize the periodicity of the SoM of the received samples in order to estimate the TO. Furthermore, we develope the CA version of the proposed estimators, called CA-JMoM and CA-SMC, for the scenarios where the number of observation samples are short. The proposed estimators are then extended to MIMO systems. Finally, the performance of the proposed estimators, in terms of lock-in probability, is evaluated under different practical scenarios.

Materials and Methods
In this section, we first discuss the system model for NDA time synchronization for ZP-OFDM. Then, two NDA estimators, i.e., JMoM and SMC estimators are proposed.

System Model
We consider that users u 1 , u 2 , . . . , u U asynchronously communicate with a single base station (BS) via ZP-OFDM technique through doubly selective fading channels. It is assumed that both users and the BS can be mobile, and there is no restrictions on the relative radial velocity of the BS and the users. That is, the users can move with ultrafast speeds while communicating with the BS. Let p 1 , p 2 , . . . , p U and τ 1 , τ 2 , . . . , τ U denote the transmit power and the TO between the U users and the BS, respectively. In this paper, we consider TO estimation in the sample level, i.e., τ i = d i , i = 1, 2, . . . , U, where d i ∈ N. The fractional part of the TO appears as phase offset at each subcarrier. Hence, its effect is compensated when carrier frequency offset is estimated [26]. Moreover, sample level synchronization offers high accurate sensing for wide-band systems. It is assumed that the BS does not have prior knowledge on the pilots and preambles inserted in the ZP-OFDM signals of the users. Hence, the BS needs to employ NDA estimators for time synchronization and channel estimation to be able to decode the ZP-OFDM symbols.
k=0 , ∀i ∈ {1, 2, · · · , U} denote the n x complex valued modulated symbols to be transmitted from user u i to the BS with the average of power p i = E{|x x i . The subscript n denotes the nth OFDM symbol, and k denotes the kth sample of an OFDM symbol. The nth baseband OFDM symbol of user u i is expressed as [27] x (i) where T x denotes the OFDM symbol duration before zero-padding. Each OFDM symbol is then zero-padded in order to mitigate the effect of inter symbol interference (ISI). Therefore, one can write the final zero-padded nth OFDM symbol as The signal described in Equation (2) then goes through a multipath wireless channel. The baseband impulse response of the channel can be written as where α l (t) is a complex number and τ l denotes the lth channel tap's delay. The effect of transmit and receive filters are captured in h(t, τ). It should be noted that t in Equation (3) captures the time selectivity of the channel, i.e., time-varying channel. Moreover, different τ l values express the frequency selectivity of the channel, i.e., frequency-varying. Hence, the channel is considered to be doubly selective.
Let τ d denote the delay spread of the multipath channel, where E{|α l (t)| 2 } = 0 for τ l > τ d . The length of the zero-padding guard interval is then chosen such that T z ≥ τ d . The sampling time for the received signal at the BS is assumed to be T sa = T x /n x . With the assumption of perfect time synchronization, the complex received basband signal of user u i at the BS can be written as where n h τ d /T sa denotes he number of channel taps, and . denotes the floor function. Wide-sense stationary uncorrelated scattering (WSSUS) is assumed for the channel model, and the channel taps are assumed to be independent random variables. We assume the channel coefficients follow zero-mean complex Gaussian distribution, i.e., Rayleigh fading is considered, and the power delay profile (PDP) of the channel is modeled as where and R[k1 − k2] is an arbitrary function with R[0] = 1. As the relative speed of the transmitter and the receiver increases, In this paper, we consider ultrafast speeds; thus, the PDP of the channel is modeled as We assume that the PDP of the channel is priori known at the BS and has already been estimated during channel sounding, where a known signal is transmitted and the power of the received signal at the receiver is measured and then averaged.
Let n s n x + n z denote the number of samples per ZP-OFDM symbols, where n z denotes the number of zeros padded to each OFDM symbol and is given by n z T z /T sa .
In the absence of TO, by using (4) and considering the additive white Gaussian noise (AWGN) at the receiver, we can write the received signal at the BS in vector form as where and I m and σ 2 w denote the m × m identity matrix and the variance of the noise, respectively. The convolution of the signal and channel taps are expressed via the multiplication of the signal, s (i) n , and an n s × n s Toeplitz channel matrix

JMoM TO Estimator
Let us write the vector of the received samples as where M denotes the total number of received samples and is considered to be a multiple of n s for the simplicity of notation (arbitrary values can be considered for M). Now, by using (10), we can write the received vector of length M as where where with 0 d i as the vector of all zeros of length d i ∈ {0, 1, . . . , d max }, and for −d max ≤ d i < 0. We allow the TO d i to take both negative and positive values. A positive value for d i means that the BS receives signal before the reception of the user u i signal. On the other hand, a negative TO value means that the BS has missed the first d i samples of the user u i .
We assume that the theoretical SoM of v (i) n [j], i = 1, 2, . . . , U, j = 0, 1, . . . , n s − 1, and n ≥ 0, in (18) and (19) given hypothesis H d i ≥0 and H d i <0 , i.e., , in Theorem 1 in [21]. By using the vector of theoretical SoMs [21] and the squared of the absolute values (SAV) of the received samples, we can express the problem of multiuser TO estimation for users u 1 , u 2 , . . . , u U aŝ where . and 1 M denote norm two, and a vector of ones with length M, respectively, d 1 , · · · ,d U denote the estimated TO of the users u 1 , u 2 , . . . , u U , and where I R + (d i ) denotes the indicator function and R + and R − denote the ranges [0, d max ] and [−d max , 0), respectively.

Successive Moment Cancellation (SMC) TO Estimator
The proposed JMoM multiuser TO estimator suffers from huge computational complexity as the number of users increases. In this subsection, we propose the low-complexity SMC multiuser TO estimator inspired by the successive interference cancelation (SIC) algorithm [28]. The main idea behind the proposed SMC is to first estimate the TO of the user with the largest average theoretical SoM, i.e., ||σ σ σ 2 v (i) |H 0 ||/M by using the Method of Moments (MoM). Then, by subtracting the theoretical conditional SoMs of the first user and the vector of noise variance from the SAV of the received samples, the TO of the next user with the largest average theoretical SoM is estimated. This procedure, without subtracting the vector of noise variance, continues until the TO of the last user is estimated. While in power NOMA for communications, difference in the power of the users is a necessary condition, power NOMA synchronization is feasible for user with equal transmit power because for the NOMA synchronization only the level of shift in the sequence of a prior known SoM is unknown. The proposed SMC multiuser TO estimator is summarized in Algorithm 1. In Algorithm 1, δ[·] is the Kronecker delta function. 10: if L < L min then 11 L min ← L 13: 14: returnd d d

Coding-Assisted (CA) Estimator for Fast TO Estimation
In some highly resource-limited scenarios, the proposed JMoM and the SMC multiusr TO estimators may fall short to achieve very high lock-in probabilities. One such scenario is when the users not only use single antenna for data transmission but also have very limited memory. In such scenario, a sufficient number of OFDM symbols can not be loaded into the memory and used for TO estimation in order to achieve the desired accuracy. Hence, the performance of the proposed JMoM and SMC estimators degrades. In order to address such scenarios, we propose the idea of CA TO estimator that can be employed in combination with JMoM and SMC algorithms for performance improvement.
Various performance metrics are used for evaluating different TO estimators. The mean-squared error (MSE) and lock-in probability are two main metrics that are used in the literature. Lock-in probability, i.e., P l , is the strictest measure as estimating the TO off by only one sample before or after the actual TO value counts as an error. On the other hand, MSE is the most lenient measure because while it gives an overview of the overall performance, it hides various important information such as how many times the estimator correctly estimated the actual TO value or exactly how far off are the estimated TO values from the actual value. For example, an estimator could always wrongly estimate the TO but have a lower MSE compared to an estimator which correctly estimates the TO for certain times but the wrongly estimated values vary far off from the actual TO value. Hence, we define a new metric where up to maximum n sample error, i.e., |d i − d i | ≤ n for i = 1, · · · , U, is considered as lock-in region (LR). The probability of correct multiuser time synchronization in the LR is given by where lr stands for lock-in region, is the intersection operator, and P(|d i − d i | ≤ n ) is the probability of correct synchronization for user u i in its lock-in region. It is obvious that P l = P lr (0). In the CA-JMoM and the CA-SMC multiuser TO estimators, a forward error correction (FEC), via low-density parity-check code (LDPC), cyclic redundancy check (CRC), etc [29], is conducted for n U TO combinations in the lock-in region, and estimated TO vector is the one that passes the parity check and achieves the lowest TO MSE.The proposed SMC multiuser TO estimator is summarized in Algorithm 2.

Algorithm 2 CA-SMC
Step 1: (23) and P P P y in (22) 2: Preprocessing: Sort users and their σ σ σ 2 for d := −d max : d max do 9: 10: if L < L min then 11 L min ← L 13: Step 2: 14: Derive LR = {d d d ∈ Z U | |d i − d i | ≤ n , ∀i ∈ {1, · · · , U} } based on n andd d d 15: Derive LR selected = {d d d ∈ LR | pass parity check} 16 Table 1 represents the computational complexity of the proposed multiuser TO estimators. As seen, the proposed SMC estimator offers significantly lower computational complexity compared to the JMoM estimator at the expense of performance degradation in lock-in probability and MSE. It should be mentioned that the computational complexity of the JMoM and CA-JMoM can be reduced by employing dynamic programming methods, such as Viterbi algorithm.

Extension to Multiple Antennas
Let us now assume that the BS is equipped with m r receive antennas while the users have a single antenna. With the assumption of U independent single-input multipleoutput (SIMO) channels, and independent and identically distributed (i.i.d) fading between the user antenna and the BS antennas, the JMoM multiuser TO estimation is formulated aŝ where P P P y y y m denotes the SAV of the received samples at the mth received antennas, and is the sequence of the theoretical SoMs given hypothesis H d i , where the vector is given in (23). Similarly, for the case of multiuser MIMO, we can writê where m t i is the number of antennas at user u i . We can easily show that the computational complexity of the proposed JMoM TO estimator for the case of SIMO and MIMO is O(Md U max m r ).

Simulations
In this section we first describe the default simulation setup parameters and then investigate the effect of each parameter on the performance of the proposed multiuser TO estimators by changing only one parameter at a time in each experiment.

Default Simulation Setup
Unless otherwise mentioned, the following simulation setup and parameters are set for the simulations. A ZP-OFDM system with 128-QAM modulation is considered. The channel is set up as a doubly selective Rayleigh multipath fading channel. The number of channel taps is n h = 10 and the channel taps are assumed to be uncorrelated in the delay domain. The maximum delay spread of the channel is set to be τ max = 10 µs. An exponentialdecay function, i.e., The sampling time interval of the system at the receiver is set to T sa = 10 −6 s. In order to avoid ISI, the number of zero samples padded to each user's signal is n z = 15. The number of data subcarriers, n x , for both users is considered to be 128. The total data transmission power, i.e., the sum of the powers of the first and the second user, is set σ 2 x = σ 2 x 1 + σ 2 x 2 = 1. The ratio of the user powers is defined as c = σ 2 x 1 /σ 2 x 2 . Unless otherwise mentioned, we set c = 1. Moreover, a total number of 200 OFDM symbols are used for estimating the TO at the receiver. The noise is considered to be AWGN and is set to be a zero-mean complex Gaussian random variable with variance σ 2 w . The variance of the noise is determined based on the SNR value, i.e., E b /N 0 = σ 2 x p h log 2 (M)/σ 2 w for 128-QAM moulation. The default SNR value, unless otherwise mentioned, is 5 dB. The TO of the users are modeled to be independent. Each user's TO is an integer random variable that follows a discrete uniform distribution with in the range of d 1 , d 2 ∈ [−30, 30]. The performance of the proposed estimators is evaluated via 10 4 Monte Carlo realizations for each scenario. For the CA version of the proposed estimators, a perfect parity check is considered.

Simulation Results
The lock-in probability of the proposed multiuser TO estimators for different values of E b /N 0 is depicted in Figure 1. We have also shown the performance of the extended successive cancellation version of the TM estimator in [19], which we call it TM-SC. As seen, the proposed estimators significantly outperform the TM-SC estimator. The main reason is that the original TM estimator heavily relies on the noise-only samples which often do not exist in the multiuser scenarios. Since the performance of the TM-SC is poor, we will not report it in next figures. As seen, as the E b /N 0 increases, the performance in terms of lock-in probability improves for the JMoM, the CA-JMoM, and the SMC estimators. More specifically, for the CA-JMoM estimator increasing E b /N 0 from −10 to −5 dB increases the lock-in probability by more than 10%. For the JMoM and the SMC algorithms, increasing E b /N 0 from −10 to 5 increases the lock-in probability by about 60%. In order to explain this, without loss of generality, let us assume that d 2 ≥ d 1 . When the signal of the second user arrives, if the E b /N 0 is large enough, the jump in the variance of the samples thereafter would be large; hence, it is easier to distinguish the TOs compared to when the E b /N 0 is small. The lock-in probability of the CA-SMC algorithm remains relatively constant. Moreover, the CA version of the proposed estimators outperform their original ones. Given the very high lock-in probability for the CA-JMoM, one can decrease the number of OFDM samples used for estimation when the memory is limited. The effect of the number of channel taps on the performance of the JMoM and the CA-JMoM estimators at 5 dB E b /N 0 is depicted in Figure 2. As seen, the performance of the JMoM degrades as the number of channel taps increases. This is due to the fact that the number of noise-only samples decreases. Such samples play an important role in TO estimation. On the other hand, disregarding data samples and using noise-only samples for TO estimation results in poor performance. We also observe that the proposed CA-JMoM estimator can achieve significantly high lock-in probability.  Figure 6 shows the PMF of the estimation error for the SMC estimator at 5 dB E b /N 0 . As seen, unlike an unbiased estimator, the PMF of the estimation error for the JMoM and SMC estimators is not symmetric around (0,0). However, the PMF becomes more symmetric as the number of observation OFDM symbols or E b /N 0 increases. This can be seen in Figures 4 and 5.
Since the error is mostly concentrated around the actual TO value, i.e., error equal to (0, 0), for the JMoM (and relatively for the SMC), the CA-JMoM offers significantly high lock-in probability.   Let us now study the effect of the PDP estimation error on the performance of the JMoM and the CA-JMoM TO estimators. As mentioned earlier, the PDP of the channel can be obtained through channel sounding prior to data transmission. However, there is PDP estimation error. Let us model the estimation error of the k-th channel tap aŝ The effect of the users' power ratio, i.e., c = σ 2 x 1 /σ 2 x 2 , on the performance of the JMoM and the CA-JMoM estimator is shown in Figure 8. On the contrary to the SIC [28], the highest lock-in probability is achieved when the power is distributed equally between the users, i.e., c = 1. The reason is that the signals in this situation are most distinguishable in terms of variance. As the gap between the users' powers, increases, the signal with the lower power hides within the signal with a higher power, and hence, the performance degrades.  Figure 9 shows the performance of the JMoM and the CA-JMoM estimators for two users equipped with multiple transmit antennas. The BS also employs multiple receive antennas. The number of OFDM symbols used for estimation for 2 × 2, 2 × 4 and 4 × 2 multiuser MIMO scenarios, are 80, 60 and 60, respectively. As seen, the lock-in probability can be improved by employing multiple receive antennas. The higher number of antennas at the receiver achieves higher estimation accuracy because of the spatial diversity. On the other hand, the more number of antennas at the transmitter results in self-interference and thus performance degradation. The higher estimation accuracy for the multiple receive antennas allows a lower number of observation samples. We also observe that the gap between the JMoM and the CA-JMoM estimators decreases in the case of multiuser MIMO.

Conclusions
The problem of time synchronization in a multiuser uplink NOMA where users employ ZP-OFDM signals was investigated. We proposed two low-complexity NDA estimators, i.e., the JMoM and the SMC, for estimating the TO of the users. Moreover, the coding assisted version of the proposed estimators, i.e., the CA-JMoM and the CA-SMC, were developed for the case of short observation symbols. We also extended the proposed estimators to multiuser MIMO scenario. Existing NDA estimators [19,20] either have low lock-in probability, high computational complexity that prevents them from being employed in MIMO systems, or are designed for single-user scenarios. The proposed estimators in this paper address all of these issues. The lock-in probability of the proposed estimators was evaluated under various practical scenarios. Simulation results showed that the JMoM estimator offers high lock-in probability, and the CA-JMoM estimator can reach lock-in probability of one. Also, the highest lock-in probability for the JMoM and the CA-JMoM estimators is achieved when the power is distributed equally between the users. The future work is to further reduce the complexity of the JMoM and the CA-JMoM via dynamic programming techniques, such as Viterbi algorithm.