A Polynomial Inversion-Based Fast Time-Delay Estimation Method for Wideband Large-Scale Antenna Systems

: This paper proposes a new fast time-delay estimation (TDE) method based on polynomial inversion which addresses the challenges arising from the requirements for high-precision, low-computational-complexity synchronization error estimation in wideband large-scale antenna systems (LSASs). In this work, we use the convex parabolic extreme point equation as the timing error detector (TED), and develop a polynomial inversion-based function to characterize the one-to-one mapping relationship between true time delay (TTD) and TED estimates using the least square (LS) method, then obtain the time-delay difference with a high accuracy and high computational efﬁciency. The results of the performance analysis indicate that the Mean Square Error (MSE) of the proposed algorithm is less than 1 dB away from the Cramer–Rao lower bound (CRLB), which is produced in this paper, while adding a few multipliers compared to the convex parabolic interpolation method. Finally, a further example illustrates that the proposed algorithm can achieve a synchronization error of less than 5 ps between channels based on the NI PXI broadband multichannel acquisition platform.


Introduction
Estimating the delay error between the channels of a sensor array system to realize the angle-of-arrival estimation and beam tracking of the incoming signal is an important task in modern signal processing.This estimation is used extensively in acoustic imaging [1,2], channel sounding [3,4], wireless sensor networks [5], and satellite navigation and communication [6][7][8][9].Efficient and high-precision TDE is critical to prevent the beam 'squint' phenomena of large-scale and ultra-wideband-array antennas Figure 1a, which is an essential component of broadband communication systems such as LEO satellite communication and THz communication systems.By high-precision time-delay estimation and difference compensation (Variable Fractional Delay Filter), the array can operate over a large signal bandwidth without beam squint.Moreover, it is key to solving the problem frequency selective fading in broadband large-scale antenna systems.To obtain a flat pattern over the desired frequency range, the antenna's coherent bandwidth should be greater than the bandwidth of the signal.According to this theory, the system must have the ability to perform delay compensation to coherently combine signals, which is an intrinsic property of TTD beamforming, as shown in Figure 1b.Thus, high-performance delay estimators are extensively highlighted in future ultra-wideband communication system scenarios.In recent years, the main algorithms used in TDE are the cross-correlation method [1,2,11], generalized phase delay estimation [12], adaptive delay estimation [13][14][15], wavelet estimation [16], and compressive sensing [17,18].Generalized phase delay estimation captures a set of frequency points and the associated phase information in the band, then estimates the time delay based on their linear relationship.However, this algorithm is only suitable for the case of noise-free or uncorrelated Gaussian noise interference.The adaptive time-delay estimation algorithm overcomes the deficiencies of the generalized phase delay estimation algorithm, which requires a priori knowledge of the signal and noise; however, in order to achieve unbiased estimation, the algorithm needs to increase the order of the fractional time-delay FIR filter, leading to an increase in the computational complexity, while the improvement is not significant under low signal-to-noise ratio (SNR) conditions.Liu [16] combines wavelet transform with the time-delay estimation approach to obtain more accurate time-delay information; yet, the algorithm requires complex calculations and its response time is limited.In an attempt to reduce the burden of data transmission bandwidth, scientists such as Gedalyahu [17] have proposed to carry out time-delay error estimation based on compressed sensing ideas.Additionally, Gu [18] gave the CRLB performance bound, but the algorithm was only effective for presenting sparsity after changes.Compared with the mentioned algorithms, the cross-correlation algorithm (CC) is currently the most widely used.The CC algorithm can be divided into Basic Cross-Correlation (BCC) and Generalized Cross-Correlation (GCC) based on the presence or absence of frequency-domain whitening filters.The BCC algorithm, by calculating the cross-correlation value R(τ) of the two signals, solves max τ R(τ) to yield τ, which is the relative time-delay estimate of the two signals.The GCC algorithm first performs whitening filtering before calculating the cross-correlation function to eliminate the correlation between channels, thereby reducing the width of the main lobe of the correlation function.It is mostly used in fields such as microphone arrays and ultrasonic positioning.However, the time-delay estimation accuracy of GCC in the wideband sampling system is decreased severely because the extremely sharp peak cannot be used to improve the accuracy of the estimated value by interpolation theory.For this situation, the sampling accuracy can be improved by zero-padding in the frequency domain, but the computational workload will be multiplied.
Based on the aforementioned points, in this paper we focus on the optimization problem of the BCC algorithm-namely, max τ R(τ)-in the context of the research on phased-array antenna technology for satellite terminals.Giunta [11] applies the parabolic interpolation method to the peak calculation of R(τ) to improve the accuracy of the timedelay estimation, but there is a misfit phenomenon.Therefore, Hui [15] uses the interpolation theory to reconstruct R(τ) so as to obtain a performance close to that of CRLB.However, the algorithm has two intrinsic limitations, the computational complexity and the very small grid size, which are key to improving the estimation precision.To cope with these limitations, Wen [19] proposes a new two-step method to obtain coarse time-delay estimates based on [20] first, then obtains a fine solution via parabolic extremization and triangular interpolation.However, the algorithm requires matrix inversion, which makes it insufficient for large-scale sensor arrays.Li [21] designed an estimation method based on the envelope differences of correlation functions (EDCFs) while maintaining a high precision and lower computational complexity; however, the algorithm has a limited range of time-delay estimation.
In summary, this paper proposes a polynomial inversion-based estimation algorithm that takes into account the accuracy and algorithm complexity while extending the range of time-delay estimation.The main contributions of this paper include: (1) Taking the convex parabolic extremum equation determined by the three sets of crosscorrelation values as TED and feeding the result of TED into the polynomial inversion equation, an accurate fractional delay estimate with the desired performance and a lower computational complexity can be obtained.(2) The CRLB is produced to adapt to more scenarios.The bias, variance, and MSE equations of the new algorithm are also derived.Combined with the CRLB and existing approaches, the simulations show that the proposed time-delay estimator is asymptotically unbiased, with its MSE performing on the CRLB.(3) Applying the algorithm to the RF multichannel broadband acquisition system, the synchronization error between the channels is less than 5 ps.
The rest of the paper is organized as follows.In Section 2, we introduce the principle of fast delay estimation based on BCC.Section 3 describes the principle of polynomial inversion-based time-delay estimation.To evaluate the performance of the proposed algorithm, expressions for bias, variance, and MSE are derived, and the performance curves of the algorithm are depicted in conjunction with the produced CRLB in Sections 4 and 5.A design example is given in Section 6.Finally, we draw conclusions in Section 7.
Notation: We use τ, τ, and τ to represent the true time delay, the output of TED, and the optimized estimate of the polynomial, respectively.The subscripts () r and () i denote the imaginary and real parts of a complex variable.The symbol * means conjugate, (•) returns the real part of a complex number, and E stands for the mathematical expectation.R x (•) represents the autocorrelation function of signal x, while R x 1 x 2 (•) denotes the correlation function of x 1 and x 2 .The Rx 1 x 2 (•) stands for the estimate of R x 1 x 2 (•).

Introduction of Fast Delay Estimation Based on BCC 2.1. Signal Model
Assume that the two broadband signals obtained by the acquisition system are expressed as follows: where T denotes the sampling period; s(kT) represents the complex signal with a frequency bandwidth of [−B/2, B/2] Hz and power σ 2 s ; and n 1 (kT), n 2 (kT) represent two independent complex Gaussian noises with zero mean and a variance of σ 2 n that are independent of s(kT).Considering x 1 (kT) as the reference channel signal, I 0 and τ 0 ∈ [−0.5, +0.5] denote the integral period and fractional delay relative to the reference signal.A indicates the channel gain.Suppose that s(kT) is a zero-mean stationary stochastic process consisting of a real part s r (kT) and an imaginary part s i (kT).Its autocorrelation functions (ACFs) can be expressed as R s (τ) = E{ (s * (kT)s(kT + τT))}.Let {n r1 , n r2 } and {n i1 , n i2 } denote the real parts and imaginary parts of {n 1 , n 2 }, respectively.According to the properties of Gaussian noise, we can obtain The time delay IT + τT is estimated according to the correlation function algorithm as follows: where The estimates are obtained by solving Equation (3) for the ideal case Î = I 0 , τ = τ 0 .

Fast Estimators of Time-Delay
The estimation of the integral period is accurate in the general case; without any loss of generality, let I 0 = 0 .To quickly solve arg max τ R x 1 x 2 (τ), suppose that we obtain three sets of data (−θ, Rx 1 x 2 (−θ)), (0, Rx 1 x 2 (0)) and (θ, Rx 1 x 2 (θ)), where θ ≤ 1.The curve of the correlation function can be approximated by a convex parabola, which can be defined as where the unknown variables {a, b, c} can be solved using the three sets of data obtained as follows: Solving Equation ( 5) yields: Solving for the extreme value by the derivation of Equation ( 4) yields an estimate τ0 , as follows: Then, τ0 ≈ τ 0 .However, a convex parabola is used to fit the function R x 1 x 2 (τ), which can lead to the 'misfit' phenomenon, as shown in Figure 2. The result reveals that the estimation performance deteriorates as the bandwidth B increases.

Proposed Fast TDE Algorithm
Based on the above introduction, the conventional algorithm suffers from the 'mismatch' issue.Before proceeding to explore the proposed algorithm, we will initially investigate the mapping relationship between τ 0 and τ0 .If τ 0 and τ0 are mapped strictly one-to-one over an interval [−0.5, 0.5], we can call Equation ( 7) a strictly monotonic function.In mathematics, if a function is strictly monotonic, it must have an inverse function.In this paper, we do not intend to strictly prove the monotonicity of Equation ( 7), and instead aim to verify the invariance of the sign of the first-order derivative of Equation ( 7) with simulation methods.
As shown in Figure 3, the sign of their first-order derivatives is constant, so we can conclude that Equation ( 7) is a strictly monotonic function.Consequently, we propose a polynomial inversion-based TDE to improve the estimation precision, which is described in detail in Section 3.1.

Principle of Polynomial Inversion-Based Time Delay Estimation
For ease of description, we introduce the function TED(τ 0 ) = τ0 by rewriting Equation (7) as follows: The equation above gives: Here, TED −1 ( τ) is the inverse function of TED(τ).To obtain the fractional time-delay inversion function, a set of P-order polynomial coefficients is designed to approximate TED −1 ( τ)-i.e., ∑ P p=0 a p τp 0 ≈ TED −1 ( τ0 ) ≈ τ 0 .To solve for the polynomial coefficients {a 0 , a 1 , • • • , a P }, the fractional delay interval [−0.5, +0.5] is divided into Q + 1 equal parts and Q ≤ P; then, ∆ = 1/Q, τ q = −0.5 + q∆, q = 0, • • • , Q. First, the time-delay estimate τq is calculated as shown in the following equation: here Rr (θ + τ q ), Rr (−θ + τ q ), Rr (τ q ) are defined as: where the r(kT) in the above equation is an arbitrary waveform sequence with zero noise, consistent with the spectral profile of the transmit signal s(kT).We then use the obtained { τ0 , τ1 , • • • , τQ } and the polynomial to establish the equation, which can be expressed as: Using the least square method, we solve for the polynomial coefficients as follows: Then, Equation (8) yields the fractional time-delay estimate, as follows:

Modified Performance Limits
Carter [22] derived the CRLB for the case of equal signal bandwidth and noise bandwidth; however, the signal bandwidth is smaller than the noise bandwidth in most scenarios.Therefore, CRLB expression needs to be adapted to make it suitable for more applications.The Cramer-Rao lower bound (CRLB) is computed as [22]: where H denotes conjugate transpose, X( f ) = [X 1 ( f ), X 2 ( f )] T , which is the Fourier transform vector of [x 1 (kT), x 2 (kT)].Q( f ) is the inverse of the spectral density matrix, which can be expressed as: Here, the G x1x1 ( f ), G x2x2 ( f ) in the above equation denote the power spectral densities (psds) of the signals x 1 (kT), x 2 (kT), respectively.G x1x2 ( f ) represents the cross-correlation power spectral density (psd) of the signals x 1 (kT), x 2 (kT), which can be formatted: where G ss ( f ), G n1n1 ( f ), G n2n2 ( f ) represent the power spectral densities of signals s(kT), n 1 (kT), n 2 (kT), respectively, and τ indicates the time delay of signal x 2 (kT) with regard to the reference signal.
The derivation term of Equation ( 14) can be calculated as: Then, we introduce the normalization factor of the cross-correlation psd C x1x2 ( f ): To facilitate the analysis, assuming that the power spectra of signal is flat over the frequency range [−B/2, B/2] and zero outside this range, Equation ( 14) is simplified as follows: As far as the noise and signal bandwidth are concerned, we set the noise bandwidth to B n = 1/T and the signal bandwidth to B; thus, the power spectral density of x 1 (kT) is obtained: Then, we can obtain: where SNR = σ 2 s /σ 2 n , α = B n /B.By substituting Equation ( 21) into Equation ( 19), we can obtain:

Performance Analysis
To analyze the performance of the improved algorithm proposed in this paper, we investigate it in terms of both bias and MSE [23].Referring to Equation ( 13), the equation for bias is obtained as: Here, the first term on the right side of Equation ( 23) denotes random error, var(•) indicates the variance, and ∂ 2 τ0 /∂(E( τ0 )) 2 = ∑ P p=2 p(p − 1)a p E p−2 ( τ0 ).D accounts for the systematic bias, representing the ∑ P p=0 a p τp misfit, which can be computed as If P → +∞, then D ≈ 0.
The MSE can be expressed as: where var( τ0 ) can be computed as: Here, ∂ τ0 /∂(E( τ0 )) = ∑ P p=1 p × a p E p−1 ( τ0 ).To facilitate the analysis of Equations ( 23) and ( 25), we combine Equation ( 7) to solve for var( τ0 ), as follows: Here, cov(x, y) denotes the covariance of the random variables x and y, X sidering the dependency of covariance on variance, we first address the expression for cov( Rx 1 x 2 (τ 1 ), Rx 1 x 2 (τ 2 )) as follows: Thus, we obtain: To simplify the analysis, we assume that the correlation between the signal (s(kT)), the real part (s r (kT)), and the imaginary part (s i (kT)) is approximately zero-i.e., E((s r (kT) • (s i (kT)) ≈ 0, A = 1.Then: Carrying out a similar manipulation, one can also obtain E{♣ • ♠} and E{♣ • }.By substituting Equations (30) and (31) into Equation (29), and after performing some algebraic manipulations, we can obtain: By substituting Equation (32) into Equation (28), we can find: Taking Equation (33) to Equation (27), we solve for var( τ0 ) (see [11] for details), yielding As far as the MSE is concerned, we find that it consists of two contributions, the covariance var( τ0 ) and the bias E 2 ( τ0 − τ 0 ).Referring to Equation (26) for further analysis, var( τ0 ) is diverse because (∂ τ0 /∂(E( τ0 ))) 2 varies with the time delay, as will be demonstrated in the next section.

var( τ0 ) Simulations
In order to evaluate the variance of the new algorithm, combined with the modified CRLB (Equation ( 22)), MATLAB simulates the variance and CRLB curves for different latency scenarios in different bandwidths.Figure 4a simulates the performance curve for BT = 0.5, τ 0 = {0.1,0.2, 0.3, 0.4}, which shows that the variance is very close to the CRLB curve with an error of less than 0.5 dB for SNR > 15 dB.Additionally, the variance jitter due to ∂ τ0 /∂(E( τ0 )) is within 0.2 dB. Figure 4b shows the performance curve under BT = 0.25, τ 0 = {0.1,0.2, 0.3, 0.4}, where the variance and CRLB error are less than 0.25 dB when SNR > 15 dB.

MSE Evaluations
Considering the bias phenomenon, the performance of the proposed algorithm is evaluated using the MSE criterion (Equation ( 25)).We simulate the scenario with different bandwidths and different fractional delays (Table 1) to obtain the following results, as shown in Figure 5.It should be noted that in this scenario, a variable fractional delay filter (VFD) is implemented, and its delay accuracy will also have an impact on the simulation system.The VFD parameters used in this simulation are: normalized passband frequency 0.9π, time-delay accuracy ≤ 2 × 10 −7 (/T), and frequency response ripple ≤ 0.2 dB.From the simulation, we can see that the MSE of the algorithm has a CRLB error of ≤1 dB when the SNR is ≥15 dB.Table 1.List of simulation parameters.

Comparison with Existing Approaches
The proposed algorithm is compared with the existing approaches in [11] (marked as 'Para'), [6] (noted as 'Tri'), and [19] (marked as 'FIR'), and the VFD parameters included in the simulation procedure are consistent with those of scenario 2. Here, the parameters of 'FIR' are set to M = 10 and τ = 0.1 [19].By comparing it with Figure 6, it can be seen that in terms of estimation, the 'FIR' algorithm outperforms the other three algorithms when the SNR ≤ 10 dB; however, the MSE of the proposed algorithm is closest to CRLB when the SNR > 10 dB.The parabolic algorithm has a larger bias due to the increasing time-delay difference because of the misfit phenomenon, as well as being insufficient for broadband multichannel acquisition systems.Furthermore, Figure 6 obviously shows that the 'Tri' is more suitable for spread-spectrum communications [6] than the scenario described in this paper.As far as the computational complexity is concerned, as presented in Table 2, the 'FIR' method has the heaviest computational load, meaning that it is not appropriate for highly dynamic situations.In contrast, the algorithm proposed in this paper only uses P + 1 more multipliers than the parabolic algorithm to achieve a performance close to that of CRLB.Therefore, it is more suitable for application in wideband large-scale multichannel systems for real-time time-delay estimation.

TDE-Based Application
We give an example to illustrate the functionality of the proposed algorithm in practice.The broadband multichannel acquisition system based on NI PXI devices we delivered is mainly used to measure the functional performance indicators of the broadband digital correlator of the microwave radiometer, which is an embodiment of wideband large-scale antenna arrays [24], including multichannel receiving channel isolation tests, stability tests, and linearity tests.The approach involves using the time-delay difference estimation method proposed in this paper to calculate the multichannel consistency, isolation, correlation calculation accuracy to evaluate the performance of the broadband digital correlator.Meanwhile, this system can also be used to validate the processing algorithm of the integrated aperture receiver.
The hardware used in the system includes an NI PXIe 1095 chassis, PXIe 8880 controller, PXIe 7976R FPGA processor and IF Digitizer Adapter Module NI 5772.The signal sampling rate is 800 Msps.The physical diagram of the system is shown in Figure 7a, and the signal processing block diagram of the acquisition is shown in Figure 8.The signal processing modules are all implemented in the FPGA, including Hilbert transform, digital down conversion, integer cycle delay adjustment, and variable fractional delay filter.
Figure 7b shows the system running UI.This calibration performance from the system shows an absolute error of less than 5 ps between channels.The system can act as a standard device to evaluate the signal quality of broadband multiantenna systems.It also can be used in calibration equipment to gauge synchronization errors in broadband multichannel systems.

Conclusions
To address the challenge of time-delay estimation between channels in multiantenna, wideband receivers, this paper proposes a polynomial inversion-based time-delay estimation method.The new algorithm improves the delay estimation accuracy while adding P + 1 multipliers compared to the convex parabolic interpolation method, as summarized in Table 2, which is a very competitive method in multiantenna real-time communication systems.To evaluate the performance of the new algorithm, we modify the expressions of CRLB and derive equations for bias, variance, and MSE.Then, the effectiveness of the algorithm is verified by simulation in conjunction with CRLB.The results show that the proposed time-delay estimator is asymptotically unbiased with its MSE performing on the CRLB.Finally, the algorithm is applied to the radar radiometer test system, named the wideband LSASs test-bed, and high-accuracy and high-stability parameter test results are achieved.

Table 2 .
Computational complexity comparison with existing approaches.