Thiran Filters for Wideband DSP-Based Multi-Beam True Time Delay RF Sensing Applications

The ability to sense propagating electromagnetic plane waves based on their directions of arrival (DOAs) is fundamental to a range of radio frequency (RF) sensing, communications, and imaging applications. This paper introduces an algorithm for the wideband true time delay digital delay Vandermonde matrix (DVM), utilizing Thiran fractional delays that are useful for realizing RF sensors having multiple look DOA support. The digital DVM algorithm leverages sparse matrix factorization to yield multiple simultaneous RF beams for low-complexity sensing applications. Consequently, the proposed algorithm offers a reduction in circuit complexity for multi-beam digital wideband beamforming systems employing Thiran fractional delays. Unlike finite impulse response filter-based approaches which are wideband but of a high filter order, the Thiran filters trade usable bandwidth in favor of low-complexity circuits. The phase and group delay responses of Thiran filters with delays of a fractional sampling period will be demonstrated. Thiran filters show favorable results for sample delay blocks with a temporal oversampling factor of three. Thiran fractional delays of orders three and four are mapped to Xilinx FPGA RF-SoC technologies for evaluation. The preliminary results of the APF-based Thiran fractional delays on FPGA can potentially be used to realize DVM factorizations using application-specific integrated circuit (ASIC) technology.


Introduction
The directional enhancement of propagating RF plane waves is a key requirement for electromagnetic sensing using antenna array processors.Linear arrays of uniformly spaced wideband antenna elements receive far-field propagating electromagnetic plane waves, which must be directionally enhanced along the desired look direction using a beamformer.A TTD beamformer enhances a wideband plane-wave signal along a particular DOA.To wit, as the instantaneous bandwidth of modern antenna apertures increases, it leads to wideband systems.The underlying beamforming schemes for such an aperture must be realized with support for squint-free operation across the bands of interest [1].The TTD beamformer is the optimal beamformer when the signals of interest are contaminated by wide sense stationary AWGN [2][3][4][5]; the case of the sensing of RF plane waves along multiple directions is a crucially important application for antenna array-based RF sensing and direction-of-arrival estimation [6] with use cases in RF imaging, RF communications, and RF location.
For an aperture with N elements, an N-th order TTD beamformer can be realized using analog, mixed-signal, or direct-digital methods for realizing such a TTD aperture beam.A TTD beamformer can be steered by the appropriate selection of the progressive time delay from one element to the next in a uniform aperture configuration.In cases where a large number of wideband beams are required, a dedicated TTD beam must be created through its own TTD beamsteering network [5].Typically, for N elements, N-independent beams can be realized by considering a short duration time delay τ 0 = T/N, where T is the time taken for the radio signal to propagate between two neighboring elements in the aperture.The parallel realization of N beams using a network of TTDs with durations set to integer multiples of τ 0 leads to an N-beam TTD multi-beam beamformer with an underlying mathematical framework given by the DVM [7][8][9][10][11].The DVM is similar in concept to a spatial DFT; however, unlike the case of the DFT which uses a fixed complex coefficient in the matrix-vector operation, the DVM uses frequency-dependent weights defined as integer multiples of the smallest TTD component τ 0 .

Review: Fractional Delay Filters
Although FFT-type algorithms utilizing Thiran fractional delays are presently not available, one could observe other methods based on fractional delay filters.The authors in [12] proposed different methodologies and parameters for designing allpass fractional delay filters.The authors claimed that the employment of Lagrange FIR and Thiran fractional delay filters are the best choices for achieving desirable phase delay responses.The paper [13] evaluated designs for FIR fractional filters, including sinc, frequency sampling, polynomial interpolation, and least integral-square error.In designing fractional delay filters, the paper [14] addressed the utilization of Taylor series expansions of e −jωτ when τ represents a combination of integer delay and a fractional delay within the interval of [−0.5, 0.5].This approach was utilized to design fractional delay filters and gauge the efficacy of their structure in relation to the Farrow structure, primarily in terms of the storage of filter coefficients in [14].The realization of allpass fractional delay filters was also addressed through the truncation of the power series expansion based on the denominator of the transfer function of the variable z at z −1 = 1 [15].The paper [16] addressed the reconstruction techniques for nonuniformly sampled band-limited signals using digital fractional delay filters using a Vandermonde structured matrix followed by a diagonal scaling.One can also find the factorization of the transfer function of the infinite impulse response (IIR) Simpson integrator for designing fractional delay filters in [17].On the other hand, authors in [18] presented the introduction of the Hilbert transform operator, which was developed using a half-sample delay operator created through the B-spline transform interpolation and decimation.However, there is a lack of computationally efficient algorithms in the literature, despite the fact that the methods for designing Thiran fractional delay filters have been established.

Wideband Beamformer Structure
Figure 1 shows a single narrow-beam phased-array beamforming structure, a DFTbased multi-beam phased-array structure, a single TTD wideband beamformer structure, and an N-beam TTD wideband multi-beam structure, respectively.We first define the DVM by where α = e −jωτ 0 .The term α accounts for the phase rotation associated with the smallest fractional delay τ 0 (necessary for realizing the DVM N-beam system) at frequency f , where the temporal frequency ω = 2π f .Unlike the DFT-based N-beam phased array, which can be realized using self-contained factorization of the DFT followed by the FFT algorithm, the corresponding N-beam TTD structure has no direct equivalent of the FFT.We address this problem using sparse factorization followed by a DVM algorithm while replacing complex coefficients with frequency-dependent weights realized through linear filters.The analog realization of low-complexity and/or radix-2 DVM algorithms was previously discussed in references [7][8][9][10][11].The structure of the DVM offers a framework for analyzing TTD-based wideband multi-beam beamforming [1].The DVM overcomes the limitations associated with employing the DFT matrix.For example, wideband signals applied to DFT-based multi-beams lead to an undesirable "beam squint" effect.The DVM entries, referred to as A N in [7][8][9][10][11], serve as the foundation for a TTD-based beamformer.It is important to note that the DFT matrix equals the matrix A N at a single temporal frequency because the DVM covers a wide range of multiple temporal frequencies.Thus, the DVM is a superclass of the DFT matrix.However, unlike the DFT matrix, the DVM matrix lacks both properties of unitary and periodicity.Therefore, unlike the case of the DFT that can be self-factored to obtain FFT algorithms [19][20][21], we are generally unable to factorize the DVM to obtain a self-contained and radix-2 factorization, leading to an FFT-like algorithm.

Contribution of the Paper
To develop an efficient digital DVM algorithm that incorporates fractional delays, we will utilize the bidiagonal factorization technique in [7].By implementing an array representation of the DVM factorization by a vector, we present a DVM algorithm that efficiently reduces the complexity associated with the matrix-matrix multiplication step of the DVM algorithm described in [7].Furthermore, we provide SFGs specifically for small-array elements, i.e., N = 4, 8 incorporating Thiran fractional delays.Specifically, the fractional part associated with each node in the SFGs is implemented using Thiran fractional delay filters.Additionally, the integer part is realized using FIFO buffers implemented using D-flop pipelines and/or realized as the integer delay component of a Thiran filter.This approach allows for a precise and efficient representation of the SFGs, facilitating enhanced performance and accuracy in TTD-based wideband multi-beam digital beamforming.

Organization of the Paper
We provide an overview of the analog DVM algorithms proposed for TTD wideband multibeam beamforms in Section 2. In Section 3, we give an overview of digital DVM beamfomers using Thiran filters.Next, in Section 4, we recall the DVM factorization in [7] and present an array implementation of the factorization using a novel DVM algorithm.The proposed algorithm is more reliable for large N than that for the matrixmatrix multiplication-based DVM algorithm in [7,8].Section 4.1 shows the arithmetic complexity (quantified using the number of necessary adders and gain-delay blocks) and numerical results for the DVM algorithm, and compares the arithmetic complexity with the matrix-vector product with the previous work.Computing the DVM-vector product with fractional delays has low arithmetic complexity compared to the brute force calculation.Section 5 presents a brief overview of analog and digital DVM beamforming using APFs.Section 6 shows the simulation for the linear phase response of Thiran filters.Next, Section 7 demonstrates Thiran filters for realizing each fractional sample delay based on the proposed DVM algorithm followed by the SFGs.Section 8 discusses future work on utilizing the APF-based Thiran fractional delay units to realize the full DVM factorization.Finally, Section 9 concludes the paper.

Review: Analog DVM Beamformers
In prior work, we proposed a sparse factorization of the N-beam DVM leading to 60% reduction in integrated circuit complexity [7].This low-complexity N-beam DVM algorithm is based on the product of complex 1-band upper and lower matrices.In this paper, we expand on the algorithmic results in [22][23][24] by utilizing complex nodes instead of real nodes.The 1-band upper and lower matrix factorization of A N in [7] is expressed without utilizing quasi separability, displacement equations, and the factorization of Hankel/Toeplitz matrices [25][26][27][28][29][30][31].Moreover, the paper addressed error bounds and stability by filling the gaps in [22][23][24] to compute DVM by a vector.
The paper [10] proposed an exact, efficient (i.e., more efficient than brute force multiplication of the DVM by a vector but not radix-2), and self-recursive DVM algorithm by using a matrix factorization of the DVM and a polynomial evaluation associated with its nodes to analyze multi-beam antenna arrays.The exact calculation of the DVM vector product via the DVM algorithm in [10] can be used to reduce the complexity of RF N-beam analog beamforming systems.Although the algorithms proposed in [8,10] are much more efficient than brute-force calculation, their order of arithmetic complexity is not O(N log N).We recall here that the stable (well conditioned) and O(N log N) algorithms for Vandermonde matrices proposed in [9] can be used for the narrowband communication system.Derivations have been established for radix-2 and split-radix FFT algorithms in [19][20][21][32][33][34][35][36].Despite the derivation of size-N DFT into two size-N 2 DFTs being quite straightforward, its extension to the DVM is cumbersome because useful DFT matrix properties (periodicity and unitary) are not present in the DVM.
To overcome the above challenge, we proposed an O(N log N) DVM algorithm having sparse factors to achieve multi-directional TTD wideband beamforming to solve the longstanding beam squint problem in [11].In [11], the "multiplication counts" in the frequency domain-albeit in the analog domain-are a combined result of analog gains (amplifiers) and time delays in the SFG when realized in a time-domain circuit.

Digital DVM Beamfomers Using Thiran Filters
Figure 2 shows a TTD multi-beam beamforming aperture for receive (left) and transmit (right) applications.In both cases, the beamformers are realized in the digital signal processing (DSP) unit that implements the DVM-vector operation once every sampling period.The TTD digital beamformers can be realized as the DVM-vector product s.t.A N x, where x is the N− point vector of input signals obtained from N receivers (or feeding N transmitters after digital-to-analog conversion).In the DVM A N , the k th -row and l thcolumn element is of the form e −jωτ 0 kl , and hence the corresponding delays take the form T(l) = τ 0 kl, where τ 0 is the smallest fractional delay necessary for realizing the DVM N-beam system, and ω is the temporal frequency.Typically, τ 0 = T s /N.We note here that the factorization of the N-beam DVM leads to a range of delays T i = τ 0 P i , where i is the index to the i th −node of the signal flow graph (which is, in fact, the butterfly diagram for the radix case) corresponding to the factored version of A N and P i ∈ Z + , and P i ≤ (N − 1) 2 are the different time delays across the nodes of the signal flow graph (butterfly for the radix case) pertaining to the proposed fast algorithm for y = A N x.It is important to note that at node i, the delay can be decomposed to the form τ 0 P i = n i T s + τ i , where τ i < T s and where n i ∈ Z + is an integer delay realized through FIFO register buffers located at point i of the signal flow graph (butterfly for the radix case).Each delay operation at node i consists of an integer component and a fractional component.The fractional delays τ i ≤ T s will be realized as fast interpolation filters based on finite impulse response (FIR) and/or all-pass IIR Thiran filters.We start with the n i th -order Thiran fractional delay filter of the form [12]: where Here, each set of coefficients a k is unique to node i; however, we dropped the dependence on i for readability.We obtain an approximated transfer function for the digital all-pass Thiran fractional delay filter.Thus, with this and from the prior work on DVM [7][8][9][10][11], we propose a low-complexity wideband TTD digital DVM algorithm using Thiran fractional delays.

A Fast DVM Algorithm for Thiran Fractional Delays
We begin this section by recalling the sparse factorization of the DVM in [7].Subsequently, an algorithm is presented based on the array implementation as opposed to the matrix-matrix implementation associated with the phase rotation α = e −jωτ 0 followed by the smallest fractional delay τ 0 .We note that the primary objective of this study is to propose a low-complexity DVM algorithm yielding to compute DVM-vector multiplication using the Thiran fractional delays.Our work builds upon the studies [7][8][9][10][11].It is important to note that this paper does not address related issues, such as the computation of Vandermonde structured matrices by a vector, inverse Vandermonde structured matrices (leading to inversion formulas or inversion algorithms), or the solution of the linear systems associated with Vandermonde structured matrices as the coefficient matrices as in [37][38][39][40][41][42][43][44][45][46][47].
Proposition 1 (Recalling from [7]).Let the DVM, denoted by A N , be defined via the nodes {α, α 2 , . . ., α N } ∈ C for integers N s.t.N ≥ 4.Then, the DVM can be factored into the product of bidiagonal matrices s.t. where Remark 1.We note here that the above factorization can be used to propose the ddvm algorithm (Algorithm 1) based on Thiran fractional delays when the nodes of the DVM (i.e., α k ) are associated with the phase rotation α = e −jωτ 0 followed by the smallest fractional delay τ 0 .Additionally, in Section 5, we utilize the above factorization to present the signal flow graphs for y = A N x, where the (k, l) element of the matrix A N is determined by the fractional delay component associated with e −jωτ 0 kl .

Arithmetic Complexity of the DVM Algorithm
We show that the arithmetic complexity of the proposed DVM algorithm is less than the standard quadratic complexity.Furthermore, when the algorithm executes, it uses array implementation to avoid matrix-matrix multiplication in [7].The number of additions and multiplications used to execute the DVM algorithm correspond to the adders and gain delay blocks, respectively.
The digital DVM using Thiran fractional delays describes a DSP system where each fractional true time delay is realized using an IIR all-pass fractional delay filter.The Thiran filter is one type of digital all-pass filter that maintains nearly constant group delay in the frequency domain −π/2 ≤ ω ≤ π/2.The Thiran filters in this work will be realized using the direct-form type-II SFG so that the amount of D-flop FIFO memory is minimized.
), end if.end for, end for.
The number of complex additions (say #a) and the number of complex multiplications (say #m), i.e., adders and gain-delay blocks, respectively, are required to compute the DVM algorithm, i.e., ddvm, stated next.Lemma 1.Let N ≥ 4 be a given integer and x ∈ C N .The DVM algorithm, i.e., ddvm, can be computed with the following arithmetic complexities (respectively, adders and gain-delay blocks): Proof.To pre-calculate the vector u in Step 1 of the algorithm, one has to use gaindelay blocks but no adders.These counts are based on pre-computation and correlated with the design time computations.Also, the aim of Step 1 is to compute the output of the algorithm effectively.In Step 2 of the algorithm, there are no adders or gain-delay counts involved in the if part.For the else part, when (N − k) ≤ i ≤ (N − 1), we have one adder and gain-delay for each k, so in total, we have adders and also gain-delay blocks in Step 2. In Step 3 of the algorithm, there are no adders or gain-delay involved in the if part.For the else part, when (k − N + 1) < i ≤ N, we have two adders and one gain-delay for each k, so in total we have N(N − 1) adders and N(N−1) 2 gain-delay blocks in Step 3. In Step 4 of the algorithm, we extract the last column of A and assign it to the vector y; hence, no cost is involved.Hence, the proposed algorithm requires 3  2 N(N − 1) adders (respectively, complex additions) and (N − 1) 2 gain-delay blocks (respectively, complex multiplications).
Remark 2. We recall here that an array implementation of a DVM algorithm was used to solve the DVM system for the phased array digital receivers in [45], and hence were able to reduce the cost of solving the DVM system from O(N 3 ) to O(N 2 ).Unlike in [45], we do not solve the systems of equations but rather compute the DVM-vector product.

Comparison Results for the Arithmetic Complexities of DVM Algorithms
In this section, we compare the arithmetic complexity of the proposed DVM algorithm with the related work of DVM algorithms.
The gain-delay blocks of the proposed DVM algorithm (i.e., (N − 1) 2 ) show favorable results in comparison to [7] (i.e., ).This advantage is gained based on the precomputation of nodes followed by the array implementation of the proposed algorithm.If we consider the gain-delay blocks in the pre-computation stage of the ddvm algorithm (step 1), the number of gain-delay blocks in the proposed algorithm and the algorithm in [7] is the same, i.e., 3  2 N 2 + O(N).Furthermore, the number of adders remains consistent between the ddvm algorithm and the algorithm in [7].The ddvm algorithm exhibits a higher count of adders and gain-delay blocks compared to the algorithms presented in [10,11].This is because the algorithms in those papers have adders and gain-delay block counts of the order O(N log N), whereas the ddvm algorithm maintains these counts of less than O(N 2 ).But, on the other hand, it is important to note that the paper [10] introduces additional criteria for the arrangement of nodes.Specifically, the nodes are evenly distributed on a circle centered at the origin with a radius r, where r ≥ 1.This criterion is not imposed in this paper.Despite the fact that the algorithms presented in [11] exhibit a complexity of O(N log N), this algorithm does not yield favorable outcomes when applied to antenna arrays with a small number of elements, specifically 4 and 8.However, the DVM algorithm in [11] shows significant efficiency, surpassing 90%, for DVM sizes of N ≥ 256.On the other hand, the gain-delay blocks of the ddvm algorithm consistently exhibit better performance compared to those in [9] for both N = 4 and N = 8 configurations.
However, the objective of this paper is to utilize the DVM factorization, as outlined in Proposition 1 from [7], for the implementation of multi-beam digital wideband beamforming systems that employ Thiran fractional delays.Thus, we will utilize the proposed algorithm to realize the signal flow graphs as shown in the next section.

Thiran Fractional Delays for Twiddle Filter Realization
In this section, we will provide a brief overview and formulation of how analog and digital DVM beamforming can be realized using APFs.

Digital Fractional Delay APFs
Upon digitization, the time domain becomes discrete due to the sampling operation in the ADCs.Hence, the linear phase fractional sample delay blocks corresponding to the (k, l) element of A N , i.e., e −jωτ 0 kl where τ 0 is the smallest fractional delay necessary for realizing the DVM N-beam DSP system.The matrix factorization of A N for digital DVM can be realized using discrete-domain Thiran fractional delay filters while approximating the fractional time delay, i.e., τ i < T s where T s = τ 0 N.

Thiran IIR Filter Blocks
Unlike the analog DVM case, which uses the fact that e −jωτ i ≈ 1−jωτ i /2M 1+jωτ i /2M M , M ∈ Z + and ω is the frequency variable in general, the digital DVM requires Thiran fractional delay approximations given by the digital infinite impulse response (IIR) filter function H i (z), having a filter order n i ∈ Z + .The purely fractional delay component ψ(z) is related to the Thiran filter via H i (z) = z −n i (ψ(z)) n i , and ψ(z) is the fractional delay associated with τ 0 .

Higher-Order Thiran APFs
We note that the fractional delay (ψ(z)) p can be realized either by cascading p units of first-order (or low-order) digital ψ(z) all-pass filters or by designing a single APF with the desired level of fractional delay so that it fits the model of H i (z).The former approach leads to p-times replicated computational hardware structures of a Thiran filter when realized as a VLSI implementation.We adopt the latter approach, where a unique Thiran filter of order n i is realized as a hardware core for the implementation of ψ(z) p .

Simulation of Thiran Fractional Delays
We demonstrate the approximately linear phase response of Thiran filters.We show two examples with fractional delays of D i = 0.1-0.9fractions of the temporal sampling period.We used Thiran IIR APFs of order n i = 3 and n i = 4, respectively, with corresponding integer delay components of three and four clock periods, and the desired fractional delay components.Both filters are usable as approximately linear-phase fractional sample delay blocks better than the first 50% of the Nyquist period corresponding to a temporal oversampling factor of about two.All plots of the phase and group delay vs. frequency are in the domain −π/3 ≤ ω ≤ π/3.In Figure 3, we show the phase response of the Thiran all-pass filter with n i = 3 and D i = 0.1-0.9.The negative linear phase behavior of the filter within the first half of the Nyquist period is clearly apparent from the simulation.The approximately linear phase response results in approximately constant group delays as shown in Figure 4.The experiments are repeated for n i = 4 in Figure 5 for phase responses and Figure 6 for group delay, respectively.

Phase Responses and Group Delay Profiles of Thiran Filters
Figures 3 and 5 illustrate phase responses for order n i = 3 and order n i = 4 for delay variations D i = 0.1-0.9,where D i = τ i /T s respectively.Figures 4 and 6 illustrate the group delay profiles for order n i = 3 and order n i = 4 for delay variations D i = 0.1-0.9,where D i = τ i /T s respectively.
The TTD beam shapes of the multi-beam beamformer remain unchanged regardless of the method used to achieve the TTDs.The main requirement is to have a constant group delay that matches the desired TTD value and a response with a unit magnitude.These conditions are satisfied within the usable frequency range.The group delay of the Thiran filters enables the desired TTD value to be achieved within the operating band, assuming temporal over-sampling up to 3×.If the group delay of the TTD at a specific frequency matches the desired TTD, the aperture will produce the same beam.When there is an insufficient amount of temporal oversampling, the Thiran filters may not perform at their best, making a strong case for deviation.In practice, we can disregard this due to the use of significant over-sampling.Our paper concludes that the Thiran filter-based approach is exclusively suitable for temporal over-sampling factors greater than approximately 3× and summarizes this using Figure 7. )

Comparison between Thiran-TTD and Ideal-TTD Beams
Beam for Thiran-TTD Beam for Ideal-TTD (a) (b) Figure 7. (a) shows a TTD beam for 3× temporal over-sampling (ω = 0.33π) which is the worst-case performance under the recommended scheme.The Thiran filter yields a beam that is identical to the perfect delay case.Furthermore, we show in (b) what would happen when the frequency is out of the recommended range, i.e., when ω = 0.8π, where there is a significant deviation of the beam axis.We only show this for information.In practice, by keeping ω < 0.33π by selecting 3× temporal over-sampling, we can avoid this issue.

SFGs for the Fast Digital DVM Algorithm Using Thiran Fractional Delays
In this section, we will utilize SFGs to elaborate on Thiran fractional delays for twiddle filter realization.Although a direct-form realization of the digital DVM beamformer [50] can be realized using Thiran filters for realizing each fractional sample delay, we proposed to reduce the computational complexity of the multi-beam beamformer by exploiting sparse factorization of the DVM in [7] followed by the fast DVM algorithm, i.e., ddvm.However, the SFGs in [7] are continuous-time and realized as analog microwave circuits.
In this paper, we are operating in the discrete domain, and the resulting SFGs are discrete-time and realizable using DSP systems; to wit, the sparse factorization of the DVM leads to discrete-time SFGs having twiddle filters consisting of digital fractional delay lines that can be approximated using Thiran filter building blocks.

An N = 4 Point DVM with Thiran Twiddle-Filters
Figure 8 shows the N = 4 point DVM factorization, where fractional true time delays of the form ψ p (z) are realized as a single Thiran filter block, having delay pτ where τ = 1/N.Figure 9 shows the direct form II SFG of a typical Thiran all-pass filter that approximates a fractional true time delay within the first 60% of the Nyquist period.A temporal over-sampling factor of two easily allows the signals of interest to be located within the usable band of the Thiran filters of the transfer function order n i = 3 and n i = 4, respectively.The twiddle filter-based DVM factorizations also require several special filters denoted as ϕ k (z), k = 1, 2, ..., 5, where each special filter is realized using ψ(z) Thiran filters as a building block.Therefore, once the Thiran block ψ(z) is available, it can be re-used to create the special twiddle filters ϕ k (z) as shown in Figure 10.
x [1] x [2] x [3] Y U (1)  U (2)  U (3)  L (3)  L (2)  L (1)   ( z ) ( z )  (z) = (z) ( (z) -1) (z)= ( -1)  Figure 11 shows the N = 8 point DVM factorization, where fractional true time delays are of the form ψ p (z) as was the case for N = 4.For N = 8, the twiddle filters require special filters that also use ψ(z) Thiran filters as the primary building block.The twiddle filters ϕ k (z) are realized using ψ(z) Thiran fractional delay filters as the building blocks.Once a reconfigurable Thiran filter is realized in a hardware design, it can be re-used at scale for realizing the SFGs consisting of the twiddle filters ϕ k (z) using cascade and parallel-form filter synthesis as shown in the corresponding design equations for ϕ k (z).The special filters ϕ k (z) are shown as SFGs in Figure 12, where ψ p (z) are direct form II Thiran filter blocks that implement true-time fractional delays having value pτ, p = 1, 2, 3..., 6.

Preliminary FPGA Digital Hardware Architectures
The full realization of the DVM architecture on a digital integrated circuit (IC) is a complex task and is beyond the scope of this paper.Nevertheless, we provide preliminary results toward the eventual realization of the DVM beamformers on digital IC using application-specific integrated circuit (ASIC) technology.

FPGA Realization of Thiran Filter Blocks
We prototyped a single software reconfigurable and delay-tunable digital Thiran filter using field programmable gate array (FPGA) technology.Using Xilinx FPGA design tools, including System Generator and PlanAhead, we realized a Thiran filter as a direct form II hardware architecture.The design, shown in Figure 13, assumes that two's complement the fixed-point computer arithmetic with a system bus size of 14 bits, and quantization of type rounding.The design was simulated for unit impulse response and compared with an ideal unit impulse response obtained from Matlab; the results match the errors associated with a 14-bit finite precision computer arithmetic core.

Synthesis and Mapping to FPGA Fabric
The RF-SoC realization of a Thiran filter for n i = 3, 4 order APFs are given in Table 1.The designs were ported from Matlab System Generator to Xilinx Vivado for synthesis and post place-and-route timing analysis.The design for order n i = 3 filter consumes 418 configurable logic block look-up tables (CLB LUTs), 158 CLB registers, and 51 CARRY8 on the Xilinx FPGA reconfigurable logic fabric.The design for the order n i = 4 Thiran consumes 487 CLB LUTs, 182 CLB registers, and 70 CARRY8 on FPGA logic fabric.The target FPGA device was an RF System-On-Chip (SoC) ZCU-111.
The hardware consumption data in Table 1 indicate the scalability of the Thiran filter building blocks for large apertures having many TTD units in the SFG.Designs were pipelined to obtain a critical path delay (CPD) of T CPD = n i T A/S + T M , where n i ∈ Z + is the order, T A/S is the latency for an adder/subtractor, having 14 bits of precision, and T M is the latency for a multiplier, having a coefficient precision of 14 bits and requantized output precision of 14 bits, selected to match the precision of the analog-todigital converters (ADCs) on the RF-SoC.The maximum clock frequency is F CLK = 1/T CPD .Post place and route timing analysis on Vivado indicates maximum clock speeds of 220 MHz and 213 MHz for Thiran filter orders of n i = 3 and n i = 4, respectively.

Future Work
We will utilize the APF-based Thiran fractional delay units to realize the full DVM factorizations provided in the paper.The twiddle filters will be realized using the Thiran building blocks.Thereafter, the factorized DVMs will be realized following the respective SFGs.The DVM algorithms will be pipelined using clocked D-flop stages inserted within the feed-forward parts of the signal flow.The building block Thiran filters are IIR filters that cannot be pipelined further without specialized computer architecture techniques; for example, we will explore the scattered/structured look-ahead pipelining [51] for digital processors, which are based on complex pole-zero cancellations.The computer architectures arrived at by mapping the SFGs to pipelined parallel processors with Thiran twiddle filters will be implemented on an AMD-Xilinx RF-SoC platform, such as ZCU-1285/1275/111, and operated with RF front ends realized off chip to evaluate the multi-beam beamforming performance in a hardware-in-the-loop emulation [52][53][54].

Conclusions
The ability to directionally enhance a propagating wideband RF plane-wave signal is a crucial requirement for RF sensing applications, as well as RF imaging, RF communications, and RF-based location systems.The application domains of RF sensing include radar, radio astronomy, and joint-communication and sensing applications that are emerging for next-generation wireless systems (e.g., 6G networks).A multi-beam TTD beamformer allows optimal sensing along a set of pre-determined DOAs for wideband waveforms.We proposed an algorithm for the efficient computation of wideband true time delay digital DVM using Thiran fractional delays.By adopting an array implementation of the DVM algorithm followed by the bidiagonal and diagonal matrices, the complexity of the digital DVM is significantly reduced.By applying this algorithm, the digital DVM becomes more efficient and optimized.Consequently, the algorithm yields a reduction in circuit complexity for multi-beam digital wideband beamforming systems that implement Thiran fractional delays.We implemented the DVM algorithm to effectively demonstrate the phase and group delay responses of Thiran filters that possess fractional sampling period delays of both the third and fourth orders.These filters can be integrated into the Xilinx FPGA RF-SoC technologies.To validate the feasibility of implementing the DVM algorithm, we utilized signal flow graphs as the basis for realizing digital application-specific integrated circuits (ASICs).

Figure 1 .
Figure 1.(a) Narrowband phased-array single-beam beamforming structure; (b) narrowband DFTbased multi-beam phased-array beamforming structure corresponding to a matrix-vector operation using the DFT matrix, in turn achieved via FFT based on the self-contained factorization of DFT matrix; (c) wideband single-beam TTD beamformer structure; (d) unsimplified wideband N-beam multi-beam TTD beamforming structure corresponding to a matrix-vector operation using the DVM.The proposed work shows how the above-mentioned architecture corresponds to a sparse factorization of DVM followed by a DVM algorithm that is realized as butterfly-like signal flow graphs (SFGs), where twiddle factors are realized in the discrete-time domain by the proposed twiddle filters based on a Thiran filter building block.

Figure 2 .
Figure 2. Overview architecture.RF antenna array used for optimal electromagnetic sensing of wideband plane waves based on their DOAs.(Left) N-beam TTD receiver aperture using DVM beamforming, and, (right) N-beam TTD transmit aperture using DVM beamforming.

9 Figure 3 .
Figure 3. Phase responses for order n i = 3 and delay variations D i = 0.1-0.9,where D i = τ i /T s .With temporal over-sampling of 3×, the usable range of the Thiran filter response is contained within the normalized −0.33 to 0.33 frequency band.The linear phase response establishes the role of Thiran filters in realizing fractional sample delays.

Figure 4 .
Figure 4. Group delay profiles for order n i = 3 and delay variations D i = 0.1-0.9,where D i = τ i /T s .Constant group delay can be observed over 50% of the Nyquist interval.With temporal over-sampling of 3×, the usable range of the Thiran filter response is contained within the normalized −0.33 to 0.33 frequency band.The constant group delay establishes the role of Thiran filters for realizing fractional sample delays.

9 Figure 5 .
Figure 5. Phase responses for order n i = 4 and delay variations D i = 0.1-0.9,whereD i = τ i /T s .With temporal over-sampling of 3×, the usable range of the Thiran filter response is contained within the normalized −0.33 to 0.33 frequency band.The linear phase response establishes the role of Thiran filters in realizing fractional sample delays.

9 Figure 6 .
Figure 6.Group delay profiles for order n i = 4 and delay variations D i = 0.1-0.9.Constant group delay can be observed over 50% of the Nyquist interval.With temporal over-sampling of 3×, the usable range of the Thiran filter response is contained within the normalized −0.33 to 0.33 frequency band.The constant group delay establishes the role of Thiran filters in realizing fractional sample delays.

Figure 8 .Figure 9 .
Figure 8. SFG for the N = 4-beam wideband beamformer using the DVM factorization followed by the ddvm algorithm, having the fractional delays ψ(z) associated with τ 0 .

Figure 10 .
Figure 10.ϕ k (z) function architectures for the order N = 4 filter.It showcases the reuse of the ψ(z) Thiran block as a fundamental building block for creating these specialized filters.

Figure 11 .
Figure11.SFG for the N = 8−beam wideband beamformer using the DVM factorization followed by the ddvm algorithm having the fractional delays ψ(z) associated with τ 0 .

Figure 12 .
Figure 12. ϕ k (z) function architectures for the order N = 8 filter.The twiddle filters are realized using the ψ(z) Thiran filter hardware core as a building block.

Figure 13 .
Figure 13.(Left) The direct form II computer architecture for realizing the Thiran filter block realizing ψ(z) for the order n i = 3. (Right) The direct form II computer architecture for realizing the Thiran filter block realizing ψ(z) for order n i = 4.The digital FPGA design uses the Xilinx System Generator FPGA blockset, which is a plug-in to Matlab/Simulink that provides a software interface to the Xilinx Vivado toolset.