Direct Wideband Coherent Localization by Distributed Antenna Arrays

We address wideband direct coherent localization of a radio transmitter by a distributed antenna array in a multipath scenario with spatially-coherent line-of-sight (LoS) signal components. Such a signal scenario is realistic in small cells, especially indoors in the mmWave range. The system model considers collocated time and phase synchronized receiving front-ends with antennas distributed in 3D space at known locations connected to the front-ends via calibrated coaxial cables or analog radio frequency over fiber links. The signal model assumes spherical wavefronts. We propose two ML-type algorithms (for known and unknown transmitter waveforms) and a subspace-based SCM-MUSIC algorithm for wideband direct coherent position estimation. We demonstrate the performance of the methods by Monte Carlo simulations. The results show that even in multipath environments, it is possible to achieve localization accuracy that is much better (by two to three orders of magnitude) than the carrier wavelength. They also suggest that the methods that do not exploit knowledge of the waveform have mean-squared errors approaching the Cramér–Rao bound.


Introduction
Localization in cellular systems has been attracting much interest in the research community for many years. The key challenge of localization is to improve the accuracy from hundreds of meters in the second generation (2G) (a list of abbreviations can be found at the end of the paper) to 1 cm or better in 5G [1]. If one can accomplish this in indoor environments, location-based services (LBS) will be improved considerably. Localization can also be used for LAC (location-aided communication), where one takes advantage of the information about terminal locations in different ways to improve the communication links [2]. This paper focuses on methods that achieve accuracy much better than the carrier wavelength, which can be used both for LAC and LBS.
The classical two-step localization methods, based on measuring received signal strength (RSS), time of arrival (ToA), time difference of arrival (TDoA), and angle of arrival (AoA), as well as the one-step method called direct position determination (DPD) [3,4] provide root-mean-squared errors (RMSE) that are greater than the carrier wavelength. The reason for this relatively high inaccuracy is that these methods are adapted to environments/systems where phase differences between distant antennas either carry no transmitter location information (the full spatial coherence across the entire distributed receiving system is not assumed) or that this information is simply ignored. algorithm. We discuss the results of Monte Carlo simulations in Section 4 and provide concluding remarks in Section 5. The ML algorithms are derived in the Appendix A.

System Model
A distributed receiving antenna array that performs the localization has M stationary elements/channels, Rx m , m ∈ {1, 2 . . . M}, where the antenna Rx m is at a known position r m = (x m , y m , z m ); see Figure 1. An active stationary transmitter, Tx, is at an unknown position r = (x, y, z). All receive and transmit antennas are arbitrarily distributed in the 3D area of interest for which we assume spatial coherence of the LoS components across the entire distributed receiving array. The coherence can be established if the carrier phase difference between any two of the receiving antennas, Rx m and Rx l , can be obtained from the difference in the distances between them and the Tx. In mathematical terms, we have: where ϕ cm and ϕ cl are the carrier phases of the m th and l th receiving antennas, respectively; ν c is the carrier frequency in Hz; d m is the distance between Tx and Rx m antennas, d m = d m , d m = r − r m , where · is the Euclidean norm;c is the speed of propagation (c = 3 · 10 8 m/s). The distributed receiving antennas are connected to M collocated receiving channels (front-ends) by calibrated cables (coaxial or fiber optic) for two reasons. The first is that fine time, frequency, and phase synchronization between the receiving channels is easier to achieve with collocated than with distributed front-ends. The second reason it that a centralized (specifically, direct or one-step) localization algorithm is a natural choice in this setup. All antennas act as "isotropic", and their polarizations are aligned. Deviations from this in practice can be taken into account if there are additional attenuations and phase shifts and whose values are determined by the antenna design. The positions of the receiving antennas, r m , are known in the sense that the antennas must be positioned with accuracy greater than the localization accuracy itself (with an error much smaller than the carrier wavelength). In the proposed system model, in practice, some antennas could be used for localization and the others for down-link communication with Tx.

Signal Model
The Tx transmits an RF signal (approximately) band-limited to (ν c − B/2, ν c + B/2), where B is the signal bandwidth in Hz, i.e., the complex form of the transmitted sequence/waveform is band-limited to (−B/2, B/2). The signal received at the Rx m antenna, m ∈ {1, 2 . . . M}, which is an attenuated and delayed version of the transmitted signal, after down-conversion and A/D conversion at the Nyquist rate ν s = B, is represented in a complex form u m (n) as: where n denotes discretized time normalized with 1/ν s , i.e., n ∈ {0, 1 . . . N − 1}; a m is a real-valued attenuation factor for the LoS component; ω c = 2π f c , f c = ν c /B are the normalized angular and natural carrier frequencies, respectively; the unknown value t 0 models the fact that the Tx is not time synchronized with the Rx system (we always use the Rx time axis); the normalized propagation delays from the Tx to the antenna Rx m are τ m = d m /c, where the normalized propagation speed is c =c/ν s ; u NLoSm (n) denotes all of the multipath components of the Tx signal except the LoS; η m (n) is the noise; a m,l is an unknown complex-valued attenuation factor of the l th component received by the m th Rx channel, and τ m,l is an appropriate time delay; the acquired samples are u m (n). Note that t 0 , d m /c, and τ m,l are fractional dimensionless values (the propagation time delays do not have to be integer multiples of the sampling interval) and that the sequence/waveform s(·) is defined in the entire (continuous) range R since it has been transformed into a continuous waveform inside the Tx.
A signal model on a large antenna array can be frequency-and spatial-wideband, and the effects of these phenomena are discussed in detail in [21]. The model has to be spatially wideband, if the propagation time across the aperture of the receiving array is greater than the inverse of the signal bandwidth, 1/B. It models envelope shifts between the signals in the receiving channels in addition to carrier phase shifts (Strictly speaking, the signals have to be modeled as spatially wideband, if the approximation s (n − t 0 − τ m ) ≈ s (n − t 0 − τ l ) does not hold for some m, l ∈ {1, 2 . . . M}. The signal is spatially narrowband if s (n − t 0 − τ m ) ≈ s (n − t 0 − τ l ), (∀m, l ∈ {1, 2 . . . M}).). It is well known that, if the signal model is spatially wideband, then the localization methods have to be formulated in the spectral domain. Our signal model does not limit the signal bandwidth, so, in the context of [21], our model is both frequency-wideband and spatial-wideband (Consider a numerical example in the mmWave band. If a signal with bandwidth B = 100 MHz at a carrier frequency ν c = 60 GHz is transmitted in a conference room equipped with an antenna array with an aperture of, say, 6 m, then the propagation time across the array would be two sampling intervals. This shows the importance of wideband modeling.).
We scale the signal u m (n) in a preprocessing step by 1/a m to obtain u m (n). We note that the signal-to-noise ratios (SNRs) and noise powers are considered known to the receivers. Thereby, the useful signals in all the receiving channels have the same power. Then, we have: where u NLoSm (n) = u NLoSm (n)/a m , η m (n) = η m (n)/a m , η m (n) ∼ CN (0, σ 2 m ) is a circular-complex zero-mean Gaussian random process (the noise) with variance σ 2 m , independent across time samples and channels and independent of the useful signal s(t). The variabilities of the propagation attenuations are modeled by σ 2 m , but the model considers that no information about the transmitter location is contained in the attenuations. We assumed this because, in practice, location estimation based on phases is more robust than estimation based on attenuations. Additionally, this reduces the number of dimensions of the search grid in the proposed algorithms. However, the algorithms use the SNRs for channel weighting. We define the SNR in channel m as SNR m = 1/(Nσ 2 m ) ∑ N−1 n=0 |s(n)| 2 . We implement the time shifting of an RF signal by performing operations in the discrete Fourier transform (DFT) domain on its complex envelope, and therefore, strictly speaking, the signal has to be periodic with period N (the length of the observation interval). However, this does not have to be the case if the error created by a cyclic time shift, compared to the regular time shift, is negligible, which is important for the random sequence scenario.
If the (h, q) th entry of a matrix A is A h,q , define the (h, q) th entry of exp (A) as exp(A h,q ). Furthermore, for any vector v, let Diag {v} be the diagonal matrix whose elements on the main diagonal are the elements of v. Then, the discrete-time matrix form of the signal model (3) is: (·) and (·) H denote transpose and conjugate transpose, respectively. In the known signal scenario, the conditions that the continuous waveform s(t) is band-limited to (−1/2, 1/2) and that it is periodic with period N imply that, given the samples in s, the signal in the entire (continuous) range t ∈ R is In the random signal scenario, s(n), n ∈ Z, is a circular-complex zero-mean Gaussian random process with variance σ 2 s , so that SNR m = σ 2 s /σ 2 m . The goal is to estimate r given the observations u m , m ∈ {1, 2 . . . M}.

Maximum Likelihood Algorithms
For simplicity, we derive the ML algorithms for u NLoSm (n) = 0. We later show that they are applicable in multipath propagation conditions as well. If Q m = F H D t 0 +d m /c F and · F is the Frobenius norm, then the signal in channel m is u m = Q m s + η m and the log likelihood function (without the additive constant) multiplied by −1 becomes: where s and η m are defined in Section 2. According to the ML method, the estimate of the vector of unknowns, α = (t 0 , x, y, z), is the one that minimizes f LL . We formulate two algorithms. ML algorithm for a known sequence (ML-KS): If the sequence s is considered known, the estimate of α is the one that maximizes the criterion function: One possible implementation of the ML-KS is given in Algorithm 1. Reliance on the knowledge of the waveform requires that the local carrier is coupled with the A/D (or D/A) converter clock in each receiving and transmitting channel.
ML algorithm for an unknown sequence (ML-US): If s is considered unknown, then the estimate of α = (x, y, z) (without loss of generality, we can set t 0 = 0 in this case) is the one that maximizes the criterion function: The derivation details are provided in the Appendix A and an implementation in Algorithm 2.
The criterion function f US2 can also be written in the following matrix form. The weighted signals in the DFT domain are given by U m = w m Fu m , where the weighting coefficients are w m = 1/σ 2 m , m ∈ {1, 2, . . . , M}. Let the signals compensated for the propagation delays for a hypothetical Tx location in the search grid, r, be U m,comp = D τ CM −d m /c U m , where τ CM is arbitrarily chosen and constant across the channels. To reduce numerical errors, we define τ CM as the mean (center of mass) of d m /c, m ∈ {1, 2, . . . , M}. If U comp = U 1,comp , U 2,comp , . . . , U M,comp , the steered covariance matrix can be written as where v is a column vector composed of elements equal to one.
Note that the ML-US algorithm cannot estimate t 0 , since this is only possible if it knows the sequence. This, however, reduces the number of dimensions in the search grid by one. The product Fu m is the DFT spectrum of the signal in channel m, and Fu m /σ 2 m can be calculated only once per channel during the preprocessing to optimize the algorithm numerically. If an algorithm does not rely on the knowledge of the waveform, the requirements for the transmitter hardware are relaxed, so that any off-the-shelf transmitter can be localized. Furthermore, the applications of this type of localization also include non-cooperative cases. Calculate D t 0 (j) according to (6) 5. end for 6. for each Rx antenna, m = 1 to M do 7.
Fu m 8. end for 9. for each gridpoint along the spatial dimensions of the grid, i = 1 to N p do 10. for each Rx antenna, m = 1 to M do 11.
Calculate D d m /c according to (6) 13. end for 14. for each gridpoint along the t 0 dimension of the grid, j = 1 to N t 0 do 15. Calculate the criterion function end for 17. end for 18.
Fu m 4. end for 5. for each gridpoint along the spatial dimensions of the grid, i = 1 to N p do 6. for each Rx antenna, m = 1 to M do 7.
Calculate D d m /c according to (6) 9. end for 10. Calculate the criterion function d m /c were used to optimize the algorithms numerically. Furthermore, note that S = Fs is equivalent to the command S = fftshift(fft(s))/sqrt(N) in MATLAB.

Steered Covariance Matrix-Based MUSIC-Type Algorithm
A method for AoA estimation of wideband acoustic signals based on a steered covariance matrix approach was proposed in [23]. The method is more efficient for short observation intervals than the spectral focusing AoA approach proposed in [24]. A method for direct position estimation of baseband UWB signals was proposed in [25]. It is based on the steered covariance matrix approach as well.
In this paper, we generalize the steered covariance matrix approach for direct position estimation of wideband bandpass radio signals by introducing the SCM-MUSIC algorithm. This algorithm uses preprocessed signals defined by U m = w m Fu m , where w m = 1/σ m . This means that, after this step, the noise powers are equal across the channels. Starting from U m , the steered covariance matrix, R ( r), is obtained in the same way as by the ML-US algorithm. The noise subspace matrix E n ( r) ∈ C M×(M−1) is formed from R ( r) in the same way as in [23,26]. Namely, it is a block matrix, where v em is the norm-one eigenvector of R ( r) corresponding to its eigenvalue λ m , where λ 1 ≥ λ 2 ≥ . . . ≥ λ M . The matrix has M − 1 columns when there is a single active transmitter. The estimated location is the one that maximizes the criterion function: where v = [1/σ 1 , 1/σ 2 , . . . , 1/σ M ] . Algorithm 3 provides an implementation of the SCM-MUSIC. Even though the criterion function (12) has a similar form as the original MUSIC in [26], the differences of the SCM-MUSIC algorithm relative to the ones in the literature are as follows. In our approach, the unknown parameters (the location of the Tx) are contained in the covariance matrix (this is the steered covariance matrix approach), whereas in [5,26], they were contained in the steering vector, which limits the application to narrowband signals. We model bandpass signals, unlike [23,25], where the signals are in the baseband ( [23] is for acoustic signals), and our antenna array is distributed, unlike [23,26]. Instead of changing the attenuations over the search grid, we used channel weights based on the received SNRs, unlike [5,25]. U m = w m Fu m 6. end for 7. for each gridpoint along the spatial dimensions of the grid, i = 1 to N p do 8. for each Rx antenna, m = 1 to M do 9.

13.
Form the noise subspace matrix Calculate the criterion function Note that v em is the norm-one eigenvector of R corresponding to its eigenvalue λ m , where

Noncoherent Algorithms
The algorithms proposed in this section are computationally intensive when applied to the entire distributed array, and so, we can adopt a suboptimal method to narrow the search grid first. We applied the idea used in [3] to define the maximum covariance matrix eigenvalue (MCME) algorithm. Its criterion function is equal to the largest eigenvalue of the covariance matrix R ( r) defined for the previous algorithms, as opposed to the criterion function in (11), which is equal to the sum of the elements of R ( r). The MCME algorithm does not use the spatial coherence between the signals received by the distributed antenna array.

Numerical Complexity of the Algorithms
We now address the number of operations, NO, of the proposed algorithms. We write: where NO PP is the number of operations in the preprocessing step, A/Q is the number of points in the search grid, and NO GP is the number of operations for a single grid point. The size of the area that the search grid spans is A, which is either a surface for 2D or a volume for 3D localization. The resolution cell of the grid is Q = ∆x∆y or Q = ∆x∆y∆z (2D or 3D). A coherent algorithm requires a very fine grid, so that ∆x, ∆y, and ∆z are in the order of λ c /1000, whereas a noncoherent algorithm can work with a much coarser grid, where ∆x, ∆y, and ∆z are in the order ofc/B/1000. Thanks to the FFT (fast Fourier transform) algorithm, we have that NO PP = M × ((1 + N/2 log 2 N) m C + N log 2 Na C ), where m C is a single complex multiplication and a C is a single complex addition. The number of operations in the preprocessing step for coherent methods is expected to be negligible compared to the search over the grid, because of a large number of grid points. Further, we have that each grid point requires: operations, where NO MD is the number of operations for the calculation of the main diagonal of the matrix D d m /c , and NO AS is an algorithm-specific value, which for the ML-KS algorithm is: where N t 0 is the number of points along the t 0 dimension of the search grid, for the ML-US algorithm is: and for the SCM-MUSIC algorithm, based on the numerical complexity in [27], is: Note that the two extreme cases for the matrix D d m /c are to either calculate it each time or to read it from a memory, which has been filled in advance with values calculated offline for the entire grid. There are also possibilities in between these two extreme cases. Furthermore, note that, when the signals are narrowband, in the limit, multiplying a signal vector by D τ reduces to multiplying the vector by a simple scalar exp (−jω c τ). In the general wideband case and when the matrix D d m /c is calculated each time, we have that NO MD = Nm C + Ne C , where e C is a single (scalar) complex exponential function.
All in all, the approximate numerical complexity of the ML-KS, ML-US, and SCM-MUSIC algorithms, respectively, is given by: NO ML-US ≈M log 2 N (N/2m C + Na C ) NO SCM-MUSIC ≈M log 2 N (N/2m C + Na C ) Table 1 shows approximate values of the numerical complexity of the proposed methods for a specific example with parameters M = 5, N = 64, ν c = 60 GHz, i.e., λ c = 5 mm; the resolution cell along the t 0 dimension of the grid (for ML-KS) is 1/ (1000ν c ), and the span of these t 0 values is the propagation time across an array aperture of 6 m (an indoor scenario). The numerical complexity of the preprocessing step, which is executed before the grid search (thus, it is independent of the number of points on the grid), is 960 multiplications and 1920 additions. The table shows the numerical complexity of the search for a single grid point, and as expected, the preprocessing step has negligible numerical complexity because the number of points in the search grid is typically large. The table also shows that ML-KS has a much higher numerical complexity than ML-US because of the search for t 0 . Further, note that SCM-MUSIC has a higher complexity than ML-US, but their complexities are still comparable. We can reduce the numerical complexity of ML-KS in the following way. Since t 0 is expected to be much larger than the propagation times, d m /c, we can use the procedure explained in [20] (Equations (20)- (23) and the explanations provided with them) to find the total delay (t 0 + d m /c) of the known sequence relative to the beginning of the observation interval by performing a 1D search for each Rx channel. In this way, we can reduce the span of possible t 0 values from [0, N) (or [−N/2, N/2)) to an interval as short as the propagation time across the aperture of the receiving array.
We point out that there are other more sophisticated methods for numerical optimization of the cost function and, thus, for localization. One such iterative method, which combines the gradient-descent and particle filters, efficiently searches for the maximum of an ML criterion function, but it does not guarantee finding the global maximum [28]. Investigating optimization methods for maximizing the likelihood functions from Section 3.1 is outside the scope of the paper.
Since a coherent algorithm requires a very fine grid, evaluating its criterion function over the entire area A of interest may be prohibitively complex. Using the mentioned idea to narrow down the search grid first by a noncoherent algorithm, we get a reduced number of operations, NO RDC , as: where Q NCOH and Q COH are the resolution cells for the noncoherent and coherent algorithms, respectively, NO GP,NCOH and NO GP,COH are the appropriate numbers of operations for each grid point, and A RDC is, for example, a 0.95-confidence interval around the noncoherent location estimate. A coarse grid spans the entire area A, whereas a fine grid spans a much smaller grid A RDC .

Simulations
This section presents numerical results of Monte Carlo simulations performed using the algorithms explained in Section 3 and three distributed receiver antenna array geometries. Throughout this section, we assume that the power of the LoS components decreases with the squared distance from the transmitter. The carrier frequency is ν c = 60 GHz.

Experimental Setup and Criterion Functions
Geometry G 5 consists of five antennas in the horizontal plane z = 0 (see Figure 2    The criterion function of the MCME method is shown in Figure 2 for G 5 , T 1 = (0, 0, 0), SNR 0 = 10 dB (which means the SNRs in the channels were approximately 1 dB). This particular criterion function is not influenced by carrier phases, has no side lobes, is immune to phase synchronization errors, and varies slowly across space, and so, a coarse grid can be used. This method offers relatively low accuracy (comparable toc/ν s ). It is useful for selecting a smaller and finer grid for the other criterion functions in this paper to reduce their computational complexity. Figures 3 and 4 show the criterion functions for ML-US and SCM-MUSIC, for G 5 , SNR 0 = 10 dB, over a small area around the Tx at two characteristic locations, T 1 , randomly chosen near the "center", and T 2 = (−1.5, 1.5, 0), randomly chosen near the edge of the array, respectively. The main lobes (the lobes where the true locations are) are marked in these figures. The other lobes are side lobes, and they represent the ambiguity problem, which is inherent to coherent methods. The distance between adjacent lobes is in the order of the carrier wavelength, λ c . These two algorithms have lobes at the same locations; however, the SCM-MUSIC criterion function contains narrower lobes and lower side lobes. Note that the array is relatively inexpensive in terms of hardware and signal processing, but it suffers from the ambiguity problem, which can be seen in Figure 4a; the estimated location (denoted by the white square) is at one of the side lobes in this particular Monte Carlo run. On the other hand, the estimates in Figures 3a,b and 4b were within the main lobe. If a side lobe has approximately the same magnitude as the main lobe, we call it a grating lobe. There is a non-negligible probability that an estimate would be within a grating lobe, as a result of the ambiguity problem. Figures 3 and 4 show the criterion functions over small areas (the criterion functions are zoomed-in) so that the lobe structure can be seen. Note that the lobe structure is determined by the geometry of the receiving array.
If the main lobe is at a point A, a grating lobe appears at any other point B that has the same relative (between the Rx antennas) propagation distances modulo λ c as a point A. More formally, if d Am is the distance between A and the Rx m antenna and similarly for d Bm , then a grating lobe appears at B if (d Am − d A1 − d Bm + d B1 ) /λ c ∈ Z (∀m). As a result, the side lobe patterns are different for T 1 and T 2 , as can be seen in Figures 3 and 4.
The geometry G 10 consists of five antenna subarrays of two antennas separated by The aperture of this geometry is large and so is suitable for outdoor environments. We applied coherent methods to each subarray and then summed the criterion functions of each subarray. As we did not require the subarrays to be phase synchronized with each other, we called the resulting method a semicoherent method. Figure 5 shows the subarray criterion functions in (a-e) and their sum for the ML-US algorithm in (f). Similar results, but for the SCM-MUSIC algorithm, are shown in Figure 6. The criterion functions of these methods have no side lobes; they are not influenced by carrier phases or phase synchronization errors between different subarrays; and they vary relatively slowly across space. Consequently, we can use coarse grids in exploring them. The accuracy of the semicoherent variant is comparable to that of the DPD, [3] and is lower than the accuracy of coherent localization. Despite having lower accuracy, semicoherent localization methods can be used to solve the ambiguity problem and to optimize the search grid (to reduce numerical complexity) of the coherent methods (as previously explained for noncoherent methods). Note that the criterion functions of subarrays with collocated antennas show that the localization methods act as AoA estimation methods.  The metric we used to evaluate the accuracy of the algorithms was the mean squared miss distance (the distance between the estimated location and the true location of the transmitter), i.e., MSE. We also used the RMSE. Figure 7 shows the summary distribution of the SNRs at antennas of the array G 5 for all simulated transmitter locations (Tx grid points) for SNR 0 = 10 dB. The figure shows that the actual SNRs in most of the channels were between −5 and 10 dB.

Results
The contour plots in Figures 8-10 were generated over Tx grids of uniformly-spaced points in the plane z = 0. For every Tx grid point, we performed 8192 Monte Carlo runs. The parameters were B = 100 MHz, SNR 0 = 10 dB, and N = 64. The simulations were carried out using realizations of a random Gaussian process or a known deterministic sequence, the first of the modulatable orthogonal sequences proposed in [29] for a given N. However, the results for the deterministic sequence scenario are not shown for brevity, as they were very similar to those for the random Gaussian signal scenario. Figures 8 and 9 show the results for the ML-US algorithm over a Tx grid of 64 × 64 points that covers most of the area inside the array. Figure 8 shows the RMSE (RMS error or the RMS of the distance between the estimated and the true location) values normalized by the carrier wavelength, λ c , for the random Gaussian signal scenario. For this relatively low SNR and fairly short sequence length, an accuracy of two orders of magnitude better than λ c can be achieved. As expected, the accuracy was higher in the inner part of the array aperture and lower near the edge of the aperture. Nevertheless, the accuracy was much better than the carrier wavelength in the entire area in Figure 8. Figure 9 shows the statistical efficiency of the algorithm (the MSE-to-CRB ratio; CRB, Cramér-Rao bound) for the random Gaussian signal scenario. We observed that the efficiency did not depend on the Tx location on the grid and that it was close to one. The figure shows a slight variation over space in the interval [0.995, 1.04], as can be seen from the color bar, but this was due to the randomness in the simulations. The results for the SCM-MUSIC algorithm were very similar to those of the ML-US algorithm.    Figure 10 shows the RMSE values relative to λ c , for the random Gaussian signal scenario for the MCME algorithm, over a 23 × 23 grid that covers a smaller area compared to the one used for the ML and SCM-MUSIC algorithms. This is because MCME starts to behave as an AoA estimation algorithm for some Tx locations near the edge of the array. In that case, the miss distance is not an appropriate metric, and so, we chose not to evaluate it in those areas. The RMSE values inside the array were much greater than for the previous algorithms (they exceeded 27λ c inside the array), because this algorithm did not use the information contained in the carrier phase. However, we can use this criterion function to perform computations over a coarse grid before exploring another criterion function over a finer grid, as explained at the end of the numerical complexity Section 3.4. This accuracy was around 0.04c/ν s , which was similar to the accuracy of the conventional methods (those that did not use the carrier phase differences between the received signals).
The statistical efficiency of the ML-US, averaged over a grid of 32 × 32 uniformly-distributed points covering an area similar to the one in Figure 8, versus SNR 0 , is shown in Figure 11. The curves are shown for different combinations of sequence length N ∈ {64, 256, 1024} and normalized frequency f c ∈ {600, 3000, 12000}, i.e., B ∈ {100, 20, 5} MHz, for both deterministic and random Gaussian signals. For the ML-US algorithm, the statistical efficiency for the random Gaussian scenario was close to one in the observed SNR range, whereas for the deterministic scenario, it started to diverge at lower SNRs in the given range. The results for the SCM-MUSIC algorithm are not shown because its statistical efficiency was similar to that of the ML-US in the observed SNR range, and in the worst case, it was by about 15% higher.  Figure 12 shows the performance of the algorithms for f c = 600, SNR 0 = 10 dB, and different sequence lengths, averaged over the same area as for Figure 11. The curves do not feature a pronounced threshold. This means that even short sequences can be used for localization, which enables spectrum sensing to have a quick response. We finally point out that in the respective experiments, we worked with other Rx configurations that surrounded the Tx by placing the Rx antennas at different locations and obtained results that were no different than the ones already presented.

Experiments with Subarrays of Antennas
As the previous results showed, G 5 suffered from the ambiguity problem (high side lobes), so we introduced a distributed massive-MIMO-like geometry, suited for installation and localization in a single room. Geometry G 90 is a 90-element array formed by replacing every antenna in G 5 with a subarray of an 18-element acoustic camera [30] geometry scaled by a factor of 17/3. Each subarray was rotated so that it was in a vertical plane, displayed by a rectangle in Figure 13, and its broadside direction pointed to the center of the total array. According to the system model, the signals from all the G 90 antennas were processed as if they were a single large array; however, grouping antennas into subarrays can simplify the mechanical design and the installation of the system. We give results for Tx at T 1 = (0, 0, 0) and different scenarios regarding the presence/absence of multipath and interfering signals. The parameters of the LoS component were B = 100 MHz, SNR 0 = 10 dB, and N = 64. The statistical results were averaged over 8192 simulation runs. All of the magnitudes in the figures showing the results are in (dB). Similarly to G 5 , SNR 0 = 10 dB means that the SNRs in the channels were around 1 dB. Figure 13. The Rx array geometry G 90 . Figure 14 shows the criterion functions in a single-user LoS-only scenario for both the ML-US and SCM-MUSIC algorithms. The ML-US side lobes were 8 dB below the main lobe, whereas the SCM-MUSIC side lobes were 33 dB below the main one. We used this scenario as a baseline to compare the results with multipath propagation and multiple transmitters against it.  Figures 15 and 16 show the results for scenarios with a small cluster of scatterers and an ideal reflector, respectively, positioned at (0.025, −0.025, 0) (denoted by a diamond). We modeled the cluster of scatterers as a point scatterer with randomized constant phase terms in the reflected (scattered) signal components. The power of the reflected signal components was pessimistically modeled to be inversely proportional to the squared sum of the distances the signal traveled before and after reflection. The only difference was that the carrier phases were modeled as random (uniformly distributed on [0, 2π]) and deterministic, respectively. In the former scenario (Figure 15), the algorithms did not detect the reflector because the carrier phase of the reflected component was randomized. However, the waveform of the NLoS component was correlated with the useful (LoS) component, so the effect was an increase in side lobe magnitudes. Even though the shape of the criterion function was (qualitatively) similar to the one without multipath propagation as in Figure 14, the suppression of the side lobes was lower; it was roughly 6 dB and 4 dB for the ML-US and SCM-MUSIC algorithms respectively, instead of 8 dB and 33 dB, as in the LoS-only scenario in Figure 14. In the latter scenario (Figure 16), there was no notable increase in side lobes, but the reflector was detected as a separate source. SCM-MUSIC requires knowledge of the number of "sources", so it is assumed known in the simulations. These setups represent extreme cases, and one may expect something in between in practice.   Table 2 shows the statistical performance of the algorithms for these experiments, but since we wanted to emphasize the effects of multipath components, we increased SNR 0 to 20 dB. The results are presented for different power ratios between the NLoS components and the LoS component, denoted δ NLoS . For the sake of comparison, the results for the setting with only the LoS component are shown. The algorithms behaved similarly in these scenarios as their performance deteriorated due to multipath, but the RMSE was still well bellow λ c . For NLoS levels of 10 dB or more below the LoS component, which are realistic for the indoor environments in the mmWave, according to [11], the RMSE was by two orders of magnitude smaller than λ c . This accuracy is important, because it enables the base station to focus energy to the Tx for down-link communication after the Tx is localized. Additionally, we generated the results for different scatterer/reflector positions inside the array (farther away from the Tx), as well as for N = 512, but the results were only slightly different. Table 2 shows that the localization methods proposed in the paper could achieve an accuracy much better than the carrier wavelength. On the other hand, the existing two-step localization methods, such as energy-based methods applied in mmWave massive MIMO systems, have drastically lower accuracies: the error is higher than the carrier wavelength, e.g., it is in the order of a meter [1,31] and 8 mm, [32]. However, the energy-based methods are easier to implement than coherent methods, especially in terms of synchronization (they can work with very rough synchronization).

Experiments with Two Transmitters
We performed also experiments with two transmitters with equal transmit powers and located at (0.025, −0.025, 0), thus separated by 7.07λ c . Figure 17 shows the results, and they suggest that both algorithms distinguish the transmitters. Note that the side lobe suppression is approximately the same as in the LoS-only scenario in Figure 14. In Figure 18, we plot the results of an experiment where the other user is at a (0.008, −0.008, 0) (2.26λ c distance) and with a 10 dB greater transmit power. ML-US could not estimate the position of the first Tx because its main lobe was lost in the side lobes corresponding to the interferer. SCM-MUSIC, however, localized the two sources successfully.   Table 3 shows the statistical performance of the algorithms when the interferer was located at (0.025, −0.025, 0) for different power ratios between the interferer and the Tx, denoted δ IF . We used SNR 0 = 20 dB for the same reason as in the multipath scenarios. For δ IF ≤ 0 dB, both algorithms performed well, with SCM-MUSIC performing almost as if the interfering signal were not present. For δ IF ≥ 10 dB, ML-US failed to localize the Tx (see Figure 18), whereas SCM-MUSIC performed well even when it was 30 dB below the interferer. This is one advantage of SCM-MUSIC over ML-US, which justified its use despite the fact that ML-US was less complex. In other words, the SCM-MUSIC method was more robust in conditions with large differences in transmitted power between different transmitters. Table 3. RMSEs for scenarios with an interferer and SNR 0 = 20 dB.

ML-US SCM-MUSIC
As an alternative to G 90 , we introduced geometry G 18 , which also suppresses side lobes, but is less expensive, both in terms of hardware and processing cost. Geometry G 18 is the acoustic camera geometry [30] scaled by a factor of 12, placed in the plane at z = 0 (see Figure 19a). Figure 19b-d shows the results for G 18 and a source at T 3 = (0.152, 0.033,-1.2), which corresponds to a setup when the array is attached to the ceiling of a room and there is a transmitter 1.2 m below. The SNR 0 was set to 12 dB, which entailed that the SNRs in the channels were between 4 and 10 dB. The search grid was in the plane z = −1.2 m, which contained the true location of the transmitter. The CRB for the Tx location was 2.93 × 10 −9 m 2 , whose square root corresponded to 0.011λ c . The criterion function side lobes were suppressed (at least 43 dB lower than the main lobe).
The proposed methods were based on a model that treats the signal wavefront as spherical (they do not use the planar wave assumption). We now show that they were not only able to estimate the AoA of the received signal, but also to localize the transmitter using a collocated massive antenna array. We performed simulations for a uniform rectangular array with 16 × 16 antennas spaced λ c /2 apart at ν c = 60 GHz, centered at (0, 0, 0) and lying in the yz-plane. The array aperture was then 7.5 cm. We simulated a transmitter on the x-axis at a distance of 1, 2, 4, and 8 apertures away from the array. The criterion functions for these four Tx locations are shown in Figure 20a-d, respectively. The RMSE values along the x-axis for 8192 Monte Carlo runs for these locations were 0.048λ c , 0.18λ c , 0.7λ c , and 2.8λ c , respectively. The RMSE values along the y-axis were 0.0042λ c , 0.008λ c , 0.016λ c , and 0.032λ c , respectively. We concluded that the accuracy along the y-axis was better than the carrier wavelength, whereas along the x-axis, it was worse, but localization was still possible, thanks to the curvature of the wavefront. Furthermore, the RMSE along the x-axis increased with the distance from the array. To make the comparison fair, we chose different SNR 0 for these Tx locations to make the actual SNRs in the receiving channels be around 4.5 dB for each of the locations.

Conclusions
In this paper, we analyzed wideband direct localization of a transmitter in a multipath scenario with spatially-coherent LoS signal components. The signal model in this paper was both spatial-wideband and frequency-wideband, unlike most of the models in the existing literature. The same goes for the proposed localization algorithms. The simulation results for the proposed algorithms showed that they were statistically efficient. Further, they showed that the LoS-only localization RMSE could be two to three orders of magnitude lower than the carrier wavelength (even as accurate as one thousandth of the carrier wavelength) for a reasonable number of receiving antennas and system parameters, such as SNRs, the bandwidth, and the number of samples (such as 16 samples). The simulations also showed that localization RMSEs are two orders of magnitude lower than the carrier wavelength even in multipath and multiuser scenarios. Furthermore, relatively high accuracy can be achieved even for low SNR conditions and short observation intervals. Overall, the proposed methods were still numerically complex, and their further numerical optimization is a part of future research. The SCM-MUSIC algorithm had higher numerical complexity, but it was more robust in conditions with large differences in transmitted signal power from different sources; and it worked even in some cases when the ML-US algorithm did not. If a base station can localize two antennas in LoS conditions as separate sources, it can send two independent data streams toward them in the same time-frequency resource, while reducing interference to other points in the area. These results suggest that the performance (throughput, interference cancellation) of future wireless cellular systems, especially distributed massive MIMO, can dramatically improve. The presented results indicated that such systems have a great potential for location-aided communication, which is not fully recognized in the existing literature.

Acknowledgments:
The authors greatly appreciate the anonymous reviewers and the Academic Editor for their careful comments and valuable suggestions to improve the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.