1. Introduction
Due to the superior balance between performance and cost, a hybrid antenna array is regarded as an excellent candidate for millimeter wave (mmWave) communication systems [
1,
2]. Typically, the hybrid array is composed of multiple analog subarrays with phase controllable antenna elements. It includes two kinds of conventional configurations, i.e., localized and interleaved arrays in the light of the topology of subarrays. As the localized array is easier for schematic design and hardware implementation, it is more suitable for building a massive array. Angleofarrival (AoA) acquisition of the incoming mmWave signal is of considerable importance for signal reception, since the mmWave channels are dominated by the lineofsight (LOS) propagation. A wide range of its applications including localization and tracking to mmWave communication systems, e.g., 5G mmWave cellular networks [
2] and satelliteborne communications [
1], have been increasingly studied in recent years.
AoA estimation using a localized array suffers from the phase ambiguity problem, which has been progressively studied in [
3,
4,
5,
6,
7]. Each of these solutions leverages the crosscorrelations between neighbouring subarrays for an AoA estimate. Phase ambiguity stems from an undetermined integer multiple of
$2\pi $ between
$Nu$ and the argument of crosscorrelations, where
N is the number of antennas in a subarray,
$u=\frac{2\pi}{\lambda}dsin\theta $,
$\theta $ the elevation AoA,
$\lambda $ the wavelength, and
d the adjacent antenna spacing. With the identical phase shift deployment over all subarrays for constructive combination of crosscorrelations, the work in [
4] proposed a differential beam search algorithm to go through all possible beams and determine
u with the largest output power. However, it incurs a long scanning period that linearly increases with the length of a subframe and
N. To avoid a long scanning period, the authors in [
5] studied a phase shift configuration with different values in different subarrays to eliminate the ambiguity by directly estimating
u. Their ingenious idea is that (1)
$Nu$ is estimated by rectifying the signs of crosscorrelations and then combining them coherently; (2) After calibrating subarray output signals with the estimated
$Nu$, one takes their inverse discrete Fourier transform (IDFT) and calculates the correlations of the Fourier coefficients to uniquely recover
u. The work in [
3,
6] further generalized the phase shift design in [
5] to narrowband and wideband systems respectively, and revealed that the strongest crosscorrelation takes the opposite sign from the remaining crosscorrelations. Following this finding with an improved calibration accuracy of crosscorrelations, the AoA can be speedily and reliably estimated even in low signaltonoise ratio (SNR) regimes. In [
7], an enhanced AoA estimation for a polarized mmWave signal was studied using a localized hybrid dualpolarized array, where polarization diversity combining was employed to improve the estimation of phase offset between adjacent subarrays. With the crosscorrelation based algorithm, a multiAoA estimation scheme with a combiner design was proposed in [
8], where the paths for different users are identified by exploiting the low correlation property of the pseudorandom sequences.
With a digital array, MUSIC and ESPRIT [
9] are the classical methods used in highresolution AoA estimation. The work in [
10,
11] applied them to a localized array. Although accurate estimation can be achieved, the computational complexity incurred from singular value decomposition grows cubically with the total number of antennas [
6], which makes the applications of these methods impractical in mmWave massive arrays. In [
12], an auxiliary beam pair (ABP) design was proposed to provide highresolution AoA estimation via amplitude comparison relating to each ABP. It, however, needs to scan all the directions of interest exhaustively, and the resolution is subject to the beamwidth and SNR. In [
13], the optimal sum and difference beamformers based AoA estimator was constructed by exploiting the ratio of difference pattern to sum pattern with two overlapping subarrays, which can achieve the minimum estimation variance under Gaussian noise, regardless of any nulling performed. The work in [
14] uses hierarchical search in the designed multiresolution codebook to promptly identify one single multipath component (MPC) and thus the AoA. A compressed sensing based method was further investigated in [
15] to find multiple MPCs, exploiting the sparse nature of mmWave channels. The beam needs to be recurrently narrowed down according to the codebook, which incurs additional overhead.
In this paper, we propose a novel phase shift design to enable unambiguous AoA estimation using a localized array. Instead of generating multiple single beams as proposed in [
3,
4,
5,
6,
7], a difference beam based phase shift configuration is designed to steer each subarray in two directions. This can effectively improve the performance in terms of mean square error (MSE) of
$Nu$ estimation and detection probability of the expected subarray index by providing better coverage of the directions of interest. Based on the derivation in terms of the even and odd ratios of
$N/K$ where
K is the different phase shifts per symbol, two IDFTandcorrelation based estimation schemes are proposed to directly estimate the AoA. Simulation results show the effectiveness of the proposed approach in estimation accuracy.
2. System Models
As illustrated in
Figure 1, we consider a uniform linear localized array composed of
M subarrays, each with
N evenly spaced phasetunable antenna elements. Assume the arriving informationbearing signal
$\tilde{s}\left(t\right)$ with wavelength
$\lambda $ and elevation angle
$\theta $. The received signals at the
mth subarray (
$m=0,...,M1$) are combined after phase shifting, and then the analog beamformed signal is downconverted to baseband. Through analogtodigital conversions, the output signal is given by [
6]
where
${\xi}_{m}\left(t\right)$ is the zeromean additive white Gaussian noise (AWGN) at the output of the
mth subarray with power
${\sigma}_{n}^{2}$;
${P}_{m}(u,t)$ is the radiation pattern of the
mth subarray at time
t given by
where
${\stackrel{\u02c7}{P}}_{m}^{n}\left(u\right)$ denotes the radiation pattern of the
nth antenna element (
$n=0,...,N1$) at the
mth subarray. As in [
5,
6], we assume
${\stackrel{\u02c7}{P}}_{m}^{n}\left(u\right)=1$;
${\alpha}_{m}^{n}\left(t\right)$ represents the phase shift of the corresponding antenna element at time
t and
$u=\frac{2\pi}{\lambda}dsin\theta $.
Let
${\rho}_{m}\left(t\right)$ denote the crosscorrelation between the output signals of the
mth and (
$m+1$)th subarrays given by
where
${(\xb7)}^{*}$ and
$\left\right(\xb7\left)\right$ represent the conjugate and absolute value of
$(\xb7)$, respectively;
${z}_{m}\left(t\right)$ is approximated as an AWGN.
In [
4], identical phase shifts are used in all the subarrays, i.e., for any
m, the values of
${\alpha}_{m}^{0}\left(t\right)$, ...,
${\alpha}_{m}^{N1}\left(t\right)$ form the same arithmetic progression, such that
$Nu$ in (3) can be estimated by taking the argument of
${\rho}_{m}\left(t\right)$. However, since
$Nu$ can be outside the range
$[\pi ,\pi )$, the determination of
u from the estimate of
$Nu$ (
$\widehat{Nu}$) will lead to phase ambiguity, i.e., there are
$2\lfloor N/2\rfloor +1$ possible estimates of
u, given by
$\widehat{u}\left(p\right)=\frac{2\pi p+\widehat{Nu}}{N}$,
$p=\lfloor N/2\rfloor ,\lfloor N/2\rfloor +1,...,\lfloor N/2\rfloor $, where
$\lfloor \xb7\rfloor $ denotes the floor function. As a result, all possible directions need to be tested by applying a scanning beam within a long scanning frame, in order to find the one with the largest signal power, and thus incurring excessive delays.
3. Proposed AOA Estimation Approach
In this section, phase shifts providing difference beams are designed to facilitate the phase offset estimation between adjacent subarrays. Two AoA estimation schemes are proposed for direct AoA acquisition according to the value of $N/K$, where K is the number of different phase shifts for any symbol.
3.1. Phase Shift Design
Let the
nth phase shift of the
mth subarray at the
tth (
$t=0,...,T1$) symbol be
${\alpha}_{m}^{n}\left(t\right)$ given by
where
${\alpha}_{m}\left(t\right)=2\pi \left(\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}\{m,K\}T+t\right)/L$,
$\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}\{\xb7,\xb7\}$ represents the modulo operation, and thus
$\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}\{m,K\}$ indicates that
${\alpha}_{m}^{n}\left(t\right)$ varies periodically every
K subarrays in one symbol;
K takes a value from
$(2,M]$ and
$N=QK$, where
N is assumed to be an even number and
Q is an integer;
T is the number of training symbols;
$L=TK$ is the total number of different phase shifts used in the system. The setting given by (4) is able to make the array scan potential
$2L$ directions within
$[\pi ,\pi )$ across
T symbols, ensuring that the AoA is acquired by at least one of the
L beams with high gain. Note that it is necessary to have the mainlobes of two difference beams to cover the AoA so that sufficient energy can be obtained when computing the crosscorrelation to estimate the AoA. According to the sampling theorem, at least
N scanning beams are required to cover the AoA range of
$[\pi ,\pi )$ given the number of antennas per subarray
N. Generally, the proper values of
N,
K and
T are supposed to be set to satisfy
$N\le TK$ for good AoA coverage with beamforming gains.
Unlike the phase shift design in [
6] that leverages multiple sum beams to steer multiple evenly distributed directions within
$[\pi ,\pi )$, the proposed one can steer double directions using each subarray by exploiting the difference beams [
12]. Each difference beam steers a null towards the boresight of the corresponding sum beam. An example of normalized beam patterns within the first symbol period are shown in
Figure 2, where the red solid and black dotted curves represent the synthesized difference beams and sum beams, respectively. In this example, we adopt
$K=M=8$ and
$N=24$, and therefore the phase shifter values of subarray
m for difference beams are set to be
$\pi mn/4$ for
$n=0,...,11$ and
$\pi (1mn/4)$ for
$n=12,...,23$, while for sum beams,
$\pi mn/4$ for
$n=0,...,23$. When multiple training symbols are used, both nullsteering direction and phase shifts are rotated by
$\frac{2\pi}{L}$ between every two consecutive symbols. Although the maximal beamforming gain of a difference beam is 3 dB lower than that of a sum beam, multiple difference beams across multiple training symbols overlap in some directions of interest, which can make up for the beamforming gain loss.
3.2. Estimation of $Nu$
We apply Equation (
4) to the estimation of
$Nu$, which is then used to suppress
${e}^{jmNu}$ of
${s}_{m}\left(t\right)$ in (1) followed by the estimation of
u. Substituting Equation (
4) into Equation (
2), we have
where
${\omega}_{m}(u,t)=(u+{\alpha}_{m}\left(t\right))/2$. For convenience of illustration, we consider the first
K subarrays even though the results apply to the remaining
$MK$ subarrays. Therefore,
${\omega}_{m}(u,t)$ is simplified as
${\omega}_{m}(u,t)=\frac{u}{2}\pi \left(\frac{m}{K}+\frac{t}{L}\right)$.
Substituting Equation (
5) into Equation (
3),
${\rho}_{m}\left(t\right)$ can be given by
where
As specified in Lemma 1 [
6], there exists a unique integer
${m}^{\prime}\in [0,K1]$ satisfying
$sin\left({\omega}_{{m}^{\prime}}(u,t)\right)sin\left({\omega}_{{m}^{\prime}+1}(u,t)\right)<0$. Given
${m}^{\prime}$, we have
since
${\omega}_{m}(u,t)={\omega}_{{m}^{\prime}}(u,t)+\frac{({m}^{\prime}m)\pi}{K}$. Considering two cases of
Q, i.e., even and odd, we have
Furthermore, as stated in Lemma 2 [
6] that
$sin\left({\omega}_{{m}^{\prime}}(u,t)\right)sin\left({\omega}_{{m}^{\prime}+1}(u,t)\right)$$<sin\left({\omega}_{m}(u,t)\right)sin\left({\omega}_{m+1}(u,t)\right)$,
$\forall m\ne {m}^{\prime}$, only
${G}_{{m}^{\prime}}(u,t)$ with the largest amplitude has the opposite sign of the remaining since the numerator of
${G}_{m}(u,t)$ does not change with
m. As a result, when the SNR is not very low,
${m}^{\prime}$ can be determined by
${\rho}_{m}\left(t\right)$ with the largest amplitude, i.e.,
${m}^{\prime}=\underset{m=0:K1}{argmax}\left\{{\rho}_{m}\left(t\right)\right\}$. Given
${m}^{\prime}$, the signs of
${\rho}_{m}\left(t\right)$ (
$m=0,...,K1$) can be aligned following
${\tilde{\rho}}_{m}\left(t\right)={(1)}^{Q+\mathbb{1}\{m={m}^{\prime}\}}{\rho}_{m}\left(t\right)$ for
$\widehat{Nu}$, where
$\mathbb{1}\{\xb7\}$ is the indicator function. Note that, from the expression of
${\alpha}_{m}\left(t\right)$, we have
${\rho}_{m}\left(t\right)={\rho}_{\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}\{m,K\}}\left(t\right)$ (
$m=0,...,M2$) while ignoring
${z}_{m}\left(t\right)$ in Equation (
6), and hence the signs of
${\rho}_{m}\left(t\right)$ can be further calibrated following Step 7 in Algorithm 1. As shown in Step 9 of Algorithm 1,
${\tilde{\rho}}_{m}\left(t\right)$ across all subarrays and symbols can be coherently combined to improve the accuracy of
$\widehat{Nu}$.
3.3. Estimation of u
Next, we perform the estimation of u in terms of Q being even or odd as follows.
(I) When
Q is even, letting
$n=k+qK$,
$k=0,...,K1$,
$q=0,...,Q/21$, Equation (
5) can be written as
where
are the Fourier coefficients of
${P}_{m}(u,t)$.
Algorithm 1 Estimation of $Nu$ 
Input:
${s}_{m}\left(t\right)$, $m=0:M1$, $t=0:T1$;  1:
for
$t=0:T1$do  2:
Calculate ${\rho}_{m}\left(t\right)$ by (3), $m=0:M2$;  3:
if $K=M$ then  4:
${\rho}_{K1}\left(t\right)\leftarrow {s}_{K1}^{*}\left(t\right){s}_{0}\left(t\right)$;//The crosscorrelation between the first and the last subarrays  5:
end if  6:
Determine ${m}^{\prime}\leftarrow \underset{m=0:K1}{argmax}\left\{{\rho}_{m}\left(t\right)\right\}$;//Find the subarray index with the largest amplitude  7:
${\tilde{\rho}}_{m}\left(t\right)\leftarrow {(1)}^{Q+\mathbb{1}\{\phantom{\rule{3.33333pt}{0ex}}mod\phantom{\rule{0.277778em}{0ex}}\{m{m}^{\prime},K\}=0\}}{\rho}_{m}\left(t\right)$, $m=0:M2$;//Calibrate their signs  8:
end for  9:
$\widehat{Nu}\leftarrow arg\left\{{e}^{j\frac{\pi}{K}}{\displaystyle \sum _{t=0}^{T1}}{\displaystyle \sum _{m=0}^{M2}}{\tilde{\rho}}_{m}\left(t\right)\right\}$.//Coherent combination for improving estimation accuracy

Given $\widehat{Nu}$, we calibrate ${s}_{m}\left(t\right)$ by multiplying ${e}^{jm\widehat{Nu}}$, i.e., ${s}_{m}\left(t\right){e}^{jm\widehat{Nu}}$. Provided that ${e}^{jm(Nu\widehat{Nu})}\approx 1$, ${s}_{m}\left(t\right)$ can be almost perfectly calibrated. Performing the Kpoint IDFT of ${s}_{m}\left(t\right){e}^{jm\widehat{Nu}}$ produces ${\tilde{S}}_{k}\left(t\right)\approx \tilde{s}\left(t\right){g}_{k}(u,t)+{\Xi}_{k}\left(t\right)$, where ${\Xi}_{k}\left(t\right)$ are the Kpoint IDFT of ${\xi}_{m}\left(t\right){e}^{jm\widehat{Nu}}$, $m,k=0,...,K1$.
To obtain an estimate of
u,
$\widehat{u}$, we compute the crosscorrelation between any two adjacent IDFT outputs,
${\tilde{S}}_{k}^{*}\left(t\right){\tilde{S}}_{k+1}\left(t\right)$, denoted by
${d}_{k}\left(t\right)$,
$k=0,...,K2$, given by
where
${\tilde{\Xi}}_{k}\left(t\right)$ is approximated as an AWGN. It is observed from Equation (
10) that
$\widehat{u}$ can be unambiguously captured by
$\widehat{u}=arg\left\{{d}_{k}\left(t\right){e}^{\frac{j2\pi t}{L}}\right\}$. Similarly,
${d}_{k}\left(t\right)$ across all subarrays and symbols can be combined to improve the accuracy of
$\widehat{u}$.
(II) When
Q is odd,
K must be even since
N is even. Letting
$n=k+qK/2$,
$k=0,...,K/21$,
$q=0,...,Q1$, (5) can be written as
Separating
${P}_{m}(u,t)$ to even and odd samples, we have
where
$l=0,...,K/21$. Performing
$K/2$point IDFT of the even and odd samples of
${s}_{m}\left(t\right){e}^{jm\widehat{Nu}}$, respectively, and then calculating the crosscorrelation between adjacent IDFT outputs, denoted by
${d}_{k}^{e}\left(t\right)$ and
${d}_{k}^{o}\left(t\right)$,
$k=0,...,K/22$, we have
$\widehat{u}$$=arg\left\{{d}_{k}^{e}\left(t\right){e}^{j\frac{2\pi t}{L}}+{d}_{k}^{o}\left(t\right){e}^{j\left(\frac{2\pi t}{L}+\frac{2\pi}{K}\right)}\right\}$.
The estimation of
u is summarized in Algorithm 2, where
${\tilde{\mathbf{S}}}_{q}{\left(t\right)}_{({k}_{1}:{k}_{2})}$ denotes a vector consisting of the
${k}_{1}$th to
${k}_{2}$th elements of
${\tilde{\mathbf{S}}}_{q}\left(t\right)$ and
${(\xb7)}^{T}$ stands for the transpose of
$(\xb7)$. Note that in Step 12, the samples from the
$(M\lfloor M/K\rfloor K)$th to the
$(K1)$th are concatenated after the samples from the
$(\lfloor M/K\rfloor K)$th to the
$(M1)$th to constructively estimate
u, exploiting the periodicity of the phase shifts designed in (4). As the proposed approach is based on crosscorrelation and IDFT computation, its computational complexity is similar to that in [
6] given by
$\mathcal{O}\left(N(3+{log}_{2}M)\right)$, which is much lower than the subspacebased methods, e.g., MUSIC or ESPRIT in [
10,
11] given by
$\mathcal{O}\left({M}^{3}{N}^{3}\right)$.
Algorithm 2 Estimation of u 
Input:
$\widehat{Nu}$, ${s}_{m}\left(t\right)$, $m=0:M1$, $t=0:T1$;  1:
for
$t=0:T1$do  2:
${\tilde{s}}_{m}\left(t\right)\leftarrow {s}_{m}\left(t\right){e}^{jm\widehat{Nu}}$, $m=0:M1$;//Calibrate ${s}_{m}\left(t\right)$  3:
for $q=0:\lfloor M/K\rfloor 1$do//$\lfloor M/K\rfloor $ nonoverlapping groups  4:
if Q is even then  5:
${\tilde{\mathbf{s}}}_{q}\left(t\right)\leftarrow \left\{{\tilde{s}}_{qK}\left(t\right),{\tilde{s}}_{qK+1}\left(t\right),...,{\tilde{s}}_{(q+1)K1}\left(t\right)\right\}$, ${\tilde{\mathbf{S}}}_{q}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{q}\left(t\right)\right\}$;//Kpoint IDFT  6:
else  7:
${\tilde{\mathbf{s}}}_{q}^{e}\left(t\right)\leftarrow \mathrm{the}\phantom{\rule{4.pt}{0ex}}\mathrm{even}\phantom{\rule{4.pt}{0ex}}\mathrm{samples}\phantom{\rule{4.pt}{0ex}}\mathrm{of}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{\tilde{\mathbf{s}}}_{q}\left(t\right)$, ${\tilde{\mathbf{S}}}_{q}^{e}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{q}^{e}\left(t\right)\right\}$;//$K/2$point IDFT  8:
${\tilde{\mathbf{s}}}_{q}^{o}\left(t\right)\leftarrow \mathrm{the}\phantom{\rule{4.pt}{0ex}}\mathrm{odd}\phantom{\rule{4.pt}{0ex}}\mathrm{samples}\phantom{\rule{4.pt}{0ex}}\mathrm{of}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{\tilde{\mathbf{s}}}_{q}\left(t\right)$, ${\tilde{\mathbf{S}}}_{q}^{o}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{q}^{o}\left(t\right)\right\}$;//$K/2$point IDFT  9:
end if  10:
end for  11:
if Q is even then  12:
${\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}\left(t\right)\leftarrow \left\{{\tilde{s}}_{\lfloor M/K\rfloor K}\left(t\right),...,{\tilde{s}}_{M1}\left(t\right),{\tilde{s}}_{M\lfloor M/K\rfloor K}\left(t\right),...,{\tilde{s}}_{K1}\left(t\right)\right\}$;  13:
${\tilde{\mathbf{S}}}_{\lfloor M/K\rfloor}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}\left(t\right)\right\}$;  14:
else  15:
${\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}^{e}\left(t\right)\leftarrow \mathrm{the}\phantom{\rule{4.pt}{0ex}}\mathrm{even}\phantom{\rule{4.pt}{0ex}}\mathrm{samples}\phantom{\rule{4.pt}{0ex}}\mathrm{of}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}\left(t\right)$, ${\tilde{\mathbf{S}}}_{\lfloor M/K\rfloor}^{e}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}^{e}\left(t\right)\right\}$;  16:
${\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}^{o}\left(t\right)\leftarrow \mathrm{the}\phantom{\rule{4.pt}{0ex}}\mathrm{odd}\phantom{\rule{4.pt}{0ex}}\mathrm{samples}\phantom{\rule{4.pt}{0ex}}\mathrm{of}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}\left(t\right)$, ${\tilde{\mathbf{S}}}_{\lfloor M/K\rfloor}^{o}\left(t\right)\leftarrow \mathrm{IDFT}\left\{{\tilde{\mathbf{s}}}_{\lfloor M/K\rfloor}^{o}\left(t\right)\right\}$;  17:
end if  18:
end for  19:
ifQ is even then  20:
$\widehat{u}\leftarrow arg\left\{{\displaystyle \sum _{t=0}^{T1}}{e}^{j\frac{2\pi t}{L}}{\displaystyle \sum _{q=0}^{\lfloor M/K\rfloor}}{{\tilde{\mathbf{S}}}_{q}^{*}\left(t\right)}_{(1:K1)}{{\tilde{\mathbf{S}}}_{q}^{T}\left(t\right)}_{(2:K)}\right\}$;//Coherent combination  21:
else  22:
$\widehat{u}\leftarrow arg\left\{{\displaystyle \sum _{t=0}^{T1}}{e}^{j\frac{2\pi t}{L}}{\displaystyle \sum _{q=0}^{\lfloor M/K\rfloor}}\left({{\tilde{\mathbf{S}}}_{q}^{e*}\left(t\right)}_{(1:\frac{K}{2}1)}{{\tilde{\mathbf{S}}}_{q}^{eT}\left(t\right)}_{(2:\frac{K}{2})}+{e}^{j\frac{2\pi}{K}}{{\tilde{\mathbf{S}}}_{q}^{o*}\left(t\right)}_{(1:\frac{K}{2}1)}{{\tilde{\mathbf{S}}}_{q}^{oT}\left(t\right)}_{(2:\frac{K}{2})}\right)\right\}$.  23:
end if

3.4. Discussion on Extension of the Proposed Approach
The proposed approach can be potentially extended to wideband mmWave systems, where each subcarrier or a cluster of subcarriers are assumed to be narrowband, and the proposed approach is performed separately at different subcarriers or clusters. The crosscorrelations between subcarriers or clusters can also be exploited to improve the estimation accuracy [
3].
The proposed approach can be extended from a linear array to a planar array, where the proposed phase shift design and crosscorrelation operation can be applied similarly along the orthogonal dimension. Since the radiation pattern of a planar array can be represented by the product of independent radiation patterns along two orthogonal dimensions, the AoA estimation between them can be decoupled from each other.
The proposed approach can potentially be extended to the case in the presence of nonlineofsight (NLOS) or interferences from other transmitters. Since the NLOSs are typically much weaker than the LOS, serial interference cancellation could be performed for sequential AoA estimation with the proposed approach. When the AoA of the LOS is estimated, we can steer all beams of subarrays towards this direction, and then evaluate its channel amplitude and phase. By regenerating the LOS signal component and removing it from the received signals in all subarrays, the second strongest path can be estimated. In the same way, the remaining paths can be estimated and subtracted one by one. When there exist multiple interferences with similar power from different directions, the proposed approach could be conducted in terms of parallel interference cancellation, where multiple AoAs are simultaneously estimated and cancelled.
4. Simulation Results
In this section, we present the simulation results to evaluate the proposed approach, compared with the state of the art [
6]. Denote the average received SNR per antenna as
${\gamma}_{a}$. The training symbols,
$\tilde{s}\left(t\right)$, are generated following complex Gaussian distributions with zero mean. Assuming uniformly distributed AoA within
$[\pi ,\pi ]$, simulation results are obtained by averaging over 50,000 trials. Here, we define
${P}_{d}$ as the probability of correctly finding the index
${m}^{\prime}$ at Step 6 of Algorithm 1.
Figure 3 compares the MSEs of
${e}^{j\widehat{Nu}}$ versus
${\gamma}_{a}$ with
$Q=1$ and 2. As shown in the figure, the proposed phase shift design outperforms that of [
6] in terms of MSE of
${e}^{j\widehat{Nu}}$, since a higher SNR for
$\widehat{Nu}$ can be achieved at Step 9 of Algorithm 1. The MSE curve of
${e}^{j\widehat{Nu}}$ becomes increasingly tight to its asymptotic lower bound with the increase of
${\gamma}_{a}$, where the asymptotic lower bound is the lower bound of proposed approach produced under the assumption of
${P}_{d}=1$. There is more gain with
$Q=2$ than
$Q=1$ in comparison with [
6], which indicates that the proposed scheme is more applicable to narrow beams, i.e., large
N. This is because multiple narrower single beams cannot provide desirable AoA coverage, resulting in estimation performance loss, which, however, can be compensated by our phase shift design.
Figure 4 shows
${P}_{d}$ versus
${\gamma}_{a}$ with different values of
Q. We can see that the proposed scheme generally has better performance. At high SNRs, it achieves higher
${P}_{d}$ with
$Q=1$, while it is inferior to that of [
6] when SNR is low. Note that, compared with that in [
6], our proposed phase shift design improves the capability of identifying the correct
${m}^{\prime}$, thus effectively suppressing the noise and indirectly improving the SNR of estimation. When
${\gamma}_{a}$ is greater than 5 dB, the proposed one leads to a higher
${P}_{d}$ with a smaller
Q. This is because, when the number of beams is fixed within one symbol period, a smaller
N, and hence a wider beam, leads to a better coverage of the directions of interest. Therefore, it is easier to find the correct
${m}^{\prime}$.
The MSEs of
$\widehat{u}$ are shown as a function of
Q in
Figure 5. It can be seen that its MSEs generally increase with
Q attributed to the decreasing number of subarrays. The proposed approach generally achieves better performance than [
6] when
Q is an even number. When
Q is odd, the proposed method results in larger estimation errors since the signals are only averaged over
$K2$ product terms (see Step 22 of Algorithm 2), less than
$K1$ in [
6]. The corresponding asymptotic lower bound of MSEs of
$\widehat{u}$ are displayed for comparison. To evaluate the credibility of estimation errors, we calculate 95% confidence intervals (CIs) for
$\widehat{u}$. When
${\gamma}_{a}=5$ dB and
$Q=2$, 4 and 6, the CIs are given by [–0.0207, 0.0111], [–0.0204, 0.0114] and [–0.0244, 0.0075] for the proposed one, [–0.0219, 0.0099], [–0.0258, 0.0060] and [–0.0224, 0.0096] for [
6], [–0.0205, 0.0113], [–0.0232, 0.0086] and [–0.0213, 0.0104] for the asymptotic lower bound, respectively. The MSEs of
$\widehat{u}$ in Rician fading channels [
12] are also provided to show the impact of multipath channels on the proposed approach, where the Rician factor is assumed to be 10 dB.
Figure 6 presents the MSE of
$\widehat{u}$ versus
${\gamma}_{a}$ with even values of
Q. From the figure, we can see that the proposed approach outperforms [
6] by 1.4 dB at the MSE of 0.1, 0.4 dB at the MSE of 0.01 and 0.7 dB at the MSE of 0.001, respectively, when
Q = 6, 4 and 2.