3 D-MIMO Channel Estimation under Non-Gaussian Noise with Unknown PDF

Three-dimensional-multiple-input-multiple-output (3D-MIMO) technology has attracted a lot of attention in the field of wireless communication. Most of the research mainly focuses on channel estimation model which is affected by additive-white-Gaussian-noise (AWGN). However, under the influence of some specified factors, such as electronic interference and man-made noise, the noise of the channel does not follow the Gaussian distribution anymore. Sometimes, the probability density function (PDF) of the noise is unknown at the receiver. Based on this reality, this paper tries to address the problem of channel estimation under non-Gaussian noise with unknown PDF. Firstly, the common support of angle domain channel matrix is estimated by compressed sensing (CS) reconstruction algorithm and a decision rule. Secondly, after modeling the received signal as a Gaussian mixture model (GMM), a data pruning algorithm is exerted to calculate the order of GMM. Lastly, an expectation maximization (EM) algorithm for linear regression is implemented to estimate the the channel matrix iteratively. Furthermore, sparsity, not only in the time domain, but in addition in the angle domain, is utilized to improve the channel estimation performance. The simulation results demonstrate the merits of the proposed algorithm compared with the traditional ones.


Introduction
The improvement of data transmission rate in wireless communication is of great meaning.Three-dimensional-multiple-input-multiple-output (3D-MIMO) can improve channel capacity without increasing the transmission power and bandwidth.Therefore, it has been one of the key technologies in the 5th-Generation communication systems.In addition, 3D-MIMO can enhance the spatial available dimension effectively compared with traditional MIMO by which only one horizontal dimension can be provided.Specifically, the traditional MIMO system has fewer ports that can only adjust the beam direction in the horizontal dimension, and cannot concentrate the vertical dimension energy on the receivers, while 3D-MIMO commonly used a two-dimensional antenna array in a large scale such as uniform planar array (UPA) [1] where hundreds of antennas can be placed in a small area.3D-MIMO can flexibly adjust the direction of the beam in both the horizontal and vertical directions, forming a more precise and narrower directional beam, so as to enhance the received signal energy and strengthen the cell coverage greatly.Furthermore, 3D-MIMO can achieve the enhancement of spatial multi-user multiplexing capabilities and better interference suppression through the use of advanced signal processing technology.An important issue for 3D-MIMO utilization is to get the channel impulse response (CIR) accurately.However, with the conventional channel estimation method, massive multi-antenna means the sharp increase of pilot number [2], which is no longer proper for real application.Meanwhile, the MIMO channel is shown to have a specific sparse structure [3] in which the row and column vectors of the angular domain channel matrix are both sparse.Therefore, sparse estimation technology, such as compressed sensing (CS) based reconstruction algorithms, can be used to do channel estimation.
At the same time, most of the existing communication theory assumes that the noise at the receivers follows a Gaussian distribution [3][4][5].These theories are mainly based on two reasons: (1) the Gaussian distribution can be expressed in specific mathematical expressions to facilitate analysis and calculation; (2) this assumption conforms to the central limit theorem.However, recent studies have shown that the noise is not always subjected to the Gaussian distribution in a lot of wireless applications [6][7][8].If the noise is still be treated as a Gaussian distribution, it usually shows poor performance [9].The reason for the non-Gaussian noise may be, for example, an atmospheric noise in radio links, lightning, relay contacts, ambient acoustic noise due to ice cracking in the arctic region in underwater sonar [8] and submarine communications [10,11].Therefore, it is of great importance to address this problem in 3D-MIMO channel estimation.
A structured sparse channel estimation (SSCE) algorithm [12] has been studied to estimate a 3D-MIMO channel, but the influence of non-Gaussian noise needs to be studied further.The authors in [13] propose a channel estimation algorithm by refining the selection of dominant entries iteratively until the terminating condition is met but ignores the influence of non-Gaussian noise either.Meanwhile, channel estimation under non-Gaussian noise can be obtained by the method of the Kernel density, but the accuracy of which is seriously affected by the selection of kernel width [10].In addition, the expectation maximization (EM) algorithm is regarded as an excellent way to solve the problem by modelling non-Gaussian noise as a Gaussian mixture model (GMM).However, most of the published works assume that the order of GMM is known previously [10,14,15], which is not always the case in real systems.Therefore, the order of GMM should be determined before the establishment of a mixture Gaussian function.This paper tries to address the order selection problem by a pruning approach.
On the other hand, the existence of sparsity of the 3D-MIMO channel matrix in the angular domain has been studied in [6], in which the channel matrix in the array domain is converted to the angular domain.In fact, as the distance of each antenna on the array are relatively close, the support set of all elements can be considered as the same [16].Inspired by these works, this paper uses a CS-based reconstruction algorithm to estimate the common support of the angle domain channel matrix and obtain the sparse position of channel matrix firstly.Meanwhile, an order selection method is implemented in the determination of the order of GMM, which is used to approximate the real probability density function (PDF) of the noise.Finally, an EM algorithm for linear regression is used to estimate the channel support matrix.
The contribution of this paper is that it provides a framework of 3D-MIMO sparse channel estimation under non-Gaussian noise with unknown PDF.
The rest of this paper is organized as follows.In Section 2, the sparsity of 3D-MIMO channel matrix is presented.Section 3 gives the channel estimation algorithm under the environment of non-Gaussian noise, and the simulation proves the advantage of this method in Section 4. Section 5 provides the conclusions of this paper.
Notion: Let (•) T and (•) H denote transpose and Hermitian transpose, respectively; let ⊗ denote the Kronecker product; let diag(x) represent a diagonal matrix with the value of vector x in its diagonal; let | • | and • represent the absolute value and 2-norm operation respectively; and let CN (x|m, Σ) indicate the circularly symmetric complex Gaussian random vector with mean m and covariance matrix Σ.Its expression for a d-dimensional vector of x is

Signal Model
Consider an orthogonal frequency division multiplexing access uplink cellular system with K users in a cell.Each user has one transmission antenna.The K users are randomly distributed in the cell.The UPA is installed in the base station which holds N v elements in a vertical direction and N h in a horizontal direction, and N = N h × N v is the total number of antennas.The antenna model of the base station is illustrated in Figure 1.According to [17], the CIR for the k-th user in matrix form can be expressed as where P is the total number of multipaths of this user.In addition, α l,k denotes the fading coefficient.
Furthermore, e(u l ) and e(v l ) denote the steering vector in the vertical direction and horizontal direction, respectively, and they can be expressed as where In addition, θ l and ϕ l represent the pitch direction of arrival (DoA) and azimuth DoA, respectively, for the l-th path as shown in Figure 1, s v and s h represent the minimum distance in the row and column of the adjacent element on the UPA, respectively, and λ is the wavelength of the signal.The channel matrix can be rewritten in array domain form as where h i,k is the vector form of the channel of the i-th path of user k, which can be written as Transforming the channel matrix of the k-th user in Equation ( 4) from an array domain into angle domain yields where U is the conjugation of Kronecker product of spatial base vector, and U h and U v express the mutually independent spatial horizontal base vector and vertical base vector, respectively.These base vectors can be expressed as After transmission, the received signal at the base station can be written as where Y is the received signal, Φ is the gather of the pilot signals, N p is the dimension of the pilot vector, and is the channel matrix of all users.The pilot vector set is where z i is the pilot vector sent by user i, and it is a N p (N p > N) dimensional column vector.In addition, F ∈ C KN p ×KP is composed by K F p matrix that is on the diagonal of F, where F p is the first P columns of N p dimensional Fourier transform matrix with where W N P = e 2πj/N P and j = √ −1.According to Equation ( 6), the received signal can be transformed from array domain into angle domain as shown in Equation (11), where X = ΦF is a known N p × KP dimensional transmission matrix with elements x i,j , Y a , H a , and W a that represent Y, H, and W angular domain transformed matrixes, respectively.The goal of channel estimation is to obtain H a from the known matrix Y a and X under non-Gaussian noise W a .

Description of Joint Sparsity of Angle Domain Channel Matrix
The joint sparsity of 3D-MIMO angle domain channel matrix H a can be used to improve the channel estimation performance.As a matter of fact, the sparsity of the channel matrix does not only lie in the time domain, but also in the angle domain, which results in the sparsity both in columns and rows of angle domain channel matrix.

The Sparsity of the Column Vector
Due to the existence of multipath effect and transmission delay, the arrival time of the same transmitted signal transmission from the transmitter to the receiver is different from each other, as shown in Figure 2.That is to say, the receiver will receive the duplicates of the transmitted signals at different times.In real systems, there is hardly any signal arrives at most of the times during this period.The elements of a column of angle domain channel matrix H a represent the intensity of arrived signal from the transmitters through all of the paths for a corresponding antenna element in UPA.Therefore, the sparsity in the time domain results in the sparsity of the column vector.

The Sparsity of the Row Vector
The sparsity of the row vector is caused by the scatters around the base station.The uplink transmitted signals can be received by a small fraction of spatial reign of the base station because the location of the base station is generally high enough [12].In addition, the signal arrives at the receivers only in some particular angle bin, which leads to the sparsity of transmitted signals in the angle domain [3].Furthermore, it has been demonstrated in [16] that all of the antennas share a common support if the distance d max among the farther antennas of the UPA satisfies the following condition: where c is the speed of light, and BW is the bandwidth of transmitted signal.In general, H a conforms to such a joint sparsity: its row vectors and column vectors are both sparse, and all of the column vectors have the same position.

Channel Estimation under Non-Gaussian Noise
This section proposes a channel estimation algorithm when the noise does not follow Gaussian distribution, and the PDF is unknown either.
In summary, the proposed algorithm estimates the angle domain channel matrix through three steps: (1) Estimate the common support to obtain the sparse position; (2) Approximate the unknown PDF of the noise using mixture Gaussian function; (3) Estimate angle domain channel matrix by linear regression EM algorithms.

The Estimation of the Common Support
To get the common support, a CS reconstruction algorithm is implemented to coarsely estimate angle domain channel matrix, and then the common support of angle domain channel matrix is obtained directed by a decision rule.After applying Equation (11) to CS algorithms, we can preliminarily get the estimation of angle domain channel matrix written as Ĥcs .Then, the common support is denoted as where s i is the i-th element of s, and is the threshold to judge the value of the common support.
In fact, is related to the signal to noise ratio (SNR) of the receivers.The value should be chosen properly; otherwise, it will lead to an inaccurate estimation of sparse positions and channel estimation.Furthermore, in Equation ( 13), is the sum of the absolute value of the i-th row in Ĥcs , and ĥcs i,n is the element of Ĥcs in the i-th row and the n-th column.The i-th element in s is 1 or 0 when the sum of absolute values of the i-th row in Ĥcs is larger or smaller than the threshold , respectively.Figure 3 gives the detailed conversion relationship between Ĥcs and s.Since the column of H a is sparse, most of its row vectors are the zero vector.The removal of these row vectors can save the computational cost and improve the speed of channel estimation without affecting the value of the received signal.In order to conduct the matrix multiplication operation correctly, the column vectors of X at the corresponding positions should also be removed.In other words, the computational cost can be saved by reducing the dimension of H a and X.After dimension reduction operation, the transmission support matrix X s is a submatrix generated by picking up the column of the transmission matrix X, whose column index corresponds to the element 1 in s.

Order Selection
In order to obtain the channel matrix, the PDF of the noise should be cognised firstly.Inspired by [18], the PDF of the unknown noise can be modeled as GMM with zero mean written as where the parameter λ i is the coefficient of component i, with for K s mixture distribution components, Σ i is the covariance matrix for the i-th component, and K s is the order of GMM.Based on Equation ( 14), the noise cognition process can be transformed to the estimation of the key parameters K s and λ i .
It should be noted that the order K s in the classical algorithm is assumed to be known, and it is always chosen from 2 to 4. This assumption is reasonable in the case of a specific non-Gaussian noise.Take impulse noise as an example, in the step of establishing its PDF by GMM, the mainstream approach is to select the order as 2 [19,20], and the results of this supposition are usually satisfactory.However, for more complicated non-Gaussian noise and interference model in real systems, order 2 to 4 may not be appropriate enough.In fact, when the number of the components is not large enough, the mixture model may underfit the data and may not be flexible enough to approximate the true noise.On the other hand, it may overfit the data and yield poor interpretations with too many components.Therefore, in order to describe the unknown distribution using GMM, the proper order should be studied at this step.
Inspired by the pruning algorithm in [21], this paper tries to propose an improved approach to determine the order of GMM.The main idea of the algorithm in [21] is implemented by an iterative manner.After each iteration, a clustering center of the data set is found, and then some data is deleted from the total data set according to this clustering center.Until all of the data has been deleted completely, the iteration is terminated.After the iteration is completed, the number of the clustering center represents the order of GMM.At each iterative step, the clustering centers are determined by the number of data in the neighborhood of a given radius.With the increase of radius, the number of the clustering center will decrease, which leads to different results of cluster numbers.However, the radius of a certain range will lead to the same number and location of the center points.The search procedure is terminated and the order is obtained when one state sustains more than M times.
However, in the communication systems, the characteristics of the Gaussian components of GMM are not always obvious.Under these circumstances, the method mentioned above needs to be improved.Specifically, when the Gaussian components are similar to each other, this method may take the clustering center center points as a cluster, with each point on the edge as a new one, even if there is only one point left in this cluster.This will lead to a large number of clusters, of which the result is unsatisfactory for the order selection.In order to overcome this problem, this paper proposes an improvement measure.In particular, after finding each clustering center and deleting this cluster of points, the variance of the remaining points is calculated.If the variance is smaller than the threshold ρ, the next clustering center will be searched sequentially.Otherwise, the remaining points will be abandoned, and then the neighborhood radius is increased in the next step.Figure 4 illustrates the steps of the proposed method.Therefore, after setting the initial neighborhood radius, the proposed algorithm then gradually increases radius to search for the center cluster point.
The initial radius is set to be where r is the initial radius, x i and x j are arbitrary elements of the data set, and the parameter R is used to control the search accuracy.The larger the parameter R is, the more accurate the result is, while the calculation complexity increases.This order reduction process is shown in Figures 4 and 5.

Estimation of Channel Support Matrix
In this section, we will use the received signal Y a and transmission support matrix X s to estimate the channel support matrix H a s under the influence of non-Gaussian noise W a with independent and identically distributed components with PDF approximated by Equation (1).Without loss of generality, the mean of channel noise is assumed to be m 1 = m 2 = • • • = 0 in Equation (14).In order to estimate the channel support matrix, instead of solving the linear regression problem in vector form in [19], this paper proposes to make an extension by transforming it into the matrix form.Furthermore, after the order has been obtained at the step of order selection, the channel support matrix can be calculated column by column.For the i-th column, its elements and the weight value least squares matrix G(y i ) can be calculated iteratively.Before the start of the algorithm, the i-th column vector ĥ(0) i of channel support matrix, coefficients λ k,i and variances σ 2 k,i should be given initial values.The calculation process of G(y i ) is to perform Equations ( 16)-( 18) one after another: In Equation ( 16), is the element of Y a in the n-th row and the i-th column, x n,m is the element of X s in the n-th row and the m-th column, h m,i is the m-th element in ĥ(t) i , the superscript t is the number of iteration , and λ k,i and σ 2 k,i are coefficients and variances of the k-th component of GMM.The i-th column of channel matrix and its parameters of each component of GMM can be deducted based on g k (n) and G(y i ) as ĥ(t Repeat Equation ( 16)-( 21) for all of the other columns, the whole channel support matrix can be obtained by Ĥ(t) s = ĥ(t) 1 ĥ(t) The iteration is terminated when the estimation value of channel support matrix is relatively stable or the given number of iterations has been done.This condition can be written in Equation (23) as In (23), m is the total number of the 1 elements in s, thr is the threshold, and γ is the given number of iteration number.After the iteration is completed, the final angle domain channel matrix can be achieved based on the channel support matrix Ĥs and common support s.
The reason for common support estimation going before order selection is that a large number of vectors which are 0 or close to 0 have a great impact on the accuracy of estimation of the order of GMM.For example, Figures 6 and 7 show the result of order selection before and after removing the elements which are close to 0, respectively.In Figure 6, due to the presence of large number of elements which are close to 0 s, these elements will automatically generate a new class, so its clustering result is bigger than the data of Figure 7.The main steps of the above algorithm are summarized in Algorithm 1.

Algorithm 1 Non-Gaussian Angular Domain Sparse Channel Estimation
Input: received signal Y a , transmission matrix X, iterative counter t = 0, threshold: , M, R, thr, γ, ρ Output: channel matrix H a 1: Estimate channel matrix by CS reconstruction algorithm, then compute the common support via Equation (13).2: Order K s selection for GMM via pruning algorithm.3: Initialize ĥ(0) 1 , λ k,1 and σ 2 k,1 (k = 1, 2, • • • , K s ).4: Compute the prior probability weight least square matrix via Equations ( 16)- (18).5: Compute ĥ(t) 1 , λ k,1 and σ 2 k,1 based on the result of step 4 via Equations ( 19)- (21).6: Let i equal to the rest label of the column, repeat the step 3-5.7: Determine whether the condition of Equation ( 23) is set up, if it meets to the condition, the next step is carried out, or return to step 4 and t = t + 1.

Simulation
This section demonstrates the simulation results.The frequency selective multipath channel consists of 200 independent Rayleigh multipath with maximum delay spread of 5 µs.The total transmission power is 1 W, the number of subcarriers is 2048, and the total transmission bandwidth is 8 MHz.The power delay is assumed to be an exponentially decaying profile.In the simulation process, the channel noise follows a mixture Gaussian distribution.The simulation parameters are shown in Table 1. Figure 8 shows the comparison between the proposed algorithm and other three algorithms in aspect of the normalized mean square error (NMSE) over SNR, including an SSCE-based [12] algorithm, compressive sampling matching pursuit (CoSaMP) [22] sparse recovery algorithm, and the dual crossing (DC) algorithm [13].From Figure 8, it can be seen that the SSCE algorithm has a comparatively satisfactory performance under Gaussian noise, especially in a relatively low SNR region.However, under the affect of non-Gaussian noise, its performance is not as good as the proposed method, which is reflected in Figure 8.It should be noted that, when the order of GMM is K s = 1, in other words, the noise follows Gaussian distribution, the proposed algorithm is similar to SSCE algorithm.The CoSaMP algorithm which has been used at the step of estimating the support set estimates the channel matrix based on classical CS reconstruction algorithm with the noise distribution being viewed as Gaussian distribution.In fact, the reconstruction performance of CoSaMP outperforms the other CS algorithms.Therefore, it is reasonable to use the CoSaMP algorithm as a reconstruction approach to estimate the support set at step 2 of Algorithm 1.The DC-based algorithm estimates channel by refining the selection of dominant entries iteratively until the terminating condition is met.It can also be seen from Figure 8   Imperfect channel state information (CSI) can cause the reduction of channel capacity [23].Furthermore, if the CSI of the receiver is not perfect, the channel capacity will decrease significantly [24].However, to the best of our knowledge, there is no precise mathematical relationship between the estimation error of CSI and the channel capacity for a MIMO system, which is still an open problem in the field of information theory.Fortunately, as mentioned in [25], the upper and lower bounds of mutual information can be determined according to channel estimation error.Based on [25], Figure 9 shows the comparison of the upper and lower bounds of mutual information between the proposed algorithm and other traditional algorithms versus SNR.It can be seen from this figure that the mutual information of the proposed algorithm is bigger than the mutual information of the traditional channel estimation algorithms when the SNR is relatively small.The difference becomes smaller when the SNR is relatively large because the channel estimation of all of the algorithms becomes more accurate and the NMSE of all of the algorithms becomes smaller.The relationship between the number of antennas and the NMSE is shown in Figure 10.In the simulation of this figure, the pilot length is set to 512.For a given number of the pilots, the estimated error of the channel matrix increases as the number of antennas.In addition, it can be seen that when the number of antennas increases, the slope of the curves becomes smaller.Figure 11 shows the impact of the value of on the results.According to the theoretical analysis, the value of is closely related to the received SNR.When the SNR is large, the value of can be chosen from a relative large range, and vice versa.Figure 12 demonstrates the effect of the choice of order of GMM on the performance channel estimation.In the simulation process, the channel noise is set to follow a GMM with order 4. It can been seen that the Gaussian mixture model with order 4 is superior to orders 2 or 3, which is not correctly estimated.For example, when the NMSE is −40 dB, the SNR gap is about 2.75 dB and 6.15 dB over orders 3 and 2, respectively.Therefore, choosing the appropriate GMM order is important for the system performance.

Conclusions
This paper studies a new algorithm that exploits the sparsity of 3D-MIMO in both time and angular domain to estimate channel matrix under non-Gaussian noise with unknown PDF.The core of the proposed algorithm is to recognize the noise by approximating the unknown PDF with GMM, and the order of GMM is also calculated after the estimation of the common support set of the channel matrix in the angle domain.After that, the channel support matrix is estimated under non-Gaussian noise.The proposed algorithm studies the property of the noise firstly and then estimates the CIR targeted on it; therefore, it outperforms the traditional approaches which view the noise distribution as Gaussian.Theoretical analysis and simulations show that the proposed algorithm is an effective method to solve the channel estimation problem under non-Gaussian noise for 3D-MIMO systems.

Figure 1 .
Figure 1.Received signal model.The number of base station antennas is N = N v × N h , θ is the pitch angle of receiving signal, and ϕ is the azimuth angular.

Figure 2 .
Figure 2. The illustration of the received signal intensity for a given element in uniform planar array.

Figure 3 .
Figure 3. Relationship between Ĥcs and common support s.

Figure 4 . 5 , and in the other four graphs is maxDistance 5 + maxDistance 50 .
Figure 4. Pruning process of order selection under different neighborhood radius.The point p 1 in (a) represents the first cluster center, and p 2 , p 3 , p 4 in (b-d) represent the next cluster center, respectively.After p 4 is determined, the variance of the remaining points is larger than the threshold ρ.The following four pictures (e-h) are the pruning results after the enlargement of the neighborhood radius.Under these two different radius conditions, due to p 1 = p 5 , p 2 = p 6 , p 3 = p 7 , p 4 = p 8 , it can be considered that the result is stable, relatively.Therefore, the order selection result of these data under this algorithm is 4. The neighborhood radius in the four graphs of the above part is maxDistance 5 , and in the other four graphs is maxDistance 5

Figure 6 .
Figure 6.Illustration of order selection under the sparse matrix.

Figure 7 .
Figure 7. Illustration of order selection under channel support matrix.
Figure8shows the comparison between the proposed algorithm and other three algorithms in aspect of the normalized mean square error (NMSE) over SNR, including an SSCE-based[12] algorithm, compressive sampling matching pursuit (CoSaMP)[22] sparse recovery algorithm, and the dual crossing (DC) algorithm[13].From Figure8, it can be seen that the SSCE algorithm has a comparatively satisfactory performance under Gaussian noise, especially in a relatively low SNR region.However, under the affect of non-Gaussian noise, its performance is not as good as the proposed method, which is reflected in Figure8.It should be noted that, when the order of GMM is K s = 1, in other words, the noise follows Gaussian distribution, the proposed algorithm is similar to SSCE algorithm.The CoSaMP algorithm which has been used at the step of estimating the support set estimates the channel matrix based on classical CS reconstruction algorithm with the noise distribution being viewed as Gaussian distribution.In fact, the reconstruction performance of CoSaMP outperforms the other CS algorithms.Therefore, it is reasonable to use the CoSaMP algorithm as a reconstruction approach to estimate the support set at step 2 of Algorithm 1.The DC-based algorithm estimates channel by refining the selection of dominant entries iteratively until the terminating condition is met.It can also be seen from Figure8that the NMSE of the proposed algorithm is obviously superior to the traditional CoSaMP-based algorithm, SSCE-based algorithm and DC-bases algorithm.For example, when the SNR is 10 dB, the NMSE of the proposed algorithm is about −21 dB, and the traditional algorithm of SSCE-based, CoSaMP-based and DC-based is about −19 dB, −17 dB and −15 dB, respectively.When the SNR is 15 dB, the NMSE of the proposed algorithm, CoSaMP-based algorithm, SSCE-based algorithm and the DC-based algorithm is about −26 dB, −23 dB, −21 dB and −17 dB, respectively, thus the proposed algorithm has approximately 13.0%, 23.8%, and 52.9% improvement in terms of NMSE compared with traditional SSCE-based, CoSaMP-based, and DC-based, respectively.

Figure 8 .
Figure 8.Comparison of normalized mean square error under different algorithms.

Figure 9 .
Figure 9. Lower and upper bounds of mutual information versus SNR.

Figure 10 .
Figure 10.The relationship between the number of antennas and NMSE.

Figure 11 .
Figure 11.The effect of on the performance.

Figure 12 .
Figure 12.The effect of K s on the performance.

Table 1 .
Parameters in simulation.