Computationally Efficient Channel Estimation in 5 G Massive Multiple-Input Multiple-output Systems

Traditional channel estimation algorithms such as minimum mean square error (MMSE) are widely used in massive multiple-input multiple-output (MIMO) systems, but require a matrix inversion operation and an enormous amount of computations, which result in high computational complexity and make them impractical to implement. To overcome the matrix inversion problem, we propose a computationally efficient hybrid steepest descent Gauss–Seidel (SDGS) joint detection, which directly estimates the user’s transmitted symbol vector, and can quickly converge to obtain an ideal estimation value with a few simple iterations. Moreover, signal detection performance was further improved by utilizing the bit log-likelihood ratio (LLR) for soft channel decoding. Simulation results showed that the proposed algorithm had better channel estimation performance, which improved the signal detection by 31.68% while the complexity was reduced by 45.72%, compared with the existing algorithms.


Introduction
Multiple-input multiple-output (MIMO) technology is becoming more and more mature, especially when combined with orthogonal frequency division multiplexing (OFDM) [1][2][3][4][5], which has been successfully applied in multiple wireless communications fields such as Long-Term Evolution (LTE) and LTE-Advanced.However, traditional MIMO technology can only achieve a 4 × 4 or 8 × 8 scale system [6], which makes it difficult to meet the explosive growth in mobile data services.Therefore, in recent years, massive MIMO has been proposed based on traditional MIMO technology [7].Massive MIMO systems configure up to hundreds of antenna arrays at the base station to serve multiple single-antenna end-users simultaneously [8], which can improve spectrum utilization and power utilization in wireless communications systems by two to three orders of magnitude [9][10][11].This has become one of the most promising enabling technologies and one of the hottest research directions in 5G [12].The maximum likelihood (ML) algorithm is the optimal algorithm in massive MIMO detection algorithms, but its computational complexity increases exponentially with the number of system antennas and the modulation order of baseband signals.It is difficult for it to be fast, effective, and realized in practical applications.Linear detection methods, such as the zero-forcing (ZF) algorithm and minimum mean square error (MMSE) algorithm, can achieve near-optimal detection performance in massive MIMO systems.The complexity in this kind of detection algorithm is greatly reduced, compared with the complexity of the ML algorithm, but introduces a complex high-dimensional matrix inversion operation, so a low-cost and efficient engineering implementation is still a problem to be solved.Aimed at this problem, many simplified algorithms based on the MMSE detection scheme have been proposed in recent years, and can be roughly divided into three types: The series expansion class-approximation method [13,14], the iterative class-approximation method [15,16] and a gradient-based search for an approximate solution [17][18][19][20].The authors in [13] proposed a method of using Neumann series expansion to approximate the inverse MMSE filter matrix, but when the number of expansion stages was gradually increased (i > 2), the computational complexity was still high, even equal to or exceeding the exact MMSE.The complexity of the filter matrix inversion algorithm also loses a large degree of detection performance.The authors in [14] applied the Newton algorithm derived from the first-order Taylor series expansion (similar to Neumann series expansion) to massive MIMO signal detection, and used the iterative method to improve the estimation accuracy of the MMSE filter matrix inversion.However, from the aspects of detection performance and computational complexity, the algorithm based on the Newton iteration was not dominant.Different from the two series expansion-based algorithms above, it is necessary to estimate the signal vector sent by the user by inverting the approximate matrix.Some iterative algorithms based on solving linear equations, such as the Richardson iterative (RI) algorithm [15] and the successive over-relaxation (SOR) algorithm [16], use the special properties of the MMSE filter's symmetric positive definite matrix.Through the method of solving linear equations, they directly estimate the transmission vector, thus avoiding the inversion of high-dimensional matrices.
The RI and SOR algorithms mentioned above have lower computational complexity at a fixed number of iterations, but RI convergence is slower and requires a higher number of iterations to achieve certain detection performance requirements.In SOR, although the detection performance is close to excellent, its internal iterative structure means the algorithm cannot be implemented in parallel in practical applications.The third type of algorithm is mainly designed and implemented based on the idea of a matrix gradient, including the conjugate-gradient (CG) method [17] and the steepest descent (SD) method [18].This type of algorithm uses the matrix gradient search method and avoids the high-dimensional matrix inversion problem.However, compared to the method of series expansion, the CG and SD algorithms bring about great improvement in detection performance, but calculation of the matrix gradient after each iteration also causes higher complexity.
In this paper, a low-complexity joint detection algorithm was proposed.The SD algorithm had a good convergence direction at the beginning of the iteration, and the Gauss-Seidel (GS) algorithm with low complexity mentioned in [19] was combined with the SD method (called SDGS), which provided an effective search direction for GS iterations, speeding up convergence and improving the detection performance.Furthermore, applying it to soft output detection gave an approximate calculation method for the bit log-likelihood ratio (LLR) of the channel decoder input.A good compromise between detection performance and computational complexity was achieved.
The rest of this paper is organized as follows: Section 2 discusses the system model and analytical derivations, Section 3 explains signal detection, while Section 4 explains the mixed iterative algorithm and the proposed algorithm.Section 5 provides the simulation results, while Section 6 concludes the paper.

System Model
The research object considered in this paper was the uplink for a massive MIMO system consisting of a base station equipped with N antennas and K single-antenna users where N K, as shown in Figure 1.Let s c = [s 1 , s 2 , . . . ,s K ] T denote the K × 1 dimensional symbol vector sent by all users simultaneously, where s k ∈ ε was the transmitted symbol from the kth user, and ε was the modulation symbol set.
Let H c ∈ C N×K represent the Rayleigh fading channel matrix; then, the N × 1 dimensional signal vector received by the base station could be recorded as: where n c represented an additive white Gaussian noise (AWGN) vector with an N × 1 dimensional mean of 0 and a covariance matrix of σ 2 I N .Converting the complex model of Equation ( 1) into an equivalent real model gave: Among these terms, s ∈ R 2K , H ∈ R 2N×2K , y ∈ R 2N , and n ∈ R 2N , which were: Among those, (•) and (•) indicated the real part and imaginary part, respectively.Let   ∈ ℂ × represent the Rayleigh fading channel matrix; then, the  ×  dimensional signal vector received by the base station could be recorded as: where   represented an additive white Gaussian noise (AWGN) vector with an  ×  dimensional mean of 0 and a covariance matrix of     .Converting the complex model of Equation ( 1) into an equivalent real model gave: Among these terms,  ∈ ℝ  ,  ∈ ℝ × ,  ∈ ℝ  , and  ∈ ℝ  , which were:

Minimum Mean Square Error Signal Detection
The main task of signal detection was to accurately determine user transmission vector s at the base station through received signal vector y.The transmitted signal vector ŝ detected by the MMSE algorithm could be expressed as: where ŷ = H H y. The filter matrix W of the MMSE detector could be expressed as: where G = H H H was the Gram matrix.In massive MIMO systems, the computational complexity of W −1 is O K 3 , which makes the implementation of the MMSE algorithm very complex.

Log Likelihood Ratio Calculation
Various channel coding techniques are commonly employed in wireless communication systems to improve their error performance, since channel reliability can be used to improve system stability.Conventional MIMO system signal detection generally uses a hard decision method to directly execute symbol decisions on the estimated value of the user-transmitted signal vector, i.e., ŝ in Equation ( 4).In order to output the soft detection information to the back end of the detector, after the MMSE detector estimates ŝ, the LLR soft information used for channel decoding could be calculated with the following method.First, we needed to restore the estimated ŝ and the calculated W −1 to the equivalent complex field to get ŝc and denote the equalized channel matrix.The equalized signals obtained by the MMSE filter matrix could be obtained from Equations ( 2) and ( 4) as follows: Then, the estimated value of the symbol transmitted by the ith user is ŝc,i = µ i s c,i + e i , where represented the effective channel gain after equalization, and e i represented the noise plus interference (NPI) term contained in the ŝc,i .The noise variance was where U ji and E ii represented the (j, i)th element of the matrix U and the ith diagonal of the matrix E, respectively, where Using the max-log approximation representation given in [11], the LLR L i,b corresponding to the bth bit transmitted by the ith user was expressed as: where

Neumann Series Expansion
In a massive MIMO system, the MMSE signal detection algorithm involves a high-dimensional matrix inversion, W −1 , with a computational complexity of O K 3 .In order to reduce the computational complexity of W −1 , the authors in [11] proposed using Neumann series expansion to approximate matrix inversion results.When W approximates the invertible matrix X and satisfies lim n→∞ then, the Neumann series can be expressed as The decomposition matrix is W = D + E, where D is the diagonal matrix of W, and E is the hollow matrix corresponding to W. Since the number of antennas equipped at the base station was much larger than the number of single-antenna users (N K), matrix W has a diagonal dominant characteristic [3]; that is, W ≈ D. Substituting D for X in Equation ( 9) gives: when lim n→∞ −D −1 E n = 0, the progression of Equation ( 10) converges.If we only expand the first i term of the Neumann series, we can get: when the value of i is small, the Neumann series expansion can approximate W −1 with lower complexity.For example, when i = 2, W −1 2 = D −1 − D −1 ED −1 , which is computationally complex, and the complexity is O K 2 .

Gauss-Seidel Algorithm
In the Neumann series expansion algorithm, when the number of expansion terms i ≥ 3, the computational complexity is still O K 3 , which is equal to or even exceeds the complexity of the exact inverse calculation of the MMSE filter matrix.Unlike the Neumann series expansion, which approximates W, the GS algorithm [19] can solve N-dimensional linear equations of the form Ax = b without inverting the matrix, where matrix A is an N × N dimensional symmetric positive definite matrix, x is the N × 1 dimensional solution vector, and b is the N × 1 dimensional measurement vector.Decomposing matrix A into a diagonal element matrix, D A , a strict lower triangular element matrix, L A , and a strict upper triangular element matrix, L H A , the GS algorithm can estimate x by the following iterative method: where i = 1, 2, . . .represents the number of iterations of the GS algorithm.In a massive MIMO system, as the number of base station antennas increases substantially, when it is much larger than the number of single-antenna users (N K), the individual column vectors of the uplink channel matrix H are progressively orthogonal [20], and W = G + σ 2 I 2K is a symmetric positive definite matrix.Similarly, W can be decomposed into: Among those terms, D, L, and L H , respectively, is the diagonal element matrix of W, the strict lower triangular element matrix, and the strict upper triangular matrix.The GS algorithm can be used to avoid inverting the high-dimensional matrix, which directly estimates the transmitted signal vector ŝ: where ŝ(0) represents the initial solution and is usually set to a zero vector.

Hybrid Iterative Algorithm Structure
The SD algorithm based on matrix gradient search can have a good convergence direction at the beginning of the iteration [18], while the GS iterative algorithm has lower complexity.Therefore, using the above characteristics, this paper proposed a hybrid iteration of the joint SD and GS algorithm.The joint algorithm (called the SDGS algorithm) speeds up convergence of the iterative effect of the algorithm without increasing the complexity, and achieves error performance close to the MMSE ideal matrix inversion detection method.The steps are in SDGS Algorithm.

•
Step 1: For the diagonal approximation's initial value setting, Equation ( 4) can be converted to W ŝ = ŷ; W is a symmetric positive definite matrix and a diagonally dominant matrix, so W −1 is also a diagonally dominant matrix.

•
Step 2: Determine the initial solution using D −1 instead of W −1 : Since D is a diagonal matrix, it is obvious that calculating D −1 requires only low complexity, and the initial value, s (0) , is set to the initial value of the SD algorithm according to Equation (15).

•
Step 3: The iterative results of the first two GS algorithms are represented by the SD algorithm, and the second GS iteration result can be expressed as: where • Step 4: Combine single SD and GS iterations into one hybrid iteration by substituting Equation (17) and This represents the first two GS iterations as Equation ( 18); update the mixed iteration value ŝ(1) = s (2) , and then perform the next GS iteration.

•
Step 5: Using the (i − 1)th GS iteration using Equation ( 14), ideal estimated value ŝ(i) of the transmitted signal vector s can be obtained by setting the appropriate number of iterations, i: Then, ŝ(i) is related to the complex domain for the next soft decision, so the hybrid iterative algorithm can converge very quickly after a small number of iteration.

Approximate Log-Likelihood Ratio Calculation
The low-complexity MMSE signal detection algorithm described in [13][14][15][16] directly estimates the transmitted signal vector ŝ without calculating W −1 .The exact calculation of the LLR for the channel decoder input is described in Section 1 (i.e., using the exact W −1 matrix inversion information), which is not difficult to find with Equation (7).When the LLR of the first bit transmitted by the ith user is L i,b , the inverse W −1 of the MMSE detector filter matrix W needs to be used again to calculate the SINR of the ith user.Consider using the W approximation of the diagonal property to replace W −1 and , and then convert it to the complex domain to get W −1 c , in order to obtain the approximate channel gain and NPI variance, expressed as: where

Complexity Analysis
According to the number of real multiplications required in the algorithm, the computational complexity of the SDGS detection algorithm proposed in this paper was analyzed.Since all linear MMSE detection algorithms and the proposed algorithm must calculate the filter matrix, W = G + I 2k , and the matched filter signal, ŷ = H H y, then only the other parts were analyzed for complexity, mainly using the following three parts of the composition.

GS iteration
Equation ( 19) can be expressed as (D + L)ŝ (i) = ŷ − L H ŝ(i−1) = c.After i iterations, the calculation of ŝ(i) mainly comes from the following two steps: First, c is a 2K × 2K strictly lower triangular element matrix; 2K × 2K and the 2K × 1 vector ŝ(i−1) are multiplied, and c must be multiplied 2K 2 − K times.
Second, in Equation ( 19), the mth element, ŝ(i) m , can be expressed as: where c m represents the mth element of c, and L mk represents the mth row and kth column element of the lower triangular matrix (D + L).When m = 1, it is obvious that ŝ(i) 1 requires 2K multiplications, and all ŝ(i) m (m = 2, . . ., 2K) require 2K 2 − K multiplications, so a total of 2K 2 multiplications are required for each iteration.

LLR calculation
The computational complexity of this part mainly came from the calculation of the effective channel gain and the NPI variance after equalization.It can be known from Equations ( 20) and ( 21) that all the elements of the matrix U and the pair of matrices E need to be calculated.Obviously, the former requires 2K 2 multiplications, while the latter only requires 2K multiplications.Therefore, a total of 2K 2 + 2K multiplications were required for this step.
In summary, the total complexity required for the joint iterations to be applied to the soft decision was 2K 2 (i + 2) + 12Ki, which reduced the computational complexity by an order of magnitude, compared to the traditional MMSE algorithm.The complexity of the number of iterations was kept at O K 2 .In addition, considering the application scenarios of hard decision detection, Table 1 also gave a comparison of the computational complexity in the four detection algorithms.

Simulation Results
We deployed Matlab (R2017a, Mathworks, Natick, MA, USA) for performing analysis and experimentation.In order to verify the soft and hard detection performance of the SDGS algorithm proposed in this paper, this section presents Monte Carlo simulation results based on Matlab.The main simulation parameters configured are in Table 2. Figure 2 compares the bit error rate (BER) based on Neumann series (NS) expansion, the conjugate gradient (CG) detection algorithm, the Gauss-Seidel iterative detection algorithm, the MMSE exact inversion detection algorithm, and the proposed SDGS joint algorithm under different antenna configurations.The decision mode is a hard decision; that is, estimated signal vector ŝ is directly judged.The simulation results showed that the detection performance of the various algorithm increased with the number of iterations or the number of items expanded by the Neumann series.For example, when the number of iterations i = 2, the BER performance of the SDGS algorithm was much better than the BER when the number of items expanded by the Neumann series was 2. By comparing the performance in Figure 2a,b, it can be seen that with the increase in the ratio of the number of base station antennas to the number of users (N/K), the BER performance of the various algorithms was greatly improved.For example, if the BER was to reach 10 −3 , the MMSE algorithm and the proposed algorithm require an SNR of about 13 dB when the antenna configuration is 64 × 16, and only 8 dB when the configuration is 128 × 16.  Figure 2 compares the bit error rate (BER) based on Neumann series (NS) expansion, the conjugate gradient (CG) detection algorithm, the Gauss-Seidel iterative detection algorithm, the MMSE exact inversion detection algorithm, and the proposed SDGS joint algorithm under different antenna configurations.The decision mode is a hard decision; that is, estimated signal vector ̂ is directly judged.The simulation results showed that the detection performance of the various algorithm increased with the number of iterations or the number of items expanded by the Neumann series.For example, when the number of iterations  = 2 , the BER performance of the SDGS algorithm was much better than the BER when the number of items expanded by the Neumann series was 2. By comparing the performance in Figure 2a,b, it can be seen that with the increase in the ratio of the number of base station antennas to the number of users (/), the BER performance of the various algorithms was greatly improved.For example, if the BER was to reach 10 −3 , the MMSE algorithm and the proposed algorithm require an SNR of about 13 dB when the antenna configuration is 64 × 16, and only 8 dB when the configuration is 128 × 16.  Figure 3 shows the soft decision simulation results for the two antenna configurations.With BER based on the NS, CG, and GS iterative detection algorithms, the MMSE exact inversion detection algorithm and the SDGS joint algorithm were compared.We set the system's convolutional code rate to 1/2, and the LLR calculation used the approximate calculation method described in this paper.Simulation results showed that no matter what kind of MMSE receiver was used, the soft decision was checked.The measured performance was much better than the hard decision.For example, when the BER reached 10 −4 , when the antenna was configured, the MMSE algorithm and the proposed algorithm required an SNR of 10 dB for hard decisions and only 5 dB for soft decisions.In addition, for the same number of iterations, the BER performance of the SDGS algorithm proposed in this paper was better than the other three simplified algorithms, and after a few iterations, the detection performance could quickly approach the detection performance of the ideal MMSE filter matrix inversion.
Figure 3 shows the soft decision simulation results for the two antenna configurations.With BER based on the NS, CG, and GS iterative detection algorithms, the MMSE exact inversion detection algorithm and the SDGS algorithm were compared.We set the system's convolutional code rate to 1/2, and the LLR calculation used the approximate calculation method described in this paper.Simulation results showed that no matter what kind of MMSE receiver was used, the soft decision was checked.The measured performance was much better than the hard decision.For example, when the BER reached 10 −4 , when the antenna was configured, the MMSE algorithm and the proposed algorithm required an SNR of 10 dB for hard decisions and only 5 dB for soft decisions.In addition, for the same number of iterations, the BER performance of the SDGS algorithm proposed in this paper was better than the other three simplified algorithms, and after a few iterations, the detection performance could quickly approach the detection performance of the ideal MMSE filter matrix inversion.Figure 4 shows the hard decision BER comparison of the proposed SDGS algorithm with NS, CG, GS and MMSE under a high fading scenario with 128 × 16 antenna configuration.As can be seen from Figure 4 the BER of the proposed SDGS algorithm was better and followed the MMSE performance with increasing SNR and number of iterations.Moreover, due to high fading impact on the SNR, there was a gap between the proposed SDGS algorithm and MMSE algorithm at a high SNR level.
Figure 5 shows the BER comparison of the proposed SDGS algorithm with NS, CG, GS and MMSE under a low fading level and 128 × 16 antenna configuration.It can be seen from Figure 5 that all the algorithms showed lower BER and better performance as compared with the hard decision BER performance in Figures 2a and 4. Therefore, to keep the system performance in a suitable level, Figure 4 shows the hard decision BER comparison of the proposed SDGS algorithm with NS, CG, GS and MMSE under a high fading scenario with 128 × 16 antenna configuration.As can be seen from Figure 4 the BER of the proposed SDGS algorithm was better and followed the MMSE performance with increasing SNR and number of iterations.Moreover, due to high fading impact on the SNR, there was a gap between the proposed SDGS algorithm and MMSE algorithm at a high SNR level.
Figure 5 shows the BER comparison of the proposed SDGS algorithm with NS, CG, GS and MMSE under a low fading level and 128 × 16 antenna configuration.It can be seen from Figure 5 that all the algorithms showed lower BER and better performance as compared with the hard decision BER performance in Figures 2 and 4. Therefore, to keep the system performance in a suitable level, the fading level and number of iterations should be considered, which has an obvious impact on the system's overall performance.Furthermore, the proposed SDGS algorithm in Figure 5 had a close BER performance with MMSE which indicated that the SDGS algorithm showed better performance in the low fading level.
Electronics 2018, 7, x FOR PEER REVIEW 10 of 12 the fading level and number of iterations should be considered, which has an obvious impact on the system's overall performance.Furthermore, the proposed algorithm in Figure 5 had a close BER performance with MMSE which indicated that the SDGS algorithm showed better performance in the low fading level.

Conclusions
Signal detection methods based on MMSE filtering in massive MIMO systems are widely used, but matrix inversion with higher complexity makes it more difficult to implement them in practical applications.Some methods of approximate inversion, such as Neumann series expansion, has reduced the detection complexity, but due to a large degree of detection performance loss; others avoid the complex matrix inversion and directly estimate the signal vector.Although, computational complexity is reduced by orders of magnitude, detection performance needs to be improved.Based on the MMSE criterion, this paper proposes a low-complexity, hybrid, iterative SDGS joint detection algorithm, which directly estimates the user's transmitted symbol vector and can quickly converge to obtain an ideal estimation value with a few simple iterations.The matrix inversion operation is avoided, and algorithm complexity is kept at ( 2 ).In addition, in order to make full use of soft the fading level and number of iterations should be considered, which has an obvious impact on the system's overall performance.Furthermore, the proposed SDGS algorithm in Figure 5 had a close BER performance with MMSE which indicated that the SDGS algorithm showed better performance in the low fading level.

Conclusions
Signal detection methods based on MMSE filtering in massive MIMO systems are widely used, but matrix inversion with higher complexity makes it more difficult to implement them in practical applications.Some methods of approximate inversion, such as Neumann series expansion, has reduced the detection complexity, but due to a large degree of detection performance loss; others avoid the complex matrix inversion and directly estimate the signal vector.Although, computational complexity is reduced by orders of magnitude, detection performance needs to be improved.Based on the MMSE criterion, this paper proposes a low-complexity, hybrid, iterative SDGS joint detection algorithm, which directly estimates the user's transmitted symbol vector and can quickly converge to obtain an ideal estimation value with a few simple iterations.The matrix inversion operation is avoided, and algorithm complexity is kept at ( 2 ).In addition, in order to make full use of soft

Conclusions
Signal detection methods based on MMSE filtering in massive MIMO systems are widely used, but matrix inversion with higher complexity makes it more difficult to implement them in practical applications.Some methods of approximate inversion, such as Neumann series expansion, has reduced the detection complexity, but due to a large degree of detection performance loss; others avoid the complex matrix inversion and directly estimate the signal vector.Although, computational complexity is reduced by orders of magnitude, detection performance needs to be improved.Based on the MMSE criterion, this paper proposes a low-complexity, hybrid, iterative SDGS joint detection algorithm, which directly estimates the user's transmitted symbol vector and can quickly converge to obtain an ideal estimation value with a few simple iterations.The matrix inversion operation is avoided, and algorithm complexity is kept at O K 2 .In addition, in order to make full use of soft information, the algorithm is applied to the soft decision, and an approximate calculation method of the LLR for channel decoding is given, which further improves the signal detection performance.Theoretical derivation and simulation results show that the SDGS algorithm can be used as one of the most effective solutions for signal detection in massive MIMO systems.

1 b
the signal-to-interference plus noise ratio (SINR), and O 0 b and O represented the modulation symbol set with the bth bit being 0 and 1, respectively.

Figure 4 .
Figure 4. Hard decision bit error rate (BER) performance comparison with high fading level at 128 × 16 antenna configuration.

Figure 5 .
Figure 5. Soft decision bit error rate (BER) performance comparison with slow fading level at 128 × 16 antenna configuration.

Figure 4 .
Figure 4. Hard decision bit error rate (BER) performance comparison with high fading level at 128 × 16 antenna configuration.

Figure 4 .
Figure 4. Hard decision bit error rate (BER) performance comparison with high fading level at 128 × 16 antenna configuration.

Figure 5 .
Figure 5. Soft decision bit error rate (BER) performance comparison with slow fading level at 128 × 16 antenna configuration.

Figure 5 .
Figure 5. Soft decision bit error rate (BER) performance comparison with slow fading level at 128 × 16 antenna configuration.