Saddle Point Approximation of Mutual Information for Finite-Alphabet Inputs over Doubly Correlated MIMO Rayleigh Fading Channels

: Given the mutual information of ﬁnite-alphabet inputs cannot be calculated concisely and accurately over fading channels, this paper proposes a new method to calculate the mutual information. First, the applicability of the saddle point method is studied, and then the mutual information is estimated by the saddle point approximation method with known channel state information. Furthermore, we induce the expectation of mutual information over doubly correlated multiple-input multiple-output (MIMO) Rayleigh fading channels. The validity of the saddle point approximation method is veriﬁed by comparing the numerical results of the Monte Carlo method and the saddle point approximation method under different doubly correlated MIMO fading channel scenarios.


Introduction
Mutual information plays an irreplaceable role in the theoretical analysis of communication system performance, including analysis, evaluation and optimization of transceiver structure [1], encoding and decoding schemes [2], and communication system bit error rate (BER) performance [3], etc., so it attracts increasing research interest. Channel capacity, defined as the upper bound of mutual information, is realized under Gaussian inputs over additive white Gaussian noise (AWGN) channels [4]. A large number of theoretical analyses and research are also based on this concept [5][6][7]. By means of Minkowski's inequality [1], two lower bounds of system capacity are obtained and are used as selection indexes to discuss the selection of antenna subsets in spatial multiplexing systems. Under the condition of Gaussian inputs, the hybrid encoder and combiners are designed by maximizing the achievable SE [2].
However, Gaussian inputs are rarely realized in practice because the unbounded amplitude of Gaussian distribution may lead to infinite transmitting power, and the continuity of Gaussian distribution will make it difficult to detect and decode the signal at the receiver. In practical communication systems, inputs are usually taken from finite-alphabet constellation sets with average distribution, rather than Gaussian inputs [8]. Considerable gaps in terms of transmitting performance exist [9][10][11] due to the differences between Gaussian and finite-alphabet inputs and then lead to deviations from optimal strategies. For example, it is believed that the traditionally optimal strategy to achieve capacity for Gaussian inputs is to allocate higher power to the sub-channels with a larger signal-to-noise ratio (SNR). In [10], it is demonstrated that such strategy may be quite suboptimal for the reason that the mutual information with finite-alphabet inputs is upper bounded, and there is little incentive to allocate more power to the sub-channels already close to saturation. At the same time, channel capacity reflects the upper bound of communication system performance, so the performance of digital communication systems can be accurately evaluated by mutual information under the condition of finite alphabet sets inputs and actual transmission environment [12,13].
Due to the great complexity of the direct calculation of mutual information, it is almost impossible to obtain a closed-form solution. Monte Carlo trials are usually used for direct and accurate calculation [14]. In order to reduce the computational complexity, a bit-level algorithm using PDF of a log-likelihood ratio (LLR) to calculate mutual information is proposed [15]. However, each modulation mode of the algorithm requires a lot of prior simulations, and it is only suitable for specific scenarios without universality. To optimize linear precoding with finite-alphabet inputs, the authors in [16,17] deduced the closedform lower and upper bounds of mutual information as alternatives, which reduce the computational effort by several orders of magnitude compared to calculating the average mutual information directly. Then the bounds of mutual information are applied to multiple antennas, secure cognitive radio networks [18], and relay networks [19]. The study in [20] utilized the cutoff rate (CR) as the alternative of mutual information (MI) to design the linear precoders. Mutual information was also used to develop a two-step algorithm to enhance the achievable secrecy rate of cooperative jamming for secure communication with finitealphabet inputs in [11]. However, the gaps between approximation and accurate mutual information are still ambiguous, which limits the range where mutual information can be applied. Recently, the authors of [21] approximated ergodic mutual information based on multi-exponential decay curve fitting under M-ary quadrature amplitude modulation (M-QAM) signaling, but other modulation modes were neglected.
This study takes a step toward evaluating accurate mutual information for finitealphabet-based transmissions over doubly correlated MIMO fading channels. After discussing the applicability of the saddle point method, we obtain the approximate solution of mutual information for any known CSI and modulation mode by using this method, which is universal. On this basis, the mutual information expectation of doubly-correlated MIMO Rayleigh fading channels is further derived. This proposition highlights the considerable accuracy with radically reduced complexity.
The outline of this paper is as follows. The second section introduces the MIMO transmission model and its preliminary research. In the third section, the saddle point approximation method is used to estimate the mutual information, and then we calculate the mutual information expectation over doubly correlated MIMO Rayleigh fading channels. Then the validity and accuracy of the proposed method are verified under different doubly correlated MIMO fading channels scenarios in the fourth section. Finally, the fifth section gives conclusions.

Model of MIMO Transmission
Consider MIMO system with N T transmitting antennas and N R receiving antennas. Let x ∈ C N T ×1 (C N×m denotes the N × m complex spaces) be a transmitting signal vector, satisfying E x x = 0 and E x x w H = I, where E (•) { * } stands for the statistical expectation of random * with respect to its variable ·; I and 0 denote an unit and zero matrix of appropriate dimensions, respectively. (•) H is the conjugate transpose of matrix ·. MIMO transmission is generally modeled by where preliminaries are made as below.

1.
H ∈ C N R ×N T is a complex fading channel matrix between transmitting antenna and receiving antenna arrays. The doubly correlated MIMO Rayleigh fading channel is modeled by Ψ R 1/2 H WG Ψ T 1/2 ∈ C N R ×N T [1], where H WG ∈ C N R ×N T is a matrix consisted of independent and identically distributed CN (0, 1) complex Gaussian entries; Ψ T and Ψ R are transmitting and receiving correlation matrices, respectively. Ψ T and Ψ R can be expressed as where U T and U R are unitary matrices whose columns are eigenvectors of Ψ T and Ψ R ; Σ T and Σ R represent diagonal matrices whose diagonal entries are the eigenvalues of Ψ T and Ψ R , respectively. 2.
w ∈ C N R ×1 stands for AWGN corresponding to N R receiving antennas, where each element is independent and identically complex Gaussian distributed, satisfying E w w = 0 and E w w w H = σ 2 I.

Mutual Information for Finite-Alphabet Inputs
When a linear unitary transform U H R is applied on the receiving signal y, the MIMO model in (1) is equivalent to a model with channel matrix U H R H and noise U H R w [16], which is written as where where N I denotes the number of symbols in the i-th discrete constellation Ω I . The mutual Information for Finite-Alphabet inputs between x and y is given by [22] I( x; y) = I(U T H x; y) where c m,k = Σ R 1/2 H WG Σ T 1/2 U T H d m,k and d m,k = q m − q k . q m and q k are the m-th and k-th points in the constellation of x, and ||•|| stands for the Euclidean norm of the variable •.
Since the statistical Channel State Information (CSI) is varying much slower than instantaneous H, and can be obtained by channel estimation, this work assumed the statistical CSI was a perfectly-known constant. Consequently, the problem was to calculate the average mutual information with given statistical CSI.

Saddle Point Approximation for Mutual Information
Mutual information by (4) needs multiple integrals to compute expectation over H WG and w. As N increases, it leads to prohibitive complexity and becomes the most significant obstacle in achieving accurate mutual information. Therefore, we used the idea of the mean value theorem of integrals to simplify multiple integrals by finding an appropriate point. In this section, we explore the saddle point approximation method and highlight the convenient calculation with a weighted mean over constellation set of x, instead of expectation over all possible samples of random H WG and w.

Saddle Point Approximation
We first considered the expectation over the AWGN vector w. The Taylor series of (4) is expanded to Before proceeding to the saddle point approximation, we needed to establish the following lemma, which guarantees the existence of the saddle point.
is a positive real number over an open interval (0, 1) and satisfies ∑ N k=1 ρ m,k = 1. The proof of Theorem 1 is shown in Appendix A.
We are now at w = w 0 to perform saddle point approximation.

Proposition 1.
For non-zero natural number q, integral over complex AWGN vector w is approximated by The proof of Proposition 1 is shown in Appendix B. Mutual information is approximated by (5) and (8) as Generally speaking, the close-form solution is hardly obtained. That is to say, we cannot write down the exact expression of α m,k . Optionally, we adopt a numerical method to obtain approximated α m,k , argmin α m,k :k=1,2 where I(x; y|1) is computed by Monte Carlo method by taking BPSK over single-input and single-output (SISO) over AWGN channel (that is H = 1) as an example, and α m,k is fixed at each signal to noise ratio (SNR). Thus, (10) can be written as Appl. Sci. 2021, 11, 4700 5 of 13

Average Mutual Information over Doubly Correlated MIMO Rayleigh Fading Channels
The average mutual information over doubly correlated MIMO Rayleigh fading channels is computed as below, Since it is still quite hard to compute the expectation of H WG . Consequently, (13) remains unsuitable for theoretical applications. Obviously, when SNR varies from −∞ to +∞, α m,k satisfies 1/3 < α m,k < 1/2 by (11), so average mutual information is approximated by For the simplified calculation, the following proposition provides approximate solution: Proposition 2. Average mutual information integral over H WG is lower bounded by where P = U T H . The proof of Proposition 2 is shown in Appendix C.

Simulation Verification and Result Analysis
This section presents examples to illustrate that the saddle point approximation method is very accurate. We considered an exponential correlation model. According to [23], the correlation matrix elements of transmitting and receiving antennas can be expressed as: where ρ T , ρ R ∈ [0, 1).

Accuracy of Saddle Point Approximation
In Figures 1 and 2, doubly correlated Rayleigh fading and Rice fading channel models were considered, respectively. We compared the average mutual information by the Monte Carlo method and saddle point approximation method by (12). Different input types (BPSK, QPSK, QAM, 8PSK, and 16QAM) were assigned to transmitting antennas to ensure generality. Obviously, with the increase in SNR, mutual information presented an upward trend. When the SNR was greater than 15dB, mutual information tended to be stable. In these cases, (12) offered a very good approximation to the average mutual information for known channel state information.
Monte Carlo method and saddle point approximation method by (12). Different input types (BPSK, QPSK, QAM, 8PSK, and 16QAM) were assigned to transmitting antennas to ensure generality. Obviously, with the increase in SNR, mutual information presented an upward trend. When the SNR was greater than 15dB, mutual information tended to be stable. In these cases, (12) offered a very good approximation to the average mutual information for known channel state information.

Conciseness of Saddle Point Approximation
The average mutual information has no closed-form expression, so it is usually calculated by the Monte Carlo method. The more sample points, the more accurate the calculation result is. We denoted the sample points as NW. In order to obtain a relatively accurate value of mutual information, NW was at least 10 4 . Tables 1 and 2

Conciseness of Saddle Point Approximation
The average mutual information has no closed-form expression, so it is usually calculated by the Monte Carlo method. The more sample points, the more accurate the Appl. Sci. 2021, 11, 4700 8 of 13 calculation result is. We denoted the sample points as N W . In order to obtain a relatively accurate value of mutual information, N W was at least 10 4 . Tables 1 and 2 compare the computational complexity of the Monte Carlo method and the saddle point approximation method according to the number of operations and CPU time under the condition that N W was 10 4 . The codes of mutual information calculation based on the Monte Carlo method and the saddle point approximation method were executed on an Intel Core i5-5200U 2.20 GHz processor. The results showed that the computational complexity of the proposed saddle point approximation method was much lower than that of the traditional Monte Carlo method. For example, as shown in Table 2, when N T and N R were equal to 2 and the input type of the two transmitting antennas was 16QAM, the CPU time of saddle point approximation methods by (15) was several orders of magnitude less than that of Monte Carlo method.
The symbol/indicates the CPU time is more than half an hour.

Conclusions
This paper studied the numerical calculation of mutual information for finite-alphabetbased transmissions over doubly correlated MIMO fading channels. The average mutual information was dominated by statistical CSI, and the obstacle of computation was complexity. We examined the appropriateness of the saddle point method first. Then mutual information over any known channel model was calculated by saddle point approximation. Furthermore, we induced the expectation of mutual information over doubly correlated MIMO Rayleigh-fading channels. Numerical results for various MIMO scenarios showed the efficacy of the proposed method. Compared to existing conclusions, the proposed approximation is of considerable accuracy in estimating the average mutual information with radically reduced complexity. It is promising that its accuracy and convenience will facilitate the practical application of mutual information.

Acknowledgments:
The authors wish to thank the reviewers for their valuable comments and suggestions concerning this manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Define c m,k and w for computational simplicity, where Re{•} and Im{•} stand for the real and imaginary components of a complex number •. c m,k and w are 2N R × 1 dimensional real vectors ( c m,k ∈ R 2N R ×1 and w ∈ R 2N R ×1 ) that satisfy || c m,k || 2 = ||c m,k || 2 and || w|| 2 = ||w|| 2 . By (6) and (A1), we have Since q is positive integers, it is easy to verify that Therefore, there is a maximum value of σ m −q ( w), which satisfies the conditions of saddle point approximation calculation. Assuming σ m −q ( w) achieves the maximum at where H{ln[σ m −q ( w)]}| w 0 is the Hessian matrix of ln[σ m −q ( w)] at w = w 0 . By the fact of σ m ( w) > 0 and q > 0, for and then the Hessian matrix H{ln[σ m −q ( w)]} I,j ( w)| w 0 is rewritten as Recalling (A6), J F | ( c m,1 ,..., c m,N , w) is a positive definite matrix at w 0 , so it is invertible. Consequently, w 0 can, in principle, express in terms of c m,1 , c m,2 , . . . , c m,N by implicit function theorem. Namely, the maximum of σ m −q ( w) is achieved on the condition that w 0 satisfies (A5). Note that a complex number is zero when and only when both its real and imaginary parts are zero vectors, so we have,

Appendix B
Lemma 1 denotes that σ m −q ( w) is maximized at w 0 . By (A5), we have, Note that a positive definite matrix A is invertible, and the determinant of A can be computed by exp[Tr ln(A)], where TrA stands for the trace of A. Recalling (A1) and (A6), the saddle point approximation can be computed by Gaussian integral According to (A5), we can induce Therefore, (A13) and (A14) demonstrate that Gaussian integral is dominated by ||c m,k || 2 /σ 2 in terms of exponential. Define a multiplier α m,k dominated by ||c m,k || 2 /σ 2 ,