Next Article in Journal
DMET: Dynamic Mask-Enhanced Transformer for Generalizable Deep Image Denoising
Previous Article in Journal
Regularized Kaczmarz Solvers for Robust Inverse Laplace Transforms
Previous Article in Special Issue
CNN-Based End-to-End CPU-AP-UE Power Allocation for Spectral Efficiency Enhancement in Cell-Free Massive MIMO Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Worst-Case Robust Training Design for Correlated MIMO Channels in the Presence of Colored Interference

1
Department of Artificial Intelligence, Kyungpook National University, Daegu 41566, Republic of Korea
2
Department of Information and Communications Engineering, Pukyong National University, Busan 48513, Republic of Korea
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(13), 2168; https://doi.org/10.3390/math13132168
Submission received: 7 May 2025 / Revised: 12 June 2025 / Accepted: 25 June 2025 / Published: 2 July 2025

Abstract

The covariance information at the transmitter side is often subject to mismatches due to various impairments. This paper considers a training design problem for multiple-input multiple-output (MIMO) systems when both channel and interference covariance matrices are imperfect at the transmitter side. We first derive the structure of the optimal training signal, minimizing the worst-case mean square error (MSE). With the training structure, the original problem becomes a simple power allocation problem. We propose a numerical optimal power allocation scheme and a closed-form suboptimal power allocation scheme. Simulation results show that the proposed schemes considerably outperform the conventional schemes in terms of the worst-case MSE and bit error rate (BER) performances, and the proposed closed-form training scheme has comparable performance to that of the optimal one. For example, the proposed schemes yield more than 2.5 dB signal-to-interference ratio (SIR) gains at a BER of 10 4 .

1. Introduction

In the past decades, it has been shown that the use of multiple antennas at both ends of a communication link, so called multiple-input multiple-output (MIMO), can provide either diversity gain or multiplexing gain [1,2]. For this reason, MIMO techniques have been employed in many system standards such as IEEE 802.11 for wireless local area networks (WLANs), IEEE 802.16, and 3rd Generation Partnership Project Long Term Evolution Advanced (3GPP LTE-A) for wireless communications, as a solution to demands for high data rates or low error rates [3,4]. To fully exploit the advantages that MIMO systems offer, multiple channel elements need to be accurately estimated. For example, in IEEE 802.11ac [5], accurate channel state information (CSI) at both ends is typically required for both transmit beamforming and coherent detection. Therefore, accurate channel estimation is an important problem for practical MIMO systems.

1.1. Prior Works and Limitations

In most current standards such as WLAN and WiMAX, unknown channel parameters are usually estimated at the receiver by sending a priori known training (or pilot) symbols from the transmitter [5,6]. Even though an arbitrary training signal can be used for channel estimation, it is possible to significantly enhance the estimation performance by optimizing the training symbols based on some prior knowledge on the channel and noise (or possibly interference) statistics. For this reason, training design problems have received considerable interest in recent years [7,8,9,10,11,12]. Most of these works used the mean square error (MSE) of channel estimation as a performance metric to improve the estimation accuracy [7,8,9,10,11]. On the other hand, another criterion, maximization of the conditional channel entropy, was considered in [12]. In the traditional training schemes [7,8,9,10,11,12], the perfect knowledge of both the channel and noise covariance matrices at both ends was commonly required for optimal design.
In practice, the covariance information about the channel and noise should be estimated at the receiver from received samples in recent consecutive training blocks [9,10]. The covariance information at the receiver side is reasonably assumed to be perfect when there is a sufficient number of samples [9,10]. Whereas the covariance information at the transmitter side is usually acquired by means of limited feedback from the receiver, which is often called covariance feedback [13,14]. Due to feedback-related issues, e.g., quantization, feedback errors, and delay, the covariance information at the transmitter is imperfect in general [15,16,17]. For this reason, it might be more practical to take such imperfection into account for training designs. In the literature, there are few studies concerning this issue [18,19,20].
Except for the work by Shariati et al. [20], all previous works have considered the white noise case only. However, the assumption of white noise is not always true in some practical scenarios. For example, in multi-user or small-cell environments, the noise term is no longer white due to the presence of non-negligible interference [21]. Also, for several applications such as mesh networking in IEEE 802.11s [22] and relay-aided systems such as IEEE 802.11ah [23] and IEEE 802.16j [24], total noise is often colored because the relays broadcast background noise as well as received data streams. Although the work in [20] considered a colored interference scenario, this work assumed full knowledge of the interference covariance at the transmitter side. This assumption is suitable if the covariance mismatch of the interference is sufficiently small compared with that of the channel. When the interference covariance matrix is subject to substantial errors at the transmitter, however, the scheme in [20] may become ineffective since it cannot properly deal with the interference covariance uncertainty. Moreover, since the iterative algorithm in [20] was designed with a heuristic approach, it may require considerable complexity during the channel training phase.

1.2. Motivations and Contributions

Motivated by the aforementioned discussions and to break through the limitations of the existing techniques, in this paper, we develop a novel and high-performing training strategy for the estimation of correlated MIMO channels in the presence of colored interference based on the worst-case robustness philosophy. The detailed technical contributions of our work are described below:
  • We propose a general framework for robust training optimization under imperfect channel and interference covariance information at the transmitter. Particularly, we design an optimal training signal for MIMO systems with interference, taking the imperfection of both the channel and interference covariance into account.
  • In our proposed framework, the worst-case MSE criterion is used as a performance metric similar to previous works [18,19,20]. In contrast to the previous problems, however, the considered design problem is not convex–concave due to the uncertainty in the interference covariance, and consequently the design of the training signal is more complicated. To solve the problem, we take innovative approaches: we initially derive an optimal structure of the training signal, which includes the existing training structures as special cases, and then we solve the training power allocation problem.
  • Two power allocation schemes are proposed. First, an optimal power allocation is determined numerically by finding an optimal solution. Next, to reduce the complexity required for optimal power allocation, a closed-form power allocation scheme is proposed by finding a suboptimal solution.
  • Based on the latter power allocation strategy, we also propose a suboptimal, yet closed-form, training scheme with low complexity.
  • We compare the performance of the proposed schemes with that of the conventional schemes by simulations. Through numerical results, we empirically demonstrate that the proposed schemes substantially surpass the existing schemes with remarkable performance improvements and the proposed suboptimal training scheme provides comparable performance to that of the optimal training scheme.

1.3. Organization and Notation

The organization of this paper is as follows. In Section 2, the system model considered is described and an optimization problem considered is formulated. The methods of the proposed training schemes are described in Section 3. The simulation results for the performance comparison are presented in Section 4. We conclude the paper in Section 5.
Notation: ( · ) T , ( · ) * , ( · ) H , ⊗, vec ( · ) , and Tr ( · ) denote the transpose, conjugate, conjugate transpose, Kronecker product, vectorization, and trace operators, respectively. · F denotes the Frobenius norm of a matrix. The notation A 0 means that a Hermitian matrix A is positive semi-definite. The Cartesian product of two sets A and B is denoted as A × B . The notation Diag ( a 1 , , a m ) and Blkdiag ( A 1 , , A m ) denotes a diagonal matrix whose diagonal elements are given by a 1 , , a m and a block diagonal matrix whose diagonal blocks are given by the matrices A 1 , , A m , respectively. A and A 2 represent the gradient and Hessian operators with respect to (w.r.t.) the variable A . E ( · ) denotes the expectation operator, and a circular symmetric Gaussian random vector a with mean a ¯ and covariance matrix A is denoted by a C N ( a ¯ , A ) .

2. System Model and Problem Formulation

2.1. System Model

As depicted in Figure 1, we consider a MIMO system consisting of a transmitter equipped with M t antennas, a receiver equipped with M r antennas, and a total of K interferers, where the kth interferer has M k antennas, k = 1 , , K . It is assumed that the background additive noise is much weaker than the interference, and hence we ignore the additive noise term for convenience (this is valid in practice as the noise power ranges from −192.5 dBm/Hz to −174 dBm/Hz, whereas the interference signal power ranges from −100 dBm to −10 dBm). In the channel training procedure, multiple training symbols are sent from the transmitter during L symbol times. The received signal matrix is then given by
Y = H P + k = 1 K H k S k = N = H P + N ,
where P C M t × L and S k C M k × L represent the training signal matrix sent from the transmitter and the interfering signal matrix sent from the kth interferer, respectively. We simply denote the total interference as N . The channel matrix between the transmitter and receiver is denoted by H C M r × M t and that between the kth interferer and receiver by H k C M r × M k . Without loss of generality, it is assumed that there exists spatial correlation at all nodes. According to the well-known Kronecker model [25], we represent the channels and interfering signals as follows (the analysis in our work is not confined to a specific distribution of the channel matrices, but valid for any distribution, as the proposed method requires only the knowledge of the covariance information of the channel matrices, not their probability distribution):
H = R 1 / 2 H w T 1 / 2 ,
H k = R 1 / 2 H w , k T I , k 1 / 2 , k = 1 , 2 , , K ,
S k = Ψ s , k 1 / 2 S w , k Ψ τ , k 1 / 2 , k = 1 , 2 , , K ,
where the elements of the matrices H w , { H w , k } k = 1 K , and { S w , k } k = 1 K are independent and identically distributed (i.i.d.) as CN ( 0 , 1 ) . T 0 , R 0 , and T I , k 0 are the spatial correlation matrices at the transmitter, receiver, and kth interferer, respectively. Ψ τ , k 0 and Ψ s , k 0 , respectively, represent the temporal and spatial correlations of the interfering signal. Taking the vectorizing operation on both sides of (1), the received signal can be rewritten in vector form as
vec Y = y = P T I M r h + n ,
where
h = vec ( H )
and
n = vec ( N ) = k = 1 K I L H k vec ( S k ) .
For the design purpose, we assume that the channel and interference covariance matrices are perfectly known at the receiver side (The issue with imperfect covariance information at the receiver side was studied in [20]. In [20] (Remark 2), it was shown that when the covariance information is imperfect at the receiver side, the resulting channel estimation MSE has a similar mathematical expression to (12), with an additional (negligible) loss term. In [20], it was also empirically demonstrated that the imperfect covariance information at the receiver side has a negligible impact on the performance, whereas that at the transmitter side has a much more significant impact).
We consider the linear minimum MSE (LMMSE) channel estimation from the observation vector y in (5). In classical estimation theory [26], it is well known that the LMMSE estimate of the channel h is given by
h ^ = C h P * I M r C n + P T I M r C h P * I M r 1 y ,
where
C h = E [ h h H ] = T T R
and
C n = E [ n n H ] = Q T R
are the channel and interference covariance matrices, respectively. Also, we have
Q = k = 1 K Tr ( Ψ s , k T I , k ) Ψ τ , k
to represent the interference correlation at the transmitter side. In practice, the matrix inversion for the LMMSE channel estimation in (6) can be approximated or replaced by several low-complexity alternatives suggested in [27] based on the matrix polynomial expansion with arbitrary degrees of freedom. This approach indeed significantly reduces the computational burden of the matrix inverse operation from a cubic complexity to a square complexity. The MSE of the LMMSE channel estimate h ^ given the parameters P , T , R , and Q can be obtained as [26]
f ( P , T , R , Q ) = E h h ^ 2 P , T , R , Q = Tr ( T 1 + P Q 1 P H ) 1 R = Tr ( R ) · Tr ( T 1 + P Q 1 P H ) 1 .
Remark 1.
When the additive noise term W C M r × L is considered (i.e., not ignored), the total interference-plus-noise term can be written as
N = k = 1 K H k S k + W .
Suppose that S k , k , and W are independent of each other and that the covariance matrix of W takes the following form:
E [ vec ( W ) vec H ( W ) ] = Φ T × R .
Then, it still follows that C n = E [ n n H ] = Q T R , with Q given by
Q = k = 1 K Tr ( Ψ s , k T I , k ) Ψ τ , k + Φ .
In a similar fashion, it is also possible to cover a multi-user or multi-cell scenario by treating W as the intra-cell or inter-cell interference, respectively.

2.2. Problem Formulation

In practice, the matrices T , R , and Q are imperfect at the transmitter side due to feedback-related issues [15,16,17]. Considering such imperfection, we mathematically model the correlation uncertainties as follows:
T = T ^ + T ˜ T , R = R ^ + R ˜ R and Q = Q ^ + Q ˜ Q ,
respectively, where T ^ , R ^ , and Q ^ are imperfect estimates of T , R , and Q , respectively, and T ˜ , R ˜ , and Q ˜ represent the corresponding correlation error matrices. The uncertainty sets are denoted by T , R , and Q , and they are, respectively, defined as
T = T ˜ : T ˜ F 2 ϵ T , T ^ + T ˜ 0 ,
R = R ˜ : R ˜ F 2 ϵ R , R ^ + R ˜ 0 ,
and
Q = Q ˜ : Q ˜ F 2 ϵ T , Q ^ + Q ˜ 0 ,
where the parameters ϵ T , ϵ R , and ϵ Q denote the spherical radii of T , R , and Q , respectively, and they are related to the quantization step size and equal error contour [15,16,17]. To guarantee certain performance of the channel estimation for any possible uncertainty on the covariance information, in this paper, we adopt the spherical uncertainty model for the covariance errors as this model is the most uncertain among the uncertainty models. Even when other uncertainty models (e.g., bounded spectral norms or element-wise errors) are adopted, similar results or conclusions to those for the spherical uncertainty model in this paper can still be drawn.
Using (8), we can rewrite (7) as
J ( P , T ˜ , R ˜ , Q ˜ ) = Tr R ^ + R ˜ · Tr ( T ^ + T ˜ ) 1 + P ( Q ^ + Q ˜ ) 1 P H 1 .
Note from (12) that the estimation MSE is a function of the known parameter P as well as the unknown parameters ( T ˜ , R ˜ , Q ˜ ) . To deal with the unknown parameters, we follow the widely used concept of the worst-case robustness [15,16,17]. Specifically, we use the worst-case MSE,
J ( P ) = max ( T ˜ , R ˜ , Q ˜ ) T × R × Q J ( P , T ˜ , R ˜ , Q ˜ ) ,
as a design criterion. Considering the training power constraint, an optimization problem of interest can then be formulated as follows:
min P max T , R , Q J ( P , T ˜ , R ˜ , Q ˜ )
s . t . Tr ( P P H ) P T ,
T ˜ T , R ˜ R , Q ˜ Q ,
where P T denotes the total transmit power of the training signal. It can be shown that problem (14) is not convex–concave (the problem
min a A max b B φ ( a , b )
is convex–concave if the constraint sets A and B are all convex, and the objective function φ is a convex function of the minimization variable a and a concave function of the maximization variable b [28,29]) due to the non-convexity of the worst-case MSE J in (13) w.r.t. P . For this reason, it is more complicated to tackle the problem (14) than the problems in [18,19,20], which are all convex–concave. For example, the conventional iterative scheme in [20] does not work well since it may converge to a local optimal solution.

3. Training Signal Optimization

In this section, we optimize the training signal by solving problem (14). Specifically, a closed-form structure of the worst-case MSE minimizing training signal is initially obtained by deriving the structure of the optimal matrix solution for the problem (14). From the structure, the original problem (14) involving the complex-valued matrices becomes a simple power allocation problem involving the real-valued scalar variables only. Thereafter, optimal and suboptimal power allocation schemes are proposed. The optimal power allocation is determined numerically and the suboptimal power allocation is obtained in a closed-form with low complexity.

3.1. Worst-Case MSE Minimizing Training Structure

Let ( P , T ˜ , R ˜ , Q ˜ ) be an optimal solution to the minimax problem (14). Then, the structures of the optimal training signal P and the worst-case correlation errors ( T ˜ , R ˜ , Q ˜ ) are derived in the following theorem.
Theorem 1.
Let T ^ = U T Λ T U T H and Q ^ = U Q Λ Q U Q H be the eigenvalue decompositions (EVDs) of the matrices T ^ and Q ^ , respectively, where
Λ T = Diag ( λ T , 1 , , λ T , M t )
and
Λ Q = Diag ( λ Q , 1 , , λ Q , L )
are diagonal matrices whose diagonal entries are the eigenvalues of T ^ and Q ^ , respectively. The columns of the unitary matrices U T and U Q consist of the eigenvectors of T ^ and Q ^ , respectively. For massive MIMO systems, there are several efficient methods or low-complexity approximations to perform the EVDs of covariance matrices with large sizes. These include the power iteration, Lanczos algorithm, Arnoldi iteration, randomized EVD, Jacobi–Davidson, locally optimal block preconditioned conjugate gradient (LOBPCG), etc. Other matrix-free computation approaches based on matrix–vector multiplications can also be used. The optimal training signal P , minimizing the worst-case MSE, has the following structure (this solution generally requires very low signaling overheads for the feedback transmission; specifically, only M t real numbers and M t integers need to be fed back to construct the training signal matrix at the transmitter [10]; accordingly, the limited feedback issue may not be a serious design concern in practice):
P = U T D P U Q H ,
and the worst-case correlation error matrices ( T ˜ , R ˜ , Q ˜ ) , respectively, have the following structures:
T ˜ = U T D T U T H , R ˜ = ϵ R M r I M r , Q ˜ = U Q D Q U Q H ,
where D P R M t × L is a rectangular diagonal matrix containing non-negative elements on its main diagonal, and D T R M t × M t and D Q R L × L are real diagonal matrices. It thus can be inferred that the worst case corresponds to a diagonal channel covariance matrix and the best case corresponds to a channel covariance matrix whose main diagonal values are close to zero and other elements are close to each other.
Proof. 
See Appendix A. □
The optimal training structure (15) means that the transmit directions of the training signal should be matched to the eigenvectors of the estimated correlation matrices at the transmitter side, whereas the training power has to be allocated according to the worst-case eigenvalues of the imperfect correlation matrices. This observation actually results from the fact that the uncertainty sets in (9)–(11) do not contain any directional information. Next, the worst-case value of the receiver correlation error seen by the transmitter is given by the scaled identity matrix, i.e., equal perturbation. This is due to the Kronecker model, in which the correlation information at the transmitter and receiver sides can be separable [25]. Finally, it is important to note that the number of real-valued variables to be optimized is reduced from 2 ( M t L + M t 2 + M r 2 + L 2 ) to min { M t , L } + M t + L with the proposed structures in (15) and (16).
Remark 2.
When the interference is temporally white and its variance information is perfectly known, the result (15) becomes P = U T D P , or equivalently, P P H = U T Λ P U T H , where Λ P = D P D P T , which is consistent with the result derived in [19].
Remark 3.
When Q is perfect, but T is imperfect, which is the case considered in [20], the training structure in (15) becomes P = U T D P V Q H , where the columns of V Q consist of the eigenvectors of Q .

3.2. Optimal Power Allocation

In this section, we determine the matrices D P , D T , and D Q . Substituting the structures P = U T D P U Q H , T ˜ = U T D T U T H , and Q ˜ = U Q D Q U Q H into (14), we can rewrite the estimated MSE J in (12) as
ζ ( d P , d T , d Q ) = β i = 1 ν 1 λ T , i + d T , i + d P , i λ Q , i + d Q , i 1 + β i = ν + 1 M t ( λ T , i + d T , i ) ,
where β = Tr ( R ^ ) + M r ϵ R and ν = min { M t , L } denotes the maximum rank of P . Also,
d P = [ d P , 1 , , d P , ν ] T ,
d T = [ d T , 1 , , d T , M t ] T ,
and
d Q = [ d Q , 1 , , d Q , L ] T
are the vectors of the diagonal elements of D P D P T , D T , and D Q , respectively. From (17), the optimal training power allocation d P can be obtained by solving the following problem:
min d P max d T , d Q ζ ( d P , d T , d Q )
s . t . d P D P , d T D T , d Q D Q ,
where the constraint sets D P , D T , and D Q are defined as
D P = d P : i = 1 ν d P , i P T , d P , i 0 , i = 1 , , ν ,
D T = d T : d T 2 ϵ T , λ T , i + d T , i 0 , i = 1 , , M t
and
D Q = d Q : d Q 2 ϵ Q , λ Q , i + d Q , i 0 , i = 1 , , L ,
respectively. In the following lemma, we show that the cost function of the problem (18) is convex–concave.
Definition 1
(Convex–concave function). [28,29] We say the function φ ( a , b ) : A × B R n × R m R is convex–concave if φ ( a , b ) is a convex function of a A for fixed b B and a concave function of b B for fixed a A .
Lemma 1.
The MSE ζ in (17) is convex in d P D P for fixed ( d T , d Q ) D T × D Q , and concave in d T D T and d Q D Q for fixed d P D P .
Proof. 
See Appendix B. □
Lemma 1 implies that the power allocation problem (18) is convex–concave since the constraint sets D P , D T , and D Q are all convex. Unfortunately, it is not possible to find a closed-form solution in general due to the nonlinearity of the objective function and norm constraints on d T and d Q . However, the optimal power allocation d P can be computed numerically by using well-known methods such as the interior point method or barrier method [28,30].

3.3. Suboptimal Power Allocation in Closed-Form

The numerical approach for optimal power allocation may be undesirable in real applications due to a non-negligible complexity during the channel training phase. Considering this problem, we propose a suboptimal power allocation in a closed-form. Since the power allocation problem (18) is convex–concave, we can interchange the minimum and maximum operators of the problem (18) according to [31] (Lemma 36.2)
as follows:
max d T , d Q min d P ζ ( d P , d T , d Q )
s . t . d P D P , d T D T , d Q D Q .
Defining
x T = [ λ T , 1 + d T , 1 , , λ T , M t + d T , M t ] T
and
x Q = [ λ Q , 1 + d Q , 1 , , λ Q , L + d Q , L ] T ,
problem (19) can be equivalently reformulated as
max x T , x Q min d P g ( d P , x T , x Q )
s . t . d P D P , x T X T , x Q X Q ,
where the cost function g in (23a) is
g ( d P , x T , x Q ) = β i = 1 ν 1 x T , i + d P , i x Q , i 1 + β i = ν + 1 M t x T , i ,
and the constraint sets X T and X Q are defined as
X T = x T : x T λ T 2 ϵ T , x T , i 0 , i = 1 , , M t
and
X Q = x Q : x Q λ Q 2 ϵ Q , x Q , i 0 , i = 1 , , L ,
respectively. It is still difficult to obtain the optimal solution to problem (20) in a closed-form. To overcome this difficulty, in the following, we instead find a suboptimal solution.
Definition 2.
[32] For any a R n , let a [ 1 ] a [ n ] denote the components of a in decreasing order.
Definition 3.
[32] (Ch.1.A.1) The vector a R n is majorized by b R n , denoted by a b , if m = 1 l a [ m ] m = 1 l b [ m ] , 1 l n 1 , and m = 1 n a [ m ] = m = 1 n b [ m ] .
Definition 4
(Schur-concave function). [32] (Ch.3.A.1) A real-valued function φ : A R n R is said to be Schur-concave on A if a b φ ( a ) φ ( b ) .
Lemma 2.
Let g ( x T , x Q ) = min d P D P g ( d P , x T , x Q ) be the minimum MSE. Then, g ( x T , x Q ) is Schur-concave in x T and x Q . In other words, if there exists the vectors x ˜ T and x ˜ Q such that x ˜ T x T and x ˜ Q x Q , then g ( x ˜ T , x ˜ Q ) g ( x T , x Q ) .
Proof. 
See Appendix C. □
From the above lemma, we can find a suboptimal solution to problem (20), with which the value of g increases, but is not maximized, as follows:
x ˜ T = λ T + δ T 1 M t and x ˜ Q = λ Q + δ Q 1 ν , 0 ( L ν ) × 1 ,
where
λ T = [ λ T , 1 , , λ T , M t ] T
and
λ Q = [ λ Q , 1 , , λ Q , L ] T .
Also, 1 n denotes the n × 1 vector whose elements are all 1, δ T = 1 M t j = 1 M t d T , j and δ Q = 1 ν j = 1 ν d Q , j . To satisfy the constraints x ˜ T X T and x ˜ Q X Q , the values of δ T and δ Q can be chosen as δ T = ϵ T M t and δ Q = ϵ Q ν , respectively. By substituting (22) into (20), an optimization problem for suboptimal training power allocation can be formulated as
min d P g ( d P , x ˜ T , x ˜ Q )
s . t . d P D P .
It is assumed that the elements of λ T and λ Q are arranged in descending and ascending orders, respectively. Then, the solution to problem (23) can be obtained in a closed form by the Lagrange multiplier method [28], as
d P , i ( s o ) = P T + j = 1 r λ Q , j + δ Q λ T , j + δ T λ Q , j + δ Q j = 1 r λ Q , j + δ Q λ Q , j + δ Q λ T , j + δ T , i = 1 , , r , 0 , i = r + 1 , , ν ,
where r = max { i { 1 , , ν } : d P , i ( s o ) > 0 } is the largest i such that d P , i ( s o ) > 0 . The suboptimal solution in (24) follows the conventional water-filling strategy, i.e., it assigns more training power to the larger eigenvalues of T ^ and smaller eigenvalues of Q ^ . One difference is that there exist the equal errors δ T and δ Q in the eigenvalues of T ^ and Q ^ , respectively.
Remark 4.
For the case considered in Remark 1, we obtain the suboptimal power allocation from (24) by setting δ Q = 0 and λ Q , j = σ Q , j = 1 , , L , where σ Q denotes the interference variance.
Remark 5.
In the case considered in Remark 2, the suboptimal power allocation can be obtained from (24) by setting δ Q = 0 and λ Q , j = σ Q , j , j = 1 , , L , where { σ Q , j } j = 1 L denote the eigenvalues of Q .
The suboptimal solution d P ( s o ) in (24) becomes optimum if the minimum MSE g is an increasing function of both δ T = 1 M t j = 1 M t d T , j and δ Q = 1 ν j = 1 ν d Q , j , which, however, does not hold as can be seen in (A16) in Appendix C. Hence, the suboptimal solution does not guarantee achieving optimal performance in general. However, g is tightly upper bounded by an increasing function of δ T and δ Q . To show this, we consider the simple case of M t = L . Then, we have
g ( x T , x Q ) = min d P D P g ( d P , x T , x Q )
β i = 1 M t 1 λ T , i + d T , i + P T / M t λ Q , i + d Q , i 1
β M t 1 λ ¯ T + δ T + P T / M t λ ¯ Q + δ Q 1
where λ ¯ T = 1 M t i = 1 M t λ T , i and λ ¯ Q = 1 M t i = 1 M t λ Q , i . The inequality (25b) follows from the fact that d P = ( P T / M t ) 1 M t is a feasible solution to the problem in (25a). The inequality (25c) follows from the concavity. The upper bound (25) becomes tight for the following cases: (i) M t = 1 , (ii) P T = 0 , and (iii) P T . Therefore, we deduce that the suboptimal scheme may provide near-optimal performance at low and high training powers when the number of transmit antennas is not large enough. Otherwise, the performance of the suboptimal scheme may degrade because the bound (25) becomes loose. To clarify our discussion, in Figure 2, we plot the value of g and that of its upper bound versus the total training power P T for various numbers of transmit antennas. In the figure, we set [ T ^ ] m , n = [ Q ^ ] m , n = ρ | m n | , 1 m , n M t , ρ = 0.5 , β = M t , ϵ T = 0.3 T ^ F 2 , and ϵ Q = 0.3 Q ^ F 2 . The real curve and dashed-dot curve indicate the value of g and that of its upper bound, respectively. From the figure, we can observe that the upper bound is tight to the actual value when P T is sufficiently low or high. Also, for the range of P T = 5 dB to P T = 15 dB, the gap between g and its upper bound becomes larger as the number of transmit antennas increases. In the following section, the performance of the suboptimal training scheme is concretely demonstrated by numerical simulations.

3.4. Complexity Comparison

To validate the efficiency of the proposed designs, in this section, we compare the computational complexity of the proposed schemes with that of the iterative algorithm in [20]. In each iteration of the conventional algorithm, the arithmetic complexity of O ( M t 6.5 + L 6.5 ) log ( 1 / ψ ) is required for computing T ˜ and Q ˜ for a fixed P [33], where ψ is the solution accuracy for the interior point method, and the arithmetic complexity of O M t 3 + L 3 is needed to compute P for fixed T ˜ and Q ˜ [34]. Thus, the conventional algorithm in [20] requires the complexity O M t 3 + L 3 + ( M t 6.5 + L 6.5 ) log ( 1 / ψ ) per iteration. Let N iter denote the total number of iterations required for the algorithm in [20]. Then, it requires the total computational complexity of O M t 3 + L 3 + ( M t 6.5 + L 6.5 ) N iter log ( 1 / ψ ) . Next, we consider the complexity of the proposed training schemes. To implement the training structure in (15), the arithmetic operation of O M t 3 + L 3 is required for computing the EVDs of T ^ and Q ^ [34]. Also, we require the computational complexity of O ( M t 3.5 + L 3.5 ) log ( 1 / ψ ) to compute the optimal power allocation numerically [33]. Therefore, the overall complexity for the optimal training scheme is O M t 3 + L 3 + ( M t 3.5 + L 3.5 ) log ( 1 / ψ ) . On the other hand, since the complexity of computing the suboptimal power allocation in (24) is insignificant compared with the complexity O M t 3 + L 3 [34], the overall complexity for the suboptimal training scheme is O M t 3 + L 3 . In Table 1, we summarize the analytical complexity results and processing time measured on the 12th Gen Intel(R) Core(TM) i9-12900K CPU when M t = M r = L = 4 .

4. Simulation Results

In this section, the performance of the proposed schemes is illustrated and compared with that of existing schemes by computer simulations.

4.1. Simulation Setup

In the simulations, we generate the matrices T ^ and R ^ according to the inverse Wishart distribution (some discussions on the use of the inverse Wishart distribution can be found in [35,36]; the inverse Wishart distribution is the conjugate prior to the actual correlation matrices T and R when h is Gaussian-distributed; nevertheless, our proposed scheme is applicable to any distributions or types of covariance matrices) as
T ^ W 1 ( ( κ T M t 1 ) C T , κ T )
and
R ^ W 1 ( ( κ R M r 1 ) C R , κ R ) ,
respectively, where κ T and κ R denote the degrees-of-freedom parameters of the inverse Wishart distribution. The values of κ T and κ R are set to κ T = M t + 2 and κ R = M r + 2 , respectively. The elements of C T and C R are generated by the one-ring model [37]:
[ C T ] m , n = J 0 ( 2 π Θ T | m n | s T / λ ) , 1 m , n M t
and
[ C R ] m , n = J 0 ( 2 π | m n | s R / λ ) , 1 m , n M r ,
respectively, where J 0 ( · ) is the zeroth-order Bessel function of the first kind, λ the carrier wavelength, Θ T the angular spread, and s T and s R the transmit and receive antenna spacings, respectively. The values of s T / λ and s R / λ are set to s T / λ = 0.5 and s R / λ = 0.25 , respectively. We set Q ^ = k = 1 K Tr ( Ψ s , k T I , k ) Ψ τ , k , where the matrices T I , k , Ψ τ , k , and Ψ s , k are, respectively, generated by
T I , k W 1 ( ( κ I , k M k 1 ) C I , k , κ I , k ) ,
Ψ τ , k W 1 ( ( κ τ , k L 1 ) C τ , k , κ τ , k ) ,
and
Ψ s , k W 1 ( ( κ R M r 1 ) C s , k , κ R )
for k = 1 , , K . The parameters κ I , k and κ τ , k are, respectively, set to κ I , k = M k + 2 and κ τ , k = L + 2 , k = 1 , , K . The elements of C I , k are generated by
[ C I , k ] m , n = J 0 ( 2 π Θ I , k | m n | s I , k / λ ) , 1 m , n M k ,
with the choice of s I , k / λ = 0.5 , k = 1 , , K , where { Θ I , k } k = 1 K denotes the angular spreads. The elements of C τ , k and C s , k are generated by the exponential model [38]:
[ C τ , k ] m , n = ρ τ , k | m n | , 1 m , n L
and
[ C s , k ] m , n = ρ s , k | m n | , 1 m , n M r ,
respectively, for k = 1 , , K , where ρ τ , k is the correlation coefficient of the elements of C τ , k and ρ s , k the correlation coefficient of the elements of C s , k . The number of interfering users K is set to K = 3 and that of the antennas at the kth interferer M k is set to M k = M t , k = 1 , , K . The relative uncertainty parameters α T , α R , and α Q are defined such that
( ϵ T , ϵ R , ϵ Q ) = ( α T T ^ F 2 , α R R ^ F 2 , α Q Q ^ F 2 ) .
The system parameters used in the simulations are summarized in Table 2.
We compare the performance of the proposed schemes with that of two different existing schemes. One is the nonrobust scheme considered in the traditional work [9], where the values of ( T ^ , R ^ , Q ^ ) are used as perfect values. The other is the semi-robust scheme considered in the work [20], where only the value of Q ^ is used as the actual value. In the figures, “Proposed optimal”, “Proposed suboptimal”, “Nonrobust”, “Semi-robust”, and “Semi-robust (suboptimal)” indicate the proposed scheme with the optimal power allocation, that with the suboptimal power allocation, the nonrobust scheme, the semi-robust scheme, and the semi-robust scheme with the proposed suboptimal power allocation, respectively.

4.2. Performance Comparison

In Figure 3 and Figure 4, we illustrate the worst-case MSE performance as a function of the effective training signal-to-interference ratio (SIR) for the systems with M t = M r = L = 4 and M t = M r = L = 8 . The effective training SIR is defined as SIR = P T E [ Tr ( T ^ ) ] / E [ Tr ( Q ^ ) ] = P T Tr ( C T ) / k = 1 3 Tr ( C s , k C I , k ) C τ , k . In Figure 3, we set Θ T = 10 , Θ I , k = 10 and ρ τ , k = ρ s , k = 0.9 , k = 1 , 2 , 3 , which represents a strongly correlated environment. On the other hand, in Figure 4, we set Θ T = 30 , Θ I , k = 30 , and ρ τ , k = ρ s , k = 0.3 , k = 1 , 2 , 3 , which represents a weakly correlated environment. The values of α T , α R , and α Q are set to α T = α R = α Q = 0.3 . In the figures, the results are averaged over 500 realizations.
It can be observed from Figure 3 and Figure 4 that the proposed schemes outperform the other schemes. For example, in Figure 3, the proposed schemes have roughly 5 dB improvement in terms of the effective training SIR for the system with M t = M r = L = 8 when the effective training SIR is higher than 10 dB (similar trends can still be observed even for other performance metrics, e.g., the ratio of the standard deviation to the mean value).
Even though the semi-robust scheme outperforms the nonrobust one, its performance degrades due to imperfect knowledge of the interference covariance matrix when the effective training SIR increases or the correlation becomes stronger. The nonrobust scheme provides the worst performance due to imperfect knowledge of both the channel and interference covariance matrices. In the proposed schemes, the suboptimal power allocation provides comparable performance to that of the optimal power allocation. The semi-robust scheme with the proposed suboptimal power allocation works well for the weakly correlated case, but it shows noticeable performance loss for the strongly correlated case.
The worst-case MSE performance is compared in Figure 5 and Figure 6 for various values of α = α T = α R = α Q when Θ T = 10 , Θ I , k = 10 and ρ τ , k = ρ s , k = 0.9 , k = 1 , 2 , 3 , where the parameter α is introduced to set ( ϵ T , ϵ R , ϵ Q ) = ( α T ^ F 2 , α R ^ F 2 , α Q ^ F 2 ) . Figure 5 and Figure 6 show the performance comparison for the systems with M t = M r = L = 4 and M t = M r = L = 6 , respectively. When the value of α decreases from 0.6 to 0.1, the performances of all the schemes are improved, and the gap between the proposed and semi-robust schemes increases, but the gap between the semi-robust and nonrobust schemes decreases. This means that the performance is dominated by the uncertainty in the interference covariance matrix when the value of α equals 0.1, i.e., the error is small, but the performance is dominated by the uncertainty in the channel covariance matrix when the value of α equals 0.6, i.e., the error is large. The suboptimal power allocation shows almost the same performance as that of the optimal one in the proposed scheme, but shows notable performance loss in the semi-robust scheme when α = 0.6 .
To inspect the effect of the proposed designs on the quality of the communication systems, the bit error rate (BER) performance is compared in Figure 7 and Figure 8 for 3 × 3 and 4 × 4 MIMO systems, respectively (as convention, we here refer to a MIMO system with M t transmit antennas and M r transmit antennas as an M t × M r MIMO system). The training SIR is set to 15 dB and the uncertainty parameters are chosen as α T = α R = α Q = 0.3 . The orthogonal space–time block code in [39] is used to encode the QPSK-modulated symbols and the well-known minimum MSE (MMSE) receiver in [40] is employed to recover the transmitted symbols before detection. For the case of the proposed, nonrobust, and semi-robust training schemes, the imperfect CSI estimated from the training signal is used to implement the MMSE receiver. Additionally, we present the BER performance of the MMSE receiver with the perfect CSI as a benchmark for the BER performance and it is denoted by “Perfect CSI” in the figures. The results are averaged over 500 channel realizations, where 4 × 10 6 symbols are transmitted for each channel realization. From the figures, it can be observed that the proposed schemes provide better BER performance than the semi-robust and nonrobust schemes. As the symbol SIR increases, the BER performance of the nonrobust scheme saturates due to the uncertainties in both the channel and interference covariance matrices, and the gap between the proposed and semi-robust schemes increases due to the uncertainty in the interference covariance matrix. In both figures, the proposed suboptimal training scheme provides almost the same performance as that of the optimal one in terms of BER. Therefore, the proposed suboptimal training scheme is more practically useful than the optimal training scheme.

5. Conclusions

We designed an optimal training signal for MIMO systems in the presence of colored interference considering the imperfection of both the channel and interference covariance. From the solution on the structure, it was observed that the optimal training strategy was to allocate the training power according to the worst-case eigenvalues of the imperfect covariance matrices. It was also shown that the proposed training structure can cover the cases considered in the previous works [18,19,20]. The optimal training power allocation scheme was obtained numerically. An efficient suboptimal power allocation scheme was also proposed in a closed form. Simulation results show that the proposed schemes considerably outperform the semi-robust and nonrobust schemes in terms of the worst-case MSE and BER performances. In particular, the proposed closed-form training scheme shows near-optimal performance, and hence it is more practically useful than the optimal training scheme.
An important conclusion from our work is that the performance of training signal design for MIMO systems is sensitive to the channel and interference covariance uncertainties, and thus one should carefully select or determine the covariance uncertainty-based training signals in practice for reliable performance according to the system requirements and operating conditions.

Author Contributions

Methodology, J.-M.K.; writing—review and editing, S.Y.; supervision, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Kyungpook National University Research Fund, 2024, and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2025-00559998).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

It is well known that the following inequalities hold at an optimum point [29]:
J ( P , T ˜ , R ˜ , Q ˜ ) J ( P , T ˜ , R ˜ , Q ˜ ) J ( P , T ˜ , R ˜ , Q ˜ )
for all feasible values of P , T ˜ , R ˜ , and Q ˜ , In other words, a saddle point of the MSE J in (12) is a solution to the worst-case MSE minimization problem in (14). Let us define the function L as
L ( P , T ˜ , R ˜ , Q ˜ , μ P , μ T , μ R , μ Q , Γ T , Γ R , Γ Q ) = J ( P , T ˜ , R ˜ , Q ˜ ) + μ P Tr ( P P H ) P T μ T | | T ˜ | | F 2 ϵ T + Tr ( T ^ + T ˜ ) Γ T μ R | | R ˜ | | F 2 ϵ R + Tr ( R ^ + R ˜ ) Γ R μ Q | | Q ˜ | | F 2 ϵ Q + Tr ( Q ^ + Q ˜ ) Γ Q ,
where μ P , μ T , μ R , μ Q , Γ T Mathematics 13 02168 i001Mt × Mt, Γ R Mathematics 13 02168 i001Mr × Mr, and Γ Q Mathematics 13 02168 i001L × L are dual variables associated with the constraints of the problem (14). Then, the saddle point should satisfy the following KKT optimality conditions [28,29]:
P L P = P = Z 2 P ( Q ^ + Q ˜ ) 1 + μ P P = 0 ,
μ P 0 , μ P [ Tr ( P P H ) P T ] = 0 ,
T ˜ L T ˜ = T ˜ = ( T ^ + T ˜ ) 1 Z 2 ( T ^ + T ˜ ) 1 + μ T T ˜ Γ T = 0 ,
μ T 0 , μ T T ˜ | | F 2 ϵ T = 0 ,
Γ T 0 , ( T ^ + T ˜ ) Γ T = 0 ,
R ˜ L R ˜ = R ˜ = Tr ( Z 1 ) I M r + μ R R ˜ Γ R = 0 ,
μ R 0 , μ R R ˜ | | F 2 ϵ R = 0 ,
Γ R 0 , ( R ^ + R ˜ ) Γ R = 0 ,
Q ˜ L Q ˜ = Q ˜ = ( Q ^ + Q ˜ ) 1 P H Z 2 P ( Q ^ + Q ˜ ) 1 + μ Q Q ˜ Γ Q = 0 ,
μ Q 0 , μ Q Q ˜ | | F 2 ϵ Q = 0 ,
Γ Q 0 , ( Q ^ + Q ˜ ) Γ Q = 0 ,
where Z = ( T ^ + T ˜ ) 1 + P ( Q ^ + Q ˜ ) 1 P H . In the following, we derive the structures of ( P , T ˜ , R ˜ , Q ˜ ) using Equations (A3a)–(A3k).
Multiplying (A3f) on the right-hand side by ( R ^ + R ˜ ) and using the condition ( R ^ + R ˜ ) Γ R = 0 in (A3h), we obtain
R ˜ = ( R ^ + R ˜ ) Tr ( Z 1 ) I M r + μ R R ˜ = 0 .
To satisfy the condition (A4), R ˜ should have the following form: R ˜ = Tr ( Z 1 ) μ R I M r or R ˜ = R ^ . The latter choice cannot be a solution since it is infeasible when R ^ F 2 > ϵ R . Therefore, we have R ˜ = Tr ( Z 1 ) μ R I M r . From the constraint R ˜ F 2 ϵ R , the optimal value of μ R can be computed as μ R = Tr ( Z 1 ) M r ϵ R . Using this result, we obtain R ˜ = ϵ R M r I M r .
By multiplying (A3i) on the right-hand side by ( Q ^ + Q ˜ ) and using the condition ( Q ^ + Q ˜ ) Γ Q = 0 in (A3k), it can be obtained that
μ Q ( Q ^ + Q ˜ ) Q ˜ = P H Z 2 P ( Q ^ + Q ˜ ) 1 .
From the condition Z 2 P ( Q ^ + Q ˜ ) 1 = μ P P in (A3a), we can rewrite (A5) as
μ Q ( Q ^ + Q ˜ ) Q ˜ = μ P P H P .
Since the right-hand side of (A6) is Hermitian, the left-hand side of (A6) should be Hermitian. From this, we have ( Q ^ + Q ˜ ) Q ˜ = Q ˜ ( Q ^ + Q ˜ ) , or equivalently, Q ˜ Q ^ = Q ^ Q ˜ .
Lemma A1.
[41] For n × n Hermitian matrices A and B ,
A B = B A
if and only if A and B share the same eigenvectors.
From Lemma A1, we can conclude that Q ^ and Q ˜ share the same eigenvectors. Let Q ^ = U Q Λ Q U Q H be the EVD of Q ^ . Then, Q ˜ should have the following form: Q ˜ = U Q D Q U Q H , where D Q is an L × L real diagonal matrix whose diagonal elements consist of the eigenvalues of Q ˜ . From the structure of Q ˜ = U Q D Q U Q H , (A6) can be rewritten as
μ Q ( Λ Q + D Q ) D Q = μ P U Q H P H P U Q .
Since the left-hand side of (A7) is diagonal, U Q H P H P U Q should be diagonal. Let P = U P D P V P H be the singular-value decomposition (SVD) of P , where the main diagonal elements of the rectangular diagonal matrix D P consist of the singular values of P , and the column vectors of U P and V P are the left and right singular vectors of P , respectively. Then, the diagonal structure of U Q H P H P U Q can be simply obtained with the choice of V P = U Q .
Multiplying (A3c) on the right-hand side by ( T ^ + T ˜ ) and using the condition ( T ^ + T ˜ ) Γ T = 0 in (A3e), one can obtain that
μ T ( T ^ + T ˜ ) T ˜ = Z 2 ( T ^ + T ˜ ) 1 = Z 1 Z 2 P ( Q ^ + Q ˜ ) 1 P H .
Also, multiplying (A3a) on the left-hand side by P H , we have
μ P P P H = Z 2 P ( Q ^ + Q ˜ ) 1 P H .
By substituting (A9) into (A8), the condition (A8) can be rewritten as
μ T ( T ^ + T ˜ ) T ˜ = Z 1 μ P P P H .
Since the right-hand side of (A10) is Hermitian, the left-hand side of (A10) is also Hermitian. From this, we have ( T ^ + T ˜ ) T ˜ = T ˜ ( T ^ + T ˜ ) , or equivalently, T ˜ T ^ = T ^ T ˜ . From Lemma A1, it follows that T ^ and T ˜ share the same eigenvectors. Let T ^ = U T Λ T U T H be the EVD of T ^ . Then, the structure of T ˜ is obtained as T ˜ = U T D T U T H , where D T is an M t × M t real diagonal matrix whose diagonal elements consist of the eigenvalues of T ˜ . By using T ˜ = U T D T U T H , (A10) can be rewritten as
μ T ( Λ T + D T ) D T = U T H Z 1 μ P P P H U T
Since the left-hand side of (A11) is diagonal, U T H Z 1 μ P P P H U T should be diagonal. With some manipulations, it can be shown that the diagonal structure of U T H Z 1 μ P P P H U T can be obtained when U T H U P D P is a rectangular diagonal matrix. This can be achieved by simply setting U P = U T . Therefore, the optimal training structure is obtained as P = U T D P U Q H .

Appendix B. Proof of Lemma 1

We first prove the convexity. The Hessian matrix of ζ w.r.t. d P is computed as
d P 2 ζ = Diag 2 x Q , 1 x T , 1 3 ( d P , 1 x T , 1 + x Q , 1 ) 3 , , 2 x Q , ν x T , ν 3 ( d P , ν x T , ν + x Q , ν ) 3 ,
where x T , i = λ T , i + d T , i , i = 1 , , M t and x Q , i = λ Q , i + d Q , i , i = 1 , , L . For fixed ( d T , d Q ) D T × D Q , the diagonal elements of the Hessian matrix d P 2 ζ in (A12) are non-negative for d P D P because the values of { d T , i } , { x T , i } , and { x Q , i } are all non-negative. This means that d P 2 ζ is positive semi-definite. Therefore, the function ζ in (17) is convex in d P D P for fixed ( d T , d Q ) D T × D Q .
Now, we prove the concavity. Let us define the augmented vector r as r = [ d T T , d Q T ] T . Then, the Hessian matrix of ζ w.r.t. r is computed as
r 2 ζ = G 1 ( Λ Q + D Q ) 2 G 1 G 1 ( Λ Q + D Q ) ( Λ T + D T ) G 2 G 2 ( Λ T + D T ) ( Λ Q + D Q ) G 1 G 2 ( Λ T + D T ) 2 G 2 = G 1 ( Λ Q + D Q ) G 2 ( Λ T + D T ) G 1 ( Λ Q + D Q ) G 2 ( Λ T + D T ) H ,
where
G 1 = Blkdiag G , 0 ( M t ν ) × ( M t ν ) ,
G 2 = Blkdiag G , 0 ( L ν ) × ( L ν )
and the matrix G is given by
G = Diag 2 d P , i ( d P , 1 x T , 1 + x Q , 1 ) 3 , , 2 d P , ν ( d P , ν x T , ν + x Q , ν ) 3 .
For fixed d P D P , it can be shown from (A13) that a T r 2 ζ a 0 for any vector a R M t L × 1 . This means that the Hessian matrix r 2 ζ in (A13) is negative semi-definite. Thus, ζ is concave in d T D T and d Q D Q for fixed d P D P .

Appendix C. Proof of Lemma 2

The Schur-concavity can be proved from Schur’s condition, presented in the following lemma.
Lemma A2
(Schur’s condition). [32] (Ch.3.A.4) Let the function φ ( a ) : A R n R be continuously differentiable. Then, φ is Schur-concave on A if and only if φ is permutation-invariant, i.e., φ ( a ) = φ ( Π n a ) , and ( a m a l ) φ a m φ a l 0 for all 1 m , l n ; Π n is any n × n permutation matrix.
We first show the permutation-invariant property of g , i.e., g ( x T , x Q ) = g ( Π M t x T , Π L x Q ) . Let us define the diagonal matrices X T and X Q as X T = Diag ( x T , 1 , , x T , M t ) and X Q = Diag ( x Q , 1 , , x Q , L ) , respectively. Then, we have
g ( Π M t x T , Π L x Q ) = min d P D P g ( d P , Π M t x T , Π L x Q )
= min Tr ( D P D P T ) P T , D P D P T 0 β Tr ( Π M t X T 1 Π M t T + D P Π L X Q 1 Π L T D P T ) 1
= min Tr ( D ˜ P D ˜ P T ) P T , D ˜ P D ˜ P T 0 β Tr ( X T 1 + D ˜ P X Q 1 D ˜ P T ) 1
= min Tr ( D P D P T ) P T , D P D P T 0 β Tr ( X T 1 + D P X Q 1 D P T ) 1
= min d P D P g ( d P , x T , x Q ) = g ( x T , x Q ) ,
where the equality in (A14b) follows from the equivalence of the problem in (A14a) and (A14b) and the fact that Π M t x T = Π M t T X T Π M t and Π L x Q = Π L T X Q Π L . The equality in (A14c) follows from the change of the variable from D P to D ˜ P = Π M t T D P Π L . The equality in (A14e) follows from the equivalence of the problem in (A14d) and (A14e).
From the above property, without loss of generality, it is assumed that the elements of x T and x Q are arranged in decreasing and increasing orders, respectively. Then, for given x T and x Q , the solution to the inner minimization problem of (20) can be computed from the KKT optimality conditions as
d P , i = P T + j = 1 r x Q , j x T , j x Q , i j = 1 r x Q , j x Q , i x T , i , i = 1 , , r , 0 , i = r + 1 , , ν ,
where r = max { i { 1 , , ν } : d P , i > 0 } denotes the largest i such that d P , i > 0 . Substituting (A15) into (21), we can write the minimum MSE g as
g ( x T , x Q ) = β j = 1 r x Q , j 2 P T + j = 1 r ( x Q , j / x T , j ) + β i = r + 1 M t x T , i .
One can easily show from (A16) that ( x T , i x T , j ) g x T , i g x T , j 0 for all 1 i , j M t and ( x Q , i x Q , j ) g x Q , i g x Q , j 0 for all 1 i , j L . Therefore, the minimum MSE g is Schur-concave in x T and x Q according to Lemma A2.

References

  1. Foschini, G.J.; Gans, M.J. On limits of wireless communications in a fading environment when using multiple antennas. Wireless Pers. Commun. 1998, 6, 311–335. [Google Scholar] [CrossRef]
  2. Telatar, E. Capacity of multi-antenna Gaussian channels. Eur. Trans. Telecommun. 1999, 10, 585–595. [Google Scholar] [CrossRef]
  3. Perahia, E. IEEE 802.11n development: History, process, and technology. IEEE Commun. Mag. 2008, 46, 48–55. [Google Scholar] [CrossRef]
  4. Li, Q.; Li, G.; Lee, W.; Lee, M.I.; Mazzarese, D.; Clerckx, B.; Li, Z. MIMO techniques in WiMAX and LTE: A future overview. IEEE Commun. Mag. 2010, 48, 86–92. [Google Scholar] [CrossRef]
  5. IEEE. Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, Amendment 4: Enhancement for Very High Throughput for Operations in Bands Below 6 GHz; IEEE P802.11ac/D3.0; IEEE: Piscataway Township, NJ, USA, 2012. [Google Scholar]
  6. Shen, Y.; Martinez, E. WiMAX Channel Estimation: Algorithms and Implementations; Application Note, Freescale; Scientific Research Publishing Inc.: Glendale CA, USA, 2007. [Google Scholar]
  7. Biguesh, M.; Gershman, A.B. Training based MIMO channel estimation: A study of estimator tradeoffs and optimal training signals. IEEE Trans. Signal Process. 2006, 54, 884–893. [Google Scholar] [CrossRef]
  8. Wong, T.F.; Park, B. Training sequence optimization in MIMO systems with colored interference. IEEE Trans. Commun. 2004, 52, 1939–1947. [Google Scholar] [CrossRef]
  9. Liu, Y.; Wong, T.; Hager, W. Training signal design for estimation of correlated MIMO channels with colored interference. IEEE Trans. Signal Process. 2007, 55, 1486–1497. [Google Scholar] [CrossRef]
  10. Katselis, D.; Kofidis, E.; Theodoridis, S. On training optimization for estimation of correlated MIMO channels in the presence of multiuser interference. IEEE Trans. Signal Process. 2008, 56, 4892–4904. [Google Scholar] [CrossRef]
  11. Björnson, E.; Ottersten, B. A framework for training-based estimation in arbitrarily correlated Rician MIMO channels with Rician disturbance. IEEE Trans. Signal Process. 2010, 58, 1807–1820. [Google Scholar] [CrossRef]
  12. Biguesh, S.S.M.; Gazor, M. Optimal training sequence for MIMO wireless systems in colored environments. IEEE Trans. Signal Process. 2009, 57, 3144–3153. [Google Scholar] [CrossRef]
  13. Love, D.J.; Heath, R.W., Jr.; Santipach, W.; Honig, M.L. What is the value of limited feedback for MIMO channels? IEEE Commun. Mag. 2004, 42, 54–59. [Google Scholar] [CrossRef]
  14. Kotecha, J.; Sayeed, A. Transmit signal design for optimal estimation of correlated MIMO channels. IEEE Trans. Signal Process. 2004, 52, 546–557. [Google Scholar] [CrossRef]
  15. Pascual-Iserte, A.; Palomar, D.P.; Perez-Neira, A.I.; Lagunas, M.A. A robust maximin approach for MIMO communications with imperfect channel state information based on convex optimization. IEEE Trans. Signal Process. 2006, 54, 346–360. [Google Scholar] [CrossRef]
  16. Vucic, N.; Boche, H.; Shi, S. Robust transceiver optimization in downlink multiuser MIMO systems. IEEE Trans. Signal Process. 2009, 57, 3576–3587. [Google Scholar] [CrossRef]
  17. Botros, M.; Davidson, T.N. Convex conic formulations of robust downlink precoder designs with quality of service constraints. IEEE Sel. Top. Signal Process. 2007, 1, 714–724. [Google Scholar]
  18. Chiang, C.-T.; Fung, C.C. Robust training sequence design for spatially correlated MIMO channel estimation. IEEE Trans. Veh. Technol. 2011, 60, 2882–2894. [Google Scholar] [CrossRef]
  19. Shariati, N.; Wang, J.; Bengtsson, M. Robust Training Sequence Design for Correlated MIMO Channel Estimation. IEEE Trans. Signal Process. 2014, 62, 107–120. [Google Scholar] [CrossRef]
  20. Shariati, N.; Bengtsson, M. Robust training sequence design for spatially correlated MIMO channels and arbitrary colored disturbance. In Proceedings of the 2011 IEEE 22nd International Symposium on Personal, Indoor and Mobile Radio Communications, Toronto, ON, Canada, 11–14 September 2011; pp. 1939–1943. [Google Scholar]
  21. Spencer, Q.H.; Peel, C.B.; Swindlehurst, A.L.; Haardt, M. An introduction to the multi-user MIMO downlink. IEEE Commun. Mag. 2004, 42, 60–67. [Google Scholar] [CrossRef]
  22. Camp, J.D.; Knightly, E.W. The IEEE 802.11s Extended Service Set Mesh Networking Standard. IEEE Commun. Mag. 2008, 46, 120–126. [Google Scholar] [CrossRef]
  23. Sun, W.; Choi, M.; Choi, S. IEEE 802.11 ah: A long range 802.11 WLAN at sub 1 GHz. J. Ict Stand. 2013, 1, 83–107. [Google Scholar]
  24. Peters, S.W.; Heath, R.W. The future of WiMAX: Multihop relaying with IEEE 802.16j. IEEE Commun. Mag. 2009, 47, 104–111. [Google Scholar] [CrossRef]
  25. Biglieri, E.; Calderbank, R.; Constantinides, A.; Goldsmith, A.; Paulraj, A.; Poor, H.V. MIMO Wireless Communications; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  26. Kay, S.M. Fundamentals of Statistical Signal Processing, Vol. I: Estimation Theory; Prentice-Hall: Englewood Cliffs, NJ, USA, 1993. [Google Scholar]
  27. Shariati, N.; Bjornson, E.; Bengtsson, M.; Debbah, M. Low-complexity polynomial channel estimation in large-scale MIMO with arbitrary statistics. IEEE J. Sel. Top. Signal Process. 2014, 8, 815–830. [Google Scholar] [CrossRef]
  28. Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
  29. Boyd, S. Lecture Notes for EE364B: Convex Optimization II. 2007. Available online: http://www.stanford.edu/class/ee364b (accessed on 6 May 2025).
  30. Matlab Software for Disciplined Convex Programming, Version 2.0; CVX Research, Inc.: Austin, TX USA, 2012. Available online: http://cvxr.com/cvx (accessed on 6 May 2025).
  31. Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
  32. Marshall, A.W.; Olkin, I. Inequalities: Theory of Majorization and Its Applications; Academic: New York, NY, USA, 1979. [Google Scholar]
  33. Luo, Z.-Q.; Yu, W. An introduction to convex optimization for communications and signal processing. IEEE J. Sel. Areas Commun. 2006, 24, 1426–1438. [Google Scholar]
  34. Palomar, D.P.; Fonollosa, R. Practical algorithms for a family of waterfilling solutions. IEEE Trans. Signal Process. 2005, 53, 686–695. [Google Scholar] [CrossRef]
  35. Svensson, L.; Lundberg, M. On posterior distributions for signals in Gaussian noise with unknown covariance matrix. IEEE Trans. Signal Process. 2005, 53, 3554–3571. [Google Scholar] [CrossRef]
  36. Tiao, G.C.; Zellner, A. On the Bayesian estimation of multivariate regression. J. R. Stat. Soc. Ser. B 1964, 26, 277–285. [Google Scholar] [CrossRef]
  37. Shiu, D.; Foschini, G.J.; Gans, M.J.; Kahn, J.M. Fading correlation and its effect on the capacity of multielement antenna systems. IEEE Trans. Commun. 2002, 48, 502–513. [Google Scholar] [CrossRef]
  38. Loyka, S.L. Channel capacity of MIMO architecture using the exponential correlation matrix. IEEE Commun. Lett. 2001, 5, 369–371. [Google Scholar] [CrossRef]
  39. Tarokh, V.; Jafarkhani, H.; Calderbank, A.R. Space-time block codes from orthogonal designs. IEEE Trans. Inform. Theory 1999, 45, 1456–1467. [Google Scholar] [CrossRef]
  40. Tse, D.N.C.; Viswanath, P. Fundamentals of Wireless Communications; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  41. Strang, G. Linear Algebra and Its Applications; Thomson Brooks/Cole: Boston, MA, USA, 2006. [Google Scholar]
Figure 1. An illustrative example of 3MIMO system under consideration, to which the proposed scheme is applicable.
Figure 1. An illustrative example of 3MIMO system under consideration, to which the proposed scheme is applicable.
Mathematics 13 02168 g001
Figure 2. Minimum MSE ζ ¯ and its upper bound versus P T for various values of M t . The actual values of ζ ¯ are indicated by the real curves and the upper bounds are indicated by the dashed-dot curves.
Figure 2. Minimum MSE ζ ¯ and its upper bound versus P T for various values of M t . The actual values of ζ ¯ are indicated by the real curves and the upper bounds are indicated by the dashed-dot curves.
Mathematics 13 02168 g002
Figure 3. Worst-case MSE performance comparison of various training schemes for strongly correlated environment when α T = α R = α Q = 0.3 . The results are shown for the systems with M t = M r = L = 4 and M t = M r = L = 8 .
Figure 3. Worst-case MSE performance comparison of various training schemes for strongly correlated environment when α T = α R = α Q = 0.3 . The results are shown for the systems with M t = M r = L = 4 and M t = M r = L = 8 .
Mathematics 13 02168 g003
Figure 4. Worst-case MSE performance comparison of various training schemes for weakly correlated environment when α T = α R = α Q = 0.3 . The results are shown for the systems with M t = M r = L = 4 and M t = M r = L = 8 .
Figure 4. Worst-case MSE performance comparison of various training schemes for weakly correlated environment when α T = α R = α Q = 0.3 . The results are shown for the systems with M t = M r = L = 4 and M t = M r = L = 8 .
Mathematics 13 02168 g004
Figure 5. Worst-case MSE performance comparison of various training schemes for different values of the uncertainty parameter α = α T = α R = α Q when M t = M r = L = 4 . The results are shown for α = 0.1 and α = 0.6 .
Figure 5. Worst-case MSE performance comparison of various training schemes for different values of the uncertainty parameter α = α T = α R = α Q when M t = M r = L = 4 . The results are shown for α = 0.1 and α = 0.6 .
Mathematics 13 02168 g005
Figure 6. Worst-case MSE performance comparison of various training schemes for different values of the uncertainty parameter α = α T = α R = α Q when M t = M r = L = 6 . The results are shown for α = 0.1 and α = 0.6 .
Figure 6. Worst-case MSE performance comparison of various training schemes for different values of the uncertainty parameter α = α T = α R = α Q when M t = M r = L = 6 . The results are shown for α = 0.1 and α = 0.6 .
Mathematics 13 02168 g006
Figure 7. BER performance comparison of various training schemes for the 3 × 3 MIMO system when the training SIR is set to 15 dB and α T = α R = α Q = 0.3 . The orthogonal space–time code and the MMSE receiver are used to encode and decode the QPSK-modulated symbols, respectively.
Figure 7. BER performance comparison of various training schemes for the 3 × 3 MIMO system when the training SIR is set to 15 dB and α T = α R = α Q = 0.3 . The orthogonal space–time code and the MMSE receiver are used to encode and decode the QPSK-modulated symbols, respectively.
Mathematics 13 02168 g007
Figure 8. BER performance comparison of various training schemes for the 4 × 4 MIMO system when the training SIR is set to 15 dB and α T = α R = α Q = 0.3 . The orthogonal space–time code and the MMSE receiver are used to encode and decode the QPSK-modulated symbols, respectively.
Figure 8. BER performance comparison of various training schemes for the 4 × 4 MIMO system when the training SIR is set to 15 dB and α T = α R = α Q = 0.3 . The orthogonal space–time code and the MMSE receiver are used to encode and decode the QPSK-modulated symbols, respectively.
Mathematics 13 02168 g008
Table 1. Computational complexity comparison.
Table 1. Computational complexity comparison.
MethodComputational ComplexityProcessing Time (s)
Iterative algorithm in [20] O M t 3 + L 3 + ( M t 6.5 + L 6.5 ) N iter log ( 1 / ψ ) 8.8098 × 10 5
Proposed optimal training schemeTotal O M t 3 + L 3 + ( M t 3.5 + L 3.5 ) log ( 1 / ψ ) 206.4642
Proposed suboptimal training schemeTotal O M t 3 + L 3 68.8214
Table 2. Simulation parameter setup.
Table 2. Simulation parameter setup.
System ParameterValues
Number of transmit antennas, M t { 3 , 4 , 6 , 8 }
Number of receive antennas, M r { 3 , 4 , 6 , 8 }
Training length, L { 3 , 4 , 6 , 8 }
Number of interferers, K3
Number of antennas at the kth interferer, M k { 3 , 4 , 6 , 8 }
Uncertainty parameters, α T , α R , and α Q { 0.1 , 0.3 , 0.6 }
Angular spreads, Θ T and Θ I , k , k { 10 , 30 }
Correlation coefficients, ρ τ , k and ρ s , k , k 0.9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kang, J.-M.; Yun, S. Worst-Case Robust Training Design for Correlated MIMO Channels in the Presence of Colored Interference. Mathematics 2025, 13, 2168. https://doi.org/10.3390/math13132168

AMA Style

Kang J-M, Yun S. Worst-Case Robust Training Design for Correlated MIMO Channels in the Presence of Colored Interference. Mathematics. 2025; 13(13):2168. https://doi.org/10.3390/math13132168

Chicago/Turabian Style

Kang, Jae-Mo, and Sangseok Yun. 2025. "Worst-Case Robust Training Design for Correlated MIMO Channels in the Presence of Colored Interference" Mathematics 13, no. 13: 2168. https://doi.org/10.3390/math13132168

APA Style

Kang, J.-M., & Yun, S. (2025). Worst-Case Robust Training Design for Correlated MIMO Channels in the Presence of Colored Interference. Mathematics, 13(13), 2168. https://doi.org/10.3390/math13132168

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop