Article Intercept Capacity: Unknown Unitary Transformation

We consider the problem of intercepting communications signals between Multiple-Input Multiple-Output (MIMO) communication systems. To correctly detect a transmitted message it is necessary to know the gain matrix that represents the channel between the transmitter and the receiver. However, even if the receiver has knowledge of the message symbol set, it may not be possible to estimate the channel matrix. Blind Source Separation (BSS) techniques, such as Independent Component Analysis (ICA) can go some way to extracting independent signals from individual transmission antennae but these may have been preprocessed in a manner unknown to the receiver. In this paper we consider the situation where a communications interception system has prior knowledge of the message symbol set, the channel matrix between the transmission system and the interception system and is able to resolve the transmissionss from independent antennae. The question then becomes: what is the mutual information available to the interceptor when an unknown unitary transformation matrix is employed by the transmitter.


Introduction
In this paper we are interested in differential entropy and mutual information as it applies to wireless communication systems employing antenna arrays at both the transmission and receiving sites.Systems of this type are more commonly known as Multiple-Input Multiple-Output (MIMO) communication systems.MIMO communication techniques are known to provide increased information capacity over that  obtainable via a single transmit antenna to single receive antenna system [1,2]; however this extra capacity comes at the expense of increased system complexity and additional processing.To correctly receive and detect the transmitted message, the receive system must know the channel, or mixing, matrix as well as the message symbol set being used.The channel matrix may be estimated when a predetermined, known message sequence is incorporated into the transmitted message and the receiver knows where in the message this sequence occurs.However this training sequence may not always be available and this presents a blind source estimation problem where neither the message nor the channel matrix are known to the receiver.One possible solution to this problem is to employ a Blind Source Separation (BSS) technique such as Independent Component Analysis (ICA) [3] which can go some way to extracting the signals from individual transmission antennae with the caveat that all but one of the transmitted signals must have a non-gaussian probability distribution.In some cases the transmitted signals may have been preprocessed in a manner unknown to the receiver.In this paper we consider the situation where a communications receiving system has prior knowledge of the message symbol set, the channel matrix between the transmission system and the receiving system, is able to resolve the transmissions from the, assumed independent, transmitter antennae but does not know the unitary transformation that has been applied at the transmitter.The question then becomes: what is the mutual information available to the receiver when an unknown unitary transformation matrix is employed by the transmitter?
In the following sections we derive expressions for differential entropy, mutual information and hence capacity for a two-element transmit array to two-element receive array system which we shall refer to as a 2-Dimensional (2D) system.The 3D case is studied in the appendix giving a basis for a high snr approximation for the general N-Dimensional (ND) case.The general snr, ND case is derived and the resulting intended-receiver and intercept receiver mutual informations are compared.

Problem and Assumptions
The model that we shall employ for a MIMO system is the simple linear transformation where y is the received signal vector, x is the transmitted vector, n is additive receiver noise and H is the channel gain or mixing matrix between the transmitter and receiver.The standard MIMO channel model [11] assumes independent identically distributed (i.i.d.), frequency-flat Rayleigh fading between the transmit and receive antennae.Consequently the components H i,j of H are typically modelled with a complex Normal density i.e.H i,j ∼ CN (0, 1).Here we shall assume H to be constant for both the intended and eavesdropper channels.In [11] the authors show that, for the case where the channel is unknown and with block coding over a coherence time T , the signal structure that achieves capacity is formed by the product of an isotropically distributed unitary matrix and a independent real, nonnegative diagonal matrix.For the purpose of this study we shall treat all of y, x, and n as real random variables.The benefit of this approach will be to simplify the derivations whilst recognising that, if the real and imaginary parts of the variables are independent, the results may be readily extended to the complex case by increasing the dimensionality of the vectors.Figure 1 illustrates the scenario that we are studying.Employing a well-known cryptographic convention [4], the transmission source array is labelled Alice (A), the intended cooperative receiver array is labelled Bob (B) and the unintended, passive intercept receiver is labelled Eve (E).The lines represent the paths that signals take from transmitter antennae to receiver antennae.Shapes in the signal paths represent objects that cause signal scattering.An important point to realise here is that the paths (channel H B ) between A and B are different to those between A and E (channel H E ).
The channel matrix can be factorized using Singular Value Decomposition (SVD) as : H = UDV † and we can then use: This allows us to view the MIMO system as if it were composed of a set of parallel channels and the input data vector can be designed with this in mind.Figure 2 shows how this channel, with pre and postprocessing, may be configured.For such an approach to work the transmitter requires precise knowledge of the channel matrix and it is a simple matter for the intended receiver to obtain the (scaled) message, since D is a real diagonal matrix.However for an unintended receiver, with a different (known) channel matrix, an unknown unitary transformation has been applied.In this case we desire to know how the mutual information, is affected.We make the following assumptions: • y is a real N × 1 observation vector.
• U is a real N × N unitary (orthogonal) matrix.
• x is a real N × 1 random Gaussian signal vector, • n is a real N × 1 random Gaussian noise vector, n i ∼ N (0, σ 2 n ).
• the intended channel (H B ) is known to both Alice and Bob.
• Eve knows the intercept channel (H E ) but not the intended channel.
• the channels H B and H E vary slowly with time (or over many symbol periods) and may be assumed constant for the present study.
Based on the last assumption, Eve attempts to estimate the signal vector by applying the channel inverse as Eve is therefore unable to directly obtain x due to the unknown unitary matrix V.In applying the channel inverse, the noise vector has also been scaled and the modified noise covariance term H −1 e Σ n H −T e shows that the intercept receiver may be operating with a different signal to noise ratio to that of the intended receiver.This also indicates that Eve could obtain better mutual information with a better channel.
Optimal power allocation to the parallel channels between Alice and Bob would typically be implemented via a technique called waterfilling, see [5] chapter 5, and hence lead to optimal system capacity.We have not taken waterfilling into account in this study and simply assume that equal power is assigned to each of the parallel channels.
We could proceed to derive the eavesdropper mutual information in a cartesian or a polar coordinate system.Of course it doesn't matter which coordinate system we choose -we should get the same answer.It is well known that differential entropy involves a Jacobian (J) in the transformation of coordinates [6] , leading to a ln det(J) term but this will cancel in the mutual information calculations because mutual information is a relative entropy i.e. the difference between two entropies.For the purpose of this study our derivations will be based on a cartesian coordinate system.We shall derive differential entropies according to the definitions given by Cover and Thomas in [7], i.e. the differential entropy h(Y ) of a continuous random variable Y with a probability density p(y) is defined as where Y is the support set of the random variable.When we have two random variables Y, X with joint probability density p(y, x), the conditional differential entropy is defined as where X is the support set of the random variable X.The Mutual Information (MI) between the two random variables Y and X is defined as The capacity C is then obtained by maximizing the mutual information over all probability distributions for the source i.e. over p(x): It is well known [7] that a Gaussian source distribution is an entropy maximizer (for a given variance) so that, by treating x as a vector with i.i.d Gaussian components, the resulting differential entropy expressions will determine the capacity.Since the channels are assumed known we may consider y = x + n  to represent the fully informed (unitary transformation known) case and y = Vx + n to represent the partially informed (unitary transformation unknown) case.We can write x = x ||x|| ||x|| to obtain where A = ||x|| and v = V x ||x|| is a unit vector for which we may or may not know the rotations.For the random vectors y and x the mutual information for the fully informed model is given by: and for the partially informed model the mutual information is obtained from: where the message amplitude A is known but not the rotation angles.

2D Capacity
To illustrate the consequence of not knowing the rotation imposed by the orthogonal transformation in the 2D case, figure 3 shows a message symbol set where each of the two transmitters can set one of four possible values.Thus a constellation containing 16 points may be realised at the receiver and the density of these points is determined by the additive noise.If the rotation is unknown but the amplitude levels are known then the receiver might obtain a message that looks something like figure 4 where the density of the rings is determined by the additive noise.

2D Density Function
We can construct the joint density function beginning with and then letting |x| 2 = x 2 1 + x 2 2 , x 1 = |x| cos α and x 2 = |x| sin α i.e. |x| is the magnitude of the vector [x 1 x 2 ] T and α is the angle of this vector relative to the origin.Similarly |y| 2 = y 2 1 + y 2 2 , y 1 = |y| cos φ and y 2 = |y| sin φ, where |y| is the magnitude of the vector [y 1 y 2 ] T and φ is the angle of this vector relative to the origin.so that

x and V known
In this case V rotates the original vector x o through a known angle to a new, known x and we can treat this case with the probability density function (pdf) and the differential entropy is

A known, V unknown
In this case V rotates the original vector x o through an unknown angle γ so that x 1 = A cos γ and x 2 = A sin γ, giving the pdf Now, with β φ − γ and p(β) At high enough SNR we may approximate the Bessel function as and the differential entropy is

x and V unknown
In this case we assume that we only have knowledge of the variance of x and n and hence the variance of y.With the components of both x and n treated as zero-mean Gaussian, then the components of y will also be zero-mean Gaussian with variance equal to the sum of the variances of x and n i.e. y i ∼ N (0, σ 2 y ) where σ 2 y = σ 2 x + σ 2 n .The joint pdf for y is which leads us to the differential entropy

Capacity
The fully informed mutual information was defined in equation ( 8) and so when both x and V are given, with Gaussian distributions for the source and noise, we have the fully informed capacity Similarly partially informed mutual information was defined in equation ( 9) so that, when the rotation matrix is unknown, we obtain the partially informed capacity In a similar fashion we may derive the entropies and mutual information for the 3D case.The derivation is given in Appendix A, where we find that and 4. ND Capacity

High SNR Case
At high snr we found that the partially informed probability density functions factored into two parts: The first part appears to have the form of a uniform density on the surface of an N-dimensional sphere.
The second part appears to represent a Gaussian distribution across an N-dimensional shell.Therefore p(y|A) may be viewed as an N-dimensional, variable density, shell with mean radius A. From Wikipedia ("Sphere") [8] the general equations for the surface area and volume of an N-dimensional sphere, with radius A = N i x 2 i , are given by: Surface Area = 2π and Volume = 2π Thus the required N-D, high SNR entropies, may be written as: The densities p(y) and p(y|x) could be pictured as N-dimensional, probability spheres.Hence the fully informed capacity becomes the difference in entropy between an N-dimensional probability sphere, representing the signal plus noise vector distribution, and an N-dimensional sphere, representing the noise vector distribution.In the partially informed case the capacity becomes the difference in entropy between an N-dimensional probability sphere, representing the signal plus noise vector distribution, and an N-dimensional probability shell, representing the amplitude known plus noise distribution.The ND fully informed capacity may be written as and the partially informed capacity may be approximated by Defining the signal-to-noise ratio as ρ = A 2 σ 2 n and with x , then the capacities may be expressed as since MM T = MM −1 = I.So we are free to choose any rotation matrix and the integral will be unaffected.Let us choose M such that My = |y|[1, 0, . . ., 0] = |y|e, where e is a unit vector, i.e. the vector y is rotated to lie along the y 1 axis.Let x = (Mx) then we have Hence, with |x| = A, We may make a change of variable by letting z = The general form for the density, given A, is therefore The entropy calculation involves a multidimensional integration over the components in y: With the general form for the differential entropies we are now able to derive the capacity for both the fully informed cases and the partially informed (amplitude only) cases.The capacity for dimensions two to five have been calculated for both cases and the results are presented in figures 5 and 6.Comparing the two figures we note that the partially informed curves have a smaller slope than their fully informed counterparts.If both receivers were operating with the same snr then we could also make the observation that the partially informed values are always less than their fully informed counterparts.However, as indicated in section 2 earlier, due to the channel matrix inversion required by Eve and a possibly different local (local to the receivers) noise environment, this may not be the case.

Summary
The problem of determining the information intercept capacity, available to a receiving system which knows its channel matrix but has no prior knowledge of a unitary transformation that has been applied at the transmitter, has been analysed.Entropy derivations were carried out for two dimensions and three dimensions giving some insight to the general dimensional, high snr case.The exact capacity for the N-Dimensional case has been obtained but requires numerical integration to derive the differential entropy for the partially informed case.The fully informed capacity has been likened to the difference in entropy between two N-dimensional probability spheres: the larger sphere, representing the distribution of the signal plus noise vector, and the smaller sphere, representing the distribution of the noise vector.At high snr, the partially informed capacity was found to be equal to the difference in entropy between an N-dimensional probability sphere, representing the distribution of the signal plus noise vector, and an N-dimensional probability shell, representing the distribution of the amplitude plus noise vector.

Figure 2 .
Figure 2. Converting MIMO channel to parallel channel via SVD.

Figure 4 .
Figure 4. Received ring distribution caused by unknown rotation on message symbol set.