1. Introduction
In recent years, the heterogeneous wireless network, which can support high-density and high-rate traffic, has attracted much research interest from both academic and industry sectors [
1,
2,
3]. The cooperation between the macro base stations (BSs) and the femto BSs can greatly improve the quality of service (QoS) of the user equipment, as well as the spectrum efficiency and the energy efficiency [
4]. Therefore, ultra-dense heterogeneous technology has been adopted as one of the potential solutions to the next-generation wireless sensor network [
5].
On the other hand, the architecture of heterogeneous wireless networks brings up many challenges to the physical-layer security [
6,
7,
8]. Several signal-level algorithms, such as beamforming [
9,
10], artificial noise [
11,
12] or stochastic geometry approach [
13,
14], have been proposed as the classical solutions to the secure multiple-input multiple-output (MIMO) communication scheme. Consequently, as an improvement, deep learning-based secure communication has been proposed to protect the message transmission to the legitimate users [
15]. However, due to the noise and the signal distortion or delay of channel estimation, it is difficult to obtain the ideal channel state information (CSI) from both the legitimate users and the eavesdroppers in most of practical scenes [
16,
17,
18,
19,
20,
21]. For example, authors in [
22,
23,
24] derived a new outage probability expression and optimize the transmission capacity under different assumptions.
Deep learning technology has already shown astonishing capabilities in dealing with complex network in mobile communications, such as channel estimation, signal detection, modulation recognition and channel equalization. For example, Ref. [
25] used a 2D image to present the time-frequency channel fading matrix. By using a super-resolution network and a denoising network, the proposed scheme produces more accurate channel estimation. Under the constraint of one-bit quantization, authors in [
26] proposed deep neural network-based auto-encoder for OFDM receiver. Modulation recognition algorithm was considered in [
27], where a convolutional neural network followed by a long short-term memory as the classifier was adopted to improve the robustness for modulation recognition. A CNN-based channel prediction scheme was designed in [
28] in massive MIMO systems under channel aging effects, where an autoregressive network was used to model the temporal channel correlation of the wireless channel. Considering perfect free space optical communications, a pilot independent deep learning-based channel estimator was proposed in [
29,
30]. Simulation results indicated that the proposed scheme provided close enough performance to the perfect channel estimation scheme.
Although there are some works on learning-based channel estimation algorithms, with imperfect CSI, the impacts of CNN-based MIMO detector on physical-layer security performance is still an open question. Motivated by that, in this paper, we will discuss the deep learning-based secure MIMO communication algorithm for heterogeneous networks with imperfect CSI. In the heterogeneous networks, there exist both macro BSs and femto BSs, which can serve a group of users within different areas. In general, the macro BS provides the massive connections and the large-scale cell coverage in the hot spots. The femto BS can help enlarge the wireless coverage and the data rates of the user in network edge. In some practical scenarios, secure messages transmitted from the macro BS to the users may be intercepted by the femto BS. In this case, with the help of zero-forcing algorithm [
31], the macro BS chooses null-space eigenvectors to prevent information leakage to the femto BS. Classically, the receiver obtains the CSI through pilots signal transmitted from the base station, and there exists a time difference between the channel estimation and the data packet transmission. Thus, the estimated CSI is an imperfect version of the instant of packet transmission, and the zero-forcing secure MIMO algorithm should be re-designed. In this paper, two deep CNN-based detectors are proposed, and a training set with imperfect CSI as well as the original messages or ideal CSI is fed to the CNN model to produce the refined CSI. Simulation results show that the proposed deep learning-based detectors outperform the classical maximum likelihood detector (MLD), especially in small correlation factor cases. The impacts of system parameters, such as the correlation factor, the antenna number, are evaluated in different setup scenarios.
The main contributions of this paper are as follows:
We employ the deep learning-based technique for secure MIMO communications in heterogeneous networks, which can exploit the benefits of CNN learning model to produce more accurate CSI and meanwhile reduce the bit error rate (BER) of the receiver.
We provide the detailed framework of deep learning-based detectors, where imperfect CSI as well as the original messages or ideal CSI are included in the training set, and can be used in different application scenarios.
We present simulation results for deep learning-based detectors in heterogeneous networks. With the help of the CNN technique, the proposed detectors show obvious performance gain over the MLD with acceptable computational cost.
Notations: We use to represent the circularly symmetric complex Gaussian random variable with mean and variance , and denote the probability density function (PDF) and cumulative distribution function (CDF) of a random variance x, respectively, is a row vector consisting of all diagonal elements of , is the conjugate transpose of the , and denotes the wireless channel fading matrix from M to F.
2. Related Work
Considering Gaussian wiretap channel, Fritschek et al. [
32] introduced an auto-encoder to model the noised wireless channel with a novel security loss function. The generative model of the auto-encoder was trained to encode a message such that the eavesdropper cannot decode it correctly. Results show that the proposed scheme learns a trade-off between legitimate communication rate and secrecy capacity. With the help of the channel’s statistical characteristic in relay networks, the authors in [
33] proposed a new deep learning-based algorithm to design secure beamforming vector. Considering visible light communication, Xiao et al. [
34] proposed deep reinforcement learning (DRL)-based secure communications strategy. Since the optimization of the system secrecy rate is non-convex and NP-hard, a suboptimal solution on beamforming vector can be obtained by introducing zero-forcing beamforming gain to the eavesdropper.
With perfect CSI, the deep feedforward neural network (DFNN) with three layers was adopted for time-slot wireless powered system in [
35]. In the proposed scheme, the tuple system parameters, such as the time allocation factor, the power allocation factor as well as the rate of the wiretap channel were produced by the DFNN. During the training phase, the output of the DFNN was compared with the optimal system parameters obtained from exhaust search, and the mean squared error (MSE) was adopted as the performance loss function. Numerical results of the DFNN and the optimal parameters were provided to validate the proposed scheme. Using Stackelberg equilibria, the authors in [
36] proposed a secure mobile crowd-sensing (MCS) scheme, where the DRL technique was adopted to derive the optimal MCS policy.
3. System Model
Figure 1 depicts the model of secure MIMO communications for heterogeneous networks. In the considered system, there is a macro BS, a femto BS and a terminal user. It is assumed that three types of nodes are all equipped with multiple antennas, and the numbers of antennas are denoted as
,
and
, respectively. In the heterogeneous networks, macro BS and femto BS work cooperatively to provide the wireless coverage. Specifically, the users located in the hot spots area are associated with the macro BS, and the users located in the network edge are served by the femto BS. We use
and
to denote the wireless channel fading matrix from the macro BS to the femto BS and the users, respectively.
In addition, the channel fading matrix is modeled as time-varying and flat fading using the classical Jakes model [
37]. Thus, the correlation coefficient between adjacent samples is given as
where
is the normalized doppler frequency spread and
is the zero-order Bessel function of the first kind.
Thus, the channel fading matrix can be calculated as
where
n denotes the sample time and
denotes the additional white noise matrix with the same size of
. The same equation can be applied on
as follows:
Please note that in the following sections, the sample time n is omitted without loss of generality.
It is well-known that the deep learning networks can effectively capture the correlation features of the training data set. Spatial correlation or antenna correlation, which may achieve dimensionality reduction, is an important challenge for MIMO systems [
38,
39]. Also, it will be a direction of our future work.
Since the time correlated MIMO channel model is adopted in this paper, we can use DCNN to obtain more accurate CSI from the outdated CSI. Actually, considering the phase rotation introduced by the channel matrix, the outdated CSI is necessary to assist the data recovery at the user.
Due to the open nature of the heterogeneous networks, the femto BS may intercept the signal transmitted from the macro BS to the users. To prevent the information leakage to the femto BS, the macro BS can zero-forcing the equivalent channel matrix of the femto BS by the null-space technique. In this case, the macro BS first obtains the CSI
between the macro BS and the femto BS. Then the null-space eigenvectors can be produced by applying eigenvalue decomposition on the autocorrelation matrix, i.e.,
, that is
where
denotes the eigenvalue decomposition,
denotes the eigenvalues in ascending order and
are the corresponding eigenvectors. Considering the size limitation for both the femto BS and users, it is reasonable to assume that the antenna number of the macro BS is larger than the femto BS and users, i.e.,
. Thus, the number of zero eigenvalues can be given as
Please note that
is also the number of null-space vectors for
, which is given as
where
denotes the first
column vectors of
. Thus, the beamforming matrix
, which is used by the macro BS to transmit messages to its associated user, lies in the null-space of
. Then the signal received at terminal user can be expressed as
where
P is the transmission power of the macro BS,
is the original message transmitted from macro BS with size
and
.
Since the equivalent channel fading matrix of the femto BS is zero-forced, we only need to observe the signal-noise-ratio (SNR) of the associated user. We define the average normalized SNR received at the user as
Classically, the receiver obtains the wireless CSI through pilot signals transmitted from the BS, and there exists a time difference between the channel estimation and the data packet transmission. Thus, the estimated CSI is an imperfect version of the instant of packet transmission. To reduce the analysis complexity, it is assumed that the estimation of
, while that of
is imperfect. Specifically, the imperfect equation is modeled as
where
is the correlation factor of the imperfect version of channel matrix.
The standard maximum likelihood detector (MLD) with imperfect CSI can be employed to detect the original message as
where
is the all possible constellations set.
4. Deep CNN-Based Detector
The imperfect CSI, which is introduced by the noise or delay of channel estimation, will greatly deteriorate the system performance of standard MLD. To overcome the effects of the imperfect CSI in secure MIMO communications, two types of deep CNN (DCNN)-based detectors are proposed in this section, which can be used in different application scenarios. The deep CNN models are first trained with predefined loss functions and then used to generate the refined CSI
, which can be fed to the MLD to obtain the original message, i.e.
The details of the DCNN model is given as
Figure 2, where there exist
N one-dimension convolutional layers excluding the input layer. In the input layer, the channel fading matrix
is reshaped as a column vector. Since only real data can be processed in the CNN model, the complex data of
can be treated as two real channels [
40]. Please note that each convolutional layer is followed by a ReLU activation function except the output layer. Moreover, in the
n-th convolutional layer, there are
features maps with filter length
. Specifically, in the output layer, there is only one feature map.
As to the detailed architecture of the DCNN, we must find the trade-off between complexity and performance. It is noted that the fully connected DNN, which may hold better performance, while its computational complexity is proportional to the square of the number of nodes. On the other hand, both the training data set and the training time required by fully connected DNN the is too large to be satisfied. There also exist some powerful CNN models, such as VGG [
41] and ResNet [
42], which improve the detection probability by increasing the depth of the models to 19 and 34, respectively. Specifically, the number of parameters for VGG-19 is up to 144M. Thus, to decrease the computational complexity, we must simplify the classical CNN models as follows. The architecture of DCNN model includes
N=4 layers, and is described by
and
as
.
In the following sections, two different methods, denoted as DCNN type-I and type-II, are introduced to train the DCNN model.
4.1. DCNN Type-I: Training with Accurate CSI
Figure 3 shows the first training method of DCNN-based detector with accurate CSI, which is denoted as DCNN type-I. In the training phase, a data set including both the imperfect CSI
and the accurate CSI
is fed into the learning model. The loss function is defined as the mean square error between the output of the model
and the accurate CSI
, i.e.
During the DCNN model training, the loss function is calculated batch by batch and used to optimize the weight and the bias of the DCNN model [
43]. Please note that the accurate CSI is necessary for DCNN type-I, which is used to calculate the model loss function. Thus, the application of DCNN type-I is limited, because in some practical scenarios, it is difficult to obtain the accurate CSI especially in the wireless MIMO communications. Therefore, another type of DCNN is proposed to overcome this limitation.
4.2. DCNN Type-II: Training with Original Message
The training architecture of DCNN type-II is given as in
Figure 4, where accurate CSI is not needed. Instead, the output of DCNN
is used in MLD and obtain the likelihood of each candidate message as follows:
By using of the SoftMax function, the normalized likelihood probability of each candidate message can be given as
We use to denote the correct probability of each candidate message. That is if i-th candidate message is correct, otherwise . Inspired by the information theory, the cross-entropy can be used to quantify the difference between two probability vectors.
Accordingly, for probability distributions of
and
, we can calculate the cross-entropy as follows:
Then, the cross-entropy can be used as the loss function to train the DCNN model.
Compared with the loss function in (
12), only the received signal
y, the beamforming matrix
and the original message
x are needed, which enlarge the application scenarios of the DCNN type-II. On the other hand, without the help of the accurate CSI, DCNN type-II leads to deteriorated performance compared with type-I, which can be validated in the simulation results.
Please note that the DCNN model could also output the ground-true symbol x directly in a supervised manner, and the outdated MIMO channel matrix could be further employed as side information by inputting it to the DCNN. We use DCNN type-III to denote the new DCNN model. Although the detailed architectures of DCNN type-II and the suggested DCNN type-III are different, they are functionally equivalent as a black box with DCNN kernels. In other words, the MLD module in DCNN type-II can be seen as part of the functions of DCNN type-III.
5. Simulation Results
In this section, simulation results are provided to verify the proposed DCNN models. The impacts of system parameters, such as the correlation factor of imperfect CSI , the normalized doppler frequency , the number of antennas is evaluated in different setup scenarios. Since the equivalent channel fading matrix of femto BS is zero-forced, BER of users with different detectors is used to evaluate the system performance.
Specifically, QPSK modulation is employed in all simulation setups. Since the constellations of QPSK modulation is 2-D complex signal, we can generalize the setup to an arbitrary modulation order. We set the data packet length as 600 bits, and each batch consists 10 data packets. During the training phase, a training data set with 10000 batches as well as a validation data set with 1000 batches are fed to the DCNN, and a test data set with 1000 batches is used to evaluate the BER performance of the proposed schemes. The popular TensorFlow framework [
44] is adopted in our simulations, while the adaptive moment estimation (Adam) optimizer is used to minimize the loss value during training phase. In particular, the optimization parameters are listed as follows: learning rate is
,
,
,
. Since the testbed of our paper is under construction at this moment, we will present experiment results on real datasets in future works.
Figure 5 depicts the BER performance versus the SNR of the system with MIMO setup as
, the normalized doppler frequency
, and the correlation factor of imperfect CSI
. The BER of three types of detectors, such as standard MLD, DCNN type-I and DCNN type-II are compared. As a benchmark, the BER curves obtained by MLD with the perfect CSI is also presented. We can see from this figure that the outdated CSI has obvious adverse effects on BER performance. As shown in this figure, in the high SNR region, DCNN-based detectors show a performance gain of about 4dB in comparison to the standard MLD. The reason is that the former can refine the imperfect channel matrix and produce more accurate CSI, then the BER of the system can be improved.
Moreover, two types of DCNN-based detectors show almost the same performance with slight gap. Similar results can be obtained from
Figure 6 and
Figure 7, where the correlation factors of imperfect CSI are
and
, respectively. However, accurate CSI is necessary for DCNN type-I, which is used to calculate the model loss function. Thus, the application of DCNN type-I is limited, because in some practical scenarios, it is difficult to obtain the accurate CSI especially in the wireless MIMO communications. Compared with the loss function, only the received signal
y, the beamforming matrix
and the original message
x are needed, which enlarge the application scenarios of the DCNN type-II. On the other hand, without the help of the accurate CSI, DCNN type-II leads to deteriorated performance compared with type-I.
Figure 8 depicts the BER performance versus the correlation factor of imperfect CSI
, where the average SNR of the system is SNR = 20 dB. The BER of three types of detectors, such as standard MLD, DCNN type-I and DCNN type-II are compared. As shown in this figure, DCNN-based detectors indicate considerable performance gains relative to the standard MLD. This is because DCNN-based detectors can refine the imperfect channel matrix and produce more accurate CSI, hence the BER of the system can be improved. Moreover, two types of DCNN-based detectors show almost the same performance.
The effect of the normalized frequency
is present in
Figure 9, where the MIMO configuration remains the same with previous setup. The BER curves of both DCNN training models are provided with
and
, respectively. We can see from this figure that smaller
produces better performance with a gain of about 4dB. The reason is that if
is smaller, the channel fading matrix changes more slowly, and the wireless channel can be learned more efficiently by the DCNN. As a result, more accurate CSI can be produced and enhancing the BER performance.
Figure 10 and
Figure 11 depict the effects of the antenna number on BER performance with the normalized doppler frequency
, and the correlation factor of imperfect CSI
. Specifically, the antenna configuration in
Figure 10 is
, and the number of data stream
. In other words, the spectrum efficiency is higher than the previous setup. In
Figure 11, the antenna configuration is
and
, respectively. We can see from the two figures that the performance gain of DCNN with larger antenna number of user is obvious compared with the smaller antenna number, especially in the higher SNR region. Specifically, with
, the performance gain of DCNN to the standard MLD is about 6dB. The performance gain of
is about 8dB than that of
. That reason is that the larger receiver antennas’ number can introduce more freedom of space diversity, thus the BER will be decreased greatly.