Deep Learning Based Successive Interference Cancellation Scheme in Nonorthogonal Multiple Access Downlink Network

: In this paper, a deep learning-based successive interference cancellation (SIC) scheme for use in nonorthogonal multiple access (NOMA) communication systems is investigated. NOMA has become a notable technique in the ﬁeld of mobile wireless communication because of its capacity to overcome orthogonality, unlike a conventional orthogonal frequency division multiple access (OFDMA) communication system. In NOMA communication systems, SIC is one of the decoding schemes applied at receivers for downlink NOMA transmissions. In this paper, a convolutional neural network (CNN)-based SIC scheme is proposed to improve performance of the single base station and multiuser NOMA scheme. In contrast to existing SIC schemes, the proposed CNN-based SIC scheme can e ﬀ ectively mitigate losses resulting from imperfections of the SIC. The simulation results indicate that the CNN-based SIC method can successfully relieve conventional SIC impairments and achieve good detection performance. Consequently, a CNN-based SIC scheme can be considered as a potential technique for use in NOMA detection schemes.


Introduction
Since the nonorthogonal multiple access (NOMA) system was proposed, related issues have been actively studied. Through 5G and 6G communication systems, NOMA can improve the quality of wireless mobile systems [1]. In this paper, the power-domain NOMA system is considered with regard to its capacity to distribute signals through power allocation according to channel state information (CSI). The conventional power-domain NOMA communication systems have successive interference cancellation (SIC) schemes for decoding nonorthogonal signals. In the downlink communication of NOMA, the transmitted signal from the base station (BS) is a multiplexed superposition signal with power allocation which depends on CSI. In power-domain NOMA, the BS transmits signals with low power allocation to users with strong channel states and vice versa. The users in the NOMA process select the maximum correlation to decode the strongest signal and subtract it from the received signal, then iterate until the user can decode the desired information [2].
Conventional research in wireless communication systems has focused on minimizing computation costs resulting from limitations in computation [3]. Recently, however, with the improvement of the 1.
We design a CNN-based SIC scheme for downlink NOMA communication systems. The proposed scheme can be used instead of implementing a conventional SIC scheme.

2.
We apply CNN as a scheme to solve the imperfections of SIC that are not considered in the conventional NOMA communication system. The proposed SIC scheme can mitigate the losses caused by imperfect SIC and then improve the sum rate of the decoded signal. 3.
We provide simulation results with various key deep learning parameters. These parameters include the number of users, number of epochs, power allocation, modulation type, learning rate and batch size. The proposed scheme performed better than the conventional SIC scheme in various simulations.
The rest of this paper is organized as follows. Section 2 describes the proposed NOMA system model. In Section 3, the imperfections of conventional SIC in contrast to the CNN-based SIC scheme are indicated. The simulation results and conclusions are presented in Sections 4 and 5, respectively.

System Model
In this section, it is assumed that the channel has Rayleigh fading distribution with additive white Gaussian noise (AWGN). NOMA can transmit signals within the same domain (time, frequency) without the orthogonality of communication theory. The utilization of power in power-domain NOMA, as considered in this paper, depends on the CSI of users to enhance the sum rate performance of communications. Figure 1 shows the proposed NOMA system architecture. In a conventional downlink system, users must occupy orthogonal resources. When the number of transmit antennas for the BS is greater Energies 2020, 13, 6237 3 of 12 than the number of receive antennas for users, interuser interference can be ignored [13]. However, in NOMA systems, the number of transmit antennas is always less than the number of receive antennas. In this paper, it is assumed that the proposed NOMA scheme has small user clusters in order to ignore interuser interference.
Energies 2020, 13, 6237 3 of 13 than the number of receive antennas for users, interuser interference can be ignored [13]. However, in NOMA systems, the number of transmit antennas is always less than the number of receive antennas. In this paper, it is assumed that the proposed NOMA scheme has small user clusters in order to ignore interuser interference. In Figure 2, a block diagram of the proposed NOMA system is presented for the case of a single BS and two users. Based on the distance between the BS and each user, the BS transmits a nonorthogonal signal with weak power to user 1 and a signal with strong power to user 2. User 1 performs the SIC operation to cancel the signal of user 2, then acquires the signal sent to them. User 2 can acquire the signal sent to them because the signal to user 1 appears as noise due to channel attenuation.
It is assumed that the number of users in the system is . The signal sent to user can be denoted as ( ), ( = 1,2, ⋯ , ). At the BS, the power allocated to signal ( ) is denoted as [14]. The nonorthogonal signal transmitted by the BS can be expressed as follows: (1) Figure 2. System block diagram of NOMA with a single base station and two users. In Figure 2, a block diagram of the proposed NOMA system is presented for the case of a single BS and two users. Based on the distance between the BS and each user, the BS transmits a nonorthogonal signal with weak power to user 1 and a signal with strong power to user 2. User 1 performs the SIC operation to cancel the signal of user 2, then acquires the signal sent to them. User 2 can acquire the signal sent to them because the signal to user 1 appears as noise due to channel attenuation.
in NOMA systems, the number of transmit antennas is always less than the number of receive antennas. In this paper, it is assumed that the proposed NOMA scheme has small user clusters in order to ignore interuser interference. In Figure 2, a block diagram of the proposed NOMA system is presented for the case of a single BS and two users. Based on the distance between the BS and each user, the BS transmits a nonorthogonal signal with weak power to user 1 and a signal with strong power to user 2. User 1 performs the SIC operation to cancel the signal of user 2, then acquires the signal sent to them. User 2 can acquire the signal sent to them because the signal to user 1 appears as noise due to channel attenuation.
It is assumed that the number of users in the system is . The signal sent to user can be denoted as ( ), ( = 1,2, ⋯ , ). At the BS, the power allocated to signal ( ) is denoted as [14]. The nonorthogonal signal transmitted by the BS can be expressed as follows: (1)  It is assumed that the number of users in the system is K. The signal sent to user k can be denoted as s k (t), (k = 1, 2, · · · , K). At the BS, the power allocated to signal s k (t) is denoted as p i [14]. The nonorthogonal signal transmitted by the BS can be expressed as follows: Energies 2020, 13, 6237 4 of 12 In addition, the signal received by the Rayleigh fading channel can be expressed as follows: where h(t) is the Rayleigh fading channel response and n o (t) is the AWGN. Conventional NOMA systems usually use an SIC scheme for decoding signals and have been shown to be multiple-access schemes in downlink [2]. In the following section, the characteristics of conventional NOMA with SIC are described, then the proposed CNN-based SIC scheme for NOMA systems is discussed.

Conventional NOMA and Successive Interference Cancellation (SIC) Scheme
In Figure 3, a conventional SIC scheme is depicted. At user k, the SIC scheme is iterated until the intended signal is decoded. If p 1 has a stronger value compared to other signals, the signal for user 1 can be decoded directly while the other signals can be interpreted as noise.
Energies 2020, 13, 6237 4 of 13 In addition, the signal received by the Rayleigh fading channel can be expressed as follows: where ℎ( ) is the Rayleigh fading channel response and ( ) is the AWGN. Conventional NOMA systems usually use an SIC scheme for decoding signals and have been shown to be multiple-access schemes in downlink [2]. In the following section, the characteristics of conventional NOMA with SIC are described, then the proposed CNN-based SIC scheme for NOMA systems is discussed.

Conventional NOMA and Successive Interference Cancellation (SIC) Scheme
In Figure 3, a conventional SIC scheme is depicted. At user , the SIC scheme is iterated until the intended signal is decoded. If 1 has a stronger value compared to other signals, the signal for user 1 can be decoded directly while the other signals can be interpreted as noise. The throughput of user 1, 1 , can be represented as follows: where 0 is the AWGN and ℎ is the channel impulse response of user .
For the user ∈ [1, ], if the first − 1 users are perfectly decoded, the throughput for user is given by: Then, the throughput for user K is given by: The throughput of user 1, R 1 , can be represented as follows: where n 0 is the AWGN and h i is the channel impulse response of user i. For the user k ∈ [1, K], if the first k − 1 users are perfectly decoded, the throughput for user k is given by: Energies 2020, 13, 6237

of 12
Then, the throughput for user K is given by:

Imperfections of Conventional SIC
In recent studies on NOMA systems, it is usually assumed that the SIC scheme can decode signals perfectly. This assumption may not be practical because the decoded signals may be cancelled incorrectly [15]. Even if the signals are decoded correctly, the signals regenerated by the SIC scheme may not be perfectly matched due to various impairment factors [16]. It is well known that the errors caused by imperfect SIC schemes degrade system performance [17]. The conventional SIC schemes used in these studies do not typically take into consideration the errors that occur when the signal cancellation is repeated. This is a key problem to be resolved in NOMA systems with SIC schemes.
Let s k denote the signal intended for the user k. The BS modulates the signal s k according to the power allocation defined by the CSI. The signal received by user k from the BS, as expressed in Equation (2), can be redefined as follows: It is assumed that at the receiver of user k, the signals s 1 , . . . , s k−1 are not perfectly decoded. The SIC scheme is iterated until the desired signal s k is decoded. The signal decoded by user k can be represented as follows:ŷ whereŷ k is the signal received by user k without the signals of other users,ŝ 1 ,ŝ 2 , . . . ,ŝ k−1 , In this paper, the goal is to optimize the transmission of signals through a deep learning-based SIC scheme for NOMA systems. The purpose of the proposed scheme is to minimize the total mean square error between the transmission and decoding of signals. The optimization formula can be written as follows: where F k (·) indicates the processing functions of the CNN-based SIC scheme. Note that Equation (8) is a nonconvex and nonlinear equation due to the nonconvexity and nonlinearity of the function F k (·). Therefore, it is very difficult to solve Equation (8) numerically.

Convolutional Neural Network (CNN)-Based SIC Scheme
In the proposed scheme, depicted in Figure 4, a CNN approach is employed as a deep learning approximator solution that can empirically approximate any function through the use of neural networks.
The processing function of user k can be reconstructed as follows: where ∀j ≤ k, ∀k. φ j k (·), U j k (·) and c k represent the activation function, the weight of node k of layer j and the bias vectors, respectively. The processing function of user can be reconstructed as follows: where ∀ ≤ , ∀ . (•), (•) and represent the activation function, the weight of node k of layer j and the bias vectors, respectively.
To solve Equation (9), the CNN-based SIC scheme is trained to minimize the following loss function: indicates the weight and bias sets of the k-th SIC scheme and is the set of training samples of the signal.
In the training process, the input and the output are a signal received from user and the estimated signals of other users, respectively. The testing mode is activated after the CNN has been trained well. The performance of proposed scheme is evaluated in this step. Training and testing algorithms can be represented as in the following Algorithms 1 and 2:  To solve Equation (9), the CNN-based SIC scheme is trained to minimize the following loss function: where θ indicates the weight and bias sets of the k-th SIC scheme and S k is the set of training samples of the signal.
In the training process, the input and the output are a signal received from user k and the estimated signals of other users, respectively. The testing mode is activated after the CNN has been trained well. The performance of proposed scheme is evaluated in this step. Training and testing algorithms can be represented as in the following Algorithms 1 and 2:  The proposed CNN-based SIC scheme is shown in Figure 5. A primary building block of the proposed scheme consists of two convolutional layers followed by a max pooling layer. Next, a set of two convolution and pooling layers are stacked, followed by a set of two dense layers that have 256 and 128 neurons, respectively. A SoftMax classifier is the last layer of the proposed model. The dropout rate is set to be 40% at the dense layers in order to overcome the effect of overfitting.

5: Return ℒ( ).
The proposed CNN-based SIC scheme is shown in Figure 5. A primary building block of the proposed scheme consists of two convolutional layers followed by a max pooling layer. Next, a set of two convolution and pooling layers are stacked, followed by a set of two dense layers that have 256 and 128 neurons, respectively. A SoftMax classifier is the last layer of the proposed model. The dropout rate is set to be 40% at the dense layers in order to overcome the effect of overfitting.

Activation Functions for CNN-Based SIC Scheme
The activation function determines the output of each layer in the CNN and most recent studies use a nonlinear activation function. Depending on the shape of the activation function, the computation costs of the backpropagation and feature point extraction have a significant impact. The most used activation function in recent studies is as follows: where ( ), ℎ ( ), ( ), ( ) and ( ) represent the sigmoid, tanh, rectified linear unit (ReLU), Leaky ReLU (LReLU) and exponential linear unit (ELU) functions, respectively. In Equations (14) and (15), and are predefined parameters for controlling the value to which functions saturate for negative inputs, respectively.
The sigmoid and tanh functions have an advantage in classification but a disadvantage in complexity caused by deeper layers. The ReLU function has an advantage in computational cost compared to the sigmoid and tanh functions but it has an inactive neuron problem when the

Activation Functions for CNN-Based SIC Scheme
The activation function determines the output of each layer in the CNN and most recent studies use a nonlinear activation function. Depending on the shape of the activation function, the computation costs of the backpropagation and feature point extraction have a significant impact. The most used activation function in recent studies is as follows: where φ sigmoid (x), φ tanh (x), φ ReLU (x), φ LReLU (x) and φ ELU (x) represent the sigmoid, tanh, rectified linear unit (ReLU), Leaky ReLU (LReLU) and exponential linear unit (ELU) functions, respectively. In Equations (14) and (15), α and β are predefined parameters for controlling the value to which functions saturate for negative inputs, respectively. The sigmoid and tanh functions have an advantage in classification but a disadvantage in complexity caused by deeper layers. The ReLU function has an advantage in computational cost compared to the sigmoid and tanh functions but it has an inactive neuron problem when the negative bias is updated (i.e., dying ReLU). The LReLU and ELU functions maintain the negative bias values after bias updates but have disadvantages in terms of computation costs. In this paper, the activation functions in Equations (11)-(15) are applied to training and testing processes in order to find the optimal function [18]. To simplify the calculation, it is assumed that the bias vector c k is 0. Assuming a bias of 0, the function for user k given by Equation (9) can be modified as follows:

Simulation Results
In this section, we provide details on the simulations of the proposed CNN-based SIC scheme with the parameter settings given in Table 1. In Figure 6, the relationship between bit error ratio (BER) and signal-to-noise ratio (SNR) for the proposed SIC scheme is investigated in a system with four users. In the figure, the perfect SIC curve denotes an ideal case of perfect separation among users. It is confirmed that the BER performance indicated by this latter curve gradually deteriorates with the number of users due to the error propagation of the SIC scheme.
Energies 2020, 13, 6237 9 of 13 learning-based transmission and reception [19]. It is confirmed that the proposed scheme outperforms the schemes in [1,19] in terms of sum rate.  In Figure 7, the relationship between the mean square error (MSE) and SNR for the proposed scheme is investigated for different batch sizes. It is assumed that the learning rate is set at 0.11. It can be observed that the MSE decreases with the SNR and that the results from smaller batch sizes perform better than those from larger ones for higher SNR ranges. However, a small batch size may cause unstable convergence behavior. Hence, the batch size selection may be a crucial issue for the proper operation of the proposed NOMA scheme. For a good trade-off between complexity and convergence, the batch size is set at 20 for the rest of the simulation results.    In Figure 8, the relationship between the SIC loss and the number of epochs is investigated in training and testing processes. After 200 epochs were run, the SIC loss was saturated at about 4% and 2.5% in the training and testing processes, respectively. For the rest of the simulation results in this paper, at least 200 epochs were run to ensure the proper working of the proposed SIC scheme.    In Figure 9, the sum rate and SNR are compared for the conventional and proposed SICs with varying power allocations. In the figure, p 1 and P represent the power allocation factor of user 1 and the total power transmitted by the BS, respectively. When p 1 was 10% of the total power, the average sum rate of the proposed SIC scheme was significantly improved by around 20% compared with the conventional SIC scheme and around 8% for p 1 = 0.3P. This result can be interpreted as indicating that an excessive power allocation for the signal user may do harm to the overall operation of the proposed NOMA scheme. The proposed scheme was also compared with related studies on different SIC decoding schemes, namely perfect SIC decoding [1] and deep learning-based transmission and reception [19]. It is confirmed that the proposed scheme outperforms the schemes in [1,19] in terms of sum rate. In Figure 10, the relationship between the sum rate and SNR of the proposed CNN-based SIC scheme is investigated with varying learning rates. To determine the optimal value of the learning rate, for each given dataset the learning rate is adjusted by 0.01 in the rage of (0.01, 0.2) until the best sum rate is achieved. The data indicates that the maximum sum rate can be achieved for each SNR with a learning rate of 0.11. In Figure 11, the relationship between the loss function ℒ( ) and the number of epochs of the proposed scheme is investigated when modulation is varied. When 64-QAM modulation is applied, the loss increases if the epoch number exceeds 200. This can be interpreted as an increase in loss due to overfitting. When the number of epochs accumulates sufficiently (more than 300), Quadrature-PSK (QPSK) can be the most suitable modulation scheme in terms of decoding complexity.
In Figure 12, the relationship between the loss function ℒ( ) and the number of epochs of the proposed scheme is investigated when the activation function on the hidden layer is varied. When ℎ ( ) and ( ) are applied, it can be confirmed that severe underfitting occurs and the loss In Figure 10, the relationship between the sum rate and SNR of the proposed CNN-based SIC scheme is investigated with varying learning rates. To determine the optimal value of the learning rate, for each given dataset the learning rate is adjusted by 0.01 in the rage of (0.01, 0.2) until the best sum rate is achieved. The data indicates that the maximum sum rate can be achieved for each SNR with a learning rate of 0.11. In Figure 10, the relationship between the sum rate and SNR of the proposed CNN-based SIC scheme is investigated with varying learning rates. To determine the optimal value of the learning rate, for each given dataset the learning rate is adjusted by 0.01 in the rage of (0.01, 0.2) until the best sum rate is achieved. The data indicates that the maximum sum rate can be achieved for each SNR with a learning rate of 0.11. In Figure 11, the relationship between the loss function ℒ( ) and the number of epochs of the proposed scheme is investigated when modulation is varied. When 64-QAM modulation is applied, the loss increases if the epoch number exceeds 200. This can be interpreted as an increase in loss due to overfitting. When the number of epochs accumulates sufficiently (more than 300), Quadrature-PSK (QPSK) can be the most suitable modulation scheme in terms of decoding complexity.
In Figure 12, the relationship between the loss function ℒ( ) and the number of epochs of the proposed scheme is investigated when the activation function on the hidden layer is varied. When ℎ ( ) and ( ) are applied, it can be confirmed that severe underfitting occurs and the loss In Figure 11, the relationship between the loss function L(θ) and the number of epochs of the proposed scheme is investigated when modulation is varied. When 64-QAM modulation is applied, the loss increases if the epoch number exceeds 200. This can be interpreted as an increase in loss due to overfitting. When the number of epochs accumulates sufficiently (more than 300), Quadrature-PSK (QPSK) can be the most suitable modulation scheme in terms of decoding complexity.
In Figure 12, the relationship between the loss function L(θ) and the number of epochs of the proposed scheme is investigated when the activation function on the hidden layer is varied. When φ tanh (x) and φ ELU (x) are applied, it can be confirmed that severe underfitting occurs and the loss remains at a relatively higher level for the range of epochs from 0 to 350. When φ ReLU (x) and φ LReLU (x) are applied, the loss function L(θ) increases if the epoch exceeds 250 due to overfitting. Therefore, φ sigmoid can be a viable choice for the activation function of the proposed scheme.
Energies 2020, 13, 6237 11 of 13 remains at a relatively higher level for the range of epochs from 0 to 350. When ( ) and ( ) are applied, the loss function ( ) increases if the epoch exceeds 250 due to overfitting. Therefore, can be a viable choice for the activation function of the proposed scheme.

Conclusions
In this paper, the practical issue of imperfect successive interference cancellation was described. The sum rate loss of NOMA-based wireless communication systems caused by imperfect SIC can be mitigated by the proposed CNN-based SIC scheme. The learning performance of the proposed SIC scheme was illustrated in the simulation results with various parameters.
This study confirms that the CNN-based deep learning approach is a promising tool for enhancement of the NOMA detection scheme and that the proposed SIC scheme can achieve higher sum rates compared with the conventional one. The results of this paper can find applications in 5G/6G wireless communication and in wireless sensor networks with improved NOMA and intelligent processing.  Energies 2020, 13, 6237 11 of 13 remains at a relatively higher level for the range of epochs from 0 to 350. When ( ) and ( ) are applied, the loss function ( ) increases if the epoch exceeds 250 due to overfitting. Therefore, can be a viable choice for the activation function of the proposed scheme.

Conclusions
In this paper, the practical issue of imperfect successive interference cancellation was described. The sum rate loss of NOMA-based wireless communication systems caused by imperfect SIC can be mitigated by the proposed CNN-based SIC scheme. The learning performance of the proposed SIC scheme was illustrated in the simulation results with various parameters.
This study confirms that the CNN-based deep learning approach is a promising tool for enhancement of the NOMA detection scheme and that the proposed SIC scheme can achieve higher sum rates compared with the conventional one. The results of this paper can find applications in 5G/6G wireless communication and in wireless sensor networks with improved NOMA and intelligent processing.

Conclusions
In this paper, the practical issue of imperfect successive interference cancellation was described. The sum rate loss of NOMA-based wireless communication systems caused by imperfect SIC can be mitigated by the proposed CNN-based SIC scheme. The learning performance of the proposed SIC scheme was illustrated in the simulation results with various parameters.
This study confirms that the CNN-based deep learning approach is a promising tool for enhancement of the NOMA detection scheme and that the proposed SIC scheme can achieve higher sum rates compared with the conventional one. The results of this paper can find applications in 5G/6G wireless communication and in wireless sensor networks with improved NOMA and intelligent processing.