Asynchronous Flipped Grant-Free SCMA for Satellite-Based Internet of Things Communication Networks

: Sparse code multiple access (SCMA) is a promising code domain non-orthogonal multiple-access scheme which is able to support massive connectivity and grant-free transmission in future satellite-based Internet of Things (IoT) communication networks. Traditional grant-free SCMA is based on time synchronization, which is no longer favorable in such satellite communication networks since the amount of signaling generated to keep all transmitters’ time synchronized is impractical for large networks. Moreover, without centralized codebook assignment, grant-free SCMA suffers from codebook collisions which mean more than one terminal selecting the same codebook being interfered. Motivated by these issues, a novel uplink grant-free asynchronous ﬂipped SCMA scheme named AF-SCMA is proposed in this paper. With the concept of ﬂipped diversity, a speciﬁc SCMA-encoded packet is transmitted with its ﬂipped replica together. Successive interference cancellation technique combined with a sliding window is adopted to resolve the packet collisions including codebook collisions at the gateway station. The performance of AF-SCMA is investigated via both mathematical analysis and simulations. Simulation results show that the proposed AF-SCMA provides remarkable performance in terms of throughput and packet loss ratio (PLR), and can beneﬁt from the received signal power unbalance.

inefficiently applied to the straightforward SAT-IoT networks. Since MTDs need to send extra signaling overhead to keep time synchronized, it is unaffordable to the energy-sensitive devices. Moreover, signaling interaction between terminals and satellite will enlarge the access latency, which is intolerant for delay-sensitive traffic. In addition, it is unrealistic to synchronize the whole network, especially for future satellite-terrestrial integration communication networks. Therefore, it is necessary to study asynchronous grant-free SCMA in the SAT-IoT networks. An asynchronous grant-free NOMA protocol was investigated in [28]. The authors considered both time and frequency offsets and exploited packet replicas to trigger the SIC procedure. In [29], a novel asynchronous interference cancellation technique named triangular SIC (T-SIC) for NOMA was introduced. The receiver tried to decode the overlapping signals in a triangular pattern based on the a priori information of all the decoded symbols. However, the T-SIC method requires the time offsets to be compensated within one symbol interval, which is not a strict grant-free scheme. In fact, the above two NOMA schemes are based on power domain, which are not appropriate for the SAT-IoT networks, because the distance between MTDs and satellite is so long that the differences of the received power are not big enough. Moreover, the MTDs are usually energy-sensitive, which cannot afford high transmission power. Furthermore, the performance of power domain NOMA schemes depends on the power allocation policies to a great extent, and the power allocation procedure will bring in extra signaling and delay. Therefore, it is meaningful to design an asynchronous SCMA scheme for the future SAT-IoT networks.
However, without time synchronization, the SCMA packets may be partially interfered, which makes the traditional successive joint decoder [26] not being applied straightforward. Without the request-grant procedure, MTDs will select the SCMA codebooks from the predefined limited codebook set randomly. If two or more MTDs choose the same codebook and they collide, it will cause codebook collisions which are destructive for SCMA decoding. The reason SCMA can achieve better performance in terms of BER is that multidimensional constellations of each codebook provide extra shaping gain [15]. However, codebook collisions eliminate the shaping gain, which makes the SCMA decoder hardly distinguish each codeword. Thus, to design an asynchronous SCMA scheme, we need to tackle the problem of codebook collisions first.
Motivated by these issues, this paper introduces a novel uplink grant-free asynchronous flipped SCMA (AF-SCMA) scheme for the future SAT-IoT communication networks. The main contributions of this paper can be summarized as: • Propose a novel uplink grant-free asynchronous flipped SCMA (AF-SCMA) scheme for the future SAT-IoT networks. With the concept of flipped diversity, a packet and its flipped replica are transmitted simultaneously. Specifically, the extra flipped replica is encoded with a different SCMA codebook from the one that is chosen by the original packet. Different pilot sequences are added before the encoded symbols of each subcarrier, which are used to detect the signal, identify the codebook information, and estimate the channel state information. Taking advantages of asynchronous arriving packets, SIC combined with message passing algorithm (MPA) is exploited to decode the packets in a sliding detection window. With the help of the flipped replica, the probability of codebook collisions is reduced, which mitigates the negative effects of codebook collisions and improves the system performance.

•
Develop an analytical model of the network throughput for the proposed AF-SCMA scheme.
We formulate a signal-to-interference-plus-noise ratio (SINR) estimator to identify the power of the received signal in an uplink land mobile satellite (LMS) channel and estimate multiple-access interference (MAI). The expressions for packet loss ratio (PLR) of the network are derived in a recursive way.

•
Investigate the performance of the proposed AF-SCMA scheme via both theoretical analysis and extensive simulations. We compare the proposed AF-SCMA with synchronous and asynchronous grant-free SCMA schemes in terms of PLR and throughput. Furthermore, the performance of different SCMA codebook sizes is also evaluated. Finally, the design of AF-SCMA pilot sequence is investigated and the pilot sequence missed detection rate is assessed.
The rest of this paper is organized as follows. Section 2 introduces the system model. The transceiver design of AF-SCMA scheme is introduced in detail in Section 3. Section 4 derives the performance of the proposed AF-SCMA scheme. Simulation results are shown in Section 5. Finally, Section 6 concludes this paper.

System Model
The scenario under consideration in this paper consists of U wireless MTDs, where MTDs send data packets asynchronously via a shared wireless channel to a GEO satellite, as shown in Figure 1a. The packets received by the satellite are forwarded to the ground gateway station transparently with independent backhaul channels which are assumed to be error-free. Considering the relatively high complexity of SCMA decoding process, the procedures of signal detection and decoding are performed at the gateway station. The system overall traffic load is assumed to follow a Poisson distribution with parameter λ in one packet transmission duration τ which is fixed and same for all the transmitted packets. Moreover, the acknowledgment (ACK) mechanism is used to inform the packets transmitted by each MTD are received correctly. If an MTD fails to receive ACK message for a period of time, it will retransmit the packet.  For each MTD, original data bits are encoded by a powerful FEC coding scheme and then are further encoded by the SCMA encoder. Specifically, for the SCMA encoder, every log 2 M bits are mapped to a codeword according to the codebook, where M is the size of the codebook [15]. Due to the sparsity of the codeword, the number of the non-zero entities N is less than the length of the codeword (i.e., N < K). In the system that adopts multi-carrier technique, such as OFDM, each entity in a codeword is mapped to a subcarrier. Therefore, K subcarriers in the frequency domain make up an SCMA transmission block. In practice, the available bandwidth is separated into several blocks to reuse the codebook. Nevertheless, in this paper, we only focus on one SCMA transmission block, and codebook reuse is not taken into consideration.
To mitigate the negative impact of codebook collisions, the proposed AF-SCMA scheme combined flipped diversity with SIC technique. As shown in Figure 1b, with the concept of flipped diversity, each MTD randomly selects two different SCMA codebooks from the codebook set with size of N CB to encode the data, and the original encoded packet is transmitted with its flipped replica, which is encoded by the other SCMA codebook simultaneously on the corresponding subcarriers. Pilot sequences, which are used for user detection, codebook identification, signal acquisition, and channel estimation, are appended in front of the encoded data. The pilot sequences adopted by this system have the characteristic of constant amplitude zero autocorrelation, similar to Zadoff-Chu sequences and Golden codes sequences. The performance of pilot sequence detection will be discussed in Section 5.
Packet detection and decoding procedures are performed in a detection window with the size of W at the gateway station. The uplink LMS channel is modeled as a block fading channel, which means the channel conditions will not change during one specific packet propagation time. Channel information is assumed to be always available, which can be acquired by the methods mentioned in [6,30] with the aid of pilot sequences. Moreover, the design of AF-SCMA transmitter and receiver will be introduced in detail in the next section.

Asynchronous Flipped SCMA Scheme
This paper proposes a novel uplink asynchronous multiple-access scheme named AF-SCMA for the mMTC scenario in the future SAT-IoT communication networks. This scheme makes SCMA truly asynchronous, which reduces the signaling overhead for keeping time synchronous. Moreover, by exploiting the concept of flipped diversity and SIC technique, AF-SCMA can mitigate the negative impact of codebook collision in grant-free SCMA systems, which can further improve the network throughput. In this section, we will present the design of AF-SCMA in detail.

Codebook Collision Resolution in AF-SCMA
SCMA is one of the most promising NOMA schemes, which allows different users to transmit the encoded codewords on the resource unit in an overlapping way. By designing the multidimensional constellation of SCMA codebook optimally, SCMA can obtain extra shaping gain, which makes SCMA achieve better performance in terms of BER [15]. To guarantee the excellent decoding performance, a codebook is allocated to a dedicated user by the central scheduler in one SCMA block in practice. However, in grant-free SCMA systems, each user chooses a codebook from the codebook set randomly, which makes it possible that two or more users choose the same codebook. If more than one user selects the same codebook and their packets are overlapped in the time domain, this leads to codebook collisions. Codebook collisions will destroy the shaping gain brought by the multidimensional constellations, which is destructive to packet-decoding.
To alleviate the destructive impact of codebook collisions in asynchronous grant-free SCMA system, the proposed scheme exploits flipped diversity and SIC technique. Specifically, each MTD randomly selects two different codebooks from the codebook set to encode the data, and the original packet is transmitted with its flipped replica simultaneously. For packet-decoding, with the information of the small clear chunks (i.e., the chunks which are not interfered with by other chunks encoded with the same codebook) of a packet, the whole packet can be recovered, although the packet collides with other packets encoded with the same codebook.
An example of codebook collision resolution is illustrated in Figure 2. In Figure 2a, three packets select the same codebook, and they interfere with each other. In this case, the shaping gain of multidimensional codewords vanishes, thus these three overlapping packets can hardly be decoded correctly. Figure 2b shows the procedure of decoding three interfered AF-SCMA packets which are also encoded with the same codebooks. A i , B i and C i represent the chunk number of packets A, B and C, respectively. The packets are encoded and transmitted with the concept of flipped diversity as mentioned above, and then SIC technique is adopted to recover the interfered packets. As shown in Figure 2b, C 1 and C 6 are a pair of clear chunks. Although they are overlapping in the time domain, encoding with different codebooks allows them still to be decoded correctly. Then the interference caused by C 1 and C 6 can be subtracted. Similarly, C 2 , C 5 and C 3 , C 4 are two pairs of clear chunks which can be decoded easily, and their replicas are subtracted to reveal more pairs of clear chunks. By removing the interference generated by chunks C 3 and C 4 , a new pair of clear chunks A 1 and A 6 is revealed. The procedure of interference cancellation is executed in a zigzag pattern [31], and all these three packets can be decoded successfully. In AF-SCMA, we take advantage of shaping gain provided by different sparse codewords to construct a novel pair of packets (one is the original packet, the other one is its flipped replica). Although the flipped replica seems to bring extra interference to the original one, the performance of the SCMA decoder is still acceptable for moderate SNR. With the asynchronous nature of AF-SCMA, the arrived packets may only partially overlap with each other. Thus, we exploit the small clear chunks and the feature of chunk-bootstrap to recover the overlapping packets that are suffered from codebook collisions. Compared with other diversity transmission schemes, AF-SCMA transmits the packet and its flipped replica simultaneously, and no random intervals are needed between these two packets, which improves the transmission efficiency.

AF-SCMA Transmitter Design
With the codebook collision resolution method mentioned above, we design a novel asynchronous SCMA for the future grant-free satellite communication networks. In AF-SCMA, MTDs can transmit their packets as soon as the packets generated, and they do not need to send the request-grant signaling for applying the resource. Therefore, AF-SCMA reduces the complexity of deploying the system and it is an efficient uplink multiple-access scheme. The transmitter design of AF-SCMA can be summarized as follows: (1) The incoming original data are organized in packets of fixed size which is predefined by the system. (2) Each packet is segmented into N chunk chunks with equal length which equals to τ/N chunk , and each chunk is associated with a few bits of cyclic redundancy check (CRC) information and the codebook information of its flipped replica which are allocated in the known location in the packet payload. (3) Packets are first encoded with a powerful FEC coding scheme with coding rate r. After channel coding, each packet is further encoded with two different SCMA codebooks which are randomly selected from the codebook set with size of N CB . (4) For the two encoded packets (the original one and its flipped replica) of a specific packet, a pilot sequence is appended at the head of each packet, which is used to signal acquisition, user detection, codebook identification and channel estimation. According to SCMA encoding rule, each encoded packet is mapped to N subcarriers, thus the pilot sequences on these N subcarriers are the same for one packet. (5) For each MTD, once the packets in physical layer are ready to be transmitted, they are transmitted in the pattern described in Figure 2b. Network-wide timing synchronization and request-grant signaling overhead are not required in AF-SCMA.

AF-SCMA Receiver Design
Similar with some asynchronous multiple-access schemes [9,10], AF-SCMA receiver adopts a sliding window at the gateway station to resolve collisions and decode the packets, where the sliding window size is W. Additionally, packet-decoding and interference cancellation (IC) processing are also executed in the sliding window, which can refer to the method mentioned in [32]. The receiver design of AF-SCMA can be summarized as follows: (1) For a received signal, it is first filtered and sampled, and then it is stored in a memory. To store all the filtered samples, the memory size should be large enough. (2) Perform pilot sequences detection throughout the window memory adopting a correlator which is matched to the pilot sequences. (3) Once a specific pilot is detected, channel information is estimated according to the pilot sequence.
In this paper, we assume channel estimation is perfect, which can be acquired by the algorithms mentioned in [6,30]. Furthermore, the information of the packet location, the MTD identification and the codebook that is used to encode the data can also be got from the pilot sequence. (4) Based on the position information of the received packets with in the detection window, AF-SCMA receiver can estimate the interference conditions of a specific packet. Since the lengths of each packet and each chunk are fixed, the decoding process can start with the chunk with minimum interference and strongest in power. (5) For SCMA decoding process, we first subtract the pilot sequences for the received signals to reduce the interference. We adopt a sub-optimal but efficient algorithm called MPA [15] to approach the optimal maximum likelihood (ML) detection. Since the packet arrivals are asynchronous, the SJD method in traditional MPA is not applicable. Therefore, AF-SCMA decoder decodes only one chunk at a time. To improve the decoding accuracy, we add the original codebook C j with a length-K zero codeword 0, where j = 0, 1, ..., N CB . Thus, the extended codebook becomes C j = C j ∪ {0}. For the chunk that is partially interfered, the clean part can be seemed as other interference MTDs virtually transmitting the codeword of 0 (for more details refer to [19]). Moreover, the factor graph which is used for MPA decoding can be obtained based on the interference conditions of the decoding chunk, and the SCMA decoding process follows the MPA decoding rules.
(6) The decoded chunk is declared correctly decoding when it passes the CRC check. Then re-encode the payload of the chunk using its original codebook. Meanwhile, the AF-SCMA also regenerates the payload of the flipped chunk with the codebook information which can be found in the chunk payload. Please note that, the codebook information in the payload is not the same in the original chunk and its flipped replica. Channel information can be acquired from the according pilot sequences. (7) Both the re-encoded chunk and its flipped replica are subtracted from the memory. (8) When no more chunks can be decoded successfully, this packet-decoding round terminates.
Then the receiver will update the interference conditions of the remaining signals, and starts a new packet-decoding round from the chunk with minimum interference and strongest in power (in step 4) until the maximum iteration number N max iter reaches. (9) When the finish processing the signals in current window, the detection window will shift by ∆W and continue to decode the packets in the new generated detection window. For the case that the replica chunk location of a decoded chunk is not within current window, the receiver will temporarily store the information of the chunk until the replica chunk completely locates within the span of the sliding window. Finally, the data stored in the previous ∆W memory will be discarded.

Complexity Analysis of AF-SCMA
Computational complexity is an important metric to evaluate the performance of an algorithm. For the proposed scheme AF-SCMA, the computational complexity can be analyzed from the aspects of pilot sequence detection and SCMA decoding process. For pilot sequence detection, we adopt the sliding correlation technique to detect the pilot sequence in a sliding window with the size of W. For each iteration round, the complexity of sliding correlation is about O n p W/τ p log 2 n p W/τ p by using fast Fourier transform (FFT) [33], where n p is the length of the pilot sequence in bit and τ p is transmission duration of the pilot sequence. For the SCMA decoding process, the computational complexity is related to the decoding algorithm. In this paper, we adopt the typical but efficient MPA to decode the SCMA codewords. Thus, the decoding complexity for a chunk is about O M d r [34], where M is the size of the codebook and d r is the number of interference codewords occupying the same subcarrier. To decode a packet, the decoding process will repeat N chunk times, where N chunk is a small positive integer in practice (no bigger than 10). Therefore, the decoding complexity for a packet is about O M d r . Please note that based on the conventional MPA, some improved algorithms have been put forward to further reduce the SCMA decoding complexity. However, researching SCMA decoding algorithms with lower complexity is out of the scope of this paper, which is left for future work.

AF-SCMA Analytical Performance Derivation
This section derives the performance of AF-SCMA in terms of PLR and network throughput T. Firstly, we need first to analyze the probability of the number of packets interfering with a specific one. Assuming the system traffic load follows a Poisson distribution with parameter λ within one packet transmission duration τ. Thus, the probability of k packets arrived within τ can be expressed as where k is a Poisson random variable with intensity λ. In asynchronous system, collisions occur when two or more packets arrive within ±1 packet duration τ. Therefore, λ is given by where G is the normalized logical traffic load of medium access control (MAC) layer, which is measured in bits/symbol/carrier. AF-SCMA coding gain G c is defined as G c = 1/ (rlog 2 M) , where r is the FEC coding rate and is the cardinality of SCMA codebooks. N rep is the number of transmitted replicas, which is a constant of 2 in this paper.
To derive the PLR of AF-SCMA, we also need to analyze the power of the received packets which influences the decoding performance directly. Since MTDs are widely distributed in a satellite beam, uplink LMS channel fading conditions of each MTD are different. Additionally, the antenna gain of MTDs or the satellite also fluctuates in a small range due to the environment variations. Therefore, power control mechanisms cannot work perfectly, and the power of each received packet at the gateway station will fluctuate around its desired value [E b /N 0 ] D . As a result, the real received power of individual packet can be expressed as where a is the power amplitude and a is assumed to follow a lognormal distribution L(a) with parameters of a mean µ (in dB) and a standard deviation σ (in dB). Specifically, L(a) is given by [35]: Furthermore, we exploit the discrete lognormal probability density function (pdf) mentioned in [10] to characterize the discrete power amplitude of the received packets, and the lognormal pdf is represented as: where E b /N 0 (i) is a possible received power value, and it increases in an equally small step in logarithmic domain (e.g., 0.1 dB).
In AF-SCMA, the decoding process is based on a unit of a chunk, and the detection of different chunks is independent of each other. Therefore, we derive the chunk loss ratio (CLR) first and then obtain PLR according to CLR. However, due to the adoption of recursive IC process in the sliding window, we should derive an iterative model for CLR. At each iteration round N iter = 0, 1, 2, ..., N max iter , CLR N iter is defined as the average probability P loss c,N iter (i) of erroneously decoding a specific chunk when the received power equals to E b /N 0 (i), which can be expressed as Supposing there are k chunks colliding with the chunk of interest, P loss c,N iter (i) is represented as where P loss c,N iter (i|k) is the probability for loss of the specific chunk at SIC iteration N iter when the received power equals to E b /N 0 (i) and conditioned to the case that k chunks are interfering the specific chunk. f (k; λτ/N chunk ) is the probability of k chunks arriving within ±1 packet duration, which follows the Poisson distribution mentioned in (1).
Due to the recursive IC scheme adopted by AF-SCMA, some of the k interference chunks may be removed after the last IC iteration round since their flipped replicas have been decoded somewhere else. Moreover, the decoding of different chunks of a specific packet is considered to be independent of each other, thus the other chunks of the interference packets to the specific chunk at other location will follow the same CLR N iter . The probability of r chunks still overlapping with the specific chunk at IC iteration N iter can be represented as where f R follows a binomial distribution, and q is the CLR of previous IC iteration which is represented as q = CLR N iter −1 . For the first IC iteration round, q = CLR N iter is initialized to 1 when N iter = 0. Furthermore, P loss c,N iter (i|k) can be derived by where P loss R,N iter (i|k) is the probability of erroneously decoding the specific chunk when there are r interference chunks and the power of the specific chunk is E b /N 0 (i). Among these r residual interference chunks, there are l chunks selecting the same codebook with the specific chunk with probability p l = 2/N CB , and the rest r − l chunks selecting different codebooks. Thus, P loss R,N iter (i|k) can be obtained by Similar to Equation (7), f L (l; r, p l ) is a binominal distribution. Due to the nature of the asynchronous system, the arrived packets will partially interfere with the packet of interest, which slightly influences the performance of SCMA and FEC decoding. Thus, referring to [9], we introduce an empirical compensation factor β = 0.9 to modify the analytical interference model. I r (r) is the average interference power spectral density contributed by r overlapping packets. Γ(·) is the function of SCMA packet error rate (PER) which is related to SINR. For the specific SCMA codebooks (number of subcarriers K = 4 and number of codebooks N CB = 6) [36] and 3GPP Turbo code with coding rate r = 1/3 used in this paper [37], Γ(·) can be approximated by fitting the simulated PER, which is provided by: where, x is SINR in dB and P 1 = −0.05406, P 2 = 2.54, P 3 = −45.68, P 4 = 395.1, P 5 = −1651, P 6 = 2693, Q 1 = −3.342, Q 2 = −29.45, Q 3 = 424.2, Q 4 = −1637 and Q 5 = 2701. However, the power of the interfered packets is distributed lognormally and the degrees of each packet overlapping with the specific packet are different. It is hard to compute I r (r) accurately. In this analytical model, we use additive white Gaussian noise (AWGN) to approximate the behavior of MAI in the AF-SCMA scenario. Although this approach is not quite accurate especially for the scenario of few colliding packets, its performance is still acceptable, which is investigated in [10,29]. Based on this simple but efficient analytical model, we can obtain the expression of I r (r). Similar to Equation (4), the power of interference chunks P I r is distributed discretely, and the energy per bit to noise power spectral density ratio can be expressed as E b /N 0 (m) = P I r (m)/(R b N 0 ) with probability of p E b /N 0 (m) for each possible value of E b /N 0 (m), where and m = 0, 1, 2, ..., ∞ is the bit rate. For each E b /N 0 (m), there are l p E b /N 0 (m) chunks selecting the same codebook with the specific trunk, and (r − l)p E b /N 0 (m) choosing the different codebooks. Thus, the MAI caused by r chunks to AWGN power spectral density I r (r)/N 0 can be approximated as (11) where α is an interfering factor to approach the performance of MPA decoding for the collided chunks encoded with different codebooks. Thanks to the shaping gain brought by multidimensional constellations, the interference caused by the interferers encoded with different codebooks is not as severe as those encoded with the same codebook. Unfortunately, there is not an exact formulation to calculate the interference power among different codebooks. Therefore, an interfering factor α = 0.3 is adopted empirically according to codebooks we use to approximate the interference power caused by different codebooks. Finally, for the given traffic load G the PLR of the system can be obtained by CLR N iter (G), which is given by Thus, the normalized MAC throughput under the traffic load can be simply obtained by

Simulation Results
This section provides some extensive simulation results to evaluate the performance of the proposed multiple-access scheme AF-SCMA. Packets are generated by infinite MTDs following Poisson distribution and they are transmitted asynchronously. As a baseline, each packet contains 500 information bits which are encoded by 3GPP Turbo encoder with coding rate r = 1/3 jointly with SCMA encoder. For SCMA decoder, we adopt the common but efficient MPA to decode the packet. Sending extra flipped packets simultaneously will reduce the received power of each packet. The performance of AF-SCMA is investigated under different desired received energy per bit to noise power spectral ratio [E b /N 0 ] D . However, due to the effect of pass loss and shadow fading in the satellite return link, the power control mechanism cannot be perfect in practice. Thus, the power levels of the received packets are assumed to follow a lognormal distribution with mean µ = 0 and standard deviation σ in dB. AWGN is added for each packet and they are summed together before decoding. Please note that the performance of AF-SCMA cannot be further improved by sending more replicas such as the methods mentioned in [6,8], since more replicas will destroy the construction of the transmitted packets. Therefore, we only evaluate the performance of AF-SCMA with 2 replicas (one original version and one flipped version). For AF-SCMA scheme, each packet is divided into 10 chunks (i.e., N chunk = 10), and the length of detection sliding window W is assumed to be 10 times of the packet length. Moreover, the maximum decoding iteration N max iter is set to be 10.

AF-SCMA MAC Performance Results
We first investigate the performance of AF-SCMA in terms of PLR and throughput for both simulation and analytical model (refer to Section 4) with and without power unbalance. The results are shown in Figure 3. The codebooks of SCMA adopted in the simulation are referred to [36], which is a 150% overloading system. Specifically, the number of subcarriers K = 4, codebook size M = 4, number of non-zero entities N = 2, and the number of codebooks N CB = 6. Thus, we use SCMA (K, N CB ) to reflect the relationship between resources and codebooks. As we can see in Figure 3, the throughput of AF-SCMA increases linearly with the increase of traffic load first, and then it drops due to the severe packet collisions. In addition, we notice that AF-SCMA can benefit from the power unbalance. For a target PLR − 10 −3 , the maximum throughput increases from 1.9 bits/symbol/carrier for the case of σ = 0 dB to 2.4 bits/symbol/carrier for the case of σ = 3 dB. Please note that, for the system of SCMA (4,6), the theoretical maximum throughput is 3 bits/symbol/carrier. However, without request-grant procedure and time synchronization, the proposed AF-SCMA can achieve a relatively high throughput (2.4 bits/symbol/carrier) with low PLR flexible to power unbalance, which is remarkable.
By comparing the simulation results with the analytical results, we notice that in the higher traffic load region (G = 2.5~3) there are some differences between the curves, especially for σ = 2 and σ = 3. This is probably because we adopt AWGN to approximate the behavior of MAI. Moreover, by introducing the empirical compensation factor β [9] and different codebooks interfering factor α in the analytical model will also loose the accuracy of the analytical model, especially for the severe collisions scenario. However, these differences occur in a high traffic load region which corresponds to really large PLR (beyond the practical target PLR). Therefore, this simple but effective analytical model can well match with the simulation results.  In addition, we notice an interesting phenomenon in Figure 3b that for the large σ some packet cannot be received correctly although the traffic load is not very high. This phenomenon is dubbed as floor effect which is discussed in [10]. This is because in the presence of power unbalance, the received power of some packets cannot reach the decoding threshold, especially for large standard deviation. Therefore, these packets cannot be decoded although the IC procedure is assumed to be perfect. This phenomenon can be mitigated by increasing the transmission power or enlarging the packet size [10].
We also compare the performance of AF-SCMA with asynchronous SCMA (A-SCMA) and slotted SCMA (S-SCMA) with and without power unbalance under the scenario of the grant-free system. As a baseline, IC technique is applied to both A-SCMA and S-SCMA. Specifically, if a packet is decoded correctly, it can be subtracted from the overlapping signal and the IC procedure is assumed to be error-free. Noted that, two packets are transmitted simultaneously in AF-SCMA, which means the transmission power is twice as high as the schemes with transmitting only one packet (A-SCMA and S-SCMA). For the sake of fairness, we double the desired receive power for A-SCMA and S-SCMA schemes in the simulation to compare with the performance of AF-SCMA with [E b /N 0 ] D = 13 dB, and the simulation results are illustrated in Figure 4. For the higher [E b /N 0 ] D (i.e., [E b /N 0 ] D = 13 dB), we can see that the performance of AF-SCMA outperforms the other schemes in any case. For a target PLR = 10 −3 , the maximum throughput of AF-SCMA is 27% better than that of S-SCMA and is 90% better than that of A-SCMA for the case of σ = 0 dB. When the power unbalance exists (σ = 3 dB), the maximum throughput of AF-SCMA is 20% better than that of S-SCMA and is 72% better than that of A-SCMA, for the same PLR target. This is because for the heavy traffic load, packet collisions become severe, which leads to codebook collisions occur frequently. Without the help of flipped replica, codebook collisions will cause significant negative impacts on SCMA decoding. Since the probability of packet collisions of the asynchronous system is higher than that of the synchronous system, S-SCMA performs better than A-SCMA. We notice that for the case of σ = 0 dB, the throughput of S-SCMA is higher than that of AF-SCMA in the heavy traffic scene, because codebook collisions are so serious that few packets can be decoded successfully, which is hard to trigger the IC process. Although S-SCMA is superior to the proposed AF-SCMA in the heavy load scenario, the PLR of S-SCMA has been beyond the practical target, which is intolerable in the SAT-IoT networks.
Actually, sending extra flipped replicas could reduce the throughput in some cases. Transmitting more packets (original packet and its flipped replica) simultaneously will degrade the received power of each packet, which causes the power of each packet may be not high enough to resist the interference. Moreover, although the probability of codebook collisions will be reduced by introducing flipped replicas, the interference to a specific packet also increases due to the extra flipped replicas. AF-SCMA decoding performance is related to SINR directly, which requires higher [E b /N 0 ] D . To illustrate the influence of received power on the performance of AF-SCMA better, we also investigate the performance of AF-SCMA with [E b /N 0 ] D = 10 dB. In Figure 4, the throughput of AF-SCMA with [E b /N 0 ] D = 10 dB is lower than that of AF-SCMA with [E b /N 0 ] D = 13 dB. For a target PLR = 10 −3 , the maximum throughput of AF-SCMA with [E b /N 0 ] D = 10 dB is about 50% and 30% worse than that of AF-SCMA with higher [E b /N 0 ] D , in the scenarios of σ = 0 dB and σ = 3 dB respectively. Therefore, the received power of each packet determines the performance of AF-SCMA, and the proposed scheme can provide a remarkable throughput for the SAT-IoT networks with acceptable Furthermore, we also investigate the AF-SCMA performance for different codebook cases, and the simulation results are shown in Figure 5. We assume all the codebooks are with fixed size M = 4 and the number of non-zero entities N = 2. Three cases of codebooks are taken into consideration in the simulation, which are K = 4 and N CB = 6 (AF-SCMA (4, 6)), K = 6 and N CB = 12 (AF-SCMA (6, 12)), K = 6 and N CB = 15 (AF-SCMA (6, 15)) respectively. Typically, the number of SCMA codebooks is determined by K and N which is K CB ≤ ( K N ). In Figure 5 we can see that the more codebooks are used, the better performance can achieve. Specifically, for a target PLR = 10 −3 , the maximum throughputs of AF-SCMA (4, 6), AF-SCMA (6, 12) and AF-SCMA (6, 15) are 1.9, 2.4 and 2.6 bits/symbol/carrier respectively without power unbalance. While for the same PLR target, the maximum throughputs of these three cases are 2.4, 2.8 and 3.2 bits/symbol/carrier respectively for the case of σ = 3 dB. This is because with more codebooks, different MTDs have more opportunities to choose different codebooks, which decreases the probability of codebook collisions. Actually, the theoretical maximum throughput of AF-SCMA (6,12) and AF-SCMA (6, 15) are 4 and 5 bits/symbol/carrier, respectively. However, the simulation results indicate that AF-SCMA can only achieve about 60%-70% of its best performance for these two cases. This is because the codebooks of AF-SCMA (6,12) and AF-SCMA (6,15) used in the simulations are not well predefined, which are designed simply based on AF-SCMA (4, 6) [36]. Thus, the shaping gain brought by multidimensional constellation is limited. When the traffic load becomes heavy, the SCMA decoder can hardly distinguish the overlapping codewords in a particular subcarrier for a medium [E b /N 0 ] D case. Nevertheless, the codebook design of SCMA is a tough and complicated issue which is left for future work. Above all, by increasing the number of subcarriers and codebooks with proper design, the performance of AF-SCMA can be further improved.

Pilot Sequence Detection Performance Assessment
Pilot sequence detection is crucial to the AF-SCMA decoding procedure, where AF-SCMA can acquire the identification of MTDs and the codebook information. Figure 6 shows the pilot sequence detection assessment results. Since false alarm can be recovered during the procedure of channel estimation [19], we only investigate the probability of missed detection. In the simulation, we adopt ZC sequence as the pilot sequence with the length of N ZC , where N ZC is an odd and prime number [38]. Each MTD chooses a ZC root index u randomly to generate its pilot sequence with cyclic shift mN CS , where u = 0, 1, 2..., N ZC − 1, N CS = f loor(N ZC /N CS ) and m is a positive integer number used to identify different codebooks, m ≤ N CB . f loor(x) returns the largest integer number that is less than or equal to x. Please note that the pilot sequences of the two replicas for a specific packet are with the same ZC root index u, but they are with different cyclic shifts to indicate different codebooks. Additionally, carrier frequency offset (CFO) is not taken into consideration in the simulation, since the main frequency error caused in a GEO SAT-IoT networks scenario is due to the satellite oscillator uncertain which can be well estimated within ±1 KHz [10]. Thus, by adjusting the subcarrier bandwidth interval properly, the negative impacts of CFO on ZC sequence detection can be ignored. We compare the missed detection rates for different lengths of ZC sequences. As shown in Figure 6 we can see that pilot signal with longer sequences perform better than the shorter ones. On one hand, longer ZC sequences provide better autocorrelation performance, which makes for easy correlation peak detection. On the other hand, larger N ZC provides more candidate root indices u, which decreases the probability of different pilot sequences interfering with the same root index. Moreover, the missed detection rate rises with the increase of traffic load. As the traffic load becomes heavier, pilot sequences generated by the same root index collide more frequently, which makes it difficult for the detector to detect the correlation peak. In Figure 6 we notice that the probability of missed detection is below 10 −3 in the entire traffic load region for both N ZC = 127 and N ZC = 257, which can satisfy the PLR target of 10 −3 . Although a longer sequence can achieve lower missed detection rate, it will cost too much overhead compared with the useful information in a packet, which is suitable for the larger packet size. Thus, N ZC = 127 seems to be a reasonable pilot sequence length.

Conclusions
A novel uplink grant-free asynchronous SCMA scheme dubbed AF-SCMA for satellite-based IoT networks has been introduced together with its key characteristics in this paper. With the concept of flipped diversity, a packet is transmitted with its flipped replica together. By removing the interference chunks in a zigzag pattern, a specific packet can probably be decoded successfully, although it collides with other packets encoded with the same codebooks. The theoretical performance of AF-SCMA has been analyzed and derived in an iteration way in detail. Simulation results show that the analytical performance of AF-SCMA is well matched with that of simulation in terms of throughput and PLR. For the target PLR = 10 −3 , AF-SCMA can achieve a higher throughput compared with S-SCMA and A-SCMA (without flipped replica) and it can benefit from the power unbalance, which can further boost the throughput of AF-SCMA. Moreover, a larger codebook set can also improve the performance of AF-SCMA. However, SCMA codebook design is a tough and complicated issue which is subject to future research. Finally, the design of the AF-SCMA pilot sequence is investigated. According to the simulation results, ZC sequences can operate and can be detected well as a pilot sequence with a reasonable length. Above all, the proposed AF-SCMA can reduce the negative effects of SCMA codebook collisions and provides remarkable performance with acceptable [E b /N 0 ] D , which is applicable to the future satellite-based IoT networks, especially for a massive machine-type communication scenario.
Author Contributions: G.C. proposed the basic framework of the research scenario. P.L. put forward the innovative idea of AF-SCMA and modeled the problem. In addition, P.L. did the simulations and wrote the paper. W.W. gave some suggestions on the mathematical model and formula derivation.

Conflicts of Interest:
The authors declare no conflict of interest.