Spectral Coexistence of QoS-Constrained and IoT Traffic in Satellite Systems

The flourishing of Internet of Things (IoT) applications, characterized by vast transmitter populations and the sporadic transmission of small data units, demands innovative solutions for the sharing of the wireless medium. In this context, satellite connectivity is an important enabler for all scenarios in which terminals are under-served by terrestrial communications and are thus fundamental for providing worldwide coverage. In turn, the design of medium access policies that attain efficient use of the scarce spectrum and can cope with flexible yet unpredictable IoT traffic is of the utmost importance. Starting from these remarks, we investigate in this work the coexistence of a quality of service (QoS)-constrained service with IoT traffic in a shared spectrum as alternative to a more traditional orthogonal allocation among the two services, with an eye on satellite applications. Leaning on analytical tools, we provide achievable rate regions, assuming a slotted ALOHA access method for IoT terminals and accounting for practical aspects, such as the transmission of short packets. Interesting trends emerge, showcasing the benefit of an overlay allocation with respect to segregating the resources for the two services.


Introduction
Massive machine-type communications (MMTC) and the Internet of Things (IoT) are attracting steadily growing research and industry attention, emerging as a fundamental component for next-generation wireless systems. Driven by a blooming number of IoT applications, this novel communications paradigm aims at serving vast populations of often low-power, low-complexity terminals that generate sporadic traffic in the form of short data packets. Examples of practical relevance span a wide set of scenarios, ranging from smart agriculture or industry, where sensors may collect data (e.g., temperature, pressure, and presence of chemical substances) and deliver status information to a common gateway, to environmental monitoring or asset tracking [1][2][3].
Support for this multitude of use cases is already provided by terrestrial networks in a number of well-established commercial products [4], e.g., LoRa [5][6][7], SigFox [8], Ingenu [9], as well as by standardized approaches, such as NB-IoT and LTE-M [10]. In parallel to this, satellite-based solutions have recently gained traction as a key enabler to provide global coverage to mMTC services [11]. Favored, among other factors, by the significant reduction in launch cost and by the use of off-the-shelves components, such as reprogrammable software defined radios (SDRs), a revived interest toward the deployment of low-Earth orbit (LEO) satellites has characterized the past few years. From this viewpoint, large constellations such as Starlink, OneWeb or the planned Amazon Kuiper are flanked by a growing number of smaller LEO networks focusing on specialized commercial services, e.g., [12][13][14]. Satellite IoT connectivity is one of the key scenarios included in the nonterrestrial networks (NTN) standardization efforts within the 3GPP ecosystem, and is aimed to become part of the standard already from Release 17.
The increasing interest toward mMTC has spurred significant research efforts to tackle a number of challenges that span the whole communications protocol stack. In particular, the intermittent transmission of short packets from IoT terminals calls for innovative approaches. At the physical layer, for instance, the design of the channel codes that operate efficiently over blocks in the order of a few tens to hundreds of bits and with possibly limited channel state information is fundamental [15]. Moreover, the grant-based medium access control policies encountered in traditional human-centric communication systems is largely ineffective for the traffic profile encountered in machine-type applications. As a matter of fact, the overhead needed to coordinate resource sharing in the presence of a massive population of sporadically active transmitters is highly inefficient. In this perspective, a flourishing line of research is emerging, with the proposal of a number of novel modern random access protocols [16][17][18][19]. These solutions lean on a joint design of coding and medium access, allowing to achieve spectral efficiencies comparable to those of coordinated schemes under a truly grant-free paradigm, and offer a promising way forward for next-generation mMTC systems. At the same time, random access is already at the core of current IoT communications, as variations of the basic ALOHA strategy [20] are employed by many widely used mMTC solutions [5,8,11].
From this standpoint, grant-free and coordinated access policies offer complementary characteristics, and are commonly employed side by side. Indeed, many communications systems host and provide support to a variety of use cases, ranging from sporadic and often lower-priority mMTC traffic to services with more stringent demands in terms of data rate, reliability and quality of service (QoS). This combination is typically achieved by assigning blocks of orthogonal resources (slices)-in time, frequency, or a combination thereof-to applications with distinct requirements. Relevant examples in this direction are the ETSI S-MIM standard for S-band mobile interactive multimedia [21], focusing on the satellite uplink, or the mobile communications standard 5G from 3GPP [22]. In the former case, the transmission of regular or high data rate traffic is served via demand-assignment procedures, whereas delivery of IoT messages can be attempted on dedicated resource blocks via a variation of spread spectrum ALOHA [23]. Similarly, in 5G, a large amount of the spectrum is dedicated to the enhanced mobile broadband (eMBB) traffic, while IoT messages are sent via the NB-IoT waveform in dedicated sub-bands or unused guard bands [24].
By construction, such approaches avoid interference between different services, easing the provision of proper QoS levels and simplifying waveform design. On the other hand, the reservation of orthogonal sets of resources may not be fully efficient in the presence of machine-type communications. In fact, the unpredictable nature of mMTC traffic inherently leads to significant load fluctuations, hindering the precise tuning of the amount of bandwidth to be allocated and often resulting in an either heavily congested or underutilized channel. From a different angle, instead, a controlled level of interference may be tolerated by non-mMTC services without violating the target QoS requirements. Indeed, information theoretic results on the multiple access channel show that spectrum sharing and interference cancellation allow to achieve the corner points of the capacity region, suggesting that the coexistence of the two services over a shared band may be beneficial.
Starting from these remarks, we explore in this work the possibility to serve different types of traffic over the same set of resources concurrently. Specifically, we tackle the need to provide uplink support to traffic with a specific QoS target as well as to enable transmission for a best-effort service to IoT devices. Leaning on information theoretic tools, we derive the maximum aggregate rate that can be granted to mMTC traffic without violating the rate and reliability requirements of the coexisting service. We then compare the performance of such an overlay configuration to a benchmark setup where resources are orthogonally split, highlighting promising gains.

Related Work
The first research direction considered spectrum coexistence for geographically spaced systems [25,26]. More recently, the idea of sharing bandwidth among non-homogeneous services has started to receive some attention in the context of 5G cellular systems, aiming to go beyond the inefficiencies of orthogonal slicing. In [27], the possibility to multiplex eMBB and ultra-reliable low-latency communications (URLLC) was studied following an information theoretic approach, pinpointing the potential of the idea in a multi-cell cloud radio access network (C-RAN). Additional insights are provided in [28], where the two aforementioned services are flanked by mMTC traffic. Among other scenarios, the work studied a heterogeneous non-orthogonal multiple access solution for the OFDM 5G uplink, where resources are shared between eMBB and mMTC with different reliability requirements. Assuming packets that are long enough to justify an asymptotic informationtheoretic analysis, the authors explored via numerical solutions some key trade-offs in terms of rates that can be granted to the distinct services, showing how an overlay allocation can be beneficial in certain regimes. Providing efficient connectivity while guaranteeing the target QoS among several concurrent services was also thoroughly investigated above layer two, e.g., [29]. Finally, a survey on inter-system spectrum sharing focusing on services with equal access rights to the resources can be found in [30].

Main Contributions and Structure of the Paper
Within this line of study, we aim to shed light on the potential of a non-orthogonal resource distribution, as well as to trigger additional research on the topic in a satellite-IoT context. In particular, in our work, we do the following: • We characterize some fundamental trends for the performance of an overlay allocation of a QoS-constrained service and of mMTC traffic, composed of short packets. Based on practical considerations, we assume the receiver to first attempt decoding the former traffic. In the case of success, its contribution is removed (interference cancellation), and retrieval of the underlying mMTC packets is attempted; • In this setup, we derive the exact analytical expressions for the maximum aggregate rate that can be granted to mMTC traffic as a function of the requirements in terms of the rate and packet error rate set for the QoS-constrained service; • Going beyond the approach followed in [28], the analysis relies on information theoretic arguments to capture the impact of short packets transmission, as well as the impact of the length of frames in which the uplink communication is organized.
The remainder of this paper is structured as follows. After introducing the system model in Section 2, we provide in Section 3 the initial insight by comparing the performance of the overlay and orthogonal allocations in a setting characterized by packets long enough to justify the use of asymptotic information theoretic tools, considering both an ergodic and a non-ergodic case. The study is then complemented in Section 4 by exploring the impact of transmission of short packets commonly encountered in mMTC applications, relying on the normal approximation [31,32]. The numerical results are presented and discussed in Section 5, highlighting some fundamental trends and comparing the effectiveness of orthogonal and overlay allocations in all of the considered cases. Finally, Section 6 draws the conclusions, offering some relevant open issues and future research directions.

Notation
We denote random variables (vectors) by uppercase (bold) letters, while we refer to their realizations in lowercase, e.g., X and x; X and x. The probability mass function (PMF) of a discrete r.v. is denoted as p X (x) := P{X = x} and that of a discrete random vector as p X (x) := P{X = x}. Moreover, we indicate the conditional PMFs P{X = x | Y = y} as p X (x|y) and P{X = x | Y = y} as p X (x|y). Finally, the expectation operator is denoted as E[·], while I k is the identity matrix of size k.

System Model and Preliminaries
Throughout our discussion, we focus on the uplink of a wireless satellite system which supports two distinct services, labeled as S a and S b . The former grants specific requirements in terms of data-rate and error probability, and is embodied in our setting by a single user, s a , which sends data to the receiver under an average power constraint, ρ a whenever it is granted access to the medium. On the other hand, S b foresees that a large number of terminals share the wireless channel under a slotted ALOHA contention policy [33] to attempt packet delivery in a best-effort fashion, e.g., for mMTC. Specifically, over any time slot allotted to S b , a variable number of users U access the channel, each transmitting toward the receiver for the whole slot duration with average power ρ b . We note that, while the single user for the transmission of a data packet is subject to an average power constraint ρ b , the overall power injected in the system by service S b varies both with the number of terminals accessing the channel over a single slot and the number of allocated slots. Such a working assumption is representative of the practical implementations of machine-type communications. Following a common approach, U is modeled as a Poisson r.v. of parameter λ, independent and identically distributed across slots, so that the following holds: The assumption is especially representative for traffic generated by a multitude of low duty cycle nodes monitoring and/or reporting data generated by heterogeneous systems and phenomena [34], as often encountered in mMTC satellite systems. Furthermore, we consider for S b no feedback on the outcome of sent packets nor retransmission policies. This approach is in line with the best-effort nature of many IoT applications, where packets losses may be tolerated, awaiting the delivery of a successive update from the monitored devices.
To characterize the performance of the system, we tackle two operation modes, considering an orthogonal and an overlay allocation for S a and S b . In both cases, having in mind the uplink of a satellite system, we assume for the link between a transmitter and the receiver a line-of-sight connection with perfect power control. Accordingly, the channel coefficient at the receiver is a constant and known value for all users, assumed to be unitary for the sake of simplicity, and all users of a given service are received with the same power level.

Orthogonal Allocation
In this configuration, resources are split between services so that no mutual interference between S a and S b arises. Without loss of generality, we assume orthogonality to be achieved in time and focus on the setup exemplified by Figure 1a (the reported analysis and results would hold for an orthogonal allocation in the frequency domain as well). Time is divided into successive frames of equal duration. Within each frame, a fraction α of the time is granted exclusively to s a for data delivery. During this period, the channel output at the receiver over the -th channel use takes the following form: where X ∼ CN (0, ρ a /α) is the complex symbol transmitted by the user, and W ∼ CN (0, σ 2 ) is a zero mean, σ 2 variance, circularly-symmetric complex Gaussian additive noise component. In view of the fraction α of resources available for transmission, the average signal-to-noise ratio (SNR) at the receiver is thus, as follows:  As for users of the best-effort service S b , channel access is only permitted during the remaining fraction (1 − α) of the time. To instantiate a slotted ALOHA policy, the corresponding portion of the frame is assumed to be divided in slots of equal duration, each allowing the transmission of a single packet of S b . Denoting by n s the number of channel uses over a slot, the input-output relation over the -th slot conditioned on having U = u users concurrently transmitting can be expressed as follows: In (2), X (b,j) is the codeword transmitted by the j-th user of S b active in slot , whose n s components are modeled as i.i.d. circularly-symmetric normal r.v. with zero mean and variance ρ b , i.e., X (b,j) ∼ CN (0, ρ b I n s ). In turn, W ∼ CN (0, σ 2 I n s ) is the additive Gaussian noise, leading to the incoming signal vector at the receiver, Y ∈ C n s . Note that when U = 0, no user of S b transmits over the slot, and the summation in (2) brings no contribution so that solely the noise component W is observed. As discussed in Section 1, this event, occurring with probability e −λ , represents a waste of resources, as bandwidth granted to mMTC is not employed to attempt data delivery.

Overlay Allocation
When the system is operated in overlay mode, both services are allowed to access the channel concurrently and without time restrictions, as illustrated in Figure 1b. Such a policy allows S a and S b to enjoy the whole share of resources at the cost of suffering from mutual interference, due to the loss of orthogonality. In this case, the whole frame is assumed to be split in M slots, each consisting of n s channel uses. Using all the resources, s a transmits then a codeword X (a) of size n s M channel uses. In turn, users of service S b can send a packet over any of the M available slots. To properly capture the signal model, although s a encodes its message across all the available channel uses, it is convenient to express X (a) as the concatenation of the M sub-codewords transmitted over the slots as follows: where X (a) ∼ CN (0, ρ a I n s ), = 1, . . . , M. Following this notation, the channel output at the receiver over the -th slot conditioned on having U = u users of S b access the channel takes the following form: A packet of a best-effort user is thus always affected by the interference coming from S a , and possibly by other transmissions of S b . Conversely, any sub-codeword sent by s a is received interference-free with probability e −λ , i.e., if no mMTC user has transmitted over the corresponding slot. It is worth stressing that the number of interfering packets from S b is not known a priori, so the coding rate of s a cannot be dynamically adapted on a slot-by-slot basis.
For both the orthogonal and overlay configurations, we are interested in the maximum sum rate that the system can offer to the best-effort service while granting the QoS requirements of S a . Specifically, denoting by R a the rate in bits per channel use of s a and by R b the average aggregate number of bits per channel use decoded at the receiver for S b , we aim at deriving the rate region given by all pairs (R a , R b ) that allow s a to experience an error probability that is less than or equal to a target value.

Asymptotic Analysis
In order to gather preliminary insight on the performance of the system, we first consider an asymptotic setting in which the codewords transmitted by s a , as well as those by users of S b over a slot, can be made long enough to approach the classical information theoretic results.

Orthogonal Allocation
When resources are orthogonally split among services, s a delivers data over an additive white gaussian noise (AWGN) channel with SNR γ a defined in (1), yet is allowed to transmit only for a fraction α of the time. In this setting, the user can then communicate with vanishing small error probability for any rate as follows: Let us focus instead on the share granted to S b , and indicate as R b, the communication rate achieved by the service over the -th slot. If no user accesses the channel (i.e., U = 0), then clearly R b, = 0. Conversely, if a single transmission is performed (U = 1, with probability λe −λ ), data are sent over an AWGN channel with capacity as follows: Finally, when more than one user becomes active over the slot, a collision takes place, and the channel output at the receiver takes the general form of (2). As will be discussed later (see Remark 1), we assume terminals of S b to encode information at a rate approaching C b , as would be done in the absence of interference. Accordingly, in the event of a collision, the actual channel capacity falls below the employed rate, and we regard all packets sent over the slot of interest to be lost. In other words, following a common modeling approach for random access schemes (e.g., [20]), collisions are of a destructive nature, and R b, = 0 whenever more than one user accesses the channel concurrently (U > 1). In summary, (i) for U = 1, R b, = 0, and (ii) for U = 1, R b, = C b . Combining these observations, the average rate for S b in bits per channel use over a slot readily follows: Equation (6) clarifies how, for any given power level ρ b , the slotted ALOHA best-effort channel shall be operated at a load λ = 1 (pkt/slot) in order to maximize the average aggregate rate. Recalling that only a fraction (1 − α) of the resources can be leveraged, the achievable sum-rates for S b are then characterized as follows: Remark 1. The widely-employed assumption of destructive collisions reflected in (6) is of practical relevance for the considered mMTC setting. Indeed, given that the number of competing terminals over a slot cannot generally be predicted before transmission, a coding scheme that effectively and dynamically takes into account interfering packets may be difficult to develop and goes beyond the complexity of the available terminals. On the other hand, a backoff on the transmission rate to increase resiliency to interference may not be efficient in lightly loaded channels, where sporadic access may lead to having many slots without contention, resulting in significance performance loss for the tight link budget configurations that are typical of mMTC links. Finally, we note that many machine-type applications attempt the delivery of updates that are often repeated or very much correlated in content over time (e.g., reporting of sensed data), positively trading off higher packet loss rates due to interference for a larger aggregate channel sum rate achieved in the absence of coordination among nodes.

Overlay Allocation
When the system is operated in overlay configuration, the incoming signal at the receiver is the superposition of transmissions performed by both services. Throughout our study, we assume decoding to start from the message sent by s a , treating the interference component of S b packets as noise. In the case of successful decoding, ideal interference cancellation is performed, removing the contribution of s a from the overall waveform and presenting to the receiver the M-slot frame populated by packets of S b alone for further processing. Ideal interference cancellation of the packet of user s a can be regarded as a reasonable assumption, especially taking into account that its transmission can enjoy a large number of channel uses, and the data-aided channel estimation can be properly tuned. Moreover, we also note that non-ideal interference cancellation would yield a minor impact on the model by effectively increasing the noise power suffered by the transmissions of service S b since both the signal and the noise are modeled as Gaussian distributed r.vs..

Remark 2.
The assumption on the order of decoding for the supported services stems from two practical considerations. First, as will be further discussed in Sections 4 and 5, mMTC traffic is typically composed of very short packets-often in the order of a few hundreds of bits-confining pilot symbols to few channel uses per transmitted message in order to avoid excessive overhead. Accurate channel estimation needed for decoding packets of S b , especially in the presence of interference from s a , may thus not be viable. On the other hand, the longer data units that characterize S a render the cost of stronger pilot sequences manageable, and, together with its more regular and predictable traffic patterns, can lead to estimates of the channel parameters that are good enough to retrieve information and perform accurate interference cancellation. In addition to this, the choice of initiating decoding from S a is driven by QoS arguments. It is indeed reasonable to overlay best-effort mMTC traffic to an existing service only as long as the latter does not experience a significant loss in performance. Along this line of reasoning, the receiver shall be able to decode S a even in the presence of underlying interference, i.e., treating the whole of S b traffic as noise.
Under this assumption, s a experiences the block-interference channel described in (3), with the interference level remaining constant over a coherence period of one slot duration, and changing to an independent realization during the subsequent n s channel uses. Specifically, conditioned on the number U of users of S b accessing the channel over the -th slot, the r.v. describing the signal-to-interference and noise ratio (SINR) seen at the receiver for s a takes the following form: Within this framework, to better grasp the effect of the bursty overlaid mMTC traffic, we first analyze an ergodic setting, later extending our results to the more practicallyrelevant non-ergodic setup.

Ergodic Case
When the number of slots within a frame is sufficiently large (i.e., M → ∞), the codeword of s a experiences, in the limit, all the possible interference values, making it possible to define the ergodic channel capacity.
In this case, a vanishingly small error probability can still be granted to the user for any rate R a < C (e) a . Clearly, from a system design perspective, the overlay operation mode becomes meaningful-and possibly convenient-if the QoS requirements of S a in terms of data rate are met, even in the presence of interference. We are thus interested in configurations for which the achievable rate in overlay matches the one of an orthogonal allocation, i.e., for which, leaning on (4) and (8), In particular, for any value of α employed to configure a dedicated-resource allocation, (9) imposes a constraint on the tolerable level of interference from S b in overlay mode, limiting the average number of transmissions per slot and thus the intensity of the mMTC traffic. Denoting this value as λ m , the message of s a is retrieved and its contribution on the incoming waveform is perfectly canceled with probability approaching one for any channel load λ < λ m of service S b . We note that, while a closed-form expression of λ m is elusive due to the transcendental nature of (9), its value can easily be determined numerically.
If the system is operated satisfying this constraint, the receiver can then process packets sent by mMTC users via slotted ALOHA after having performed interference cancellation, and the average aggregate rate for S b is once again captured by (6). In order to maximize performance while meeting the requirements of the QoS-driven user, S b shall thus be operated at a channel load as follows: We recall indeed that values of λ larger than 1 [pkt/slot], even if tolerable by the QoS requirements of S a , would be ineffective from a throughput perspective for the mMTC traffic in view of the detrimental effect of destructive collisions.
Combining these remarks, and recalling that both services have now access to the whole set of resources, the rate region (R a , R b ) for the ergodic case finally evaluates to the following:

Non-Ergodic Case
Albeit insightful in pinpointing a fundamental trade-off between the achievable rate for S a and the amount of mMTC traffic that can be served, the ergodic setup fails to capture some important aspects of practical relevance. In particular, the assumption of having an asymptotically large number of slots over which user s a can encode its message can become highly inaccurate for typical systems, characterized by frames of a length of few hundreds of slots, e.g., [21]. In such conditions, each codeword of service S a experiences a finite number M of interference realizations, and a vanishingly small error probability can no longer be granted. The QoS requirements of the service are therefore more appropriately characterized by a rate-reliability pair, where R a is complemented by a tolerable error probability for the sent codeword. In the non-ergodic setting, the latter is captured using information theory tools by the outage probability P out . Specifically, for the block-interference channel under study, the two quantities are related as follows: where C a, denotes the instantaneous capacity for s a over the -th slot, and can be expressed as a function of the number U of S b users transmitting over the slot as follows: Due to the discrete levels of interference that can be experienced, the r.v. C a, takes values in the alphabet as follows: , u ∈ N and its PMF can readily be derived from (11), leading to the following: where the final expression resorts to the ancillary quantity as follows: Leaning on this, let us define for compactness the r.v. as follows: taking values in the set Observing that the instantaneous capacity values over different slots are i.i.d., the PMF of V can simply be computed as the M-fold convolution of (12), to obtain for any v ∈ A V the following: Accordingly, an exact expression of the outage probability in (10) can finally be derived as follows: This result offers a useful system design tool. Indeed, for any pair (R a , P out ), (12) and (13) provide an exact characterization of the maximum channel load λ m that can be granted to mMTC traffic without violating the QoS requirements of S a .
Taking such a constraint into account, let us now focus on S b , aiming to derive the aggregate rate R b that can be achieved for the maximum arrival intensity sustainable by S a . In this perspective, we observe that two conditions have to be met for an mMTC packet to be retrieved over a slot in a non-ergodic setup: (i) the packet of s a is decoded, and (ii) no other user of S b concurrently accesses the channel over the slot of interest. Note that condition (i) is consistent with the system model assumptions. Indeed, should the message of s a not be decoded, its interference contribution cannot be removed, affecting any underlying mMTC packet. Recalling the discussion of Remark 1, this would prevent the successful retrieval of information for users of S b . Furthermore, let us denote by W the r.v. taking values in {0, . . . , M} and counting the number of slots characterized by the transmission of a single user (named also singleton slots) of S b , which, for the Poisson traffic under study, follows a binomial distribution of parameters M and λe −λ . Following this notation, for any arrival intensity λ, the aggregate rate of S b can be conveniently expressed as follows: (14) where, recalling (10), the term within square brackets takes into account how all w singleton slots bring a contribution of log 2 (1 + ρ b /σ 2 ) bits per channel use if, and only if, s a is not in outage. To complete the calculation, we therefore need to derive the outage probability for S a conditioned on the r.v. W. To this aim, it is useful to rearrange the conditional expression as follows: where the first addend on the right-hand side of (15) accounts for the sum of the instantaneous capacity of all w slots with a single transmission of service S b , whereas the summation takes into consideration the contributions of all other M − w slots (i.e., those with U = 1). In turn, the PMF of the instantaneous capacity C a, over a nonsingleton slot can be directly derived from (12), to obtain for any value in the alphabet where the normalization factor readily follows from the Poisson distribution of the mMTC traffic. Recalling now that also the r.v. C a, over non-singleton slots are i.i.d., the PMF of the sum of M − w such values can be once more derived by taking the (M − w)-fold convolution of (17), allowing to compute (16). Leaning on this result, and plugging the binomial distribution of W into (14), R b can be finally computed for any value of λ as follows: The presented approach provides thus the sought pairs of sustainable rates. As discussed, the maximum sustainable arrival intensity λ m for S b can be computed from (13) for any target QoS requirements (R a , P out ). In turn, mMTC traffic can be operated effectively at any channel load lower than λ * = min{1, λ m }, with the corresponding aggregate rate R b obtained by evaluating (18) for λ = λ * .

Finite Blocklength Analysis
Going beyond the asymptotic setting discussed in Section 3, we complement our study by delving into a more practical scenario that closely relates to mMTC applications, where packets transmitted by users can be in the order of few hundreds bits. To capture this aspect we leverage tools of finite-length information theory, and focus in particular on the normal approximation [31,32] to characterize the rates achievable by S a and S b .

Orthogonal Allocation
Let us first focus on the orthogonal configuration, and denote by n the (finite) number of channel uses available over the whole transmission frame. Following the notation of Section 2, the transmission of s a spans n a := αn channel uses, and takes place over an AWGN channel of SNR γ a defined in (1), i.e., we implicitly focus on values of α such that αn ∈ N. In this case, a vanishingly small error probability cannot be granted, even in the absence of interference, and the performance of S a is properly described by the couple (R * a , P e ), where R * a denotes the maximum rate that can be supported for a codeword error probability P e . Specifically, the two quantities are related as follows [31]: where the channel capacity C a and channel dispersion V a are defined by the following expressions: and Q −1 (·) denotes the inverse Q function. If we now recall that the user s a transmits only for a fraction α of the time, and approximates (19) by disregarding the terms of order O(log n a /n a ), the set of rates achievable by S a for a target error probability P * e can be characterized as follows: Consider now service S b . In line with what was discussed in Section 3, collisions are once more regarded as destructive, so slots in which more than one user transmits lead to the loss of all the packets sent. In the setup under study, however, a non-vanishing error probability is to be expected for any transmission rate R b even over a singleton slot, in view of the finite number n s of available channel uses. Accordingly, introducing the channel dispersion, and recalling the channel capacity C b in (5), the maximum rate R * b that can be granted to S b under a codeword error probability can be approximated as follows: In terms of system design, (21) allows then to tune the rate-reliability requirements of S b . From this standpoint, we assume within our study mMTC traffic to be of the best-effort nature, i.e., without stringent requirements in terms of reliability, and select the transmission rate in order to maximize the attained spectral efficiency. Specifically, we consider S b to be operated at an error rate * that maximizes the information bits per channel use retrieved over a singleton slot: Finally, recalling that mMTC traffic can enjoy a share (1 − α) of the resources and that the fraction of singleton slots is bounded by e −1 (achieved for λ = 1), the set of aggregate rates achievable by S b in the orthogonal configuration can be described as follows:

Overlay Allocation
As discussed in Section 2, when the system is operated in overlay mode, the total number of channel uses, n, available over a frame is split into M time slots of n s channel uses each, i.e., n = Mn s . Let us denote by U = [U 1 , . . . , U M ] the random vector whose components describe the number of users of S b that transmit over each of the slots. Recalling the i.i.d. Poisson distribution of the r.v. U , the joint PMF of U can be expressed for any arrival intensity λ as follows: Consider now the transmission of s a , and condition on a specific realization U = u. In this case, each portion of the sent message experiences a distinct (and independent) interference level, allowing to model the problem as the transmission of an n channel uses codeword over M parallel AWGN channels. Following the approach presented in [35] in the context of block-fading, the relation between maximum sustainable rate R * a and codeword error probability P e we can then be approximated as follows: where and we introduce for convenience the SINR experienced by s a over the -th slot as follows: Solving (24) with respect to P e , we can easily obtain the error probability when a given rate R a is employed by s a and a specific realization of the interference pattern U = u is experienced: We now observe that, in view of the random nature of the interference of service S b , it is more meaningful to evaluate the average error probability P e as follows: (26) by employing (23) and (25). The QoS requirements of S a are properly specified in this case by the pair (R a , P e ).
In this perspective, the result in (26) offers a relevant system design tool. Indeed, for any targeted rate R a , an inspection of the equation allows to derive the maximum traffic intensity λ m of S b that can be supported without violating the average error probability P e .
Taking the lead from this, let us then focus on S b for which we need to compute the maximum aggregate rate that can be achieved under the constraint on λ m . In this case, three conditions have to be met for a mMTC message to be retrieved: (i) the message of s a has to be decoded and its interference canceled; (ii) the packet of S b has to be sent over a singleton slot; and (iii) the n s -channel use codeword has to be decoded. To account for all these conditions, we follow an approach similar to the one discussed in Section 3.2.2, and condition on the number of singleton slots W that take place over the observed frame, writing the aggregate average rate achieved for a traffic intensity λ as follows: where we assume that the coding scheme employed by S b has been tuned following (22), so that each of the w singleton slots brings a contribution of R * b ( * )(1 − * ) information bits per channel use. Recalling that W follows a binomial distribution of parameters (M, λe −λ ), the complete evaluation of (27) simply requires to specify the joint PMF of the number of users of S b that transmitted over each of the M slots, conditioned on having W = w singleton ones. The distribution can be computed by considering two cases. First, for any vector u whose number of components with value 1 (i.e., the number of singleton slots in the considered frame realization) is different from w, we clearly have p U (u | w) = 0. For any other u, instead.
where the numerator follows from the i.i.d. Poisson distribution of the number of transmissions over a slot, whereas the normalization factor accounts for the probability of having exactly w singleton slots out of the available M.
In conclusion, the set of admissible rates for S b can be computed by taking into account the constraint λ m imposed by the QoS requirements of S a , and by evaluating (27) for all traffic intensities smaller than λ * = min{1, λ m }.

Numerical Results
In this section, we present and discuss some numerical results obtained from the general framework developed in Sections 3 and 4. Bearing in mind the uplink of a satellite IoT system, we target two relevant scenarios. In the former, packets of both service S a and S b reach the receiver with the same power level so that ρ a /σ 2 = ρ b /σ 2 = 0 dB. This setting reflects the coexistence of two services whose transmitters have a similar hardware (amplifier, antenna) and thus are received with comparable signal strength. In the latter, the QoS-constrained traffic of S a is transmitted with higher power, and we consider the configuration ρ a /σ 2 = 10 dB, ρ b /σ 2 = 0 dB. The second setting, instead, assumes that the QoS-constrained service features transmitting units equipped with a better and possibly more costly hardware, thus resulting in a more favorable link budget. The selected configurations are in line with LEO satellite systems targeting IoT applications, e.g., [36].
As a starting point, let us focus on the asymptotic, ergodic setting introduced in Section 3. In this case, a vanishingly small error probability can always be granted to S a , whose QoS requirements are solely specified in terms of a target data rate R a . The corresponding results are reported in Figure 2. The two solid lines represent the boundary of the rate pairs achievable by orthogonal and overlay allocations respectively, according to (4), (7) and (8). The shadowed light-red area comprises rate pairs achievable only with an overlay allocation, whereas the shadowed light-blue area denotes rates that are nonachievable by any of the two schemes. The two SNR scenarios, i.e., ρ a /σ 2 = ρ b /σ 2 = 0 dB and ρ a /σ 2 = 10 dB, ρ b /σ 2 = 0 dB are depicted in Figure 2a,b, respectively. In both configurations, irrespective of the target rate of user s a , an overlay allocation is always beneficial to service S b in terms of maximum achievable rates. Incidentally, we note that a slightly different trend was observed in [28], where, even in the asymptotic setting, orthogonal allocation may be beneficial for a small rate region. The discrepancy stems from two main factors. First, different channel models, i.e., AWGN with perfect power control vs. Rayleigh fading, are considered. Second, distinct decoding condition on the mMTC service are assumed. Indeed, while in our case we consider destructive collisions, ref. [28] relies on capture effect and interference cancellation (IC) also for service S b data units.
From Figure 2, we can also infer that, when the two services are operated in overlay, S b can achieve its maximum rate of e −1 ∼ = 0.368 (bit/ch. use) up to R a ≤ 0.68 (bit/ch. use) for ρ a /σ 2 = 0 dB, and up to R a ≤ 2.76 (bit/ch. use) for ρ a /σ 2 = 10 dB. In other words, increasing the target rate for the QoS-constrained service S a does not impact service S b in this region. Especially for the setting of Figure 2b, there is a very large range of rates for service S a , where the IoT traffic is limited by the poor performance of a slotted ALOHA access method. Indeed, the channel code protection of the data unit of user s a could allow a larger channel traffic of service S b , beyond λ = 1, but cannot be reaped due to the limitation in the medium access. Such a remark hints at how advanced alternatives relying on packet repetition, e.g., [37], can be beneficial to expand the achievable rate region, fostering the need for additional research in this direction. Finally, for R a > 0.68 (bit/ch. use), or R a > 2.76 (bit/ch. use) the maximum rate achievable for service S b sees a larger and steeper degradation with respect to the orthogonal allocation as R a → log 2 1 + ρ a /σ 2 . As a consequence, more caution on the tuning of the traffic intensity for service S b shall be devoted for such rates.   Figure 2. Asymptotic ergodic rate regions for an orthogonal and overlay allocation of resources of services S a and S b . Two scenarios are considered: ρ a /σ 2 = ρ b /σ 2 = 0 dB and ρ a /σ 2 = 10 dB, Let us now take a further step and consider the practical constraint imposed by having a finite number of slots within the communications frames (Section 3.2.2). Under these conditions, the transmission of service S a fails to experience all possible interference levels from service S b , and a non-ergodic setting has to be considered for the overlay allocation. The corresponding results for a number of time slots in the set M ∈ {25, 100, 200} are shown in Figure 3 together with the ergodic benchmark as reference. The set of time slots considered is in line with the literature on random access targeting satellite uplink scenarios, e.g., [37,38]. For our discussion, we set the target (average) codeword error probability for service S a to 10 −3 , and focus on the ρ a /σ 2 = ρ b /σ 2 = 0 dB scenario. Similar trends were also found for the unbalanced SNR scenario, and are not reported here for the sake of compactness. The impact of finite-length frames is clearly visible already for a moderately large number of slots, and the reduction of the achievable rate region becomes even more pronounced as the system operates with smaller values of M. For example, if user s a targets a rate of R a = 0.8 (bit/ch. use), the maximum achievable rate for service S b is reduced by ∼51% with respect to the ergodic setting, when M = 25. The rate R b contraction becomes even more relevant for larger values of R a . Conversely, increasing the number of time slots to 100 or 200 mitigates the trend. For the same target rate of R a = 0.8 (bit/ch. use), the maximum achievable rate for service S b is reduced by only ∼18% with respect to the asymptotic setting, when 200 slots are considered. On the other hand, it is relevant to remark that, also in the non-ergodic setting, there exists a significant range of rate values for service S a that allows to operate S b at the maximum aggregate rate. Such a result confirms the potential of the overlay approach.  . Asymptotic non-ergodic rate regions for overlay allocation. We fix ρ a /σ 2 = ρ b /σ 2 = 0 dB. The number of time slots are in the set M ∈ {25, 100, 200} and the target outage probability for service S a is P * out = 10 −3 . The asymptotic ergodic rate region for the overlay allocation is also provided for reference.
We also investigate the trends under finite codeword length in Figure 4. In this case, not only the time slots, but also the number of channel uses per time slot are finite and we thus rely on the analysis provided in Section 4. As for the previous scenario, we set the target average codeword error probability of service S a to 10 −3 , and consider ρ a /σ 2 = ρ b /σ 2 = 0 dB. The set of time slots is M ∈ {25, 100, 200} and the set of channel uses per time slot n s ∈ {100, 1000}. The latter choice is representative of practical mMTC applications, e.g., LoRa [5] and SigFox [8], which enable the transmission of payloads of up to 96 and 2000 bits, respectively. Accordingly, the set of total channel uses in the orthogonal allocation evaluates to n ∈ {2500, 10,000, 20,000, 25,000}. In the main plot of Figure 4, the orthogonal allocation and the overlay allocation for M ∈ {25, 100, 200} and n s = 100 channel uses per time slot are compared. As we can observe, increasing the number of time slots has a more beneficial impact on the overlay allocation than on the orthogonal one. In the former case, the increase in the number of time slots allows supporting a larger channel load for service S b for the same target outage probability. In the latter case instead, it only impacts the correction term in (20) with respect to capacity, and thus results in a minor benefit on the achievable rate pairs. It is also worth noting that, as expected, the finite number of channel uses strongly affects the maximum achievable rate of service S b , reducing it by ∼18% with respect to the asymptotic case. More interestingly, in contrast with what was discussed in the asymptotic setting, a region where an orthogonal allocation is superior to the overlay configuration emerges. Focusing on the M = 25 case, such a region is well highlighted in the subplot, where both scenarios with n s ∈ {100, 1000} are depicted. Although rather limited and only for rates of service S a larger than 0.8 (bit/ch. use), such inversion suggests that reserving a portion of time for the IoT service alone is beneficial instead of letting the two services compete completely.  In order to further investigate this aspect, we provide in Figure 5a different angle on the considered results. In particular, we analyze the ratio of the maximum rate achievable by service S b with an overlay allocation to the one obtained in the orthogonal case as a function of R a . Values larger than 1 thus identify regions of R a where an overlay allocation can outperform the orthogonal one, so that a more aggregate mMTC rate is achieved while granting the same performance to the QoS-constrained service. To highlight the impact of short-packet transmissions, we compare the asymptotic, ergodic setting with finite length scenarios. In the latter case, we set the target average codeword error probability of service S a to 10 −3 . Let us focus first on the setting ρ a /σ 2 = ρ b /σ 2 = 0 dB. As already shown in Figure 2, under the asymptotic, ergodic setting, the overlay allocation is always beneficial. However, Figure 5a reveals the presence of an optimal operating point at R a ∼ = 0.92 (bit/ch. use) for which the benefit achieved by the overlay allocation is maximized. A similar trend can also be observed for both M = 100 and M = 25, although for progressively smaller values of rate R a . Moreover, the improvement reduces as well with the number of available time slots, going from a peak 87.5% increase in rate R b for the overlay allocation with respect to the orthogonal one in the asymptotic setting, to a 48.7% improvement for M = 100 and 32.7% for M = 25. The plot also confirms that, in practical finite-length setups, there exist values of R a for which an orthogonal allocation is convenient (i.e., the ratio falls below 1). Nonetheless, such a region drastically reduces by increasing the number of slots available over a frame. Finally, the effect of the number of channel uses appears to have a minor impact. Taking the lead from this, we explore in Figure 5b the configuration ρ a /σ 2 = 10 dB, ρ b /σ 2 = 0 dB, focusing only on n s = 100. Increasing the SNR enjoyed by the QoS-constrained service drastically increases the advantage perceived by the IoT traffic when an overlay allocation is adopted. Rate exceeding a six-fold improvement for the IoT traffic can be achieved in the asymptotic ergodic setting, while, even for as few as 25 slots, more than a two-fold increase in R b is expected when in overlay allocation. Interestingly, the region where the orthogonal allocation is superior for the equal SNR scenario appears to vanish in this scenario. asymptotic ergodic 25 sl. 100 ch. uses 100 sl. 100 ch. uses (b) ρ a /σ 2 = 10 dB, ρ b /σ 2 = 0 dB. Figure 5. Ratio of the maximum rate achievable by service S b with an overlay allocation (R b overlay) over the one with an orthogonal allocation (R b orth.) as a function of R a . Both the ergodic asymptotic setting and the finite length scenarios are presented. In the latter, we set the target average codeword error probability of service S a to 10 −3 for both SNR setups. M = 25 and M = 100 slots together with n s = 100 and n s = 300 channel uses are investigated for ρ a /σ 2 = ρ b /σ 2 = 0 dB, while only n s = 100 channel uses for ρ a /σ 2 = 10 dB, ρ b /σ 2 = 0 dB are shown.

Conclusions and Outlook
In this paper, we investigated the potential of letting two services, a QoS-constrained (S a ) and a mMTC (S b ), share a common spectrum by overlaying their transmissions in an AWGN scenario modeling the uplink of a satellite communication system. The receiver attempts decoding of S a and, if successful, removes its contribution by means of IC, possibly allowing to retrieve data units of the mMTC traffic transmitted with a slotted ALOHA policy. Leveraging analytical tools, we have shown that an overlay allocation is beneficial in most situations, compared to a more traditional orthogonal allocation among the two services. Starting with an asymptotic scenario, where both the codewords and the number of time slots are very large, we delved into a non-ergodic setting (finite number of time slots) and the more practical finite-length regime (both time slots and codewords are finite). Achievable rate regions and expected gains for the IoT aggregate rate when the overlay allocation is adopted are presented for all scenarios. Furthermore, a rate tuple (R a , R b ) maximizing the improvement on the aggregate rate for the mMTC service is identified, showcasing a possible optimal operating point for the overlay system.
The presented work aims at stimulating further research in the context of the spectral coexistence between mMTC and QoS-constrained traffic in satellite scenarios. The possibility to upgrade the medium access policy of the mMTC service to more advanced solutions and exploring the benefits of modern random access schemes is an interesting direction. For example, the use of repetition-based solutions, e.g., [37], may unleash further benefits of the overlay allocation for low enough rates of the QoS-constrained service. In particular, when strong forward error correction is adopted on service S a , the mMTC service is not able to fully exploit it since high channel load values are detrimental in slotted ALOHA (cf. Figures 2 and 4, for example). A repetition-based scheme instead, can reap the rewards of a stronger interference rejection in S a (lower rate) by heavily loading the physical layer with packet copies. Furthermore, we focused in this paper on perfect power control, i.e., all terminals of service S b are received with the same power. In practical scenarios, in turn, fading and topology trigger variability in the received power levels so that the capture of packets is viable at the receiver. The impact of this aspect is also worth exploring, as it may trigger other relevant benefits and trade-offs, all the more so when coupled with IC, becoming especially relevant in the context of modern random access policies for IoT. The decoding order of services-first S a , then IC and subsequently S b -can also be further investigated when power variability is present, along the lines of [28]. Finally, the appli-cability of the overlay allocation shall be investigated in a real-world scenario, entailing details such as the link budget, topology of the transmitters and effective error correcting code, among others.