Spectrum Slicing for Multiple Access Channels with Heterogeneous Services

Wireless mobile networks from the fifth generation (5G) and beyond serve as platforms for flexible support of heterogeneous traffic types with diverse performance requirements. In particular, the broadband services aim for the traditional rate optimization, while the time-sensitive services aim for the optimization of latency and reliability, and some novel metrics such as Age of Information (AoI). In such settings, the key question is the one of spectrum slicing: how these services share the same chunk of available spectrum while meeting the heterogeneous requirements. In this work we investigated the two canonical frameworks for spectrum sharing, Orthogonal Multiple Access (OMA) and Non-Orthogonal Multiple Access (NOMA), in a simple, but insightful setup with a single time-slotted shared frequency channel, involving one broadband user, aiming to maximize throughput and using packet-level coding to protect its transmissions from noise and interference, and several intermittent users, aiming to either to improve their latency-reliability performance or to minimize their AoI. We analytically assessed the performances of Time Division Multiple Access (TDMA) and ALOHA-based schemes in both OMA and NOMA frameworks by deriving their Pareto regions and the corresponding optimal values of their parameters. Our results show that NOMA can outperform traditional OMA in latency-reliability oriented systems in most conditions, but OMA performs slightly better in age-oriented systems.


Introduction
The fifth generation of mobile networks (5G) was designed to support three main types of services with widely different requirements: enhanced mobile broadband (eMBB), ultra-reliable low-latency communications (URLLC), and massive machine-type communications (mMTC) [1]. The eMBB category focuses on human-oriented services that transmit large amounts of data and offer higher data rates and increased spectral efficiency when compared to the previous generation. On the other hand, Internet of Things (IoT)-like services, which transmit small amounts of data intermittently (and hence are termed intermittent services throughout the rest of the paper), may fall within either URLLC or mMTC categories, depending on their latency and reliability requirements, and processing/computational capabilities. The intermittent services where low latency (in the order of a few milliseconds) must be guaranteed with extremely high reliability (in the order of 1-10 −5 ) belong to URLLC service type. Conversely, intermittent services with relaxed latency and reliability requirements while incorporating exceedingly large numbers of devices belong to mMTC service type.
However, such a categorization of IoT services is too simplistic and cannot model a finer gradation of timely data delivery requirements. In particular, there are novel, timeliness-related metrics that may better capture the requirements of some categories of IoT applications. In this respect, Age of Information (AoI) has recently attracted attention due its ability to measure the freshness of information by combining the communication No channel erasures are considered in this example. Collisions among intermittent users cannot be recovered, but SIC can be used to recover collisions between the broadband and intermittent users after decoding the broadband user.
In particular, in this paper we investigate orthogonal and non-orthogonal slicing mechanisms in the case where a broadband user shares a wireless channel with multiple intermittent users share. Specifically, we explore the performance of slicing implemented via multiple access schemes standardly used in the cellular access, which are Time Division Multiple Access (TDMA) and slotted ALOHA, and a scheme representing their combination. The broadband user implements a K-out-of-N erasure code, which allows the user to counteract the packet losses due to channel and potential collisions with the intermittent users transmission in the case of non-orthogonal slicing. In the later case, once the block of N broadband users' packets becomes decoded, the receiver uses SIC to attempt recovery of the intermittent users' packets. The performance parameters of interests are throughput of the broadband user and two timeliness metrics for the intermittent users: latency-reliability of individual packets and PAoI.
In our previous works [6,7], we investigated the performance trade-offs of OMA and NOMA in a simple uplink scenario with one broadband user and one intermittent user. The general conclusion was that OMA usually outperforms NOMA when transmissions takes place in a collision channel with packet erasures and without capture, which is a rather conservative channel model. However, NOMA schemes achieved a similar performance as OMA in extreme cases when the single objective is to maximize the throughput of the broadband user or to minimize the latency of the intermittent user [6]. We also evaluated how the capture effect and immediate (i.e., intra-collision) SIC at the receiver enhance the performance of NOMA. Under this scenario we observed that important gains can be achieved with NOMA when the intermittent user aims to minimize latency, but the gains are limited when the objective is to minimize AoI [7]. This paper extends that analysis to the case with multiple intermittent users, showing quite different trade-offs. We derive closedform expressions for the performance parameters and show that, when the intermittent users aim to minimize the PAoI, OMA with TDMA is the best choice, albeit by a small margin. In contrast, when the intermittent users aim to optimize the packet latency, the slicing mechanism must be carefully selected based on the access load and the number of users, as there is no single slicing method that provides the best trade-offs.
In summary, the main contributions of this paper are the following: • We analyze the trade-offs and regions of operation of OMA and NOMA schemes with a broadband and multiple intermittent users in a collision channel with erasures. • We investigate the impact of the metrics of interest on the overall system design and on the achievable gains with OMA and NOMA. • We investigate the impact of the activation probability of intermittent users on the performance of the slicing mechanism. • We derive Pareto frontiers, which define the best possible trade-offs between throughput of the broadband user and latency/AoI of the intermittent users with the considered schemes.
The rest of the paper is organized as follows. Section 2 presents the literature review. The system model is described in Section 3. The analyses for OMA and NOMA schemes are presented in Sections 4 and 5, respectively. The results are presented in Section 6. Section 7 concludes the paper.

Related Work
Orthogonal slicing has been widely explored and used in commercial systems [9]. It is a straightforward approach where independent resources are allocated to the different services, which allows one to treat them in an isolated manner. Popovski et al. [17] provided one of the first studies that compared orthogonal to non-orthogonal slicing. In particular, it investigated the benefits of OMA and NOMA schemes for the different combinations of 5G services in an uplink scenario: eMBB with URLLC and eMBB with mMTC. In the latter, orthogonal resources were allocated to each eMBB user, mMTC traffic was assumed to be Poisson distributed, and one URLLC user was considered. It was observed that NOMA may offer benefits with respect to OMA depending on the rate of the eMBB users and on the coexisting type of the intermittent traffic: with high URLLC, high data rates at the eMBB user were beneficial for NOMA, whereas the opposite is true with mMTC traffic.
The work presented in [17] was extended to a multi-cell scenario with strict latency guarantees for URLLC traffic [18]. A single URLLC user per cell was considered, and it was observed that NOMA leads to a greater spectral efficiency with respect to OMA. A similar conclusion was drawn by Maatouk et al. [10] in an uplink scenario with two users with the same service type that aimed to minimize the average AoI. It was also observed that a greater spectral efficiency does not directly translate into a lower average AoI. Another scenario that includes power control to simplify the reception of the intermittent packets was studied in [19], which derived analytical formulas for throughput and AoI with those settings.
The selection of the multiple access scheme is essential when considering spectrum slicing with multiple intermittent users [9], and particularly so in MU-MIMO systems which can make MPR easier [20]. Slotted ALOHA and TDMA are two basic multiple access schemes that offer widely different benefits. Slotted ALOHA is simple, flexible, and effective for relatively low traffic loads. It is one of the most widely used random access protocols, implemented in a number of variants, e.g., multichannel slotted ALOHA in 5G [21,22]. There is a vast literature on the performance evaluation of ALOHA-based schemes in terms of latency and reliability. For instance, grant-free ALOHA-based access has been studied for URLLC services [23,24]. Besides, latency and reliability can be combined into a single performance indicator termed latency-reliability [25]. On the other hand, it is difficult to derive closed-form expressions of the probability distribution of the AoI. Hence, most papers in the literature examined it in terms of its mean value and in the context of queuing theory and often in ideal systems with Markovian service [26]. Only a few studies investigated the tail or the full distribution of AoI, event though these provide a clear measure for the reliability and stability of control systems. In particular, these are directly connected to control systems by the survival time, defined as the time that an application may continue to operate without receiving an anticipated message [27]. The distribution of AoI with packet preemption and memoryless servers was investigated in [28]. In [29], the Chernoff bound was used to derive an upper bound of the quantile function of the AoI for two queues in tandem with deterministic arrivals. The peak-age violation probability, defined as the probability of exceeding a pre-defined PAoI threshold, was derived for a single-hop link with fading and retransmissions, in the form of variable-length-stop-feedback [3].
So far, only a few studies considered the impact of physical layer and medium access control on the AoI. Among these, recent works compute the average AoI in Carrier Sense Multiple Access (CSMA) [30], ALOHA [31], and slotted ALOHA [32] networks, considering the impact of the different medium access policies on the age. Of special interest for our study, the AoI with a TDMA-like scheme with perfect feedback and immediate retransmissions was compared to that of ALOHA [33]. It was observed that TDMA with retransmissions greatly reduced the AoI when compared to ALOHA. However, the former scheme assumes that the transmissions from all users after a transmission failure are delayed to allow for a retransmission to occur in the next time slot, which is inefficient, as a separate channel is needed for feedback.
There are only a few studies on heterogeneity in AoI systems. We mention the work presented in [34], which considered different service classes, and modeled the system as an M/G/1/1 queue with hyperexponential service time. However, only the service rate was different among classes. Then, the classes could adapt the arrival rate to minimize the AoI.

System Model
In the following, we denote random variables with capital letters (e.g., X) and their values with the corresponding lowercase letters (e.g., x). Sets are denoted in calligraphic font (e.g., U ), and the corresponding standard capital letters denote their cardinality (e.g., U). Vectors are denoted with bold lowercase letters (e.g., x), and matrices with bold capital letters (e.g., X). probability mass functions (pmfs) are denoted with a lowercase p and Cumulative Distribution Functions (CDFs) with a capital P. Table 1 provides a quick reference for the most important notation used in the rest of the paper. We define the outcome of the user's activity in a slot as an event, which happens with probability p and is mutually exclusive with other outcomes. The outcome vector k then corresponds to the composite event in which the i-th outcome is observed k i times, and the probability vector p contains the probability of each outcome (which does not necessarily sum to 1 as we consider that none of the outcomes might occur). We can then define the multinomial function Mult(k; n, p), which corresponds to the probability of outcome vector k being observed over n slots.
where |p| is the length of vector p. The binomial function Bin(k; n, p) is the special case in which |k| = |p| = 1. We also define the modulo function, which behaves as expected from integer arithmetic.
for m, n ∈ Z + . Z + is the set of non-negative integers.

Access Model
We consider an uplink scenario with a set of users U transmitting data to a Base Station (BS) over a single time-slotted multiple access channel. This single channel may consist of a single or of multiple subcarriers in an OFDMA system, whose number remains constant throughout the operation of the system. Users can transmit up to one packet per time slot, denoted by the index t ∈ Z, by occupying the available bandwidth and the entire duration of the slot. This can achieved by selecting a proper modulation and coding scheme based on the size of the payload to transmit. The study of multi-channel settings is considerably more complex, and left to future work, as having multiple concurrent resources in frequency domain changes the timing considerations significantly.
There is as set of users U in the system, composed of a single broadband user and multiple intermittent users. Specifically, user u B is the broadband user following the eMBB model: it is a full-buffer user that always has data to transmit and maintains an infinite transmission queue. To counteract potential packet losses due to the noise, the broadband user implements a packet-level coding scheme, where blocks of K source packets are encoded to generate a frame of N coded packets of length bits each. The basic operation of the broadband user is shown in Figure 1. The coded packets are linearly independent, which can be achieved, for example, with Maximum Distance Separable (MDS) codes or with Random Linear Network Coding (RLNC) with Galois-field size equal to ∞. In effect, decoding any subset of K coded packets is sufficient for recovering the original block.
The intermittent users belong to the subset U I = U \ {u B }, where U I = U − 1. They generate packets in each slot with a probability α (i.e., they experience Bernoulli arrivals with parameter α) and maintain a queue of up to Q generated packets. If a new packet is generated when the instantaneous length of the queue is Q, these users discard the oldest buffered packet and add the newly generated one at the end of the queue. The choice of discarding the oldest packet in the queue follows a simple rationale: discarding any of the packets has the same effect on the overall reliability, choosing the oldest minimizes the latency for the ones that are delivered, as they will spend less time waiting for a slot in which they can be transmitted. In most practical cases, the queue will be set up so as to minimize the probability of discarding packets, but the case with short queues is relevant for low-power IoT devices with limited memory and computational resources. Packets are transmitted from the queue using First-In First-Out (FIFO) discipline, and the transmissions take place in the allocated slots.
We consider a static allocation scheme, in which users are synchronized at the slot level. The set of users that are allocated slot t is denoted by A t , where A t ⊆ U s.t. A t = ∅. We define the following three types of slot allocations.

1.
Broadband: The slot is reserved for the broadband user. Hence, A t = {u B }.

2.
Intermittent: The intermittent users are allocated the slot and may use it if there are packets in their queues. Hence, A t ⊆ U I .

3.
Mixed: Both types of users are allowed to access the slot. Hence, A t ⊆ U s.t. u B ∈ A t and |A t | > 1. Next, we define the OMA and NOMA slicing based on the resource allocation as follows.

1.
OMA: Slots can be either allocated to the broadband user or intermittent users; we define T int to be the period between intermittent slots.
Finally, based on the allocation in the intermittent and mixed slots, we define the following three subdivisions of OMA and NOMA slicing. We take a slot t in which the intermittent users can transmit, i.e., any slot in NOMA or one of the intermittent slots in OMA.

1.
TDMA: The slot is allocated to a single intermittent user, such that |A t \ {u B }| = 1.

2.
Grouping: The slot is allocated to G ∈ {2, . . . , We consider the case where (U I mod G) = 0, i.e., we can divide the intermittent users into groups of equal size.

3.
ALOHA: All the intermittent users are allowed to transmit in the slot. Hence, |A t \ {u B }| = U I for all slots, excluding the broadband slots in OMA.
The frame structures for the six access schemes resulting from the combining of the slicing and allocation methods described above are illustrated in Figure 2: the circles represent the intermittent users that have access in any given slot, and the color of the square represents the type of access in that slot. We also not that the grouping scheme can be easily extended to cover the two extreme cases in which (i) there is only one group comprising all intermittent users (which is equivalent to ALOHA) and (ii) there is one user per group (which is equivalent to TDMA). Thus, it represents a general scheme which we can apply within OMA or NOMA.

Channel Model
We consider a quasi-static block fading channel, where the received signal by the BS at any slot t is given as where h u,t is the random fading coefficient for user u at slot t and z t is an Additive White Gaussian Noise (AWGN) noise with variance σ 2 . The random variable a u,t ∈ {0, 1} models user's activity, being equal to 1 if the user is active in that slot and 0 otherwise. A user is active only if it is allowed to transmit; i.e., if u ∈ A t , and if its packet queue q u,t is not empty: where I(×) is the indicator function, equal to 1 if the condition is true and 0 otherwise. Let P u be the fixed transmission power of user u, which can be different for each user. The Signal to Noise Ratio (SNR) of user u at time slot t is given by: whereas the Signal to Interference plus Noise Ratio (SINR) of user u at time slot t is given by where U \ u is the set of users except user u. We can also simply divide the SINR by the noise power |z t | 2 , giving Hence, the SINR is equal to the SNR in the absence of interference. Next, we define γ as the threshold in the SNR to decode a packet. That is, γ defines the erasure probability of a binary erasure channel (BEC) as Further, we consider a simple collision model, so that packets cannot be decoded in the presence of interference (i.e., collisions). Hence, a packet from user u can be decoded, with probability (1 − ε u ) if and only if SNR(u, t) = SINR(u, t). This model neither allows for capture, nor for potential subsequent application SIC within slots containing more than one transmission (i.e., intra-collision SIC), representing the worst-case scenario for schemes that rely on MPR, such as power-domain NOMA. Instead, SIC can be only performed after decoding the broadband user, regeneration of all its N coded packets, and removing them from the slots that also contain transmissions from the intermittent users (i.e., extra-collision SIC). In slots without a collision, we assume a constant erasure probability for each user, denoted as ε B for the broadband user and ε I for the intermittent user. Our assumption is that the erasure probability after the interference is canceled is the same as for a free channel, which is a simplification. However, the use of parity checks on all the packets in a frame means that the probability of erroneous packet decodings is very low, and modeling the precise performance of SIC schemes is beyond the scope of this paper, and this is a common assumption in the coded slotted ALOHA literature, which assumes a similar setting [35]. The model provides a general view on the lower bound on performance of the OMA and NOMA schemes that is independent of the underlying channel model.

Key Performance Indicators
The Key Performance Indicators (KPIs) of interest are described in the following. We first define the AoI ξ, which in our case is the number of slots that have passed since the generation of the last correctly received packet. If packet i is generated in slot g i and decoded by the receiver in slot d i , while packet i + 1 is generated in slot g i+1 > g i and decoded by the receiver in slot d i+1 > d i , we have: The PAoI ∆ is then simply defined as the AoI, measured at the instant of arrival of a new packet: The PAoI is the maximum value of the AoI across a cycle, as depicted in Figure 3. The relevant KPI for PAoI-oriented systems is the 90th percentile of the PAoI, denoted by ∆ 90 .
Latency and age are expressed in slots. ∆ 90 allows us to assess the tail distribution of the PAoI in a general scenario, and can be used to compare performance with different values of the slot arrival rate α. In contrast, a widely employed metric called PAoI violation probability [3] requires the definition of a specific threshold, either expressed as an absolute time or as a maximum number of slots. Furthermore, it cannot be used to compare the performance under different arrival rates α since the AoI is greatly determined by the latter.
For latency-oriented systems, we introduce a similar KPI, which is the 90th percentile of the latency-reliability for intermittent users. The distribution of latency-reliability is computed by multiplying the distribution of the latency of successfully received packets by their success probability p s,I : on all packets, not just the successfully delivered ones, We can now define the Pareto frontier, which is commonly used in multi-objective optimization: Definition 1. Let f : (Z + ) 2 → R × Z + and C be the set of feasible configurations. Next, let where S B is the throughput of the broadband user and τ is the timeliness of the intermittent user, i.e., ∆ 90 or T 90 . The Pareto frontier is the set

Orthogonal Multiple Access
We first consider algorithms based on OMA, assuming that U I > 1, and, for the sake of simplicity, that all intermittent users have the same slot arrival rate α. In an OMA system, the broadband user transmits in frames of N broadband slots, each of which contains an encoded data packet. It is sufficient to decode K of the N packets to recover the whole frame. Reserved slots for the intermittent users are interleaved with the ones for the broadband user: there is one intermittent slot every T int , where in general T int = N, and in which one or more intermittent users try to access the channel.

PAoI-Oriented System
In PAoI-oriented OMA system the transmission queue size is Q = 1 and preemptive scheduling is used, i.e., a newly arrived packets replaces the one stored in the buffer. In this case, the KPIs are given by ∆ 90 (the 90th percentile of the PAoI) for the intermittent users, and the throughput S B for the broadband user. We consider the grouping model, in which the U I intermittent users are divided into G groups. As we mentioned earlier, the scheme applies TDMA between groups. Users in the same group contend for the channel in the same slots. The ALOHA and TDMA systems are extreme cases of the grouping scheme, with G = 1 and G = U I , respectively. The OMA grouping scheme is represented in Figure 2, along with the two extreme cases.
Denote the probability of successfully decoding the broadband users frame (i.e., the N packets contained in it) as p s,B . The throughput S B is As the broadband user can only use T int − 1 slots out of every T int , setting up more frequent transmission opportunities for the intermittent users reduces the broadband user's throughput. Probability p s,B is easy to compute in this case, as orthogonal access prevents collisions with the intermittent users.
In order to compute the success probability for intermittent users, we first consider the probability ρ that an intermittent user accesses the channel in the next intermittent slot that is allocated to it, i.e., the probability that at least one packet is generated in a GT int interval.
The probability p s,I that a packet from an intermittent user is decoded successfully is then given by two components. First, all other intermittent users in the same group must not have any packets to send in that slot, and second, there must not be a channel erasure.
The PAoI is then ∆ = W + Z, given by the sum of two components: the first, W, is the waiting time between the generation of a packet and its successful transmission. The second, Z, is the inter-transmission time between the slot when the packet is transmitted and the slot in which the next successful packet from the same user is decoded.
The pmf of the waiting delay W of a successful transmission in PAoI-oriented OMA is then given by: Since transmission opportunities for the intermittent users in a given group are scheduled in one slot every GT int , Z is GT int times the number of reserved slots between consecutive transmissions. This is a geometric random variable, whose parameter is ρp s,I . The pmf of Z is then given by The pmf of the PAoI ∆ is now easy to find by convolving the distributions of W and Z. Since W's support is {0, . . . , GT int − 1, and Z's support is GT int × Z + , the convolution is reduced to a simple multiplication: We can now easily derive the KPI ∆ 90 by applying (11).

Latency-Oriented System
We now examine the relevant KPIs for the latency-oriented case. In this case, intermittent users maintain a queue of up to Q ≥ 1 packets, discarding the oldest one when a new packet arrives and the queue is already full. As for the PAoI case, we consider the grouping system, in which the U I intermittent users are placed in G groups. The throughput of the broadband user is the same as in the PAoI-oriented system, given by (14). We now focus on intermittent user u: the state of its queue is represented by a Markov chain, whose discrete time instants represent the time just after each slot allocated to it. In the following, we will refer to any slot allocated to the considered user u as an allocated slot. The elements of the state transition probability matrix P (Q) are given by Using basic Markov theory, the steady-state distribution π (0) is derived as the lefteigenvector of P (Q) with eigenvalue 1, normalized to sum to 1.
We can now consider the slots between two allocated slots by deriving the steady-state distribution n slots after the last allocated slot, which we denote as π (n) .
At each allocated slot, the oldest packet in the queue is transmitted. If a new packet is generated when the queue is already full, the oldest packet is dropped from the buffer. Consider a specific packet generated in the n-th slot after an allocated one: if it finds q packets in the queue when it is generated, it will be transmitted at the q + 1-th allocated slot after it is generated, unless some packets ahead of it are dropped due to new arrivals. We can then define a generation vector of length , whose i-th element contains the number of packets generated in the slots between the i − 1-th and i-th allocated slots after the generation of the considered packet. The first element of the vector contains the number of packets generated between the considered packet's generation and the first allocated slot after it. We then define the set G (n) , which contains all the generation vectors of length for a packet generated in the n-th slot after the last one allocated: The probability of each generation vector in the set is then given by: The considered packet is then transmitted by the -th allocated slot after its generation if q + 1 − packets ahead of it are either dropped or transmitted at that point. For a given generation vector g ∈ G (n) , we then formulate condition ψ (g,q) : where δ(x) is the delta function, which is equal to 1 if x = 0 and 0 otherwise, and [x] + = max(x, 0). The condition naturally excludes the cases in which the considered packet is dropped, i.e., when a new packet arrives and finds a full queue, with the considered packet being first in line. We can then define the set H (n,q) , which contains the elements g ∈ G (n) for which the considered packet is transmitted at the -th opportunity: The maximum value of is q + 1, as by that point the packet has either been transmitted or dropped. Consequently, the success probability p s,I (n, q) for an intermittent user arriving n slots after an allocated one and finding a queue of q packets ahead of it is given by The packet can only be received correctly if the channel is free and there are no channel errors, as we assume totally destructive interference. The probability of having a free channel is equal to the probability that none of the other intermittent users in the same group have any packets in their queues: We can then compute the conditioned latency distribution: Knowing that packet generation probability is the same for every slot, we can now use (23) to derive the overall success probability.
In the same way, we derive the latency pmf: π (n−1) q p T (t; min(q, Q − 1), n)p s,I (n, q) GT int p s,I .
The 90th percentile of the packet delivery latency T 90 can be derived by applying the definition in (12).

Non-Orthogonal Multiple Access
We now examine the performance of NOMA schemes, in which the intermittent users' packets can collide with the broadband user's packets, and among themselves. If the broadband user frame (i.e., N packets contained in it) has been recovered, the receiver performs SIC to remove the broadband user's packets from the slots. In the next step, the receiver attempts decoding intermittent users' packets which may be contained in the slots affected by SIC. According to the channel model, the decoding succeed only if there was a single intermittent user transmission (i.e., packet) in a slot, and it was not affected by a channel erasure. As for the OMA case, we consider the grouping case. In this case, each intermittent user can transmit once every G slots, along with the other users in the same group.

PAoI-Oriented System
As in the OMA case, we first consider a PAoI-oriented system, in which Q = 1 and preemptive scheduling are used for all intermittent users. Since all intermittent users have the same arrival rate α, we can easily compute the success probability of the broadband user: The throughput for the broadband user is then: We now turn to computing the value of ∆ 90 . We consider a specific intermittent user u, whose probability of generating at least one packet before the next allocated slot is If a packet from an intermittent user is transmitted, the probability of success (without considering the interference from the broadband user) is As we did in the OMA case, we can divide the PAoI in three parts: where, as above, W is the waiting time from the packet generation to its transmission and Z is the inter-transmission time. Y is the decoding latency, i.e., the number of slots from the transmission until its successful decoding, which is 0 for OMA (in that case, packets are either decoded immediately or lost due to erasure or collision), but can be non-zero for NOMA if the packet is recovered later with SIC. The distribution of W is simple to derive: We can now compute the pmfs of Y and Z, but to do so we first compute some auxiliary functions. We define the offset o as the index of the slot that represents the first allocated slot for the considered user in the frame. Denote by T o (d) the set of transmission opportunities for the user from the beginning of the frame to slot d, whose first element is o: The probability that the user will transmit m packets by slot d for a given offset o is We now derive the probability that the first packet from the intermittent user to be decoded in a frame is correctly received in slot d. This only happens if three conditions are met: 1.
The interference from the broadband user can be successfully removed by SIC; i.e., K packets from it have been received and decoded in the current frame.

2.
There is no interference from other intermittent users.

3.
There are no channel errors.
The second and third conditions are easy to compute, and are summarized by (36). To consider the third one, we consider the two cases in which the packet is transmitted and decoded in the same slot (denoted as A) and the one in which it is decoded later (denoted as B). In the former case, at least K packets from the broadband user have already arrived before d, and SIC is performed immediately; in the latter case, the intermittent user packet is retroactively decoded when the K-th broadband user packet is decoded.
We start with the first one: In case A, the decoding delay is always 0, i.e., Y = 0. In case B, the probability of a packet from the intermittent user being decoded in slot d is equivalent to the probability of at least one packet from the user being transmitted in the frame, and the K-th packet from the broadband user is decoded in slot d.
The pmf of the decoding delay Y in case B is more complicated.
If d is an allocated slot, we have to consider both cases, but if it is not, the only possible case is the first one.
We can then compute the probability that no packets will be delivered in a frame with a given offset.
Naturally, the delay of decoding events that come after the first in the frame is always 0, as SIC can instantly decode the packet from the intermittent user. We can now compute the probability of having a decoding event in a given slot d, given that the first decoding event was in slot f and the offset is o.
We can then uncondition on f and get p R (d): With this, we compute the probability that a decoding event in a given slot is the first in the frame: We can then compute the pmf of the latency T for a decoding in slot d.
The final component of the PAoI is the inter-arrival time, Z. There are two separate cases for this: either the two consecutive decoding events are in the same frame, or the next one is in a future frame. We first find the probability that a given decoding event is the last in the frame: If the next packet from the intermittent user is in the same frame, we have: If the next packet is in a future frame, we need to compute the offset for the next frames. We denote the offset for the i-th frame after the current one, which has offset o, as If the number of groups G is larger than the number of slots in a frame N, there might be no transmission opportunities in a frame; in that case, T ω i (o) (N) = ∅, and the intermittent user will never transmit in that frame. For a given inter-transmission time z, we can then define the number of frames without successfully received intermittent packets as M(z; d, o): We can then give the pmf of the inter-transmission time if the next packet is not in the same frame: By unconditioning over L and d, we get the pmf of the inter-transmission time: We can now join the results in (38), (49), and (55) to get the pmf of the PAoI for a given offset: Finally, we uncondition on the offset o by considering all the possible offsets for a user. We assume that the initial offset is o 0 , and denote the set of reachable offsets from o 0 as The probability of having a random decoded packet be in a frame with offset o is then given by We can now uncondition the PAoI pmf: We remark that the grouping scheme is not necessarily fair to users, as users with a different initial index might have slightly different PAoI distributions.

Latency-Oriented System
We now derive the distributions of the KPIs in the NOMA latency-oriented case. As for OMA, intermittent users maintain a queue of up to Q packets, and we can define the transition matrix P (Q) of the Markov chain representing the queue state of an intermittent user right after two successive transmission opportunities: Using the same procedure as in (23), we can derive the steady-state distribution π (0) , and then the value of π (n) in intermediate slots. We can then define the success probability and throughput for the broadband user We now analyze the latency for an intermittent user. As in the PAoI case, we consider an offset o, with a set of possible transmissions T o (N) given by (39). Latency is composed of two parts, the waiting time W and the decoding time Y. The waiting time is the time from the generation of the packet until it is transmitted, and the decoding time depends on when the frame from the broadband user is decoded. As it was done for the OMA case, we define the generation set G (n) , which contains the possible numbers of arrivals in each transmission window after the generation of the considered one: The probability of each element in the set is given by: As we did for OMA, we define the set H (n,q) , which contains the elements g ∈ G (n) for which the considered packet is transmitted at the -th opportunity, following the definitions we gave in (26) and (27). We compute the dropping probability for a packet generated in slot n with q packets ahead of it as such: We can now compute p W (w; n, q) In order to compute Y, we need to consider the fact that transmission opportunities before or after the one in which the packet is sent are used by the same user. We consider a packet generated in slot i in a frame with offset o, which finds q packets ahead of it and waits for w slots before being transmitted. If the transmission is in the same frame as the packet generation, there might be C transmission opportunities unused by the user before the packet generation, whereas if the transmission is in a subsequent frame, the user is active in all transmission opportunities in the frame before the one in which the packet is transmitted, because it still has packets in the queue. We know that the offset of the frame in which the packet is transmitted is ω i+w N (o), as given by (52). In the following, we will simply refer to this value as ω to simplify the notation. We then have that C = 0 if i + w > N, and in the other case we need to consider the possible events that happened before the generation of the considered packet.
We now compute the pmf of C. There are |T o (i − 1)| transmission opportunities before the generation of the packet. We define n(i; o) as the slots between the last available allocated slot and slot i: We define the generation set J o (i) as Each vector j in the set corresponds to a possible sequence of past events that led to this point. We define the number of queued packets at the i-th allocated slot for the generation vector j for a given starting queue q 0 , denoted as q i (j; q 0 ), as If we condition the set on the fact that the packet generated in slot i finds q packets in the queue, we get For each initial queue q 0 , we can then define a set C (i) o,q 0 (l), which contains the generation vectors that cause exactly l transmission opportunities to be unused.
We then get p C (l; i, q, o).
We now repeat the same consideration for transmission opportunities after the transmission of the considered packet. The number of packets in the queue after the transmission V(g), for a given generation set g, is There are at least V(g) occupied transmission opportunities after the transmission of the packets. We can then define the generation set H (V), which represents the possible new packet arrivals.
The probability of each vector h in the set is given by: We define the number of queued packets at the i-th allocated slot for the generation vector h, denoted by q i (h), as where q 0 = 0. We can then define the set F ( f ), which requires f transmission opportunities to be unused by the considered user.
The probability of having f unused transmission opportunities for the user by the d-th slot in the frame after the packet transmission, given the generation vector g, is then In the following, we denote (1 − ε B ) as p 1 to simplify the notation. We now compute the probability that r packets from the broadband user frame are correctly received by slot d, given that there are f transmission opportunities before it left unused by the considered user: where p free is the same as in (29). We can then define the probability that at least K packets have been received by slot d, P R (d, f , o).
The success probability for a packet i, which finds q packets ahead of it in the queue, in a frame with offset o, is The latency when the decoding delay is 0: where p free is the same as in (29). If the decoding delay is not 0, we need to consider that transmission opportunities after the slot might be free. Furthermore, we define p B (d, i, w, q, g, f , o) as the probability of correctly receiving a packet from the broadband user in slot d: We can now compute the latency and decoding delay joint pmf when the latter is not 0.
If i + w is larger than N, the packet is transmitted in the next frame, and we have Now we uncondition p T,Y (t, y; o, i, q) on i and q and remove Y to get p T (t; o), knowing that the generation probability is the same in all slots: Taking the offset set O(o 0 ) as defined in (57), and using the probabilities in (58), we get: In the same way, we can compute the reliability p (I) s from (81):

Results
In this section, we show some illustrative analytical results for the PAoI-oriented and latency-oriented case. We first confirm that our theoretical calculations are correct by considering a given scenario and performing a Monte Carlo simulation. We simulate the erasure channel and destructive interference simply by dropping packets from the list, and consider T = 1, 000, 000 frames. In the scenario we simulate, the broadband user protects its transmission with a K over N erasure code, i.e., N = 40 and K = 32, and the arrival rate for each of the U I = 10 intermittent users is α = 0.005 (i.e., U I α = 0.05). The OMA systems use T int = 5. As Figure 4 shows, the theoretical results for both PAoI and latency-reliability, shown here as CDFs, match the simulations perfectly in all cases. Monte Carlo results are not shown for the rest of the section to improve the understandability of the plots, but the results still match tightly with the theoretical analysis.  The results are presented in the form of Pareto frontiers, that capture the best trade-offs between the throughput of the broadband user S B and the 90th percentile of the timeliness metric for the intermittent users. The parameter settings are shown in Table 2. With the selected parameters and if only the broadband user is considered, the optimal source and coded block sizes are N = 77 and K = 64, where K is limited to 64 to make the solution practical), respectively, which results a throughput of S B = 0.8147 packets per slot. The latter corresponds to the upper bound in throughput for both OMA and NOMA systems evaluated in the following. We first consider PAoI-oriented systems, whose performance has a strong dependence on the aggregate arrival rate U I α. We assume that U I = 4, which allows us to explore a wide range of values for α. In this case, the grouping scheme used G = 2, whereas the ALOHA and TDMA cases had the expected G = 1 and G = 4, respectively. When α is very low, the inter-arrival time dominates the PAoI and the impact of the choice of access schemes is negligible. As Figure 5 shows, this is true even for a total arrival rate of U I α = 0.01, which corresponds to an average of one packet every 400 slots from each source: as the arrival process is exponentially distributed, the 90th percentile of the inter-arrival time is 920 slots, and it is impossible to achieve a lower ∆ 90 . In cases with a higher arrival rate, OMA TDMA seems to be the best system, although NOMA ALOHA can achieve a similar performance when PAoI is more important than the broadband user throughput.
Besides the achievable performance trade-offs, it is also important to observe the parameter settings that achieve Pareto efficiency, as shown in Figure 6. The difference between the optimal values of T int in OMA for the three considered schemes is stark, as shown in Figure 6a. This is because collisions are the main factor driving up the age in OMA, making the age for TDMA far lower. The other factor in the age is the waiting time due to the grouping: while TDMA compensates for this by avoiding collisions entirely, the grouping scheme with G = 2 is the worst of both worlds, getting extremely poor performance due to having both a longer interval between allocated slots and the risk of collisions. Therefore, in age-oriented systems where the arrival rate α for each intermittent user needs to be relatively high to achieve the desired AoI, orthogonal slicing among all users (broadband and intermittent) is a good choice, as the alternative will result in a high collision probability.
Collisions are not so common in TDMA, as the transmissions for the intermittent users can be spread out over all slots, and are not concentrated in some reserved ones. In this case, the specific method used is not very important, as Figure 6b shows: the three schemes have a similar age with very similar coding rates. However, NOMA cannot significantly outperform OMA TDMA, as allowing collisions with the broadband user limits the achievable throughput. 10      Next, we consider the latency-oriented systems where the 90th percentile of latencyreliability T 90 is the main KPI for intermittent users. For these, we focus on illustrating the impact of the arrival rate U I α and the number of intermittent users U I . Figures 7-9 show the Pareto frontiers for the cases with U I = 4, U I = 10, and U I = 100, respectively. Each of the figures includes the latency and throughput trade-offs for U I α ∈ {0.01, 0.02, 0.05, 0.1}.
For the case with U I = 4, we see an interesting phenomenon in Figure 7a,b: if the arrival rate is low, OMA ALOHA is the optimal choice if the main KPI is the latencyreliability. However, it is not able to achieve a high broadband user throughput S B . Conversely, NOMA, either with ALOHA or grouped access among the intermittent users, can achieve the greatest throughput S B 0.8. In addition, NOMA ALOHA achieves the lowest latency-reliability with S B > 0.  As the arrival rate increases with U I = 4, NOMA becomes the Pareto efficient choice for all points in the latency-throughput trade-off, albeit with a small margin. This is observed in Figure 7c,d, where the Pareto efficient methods are NOMA ALOHA and NOMA TDMA, respectively, with NOMA grouping achieving a close performance. The reason for the better performance of NOMA with high arrival rates is that it allows the intermittent users to access considerably more resources than OMA, which minimizes collisions between them. These collisions are considerably harmful for the system as they cannot be resolved. Therefore, OMA ALOHA becomes infeasible with high arrival rates, whereas OMA TDMA may suffer from queue overflows since intermittent slots are spaced by U I T int slots.
Next, Figure 8 shows a similar pattern to Figure 7, but with a much better performance of NOMA with respect to OMA. Specifically, NOMA ALOHA and grouping achieve much better trade-offs when compared to OMA TDMA for the considered arrival rates, with the only exception being that NOMA grouping is not viable for U I α = 0.1. This is also the case with all the ALOHA methods, which fail for the cases with U I α = 0.1 because of the excessive collisions among the intermittent users. Finally, OMA grouping can only achieve the required 90% reliability for the intermittent users with U I α = 0.1 by making S B = 0.  The case with U I = 100, displayed in Figure 9, features a more pronounced differences among the access schemes, indicating that the selection of the access scheme and/or its parameters will be even more critical in massive access scenarios with larger number of users. As in the previous cases, using NOMA becomes more convenient as the total arrival rate increases. OMA ALOHA performs particularly well for low total activation rates, as collisions between intermittent users are rare in this scenario, and in settings that are oriented more towards latency-reliability than broadband user throughput, as increasing the transmission opportunities for the intermittent users can further reduce the probability of collisions between them.
In general, it can be concluded that ALOHA schemes perform better under low arrival rates U I α, whereas TDMA schemes perform better when the aggregate arrival rate increases. This may be expected, in particular as the assumed timeliness parameters of interest are rather stringent. The performance of OMA grouping oftentimes lies between that of OMA ALOHA and TDMA for all values of U I . This showcases its robustness to the arrival rate U I α, but also that it is not an ideal option to optimize performance. Instead, NOMA grouping achieves a remarkable performance, oftentimes matching or even surpassing the performance of NOMA ALOHA and NOMA TDMA, even with very high rates. Depending on the scenario, the number of groups is highly variable: if U I α = 0.1, the grouping scheme uses the largest possible number of groups (i.e., G = 50 with U I = 100), making the scheme closer to TDMA than pure ALOHA. On the other hand, ALOHA is more convenient for lower activation rates, so the best grouping performance will be obtained with G = 2.
Most interestingly, NOMA schemes outperform OMA under most conditions, with the exception of OMA ALOHA for low arrival rates. This behavior is extremely encouraging for the performance of NOMA in realistic systems, as the collision channel we considered is a worst-case scenario for non-orthogonal access.
As observed in our previous work [7], by including the probability of channel capture and intra-collision SIC, the performance of non-orthogonal schemes can only improve. Nevertheless, OMA may can also benefit from capture and intra-collision SIC by mitigating collisions between intermittent users.

Conclusions
In this work, we investigated the performance trade-offs with orthogonal and nonorthogonal spectrum slicing in a multiple access system with broadband and intermittent users. We derived closed-form expressions for both PAoI and latency-reliability for the intermittent users, along with throughput for the broadband user, in a time-slotted system in which the users share a single frequency channel.
The results illustrate that, by implementing an erasure code at the broadband user, the choice between OMA and NOMA depends on the specific features of the considered scenario and on the objectives of the system designer. In particular, the number of intermittent users and their aggregate arrival rate have major impacts on the preferred slicing and access method for latency-oriented systems. In these cases, TDMA was clearly preferable for the higher arrival rates, whereas ALOHA performed remarkably well with low to medium arrival rates. Interestingly, the opposite effect can be seen for the choice of the access scheme, as NOMA outperformed OMA with higher arrival rates, and orthogonal allocation worked better for lower arrival rates. The NOMA ALOHA scheme presents a case of particular interest, as by correctly tuning the coding parameters for the broadband user, it could oftentimes achieve the best performance trade-offs with low to medium arrival rates in the extreme cases-that is, when the intermittent users required the lowest latency and when the broadband users required the highest throughput. On the other hand, NOMA TDMA is clearly the best access method for latency-reliability with high arrival rates. The PAoI results show that the two access methods are almost equivalent, as long as they are configured properly, and the main driver of performance is the packet generation process. However, OMA TDMA does show significant advantages with respect to the other OMA schemes, as it avoids collisions entirely, whereas the other OMA schemes may still have collisions between intermittent users. These results, obtained in the simple collision channel without capture, showcase the potential of NOMA schemes in scenarios with heterogeneous service types as channel capture and intra-collision SIC greatly improve its performance.
Future work on the subject can be oriented in multiple directions: First, analyzing the system with MPR is definitely a priority, as the worst-case analysis has already shown the advantages of NOMA. Secondly, more realistic systems could be investigated, with time-dependent arrival patterns or with multiple frequency channels, which would add an interesting dimension to the problem by providing parallel resources. The possibility of using packet repetition to increase the intermittent users' reliability is another interesting facet that can be examined, although the complexity of the system may grow beyond the possibility of analytical tools, requiring a simulation-based approach.