Performance Model for Video Service in 5G Networks

: Network slicing allows operators to sell customized slices to various tenants at di ﬀ erent prices. To provide better-performing and cost-e ﬃ cient services, network slicing is looking to intelligent resource management approaches to be aligned to users’ activities per slice. In this article, we propose a radio access network (RAN) slicing design methodology for quality of service (QoS) provisioning, for di ﬀ erentiated services in a 5G network. A performance model is constructed for each service using machine learning (ML)-based approaches, optimized using interference coordination approaches, and used to facilitate service level agreement (SLA) mapping to the radio resource. The optimal bandwidth allocation is dynamically adjusted based on instantaneous network load conditions. We investigate the application of machine learning in solving the radio resource slicing problem and demonstrate the advantage of machine learning through extensive simulations. A case study is presented to demonstrate the e ﬀ ectiveness of the proposed radio resource slicing approach.


Introduction
Fifth generation (5G) cellular networks are required to support cell capacity of multi-gigabit per second (Gbps) and cell edge throughput of tens of megabits per second (Mbps). In addition to the pure performance metrics, such as the rate, latency, reliability, and allowed connections, the scope of 5G incorporates the transformation of the mobile network ecosystem and accommodates heterogeneous services using one infrastructure [1]. To achieve this goal, 5G networks incorporate a technique named network slicing. Network slicing tries to slice the whole network into slices, each of which is tailored to the specific service requirements that are agreed to in a service level agreement (SLA) with customers (also known as tenants). Therefore, network slicing is an emerging business for operators that allows them to sell the customized network slices to various tenants at different prices [1]. As part of network slicing, radio access network (RAN) slicing defines a shared RAN connected to each of the multiple tenants' core networks, with radio resources distributed by the central controller to different tenants, to maximize the radio resource usage.
To provide better-performing and cost-efficient services, RAN slicing is looking to intelligent radio resource management approaches to be aligned with users' activities per slice. For radio access networks, the spectrum is expensive; thus, guaranteeing high spectrum efficiency (SE) is meaningful. SLAs usually impose stringent requirements on the network quality of services (QoS). Therefore, it is critical to investigate how to intelligently respond to the dynamics of the traffic load from mobile users to obtain satisfactory QoS in each slice at an acceptable spectrum cost [1]. In particular, RAN providers must implement new radio resource management approaches to map to perform MU-MIMO transmission toward UEs, to eliminate inter-cell interference and achieve the highest cell capacity [26,27].
Beamforming systems include switched beam systems (SBSs) and adaptive array systems (AASs). Switched beam system uses fixed beams and beam patterns to point to predetermined directions within a cell [28]. Adaptive array system forms a beam pattern for each user, the beam pattern is generated to direct the main lobe to the desired UE and nulls to interfered UEs for interference suppression [29]. The motivation for an SBS is its lower cost, as AAS is expensive for commercial mobile networks.

Contribution
In this paper, we present a two-level dynamic IC approach, referred to as intra-cluster and inter-cluster levels, coordinated in the time and frequency domains, to construct and optimize the QoS performance model, and an approach for utilizing the constructed QoS performance model for network design and optimization. The approach targets the SBS on the mmWave band, employs MU-MIMO to improve the system capacity, and is designed for delay-sensitive traffic (i.e., video services, which are fixed wireless access applications (FWA)). The main contributions of this paper include the following:

•
Our proposed approach targets improvement in performance models that map high-level customer-friendly business requirements to low-level network parameters and achieve QoS assurance for delay-sensitive traffic.

•
The data analytics and ML approach is employed to construct the QoS performance model for network design and optimization, to identify the relationships and dependencies between SLA high-level requirements and low-level network attributes.
The paper is organized as follows: the problem is formulated in Section 2, the proposed approach is described in Section 3, the simulation methodology is described in Section 4, the simulation results are presented in Section 5, the complexity analysis is shown in Section 6, and Section 7 concludes the paper.

Problem Description
By balancing the relative importance of resource utilization (i.e., SE) and the quality of experience (QoE) satisfaction ratio, the resource management problem could be formulated as R = α · SE + β · QoE, where α and β represent the weights of SE and QoE [1]. QoE is subjective, so we use QoS instead.
Consider an area with U users, denoted as U = {1, . . . , U}, uniformly distributed within the area, supporting S services, denoted as S = {1, . . . , S}, and sharing the aggregated bandwidth W. Denote U s as users supporting service s. Due to the lack of physical resources, sites are shared among services, and are uniformly distributed within the area (with the same inter-site distance (ISD) between the sites). In a real network, instead of maximizing QoS, maximizing the number of users who satisfy the QoS requirements is the target. The problem is written as: Problem SM-P1: Over: w, isd, where: Subject to: where T s is the pre-defined threshold for service S. qos u (w, isd) > T s tells us that the QoS requirement T s of service s is satisfied by user u under network attributes, such as the bandwidth distribution of w and the site separation of isd.
To address the complexity, a sub-optimal solution is proposed to put higher priority on the QoS requirement, optimize radio resource utilization using IC approach, and allocate dedicated bandwidth for each service. The proposed solution includes two steps. First, create a performance model for each service. The problem can be transferred into solving the following performance model over different w s and isd: Second, perform an exhaustive search in the ISD space and W space for all services. The point satisfied the total bandwidth constraint and fulfills the QoS requirements for all services yet with the maximized ISD, is the optimized solution.
The core of the solution is solving the performance model for each service. Data analytics and machine learning approaches using regression algorithm that allow automated identification of non-explicit relationships and dependencies are exploited. With the performance model, the following problems are investigated: (1) optimization of the performance model using IC approaches, (2) network design using the performance models, and (3) network optimization.
In this research, the proposed UIC approaches are on both time (TIC) and frequency (FIC) domains. First, we modify the approach we proposed in [29], extend it into the time domain, and apply the approach to the video service. Assume the mobile network consists of C cells, denoted as C = This research assumes an SBS system using basic and switched directional antennas. Each user chooses one beam to transmit, referred as the serving beam, and we define the strongest beam from cell c to user u as m u,c . We also assume a constant transmit power p 0 per sub-band for each cell. If multiple beams are transmitted, the power will be distributed evenly among the simultaneously transmitted beams. The following Figure 1, Figure 2, and Equations (6)-(8) describe the assumptions. Figure 1 shows the radiation pattern of basic and switched directional antennas with 15, 30, and 60 of half power beam width (HPBW), denoted as θ B . Equation (6) shows the radiation pattern in decibels [30], where G 0,dB is the maximum gain and G N,dB is the averaged gain values at the sidelobe.
where ∅ u,c,m is calculated by:    Figure 1 shows the radiation pattern of basic and switched directional antennas with 15°, 30°, and 60° of half power beam width (HPBW), denoted as . Equation (6) shows the radiation pattern in decibels [30], where 0, is the maximum gain and ℕ, is the averaged gain values at the sidelobe.
where ∅ , , is calculated by:   Figure 1 shows the radiation pattern of basic and switched directional antennas with 15°, 30°, and 60° of half power beam width (HPBW), denoted as . Equation (6) shows the radiation pattern in decibels [30], where 0, is the maximum gain and ℕ, is the averaged gain values at the sidelobe.
where ∅ , , is calculated by: With the above assumptions, the optimization problem is written as a function of I, Over: I. Subject to: is defined in Equations (6)- (8). w u is the weight of user u. For video traffic, w u is calculated using the head of line delay of the video traffic. W is the bandwidth. g u,c is the channel gain, which depends on the distance between the cell c and the user u. M max is the maximum number of beams per cell. σ u b is the thermal noise.
Problem P1 defines UIC approach. GIC approaches allocate time and frequency resources to UEs so that all UEs' interference level are lower than a threshold. The problem is formulated as: Problem IC-P2: Maximize c∈C b∈B u∈U c Over: I. Subject to: where SINRThreshold is the pre-defined threshold. The IC approaches defined in IC-P1 and IC-P2 are used to improve the performance model. After selecting the best interference coordination approach, we constructed the performance models for each service to map SLA requirements into network attributes. The performance models are used for network design and optimization. For network design, we propose an approach to learn from the performance models of existing markets and use the prediction to design the new market. Figure 3 shows the proposed design approach. Because the network attributes are designed using the worst case (i.e., the highest traffic loads), the network optimization process is used to dynamically adjust the designed network attributes to instantaneous network load conditions to improve the network usage.

Interference Coordination Approach
Large networks are partitioned into coordination clusters for scalability considerations. This causes an inter-cluster interference problem; i.e., within a cluster, the inter-cell interference has been minimized, but between clusters, inter-cluster interference still exists. Therefore, we present a two-level IC approach to improve the performance model, which reduces intra-and inter-cluster interference.

Network Clustering
We used the clustering methodology in [29] to group sectors into clusters. Neighbor sectors with higher interference to each other are more likely to be clustered together. The interference is calculated based on all UEs in the neighbor sectors and their allocated beams. The interference is calculated using Equations (6)- (8) and averaged over all UE within the sector, as shown in Figure 2. The cluster starts by adding a randomly selected sector, and sectors are added if they cause highest interference to all existing sectors in the cluster. The cluster increases to a maximum cluster size C max .

Intra-cluster Interference Graph Construction
After we construct the clusters, we build a directed graph G = (N, V), referred to as an intra-cluster interference graph. In the graph, N nodes represent N UEs, V edges represent interference between UEs, the direction from A to B means if UE B is transmitting, it will cause w A,B interference to UE A, where w A,B is the edge weight. The interference graph is built using Equations (6)-(8) by assuming a fixed transmit power; i.e., one UE transmits at a time. During scheduling, interference among the UE is dynamically calculated, using the transmit power calculated according to the number of UE scheduled simultaneously, in Equation (7).

Interference Coordination
Interference coordination is performed at the intra-and inter-cluster levels, which are summarized in Table 1. Table 1. Proposed Ic Algorithms.

(1) Intra-cluster interference coordination
Within each cluster, up to M max UE can be scheduled simultaneously per time-frequency resource (per TTI per sub-band). A greedy algorithm has been used for intra-cluster scheduling. The algorithm starts by adding a randomly selected UE, UEs are added according to: IC-P1: Increased network utility defined in (9). IC-P2: Increased number of UE scheduled satisfying the constraint defined in (14). UE is added till the criteria are broken or maximum number of users per cluster is reached. In this way, UEs are partitioned into groups, and the groups are assigned different time-frequency resources. Scheduled UE will not be scheduled to the next time-frequency resource in the same intra-cluster coordination period (ICCP) (ICCP is the number of TTIs within which all UEs are scheduled once). The Algorithm 1 is shown in the following pseudo-code.

(2) Inter-cluster interference coordination
Inter-cluster interference coordination restricts UEs scheduled during intra-cluster scheduling. Cluster-edge UE reports UEs causing high interference, and by removing some of them, the cluster-edge Future Internet 2020, 12, 99 8 of 21 UE can achieve a higher SINR or higher total network utility. The forbidden relationship Table 2 is built on serving cell and dynamically updated. Table 1 from [23] shows a forbidden relationship table. The Algorithm 2 is shown in the following pseudo-code.

Algorithm 1 Intra-cluster scheduling algorithm
For each cluster c Scheduled UE at this intra-cluster coordination period: U Initialization: Scheduled UE at this TTI and this sub-band: Algorithm IC-P2 turns off UE that causes high interference to neighbor clusters and reschedule the UEs to other time-frequency resources, which causes capacity loss. Algorithm IC-P1 is targeted to maximize the network capacity, simultaneously reducing the interference. In the pseudo-code, Util are reported by the forbidden relationship table. Figure 4 shows how interference affect both intra-cluster coordination and inter-cluster coordination: in cluster 1, BS1 schedules UE1 and UE2, when BS2 schedules UEs, the intra-clustering algorithm skips UE3 because it causes high interference to UE1 who is already scheduled, and schedules UE4. In cluster 2, BS3 schedules UE5. During inter-cluster coordination, UE2 is found to have low SINR and its interferer UE5 is unscheduled to reduce interference.

Network Design Approach
After selecting the interference coordination approach, we can construct the performance model to map the SLA requirements to the network attributes. For video service, SLA requirements include the traffic demand (or the number of UE that fulfills specific service QoS requirements), and the network attributes include the allocated bandwidth among the services and the ISD. Other attributes, i.e., the average UE distance from the site, and variance in traffic demands among sectors can be used to fine-tune the performance model. Algorithm 3 is used to create a performance model.

Network Design Approach
After selecting the interference coordination approach, we can construct the performance model to map the SLA requirements to the network attributes. For video service, SLA requirements include the traffic demand (or the number of UE that fulfills specific service QoS requirements), and the network attributes include the allocated bandwidth among the services and the ISD. Other attributes, i.e., the average UE distance from the site, and variance in traffic demands among sectors can be used to fine-tune the performance model. Algorithm 3 is used to create a performance model.
The algorithm maps the input variables (s, u, b, isd, dis, and varT) to the required bandwidth. UETHRESHOLD is defined as percentage of UEs satisfied the QoS requirements, here we 95%. Multiple drops with different random seeds are used to generate data sets used for data analytics and machine learning. For the network design, the worst case is considered. Thus, for each input variable data point (s, u, b, isd, dis, and varT), the maximum required bandwidth is calculated, and the performance model is represented by maxBW (s, u, b, isd, dis, and varT) for each service. The goal is to approximate the mapping function so that when there are new input variables data points, the required bandwidth can be predicted. See Figure 3 for the approach. We observed that when more drops are used for learning, a lower number of the SLA violations happen.

End For
Given a total bandwidth of W and demand d, considering a finite number of services and using interpolation, an exhaustive search can be performed to find the best fit w and corresponding ISD. Algorithm 4 shows the approach.

Network Optimization Approach
The design process is calculated for the worst-case scenario. As video traffic has a long duration, a dynamic network optimization approach is proposed to adjust the bandwidth to improve resource usage. Algorithm 5 shows the approach:

Simulation Methodology
The simulation network is a 19-cell cellular network using a hexagonal model. Each cell has 3 sectors and three directional antennas, one antenna points north, the other two are each deviated 120 degrees clockwise from the previous one. Out of the fifty-seven simulated sectors, seven cells in the center (21 sectors) were used to evaluate the performance.
Each radio frame is 10 milliseconds, consisting of one hundred 0.1 millisecond time slots. The spectrum is split into resource blocks (RBs) of 900 KHz, each RB is split into 12 subcarriers. LTE TDD subframe configuration 2 with 0.75/0.25 downlink/uplink ratio, 64 quadrature amplitude modulation, 2 × 2 MIMO with closed loop spatial multiplexing transmission mode 4, control format indicator 1, and extended pedestrian A 5 Hz multipath model have been assumed. The max transmit power per sector is 47 dBm. Figure 5 shows the link curve based on above assumption from [23] assuming 10% of packet error rate.
Four video services (S1: virtual reality, S2: interactive gaming, S3: conversational video, and S4: non-conversational video) were simulated, with different data rates and delay budgets (S1: data rate -25M, delay budget -20 ms; S2: data rate-5M, delay budget-100 ms; S3: data rate-5M, delay budget-150 ms; S4: data rate-5M, delay budget-300 ms). We assumed the same target package loss rate of 10 −3 . For S1 and S2, dedicated bandwidth is allocated for each service. For S3 and S4, shared bandwidth is allocated for both services, with the assumption that the traffic demands of the two services are 50% for each.
The simulation network is a 19-cell cellular network using a hexagonal model. Each cell has 3 sectors and three directional antennas, one antenna points north, the other two are each deviated 120 degrees clockwise from the previous one. Out of the fifty-seven simulated sectors, seven cells in the center (21 sectors) were used to evaluate the performance.
Each radio frame is 10 milliseconds, consisting of one hundred 0.1 millisecond time slots. The spectrum is split into resource blocks (RBs) of 900 KHz, each RB is split into 12 subcarriers. LTE TDD subframe configuration 2 with 0.75/0.25 downlink/uplink ratio, 64 quadrature amplitude modulation, 2 × 2 MIMO with closed loop spatial multiplexing transmission mode 4, control format indicator 1, and extended pedestrian A 5 Hz multipath model have been assumed. The max transmit power per sector is 47 dBm. Figure 5 shows the link curve based on above assumption from [23] assuming 10% of packet error rate. Four video services (S1: virtual reality, S2: interactive gaming, S3: conversational video, and S4: non-conversational video) were simulated, with different data rates and delay budgets (S1: data rate -25M, delay budget -20ms; S2: data rate-5M, delay budget-100ms; S3: data rate-5M, delay budget-150ms; S4: data rate-5M, delay budget-300ms). We assumed the same target package loss The 3GPP Urban Macro outdoor propagation loss model [31] has been used. Neglecting UE height, the line-of-sight (LOS) probability is calculated by: where d 2D is distance between BTS and UE. Assuming less than 1 km of ISD, the path loss is: where f c is carrier frequency of 28G Hz, PL UMa−NLOS is calculated by: PL UMa−NLOS = 13.54 + 39.08log 10 (d 2D ) + 20log 10 ( f c ). (17) The following describes how the simulation works: the 19 cells are placed according to hexagonal model, UEs are randomly placed in each sector. The network will be clustered according to Section 3.1.1, and the interference graph is built according to Section 3.1.2. At each time slot, the unscheduled UEs will be searched and each cluster will randomly pick one valid UE within its range (valid UEs means the UEs which have data in their transmission queue and have not be scheduled) to schedule, other valid UEs will be added according to intra-cluster interference coordination algorithm. After all clusters are scheduled, SINR will be calculated for all UEs according to the allocated beams and UE locations. If a UE has lower SINR than a pre-defined threshold, inter-cluster coordination algorithms will be run to un-schedule some of its interferers. Then, SINR will be re-calculated. We used link curve to determine how many data are transmitted for each scheduled UE and the transmitted data will be removed from its queue. For a UE, we record its time to receive a video frame and if after a period of time (i.e., 100 ms of delay budget) the frame is still not being completely transmitted, a packet drop will be recorded. Figure 6 shows the comparison of the bandwidths required for different IC approaches for the virtual reality service (S1) with a 25 M data rate, 20 ms delay budget (90 frames per second (FPS) with Future Internet 2020, 12, 99 13 of 21 11 ms frame delay and 9 ms transmission delay), and a packet loss rate of 10 −3 . This scenario simulated a network with a cell radius of 100 meters (ISD = cell radius × √

Simulation Results
3), 10/20/30 UEs per sector, 4 beams to horizontally cover the sector area, a maximum 3 sectors per cluster, and a maximum of 4 MU-MIMO users per sector. Ten simulations were run with different random seeds, and the results were averaged. The baseline approach is an approach in which no interference coordination is performed. The coordinated beamforming IC approach (BFIC) is a one-level IC approach with intra-cluster interference coordination only. The proposed two-level IC approach, together with the BFIC approach, show improvement in the required bandwidth compared to that of the baseline approach. The two-level IC-P1 approach achieved the minimum required bandwidth, but the two-level IC-P2 approach was worse than the BFIC-P2 approach because of the capacity loss the approach caused.   Figure 7 is a scenario for an interactive gaming service (S2), with a 5 M data rate, 100 ms delay budget (40 FPS, 20 ms delay for frame encoding and decoding, and 80 ms transmission delay), and a packet loss rate of 10 −3 . Figure 8 shows a scenario for a mix of conversational video service (S3) and non-conversational video service (S4) with 50% loading each. Both services had the same 5 M data rate and a packet loss rate of 10 −3 , but different delay budgets (150 ms for S3 with a 130 ms transmission delay, and 300 ms for S4 with a 280 ms transmission delay). Figure 7 and Figure 8 show that the two-level IC-P1 approach achieved the  Figure 7 is a scenario for an interactive gaming service (S2), with a 5 M data rate, 100 ms delay budget (40 FPS, 20 ms delay for frame encoding and decoding, and 80 ms transmission delay), and a packet loss rate of 10 −3 . Figure 8 shows a scenario for a mix of conversational video service (S3) and non-conversational video service (S4) with 50% loading each. Both services had the same 5 M data rate and a packet loss rate of 10 −3 , but different delay budgets (150 ms for S3 with a 130 ms transmission delay, and 300 ms for S4 with a 280 ms transmission delay). Figures 7 and 8 show that the two-level IC-P1 approach achieved the minimum required bandwidth. Selecting the two-level IC-P1 approach that achieved the minimum required bandwidth, Figure  9 shows the scatter plot of 200 drops with different random seeds. For simplicity, we fixed the cell radius to 100 m, the average number of users per sector at 10, the beams to cover the sector area at 8, and the service as S2 (interactive gaming service). The first plot shows that as the average distance from the UE to the BTS increases, more bandwidth is required. The second plot shows that as the traffic demand varies among the sectors, the variance of the required bandwidth to fulfill the QoS requirement increases. For network design, as explained in Section III, for each data point (s, u, b, isd, dis, and varT), the maximum required bandwidth is calculated to create the performance model.  Baseline P1  BF IC P1  Two-Level IC P1   40  60  80  100  120  140  160  180  200  220  240  260  280  300  320  340  360   10  20  30 Bandwidth (MHz)

Number of UEs
Baseline P2 BF IC P2 Two-Level IC P2 Selecting the two-level IC-P1 approach that achieved the minimum required bandwidth, Figure 9 shows the scatter plot of 200 drops with different random seeds. For simplicity, we fixed the cell radius to 100 m, the average number of users per sector at 10, the beams to cover the sector area at 8, and the service as S2 (interactive gaming service). For network design, as explained in Section 3, for each data point (s, u, b, isd, dis, and varT), the maximum required bandwidth is calculated to create the performance model. Future Internet 2020, 12, x FOR PEER REVIEW 17 of 22 (a) (b) Figure 9. Bandwidth for multiple drops. Figures 10-12 show the network design performance model, simplified by fixing the traffic demand variance varT and the number of beams to 8. The number of UE and the ISD (or the cell radius) were varied with discrete values, and for each data point of (u and isd), 50 drops were run. Figure 13 shows an example of design solution using the performance models in Figures 10-12. The SLA requirements include varied UE density (we assume the same UE density among all three scenarios, i.e., N UE per sector for S1, N UE per sector for S2, and N UE per sector for S3 plus S4) and total available bandwidth of 1 GHz. A network design solution was to find an ISD in which the bandwidth was shared among services. In all three scenarios specified in Figures 6-8, the UE covered by the sites fulfilled their QoS requirements. were varied with discrete values, and for each data point of (u and isd), 50 drops were run. Figure 13 shows an example of design solution using the performance models in Figures 10-12. The SLA requirements include varied UE density (we assume the same UE density among all three scenarios, i.e., N UE per sector for S1, N UE per sector for S2, and N UE per sector for S3 plus S4) and total available bandwidth of 1 GHz. A network design solution was to find an ISD in which the bandwidth was shared among services. In all three scenarios specified in Figures 6-8, the UE covered by the sites fulfilled their QoS requirements. Figure 14 shows the SLA violation rate using a performance model. For simplicity, we fixed the cell radius to 100 m, the average number of users per sector at 10, the beams to cover the sector area at 8, and the service as S2. The first 1000 drops were used to generate the initial performance model. For each new drop, dis and varT were used to search the performance model for a solution and verified whether the solution fulfilled the QoS requirements. If not, a new solution had to be found using Algorithm 4 in Section 3, and the performance model updated. The figure shows that as more drops were run, the rate of successful prediction increased.   Figure 13 shows an example of design solution using the performance models in Figures 10-12. The SLA requirements include varied UE density (we assume the same UE density among all three scenarios, i.e., N UE per sector for S1, N UE per sector for S2, and N UE per sector for S3 plus S4) and total available bandwidth of 1 GHz. A network design solution was to find an ISD in which the bandwidth was shared among services. In all three scenarios specified in Figures 6-8, the UE covered by the sites fulfilled their QoS requirements.     Figure 14 shows the SLA violation rate using a performance model. For simplicity, we fixed the cell radius to 100 m, the average number of users per sector at 10, the beams to cover the sector area at 8, and the service as S2. The first 1000 drops were used to generate the initial performance model.   Figure 14 shows the SLA violation rate using a performance model. For simplicity, we fixed the cell radius to 100 m, the average number of users per sector at 10, the beams to cover the sector area at 8, and the service as S2. The first 1000 drops were used to generate the initial performance model. The same setup in Figure 14 is in Figure 15, to show the network optimization results for 1000 random drops. On average, a bandwidth saving of 7.5% was achieved, which shows that the bandwidth we used for the initial design is as close to the optimized bandwidth as we can achieve.

Complexity Analysis
For the IC approach, additional channel quality indicator (CQI) feedbacks are required for cluster edge UEs. For FWA applications with slow varying channels, the air-link overhead is small. The 5G network uses a dedicated fiber network to connect the baseband unit (BBU) and the remote radio unit (RRU), or between the BTS to exchange intra-cluster scheduling outputs and forbidden relationship tables, both delay and data volume are not of concern.
For computational complexity, multiple SINR values are calculated. On the BBU, the intracluster scheduling uses greedy algorithm with complexity of (| | * 2 ), where | | is the number of The same setup in Figure 14 is in Figure 15, to show the network optimization results for 1000 random drops. On average, a bandwidth saving of 7.5% was achieved, which shows that the bandwidth we used for the initial design is as close to the optimized bandwidth as we can achieve. The same setup in Figure 14 is in Figure 15, to show the network optimization results for 1000 random drops. On average, a bandwidth saving of 7.5% was achieved, which shows that the bandwidth we used for the initial design is as close to the optimized bandwidth as we can achieve.

Complexity Analysis
For the IC approach, additional channel quality indicator (CQI) feedbacks are required for cluster edge UEs. For FWA applications with slow varying channels, the air-link overhead is small. The 5G network uses a dedicated fiber network to connect the baseband unit (BBU) and the remote radio unit (RRU), or between the BTS to exchange intra-cluster scheduling outputs and forbidden relationship tables, both delay and data volume are not of concern.
For computational complexity, multiple SINR values are calculated. On the BBU, the intracluster scheduling uses greedy algorithm with complexity of (| | * 2 ), where | | is the number of

Complexity Analysis
For the IC approach, additional channel quality indicator (CQI) feedbacks are required for cluster edge UEs. For FWA applications with slow varying channels, the air-link overhead is small. The 5G network uses a dedicated fiber network to connect the baseband unit (BBU) and the remote radio unit (RRU), or between the BTS to exchange intra-cluster scheduling outputs and forbidden relationship tables, both delay and data volume are not of concern.
For computational complexity, multiple SINR values are calculated. On the BBU, the intra-cluster scheduling uses greedy algorithm with complexity of O |C| * U 2 c , where |C| is the number of clusters and U c is the number of UE in each cluster, which equals C max × U s . C max is the maximum cluster size, and U s is UEs per sector. For inter-cluster scheduling, only scheduled cluster-edge UE triggers inter-cluster scheduling, with complexity of O(N * log(N) * U e ), where U e is the number of assigned cluster-edge UE, N is the number of cell, and O(N * log(N)) is the sorting complexity.
For the network design and optimization approach, fewer messages are transferred between the network and the management unit, and they are not in a time-critical situation, delay and data volume are not of concern. For computational complexity, the search algorithm is used for the network design. The algorithm runs offline and with low complexity.

Conclusions
In this paper, we introduced a two-level IC approach to improve performance models that map high-level customer-friendly business requirements to low-level network parameters and achieve QoS assurance for delay-sensitive traffic. We also proposed an approach that uses data analytics and machine learning to automate identification of no explicit relationships and dependencies between SLA high-level requirements and low-level network attributes to construct QoS performance models, which are used for network design and optimization.