Degrees-Of-Freedom in Multi-Cloud Based Sectored Cellular Networks

This paper investigates the achievable per-user degrees-of-freedom (DoF) in multi-cloud based sectored hexagonal cellular networks (M-CRAN) at uplink. The network consists of N base stations (BS) and K≤N base band unit pools (BBUP), which function as independent cloud centers. The communication between BSs and BBUPs occurs by means of finite-capacity fronthaul links of capacities CF=μF·12log(1+P) with P denoting transmit power. In the system model, BBUPs have limited processing capacity CBBU=μBBU·12log(1+P). We propose two different achievability schemes based on dividing the network into non-interfering parallelogram and hexagonal clusters, respectively. The minimum number of users in a cluster is determined by the ratio of BBUPs to BSs, r=K/N. Both of the parallelogram and hexagonal schemes are based on practically implementable beamforming and adapt the way of forming clusters to the sectorization of the cells. Proposed coding schemes improve the sum-rate over naive approaches that ignore cell sectorization, both at finite signal-to-noise ratio (SNR) and in the high-SNR limit. We derive a lower bound on per-user DoF which is a function of μBBU, μF, and r. We show that cut-set bound are attained for several cases, the achievability gap between lower and cut-set bounds decreases with the inverse of BBUP-BS ratio 1r for μF≤2M irrespective of μBBU, and that per-user DoF achieved through hexagonal clustering can not exceed the per-user DoF of parallelogram clustering for any value of μBBU and r as long as μF≤2M. Since the achievability gap decreases with inverse of the BBUP-BS ratio for small and moderate fronthaul capacities, the cut-set bound is almost achieved even for small cluster sizes for this range of fronthaul capacities. For higher fronthaul capacities, the achievability gap is not always tight but decreases with processing capacity. However, the cut-set bound, e.g., at 5M6, can be achieved with a moderate clustering size.


Introduction
Interference is one of the fundamental obstacles for high data rate communications in current and future cellular networks because of restricting the effect on overall spectral efficiency in bits/sec/Hz/base station. Sectorization, which has been used in 4G networks, is one solution to alleviate intra-cell interference by using multiple antennas at base stations (BS) resulting in directional beams that cover an intended sector. In the literature, sectorization is often combined with hexagonal cell models, and mostly each cell is divided into three sectors [1,2]. Here, we follow the works in [3][4][5][6] that totally ignore the interference between the sectors in the same cell. In real systems, this is not the case since the side lobes of the radiation pattern cause to observe signals from adjacent inter-cell cluster. This work proposes a distributed iterative solution that achieves the performance of the case all BSs connected to a single BBUP. While the formerly mentioned works assume non-dynamic clustering for each BBUP, the authors of [30] propose and analyze dynamic clustering approach based on instantaneous CSI, where they also consider the allocation of computation resources of BBUPs as an optimization parameter.
In the present work, we consider uplink of an M-CRAN with multiple-antenna mobile users and multiple-antenna BSs. We assume N 1 BSs, K ≤ N BBUPs with limited processing capacity and limited fronthaul capacity. The main interest of this paper is to understand highest achievable per-DoF and sum-rate for limited fronthaul and BBUP processing capacity for given BBUP-BS ratio K/N. We propose two coding schemes in each of which some mobile users are deactivated to decompose the network into isolated parallelogram and hexagonal clusters, respectively. For both clustering types, the minimum number of mobile users/sectors are determined regarding a BBUP-BS ratio due to one-to-one association between BBUPs and clusters. Each BBUP collects quantized versions of the received signals of the associated cluster through fronthaul links and decodes them jointly. The considered decoding scheme is thus reminiscent of clustered decoding as performed in [10,31].
The contributions of this paper are: • We propose a specific non-dynamic way of silencing mobile users in parallelogram clustering.
One could attempt to silence entire cells. We find an efficient way of dividing the network non-interfering parallelogram clusters by silencing mobile users mostly in single sectors of the considered cells; • We propose achievability schemes for both parallelogram and hexagonal clusterings and derive lower bounds on per-user DoF for both schemes in a function of fronthaul and BBUP processing capacities and BBUP-BS ratio; • We prove that the performance of parallelogram clustering can not be worse than hexagonal clustering for small and moderate fronthaul capacities; • We show by simulations that, for high fronthaul capacities, the coding scheme proposed for hexagonal clustering can show better performance than parallelogram clustering if the processing capacity is large enough according to given BBUP-BS ratio.
The upper bound is obtained through cut-set argument. In several cases, upper and lower bounds are matched. For small and moderate fronthaul capacities, the achievability gap is given as a function of fronthaul capacity and BBUP-BS ratio, and it is shown that it decreases with the inverse of the BBUP-BS ratio irrespective of BBUP processing capacity.
In the finite SNR case, we compare the proposed coding schemes with the following schemes: • Naive versions of both schemes where all mobile users in certain cells are deactivated, • Interfering versions of both schemes where the network is decomposed into non-overlapping but interfering clusters, • An opportunistic scheme where each message is decoded based on the received signals of three neighboring sectors that have the strongest channel gains.
Finite SNR analysis shows that, in the strong interference regime, the proposed schemes outperform all other schemes for almost all SNR range under all scenarios except two; for the 3-sector decoding scheme, low SNR range and scarce BBUP capacities and, for non-interfering schemes, moderate SNR range and high BBUP capacities.
An interesting outcome of the finite SNR analysis is that interfering clustering schemes show either close to or better performance than proposed schemes in the finite SNR range under both weak and strong interference regimes; therefore, the interfering clusterings can be employed at finite SNR values with minor performance losses, since they may be more convenient for practical systems.

Organization
The rest of the paper is organized as follows: This section ends with some remarks on notation. The following Section 2 describes the problem definition. Section 3 presents the main results of the paper. In Sections 4 and 5, we present the coding schemes for the parallelogram and hexagonal clusterings, respectively. Section 6 presents the achievability results for the naive schemes and Section 7 presents simulation results for DoF per-user. In Section 8, we present the results regarding the finite SNR analysis. We conclude the paper with Section 9 and some technical proofs are presented in the appendices.

Notation
We denote the set of all integers by Z, the set of positive integers by Z + , and the set of real numbers by R. For other sets, we use calligraphic letters, for example, X . We represent random variables by uppercase letters, for example, X, and their realizations by lowercase letters, for example, x. We use boldface notation for vectors, that is, upper case boldface letters such as X for random vectors and lower case boldface letters such as x for deterministic vectors.) Matrices are depicted with sans serif font, for example, H. We also write X (n) for the tuple of random vectors (X 1 , . . . , X n ).

Network Model
Consider the uplink communication in a cellular network consisting of N 1 hexagonal cells as depicted in Figure 1. Each single cell contains a base station (BS) equipped with 3M directional receive antennas and is divided into three sectors, where each sector is covered by M receive antennas. Usage of directional antennas, where side lobe radiation patterns are negligible, implies that communications in the three sectors of a cell do not interfere with each other. It is assumed that different mobile users in the same sector perform orthogonal multiple-access as is typical for current 4G networks [32]. Thus, the model is restricted to a single mobile user per sector. For simplicity and symmetry, it is supposed that each mobile user is equipped with M transmit antennas.  It is assumed that the signal from a mobile user attenuates rapidly enough so that it cannot cause interference to sector receive antennas (Rx) in non-adjacent sectors. These assumptions lead to the interference graph in Figure 1, where each small circle depicts a mobile user and Rx pair. Solid black lines between any two circles represent symmetric interference between mobile users and Rxs of adjacent sectors. Let N = {1, . . . , N} be an index set of all cells and associated BS in the network, and let T = {1, . . . , 3N} be index set of all sectors and their corresponding users and Rxs. Then, the observed signal at the Rx u ∈ T is given by the following discrete-time input-output relation: where • n denotes the number of channel use; • T u denotes the index set of mobile users whose transmitted signal is observed by Rx u (including mobile user u); • x v,n denotes the M-dimensional time-n signal sent by mobile user v; • z u,n denotes the M-dimensional i.i.d. standard Gaussian noise vector corrupting the time-n signal at Rx u; it is independent of all other noise vectors; • and H u,υ denotes an M-by-M dimensional random matrix with entries that are independently drawn according to a standard Gaussian distribution that models the channel from mobile user υ to Rx u.
Channel matrices are randomly drawn but assumed to be constant over the n channel uses employed for the transmission of a message. In other words, the block length of a transmission is assumed shorter than the coherence time of the channel. Realizations of the channel matrices are assumed to be known by corresponding BSs, but not by the mobile users.

Uplink Communication Model with M-CRAN Architecture
Consider the network model defined in Section 2.1. Assume that the mobile user in sector u ∈ T wishes to send its message W u , which is selected at random from the set 1, . . . , 2 nR u , to the BS in which its sector is located. To this end, mobile user u encodes its message with the function where X (n) u = (X u,1 , . . . , X u,n ), and X u,n ∈ R M is a column vector for n = 1, . . . , n, satisfying the power constraint: 1 n n ∑ n =1 X u,n 2 ≤ P with probability 1.
We assume that the decoding processes of receive signals during the uplink communication is performed by K ≤ N BBUPs, and that any BS j ∈ N can have access to any BBUP k ∈ {1, . . . , K} through a one-hop fronthaul link which can be modeled as noise-free but capacity limited. Definition 1. Observation Function Let U k be the index set of BSs communicating with BBUP k. Each BS j ∈ U k sends an observation function, φ and with u j,1 , u j,2 , and u j,3 denoting the three sectors of BS j.
To account for capacity limits of the fronthaul links, we require where C F = µ F · 1 2 log(1 + P) and µ F is fronthaul capacity prelog, which is a positive constant. Let D k be the index set of sectors whose messages are to be decoded at BBUP k. After receiving observation functions, for each BBUP k and each u ∈ D k , BBUP k applies a deterministic and invertible function g k,u on the relevant observation functions to decode the message W u : Decoding is successful if, for all u ∈ T :Ŵ Increasing computational power of a processor leads to an increase in complexity. Hence, to take the computational limitation into consideration, we impose a complexity constraint on the BBUPs in terms of bit processing capacity per channel use. We assume that any BBUP k can implement the decoding process if and only if the sum rate of all observation functions that is sent to BBUP k satisfies where C BBU = µ BBU · 1 2 log(1 + P) and µ BBU is processing capacity prelog, which is a positive constant.

Capacity and Degrees of Freedom
A rate-tuple {R u } u∈T is said to be achievable if, for every > 0 and sufficiently large n, there exists encoding, observation, and decoding functions f j,k , and g (n) k,j satisfying (3), (6) and (9), such that The capacity region C (P, µ F , µ BBU , K) is the closure of all achievable rate-tuples {R u } u∈T , and the maximum sum-rate is defined as where the supremum is over all achievable rates {R u } u∈T ∈ C (P, µ F , µ BBU , K).
Definition 2 (Per-User DoF). For any BBUP-BS ratio r ∈ (0, 1], fronthaul capacity prelog µ F > 0 and processsing capacity prelog µ BBU > 0, the per user DoF is given as Here, note that the allowed interval of r guarantees satisfying the proposed system model restriction K ≤ N. In the following, we use the abbreviation DoF to designate the per-user DoF.

Main Results
We derive two lower bounds and an upper bound on the DoF. As we will show, they match in some cases. The first and second lower bounds are achieved by the schemes described in Sections 4 and 5, respectively. Both schemes are based on deactivating a set of mobile users. In the first scheme, the mobile users are deactivated so that the remaining active users form parallelogram-like clusters. In the second, the remaining active users form hexagon-like clusters. We name the two DoF lower bounds as parallelogram bound and hexagon bound, respectively. Theorem 1 (Lower Bound). For any µ BBU > 0, µ F > 0, and 0 < r ≤ 1, the achievable DoF is given by where where above maximizations are over all positive integers t 1 , t 2 satisfying t 1 t 2 ≥ 1 r , and where above maximizations are over all positive integers t satisfying t ≥ 1 3r .
Proof. The proof is given in Sections 4 and 5.

Remark 1.
For µ BBU > 0 and 0 < r ≤ 1 Proof. The proof is given in Appendix A.
Theorem 2 (Cut-Set Bound). For any µ BBU > 0, µ F > 0 and 0 < r ≤ 1, the achievable DoF is upper bounded by Proof. The proof is given in Appendix B.
Corollary 1 (Optimality in some special cases). • Proof. The proofs are given in Appendix C.
Proof. The proof is given in Appendix D.

Uplink Scheme with Parallelogram Clustering
In the proposed uplink scheme, we deactivate a subset of mobile users so as to partition the network into non-interfering clusters of active users. These clusters have parallelogram shapes and are parametrized by positive integer pair (t 1 , t 2 ).

Construction of Parallelogram Clusters
For a given (t 1 , t 2 ) pair, we define a regular parallelogram grid such that the length of sides of a parallelogram in the diagonal direction (-30 degree with horizontal axis) is t 1 cell-hop length, and the length of sides in the vertical direction is t 2 cell-hop length. Then, we fit this parallelogram grid into our figurative network in a way that the intersections of the parallelogram grid coincide with BSs, which are supposed to be at the center of the cells. Subsequently, we deactivate all mobile users coinciding with the sides of the grid. This process divides the network into parallelogram-like non-interfering clusters of active users and their sectors, and we refer to them shortly as p-clusters. In Figure 2, we present an example of parallelogram clustering for (t 1 , t 2 ) = (2, 2), where users coinciding with green lines are deactivated. Throughout this section, we refer to active users as only users. Users of a p-cluster are located in: Single BS with one user.
Therefore, the number of users n p in a p-cluster is: Let K = 1, . . . , K p , with K p ≤ K, be index set of p-clusters. We associate each p-cluster with single BBUP and denote the associated BBUP with the same index k ∈ K of the p-cluster. Let I k be the index set of BSs whose users are elements of kth p-cluster. Each BS j ∈ I k sends an observation function to k th BBUP, i.e., U k = I k .
To be able to find a BBUP-BS ratio, we need to equally partition all BSs to BBUPs. Note that any BS j ∈ N with one user or three users is an element of a single index set I k , k ∈ K, and any BS j ∈ N with two users is an element of two different index sets, i.e., I k and I k 1 , k, k 1 ∈ K. Therefore, of the I k BSs of p-cluster k, we associate all of them with one user or three users, and half of them with two users to the BBUP k. This leads to the BBUP-BS ratio r p : We can choose any (t 1 , t 2 ) ∈ Z + pair to construct p-clusters that satisfies r p ≤ r:

Coding Scheme
Each mobile user u encodes its message W u , which is uniformly distributed over the set W u = 1, . . . , 2 nR u , with a multi-antenna Gaussian codebook of power P. Since Rxs of silenced user observe only interference, each BS j generates its observation function for (active) Rxs through independent quantization codebooks. To generate quantization codebooks, each BS j applies a point-to-point Gaussian vector quantizer to receive signal of each Rx so that the noise-level quantization rates imposed in the following are satisfied. Let J k denote the sector index set of p-cluster k. We choose D k = J k , where D k is an index set of sectors whose messages are to be decoded at BBUP k. Each BS j ∈ I k with three users transmits a message consisting of three quantization messages of its Rxs to BBUP k and each BS j ∈ I k with two users transmits only quantization message of Rx u to BBUP k if u ∈ J k . The BS j ∈ I k with a single user transmits the only quantization message of its cell to the BBUP k. Depending on the prelogs µ BBU and µ F , there are three different quantization rates: all BSs with three users quantize each receive signal at the rate R q1 = µ q1 1 2 log(1 + P) and all BSs with two users quantize each receive signal at the rate R q2 = µ q2 1 2 log(1 + P), and all BSs with one active user quantize their receive signals at the rate R q3 = µ q3 1 2 log(1 + P). After receiving quantization messages, each BBUP k reconstructs all observations with quantization noise term, i.e., {Ŷ (n) u } u∈D k . The input-output relationship experienced by each BBUP k is a multi-user MIMO-MAC channel ( [33], Chapter 9) and [34], where the effective noise is the sum of channel and quantization noises. Since the channel matrix from mobile users of D k to Rxs of D k is known by BBUP k and is square and full rank with probability 1, each BBUP k can perform joint decoding with vanishingly small average error probability, which leads to achieving the same DoFs as if each user message is decoded in a point-to-point communication. That is, the prelogs µ q1 , µ q2 and µ q3 are achieved for respective mobile users.
To be able to find DoF for asymptotic case (The limit N → ∞ is only needed to eliminate edge effects.), i.e., while N → ∞, we need to equally partition deactivated users of the network to p-clusters. Note that deactivated users around a p-cluster are located on green lines of four different sides and each side is on the border of two p-clusters. Therefore, when half of the deactivated users around a p-cluster, i.e., (t 1 + t 2 ), are associated with the p-cluster itself, the equal partition of the deactivated users is performed. Then, the DoF of the scheme can be obtained as: where the expression in the numerator refers to the sum-DoF in a given p-cluster and the expression in the denominator refers to the total number of active and deactivated users for a given p-cluster. In the following, we will give a policy to choose quantization rates for any (t 1 , t 2 ) satisfying (25).
The DoF of M × M MIMO system with independently fading channels, which is our case, is M as given in [35]: the quantization rate M 2 log(1 + P) is enough to describe message set W u of any user u asymptotically. Thus, here, we are not restricted by the processing capacity prelog µ BBU , i.e., the only restricting factor is fronthaul capacity prelog µ F . The main policy is to distribute transmission resources between (active) users of any given BS unless the per sector transmission capacity is more than the rate providing maximum DoF M, i.e., M 2 log(1 + P). To this end, we determine the quantization rates regarding µ F :

•
If µ F ≤ M, transmission resource of a fronthaul link is allocated equally among Rxs of a BS: and the achievable DoF is given as • If M ≤ µ F ≤ 2M, transmission resource of a fronthaul link is equally allocated among Rxs of a BS with two or three users; however, any BS with one user quantizes its received signal at the maximum rate since each fronthaul link has enough capacity to support that communication rate and the achievable DoF is given by • If 2M ≤ µ F ≤ 3M, transmission resource of a fronthaul link is equally allocated among Rxs of a BS with three users; however, any BS with one or two users quantizes their receive signals at the maximum rate for each Rx since each fronthaul link has enough capacity to support that communication rate ( M ≤ µ F 2 ): and the achievable DoF is given by • If 3M ≤ µ F , all BSs quantize their receive signal at the maximum rate at each sector (M ≤ µ F 3 ): and achievable DoF is given as: Under this condition, the achievable sum-DoF of a p-cluster, which is given in the numerator of (26), can be restricted by the processing capacity prelog µ BBU . If the µ BBU is not smaller than the achievable sum-DoF of a p-cluster for the given interval of µ F : The process that has been implemented in Section 4.2.1 is applied and, hence, the DoF expressions are given as in (27), (28), (29) and (30), respectively. However, if the processing capacity prelog µ BBU is smaller than the sum-DoF for the given µ F : We distribute the processing resource of a BBUP equally among sectors of a cluster and the quantization rate at each sector is chosen as which leads to: To provide fairness among the achievable DoFs of users, instances of the proposed scheme are time-shared so that each mobile user takes all relative positions in a p-cluster, which requires different instances.

Uplink Scheme with Hexagon Clustering
The same as done in the last section, we deactivate a subset of mobile users so as to partition the network into non-interfering clusters of active users and their sectors. The shape of the clusters is hexagon and the size of the hexagons are set by a positive integer t.

Construction of Hexagon Clusters
For a given design parameter t, we choose some BSs as center BSs to construct a regular grid of equilateral triangles where every three closest center BSs are 2t cell-hops apart from each other. Therefore, the maximum distance to the closest center BS is t cell-hops and we name the BSs whose distance is cell-hops to the closest center BS as layer-" " BSs, for = 1, . . . , t. We determine all BSs located at t cell-hops above and below of any center BS as corner and null BSs, respectively. Then, we create solid green lines between any closest null and corner BSs (t cell-hop apart from each other), which creates hexagon grids along the entire network. Subsequently, we deactivate mobile users coinciding with solid green lines. This process divides the network hexagon-like non-interfering clusters and we shortly name as h-clusters (The hexagonal clustering is first presented in [36].). Figure 3 shows an example of partition for t = 3. Later on, we refer to active users as only users. In an h-cluster,
Therefore, the number of users in a h-cluster, n h , is: Let K = {1, . . . , K h }, with K h ≤ K, be index set of h-clusters. We associate each h-cluster with single BBUP and denote the associated BBUP with the same index k ∈ K of the h-cluster. Let I k be the index set of BSs whose users are elements of kth h-cluster. Each I k BS sends an observation function to kth BBUP, i.e., U k = I k . To be able to find a BBUP-BS ratio, r h , we need to equally partition all BSs to BBUPs. Note that each layer-t BS, except the one in the corner, belongs to two different index sets, i.e., I k and I k 1 , k, k 1 ∈ K. Each corner BS is an element of three different index sets, i.e., I k , I k 1 and I k 2 , k, k 1 , k 2 ∈ K. In addition, note that each null BS around a h-cluster k is on the border of three different h-clusters. To this end, of the I k BSs and null BSs around h-cluster k, we partition all layer-" ", = 1, . . . , t − 1, BSs including center BS, half of the layer-t BSs except corner and null BSs, and one third of corner and null BSs to the BBUP k, which leads to: Since we have a given ratio r, we can choose any t ∈ Z + such that r h ≤ r, i.e., t ≥ 1 3r .

Coding Scheme
Each mobile user u encodes its message W u , which is uniformly distributed over the set W u = 1, . . . , 2 nR u , with a multi-antenna Gaussian codebook of power P. As in Section 4, after observation at sector antennas, each BS j generates observation function for (active) Rxs through independent quantization codebooks. To generate quantization codebooks, each BS j applies a point-to-point Gaussian vector quantizer to a received signal of each Rx such that the following noise-level quantization rate constraints are met. Let J k denote the sector index set of h-cluster k. We choose D k = J k . Each BS j ∈ I k of layer-" ", = 1, . . . , t − 1, transmits a message consisting of three independent quantization messages of Rxs to BBUP k and each BS j ∈ I k of layer-"t" transmits only quantization message of sector u to BBUP k if u ∈ J k . Depending on the prelogs µ BBU and µ F , there are two different quantization rates: Each BS with three users quantize each receive signal at the rate R q1 = µ q1 1 2 log(1 + P) and each BS with two users quantize each receive signal at the rate R q2 = µ q2 1 2 log(1 + P). That is, in h-cluster k, the receive signals of all layer-" " BSs, = 1, . . . , t − 1, and the receive signals of every corner BS is quantized at rate R q1 , i.e., 9t 2 − 9t + 6, and receive signals of layer-"t" BSs other than corner BSs are quantized at R q2 , i.e., 6t − 6. After obtaining quantization messages, BBUP k reconstructs all {Ŷ (n) u } u∈D k with quantization error. The input-output relationship experienced at the BBUP k is multi-user Gaussian MIMO-MAC. Then, each BBUP k performs joint decoding with vanishingly small probability of error since the channel matrix from users of D k to Rxs of D k is known by BBUP k and is square and full rank with probability 1. This leads to achieving DoFs µ q1 and µ q2 for respective mobile users.
To be able to find DoF for asymptotic case, i.e., N → ∞, we need to equally partition deactivated users of the network to h-clusters. The number of deactivated users around h-cluster k is 6t. Since each deactivated user is on the border of two h-clusters, to be able to find DoF of the scheme, we partition half of them, i.e., 3t, to users of h-cluster k, which gives the DoF expression: In the following, we will give the policy for choosing quantization rates.

Case 1: µ BBU ≥ n h M
In Section 4.2.1, µ F is the only limiting factor since the quantization rate M · 1 2 log(1 + P) is enough to describe message W u of any user u in the asymptotic case. The policy is again to distribute transmission resources equally among (active) Rxs of any given BS. To this end, we choose the quantization rates regarding µ F :

transmission resource of a fronthaul link is equally allocated between Rxs
and the achievable DoF is given as • If 2M ≤ µ F ≤ 3M, transmission resource of a fronthaul link is equally allocated among Rxs of a BS with three users; however, any BS with two users quantizes its receive signals at the maximum rate at each Rx since each fronthaul link has enough capacity to support that communication rate (M ≤ µ F 2 ): and the achievable DoF is given as • If µ F ≥ 3M, all BSs quantize their receive signals at the maximum quantization rate (M ≤ µ F 3 ): and the achievable DoF is given as

Case 2: µ BBU ≤ n h M
Under this condition, depending on µ F , the achievable sum-DoF of a h-cluster can be restricted by the processing capacity prelog µ BBU . Achievable sum-DoF is given in the numerator of (36). Therefore, if the processing capacity prelog µ BBU is not smaller than the achievable sum-DoF of a h-cluster for the given interval of µ F : The process that has been implemented in Section 5.2.1 is applied and, hence, the DoF expressions are given as in (38), (40) and (42), respectively. However, if the processing capacity prelog µ BBU is smaller than the sum-DoF for the given µ F : We distribute the processing resource of a BBUP equally among sectors of a cluster and the quantization rate at each sector is chosen as which leads to: To provide fairness among the achievable DoFs of users, instances of the proposed scheme are time-shared so that each mobile user takes all relative positions in a h-cluster, which requires different instances.

DoF without Sectorization
In the two proposed achievability schemes ("p-clustering" and "h-clustering"), we considered three non-interfering sectors in each cell. Now, if we consider cells without sectors, we can naively adapt our clustering by deactivating all users in the border cells of clusters. That is, for p-clustering, it requires deactivation of all users in the cells with one or two active mobile users and, for h-clustering, it requires deactivation of all users in the corner cells and the cells with two active users. This means that the network consists of only cells with three active users and cells with no active users for both schemes. This would again partition the network into non-interfering p-clusters and h-clusters without changing the r p and r h for any given (t 1 , t 2 ) pair or t, respectively.
By following the similar procedure introduced in Sections 4 and 5, one can easily state the following result by simply distributing the available transmission resources equally among three Rxs of a given BS as long as the BBUP capacity is enough or, otherwise, distributing BBUP processing resources equally among the Rxs of a p-cluster/h-cluster. This leads to the following lemma: Lemma 1 (DoF for naive scheme). For any µ BBU > 0, µ F > 0 and 0 < r ≤ 1, the achievable DoF in a multi cloud based non-sectored cellular network is given by and above maximizations are over all positive integers t 1 ,t 2 satisfying t 1 t 2 ≥ 1 r , and where above maximizations are over all positive integers t satisfying t ≥ 1 3r .
Notice that the same cut-set bound, Theorem 2, applies for the naive schemes since the observation functions, Definition 1, are defined not on the sector basis but on the BS basis.

Numerical Results and Discussion
In this section, we present simulation results to evaluate the proposed coding schemes for p-clustering and h-clustering. In Figure 4a, we investigate effect of clustering size on the achievable DoF for several fronthaul capacities µ F = [3,7,11] and µ BBU = 428. We define size of a p-cluster as inverse of r p , i.e., 1 r p = t 1 t 2 , and we denote it also with side length pair (t 1 , t 2 ). We define size of a h-cluster as inverse r h , i.e., 1 r h = 3t 2 , and we denote it also with the parameter t. It is observed that, for p-clustering, when the fronthaul capacity is small, i.e., µ F ≤ M, clustering size has no effect on DoF since µ F becomes a bottleneck. In general, we see that, for both p-clustering and h-clustering, the clustering size giving highest DoF decreases with µ F . The figure verifies the Remark 1 since, for all r p = r h , p-clustering outperforms h-clustering for µ F = [3,7]. It is also interesting to note that, for p-clustering, the achievable DoF is not monotonically increasing(decreasing) until(after) reaching the maximum for µ F = 11 (i.e., 2M < µ F ≤ 3M) since not only the clustering size but also the side length of the p-cluster is important for exploiting interference. For any r p , choosing a (t 1 , t 2 ) pair that is the minimum in the sum gives the maximum DoF since it provides higher joint processing gain for a p-cluster for the given size (i.e., 1 r p ), i.e., the more t 1 and t 2 becomes closer to each other the more mutual information clusters have. Therefore, larger p-cluster sizes may not result in higher DoF owing to the side length effect. However, for µ F ≤ 2M, the side lengths of p-cluster has no effect on achievable DoF for a given cluster size.   Figure 4b shows the effect of clustering size on DoF for various values of µ BBU = [100, 300, 500] and µ F = 12. It is seen that, for each µ BBU , achievable DoF increases with cluster size until it becomes a bottleneck, i.e., until µ BBU becomes active in the achievability expression. Accordingly, the results clearly indicate that having more processing power makes possible larger cluster sizes and hence larger DoF.
In Figure 5, we plot the achievable DoF and cut-set bound vs µ F for M = 4, r = 0.025 and µ BBU = 428, which refers the case BBUP processing capacity is equal to the required processing capacity when each receive signal in a p-cluster of size (t 1 , t 2 ) = (5,8) is quantized at the maximum quantization rate R q = M 2 log(1 + P). From the figure, we can deduce that almost upper bound for µ F ≤ 2M can be reached, which means that 2M 3 DoF is almost achievable at µ F = 2M given that processing capacity is high enough. In Figure 5, the operating points of clustering sizes is also depicted. For µ F ≤ 8, equivalently 2M, any p-clustering with 1 r p = 40 gives the highest achievable DoF for the given system parameters. However, for µ F > 8, there are several different operating points. For example, for 8 < µ F ≤ 9.4, the h-clustering of size t = 4 is the optimal clustering size, which means, for µ F > 2M, dividing the network into h-clusters provides higher joint processing gain than p-clustering for the same r h = r p if the BBUP processing capacity is enough. For the rest, the clustering size r p is decreasing with µ F due to the given BBUP capacity is not enough to handle the quantized data for larger cluster sizes. At the operating point µ F = 12, which allows maximum quantization rate for each receive signal, the p-clustering of size (t 1 , t 2 ) = (5, 8) achieves capacity. This proves that the proposed scheme utilizes the system resources optimally at this operating point and almost 9M 10 DoF is achievable. We plot also the lower bound on DoF achieved by the naive scheme vs µ F for the same parameters. We can clearly see that the performance of the proposed schemes is considerably better than naive schemes due to the sectorization gain brought by nulling intra-cell interference.
In Figure 6, we plot the achievable DoF and cut-set bound as a function of processing capacity prelog µ BBU , for r = 0.025 and µ F = 12, which means that the fronthaul capacity has no restrictive effect on the achievable DoF. The operating points of clustering sizes regarding µ BBU is also presented. The plot clearly indicates that the cut set bound is achieved until µ BBU = 428, i.e., the processing resources is used efficiently even until achieving 9M 10 DoF. At the rest of the µ BBU range, it is seen that the optimal clustering sizes ( 1 r p or 1 r h ) increase with µ BBU , and for most of µ BBU > 428, h-clustering provides highest DoF. This indicates the advantage of employing h-clustering when the processing capacity is high enough. For some range of µ BBU , both h-clustering of size t = 4 and p-clustering of size (t 1 , t 2 ) = (8,8) provide the highest DoF, which shows that h-clustering with lower clustering size provides higher joint processing gain than p-clustering with larger clustering sizes due to clustering geometry. The figure also depicts the lower bound achieved by the naive approach vs. µ BBU for the same parameters and the gain of sectorization is clearly seen for higher values of processing capacity.

Finite SNR Analysis
In this section, we compare finite SNR performances of the proposed schemes with several other schemes, which will be introduced later on. For the finite SNR case, the quantization rates for both proposed clusterings are chosen as stated in Sections 4.2 and 5.2, but the conditions regarding a high SNR regime are not applied, i.e., the prelog of any quantization rate is not reduced to the number of antennas M. Then, each BBUP implements joint decoding for the users of the associated cluster after reconstructing all sector receive signals of the cluster. For simplicity, we present the comparisons for M = 1 throughout the section.
To evaluate the performance of the proposed schemes at finite SNR values, other than naive schemes, we compare our schemes with three different schemes: • Scheme 1 is a variation of the proposed p-clustering scheme. In p-clustering, each p-cluster is surrounded by deactivated users located on the sides of (t 1 , t 2 )-hop parellelogram, where each side has t 1 and t 2 deactivated users, respectively. For each p-cluster, we associate all deactivated users on the lower side and right side of a (t 1 , t 2 )-hop parellelogram to the p-cluster under consideration. Subsequently, we activate all deactivated users and allow each BBUP to collect quantization messages of reactivated user sectors associated with its own p-cluster. This process partitions the network into non-overlapping but interfering paralleogram-like clusters, which we call I p -clusters later on; see Figure 7 for an example of (t 1 , t 2 ) = (4, 3). Note that I p -clustering requires the same BBUP-BS ratio r p as for a p-clustering case. With reactivation of all deactivated mobile users, there are 3t 1 t 2 active users in each I p -cluster and all cells consists of three active users. Therefore, each BS equally partitions its fronthaul transmission resources to Rxs if BBU processing resources is enough to implement the joint decoding; otherwise, the processing resources is evenly distributed among all Rxs of the I p -cluster, i.e., the quantization rate is chosen as over all positive integer (t 1 , t 2 ) pairs satisfying t 1 t 2 ≥ 1 r . To be able to guess the user messages, each BBUP implement joint decoding by treating out-of-cluster interference as noise.
• Scheme 2 is a variation of the proposed h-clustering scheme. In h-clustering, there are 6t deactivated users around a cluster of size-t. For a specific h-cluster, we associate the deactivated users on the borders of any three adjacent h-clusters, e.g., east, southeast, and southwest, to the h-cluster under consideration. Then, we replicate this process for each h-cluster with the same relative directions of adjacent h-clusters. Subsequently, we reactivate all deactivated users and allow each BBUP k to collect the quantized received signals of sectors of reactivated users associated with its own h-cluster. This process partitions the network into interfering but non-overlapping clusters, which we call I h -cluster in the following, see Figure 8 for t = 2. Note that I h -clustering requires the same BBUP-BS ratio as for the h-clustering case. With reactivation of deactivated users, there are 9t 2 active users in each I h -cluster. Therefore, by applying similar arguments as stated above, the quantization rate for I h -clustering is chosen as over positive integers t satisfying t ≥ 1 3r . To be able to guess the user messages, each BBUP implement joint decoding by treating out-of-cluster interference as noise.
• Scheme 3 is a variation of the practical opportunistic schemes. The decoding depends on the realization of the channel coefficients. With the help of neighbors of the considered BS, the corresponding BBUP identifies for each user in the corresponding cell the three adjacent sectors that give the best joint decoding performance for the corresponding message. To be able to make a fair comparison between the proposed schemes and the 3-sector decoding scheme, we impose the same fronthaul rate constraint on the 3-sector decoding scheme as in the non-interfering clustering scheme (note that there is no silenced user in the 3-sector decoding case) by assuming all processing resources are used. That is, the quantization rates are chosen as Then, the BBUP collects the quantization messages and decodes the corresponding message based on them.
In our numerical comparison, we average the rate over 5000 independent channel realizations of the channel matrices, where for each realization all channel gains are drawn independently of each other according to a Gaussian distribution, by which we aim at modeling the random location of a mobile user. The direct channel gains of intra-sector links are drawn with variance 1 and the cross channel gains of inter-sector links are drawn with variance α 2 < 1 since any mobile user in adjacent sectors can not be closer to a sector receiver than the user in the considered sector, where α is the channel attenuation coefficient. Figure 9 presents the comparison of the performances of the proposed schemes with naive schemes, I p -clustering, I h -clustering schemes and 3-sector decoding scheme vs SNR when r = 1 10 . The simulations are performed for different cluster sizes such that t 1 * t 2 = 10, 11 and 12 and t = 2. However, in all the subfigures of Figure 9, we present only the ones showing relatively better performance than others to make presentation better.      As seen from all the subfigures of Figure 9, the proposed schemes provides higher sum-rates than naive schemes for all SNR range and, under all scenarios, e.g., strong interference regime at low BBUP processing capacity as in Figure 9b, or low interference regime at high BBUP capacity as in Figure 9e.
By comparing the subfigures of Figure 9 for a given α, we conclude that the proposed schemes become more efficient if the processing capacity of BBUPs increases, i.e., the allowed quantization rate increases. In addition, we can see that employing the smallest possible cluster for a given r is more advantageous for small processing capacities. For example, for α = 0.9, while the p-clustering scheme for (t 1 , t 2 ) = (5, 2) shows better performance than other proposed schemes of larger cluster sizes at all SNR range for µ BBU = 30, it outperforms the h-clustering for t = 2 only at low SNR values for µ BBU = 60, and it does not outperform either h-clustering for t = 2 or p-clustering for (t 1 , t 2 ) = (5, 2) at any SNR value for µ BBU = 120.
By comparing the subfigures of Figure 9 for a given µ BBU , we can observe that, for each µ BBU value, the SNR range in which the performance of the 3-sector decoding scheme is superior to or close to the proposed schemes decreases when the channel attenuation coefficient is higher. In addition, we see that, for µ BBU = 60 and 120, the SNR range in which the h-clustering for t = 2 outperforms the I p -clustering for (t 1 , t 2 ) = (5, 2) and/or I h -clustering for t = 2 increases with the channel attenuation coefficient. We infer that the idea of isolated clustering is more advantageous at strong interference regime.
Another general conclusion that we can draw from simulation results presented in Figure 9 is that, if the processing capacity is high enough, i.e., the quantization rate is high enough, decomposing the network into hexagonal-type clusters achieves higher rates than paralellogram-type clusters especially at moderate and high SNR range even if r h = r p . This is due to geometrical structure of hexagonal-type clustering that includes more users for both h-cluster/I h -cluster and less interfererers for I h -cluster in comparison with the parallelogram clusters for the same r h = r p .
An interesting conclusion from the finite SNR analysis is that interfering clusterings show close performance to the proposed schemes in the finite SNR range; therefore, the interfering clusterings can also be employed at finite SNR values, since it may be more convenient for practical systems.

Conclusions
In this paper, we analyze the uplink per-user DoF of M-CRAN based sectored cellular networks. The main features of this paper are the following: it proposes efficient ways of decomposing the network into non-interfering clusters for M-CRAN scenarios, and it characterizes per-user DoF as a function of fronthaul and processing capacity prelogs, and BBUP-BS ratio. The lower bound is obtained through two coding schemes based on decomposing the network into non-interfering parallelogram and hexagonal clusters, respectively. In both schemes, BSs apply point-point quantization to receive signals and send the quantization messages to the associated BBUPs over fronthaul links for joint decoding.
Simulation results show that, for small and moderate fronthaul capacities, the achievability gap between lower and cut-set bounds decreases with an inverse of the BBUP-BS ratio. Therefore, the cut-set bound is almost achieved even for small cluster sizes at this range of fronthaul capacities. For higher fronthaul capacity prelogs, the achievability gap is not always tight but decreases with processing capacity prelog.
The finite SNR analysis shows that the proposed schemes outperform the naive schemes at all SNR ranges and, under all scenarios, the interfering clustering cases at all SNR range under strong interference regime when the BBUP processing capacity is scarce and moderate, and the 3-sector decoding case at all SNR range under strong interference regime if the BBUP processing capacity is moderate and high. In other scenarios for interfering clustering and 3-sector decoding cases, the proposed schemes always achieve higher sum-rates except low SNR values.
In general, the results provide valuable insights into appropriate clustering ways for mobile users/sectors, emphasizing the isolation of clusters, particularly if inter-cell interference is highly detrimental.
and, from (15), the achievable DoF for Hexagon Scheme can be written as: Proof. The first part of the proposition, i.e., (A1), is straightforward. For (A2), note that, for a given µ BBU > 0, there is a unique t ∈ Z + that satisfies (A3) and the maximization term in (A2) comes from the fact that, to implement the hexagonal scheme, we choose the design parameter t that gives the maximum DoF of among minimums found for satisfying t ≥ 1 3r . From (15), we infer that the term µ F (3t 2 −1) 9t 2 is active in the minimization process for t ≥ t ≥ 1 3r since µ BBU ≥ µ F 3t 2 − 1 , and that the term µ BBU 9t 2 is active in lower bound for t > t since µ BBU ≤ µ F 3 (t + 1) 2 − 1 . Therefore, the achievable DoF by hexagon scheme is given by the second term of (A2) if µ BBU ≥ µ F 3t * 2 − 1 .
We now check all possible intervals of µ BBU regarding (A1) and (A2).
This completes the proof for Remark 1.

Appendix B. Proof of Theorem 2
For the sake of simplicity, define Y BBU k as the received signal of BBUP k: We obtain the first two terms of the upper bound by choosing the cut set S = {all base stations, j = 1, . . . , N} S c = {all BBUPs, k = 1, . . . , K} and defining In that case, for any fixed BBUP to BS association for any given network, the total rate of all users is upper bounded by: where second inequality comes from applying (6) and (9) to received signals of BBUPs, which gives the first two terms by Definition 2. The third term comes from the fact that, by [35], the DoF a M × M MIMO system is upper bounded by M.
For M ≤ µ F ≤ 2M, the term with µ BBU is active in both upper and lower bounds if µ BBU · r ≤ µ F and µ BBU ≤ M + µ F (t 1 t 2 − 1), respectively, where the matching requires t 1 t 2 = 1 r .
For 2M ≤ µ F ≤ 3M, the term with µ BBU is active in upper and parallelogram lower bound if µ BBU · r ≤ µ F and µ BBU ≤ M (2t 1 + 2t 2 − 3) + µ F (t 1 t 2 − t 1 − t 2 − 1), respectively, where matching requires t 1 t 2 = 1 r . If 1 r ∈ Z + , there is at least one (t 1 , t 2 ) pair that results in t 1 t 2 = 1 r . However, , the term with µ BBU can not be active in lower bound. This imposes choosing the pair (t 1 , t 2 ) that minimizes t 1 + t 2 . The term with µ BBU is active in upper and hexagon lower bound if µ BBU · r ≤ µ F and µ BBU ≤ M (6t where matching requires t = 1 3r .
For 3M ≤ µ F , the matching cases can be found by applying similar procedures as in the 2M ≤ µ F ≤ 3M case.

Appendix D. Proof of Theorem 3
Due to Remark 1, we do the achievability gap analysis only for paralleogram clustering. We do the analysis for one of the cases which leads to the maximum achievability gap. For other cases, a similar procedure can be applied.

•
If µ F ≤ M, the maximum gap occurs when µ F 3 and µ BBU 3t 1 t 2 is active in upper and lower bounds, respectively. Note that this assumption imposes µ F r ≤ µ BBU ≤ µ F · t 1 · t 2 .
where (a) is due to max µ BBU = µ F · 1 r and (b) is due to min t 1 · t 2 = 1 r by (25).