On Information-Theoretic Scaling Laws for Wireless Networks

Xie, Liang-Liang

doi:10.3390/info16090728

Open AccessArticle

On Information-Theoretic Scaling Laws for Wireless Networks

by

Liang-Liang Xie

Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada

Information 2025, 16(9), 728; https://doi.org/10.3390/info16090728

Submission received: 27 July 2025 / Revised: 21 August 2025 / Accepted: 21 August 2025 / Published: 25 August 2025

(This article belongs to the Section Wireless Technologies)

Download

Browse Figure

Versions Notes

Abstract

In the development of large wireless networks, scaling law studies can provide fundamental insights. For example, is it possible to build an arbitrarily large wireless network without a wired infrastructure while maintaining a constant communication rate for each user? This is equivalent to asking if a linear scaling law is achievable for wireless networks. Whether too ambitious a goal or not, this question has attracted intensive research but still remains open. Among many proposals, the hierarchical scheme is impressive in exploiting the MIMO gain with a bootstrapping strategy. In this paper, a careful analysis of the hierarchical scheme exposes the potential influence of the pre-constant in deriving scaling laws. It is found that a modified hierarchical scheme can achieve a throughput up to an arbitrary factor higher than the original one, although it is still short of linear scaling. This study demonstrates the essential importance of the throughput formula itself, rather than the scaling laws consequently derived.

Keywords:

information theory; scaling laws; wireless networks

1. Introduction

The scaling-law study of the capacity of wireless networks is a retreat when the exact characterization is out of reach. Although it aims at lower goals, it opens an avenue for obtaining concrete results. Such results are asymptotic in nature but can be very insightful, especially for networks with a large number of nodes.

Consider a wireless network of n nodes, where each node is an independent source and wants to send information to another node in the network. What are the achievable rates? For this problem, the seminal work in [1] showed that the multi-hop scheme achieves a scaling law of

Θ (\sqrt{n})

for the total throughput. That is, on average, each source–destination pair enjoys a rate of

Θ (\sqrt{n}) / n = Θ (\frac{1}{\sqrt{n}})

, which, unfortunately, tends to zero as n extends to infinity. This was not good news; it implies that no constant rates can be maintained for all source–destination pairs when the network size n grows. Obviously, in order to maintain a constant rate, the linear scaling

Θ (n)

of the total throughput has to be achieved.

Although the multi-hop scheme has indeed been the focus of much protocol development, it is well known from multi-user information theory [2] that there are many cooperation schemes that can achieve higher rates, e.g., interference cancellation [3], multiple access [4,5], broadcast [6], and relay [7]. Hence, the question remains: Is linear scaling achievable if based on multi-user cooperation?

Towards this goal, major progress was made via a hierarchical scheme exploiting the MIMO gain [8], where it was shown that for any

ϵ > 0

, the scaling

Θ (n^{1 - ϵ})

is achievable under certain network conditions. This is a significant improvement over the scaling

Θ (\sqrt{n})

achieved by the multi-hop scheme since

ϵ

can be made arbitrarily small. However, linear scaling is still not achieved because the pre-constant of the scaling is

ϵ

-dependent and actually decreases to zero as

ϵ

decreases to zero.

There is a subtle difference between the scaling law study in [1] and the scaling law study in [8]. In [1], the pre-constant of the scaling is easy to determine due to the fixed link rate in the multi-hop scheme, which does not change even when the network size grows. However, it is not so simple for schemes based on multi-user cooperations, which is the case in [8]. Unfortunately, in [8], the pre-constant was not addressed. Neglect of the pre-constant results in incomplete pictures and can even lead to misleading conclusions.

In [8], different scaling laws were claimed for dense networks (in a fixed area) and extended networks (with a fixed density) as the number of nodes extends to infinity. However, any practical network is in a fixed area, and with a fixed density. It can either be embedded into a series of increasingly denser networks or a series of increasingly more extended networks. What can the two different scaling laws tell us about the design and operation of this practical network if they contradict each other? Inevitably, the only explanation is that the scaling laws must be irrelevant to the design and operation of any practical network that lies in a fixed area and has a fixed density.

Is there anything wrong with this? Not really, if one takes into account the pre-constant. Consider the following simple equation:

c_{1} n^{γ_{1}} = c_{2} n^{γ_{2}} .

Obviously, for any

γ_{1} > γ_{2}

, we can always find

c_{1} < c_{2}

for the above equation to hold for any n. This actually indicates that the scaling exponent

γ

can be made arbitrarily large if the pre-constant c is not fixed.

Therefore, without addressing the pre-constant, the scaling laws claimed in [8] are susceptible to the ambiguity indicated above. Indeed, in [8], the way to improve the scaling exponent is by increasing the number of hierarchical layers h such that the corresponding scaling order

Θ (n^{\frac{h - 1}{h}})

can be arbitrarily close to linear, as

\frac{h - 1}{h} \to 1

. However, the unaddressed pre-constant is actually h-dependent and decreases to zero as h extends to infinity, as demonstrated in [9]. That is, the correct and complete expression should be

c (h) n^{\frac{h - 1}{h}}

, with

c (h) \to 0

as

h \to \infty

, instead of a single

Θ

, which cannot uncover the whole story.

The more careful study [9] of the hierarchical scheme showed that it is not always better to choose larger h for any fixed n. Actually, for any n, the optimal h to choose is

h^{*} (n) = \sqrt{{log}_{β} (n / 2)}

(1)

where

β

is a constant depending on the basic SINR (signal-to-interference-plus-noise ratio) in the network. This implies that any larger h will result in a bigger loss in

c (h)

compared to the gain from

n^{\frac{h - 1}{h}}

. It was also shown that with the optimal choice of h and the corresponding optimal cluster sizes, the maximum achievable throughput by the hierarchical scheme is

T^{*} (n) = \frac{β R}{\sqrt{{log}_{β} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β} (n / 2)}}}

(2)

where R is another constant, also depending on the basic SINR in the network. It can easily be determined that

\frac{T^{*} (n)}{n} ↓ 0 .

That is, compared to linear scaling, the throughput achieved by the hierarchical scheme is monotonously becoming worse as n increases, and the average rate per source–destination pair goes to zero.

One might argue that the scaling exponent

1 - \frac{2}{\sqrt{{log}_{β} (n / 2)}}

in (2) does converge to 1 as

n \to \infty

and can thus be replaced by

1 - ϵ

for arbitrarily small

ϵ > 0

, the same as the expression in [8]. However, note that this

ϵ

is n-dependent, and smaller

ϵ

requires larger n, which, in turn, magnifies the importance of

ϵ

. This is exactly why

T^{*} (n)

becomes arbitrarily worse than n, although the exponent does converge to 1.

But still, does this matter, if it can be claimed that any scaling of

Θ (n^{1 - ϵ})

is achievable for any fixed

ϵ > 0

, although the pre-constant is

ϵ

-dependent, and diminishes to zero as

ϵ \to 0

? Yes, it matters for the practical design and operation of wireless networks if the scaling law study intends to be insightful or even relevant. First, as explained above, it becomes clear that for any practical network, it is not always better to choose more hierarchical layers. More layers do increase the exponent, but they also introduce more overhead when supporting the hierarchical structure. There will be some point beyond which the overhead outweighs the benefit of adding more layers. As a simple example, for the case where

β = 10

, (1) shows that the optimal number of layers for a network of 20,000 nodes is two, i.e., the simplest three-phase operation in the hierarchical scheme, and the corresponding throughput is

c (2) \sqrt{20, 000}

, which actually is of the same order as that offered by the simple multi-hop scheme. From here, whether to use the hierarchical scheme or the multi-hop scheme is completely determined by the pre-constant.

Moreover, even if concentrating on the limiting behavior as

n \to \infty

, we will show in this paper that a modification of the hierarchical scheme can achieve a throughput

T_{1}^{*} (n)

up to an arbitrary factor higher than

T^{*} (n)

in the sense that

\frac{T_{1}^{*} (n)}{T^{*} (n)} \to \infty, a s n \to \infty .

In fact, a more careful evaluation shows that

\frac{T_{1}^{*} (n)}{T^{*} (n) {log}_{a} n} \to \infty, f o r a n y a > 1 .

The potential of discovering such superior schemes may be missed if one overlooks the importance of

ϵ

or the pre-constant.

The remainder of the paper is organized as the following. In Section 2, we point out an immediate improvement that can be made on the hierarchical scheme proposed in [8] and introduce a modification. The throughput analysis and optimization of the modified scheme will be carried out in Section 3 and will also be compared to the original scheme. In Section 4, we discuss the drawbacks associated with the notions of “dense” and “extended” networks, artificially coined for scaling law studies, and propose a unified and direct way of addressing the real issues. Finally, some concluding remarks are presented in Section 5.

2. Clustering Multiple-Access with Relay

We introduce a simple modification to the hierarchical scheme proposed in [8]. The basic element in the modification is multiple-access; that is, multiple nodes want to send their independent bits to the same node simultaneously. However, instead of accomplishing this in one step, we use a hierarchical structure, where the bits are relayed via multiple levels of clusters until reaching the final destination. Before going into the details, let us first examine the scheme in [8] to see where improvements can be made.

The network under study consists of n nodes. There are n source–destination pairs evenly distributed such that each node is a source for some other node and is also the destination of some other source. For convenience, let us call this the original S-D pair problem. In order to introduce cooperations, the network is first divided into clusters, each of

M_{1}

nodes.

The basic element in the scheme in [8] is the three-phase operation. That is, first a source node distributes its bits to the other nodes in the same cluster (different bits to different nodes); then, the source cluster sends all these bits to the destination cluster via the virtual MIMO channel; finally, all the nodes in the destination cluster send their quantized observations to the destination node. Since all nodes are sources, the first step needs to be carried out

M_{1}

times for all the nodes in the source cluster, which constitutes Phase 1; similarly, since all nodes are destinations, the last step also needs to be carried out

M_{1}

times for all the nodes in the destination cluster, which constitutes Phase 3; moreover, the second step needs to be carried out n times for n S-D pairs, which constitutes Phase 2.

Note that in each cluster of

M_{1}

nodes, Phase 1 can actually be decomposed into

M_{1} - 1

original S-D pair problems with non-overlapping destination distributions, and Phase 3 can be similarly decomposed. It is exactly this observation that leads to the hierarchical structure proposed in [8], where both Phase 1 and Phase 3 can be replaced by another three-phase operation with smaller sub-clusters of size

M_{2}

. Then again, Phases 1 and 3 of the sub-clusters can be replaced by another three-phase operation with even smaller sub-sub-clusters. This process is continued, with each Phase 1 or Phase 3 being replaced by a three-phase operation with smaller clusters, and the hierarchy is built.

Our modification arises from a different perspective on Phase 1 and Phase 3. Although they can be decomposed into a sequence of the original S-D problems, they are essentially the problem where every node wants to send every other node an independent message. From the receiver’s point of view, each node sees the other nodes trying to send independent messages to it via a multiple-access channel. Hence, with this new perspective, in a cluster of

M_{i}

nodes, both Phase 1 and Phase 3 can be carried out by

M_{i}

multiple-access operations. Since this is a task where there are multiple-accesses to all the nodes, it is convenient to name it the all-way multiple-access problem.

The advantage of this new perspective is that with cluster cooperation, the all-way multiple-access problem can be accomplished in two-steps instead of three. That is, first, the nodes in any one cluster send their bits to the destination cluster via the virtual MIMO channel; then, all the nodes in the destination cluster send their quantized observations to the destination node. In other words, the first step of one node distributing its bits is no longer necessary because now, every node has something to transmit to the same destination. Correspondingly, the hierarchy proposed in [8] can be modified, as in Figure 1. Compared to Figure 3 in [8], the difference is the elimination of all the Phase 1s from the hierarchy, except on the top layer, where the problem is still the original S-D pair problem, which cannot be turned into a multiple-access problem.

As stated in [8], the purpose of Phase 1 is for a node to distribute its bits to the other nodes in the cluster in order to establish a virtual multi-antenna transmitter for the MIMO communication in Phase 2. However, in retrospect, since different bits are distributed to different nodes, there is essentially no mutual understanding among these nodes when they are transmitting together to the destination cluster. Therefore, it may be more accurate to think of Phase 2 as a multiple-access communication with a virtual receive cluster. With this in mind, it then becomes obvious that Phase 2 can be directly carried out without the preparation of Phase 1 if the problem is already a multiple-access problem.

The same modification with a multiple-access perspective has also appeared in [10] in the context of minimizing delay. However, the authors in this paper simply claim that the modified scheme achieves the same throughput as the original scheme in [8], largely due to the neglect of the

ϵ

, as we explained in the Introduction. In the next section, we will show that the modified scheme can achieve a throughput that is arbitrarily higher than the original scheme as the network size n grows.

3. Analysis of the Scaling Laws

In this section, we analyze the optimal throughput achievable by the modified hierarchical scheme proposed in the last section. The procedure is similar to that in [9] when analyzing the original hierarchical scheme of [8]. It turns out that the improvement can be any amount larger as the network size n grows to infinity, i.e.,

\frac{T_{1}^{*} (n)}{T^{*} (n)} \to \infty

where

T_{1}^{*} (n)

is the optimal throughput by the modified scheme, and

T^{*} (n)

is the optimal throughput by the original scheme. A more careful evaluation even shows that

\frac{T_{1}^{*} (n)}{T^{*} (n) {log}_{a} n} \to \infty, f o r a n y a > 1 .

However, still, the average rate per S-D pair goes to zero as

n \to \infty

, i.e.,

\frac{T_{1}^{*} (n)}{n} \to 0 .

Since the analysis procedure is similar to that in [9], we only highlight the differences here. Note that the top layer of the hierarchy remains the same. The key objective is to determine the time needed to accomplish the all-way multiple-access problem in Phase 1 and Phase 3 of the top layer.

As defined in the last section, the all-way multiple-access problem under study can be described as follows. Consider a network of size

M_{1}

, where every node wants to send L bits to every other node in the network. Different bits for different pairs, i.e., in total,

M_{1}^{2} L

bits need to be communicated. (The accurate number should be

M_{1} (M_{1} - 1) L

; however, for simplicity and without loss of much accuracy when

M_{1}

is large, we use

M_{1}^{2} L

in the calculation; this approximation will not affect the scaling order.) The question is how long it takes to accomplish the task.

We use the modified two-phase operation scheme to accomplish the task. First, we build the hierarchical structure: Divide these

M_{1}

nodes into clusters of size

M_{2}

; then, divide each cluster of

M_{2}

nodes into smaller clusters of size

M_{3}

; continue this process

h - 2

times for some

h \geq 2

; finally, obtain clusters of size

M_{h - 1}

. We will determine the optimal value of h to stop, i.e., the optimal number of hierarchical layers, and also the optimal cluster sizes

M_{2}

,

M_{3}

, …,

M_{h - 1}

, in the sequel.

Obviously, the number of time slots needed to accomplish the all-way multiple-access problem with the above hierarchical structure depends on the parameters h,

M_{1}

,

M_{2}

, …,

M_{h - 1}

, and L, and its is therefore denoted by

D_{h - 1} (M_{1}, M_{2}, \dots, M_{h - 1}, L) .

(3)

We will use a recurrence relation to determine (3). First, note that the all-way multiple-access problem of the network of size

M_{1}

is accomplished in two phases, i.e., Phase 2 and Phase 3, with clusters of size

M_{2}

. The number of time slots needed for Phase 2 is simply

\frac{M_{1}}{M_{2}} 2 M_{1} \frac{L}{R}

, as calculated in [9], where R is the basic rate. In Phase 3, we see again the all-way multiple-access problem for networks of smaller size

M_{2}

, but now with

L \frac{Q}{R} \frac{M_{1}}{M_{2}}

bits to be communicated between each pair of nodes. Hence, we have the following relation:

\begin{matrix} D_{h - 1} (M_{1}, M_{2}, \dots, M_{h - 1}, L) \\ = \frac{M_{1}}{M_{2}} 2 M_{1} \frac{L}{R} + 4 D_{h - 2} (M_{2}, \dots, M_{h - 1}, L \frac{Q}{R} \frac{M_{1}}{M_{2}}) \end{matrix}

where the multiplier 4 is needed for time-sharing between neighboring clusters to avoid excessive interference. Note that for clusters formed according to a regular grid, only one in four clusters can transmit at the same time in order to avoid simultaneous transmission by neighboring clusters.

In turn, we have the following relation:

\begin{matrix} D_{h - 2} (M_{2}, \dots, M_{h - 1}, L \frac{Q}{R} \frac{M_{1}}{M_{2}}) \\ = \frac{M_{2}}{M_{3}} 2 M_{2} \frac{L}{R} \frac{Q}{R} \frac{M_{1}}{M_{2}} + 4 D_{h - 3} (M_{3}, \dots, M_{h - 1}, L \frac{Q}{R} \frac{M_{1}}{M_{2}} \frac{Q}{R} \frac{M_{2}}{M_{3}}) \end{matrix}

and similar recursive relations for

D_{h - 3}

,

D_{h - 4}

, and so on. Hence, recursively,

\begin{matrix} D_{h - 1} (M_{1}, M_{2}, \dots, M_{h - 1}, L) \\ = \frac{M_{1}}{M_{2}} 2 M_{1} \frac{L}{R} \\ + 4 \frac{M_{2}}{M_{3}} 2 M_{2} \frac{L}{R} \frac{Q}{R} \frac{M_{1}}{M_{2}} \\ + 4^{2} \frac{M_{3}}{M_{4}} 2 M_{3} \frac{L}{R} {(\frac{Q}{R})}^{2} \frac{M_{1}}{M_{3}} \\ + \dots \\ + 4^{h - 3} \frac{M_{h - 2}}{M_{h - 1}} 2 M_{h - 2} \frac{L}{R} {(\frac{Q}{R})}^{h - 3} \frac{M_{1}}{M_{h - 2}} \\ + 4^{h - 2} D_{1} (M_{h - 1}, L {(\frac{Q}{R})}^{h - 2} \frac{M_{1}}{M_{h - 1}}) . \end{matrix}

For the smallest clusters of size

M_{h - 1}

, the all-way multiple-access problem is accomplished directly without the two-phase operation; thus,

D_{1} (M_{h - 1}, L {(\frac{Q}{R})}^{h - 2} \frac{M_{1}}{M_{h - 1}}) = \frac{L}{R} {(\frac{Q}{R})}^{h - 2} \frac{M_{1}}{M_{h - 1}} M_{h - 1}^{2} .

Therefore, letting

c = 4 \frac{Q}{R}

,

D_{h - 1} (M_{1}, M_{2}, \dots, M_{h - 1}, L) = 2 M_{1} \frac{L}{R} (\frac{M_{1}}{M_{2}} + c \frac{M_{2}}{M_{3}} + c^{2} \frac{M_{3}}{M_{4}} + \dots + c^{h - 3} \frac{M_{h - 2}}{M_{h - 1}} + c^{h - 2} \frac{M_{h - 1}}{2}) .

For any fixed

M_{1}

, to minimize the sum in the parenthesis above, noting that the product of all those terms is

c^{1 + 2 + \dots + (h - 2)} \cdot \frac{M_{1}}{2} = c^{\frac{(h - 1) (h - 2)}{2}} \cdot \frac{M_{1}}{2},

obviously, the optimal choice is to ensure that every term equals

{(c^{\frac{(h - 1) (h - 2)}{2}} \cdot \frac{M_{1}}{2})}^{\frac{1}{h - 1}} = c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{1}{h - 1}} .

This leads to the optimal cluster size choices:

M_{i} = 2 c^{- \frac{(i - 1) (h - i)}{2}} {(\frac{M_{1}}{2})}^{\frac{h - i}{h - 1}}, 2 \leq i \leq h - 1

(4)

and the minimum number of time slots:

D_{h - 1}^{*} (M_{1}, M_{2}, \dots, M_{h - 1}, L) = 2 M_{1} \frac{L}{R} (h - 1) c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{1}{h - 1}} .

Therefore, on the top layer, the number of time slots needed for Phase 1 is

4 \times 2 M_{1} \frac{L}{R} (h - 1) c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{1}{h - 1}};

the number of time slots needed for Phase 3 is

4 \times 2 M_{1} \frac{L}{R} (h - 1) c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{1}{h - 1}} \frac{Q}{R};

and the number of time slots needed for Phase 2 is still

2 n \frac{L}{R}

, the same as the original scheme. After these time slots, the number of bits transported for each S-D pair is

M_{1} L

, and the total number of bits transported in the whole network is

n M_{1} L

. Therefore, the throughput is calculated as

\frac{n M_{1} L}{16 \frac{L}{R} (h - 1) (1 + \frac{Q}{R}) c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{h}{h - 1}} + 2 n \frac{L}{R}} = : f (M_{1}) .

It is easy to find the optimal choice of

M_{1}

by setting

f^{'} (M_{1}) = 0

; we have

n = 8 (1 + \frac{Q}{R}) c^{\frac{h - 2}{2}} {(\frac{M_{1}}{2})}^{\frac{h}{h - 1}} o r e q u i v a l e n t l y, M_{1} = 2 {[8 (1 + \frac{Q}{R}) c^{\frac{h - 2}{2}}]}^{- \frac{h - 1}{h}} n^{\frac{h - 1}{h}}

(5)

and the corresponding throughput

T_{h}^{o p t} (n) = \frac{R}{h {(1 + R / Q)}^{\frac{h - 1}{h}} {(4 Q / R)}^{\frac{h - 1}{2}}} {(n / 2)}^{\frac{h - 1}{h}} .

For any fixed n, we can find the optimal h to maximize

T_{h}^{o p t} (n)

by setting

\frac{d T_{h}^{o p t} (n)}{d h} = 0 .

This leads to

h^{2} ln (2 \sqrt{Q / R}) + h - [ln (n / 2) - ln (1 + R / Q)] = 0 .

Hence, the optimal number of layers to choose is

h^{*} = \frac{\sqrt{1 + 4 ln (2 \sqrt{Q / R}) [ln (n / 2) - ln (1 + R / Q)]} - 1}{2 ln (2 \sqrt{Q / R})} .

Similarly, as in [9], in order to obtain a simple formula, we use the approximation

h^{*} = \frac{\sqrt{4 ln (2 \sqrt{Q / R}) ln (n / 2)}}{2 ln (2 \sqrt{Q / R})}

(6)

which is very accurate for large n since in the numerator, we only omitted the terms independent of n, which, compared to the remaining term, are arbitrarily smaller as n becomes larger. Letting

β_{1} = 2 \sqrt{Q / R}

, we have

h^{*} = \sqrt{\frac{ln (n / 2)}{ln (2 \sqrt{Q / R})}} = \sqrt{{log}_{β_{1}} (n / 2)} .

(7)

Note that

β_{1}^{h} = β_{1}^{{log}_{β_{1}} (n / 2) \frac{h}{{log}_{β_{1}} (n / 2)}} = {(n / 2)}^{\frac{h}{{log}_{β_{1}} (n / 2)}} .

Therefore,

\begin{matrix} T_{h}^{o p t} (n) & = & \frac{R}{h {(1 + R / Q)}^{\frac{h - 1}{h}} {(4 Q / R)}^{\frac{h - 1}{2}}} {(n / 2)}^{\frac{h - 1}{h}} \\ = & \frac{β_{1} R}{h {(1 + R / Q)}^{\frac{h - 1}{h}} β_{1}^{h}} {(n / 2)}^{\frac{h - 1}{h}} \\ = & \frac{β_{1} R}{h {(1 + R / Q)}^{\frac{h - 1}{h}}} {(n / 2)}^{1 - \frac{1}{h} - \frac{h}{{log}_{β_{1}} (n / 2)}} \end{matrix}

(8)

where letting

h = h^{*} = \sqrt{{log}_{β_{1}} (n / 2)}

, we have the optimal throughput

T_{1}^{*} (n) = \frac{β_{1} R}{c_{n} \sqrt{{log}_{β_{1}} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}}

(9)

where

c_{n} = {(1 + R / Q)}^{1 - \frac{1}{\sqrt{{log}_{β_{1}} (n / 2)}}} \to (1 + R / Q), a s n \to \infty .

Obviously, (9) is very accurate for large n, although we made some approximations in (6), and

h^{*}

should always be an integer.

Hence, we arrive at the following theorem.

Theorem 1.

With the modified hierarchical scheme, by choosing the optimal number of layers as (7) and the corresponding optimal cluster sizes as (4) and (5), the optimal throughput is given by (9).

Without the approximation (6), we can also obtain an exact upper bound of the throughput. By (8),

\begin{matrix} T_{h}^{o p t} (n) & \leq & β_{1} R {(n / 2)}^{1 - \frac{1}{h} - \frac{h}{{log}_{β_{1}} (n / 2)}} \\ \leq & β_{1} R {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}} \end{matrix}

where, in the last inequality, “=” holds if

h = \sqrt{{log}_{β_{1}} (n / 2)}

. It is easy to check that with the modified hierarchical scheme, the average rate per S-D pair still goes to zero as

\begin{matrix} \frac{{(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}}}{n} & = & \frac{1}{2} {(n / 2)}^{- \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}} \\ = & \frac{1}{2} {(β_{1}^{{log}_{β_{1}} (n / 2)})}^{- \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}} \\ = & \frac{1}{2} β_{1}^{- 2 \sqrt{{log}_{β_{1}} (n / 2)}} \\ \to & 0 . \end{matrix}

However, the modified scheme can be up to an arbitrary factor better than the original one in [8], as can be checked with

\begin{matrix} \frac{T_{1}^{*} (n)}{T^{*} (n)} & = & \frac{β_{1} R}{c_{n} \sqrt{{log}_{β_{1}} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}} / \frac{β R}{\sqrt{{log}_{β} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β} (n / 2)}}} \\ = & \frac{β_{1} \sqrt{{log}_{β} β_{1}}}{c_{n} β} {(n / 2)}^{\frac{2}{\sqrt{{log}_{β} (n / 2)}} (1 - \sqrt{{log}_{β} β_{1}})} \\ = & \frac{β_{1} \sqrt{{log}_{β} β_{1}}}{c_{n} β} β^{2 (1 - \sqrt{{log}_{β} β_{1}}) \sqrt{{log}_{β} (n / 2)}} \\ \to & \infty, \end{matrix}

where

T^{*} (n)

is the optimal throughput of the original scheme, as calculated in [9], with

β = 2 \sqrt{1 + Q / R} > β_{1}

; thus,

{log}_{β} β_{1} < 1

. In fact, we can show an even stronger result that

\frac{T_{1}^{*} (n)}{T^{*} (n) {log}_{a} n} \to \infty, f o r a n y a > 1,

since

\frac{2 (1 - \sqrt{{log}_{β} β_{1}}) \sqrt{{log}_{β} (n / 2)}}{{log}_{β} {log}_{a} n} \to \infty .

4. Dense or Sparse Networks?

The analysis in the last section assumed a fixed basic rate R in order to focus on the scaling in terms of n. While this is the case under some channel gain models for so-called dense networks, where networks are confined in a fixed area even as the number of nodes grows to infinity, it is not so easy to maintain a fixed basic rate for networks with growing areas due to the power path loss.

Therefore, when addressing so-called extended networks, where the node density is fixed while the area grows proportionally to the number of nodes, ref. [8] proposed the trick of concentrating the total transmission power into a small portion of the total transmission time to compensate for the path loss, so that during that portion, the received SINR is maintained at a specific level. Then, with the following power path loss model:

P_{r} = P_{t} / d^{α},

i.e., the received power

P_{r}

depends on the transmitted power

P_{t}

via the transmitter–receiver distance d and the path-loss exponent

α

, the scaling for extended networks readily follows by multiplying all the results above with the factor

n^{1 - α / 2}

. That is, to compensate for the power path loss, the transmitted power needs be

d^{α}

times larger, i.e.,

{(\sqrt{n})}^{α}

times larger, considering long-hop distances in an area proportional to n. Since the required power level is

P / n

for dense networks, the hierarchical scheme can only be operated in a

n^{1 - α / 2}

portion of the time for extended networks to satisfy the total power constraint, which leads to the multiplication of the same factor for all the scaling law results obtained previously.

More generally, as pointed out in [9], the same trick can be played on networks with any area other than either fixed or linear growing. That is, a network with area A is distinguished into two categories based on whether

A^{α / 2} \leq n .

(10)

In the case where

A^{α / 2} \leq n

, the basic SINR can be maintained all the time, and the power-concentration trick is not needed. In the other case, where

A^{α / 2} > n

, the power-concentration trick is needed to maintain the basic SINR for a

n / A^{α / 2}

portion of the time, and all the results correspondingly need to be multiplied by the same factor. For example, Formula (9) should be modified as

T_{1}^{*} (n, A) = min \{1, \frac{n}{A^{α / 2}}\} \frac{β_{1} R}{c_{n} \sqrt{{log}_{β_{1}} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}} .

(11)

In [9], these two categories are, respectively, named as “dense” and “sparse” networks. Note that this is a notion that can be readily clarified on any specific network based on the relation between the area A and the number of nodes n, different from the previous notion of “dense” and “extended” networks, which is undetermined for any single network. However, one has to realize that this new notion is largely a consequence of the hierarchical scheme, and it is also related to the path-loss exponent.

A follow-up work by the same authors [11] proposed to address the intermediate regime between dense and extended networks by introducing a more general pattern of area scaling as

A = n^{ν}

(12)

where

ν

is a real number, with

ν = 0

corresponding to dense networks, and

ν = 1

corresponding to extended networks. Although it seems more general by the flexibility of choosing different values for

ν

, it is still artificial to make the network area scale according to the pattern (12). The ambiguity of determining the right embedding process for any specific network still remains, as we pointed out in the Introduction.

Above all, the motivation to study the capacity of wireless networks is clear: to provide insight and guidance with respect to the practical design and operation of such networks. After the explicit determination of the pre-constant, it is clear that the throughput formulas such as (11) hold for any finite number n, and the scaling laws are just a consequence of letting

n \to \infty

. If the practical problem under study is a specific network with a specific area and a specific number of nodes, then obviously, it is more natural and insightful to apply Formula (11) directly rather than to consult the scaling laws derived thereafter. One should therefore probably not be concerned with the scaling laws so much as the exact throughput formula itself.

Note that in Formula (11), the parameters R, Q, and

β_{1}

also affect the throughput, and they are determined by the basic SINR, which, in turn, is determined by the long-hop path loss. Therefore, for the flexibility of selecting different basic SINRs, the criterion (10) should be modified as follows:

A^{α / 2} \leq c_{0} n,

(13)

and the corresponding optimal throughput should be modified as follows:

T_{1}^{*} (n, A) = min \{1, \frac{c_{0} n}{A^{α / 2}}\} \frac{β_{1} R}{c_{n} \sqrt{{log}_{β_{1}} (n / 2)}} {(n / 2)}^{1 - \frac{2}{\sqrt{{log}_{β_{1}} (n / 2)}}}

(14)

where

c_{0}

is a constant, chosen to set the threshold of the basic SINR and thus the values of the parameters R, Q, and

β_{1}

. Generally, a smaller

c_{0}

leads to a larger basic rate R; however, a smaller

c_{0}

may also cause the condition (13) to be unsatisfied, thus leading to the scale-down factor

c_{0} n / A^{α / 2}

in (14) as a result of the power concentration trick. Hence, there is a basic tradeoff in choosing

c_{0}

when maximizing (14). Apparently, the aforementioned notions of “dense” and “sparse” networks derived via criterion (10) are rather arbitrary and more scheme-dependent than fundamental.

In summary, (14) presents the optimal throughput achievable by the modified hierarchical scheme for a network of n nodes and area A. This is all we need to know. Based on this, all kinds of scaling laws can be derived by setting different limits. Now, the question is really how good is (14) for any possible values of n and A, not just when

n \to \infty

. We have presented a simple example in the Introduction, showing that this is a question even when only compared to the multi-hop scheme. In general, we note that the upper bounds obtained in [12,13] apply to any finite network with specific n and A, and, in fact, encompass more general traffic patterns with the criterion of transport capacity, which allows for unequal rates and uneven S-D distributions.

5. Conclusions

Caution regarding the pre-constant is needed when deriving scaling laws for wireless networks, especially with multi-user cooperation schemes where the overhead may not be negligible. Based on explicit analysis of the pre-constant, we have shown that a modified hierarchical scheme can achieve a throughput up to an arbitrary factor higher than the original one, although it is still diminishingly lower compared to linear scaling. This leaves open the question of whether it is possible to maintain a constant rate between each S-D pair when the number of nodes grows to infinity.

On the other hand, rather than the scaling laws, we have demonstrated the pivotal importance of the throughput formula itself as a function of the network parameters. We emphasize that all scaling laws can be derived from this formula, and more importantly, it is this formula that is directly related to practice.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The author declares no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

MIMO	Multiple-input multiple-output
SINR	Signal-to-interference-plus-noise ratio
S-D	Source–destination

References

Gupta, P.; Kumar, P.R. The capacity of wireless networks. IEEE Trans. Inf. Theory 2000, 46, 388–404. [Google Scholar] [CrossRef]
Cover, T.; Thomas, J. Elements of Information Theory; Wiley: New York, NY, USA, 1991. [Google Scholar]
Shannon, C.E. Two-way communication channels. In Proceedings of the 4th Berkeley Symposium on Mathematical Statistics Probability, Berkeley, CA, USA, 20 June–30 July 1961; pp. 611–644. [Google Scholar]
Ahlswede, R. Multi-way communication channels. In Proceedings of the 2nd International Symposium on Information Theory, Budapest, Hungary, 2–8 September 1971; pp. 23–52. [Google Scholar]
Liao, H. Multiple Access Channels. Ph.D. Dissertation, University of Hawaii, Honolulu, HI, USA, 1972. [Google Scholar]
Cover, T. Broadcast channels. IEEE Trans. Inf. Theory 1972, 18, 2–14. [Google Scholar] [CrossRef]
Cover, T.; El Gamal, A.A. Capacity theorems for the relay channel. IEEE Trans. Inf. Theory 1979, 25, 572–584. [Google Scholar] [CrossRef]
Ozgur, A.; Leveque, O.; Tse, D. Hierarchical cooperation achieves optimal capacity scaling in ad hoc networks. IEEE Trans. Inf. Theory 2007, 53, 3549–3572. [Google Scholar] [CrossRef]
Ghaderi, J.; Xie, L.-L.; Shen, X. Hierarchical cooperation in ad hoc networks: Optimal clustering and achievable throughput. IEEE Trans. Inf. Theory 2009, 55, 3425–3436. [Google Scholar] [CrossRef]
Ozgur, A.; Leveque, O. Throughput-delay tradeoff for hierarchical cooperation in ad hoc wireless networks. IEEE Trans. Inf. Theory 2010, 56, 1369–1377. [Google Scholar] [CrossRef]
Ozgur, A.; Johari, R.; Tse, D.; Leveque, O. Information-theoretic operating regimes of large wireless networks. IEEE Trans. Inf. Theory 2010, 56, 427–437. [Google Scholar] [CrossRef]
Xie, L.-L.; Kumar, P.R. A network information theory for wireless communication: Scaling laws and optimal operation. IEEE Trans. Inf. Theory 2004, 50, 748–767. [Google Scholar] [CrossRef]
Xie, L.-L.; Kumar, P.R. On the path-loss attenuation regime for positive cost and linear scaling of transport capacity in wireless networks. IEEE Trans. Inf. Theory 2006, 52, 2313–2328. [Google Scholar] [CrossRef]

Figure 1. A modified hierarchical scheme that can achieve an arbitrarily higher throughput than the original one.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, L.-L. On Information-Theoretic Scaling Laws for Wireless Networks. Information 2025, 16, 728. https://doi.org/10.3390/info16090728

AMA Style

Xie L-L. On Information-Theoretic Scaling Laws for Wireless Networks. Information. 2025; 16(9):728. https://doi.org/10.3390/info16090728

Chicago/Turabian Style

Xie, Liang-Liang. 2025. "On Information-Theoretic Scaling Laws for Wireless Networks" Information 16, no. 9: 728. https://doi.org/10.3390/info16090728

APA Style

Xie, L.-L. (2025). On Information-Theoretic Scaling Laws for Wireless Networks. Information, 16(9), 728. https://doi.org/10.3390/info16090728

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Information-Theoretic Scaling Laws for Wireless Networks

Abstract

1. Introduction

2. Clustering Multiple-Access with Relay

3. Analysis of the Scaling Laws

4. Dense or Sparse Networks?

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI