Next Article in Journal
A Peak Traffic Congestion Prediction Method Based on Bus Driving Time

Entropy 2019, 21(7), 708; https://doi.org/10.3390/e21070708

Article
Gaussian Belief Propagation for Solving Network Utility Maximization with Delivery Contracts
by 1,2,* and
1
National Engineering Center for E-Learning, Huazhong Normal University, Wuhan 430079, China
2
The National Engineering Laboratory for Educational Big Data Technology, Huazhong Normal University, Wuhan 430079, China
3
The National Engineering Laboratory for Big Data Analytics and The School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
*
Author to whom correspondence should be addressed.
Received: 13 June 2019 / Accepted: 17 July 2019 / Published: 19 July 2019

## Abstract

:
Classical network utility maximization (NUM) models fail to capture network dynamics, which are of increasing importance for modeling network behaviors. In this paper, we consider the NUM with delivery contracts, which are constraints to the classical model to describe network dynamics. This paper investigates a method to distributively solve the given problem. We first transform the problem into an equivalent model of linear equations by dual decomposition theory, and then use Gaussian belief propagation algorithm to solve the equivalent issue distributively. The proposed algorithm has faster convergence speed than the existing first-order methods and distributed Newton method. Experimental results have demonstrated the effectiveness of our proposed approach.
Keywords:
network utility maximization; delivery contracts; Gaussian belief propagation; distributed algorithms

## 1. Introduction

Since the publication of the seminal paper  by Kelly et al., the framework of network utility maximization (NUM) has received a great deal of interest in the past two decades, which has been developed into a mathematical theory of network architectures . Many important network design and resource allocation problems can be formulated as a NUM model. The utility concept, originally proposed in economics, is used to measure the satisfaction degree of a consumer for a good or service. In the basic NUM model, the utility of a network user is defined as a function of its data rate. The goal of network system is designed to maximize the overall utility of all the users in the network.
Consider a network with L links and R users, where each link l has a capacity of $c l$ bps. Let $L$ be the set of all links and a route r is a non-empty subset of $L$, let $R$ be the set of possible routes, and associate a route r with a user r (or a data source r), i.e., $L = { 1 , 2 , … , L }$ and $R = { 1 , 2 , … , R }$. Set $A l r = 1$ if $l ∈ r$, so that the link l lies on route r, and set $A l r = 0$ otherwise. This defines a 0-1 routing matrix $A = ( A l r , l ∈ L , r ∈ R )$.
Suppose that if a rate $x r$ is allocated to user r then this has utility $U r ( x r )$ to the user. Assume that the utility $U r ( x r )$ is increasing, strictly concave, and continuously differentiable over the range $x r ≥ 0$. Let $U = ( U r ( x r ) , r ∈ R )$ and $C = ( c l , l ∈ L )$. Under this model, the network seeks a rate allocation $x = ( x r , r ∈ R )$ which solves the following optimization problem .
$max ∑ r ∈ R U r ( x r ) subject to A x ≤ C x ≥ 0$
However, the basic NUM model (1) does not consider the network dynamics such as time-varying link capacities and user demands for quality of service (QoS). In this paper, we investigate a dynamic NUM model with QoS constraints, i.e., the model is time-varying over time t, which takes values in the set of time indices $T = { = 1 , 2 , … , T }$. The model was first introduced in  as following
$max ∑ t = 1 T ∑ r = 1 R U r t ( x r t ) subject to A t x t ≤ C t ∀ t ∈ T B r x r ≥ q r ∀ r ∈ R x t ≥ 0 ∀ t ∈ T$
where $x r t$ denotes the source rate for user r at time step t, $x t = ( x r t , r ∈ R )$ is the rate vector of all users at time step t and $x r = ( x r t , t ∈ T )$ is the source rate allocation for user r. The second constraint in model (2) is the QoS constraints or delivery contracts. For each user, a delivery contract is the required minimal flow to be delivered over some particular time interval. Assume user r has $k r$ delivery contracts and $q r ∈ R k r$ is the associated contract quantity amounts. A contract is active at time index t if t is in the time interval of the contract. We can define a 0-1 matrix $B r ∈ R k r × T$ to represent the delivery contract indicator matrix by setting the $( B r ) k t = 1$ if the kth contract of user r is active at time t, and setting $( B r ) k t = 0$ otherwise. Thus, the QoS constraints that all delivery contracts are met can be given by $B r x r ≥ q r ∀ r ∈ R$, i.e., the second constraint in model (2).
In the dynamic NUM model (2), utility function $U r t$, route matrix $A t$ and link capacity $C t$ are all dependent on the time index t, this means they are all possibly time-varying. There are many different ways to solve the problem (2), such as interior-point methods  and primary-dual algorithms . The interior-point methods are efficient for solving the problem (2); however, they are centralized algorithms. The primary-dual algorithms are decentralized, but they suffer from slow convergence speed. In this paper, we concentrate on the issue of designing a distributed algorithm with fast convergence speed for solving the problem (2).
Only a few studies so far have investigated the problem (2) [4,7]. The goal of these works are similar to us. In , the authors presented a distributed primary-dual algorithm for solving the problem (2) based on dual decomposition and first-order methods. However, their work suffers from slow convergence speed. Of particular relevance to our work is , where a distributed Newton-type algorithm has been developed for solving the dynamic NUM model (2). Unlike these works, we use Gaussian belief propagation (GaBP) algorithm  to compute the Newton step and obtain an efficient distributed algorithm for solving the problem (2).
Our proposed solution is a three-step method for addressing the given issue. The first step is to obtain the optimality conditions for solving the problem (2) by introducing slack variables. The first step is similar to that approach used in the primary-dual algorithms for solving the problem (2). However, we do not adopt the primary-dual algorithms to solve the problem, which suffer from slow convergence speed.
The second step is to transform the obtained optimality conditions into an inference problem in a probabilistic graph model describing a certain Gaussian distribution with unknown parameters, which equal to optimal solutions for the problem (2). The method used in the second step transfers an algebraic problem into a probabilistic inference problem, which was first raised in .
The third step is to use the GaBP algorithm to evaluate distributively the parameter values of the Gaussian distribution. Essentially the GaBP algorithm is used to compute the Newton step in a primary-dual interior-point method . This is similar to the work in . However, this work does not consider the delivery contracts and is the special case of our work.
The outline of this paper is as follows. We first discuss the background and related work in Section 2, then present our method in Section 3. Section 4 provides experimental results and a discussion. Finally, Section 5 concludes this paper.

## 2. Background and Related Work

Before we present our idea, we first introduce a basic distributed optimization algorithm  which solves the model (1). Our work belongs to extensions of their works to dynamic model.

#### 2.1. Basic Primary-Dual Algorithm

Low et al. present in  the following basic distributed optimization algorithm for solving the model (1).
The Lagrangian dual function for problem (1) is
$D ( μ ) = max x r ≥ 0 ∑ r ∈ R U r ( x r ) − μ T r ( A x − C ) = max x r ≥ 0 ∑ r ∈ R U r ( x r ) − x r ∑ l ∈ L A l r μ l + μ T r C$
where $μ = ( μ l , l ∈ L )$ is a vector of Lagrange multipliers and $μ T r$ denotes the transpose of the vector $μ$. Here, the second equality follows due to the definition of the matrix A. Thus, the dual problem for primary problem (1) is
$min μ ≥ 0 D ( μ )$
In the dual formulation, Lagrange multiplier $μ l$ can be interpreted as congestion price on link l. A key observation from Equation (3) is that sources can compute their optimal rate individually, based on the total congestion price $∑ l ∈ L A l r μ l$, using the following source rate algorithm
$x r = arg max x r ≥ 0 ∑ r ∈ R U r ( x r ) − x r ∑ l ∈ L A l r μ l$
To solve the dual problem (4), one can use the following projected gradient method
$μ l ( t + 1 ) = μ l ( t ) − α ( t ) ( c l − ∑ l ∈ L A l r x r ) +$
where $α ( t )$ is a positive scalar stepsize, and $a +$ denotes the projection of a onto the set $R +$ of non-negative real numbers.
According to the duality theory  and the assumption that the utility $U r ( x r )$ is increasing, strictly concave and continuously differentiable over the range $x r ≥ 0$, the optimal solutions to both primary problem (1) and dual problem (4) can be found simultaneously by solving iteratively in Equations (5) and (6), respectively. This suggests treating the network links and the sources as processors in a distributed computation system to solve the primary problem (1) and the dual problem (4). The algorithm (5) and (6) is often referred to as the basic primary-dual algorithm. A large number of studies based on NUM framework belong to extensions of the basic primary-dual algorithm, the interested readers along this line please refer to .

#### 2.2. Related Work

In the basic NUM model (1), the utility $U r ( x r )$ of a network user r is defined as function of its data rate $x r$, this means that all utility functions are separable. Due to the characteristic of separability of utility functions, a basic distributed NUM algorithm is derived to maximize aggregate user utility by the dual decomposition theory . Along this way, so many extended NUM models and resultant distributed algorithms have been proposed for network architectural design, cross-layer optimization and resource allocation in wireless as well as wireline networks [14,15,16,17,18,19]. There are some works which studied the extended NUM models with the non-strictly concave or non-concave utility functions such as in [20,21,22,23]. When the utility functions are not strictly concave, the subgradient method is usually used to solve the dual problems instead of the gradient method in the basic primary-dual algorithm. If the utility functions are not concave, the extented duality method  can be used to construct distributed algorithms.
The design of the utility functions depends on applications of NUM problem. NUM-based approaches have been explored in different applications. Based on NUM, Liu et al.  presented a distributed and adaptive solution that jointly computes the data collection rates for each node and finds the schedule transmissions for rechargeable sensor networks. The concept of Water flow Driven Sensor Networks was introduced for leakage and contamination monitoring based on NUM . Sadagopan et al.  use NUM approach constructing an energy balance tree in sensor networks, where each sensor node’s utility depends on the selection of its parent node. There exist some challenges to define utility functions based on performance metrics of different applications. The relationship between performance metrics and utility functions please refer to .
All the above works considered the static NUM models. Dynamic NUM models also belong to the extension of the basic NUM model. In , the authors presented a dynamic NUM model in adversarial environments. Their work focuses on the tradeoff between total queue length and utility regret. However, in this paper, we concentrate on the issue of devising a distributed algorithm to solve a dynamic NUM. In , the authors proposed a dynamic NUM model with time-varying fading channels. However, the utility functions and route matrix in their work are fixed. Moreover, their work focuses on the convergence behavior and tracking errors of the iterative primary-dual scaled gradient algorithm. Parametric network utility maximization model was presented in . If the parameters are regarded as time steps, their model is equivalent to ours. However, their work concentrates on the tracking of algorithm trajectory by using a pathfollowing method on the parametric optimization problem .
The works in [4,7] are particular relevance to this work. The dynamic NUM with delivery contracts was first proposed in , and the authors presented a distributed primary-dual algorithm to solve the problem. The distributed primary-dual algorithm provided in  is based on dual decomposition theory, which is similar to the basic primary-dual algorithm. These primary-dual algorithms usually suffer from the slow rate of convergence [33,34].
The work in  also investigates the same model with ours in this paper. They proposed a distributed Newton method for solving the given problem. Their distributed algorithm obtained fast convergence speed compared with the distributed primary-dual algorithms. the method in  approximates the Newton direction at each iteration by using the matrix splitting technique. However, our method in this paper uses GaBP algorithm to evaluate the Newton direction.
Our proposed algorithm is a kind of primary-dual interior-point method. The primary-dual interior-point method was used for solving the NUM problem in ; however, the proposed algorithm is not decentralized. The work in  provided a distributed algorithm for solving a NUM by using the GaBP algorithm to compute the Newton direction. This is similar to our work. However, their model is static, and our model includes delivery contracts. Their work can be regarded as a special case of our work.

## 3. Our Method

In this section, we develop a three-step method to solve distributively the dynamic NUM problem (2). Before we present our idea, we first introduce some notations to simplify the model (2). let $x = ( x 1 T r , x 2 T r , … , x R T r ) T r$ and $C = ( C 1 T r , C 2 T r , … , C T T r ) T r$ be the rate vector of users and the capacity vector of links respectively, where $a T r$ is the transpose of the vector a. let matrix A denote the corresponding routing matrix for all time steps given as
$A = A 1 0 … 0 0 A 2 … 0 … … … … 0 0 … A T$
Then we can write down the first constraint of the dynamic NUM problem (2) as $A x ≤ C$.
Similarly, let $q = ( q 1 T r , q 2 T r , … , q R T r ) T r$ be the contract quantity vector of users, and the matrix B denote the delivery contract matrix for all users given as
$B = B 1 0 … 0 0 B 2 … 0 … … … … 0 0 … B R$
Thus, the second constraint of the dynamic NUM problem (2) can be written as $B x ≥ q$. Let X be the rate matrix with entries $( x r t , r ∈ R , t ∈ T )$, define $U ( X ) = ∑ t = 1 T ∑ r = 1 R U r t ( x r t )$
Equivalently, we can transform the dynamic NUM problem (2) as following,
$max U ( X ) subject to A x ≤ C B x ≥ q x ≥ 0$
Next, we will present our method to solve the problem (7).

#### 3.1. Optimality Conditions

The Lagrangian associated with the NUM problem (7) is
$L ( x r t ; λ , μ , α ) = U ( X ) − λ T r ( A x − C ) + μ T r ( B x − q ) + α T r x$
where $λ$, $μ$ and $α$ are Lagrange multiplier vectors which are associated the inequality constraints in the NUM problem (7). Therefore, the dual function is given by
$D ( λ , μ , α ) = max x r ≥ 0 L ( x r t ; λ , μ , α )$
Thus, the dual problem for model (7) is given by
$min λ ≥ 0 , μ ≥ 0 , α ≥ 0 D ( λ , μ , α )$
Assume $x ^ r t$, ($λ ^ , μ ^ , α ^$) are the optimal solutions of the primary problem (7) and dual problem (9), according to the Karush-Kuhn-Tucker (KKT) conditions , we can obtain the optimality conditions as following,
$− ∇ U ( X ^ ) + A T r λ ^ − B T r μ ^ − α ^ = 0 diag ( λ ^ ) ( C − A x ^ ) = 0 diag ( μ ^ ) ( B x ^ − q ) ) = 0 diag ( α ^ ) x ^ = 0$
where diag(·) denotes a diagonal matrix formed from its vector argument.

#### 3.2. Inference Problem

We can modify the optimality conditions (10) and apply the primary-dual interior-point method on the modified optimality conditions for solving the primary problem (7) and dual problem (9) in an iterative manner with the given error of the duality gap . The modification is parametrized by a parameter k as ,
$− ∇ U ( X ) + A T r λ − B T r μ − α = 0 diag ( λ ) ( C − A x ) = ( 1 k ) 1 diag ( μ ) ( B x − q ) ) = ( 1 k ) 1 diag ( α ) x = ( 1 k ) 1$
where $k ≥ 0$ is a parameter. We know from (11) that the modified optimality conditions approximate the optimality conditions as $k → ∞$ and different values of k set the different accuracies of the approximation. We can compactly write the modified optimality conditions as following,
$r t ( x , λ , μ , α ) = − ∇ U ( X ) + A T r λ − B T r μ − α diag ( λ ) ( C − A x ) − ( 1 k ) 1 diag ( μ ) ( B x − q ) ) − ( 1 k ) 1 diag ( α ) x − ( 1 k ) 1 = 0$
The search direction of the primary-dual interior-point method is the Newton step for solving the modified optimality conditions $r t ( x , λ , μ , α ) = 0$. If $y = ( x , λ , μ , α ) T r$ is the current point, the Newton step $Δ y = ( Δ x , Δ λ , Δ μ , Δ α ) T r$, then we have,
$r ( y + Δ y ) ≈ r t ( y ) + r t ′ ( y ) Δ y = 0 ,$
where $r t ′ ( y )$ denotes the derivative of $r t ( y )$. The above equation means
$− r t ( x , λ , μ , α ) = − ∇ 2 U ( X ) A T r − B T r − I − diag ( λ ) A diag ( C − A x ) 0 0 diag ( μ ) B 0 diag ( B x − q ) ) 0 diag ( α ) 0 0 diag ( x ) Δ x Δ λ Δ μ Δ α$
Searching the Newton step by Equation (12) is the main computational bottleneck in the primary-dual interior-point method. However, in this paper, we do not directly calculate Newton’s direction by Equation (12). We transform the problem solving the liner Equation (12) into a probabilistic inference which can be computed based on GaBP. We first transfer the matrix in the right side of Equation (12) into a symmetric matrix by multiplying $r t ( x , λ , μ , α )$ a factor $( 1 , − 1 / λ , − 1 / μ , − 1 / α )$ as following,
$− r ^ t ( x , λ , μ , α ) = − ∇ 2 U ( X ) A T r − B T r − I A − diag ( C − A x ) / λ 0 0 − B 0 − diag ( B x − q ) / μ 0 − I 0 0 − diag ( x ) / α Δ x Δ λ Δ μ Δ α = A Δ x Δ λ Δ μ Δ α$
where $r ^ t ( x , λ , μ , α )$ and $A$ are defined as,
$r ^ t ( x , λ , μ , α ) = r t ( x , λ , μ , α ) · ( 1 , − 1 / λ , − 1 / μ , − 1 / α ) T r = − ∇ U ( X ) + A T r λ − B T r μ − α − ( C − A x ) + ( λ k ) 1 − ( B x − q ) ) + ( μ k ) 1 − x + ( α k ) 1$
and
$A = − ∇ 2 U ( X ) A T r − B T r − I A − diag ( C − A x ) / λ 0 0 − B 0 − diag ( B x − q ) / μ 0 − I 0 0 − diag ( x ) / α$
For notational simplicity, let $b = − r ^ t ( x , λ , μ , α )$, $w = ( Δ x , Δ λ , Δ μ , Δ α ) T r$, we can write the Equation (13) as,
$A w = b$
Therefore, the Equation (14) for computing the Newton step is a system of linear equations with a symmetric coefficient matrix, which can be efficiently solved by using the GaBP algorithm . We can define an undirected probabilistic graphical model $G = ( W , E )$, where $W$ is a set of nodes which consist of the variables of linear Equation (14), and $E$ is a set of edges which are determined by the non-zero entries of the coefficient matrix $A$. Given the matrix $A$ and vector b, we can build up a Gaussian density function $p ( w ) ∼$ exp$( − 1 2 w T r A w + b T r w )$, which corresponds to the probabilistic graph $G$. Let $M = T R + T L + ∑ r = 1 R T k r + T R$ be the dimension of the vector b (or vector w) (refer to the original model (2), we know that the number of the objective functions is $T R$, the capacity constraints are $T L$, the constraints of the delivery contracts are $∑ r = 1 R T k r$, and the non-negativity constraints are $T R$. Therefore, the dimension of the vector b is $T R + T L + ∑ r = 1 R T k r + T R$). The probabilistic graph $G$ has edge potentials (or compatibility functions) $ψ$ and self-potentials (or evidence) $ϕ$. These graph potentials are determined by the following pairwise factorization of Gaussian distribution,
$p ( w ) ∝ ∏ i = 1 M ϕ i ( w i ) ∏ { i , j } ψ i j ( w i , w j ) ,$
resulting in $ϕ i ( w i ) ≐$exp$( − 1 2 A i i w i 2 + b i w i )$ and $ψ i j ( w i , w j ) ≐$exp$( − 1 2 w i A i j w j )$. Using this probabilistic graph, we can transform the problem of solving the linear Equation (14) from the algebraic domain to a parameter estimation problem in the domain of probabilistic inference, as stated in the following theorem .
Theorem 1.
The computation of the solution vector $w * = A − 1 b$ is identical to the inference of the vector marginal means $θ ≐ { θ 1 , … , θ R }$ over the graph $G$ with the associated joint Gaussian probability density function $p ( w ) ∼ N ( θ , A − 1 )$.
Proof.
See Appendix A. □
The above theory shows that if we can distributively evaluate the mean of the Gaussian distribution $p ( w )$, then we can use the primary-dual interior-point method to distributively solve the primary problem (7) and dual problem (9). The next section will present the method for solving the mean of the inference problem (15) based on GaBP algorithm.

#### 3.3. Parameter Evaluation Based on GaBP

Belief propagation is a kind of local message-passing algorithm and has been found to have excellent performance in many applications . GaBP is a special case of the belief propagation algorithm, in which the underlying distributions are Gaussian. According to the statements in above section, in order to solve the linear equation problem (14) we need to infer the marginal densities $p ( w i )$, which must also be Gaussian, i.e.,
$p ( w i ) ∼ N ( θ i = { A − 1 b } i , P i − 1 = { A − 1 } i i )$
where $θ i$ and $P i$ are the marginal mean and inverse variance (also known as the precision), respectively. Let $N ( i )$ be the set of all the nodes neighboring the node i (excluding node i). The set $N ( i ) \ j$ includes all the nodes in the set of $N ( i )$ except node j. The following Algorithm 1 provides the GaBP algorithm update rules for inferring the mean $θ i$.
 Algorithm 1: GaBP Algorithm Step 0 Initialization:Set a convergence threshold $ϵ$, $P k i = 0$ and $θ k i = 0 , ∀ k ∈ N ( i )$. Compute $P i i = A i i$ and $θ i i = b i / A i i$.Step 1 Iteration:Propagate the messages $P k i$ and $θ k i , ∀ k ∈ N ( i )$. Compute $P i j = − A i j 2 / ( P i i + ∑ k ∈ N ( i ) \ j P k i ) , θ i j = ( P i i θ i i + ∑ k ∈ N ( i ) \ j P k i θ k i ) / A i j$.Step 2 Convergence check:If the message $P i j$ and $θ i j$ do not converge, return to Step 1, else, go to Step 3.Step 3 Inference:Compute the marginal means $θ i = ( P i i θ i i + ∑ k ∈ N ( i ) P k i θ k i ) / ( ( P i i + ∑ k ∈ N ( i ) P k i ) )$.Step 4 Output:Output the solution $w i = θ i$.

## 4. Experiments and Analysis

#### 4.1. Experimental Settings

In this section, we justify empirically the effectiveness of the proposed algorithm and compare the performance with the classical primary-dual algorithm and the primary-dual interior-point method. The classical primary-dual algorithm is based on dual decomposition. The primary-dual interior-point method used here is an iterative method for solving approximately a Newton system, which is usually called a truncated Newton primary-dual interior-point method .
We consider a network which has 100 flows and 200 links, and all of the utility functions are set to logarithmic functions, i.e., $U r t ( x r t ) = l o g ( x r t )$, which are the most widely used in NUM problems . The network was randomly generated and similar to that used in [34,35], this means that we need generate the link capacities and the routing matrix. The link capacities are chosen independently from a uniform distribution on [0.1, 1] and all of the required minimal flows to be delivered over different time intervals are set to 0.1, and the elements of the routing matrix A are generated randomly and independently, so that the average route length is 6 links. The time index T and all stepsizes are set to 10 and 0.001, respectively. After the network was generated, our proposed algorithm, primary-dual algorithm, and truncated Newton algorithm are performed once on it, respectively. The experimental results and comparisons are provided in the next section.

#### 4.2. Experimental Results

We first evaluate the convergence of the proposed algorithm and compare with the classical primary-dual algorithm, these two algorithms are all distributed. Figure 1 shows the convergence curves of total utilities and Figure 2 provides the duality gap or corresponding residual values between the primary function and dual function.
We also compare the performance of our proposed method with the truncated Newton method which has achieved a very fast convergence speed and very good accuracy for solving nonlinear equation system. However, the truncated Newton method is centralized. Figure 3 and Figure 4 show the convergence curves of total utilities and the duality gap curves versus iteration number for our proposed method and the truncated Newton method, respectively.
Our proposed method uses GaBP algorithm to compute the Newton and the truncated Newton method adopts the preconditioned conjugate gradient (PCG) algorithm  for computing the Newton step. While the above Figure 3 and Figure 4 provide the performance comparison in term of the Newton steps, we also give the comparison of the iteration count in each Newton step for these two algorithms in Table 1 and Table 2. Table 1 and Table 2 are the experimental results for two networks which have 100 flows and 200 links, and 500 flows and 1000 links, respectively.

#### 4.3.1. Analysis and Comparison with a Distributed Method

We will analyze the experimental results in terms of convergence speed and solution accuracy. From Figure 1, we can see that our proposed algorithm and the dual decomposition algorithm can converge to the optimal value Within a certain range of errors. However, the convergence speed of our method is much faster than the dual decomposition algorithm.
The duality gap between the primary function and the dual function depicts the accuracy of the obtained solution. For a convex optimization problem, the primary variables and dual variables will eventually approach the optimal solution as the duality gap tends to zero. Naturally, we expect that our proposed algorithm will obtain a smaller duality gap. From Figure 2, we can see that the duality gap achieved by our proposed algorithm is smaller than that obtained by the dual decomposition method.

#### 4.3.2. Analysis and Comparison with a Centralized Approach

The truncated Newton method is a centralized approach, which is an efficient primary-dual interior-point method and achieves good performance in many optimization problems . We gave the performance comparison for our proposed method and the truncated Newton method in this section.
From Figure 3, we can see that the convergence speed of our proposed method is very fast, which is slightly faster than the truncated Newton method. This means that both methods had comparable convergence speed. However, as specified before, our proposed approach is distributed, while the truncated Newton method is centralized.
Figure 4 provides the accuracy comparison of the solutions obtained by both methods. From Figure 4, we can see that our proposed method has a smaller duality gap than that achieved by the truncated Newton method. This means that although both methods had comparable convergence rate, the accuracy of the solution obtained by our method is better than that achieved by the truncated Newton method.
As these two methods computed the Newton steps based on GaBP and PCG algorithms respectively, we also compared the iteration count for each Newton step. From Table 1 and Table 2, we can see that the iteration count of GaBP algorithm is smaller than that required by the PCG algorithm except for the first few Newton steps. Moreover, the total iteration count of GaBP algorithm is also smaller than that achieved by the PCG algorithm.
Another advantage for GaBP algorithm is that as the Newton step number increases, the iteration count for each Newton step tends to a stable value. However, the iteration count required by PCG algorithm always increases; moreover, the increased magnitudes grows bigger as the scale of the network grows.

## 5. Conclusions

We propose a three-step method for distributively solving network utility maximization with delivery contracts (or dynamic NUM). This paper first obtained the optimality conditions for solving the dynamic NUM problem by dual decomposition theory. Then we transform the problem for searching the Newton step in solving the optimality conditions into a probabilistic inference. Finally, GaBP algorithm was used to compute the probabilistic inference.
NUM problems are usually solved by means of the classical primary-dual algorithm, which is used as the benchmark algorithm for testing the effectiveness of our proposed method. By comparing and analyzing the experimental results, we can reach a conclusion that the proposed method is effective in convergence speed and solution accuracy compared with the classical primary-dual algorithm.
Our proposed method belongs to distributed primary-dual interior-point methods. Therefore, we also compared the performance of our proposed method with the primary-dual interior-point method based on PCG, which had achieved a very fast convergence speed and very good accuracy for solving nonlinear equation problems. The experimental results also validated the effectiveness of our proposed method.

## Author Contributions

S.L. conceived and designed the experiments; J.S. performed the experiments; S.L. and J.S. wrote the paper.

## Funding

This research was funded by the Fundamental Research Funds for the Central Universities(grant number: CCNU19TS021), and the National Key Technology Research Program of the Ministry of Science and Technology of China (grant number: 2015BAK33B02), and National Natural Science Foundation of China (grant number: 61671483, 61072051).

## Conflicts of Interest

The authors declare no conflict of interest.

## Abbreviations

The following abbreviations are used in this manuscript:
 NUM Network utility maximization GaBP Gaussian belief propagation QoS Quality of service KKT Karush-Kuhn-Tucker PCG preconditioned conjugate gradient $( · ) T r$ The transpose of a matrix or a vector

## Appendix A

Proof of Theorem 1.
Solving the linear equation system $A w = b$ is equivalent to maximizing the quadratic form $− 1 2 w T r A w + b T r w$, which is further equivalent to finding the maximal value of the exponential function exp$( − 1 2 w T r A w + b T r w )$.
Next, we can define the joint Gaussian probability density function as follows:
$p ( w ) ≐ Z − 1 exp ( − 1 2 w T r A w + b T r w )$
where $Z$ is a normalization factor to make $p ( w )$ to be a probability distribution. Let $θ ≐ A − 1 b$, we can rewrite the joint Gaussian probability density function as following,
$p ( w ) = Z − 1 exp ( 1 2 θ T r A θ ) exp ( − 1 2 w T r A w + θ T r A w − 1 2 θ T r A θ ) = ξ exp ( − 1 2 ( w − θ ) T r A ( w − θ ) ) = N ( θ , A − 1 )$
where $ξ = Z − 1 exp ( 1 2 θ T r A θ )$ is the new normalization factor after the distribution $p ( w )$ has been rewritten.
From the above Equation (A2), we see that the mean vector of the Gaussian distribution $p ( w )$ defined by Equation (A1) is $θ$, which is equal to the our target solution $w * = A − 1 b$. This proves Theorem 1. □

## References

1. Kelly, F.P.; Maulloo, A.; Tan, D. Rate control for communication networks: Shadow prices, proportional fairness and stability. J. Oper. Res. Soc. Am. 1998, 49, 237–252. [Google Scholar] [CrossRef]
2. Chiang, M.; Low, S.H.; Calderbank, A.R.; Doyle, J.C. Layering as optimization decomposition: A mathematical theory of network architectures. Proc. IEEE 2007, 95, 255–312. [Google Scholar] [CrossRef]
3. Low, S.H.; Lapsley, D.E. Optimal flow control, I: Basic algorithm and convergence. IEEE/ACM Trans. Netw. 1999, 7, 861–874. [Google Scholar] [CrossRef]
4. Trichakis, N.; Zymnis, A.; Boyd, S. Dynamic Network Utility Maximization with delivery contracts. In Proceedings of the IFAC World Congress, Seoul, Korea, 6–11 July 2008; pp. 2907–2912. [Google Scholar]
5. Nocedal, J.; Wright, S.J. Numerical Optimization; Springer: New York, NY, USA, 1999. [Google Scholar]
6. Bertsekas, D.P. Nonlinear Programming, 2nd ed.; Athena Scientific: Nashua, NH, USA, 1999. [Google Scholar]
7. Weiy, E.; Ozdaglary, A.; Eryilmazz, A.; Jadbabaie, A. A Distributed Newton Method for Dynamic Network Utility Maximization with Delivery Contracts. In Proceedings of the 46th Annual Conference on Information Sciences and Systems (CISS 2012), Princeton, NJ, USA, 21–23 March 2012; pp. 1–6. [Google Scholar]
8. Weiss, Y.; Freeman, W.T. Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology. Neural Comput. 2001, 13, 2173–2200. [Google Scholar] [CrossRef] [PubMed]
9. Danny, B. Gaussian Belief Propagation: Theory and Application. Ph.D. Thesis, The Hebrew University of Jerusalem, Jerusalem, Israel, 2009. [Google Scholar]
10. Wright, S.J. Primal-Dual Interior-Point Methods; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1997. [Google Scholar]
11. Bickson, D.; Tock, Y.; Zymnis, A.; Boyd, S.P.; Dolev, D. Distributed large scale network utility maximization. In Proceedings of the IEEE International Symposium on Information Theory (ISIT 2009), Seoul, Korea, 28 June–3 July 2009; pp. 829–833. [Google Scholar]
12. Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
13. Pham, Q.V.; Hwang, W.J. Network utility maximization-based congestion control over Wireless Networks: A survey and potential directives. IEEE Commun. Surv. Tutor. 2017, 9, 1173–1200. [Google Scholar] [CrossRef]
14. Shengbin, L.; Jianhua, H. Design and analysis of distributed utility maximization algorithm for multihop wireless network with inaccurate feedback. Int. J. Commun. Syst. 2014, 27, 4280–4299. [Google Scholar]
15. Jan, V. Dynamic Scoring: Probabilistic Model Selection Based on Utility Maximization. Entropy 2019, 21, 36. [Google Scholar] [CrossRef]
16. Im, Y.; Joe-Wong, C.; Ha, S.; Sen, S.; Kwon, T.; Chiang, M. AMUSE: Empowering users for cost-aware offloading with throughput-delay tradeoff. IEEE Trans. Mob. Comput. 2016, 15, 1062–1076. [Google Scholar] [CrossRef]
17. Merayo, N.; Pavon-Marino, P.; Aguado, J.C.; Duran, R.J.; Burrull, F.; Bueno-Delgaado, V. Fair bandwidth allocation algorithm for PONS based on network utility maximization. J. Opt. Commun. Netw. 2017, 9, 75–86. [Google Scholar] [CrossRef]
18. Abhishek, S.; Eytan, M. Network utility maximization with heterogeneous traffic flows. In Proceedings of the 16th International Symposium on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt 2018), Shanghai, China, 7–11 May 2018; pp. 1–10. [Google Scholar]
19. Shengbin, L.; Qingfu, Z. A multiutility framework with application for studying tradeoff between utility and lifetime in wireless sensor networks. IEEE Trans. Veh. Technol. 2011, 64, 4701–4711. [Google Scholar]
20. Wang, W.; Palaniswami, M.; Low, S.H. Optimal flow control and routing in multi-path net-works. Perform. Eval. 2003, 52, 119–132. [Google Scholar] [CrossRef]
21. Lin, X.; Shroff, N.B. Utility maximization for communication networks withmultipath routing. IEEE Trans. Autom. Contr. 2006, 51, 766–781. [Google Scholar] [CrossRef]
22. Pal, A.; Kant, K. On the Feasibility of Distributed Sampling Rate Adaptation in Heterogeneous and Collaborative Wireless Sensor Networks. In Proceedings of the 25th International Conference on Computer Communication and Networks (ICCCN 2016), Waikoloa, HI, USA, 1–4 August 2016; pp. 1–9. [Google Scholar]
23. Liao, S.; Sun, J.; Chen, Y.; Wang, Y.; Zhang, P. Distributed power control for wireless networks via the alternating direction method of multipliers. J. Netw. Comput. Appl. 2015, 55, 81–88. [Google Scholar] [CrossRef]
24. Chen, Y.; Chen, M. Extended duality for nonlinear programming. Comput. Optim. Appl. 2010, 47, 33–59. [Google Scholar] [CrossRef]
25. Liu, R.; Sinha, P.; Koksal, C.E. Joint energy management and resource allocation inrechargeable sensor networks. In Proceedings of the 29th Conference on Computer Communications (INFOCOM 2010), San Diego, CA, USA, 15–19 March 2010; pp. 902–910. [Google Scholar]
26. Pal, A.; Kant, K. Water flow driven sensor networks for leakage and contamination monitoring. In Proceedings of the IEEE 16th International Symposium on “A World of Wireless, Mobile and Multimedia Networks” (WoWMoM 2015), Boston, MA, USA, 14–17 June 2015; pp. 1–9. [Google Scholar]
27. Sadagopan, N.; Singh, M.; Krishnamachari, B. Decentralized utility-based sensor network design. Mob. Netw. Appl. 2006, 11, 341–350. [Google Scholar] [CrossRef]
28. Zhao, Y.; Mao, S.; Neel, J.; Reed, J. Performance evaluation of cognitive radios: Metrics, utility functions, and methodology. Proc. IEEE 2009, 97, 642–659. [Google Scholar] [CrossRef]
29. Qingkai, L.; Eytan, M. Network Utility Maximization in Adversarial Environments. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2018), Honolulu, HI, USA, 15–19 April 2018; pp. 1–10. [Google Scholar]
30. Junting, C.; Vincent, K.N.L.; Yong, C. Distributive network utility maximization over time-varying fading channels. IEEE Trans. Signal Process. 2011, 59, 2395–2404. [Google Scholar]
31. Lutbat, Y.; Enkhbat, R.; Suk-Hwan, L.; Won-Joo, H. Parametric network utility maximization problem. Optim. Lett. 2014, 8, 889–901. [Google Scholar]
32. Guddat, J.; Guerra, V.F. Parametric Optimization: Singularities, Pathfollowing and Jumps; Wiley: New York, NY, USA, 1990. [Google Scholar]
33. Ehsan, N.; Tansu, A.; Girish, N.N.; Robin, J.E. Convergence analysis of quantized primal-dual algorithms in network utility maximization problems. IEEE Trans. Control Netw. Syst. 2018, 5, 284–297. [Google Scholar]
34. Michael, Z.; Alejandro, R.; Asuman, O.; Ali, J. Accelerated dual descent for network flow optimization. IEEE Trans. Autom. Control 2014, 59, 905–920. [Google Scholar]
35. Zymnis, A.; Trichakis, N.; Boyd, S.; O’Neill, D. An interior-point method for large scale network utility maximization. In Proceedings of the Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 26–28 September 2007; pp. 877–882. [Google Scholar]
36. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann: San Francisco, CA, USA, 1988. [Google Scholar]
37. Kelley, C.T. Iterative Methods for Linear and Nonlinear Equations; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 1995; ISBN 9780898713527. [Google Scholar]
38. Avriel, M. Nonlinear Programming: Analysis and Methods; Dover Publishing: Mineola, NY, USA, 2003; ISBN 0-486-43227-0. [Google Scholar]
39. Bonnans, J.F.; Gilbert, J.C.; Lemaréchal, C.; Sagastizábal, C.A. Numerical Optimization: Theoretical and Practical Aspects; Springer: Berlin, Germany, 2006; ISBN 3-540-35445-X. [Google Scholar]
Figure 1. The convergence comparison of the proposed method and the dual decomposition algorithm. The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the red curve denotes the convergence speed of total utility for the dual decomposition algorithm which is a distributed primary-dual algorithm.
Figure 1. The convergence comparison of the proposed method and the dual decomposition algorithm. The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the red curve denotes the convergence speed of total utility for the dual decomposition algorithm which is a distributed primary-dual algorithm.
Figure 2. The duality gap comparison of the proposed method and the dual decomposition algorithm. The green curve denotes the estimation errors between the primary function and dual function versus iteration number for our proposed method, and the red curve denotes the estimation errors between the primary function and dual function versus iteration number for the dual decomposition algorithm.
Figure 2. The duality gap comparison of the proposed method and the dual decomposition algorithm. The green curve denotes the estimation errors between the primary function and dual function versus iteration number for our proposed method, and the red curve denotes the estimation errors between the primary function and dual function versus iteration number for the dual decomposition algorithm.
Figure 3. The convergence comparison of the proposed method and the truncated Newton method. The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the blue curve denotes the convergence speed of total utility for the truncated Newton method which is a centralized algorithm.
Figure 3. The convergence comparison of the proposed method and the truncated Newton method. The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the blue curve denotes the convergence speed of total utility for the truncated Newton method which is a centralized algorithm.
Figure 4. The duality gap comparison of the proposed method and the truncated Newton method. The green curve denotes the estimation errors between the primary function and dual function versus iteration number for our proposed method, and the blue curve denotes the estimation errors between the primary function and dual function versus iteration number for the truncated Newton algorithm.
Figure 4. The duality gap comparison of the proposed method and the truncated Newton method. The green curve denotes the estimation errors between the primary function and dual function versus iteration number for our proposed method, and the blue curve denotes the estimation errors between the primary function and dual function versus iteration number for the truncated Newton algorithm.
Table 1. Experimental results of the iteration count for each Newton step in a small network.
Table 1. Experimental results of the iteration count for each Newton step in a small network.
Newton Step NumberGaBPPCG
163
262
362
475
599
61012
71213
81522
91429
101334
11 43
total98174
Table 2. Experimental results of the iteration count for each Newton step in a bigger network.
Table 2. Experimental results of the iteration count for each Newton step in a bigger network.
Newton Step NumberGaBPPCG
162
254
358
463
5716
6720
7736
8764
9 101
total50253