Gaussian Belief Propagation for Solving Network Utility Maximization with Delivery Contracts

Classical network utility maximization (NUM) models fail to capture network dynamics, which are of increasing importance for modeling network behaviors. In this paper, we consider the NUM with delivery contracts, which are constraints to the classical model to describe network dynamics. This paper investigates a method to distributively solve the given problem. We first transform the problem into an equivalent model of linear equations by dual decomposition theory, and then use Gaussian belief propagation algorithm to solve the equivalent issue distributively. The proposed algorithm has faster convergence speed than the existing first-order methods and distributed Newton method. Experimental results have demonstrated the effectiveness of our proposed approach.


Introduction
Since the publication of the seminal paper [1] by Kelly et al., the framework of network utility maximization (NUM) has received a great deal of interest in the past two decades, which has been developed into a mathematical theory of network architectures [2]. Many important network design and resource allocation problems can be formulated as a NUM model. The utility concept, originally proposed in economics, is used to measure the satisfaction degree of a consumer for a good or service. In the basic NUM model, the utility of a network user is defined as a function of its data rate. The goal of network system is designed to maximize the overall utility of all the users in the network.
Consider a network with L links and R users, where each link l has a capacity of c l bps. Let L be the set of all links and a route r is a non-empty subset of L, let R be the set of possible routes, and associate a route r with a user r (or a data source r), i.e., L = {1, 2, . . . , L} and R = {1, 2, . . . , R}. Set A lr = 1 if l ∈ r, so that the link l lies on route r, and set A lr = 0 otherwise. This defines a 0-1 routing matrix A = (A lr , l ∈ L, r ∈ R).
Suppose that if a rate x r is allocated to user r then this has utility U r (x r ) to the user. Assume that the utility U r (x r ) is increasing, strictly concave, and continuously differentiable over the range x r ≥ 0. Let U = (U r (x r ), r ∈ R) and C = (c l , l ∈ L). Under this model, the network seeks a rate allocation x = (x r , r ∈ R) which solves the following optimization problem [3].
However, the basic NUM model (1) does not consider the network dynamics such as time-varying link capacities and user demands for quality of service (QoS). In this paper, we investigate a dynamic NUM model with QoS constraints, i.e., the model is time-varying over time t, which takes values in the set of time indices T = {= 1, 2, . . . , T}. The model was first introduced in [4] as following max ∑ T t=1 ∑ R r=1 U t r (x t r ) subject to A t x t ≤ C t ∀t ∈ T B r x r ≥ q r ∀r ∈ R x t ≥ 0 ∀t ∈ T (2) where x t r denotes the source rate for user r at time step t, x t = (x t r , r ∈ R) is the rate vector of all users at time step t and x r = (x t r , t ∈ T ) is the source rate allocation for user r. The second constraint in model (2) is the QoS constraints or delivery contracts. For each user, a delivery contract is the required minimal flow to be delivered over some particular time interval. Assume user r has k r delivery contracts and q r ∈ R k r is the associated contract quantity amounts. A contract is active at time index t if t is in the time interval of the contract. We can define a 0-1 matrix B r ∈ R k r ×T to represent the delivery contract indicator matrix by setting the (B r ) kt = 1 if the kth contract of user r is active at time t, and setting (B r ) kt = 0 otherwise. Thus, the QoS constraints that all delivery contracts are met can be given by B r x r ≥ q r ∀r ∈ R, i.e., the second constraint in model (2).
In the dynamic NUM model (2), utility function U t r , route matrix A t and link capacity C t are all dependent on the time index t, this means they are all possibly time-varying. There are many different ways to solve the problem (2), such as interior-point methods [5] and primary-dual algorithms [6]. The interior-point methods are efficient for solving the problem (2); however, they are centralized algorithms. The primary-dual algorithms are decentralized, but they suffer from slow convergence speed. In this paper, we concentrate on the issue of designing a distributed algorithm with fast convergence speed for solving the problem (2).
Only a few studies so far have investigated the problem (2) [4,7]. The goal of these works are similar to us. In [4], the authors presented a distributed primary-dual algorithm for solving the problem (2) based on dual decomposition and first-order methods. However, their work suffers from slow convergence speed. Of particular relevance to our work is [7], where a distributed Newton-type algorithm has been developed for solving the dynamic NUM model (2). Unlike these works, we use Gaussian belief propagation (GaBP) algorithm [8] to compute the Newton step and obtain an efficient distributed algorithm for solving the problem (2).
Our proposed solution is a three-step method for addressing the given issue. The first step is to obtain the optimality conditions for solving the problem (2) by introducing slack variables. The first step is similar to that approach used in the primary-dual algorithms for solving the problem (2). However, we do not adopt the primary-dual algorithms to solve the problem, which suffer from slow convergence speed.
The second step is to transform the obtained optimality conditions into an inference problem in a probabilistic graph model describing a certain Gaussian distribution with unknown parameters, which equal to optimal solutions for the problem (2). The method used in the second step transfers an algebraic problem into a probabilistic inference problem, which was first raised in [9].
The third step is to use the GaBP algorithm to evaluate distributively the parameter values of the Gaussian distribution. Essentially the GaBP algorithm is used to compute the Newton step in a primary-dual interior-point method [10]. This is similar to the work in [11]. However, this work does not consider the delivery contracts and is the special case of our work.
The outline of this paper is as follows. We first discuss the background and related work in Section 2, then present our method in Section 3. Section 4 provides experimental results and a discussion. Finally, Section 5 concludes this paper.

Background and Related Work
Before we present our idea, we first introduce a basic distributed optimization algorithm [3] which solves the model (1). Our work belongs to extensions of their works to dynamic model.

Basic Primary-Dual Algorithm
Low et al. present in [3] the following basic distributed optimization algorithm for solving the model (1).
The Lagrangian dual function for problem (1) is where µ = (µ l , l ∈ L) is a vector of Lagrange multipliers and µ Tr denotes the transpose of the vector µ.
Here, the second equality follows due to the definition of the matrix A. Thus, the dual problem for primary problem (1) is min In the dual formulation, Lagrange multiplier µ l can be interpreted as congestion price on link l. A key observation from Equation (3) is that sources can compute their optimal rate individually, based on the total congestion price ∑ l∈L A lr µ l , using the following source rate algorithm To solve the dual problem (4), one can use the following projected gradient method where α(t) is a positive scalar stepsize, and [a] + denotes the projection of a onto the set R + of non-negative real numbers. According to the duality theory [12] and the assumption that the utility U r (x r ) is increasing, strictly concave and continuously differentiable over the range x r ≥ 0, the optimal solutions to both primary problem (1) and dual problem (4) can be found simultaneously by solving iteratively in Equations (5) and (6), respectively. This suggests treating the network links and the sources as processors in a distributed computation system to solve the primary problem (1) and the dual problem (4). The algorithm (5) and (6) is often referred to as the basic primary-dual algorithm. A large number of studies based on NUM framework belong to extensions of the basic primary-dual algorithm, the interested readers along this line please refer to [13].

Related Work
In the basic NUM model (1), the utility U r (x r ) of a network user r is defined as function of its data rate x r , this means that all utility functions are separable. Due to the characteristic of separability of utility functions, a basic distributed NUM algorithm is derived to maximize aggregate user utility by the dual decomposition theory [12]. Along this way, so many extended NUM models and resultant distributed algorithms have been proposed for network architectural design, cross-layer optimization and resource allocation in wireless as well as wireline networks [14][15][16][17][18][19]. There are some works which studied the extended NUM models with the non-strictly concave or non-concave utility functions such as in [20][21][22][23]. When the utility functions are not strictly concave, the subgradient method is usually used to solve the dual problems instead of the gradient method in the basic primary-dual algorithm. If the utility functions are not concave, the extented duality method [24] can be used to construct distributed algorithms.
The design of the utility functions depends on applications of NUM problem. NUM-based approaches have been explored in different applications. Based on NUM, Liu et al. [25] presented a distributed and adaptive solution that jointly computes the data collection rates for each node and finds the schedule transmissions for rechargeable sensor networks. The concept of Water flow Driven Sensor Networks was introduced for leakage and contamination monitoring based on NUM [26]. Sadagopan et al. [27] use NUM approach constructing an energy balance tree in sensor networks, where each sensor node's utility depends on the selection of its parent node. There exist some challenges to define utility functions based on performance metrics of different applications. The relationship between performance metrics and utility functions please refer to [28].
All the above works considered the static NUM models. Dynamic NUM models also belong to the extension of the basic NUM model. In [29], the authors presented a dynamic NUM model in adversarial environments. Their work focuses on the tradeoff between total queue length and utility regret. However, in this paper, we concentrate on the issue of devising a distributed algorithm to solve a dynamic NUM. In [30], the authors proposed a dynamic NUM model with time-varying fading channels. However, the utility functions and route matrix in their work are fixed. Moreover, their work focuses on the convergence behavior and tracking errors of the iterative primary-dual scaled gradient algorithm. Parametric network utility maximization model was presented in [31]. If the parameters are regarded as time steps, their model is equivalent to ours. However, their work concentrates on the tracking of algorithm trajectory by using a pathfollowing method on the parametric optimization problem [32].
The works in [4,7] are particular relevance to this work. The dynamic NUM with delivery contracts was first proposed in [4], and the authors presented a distributed primary-dual algorithm to solve the problem. The distributed primary-dual algorithm provided in [4] is based on dual decomposition theory, which is similar to the basic primary-dual algorithm. These primary-dual algorithms usually suffer from the slow rate of convergence [33,34].
The work in [7] also investigates the same model with ours in this paper. They proposed a distributed Newton method for solving the given problem. Their distributed algorithm obtained fast convergence speed compared with the distributed primary-dual algorithms. the method in [7] approximates the Newton direction at each iteration by using the matrix splitting technique. However, our method in this paper uses GaBP algorithm to evaluate the Newton direction.
Our proposed algorithm is a kind of primary-dual interior-point method. The primary-dual interior-point method was used for solving the NUM problem in [35]; however, the proposed algorithm is not decentralized. The work in [11] provided a distributed algorithm for solving a NUM by using the GaBP algorithm to compute the Newton direction. This is similar to our work. However, their model is static, and our model includes delivery contracts. Their work can be regarded as a special case of our work.

Our Method
In this section, we develop a three-step method to solve distributively the dynamic NUM problem (2). Before we present our idea, we first introduce some notations to simplify the model (2). let x = (x Tr 1 , x Tr 2 , . . . , x Tr R ) Tr and C = (C Tr 1 , C Tr 2 , . . . , C Tr T ) Tr be the rate vector of users and the capacity vector of links respectively, where a Tr is the transpose of the vector a. let matrix A denote the corresponding routing matrix for all time steps given as Then we can write down the first constraint of the dynamic NUM problem (2) as Ax ≤ C.
Similarly, let q = (q Tr 1 , q Tr 2 , . . . , q Tr R ) Tr be the contract quantity vector of users, and the matrix B denote the delivery contract matrix for all users given as Thus, the second constraint of the dynamic NUM problem (2) can be written as Bx ≥ q. Let X be the rate matrix with entries (x t r , r ∈ R, t ∈ T ), define U(X) = ∑ T t=1 ∑ R r=1 U t r (x t r ) Equivalently, we can transform the dynamic NUM problem (2) as following, Next, we will present our method to solve the problem (7).

Optimality Conditions
The Lagrangian associated with the NUM problem (7) is where λ, µ and α are Lagrange multiplier vectors which are associated the inequality constraints in the NUM problem (7). Therefore, the dual function is given by L(x t r ; λ, µ, α) Thus, the dual problem for model (7) is given by min λ≥0,µ≥0,α≥0 Assumex t r , (λ,μ,α) are the optimal solutions of the primary problem (7) and dual problem (9), according to the Karush-Kuhn-Tucker (KKT) conditions [12], we can obtain the optimality conditions as following, where diag(·) denotes a diagonal matrix formed from its vector argument.

Inference Problem
We can modify the optimality conditions (10) and apply the primary-dual interior-point method on the modified optimality conditions for solving the primary problem (7) and dual problem (9) in an iterative manner with the given error of the duality gap [12]. The modification is parametrized by a parameter k as [35], −∇U(X) where k ≥ 0 is a parameter. We know from (11) that the modified optimality conditions approximate the optimality conditions as k → ∞ and different values of k set the different accuracies of the approximation. We can compactly write the modified optimality conditions as following, The search direction of the primary-dual interior-point method is the Newton step for solving the modified optimality conditions r t (x, λ, µ, α) = 0. If y = (x, λ, µ, α) Tr is the current point, the Newton step ∆y = (∆x, ∆λ, ∆µ, ∆α) Tr , then we have, r(y + ∆y) ≈ r t (y) + r t (y)∆y = 0, where r t (y) denotes the derivative of r t (y). The above equation means Searching the Newton step by Equation (12) is the main computational bottleneck in the primary-dual interior-point method. However, in this paper, we do not directly calculate Newton's direction by Equation (12). We transform the problem solving the liner Equation (12) into a probabilistic inference which can be computed based on GaBP. We first transfer the matrix in the right side of Equation (12) into a symmetric matrix by multiplying r t (x, λ, µ, α) a factor (1, −1/λ, −1/µ, −1/α) as following, wherer t (x, λ, µ, α) and A are defined as, For notational simplicity, let b = −r t (x, λ, µ, α), w = (∆x, ∆λ, ∆µ, ∆α) Tr , we can write the Equation (13) Therefore, the Equation (14) for computing the Newton step is a system of linear equations with a symmetric coefficient matrix, which can be efficiently solved by using the GaBP algorithm [9]. We can define an undirected probabilistic graphical model G = (W, E ), where W is a set of nodes which consist of the variables of linear Equation (14), and E is a set of edges which are determined by the non-zero entries of the coefficient matrix A. Given the matrix A and vector b, we can build up a Gaussian density function p(w) ∼ exp(− 1 2 w Tr Aw + b Tr w), which corresponds to the probabilistic graph G. Let M = TR + TL + R ∑ r=1 Tk r + TR be the dimension of the vector b (or vector w) (refer to the original model (2), we know that the number of the objective functions is TR, the capacity constraints are TL, the constraints of the delivery contracts are R ∑ r=1 Tk r , and the non-negativity constraints are TR. Therefore, the dimension of the vector b is TR + TL + R ∑ r=1 Tk r + TR.). The probabilistic graph G has edge potentials (or compatibility functions) ψ and self-potentials (or evidence) φ. These graph potentials are determined by the following pairwise factorization of Gaussian distribution, . =exp(− 1 2 w i A ij w j ). Using this probabilistic graph, we can transform the problem of solving the linear Equation (14) from the algebraic domain to a parameter estimation problem in the domain of probabilistic inference, as stated in the following theorem [9]. = {θ 1 , . . . , θ R } over the graph G with the associated joint Gaussian probability density function p(w) ∼ N (θ, A −1 ).

Proof. See Appendix A.
The above theory shows that if we can distributively evaluate the mean of the Gaussian distribution p(w), then we can use the primary-dual interior-point method to distributively solve the primary problem (7) and dual problem (9). The next section will present the method for solving the mean of the inference problem (15) based on GaBP algorithm.

Parameter Evaluation Based on GaBP
Belief propagation is a kind of local message-passing algorithm and has been found to have excellent performance in many applications [36]. GaBP is a special case of the belief propagation algorithm, in which the underlying distributions are Gaussian. According to the statements in above section, in order to solve the linear equation problem (14) we need to infer the marginal densities p(w i ), which must also be Gaussian, i.e., where θ i and P i are the marginal mean and inverse variance (also known as the precision), respectively. Let N(i) be the set of all the nodes neighboring the node i (excluding node i). The set N(i) \ j includes all the nodes in the set of N(i) except node j. The following Algorithm 1 provides the GaBP algorithm update rules for inferring the mean θ i .

•
Step 0 Initialization: Set a convergence threshold , P ki = 0 and θ ki = 0, ∀k ∈ N(i). Compute P ii = A ii and Step 1 Iteration: Propagate the messages P ki and θ ki , ∀k ∈ N(i).

•
Step 2 Convergence check: If the message P ij and θ ij do not converge, return to Step 1, else, go to Step 3.

Experimental Settings
In this section, we justify empirically the effectiveness of the proposed algorithm and compare the performance with the classical primary-dual algorithm and the primary-dual interior-point method. The classical primary-dual algorithm is based on dual decomposition. The primary-dual interior-point method used here is an iterative method for solving approximately a Newton system, which is usually called a truncated Newton primary-dual interior-point method [37].
We consider a network which has 100 flows and 200 links, and all of the utility functions are set to logarithmic functions, i.e., U t r (x t r ) = log(x t r ), which are the most widely used in NUM problems [2]. The network was randomly generated and similar to that used in [34,35], this means that we need generate the link capacities and the routing matrix. The link capacities are chosen independently from a uniform distribution on [0.1, 1] and all of the required minimal flows to be delivered over different time intervals are set to 0.1, and the elements of the routing matrix A are generated randomly and independently, so that the average route length is 6 links. The time index T and all stepsizes are set to 10 and 0.001, respectively. After the network was generated, our proposed algorithm, primary-dual algorithm, and truncated Newton algorithm are performed once on it, respectively. The experimental results and comparisons are provided in the next section.

Experimental Results
We first evaluate the convergence of the proposed algorithm and compare with the classical primary-dual algorithm, these two algorithms are all distributed. Figure 1 shows the convergence curves of total utilities and Figure 2 provides the duality gap or corresponding residual values between the primary function and dual function. The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the red curve denotes the convergence speed of total utility for the dual decomposition algorithm which is a distributed primary-dual algorithm. We also compare the performance of our proposed method with the truncated Newton method which has achieved a very fast convergence speed and very good accuracy for solving nonlinear equation system. However, the truncated Newton method is centralized. Figures 3 and 4 show the convergence curves of total utilities and the duality gap curves versus iteration number for our proposed method and the truncated Newton method, respectively.  The green curve denotes the convergence speed of total utility for our proposed algorithm which is a distributed Newton method based on GaBP, and the blue curve denotes the convergence speed of total utility for the truncated Newton method which is a centralized algorithm. Our proposed method uses GaBP algorithm to compute the Newton and the truncated Newton method adopts the preconditioned conjugate gradient (PCG) algorithm [38] for computing the Newton step. While the above Figures 3 and 4 provide the performance comparison in term of the Newton steps, we also give the comparison of the iteration count in each Newton step for these two algorithms in Tables 1 and 2. Tables 1 and 2 are the experimental results for two networks which have 100 flows and 200 links, and 500 flows and 1000 links, respectively. Table 1. Experimental results of the iteration count for each Newton step in a small network. 1  6  3  2  6  2  3  6  2  4  7  5  5  9  9  6  10  12  7  12  13  8  15  22  9  14  29  10  13  34  11  43 total 98 174

Analysis and Comparison with a Distributed Method
We will analyze the experimental results in terms of convergence speed and solution accuracy. From Figure 1, we can see that our proposed algorithm and the dual decomposition algorithm can converge to the optimal value Within a certain range of errors. However, the convergence speed of our method is much faster than the dual decomposition algorithm.
The duality gap between the primary function and the dual function depicts the accuracy of the obtained solution. For a convex optimization problem, the primary variables and dual variables will eventually approach the optimal solution as the duality gap tends to zero. Naturally, we expect that our proposed algorithm will obtain a smaller duality gap. From Figure 2, we can see that the duality gap achieved by our proposed algorithm is smaller than that obtained by the dual decomposition method.

Analysis and Comparison with a Centralized Approach
The truncated Newton method is a centralized approach, which is an efficient primary-dual interior-point method and achieves good performance in many optimization problems [39]. We gave the performance comparison for our proposed method and the truncated Newton method in this section.
From Figure 3, we can see that the convergence speed of our proposed method is very fast, which is slightly faster than the truncated Newton method. This means that both methods had comparable convergence speed. However, as specified before, our proposed approach is distributed, while the truncated Newton method is centralized. Figure 4 provides the accuracy comparison of the solutions obtained by both methods. From Figure 4, we can see that our proposed method has a smaller duality gap than that achieved by the truncated Newton method. This means that although both methods had comparable convergence rate, the accuracy of the solution obtained by our method is better than that achieved by the truncated Newton method.
As these two methods computed the Newton steps based on GaBP and PCG algorithms respectively, we also compared the iteration count for each Newton step. From Tables 1 and 2, we can see that the iteration count of GaBP algorithm is smaller than that required by the PCG algorithm except for the first few Newton steps. Moreover, the total iteration count of GaBP algorithm is also smaller than that achieved by the PCG algorithm.
Another advantage for GaBP algorithm is that as the Newton step number increases, the iteration count for each Newton step tends to a stable value. However, the iteration count required by PCG algorithm always increases; moreover, the increased magnitudes grows bigger as the scale of the network grows.

Conclusions
We propose a three-step method for distributively solving network utility maximization with delivery contracts (or dynamic NUM). This paper first obtained the optimality conditions for solving the dynamic NUM problem by dual decomposition theory. Then we transform the problem for searching the Newton step in solving the optimality conditions into a probabilistic inference. Finally, GaBP algorithm was used to compute the probabilistic inference.
NUM problems are usually solved by means of the classical primary-dual algorithm, which is used as the benchmark algorithm for testing the effectiveness of our proposed method. By comparing and analyzing the experimental results, we can reach a conclusion that the proposed method is effective in convergence speed and solution accuracy compared with the classical primary-dual algorithm.
Our proposed method belongs to distributed primary-dual interior-point methods. Therefore, we also compared the performance of our proposed method with the primary-dual interior-point method based on PCG, which had achieved a very fast convergence speed and very good accuracy for solving nonlinear equation problems. The experimental results also validated the effectiveness of our proposed method.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: NUM Network utility maximization GaBP Gaussian belief propagation QoS Quality of service KKT Karush-Kuhn-Tucker PCG preconditioned conjugate gradient (·) Tr The transpose of a matrix or a vector