Efficient Algorithm for the Computation of the Solution to a Sparse Matrix Equation in Distributed Control Theory

In this short communication, an algorithm for efficiently solving a sparse matrix equation, which arises frequently in the field of distributed control and estimation theory, is proposed. The efficient algorithm stems from the fact that the sparse equation at hand can be reduced to a system of linear equations. The proposed algorithm is shown to require significantly fewer floating point operations than the state-of-the-art solution. The proposed solution is applied to a real-life example, which models a wide range of industrial processes. The experimental results show that the solution put forward allows for a significant increase in efficiency in relation to the state-of-the-art solution. The significant increase in efficiency of the presented algorithm allows for a valuable widening of the applications of distributed estimation and control.


Notation
The n × o null matrix and the null vector of dimension n are denoted by 0 n×o and 0 n , respectively. The i-th component of a vector v is denoted by [v] i , and the entry (i, j) of a matrix A is denoted by [A] i,j . The vectorization of a matrix A, denoted herein by vec(A), returns a vector composed of the concatenated columns of A. The Kronecker product of two matrices A and B is denoted by A ⊗ B. The cardinality of a set χ is denoted by |χ|.

Introduction
Sparse matrix equations arise in various real-life problems in a broad range of engineering fields. Thus, given their impact, they have been extensively studied for a long time [1]. In recent developments in the field of distributed estimation and control theory, a sparse matrix equation has arisen, for which, to the best of the knowledge of the authors, an efficient algorithm for its solution has not yet been proposed.
Consider the matrix equation where A ∈ R n×n , B ∈ R o×o , and C ∈ R n×o are known and X ∈ R n×o is the unknown. Let E ∈ R n×o represent a known sparsity pattern, and define the set χ of integer pairs of the form (i, j) to index the nonzero entries of E as Equation (1) arises frequently in the field of distributed control and estimation theory. In fact, it is the backbone of the state-of-the-art methods for distributed filter design [2] and controller synthesis [3] for large-scale dynamical systems. However, the solution to (1), presented in these papers, relies on an inefficient closed-form matrix computation. Define a matrix Z, such that the vector Zvec(E) contains all the nonzero elements of E. The solution to (1) presented in the literature is given by as proven in ( [2], Appendix B). Note that (3) requires the multiplication of a matrix of dimension |χ| × no by other dimension no × no, which requires O(|χ|(no) 2 ) floating point operations. The number of practical applications, where it is convenient to model the interconnection of complex systems as a whole network, has been steadily increasing [4]. They have emerged in a broad range of engineering fields such as irrigation networks [5,6], power systems networks [7,8], traffic networks [9][10][11][12], and industrial processes [13]. Nevertheless, given the substantial increase in memory and processing power needed to control such systems, the classical control solutions cannot be implemented in practice. That is why, over the past decades, an effort has been made towards the development of distributed and computationally efficient algorithms to handle such systems. In this short communication, an algorithm to obtain an efficient and exact solution to (1) is proposed, which significantly increases the applicability of distributed methods that address the control and estimation problems for large-scale dynamical systems.

Materials and Methods
The following result is the basis for the algorithm proposed in this short communication. Theorem 1. The system of Equation (1) can be reduced to a system of linear equations of dimension |χ|, where S ∈ R |χ|×|χ| , r ∈ R |χ| , and x ∈ R |χ| is the vector of the ordered nonzero entries of vec(X).
The LU decomposition of S can be carried out with O(|χ| 3 ) floating point operations, using Gaussian elimination. The type of solution is then assessed according to Corollary 1. If the solution is unique, it follows from Theorem 1 that it is given by (9), which is implemented in Algorithm 1.

Algorithm 1 Algorithm to efficiently compute the exact solution to (1)
Input: A ∈ R n×n , B ∈ R o×o , C ∈ R n×o , and E ∈ R n×o Output: Solution to (1), X ∈ R n×o , if one unique solutions exists. Compute set χ according to (2); // LU decomposition Assess type of solution according to Corollary 1; if Sx = r has a unique solution then Flag type of solution; end Remark 1. It is important to point out that each of the three main steps of Algorithm 1 can be carried out in parallel. The first and last steps for loops of Algorithm 1 (which are commented on) can be carried out in parallel, for each j. In this case, the variable p has to be initialized, for each j, with the number of nonzero entries of E in the first j − 1 columns. Moreover, the LU decomposition can also be carried out in parallel [14,15].

Remark 2.
There are asymptotically faster algorithms for computing the LU decomposition of a square matrix. For instance, the Strassen algorithm [16] can perform the LU decomposition of S in O(|χ| log 2 7 ) floating point operations. Moreover, the Coppersmith-Winograd algorithm [17]

Remark 3.
In the field of distributed estimation and control theory, |χ| is the sum of the number of sensor output signals available to each node of the network, and n is the order of the state-space dynamics of the whole system network. Given that, for most applications, each node has a similar order and a similar number of sensor output signals available, then |χ| ≈ cn, where c ∈ N is a constant. The system detailed in Section 4 illustrates this claim. It, thus, follows that the ratio of the number of floating points operations required by the proposed algorithm and the number required by the state-of-the-art methods is O(o −2 ).

Results
In this section, the solution to the sparse matrix equation (1) is used to design a distributed filter for a large-scale network of N tanks. The computational efficiency of the synthesis with the proposed solution is compared to the state-of-the-art synthesis procedure. This network is widely studied [3,19,20] because it is analogous to a broad range of industrial processes. Consider N interconnected tanks, as shown in Figure 1, where N is an even integer. The water level of tank i is denoted by h i . The network is actuated by N/2 pumps, which are controlled by the lower tanks, whose inputs are denoted by u i for i = 1, . . ., N/2, in accordance with the schematic. Each pump is connected to a three-way valve that regulates the fraction of the flow, held constant, that goes to each of the tanks supplied by the pump. Each tank has a sensor, which measures its water level. Making use of mass balances and Bernoulli's law, the system dynamics, in the absence of noise, are given by where A i and a i are the cross sections of tank i and of its outlet hole, respectively; the constant γ i represents the fraction of the flow that passes through the valve i to the lower tanks; k i is the constant of proportionality between the mass flow and the input of pump i; and g denotes the acceleration of gravity. The values of the constant parameters of the network are presented in Table 1. The dynamics of the network are nonlinear, thus the system is linearized about its operational point, which is selected to be h(i) = 15 cm, i = 1, . . . , N. The distributed filter is synthesized with the finite-horizon method ( [2], Section 5), which is shown to have better state estimation performance in relation to other distributed filter synthesis methods. Each iteration k of this method computes a new gain, K(k) ∈ R n×o , solving where Λ(k + 1) ∈ R n×n , S(k) ∈ R o×o , and P(k|k − 1) ∈ R n×n are defined in [2], with n = o = N. Moreover, C = I N is the output matrix of the network and, given the configuration of the network, the sparsity pattern that defines the set χ is given by E = I N .
Note that (11) has the same form as (1), for which a novel efficient solution is proposed in this short communication.

Constant Value
The goal is to compare the computational efficiency of the gain synthesis using the solution to (11) proposed in this short communication and compare it to the state-of-the-art solution (3). In that sense, the gain synthesis is performed for both alternatives, for a range of N. Figure 2 depicts the elapsed wall-clock time for the distributed filter synthesis, resulting from the average of three simulations of a MATLAB implementation on a single thread of a server with 24GB RAM and 24 Intel Xeon hexa-core CPUs at 2.40GHz. A MATLAB implementation of the distributed filter synthesis, of the application to the N tanks network, and of Algorithm 1, can be found in the DECENTER toolbox available at https://decenter2021.github.io (accessed on 23 June 2021). Figure 2 also depicts the linear regression in the logarithmic scale of the asymptotic evolution, which is considered to be achieved for N > 40. First, if the solution of (11) is the bottleneck of the gain synthesis procedure, one would expect that the synthesis wall-clock time grows asymptotically with O(N 5 ) for the state-of-the-art solution, and O(N 3 ) for the novel solution proposed in this short communication. It is possible to conclude from Figure 2 that this is, in fact, the case. The experimental complexity is slightly lower than what was projected, since MATLAB makes use of efficient LU decomposition algorithms like the ones pointed out in Remark 2. Second, it is also possible to confirm that the use of the solution proposed in this short communication leads to a ratio of wall-clock synthesis time between the proposed solution and the state-of-the-art solution that evolves with O(N −2.035 ). This improvement significantly widens the applicability of various distributed methods to large-scale systems.

Conclusions
In this short communication, an algorithm for efficiently solving the sparse matrix equation (1), which arises frequently in the field of distributed control and estimation theory, is proposed. The state-of-the-art methods for distributed filter design and controller synthesis rely on a very inefficient computation of the solution to (1), requiring O(|χ|(no) 2 ) floating point operations. In this letter, an algorithm is proposed to compute the solution to this equation with O(|χ| 3 ) floating point operations. For distributed estimation and control theory applications, the ratio of the number of floating points operations required by the proposed algorithm and the number required by the state-of-the-art methods is O(o −2 ). Furthermore, the algorithm proposed in this short communication was applied to an example representative of real-life large-scale industrial processes, and was compared to the state-of-the-art solution. It is shown that the solution put forward allows for a decrease of the computational complexity of the gain synthesis from O(N 4.805 ), which is obtained for the state-of-the-art solution, to O(N 2.770 ), where N is the dimension of the network. The significantly greater efficiency achieved with the proposed algorithm allows for the state-of-the-art distributed control and estimation algorithms to be implemented in an online framework and for large-scale dynamical systems, which would otherwise be unfeasible.

Conflicts of Interest:
The authors declare no conflict of interest.