Graph Theory for Modeling and Analysis of the Human Lymphatic System

: The human lymphatic system (HLS) is a complex network of lymphatic organs linked through the lymphatic vessels. We present a graph theory-based approach to model and analyze the human lymphatic network. Two different methods of building a graph are considered: the method using anatomical data directly and the method based on a system of rules derived from structural analysis of HLS. A simple anatomical data-based graph is converted to an oriented graph by quantifying the steady-state ﬂuid balance in the lymphatic network with the use of the Poiseuille equation in vessels and the mass conservation at vessel junctions. A computational algorithm for the generation of the rule-based random graph is developed and implemented. Some fundamental characteristics of the two types of HLS graph models are analyzed using different metrics such as graph energy, clustering, robustness, etc.


Introduction
The human lymphatic system (HLS) is a complex network of lymphatic organs linked through the lymphatic vessels. Its structure and functioning are critical to immunity by transporting the immune cells and antigens [1] and maintaining the fluid balance in tissues [2]. Currently, there are only a few studies considering the spatial structure of the human lymphatic system in terms of the network models [3][4][5] . The mathematical properties of the HLS network remain poorly investigated. The graph theory provides a powerful tool to implement, visualize and analyze the characteristics of the network models. Recently, it has been successfully applied for analysis of the topological properties and robustness of conduit networks in lymphatic nodes [6].
Generally, two complementary approaches to modeling the network structure of the lymphatic system. One is based on employment of available anatomical data [7,8]. The second approach uses some general rules of lymphatic vessels network organization as discussed in [5,9]. In this study, Figure 1. Oriented graph of a simple network model of the human lymphatic system suggested by Reddy et al. [3] with 29 nodes and 28 edges. The black node with out-degree deg + = 0 corresponds to jugular vein (sink).

Plastic Boy-Derived Graph Model of the HLS
We have previously developed a 3D computational geometry model of the HLS based on available anatomical data from the Plasticboy project [5]. This network is a graph G(V, E) containing 996 vertices and 1117 edges as shown in Figure 2. Unlike the Reddy's graph model of the HLS shown in the Figure 1, the above model is an undirected graph. To transform it to an oriented graph, we need to determine the directions of the lymph flow through the graph. To this end, the simulation of a steady lymph flow based on Poiseuille equation is employed.

Undirected Graph of the HLS
The anatomical data-based 3D graph model contains several vertices of degree 2 corresponding to the lymphatic vessels which are composed of several lymphangions. We remove from the graph all vertices of degree 2 corresponding to individual lymphangions to represent the multi-lymphangion vessels as single graph edges. Moreover, we add one edge to correct the connection of right arm to the body (which was missing in the original graph) and three edges to get a single output vertex for the whole HLS graph (modeling the excretory vessels emptying the lymph to the venous system, i.e., the jugular lymphatic trunk, thoracic duct and jugular vein). Overall, the following specifications are made to obtain the HLS presented in Figure 2: 1. coordinates of all vertices in the graph are scaled to correspond to the basal 1750 mm height of a man; 2. the right hand and the head right side vertices are attached to the major part of HLS graph by adding the edge from vertex 987 to the vertex 993; 3. single output node (sink): (1) two vertices are added (numbered as 995 and 996) with vertex 996 being the output of the system, (2) three edges, i.e., between nodes 993-995, 994-995 and 995-996 were added; 4. we iterate over graph vertices and search for irrelevant vertices that meet all the following conditions: (a) their degree is equal to two; (b) they are not lymph nodes; (c) their neighbor vertices v a , v b are not connected with each other by an edge; at each iteration, the irrelevant vertex was removed, and the edge v a − v b was created.
The topological properties of the HLS data-based graph models, i.e., the vertices degree-and edge-length distributions for the Plastic boy data-based graph and Reddy's graph of the HLS are summarized in Figure 3A,B, respectively. For the data-based HLS graph, we additionally show the edge-length distribution and degree distribution of the vertices that represent lymph nodes (LNs) in Figure 3A.

Lymph Flow Analysis for Specifying the Directed Graph
The lymphatic system collects excess of intercellular fluid in the body tissues, drains it through the lymph nodes (LN) and redirects the lymph further along the lymphatic vessels, delivering the collected fluid into the central venous duct. In the LNs, about 90% of the afferent lymph flow takes a peripheral path through the subcapsular and medullary sinuses [10]. The remaining 10% enters interstitium to finally reach the efferent lymphatic vessel or to be absorbed by parenchymal blood vessels of the LN. Earlier experimental data show that about 10% of the lymph is absorbed into the blood vessels that penetrate the lymph nodes [11,12], while the remaining 90% of lymph entering LN is transported further through the lymphatic vessels to higher-level LN.
We study the balance of lymph flow in the undirected anatomical-based graph presented in Figure 2 to obtain the direction of lymph flow through the graph. We use the similar approach as described in [13]. We apply the Poiseuille equation to every internal vessel represented by edge (i, j), where Q ij is the lymph flow rate along the vessel, R ij = R is the vessel lumen radius, η is the dynamic viscosity of the lymph, p i , p j are the pressures at the ends of the vessel, L ij is the length of the vessel.
In the internal nodes, the mass conservation law is imposed: where N inp = 357 is the number of input nodes with zero in-degree that provide the inflow of the lymph Q in = 0.081 mm 3 /s, N out = 1 is the only output node with outflow Q out = N inp Q in = 28.93 mm 3 /s. To close the system (1) and (2) we specify the pressure p out = 1127.73 Pa on the output node and Q in flow rate on the input nodes.
The above set of equations on HLS network in the matrix form to calculate the lymph flows where p = [p 1 , . . . , p N ] T is an unknown pressure vector for all the vertices of the graph G(V, E). Matrix K and vector b are defined as where A is an adjacency matrix of the LS graph; L ij is a distance between vertices i and j; V inp , V int , V out are sets of input, internal and output vertices; δ ij is Kronecker's delta. For the inflow vertices, we assume that all input vertices obtain equal amount of lymph (Q in ). For internal vertices, the conservation of mass requires the sum of incoming and outgoing flows to be zero. For the outflow vertices, we fix the pressure values P out .
To calculate the stationary flows, the following assumptions were made: 1. there are no other sinks of the lymph except for the exit vertex number 996: V out = {996}; 2. all vertices of the graph that have only one connection with other vertices (except for the vertex 996) are the points of lymph entry (collecting lymphatics) into the lymphatic network; 3. the pressure at all inflow vertices is adjusted to provide the lymph flow rate on the input edges to be equal to the output flow rate of the system divided by number of vertices with zero input edges; 4. the radii of all the vessels are set to be same, equal to 1 mm; 5. the constant viscosity is set in the lymphatic network.
The value of the dynamic viscosity of lymph at a temperature of 37 • C (η = 1.8 × 10 −3 Pa·s) is used. The pressure in the central venous duct ranges from 8 to 15 mmHg [2,[14][15][16], and the value of 11.5 mmHg is considered in the calculations. According to existing data, from 2 to 3 L of lymph per day gets into the central venous duct [17,18], hence we take this parameter to be 2.5 L per day. The resulting stationary lymph flow distribution in the HLS is shown in Figure 4. The above analysis enabled us to specify the lymph flow directions in the graph edges, and ultimately, to generate an oriented graph of the human lymphatic system, which is shown in Figure 5A. As the dimension of the graph HLS model is high, it can be visualized in 2D being subdivided into six segments representing specific parts of the human organism: arms, legs, head and body. The resulting oriented subgraphs are presented in the Figure 5B. (B) View of the graph on a plane with no-overlapping layout. The 2D layout was obtained using spring-based model algorithm [19] implemented in package igraph, followed by some manual tweaking to space out the vertices. The colored subgraphs correspond to different parts of the body, i.e., arms (dark and light blue), head/neck (magenta), torso (light green), legs (red and gray). LNs are represented by larger circles.
The adjacency matrix of this graph is presented in Figure 6.  Figure 6. Adjacency matrix of the oriented data-based graph of human lymphatic system shown in Figure 5. Colors represent the lengths L ij of the edges e ij .

Computational Algorithm for Generating a Random Rule-Based Directed Graph of the HLS
A quantitative modeling of the HLS presents a challenge due to enormous morphological complexity and variability in its appearance [5]. Scarcity of available anatomical data calls for the development of the so-called rule-based models of the HLS network structure. General rules of the lymphatic vessels network organization can be specified as follows: (i) no long-distance edges are allowed, (ii) nodes can have multiple inflow connections, (iii) the nodes connections are locally acyclic, (iv) the lymphatic system network is a circular graph [9], and additionally, the links between nodes of the same layer are allowed. To build a random graph of the lymphatic system, we developed a rule-based algorithm that generates a random oriented graph with a predetermined number of layers N l , number of vertices N v , number of sources N inp (input nodes, which should only have out-degree 1) and with one sink (output node). The layers in the algorithm are numerated from the output to the input vertices, from ground layer with index 1 to top layer with index N l . Iterative process on vertices is executed layer by layer starting from the next to the top layer. The rule-based graph HLS model generation consists of three major steps as specified in Algorithm 1 (C++ language realization of the algorithm is provided in supplementary materials, for any questions please contact authors of the article).

Algorithm 1: Generation of a rule-based directed graph of the HLS.
Parameters : N v , N inp , N l , P e , P o ; N v -number of vertices; N inp -number of input nodes; N l -number of layers; P e -probability of additional edge creation; P o -prob. of over-layer edge type; Init array A of size N l to store lists of vertices in each layer; Step 1. Distribute the vertices through the layers.
Set AVL = round(1.1 · (N v − 1)/(N l − 1)); for each layer L from A do add one vertex to the layer L; end while vertices count in A is less than N v do select random layer L from A; if L is a top layer of A and vertices count in layer L is less than min(N inp , AVL) then add one vertex to the layer L; else if L is not top layer of A and vertices count in layer (L + 1) is more than in layer L then add one vertex to the layer L; end end end Step 2. Configure the input vertices.
Init empty list L inp to store the input vertices (which can be spread through the layers); insert all vertices from top layer of A into L inp ; while vertices count in L inp is less than N inp do select random layer L c from A; let n c be the number of vertices in layer L c ; let n inp be the number of vertices in layer L c that are present in L inp ; if n c > n inp + 1 then select random vertex from layer L c that is not in L inp as v candidate ; insert v candidate into L inp ; end end Step 3. Create directed edges between the vertices.

for each vertex v in A do
let L v be the layer which holds the current vertex v; let L f irst be the list of vertices from layer (L v − 1) that do not have inflow connections; let L second be the list of vertices from layer (L v − 1) that do have inflow connections; if L f irst is not empty then select random vertex from L f irst that is not in L inp as v s ; else select random vertex from L second that is not in L inp as v s ; end create directed edge from current vertex v to selected vertex v s ; if v is not in L inp then while P e > random(0,1) do let L tv = L v − 1; while L tv > 1 and P o > random(0,1) do L tv = L tv − 1; end select random vertex from layer L tv that is not in L inp as v s ; create directed edge from current vertex v to selected vertex v s ; end end end The optimal estimation of the algorithm parameters is performed in Section 4. To this end a topological fitness function specifying a mismatch between the anatomy-based graph and the rule-based graph is used. One realization of a random graph constructed using Algorithm 1 with the optimal parameters is presented in Figure 7, its degree distribution is shown in Figure 8, its adjacency matrix-in Figure 9.

Parameters of the Rule-Based Algorithm Providing Best Match to the Anatomy Data Graph
The basic topological properties of the graph are the node (vertex) degree-and the edge (arc) length distributions. These were presented for two types of graph in the previous sections. As the coordinates of nodes (and edge lengths) of the rule-based graph are not specified, we can only obtain statistics on the degree distribution of the vertices, and this indicator varies significantly depending on the parameter settings of the algorithm. We consider some fundamental characteristics of the HLS graph properties to make a deeper comparison of the graph models such as the energy, clustering, robustness, etc. Let G = (V, E) be an undirected version of the graph with n =| V(G) | and m =| E(G) | nodes and edges, respectively. Let A denote the n × n adjacency matrix of G.

•
The number of input nodes N inp , i.e., the number of nodes with degree 1 and out-degree 0. • Maximum degree of graph ∆ G , i.e., the maximum degree of its vertices. • Girth of the graph g, which is the length of the shortest (undirected) cycle in the graph. • Diameter, i.e., the longest geodesic distance (in other terms, maximum eccentricity of any vertex), where d(u, v) is the geodesic distance (shortest directed path connecting vertices u and v), (v) is the eccentricity of vertex v. • Radius of the graph (minimum eccentricity of any vertex), • Average path length (mean geodesic distance), • The energy and the spectral radius of the graph are defined as follows, where λ j stand for the eigenvalues of the adjacency matrix A of the graph. • Edge density of the graph, i.e., the number of edges divided by the number of all possible edges, .
• The clustering coefficient C (transitivity) measures the probability that two neighbors of a vertex are connected. It can be computed as function of adjacency matrix A: • Number of separators n sep , i.e., the vertices removal of which disconnects the graph. • The robustness R of the graph can be defined as the fraction of peripheral vertices that retained the connection with the output vertex after removing 5% of the vertices selected randomly, averaged over k = 1, ..., N r removal realizations. Given adjacency matrixÃ k of the k-th realization of the graph, indices i 1 , ..., i n inp of its input vertices and index o of the output vertex, the robustness is computed as • Topological diversity of the vertices as a function of the Shannon entropy associated with flow rates through the incident edges, where k is the number of v i 's incident edges and p ij is the proportion of the flow between the adjacent v i and v j to the total flow through the edges involving v i . The flow diversity is defined analogous to the social and spatial diversity of networks from [20].
The rule-based graphs generated with Algorithm 1 have five tuning parameters: (1) number of vertices N v , (2) number of input vertices N inp , (3) number of layers N l , (4) probability of new edge creation P e at each step of the algorithm, probability that the created edge connects vertices from different layers P o . The first two parameters are set to match explicitly the properties of the anatomy-based graph: N v = 996, N inp = 357. The number of layers can be estimated as N l = D + 1 = 41, where D is the diameter of the directed target graph, which is increased by one because the output vertex is represented as a separate (ground) layer in Algorithm 1. Given the metrics characterizing the topology of an anatomy-based graph, we search for values of the remaining parameters P e and P o that can produce the graphs with similar topological structure. Specifically, we introduce the following state-vector s(G) = m, g, D, r, l G , En, ρ, ρ d , C, n sep , n deg 1 , n deg 2 , n deg 3 , n deg 4 , n deg 5 , n deg 7 , n deg 8 which describes the topological properties of graph G(V, E), and the objective function which penalizes the topological discrepancies of graph G from the target graph G * and weighs them with (s i (G * )) −2 to bring discrepancies of different components of vector s to a single scale. Here, we denote the number of vertices with degree i as n deg i = ∑ v∈V, deg(v)=i 1, and exclude from the state-vector n deg 6 , which equals zero for an anatomy-based graph, to avoid singularity in (14). First, we investigate the landscape of topological fitness of algorithmically generated graphs over uniform ranges of admissible values of parameters P e and P o . The dependence of the objective function (14) on P e and P o is presented in Figure 10A. The colors correspond to logarithmically transformed mean values of objective function Φ mean over 640 algorithm realizations for each set of parameter values. One can see that there is a narrow region in which the objective function reaches its minimum on the grid cells where P e = 0.035. Next, we examine more closely the dependence of the objective function distribution on parameter P o for fixed P e = 0.035, which is presented in Figure 10B. One can see that for a long range of parameter P o , the objective function values remain low, until they increase for large values P o > 0.8, where the maximum degree of graphs become too large (the dependence of topological metrics on P o is presented in Figure S1). From Figure 10, one can estimate the following ranges of viable parameter values for P e and N l : P e ∈ (0.001, 0.07), P o ∈ (0.01, 0.8). Two examples of rule-based graphs with satisfactory topological fitness generated by the algorithm with the use of optimal parameters P e = 0.035, P o = 0.21 are shown in Figure 7.

Comparative Analysis of the HLS Graph Models
The differences between the anatomy data-based and the rule-based graphs in terms of the metrics introduced in Section 4 are summarized in Tables 1 and 2. The histograms of robustness distributions (R k values) for each type of graph calculated over N r = 100, 000 (N r = 435 for Reddy's graph is the number of random deletions of 5% of nodes corresponding to 29 one node deletions and 406 deletions of randomly chosen pairs of nodes) realizations of random deletion of 5% of vertices are presented in Figure 11. The distributions of flow diversity of anatomy-based graphs are presented in Figure 12. Please note that for data-based graph we use the flows computed in Section 2.2.2. For Reddy's graph we obtain the flows assuming the equidistant inflow along 16 input edges equal to the 1/16-th of the outflow to the blue output node and the conservation law. Table 1. Summary statistics for Reddy's, anatomy-based and rule-based graphs characterizing their topological properties. For algorithmically generated graphs, we present the statistics obtained over 10,000 graphs generated with algorithm parameters P e = 0.035, P o = 0.21, N l = 41. Robustness was calculated over 1000 algorithm realizations, with 1000 removal attempts for each graph.

Lymphatic Vascular
Reddy   Figure 11. Histograms of the robustness of (A) Reddy's graph, (B) anatomical data-based graph and (C) rule-based random graph presented in Figure 7A. Given the algorithmically generated rule-based graph with optimal parameters presented in Figure 7A, we used another method to estimate the level of its resemblance to the anatomy-based graph. Specifically, to evaluate the level of isomorphism (LI) between the data-based and the rule-based graphs we searched for the row and column permutations to find the best match of them. Please note that examination of the graph isomorphism level belongs to the NP complexity calls. For this, we formed an is the adjacency matrix of the data-based graph, and B is the adjacency matrix of the rule-based graph, O is the zero matrix with equal sizes, D max is the initial Hamming distance for the given adjacency matrices. We used the simulated annealing method to minimize the Hamming distance between matrices M 1 , M 2 by performing permutations in the M 2 matrix (based on graph isomorphism): The permutations in the M 2 matrix are performed as follows: a pair of numbers i, j is chosen at random. Then, in the matrix M 2 , columns number i, j are swapped. After that, the same operation is performed with rows i, j. From the point of view of the graph, such transformations mean a permutation of the designations of the vertices i, j. Using this, we carry out a stochastic optimization of the M 2 matrix by the simulated annealing method to minimize (15). After that, we calculate the relative distance between graphs, presented by the adjacency matrices M 1 , M 2 as D rel = . Here, D rel = 1 means a complete mismatch, D rel = 0 means the complete equivalence of the adjacency matrices. By solving the minimization problem for the entire graphs, we achieved the upper bound estimate of the mismatch level of the range LI ≤ 0.49.

Conclusions
The human lymphatic system is a complex system consisting of many components and performing several important functions related to the metabolism, fluid tissue homeostasis and the immune system [21,22]. Currently, there are only a few models addressing the network structure of the human lymphatic system in a systematic way [3][4][5]. In this work, we propose an approach based on the graph theory to modeling the human lymphatic system and analyzing the fundamental mathematical properties of the resulting graph models. Two different types of graph models are developed, the anatomical data-and the rule-based ones. The transformation of the anatomical data-based simple graph into the oriented graph is implemented on the base of the steady-state lymph flow balance in the lymphatic vessel network. This homeostatic balance is quantified by applying the Poiseuille equation to every vessel and the mass conservation law to every vessel junction. The available empirical data on lymph flow through the system are rather scarce and range from 0.004 mm 3 /s in rats to 46.3 mm 3 /s in humans [2]. Our computations consistently suggest the range of lymph flows from 0.001 to 10 mm 3 /s, depending on the location of the respective vessel.
Our study is aimed to understand the fundamental topological properties of the human lymphatic system and to critically analyze the underlying organizational principles (rules) of the LS that have been proposed for modeling the systemic spreading of HIV infection. The combination of the developed HLS data-based graph and the publicly available software SimVascular (https://simvascular.github. io/index.html) designed for blood flow simulations provide a solid basis to further move on to the 1D, 2D-and 3D-simulations of the lymphatic system flows. The knowledge gained on the lymphatic network topology can be useful for systems pharmacokinetics studies and drug therapies design.
Recent progress in tissue bioengineering and 3D bio-printing enables the development of artificial organs including those of the lymphatics system, e.g., the LN-on-a-chip [23]. The artificial LNs can be used as a functional cure for certain pathological conditions like lymphoedema characterized by disruption of the interstitial fluid transport. The graph model of the human lymphatic system provides a platform for an optimal positioning of the artificial LNs and lymphatic vessels to restore the functioning and performance of the HLS.
The lymphatic system (LS) is responsible for maintaining the fluid balance in tissues and the functioning of the immune system. As the complexity of the organization and regulation of the immune system is extremely high, the move towards a mechanistic understanding of the immune system functioning takes the form of specific rules (see [24][25][26][27]). The respective set of organizational principles (rules) of the LS structure was recently proposed and used for modeling the systemic spreading of HIV infection [9]. In our study we analyzed the consistency of the rule-based graph of the human lymphatic system with that based on the anatomical data. To this end, a novel computational algorithm for generation of the rule-based random graph has been developed. Some fundamental characteristics of the two types of HLS graph models are analyzed using metrics such as graph energy, clustering, robustness, etc.
Additionally, we estimated the optimal parameters of the proposed algorithm for generating the rule-based graph model which provide the best-fit to the anatomy data-based graph model with respect to basic topological metrics. Our analysis revealed that the currently available set of rules specifying universal properties of HLS for the rule-based graph modeling requires further maturation to be applied in clinical studies.
Overall, it is the first study in which the full complexity of the HLS network is addressed by methods of the graph theory methods. We probe to elucidate the efficacy of the rule-based approach to generate the HLS models. Further specification of the basic rules underlying the structure and function of the HLS is needed in collaboration with immunologists and physiologists. The graph models which we make publicly available, provide an important step towards a quantitative analysis of transport phenomena in the whole lymphatic system. This is a problem of a paramount importance for systems immunology, pharmacokinetics and medicine [1,2,28].