Next Article in Journal
Time-Limited Codewords over Band-Limited Channels: Data Rates and the Dimension of the W-T Space
Previous Article in Journal
Newtonian-Type Adaptive Filtering Based on the Maximum Correntropy Criterion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Efficient Algorithm to Count Tree-Like Graphs with a Given Number of Vertices and Self-Loops

by
Naveed Ahmed Azam
*,
Aleksandar Shurbevski
and
Hiroshi Nagamochi
Department of Applied Mathematics and Physics, Kyoto University, Kyoto 606-8502, Japan
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(9), 923; https://doi.org/10.3390/e22090923
Submission received: 23 July 2020 / Revised: 14 August 2020 / Accepted: 18 August 2020 / Published: 22 August 2020

Abstract

:
Graph enumeration with given constraints is an interesting problem considered to be one of the fundamental problems in graph theory, with many applications in natural sciences and engineering such as bio-informatics and computational chemistry. For any two integers n 1 and Δ 0 , we propose a method to count all non-isomorphic trees with n vertices, Δ self-loops, and no multi-edges based on dynamic programming. To achieve this goal, we count the number of non-isomorphic rooted trees with n vertices, Δ self-loops and no multi-edges, in O ( n 2 ( n + Δ ( n + Δ · min { n , Δ } ) ) ) time and O ( n 2 ( Δ 2 + 1 ) ) space, since every tree can be uniquely viewed as a rooted tree by either regarding its unicentroid as the root, or in the case of bicentroid, by introducing a virtual vertex on the bicentroid and assuming the virtual vertex to be the root. By this result, we get a lower bound and an upper bound on the number of tree-like polymer topologies of chemical compounds with any “cycle rank”.

1. Introduction

Counting and generation of discrete objects are two fundamental problems in combinatorial mathematics and have many applications in the fields of natural science and engineering, such as computational chemistry and bioinformatics. The counting problem asks to count all possible objects under given constraints. On the other hand, the generation problem asks to list all possible objects under given constraints. One of the notable advantages of the counting problem is that we can know the size of the solution space before generating all solutions.
Different kinds of enumeration methods are used to solve counting and generation problems, where branching algorithms and Polya’s enumeration theorem are the two most commonly used methods for these problems. In branching algorithms, the computation is performed by following a computation tree, and the required solutions are attained at the leaves of the computation tree. It is important to mention that the branching algorithms can only count all solutions after generating each one of them, and therefore they are inefficient for the problem where we first want to know the size of the solution space before the generation of solutions.
The well-known Polya’s enumeration theorem [1,2] is used for counting all distinct objects. The idea of this method is to use the cyclic index of the group of symmetries of the underlying object to develop a generating function, which is then used to count all possible objects. Note that finding the group of symmetries and its cyclic index is a challenging task, which may make the use of Polya’s theorem harder for some problems.
The drawback of branching algorithms discussed above and the difficulty of using Polya’s theorem necessitate the exploration of new enumeration methods to solve counting problems efficiently. For an enumeration method, it is necessary to satisfy the following three conditions:
(i)
Consider all solutions: The method does not miss any of the required objects;
(ii)
Avoid duplication: The method does not count and generate isomorphic objects; and
(iii)
Low computational complexity: The method can count and generate all solutions in low time and space complexity.
Designing such a method is not an easy task, because of the underlying symmetries and the computation difficulty for their detection.
Counting and generation of chemical compounds have a long history and numerous applications in designing novel drugs [3,4,5,6,7,8] and structure elucidation [9]. The problem of counting and generation of chemical compounds can be viewed as the problem of enumerating graphs with given constraints. There are several available chemical compound enumeration tools [10,11,12]. We can divide these tools into two classes. One class of enumeration tools treats general graph structures [10,12]. In the other class, the tools are focused on enumerating some restricted chemical compounds. One such tool is Enumol2 [11]. Enumeration of restricted chemical compounds with specialized tools is more efficient than with the tools which use general graph structures. This led to a new trend of developing efficient enumeration of restricted chemical compounds in the field of chemoinformatics [13].
A polymer is a large molecule with interesting chemical properties consisting of many sub-molecules. From a graph-theoretic perspective, we represent the structure of a polymer with a graph G called polymer topology, possibly with self-loops and multi-edges, such that G is connected and the degree of each vertex in G is at least three [14]. For a chemical graph, we get its polymer topology by repeatedly removing the vertices of degree one and two. For example, the polymer topology of Remdesivir C 27 H 35 N 6 O 8 P Figure 1a, a potential candidate of treatment for COVID-19, is illustrated in Figure 1b.
Tezuka and Oike [15] pointed out that a classification of polymer topologies will lay a foundation for the elucidation of structural relationships between different macro-chemical molecules and their synthetic pathways. Different kinds of graph-theoretic approaches have been applied to classify and enumerate polymer topologies [16,17]. For a connected graph G, possibly with self-loops and multi-edges, the cycle rank is defined to be the number of edges that must be removed to get a simple spanning tree of G. Recently, Haruna et al. [14] proposed a method to enumerate all polymer topologies with cycle rank up to five.
Notice that trees with no multi-edges but with Δ 0 self-loops have cycle rank Δ and include all polymer topologies with the said structure. Therefore, it is of interest to count and generate all trees with no multi-edges and a given number of vertices and self-loops.
We use dynamic programming (DP) to count all mutually non-isomorphic trees with n vertices, Δ self-loops and no multi-edges. The basic idea of DP is to partition the original problem into subproblems that satisfy some recursive relations, and the union of their solution sets is equal to the solution set of the original problem. Unlike branching algorithms and Polya’s theorem, the main advantage of using the DP is that we can count all non-isomorphic structures without their generation and calculation of their group of symmetries. As an application of our results, we get lower and upper bounds on the number of tree-like polymer topologies with self-loops of a given cycle rank.
The rest of the paper is organized as follows: Section 2 reviews some notions and results related to graph theory. Section 3 explains our tree counting method. Section 4 makes some concluding remarks.

2. Preliminaries

Throughout this draft, the term graph stands for an undirected graph with no multi-edges and possibly with self-loops unless stated otherwise. Let G be a graph. We denote an edge between two vertices u and v in G by u v ( = v u ) . Let V ( G ) and E ( G ) denote the vertex set and edge set of G, respectively. Let s ( G ) denote the number of self-loops in G. For a vertex v V ( G ) , we denote by s ( v ) the number of self-loops on the vertex v. For a vertex v in G, let N G ( v ) denote the set of vertices incident to v except v itself and the degree deg G ( v ) of v in G is defined to be | N G ( v ) | . A graph H with the properties V ( H ) V ( G ) and E ( G ) E ( G ) is called a subgraph of G. A simple path between two distinct vertices u , v V ( G ) is defined to be a subgraph P of G with vertex set V ( P ) = { u = w 1 , w 2 , , v = w k } and edge set E ( P ) = { w i w i + 1 1 i k 1 } . A graph is called a connected graph if there is a path between any two distinct vertices in the graph. A connected component of a graph G is defined to be a maximal connected subgraph H of G, i.e., for any vertex v V ( G ) \ V ( H ) it holds that every subgraph with the vertex set V ( H ) { v } is disconnected.
By Jordan [18], any simple tree with n 1 vertices has either a unique vertex or edge, the removal of which creates connected components with at most ( n 1 ) / 2 or exactly n / 2 vertices, respectively. Such a vertex is called the unicentroid, the edge is called the bicentroid, and collectively they are called the centroid of the tree. It is important to note that there exits a bicentroid only for trees with an even number of vertices. A tree with a fixed vertex r is called a rooted tree with root r. Note that any tree can be uniquely viewed as a rooted tree by either regarding its unicentroid as the root, or in the case of a bicentroid, by introducing a virtual vertex on the bicentroid and assuming the virtual vertex as the root.
Let H be a rooted tree. Let r H denote the root of H. For any two distinct vertices u , v V ( H ) , let P H ( u , v ) denote the unique simple path between them in H. For a vertex v V ( H ) \ { r H } , we define the ancestors of v to be the vertices on the path P H ( v , r H ) other than v. If u is an ancestor of v, then we call v a descendant of u. For a vertex v V ( H ) \ { r H } , the parent p ( v ) of v is defined to be the ancestor u of v such that u N H ( v ) . We call the vertex v a child of p ( v ) . Two vertices with the same parent in H are called siblings. For a vertex v V ( H ) , let H v denote the subtree of H rooted at v induced by v and its descendants.
Two rooted trees T and H are called isomorphic if there exists a bijection σ : V ( T ) V ( H ) such that
(i)
σ ( r T ) = r H ;
(ii)
for each vertex v V ( T ) , it holds that s ( v ) = s ( σ ( v ) ) ; and
(iii)
for any two vertices u , v V ( T ) , it holds that u v E ( T ) if and only if σ ( u ) σ ( v ) E ( H ) .
For any two integers n 1 and Δ 0 , let H ( n , Δ ) denote a maximal set of mutually non-isomorphic rooted trees with n vertices and Δ self-loops, and we define h ( n , Δ ) H ( n , Δ ) .

3. Counting Tree-Like Graphs with a Given Number of Vertices and Self-Loops

We develop a method to compute for any two integers n 1 and Δ 0 , the size h ( n , Δ ) of a maximal set H ( n , Δ ) of mutually non-isomorphic rooted trees with n vertices and Δ self-loops; i.e., we are interested in the following problem:
Counting Problem
Input: Two integers n 1 and Δ 0 .
Output: h ( n , Δ ) .
We solve this problem by using dynamic programming based on the information of the number of vertices and self-loops in the subtrees rooted at the children of the root of each tree in H ( n , Δ ) . We define the following notions.
Let n 1 and Δ 0 be any two integers. For each tree H H ( n , Δ ) , we define
Max v ( H ) max { { | V ( H v ) | v N H ( r H ) } { 0 } } , Max s ( H ) max { { s ( H v ) v N H ( r H ) , | V ( H v ) | = Max v ( H ) } { 0 } } .
Note that for any tree H H ( 1 , Δ ) , it holds that Max v ( H ) = 0 and Max s ( H ) = 0 .
Let m , d 0 be any two integers. We define
H ( n , Δ , m , d ) { H H ( n , Δ ) Max v ( H ) m , Max s ( H ) d } .
Observe that by the definition of H ( n , Δ , m , d ) it holds that
(i)
H ( n , Δ , m , d ) = H ( n , Δ , n 1 , d ) if m n ;
(ii)
H ( n , Δ , m , d ) = H ( n , Δ , m , Δ ) if d Δ + 1 ; and
(iii)
H ( n , Δ ) = H ( n , Δ , n 1 , Δ ) .
Therefore, from now on, we assume that m n 1 and d Δ . Further, by the definition of H ( n , Δ , m , d ) it holds that H ( n , Δ , m , d ) (resp., H ( n , Δ , m , d ) = ) if “ n = 1 ” or “ n 1 m 1 ” (resp., otherwise ( n 2 and m = 0 )).
We define
H ( n , Δ , m = , d ) { H H ( n , Δ , m , d ) Max v ( H ) = m } .
It follows from the definition of H ( n , Δ , m = , d ) that H ( n , Δ , m = , d ) (resp., H ( n , Δ , m = , d ) = ) if “ n = 1 ” or “ n 1 m 1 ” (resp., otherwise ( n 2 and m = 0 )). Further we have the following relation:
H ( n , Δ , m , d ) = H ( n , Δ , 0 = , d ) if m = 0 ,
H ( n , Δ , m , d ) = H ( n , Δ , m 1 , d ) H ( n , Δ , m = , d ) if m 1 ,
where H ( n , Δ , m 1 , d ) H ( n , Δ , m = , d ) = for m 1 .
Next we define
H ( n , Δ , m = , d = ) { H H ( n , Δ , m = , d ) Max s ( H ) = d } .
Note that if “ n = 1 and d = 0 ” or “ n 1 m 1 ” (resp., otherwise (“ n = 1 and d 1 ” or “ n 2 and m = 0 ”)), then by the definition of H ( n , Δ , m = , d = ) it holds that H ( n , Δ , m = , d = ) (resp., H ( n , Δ , m = , d = ) = ). Furthermore, we get the following relation for H ( n , Δ , m = , d ) :
H ( n , Δ , m = , d ) = H ( n , Δ , m = , 0 = ) if d = 0 ,
H ( n , Δ , m = , d ) = H ( n , Δ , m = , d 1 ) H ( n , Δ , m = , d = ) if d 1 ,
where H ( n , Δ , m = , d 1 ) H ( n , Δ , m = , d = ) = for d 1 .
Let n 1 m 0 , and Δ d 0 be four integers. Let h ( n , Δ , m , d ) , h ( n , Δ , m = , d ) and h ( n , Δ , m = , d = ) denote the number of elements in the families H ( n , Δ , m , d ) , H ( n , Δ , m = , d ) and H ( n , Δ , m = , d = ) , respectively. We discuss recursive relations for h ( n , Δ , m , d ) and h ( n , Δ , m = , d ) in Lemma 1.
Lemma 1.
For any four integers n 1 m 0 , and Δ d 0 , it holds that
(i)
h ( n , Δ , m , d ) = h ( n , Δ , 0 = , d ) if m = 0 ;
(ii)
h ( n , Δ , m , d ) = h ( n , Δ , m 1 , d ) + h ( n , Δ , m = , d ) if m 1 ;
(iii)
h ( n , Δ , m = , d ) = h ( n , Δ , m = , 0 = ) if d = 0 ; and
(iv)
h ( n , Δ , m = , d ) = h ( n , Δ , m = , d 1 ) + h ( n , Δ , m = , d = ) if d 1 .
Proof. 
The case (i) follows by Equation (1). The case (ii) follows by Equation (2) and the fact that for m 1 it holds that H ( n , Δ , m 1 , d ) H ( n , Δ , m = , d ) = . By Equation (3) the case (iii) follows. The case (iv) follows by Equation (4) and the fact that for d 1 it holds that H ( n , Δ , m = , d 1 ) H ( n , Δ , m = , d = ) = . □
Next we discuss some boundary conditions for our DP to compute h ( n , Δ ) .
Lemma 2.
For any four integers n 1 m 0 , and Δ d 0 , it holds that
(i)
h ( n , Δ , 0 = , d = ) = 1 (resp., h ( n , Δ , 0 = , d = ) = 0 ) if n = 1 and d = 0 (resp., otherwise (“ n = 1 and d 1 ” or “ n 2 ”));
(ii)
h ( n , Δ , 0 = , d ) = h ( n , Δ , 0 , d ) = 1 (resp., h ( n , Δ , 0 = , d ) = h ( n , Δ , 0 , d ) = 0 ) if n = 1 (resp., otherwise ( n 2 ));
(iii)
h ( n , Δ , 1 = , d = ) = 1 if “ n = 2 ” or “ n 3 and d = 0 ”; and
(iv)
h ( n , Δ , 1 = , d ) = h ( n , Δ , 1 , d ) = d + 1 if “ n = 2 ” or “ n 3 and d = 0 ”.
Proof. 
(i)
The result follows from the definition of H ( n , Δ , 0 = , d = ) , since a tree H with max v ( H ) = 0 exists if and only if | V ( H ) | = 1 and max s ( H ) = 0 .
(ii)
By Lemma 1(i), (ii) and (iv) it holds that h ( n , Δ , 0 , d ) = h ( n , Δ , 0 = , d ) = p = 0 d h ( n , Δ , 0 = , p = ) . This and Lemma 2(i) imply the required result.
(iii)
When n 2 , then for any tree H H ( n , Δ , 1 = , d = ) it holds that | N H ( r H ) | = n 1 . Thus for each v N H ( r H ) it holds that | V ( H v ) | = 1 and s ( H v ) = d if “ n = 2 ” or “ n 3 and d = 0 ”, i.e., H v H ( 1 , d , 0 , d ) . But by Lemma 2(ii) it holds that h ( 1 , d , 0 , d ) = 1 . Hence we have the required result.
(iv)
Let “ n = 2 ” or “ n 3 and d = 0 ”. By Lemma 1(iii) and (iv) it holds that h ( n , Δ , 1 = , d ) = p = 0 d h ( n , Δ , 1 = , p = ) . This and Lemma 2(iii) imply that
h ( n , Δ , 1 = , d ) = d + 1 .
Furthermore, by Lemma 1(iii) it holds that h ( n , Δ , 1 , d ) = h ( n , Δ , 0 = , d ) + h ( n , Δ , 1 = , d ) . By Lemma 2(ii), we have h ( n , Δ , 1 , d ) = h ( n , Δ , 1 = , d ) . Hence the result follows by Equation (5).
By Lemma 2, we can get that h ( 1 , Δ ) = 1 and h ( 2 , Δ ) = Δ + 1 . Furthermore, Lemma 1(i)–(iv) give recursive relations for h ( n , Δ , m , d ) and h ( n , Δ , m = , d ) which depend on h ( n , Δ , m = , d = ) . Thus for n 3 , m 1 , and Δ d 0 , our next goal is to develop a recursive relation for h ( n , Δ , m = , d = ) . For any tree H H ( n , Δ , m = , d = ) and any vertex v N H ( r H ) , the subtree H v of H satisfies exactly one of the following three conditions:
(C-1)
V ( H v ) = m and s ( H v ) = d .
(C-2)
V ( H v ) = m and 0 s ( H v ) < d .
(C-3)
V ( H v ) < m and 0 s ( H v ) Δ .
For any tree H H ( n , Δ , m = , d = ) , we define the residual tree of H to be the subtree of H rooted at r H induced by the vertices V ( H ) \ V ( H v ) v N H ( r H ) , H v H ( m , d , m 1 , d ) . Note that the residual tree of a tree H has at least one vertex, i.e., the root of H. We give an illustration of a residual tree in Figure 2.
Lemma 3.
For any four integers n 3 , m 1 , and Δ d 0 , and a tree H H ( n , Δ , m = , d = ) , let q = | { v N H ( r H ) H v H ( m , d , m 1 , d ) } | . Then it holds that
(i)
1 q ( n 1 ) / m with q Δ / d when d 1 .
(ii)
The residual tree of H belongs to exactly one of the families H ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) and H ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) .
Proof. 
(i)
Since H H ( n , Δ , m = , d = ) , there exists at least one vertex v N H ( r H ) such that H v H ( m , d , m 1 , d ) . This implies that q 1 . Also, it holds that n 1 m q and Δ d q . This implies that q ( n 1 ) / m with q Δ / d when d 1 .
(ii)
Let K denote the residual tree of H. By the definition of K it holds that K H ( n m q , Δ d q , n m q 1 , Δ d q ) . Furthermore, for each vertex v N H ( r H ) V ( K ) , the tree H v satisfies exactly one of the conditions (C-2) and (C-3). Now, if there exists a vertex v N H ( r H ) V ( K ) such that H v satisfies condition (C-2), then d 1 0 , and hence K H ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) . On the other hand, if condition (C-2) does not hold for any v N H ( r H ) V ( K ) ; i.e., either N H ( r H ) V ( K ) = or for each v N H ( r H ) V ( K ) it holds that | V ( H v ) | min { n q m 1 , m 1 } and 0 s ( H v ) Δ d q , then by the definition of K it holds that K H ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) . This completes the proof.
For any five integers n 3 , m 1 , Δ d 0 , and t 0 , let c ( m , d ; t ) h ( m , d , m 1 , d ) + t 1 t denote the number of combinations with repetition of t trees from the family H ( m , d , m 1 , d ) . In Lemma 4, we give a recursive relation for h ( n , Δ , m = , d = ) .
Lemma 4.
For any five integers n 3 , m 1 , Δ d 0 , and q, such that 1 q ( n 1 ) / m with q Δ / d when d 1 , it holds that
(i)
h ( n , Δ , m = , d = ) = q c ( m , d ; q ) h ( n q m , Δ , min { n q m 1 , m 1 } , Δ ) if d = 0 ;
(ii)
h ( n , Δ , m = , d = ) = q c ( m , d ; q ) ( h ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) + h ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) ) if d 1 ;
(iii)
h ( n , Δ , m = , d = ) = q c ( m , d ; q 1 ) ( ( h ( m , d , m 1 , d ) + q 1 ) / q ) h ( n q m , Δ , min { n q m 1 , m 1 } , Δ ) if d = 0 ; and
(iv)
h ( n , Δ , m = , d = ) = q c ( m , d ; q 1 ) ( ( h ( m , d , m 1 , d ) + q 1 ) / q ) ( h ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) + h ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) ) if d 1 .
Proof. 
Let H be a tree in the family H ( n , Δ , m = , d = ) . By Lemma 3(i), there exists a unique integer q, 1 q ( n 1 ) / m with q Δ / d when d 1 , such that there are exactly q subtrees H v with v N H ( r H ) and H v H ( m , d , m 1 , d ) . Further, by Lemma 3(ii) the residual tree of H belongs to the family H ( n q m , Δ , min { n q m 1 , m 1 } , Δ ) (resp., H ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) H ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) ) if d = 0 (resp., otherwise). Note that H ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) H ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) = . This implies that for a fixed integer q in the range given in the lemma, the number of trees K in the family H ( n , Δ , m = , d = ) with exactly q subtrees K v H ( m , d , m 1 , d ) , for v N K ( r K ) , are
(a)
c ( m , d ; q ) h ( n q m , Δ , min { n q m 1 , m 1 } , Δ ) if d = 0 ; and
(b)
c ( m , d ; q ) ( h ( n q m , Δ d q , m = , min { Δ d q , d 1 } ) + h ( n q m , Δ d q , min { n q m 1 , m 1 } , Δ d q ) ) if d 1 .
Note that, for m = 1 and d = 0 , we have 1 q n 1 , and by Lemma 2(ii) it holds that h ( n q , Δ , 0 , Δ ) = 0 (resp., h ( n q , Δ , 0 , Δ ) = 1 ), if 1 q n 2 (resp., otherwise (if q = n 1 )). This implies that any tree H H ( n , Δ , 1 = , 0 = ) has exactly q = n 1 subtrees H v H ( 1 , 0 , 0 , 0 ) , for v N H ( r H ) . However, observe that for each integer m 2 or d 1 , and q satisfying the conditions given in the lemma, there exists at least one tree H H ( n , Δ , m = , d = ) such that H has exactly q subtrees H v H ( m , d , m 1 , d ) , for v N H ( r H ) . Hence, this and case (a) (resp., case (b)) imply Lemma 4(i) (resp., Lemma 4(ii)).
Furthermore, it holds that
c ( m , d ; q ) = ( h ( m , d , m 1 , d ) + q 1 ) ! ( h ( m , d , m 1 , d ) 1 ) ! q ! = ( h ( m , d , m 1 , d ) + q 2 ) ! ( h ( m , d , m 1 , d ) 1 ) ! ( q 1 ) ! × ( h ( m , d , m 1 , d ) + q 1 ) q = c ( m , d ; q 1 ) × ( h ( m , d , m 1 , d ) + q 1 ) q .
Hence, Lemma 4(iii) and (iv) follow from Lemma 4(i) and (ii), respectively. □
We design a DP algorithm to compute h ( n , Δ ) based on the recursive structures of h ( n , Δ , m , d ) , h ( n , Δ , m = , d ) and h ( n , Δ , m = , d = ) , 0 m n 1 and 0 d Δ , as given in Lemmas 1 and 4, where h ( n , Δ ) = h ( n , Δ , n 1 , Δ ) for n 1 and Δ 0 .
Lemma 5.
For any four integers n 1 m 0 , and Δ d 0 , h ( n , Δ , m , d ) can be obtained in O ( n m ( n + Δ ( n + d · min { n , Δ } ) ) ) time and O ( n m ( Δ ( d + 1 ) + 1 ) ) space.
The proof of Lemma 5 follows from Algorithm 1 and Lemma 6.
Corollary 1.
For any two integers n 1 and Δ 0 , h ( n , Δ , n 1 , Δ ) can be obtained in O ( n 2 ( n + Δ ( n + Δ · min { n , Δ } ) ) ) time and O ( n 2 ( Δ 2 + 1 ) ) space.
Next, for any four integers n 1 m 0 , and Δ d 0 , we present Algorithm 1 for solving the problem of calculating h ( n , Δ , m , d ) . In this algorithm, for each integers 1 i n , 0 j Δ , 0 k min { i , m } , and 0 p min { j , d } , the variables h i , j , k , p , h i , j , k = , p , and h i , j , k = , p = store the values of h ( i , j , k , p ) , h ( i , j , k = , p ) , and h ( i , j , k = , p = ) , respectively.
Lemma 6.
For any four integers n 1 m 0 , and Δ d 0 , Algorithm 1 outputs h ( n , Δ , m , d ) in O ( n m ( n + Δ ( n + d · min { n , Δ } ) ) ) time and O ( n m ( Δ ( d + 1 ) + 1 ) ) space.
Proof. 
Correctness: For each integer 1 i n , 0 j Δ , 0 k min { i , m } , and 0 p min { j , d } , all the substitutions and if-conditions in Algorithm 1 follow from Lemmas 1, 2, 3 and 4. Furthermore, the values h [ i , j , k , p ] , h [ i , j , k = , p ] , and h [ i , j , k = , p = ] are computed by the recursive relations given in Lemmas 1 and 4. This implies that Algorithm 1 correctly computes the required value h [ n , Δ , m , d ] .
Complexity analysis: There are three nested loops over the variables i , j , and p at line 4, which take O ( n ( Δ ( d + 1 ) + 1 ) ) time. Following there are five nested loops: over variables i , j , k , p , and q at lines 5, 6, 7, 8, and 31, respectively. The loop at line 5 is of size O ( n ) , while the loop at line 6 is of size O ( Δ ) . Similarly, the loops at lines 7 and 8 are of size O ( m ) and O ( d ) , respectively. The fifth nested loop at line 18 is of size O ( n ) (resp., O ( min { n , Δ } ) ) if p = 0 (resp., otherwise). Thus from line 5–36, Algorithm 1 takes O ( n 2 m ) (resp., O ( n m Δ ( n + d · min { n , Δ } ) ) ) time if Δ = 0 (resp., otherwise). Therefore, Algorithm 1 takes O ( n m ( n + Δ ( n + d · min { n , Δ } ) ) ) time.
The algorithm stores three four-dimensional arrays. When Δ = 0 , for each integer 1 i n , and 1 k min { i , m } we store h [ i , 0 , k , 0 ] , h [ i , 0 , k = , 0 ] and h [ i , 0 , k = , 0 = ] , taking O ( n m ) space. When Δ 1 , then for each integer 1 i n , 0 j Δ , 1 k min { i , m } and 0 p min { j , d } we store h [ i , j , k , p ] , h [ i , j , k = , p ] and h [ i , j , k = , p = ] , taking O ( n m Δ ( d + 1 ) ) space. Hence, Algorithm 1 takes O ( n m ( Δ ( d + 1 ) + 1 ) ) space. □
Algorithm 1 DP based counting algorithm for h ( n , Δ , m , d )
Input: Integers n 1 m 0 and Δ d 0 .
Output: h ( n , Δ , m , d ) .
h [ 1 , j , 0 = , 0 = ] : = h [ 1 , j , 0 = , p ] : = h [ 1 , j , 0 , p ] : = 1 ;
h [ i , j , 0 = , p ] : = h [ i , j , 0 , p ] : = 0 ;
h [ 2 , j , 1 = , p = ] : = 1 ; h [ 2 , j , 1 = , p ] : = h [ 2 , j , 1 , p ] : = p + 1
 for each 2 i n , 0 j Δ , 0 p min { j , d } ;
for i : = 3 , 4 , , n do
   for j : = 0 , 1 , , Δ do
     for k : = 1 , 2 , , min { i , m } do
       for p : = 0 , 1 , , min { j , d } do
         if p = 0 and k = 1 then
            h [ i , j , 1 = , 0 = ] : = h [ i , j , 1 = , 0 ] : = h [ i , j , 1 , 0 ] : = 1
         else /* p 1 or k 2 */
            c : = 1 ; h [ i , j , k = , p = ] : = 0 ; /* Initialization */
           if p = 0 then
              : = ( i 1 ) / k
           else /* p 1 */
              : = min { ( i 1 ) / k , j / p }
           end if;
           for q : = 1 , 2 , , do
              c : = c · ( h [ k , p , k 1 , p ] + q 1 ) / q ;
             if p = 0 then
                h [ i , j , k = , p = ] : = h [ i , j , k = , p = ] + c · h [ i q k , j , min { i k q 1 , k 1 } , j ]
             else /* p 1 */
                h [ i , j , k = , p = ] : = h [ i , j , k = , p = ] + c · h [ i k q , j p q , k = , min { j p q , p 1 } ] + h [ i k q , j p q , min { i k q 1 , k 1 } , j p q ]
             end if
           end for;
           if p = 0 then /* k 2 */
              h [ i , j , k = , 0 ] : = h [ i , j , k = , 0 = ]
           else /* p 1 */
              h [ i , j , k = , p ] : = h [ i , j , k = , p 1 ] + h [ i , j , k = , p = ]
          end if;
           h [ i , j , k , p ] : = h [ i , j , k 1 , p ] + h [ i , j , k = , p ]
        end if
      end for
    end for
  end for
end for;
output h [ n , Δ , m , d ] as h ( n , Δ , m , d ) .
Theorem 1.
For any two integers n 1 and Δ 0 , the number of non-isomorphic trees with n vertices and Δ self-loops can be obtained in O ( n 2 ( n + Δ ( n + Δ · min { n , Δ } ) ) ) time and O ( n 2 ( Δ 2 + 1 ) ) space.
Proof. 
By Jordan [18], we can uniquely consider any tree as a rooted tree by either regarding its unicentroid as the root, or in the case of a bicentroid, by introducing a virtual vertex on the bicentroid and assuming the virtual vertex as the root of the tree. By the definition of a unicentroid, the number of mutually non-isomorphic trees with n vertices, Δ self-loops and a unicentroid is h ( n , Δ , ( n 1 ) / 2 , Δ ) . Further, if n is even, then there exist trees with n vertices and a bicentroid. This implies that the number of mutually non-isomorphic trees with n vertices and Δ self-loops is h ( n , Δ , ( n 1 ) / 2 , Δ ) when n is odd. Let n be an even integer. Then any tree H with n vertices, Δ self-loops and a bicentroid has two connected components, A and B obtained by the removal of the bicentroid such that A H ( n / 2 , i , n / 2 1 , i ) and B H ( n / 2 , Δ i , n / 2 1 , Δ i ) for some 0 i Δ / 2 , where if Δ is even then for i = Δ / 2 , both of the components A and B belong to H ( n / 2 , Δ / 2 , n / 2 1 , Δ / 2 ) .
Note that for any 0 i ( Δ 1 ) / 2 , it holds that
H ( n / 2 , i , n / 2 1 , i ) H ( n / 2 , Δ i , n / 2 1 , Δ i ) = .
Therefore, when Δ is odd (resp., even), the number of mutually non-isomorphic trees with n vertices, Δ self-loops, and a bicentroid is
i = 0 ( Δ 1 ) / 2 h ( n / 2 , i , n / 2 1 , i ) h ( n / 2 , Δ i , n / 2 1 , Δ i ) + α h ( n / 2 , Δ / 2 , n / 2 1 , Δ / 2 ) + 1 2 ,
such that α = 0 (resp., α = 1 ). Thus, the number of mutually non-isomorphic trees with n vertices and Δ self-loops is
h ( n , Δ , ( n 1 ) / 2 , Δ ) + i = 0 ( Δ 1 ) / 2 h ( n / 2 , i , n / 2 1 , i ) h ( n / 2 , Δ i , n / 2 1 , Δ i ) + α h ( n / 2 , Δ / 2 , n / 2 1 , Δ / 2 ) + 1 2
such that α = 0 (resp., α = 1 ) when Δ is odd (resp., even). Moreover, for each 0 i Δ , Algorithm 1 also computes and stores h ( n / 2 , i , n / 2 1 , i ) during the calculation of h ( n , Δ , ( n 1 ) / 2 , Δ ) , and therefore the required result follows from Lemma 6. □
We implemented the proposed DP algorithm and counting trees with a given number of vertices and self-loops. The experimental results in Table 1 show that the proposed method efficiently counts trees with n vertices and Δ self-loops.
We next give a lower bound and an upper bound on the number of tree-like polymer topologies with self-loops of a given rank. For this we prove the following results.
Lemma 7.
For an integer n 2 , there exists at least one tree-like polymer with n vertices and Δ self-loops if Δ n 2 + 1 .
Proof. 
Consider a tree T of n vertices of diameter n 2 such that T contains a path of length n 2 , in which each non-end vertex has degree at least 3. Observe that when n is even, the tree T has exactly n 2 1 vertices of degree 3, and hence n n 2 + 1 = n 2 + 1 vertices of degree less than 3. When n is odd, the tree T has n 2 3 vertices of degree 3 and one vertex of degree 4. Thus, in this case, the number of vertices of degree less than 3 is n n 2 + 2 = 2 n 2 1 n 2 + 2 = n 2 + 1 . This implies that T can be transformed into a polymer with n 2 + 1 self-loops by assigning a self-loop to each vertex of degree less than 3. Hence, n 2 + 1 self-loops are sufficient to get a tree-like polymer with n vertices. □
For two integers n 1 and Δ 0 , let t ( n , Δ ) denote the number of trees with n vertices and Δ self-loops. For r 1 , let p ( r ) denote the number of tree-like polymers with self-loops and no multi-edges of rank r. Observe that a tree with n vertices and k self-loops at each vertex is a polymer with n vertices of cycle rank k n . From this fact and Lemma 7 it holds that
n , k Z + : n k = r t ( n , 0 ) p ( r ) n Z + : n 2 + 1 r t ( n , r ) .

4. Conclusions

This paper presented an efficient method to count the number of all mutually non-isomorphic trees with a given number of vertices and self-loops. The proposed method is based on dynamic programming where we count the number of all mutually non-isomorphic rooted trees with a given number n of vertices and Δ self-loops in O ( n 2 ( n + Δ ( n + Δ · min { n , Δ } ) ) ) time and O ( n 2 ( Δ 2 + 1 ) ) space. As an application of our results, we gave lower and upper bounds on the number of tree-like polymer topologies with a given cycle rank. This is an interesting application of DP to objects such as trees, and offers the advantage of getting the size of the entire solution space at low computational complexity without explicitly generating each object.
An interesting direction for future research is to efficiently generate all mutually non-isomorphic trees with a given number of vertices and self-loops by using the result from the developed counting method. Further, another possible extension of this research is to count and generate all mutually non-isomorphic tree-like polymer topologies with a given number of vertices and self-loops.

Author Contributions

Conceptualization, N.A.A. and H.N.; funding acquisition, N.A.A.; methodology, N.A.A. and H.N.; software, N.A.A.; supervision, H.N.; validation, N.A.A., A.S. and H.N.; writing—original draft, N.A.A.; writing—review and editing, N.A.A. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially funded by JSPS KAKENHI Grant Number 18J23484.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pólya, G. Kombinatorische anzahlbestimmungen für gruppen, graphen und chemische verbindungen. Acta Math. 1937, 68, 145–254. [Google Scholar] [CrossRef]
  2. Polya, G.; Read, R.C. Combinatorial Enumeration of Groups, Graphs, and Chemical Compounds; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
  3. Blum, L.C.; Reymond, J.L. 970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. J. Am. Chem. Soc. 2009, 131, 8732–8733. [Google Scholar] [CrossRef] [PubMed]
  4. Azam, N.A.; Chiewvanichakorn, R.; Zhang, F.; Shurbevski, A.; Nagamochi, H.; Akutsu, T. A method for the inverse QSAR/QSPR based on artificial neural networks and mixed integer linear programming. In Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies—Volume 3: Bioinformatics, Valletta, Malta, 24–26 February 2020. [Google Scholar]
  5. Ito, R.; Azam, N.A.; Wang, C.; Shurbevski, A.; Nagamochi, H.; Akutsu, T. A novel method for the inverse QSAR/QSPR to monocyclic chemical compounds based on artificial neural networks and integer programming. In Advances in Computer Vision and Computational Biology; Springer Nature-Research Book Series; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
  6. Zhu, J.; Wang, C.; Shurbevski, A.; Nagamochi, H.; Akutsu, T. A novel method for inference of chemical compounds of cycle index two with desired properties based on artificial neural networks and integer programming. Algorithms 2020, 13, 124. [Google Scholar] [CrossRef]
  7. Méndez-Lucio, O.; Baillif, B.; Clevert, D.A.; Rouquié, D.; Wichard, J. De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat. Commun. 2020, 11, 10. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Lim, J.; Hwang, S.Y.; Moon, S.; Kim, S.; Kim, W.Y. Scaffold-based molecular design with a graph generative model. Chem. Sci. 2020, 11, 1153–1164. [Google Scholar] [CrossRef] [Green Version]
  9. Meringer, M.; Schymanski, E.L. Small molecule identification with MOLGEN and mass spectrometry. Metabolites 2013, 3, 440–462. [Google Scholar] [CrossRef] [PubMed]
  10. Benecke, C.; Grund, R.; Hohberger, R.; Kerber, A.; Laue, R.; Wieland, T. MOLGEN+, a generator of connectivity isomers and stereoisomers for molecular structure elucidation. Anal. Chim. Acta 1995, 314, 141–147. [Google Scholar] [CrossRef]
  11. Available online: http://sunflower.kuicr.kyoto-u.ac.jp/tools/enumol2/ (accessed on 4 July 2020).
  12. Peironcely, J.E.; Rojas-Chertó, M.; Fichera, D.; Reijmers, T.; Coulier, L.; Faulon, J.L.; Hankemeier, T. OMG: Open molecule generator. J. Cheminf. 2012, 4, 21. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Vogt, M.; Bajorath, J. Chemoinformatics: A view of the field and current trends in method development. Bioorg. Med. Chem. 2012, 20, 5317–5323. [Google Scholar] [CrossRef] [PubMed]
  14. Haruna, T.; Horiyama, T.; Shimokawa, K. On the enumeration of polymer topologies. IPSJ SIG Tech. Rep. 2017, 2017-Al-162, 1–5. [Google Scholar]
  15. Tezuka, Y.; Oike, H. Topological polymer chemistry. Prog. Polym. Sci. 2002, 27, 1069–1122. [Google Scholar] [CrossRef]
  16. Galina, H.; Sysło, M.M. Some applications of graph theory to the study of polymer configuration. Discret. Appl. Math. 1988, 19, 167–176. [Google Scholar] [CrossRef] [Green Version]
  17. Zimm, B.H.; Stockmayer, W.H. The dimensions of chain molecules containing branches and rings. J. Chem. Phys. 1949, 17, 1301–1314. [Google Scholar] [CrossRef]
  18. Jordan, C. Sur les assemblages de lignes. J. Reine Angew. Math. 1869, 70, 81. [Google Scholar]
Figure 1. The chemical compound Remdesivir C 27 H 35 N 6 O 8 P and its polymer topology: (a) chemical structure of Remdesivir C 27 H 35 N 6 O 8 P obtained from the PubChem database; (b) the polymer topology of Remdesivir with six vertices, two multi-edges of multiplicity 2, one self-loop and cycle rank 4.
Figure 1. The chemical compound Remdesivir C 27 H 35 N 6 O 8 P and its polymer topology: (a) chemical structure of Remdesivir C 27 H 35 N 6 O 8 P obtained from the PubChem database; (b) the polymer topology of Remdesivir with six vertices, two multi-edges of multiplicity 2, one self-loop and cycle rank 4.
Entropy 22 00923 g001
Figure 2. An illustration of a residual tree, where H H ( n , Δ , m = , d = ) and the residual tree of H is shown by dashed lines.
Figure 2. An illustration of a residual tree, where H H ( n , Δ , m = , d = ) and the residual tree of H is shown by dashed lines.
Entropy 22 00923 g002
Table 1. Experimental result of the counting method.
Table 1. Experimental result of the counting method.
( n , Δ ) Number of TreesTime [s]
( 10 , 0 ) 1060.000173
( 20 , 0 ) 823,0650.00048
( 10 , 5 ) 91,0370.001193
( 10 , 30 ) 6,629,790,7120.00881
( 20 , 10 ) 5,143,681,226,0040.006869
( 30 , 10 ) 2,547,562,522,909,694,3310.015901

Share and Cite

MDPI and ACS Style

Azam, N.A.; Shurbevski, A.; Nagamochi, H. An Efficient Algorithm to Count Tree-Like Graphs with a Given Number of Vertices and Self-Loops. Entropy 2020, 22, 923. https://doi.org/10.3390/e22090923

AMA Style

Azam NA, Shurbevski A, Nagamochi H. An Efficient Algorithm to Count Tree-Like Graphs with a Given Number of Vertices and Self-Loops. Entropy. 2020; 22(9):923. https://doi.org/10.3390/e22090923

Chicago/Turabian Style

Azam, Naveed Ahmed, Aleksandar Shurbevski, and Hiroshi Nagamochi. 2020. "An Efficient Algorithm to Count Tree-Like Graphs with a Given Number of Vertices and Self-Loops" Entropy 22, no. 9: 923. https://doi.org/10.3390/e22090923

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop