# A Set-Theoretic Approach to Modeling Network Structure

## Abstract

**:**

`Union`and

`meet`are essential binary operators;

`contained_in`is the basic relational comparator. The interior $\mathcal{I}$ is shown to have desirable formal properties and to provide an effective way of revealing “communities” in social networks. A series of networks randomly generated from $\mathcal{I}$ is compared with the original network, $\mathcal{N}$.

## 1. Introduction

**nodes**and L is a set of unordered pairs $\{x,y\}\subseteq N$, called

**links**(In graph theory, these unordered pairs are called “edges”. This seems to be derived from the edges of the solid “dodecahedron puzzle” of Sir William Hamilton (1857) and retained through inertia. However, since in social networks they connect individuals, it seems more appropriate to call them “links”) [1,2]. However, although textbook network theory is almost always set based, virtually all computer network algorithms are algebraic [3,4]. Any network can be represented by its adjacency matrix, ${A}_{n,n}$, where ${a}_{i,j}=1$ if $\{i,j\}$ is a link and 0 otherwise. There is an abundance of matrix algorithms one can use, such as eigenvector evaluation [3]. In this paper, we supplement these matrix-based algorithms. The common goal is to describe the nature of a network in terms of fundamental properties. A matrix based approach yields numeric properties; the set based approach of this paper yields set-theoretic properties.

## 2. The Interior

**operator**$\tau :{2}^{S}\to {2}^{S}$ is an injective function which maps subsets of S into subsets of S. We denote operators by Greek letters and use postfix notation, as in $Y.\tau $, where $Y\subseteq S$. An operator $\phi $ is said to be a

**closure**operator if, for all $X,Y\subseteq S$, (C1) $Y\subseteq Y.\phi $ (expansive), (C2) $X\subseteq Y$ implies $X.\phi \subseteq Y.\phi $ (monotone) and (C3) $Y.\phi .\phi =Y.\phi $ (idempotent). Closure operators are a staple of topological mathematics.

**interior**operator. We use $\iota $ to denote interior operators and $\phi $ to denote closure operators; they are similar, except that one is contractive while the other is expansive.

**neighborhood**of Y is $Y.\eta =\left\{z\right|\exists y\in Y,\{y,z\}\in L\}\cup Y$. (In graph theory, $Y.\eta $ is sometimes called the “closed neighborhood” of Y, and denoted $N\left[Y\right]$, while $N\left(Y\right)=Y.\eta -Y$ is called the “open neighborhood” [1,2]). Finally, since all operators map sets into sets, even when we are talking about the neighborhood of a single node, for example z in (1) below, we express it as $\left\{z\right\}.\eta $. A

**neighborhood closure**operator, ${\phi}_{\eta}$, on $\mathcal{N}$ can be defined by

#### 2.1. The Network Interior

**irreducible**if $\left\{y\right\}.{\phi}_{\eta}=\left\{y\right\}$. A sub-network, $\mathcal{I}\subseteq \mathcal{N}$, of irreducible nodes is called the network’s

**interior**. In the remainder of this section we define an operator, $\omega $, which reduces any network to its irreducible core, and prove that it is almost an interior operator.

**belongs**to y. We can remove z from N, together with all its connections, and add z to $\left\{y\right\}.\beta $, the set of all nodes belonging to $\left\{y\right\}$. This set $\left\{y\right\}.\beta $ is called its

**$\beta $-set**. Of course, $y\in \left\{y\right\}.\beta $. The cardinality $\left|\right\{y\}.\beta |$ is called its $\beta $

**-count**.

`reduce`Pseudocode I was used to implement a process $\omega $ that reduces any network $\mathcal{N}$ to its irreducible core, $\mathcal{I}=\mathcal{N}.\omega $.

while there exist reduceable nodes { |

reducible = 0 |

for_each {y} in N { |

for_each {z} in {y}.nbhd - {y} { |

if ({z}.nbhd contained_in {y}.nbhd { |

// z is subsumed by y |

remove z from network; |

{y}.beta = {y}.beta union {z}.beta |

reducible = 1 } } } } |

Pseudocode I, $\omega $, reduce_network |

**Proposition**

**1.**

**Proof.**

**Proposition**

**2.**

**Proof.**

**path**$\rho ({y}_{0},{y}_{n})$ of length n. It is often easier to describe a path in terms of its nodes, $\dot{\rho}$ rather than $\overline{\rho}$, which is more precise. By $\left|\rho \right(x,z\left)\right|$, we mean the length of the path independent of whether we are counting nodes or links.

**bridge**if there exists a path $\overline{\rho}({y}_{i},{y}_{k})\in L$ where $(k-i)\phantom{\rule{4pt}{0ex}}mod\left(n\right)\ne 1$ [2]. If the path consists of a single link, it is called a

**chord**. If C has no such chords, it is said to be a

**chordless**cycle. Graphs, in which every cycle of length $\ge 4$ must have a chord, are called “chordal graphs” [7].)

**Proposition**

**3.**

**Proof.**

**Proposition**

**4.**

**Proof.**

**Corollary**

**1.**

**rank**[9]. If the network is projected onto a planar representation, then counting those cycles without a bridge yields the rank.

#### 2.2. Reduction Performance

## 3. Network Properties

`count_triangles`whose code is given in Pseudocode II.

k_total = 0 |

for_each link {x, z} in L { |

MEET = {x}.nbhd meet {z}.nbhd |

{x, z}.k_count = cardinality_of(MEET) |

k_total = k_total + {x, z}.k_count } |

n_triangles = k_total/3 |

Pseudocode II, count_triangles |

`count_triangles`is linear, or $O\left(L\right)$.

**shortest path**(s) between x and z. The path length $\left|\sigma \right(x,z\left)\right|$ is also known as the

**distance**, $d(x,z)$, between x and y [1,2]. The $diameter\left(\mathcal{N}\right)$ of the network is the maximal distance, $d(x,z)$ for all $x,z\in N$. The $eccentricity$ of a node x is $e\left(x\right)=max\left(d\right(x,z\left)\right)$ for all $z\in N$. The $radius$, $r\left(\mathcal{N}\right)$, of the network is minimum eccentricity of any node y [2].

#### 3.1. Communities

**communities**, arise from the social phenomenon called

**triadic closure**[14]. It is known that, in many social contexts, if x is connected to y and y is connected to z, then x is likely to be connected to z. Even though triadic closure is not really a closure operator, its principle has been identified on many repeated occasions [11,15]. (As normally encountered, triadic closure is not idempotent. Applied literally, the triadic closure of any network would be the complete graph/network on its n nodes).

**k-truss**[18]. A connected subset of triangles could be tree-structured, so it is common to specify that a k-truss is a connected collection of links with a $k\_count>1$, where the $k\_count$ of a link $\{x,z\}$ is $\left|\right\{x\}.\eta \cap \{z\}.\eta |$ as in Pseudocode II. If $k\_count=2$, the Karate network of Figure 1 has just one 2-truss, consisting of links connecting the nodes {1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 14, 30, 31, 33, 34} or just less than half the network. It has two 3-trusses connecting the nodes {1, 2, 3, 4, 8, 14} and {9, 24, 33, 34}. The small network of Figure 3 has two 2-trusses of links connecting the nodes {a, e, d, g, h, l, m, p, q, r, s} and {b, f, j, k, o} and four small 3-trusses, which are {b, j}, {d, g, l, m}, {p, q} and {r, s}. There are 23 2-trusses in the Newman network and each is large; however, there are only three 8-trusses. They are {Arenas, Dido-Guilera}, {Mano, Occaletti} and {Barabasi, Jeong, Oltavi, Raven, Schubert}; however, Arenas, Mano, Oltavi, Raven, and Schubert are not elements of the interior and thus not shown in Figure 5.

#### 3.2. Important Nodes

**center**of $\mathcal{N}$ [1,2]; they are “closest” to all other nodes. It is well known that this subset of nodes must be edge connected. One may assume that these nodes in the “center” of a network are “important” nodes.

#### 3.3. Network Properties Preserved by the Interior

**Lemma**

**1.**

**Proof.**

**Lemma**

**2.**

**Proof.**

**Lemma**

**3.**

**Proof.**

**$\beta $-connected**if there exists $x,y\ne {x}_{0},{y}_{0}$ where $x\in \left\{{x}_{0}\right\}.\beta $, $y\in \left\{{y}_{0}\right\}.\beta $ and $\{x,y\}\in L$. The preceding lemmas describe links that must exist if $\beta $-sets are connected. These are illustrated in Figure 6.

`reduce`can be very dependent on this order.

**Proposition**

**5.**

**Proof.**

#### 3.4. Network Centrality

**Proposition**

**6.**

**Proof.**

**Proposition**

**7.**

**Proof.**

## 4. Network Generation by Expansion

`expand`to implement an operator $\epsilon $ that generates new nodes relative to the interior is given in Pseudocode III. ($\epsilon $, as shown here, is a round-robin procedure expanding one node in a $\beta $-set at a time. An alternative, and slightly faster, process can be found in [22]).

while still_expanding { |

still_expanding = 0 |

for_each y in NODES { |

if (y.beta_count > 1) { |

z = new_node() |

add new_node to NODES |

chosen = choose_subset (y.nbhd) |

// distribute some of y.beta_count to z |

increment = y.beta_count/(n_chosen+1) |

y.beta_count = y.beta_count - increment |

z.beta_count = 1 + increment |

add (y, z) to LINKS |

// link z to chosen nodes in y.nbhd |

for_each x in chosen { |

add (x, z) to LINKS } |

still_expanding = 1 } } } |

Pseudocode III, $\epsilon $, expand_network |

**Proposition**

**8.**

**Proof.**

## 5. Observations

`reduce, count_triangles`and

`expand`was more important. Programming with set operators is not widespread. However, these set-theoretic procedures appear to be fast and quite scalable. The reduction, $\omega $, of the Newman co-authorship network to Figure 5 took 0.008 s; reduction of the smaller networks (Figure 1 and Figure 3) were each less than 0.001 s. Calculation of the eigenvectors of Figure 5 exceeded 5 s. Such anecdotal evidence is suggestive, but far from definitive.

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Agnarsson, G.; Greenlaw, R. Graph Theory: Modeling, Applications and Algorithms; Prentice Hall: Upper Saddle River, NJ, USA, 2007. [Google Scholar]
- Harary, F. Graph Theory; Addison-Wesley: Reading, MA, USA, 1969. [Google Scholar]
- Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E
**2006**, 74, 1–22. [Google Scholar] [CrossRef] [PubMed][Green Version] - Newman, M. Networks; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
- Orlandic, R.; Pfaltz, J.L.; Taylor, C. A Functional Database Representation of Large Sets of Objects. In Proceedings of the 25th Australasian Database Conference (ADC 2014), Brisbane, Australia, 14–16 July 2014; Wang, H., Saraf, M.A., Eds.; Springer: Cham, Switzerland, 2014; pp. 189–196. [Google Scholar]
- Zachary, W.W. An Information Flow Model for Conflict and Fission in Small groups. J. Anthropol. Res.
**1977**, 33, 452–473. [Google Scholar] [CrossRef][Green Version] - McKee, T.A. How Chordal Graphs Work. Bull. ICA
**1993**, 9, 27–39. [Google Scholar] - White, N. Theory of Matroids; Cambridge University Press: Cambridge, UK, 1986. [Google Scholar]
- Pfaltz, J.L. Cycle Systems. Math. Appl.
**2020**, 9, 55–66. [Google Scholar] [CrossRef] - Girvan, M.; Newman, M.E.J. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA
**2002**, 99, 7821–7826. [Google Scholar] [CrossRef][Green Version] - Newman, M.E.J. Detecting community structure in networks. Eur. Phys. J. B
**2004**, 38, 321–330. [Google Scholar] [CrossRef] - Newman, M.E.J. The structure and function of complex networks. SIAM Rev.
**2003**, 45, 167–256. [Google Scholar] [CrossRef][Green Version] - Tsourakakis, C.E.; Drineas, P.; Michelakis, E.; Koutis, I.; Faloutos, C. Spectral counting of triangles via element-wise sparsification and triangle-based link recommendation. Soc. Netw. Anal. Min.
**2011**, 1, 75–81. [Google Scholar] [CrossRef][Green Version] - Mollenhorst, G.; Völker, B.; Flap, H. Shared contexts and triadic closure in core discussion networks. Soc. Netw.
**2012**, 34, 292–302. [Google Scholar] [CrossRef] - Granovetter, M.S. The Strength of Weak Ties. Am. J. Sociol.
**1973**, 78, 1360–1380. [Google Scholar] [CrossRef][Green Version] - Newman, M.E.J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA
**2006**, 103, 8577–8582. [Google Scholar] [CrossRef] [PubMed][Green Version] - Fiedler, M. Algebraic Connectivity of Graphs. Czechoslovak Math. J.
**1973**, 23, 298–305. [Google Scholar] [CrossRef] - McCulloh, I.; Savas, O. k-Truss Network Community Detection. In Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), The Hague, The Netherlands, 7–10 December 2020. [Google Scholar]
- Freeman, L.C. Centrality in Social Networks, Conceptual Clarification. Soc. Netw.
**1978**, 1, 215–239. [Google Scholar] [CrossRef][Green Version] - Newman, M.E.J. A measure of betweenness centrality based on random walks. Soc. Netw.
**2005**, 27, 39–45. [Google Scholar] [CrossRef][Green Version] - Brandes, U. A Faster Algorithm for Betweeness Centrality. J. Math. Sociol.
**2001**, 25, 163–177. [Google Scholar] [CrossRef] - Pfaltz, J.L. Computational Processes that Appear to Model Human Memory. In Proceedings of the 4th International Conference, Algorithms for Computational Biology (AlCoB 2017), Aveiro, Portugal, 5–6 June 2017; pp. 85–99. [Google Scholar]
- Pfaltz, J.; Šlapal, J. Transformations of discrete closure systems. Acta Math. Hung.
**2013**, 138, 386–405. [Google Scholar] [CrossRef] - Kempner, Y.; Levit, V.E. Violator spaces vs closure spaces. Eur. J. Comb.
**2019**, 80, 203–213. [Google Scholar] [CrossRef][Green Version] - Seierstad, C.; Opsahl, T. For the few not the many? The effects of affirmative action on presence, prominence, and social capital of female directors in Norway. Scand. J. Manag.
**2011**, 27, 44–54. [Google Scholar] [CrossRef]

**Figure 1.**The interior $\mathcal{I}$ of ${\mathcal{N}}_{1}$, the Karate network, is shown with bolder links.

**Figure 3.**A small network, ${\mathcal{N}}_{2}$, of 21 nodes. Interior links are bolder. $\beta $-sets are dotted.

**Figure 5.**The interior $\mathcal{I}$ of ${\mathcal{N}}_{3}$, the 363 node co-authorship network of Newman [12].

$\left|\mathit{N}\right|$ | $\left|\mathit{L}\right|$ | $\mathbf{Density}$ | $\mathit{Triangles}$ | $2\_\mathit{Trusses}$ | $3\_\mathit{Trusses}$ | |
---|---|---|---|---|---|---|

${\mathcal{N}}_{2}$ | 21 | 44 | 2.095 | 21 | 2 | 4 |

exp.1 | 21 | 49 | 2.333 | 31 | 1 | 3 |

exp.2 | 21 | 46 | 2.190 | 25 | 2 | 3 |

exp.3 | 21 | 37 | 1.762 | 13 | 2 | 2 |

a | b | c | d | e | f | g | h | i | j | k | |
---|---|---|---|---|---|---|---|---|---|---|---|

$\mathcal{N}$ | 0.179 | 0.182 | 0.123 | 0.350 | 0.293 | 0.155 | 0.226 | 0.234 | 0.194 | 0.231 | 0.120 |

A0 | B0 | C0 | D0 | e | f | E0 | h | i | j | F0 | |

exp.1 | 0.170 | 0.295 | 0.225 | 0.033 | 0.355 | 0.129 | 0.053 | 0.306 | 0.202 | 0.265 | 0.162 |

exp.2 | 0.048 | 0.095 | 0.203 | 0.021 | 0.262 | 0.183 | 0.026 | 0.192 | 0.254 | 0.285 | 0.303 |

exp.3 | 0.125 | 0.212 | 0.056 | 0.187 | 0.265 | 0.120 | 0.093 | 0.353 | 0.270 | 0.243 | 0.195 |

l | m | n | o | p | q | r | s | t | u | ||

$\mathcal{N}$ | 0.291 | 0.293 | 0.159 | 0.174 | 0.271 | 0.220 | 0.280 | 0.187 | 0.104 | 0.022 | |

G0 | m | n | o | p | H0 | r | I0 | J0 | K0 | ||

exp.1 | 0.054 | 0.190 | 0.387 | 0.133 | 0.112 | 0.265 | 0.224 | 0.164 | 0.104 | 0.272 | |

exp.2 | 0.192 | 0.118 | 0.379 | 0.271 | 0.142 | 0.253 | 0.325 | 0.307 | 0.017 | 0.115 | |

exp.3 | 0.144 | 0.236 | 0.336 | 0.208 | 0.163 | 0.056 | 0.369 | 0.276 | 0.132 | 0.132 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Pfaltz, J.L. A Set-Theoretic Approach to Modeling Network Structure. *Algorithms* **2021**, *14*, 153.
https://doi.org/10.3390/a14050153

**AMA Style**

Pfaltz JL. A Set-Theoretic Approach to Modeling Network Structure. *Algorithms*. 2021; 14(5):153.
https://doi.org/10.3390/a14050153

**Chicago/Turabian Style**

Pfaltz, John L. 2021. "A Set-Theoretic Approach to Modeling Network Structure" *Algorithms* 14, no. 5: 153.
https://doi.org/10.3390/a14050153