Exploring the Entropy Complex Networks with Latent Interaction

Centeno Mejia, Alex Arturo; Bravo Gaete, Moisés Felipe

doi:10.3390/e25111535

Open AccessArticle

Exploring the Entropy Complex Networks with Latent Interaction

by

Alex Arturo Centeno Mejia

^1,*

and

Moisés Felipe Bravo Gaete

²

¹

Doctorado en Modelamiento Matemático Aplicado, Universidad Católica del Maule, Avenida San Miguel, Talca 3605, Chile

²

Departamento de Matemáticas, Física y Estadística, Facultad de Ciencias Básicas, Universidad Católica del Maule, Avenida San Miguel, Talca 3605, Chile

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(11), 1535; https://doi.org/10.3390/e25111535

Submission received: 16 August 2023 / Revised: 16 October 2023 / Accepted: 6 November 2023 / Published: 11 November 2023

(This article belongs to the Section Complexity)

Download

Browse Figures

Versions Notes

Abstract

:

In the present work, we study the introduction of a latent interaction index, examining its impact on the formation and development of complex networks. This index takes into account both observed and unobserved heterogeneity per node in order to overcome the limitations of traditional compositional similarity indices, particularly when dealing with large networks comprising numerous nodes. In this way, it effectively captures specific information about participating nodes while mitigating estimation problems based on network structures. Furthermore, we develop a Shannon-type entropy function to characterize the density of networks and establish optimal bounds for this estimation by leveraging the network topology. Additionally, we demonstrate some asymptotic properties of pointwise estimation using this function. Through this approach, we analyze the compositional structural dynamics, providing valuable insights into the complex interactions within the network. Our proposed method offers a promising tool for studying and understanding the intricate relationships within complex networks and their implications under parameter specification. We perform simulations and comparisons with the formation of Erdös–Rényi and Barabási–Alber-type networks and Erdös–Rényi and Shannon-type entropy. Finally, we apply our models to the detection of microbial communities.

Keywords:

entropy; complex networks; latent interaction index; estimation

1. Introduction

Complex Network Analysis (CNA) is a crucial field that spans various disciplines, addressing network dynamics [1,2,3]. In this context, the focus is on unraveling the complexities of network structure, specifically in the dynamics of link formation. The study delves into three fundamental network attributes: the homophily effect, unobserved heterogeneity, and persistence measures. Homophily, which denotes the tendency of nodes to connect with similar nodes, is a well-established phenomenon in real-world social networks [4]. However, many node characteristics influencing linking decisions remain unobservable, outnumbering observable ones. To address this challenge, a fixed effect approach to account for unobserved heterogeneity is introduced [5,6,7], as well as persistence measures as tools for quantifying time series data dependence [8]. These measures hold significant implications for various processes, for example, in information diffusion and ecological networks (see refs. [1,9,10,11]).

While complex networks offer powerful modeling capabilities, they also present significant challenges. One major hurdle is the lack of a comprehensive metric to effectively measure unobserved heterogeneity, especially in understanding interconnected components [12]. This metric should consider interaction abundance and latent nature, aligning with existing frameworks [13,14]. Furthermore, metrics assessing heterogeneity, both observed and unobserved, are intricately linked to the ways in which nodes are aggregated. From the above, an acute dilemma arises when unobserved heterogeneity is treated as an incidental parameter, independent of node aggregation. In such cases, the parameter vector dimension grows with network size, leading to non-standard estimation challenges, where classical results regarding the properties of maximum likelihood estimates (MLEs) no longer apply [15]. Additionally, certain models (see, for example, refs. [7,16,17]) disregard interdependencies in network formation. These limitations prompt essential questions: Can we devise a test for evaluating the link formation interdependence hypothesis? Is it feasible to extend the model’s scope to incorporate these interdependencies? How can we address the challenges posed by complex network structures and their inherent uncertainties for a deeper understanding of link formation dynamics?

In this work, we introduce a novel framework to tackle these complex network analysis challenges, where our approach (i) incorporates a discrete latent interaction index that integrates parametric and semiparametric components, shedding light on network formation dynamics.

The realm of network models is diverse, ranging from classical ones [18,19,20] to the more recent advancements [6,7,16]. In the first group of models (I), random networks [18] aim to probabilistically study graph properties as the number of random connections increases, reflecting the disordered nature of link arrangements between different nodes. We start with the hypothesis that the proposed latent interaction index displaces the possibility of randomness in link formation. We conduct statistical significance tests based on this hypothesis. Additionally, the Watts–Strogatz model [20] presents a rewiring model that often exhibits high clustering coefficients in “small” networks. On the other hand, the Barabási–Albert model [19] relies on two ingredients: growth and preferential attachment. The idea is that by mimicking the dynamic mechanisms that assemble the network, we can reproduce the system’s topological properties as observed here. The second group of models (II) has been limited to studying static nonlinear dyadic models and their asymptotic properties. Because the number of individual parameters is proportional to the number of nodes, a problem of incidental parameters results in asymptotic bias [6]. While the estimator is consistent, asymptotic bias is relevant for inference. We provide a model test based on the prevalence of transitive triads (i.e., node triples where links are transitive). Observed heterogeneity has also been incorporated through dyadic models that expand on this model, just as a probit or logit model generalizes a simple Bernoulli statistical model, which can be used in directed or undirected settings [21]. It is possible to extend the Erdös–Rényi model to incorporate other features [5].

Our proposed model seeks to bridge these two groups (I and II), offering a comprehensive approach to network analysis by incorporating the strengths of both.

Additionally, (ii) we present an entropy function dynamically accounting for these components, providing insights into parameters related to persistence and homophily. The estimates derived from this entropy function provide valuable information to characterize the parameters related. This framework enhances our comprehension of link formation within dynamic networks, enabling us to explore the influence of these components on network formation and evolution [22,23,24].

To provide a comprehensive view, it is important to note that each entropy metric used in network analysis offers unique insights into network characteristics and its various components. However, it is widely acknowledged within the field that not all of these metrics can be universally applied to all categories of networks. In fact, this wealth of research is dispersed across numerous disciplines [1,17,22,23,25,26], making it challenging to identify the available metrics and understand the specific contexts in which they are applicable. Additionally, this dispersion complicates our ability to determine areas in need of further development.

These entropy metrics often depend on probability distributions based on various factors, such as node degrees [27,28], the degree and strength of node neighbors [23,29,30], or degrees associated with subgraphs of nodes [31]. Path-based metrics, considering sequences of linked nodes and repetitions of nodes and edges, are also common [32,33,34]. Moreover, entropy metrics explore other factors like closeness and information functionals [35,36]. Some metrics rely on probability distribution, including Bayes posterior probability, although specific calculation methods may not always be clear [37]. Notably, Wang et al. [38] introduced a combined metric, where the first part is calculated as the sum of closeness centrality and the clustering coefficient.

Ecological research has a long-standing tradition of studying co-occurrence and co-abundance patterns. These patterns often signify non-random species co-occurrence, indicating that interactions play a significant role in community structure—either by fostering aggregation or promoting avoidance/exclusion—thus influencing the overall community dynamics. Macro-ecological interaction networks illustrate that such patterns bolster community robustness and functionality, crucial for comprehending community dynamics and productivity [25,39]. Microorganisms engage in diverse relationships, encompassing both antagonistic and cooperative interactions. With the advancements in sequencing technologies, we now have access to substantial datasets for analysis. This allows for the construction of co-occurrence networks using correlation coefficients or similar metrics. However, interpreting these networks, especially in microbial surveys with poorly understood organism behaviors, presents significant challenges [11,17,40].

The complexity of microbial communities makes it challenging to validate community-wide interactions due to the multitude of species and limited experimental approaches. Consequently, modeling microbial populations using simplified growth and interaction rules offers an alternative approach to simulate the dynamics of these intricate multispecies communities. In this study, we consider the model proposal as an application for identifying microbial networks. Concretely, we apply our dynamic network formation model on an 18S rRNA gene amplicon dataset. The original dataset comprises 19 samples, and we observe a total of 3831 OTU (Operative Taxonomic Unity) entries. These observations are obtained through Lagrangian sampling as part of a study conducted by Hu et al. [40].

This work starts by providing an introduction to our notations, delineating the symbols and conventions used throughout this study. The organization of this paper is as follows: In Section 2, we present the proposed model; then, in Section 3, we introduce the entropy function. In Section 4, simulation results are presented, and in Section 5 we apply the model focused on the microbial network identification. Section 6 provides the conclusions and discussions, while all proofs of the theorems and elimination of fixed effects are present in Appendix A.

Notation 1.

Network

G = (V, E)

is an ordered pair of sets V and E, where V is a set finite nonempty of elements named nodes, and the set E is composed of two-element subsets

{i j}

of V named edges. If i and j are connected,

{i j}

constitutes a dyad, and j is a neighbor of i. Along the work, we use notation

\prod_{i < j}

to indicate

\prod_{i = 1}^{N} \prod_{j = i + 1}^{N}

, and similarly

\sum_{i < j}

to indicate

\sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N}

.

2. Structural Model

We consider a dynamic group interaction scenario consisting of a large population of connected nodes. We let

i = 1, \dots, N

is the index of a random sample of size N from this population at time

t = 1, \dots, T

. Each node i has a profile defined as

{(X_{i, t}^{⊤}, A_{i})}^{⊤}

, where

X_{i, t}

is an aggregated vector of the observed time-varying characteristics,

A_{i}

contains unobserved information assuming the t-invariant. We let

Supp (X_{i, t})

be a compact subset of

R^{\dim (X_{i, t})}

, and

A_{i}

is distributed compactly and continuously on the same support, conditional on

X_{i, t} = x

, i.e., for all

x \in Supp (X_{i, t})

,

Supp (A_{i} | X_{i, t} = x) = Supp (A_{i})

is a compact subset of

R

.

Linking decisions are a binary choice that depends solely on the characteristics of the two nodes connected by the link. We observe relationships between nodes through the indicator variable

C_{i j, t} \sim Bernoulli (p_{i j, t})

, where

C_{i j, t} = 1

if node j interacts (success) with node i at time t and

C_{i j, t} = 0

(failure) otherwise. Parameter

p_{i j, t}

can be interpreted as the detection rate of the interaction between nodes i, j. Connections are undirected (i.e.,

C_{i j, t} = C_{j i, t}

), and self-ties are ruled out (i.e.,

C_{i i, t} = 0

for all t). For each

t = 1, \dots, T

, there is a corresponding

N \times N

socio-matrix

C = {(C_{i j, t})}_{i \neq j}

that captures the interaction dynamics between nodes i and j across all time steps.

We parameterize the latent interaction structure according to the probability of each link

C_{i j, t}

:

C_{i j, t} = 1 \{\sum_{p = 1}^{q} α_{p}^{0} C_{i j, t - p} + β^{0} X_{i j, t}^{⊤} + D_{i j, t} + A_{i j} - ϵ_{i j, t} > 0\},

(1)

where 1

(\cdot)

denotes the indicator function. The q-dimensional vector

α^{0} = {(α_{1}^{0}, \dots, α_{q}^{0})}^{⊤}

with

∥ α ∥ < 1

and

1 \leq q \leq t - 1

captures the autocorrelation or cumulative nonlinear persistence of the time series [8]. Variable

X_{i j, t} : Supp (X_{i, t}) \times Supp (X_{j, t}) \to R^{\dim (β)}

is a known transformation of

{(X_{i, t}^{⊤}, X_{j, t}^{⊤})}^{⊤}

. This function is symmetric, so that

X_{i j, t} = X_{j i, t}

. For example, if

X_{i, t}

and

X_{j t}

are location coordinates,

X_{i j, t}

is equal to the “distance” between i and j. This choice was implemented under the consideration that nodes only form connections if they are close enough [7,16,21]. Vector

β^{0}

is an unknown model parameter that parameterizes homophily preferences. The parameter vector is

θ^{0} = {({(α^{0})}^{⊤}, {(β^{0})}^{⊤})}^{⊤} \in int (Θ)

, with

Θ

being a compact subset of

R^{q + \dim (β)}

. Variable

D_{i j, t} = \sum_{t^{'} \leq t} \sum_{k = 1}^{N} C_{i k, t^{'}} C_{j k, t^{'}}

denotes the memory effect of connections that node i and j have had in common up to time t. Variable

A_{i j}

is a component that varies with unobserved attributes by node pairs as in Graham’s model [7], and

ϵ_{i j, t}

represent an idiosyncratic component that is assumed to be independent and identically distributed over time. Moreover, this component is assumed to be independent across pairs, although not necessarily identically distributed; it is

F (ϵ_{12, 1}, \dots, ϵ_{12, T}, \dots, ϵ_{(N - 1) N, 1}, \dots, ϵ_{(N - 1) N, T}) = \prod_{i < j} \prod_{t = 1}^{T} F_{ϵ_{i j, t}} .

(2)

It is important to note that Equation (1) captures in a parsimonious way three forces that researchers consider important for bond formation [41]. First, linkages are state dependent; equally, the linkage returns for i and j are higher in the current period if they were also connected in previous periods. Second, there are returns to “triadic closure”, profit is higher if transitive aspects are considered in the interaction between nodes. In addition, Rule (1) is more general instead of taking

D_{i j, t} = 0

and

α_{p}^{0} = 0

for all

p = 1, \dots, q

, which would imply that only direct entailments are important, not autocorrelation and particular incentives for interaction.

The degree of a node is defined as the number of links it possesses, which can be represented as the sum of connections it has with other nodes, and denoted as

C_{i +, t} = \sum_{i \neq j} C_{i j, t}

. The network’s degree sequence is obtained by summing the rows (or columns) of the adjacency matrix, resulting in an

N \times 1

vector

C_{+} = {(C_{1 +, t}, \dots, C_{N +, t})}^{⊤}

.

We denote

z_{i j, t} (θ, A_{i j}) = \exp (\sum_{p = 1}^{q} α_{p} C_{i j, t - p} + β X_{i j, t}^{⊤} + D_{i j, t} + A_{i j}) .

For parameter values

θ \in int (Θ)

and

A = {({(A_{i})}_{i = 1, \dots, N}, {(A_{j})}_{j = 1, \dots, N})}^{⊤} \in Supp (A)

, we define the link probability

p_{i j, t} (θ, A_{i j}) = \frac{z_{i j, t} (θ, A_{i j})}{1 + z_{i j, t} (θ, A_{i j})}

.

With the information presented above, we are now able to outline the principal assumptions that significantly influence our work:

Assumption 1.

Equations (1) and (2) specify a dynamic model of node interactions. The conditional likelihood of link

C_{i j, t} = c_{i j, t}

is given by

\Pr (C_{i j, t} = c_{i j, t} | X, D, A_{0}) = \prod_{i < j} \prod_{t = 1}^{T} \frac{z_{i j, t}^{c_{i j, t}} (θ^{0}, A_{i j_{0}})}{1 + z_{i j, t} (θ^{0}, A_{i j_{0}})},

(3)

Here, Assumption 1 implies that the idiosyncratic component of link surplus,

ϵ_{i j, t}

, is a standard logistic random variable that is independently and identically distributed across pairs of nodes. The assumption that links are formed independently of each other based on agent attributes may hold in some situations but not in others. Specifically, Equation (1) and Assumption 1 are suitable for scenarios where link formation is predominantly bilateral. This is particularly relevant in certain types of friendship and trade networks, as well as in models of specific types of conflicts between nation-states [42,43]. In these contexts, the incorporation of unobserved node characteristics into the link formation model represents a significant and useful generalization relative to many commonly used models.

The objective pursued here is to study the identification and estimation problems posed by the shape according to Equation (1) and Assumption 1. This set encompasses a useful class of empirical examples and represents a natural starting point for a formal statistical analysis. In this context, early methodological work focused on introducing unobserved correlated heterogeneity into static choice models [44,45]. Subsequent work incorporated a chance for stated dependence in choice [46].

The estimated value of the parameters, denoted by

\begin{matrix} \hat{θ} & = & {({\hat{α}}^{⊤}, {\hat{β}}^{⊤})}^{⊤}, \end{matrix}

(4)

\begin{matrix} {\hat{a}}_{N} & = & {({\hat{A}}_{1}, \dots, {\hat{A}}_{N})}^{⊤}, \end{matrix}

(5)

are the solution to the population conditional maximum likelihood problem

\begin{matrix} \max_{(θ, a_{N}) \in R^{\dim (θ) + N}} E_{a} [L_{N T} (θ, a_{N})], \\ L_{N T} (θ, a_{N}) : = {(N T)}^{- \frac{1}{2}} \{\sum_{i < j} \sum_{t = 1}^{T} C_{i j, t} \log (p_{i j, t} (θ, A_{i j})) + (1 - C_{i j, t}) \log (1 - p_{i j, t} (θ, A_{i j}))\}, \end{matrix}

(6)

for every

N, T

. Here,

E_{a}

denotes the expectation with respect to the distribution of the data conditional on the unobserved effects.

Assumption 2.

(i): Asymptotics: We consider limits of sequences where $N / T$ approaches a constant value c as both N and T rise to infinity, where c is a finite number greater than zero.
(ii): Sampling: Conditional on $a_{N} = {(A_{1}, \dots, A_{N})}^{⊤}$ , $\{(C_{i j, t}, X_{i j, t}) : 1 \leq i, j \leq N, 1 \leq t \leq T\}$ is independent across the dyad, and for $Y_{i j, t} = (C_{i j, t}, X_{i j, t})$ , $A$ is the σ-field generated by $(Y_{i j, t},$ $Y_{i j, t - 1} {, \dots)}^{⊤}$ , and $B$ is the σ-field generated by ${(Y_{i j, t}, Y_{i j, t + 1, \dots})}^{⊤}$ .
(iii): Compact support: The support of $X_{i j, t}$ is a compact subset of $R^{\dim (β)}$ .
(iv): Concavity: For all $N, T$ , $(θ, a_{N}) \mapsto L_{N T} (θ, a_{N})$ is strictly concave over $R^{\dim θ + N}$ .

Just for completeness, Assumption 2 (i) defines the large-T asymptotic framework and is the same as in Hahn and Kuersteiner [47]. The relative rate exactly balances the order of the bias and variance producing a non-degenerate asymptotic distribution. Assumption 2 (ii) imposes neither identical distribution nor stationarity over the time series dimension, conditional on the unobserved effects, unlike most of the large-T panel literature [47]. Additionally, it is used to bound covariances and moments in the application of the Laws of Large Numbers (LLN), as we see below, it could be replaced by other conditions that guarantee the applicability of these results. Assumption 2 (iii) is standard in the context of nonlinear estimation problems [48]. It implies that the observed component of link surplus,

\sum_{p = 1}^{q} α_{p} c_{i j, t - p} + β x_{i j, t}^{⊤} + d_{i j, t}

, has bounded support. This simplifies the proofs of the main theorems, especially those of the ML estimator. Furthermore, (iv) imposes smoothness and moment conditions in the log-likelihood function and its derivatives. These conditions guarantee that the higher-order stochastic expansions of the fixed effect estimator that we use to characterize the asymptotic bias are well-defined, and the remaining terms of these expansions are bounded. In addition, this guarantees that all the elements of

X_{i j, t}

have cross-sectional and time series variation. In addition, it also guarantees that

\hat{θ}

is the unique solution to the population problem (given by Equation (6)), that is, all the parameters are point identified. The existence and uniqueness of the solution to the population problem are guaranteed by our Assumptions 2, including the concavity of the objective function in all parameters.

Together with the above, and denoting

p_{i j, t} = p_{i j, t} (θ^{0}, a_{N}^{0})

, through to Parts (iii) and (iv) from Assumption 2 in combination with

Supp (A_{i})

being a compact subset of

R

, our findings imply that

p_{i j, t} (θ, a_{N}) \in (κ, 1 - κ)

for some

0 < κ < 1

and for all

θ

and

a_{N} \in Supp (A)

. An implication of this fact is that

(C_{i j, t} - p_{i j, t}) \log (p_{i j, t} (θ, a_{N}))

is a bounded random variable. A more involved argument shows that it is possible to estimate the difference between

C_{i j, t}

and

p_{i j, t}

with uniform accuracy.

With the aforementioned assumptions in place, we can now elucidate the primary theorems that are providential through the work:

Theorem 1.

Under Assumptions 1 and 2,

\sup_{1 \leq i, j \leq N} \sup_{1 \leq t \leq T} |\frac{1}{(N - 1) T} \sum_{i < j} \sum_{t = 1}^{T} (C_{i j, t} - p_{i j, t})| < \sqrt{\ln (N T)}

(7)

with probability

1 - O ({(N T)}^{- 2})

.

Theorem 1 suggests that as more data are collected (increasing N) and a broader time horizon is considered (increasing T), the difference between latent variables and observed probabilities becomes relatively small and tends to be more bounded. This interpretation may be relevant for assessing the accuracy or validity of a latent model in relation to real observations within a network. The term

\ln (N T)

in the upper bound can be interpreted as a measure of the uncertainty associated with the difference between latent variables and observations. As N and T grow, uncertainty decreases.

The following theorem is related to a generalized form of the Law of Large Numbers (LLN) adapted to the context of complex networks.

Theorem 2.

Under Assumptions 1 and 2, we assume that

E (\sup_{θ \in Θ} | \log (\frac{z_{i j, t} (θ, a_{N})}{1 + z_{i j, t} (θ, a_{N})}) |)

is finite for all t and

F = {c_{i j, t^{'}} : t^{'} < t}

is a filtration with respect to

A

; then,

E_{a} (\sum_{i < j} l_{i j, t}^{θ, a_{N}} | F) \overset{\Pr}{\to} {(\binom{N}{2})}^{- 1} \sum_{i < j} E [\log (\frac{z_{i j, t} (θ^{0}, a_{N}^{0})}{1 + z_{i j, t} (θ^{0}, a_{N}^{0})})]

uniformly in

θ \in Θ

.

In the LLN, the average of random variables is expected to converge to the expected value as the sample size grows. In this case, the sum of certain probability functions

l_{i j, t}^{θ, a_{N}}

for all dyads in the network converges in probability towards a sum of probabilities associated with the dyads. Convergence in probability implies that as the network size (or the number of dyads) grows, the conditional expectation of the discrete choice probabilities approaches the expected value of those probabilities for all dyads. This can have significant implications in the theory of complex networks. For example, the stability of emergent patterns: if the result holds, it implies that as the network grows, emergent patterns in discrete choices may become more stable and predictable, providing a deeper understanding of collective behavior in the network [49].

3. Exploring the Entropy

Combining Assumption 1 and conditional on

X_{i j, t}

,

D_{i j, t}

and

A_{i j}

, we write

l_{i j, t}^{θ, a_{N}} = c_{i j, t} \log (z_{i j, t} (θ, a_{N})) - \log (1 + z_{i j, t} (θ, a_{N}))

(8)

for the log-likelihood contribution of link

{i j}

. Since entropy characterizes the logarithm of the number of different nodes that can be separated in the stochastic dynamics of the network [37,50], we use Equation (8) to provide a new node interaction detection rate. We note that by the asymptotic equipartition property (AEP) (see, e.g., ref. [51]), we have

- \frac{1}{T N^{2}} l_{i j, t}^{θ, a_{N}}

converging in probability to the entropy of C, denoted as

H (C)

, where C represents the socio-matrix of the network. Formally,

H (C) = - \sum_{i < j} \sum_{t = 1}^{T} l_{i j, t}^{θ, a_{N}} \log (l_{i j, t}^{θ, a_{N}}) = \sum_{i < j} \sum_{t = 1}^{T} \sum_{k = 1}^{+ \infty} \frac{l_{i j, t}^{θ, a_{N}} {(1 - l_{i j, t}^{θ, a_{N}})}^{k}}{k},

(9)

where the variable k ranges from 1 to +∞, indicating that all possible configurations of connections that do not exist between nodes i and j are considered. Expression

[l_{i j, t}^{θ, a_{N}} {(1 - l_{i j, t}^{θ, a_{N}})}^{k}] / k

represents the probability of there not being a connection between nodes i and j at time step t. Therefore, Equation (9) combines the influences of both existing and non-existing connections at each time step to compute the entropy of the dynamic network. For the sake of completeness, Figure 1 shows the behavior of the entropy

H (C)

for values of N nodes. It is crucial to note that Equation (9) comprehensively encompasses the charging capability of the logistics distribution—a facet that some propositions tend to disregard [52]. For the sake of completeness, Figure 1 shows the behavior of the entropy

H (C)

for values of N nodes.

The following theorems establish consistency of

\hat{θ}

(Equation (4)):

Theorem 3.

Under Assumptions 1 and 2, we have that

∥ \hat{θ} - θ^{0} ∥ < O ({(N T)}^{- 1 / 2}) .

(10)

Theorem 2 provides a foundation for drawing inferences about the parameter vector encompassing homophily and nonlinear persistence. However, attaining asymptotic normality, for reasons we elaborate on, cannot be guaranteed. The consistency test for models with only individual effects is based on partitioning the log-likelihood into the sum of individual log-likelihoods that depend on a fixed number of parameters, the model parameter, and the corresponding individual effect. The individual log-likelihood maximizers are then consistent estimators of all parameters as they become large according to standard arguments. This approach does not work on network structure because there is no partition of the data that are only affected by a fixed number of parameters and whose size grows with sample size [6].

To achieve asymptotic normality over the observed, we first need to control for the unobserved and second to establish consistency in the estimated entropy function, which depends on both components.

We assess node performance and select a group of exogenous nodes to serve as a “testing ground”. To achieve this, we examine the conditional expectation of

C_{i k, t}

and

C_{j k, t}

, conditioning on the observable characteristics of node k, and the characteristics of nodes i and j based on

X_{i j, t}

and

A_{i j}

. We denote

H_{i j, t} (x_{k, t}, a_{i j})

as the expected value of

(C_{i k, t} - C_{j k, t}

| X_{k t} = x_{k t}, A_{i j} = a_{i j}, X_{i j, t} = x_{i j})

and

δ_{i j, t} (X_{k}) = H_{i j, t} (X_{k, t}, A_{i j})

. According to Parzen’s estimation [53] and Rosenblatt’s remarks [54], we define dyadic extension for monadic data by

{\hat{δ}}_{i j} (x) : = \frac{1}{N T} \sum_{l = 1}^{N} \sum_{t = 1}^{T} (C_{i l, t} - C_{j l, t}) \tilde{K} (x - X_{l, t}); \tilde{K} (x - X_{l, t}) : = \frac{K (\frac{x - X_{l, t}}{h (N)})}{\sum_{l = 1}^{N} K (\frac{x - X_{l, t}}{h (N)})} .

(11)

Here,

K (x)

is a density function satisfying the following conditions: (i)

K (x) < \infty

for all x, (ii) symmetric around zero (

K (- x) = K (x)

), (iii)

K (x) = 0

if

| x | > \bar{x}

, and integrates to one (

\int K (x) d x = 1

). Bandwidth

h (N)

is assumed to be a positive, deterministic sequence that tends to zero as

N \to \infty

.

Lemma 1.

Under Assumptions 1 and 2, we have

\sup_{i, j} | {\hat{δ}}_{i j} (x) - δ_{i j} (x) | = O_{p} ({(N T)}^{- 1})

.

There are at least two approaches to the estimation of unobserved heterogeneity (fixed effects). The first lies in a computational perspective [6,55]. For these purposes, the solution of the (6) program for

θ

is the same as in the solution of the program that imposes

ι_{N}^{⊤} a_{N} = 0

with

ι_{N}

, a vector of N-ones, directly as a constraint on the optimization, which is invariant to normalization. This constrained program has good computational properties because its objective function is concave and smooth in all the parameters. The second alternative arises from Parzen’s estimations of a density function [53]. This alternative is also efficient for the estimation of unobserved heterogeneity. The problem of estimating a probability density function over the unobserved is sometimes similar to the problem of estimating maximum likelihood parameters. However, in a network setting, it is more similar to estimating the spectral density function of a stationary process [53]. Focusing on the second alternative, the following argument shows that it is possible to estimate unobserved heterogeneity with a given probability of occurrence. We consider

ι_{\dim θ}

as a vector consisting of

\dim (θ)

. We let

L : R \to R

be a Lipschitz function, differentiable, a symmetric kernel function, and

\hat{θ}

as in Theorem 2.

Theorem 4.

Under Lemma 1, we define

{\hat{A}}_{l} (θ) = \frac{1}{N} \cdot \frac{\sum_{i < j} L (\frac{{\hat{δ}}_{i j} (x_{l})}{σ_{N}}) X_{i l, t} X_{j l, t}^{⊤} θ ι_{\dim (θ)}^{⊤}}{\sum_{i < j} L (\frac{{\hat{δ}}_{i j} (x_{l})}{σ_{N}})}

for all

l \neq i, j

with

σ_{N}

being bandwidth. Then,

| {\hat{A}}_{l} (\hat{θ}) - A_{l} (θ) | = O_{p} (\max {{(N T σ_{N})}^{- 1},

σ_{N} {(N T)}^{- 1}})

.

Chatterjee, Diaconis, and Sly [56] demonstrated the uniform consistency of estimator

{\hat{A}}_{l} (θ)

in the model that does not incorporate dyad-level covariates. The key to this theorem is the following: In sparse network sequences, we effectively witness

N - 1

linking decisions made by each node, which means that we observe whether node i links to every other node j. This unique feature of the problem allows for consistent estimation of

{\hat{A}}_{l} (θ)

for each node. The argument becomes tedious because of the interdependence of the linking decisions in the sequences of nodes i and j. However, this dependence is weak, only arising via the presence of

C_{i j, t}

in both link sequences. Establishing asymptotic normality of

\hat{θ}

is also involved. This is because the sampling properties of

\hat{θ}

are influenced by the estimation error in

{\hat{A}}_{l} (θ)

. This influence generates a bias in the limit distribution of

\hat{θ}

. This bias is similar to that which arises in large N, large-T joint fixed effects estimation of non-linear panel data models [47].

To state the form of the limit distribution, we let

\hat{H} (C)

and

H_{0} (C)

be the entropy computed over the parameter vector

\hat{θ}

and

θ^{0}

, respectively. Our objective is to estimate quantity

H (C)

within the family of networks

C

that contains nodes i and j. Our estimator is expected to provide a reliable estimate of

H (C)

. Here, we state the following result:

Theorem 5.

Under Assumptions 1 and 2,

\sup_{C \in C} | \hat{H} (C) - H_{0} (C) | < \frac{1}{N T} \sqrt{\log N T}

(12)

with probability

1 - O ({(N T)}^{- 1})

.

This inequality demonstrates that our estimator

\hat{H} (C)

enjoys uniform consistency within class

C

. In simpler terms, it implies that, as our sample size N and time period T increase, the maximum absolute difference between our estimator and the true value

H (C)

across all sets

C \in C

becomes small. The probability that the bound

\frac{1}{N T} \sqrt{\log N T}

holds is stated to be

1 - O ({(N T)}^{- 1})

, meaning that it holds with high probability as the size of the network and the number of time steps grow large. This result provides an upper bound on the discrepancy between the estimated and true entropy, ensuring the reliability of the estimation in the context of the class of networks

C

.

Now, via definition

J_{0} (θ) = \lim_{N, T \to \infty} - {((\binom{N}{2}) T)}^{- 1} \frac{\partial^{2} E (L_{N T} (θ^{0}, \hat{a} (θ^{0})))}{\partial θ \partial θ^{⊤}},

(13)

we are in a position to show

Theorem 6.

Under Assumptions 1 and 2,

\frac{\sqrt{N T} (\hat{θ} - θ^{0}) - J_{0}^{- 1} (θ)}{J_{0} {(θ)}^{1 / 2}} \overset{D}{\to} N (0, I_{\dim (θ)}) .

(14)

To converge to a normal distribution, the difference between estimator

\hat{θ}

and true value

θ^{0}

has to be bias-corrected and rated proportionally to the number of nodes N and time T. In the dense network setting considered here,

θ^{0}

is estimated based on the observed linking decisions about

N (N - 1)

potential links. Therefore, the rate of convergence

\sqrt{N T}

is the conventional parametric rate corresponding to the sample size [5,7].

We finalize this section showing some functional dimensions of the entropy function, given by

Theorem 7.

Under Assumptions 1 and 2, we have that:

(i): $H (C) \leq \frac{ρ_{k l, N}}{K} \log (\frac{K (N - 2)}{ρ_{k l, N}}) - \sum_{i = 1}^{N - 1} \frac{l_{i (i + 1), t}^{θ, a_{N}}}{K} \log (\frac{l_{i (i + 1), t}^{θ, a_{N}}}{K}) - \sum_{i = 1}^{N - 1} \frac{l_{i N, t}}{K} \log (\frac{l_{i N, t}^{θ, a_{N}}}{K}),$

where

$ρ_{k l, N} = \sum_{k = 1}^{N - 1} \sum_{l = k + 1}^{N} l_{k l, t} - \sum_{k = 1}^{N - 1} (l_{k (k + 1), t}^{θ, a_{N}} + l_{k N, t}^{θ, a_{N}})$

and $K = \sum_{i < j} \sum_{t = 1}^{T} \log (\frac{z_{i j, t} (θ, a_{N})}{{(1 + z_{i j, t} (θ, a_{N}))}^{2}}) .$
(ii): If $F = {c_{i j, t^{'}} : t^{'} < t}$ and $F^{'} = {c_{i j, t^{'}}^{'} : t^{'} < t}$ are two filtrations with respect to $A$ , then

$H (C) \geq - \sum_{i < j} ϱ_{i j, F} \log (ϱ_{i j, F^{'}}) - \frac{1}{\ln (2)} \sum_{i < j} \frac{{(ϱ_{i j, F})}^{2} - ϱ_{i j, F} ϱ_{i j, F^{'}}}{ϱ_{i j, F^{'}}},$

(15)

where $ϱ_{i j, F} = \frac{l_{i j, F}^{θ, a_{N}}}{\sum_{i < j} l_{i j, F}^{θ, a_{N}}}$ and $ϱ_{i j, F^{'}} = \frac{l_{i j, F^{'}}^{θ, a_{N}}}{\sum_{j < i} l_{i j, F^{'}}^{θ, a_{N}}} .$

Theorem 7 states that the entropy

H (C)

of the dynamic network C is bounded by the mutual information between successive states of the filtrations

F

and

F^{'}

. This means that as the states of the network become more predictable and related to each other, the entropy decreases, implying greater structure and order in the network. Conversely, if the states are more independent and random, the entropy increases, reflecting a more chaotic and less predictable structure in the network.

4. Benchmark and Simulations

In this section, we studied the finite sample performance of procedures in Monte Carlo simulations, where the programming language used for these simulations is Matlab. We compared the development and robustness of our network formation model using the Erdös–Rényi [18] and Barabási–Albert [19]-type networks. The Barabási–Albert network was generated with a connection probability of

0.5

and a new number of links in each period equal to five. These comparisons were made with the metrics of degree distribution, clustering coefficient, and entropy value. The experiment was based on the latent index formation rule with specification

C_{i j, t} = 1 \{\sum_{p = 1}^{10} α_{p} C_{i j, t - p} + X_{i t} X_{j t}^{⊤} β + A_{i j} - ϵ_{i j, t} \geq 0\}

Here,

β^{0} = 0.5

,

α^{0}

is a random vector with a norm of less than one and

X_{i} \in {- 1, 1}

,

i = 1, \dots, N

being independent and identically distributed random variables simulated by

X_{i} = 1 - 2 \cdot 1 {i is even}

, with size networks of 100, 150, 200, 250 and 500. For larger sample sizes, the behavior of the entropy function is, on average, similar. With this specification, nodes with an even index prefer links to nodes with an even index over links to nodes with an odd index, and vice versa for nodes with an odd index. Through 1000 repetitions of the experiment, we show the reproducibility and dynamics of the constructed networks. A 15-step time experiment was proposed. In addition,

A_{i j} = | A_{i} - A_{j} |

with

A_{i} = (\frac{N - i}{N - 1}) \log N

for all

i = 1, \dots, N

. The descriptive characteristics of the network formation are shown in Table 1. Based on Table 1, we can perform a comparative analysis between the three generated networks.

Mean Degree: The mean degree represents the average number of connections that nodes have in the network. In the simulated network, the mean degree decreases as the network size increases, suggesting that nodes tend to be less connected to each other. This could be influenced by the parameters of the network generation model, such as $α$ , $β$ , and p, which affect the probability of forming new connections at each time step. On the other hand, the Erdös–Rényi and Barabási–Albert networks maintain their mean degree relatively constant, indicating that their connection generation process is not strongly influenced by network size.
Standard Deviation of Degree: The standard deviation of the degree measures the variability in the number of connections that nodes have in the network. In the simulated network, the standard deviation of the degree tends to decrease as the size of the network increases, implying that node degrees become more homogeneous. This could be a desirable feature in some contexts, as it indicates that the simulated network tends to have a more uniform degree distribution, which is associated with greater robustness and stability in its structure.
Clustering Coefficient: The clustering coefficient measures the proportion of connections that exist between the neighbors of a given node. In the simulated network and Erdöss–Rényi networks, the clustering coefficient tends to decrease as the size of the network increases. This suggests that nodes tend to be less interconnected compared to smaller networks. On the other hand, in the Barabási–Albert network, the clustering coefficient remains at one, indicating that neighboring nodes are highly connected. This result is characteristic of Barabási–Albert scale-free networks, where new nodes tend to preferentially connect to existing nodes with higher degrees, resulting in high clustering among the neighbors of each node.

Regarding the convergence order, it is observed that the simulated network exhibits an intermediate behavior between Erdös–Rényi and Barabási–Albert networks in terms of mean degree and clustering coefficient. While Erdös–Rényi networks are more homogeneous and less clustered, and Barabási–Albert networks are more heterogeneous and highly clustered, the simulated network shows intermediate characteristics, making it suitable for representing systems that contain elements of both tendencies.

Concerning entropy, we validated the development of entropy

H (C)

across the same number of network sizes over three time periods. We compared the results with Erdös-Rényi entropy [57] and Shannon entropy [58]. Table 2 summarizes the results obtained from 1000 simulations. The analysis shows that entropy

H (C)

performs consistently well across various network sizes and time periods. It demonstrates competitive values compared to Shannon entropy and outperforms Erdös–Renyi entropy significantly. The results indicate that

H (C)

is a reliable and effective measure to capture the information flow in network dynamics. The lower values obtained by

H (C)

compared to Shannon entropy suggest that it provides a more informative representation of the network’s complexity. Furthermore, the increasing trend of

H (C)

with network size indicates that it effectively captures the growing complexity of larger networks, indicating that larger networks tend to have more structure and order. Overall, these findings support the usefulness of

H (C)

as an entropy measure for analyzing network dynamics and information flow. Higher Shannon entropy values indicate greater diversity or complexity within the networks. In this context, Shannon entropy decreases as the size of the network increases, which implies greater self-organization and less uncertainty within larger networks.

5. Empirical Application

In this section, we apply our dynamic network formation model (Equation (1)) to the 18S rRNA gene amplicon dataset from a study by Hu et al. [40]. This application has focused the microbial network identification. Seawater samples were collected from a depth of 15 m every 4 h following a Lagrangian sampling schematic in an anticyclonic eddy in the North Pacific Subtropical Gyre, as a part of the Simons Collaboration on Ocean Processes and Ecology (SCOPE, http://scope.soest.hawaii.edu/) cruise efforts in July 2015. Some species with taxonomical classification of RNA OTUs are shown in Table 3.

5.1. Analytical Processes

We examined the influence of species richness, specifically focusing on the relative rather than absolute frequency of OTUs. This simplicity forms the primary homophilic structure governing interactions among species taxa in this microbial context, where species engage based on their relative abundances. Subsequently, we applied the Community Louvain algorithm to identify the microbial communities participating in various interactions during each sampling period. To validate the algorithm’s findings, we conducted null modularity calculations with 1000 replicates to assess the statistical significance and distinctiveness of the identified communities within the networks. Additionally, we considered community uniformity and similarity across the sampling periods. To confirm sample dissimilarity, we conducted multiple ANOVA tests and employed the Jaccard test. Our analysis encompassed sensitivity, interaction intensity, and the effect of parameters observable and non-observable on microbial diversity. Computational cost allowed us evaluation of six samples. Samples were collected using 10 L Niskin bottles mounted on a CTD rosette at 6 a.m., 10 a.m., 2 p.m., 6 p.m., 10 p.m., and 2 a.m. Corresponding temperature, salinity, dissolved oxygen, and chlorophyll a data were derived from the same CTD casts. The input data are presented in the form of sequential count tables, where each column represents a sample, and each row represents a taxonomic designation (OTU or transcription ID) with sequence count or read coverage abundance per taxon. Global singletons (where a single OTU appears with a frequency of 1 in the entire dataset sequence) are removed. Out of a total of 3831 Taxa observed, 1779 are eliminated.

5.2. Results

Incorporating the details outlined above, along with the dynamic network formation model (1), we present the following results.

5.2.1. Calculating Sensitivity and Specificity, Effect of Interaction Intensity

The interaction network and co-occurrence network were compared to each other to determine the sensitivity and specificity of the constructed co-occurrence network in detecting direct (first-order) interactions [25]. For this calculation, a true positive (TP) was indicated by the presence of an edge in the co-occurrence network that had the same sign as in the interaction network (when using association metrics with sign). A false positive (FP) represented an edge in the co-occurrence network that was not present in the interaction network. A false negative (FN) denoted an edge in the interaction network that was absent in the co-occurrence network. A true negative (TN) was the absence of an edge in both the interaction and co-occurrence networks. Sensitivity was defined as TP/(TP + FN), and specificity was defined as TN/(TN + FP). In cases where two species interacted with each other with different signs, the interaction with the larger absolute value was considered to be the sign of the net interaction. In addition, we calculated each precision as TP/(TP + FP), and F1 score as

2 \times (precision \times sensibility) / (precision + sensibility)

.

The similarity of species had a large effect on network sensitivity (see Table 4). Though specificity remained high at similarities ranging from 89% to 90%, the sensitivity increased through this range within creasing similarity. Samples with relatively high similarity in species membership were therefore useful for constructing sensitive networks. Many real microbial communities have a lower percentage of shared taxa, but this is largely due to under sampling of rare species [59]. The F1-score, which reflects the balance between precision and recall in measuring species interaction or co-occurrence, consistently indicates strong performance throughout the day. An F1-score of 0.637 indicates a good balance between precision and recall for species interaction or co-occurrence at 6 a.m. This means that the model or method used to measure species interaction performs well in identifying both positive (species interactions) and negative (absence of interactions) cases at this time. At other time points, including 10 a.m., 2 p.m., 6 p.m., 10 p.m., and 2 a.m., the F1-scores range from 0.635 to 0.644. These values suggest that the employed method effectively identifies species interactions, with a particularly noteworthy performance during the nighttime hours at 2 a.m. Overall, the F1-score results highlight the method’s robustness in assessing species interactions across different times of the day.

5.2.2. Effect of Interaction Intensity in the Communities

Once the co-occurrence networks between species are constructed, we investigate the community structure that these interactions generate. In each sampling instance, we identify microbial communities based on the interaction of the corresponding species. These interaction networks of communities evolve with each sampling, both in terms of the number of communities and the composition of these communities. The depth of this identification is carried out at seven taxonomic levels. The original dataset comprises eight taxonomic levels, as described in Table 3. The sampling time reveals preferences in the interactions among certain communities. For instance, some of the microbial communities tend to be more inclined to interact during the day, likely due to the increased presence of the 18S rRNA gene within their taxonomy. Tukey–Kramer tests were conducted in this sampling. All tests resulted in

p < 0.001

in favor of rejecting the null hypothesis: there is no statistically significant evidence in the mean of the compared communities. The randomness test is performed on the degree distribution at all sampling points. In all of these, we find a p-value

< 0.001

, indicating that the biological network formation structure does not follow a random structure. The modularity test based on 1000 permutations yields a p-value

< 0.001

. This indicates that the formation of these communities is robust and the interactions are strongly cohesive at each sampling.

Figure 2 shows the different interaction networks of microbial communities. The relative frequency of the communities is described by the size of their respective node. In this work, notation DSDGIIC16DGIIC16DGIIC16XspSDGII refers to the microbial community Dinophyta-Syndiniales-Dino-Group-II-Clade-16-Dino-Group-II-Clade16Xsp.-Syndiniales-Dino-Group-II, nomenclature DSDGIIC16DGIIIC16XspSDGII refers to Dinophyta-Syndiniales-Dino-Group-II-Clade-16-Dino-Group-III-Clade16Xsp-Syndiniales-Dino-Group-II, nomenclature DSDGIIC1011DGIIC1011XspSDGII refers to Dinophyta-Syndiniales-Dino-Group-II -Clade-10-11-Dino-Group-II-Clade-10-11Xsp-Syndiniales-Dino-Group-II, MCMOPCCX to Metazoa-Craniata-Mammalia-Ochotona-princeps-Craniata-CraniataX, DDSDKspDS to Dinophyta-Dinophyceae-SuessialesX-Karlodiniumsp-Dinophyceae-Suessiales and DDSXSspDS to Dinophyta-Dinophyceae-SuessialesX-Symbiodiniumsp-Dinophyceae-Suessiales. Unlabeled nodes refer to unidentified communities according to the seven sequencing depth levels.

5.2.3. Effects of Parameters on Microbial Diversity

Microbial communities in different environments can vary widely in their composition and structure. Though the experimenter cannot necessarily influence ecological parameters, it is valuable to know which factors may cause problems in co-occurrence network inference. We considered the effect of species richness, community evenness, and similarity of communities across sampling sites.

Our analysis suggests that community evenness does not directly affect co-occurrence network sensitivity and specificity. However, it may have an indirect effect because uneven communities require increased sampling depth in order to detect the real species richness, and if this is inadequate, then the number of detected species (i.e., the effective richness) is reduced. The diversity of communities between different sites can be calculated via a variety of metrics [60]. We used a simple and intuitive metric to quantify the similarity of communities at different sampling sites: the average percentage of species shared between any two sites (i.e., the Jaccard similarity). The similarity of communities had a large effect on network sensitivity. The Jaccard index for all community networks is 0.017, indicating a dynamic configuration in the networks and thus in the microbial structure. This is of utmost importance due to the intrinsic biological complexity of genomic structure, considering that some of the taxonomic properties of the 16S rRNA gene are more expressive at certain times of the day.

5.2.4. Effect of the Non-Observable

The communities evaluated so far have not been in a steady state, representative of many complex communities [61,62]. Therefore, we investigated the ways in which the variability in unobservable site properties influences the inference of each network of communities. To achieve this, we introduced random variations in the carrying capacity of each species at each site, which can be interpreted as an introduction of between-site heterogeneity. This addition of inter-site heterogeneity, where each species has varying advantages, introduced “noise” to the dataset. Nevertheless, we mitigated the impact of the unobservable factors using Theorem 4.

Table 5 shows the variation of network statistics as the level of heterogeneity changes. We observed that the number of microbial communities varies depending on the time of day and the formula for unobserved heterogeneity used. At 6 a.m. and 10 p.m., the number of communities was lower when formula

\frac{N - i}{N - 1} \log (\log (N))

was applied, which could indicate a higher cohesion among communities at those hours. In contrast, at 2 p.m., regardless of the formula, a constant number of communities was maintained, suggesting a more robust structure. Regarding the average node degree in the networks, there was no clear pattern of increase or decrease based on the time of day or the formulation of unobserved heterogeneity. The values fluctuated under all conditions, implying natural variability in microbial interactions. Finally, the density of the networks showed significant variations. For example, at 10 a.m. and 2 p.m., the density was relatively low, implying a lower proportion of possible connections in these networks. In contrast, at 6 p.m., a higher density was observed, suggesting greater interconnection among microbial species at that time.

6. Discussion and Conclusions

Motivated to explore the field of CNA, we study the introduction of a latent interaction index, addressing the limitations inherent in traditional compositional similarity indices, taking into account both observed and unobserved heterogeneity per node, particularly in the context of large and complex networks.

This index addresses a limitation in network formation, namely interdependence. The study of complex network formation in the presence of interdependencies is one of the focal points of recent theoretical and empirical research on networks [5,7,16]. However, with the exception of Graham’s [7] and Dzemski’s [5] models, none of these papers incorporate unobserved correlated heterogeneity within the modeling framework, unlike the approach used here. The results obtained through the development of this index (Theorems 1 and 2) demonstrate uniform consistency with respect to the homophily parameter vector and fixed effects. This assures us that the proposed index yields statistically replicable results, in line with the principles of the law of large numbers and its applicability across various domains [2,25].

Together with the above, we formulate a Shannon-type entropy measure to quantify network density. We further establish optimal boundaries for this measurement by utilizing insights from network topology. Additionally, we present asymptotic properties of pointwise estimation using this entropy function. This analytical approach allows us the application of scrutiny on the structural dynamics of composition, offering valuable insights into the intricate interactions within the network. Here, it is important to note the relevance of dyads contributing to this measure, as a consequence of both observed and unobserved factors. In contrast to some studied entropy measures that do not take these characteristics into account [23,27,29,34,36], it might be more useful and comprehensive for future research in various fields to conduct a deeper exploration of what other factors and dimensions could potentially influence the contributions of dyads in the network and, consequently, network entropy.

The results indicate that as network states become more predictable and interconnected, network entropy decreases. This decrease in entropy signifies a greater degree of structure and order within the network. Conversely, when network states exhibit greater independence and randomness, entropy increases, reflecting a more chaotic and less predictable network structure. These findings align with previous research on the interplay between network structure and entropy [13,14,63].

The application of the Shannon-type entropy function provides a robust measure for quantifying network complexity. By establishing optimal bounds for entropy estimation based on network topology, we ensure the accuracy of our analysis and enhance our ability to distinguish networks with varying complexity levels. This contributes to a more nuanced understanding of network dynamics and interactions. Simulations and comparisons with Erdös–Rényi and Barabási–Albert-type networks, in addition to the utilization of Erdös–Rényi and Shannon-type entropy, further validate the effectiveness of our proposed method. Our results demonstrate that the proposed index successfully distinguishes between networks with different degrees of complexity, even outperforming classical models in certain cases [18,19].

Despite the inherent complexity of microbiological data [40,61], our method offers a promising avenue for studying and comprehending the intricate relationships within these interaction networks and their implications under various parameter specifications. The ecological results presented here are currently under discussion with experts in the field. However, we acknowledge the possibility of simplifications and extensions of the model proposed here.

The theoretical results presented in this article allow us formulation of two statements. First, the interdependence structure in forming complex networks should not be independent of the objective parameters and unobservable node effects. This would enable researchers to discover causal relationships based on these parameters and the network formation itself, complementing some of the discussed network models [37,38]. Second, entropy measures on network structures could be more robust and consistent if only the dyads influencing their structure were considered. It is well-known that biased estimates in entropy measures of networks arise from the influence of false dyads on the system [64]. The entropy metric presented here is based solely on the contributing dyads of the network.

In conclusion, this approach enables us to capture both observed and unobserved heterogeneity per node, providing a more comprehensive understanding of interactions within ecological communities and other intricate networks. The proposed latent interaction index proves to be an invaluable tool for characterizing the structural dynamics of networks. Additionally, it is feasible to design a test to evaluate interdependencies in link formation. It is more plausible to assume that these interdependencies establish a bounded degree between pairwise interactions [21,24]. The proposed model provides feasibility and evidence of how to incorporate these interdependencies, which in many cases are probabilities conditioned on triads (groups of three nodes) [5,7]. It is worth noting that these probabilities introduce a bias in the linkage decision [6]. While work has been extensive in reducing this bias in mono-nodal estimation [16,46,47], little is known about multi-nodal structures. This inherent uncertainty led to the introduction of the entropy function studied here. It possesses the property of reflecting parameter estimates as a function of the true parameters, meaning that the estimated entropy converges to the true entropy. This finding could be a valuable contribution to the challenge of multinodal estimates.

Author Contributions

Conceptualization, methodology, and formal analysis, A.A.C.M.; validation, M.F.B.G. All authors have read and agreed to the published version of the manuscript.

Funding

A. Centeno would like to thank the Beca Doctoral UCM of Vicerrectoría de Investigación y Postgrado (VRIP) at Universidad Católica del Maule, Chile. M. Bravo-Gaete is supported by Proyecto Interno UCM-IN 22204, Linea Regular.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

We thank Doctorado en Modelamiento Matemático Aplicado, which is part of the Universidad Católica del Maule.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Proofs

To ensure the comprehensiveness of the present work, this part is reserved for the proofs for Lemma 1 and Theorems 1–7.

Proof of Theorem 1.

From Hoeffding’s inequality,

\forall ϵ > 0

,

\Pr (|\sum_{i < j} \sum_{t = 1}^{T} (C_{i j, t} - p_{i j, t}) \geq ϵ|) \leq 2 \exp (\frac{- 2 ϵ^{2}}{{(1 - 2 κ)}^{2}}),

where

κ \in (0, 1)

such that

p_{i j, t} (θ, a_{N}) \in (κ, 1 - κ)

. Setting

ϵ = \sqrt{\ln (N T)}

, we have

\Pr (|\sum_{i < j} \sum_{t = 1}^{T} (C_{i j, t} - p_{i j, t}) \geq \sqrt{\ln (N T)}|) \leq 2 \exp (\frac{- 2 \ln (N T)}{{(1 - 2 κ)}^{2}}) .

Here,

2 \exp (\frac{\ln (\frac{1}{{(N T)}^{2}})}{{(1 - 2 κ)}^{2}}) = \frac{2}{{(N T)}^{2}} \exp (\frac{1}{{(1 - 2 κ)}^{2}}) = O ({(N T)}^{- 2})

. Therefore,

\sup_{1 \leq i, j \leq N} \sup_{1 \leq t \leq T} |\sum_{i < j} \sum_{t = 1}^{T} (C_{i j, t} - p_{i j, t}) \leq ϵ|

with probability

1 - O ({(N T)}^{- 2})

. □

Proof of Theorem 2.

We note that

E (\sum_{i < j} l_{i j, t}^{θ, a_{N}} | F) = \frac{{(\binom{N}{2})}^{- 1}}{T} [\sum_{i < j} \sum_{t = 1}^{T} c_{i j, t} \log (z_{i j, t} (θ, a_{N})) - \log (1 + z_{i j, t} (θ, a_{N})) | F] .

The last equation can be written as

\begin{matrix} E (\sum_{i < j} l_{i j, t}^{θ, a_{N}} | F) & = \frac{{(\binom{N}{2})}^{- 1}}{T} \sum_{i < j} \sum_{t = 1}^{T} (\log (\frac{z_{i j, t} (θ, a_{N})}{1 + z_{i j, t} (θ, a_{N})})) . \end{matrix}

Via Theorem 4.2.2 from [48], we have

\frac{1}{T} \sum_{t = 1}^{T} \log (\frac{z_{i j, t} (θ, a_{N})}{1 + z_{i j, t} (θ, a_{N})}) \overset{\Pr}{\to} E [\log (\frac{z_{i j, t} (θ^{0}, a_{N}^{0})}{1 + z_{i j, t} (θ^{0}, a_{N}^{0})})],

uniformly in

θ \in Θ

when

E (\sup_{θ \in Θ} | \log (\frac{z_{i j, t} (θ, a_{N})}{1 + z_{i j, t} (θ, a_{N})}) |)

is finite. Then,

E (\sum_{i < j} l_{i j, t}^{θ, a_{N}} | F) \overset{\Pr}{\to} {(\binom{N}{2})}^{- 1} \sum_{i < j} E [\log (\frac{z_{i j, t} (θ^{0}, a_{N}^{0})}{1 + z_{i j, t} (θ^{0}, a_{N}^{0})})],

uniformly in

θ \in Θ

. □

Proof of Theorem 3.

Since

\hat{θ}

is based on contribution

l_{i j, t}^{\hat{θ}, a_{N}}

, which, by Theorem 2, converges in probability to a monotonic transformation of vector

θ^{0}

, from Theorem 4.2.1 in [48], this implies that

\lim_{N, T \to + \infty} \hat{θ} = θ^{0}

and, therefore, the variance of

∥ \hat{θ} - θ^{0} ∥

proceeds to zero as

N, T

approaches positive infinity. We let

δ > 0

be a fixed small constant. Then, via Chebyshev’s inequality, we have

\lim_{N, T \to + \infty} \Pr (∥ \hat{θ} - θ^{0} ∥ \geq δ) \leq \lim_{N, T \to + \infty} \frac{Var (∥ \hat{θ} - θ^{0} ∥)}{δ^{2}} = 0 .

This means that the probability of

\hat{θ}

being within a small neighborhood of

θ^{0}

approaches 1 as

N, T

becomes large. To determine the rate of convergence, we can express this difference as

∥ \hat{θ} - θ^{0} ∥ = O ({(N T)}^{- 1 / 2})

. This indicates that the convergence rate is at least

O ({(N T)}^{- 1 / 2})

. □

Proof of Lemma 1.

Let us calculate difference

\begin{matrix} {\hat{δ}}_{i j} (x) - δ_{i j} (x) & = \frac{1}{N T} \sum_{l = 1}^{N} \sum_{t = 1}^{T} ((C_{i l, t} - C_{j l, t}) \tilde{K} (\frac{x - X_{l}}{h (N)}) - E (C_{i l, t} - C_{j l, t} | X, A, D)) \\ = \frac{1}{N T} \sum_{l = 1}^{N} \sum_{t = 1}^{N} (C_{i l, t} \tilde{K} (\frac{x - X_{l}}{h (N)}) - E (C_{i l, t} | X, A, D) - \\ - (C_{j l, t} \tilde{K} (\frac{x - X_{l}}{h (N)}) - E (C_{i l, t} - C_{j l, t} | X, A, D)) \end{matrix}

by applying Hoeffding’s inequality twice:

\begin{matrix} | {\hat{δ}}_{i j} (x) - δ_{i j} (x) | & \leq \frac{1}{N T} (| \sum_{l = 1}^{N} \sum_{t = 1}^{T} C_{i l, t} \tilde{K} (\frac{x - X_{l}}{h (N)}) - E (C_{i l, t} | X, A) | + \\ + | \sum_{l = 1}^{N} \sum_{t = 1}^{T} C_{j l, t} \tilde{K} (\frac{x - X_{l}}{h (N)}) - E (C_{i l, t} - C_{j l, t} | X, A) |) = O_{p} ({(N T)}^{- 1}) . \end{matrix}

□

Proof of Theorem 4.

We denote

{\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) = \frac{ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))}{\sum_{i < j} ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))}

with

ℓ_{N T} (x) = L (\frac{x}{σ_{N}})

.

First, we note that

\begin{matrix} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) X_{i l, t} X_{j l, t}^{⊤} \hat{θ} & = {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) θ + X_{i l, t} X_{j l, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) + \\ + X_{i l, t} X_{j l, t}^{⊤} θ ({\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) - {\tilde{L}}_{N T} (δ_{i j} (x_{l}))) . \end{matrix}

Since

X_{i l, t}

,

X_{j l, t}

, and

δ_{i j} (x_{l})

are bounded for all

l = 1, \dots, N

, by mean value expansion of

ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))

around

ℓ_{N T} (δ_{i j} (x_{l}))

, we have

ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) - ℓ_{N T} (δ_{i j} (x_{l})) = ℓ_{N T}^{'} (ξ) | {\hat{δ}}_{i j} (x_{l}) - δ_{i j} (x_{l}) |

. Here,

| ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) - ℓ_{N T} (δ_{i j} (x_{l})) | = O_{p} (\sup | {\hat{δ}}_{i j} (x_{l}) - δ_{i j} (x_{l}) |)

. By Lemma 1,

| ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) - ℓ_{N T} (δ_{i j} (x_{l})) | = O_{p} ({(N T)}^{- 1})

. By Theorem 1 and using of Chevyshev inequality,

\Pr (| X_{i l, t} X_{j l, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l})) | > ϵ) \leq E (\frac{1}{ϵ^{2}} {(X_{i l, t} X_{j l, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l})))}^{2}),

for all

ϵ > 0, \forall θ, \hat{θ} \in Θ .

Now, from the Cauchy–Schwartz inequality,

\begin{matrix} E {(X_{i l, t} X_{j l, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l})))}^{2} & \leq E {(X_{i l, t} X_{j l, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l})))}^{2} \cdot \\ \cdot E {(X_{i k, t} X_{j k, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{k})))}^{2} . \end{matrix}

(A1)

In addition,

E {(X_{i k, t} X_{j k, t}^{⊤} (\hat{θ} - θ) {\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{k})))}^{2} = O_{p} (σ_{n} {(N T)}^{- 1})

. Second, we note that

\begin{matrix} \sum_{i < j} \sum_{t} X_{i l, t} X_{j l, t}^{⊤} θ ({\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l})) - {\tilde{L}}_{n t} (δ_{i j} (x_{l}))) = & \sum_{i < j} \sum_{t} X_{i l, t} X_{j l, t}^{⊤} θ \times \\ \times [\frac{ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))}{\sum_{i < j} ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))} - \frac{ℓ_{N T} (δ_{i j} (x_{l}))}{\sum_{i < j} ℓ_{N T} (δ_{i j} (x_{l}))}] \\ \leq \frac{\sum_{i < j} \sum_{t} ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) X_{i l, t} X_{j l, t}^{⊤} θ}{\sum_{i < j} ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))} \cdot \\ \cdot \frac{\sum_{i < j} \sum_{t} ℓ_{N T} (δ_{i j} (x_{l})) - ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))}{\sum_{i < j} ℓ_{N T} (δ_{i j} (x_{l}))} + \\ + \frac{\sum_{i < j} \sum_{t} ℓ_{N T} (δ_{i j} (x_{l})) - ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) X_{i l, t} X_{j l, t}^{⊤} θ}{\sum_{l} ℓ_{N T} (δ_{i j} (x_{l}))} \\ = O_{p} ({(N T)}^{- 1}) + \\ + \frac{\sum_{i < j} \sum_{t} (ℓ_{N T} (δ_{i j} (x_{l})) - ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))) \bar{X (θ)}}{\sum_{i < j} ℓ_{N T} (δ_{i j} (x_{l}))} \end{matrix}

with

\sup_{X, t} X_{i l, t} X_{j l, t}^{⊤} θ = \bar{X (θ)}

.

On the other hand,

| ({\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) - {\tilde{L}}_{N T} (δ_{i j} (x_{l}))) X_{i l, t} X_{j l, t}^{⊤} θ | \leq 2 \frac{\sum_{i < j} \sum_{t} [ℓ_{N T} (δ_{i j} (x_{l})) - ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}))] \bar{X (θ)}}{\sum_{i < j} ℓ_{N T} (δ_{i j} (x_{l}))} .

Now, we note that

\frac{1}{N} \sum_{i < j} | ℓ_{N T} (δ_{i j} (x_{l})) - ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) | = O_{p} ({(N T σ_{n})}^{- 1})

for all

l \neq i, j

, and

ℓ_{N T} (δ_{i j} (x_{l})) = E (ℓ_{N T} (δ_{i j} (x_{l})) | X, A, D) + [ℓ_{N T} (δ_{i j} (x_{l})) - E (ℓ_{N T} (δ_{i j} (x_{l})) | X, A, D)] .

(A2)

We note that

E (ℓ_{N T} (δ_{i j} (x_{l})) | X, A, D)

tends to zero when

N \to \infty

. Using Hoeffding’s inequality

\Pr (| ℓ_{N T} (δ_{i j} (x_{l})) - E (ℓ_{N T} (δ_{i j} (x_{l})) | X, A, D) | \geq ϵ_{0}) < O ({(N T)}^{- 1 / 2})

\forall ϵ_{0} > 0

. Therefore,

| \sum_{i < j} \sum_{t} X_{i l, t} X_{j l, t}^{⊤} θ ({\tilde{L}}_{n t} ({\hat{δ}}_{i j} (x_{l}))) - {\tilde{L}}_{n t} (δ_{i j} (x_{l})) | = O_{p} ({(N T σ_{n})}^{- 1}) \forall l \neq i, j .

Finally, by mean-value expansion for logit distibution

Λ

, we have

Λ (z_{i l, t} (θ, A_{i l})) - Λ (z_{j l, t} (θ, A_{j l})) = Λ^{'} (η) (\sum_{p = 1}^{q} α_{p} C_{i l, t - p} - \sum_{p = 1}^{q} α_{p} C_{j l, t - p} + X_{i l, t} X_{j l, t}^{⊤} θ + A_{i j});

then, there exist nonzero

K_{1}, K_{2}, K_{3}

such that

K_{1} | Λ (z_{i l, t} (θ, A_{i l})) - Λ (z_{j l, t} (θ, A_{j l})) | \leq K_{2} | Λ (z_{i l, t} (θ, A_{i l})) - Λ (z_{j l, t} (θ, A_{j l})) |,

while

| {(Λ^{'})}^{- 1} (η) | \leq K_{3}

. Here,

\begin{matrix} | {\hat{A}}_{l} (θ) - A_{l} (θ) | = & \frac{1}{N} | \sum_{i < j} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) (X_{i l, t} X_{j l, t}^{⊤} - A_{l}) | \\ = & \frac{1}{N} | \sum_{i < j} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) [Λ (z_{i l, t} (θ, A_{i l})) - Λ (z_{j l, t} (θ, A_{j l})) - Λ^{'} (η) \cdot \\ \cdot (\sum_{p = 1}^{q} α_{p} C_{i l, t - p} - \sum_{p = 1}^{q} α_{p} C_{j l, t - p})] {(Λ^{'})}^{- 1} (η) | \\ \leq \frac{1}{N} | \sum_{i < j} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) [δ_{i j} (x_{l}) {(Λ^{'})}^{- 1} (η) - (\sum_{p = 1}^{q} α_{p} C_{i l, t - p} - \sum_{p = 1}^{q} α_{p} C_{j l, t - p})] | \\ \leq \frac{1}{N} | \sum_{i < j} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) K_{2} K_{3} δ_{i j} (x_{l}) | + | (\sum_{p = 1}^{q} α_{p} C_{i l, t - p} - \sum_{p = 1}^{q} α_{p} C_{j l, t - p}) | \\ \leq \frac{1}{N} | \sum_{i < j} {\tilde{L}}_{N T} ({\hat{δ}}_{i j} (x_{l})) K_{2} K_{3} δ_{i j} (x_{l}) | + O_{p} (1) \\ = \frac{1}{N} | \sum_{i < j} \frac{ℓ_{N T} ({\hat{δ}}_{i j} (x_{l})) K_{2} K_{3} δ_{i j} (x_{l})}{\sum_{i < j} ℓ_{N T} ({\hat{δ}}_{i j} (x_{l}) (x_{l}))} | + O_{p} (1) \\ = O_{p} (\max {{(N T σ_{N})}^{- 1}, σ_{N} {(N T)}^{- 1}}), \end{matrix}

where in the last equality we use Equation (A2). □

Proof of Theorem 5.

By Theorem 2 (Law of Large Numbers) and Theorem 3, we have

\begin{matrix} E_{a} (\hat{H} (C)) & = - \sum_{i < j} \sum_{t = 1}^{T} E_{a} (l_{i j, t}^{\hat{θ}, {\hat{a}}_{N}} \log (l_{i j, t}^{\hat{θ}, {\hat{a}}_{N}}) | F) \\ \overset{\Pr}{\to} - T {(\binom{N}{2})}^{- 1} \sum_{i < j} E (\log (\frac{z_{i j, t} (θ^{0}, a_{N}^{0})}{1 + z_{i j, t} (θ^{0}, a_{N}^{0})}) \log (l_{i j, t}^{θ^{0}, a_{N}^{0}})) \\ = - T \sum_{i < j} E (l_{i j, t}^{θ^{0}, a_{N}^{0}} \log (l_{i j, t}^{θ^{0}, a_{N}^{0}})) \\ = - \sum_{i < j} \sum_{t = 1}^{T} l_{i j, t}^{θ^{0}, a_{N}^{0}} \log (l_{i j, t}^{θ^{0}, a_{N}^{0}}) \\ = H_{0} (C) . \end{matrix}

Then, for all

ϵ > 0

,

| \hat{H} (C) - H_{0} (C) | < ϵ

. Let us set

ϵ = \frac{1}{N T} \sqrt{\log (N T)}

; then,

\Pr (| \hat{H} (C) - H_{0} (C) | \geq \frac{1}{N T} \sqrt{\log (N T)}) \leq \frac{\log (N T)}{{(\frac{1}{N T} \sqrt{\log (N T)})}^{2} N T} = O ({(N T)}^{- 1}) .

(A3)

Hence, with probability of at least

1 - O ({(N T)}^{- 1})

, we have

| \hat{H} (C) - H_{0} (C) | < \frac{1}{N T}

\sqrt{\log (N T)}

for any network C containing the pair of nodes

i, j

. Taking the supremum over all

C \in C

, we have

\sup_{C \in C} | \hat{H} (C) - H_{0} (C) | < \frac{1}{N T} \sqrt{\log (N T)},

(A4)

with probability

1 - O ({(N T)}^{- 1})

. □

Proof of Theorem 6.

From Theorem 3 and the first-order condition associated with the concentrated log-likelihood, a mean value expansion offers

\sqrt{N T} (\hat{θ} - θ^{0}) = - {[\frac{1}{N T} \sum_{i < j} \sum_{t = 1}^{T} \frac{\partial s_{i j t, θ} (\bar{θ}, {\hat{a}}_{N} (\bar{θ}))}{\partial θ}]}^{- 1} [\frac{1}{\sqrt{N T}} \sum_{j < i} \sum_{t = 1}^{T} \frac{\partial s_{i j t, θ} (θ^{0}, {\hat{a}}_{N} (θ^{0}))}{\partial θ}],

where

s_{i j t, θ} (θ, a_{N})

denotes the

{i j}

th dyad’s contributions to the score of the maximum likelihood estimator associated with vector

θ

. After applying the result for the Hessian of the concentrated log-likelihood derived immediately above, we obtain

\sqrt{N T} (\hat{θ} - θ^{0}) = J_{0}^{- 1} (θ) \times [\frac{1}{\sqrt{N T}} \sum_{j < i} \sum_{t = 1}^{T} \frac{\partial E (s_{i j t, θ} (θ^{0}, {\hat{a}}_{N} (θ^{0})))}{\partial θ}] + o_{p} (1)

(A5)

since

\frac{1}{N T} \sum_{i < j} \sum_{t = 1}^{T} \frac{\partial E (s_{i j t, θ} (\bar{θ}, {\hat{a}}_{N} (\bar{θ})))}{\partial θ} \overset{\Pr}{\to} - J_{0} (θ)

. Tedious calculations, along with the calculations immediately above, produce

\frac{1}{\sqrt{N T}} \sum_{j < i} \sum_{t = 1}^{T} \frac{\partial E (s_{i j t, θ^{0}} (θ^{0}, {\hat{a}}_{N} (θ^{0})))}{\partial θ} = \frac{1}{\sqrt{N T}} \sum_{j < i} \sum_{t = 1}^{T} \frac{\partial E ({\hat{s}}_{i j t, θ^{0}} (θ^{0}, {\hat{a}}_{N}^{0}))}{\partial θ} + o_{p} (1),

(A6)

where

{\hat{s}}_{i j t, \hat{θ}} (\hat{θ}, {\hat{a}}_{N}) = s_{i j t, \hat{θ}} (\hat{θ}, {\hat{a}}_{N}) - H_{θ a_{N}^{⊤}} (C) H_{a_{N} a_{N}^{⊤}} (C) \cdot s_{i j t, a_{N}} (\hat{θ}, {\hat{a}}_{N})

. Substituting Equation (A6) into Equation (A5) results in

\sqrt{N T} (\hat{θ} - θ^{0}) = J_{0}^{- 1} (θ) + J_{0}^{- 1} (θ) \frac{1}{\sqrt{N T}} \sum_{j < i} \sum_{t = 1}^{T} \frac{\partial E ({\hat{s}}_{i j t, θ^{0}} (θ^{0}, {\hat{a}}_{N}))}{\partial θ} + o_{p} (1)

(A7)

Applying the Central Limit Theorem to the second addend, we have

\sqrt{N T} (\hat{θ} - θ^{0}) \overset{D}{\to} N (J_{0}^{- 1} (θ), J_{0} {(θ)}^{1 / 2})

. □

Proof of Theorem 7.

(i) Here, we note that

\begin{matrix} H (C) = & - \sum_{i < j} \sum_{t = 1}^{T} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) = - \sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N} \sum_{t = 1}^{T} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) \\ = - \sum_{t = 1}^{T} [\sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) - \sum_{i = 1}^{N - 1} \frac{l_{i (i + 1), t}^{θ, a_{N}}}{K} \log (\frac{l_{i (i + 1), t}^{θ, a_{N}}}{K}) \\ - \sum_{i = 1}^{N - 1} \frac{l_{i N, t}^{θ, a_{N}}}{K} \log (\frac{l_{i N, t}^{θ, a_{N}}}{K})], \end{matrix}

where we can write

\sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} l_{i j, t}^{θ, a_{N}} = \sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N} l_{i j, t}^{θ, a_{N}} - \sum_{i = 1}^{N - 1} (l_{i, i + 1}^{θ, a_{N}} - l_{i N}^{θ, a_{N}}),

allowing us definition of a new vector over the parameters, i.e.,

q_{t} = {(q_{i j, t})}_{i = 1, j = i + 1}^{N - 1, N}

as

q_{i j, t} = \frac{l_{i j, t}^{θ, a_{N}}}{\sum_{k = 1}^{N - 1} \sum_{l = k + 1}^{N} l_{k l, t} - \sum_{k = 1}^{N - 1} (l_{k (k + 1), t}^{θ, a_{N}} - l_{k N, t}^{θ, a_{N}})} = \frac{l_{i j, t}^{θ, a_{N}}}{ρ_{k l, N}} .

From this definition arises the fact that

0 \leq q_{i j, t} \leq 1

; then,

\begin{matrix} H (q_{t}) = & - \sum_{i = 1}^{N - 1} \sum_{j = i + 1}^{N} q_{i j, t} \log (q_{i j, t}) \\ = - \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{ρ_{k l, N}} \log (\frac{l_{i j, t}^{θ, a_{N}}}{ρ_{k l, N}}) \\ = - \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{ρ_{k l, N}} [\log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) + \log (\frac{K}{ρ_{k l, N}})] \\ = - \frac{K}{ρ_{k l, N}} \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) - \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N} \frac{l_{i j, t}^{θ, a_{N}}}{ρ_{k l, N}} \log (\frac{K}{ρ_{k l, N}}) \\ = - \frac{K}{ρ_{k l, N}} \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) - \log (\frac{K}{ρ_{k l, N}}) . \end{matrix}

Returning to the entropy with respect to vector

q_{t}

, we have

- \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) = \frac{ρ_{k l, N}}{K} [H (q_{t}) + \log (\frac{K}{ρ_{k l, N}})],

and via Theorem 2.6.4 from [51],

H (q_{t}) \leq \log (N (N - 1)) .

Then,

- \sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) \leq \frac{ρ_{k l, N}}{K} \log (\frac{K (N (N - 1))}{ρ_{k l, N}}) .

Therefore, the entropy satisfies

\begin{matrix} H (C) & = - \sum_{t = 1}^{T} [\sum_{i = 1}^{N - 1} \sum_{j = i + 2}^{N - 1} \frac{l_{i j, t}^{θ, a_{N}}}{K} \log (\frac{l_{i j, t}^{θ, a_{N}}}{K}) - \sum_{i = 1}^{N - 1} \frac{l_{i (i + 1), t}^{θ, a_{N}}}{K} \log (\frac{l_{i (i + 1), t}^{θ, a_{N}}}{K}) \\ - \sum_{i = 1}^{N - 1} \frac{l_{i N, t}^{θ, a_{N}}}{K} \log (\frac{l_{i N, t}^{θ, a_{N}}}{K})] \\ \leq \frac{ρ_{k l, N}}{K} \log (\frac{K (N (N - 1))}{ρ_{k l, N}}) - \sum_{i = 1}^{N - 1} \frac{l_{i (i + 1), t}^{θ, a_{N}}}{K} \log (\frac{l_{i (i + 1), t}^{θ, a_{N}}}{K}) \\ - \sum_{i = 1}^{N - 1} \frac{l_{i N, t}^{θ, a_{N}}}{K} \log (\frac{l_{i N, t}^{θ, a_{N}}}{K}) \end{matrix}

\begin{matrix} \leq \frac{ρ_{k l, N}}{K} \log (\frac{K (N (N - 1))}{ρ_{k l, N}}) - \frac{1}{K} [(\sum_{i = 1}^{N - 1} {\hat{p}}_{i, i + 1}) \log (\frac{\sum_{i = 1}^{N - 1} l_{i (i + 1), t}^{θ, a_{N}}}{(N - 1) K}) - \\ - (\sum_{i = 1}^{N - 1} l_{i N, t}^{θ, a_{N}}) \log (\frac{\sum_{i = 1}^{N - 1} l_{i N, t}^{θ, a_{N}}}{(N - 1) k})] . \end{matrix}

(ii) This proof is based on the fact that logarithms are concave functions. We know that

\log_{b} (x) - \log_{b} (y) \leq \frac{1}{\ln (b)} (\frac{x - y}{y})

,

\forall x, y > 0

. Then,

\log (ϱ_{i j, F}) - \log (ϱ_{i j, F^{'}}) \leq \frac{1}{\ln (2)} (\frac{ϱ_{i j, F} - ϱ_{i j, F^{'}}}{ϱ_{i j, F^{'}}}),

allowing us constructioon of the following inequality:

ϱ_{i j, F} \log (ϱ_{i j, F}) - ϱ_{i j, F} \log (ϱ_{i j, F^{'}}) \leq \frac{1}{\ln (2)} (\frac{{(ϱ_{i j, F})}^{2} - ϱ_{i j, F} ϱ_{i j, F^{'}}}{ϱ_{i j, F^{'}}}) .

Then,

- ϱ_{i j, F} \log (ϱ_{i j, F}) \geq - \frac{1}{\ln (2)} (\frac{{(ϱ_{i j, F})}^{2} - ϱ_{i j, F} ϱ_{i j, F^{'}}}{ϱ_{i j, F^{'}}}) - ϱ_{i j, F} \log (ϱ_{i j, F^{'}}) .

Finally, considering the sum over i and over the corresponding j, we have the desired result. □

Appendix A.2. Weighted Likelihood

We start by investigating a panel data logit model with fixed effects, a p-lagged dependent variable, and a set of strictly exogenous explanatory variables. We let

C_{i j}^{T} = (C_{i j, t - 1}, C_{i j, t - 2}, \dots) .

The model logit can be written as

\Pr (C_{i j, t} = 1 | C^{T - 1}, X, D, A) = \frac{\exp (\sum_{l = 1}^{p} α_{l} C_{i j, t - l} + β X_{i j, t}^{⊤} + D_{i j, t} + A_{i j})}{1 + \exp (\sum_{l = 1}^{p} α_{l} C_{i j, t - l} + β X_{i j, t}^{⊤} + D_{i j, t} + A_{i j})},

(A8)

with

i, j \in {1, \dots, N}

and

t \in {1, \dots, T}

. We assume that the autoregressive order

p \in {1, 2, \dots}

is known, and the outcomes

C_{i j, t}

are observed for time

t = t_{0}, \dots, T

. The total number of time periods for which outcomes are observed is

T_{obs} = T + p

. All probabilistic statements are for the model distribution generated under the true model parameters

α^{0}

and

β^{0}

. For example, if

C_{i j} = {(C_{i j, 1}, \dots, C_{i j, T})}^{⊤}

,

X_{i j} = {(X_{i j, 1}, \dots, X_{i j, T})}^{⊤}

, and

D_{i j} = {(D_{i j, 1}, \dots, D_{i j, T})}^{⊤}

, then

\Pr (C_{i j, t} = 1 | C_{i j, 0} = c_{0}, X_{i j} = x, D_{i j} = d, A_{i j} = a) = p_{c_{0}} (α^{0}, β^{0}, c, x, d, a)

, where

\begin{matrix} p_{c_{0}} (α, β, c, x, d, a) & : = & \prod_{i < j} \prod_{t = 1}^{T} {[\frac{1}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{i j, t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}]}^{1 - c_{i j, t}} \\ \times & {[\frac{\exp (\sum_{l = 1}^{p} α_{l} c_{i j, t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{i j, t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}]}^{c_{i j, t}} . \end{matrix}

(A9)

We drop index

i j

until we discuss estimation, that is, instead of

C_{i j, t}

,

X_{i j, t}

,

D_{i j, t}

A_{i j}

, we just write C , X, D, and A for the corresponding random variables. Our first goal is to obtain the conditional likelihood under some considerations in Equation (1) that are valid regardless of the realization of A fixed effects. Following [44], we consider events

\begin{matrix} B_{1} = {c_{i j, 0} = c_{0}, \dots, c_{i j, t - 1} = c_{t - 1}, c_{i j, t} = 0, c_{i j, t + 1} = c_{t + 1}, \dots, \\ c_{i j, s - 1} = c_{s - 1}, c_{i j, s} = 1, c_{i j, s + 1} = c_{s + 1}, \dots, c_{i j, T} = c_{T}}, \end{matrix}

\begin{matrix} B_{2} = {c_{i j, 0} = c_{0}, \dots, c_{i j, t - 1} = c_{t - 1}, c_{i j, t} = 1, c_{i j, t + 1} = c_{t + 1}, \dots, \\ c_{i j, s - 1} = c_{s - 1}, c_{i j, s} = 0, c_{i j, s + 1} = c_{s + 1}, \dots, c_{i j, T} = c_{T}}, \end{matrix}

where

1 \leq t < s < T - 1

and

c_{0}, c_{1}, c_{2}, \dots, c_{T}

are either 0 or 1. A simple calculation shows that

\begin{matrix} \Pr (B_{1} | c, x, d, a) & = & \frac{1}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})} \\ \times & \frac{\exp {(\sum_{l = 2}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}^{c_{t + 1}}}{1 + \exp (\sum_{l = 2}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})} \\ \times & \frac{\exp (\sum_{l = 1}^{p} α_{l} c_{s - l} + β x_{i j, s}^{⊤} + d_{i j, s} + a_{i j})}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{s - l} + β x_{i j, s}^{⊤} + d_{i j, s} + a_{i j})} \\ \times & \frac{\exp {(\sum_{l = 1}^{p} β_{l} c_{s - l} + β x_{i j, s + 1}^{⊤} + d_{i j, s + 1} + a_{i j})}^{c_{s + 1}}}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{s - l} + β x_{i j, s + 1}^{⊤} + d_{i j, s + 1} + a_{i j})}, \end{matrix}

and

\begin{matrix} \Pr (B_{2} | c, x, d, a) & = & \frac{\exp (\sum_{l = 1}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})} \\ \times & \frac{\exp {(\sum_{l = 1}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})}^{c_{t + 1}}}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{t - l} + β x_{i j, t}^{⊤} + d_{i j, t} + a_{i j})} \\ \times & \frac{1}{1 + \exp (\sum_{l = 1}^{p} α_{l} c_{s - l} + β x_{i j, s}^{⊤} + d_{i j, s} + a_{i j})} \\ \times & \frac{\exp {(\sum_{l = 2}^{p} α_{l} c_{s - l} + β x_{i j, s + 1}^{⊤} + d_{i j, s + 1} + a_{i j})}^{c_{s + 1}}}{1 + \exp (\sum_{l = 2}^{p} α_{l} c_{s - l} + β x_{i j, s + 1}^{⊤} + d_{i j, s + 1} + a_{i j})} . \end{matrix}

Then,

\begin{matrix} \Pr (B_{1} | c, x, d, a, B_{1} \cup B_{2}, x_{i j, t + 1} = x_{i j, s + 1}) = \\ \frac{1}{1 + \exp ((x_{t} - x_{s}) β + α_{1} (c_{t + 1} - c_{s + 1}) + \sum_{l = 1}^{p} α_{l} (c_{t - l} - c_{s - l}) 1 {s - t > 1})}, \end{matrix}

(A10)

and

\begin{matrix} \Pr (B_{2} | c, x, d, a, B_{1} \cup B_{2}, x_{i j, t + 1} = x_{i j, s + 1}) = \\ \frac{\exp ((x_{t} - x_{s}) β + α_{1} (c_{t + 1} - c_{s + 1}) + \sum_{l = 1}^{p} α_{l} (c_{t - l} - c_{s - l}) 1 {s - t > 1})}{1 + \exp ((x_{t} - x_{s}) β + α_{1} (c_{t + 1} - c_{s + 1}) + \sum_{l = 1}^{p} α_{l} (c_{t - l} - c_{s - l}) 1 {s - t > 1})}, \end{matrix}

(A11)

which does not depend on

a_{i j}

. In the special case where all the explanatory variables and the stochastic process

{x_{i j, t}}_{t \in T}

satisfy

\Pr (x_{i j, t + 1} - x_{i j, s + 1} = 0) > 0

, we can use Equations (A10) and (A11) to make inference about

α

and

β

. In particular,

\begin{matrix} (\hat{α}, \hat{β}) = \arg \max_{α, β} \frac{1}{T} {(\binom{N}{2})}^{- 1} \sum_{i < j} \sum_{2 \leq t < s \leq T - 1} 1 {c_{i j, t} + c_{i j, s} = 1} K (\frac{x_{i j, t + 1} - x_{i j, s + 1}}{σ_{n}}) \\ \times \log [\frac{\exp {((x_{t} - x_{s}) b + a_{1} (c_{t + 1} - c_{s + 1}) + \sum_{l = 1}^{p} a_{l} (c_{t - l} - c_{s - l}) 1 {s - t > 1})}^{c_{i j, t}}}{1 + \exp ((x_{t} - x_{s}) b + a_{1} (c_{t + 1} - c_{s + 1}) + \sum_{l = 1}^{p} b_{l} (c_{t - l} - c_{s - l}) 1 {s - t > 1})}] . \end{matrix}

(A12)

Here,

K (\cdot)

is a kernel density function that gives appropriate weight to link

{i j}

, while

σ_{n}

is a bandwidth that shrinks as n increases. The asymptotic theory requires that kernel density be chosen so that a number of regularity conditions cn be determined. If

\Pr (X_{i j, t + 1} = X_{i j, s + 1}) > 0

(e.g., discrete covariates or controlled experiments) and

X_{i j, t + 1} - X_{i j, s + 1}

has sufficient variation conditional on

X_{i j, t + 1} = X_{i j, s + 1}

, then the

K (\cdot)

function can be replaced by a

1 (X_{i j, t + 1} - X_{i j, s + 1} = 0)

indicator function, and the resulting estimator has the usual

{(N T)}^{- 1 / 2}

rate of convergence. However, if the regressors are continuous or have high dimensions, then the estimator, while still consistent and asymptotically normal, has a convergence rate slower than

{(N T)}^{- 1 / 2}

. Also, this rate falls as the number of covariates increases.

References

Chao, A.; Gotelli, N.J.; Hsieh, T.; Sander, E.L.; Ma, K.; Colwell, R.K.; Ellison, A.M. Rarefaction and extrapolation with Hill numbers: A framework for sampling and estimation in species diversity studies. Ecol. Monogr. 2014, 84, 45–67. [Google Scholar] [CrossRef]
Jackson, M.O.; Rogers, B.; Zenou, Y. DP10406 The Economic Consequences of Social Network Structure. 2015. Available online: https://cepr.org/publications/dp10406 (accessed on 8 February 2015).
Magurran, A.E. Measuring biological diversity. Curr. Biol. 2021, 31, R1174–R1177. [Google Scholar] [CrossRef] [PubMed]
McPherson, M.; Smith-Lovin, L.; Cook, J.M. Birds of a feather: Homophily in social networks. Annu. Rev. Sociol. 2001, 27, 415–444. [Google Scholar] [CrossRef]
Dzemski, A. An empirical model of dyadic link formation in a network with unobserved heterogeneity. Rev. Econ. Stat. 2019, 101, 763–776. [Google Scholar] [CrossRef]
Fernández-Val, I.; Weidner, M. Individual and time effects in nonlinear panel models with large N, T. J. Econom. 2016, 192, 291–312. [Google Scholar] [CrossRef]
Graham, B.S. An econometric model of network formation with degree heterogeneity. Econometrica 2017, 85, 1033–1063. [Google Scholar] [CrossRef]
Engle, R.F.; Bollerslev, T. Modelling the persistence of conditional variances. Econom. Rev. 1986, 5, 1–50. [Google Scholar] [CrossRef]
Li, Y.; Wu, C.; Luo, P.; Zhang, W. Exploring the characteristics of innovation adoption in social networks: Structure, homophily, and strategy. Entropy 2013, 15, 2662–2678. [Google Scholar] [CrossRef]
Mueller-Frank, M. A general framework for rational learning in social networks. Theor. Econ. 2013, 8, 1–40. [Google Scholar] [CrossRef]
Sole, R.V.; Montoya, M. Complexity and fragility in ecological networks. Proc. R. Soc. Lond. Ser. B Biol. Sci. 2001, 268, 2039–2045. [Google Scholar] [CrossRef]
Yan, T.; Leng, C.; Zhu, J. Asymptotics in directed exponential random graph models with an increasing bi-degree sequence. Ann. Stat. 2016, 44, 31–57. [Google Scholar] [CrossRef]
Hubbell, S.P. The unified neutral theory of biodiversity and biogeography (MPB-32). In The Unified Neutral Theory of Biodiversity and Biogeography (MPB-32); Princeton University Press: Princeton, NJ, USA, 2011. [Google Scholar]
Pellissier, L.; Albouy, C.; Bascompte, J.; Farwig, N.; Graham, C.; Loreau, M.; Maglianesi, M.A.; Melián, C.J.; Pitteloud, C.; Roslin, T.; et al. Comparing species interaction networks along environmental gradients. Biol. Rev. 2018, 93, 785–800. [Google Scholar] [CrossRef] [PubMed]
Neyman, J.; Scott, E.L. Consistent estimates based on partially consistent observations. Econom. J. Econom. Soc. 1948, 16, 1–32. [Google Scholar] [CrossRef]
Kuersteiner, G.M.; Prucha, I.R. Dynamic spatial panel models: Networks, common shocks, and sequential exogeneity. Econometrica 2020, 88, 2109–2146. [Google Scholar] [CrossRef]
Viol, A.; Palhano-Fontes, F.; Onias, H.; de Araujo, D.B.; Hövel, P.; Viswanathan, G.M. Characterizing complex networks using entropy-degree diagrams: Unveiling changes in functional brain connectivity induced by Ayahuasca. Entropy 2019, 21, 128. [Google Scholar] [CrossRef]
Renyi, E. On Random Graph. Publ. Math. 1959, 6, 290–297. [Google Scholar]
Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef]
Watts, D.J.; Strogatz, S.H. Collective dynamics of ‘small-world’networks. Nature 1998, 393, 440–442. [Google Scholar] [CrossRef]
De Paula, Á. Econometric models of network formation. Annu. Rev. Econ. 2020, 12, 775–799. [Google Scholar] [CrossRef]
Cai, M.; Cui, Y.; Stanley, H.E. Analysis and evaluation of the entropy indices of a static network structure. Sci. Rep. 2017, 7, 9340. [Google Scholar] [CrossRef]
Guo, C.; Yang, L.; Chen, X.; Chen, D.; Gao, H.; Ma, J. Influential nodes identification in complex networks via information entropy. Entropy 2020, 22, 242. [Google Scholar] [CrossRef] [PubMed]
Sheng, S. A structural econometric analysis of network formation games through subnetworks. Econometrica 2020, 88, 1829–1858. [Google Scholar] [CrossRef]
Chao, A.; Jost, L. Estimating diversity and entropy profiles via discovery rates of new species. Methods Ecol. Evol. 2015, 6, 873–882. [Google Scholar] [CrossRef]
Cushman, S.A. Entropy in landscape ecology: A quantitative textual multivariate review. Entropy 2021, 23, 1425. [Google Scholar] [CrossRef]
Ai, X. Node importance ranking of complex networks with entropy variation. Entropy 2017, 19, 303. [Google Scholar] [CrossRef]
Lu, G.; Li, B.; Wang, L. Some new properties for degree-based graph entropies. Entropy 2015, 17, 8217–8227. [Google Scholar] [CrossRef]
Li, Y.; Cai, W.; Li, Y.; Du, X. Key node ranking in complex networks: A novel entropy and mutual information-based approach. Entropy 2019, 22, 52. [Google Scholar] [CrossRef]
Nie, T.; Guo, Z.; Zhao, K.; Lu, Z.M. Using mapping entropy to identify node centrality in complex networks. Phys. A Stat. Mech. Its Appl. 2016, 453, 290–297. [Google Scholar] [CrossRef]
Qiao, T.; Shan, W.; Zhou, C. How to identify the most powerful node in complex networks? A novel entropy centrality approach. Entropy 2017, 19, 614. [Google Scholar] [CrossRef]
Delvenne, J.C.; Libert, A.S. Centrality measures and thermodynamic formalism for complex networks. Phys. Rev. E 2011, 83, 046117. [Google Scholar] [CrossRef]
Tutzauer, F. Entropy as a measure of centrality in networks characterized by path-transfer flow. Soc. Netw. 2007, 29, 249–265. [Google Scholar] [CrossRef]
Estrada, E.; de la Peña, J.A.; Hatano, N. Walk entropies in graphs. Linear Algebra Its Appl. 2014, 443, 235–244. [Google Scholar] [CrossRef]
Dehmer, M.; Sivakumar, L. Recent developments in quantitative graph theory: Information inequalities for networks. PLoS ONE 2012, 7, e31395. [Google Scholar] [CrossRef] [PubMed]
Zarghami, S.A.; Gunawan, I.; Schultmann, F. Entropy of centrality values for topological vulnerability analysis of water distribution networks. Built Environ. Proj. Asset Manag. 2019, 9, 412–425. [Google Scholar] [CrossRef]
Omar, Y.M.; Plapper, P. A survey of information entropy metrics for complex networks. Entropy 2020, 22, 1417. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Zeng, G.; Tu, X. Information technology project portfolio implementation process optimization based on complex network theory and entropy. Entropy 2017, 19, 287. [Google Scholar] [CrossRef]
Faust, K.; Lahti, L.; Gonze, D.; De Vos, W.M.; Raes, J. Metagenomics meets time series analysis: Unraveling microbial community dynamics. Curr. Opin. Microbiol. 2015, 25, 56–66. [Google Scholar] [CrossRef]
Hu, S.K.; Connell, P.E.; Mesrop, L.Y.; Caron, D.A. A hard day’s night: Diel shifts in microbial eukaryotic activity in the north pacific subtropical gyre. Front. Mar. Sci. 2018, 5, 351. [Google Scholar] [CrossRef]
Snijders, T.A. Statistical models for social networks. Annu. Rev. Sociol. 2011, 37, 131–153. [Google Scholar] [CrossRef]
Fafchamps, M.; Lund, S. Risk-sharing networks in rural Philippines. J. Dev. Econ. 2003, 71, 261–287. [Google Scholar] [CrossRef]
Silva, J.S.; Tenreyro, S. The log of gravity. Rev. Econ. Stat. 2006, 88, 641–658. [Google Scholar] [CrossRef]
Chamberlain, G. Panel data. Handb. Econom. 1984, 2, 1247–1318. [Google Scholar]
Manski, C.F. Identification of endogenous social effects: The reflection problem. Rev. Econ. Stud. 1993, 60, 531–542. [Google Scholar] [CrossRef]
Honoré, B.E.; Kyriazidou, E. Panel data discrete choice models with lagged dependent variables. Econometrica 2000, 68, 839–874. [Google Scholar] [CrossRef]
Hahn, J.; Kuersteiner, G. Bias reduction for dynamic nonlinear panel models with fixed effects. Econom. Theory 2011, 27, 1152–1191. [Google Scholar] [CrossRef]
Amemiya, T. Advanced Econometrics; Harvard University Press: Cambridge, MA, USA, 1985. [Google Scholar]
Helgeson, E.S.; Liu, Q.; Chen, G.; Kosorok, M.R.; Bair, E. Biclustering via sparse clustering. Biometrics 2020, 76, 348–358. [Google Scholar] [CrossRef]
Han, W.; Feng, Y.; Qian, X.; Yang, Q.; Huang, C. Clusters and the entropy in opinion dynamics on complex networks. Phys. A Stat. Mech. Its Appl. 2020, 559, 125033. [Google Scholar] [CrossRef]
Cover, T.M. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
Barucca, P.; Caldarelli, G.; Squartini, T. Tackling information asymmetry in networks: A new entropy-based ranking index. J. Stat. Phys. 2018, 173, 1028–1044. [Google Scholar] [CrossRef]
Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
Rosenblatt, M. Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 1956, 27, 832–837. [Google Scholar] [CrossRef]
Enea, M. Speedglm: Fitting Linear and Generalized Linear Models to large data sets, R Package Version 0.1. 2012. Available online: https://rdrr.io/cran/speedglm/man/speedglm.html(accessed on 15 August 2023).
Chatterjee, S.; Diaconis, P.; Sly, A. Random graphs with a given degree sequence. Ann. Appl. Probab. 2011, 21, 1400–1435. [Google Scholar] [CrossRef]
Reny, P.J. On the existence of pure and mixed strategy Nash equilibria in discontinuous games. Econometrica 1999, 67, 1029–1056. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Zinger, L.; Amaral-Zettler, L.A.; Fuhrman, J.A.; Horner-Devine, M.C.; Huse, S.M.; Welch, D.B.M.; Martiny, J.B.; Sogin, M.; Boetius, A.; Ramette, A. Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems. PloS ONE 2011, 6, e24570. [Google Scholar] [CrossRef] [PubMed]
Lozupone, C.A.; Hamady, M.; Kelley, S.T.; Knight, R. Quantitative and qualitative β diversity measures lead to different insights into factors that structure microbial communities. Appl. Environ. Microbiol. 2007, 73, 1576–1585. [Google Scholar] [CrossRef] [PubMed]
Coenen, A.R.; Hu, S.K.; Luo, E.; Muratore, D.; Weitz, J.S. A primer for microbiome time-series analysis. Front. Genet. 2020, 11, 310. [Google Scholar] [CrossRef] [PubMed]
Shade, A.; Gregory Caporaso, J.; Handelsman, J.; Knight, R.; Fierer, N. A meta-analysis of changes in bacterial and archaeal communities with time. ISME J. 2013, 7, 1493–1506. [Google Scholar] [CrossRef]
Vellend, M. Conceptual synthesis in community ecology. Q. Rev. Biol. 2010, 85, 183–206. [Google Scholar] [CrossRef]
Hausser, J.; Strimmer, K. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. Curr. Biol. 2021, 31, R1174–R1177. [Google Scholar]

Figure 1. Entropy function

H (C)

for values of N nodes. Here,

β^{0}

is a scalar equal to

0.5

and

α^{0}

is a random vector of length 10 with a norm of less than 1.

Figure 1. Entropy function

H (C)

for values of N nodes. Here,

β^{0}

is a scalar equal to

0.5

and

α^{0}

is a random vector of length 10 with a norm of less than 1.

Figure 2. Identifying interaction in communities in sampling networks.

Table 1. Network statistics.

Network	Size Network	Mean Degree	Standard Deviation	Cluster Coeficient	Standard Deviation (Cluster Coefficient)
Simulate	100	0.50	3.01	0.91	0.15
Erdös–Rényi	100	0.10	1.40	0.11	0.01
Barabási–Albert	100	0.28	0.25	1.01	0.24
Simulate	150	0.33	2.19	0.97	0.18
Erdös–Rényi	150	0.05	1.78	0.11	0.02
Barabási–Albert	150	0.10	0.15	1.02	0.25
Simulate	200	0.27	1.73	0.92	0.19
Erdös–Rényi	200	0.06	0.91	0.08	0.01
Barabási–Albert	200	0.12	0.08	0.56	0.14
Simulate	250	0.22	1.44	0.94	0.16
Erdös–Rényi	250	0.04	0.44	0.08	0.05
Barabási–Albert	250	0.20	0.06	0.01	0.16
Simulate	500	0.20	0.82	0.98	0.14
Erdös–Rényi	500	0.09	0.44	0.09	0.008
Barabási–Albert	500	0.13	0.01	0.01	0.13

Table 2. Network entropy values.

Network	Size Network
Network	100	200	250	500
$H (C)$	4.27	4.36	4.05	3.33
Shannon Entropy	10.11	11.50	11.94	14.34
Erdös-Rényi Entropy	401.3	523.1	559.4	689.9

Table 3. Taxonomic identities.

Taxonomic Group		Taxonomic Detail of RNA OTUs
Alveolates	Ciliates	Phyllopharyngea, Spirotrichea, Litostomatea, Prostomatea, Oligohymenophorea, and Colpodea
	Dinoflagellates	Symbiodinium, Gyrodinium, Protoperidinium, Prorocentrum, Dinophysis, Gymnodinium, Heterocapsa, Apicoporus, Suessiales, Azadinium, Blastodinium, Chytriodinium, Peridinium, Amphisolenia, Phalacroma, Amphidinium, and unclassified Dinophyceae.
	Syndiniales	Dino-Group-I, Dino-Group-II, Dino-Group-III, and Dino-Group-V
Archaeplastids	Chlorophytes	Chlorodendrophyceae, Pyramimonadales, and Prasino-Clade-VII
	Other	Heliconia.
Rhizaria	Acantharia	Hexaconus, Chaunacanthida, Acantharea, Amphilonche, Staurolithium, Acanthocolla, and Heteracon.
	Cercozoa	Protaspa.
Opisthokont	Fungi	Other-unclassified
	Metazoa	Arthropoda, Mollusca, Annelida, and Urochordata.

Table 4. Metric values of the sample of the network.

Network Time	Metric
Network Time	Entropy	Transitivity	Mean Degree	Sensitivity	Specifity	Precision	F1-Scores
6 a.m.	$1.19 \times 10^{8}$	0.172	49.4	0.905	0.098	0.493	0.637
10 a.m.	$1.23 \times 10^{8}$	0.171	50.1	0.895	0.101	0.503	0.644
2 p.m.	$1.22 \times 10^{8}$	0.176	40.5	0.896	0.096	0.501	0.643
6 p.m.	$1.21 \times 10^{8}$	0.173	25.5	0.900	0.100	0.490	0.635
10 p.m.	$1.18 \times 10^{8}$	0.167	24.6	0.891	0.103	0.494	0.636
2 a.m.	$1.24 \times 10^{8}$	0.173	16.8	0.906	0.102	0.496	0.641

Table 5. Effect of the non-observable on microbial communities.

Network Time		$A_{i}$
Network Time		$\frac{N - i}{N - 1} \log (\log (N))$	$\frac{N - i}{N - 1} \log N^{1 / 2}$	$\frac{N - i}{N - 1} \log N$
6 a.m.	Communities	21	15	15
	Mean degree	2.86	2.67	2.67
	Denstity	0.14	0.19	0.19
10 a.m.	Communities	28	15	21
	Mean degree	3	2.67	2.86
	Denstity	0.11	0.19	0.14
2 p.m.	Communities	21	21	21
	Mean degree	2.86	2.86	2.86
	Denstity	0.14	0.14	0.14
6 p.m.	Communities	15	21	15
	Mean degree	2.67	2.86	2.67
	Denstity	0.19	0.14	0.19
10 p.m.	Communities	15	15	15
	Mean degree	2.67	2.67	2.67
	Denstity	0.19	0.19	0.19
2 a.m.	Communities	28	21	15
	Mean degree	3	2.86	2.67
	Denstity	0.11	0.14	0.19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Centeno Mejia, A.A.; Bravo Gaete, M.F. Exploring the Entropy Complex Networks with Latent Interaction. Entropy 2023, 25, 1535. https://doi.org/10.3390/e25111535

AMA Style

Centeno Mejia AA, Bravo Gaete MF. Exploring the Entropy Complex Networks with Latent Interaction. Entropy. 2023; 25(11):1535. https://doi.org/10.3390/e25111535

Chicago/Turabian Style

Centeno Mejia, Alex Arturo, and Moisés Felipe Bravo Gaete. 2023. "Exploring the Entropy Complex Networks with Latent Interaction" Entropy 25, no. 11: 1535. https://doi.org/10.3390/e25111535

APA Style

Centeno Mejia, A. A., & Bravo Gaete, M. F. (2023). Exploring the Entropy Complex Networks with Latent Interaction. Entropy, 25(11), 1535. https://doi.org/10.3390/e25111535

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploring the Entropy Complex Networks with Latent Interaction

Abstract

1. Introduction

2. Structural Model

3. Exploring the Entropy

4. Benchmark and Simulations

5. Empirical Application

5.1. Analytical Processes

5.2. Results

5.2.1. Calculating Sensitivity and Specificity, Effect of Interaction Intensity

5.2.2. Effect of Interaction Intensity in the Communities

5.2.3. Effects of Parameters on Microbial Diversity

5.2.4. Effect of the Non-Observable

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Proofs

Appendix A.2. Weighted Likelihood

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI