Testing goodness-of-fit of random graph models

Random graphs are matrices with independent 0, 1 elements with probabilities determined by a small number of parameters. One of the oldest model is the Rasch model where the odds are ratios of positive numbers scaling the rows and columns. Later Persi Diaconis with his coworkers rediscovered the model for symmetric matrices and called the model beta. Here we give goodnes-of-fit tests for the model and extend the model to a version of the block model introduced by Holland, Laskey, and Leinhard.


Introduction
Let n be a positive integer, 1 ≤ i, j ≤ n, and ε(i, j) independent random variables such that ε(i, j) = ε(j, i) and ε(i, i) = 0, furthermore P (ε(i, j) = 1) = p i,j = p + p i + p j , 1 ≤ i < j ≤ n, where the sum of the p i -s is zero. The least square estimatep of p is the average of the epsilons, and the least square estimate of p i is the average of the differences ε(i, j) −p. The modification of the model for non-symmetric matrices is straightforward, and in that case the statistical inference is practically a two-way analysis of variance. Perhaps this is the simplest random graph model but it shares the inconvenient property of many other random graph models that it is hard to ensure that edge probabilities remain in the interval (0, 1). If we use the odds instead of the probabilities, then it is enough to ensure the positivity of r i,j -s. This is the case in the model introduced by George Rasch [31]. Historically the odds were defined as the ratios of scaling factors for rows and columns but we prefer the multiplicative form for non-symmetric and for symmetric case. Statistical investigation of the model started with Andersen [1] (see also [21,30,33]) and later Persi Diaconis with his coworkers rediscovered the model and introduced the name beta-model for its parameter. The model has many attractive properties (see in [2,4,5,6,8,28]): -degree sequences are sufficient statistics -the model covers practically all possible expected degree sequence -the conditional distribution of the graphs on condition of a prescribed degree sequence is uniform on the set of all graphs with the given degree sequences.
Statistically inference emerged from Gaussian distribution and later was extended to random variables in Euclidean spaces but the statistical inference on discrete structures is rather sparse ( [7,15,16,19,26]). Mathematical investigation of graphs has its own history. Nowadays instead of graphs we are speaking of networks ( [27]) where the most investigated model is the stochastic block model introduced by Holland, Laskey, and Leinhard ( [18]). Here the vertices are labeled by small numbers or colors and edge probabilities depend only on the labels ( [3,17]). With an eye on preferential attachment where degree sequences follow scale-free power-law the block model was criticized because it has moderated flexibility on degree sequences. Chung, Lu, and Vu [14] introduced a model with independent vertices, Chaughuri, Chung, and Tsiatas ( [10]) introduced the planted partition model (see also [25]). Karrer and Newman [20] proposed and other extension of the block model. A natural extension of these models is the unification of the beta and block models: where b(., .) is a positive matrix with n rows and k columns, and c(i) is the label of the i-th vertex i.e. it is an integer between 1 and k. We call the model k-beta model. The estimation of the labels in block models is possible by the spectral method ( [32]). It is generally believed that eigenvectors and eigenvalues of the matrix ε(i, j) tells everything of the structure of the graph ( [10,12,13,22,23,24]), while there are many attempts to provide more flexible models ( [9,29]).

Goodness-of-fit
We can not test edge-independence on a single graph. While i.i.d. sample is common in statistical inference, in case of graphs the sample generally means a copy of a graph. Perhaps the number one question in statistical inference is the following. Let be an arbitrary given sequence of probabilities, and be independent 0 −1 variables such that P (ε i = 1) = p i . Can we test the model? A randomized answer is the following. Let u 1 , . . . , u n independent and uniformly distributed in (0, 1). Then are independent and uniformly distributed in (0, 1), what we can test. An other, more practical solution is ordering the the pairs (p i , ε i ) according to the p i -s in increasing order and compare their partial sums. Or we can clump them into blocks of small number and compare again the sums. All these possibilities hold for graphs with estimated edge probabilities. Let us partition the edges of the complete graph according to the blocks formed with respect to the edge probabilities. In each portion the edge probabilities are close to each other whence the ε i,j -s corresponding to that portion behave like a pure random graph. what we again can test e.g. by their sums on subsets of vertices. Blitzstein and Diaconis ( [6,11]) propose for testing the beta model the following general procedure. Let us choose any graph statistic and determine it on our graph. Let us generate as many graph we can with the same degree sequence as the investigated graph has according to the uniform distribution, and let us calculate the chosen statistics. If the value of the sample graph is inside the generated numbers, we accept the beta model, otherwise reject it. One can ask, are there any effect of the choose on the power of the test?
We have found by computer simulations that graphs generated by beta model have only one eigenvalue proportional with n, all the others are of order √ n. We think that it is a characteristic property of beta graphs. One wonders that -if beta model covers all possible degree sequences -the conditional distribution is uniform over graphs sharing the same degree sequence, then how is possible that graph behaves differently from typical graphs generated by beta model? Of course there are graphs having many large eigenvalues. But where are they coming from once beta model can generate all the graphs? A possible solution of the catch is the following.
Let us generate a meta graph from graphs sharing the same degree sequence. Let us say that neighborhood in this meta graph is given by on single swap. If we have four vertices A, B, C, D in a graph such that AC, BD is and edge but AD, BC is not, then changing existence into non existence among these edges we form a new graph with the same degree sequence. The degree of a graph in this meta graph goes parallel with the second largest eigenvalue: typical beta model graphs have minimal degree and any increase in their degree results in a more complicated eigenvalue structure. Perhaps the degree in the meta graph is the most characteristic statistic for beta model.

The k-beta model
The maximum likelihood equations for the parameters b(., .) in (5) say that the expected values of degrees inside all the subgraph with a given pair of labels should be the same us in the given graph. This is the case when the labels are known. With unknown labels we can form a two-level optimization: for each label set first to determine the parameters b(., .) next changing a small number of labels and repeat the calculation of the parameters. But the procedure is slow even for graphs of moderate sizes. Spectral methods available for block models fail for coloring k-beta models because the model lose the well pronounced checkerboard character of block models. It is the ANOVA what offers an applicable algorithm. For any set C of labels c(.) let us calculate the statistic where u(s, t) = and Q(C) is the sum of two way ANOVA sum of squares calculated independently for subgraphs defined for pairs of labels. Starting from a uniform random set C of labels on the vertices and perturbing small number of labels in the individual steps a simple greedy optimization results in a good set of labels, which is close to the original (true) labels.
For evaluating the character of a random graph we use the number We call it delogarithmed average entropy or DAE. This is a number between 1 and 2. If it is close to one the graph is almost deterministic: the probabilities are close to 0 or 1. In checkerboard block models it means that empty and full subgraphs are amalgamated together. If DAE is close to 2 then the graph has no structure at al. DAE depends on edge density, too. The above tendency is valid for edge density 1 2 , for other edge densities the cut point is closer to 1. According to our experience if DAE is smaller then 1.9 while edge density is half, then we are able to reconstruct the original labels. For these graphs the number of non-trivial eigenvalues is 2k − 1, thus the spectrum determines the number of different labels.
The k-beta model has a sister model what we call small odds rank model. Strictly speaking we ought to redefine the diagonal of odds matrix, but perhaps the name is permissible without doing so. The maximum likelihood estimation of parameters in small odds rank models is straightforward and the block structure is detectable in the estimated parameters. Actually the block model is in the intersection of k-beta and small odds rank models, thus if there is any block structure in the graph it is detectable even in fitting k-beta model to the graph. But if there is no block structure and we are trying to use ANOVA coloring for a small odds rank graph then the algorithm is no longer stable, it results in different local minima in each runs.