High Dimensional Hyperbolic Geometry of Complex Networks

: High dimensional embeddings of graph data into hyperbolic space have recently been shown to have great value in encoding hierarchical structures, especially in the area of natural language processing, named entity recognition, and machine generation of ontologies. Given the striking success of these approaches, we extend the famous hyperbolic geometric random graph models of Krioukov et al. to arbitrary dimension, providing a detailed analysis of the degree distribution behavior of the model in an expanded portion of the parameter space, considering several regimes which have yet to be considered. Our analysis includes a study of the asymptotic correlations of degree in the network, revealing a non-trivial dependence on the dimension and power law exponent. These results pave the way to using hyperbolic geometric random graph models in high dimensional contexts, which may provide a new window into the internal states of network nodes, manifested only by their external interconnectivity.


Introduction
Many complex networks, which arise from extremely diverse areas of study, surprisingly share a number of common properties. They are sparse, in the sense that the number of edges scales only linearly with the number of nodes, they have small radius, connected nodes tend to have many of their neighbors in common (beyond what would be expected from, for example, random graphs drawn from a distribution with a fixed degree sequence), and their degree sequence has a power law tail, generally with an exponent less than 3 [1][2][3][4]. One of the major efforts of network science is to devise models which are able to simultaneously capture these common properties.
Many models of complex networks have been proposed which are able to capture some of the common properties above. Perhaps the most well known is the Barabási-Albert model of preferential attachment [5]. This model is described in terms of a growing network, in which nodes of high degree are more likely to attract connections from newly born nodes. It is effective in capturing the power law degree sequence, but fails to generate graphs in which connected nodes tend to have a large fraction of their neighbors in common (often called 'strong clustering'). There are growing indications that the strong clustering, the tendency for affiliation of neighbors of connected nodes, is a manifestation of the network having a geometric structure [6], based upon the notion of a geometric random graph. However, geometric random graphs based upon a flat, Euclidean geometry do not give rise to power law degree distributions. Curved space, on the other hand, in particular the negatively curved space of hyperbolic geometry, does give rise to such heterogeneous distribution of degrees, as shown by Krioukov et al. [7]. Random geometric graphs in hyperbolic geometry therefore provide extremely promising models for the structure of complex networks. Besides curvature, an arguably even more fundamental aspect of a geometric space is its dimension. After all, a lot more possibilities open when one is attempting to navigate a space of higher dimension. Can random geometric graph models based on higher dimensional geometry account for structures in graphs which so far have eluded description within lower dimensional models?
Recent developments in machine learning for natural language processing suggest that this is likely the case, as higher dimensional hyperbolic geometries prove more effective at link prediction in author collaboration networks than do embeddings into lower dimensional hyperbolic space [8]. High dimensional hyperbolic embeddings are also proving fruitful in representing taxonomic information within machine generated knowledge bases [9,10]. It may be worth noting that, in these machine learning contexts, the hyperbolic geometry generally plays a slightly different role. There one often seeks an isometric embedding of some structured data, such as word proximity graphs in natural language texts, into hyperbolic space of some dimension. In our approach based upon random geometric graphs, we are taking the geometric description more fundamentally, as we seek an embedding which covers the entire space uniformly, rather than covering merely an isometric subset of the space. The idea, ultimately, is to associate data with a dynamical description of geometry, for example, as is done in gravitational physics [11].
We consider statistical mechanical models of complex networks as introduced in [12], and, in particular, their application to random geometric graphs in hyperbolic space [7]. Fountoulakis has computed the asymptotic distribution of the degree of an arbitrary vertex in the hyperbolic model, and shown that it gives rise to a degree sequence which follows a power law [13]. More generally, it is known that the hyperbolic models are effective in capturing all of the above properties, including the strong clustering behavior which typically evades non-geometric models. Fountoulakis also proves that the degrees of any finite collection of vertices are asymptotically independent, so that in this sense the correlations that one might expect from spatial proximity vanish in the N → ∞ and R → ∞ limit. Furthermore, it is possible to assign an effective meaning to the dimensions of the hyperbolic embedding space: the radial dimension can be regarded as a measure of popularity of a network node, while the angular dimension acts as a space in which the similarity of nodes can be represented, such that nearby nodes are similar to each other, while distant nodes are not [14].
As far as we are aware, to date, all attention on this model has been focused on two -dimensional hyperbolic space. Generalization to higher dimension does appear in an as yet unpublished manuscript [15], which extends the hyperbolic geometric models to arbitrary dimension, however, explicit proofs regarding the degree distribution are not provided. High dimensional hyperbolic embeddings have also been explored from the machine learning perspective, but not as an explicit model of random geometric graphs as considered here [8]. It is reasonable to expect that, in order to capture the behavior of real world complex networks, it will be necessary to allow for similarity spaces of larger than one dimension. We therefore construct a model of random geometric graphs in a ball of H d+1 , in direct analogy to the d = 1 construction of [7]. In this model, the radial coordinate retains its effective meaning as a measure of the popularity of a node, while the similarity space becomes a full d-sphere, allowing for a much richer characterization of the interests of a person in a social network, or the role of an entity in some complex organizational structure. We perform a careful asymptotic analysis, yielding precise results which help clarify various aspects of the model.
In particular, we compute the asymptotic degree distribution in five regions of the parameter space, for arbitrary dimension, only one region of which (τ < 1 and 2σ > 1) was considered in Fountoulakis' treatment of d = 1 [13]. For d = 1, the angular probability distribution is simply a constant 1 2π , while when d > 1 it is a dimension-dependent power of the sine of the angular separation between two nodes, so that we cannot expect to straightforwardly generalize the prior results for d = 1 to higher dimension. In fact, the angular integrals are tedious and not easy to compute. We use the series expansion to decompose the integrated function, to get a fine infinitesimal estimation of the angular integral, which is the key step to performing the high dimensional analysis.
For d = 1 Fountoulakis also computes the correlation of node degrees in the model, and shows that it goes to zero in the asymptotic limit [13]. We generalize this result to arbitrary dimension, finding that the asymptotic independence of degree requires a steeper fall off for the degree distribution (governed by the parameter σ) at larger dimension d. This dimensional dependence is reasonable, in the sense that, in higher dimensions, there are more directions in which nodes can interact. When computing clustering for the high dimensional model, the non-trivial angular dependence will pose an even greater challenge, since for three nodes there is much less angular symmetry. Hence our present analysis paves the way for future research in high dimensional hyperbolic space.
Our model employs a random mapping of graph nodes to points in a hyperbolic ball, giving these nodes hyperbolic coordinates, and connecting pairs of nodes if they are nearby in the hyperbolic geometry. Based upon their close relationship with hyperbolic geometry, we might assume that they are therefore intrinsically hyperbolic, in the sense of Gromov's notion of δ-hyperbolicity. This may be an important question, in that it would connect to a number of important results on hyperbolicity of graphs [16], and may cast an important light on the relevance of these hyperbolic geometric models to real world networks [17]. We do not expect our model to generate δ-hyperbolic networks for the entire range of the parameter space of our model because, for example, one can effectively send the curvature to zero by choosing R 1 (with ζ ≡ 1). However, we would expect it to be the case when the effect of curvature is large, such as when R grows rapidly with respect to N. We do not know for certain whether the particular ranges of parameters that we explore in this paper generate δ-hyperbolic graphs for δ > 0, almost surely, but imagine it likely to be the case.

Model Introduction
We consider a class of exponential random graph models, which are Gibbs (or Boltzmann) distributions on random labeled graphs. [12,18] These models define a probability distribution on graphs which is defined in terms of a Hamiltonian 'energy functional', for which the probability of a graph G is Pr(G) = e −H(G)/T Z(T) . T is a 'temperature', which controls the relevance of the Hamiltonian H(G), and Z(T) = ∑ G e −H(G)/T is a normalizing factor, often called the 'partition function'. Note that the probability distribution becomes uniform in the limit T → ∞. The Hamiltonian, which encodes the 'energy' of the graph, consists of a sum of 'observables', each multiplied by a corresponding Lagrange multiplier, which controls the relevance of that observables' value in the probability of a given graph. We begin with the most general model for which the probability of each edge is independent, wherein each probability is governed by its own Lagrange multiplier ω u,v . Thus, our Hamiltonian is H(G) = ∑ u<v ω u,v a u,v , where a u,v = 1 if u and v are connected by an edge in G, and 0 otherwise. The partition function Z(T) = ∏ u<v (1 + e −ω u,v /T ), and the therefore the probability of occurrence of an edge between vertices u and v is Following [7], we embed each of N nodes into a hyperbolic ball of finite radius R in H d+1 , and set the 'link energies' ω u,v to be based on the distances d u,v between the embedded locations of the vertices u and v, through here we allow all integer values of d ≥ 1. In particular, we set where µ is a connectivity distance threshold. Note that each pair of vertices for which d u,v < µ contribute negatively to the 'total energy' whenever a link between them is present, so that the 'ground state' (minimum energy, highest probability) graph will have links between every pair of vertices u, v for which d u,v < µ, and no links with d u,v > µ. Note that in the limit as T → 0, the probability of the ground state graph goes to 1, and that of every other graph vanishes. Thus In a real world network, we may expect the largest node degree to be comparable with N itself, and the lowest degree nodes to have degree of order 1. To maintain this behavior, following [7], we restrict attention to the submanifold of models for which µ ≡ R (If µ = R, then k(0) , the expected degree of the origin node, satisfies N−1 Additionally, for ease of comparison with numerous results in the literature, and to keep track of 'distance units', we follow [7] in retaining the parameter ζ > 0 which governs the curvature K = −ζ 2 of the hyperbolic geometry, even though modifying it adds nothing to the space of models which cannot be achieved through a tuning of R. (Changing ζ has the sole effect of scaling all radial coordinates, which can equivalently be achieved by scaling R.) We will use spherical coordinates on H d+1 , so that each point x has coordinates x = (r, θ), with the usual coordinates θ = (θ 1 , and θ d ∈ [0, 2π). In these coordinates the distance d u,v between two points u = (r u , θ u ) and v = (r v , θ v ) is given by the hyperbolic law of cosines where θ u,v is the angular distance between θ u and θ v on S d . In these coordinates the uniform probability density function, from the volume element in H d+1 , factorizes into a product of functions of the separate radial and spherical coordinates With the usual spherical coordinates on S d , the uniform distribution arises from ρ n (θ n ) = sin d−n θ n /I d,n , with I d,n = π 0 sin d−n θ dθ for n < d and I d,d = 2π. In place of a uniform distribution of points, we use a spherically symmetric distribution whose radial coordinates are sampled from the probability density where σ > 0 is a real parameter which controls the radial density profile, and the normalizing factor C d = R 0 sinh d (σζr)dr. Note that σ = 1 corresponds to a uniform density with respect to the hyperbolic geometry. Thus, when σ > 1, the nodes are more likely to lie at large radius than they would with a uniform embedding in hyperbolic space, and when σ < 1 they are more likely to lie nearer the origin than in hyperbolic geometry. We will see later that the degree of a node diminishes exponentially with its distance from the origin, and that with larger values of σ the model tends to produce networks with a degree distribution which falls off more rapidly at large degree, while with smaller σ it produces networks whose degree distribution possesses a 'fatter tail' at large degree.
We would like to compute the expected degree sequence of the model, in a limit with N → ∞ and R = R(N) → ∞ as N → ∞ . The model is label independent, meaning that, before embedding the vertices into H d+1 , each vertex is equivalent. To determine the expected degree sequence, it is therefore sufficient to compute the degree distribution of a single vertex, as a function of its embedding coordinates, and then integrate over the hyperbolic ball.
Given that each of the other N − 1 vertices are randomly distributed in H d+1 according to the probability density (3), the expected degree of u will be equal to the connection probability p u,v , integrated over H d+1 , for each of the other N − 1 vertices. Thus, Since the geometry is independent of angular position, without loss of generality, we can assume our vertex u is embedded at the 'north pole' θ 1 = 0, so the relative angle θ u,v between u and any other embedding vertex v is just the angle θ 1 of v. (The full hyperbolic geometry is of course completely homogeneous, and thus independent of the radial coordinate as well; however, the presence of the boundary at r = R breaks the radial homogeneity.) Thus, where r u , r v are the radial coordinates of the vertices u, v; θ u,v is the angle between them in the θ 1 direction, and ρ 1 (θ) = sin d−1 θ/I d,1 , with To simplify the notation below, we use θ in place of θ u,v .
In the case of r u = 0, we have d u,v = r v and

Main Results
Let V N be the vertex set of a random graph G with N vertices, whose elements are randomly distributed with a radially dependent density into a ball of radius R in H d+1 , d ≥ 1. Let D u denote the degree of the vertex u. Throughout o(1) → 0 as N → ∞, unless otherwise indicated.
Theorem 3. Let τ < 1, and , for dimension d, for any integer m ≥ 2 and for any collection of m pairwise distinct vertices v i , i = 1, · · · m, their degrees D v 1 , · · · , D v m are asymptotically independent in the sense that, for any non-negative integers k i , i = 1, · · · , m, The paper is organized as follows. In Section 1.1 above, we introduce the model. Specifically, we explain the exponential random graph model in hyperbolic space, explain the roles of the parameters, and provide some basic formulas for mean degree. In Section 1.2 above, we state three main theorems. Theorems 1 and 2 state that the probability of degree taking value k for any node, in the limit of large N, takes the form of a power law. Theorem 4 states that the probability of several nodes taking respective k values is asymptotically independent as the number of nodes is growing. In Section 2, we provide some useful preliminary results; in particular, we give an explicit approximation of the hyperbolic distance formula for two points on some conditions, which is superior to previous such expressions. In Section 3, we compute the angular integral, which is the key obstacle that we need to overcome to extend the model to high dimensional hyperbolic space. We use analytic expansion to perform a fine infinitesimal estimation for the angular integral, which could not be done on a computer. We put these computations in the Appendix. Our main proof processes lie in Sections 4-7. We compute the expected degree of a given vertex as a function of radial coordinate, the mean degree of the network, and the asymptotic distribution of degree for the cases τ < 1, τ = 1, and τ > 1, respectively, in Sections 4-6. The proof of Theorem 1 lies in Section 4.1 and the proof of Theorem 2 lies in Section 6.1. In Section 7, we analyze the asymptotic correlations of degree for τ < 1 and finally prove Theorem 4.

Simulations
The Cactus High Performance Computing Framework [19] defines a new paradigm in scientific computing, by defining software in terms of abstract APIs, which allows scientists to construct interoperable modules without having to know the details of function argument specifications from these modules at the outset. One can write toolkits within the framework, which are collections of modules which perform computations in some scientific domain. For the simulation results shown in this paper, we have used a ComplexNetworks toolkit within Cactus, which we hope to be available soon under a freely available open source license.
The ComplexNetworks toolkit is an extension of the CausalSets toolkit [20] which allows many computations involving complex networks to be easily run on distributed memory supercomputers, though the simulations we perform here are small enough to fit on a single 12 core Xeon workstation. The model is implemented by a module which allows direct control of a number of parameters, including σ, R H , and T, µ of (1), and N.

Preliminaries
We begin with some basic calculations, before attempting to directly evaluate (4).

Radial Density
Let ρ(r) = sinh d (σζr) 1. Below are some basic calculations which will be very useful in the sequel. . ( i.e., where If we set R = 2 dζ ln(N/ν), for some ν > 0, then Note that ν provides some control over the effective density of nodes with respect to the hyperbolic volume measure, though subject to the asymptotic constraint that R ∝ ln N. Furthermore, we have uniformly for all r with The following Lemma follows immediately from the above.

Lemma 1.
Let r v be the radial coordinate of the vertex v ∈ V N . We have Furthermore, letting r 0 = 1 2σ R + ω(N) with 2σ > 1, then (Throughout, a.a.s. stands for asymptotically almost surely, i.e., with a probability that tends to 1 as

Distance
As mentioned in Section 1.1, the distance d u,v between two vertices u and v is given by the hyperbolic law of cosines This hyperbolic distance can be approximated according to the following lemma.
we have uniformly for all u, v satisfying the above condition. Note that we can take h 1 (N) = h(N), for example.
π. Next, we prove (13). The right-hand side of (12) cosh Here, , and we also make use of Young's inequality to get Thus, so from (12), we get Furthermore, from (12) and (14), we have and, from (15), we have . Notice that h(N) should grow more slowly than R as N → ∞.

Angular Integral
The conditional probabilityp u,v that a node u at fixed radial coordinate r u connects to a node v at fixed radius r v is the connection probability (2) integrated over their possible relative angular separations θ 1 :p We call this the angular integral, and, to simplify the notation, we will rescale variables as follows: .
Thus, for r u + r v − R ≥ ω(N), we divide the integral (16) as follows: We begin from the first part of the integral The second part of the integral Notice the estimation I = δ π 0 in Appendix A, for the integral , and taking account of (18), we have • and for τ > 1, Finally, we get the angular integral estimation:

Expected Degree Distribution for τ < 1
We are now in a position to compute the expected degree of node u. Let I u,v be an indicator random variable which is 1 when there is an edge between nodes u and v in the graph, and 0, otherwise. Recall that the coordinates of u are (r u , θ u ). Below, we assume r u > ω(N). Now, Notice that Pr[I u,v = 1|r u , θ u ] depends only on the radius r u (or rescaled radius coordinate η u ), as one would expect from spherical symmetry. We therefore omit the θ u in Pr[ We estimate the first part of the integral (20). From (7), (8) and Next, we estimate the second part of the integral (20). From Lemma 3 and (11), we have whereρ(η) = 2 dζ ρ( 2 dζ η).
Thus, we have

Asymptotic Degree Distribution
Now, we compute the asymptotic degree distribution. When r u ω(N), we have (21) and (22). Thus, Equation (29) is compared with simulation in Figure 1. Comparison of (the asymptotic approximation to) Equation (29) (the two straight lines) with two sample N = 2 17 node networks generated from the model. The degree of each node u is plotted as a function of its (rescaled) radial coordinate η u , for a network sprinkled into H d+1 . The lower data points, in red, come from a network with parameters d = 1, 2σ = 9/5 and τ = 1/100, while the upper data points, in black, arise from a sprinkling with d = 3, 2σ = 7/5 and τ = 1/40. In each case, we sprinkle into a ball of radius R H = ln N ≈ 11.78 (i.e., ν ≡ 1). Note that the vertical axis is in logscale, so that the exponential decay of node degree with distance η appears as straight lines. The simulation results appear to be consistent with Equation (29).
is the number of connections to the vertex u, i.e., its degree. In general, it depends on the coordinates of u: r u and θ u ; however, due to the angular symmetry, only the r u dependence remains. The quantities {I u,v } v∈V u N are a family of independent and identically distributed indicator random variables, whose expectation values are given by (28). LetD u be a Poisson random variable with parameter equal to T u := ∑ v∈V u N Pr[I u,v = 1|r u ], and T u = k(r u ) from (29). Recall that the total variation distance d TV (X 1 , X 2 ) between two non-negative discrete random variables is defined as uniformly for all r u ≥ R 2 + ω(N).
Proof. From Theorem 2.10 in [21] Using the above claim, we have, for any k > 0, the probability that the vertex u has k connections is from (7) and (8). Taking the specific expression of the Poisson distribution into (30) and rescaling the integral, we have (11). On the other hand, Thus, Changing the integration variable to x = K(N − 1)e −η u , where o k,N (1) → 0, as k, N → ∞, γ = 2σ + 1, and we used that Γ(k−2σ) k! → k −γ as k → ∞. Above, we made use of the mean value theorem to assert that  4.2. τ < 1 and 2σ = 1

Asymptotic Degree Distribution
Next, we compute the asymptotic degree distribution. When η u ≥ 2 (22). Thus, uniformly for all η u ≥ 2 3 R H + dζ 2 ω(N). Similarly to Claim 2, we have Figure 3 gives the tendency of Pr[D u = k] as N is increasing. If we choose R H = ln N ν , then Since e −x x k k! is decreasing to 0 when x > k, then for any > 0, if N is large enough, then which means that the k-degree probability Pr[D u = k] will be very small when N is large enough, for any fixed k ≥ 0. Note that, comparing the k-degree probability with the mean degree, we see that the rates at which the k-degree probabilities go to zero as N → ∞ cannot be uniform. It would be interesting to study the behavior of the limiting degree distribution as k → ∞.
As with Claim 2, we compute the asymptotic behavior of D u = ∑ v∈V u N I u,v andD u . By (39), we have uniformly for all r u ≥ R 2 + ω(N). Proof.
Thus, as with (34), we have for any k > 0. First, we compute and notice that, for N Altogether, we have for any k > 0. Figure 4 shows that Pr[D u = k] decreases as N increases for k > 0.

Asymptotic Degree Distribution
If 2σ = 1 τ , we consider η u ≥ 2R H 3 + dζ 2 ω(N). We still may achieve η u e − ηu τ from (41), so uniformly for all η u ≥ 2R H 3 + dζ 2 ω(N). Then, when R H = τ ln N ν . Furthermore, as with (30), we still have η u e − ηu τ dζ 2 ω(N) uniformly for all ( 2 3 R H + dζ 2 ω(N)) < η u < R H , and since the function e −x x k k! is decreasing to 0 when x > k, then, for any > 0, when N is large enough, which means that the k-degree probability Pr[D u = k] will be very small when N is large enough, for any fixed k ≥ 0.

Mean Degree
If we choose R H = τ ln N ν , then, from (25) and (41), If we choose ω(N) satisfying e dζ 2τ ω(N) = o(R), from (24) and (26), we have the mean degree which will go to infinity as N → ∞. Similarly, comparing the probability of the degree being equal to k with the mean degree, we see that the rates at which the k-degree probabilities go to zero as N → ∞ cannot be uniform. Note: For the parameter regions τ < 1 and 2σ < 1; τ = 1 and 2σ ≤ 1; τ > 1 and 2σ < 1 τ , we can not perform similar asymptotic analyses of degree because, in those cases, we can not ignore (21) relative to (22), (38) and (41).

Asymptotic Correlations of Degree
We extend Theorem 4.1 of [13], governing the independence of degrees of vertices at separate angular positions in a hyperbolic ball in H d+1 , to arbitrary dimension d ≥ 1.
for dimension d, and any fixed integer m ≥ 2, and for any collection of m pairwise distinct vertices v i , i = 1, · · · , m, their degrees D v 1 , · · · , D v m are asymptotically independent in the sense that, for any non-negative integers k i , i = 1, · · · , m, It is important to note that m is constant while N → ∞, so the collection of m vertices of Theorem 4 forms a vanishingly small fraction of all vertices. Furthermore, since each of these m vertices has a finite degree, each of their neighbors will almost surely not be found among the other m vertices.
For the purposes of this section, we choose dζ 2 R = ln N ν and 2σ > 1.

Definition 1.
For a vertex v ∈ V N , define A v to be the set of points {w ∈ H d+1 : r w ≥ R − r 0 , θ v,w ≤ min{π,θ v,w }}, where θ v,w is the angle between the points v and w in H d+1 , andθ v,w := ω(N) A v is called the vital region in [13] (Note that A v might not be a topological neighborhood for the vertex v because v might lie outside of its vital region). (r 0 is defined at the end of Section 2.1, and A v,w at the beginning of Section 3.) Similarly, we write u ∈ A v to indicate that the embedded location of vertex u lies in the vital region A v .
We will prove that the vital regions A v i , i = 1, · · · , m are mutually disjoint with high probability. Let E 1 be this event, i.e., that A v i ∩ A v j = ∅, ∀i = j, i, j = 1, · · · , m . Proof. Assume r v i > R 2σ + 2ω(N). For any point w ∈ A v i , the parameterθ v i ,w is maximized when r w = R − r 0 = R − ( 1 2σ R + ω(N)), so letθ v i be this maximum, that is Thus, for i = j, and The probability for this to occur is bounded as the following equation, for sufficiently large N. Making use of the angular symmetry, we have Additionally, for a vertex v, we have from (7) and (8) Pr Since 2σ > 1, and we may choose ω(N) = o(R) small enough to satisfy e dζω(N)

Definition 2.
Define by T 1 the event that R − r v i ≤ r 0 , i = 1, . . . , m. For a given vertex w ∈ V N \ {v 1 , · · · , v m }, define by A w v i the event that w is connected to v i and w ∈ A v i . Define byÃ w v i the event that the vertex w is outside of A v i but is connected to v i . Furthermore, define A (k 1 , · · · , k m ) as the event that k i vertices satisfy the event A w v i , i.e., are vertices connected to vertex v i and lying within its vital region, for i = 1, · · · , m, whereas all other vertices do not. In addition, lastly, let w v i . Thus, B 1 is the event that at least one of the vertices v i has a neighbor outside of its vital region.
Note that, from Corollary 1, Pr[T 1 ] = 1 − o(1). Assuming E 1 ∩ T 1 , if B 1 is not realized, then the event that vertex v i has degree k i , for all i = 1, · · · , m, is realized if and only if A (k 1 , · · · , k m ) is realized.

Proof. The proof appears in Appendix B.
Thus, we have We first calculate the probability of event A w v i that the vertex w ∈ V N \ {v 1 , · · · , v m } is located in A v i and is linked to v i . Notice that p v i ,w does not depend directly on θ v i , but only the angle between v i and w, and furthermore the volume element in H d+1 is rotationally invariant, so, when averaging over the position of w, Thus, by the angular symmetry, we have the below lemma.

Comment:
For τ < 1 and 2σ > 1, we found Pr This is important because we can see that, for the vertex w, the radius r w of the vertex w is likely larger than R − r 0 when the event A w v i occurs; so we can expect that the vertex w has a very small relative angle with v i , and lies only in A v i , and not in any other vital regions A v j , j = 1, · · · , m, j = i. Thus, we expect that the events D v 1 = k 1 , · · · , D v m = k m are asymptotically independent. For τ > 1 and 2σ > 1 τ , we cannot get the above analysis, so the same result may not hold in that case. Lemma 6. Let τ < 1 and 2σ > 1. For fixed m ≥ 2, v i , i = 1, · · · , m are vertices in V N , and k 1 , · · · , k m ≥ 0 are integers; then, Pr[A (k 1 , · · · , k m )|E 1 , Proof. The proof is lengthy so we put it in Appendix C.

Summary and Conclusions
Models of complex networks based upon hyperbolic geometry are proving to be effective in their ability to capture many important characteristics of real world complex networks, including a power law degree distribution and strong clustering. As explicated in Ref. [14], the radial dimension can be regarded as a measure of popularity of a network node, while the angular dimension provides a space to summarize internal characteristics of a network node. Prior to this work, all hyperbolic models that have been considered to date have been two-dimensional, thus allowing only a one-dimensional representation of these internal characteristics. However, it is reasonable to expect that, in order to capture the behavior of real world complex networks, one must allow for similarity spaces of larger than one dimension. One place to see this need is in networks embedded in space, such as transportation at intracity and intercity scales, or galaxies arranged in clusters and superclusters, each of which naturally live in two or three dimensions, respectively. However, the greater need may in fact be in information-based networks, such as those which represent how data are organized, for example in the realm of natural language processing. See, for example, the following references: [8,10,[22][23][24][25]. We thus have generalized the hyperbolic model to allow for arbitrarily large dimensional similarity spaces.
Specifically, we have computed the exact asymptotic degree distribution for a generalization of hyperbolic geometric random graphs in arbitrary dimensional hyperbolic space. We considered five regions of the parameter space, as depicted in Figure 5: one for temperature τ < 1 and 2σ > 1, another at τ < 1 and 2σ = 1, a third at a critical temperature τ = 1 and 2σ > 1, and last two at high temperature regimes with τ > 1: one with 2σ > 1/τ, and a second with 2σ = 1/τ. For two of the regions, we found a power law expected degree distribution whose exponent is governed by the parameter σ, which controls the radial concentration of nodes in the geometric embedding. When τ = 1, we find that the degree distribution degenerates, such that only zero degree nodes have non-vanishing probability, i.e., almost every node is disconnected from the rest of the network. For the remaining two regions, with 2σ = 1 or 2σ = 1 τ , we find that the degree distribution 'runs off to infinity', such that the probability of any node having finite degree goes to zero in the asymptotic limit.
We have also proved a generalization of Fountoulakis's theorem governing correlations in degree to arbitrary dimension, and discovered a non-trivial dependence on the dimension and the exponent of the power law in these correlations.
It is important to be able to model somewhat denser complex networks with 'fat tailed' degree distributions, for example those whose degree sequence is still a power law Pr(k) ∼ 1/k γ , but with γ ≤ 2. We have made an important first step in this direction, by exploring three parameter regimes with 2σ ≤ 1. One of them manifests a power law degree distribution, at τ > 1, while, for the two others, the degree distribution 'runs off to infinity', which is not necessarily unexpected in the context of a fat tailed distribution. It would be instructive to understand this denser regime of complex networks in more detail, and provide models which can help predict the behavior of these networks. Another significant step is to explore the clustering behavior of these higher dimensional models. Krioukov has shown that clustering can be an important herald of geometric structure of networks, and is common in real-world complex networks. [6] Does the clustering behavior of the two-dimensional hyperbolic models generalize to arbitrary dimension?
One can also study more carefully the effect of the constraints we impose on the growth of R with N, such as the choice R = ln N ν . Might some other characterization of the ball radius as a function of network size be more effective, in a wider region of the parameter space? We have already seen hints to this effect in Sections 5 and 6.
Another important step is to extend the model to allow for bipartite networks, by assigning to each node one of two possible 'types', and allowing only nodes of different types to connect. This would generalize the approach of [26] to arbitrary dimensional hyperbolic space.  Regions of parameter space explored in this paper, with the respective subsections of the paper indicated for each. The temperature parameter τ = dζ 2 T increases in the downward direction (as it generally does on the Earth), and the parameter which often controls the exponent of the power law degree distribution, 2σ, increases to the right. Since larger σ generally means faster fall off of the degree distribution, we can imagine fat or long tails to the left, and more truncated tails to the right. We can think of increased temperature as promoting noise which shifts the model in the direction of a uniform distribution on graphs. The two 'generic' regions in blue both manifest power law degree distributions. The three other 'measure zero' regions do not yield non-trivial degree distributions at finite degree, using the growth rate of R H with N that we have chosen for them. Only the upper right region τ < 1 and 2σ > 1 has appeared in the previous literature.
Of course, it is arguably of greatest interest to apply the model to real world networks, by embedding such networks into higher dimensional hyperbolic space. To do so, one would need to estimate the model parameters which are most closely associated with a given network. The ideal process for doing so is Bayesian parameter estimation; however, there are a number of techniques, such as measuring the network's degree sequence, which can serve as efficient proxies in place a full Bayesian statistical analysis. Some steps in this direction for a somewhat similar (d = 1) model, along with some initial results for link prediction and soft community detection, are given in [27]. It will be of great interest to see if these higher dimensional models provide more effective predictions for many real world networks, such as the information-based networks mentioned above, or social or biological networks, where we may expect that a one-dimensional similarity space lacks the depth and sophistication necessary to effectively represent the behavior of these complicated entities. Acknowledgments: D.R. is extremely grateful to Maksim Kitsak for sharing the unpublished manuscript, "Lorentz-invariant maximum-entropy network ensembles". We appreciate the assistance of the anonymous referees for many very useful comments and suggestions. We thank the San Diego Supercomputing Center for hosting us while these computations were performed.

Conflicts of Interest:
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Angular Integrals
Consider the integral , where δ is a small constant. Define θ δ by Split I into two pieces I = I 1 + I 2 , with (where, by construction, we keep the second term in the denominator smaller than the first) and .
We evaluate the integrals I {1,2} for general dimension d.
Note that, for 1 τ ≥ 1, the last integral will be singular.