Next Article in Journal
Detracking Autoencoding Conditional Generative Adversarial Network: Improved Generative Adversarial Network Method for Tabular Missing Value Imputation
Next Article in Special Issue
Violations of Hyperscaling in Finite-Size Scaling above the Upper Critical Dimension
Previous Article in Journal
Dynamic Weighting Translation Transfer Learning for Imbalanced Medical Image Classification
Previous Article in Special Issue
Ralph Kenna’s Scaling Relations in Critical Phenomena
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Monte Carlo Based Techniques for Quantum Magnets with Long-Range Interactions

Patrick Adelhardt
Jan A. Koziol
Anja Langheld
Kai P. Schmidt
Department of Physics, Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), 91058 Erlangen, Germany
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2024, 26(5), 401;
Submission received: 5 March 2024 / Revised: 17 April 2024 / Accepted: 19 April 2024 / Published: 1 May 2024


Long-range interactions are relevant for a large variety of quantum systems in quantum optics and condensed matter physics. In particular, the control of quantum–optical platforms promises to gain deep insights into quantum-critical properties induced by the long-range nature of interactions. From a theoretical perspective, long-range interactions are notoriously complicated to treat. Here, we give an overview of recent advancements to investigate quantum magnets with long-range interactions focusing on two techniques based on Monte Carlo integration. First, the method of perturbative continuous unitary transformations where classical Monte Carlo integration is applied within the embedding scheme of white graphs. This linked-cluster expansion allows extracting high-order series expansions of energies and observables in the thermodynamic limit. Second, stochastic series expansion quantum Monte Carlo integration enables calculations on large finite systems. Finite-size scaling can then be used to determine the physical properties of the infinite system. In recent years, both techniques have been applied successfully to one- and two-dimensional quantum magnets involving long-range Ising, XY, and Heisenberg interactions on various bipartite and non-bipartite lattices. Here, we summarise the obtained quantum-critical properties including critical exponents for all these systems in a coherent way. Further, we review how long-range interactions are used to study quantum phase transitions above the upper critical dimension and the scaling techniques to extract these quantum critical properties from the numerical calculations.

1. Introduction

Since the advent of the theoretical description of classical and quantum phase transitions (QPTs), long-range interactions between degrees of freedom challenged the established concepts and propelled the development of new ideas in the field [1,2,3,4,5]. It is remarkable that, only a few years after the introduction of the renormalisation group (RG) theory by K.G. Wilson in 1971 as a tool to study phase transitions and as an explanation for universality classes [6,7,8,9,10,11], it was used to investigate ordering phase transitions with long-range interactions. These studies found that the criticality depends on the decay strength of the interaction [1,2,3]. It then took two decades to develop numerical Monte Carlo (MC) tools capable of simulating basic magnetic long-range models with thermal phase transitions following the behaviour predicted by the RG theory [12,13]. The results of these simulations sparked a renewed interest in finite-size scaling above the upper critical dimension [12,14,15,16,17,18,19] since “hyperscaling is violated” [13] for long-range interactions that decay slowly enough. In this regime, the treatment of dangerous irrelevant variables (DIVs) in the scaling forms is required to extract critical exponents from finite systems.
Meanwhile, a similar historic development took place regarding the study of QPTs under the influence of long-range interactions. By virtue of pioneering RG studies [20,21], the numerical investigation of long-range interacting magnetic systems has been triggered [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38]. In particular, Monte Carlo-based techniques became a popular tool to gain quantitative insight into these long-range interacting quantum magnets [22,25,26,29,30,31,32,33,34,35,36,37,38,39,40]. On the one hand, this includes high-order series expansion techniques, where classical Monte Carlo integration is applied for the graph embedding scheme, allowing extracting energies and observables in the thermodynamic limit [25,29]. On the other hand, there is stochastic series expansion quantum Monte Carlo [39], which enables calculations on large finite systems. To determine the physical properties of the infinite system, finite-size scaling is performed with the results of these computations. Inspired by the recent developments for classical phase transitions [15,16,17,18,19,41], a theory for finite-size scaling above the upper critical dimension for QPTs was introduced [32,34].
When investigating algebraically decaying long-range interactions r ( d + σ ) with the distance r and the dimension d of the system, there are two distinct regimes: one for σ 0 (strong long-range interaction) and another one for σ > 0 (weak long-range interaction) [5,42,43,44,45]. In the case of strong long-range interactions, common definitions of internal energy and entropy in the thermodynamic limit are not applicable and standard thermodynamics breaks down [5,42,43,44,45]. We will not focus on this regime in this review. For details specific to strong long-range interactions, we refer to other review articles such as Refs. [5,42,43,44,45]. For the sake of this work, we restrict the discussion to weak long-range interaction or competing antiferromagnetic strong long-range interactions, for which an extensive ground-state energy can be defined without rescaling of the coupling constant [5].
The interest in quantum magnets with long-range interactions is further fuelled by the relevance of these models in state-of-the-art quantum–optical platforms [5,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87]. To realise long-range interacting quantum lattice models with a tunable algebraic decay exponent, one can use trapped ions, which are coupled off-resonantly to motional degrees of freedom [5,81,82,83,84,85,88]. Another possibility is to couple trapped neutral atoms to photonic modes of a cavity [5,86,87]. Alternatively, one can realise long-range interactions decaying with a fixed algebraic decay exponent of six or three using Rydberg atom quantum simulators [46,47,48,49,50,51,52,53,54,55] or ultracold dipolar quantum atomic or molecular gases in optical lattices [56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73]. Note that, in many of the above-listed cases, it is possible to map the long-range interacting atomic degrees of freedom onto quantum spin models [5,52,89]. Therefore, they can be exploited as analogue quantum simulators for long-range interacting quantum magnets, and the relevance of the theoretical concepts transcends the boundary between the fields.
From the perspective of condensed matter physics, there are multiple materials with relevant long-range interactions [90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106]. The compound LiHoF4 in an external field realises an Ising magnet in a transverse magnetic field [102,103,104,105]. A recent experiment with the two-dimensional Heisenberg ferromagnet Fe3GeTe2 demonstrates that phase transitions and continuous symmetry breaking can be implemented by circumventing the Hohenberg–Mermin–Wagner theorem with long-range interactions [106]. This material is in the recently discovered material class of 2D magnetic van der Waals systems [107,108]. Further, dipolar interactions play a crucial role in the spin ice state in the frustrated magnetic pyrochlore materials Ho2Ti2O7 and Dy2Ti2O7 [90,91,92,93,94,95,96,97,98,99,100,101].
In this review, we are interested in physical systems described by quantum spin models, where the magnetic degrees of freedom are located on the sites of a lattice. We concentrate on the following three paradigmatic types of magnetic interactions between lattice sites: first, Ising interactions, where the magnetic interaction is oriented only in the direction of one quantisation axis; second, XY interactions with a U ( 1 ) -symmetric magnetic interaction invariant under planar rotations; and third, Heisenberg interactions with a S U ( 2 ) -symmetric magnetic interaction invariant under rotations in 3D spin space. In the microscopic models of interest, a competition between magnetic ordering and trivial product states, external fields, or quasi-long-range order leads to QPTs.
In this context, the primary research pursuit revolves around how the properties of the QPT depend on the long-range interaction. The upper critical dimension of a QPT in magnetic models with non-competing algebraically decaying long-range interactions is known to depend on the decay exponent of the interaction for a small enough exponent, and decreases as the decay exponent decreases [20,21]. If the dimension of a system is equal to or exceeds the upper critical dimension, the QPT displays mean-field critical behaviour. At the same time, standard finite-size scaling, as well as standard hyperscaling relations are no longer applicable. Therefore, these systems are primary workhorse models to study finite-size scaling above the upper critical dimension. In this case, the numerical simulation of these systems is crucial in order to gauge novel theoretical developments. Further, QPTs in systems with competing long-range interactions do not tend to depend on the long-range nature of the interaction [23,24,25,26,29,30,32]. In several cases, long-range interactions then lead to the emergence of ground states and QPTs, which are not present in the corresponding short-range interacting models [27,30,54,55,109,110,111,112].
In this review, we are mainly interested in the description and discussion of two Monte Carlo-based numerical techniques, which were successfully used to study the low-energy physics of long-range interacting quantum magnets, in particular with respect to the quantitative investigation of QPTs [22,25,29,30,31,32,34,35,36,37,38,40]. The success of Monte Carlo techniques in this field is due to the occurrence of high-dimensional sums and integrals that commonly arise in the formulation of many-particle statistics. In contrast to many deterministic integration techniques, for which the standard error scales exponentially with the dimension of the underlying integral, the standard error of an integral calculated with Monte Carlo integration does not scale with the dimension of the underlying integral. We further chose to review this topic due to our personal involvement with the application and development of these methods [25,29,30,31,32,34,35]. On the one hand, we explain in detail how classical Monte Carlo integration can enhance the capabilities of linked-cluster expansions (LCEs) with the pCUT+MC approach (a combination of the perturbative unitary transform approach (pCUT) and MC embedding). On the other hand, we describe how stochastic series expansion (SSE) quantum Monte Carlo (QMC) integration is used to directly sample the thermodynamic properties of suitable long-range quantum magnets on finite systems.
This review is structured as follows. In Section 2, we review the basic concept of a QPT in a condensed way, focusing on the details relevant for this review. We define the quantum-critical exponents and the relations between them in Section 2.1. Here, we also have the first encounter with the generalised hyperscaling relation, which is also valid above the upper critical dimension where conventional hyperscaling breaks down. As the SSE QMC method discussed in this review is a finite-system simulation, we discuss the conventional finite-size scaling below the upper critical dimension in Section 2.2 and the peculiarities of finite-size scaling above the upper critical dimension in Section 2.3. In Section 3, we summarise the basic concepts of Markov chain Monte Carlo integration: Monte Carlo sampling, Markovian random walks, stationary distributions, the detailed balance condition, and the Metropolis–Hastings algorithm. We continue by introducing the series-expansion Monte Carlo embedding method pCUT+MC in Section 4. We start with the basic concepts of a graph expansion in Section 4.1 and introduce the perturbative method of our choice, the perturbative continuous unitary transformation method, in Section 4.2. We introduce the theoretical concepts for setting up a linked-cluster expansion as a full graph decomposition in Section 4.3 and, subsequently, discuss how to practically calculate perturbative contributions in Section 4.4 and Section 4.5. We prepare the discussion of the white graph decomposition in Section 4.6 with an interlude on the relevant graph theory in Section 4.6.1 and Section 4.6.2 and the important concept of white graphs in Section 4.6.3. Further, in Section 4.7, we discuss the embedding problem for the white graph contributions. Starting from the nearest-neighbour embedding problem in Section 4.7.1, we generalise it to the long-range case in Section 4.7.2 and then introduce a classical Monte Carlo algorithm to calculate the resulting high-dimensional sums in Section 4.7.3. This is followed by some technical aspects on series extrapolations in Section 4.8 and a summary of the entire workflow in Section 4.9. In the next section, the topic changes towards the review of the SSE QMC method, which is an approach to simulate thermodynamic properties of suitable quantum many-body systems on finite systems at a finite temperature. First, we discuss the general concepts of the method in Section 5. We review the algorithm to simulate arbitrary transverse-field Ising models introduced by A. Sandvik [39] in Section 5.1. We then review an algorithm used to simulate non-frustrated Heisenberg models in Section 5.2. After the introduction to the algorithms, we summarise techniques on how to measure common observables in the SSE QMC scheme in Section 5.3. Since the SSE QMC method is a finite-temperature method, we discuss how to rigorously use this scheme to perform simulations at effective zero temperature in Section 5.4. We conclude this section with a brief summary of path integral Monte Carlo techniques used for systems with long-range interactions (see Section 5.5). To maintain the balance between algorithmic aspects and their physical relevance, we summarise several theoretical and numerical results for quantum phase transitions in basics long-range interacting quantum spin models, for which the discussed Monte Carlo-based techniques provided significant results. First, we discuss long-range interacting transverse-field Ising models in Section 6. For ferromagnetic interactions, this model displays three regimes of universality: a long-range mean-field regime for slowly decaying long-range interactions, an intermediate long-range non-trivial regime, and a regime of short-range universality for strong decaying long-range interactions. We discuss the theoretical origins of this behaviour in Section 6.1.1 and numerical results for quantum critical exponents in Section 6.1.2. Since this model is a prime example to study scaling above the upper critical dimension in the long-range mean-field regime, we emphasise these aspects in Section 6.1.3. Further, we discuss the antiferromagnetic long-range transverse-field Ising model on bipartite lattices in Section 6.2 and on non-bipartite lattices in Section 6.3. The next obvious step is to change the symmetry of the magnetic interactions. Therefore, we turn to long-range interacting XY models in Section 7 and Heisenberg models in Section 8. We discuss the long-range interacting transverse-field XY chain in Section 7 starting with the U ( 1 ) -symmetric isotropic case in Section 7.1, followed by the anisotropic case for ferromagnetic (see Section 7.2) and antiferromagnetic (see Section 7.3) interactions, which display similar behaviour to the long-range transverse-field Ising model on the chain discussed in Section 6. We conclude the discussion of the results with unfrustrated long-range Heisenberg models in Section 8. We focus on the staggered antiferromagnetic long-range Heisenberg square lattice bilayer model in Section 8.1 followed by long-range Heisenberg ladders in Section 8.2 and the long-range Heisenberg chain in Section 8.3. We conclude in Section 9 with a brief summary and with some comments on the next possible steps in the field.

2. Quantum Phase Transitions

This review is part of the Special Issue with the topic “Violations of Hyperscaling in Phase Transitions and Critical Phenomena”. In this work, we summarise investigations of low-dimensional quantum magnets with long-range interactions targeting, in particular, quantum phase transitions (QPTs) above the upper critical dimension, where the naive hyperscaling relation is no longer applicable. In this section, we recapitulate the relevant aspects of QPTs needed to discuss the results of the Monte Carlo-based numerical approaches. First, we give a general introduction to QPTs. After that, we discuss in detail the definition of critical exponents and the relations among them in Section 2.1, as well as the scaling below (see Section 2.2) and above (see Section 2.3) the upper critical dimension.
Any non-analytic point of the ground-state energy of an infinite quantum system as a function of a tuning parameter λ is identified with a QPT [113]. This tuning parameter can, for instance, be a magnetic field or pressure, but not the temperature. Quantum phase transitions are a concept of zero temperature as there are no thermal fluctuations and all excited states are suppressed infinitely strong such that the system remains in its ground state. There are two scenarios for how a non-analytic point in the ground-state energy can emerge [113]: First is an actual (sharp) level crossing between the ground-state energy and another energy level. Second, the non-analytic point can be considered as a limiting case of an avoided level crossing. Historically, phase transitions are classified by the lowest order derivative of the free energy that is discontinuous [113,114]. Therefore, a first-order phase transition is discontinuous in the order parameter (first derivative) and a second-order phase transition is discontinuous in the response functions (second derivative). Since, in second-order phase transitions, the order parameter is still continuous across the phase transition, we use the term “continuous phase transition” as an equivalent for “second-order phase transition”.
In this review, we are interested in second-order QPTs, which fall into the second scenario. At a second-order QPT, the relevant elementary excitations condense into a novel ground state, while the characteristic length and time scales diverge. Apart from topological QPTs involving a long-range entangled topological phase, such continuous transitions are described by the concept of spontaneous symmetry breaking. On one side of the QPT, the ground state obeys a symmetry of the Hamiltonian, while on the other side, this symmetry is broken in the ground state and a ground-state degeneracy arises.
Following the idea of the quantum-to-classical mapping [113,115], d-dimensional quantum systems can be mapped in the vicinity of a second-order QPT to models of statistical mechanics with a classical (thermal) second-order phase transition in d + 1 dimensions. In many cases, the models obtained from a quantum-to-classical mapping are rather artificial [113]. However, such mappings often allow categorising QPTs in terms of universality classes and associated critical exponents by the non-analytic behaviour of the classical counterparts [10,11,113,116,117]. The mapping further illustrates that the renormalisation group (RG) theory is also applicable to describe QPTs [10,11,113,116,117].
In the RG theory, each QPT belongs to a non-trivial fixed point of the RG transformation [10,11], whereas a trivial fixed point would, for instance, be a fully ordered state with maximal correlation or a fully disordered state with no correlation at all. Critical exponents are connected to the RG flow in the immediate vicinity of these non-trivial fixed points [10,11,113]. The concept of universality classes arises from the fact that different microscopic Hamiltonians can have a quantum critical point that is attracted by the same non-trivial fixed point under successive application of the RG transformation [10,11]. Due to this, the QPTs in these models have the same critical exponents.
Another remarkable result of the RG theory is the scaling of observables in the vicinity of phase transitions. Historically, the theory of scaling at phase transitions was heuristically introduced before the RG approach [118,119,120,121,122,123,124]. The latter provided the theoretical foundation for the scaling hypothesis [6,7]. The main statement of the scaling theory is that the non-analytic contributions to the free energy and correlation functions are mathematically described by generalised homogeneous functions (GHFs) [124]. A function with n variables f ( x 1 , x 2 , , x n ) is called a GHF, if there exist a 1 , a 2 , , a n R with at least one being non-zero and a f R such that, for b R + ,
f ( b a 1 x 1 , b a 2 x 2 , , b a n x n ) = b a f f ( x 1 , x 2 , , x n ) .
The exponents a 1 , a 2 , , a n are the scaling powers of the variables, and a f is the scaling power of the function f itself. An in-depth summary of the mathematical properties of GHFs can be found in Appendix B. The most important properties of GHFs are that their derivatives, Legendre transforms, and Fourier transforms are also GHFs. As we will outline in Section 2.2, the theory of finite-size scaling is formulated in terms of GHFs and relates the non-analytic behaviour at QPTs in the thermodynamic limit with the scaling of observables for different finite system sizes. In this, the variables x i are related to physical parameters like the temperature T, control parameter λ , symmetry-breaking field H, and also irrelevant, more abstract parameters that parameterise the microscopic details of the model like the lattice spacing. Later in this section, we will define irrelevant variables in the context of the RG and GHFs.
Another aspect relevant for this work is that quantum fluctuations are the driving factor with QPTs [113]. In general, fluctuations are more important in low dimensions [117]. The universality class of QPTs for a certain symmetry breaking depends on the dimensionality of the system.
An important aspect regarding this review is the so-called upper critical dimension d uc . The upper critical dimension is defined as a dimensional boundary such that, for systems with dimension d d uc , the critical exponents are those obtained from mean-field considerations. The upper critical dimension is of particular importance for QPTs in systems with non-competing long-range interactions. For sufficiently small decay exponents of an algebraically decaying long-range interaction 1 / r d + σ , the upper critical dimension starts to decrease as a function of the decay exponent σ [20,32,34]. In the limiting case of a completely flat decay ( σ = d ) of the long-range interaction resulting in an all-to-all coupling, the model is intrinsically of the mean-field type and mean-field considerations become exact. For a certain value of the decay exponent, the upper critical dimension becomes equal to the fixed spatial dimension, and for decay exponents below this value, the dimension of the system is above the upper critical dimension of the transition [20,32,34]. This makes long-range interacting systems an ideal test bed for studying phase transitions above the upper critical dimension in low-dimensional systems. In particular, long-range interactions can make the upper critical dimension accessible in real-world experiments as the upper critical dimension of short-range models is usually not below three.
Although phase transitions above the upper critical dimension display mean-field criticality, they are still a matter worth studying, since naive scaling theory describing the behaviour of finite systems close to a phase transition (see Section 2.2) is no longer applicable [16,19,125]. Moreover, the naive versions of some relations between critical exponents, as discussed in Section 2.1, do not hold any longer [15,16]. The reason for this issue are the dangerous irrelevant variables (DIVs) in the RG framework [126,127,128]. During the application of the RG transformation, the original Hamiltonian is successively mapped to other Hamiltonians, which can have infinitely many couplings. All these couplings, in principle, enter the GHFs. In practice, all but a finite number of these couplings are set to zero since their scaling powers are negative, which means they flow to zero under renormalisation. These couplings are, therefore, called irrelevant. This approach of setting irrelevant couplings to zero can be used to derive the finite-size scaling behaviour as described in Section 2.2. However, above the upper critical dimension, this approach breaks down because it is only possible to set irrelevant variables to zero if the GHF does not have a singularity in this limit [126]. Above the upper critical dimension, such singularities in irrelevant parameters exist, which makes them DIVs [127]. We explain the effect of DIVs on scaling in Section 2.3.

2.1. Critical Exponents in the Thermodynamic Limit

As outlined above, a second-order QPT comes with a singularity in the free energy density. In fact, also, other observables experience singular behaviour at the critical point in the form of power-law singularities. For instance, the order parameter m as a function of the control parameter λ behaves as
m ( λ λ c ) | λ λ c | β
in the ordered phase. Without loss of generality, the system is taken to be in the ordered phase for λ < λ c and the notation λ λ c means that λ is approaching λ c from below, i.e., it is approaching in the ordered phase. In the disordered phase λ > λ c , the order parameter by definition vanishes such that m ( λ λ c + ) = 0 . The observables with their respective power-law singular behaviour, which is characterised by the critical exponents α , β , γ , δ , η , ν , and z, are summarised in Table 1 together with how they are commonly defined in terms of the free energy density f, the symmetry-breaking field H, which couples to the order parameter, and the reduced control parameter r = ( λ λ c ) / λ c .
One usually defines reduced parameters like r that vanish at the critical point not only to shorten the notation, but also to express the power-law singularities independent of the microscopic details of the specific model one is looking at. While the value of λ c depends on these details, the power-law singularities are empirically known to not depend on the microscopic details, but only on more general properties like the dimensionality, the symmetry that is being broken, and, with particular emphasis due to the focus of this review, on the range of the interaction. It is, therefore, common to classify continuous phase transitions in terms of universality classes. These universality classes share the same set of critical exponents. In terms of the RG, this behaviour is understood as distinct critical points of microscopically different models flowing to the same renormalisation group fixed point, which determines the criticality of the system [6,7,10]. Prominent examples for universality classes of magnets are the 2D and 3D Ising ( Z 2 symmetry), 3D XY ( O ( 2 ) symmetry), and 3D Heisenberg ( O ( 3 ) symmetry) universality classes [113]. It is important to mention that the dimension in the classifications is referring to classical and not quantum systems, and they should not be confused with each other. In fact, the universality class of a short-range interacting non-frustrated quantum Ising model of dimension d lies in the universality of the classical d + 1 -dimensional Ising model.
There are only a few dimensions for which a separate universality class is defined for the different models. For lower dimensions, the fluctuations are too strong in order for a spontaneous symmetry breaking to occur. In the case of the classical Ising model, there is no phase transition for 1D, while for the classical XY and Heisenberg models with continuous symmetries, there is not even a phase transitions for 2D due to the Hohenberg–Mermin–Wagner (HWM) theorem [130,131]. This dimensional boundary is referred to as lower critical dimension d lc . The lower critical dimension is the highest dimension for which no transition occurs, i.e., d lc = 1 for the Ising model and d lc = 2 for the XY and Heisenberg model. For higher dimensions d 4 , the critical exponents of the mentioned models do not depend on the dimensionality any longer, and they take on the mean-field critical exponents in all dimensions. The underlying reason is that, with increasing dimensions, the local fluctuations become smaller due to the higher connectivity of the system [132]. This has been also exploited in diagrammatic and series expansions in 1 / d [133,134,135]. This dimensional boundary, at which the criticality becomes the mean-field one, is called upper critical dimension d uc . Usually, the upper critical dimension is too large to realise a system above its upper critical dimension in the real world. However, long-rang interactions can increase the connectivity of a system in a similar sense as the dimensionality. A sufficiently long-range interaction can, therefore, lower the upper critical dimension to a value that is accessible in experiments.
Finally, it is worth mentioning that the critical exponents are not independent of each other, but obey certain relations [129], namely
2 α = ( d + z ) ν ,
2 α = 2 β + γ ,
γ = β ( δ 1 ) ,
γ = ( 2 η ) ν .
The first relation in Equation (3) is the so-called hyperscaling relation, whose classical analogue (without z) was introduced by Widom [10,136]. The Essam–Fisher relation in Equation (4) [137,138] is reminiscent of a similar inequality, which was proven rigorously by Rushbrooke using thermodynamic stability arguments. Equation (5) is called the Widom relation. The last relation in Equation (6) is the Fisher scaling relation, which can be derived using the fluctuation–dissipation theorem [10,129,138]. Those relations were originally obtained from scaling assumptions of observables close to the critical point, which were only later derived rigorously when the RG formalism introduced to critical phenomena [10,129]. Due to these relations, it is sufficient to calculate three, albeit not arbitrary, exponents to obtain the full set of critical exponents.
The hyperscaling relation Equation (3) is the only relation containing the dimension of the system and is, therefore, often said to break down above the upper critical dimension, where one expects the same mean-field critical exponents independent of the dimension d [10]. It, therefore, deserves special focus in this review since the long-range models discussed will be above the upper critical dimension in certain parameter regimes. Personally, we would not agree that the hyperscaling relation breaks down above the upper critical dimension, but we would rather call Equation (3) a special case of a more general hyperscaling relation:
2 α = d ϙ + z ν ,
with the pseudo-critical exponent ϙ (“koppa”) [34]:
ϙ = 1 for d d uc d d uc for d > d uc .
Below the upper critical dimension, the general hyperscaling relation, therefore, relaxes to Equation (3). Above the upper critical dimension, the relation becomes
2 α = d uc + z ν ,
which is independent of the dimension of the system. For the derivation of this generalised version of the hyperscaling relation for QPTs, see Section 2.3 or Ref. [34]. The derivation of the classical counterpart can be found in Ref. [15] and is reviewed in Ref. [41].

2.2. Finite-Size Scaling below the Upper Critical Dimension

Even though the singular behaviour of observables at the critical point is only present in the thermodynamic limit, it is possible to study the criticality of an infinite system by investigating their finite counterparts. In finite systems, the power-law singularities of the infinite system are rounded and shifted with respect to the critical point, e.g., the susceptibility with its characteristic divergence at the critical point λ c is deformed to a broadened peak of finite height. The peak’s position r L = ( λ L λ c ) / λ c is shifted with respect to the critical point r = 0 . A possible definition of a pseudo-critical point of a finite system is the peak position λ L . As the characteristic length scale of fluctuations ξ diverges at the critical point, the finite system approaching the critical point will at some point begin to “feel” its finite extent and the observables start to deviate from the ones in the thermodynamic limit. As ξ diverges with the exponent ν like ξ | r | ν at the critical point, the extent of rounding in a finite system is related to the value of ν . Similarly, the peak magnitude of finite-size observables at the pseudo-critical point will depend on how strong the singularity in the thermodynamic limit is, which means it depends on the respective critical exponents α , β , γ , and δ . The shifting, rounding, and varying peak magnitude are shown for the susceptibility of the long-range transverse-field Ising model in Figure 1. This dependence of observables in finite systems on the criticality of the infinite system is the basis of finite-size scaling.
In a more mathematical sense, the relation between critical exponents and finite-size observables has its origin in the renormalisation group (RG) flow close to the corresponding RG fixed point that determines the criticality [139]. Close to this fixed point, one can linearise the RG flow so that the free energy density and the characteristic length ξ become generalised homogeneous functions (GHFs) in their parameters [124,128,129,140,141]. For a thorough discussion of the mathematical properties of GHFs, we refer to Ref. [124] and Appendix B. This means that the free energy density f and characteristic length scale ξ as functions of the couplings r , H , T , u and the inverse system length L 1 obey the relations:
f ( r , H , T , L 1 , u ) = b ( d + z ) f ( b y r r , b y H H , b z T , b L 1 , b y u u )
ξ ( r , H , T , L 1 , u ) = b ξ ( b y r r , b y H H , b z T , b L 1 , b y u u )
with the respective scaling dimensions y r , y H , z > 0 , y L = 1 , and y u < 0 governing the linearised RG flow with spatial rescaling factor b > 1 around the RG fixed point, at which all couplings vanish by definition. All of those couplings are relevant except for u, which denotes the leading irrelevant coupling [10,113]. Relevant couplings are related to real-world parameters that can be used to tune our system away from the critical point like the temperature T, a symmetry-breaking field H, or simply the control parameter r. The irrelevant couplings u do not per se vanish at the critical point like the relevant ones do. However, they flow to zero under the RG transformation and are commonly set to zero in the scaling laws:
f ( r , H , T , L 1 ) = b ( d + z ) f ( b y r r , b y H H , b z T , b L 1 )
ξ ( r , H , T , L 1 ) = b ξ ( b y r r , b y H H , b z T , b L 1 )
by assuming analyticity in these parameters. The generalised homogeneity of thermodynamic observables can be derived from the one of the free energy density f. For example, the generalised homogeneity of the magnetisation:
m ( r , H , T , L 1 ) = b ( d + z ) + y H m ( b y r r , b y H H , b z T , b L 1 )
can be derived by taking the derivative of f with respect to the symmetry-breaking field H.
By investigating the singularity of ξ ( r ) = ξ ( r , H = 0 , T = 0 , L 1 = 0 ) in r via Equation (13), one can show that the scaling power y r of the control parameter r is related to the critical exponent ν by y r = 1 / ν [113]. For this, one fixes the value b y r r of the first argument to ± 1 in the right-hand side of Equation (13) by setting b = | r | 1 / y r such that
ξ ( r ) = | r | 1 / y r ξ ( ± 1 ) | r | ν .
Analogously, further relations between the scaling powers and other critical exponents can be derived by looking at the singular behaviour of the respective observables in the corresponding parameters. Overall, one further obtains
α = d + z 2 y r y r , β = d + z y H y r , δ = y H d + z y H , γ = d + z 2 y H y r .
From these equations, one can already tell that the critical exponents are not independent of each other. In fact, the scaling relations 2 α = ( d + z ) ν , 2 α = 2 β + γ and γ = β ( δ 1 ) (see Equations (3)–(5)) can be derived from Equation (16) and y r 1 = ν . By expressing the RG scaling powers y x in terms of critical exponents, the homogeneity law for an observable O with a bulk divergence O ( r , 0 , 0 , 0 ) | r | ω is given by
O ( r , H , T , L 1 ) = b ω y r O ( b y r r , b y H H , b z T , b L 1 )
= b ω / ν O ( b 1 / ν r , b ( β + γ ) / ν H , b z T , b L 1 ) .
In order to investigate the dependence on the linear system size L, the last argument in the homogeneity law is fixed to b L 1 = 1 by inserting b = L . This readily gives the finite-size scaling form
O L ( r , H , T ) = L ω / ν Ψ ( L 1 / ν r , L ( β + γ ) / ν H , L z T )
with Ψ being the universal scaling function of the observables O . The scaling function Ψ itself does not depend on L any longer, but in order to compare different linear system sizes, one has to rescale its arguments. To extract the critical exponents from finite systems, the observable O L ( r , H , T ) is measured for different system sizes L and parameters ( r , H , T ) close to the critical point ( r , H , T ) = ( 0 , 0 , 0 ) . The L-dependence according to Equation (19) is then fit with the critical exponents ω , ν , β + γ , and z, as well as the critical point λ c , which is hidden in the definition of r, as free parameters. It is advisable to fix two of the three parameters r , H , T to their critical values in order to minimise the amount of free parameters in the fit. For example, with H = T = 0 and only r 0 , one can extract the two critical exponents ω and ν alongside λ c . For further details on a fitting procedure, we refer to Ref. [32]. If one knows the critical point, one can also set ( r , H , T ) = ( 0 , 0 , 0 ) and look at the L-dependent scaling O L L ω / ν directly at the critical point to extract the exponent ratio ω / ν . There are many more possible approaches to extract critical exponents from the FSS law in Equation (19) [142,143,144]. For relatively small system sizes, it might be required to take corrections to scaling into account [143,144].

2.3. Finite-Size Scaling above the Upper Critical Dimension

In the derivation of finite-size scaling below the upper critical dimension, it was assumed that the free energy density is an analytic function in the leading irrelevant coupling u and, therefore, one can set it to zero. However, this is not the case above the upper critical dimension any longer, and the free energy density f is singular at u = 0 . Due to this singular behaviour u is referred to as a dangerous irrelevant variable (DIV).
As a consequence, one has to take the scaling of u close to the RG fixed point into account. This is achieved by absorbing the scaling of f in u for small u into the scaling of the other variables [128]:
f ( r , H , T , L 1 , u ) = u p ( d + z ) f ¯ ( u p r r , u p H H , u p T T , u p L L 1 ) ,
up to a global power p ( d + z ) of u. This leads to a modification of the scaling powers in the homogeneity law for the free energy density [128]:
f ( r , H , T , L 1 ) = b ( d + z ) * f ( b y r * r , b y H * H , b z * T , b y L * L 1 )
= L ( d + z ) * / y L * F ( L y r * / y L * r , L y H * / y L * H , L z * / y L * T )
with the modified scaling powers [34,128]:
( d + z ) * = ( d + z ) p ( d + z ) y u , y r * = y r + p r y u , y H * = y H + p H y u , z * = z + p z y u , y L * = 1 + p L y u .
In the classical case [128], y L * was commonly set to 1 by choice. This is justified because the scaling powers of a GHF are only determined up to a common non-zero factor [124]. However, for the quantum case [34], this was kept general as it has no impact on the FSS.
As the predictions from the Gaussian field theory and mean field differed for the critical exponents α , β , and δ , but not for the “correlation” critical exponents ν , z, η , and γ [145], the correlation sector was thought not to be affected by DIVs at first [128,142,145]. Later, the Q-FSS, another approach to FSS above the upper critical dimension, pioneered by Ralph Kenna and his colleagues, was developed for classical [15,19,125], as well as for quantum systems [34], which explicitly allowed the correlation sector to also be affected by the DIV. In analogy to the free energy density, the homogeneity law of the characteristic length scale is then also modified to
ξ ( r , H , T , L 1 ) = b y ξ * ξ ( b y r * r , b y H * H , b z * T , b y L * L 1 )
= L ϙ Ξ ( L y r * / y L * r , L y H * / y L * H , L z * / y L * T )
with y ξ * = 1 p ξ y u = y r * / y r in order to reproduce the correct bulk singularity ξ | r | ν . A new pseudo-critical exponent ϙ (“koppa”):
ϙ = y ξ * y L * = y r * y r y L * = ν y r * y L *
is introduced. This exponent describes the scaling of the characteristic length scale with the linear system size. This non-linear scaling of ξ with L is one of the key differences to the previous treatments above the upper critical dimension in Ref. [128].
Analogous to the case below the upper critical dimension, the modified scaling powers y x * can be related to the critical exponents:
α = ( d + z ) * 2 y r * y r * ,
β = ( d + z ) * y H * y r * ,
δ = y H * ( d + z ) * y H * ,
γ = ( d + z ) * 2 y H * y r * .
By using the mean-field critical exponents for the O ( n ) quantum rotor model, one obtains restrictions for the ratios of modified scaling powers:
y r * = ( d + z ) * 2 , y H * = 3 ( d + z ) * 4
Furthermore, one can link the bulk scaling powers y r * , y H * , and ( d + z ) * to the scaling power y L * of the inverse linear system size [34]:
( d + z ) * = y L * d + y r * y r z ,
by looking at the scaling of the susceptibility in a finite system [34,128]. This relation is crucial for deriving an FSS form above the upper critical dimension as the modified scaling power y L * , or rather, its relation to the other scaling powers determines the scaling with the linear system size L. For details on the derivation, we refer to Ref. [34]. We want to stress again that the scaling powers of GHFs are only determined up to a common non-zero factor [124]. Therefore, it is evident that one can only determine the ratios of the modified scaling powers, but not their absolute value. The absolute values are subject to choice. Different choices were discussed in Ref. [34], but these choices rather correspond to taking on different perspectives and have no impact on the FSS nor the physics.
From Equation (32) together with Equations (26) and (27), a generalised hyperscaling relation:
2 α = d ϙ + z ν ,
can be derived. This also determines the pseudo-critical exponent:
ϙ = d d uc for d > d uc .
Finally, we can express the modified scaling powers in the FSS law for an observable O with power-law singularity O ( r , 0 , 0 , 0 ) | r | ω :
O ( r , H , T , L 1 ) = b ω y r * O ( b y r * r , b ( β + γ ) y r * H , b z * T , b y L * L 1 )
= L ω ϙ / ν Ψ ( L ϙ / ν r , L ( β + γ ) ϙ / ν H , L z * / y L * T ) .
For ϙ = 1 below the upper critical dimension, Equation (36) relaxes to the standard FSS law Equation (19). The scaling in temperature has not yet been studied for finite quantum systems above the upper critical dimension. However, in Ref. [34], it was conjectured that z * = y r * / y r z based on Equation (32), which is also in agreement with z being the dynamical critical exponent that determines the space–time anisotropy ξ τ ξ z , as we will shortly see. This means that the finite-size gap scales as Δ L L z * / y L * L ϙ z with the system size [34]. Of particular interest is the scaling of the characteristic length scale above the upper critical dimension, for which the modified scaling law Equation (36) also holds with ω = ν . Hence, the characteristic length scale in dependence of the control parameter r scales like
ξ L ( r ) = L ϙ Ξ ( L ϙ / ν r )
with the scaling function Ξ . Directly at the critical point r = 0 , this leads to ξ L L ϙ . Comparing this with the scaling of the inverse finite-size gap Δ L 1 ξ L , τ ξ L z verifies that z still determines the space–time anisotropy. Prior to the Q-FSS [15,17], the characteristic length scale was thought to be bound by the linear system size L [128]. However, this was shown not to be the case by measurements of the characteristic length scale for the classical five-dimensional Ising model [14] and for long-range transverse-field Ising models [34].
For the latter, the data collapse of the correlation length according to Equation (37) is shown in Figure 2 as an example.

3. Monte Carlo Integration

In this section, we provide a brief introduction to Monte Carlo integration (MCI). We focus on the aspects of Markov chain MCI as the basis to formulate the white graph Monte Carlo embedding scheme of the pCUT+MC method in Section 4 and the stochastic series expansion (SSE) quantum Monte Carlo (QMC) algorithm in Section 5 in a self-contained fashion. MCI is the foundation for countless numerical applications, which require the integration over high-dimensional integration spaces. As this review has a focus on “Monte Carlo-based techniques for quantum magnets with long-range interactions”, we forward readers with a deeper interest in the fundamental aspects of MCI and Markov chains to Refs. [146,147,148].
MCI summarises numerical techniques to find estimators for integrals of functions f : C R over an integration space C using random numbers. The underlying idea behind MCI is to estimate the integral, or the sum in the case of discrete variables, of the function f over the configuration space by an expectation value:
I = C d ω f ( ω ) = C d ω P ( ω ) P ( ω ) f ( ω ) = C d ω P ( ω ) f ˜ ( ω ) = lim S 1 S i = 1 i = S f ˜ ( ω i )
with ω i C sampled according to a probability density function P : C R 0 (PDF) and the function f ˜ ( ω ) = f ( ω ) / P ( ω ) reweighted by the PDF. A famous direct application of this idea is the calculation of the number “pi”, which is discussed in great detail in Ref. [148].
In this review, MCI is used for the embedding of white graphs on a lattice to evaluate high orders of a perturbative series expansion or to calculate thermodynamic observables using the SSE framework. In both cases, non-normalised relative weights π ( ω ) within a configuration space C arise, which are used for the sampling of the PDF P:
P ( ω ) = π ( ω ) C d ω π ( ω ) ,
being oftentimes not directly accessible. In the context of statistical physics, π ( ω ) is often chosen to be the relative Boltzmann weight e β E ( ω ) of each configuration ω . While this relative Boltzmann weight is accessible as long as E ( ω ) is known, the full partition function to normalise the weights is in general not.
In order to efficiently sample the configuration space C according to the relative weights, the methods in this review use a Markovian random walk to generate { ω 1 , , ω m } . Let ω n be the random state of a random walk at a discrete step n. The state ω n + 1 at the next step is randomly determined according to the conditional probabilities T ( ω ω ) (transition probabilities). These transition probabilities are normalised by
C d ω T ( ω ω ) = 1 .
Markovian random walks obey the Markov property, which means the random walk is memory-free and the transition probability for multiple steps factorises into a product over all time steps:
T ( ω ( 0 ) ω ( 1 ) ω ( m 1 ) ω ( m ) ) = i = 0 m 1 T ( ω ( i ) ω ( i + 1 ) )
with ω ( 0 ) the start configuration. We require the Markovian random walk to fulfil the following conditions: First, the random walk should have a certain PDF P ( ω ) defined by the weights π ( ω ) in Equation (39) as a stationary distribution. By definition, P ( ω ) is a stationary distribution of the Markov chain if it satisfies the global balance condition:
C d ω P ( ω ) T ( ω ω ) = P ( ω ) .
Second, we require the random walk to be irreducible, which means that the transition graph must be connected and every configuration ω C can be reached from any configuration ω C in a finite number of steps. This property is necessary for the uniqueness of the stationary distribution [147]. Lastly, we require the random walk to be aperiodic (see Ref. [147] for a rigorous definition). Together with the irreducibility condition, this ensures convergence to the stationary distribution [147].
There are several possibilities to design a Markov chain with a desired stationary distribution [146,147,148,149,150,151]. Commonly, the Markov chain is constructed to be reversible. This means that it satisfies the detailed balance condition:
P ( ω ) T ( ω ω ) = P ( ω ) T ( ω ω ) ,
which is a stronger condition for the stationarity of P than the global balance condition in Equation (42). One popular choice for the transition probabilities T ( ω ω ) that satisfies the detailed balance condition is given by the Metropolis–Hastings algorithm. Most applications of MCI reviewed in this work are based on the Metropolis–Hastings algorithm [149,150]. In this approach, the transition probabilities T ( ω ω ) are decomposed into propositions T ˜ ( ω ω ) and acceptance probabilities p acc ( ω ω ) as follows:
T ( ω ω ) = T ˜ ( ω ω ) p acc ( ω ω ) .
The probabilities to propose a move T ˜ ( ω ω ) can be any random walk satisfying the irreducibility and aperiodicity condition. By inserting the decomposition of the transition probabilities Equation (44) into the detailed balance condition Equation (43), one obtains for the acceptance probabilities:
p acc ( ω ω ) p acc ( ω ω ) = P ( ω ) T ˜ ( ω ω ) P ( ω ) T ˜ ( ω ω ) = π ( ω ) T ˜ ( ω ω ) π ( ω ) T ˜ ( ω ω ) ,
where, in the last step, the idea that the unknown normalisation factors (see Equation (39)) of the PDF cancel was used. The condition in Equation (45) is fulfilled by the Metropolis–Hastings acceptance probabilities [150]:
p acc ( ω ω ) = min 1 , π ( ω ) T ˜ ( ω ω ) π ( ω ) T ˜ ( ω ω ) .
For the special case, for which the proposition probabilities are symmetric T ˜ ( ω ω ) = T ˜ ( ω ω ) , Equation (46) reduces to the Metropolis acceptance probabilities:
p acc ( ω ω ) = min 1 , π ( ω ) π ( ω ) .
As an example, we regard a classical thermodynamic system with Boltzmann weights e β E ( ω ) given by the energies of configurations and the inverse temperature to give an intuitive interpretation of the Metropolis acceptance probabilities in Equation (47). The proposition to move from a configuration ω to a configuration ω with a smaller energy E ( ω ) < E ( ω ) is always accepted independent of the temperature. On the other hand, the proposition to move to a configuration ω with a larger energy than ω is only accepted with a probability depending on the ratio of the Boltzmann weights. If the temperature is higher, it is more likely to move to states with a larger energy. This reflects the physics of the system in the algorithm, focusing on the low-energy states at low temperatures and going to the maximum entropy state at large temperatures.

4. Series-Expansion Monte Carlo Embedding

In this section, we provide a self-contained and comprehensive overview of linked-cluster expansions for long-range interacting systems [25,29,30,31,34,35] using white graphs [152] in combination with Monte Carlo integration for the graph embedding. First, we introduce linked-cluster expansions (LCEs) and discuss perturbative continuous unitary transformations (pCUTs) [153,154] as a suitable high-order series expansion method. We then establish an adequate formalism for setting up LCEs and discuss the calculation of suitable physical quantities in practice. With the help of white graphs, we can employ LCEs for models with long-range interactions and use the resulting contributions in a Monte Carlo algorithm to deal with the embedding problem posed by long-range interactions. This approach we dub pCUT + MC.

4.1. Motivation and Basic Concepts

The goal of all our efforts is to calculate physical observables in the thermodynamic limit, i.e., for an infinite lattice L , using high-order series expansions. The starting point of every perturbative problem is
H = H 0 + λ V ,
where the Hamiltonian describing the full physical problem H can be split up into an unperturbed part H 0 , which is readily diagonalisable, and a perturbation V associated with a perturbation parameter λ , which is small compared to the energy scales of H 0 . We aim to obtain a power series up to a maximal reachable order o max as an approximation of a desired physical quantity:
f ( λ ) p 0 + p 1 λ + p 2 λ 2 + p o max λ o max ,
where the coefficients p i are to be determined by the series expansion. We want to use the information contained in the power series to infer the properties of the approximated function f ( λ ) [155]. The cost of determining the coefficients is associated with an exponential growth in complexity with increasing order [155]. Hence, calculations are performed with the help of a computer programme. Obviously, the computer cannot deal with an infinitely large lattice. Instead, we must look at finite cut-outs consisting of a finite set of lattice sites that are connected by bonds (or links) symbolising the interactions of the Hamiltonian on the lattice. We term these cut-outs clusters. If two clusters A and B do not share a common site or conterminously do not have a link that connects any site of A and B with each other ( A B = ), then the cluster C = A B is called a disconnected cluster. Otherwise, if no such partition into disconnected clusters A and B exists ( A B ), the cluster C is called connected. We can define quantum operators M (e.g., a physical observable) on these clusters just as on the infinite lattice.
There are essentially two ways of performing high-order series expansions. The first one is the naive approach of taking a single finite cluster C L [153,154,156,157,158] and designing it such that the contribution of M ( C ) coincides with the contributions on the infinite lattice M ( L ) = M ( C ) up to the considered order in the perturbation parameter. The cluster needs to be chosen large enough such that the perturbative calculations up to the considered order are not affected by the boundaries of the cluster. Another way of performing calculations is to construct the operator contribution on a cluster—coinciding with the infinite lattice contributions up to a given order—by decomposing it into all possible contributions on smaller clusters [155,159,160,161,162,163,164,165,166,167,168,169,170]. Now, the contributions on many, but smaller clusters must be determined and added up to obtain the contribution on the infinite lattice:
M ( L ) = M ( C ) = C C M ( C ) .
In contrast to the previous approach, we willingly accept boundary effects for the many subclusters. Such a cluster decomposition is known to be computationally more efficient because it suffices to calculate the contributions on the Hilbert space of the smaller clusters reducing the overhead of non-contributing processes, and it also suffices to perform the calculations only on a few clusters as many give identical contributions due to symmetries. This can significantly reduce the overhead.
However, there are subtleties about the validity of performing calculations on finite clusters, e.g., when setting up a linked-cluster expansion (linked-cluster means only connected clusters contribute), the operator M must satisfy a certain property, namely cluster additivity. The quantity M is called cluster additive if and only if the contribution on disconnected clusters A B solely comes from the contributions on its constituent connected clusters A and B. This means we can simply add the contributions of A and B from the smaller connected clusters to obtain the one for the disconnected cluster, i.e.,
M ( A B ) = M ( A ) + M ( B ) ,
as illustrated schematically in Figure 3.
We can also understand cluster additivity in the language of perturbation theory where acting on bonds in every order forms a cluster of perturbatively active bonds. If such a cluster is connected, we call these processes linked. So, cluster additivity simultaneously means that only linked processes will contribute. Cluster additivity is at the heart of the linked-cluster theorem, which states that only linked processes will contribute to the overall contribution in the thermodynamic limit. To set up a linked-cluster expansion, we want to exploit cluster additivity and the linked-cluster theorem so that we can “simply” add up the contributions from individual connected clusters to obtain the desired power series in the thermodynamic limit.
An example of a cluster additive quantity is the ground-state energy E 0 . Imagine we want to calculate the ground-state energy of a non-degenerate ground-state subspace, then cluster additivity is naturally fulfilled:
E 0 ( A B ) = E 0 ( A ) + E 0 ( B ) ,
and we can calculate the ground-state energy on A B from its individual parts A and B. However, cluster additivity is not satisfied in general. We can construct a counterexample by considering the first excited state with energy E 1 . For example, consider the first excitation above the ferromagnetic ground state of the transverse-field Ising model in the low-field limit that is a single spin flip dressed with quantum fluctuations induced by the transverse field [113]. We usually refer to such excitations as quasiparticles (qp). Here, we cannot add the contributions on clusters A and B to obtain the excitation energy on cluster A B :
E 1 ( A B ) E 1 ( A ) + E 1 ( B ) .
How to set up a linked-cluster expansion for intensive properties is not obvious, and it seemed out of reach after the introduction of linked-cluster expansions in the 1980s [159,160,161]. Only several years later, it was noticed by Gelfand [162] that additivity can be restored for excited states when properly subtracting the ground-state energy. This approach was later generalised to multiparticle excitations [154,156,164,165] and observables [154,163].
In the following, we first introduce a perturbation theory method that maps the original problem in Equation (48) to an effective one. We will show that the derived effective Hamiltonian and observables satisfy cluster additivity. In the subsequent section, we make use of the property and show how we can set up a linked-cluster expansion for energies of excited states and observables by properly subtracting contributions from lower energy states.

4.2. Perturbation Method: Perturbative Continuous Unitary Transformations

The first step towards setting up a linked-cluster expansion is to find a perturbation method that satisfies cluster additivity, which is generically not given and a non-trivial task [171]. Here, we use perturbative continuous unitary transformations (pCUTs) [153,154] that transform the original Hamiltonian perturbatively order by order into a quasiparticle-conserving Hamiltonian, reducing the original many-body problem to an effective few-body problem. We start discussing how to solve the flow equation to obtain the pCUT method and show afterwards how the Hamiltonian decomposes into additive parts that can be used for a linked-cluster expansion.
We strive to solve the usual problem of perturbation theory of Equation (48). The unperturbed part H 0 can be easily diagonalised exactly with a spectrum that has to be equidistant and bounded from below. Additionally, the perturbation V must be a sum of operators T n :
V = n = N N T n ,
containing all processes changing the energy by n quanta and—if properly rescaled—corresponding to the same number of quasiparticles n. The goal of the pCUT method is to find an optimal basis in which the many-body problem of the original Hamiltonian reduces to an effective few-body problem. For that, we introduce a unitary transformation depending on a continuous flow parameter and define
H ( ) = U ( ) H U ( ) .
In the limiting case = 0 , we require H ( 0 ) = H to recover the original Hamiltonian, and for = , we require lim H ( ) = H eff so that the unitary transformation maps the original to the desired effective Hamiltonian. We can rewrite the unitary transformation as
U ( ) = T exp ( 0 η ( ) d ) ,
where η is the anti-hermitian generator generating the unitary transformation and T the ordering operator for the flow parameter. Taking the derivatives of Equation (55) and Equation (56) in , we eventually arrive at the flow equation:
d H ( ) d = η ( ) , H ( ) .
Flow equations have been studied for quite some time in mathematics and physics with a variety of applications [172,173,174,175,176,177,178,179,180]. It was Knetter and Uhrig [153] who proposed a perturbative ansatz for the generator of continuous unitary transformations along the lines of Mielke [180], introducing the quasiparticle generator (also known as the “MKU generator”) for the pCUT method:
η qp ( ) i , j : = sgn ( H 0 i , i H 0 j , j ) H i , j ( ) ,
where the indices i , j refer to blocks of the Hamiltonian labelling the quasiparticle number. Diagonal blocks H i , i contain all processes conserving the number of quasiparticles i, while off-diagonal blocks H i , j contain all processes changing the quasiparticle number from i to j. The reasoning behind the ansatz can be explained by looking at sgn ( H 0 i , i H 0 j , j ) , where processes i j in H i , j are assigned the opposite sign of the inverse processes j i , and therefore, the idea is to “rotate away” off-diagonal blocks by the unitary transformation during the flow of , while processes that do not change the quasiparticle number are not transformed away due to sgn ( 0 ) = 0 , but get renormalised during the flow. Consequently, in the limit , we obtain an effective Hamiltonian H eff that is block diagonal in n. This idea is depicted in Figure 4.
Next, we make a perturbative ansatz for the Hamiltonian during the flow:
H ( ) = H 0 + k = 1 j n j = k λ 1 n 1 λ N λ n N λ dim ( m ) = k F ( ; m ) T ( m ) ,
with the notation
m = ( m 1 , m 2 , m 3 , , m k ) ,
m i { 0 , ± 1 , ± 2 , , ± N } ,
dim ( m ) = k ,
T ( m ) = T m 1 T m 2 T m 3 T m k ,
with F ( ; m ) being undetermined real functions. We introduce N λ distinct expansion parameters instead of just a single λ to keep the notation as general as possible because in Section 4.6.3, about white graphs, we will need multiple expansion parameters to encode additional information. Inserting Equations (58) and (59) into the flow equation (57), we can solve the equation perturbatively order-by-order as we obtain a recursive set of differential equations for F ( ; m ) .
To recover the original Hamiltonian H , we have to demand the correct initial conditions F ( 0 ; m ) = 1 for | m | = 1 and F ( 0 ; m ) = 0 for | m | > 1 . We can solve the differential equations (cf. Ref. [153]) exactly for , yielding
H eff = H 0 + k = 1 j n j = k λ 1 n 1 λ N λ n N λ dim ( m ) = k , M ( m ) = 0 C ( m ) T ( m ) ,
with F ( ; m ) = C ( m ) Q being exact rational coefficients and the restriction M ( m ) = i = 1 k m i = 0 making the products T ( m ) quasiparticle-conserving [153]. Hence, the commutator of the effective Hamiltonian with the unperturbed diagonal part of the original Hamiltonian vanishes ( [ H eff , H 0 ] = 0 ). Note that, so far, the effective Hamiltonian (64) is model independent. It only depends on the overall structure of Equation (54). The generic form of the Hamiltonian comes at the cost of an additional normal ordering usually by applying the Hamiltonian to a cluster. Of course, it could also be performed explicitly by using the hard-core bosonic commutation relations, but the former approach can be handled much easier by a computer programme. Yet, we achieved our goal of obtaining a block-diagonal Hamiltonian:
H eff = n = 0 H eff n ,
where H eff n is the effective irreducible Hamiltonian of n quasiparticle processes (see also Section 4.3). Let us emphasise again that this block-diagonal structure allows us to solve the n quasiparticle blocks individually, which significantly reduces the complexity of the original many-body problem to an effective one that is block-diagonal in the quasiparticle number n.
If we want to calculate an effective observable, we can make an ansatz along the same lines [154]. We insert the perturbative ansatz:
O ( ) = k = 1 j n j = k λ 1 n 1 λ N λ n N λ i = 1 k + 1 dim ( m ) = k G ( ; m ; i ) O ( m ; i )
with undetermined functions G ( ; m ; i ) . The operator product is defined as
O ( m ; i ) = T m 1 T m i 1 O T m i + 1 T m k .
Inserting exactly the same generator (58) and the ansatz for the observable in Equation (66) instead of the Hamiltonian into the flow Equation (57), we arrive at
O eff = k = 1 j n j = k λ 1 n 1 λ N λ n N λ i = 1 k + 1 dim ( m ) = k C ˜ ( m ; i ) O ( m ; i )
with C ˜ ( m ; i ) = G ( ; m ; i ) Q by solving the resulting set of differential equations for [154]. Note that the last sum does not contain a restriction M ( m ) = 0 , and therefore—in contrast to the effective Hamiltonian—effective observables are not (necessarily) quasiparticle-conserving.
We have just derived the effective form of the Hamiltonian and observables in the pCUT method that have a very generic form depending only on the structure of the perturbation of Equation (54). As already stated, the model dependence of our approach comes into play when performing a linked-cluster expansion by applying the effective Hamiltonian or observable to finite clusters. But how do we know if the effective quantities are cluster additive? We follow the argumentation of Refs. [181,182] by looking at the original Hamiltonian (48) that trivially satisfies cluster additivity as long as all bonds represent a non-vanishing term in V between sites (“bond equals interaction”). Thus, the Hamiltonian on a disconnected cluster A B :
H | A B = H | A + H | B
is cluster additive because H | A and H | B are non-interacting. Here, we denote the restriction of the Hamiltonian H to a cluster C as H | C . We can further insert this property into the flow equation:
d H ( ) | A B d = η ( ) | A B , H ( ) | A B d H ( ) | A d + d H ( ) | B d = η ( ) | A + η ( ) | B , H ( ) | A + H ( ) | B = η ( ) | A , H ( ) | A + η ( ) | B , H ( ) | B .
Here, we used the property of H ( 0 ) that it commutes on disconnected clusters and the fact that the Hamiltonian is continuously transformed during the flow starting from = 0 . Therefore, the derivative and the commutator can be split up, acting on each cluster individually and preserving cluster additivity during the flow. Consequently, the effective Hamiltonian:
H eff | A B = H eff | A + H eff | B
in the limit is cluster-additive as well. The same proof holds for effective observables. Another more physical argument is that the effective pCUT Hamiltonian can be written as a sum of nested commutators of T-operators [152,167]. For instance, considering the perturbation V = T 1 + T 0 + T + 1 , the effective Hamiltonian looks like
H eff = H 0 + λ T 0 + λ 2 [ T + 1 , T 1 ] + λ 3 2 [ [ T + 1 , T 0 ] , T 1 ] + [ T + 1 , [ T 0 , T 1 ] ] + .
Splitting up T n = l τ n , l into local operators acting on bonds l, the nested commutators vanish for processes that are not linked. Hence, the linked-cluster theorem is fulfilled, and the effective Hamiltonian is cluster-additive. To emphasise the linked-cluster property, the generic effective Hamiltonian is often written as
H eff = H 0 + k = 1 j n j = k λ 1 n 1 λ N λ n N λ dim ( m ) = k M ( m ) = 0 C | E C | k C ( m ) l 1 , , l k i = 1 k l i = C τ m 1 , l 1 τ m k , l k ,
where the sum over C runs over all possible connected clusters with maximal k bonds ( k | E C | ) [152]. The notation E C will be clarified in Section 4.6.2, where graphs are formally introduced. In this context, it is simply the set of bonds of a connected cluster C and | E C | the number of bonds in this set. The condition i = 1 k l i = C arising from the linked-cluster theorem ensures that the cluster consisting of active links and sites during a process must match with the bonds and sites of the connected cluster C. For observables, the generalised condition i = 1 k l i x = C holds, where the index x can either refer to a site (local observable) or a link (non-local observable), and we have
O eff = k = 1 j n j = k λ 1 n 1 λ N λ n N λ i = 1 k + 1 dim ( m ) = k C | E C | k C ˜ ( m ; i ) × l 1 , , l k i = 1 k l i x = C τ m 1 , l 1 τ m i 1 , l i 1 O x τ m i , l i τ m k , l k .
Although we showed that the effective Hamiltonian and observables are cluster-additive and, therefore, fulfil the linked-cluster theorem, to set up a linked-cluster expansion, there are important subtleties remaining when we restrict the effective Hamiltonian and observables to the quasiparticle basis, which we need to address before we can discuss how to perform the calculations in practice.

4.3. Unravelling Cluster Additivity

In this subsection, we need to clarify how we can use the cluster-additive property of the effective pCUT Hamiltonian and observables to set up a linked-cluster expansion not only for the ground-state energy, but also for the energies of excited states. Many aspects of this section are based on the original work of Ref. [154], in which a general formalism was developed for how to derive suitable quantities for the calculation of multiparticle excitations and observables. We further develop this formalism by inferring the concept of cluster additivity for the quasiparticle basis, introducing the notion of particle additivity. The term “additivity” in this context was recently introduced by Ref. [171].
We start by recalling that the effective Hamiltonian is block-diagonal, and we can write the Hamiltonian operator as a sum of irreducible operators of n quasiparticle processes:
H eff = n = 0 H eff , n .
We can express the n quasiparticle processes in second quantisation in terms of local (hard-core) bosonic operators b i creating and b i annihilating a quasiparticle at site i. When considering quantum magnets like we do in this review, a hard-core repulsion comes into play allowing only a single quasiparticle at a given site [154]. For instance, in the ferromagnetic ground-state of the 2D Ising model, an elementary excitation is given by a single spin flip, which can be interpreted as a quasiparticle excitation [113]. Obviously, at most one excitation on the same site is allowed. Different particle flavours τ can also be accounted for by incorporating an additional index τ of the operator b i , τ ( ) . To keep the notation simple, we will drop this additional index in the following. The irreducible operators in the second quantisation and normal-ordered form then read
H eff 0 = ϵ 0 1 , H eff 1 = i j t i ; j b j b i , H eff 2 = i 1 , i 2 j 1 , j 2 t i 1 , i 2 ; j 1 , j 2 b j 2 b j 1 b i 2 b i 1 , H eff n = i 1 , , i n j 1 , , j n t i 1 , i n ; j 1 , , j n b j n b j 1 b i n b i 1 .
Written in normal order, the meaning of these processes is directly clear when acting on states in the quasiparticle basis. The prefactors t i 1 , i n ; j 1 , , j n are to be determined by applying the effective pCUT Hamiltonian (64) to an appropriately designed cluster C. Let us consider the quasiparticle basis on a connected cluster C:
{ | 0 C , | 1 C , | 2 C , , | n C } = { | 0 C , | 1 ; i C , | 2 ; i 1 , i 2 C , , | n ; i 1 , i 2 , , i n C } ,
where the number n specifies the number of particle excitations and the indices i j denote the positions of the n (local) excitations. The effective pCUT Hamiltonian is quasiparticle conserving, so let us restrict it to N particle states, like when evaluating its matrix elements in this basis. If we evaluate by acting on a state with fewer particles than particles involved in the process, then the irreducible operator annihilates more particles than there are and the contribution is zero, that is H n | N = 0 for N < n . When we determine the action of H eff n | n for N = n , this allows us to determine all prefactors t i 1 , i n ; j 1 , , j n defining the action of H eff n on the entire unrestricted Hilbert space. The second quantisation presents a natural generalisation of the Hamiltonian restricted to a finite number of particles to an arbitrary number of particles [154]. We can construct H n | N for N > n from H n | n since the latter completely defines the action H eff n on the entire Hilbert space.
Although everything seems fine so far to set up a linked-cluster expansions, let us tell you that it is not. We finished the motivation in Section 4.1 with the statement that we cannot simply add up contributions for energies of excited states (cf. Equation (53)). The reason is that, although we showed that H eff is cluster additive, the irreducible operators restricted to the N particle basis H eff n | N are in fact not. To grasp a better understanding of the abstract concept of cluster additivity and why setting up a linked cluster expansion for higher particle channels usually fails, let us consider the following basis on a disconnected cluster A B :
{ | 0 A B , | 1 A B , | 2 A B , } = { | 0 A | 0 B , | 0 A | 1 B , | 1 A | 0 B , | 0 A | 2 B , | 1 A | 1 B , | 2 A | 0 B , } ,
where | n C represents all possible n-particle states living on a cluster C. While there is only one way to decompose the zero-particle states | 0 A B on the disconnected cluster A B , one-particle states | 1 A B decompose into two sets of states with a particle on cluster A ( | 1 A | 0 B ) and a particle on B ( | 0 A | 1 B ). For two-particle excitations | 2 A B , there are three possibilities to distribute the particles. In general, N-particle states have the form | k A | N k B with k { 0 , 1 , , N } , and there are N + 1 possibilities to decompose the states.
When restricting Equation (51) to these N-quasiparticle states, a cluster-additive Hamiltonian must decompose as
H eff cl . add . | A B N = k = 0 N H eff | A k + H eff | B N k ,
where we introduce the notation | C n restricting the Hamiltonian to all n-quasiparticle states on a cluster C. The direct sum in Equation (79) is introduced to emphasise that, for a cluster additive Hamiltonian, there must not be any particle processes between the two disconnected clusters A and B. The Hilbert space on the disconnected cluster A B can be seen as the natural extension of the Hilbert spaces on cluster A and B, and we can define the operators on the clusters A and B in terms of the Hilbert space on A B as a tensor product:
H eff | A k : = H eff | k 1 | N k , H eff | B k : = 1 | N k H eff | k ,
where operators on the left of ⊗ are defined on the Hilbert space of A and the operators to the right on the Hilbert space of B. The issue with Equation (79) is that, when we restrict the particle basis to N on the disconnected cluster A B , there are contributions from lower particles channels coming from the N + 1 possibilities to distribute the N particles on the two clusters. For example, if we look at the one-particle space:
H eff cl . add . | A B 1 = H eff | A 1 + H eff | B 0 H eff | A 0 + H eff | B 1 ,
we see that, in addition to the one-particle contributions, we obtain additional zero-particle contributions. The left part of the direct sum stems from acting on | 1 A | 0 B and the right side from acting on | 0 A | 1 B . There are always two possibilities to distribute the one-particle excitation on the two clusters where the other cluster is always unoccupied, which gives an additional zero-particle contribution. This fact is schematically illustrated in Figure 5a.
This is not the desired behaviour for a linked-cluster expansion. We would like the general notion of Equation (51) to directly translate to the particle-restricted basis, which is illustrated in Figure 5b. In other words, our goal is to find a notion of (cluster) additivity in the restricted particle basis such that we can simply add up the N-particle contributions of individual clusters without caring about lower particle channels. We define particle additivity as
H eff add . | A B N : = H eff | A N H eff | B N ,
where we demand the other contributions from Equation (79) to vanish. The crucial thing to notice is that irreducible operators (76) in the particle basis:
H eff N | A B N H eff add | A B N
are, in fact, particle-additive [154,171]. In the following, we will show that this is indeed the case. First, we remember that, for the ground-state energy, we can trivially add up the contributions. Starting from the definition for cluster additivity (79), we have
H eff cl . add | A B 0 = H eff | A 0 + H eff | B 0 H eff 0 | A 0 + H eff 0 | B 0 H eff add | A B 0 .
Second, from restricting the decomposition of Equation (75) to the N = 1 particle channel, we can express the irreducible one-particle operator as
H eff 1 | A B 1 = H eff cl . add . | A B 1 H eff 0 | A B 1 .
We recall that, by calculating H eff 0 | 0 , we can automatically derive H eff 0 on the entire Hilbert space, which subsequently defines H eff 0 | 1 . Therefore, by inserting the definition for cluster additivity (79), we obtain
H eff cl . add . | A B 1 H eff 0 | A B 1 = H eff | A 1 + H eff | B 0 H eff | A 0 + H eff | B 1 H eff 0 | A 1 + H eff 0 | B 0 H eff 0 | A 0 + H eff 0 | B 1 = H eff | A 1 H eff 0 | A 1 H eff | B 1 H eff 0 | B 1 = H eff 1 | A 1 H eff 1 | B 1 H eff add . | A B 1 ,
where we used the definition of Equation (82) in the last line. Hence, we have proven Equation (83) for N < 2 . The above proof can be readily extended to N 2 .
We achieved our goal of finding a notion of cluster additivity in the particle basis, which we termed particle additivity. We can determine the desired particle-additive quantities by using the subtraction scheme:
H eff add . | N H eff N | N = H eff | N n = 0 N 1 H eff n | N ,
which comes from Equation (75) by restricting it to an N-particle basis. This is an inductive scheme starting from N = 0 calculating the irreducible additive quantity H eff 0 | 0 . This result can be used to calculate the subsequent irreducible additive quantity H eff 1 | 1 for N = 1 . Then, for N = 2 , we use the results from N = 0 and N = 1 to calculate H eff 2 | 2 , and so on. Again, it is important that H eff n | n completely defines the operator H eff n and, therefore, any H eff n | m .
When considering effective observables, the particle number is no longer conserved and more types of processes are allowed. We need to generalise Equation (75) for effective observables by introducing an additional sum over d that is the change in the quasiparticle number. An effective observable, thus, decomposes as
O eff = n = 0 d n O eff n , d ,
where O eff n , d are irreducible contributions [154]. When writing them in second quantisation, we have
O eff n , d = i 1 , , i n j 1 , , j n + d t ˜ i 1 , i n ; j 1 , , j n + d b j n + d b j 1 b i n b i 1 .
We can directly see that the d quasiparticles are created because there are d additional creation operators. When d is negative, d quasiparticles are annihilated. We can infer a notion for cluster additivity and particle additivity along the same lines:
O eff cl . add . | A B N N + d = k = 0 N O eff | A k k + d + O eff | B N k N k + d ,
O eff add . | A B N N + d = O eff | A N N + d O eff | B N N + d .
To determine the particle additive parts, we can use an analogue subtraction scheme as described in Equation (87), which can be denoted as
O eff N , d | N N + d = O eff | N N + d n = 0 N 1 O eff n , d | N N + d .
If we want to calculate O eff N , d | N N + d , then we have to inductively apply Equation (92).
There are several things to be noted at this point. First, not all perturbation theory methods satisfy cluster additivity (79), and in this case, we cannot write operators as a direct sum any longer. There will be quasiparticle processes between one and the other cluster changing the number of particles on each cluster [155,162]. This is sketched in Figure 6 by the presence of an additional term.
When falsely performing a linked-cluster expansion, it can be noticed immediately that the approach breaks down. A symptom of non-cluster additivity is the presence of contributions of lower orders than expected from the number of edges of the graph [155]. When calculating reduced contributions in a linked-cluster expansion, we subtract only contributions of connected subgraphs, which leaves non-zero contributions of disconnected clusters when the perturbation theory method is not cluster-additive. However, there are notable exceptions when a linked-cluster expansion for energies of excited states is still correct even though the perturbation theory method is not cluster-additive [163,164,165,183,184,185,186]. This is only possible when the considered excitation does not couple with a lower particle channel, i.e., lower lying states are described by a distinct set of quantum numbers lying in another symmetry sector [155,165,187]. For instance, consider the elementary quasiparticle excitation in a high-field expansion for the transverse-field Ising model (TFIM), then the structure of the perturbation is T 2 + T 0 + T 2 , and therefore, the first excited state does not couple directly with the ground state (there is no T ± 1 , which is due to symmetry). If one wants to draw a comparison, we can think of this as being similar to the case when calculating the excitations in the density matrix renormalisation group. It is no problem to target an excited state if it is in a different symmetry sector than the ground state, but if it is in the same symmetry sector described by the same set of quantum numbers, then the Hamiltonian needs to be modified to project out the ground state [188].
Recently, a minimal transformation to an effective Hamiltonian was discovered that preserves cluster additivity. This method, called “projective cluster-additive transformation” [171], can be used analogously and is even more efficient for the calculation of high-order perturbative series. In this review, however, we stick to the well-established pCUT method.

4.4. Calculating Cluster Contributions

At this point, we may ask how to evaluate physical quantities on finite clusters in practice. To evaluate these quantities, we must evaluate them in the quasiparticle basis. In general, when setting up a cluster expansion, may it be non-linked or linked, it is important to subtract the contributions from all possible subclusters to prevent over-counting. Mathematically, for a quantity M ( H eff or O eff ), this can be written as
M ¯ | C = M | C C C M ¯ | C ,
where the sum runs over all real subclusters C in cluster C, and we call the resulting quantity M ¯ | C reduced. Starting from the smallest possible cluster (e.g., a single bond between two sites), this formula can be inductively applied to determine the reduced quantity on increasingly big clusters. An essential observation to make is that reduced operators vanish on disconnected clusters by construction if the operator M is additive since we subtract all contributions from individual subclusters. As the linked-cluster theorem applies, we can set up a cluster expansion:
M ¯ ( L ) | C = C L M ¯ ( C ) | C
of connected clusters, but we need to consider reduced quantities M ¯ to prevent over-counting. For a light notation, we will drop the bar in the sections below as we will only consider reduced contributions on graphs anyway.
Now, we are ready to look at the problem from a more practical point of view. From the previous subsection, we know how cluster additivity translates into the particle basis and how to construct particle-additive parts, namely the irreducible quasiparticle contributions. We decompose the effective Hamiltonian into its irreducible contributions by explicitly calculating:
H eff 0 | 0 = H eff | 0
H eff 1 | 1 = H eff | 1 H eff 0 | 1 ,
H eff 2 | 2 = H eff | 2 H eff 1 | 2 H eff 0 | 2
H eff N | N = H eff | N n = 0 N 1 H eff n | N .
Again, consider the effective Hamiltonian in the second quantisation made up of hard-core bosonic operators b i ( ) annihilating (creating) quasiparticles and the quasiparticle counting operator n i = b i b i occurring in the unperturbed Hamiltonian H 0 . We also consider a connected cluster C, and we denote n quasiparticle states on this cluster as | n ; i 1 , , i n C with the quasiparticles on the sites i 1 to i n . Note that, for multiple quasiparticle flavours or multiple sites within a lattice unit cell, this notation can be generalised to | n ; i 1 , , i n , τ 1 , , τ n C by introducing additional indices τ i . To lighten the notation, in the following, we stick to the former case. Let us consider the three lowest particle channels:
n = 0
We can directly calculate the ground-state energy E 0 ( C ) on a cluster C as it is already additive:
E 0 ( C ) = 0 | H eff | 0 C ,
as can be seen from Equation (95).
n = 1
To calculate the irreducible amplitudes t i ; j ( 1 ) ( C ) associated with the hopping process b j b i in H eff 1 , we need to subtract the zero-particle channel, as can be seen from Equation (96). However, we only need to subtract the ground-state energy if the hopping process is local, b i b i , since the ground-state energy only contributes to diagonal processes. Thus, we calculate
t i ; j ( 1 ) ( C ) = 1 ; j | H eff | 1 ; i C if i j , t i ; i ( 1 ) ( C ) = 1 ; i | H eff | 1 ; i C E 0 ( C ) else .
n = 2
In the two-particle case, we have to distinguish between three processes: pair hoppings ( t i , j ; k , l ( 2 ) ( C ) b l b k b j b i with four distinct indices), correlated hoppings ( t i , j ; i , k ( 2 ) ( C ) b k b j n i ), and density–density interactions ( t i , j ; i , j ( 2 ) ( C ) n j n i ). The free quasiparticle hopping is already irreducible, and nothing has to be done, but for the correlated hopping contribution, we have to subtract the free one-particle hopping. In the case of the two-particle density–density interactions, we need to subtract the local one-particle hoppings, as well as the ground-state energy, as this process is diagonal (cf. Equation (97)). Therefore, we calculate
t i , j ; k , l ( 2 ) ( C ) = 2 ; k , l | H eff | 2 ; i , j C if i j k l , t i , j ; i , k ( 2 ) ( C ) = 2 ; i , k | H eff | 2 ; i , j C t j ; k ( 1 ) ( C ) if i j k , t i , j ; i , j ( 2 ) ( C ) = 2 ; i , j | H eff | 2 ; i , j C t i ; i ( 1 ) ( C ) t j ; j ( 1 ) ( C ) E 0 ( C ) if i j .
An analogous procedure can be applied for effective observables. Here, we need to determine the irreducible contributions for a fixed d. The subtraction scheme is given by
O eff 0 , d | 0 d = O eff | 0 d
O eff 1 , d | 1 1 + d = O eff | 1 1 + d O eff 0 , d | 1 1 + d
O eff 2 , d | 2 2 + d = O eff | 2 2 + d O eff 1 , d | 2 2 + d O eff 0 , d | 2 2 + d
O eff N , d | N N + d = O eff | N N + d n = 0 N 1 O eff n , d | N N + d .
For d = 0 , we recover exactly the same subtraction procedure as before. It is straightforward to generalise this procedure for d 0 . Let us specifically consider the example we will encounter in the next section, when calculating correlations for the spectral weight. The effective observable is applied to the unperturbed ground state | 0 . Hence, there are only contributions out of the ground state ( N = 0 ), and the effective observables decomposes into
O eff = O eff 0 , 0 + O eff 0 , 1 + O eff 0 , 2 + .
Since only N = 0 processes contribute, nothing needs to be subtracted and the effective observable O eff | 0 d = O eff 0 , d | 0 d is irreducible and already particle-additive.
Although, for these types of calculations, the lowest orders can be analytically determined by hand, the calculations usually become cumbersome quickly and must be evaluated using a computer programme to push to higher perturbative orders (in most cases, a maximal order ranging from 8 to 20 is achievable). Such a programme reads the information of the cluster (site and bond information), the “bra” and “ket” states, the coefficient lists C ( m ) or C ˜ ( m ; i ) from Equations (64) and (68) up to the desired order, the structure of the Hamiltonian H , the local τ -operators from the perturbation V , and if necessary, the observable. The input states, as well as the τ -operators should be efficiently implemented in the eigenbasis of H 0 bitwise encoding the information as known, for instance, from exact diagonalisation. If possible, the calculation should be performed with rational coefficients for the exact representation of the perturbative series up to a desired order. The routine of the programme is then to iterate through the operator sequences from the coefficient list C or C ˜ and to consecutively apply the τ -operators by systematically iterating over all bonds of the cluster and calculating the action of the operator, saving the intermediate states for the action of the next operator in the sequence. As intermediate states are superpositions of basis states, they are saved in associative containers:
| ψ = j c j | j ,
where | j is the bit representation of a basis state. The key of the associative container is the basis state | j and the associated value, the prefactor c j . The bitwise representation of the basis states | j , as well as the τ -operators allow for a fast access and modification during the calculation.
So far, we have introduced the pCUT method for calculating the perturbative contributions on clusters. We demonstrated that the resulting effective quasiparticle-conserving Hamiltonian is cluster-additive and showed how to extract particle-additive irreducible contributions. We introduced a subtraction scheme for the effective Hamiltonian and observables that can be easily applied to calculate additive quantities to set up a linked-cluster expansion. Finally, we briefly discussed an efficient implementation of the pCUT method.

4.5. Energy Spectrum and Observables

Having established the basic theoretical framework for the pCUT method, we want to give a short overview of the physical quantities that are most frequently calculated with this approach. To this end, we assume that we consider a single suitably designed cluster for the calculations instead of setting up an LCE as a full graph decomposition. We will see how we can calculate the desired quantities without thinking about the abstract concepts necessary for linked-cluster expansions, and with the insights from this section, it will be easier to recognise what we are aiming at. Here, we first consider the energy spectrum of the Hamiltonian. We derive both the control-parameter susceptibility as the second derivative of the ground-state energy, as well as the elementary excitation gap from the effective one-quasiparticle Hamiltonian. Second, we consider observables. In the pCUT approach, often, spectral weights are calculated, which are of great importance for inelastic neutron scattering experiments.

4.5.1. Ground-State Energy and Elementary Excitation Gap

Following the above-described recursive scheme, we start with the zero-quasiparticle channel assuming a non-degenerate unperturbed ground state, which is the situation in all applications discussed in this review. The ground-state energy can be directly calculated from the cluster as in Equation (100). We consider a suitably designed cluster that is large enough to accommodate all fluctuations of a given maximal order o max and has periodic boundary conditions to correctly account for translational invariance. We calculate E 0 = 0 | H eff | 0 and obtain a high-order perturbative series:
E 0 = o = 0 o max p o λ o
in the expansion parameter λ , which is valid in the thermodynamic limit up to the given maximal order o max . We can extract the ground-state energy per site by dividing E 0 by the number of sites of the cluster ϵ 0 = E 0 / N . By taking the second derivative, we obtain the control-parameter susceptibility:
χ = d 2 ϵ 0 d λ 2 .
We are usually interested in the quantum-critical properties of the model and, therefore, analyse the behaviour about the quantum-critical point λ c . The control-parameter susceptibility shows the diverging power-law behaviour:
χ | λ λ c | α
with the associated critical exponent α , as we know from Table 1.
Turning to the one-quasiparticle channel, we calculate the hopping amplitudes following Equation (101). Here, we use open boundary conditions, again with a cluster large enough to accommodate all fluctuations contributing to the hopping process. Note that, in our notation, we denote t i ; j for hopping amplitudes on a graph or cluster level and the character a for processes in the thermodynamic limit. As for the ground-state energy, we can directly infer the contribution in the thermodynamic limit if the contributing fluctuations do not feel finite-size effects, and thus, we use a ( j i ) in the following. Further, we can generalise our notation to multiple particle types or larger unit cells containing multiple sites, as mentioned earlier, by introducing additional indices ξ , τ . We denote a hopping from unit cell i to j by δ = j i and within the unit cells from ξ to τ . We calculate a ξ , τ ( δ ) = 1 ; j , τ | H eff | 1 ; j δ , ξ by fixing j , due to translational symmetry. The effective one-quasiparticle Hamiltonian in the second quantisation then is
H eff 1 qp : = H eff | 1 = H eff 0 | 1 + H eff 1 | 1 = ϵ 0 N + j , δ , ξ , τ a ξ , τ ( δ ) b j , τ b j δ , ξ .
Applying the Fourier transform for a discrete lattice:
b j , τ = 1 N k b j , τ exp i k j b j , τ = 1 N k b j , τ exp i k j ,
we can diagonalise the resulting Hamiltonian in momentum space:
F H eff 1 qp = ϵ 0 N + k b k Ω ( k ) b k = k Ω m , n ( k ) b k , m b k , n = k ω ν ( k ) β k , ν β k , ν ,
introducing the operators β k , ν ( ) ( k ) that diagonalise the matrix Ω m , n . The eigenenergies ω ν ( k ) are the associated bands of the one-quasiparticle dispersion. In case of a trivial unit cell or a single particle flavour, the dispersion matrix becomes a scalar, and we can directly express the single-banded dispersion as
ω ( k ) = a ( 0 ) + 2 δ a ( δ ) cos ( k δ ) ,
where the sum over δ is restricted to symmetry representatives and we assumed real hopping amplitudes a ( δ ) . We determine the hopping amplitudes a ( δ ) as a perturbative series by performing the calculations on the properly designed cluster. Note that, even for a single cluster, we need to perform a subtraction scheme because, in order to obtain the irreducible contribution a ( 0 ) explicitly, we need to subtract the ground-state energy from i | H eff | i . When we determine the elementary excitation gap at the minimum of the lowest band:
Δ = min k , ν ω ν ( k ) = o = 0 o max p o λ o ,
we can as well extract the gap directly as such a series. The gap closing shows a power-law behaviour:
Δ | λ λ c | z ν
about the critical point with the critical exponent z ν .

4.5.2. Spectral Properties

Neutron scattering is a powerful method resolving spatial and dynamical structures in condensed matter physics since thermal neutrons have a de Broglie wavelength of similar length scale as interatomic distances and their energy is of the same order of magnitude as typical excitations [189]. By measuring the change in momentum and kinetic energy in inelastic neutron scattering experiments determining the dynamic response, not only static properties like magnetic order can be resolved, but also dynamic properties like spin–spin correlations [190]. The dynamic response S α β ( k , ω ) can be determined as it is proportional to the cross-section:
d 2 σ d Ω d ω α , β S α , β ( k , ω )
of inelastic neutron scattering [189,190]. We follow the derivations in Refs. [155,191,192] and start with the definition of the dynamic response:
S α , β ( k , ω ) = 1 2 π N i , j d t exp { i [ ω t k ( j i ) ] } S j α ( t ) S i β ( 0 ) T ,
which is the space and time Fourier transform of the spin correlation function S α ( t ) S β ( 0 ) T with α , β { x , y , z , + , } and · T referring to the thermal expectation value. In the limit of vanishing temperature T = 0 , the expectation value simplifies to · = ψ 0 | · | ψ 0 with | ψ 0 being the ground state. Then, we call S α , β ( k , ω ) the dynamic structure factor. We introduce a complete set of energy eigenstates { | ψ Λ } , where Λ denotes a set of quantum numbers, for instance n , k , where n is the number of quasiparticles and k the lattice momentum. Writing the dynamic structure factor in terms of these energy eigenstates, this yields the dynamic structure factor in the spectral form as a sum:
S α , β ( k , ω ) = Λ S α , β Λ ( k , ω ) ,
where S α , β Λ ( k , ω ) is called the exclusive structure factor or spectral weight associated with the quantum numbers Λ . We insert 1 = Λ | ψ Λ ψ Λ | into the correlation function and switch to the Heisenberg picture, where S j α ( t ) = e i H t S j α ( 0 ) e i H t . The spectral weight then reads
S α , β Λ ( k , ω ) = 1 2 π N i , j d t exp [ i ( ω E Λ + E 0 ) t ] exp [ i k ( j i ) ] × ψ 0 | S j α ( 0 ) | ψ Λ ψ Λ | S i β ( 0 ) | ψ 0 ,
which, after integrating over time t, yields
S α , β Λ ( k , ω ) = δ ( ω E Λ + E 0 ) 1 N i , j ψ 0 | S j α ( 0 ) | ψ Λ ψ Λ | S i β ( 0 ) | ψ 0 exp [ i k ( j i ) ] .
If we consider the case ( S i α ) = S i β or some observable with ( O i ) = O i , which could be a linear combination of spin operators, then the above expression further simplifies to
S Λ ( k , ω ) = δ ( ω E Λ + E 0 ) 1 N i ψ Λ | O i | ψ 0 e i k i 2 .
By the above equations, we can identify the spectral weights S Λ ( k ) as
S Λ ( k ) = 1 N i ψ Λ | O i | ψ 0 e i k i 2 .
These quantities are usually visualised as a heat map where the dispersion ω ( k ) is plotted against the momentum k and the value S Λ ( k , ω ) , that is the intensity of the scattering signal associated with | ψ Λ , is colour-coded.
In the pCUT approach, we want to reformulate the observable in terms of an effective one O eff . Here, we want to restrict ourselves to the one-quasiparticle spectral weight. In Ref. [191], you can also find a formulation for the two-quasiparticle case. For the 1qp spectral weight, Λ are the quantum numbers defining one-quasiparticle states, and we denote S Λ ( k ) S τ 1 qp ( k ) in the following. Since the pCUT method is a perturbative approach, we want to reformulate the problem in the language of H 0 states:
| ψ 0 = U | 0 ,
| ψ 1 qp = U | 1 ; k , τ ,
where we introduce the momentum states | 1 ; k , τ with additional index τ denoting a quantum number like a flavour of the excitation or denoting a site in a unit cell. The momentum states are defined via the Fourier transform:
| 1 ; k , τ = 1 N j exp ( i k j ) | 1 ; j , τ .
Inserting these identities, we obtain
S τ 1 qp ( k ) = 1 N i ψ 1 qp | O i | ψ 0 exp ( i k i ) 2 = 1 N i 1 ; k , τ | U O i U | 0 exp ( i k i ) 2 = 1 N 2 i , j 1 ; j , τ | O eff , i | 0 exp [ i k ( j i ) ] 2 ,
where we defined O eff , i = U O i U . For a problem with translational invariance, we can fix the site i of the observable and introduce δ = j i , which yields
S τ 1 qp ( k ) = 1 N 2 i , δ 1 ; i + δ , τ | O eff , i | 0 exp ( i k δ ) 2 = δ a ˜ τ ( δ ) exp ( i k δ ) 2 ,
where we used a ˜ τ ( δ ) = 1 ; i + δ , τ | O eff , i | 0 . Note that the form of the irreducible contribution of the effective observable is O eff 0 , 1 = δ a ˜ τ ( δ ) b δ , τ , exactly as denoted above in Equation (108). As long as the processes a ˜ ( δ ) and a ˜ ( δ ) are equivalent by means of the model symmetries, we can use a ˜ ( δ ) = a ˜ ( δ ) and further simplify S τ 1 qp ( k ) to
S τ 1 qp ( k ) = a ˜ τ ( 0 ) + 2 δ a ˜ τ ( δ ) cos ( k δ ) 2 | s τ ( k ) | 2 .
So, in the pCUT method, we determine a ˜ τ ( δ ) as a perturbative series with which we can determine S τ 1 qp ( k ) . Note that this observable is already an irreducible contribution according to Equation (92), and no subtraction scheme needs to be performed. Because we consider the contribution O eff | 0 1 of the observable, the initial particle number is n = 0 , and nothing needs to be subtracted. To extract the quantum-critical behaviour, we need to evaluate the expression at the critical momentum k c as
S τ 1 qp ( k c ) = o = 0 o max p o λ o ,
yielding a perturbative series for this quantity as well. It shows a diverging critical behaviour that goes as
S τ 1 qp ( k c ) | λ λ c | ( 2 z η ) ν
with the associated critical exponent ( 2 z η ) ν . After determining the three quantities χ , Δ , and S τ 1 qp ( k c ) described above, we can use the extrapolation techniques described in Section 4.8 to extract estimates for the critical point λ c and the associated critical exponents. However, we continue with the description of a linked-cluster expansion as a full graph decomposition for long-range interacting systems. The next step on the way is to formally introduce graphs, discuss their generation, and the associated concept of white graphs.

4.6. White Graph Decomposition

In this section, we first give a brief introduction to graph theory, which forms the basis for understanding how linked-cluster expansions as a graph decomposition work. Then, we discuss how to generate all topologically distinct graphs up to a certain number of edges corresponding to the maximal number of active edges at a given perturbative order. We conclude this section by explaining the concept of white graphs where edge colours are ignored as a topological attribute in the classification of graphs. White graphs are essential for tackling long-range interactions, and therefore, every graph decomposition in this review is, in fact, a white graph decomposition.

4.6.1. Graph Theory

So far, we already defined clusters as a cut-out of the infinite lattice with a finite set of sites and a set of bonds connecting those sites. More generally, only considering the topology of clusters without restricting them to the geometry of lattices, we can define a graph as a tuple G = ( V G , E G ) consisting of a (finite) set of vertices V G and a (finite) set of edges E G . An edge e E G consists of a pair of vertices { μ , ν } , and these vertices μ , ν V G are called adjacent. The degree of a vertex is the number of edges connecting it to other vertices of the graph. In the following, we only consider undirected, simple, and connected graphs, which means there are neither directed edges, multiple edges between two vertices, nor loops (no edge that is joining a vertex to itself), and there always exists a path of edges connecting any two vertices of a graph [155,193,194]. As an example, we depict a graph G = ( V G , E G ) with V G = { 0 , 1 , 2 , 3 , 4 , 5 } and E G = { { 0 , 1 } , { 0 , 2 } , { 0 , 4 } , { 0 , 5 } , { 1 , 3 } , { 2 , 3 } } in Figure 7.
A subgraph  G of a graph G (we write G G ) is defined as a subset of V G V G and E G E G [193,194,195]. We call G a proper subgraph if V G and E G are proper subsets of V G and E G .
To set up a full graph decomposition, it is essential to define how to distinguish different graphs. If there exists an isomorphism between two graphs, we call them topologically equivalent; otherwise, they are topologically distinct. A graph isomorphism  Iso ( G 1 , G 2 ) is a bijective map φ Iso between the vertex sets of two graphs, such that
φ Iso : V G 1 V G 2 { μ , ν } E G 1 { φ Iso ( μ ) , φ Iso ( ν ) } E G 2 .
So, an isomorphism preserves adjacency and non-adjacency, i.e., μ and ν are adjacent if and only if the vertices φ ( μ ) and φ ( ν ) are adjacent for any vertices μ , ν V G [193,194,196]. A special case are graph automorphisms Auto ( G ) , which are maps of a graph on itself, so we have
φ Auto : V G V G { μ , ν } E G { φ Auto ( μ ) , φ Auto ( ν ) } E G .
In other words, a graph automorphism is a permutation of the vertex set preserving adjacency [194,196]. The number of graph automorphisms | Auto ( G ) | of a given graph gives the number of its symmetries, and we call it the symmetry number s G = | Auto ( G ) | of a graph [155,195]. Examples of a graph isomorphism and automorphism are depicted in Figure 7. Further, if G 1 G 2 , the mapping (134) is injective instead of bijective such that
φ Mono : V G 1 V G 2 { μ , ν } E G 1 { φ Mono ( μ ) , φ Mono ( ν ) } E G 2 ,
and we call it a subgraph isomorphism or monomorphism  Mono ( G 1 , G 2 ) [195,197,198].
An example of a monomorphism is depicted in Figure 8a. Monomorphisms will later become of utmost importance as they give the number of embeddings of the subgraph onto a graph, which is the infinite lattice.
To account for hopping processes during the embedding of graphs, we need to assign additional attributes to graphs like colouring their vertices [195]. Then, a coloured graph G c is a tuple ( V G , E G , A V ) , where A V is a set of pairs { μ , a } with μ V and a the colour attribute. In Figure 8b, an example is depicted with A V = { { 0 , green } , { 2 , green } } . We can extend the above definitions for isomorphisms and automorphisms and say that they must preserve the vertex colour, i.e., only vertices of the same colour can be mapped onto each other. Of course, this reduces the cardinality | Auto ( G c ) | and, therefore, the symmetry number s G c associated with the coloured graph. We can later exploit the colour information for matching hopping vertices of the graph with the actual hopping of the quasiparticle on the lattice. Note that, also, colouring edges of graphs with attributes A E is useful to distinguish different types of interactions of a Hamiltonian on the graph level and very similar properties as for coloured vertices hold. As stated above, two graphs are topologically distinct if there does not exist a graph isomorphism between the two graphs. Thus, for coloured graphs, the colour attribute serves as another topological attribute. The importance of this will become apparent later in Section 4.6.3. For a more elaborate overview of graph theory in the context of linked-cluster expansions, we recommend Ref. [195].

4.6.2. Graph Generation

For the graph generation of (undirected) simple, connected graphs, we need to define an ordering between all graph isomorphs such that it is possible to pick a unique representative of all graph isomorphs that is called canonical representative [199]. One challenge lays in efficiently generating graphs as the number of connected graphs grows exponentially with the number of edges and the idea behind every algorithm must be to restrict the number of possible graph isomorph candidates when generating new edge sets by permutation. A well known algorithm is McKay’s algorithm [199,200], which exploits the degree information of the vertices, sets up a search tree for vertices with the same degree, and uses the ordering to check if the canonical representative of the current graph already exists. We recommend using open-source libraries for the graph generation and calculation of additional symmetries. For instance, there is “nauty” [201] and “networkX” [202] or the “Boost Graph Library” [203].
There are various conventions for how to write the graph information to a file. One could simply save the site and edge set as lists or save its adjacency matrix, where the rows and columns refer to the sites and a non-zero entry in the corresponding matrix element marks the existence of an edge between the sites [155,193,194]. Here, we suggest to simply use a bond list. Each entry in the list contains the edge information of the graph, denoted as n e , μ , ν , where e E G with μ , ν V G adjacent to e and n e is just a number associated with the edge e. The number n e can be interpreted as a specific expansion parameter corresponding to this bond. In the simplest case, there is just one expansion parameter and therefore a single number n e for all edges. Assigning multiple expansion parameters becomes especially important in the next Section 4.6.3. Usually, the symmetry number s G of a graph is calculated on the fly when generating a set of graphs and should be saved as well. In our workflow, we save the generated graphs into bond lists and create a list containing all connected graphs (e.g., the file names) along with their symmetry number s G . These types of lists suffice for the calculation of the ground-state energy. When calculating 1qp irreducible processes for the dispersion or for the spectral weight observable, we can think of these processes breaking the graph symmetry. Therefore, after the graph generation, we consider all possible processes on a graph and assign colour attributes to the start and end vertices. Due to the symmetry of the processes, we assign the same colour for hoppings and distinct colours for processes of the spectral weight (When calculating the hopping processes of the effective 1qp Hamiltonian from vertex μ to ν , it has the symmetry t μ ; ν = t ν ; μ with the inverse process giving the same contributions as long as the prefactors are real due to the Hermiticity of the Hamiltonian. Thus, we can use the same colours for start and end vertex. The 1qp process for the spectral weight t ˜ μ ; ν is distinct because a quasiparticle is created at vertex μ and then subsequently hops to vertex ν . Therefore, t ˜ μ ; ν and t ˜ ν ; μ are, in general, not equivalent processes t ˜ μ ; ν t ˜ ν ; μ , and we have to use distinct colours for the start and end vertices. But, as mentioned in Section 4.5.2, the processes can become equivalent by means of general model symmetries, for instance due to underlying graph symmetries). We calculate the symmetry number s G c associated with the coloured graph. In the end, we create a list, where each entry contains the graph, its symmetry number s G , a representative process (start and end vertex), and the associated symmetry number of the coloured graph s G c , counting the number of processes that give the same contributions as the representative process due to symmetries.
After generating the graphs and the lists, we can employ perturbation theory calculations on the graph level, viewing graphs as abstract clusters with vertices as lattice sites connected by bonds. A programme as described above reads in the graph and process information and repeatedly performs the pCUT calculations on every graph. The resulting graph contributions must be added up in such a way that the information of the lattice geometry is restored by weighting the graph contributions with embedding factors. That is, how many ways a graph can be fit onto a lattice apart from translational invariance (cf. Ref. [155]). For the conventional embedding of contributions from models with just nearest-neighbour interactions, the number of graphs can be reduced as the lattice geometry puts a restriction on the graphs. For example, graphs containing cycles of odd length cannot be embedded on the square lattice. In contrast, for long-range interactions, no such restriction exists, and therefore, the lattice can be seen as a single fully connected graph where every site interacts with each other. Hence, we have to generate every simple connected graph up to a given number of edges as all graphs contribute. Before we turn to the embedding problem in more detail, we will first deal with the challenges long-range interactions pose and address the problem by using white graphs.

4.6.3. White Graphs for Long-Range Interactions

In many cases in perturbation theory, there may be more types of expansion parameters, e.g., due to the geometry of the lattice, as can be seen in the n-leg Heisenberg ladder [152]:
H = H 0 + λ V + λ | | V
with an expansion parameter λ associated with the rungs and another one λ | | associated with the legs. The interaction V is given by XY interactions, and the series expansion is performed about the Ising limit H 0 . Setting up a graph decomposition for this model, the canonical approach would be to associate each expansion parameter with a distinct edge colour, blue for λ and purple for λ | | .
In Figure 9a (left), all graphs with two edges and two colours (one for λ and one for λ | | ) are depicted on the left. It is necessary to incorporate the edge colour information as the graphs can only be embedded correctly on the infinite lattice when the edge colour matches the bond type of the lattice. Another common type of perturbation problem is where the perturbation splits into different types of interactions, although associated with the same perturbation parameter. For instance, the problem can look like
H = H 0 + λ V = H 0 + λ V x + V x + V z ,
which is essentially the form of the Hamiltonian when performing a high-field series expansion [204] for Kitaev’s honeycomb model [205], where each site has one x-, y-, and z-type Ising interaction bond to one of its neighbours on the honeycomb lattice, respectively. Here, we associate the three different interaction types with three types of edge colours, blue, green, and purple for x-, y-, and z-bonds. See Figure 9b (left) for an illustration of all three graphs with two edges and three colours (there are only three possibilities because colours must alternate due to the constraints posed by Kitaev’s honeycomb model). In both cases, the edge colour is an additional topological attribute of the graph, leading to exponentially more graphs with the number of colours, which becomes relevant when pushing to high orders in perturbation.
In the case of long-range interactions, such an approach becomes unfeasible. A Hamiltonian with long-range interactions is of the form
H = H 0 + λ δ 1 | δ | d + σ V = H 0 + δ λ δ V ,
where δ is the distance between interacting sites, d the dimension of the lattice, and σ the long-range decay exponent and λ δ = λ | δ | ( d + σ ) . Applying the conventional approach from above, we would associate each of the infinitely many perturbations V δ with its own edge colour. The only obvious way to resolve this problem would be to truncate the sum over δ only considering very small distances. Instead, the use of white graphs [152] can be a solution to problems of this kind [25,29]. The idea is to change our view of how to tackle these Hamiltonians. We ignore the edge colours of the graph—significantly reducing the number of graphs for a given order—and instead, encode the colour information in the expansion parameters on the graph level in a more abstract way. This is performed by associating each edge e of the graph G with a different “abstract” perturbation parameter λ e such that we can track how often τ -operators acted on each edge e, yielding a multivariable polynomial:
P G ( { λ e } ) = m v m ( G ) M G , m = m v m ( G ) e E G λ e n e , m ,
with the sum over all monomial contributions. The individual monomial contributions consist of a prefactor v m ( G ) , and its monomial dependency M G , m . M G , m comprises a product of expansion parameters λ e associated with edge e and their respective integer powers n e , m 1 tracking how often each bond was active during the calculation. Let us emphasise that we simply wrote the white graph contribution for ϵ 0 , t μ ; ν , or t ˜ μ ; ν explicitly as a polynomial P G ( { λ e } ) . It is only later during the embedding of these abstract generalised contributions when the proper link colour is reintroduced by substituting the expansion parameters by the actual expansion parameters for each realisation on the actual lattice. This is the origin of the name “white graphs” because, during the calculation on the graph, the colour of the links is unspecified (but encoded in the multivariable polynomial), and only during the embedding, the colour of the edges is reintroduced. In Figure 9, we illustrate the white graph concept in the middle. On the right, we depict how to recover the polynomial of the conventional graph contribution from the abstract white graph contribution for the models in Equations (137) and (138) by correctly substituting the abstract expansion parameters. The main difference between the model (138) compared to (137) is that, for the different interaction types, multiple parameters are associated with each edge. To account for three types of interaction flavours, we have to consider three expansion parameters λ e , f with f { x , y , z } per edge e.
For long-range interactions, we can straightforwardly apply the exact same white graph scheme. We substitute the abstract expansion parameter with the correct algebraically decaying interaction strength depending on the distance between the interacting sites on the lattice [25,29]. For this, we have to use the substitution:
λ e 1 | δ | ( d + σ )
with δ = i ν i μ and e = { μ , ν } E G . It is also possible to incorporate multiple interaction types, but the substitution for the different interaction types must then be performed before the embedding, as was done in Refs. [31,35,206].
So far, we explained how to resolve the problem of infinitely many perturbation parameters by introducing white graphs. We managed to reduce the number of expansion parameters from infinity to the number of edges, i.e., the order of the perturbation. Yet, the polynomial in Equation (140) still grows exponentially with the number of expansion parameters. Following the description in Ref. [152], we can further mitigate this issue by an efficient representation of white graph contributions. The abstract concept is to track the relevant information in a quantity M ( n ) as a generalised monomial with the property M ( n 1 + n 2 ) = M ( n 1 ) M ( n 2 ) and use n i as an abstract parameter that encodes the tracked information. In Equation (140), the monomial quantity M ( n ) is just M G , m , tracking how often each edge was active during the calculation by associating each edge with its own abstract expansion parameter. We can generalise the expression of Equation (109) for states comprising additional information:
| ψ = i , j c i , j M ( n i ) | j = j | j i c i , j M ( n i ) .
Instead of a simple superposition of states | j with associated prefactor c j as in Equation (109), we have an additional superposition over all monomials M ( n i ) comprising the information of all the distinct processes encoded in M ( n i ) leading to the state | j . We can make use of this factorisation property by using nested containers. The key of the outer container is the basis state | j , and the value contains an inner container with the monomial M ( n i ) as the key and the prefactor c i , j as the value. Thus, the action of a τ -operator on a state | j can be calculated independent of the action on the monomial M ( n i ) . For a flat container, we would have to calculate the action on the same state | j multiple times. To further improve the efficiency of the calculation, we can directly encode into the programme which edge of a graph was active in a bitwise representation. Using the information of the number of edges of a graph and tracking the applied perturbative order during the calculation, we can neglect subcluster contributions on the fly and reduce the computational overhead even further. Therefore, we can directly calculate the reduced contribution on the graph without the need for explicitly subtracting subcluster contributions.
Wrapping things up, we explained how the use of white graphs can be applied to models with long-range interactions resolving the issue of infinitely many graphs or expansion parameters at any finite perturbative order. Instead of treating each perturbation of the long-range interaction as another topological attribute in the classification of graphs, we associate an abstract expansion parameter with each edge of a graph, and only during the embedding on the lattice, we substitute these expansion parameters with the actual bond information of long-range interacting sites. An efficient representation of white graphs can further help to reduce the computational overhead.

4.7. Monte Carlo Embedding of White Graphs

We discussed ways to set up a linked-cluster expansion, either by designing a single appropriately large cluster hosting all relevant fluctuations up to a given perturbative order or by setting up a linked-cluster expansion as a full graph decomposition, where the latter is the more efficient way to perform high-order series expansions. Of course, a quantity M must be cluster-additive in the first place. To decompose M ( L ) on the lattice L into many smaller subcluster contributions, we can add up the contributions:
M ( L ) = C L M ( C ) ,
where M ( C ) are reduced contributions on a cluster C to prevent overcounting. It is not necessary to calculate the contribution of every possible cluster since many clusters have the same contribution. It suffices to calculate the contribution of a representative cluster, only containing the relevant topological information, that is a graph defined by its vertex set V G and its edge set E G . We can see a graph G as representing an equivalence class [ G ] , whose elements are all clusters C realising all possible embeddings of a graph on the lattice [195]. Figuratively, all elements of the equivalence class are related by translational and rotational symmetry or are different geometric realisations on the lattice. We can now split the sum over all clusters into one sum over all possible graphs on the lattice and another one over all elements in the equivalence class:
M ( L ) = C L M ( C ) = G L C [ G ] M ( C ) ,
where it suffices to calculate M ( C ) once for all C [ G ] [195]. We are left counting the number of elements C in the equivalence class such that we can write
M ( L ) = G L W ( G , L ) M ( G ) ,
where the embedding factor W ( G , L ) is simply a number counting the number of embeddings [195]. The important point in our line of thought is that we can calculate the quantity M ( G ) only once on the graph level and multiply the contribution with a weight that is the embedding factor W and sum up the resulting contribution for all graphs to obtain the desired quantity M ( L ) in the thermodynamic limit. We are essentially left with determining the embedding factors after calculating the graph contributions M ( G ) . Depending on the topology of the graph, the number of possible embeddings is different and, therefore, also the embedding factor. When calculating one quasiparticle processes for the 1qp dispersion or the 1qp spectral weight, it is important that we account for the graph vertices’ colour attributes for the definition of the equivalence class. If conventional graph contributions are considered, we directly obtain the correct contribution M ( G ) . If there are multiple physical parameters or different interaction flavours, the graph edges have to be colour matched with the bonds of the lattice. If white graphs are used, we do not have to match any colours and this can be ignored; however, the white graph contributions have to be evaluated appropriately substituting the abstract expansion parameters with the correct physical parameter for each embedding. Regarding continuative reading on the embedding, we want to point out the standard literature for linked-cluster expansions [152,155,187], as well as Ref. [195]. Nonetheless, we show in the following how the embedding procedure works for models with nearest-neighbour interactions specifically on the example of the ground-state energy and one-quasiparticle processes yielding the 1qp dispersion of the Hamiltonian. Likewise, we derive the 1qp spectral weight. With these findings, we can turn to the embedding problem of long-range interacting models and eventually describe the Monte Carlo algorithm as a solution to this problem.

4.7.1. Conventional Nearest-Neighbour Embedding

Ground-state energy: We start with the simplest case of calculating the ground-state energy on the infinite lattice L . We know that H eff is cluster additive and that the ground-state energy is an extensive quantity, so we can directly calculate
E 0 = G L W ( G , L ) E 0 ( G ) .
In other words, this means we have to multiply the ground-state energy contributions on the graphs with the correct embedding factor and add up the resulting weighted contributions for every graph. For the definition of the embedding factor, we follow the formalism introduced by Ref. [195] and write
W ( G , L ) = | Mono ( G , L ) | | Auto ( G ) | = 1 s G | Mono ( G , L ) | .
The embedding factor W is the number of subgraph isomorphisms (monomorphisms) of the graph G on the lattice L divided by the number of graph automorphisms, i.e., the symmetry number of the graph. It is necessary to divide by the symmetry number s G because the number of monomorphisms is in general not equal to the number of subgraphs because there may be multiple monomorphisms that map to the identical subgraph [195]. To properly account for this, we have to divide by the number of automorphisms.
In Figure 10, you see an example for the embedding problem. We can recognise the fact that there are multiple monomorphisms by the presence of arrows illustrating the ambiguity of mapping onto a specific subgraph embedding. Further, we always have to consider reduced graph contributions subtracting all subgraph contributions as
E 0 ( G ) = 0 | H eff | 0 G G W ( G , G ) E 0 ( G ) .
Here, W ( G , G ) is the embedding factor for subgraphs G with respect to the considered graph G . Note again that, with an efficient white-graph implementation, reduced contributions are calculated on the fly without the need for explicit subtraction. We state the subtraction scheme for completeness. Going back to the embedding problem for the infinite lattice L , the embedding factor will be extensive as the ground-state energy is extensive as well. Usually, intensive quantities such as the energy per site ϵ 0 = E 0 / N are calculated, where N is the number of sites, and an arbitrary edge of the graph is fixed on the lattice. This then gives
ϵ 0 = G L w ( G , L ) E 0 ( G ) ,
where the normalised embedding factor is
w ( G , L ) = W ( G , L ) N = q s G | Mono ( G c , L c ) | ,
denoted with a lower case w, where q is the coordination number of the lattice (the number of neighbours of any site) [155,195]. One can think about fixing sites as colouring two adjacent vertices on the graph and two adjacent sites on the lattice and considering the monomorphism with respect to the additional colour attributes (for instance, A V = { { μ , yellow } , { ν , blue } } ) of the coloured graph G c and lattice L c . In Figure 10, we depict the number of embeddings for a given graph on the square lattice and show how the embedding factor is calculated for this example. Note that it would be equally valid to just fix a single site. Then, one would not have to account for the coordination number q and the number of monomorphism would be larger by a factor of q, making it computationally more expensive. In the end, we obtain the ground-state energy (per site) as a high-order perturbative series in an expansion parameter λ up to a computationally feasible maximal order o max as in Equation (110).
1qp dispersion: We now turn to the one-quasiparticle channel and calculate the hopping amplitudes. For a quasiparticle hopping δ on the lattice with additional hopping within the unit cell from ξ to τ or a quasiparticle changing its flavour from ξ to τ , we need to calculate
a ξ , τ ( δ ) = 1 ; i + δ , τ | H eff | 1 ; i , ξ = G L ( μ , ν ) P w ( G c , L c ) t μ ; ν ( G ) ,
where we fix the initial vertex μ and end vertex ν of the quasiparticle process on the graph to the initial site i and end site i + δ in real space. We can choose an arbitrary site i without loss of generality due to the translational symmetry of the models considered. (If we consider larger unit cells, we also need to fix the sites ξ , τ witin the unit cells i and i + δ .) The fixing of sites can be formally achieved by assigning colours to the initial and end sites on the graph and on the lattice. The second sum goes over all representative hopping processes P , as described above in Section 4.6.2. The embedding factor here is similar to Equation (150) and reads
w ( G c , L c ) = s G c s G | Mono ( G c , L c ) | ,
where the colour attribute comes from fixing the sites to the hopping. We account for the reduced symmetry of coloured graphs stemming from the representative hopping by multiplying by the symmetry number of the coloured graph s G c .
The embedding factor is then calculated with respect to the coloured graph G c and lattice L c . Again, we need to be careful as we have to consider reduced and additive contributions, i.e., we have to determine
t μ ; ν ( G ) = 1 ; ν | H eff | 1 ; μ δ μ , ν E 0 ( G ) G G ( μ , ν ) P W ( G c , G c ) t μ ; ν ( G ) ,
for each contribution on a graph G . After having determined the ground-state energy per site ϵ 0 and the hopping amplitudes a ξ , τ ( δ ) on the lattice L , we can derive the effective one-quasiparticle Hamiltonian (113). As we have seen in Section 4.5, we can readily derive the one-quasiparticle gap as a series in the perturbation parameter λ as in Equation (117).
1qp spectral weight: Lastly, we can do the same for one-quasiparticle spectral weights. We calculate the process amplitudes:
a ˜ τ ( δ ) = 1 ; δ , τ | O eff , i | 0 = G L ( μ , ν ) P w ( G c , L c ) t ˜ μ ; ν ( G ) .
Note that a process creating a quasiparticle at μ that subsequently hops to ν with the contribution t ˜ μ ; ν ( G ) is in general distinct to the inverse process, i.e., t ˜ μ ; ν ( G ) t ˜ ν ; μ ( G ) . On the other hand, the 1qp hopping processes of the Hamiltonian fulfil t μ ; ν ( G ) = t ν ; μ ( G ) as long as the hopping amplitudes are real. While for hopping processes of the Hamiltonian, we can use the same colour (for start and end vertex) as a topological attribute, here, we must use two different colours, leading to a smaller symmetry number of the coloured graphs. On the graph level, only subgraph contributions must be subtracted:
t ˜ μ ; ν ( G ) = 1 ; ν | O eff , μ | 0 G G ( μ , ν ) P W ( G c , G c ) t μ ; ν ( G ) .
With the contributions a ˜ τ ( δ ) , the spectral weight can be determined with Equation (131), and evaluating this quantity for example at the critical momentum k c , we again obtain a series (132) in the perturbation parameter λ .
For the conventional embedding problem, we can also consider several generalisations, which we want to briefly mention. First, we could consider quasiparticle processes between different particle types. In a 1qp process, a particle flavour could change from ξ to τ and the graph contribution would be denoted by including the additional indices t μ , ξ ; ν , τ and t ˜ μ ; ν , τ . The rest of the formalism is identical. Second, we may want to consider different interaction types like in Equation (138) or more expansion parameter like in Equation (137). In such a case, when considering coloured graphs, we also have to consider their coloured edges, which must be matched with the appropriate bonds on the lattice. Then, the coloured graphs G c are given by the tuple ( V G , E G , A E , A V ) , where A E are the edge colour attributes and A V the vertex colour attributes. Now, every embedding factor w needs to be determined using coloured graphs with respect to A E and A V . When using white graphs, the additional edge colour can be ignored and the embedding is as before; however, we need to appropriately substitute the abstract expansion parameters with the actual physical expansion parameters.
By now, we have everything together to calculate the ground-state energy, the 1qp dispersion, and the 1qp spectral weight in the thermodynamic limit from a linked-cluster expansion set up as a full graph decomposition. Therefore, we need to calculate the embedding factor, multiply them with the associated graph contribution, and add up the resulting weighted contributions. The embedding factor can be determined using available graph libraries like the Boost Graph Library [203] as only automorphisms and monomorphisms (with colours) need to be determined. For long-range interactions, however, we cannot just simply calculate the embedding factor because, for every graph, there are infinitely many possible embeddings, even when accounting for translational invariance.

4.7.2. Embedding for Models with Long-Range Interactions

In this section, we restrict ourselves to a single physical perturbation parameter. If we had more than one perturbation parameter, this would introduce an additional functional dependency, which we want to avoid as we perform the Monte Carlo summation. A convenient way around this is to sample the parameters and perform the Monte Carlo summation for each parameter ratio like it was performed in [31,35,206] for anisotropic XY interactions, distinct ladder interactions, and for XXZ interactions. In the following, we restrict ourselves to a single quasiparticle flavour due to simplicity and a trivial unit cell, as we have not generalised the algorithm to larger unit cells yet. The starting point to describe the embedding problem for long-range interactions is the embedding formulas given in the previous section in Equations (149), (151), and (154) for the ground-state energy per site, 1qp hopping processes, and the 1qp spectral weight, respectively, which we have to rewrite and adapt for long-range interactions.
Ground-state energy: Starting with the embedding of the ground-state energy per site (149), we can write
ϵ 0 = G w ( G , L ) E 0 ( G ) = o = 2 o max G w ( G , L ) E 0 ( o ) ( G ) = o = 2 o max n s = 2 o G | V G | = n s w ( G , L ) E 0 ( o ) ( G ) .
We have replaced G L with G in the sum to emphasise that the graphs are not restricted to the nearest-neighbour lattice geometry any longer due to the long-range interactions, making the lattice a fully connected graph. From the first to second line, we decomposed the ground-state energy contribution from graphs into contributions of individual orders E 0 ( G ) = o = 2 o max E 0 ( o ) . Note that the minimum order of the ground-state energy is o = 2 . From the second to third line, we introduced a second sum over vertices n s and restricted the sum over all graphs to a sum over graphs with a fixed number of vertices n s . Next, we can reformulate the embedding factor in Equation (147) as
w ( G , L ) = c C 1 s G ,
where we replaced the expression for the number of monomorphisms (divided by N) with a sum over all possible configurations C . A configuration is nothing else than the current embedding of graphs. As we will see later, we can calculate the contributions of multiple graphs simultaneously as the Monte Carlo sum only depends on the number of sites (we use “site” also as a synonym for “vertex position”). When using the word configuration, we think about it as the current set of vertex positions on the lattice. The sum over all configurations comprises individual sums for each vertex over all lattice sites excluding configurations where vertices overlap, as shown in the next subsection in Equation (180). One may falsely conclude that a sum over all configurations should result in the un-normalised embedding factor W , but as we will see in the following, by substituting the abstract expansion parameters with the physical long-range interactions, only the relative distance between sites is relevant for the contribution in the summand, irrespective of the absolute position of the sites on the lattice. We can also see the reason behind splitting the graph set into sets with graphs of a fixed number of vertices because we can now group all graph contributions with a given number of sites into a single integrand because for long-range interactions there are no constraints on the embeddings (except for overlaps) and the integrand only depends on the (relative) position of the vertices. Further, the white graph contributions E 0 ( o ) ( G ) still need to be replaced with the correct colour, i.e., the general expansion parameters need to be substituted with the algebraically decaying long-range interaction depending on the current configuration. In reality, the contribution E 0 ( o ) ( G , c ) depends on the current configuration c. Thus, replacing the expression with an explicit sum is necessary as the contribution for each configuration is different and w ( G , L ) cannot just be a number. The substitution must look like
E 0 ( o ) ( G , c ) : = E 0 ( o ) G ; { λ e | δ | ( d + σ ) } = m v m ( G ) e E G 1 | δ | n e , m ( d + σ ) ,
where the index m of the sum runs over all monomials of the contribution E 0 ( o ) ( G , c ) and the product is over all edges e = { μ , ν } E G of the graph G with the adjacent vertices μ , ν V G . The power law in the product arises from substituting the expansion parameters λ e | δ | α on the edges e with the appropriate algebraically decaying long-range interaction of the current embedding (cf. Equation (140)). The adjacent vertices μ , ν are embedded on the lattice sites i μ and i ν with the distance δ = i ν i μ , and the multiplicity n e , m N comes from the power n e of the associated expansion parameter λ e . This way, we can reduce the many expansion parameters from the white graph contribution to a single physical perturbation parameter λ , and by reordering the expression (156) of the ground-state energy, we have
ϵ 0 = o = 2 o max n s = 2 o c C G | V G | = n s 1 s G E 0 ( o ) ( G , c ) λ o .
We can define
f n s ( o ) ( c ) : = G | V G | = n s 1 s G E 0 ( o ) ( G , c ) ,
S [ f n s ( o ) ] : = c C f n s ( o ) ( c ) ,
where f n s ( o ) ( c ) is the integrand function and S [ · ] denotes the associated sum over all possible embeddings on the lattice that will be evaluated using a classical Monte Carlo algorithm. Since Monte Carlo runs are usually performed for a batch of different seeds, we introduce an additional sum over seeds averaging the Monte Carlo runs, which we denote as
S ¯ [ · ] = 1 N seeds s = 1 N seeds S [ · ] .
Eventually, this yields the expression:
ϵ 0 = o = 2 o max n s = 2 o S ¯ [ f n s ( o ) ] λ o .
To express the ground-state energy ϵ 0 as a perturbative series, we write
ϵ 0 = p 0 + o = 2 o max p o λ o with p i = n s = 2 o S ¯ [ f n s ( o ) ] ,
where we have to sum up the contributions of multiple Monte Carlo runs to obtain the series prefactors p i for i > 0 . The zeroth prefactor p 0 is simply given by the ground-state energy of H 0 . As a result of the Monte Carlo summation, these prefactors p i carry an uncertainty that can be estimated from the standard deviation obtained by averaging over the results for all seeds.
1qp dispersion: We now turn to extracting the one-quasiparticle dispersion. In Equation (116), we have seen that the dispersion can be analytically determined in terms of the hopping amplitudes of Equation (151) up to some perturbative order. For nearest-neighbour interactions, it is an analytic function in k ; however, for long-range interactions, there are infinitely many hoppings possible at any order, so we can neither explicitly determine the hopping amplitudes a ( δ ) , nor is it possible to have the dispersion as an analytical function in k . This would introduce a functional dependence in the integrand that we cannot sample. Instead, we will have to evaluate the dispersion for certain values k to obtain an explicit series in the perturbation parameter λ . Evaluating Equation (116) at k = k and inserting Equation (151), we can write
ω ( k = k ) = a ( 0 ) + 2 δ a ( δ ) cos ( k δ ) = δ Ξ ( k , δ ) a ( δ ) = δ Ξ ( k , δ ) G ( μ , ν ) P w ( G c , L c ) t μ ; ν ( G ) ,
where we rewrote the sum over δ by introducing the function:
Ξ ( k , δ ) = 1 δ = 0 , 2 cos ( k δ ) δ 0 .
Again, we can split t μ ; ν into the contributions of individual orders t μ ; ν ( o ) and split the graph set into subsets of a fixed number of sites n s = | V G | , yielding
ω ( k ) = o = 1 o max n s = 2 o + 1 G | V G | = n s ( μ , ν ) P δ Ξ ( k , δ ) w ( G c , L c ) t μ ; ν ( o ) ( G ) .
Note here that the second sum goes until o + 1 , while for the ground-state energy, it runs only until o . The maximal number of sites at a given order is o + 1 because graphs with o edges can maximally have o + 1 sites. For the ground-state energy fluctuations, every quasiparticle that is created has to be annihilated again, so at order o , a process can only touch maximal o 1 edges, which restricts the sum to o sites.
Now, we argue that we can drop the sum over δ by thinking differently about this embedding problem for the dispersion. The information of the start and end vertex of the hopping process is encoded into vertex colours, and when finding the subgraph monomorphisms for the embedding on the infinite lattice L , the colours of the vertices must match the coloured sites on the lattice, i.e., the hopping vertices are fixed to the hopping sites on the lattice. Since the long-range interactions allow any hopping—i.e., of any distance—at any order, it is not useful to think in this picture. Instead, we should think about the embedding problem analogous to the one for the ground-state energy, where no such hopping constraint exists and the embedding factor W is simply proportional to a sum over all configurations. This is valid as we let the sum over all lattice sites and account for constraints on δ by multiplying with the symmetry number of the coloured graph s G c . The relevant hopping information of the vertices, which was previously fixed by coloured vertices, is anyway encoded into the cosine terms. Hence, we can make the substitution:
δ Ξ ( k , δ ) w ( G c , L c ) = c C s G c s G cos ( k δ ) ,
where we account for the reduced symmetry of the graph due to the hopping by multiplying with the symmetry number of the coloured graph s G c . As before, we need to substitute the general white graph contribution with the actual algebraically decaying long-range interactions of the current embedding:
t μ ; ν ( o ) ( G , c ) : = t μ ; ν ( o ) G ; { λ e | δ | α } = m v m ( G ) e E G 1 | δ | n e , m α .
Inserting Equation (168) into Equation (167), we end up with the expression:
ω ( k ) = o = 1 o max n s = 2 o + 1 c C G | V G | = n s ( μ , ν ) P s G c s G t μ ; ν ( o ) ( G , c ) cos ( k δ ) λ o .
For a lighter notation, we again define the integrand function and the Monte Carlo sum:
f n s , k ( o ) ( c ) : = G | V G | = n s ( μ , ν ) P s G c s G t μ ; ν ( o ) ( G , c ) cos ( k δ ) ,
S [ f n s , k ( o ) ] : = c C f n s , k ( o ) ( c ) .
Introducing an average over a batch of seeds for the MC sum S ¯ [ · ] , we obtain
ω ( k ) = o = 1 o max n s = 2 o + 1 S ¯ [ f n s , k ( o ) ] λ o .
The perturbative series of the dispersion evaluated at k = k can then be expressed as
ω ( k ) = p 0 + o = 1 o max p o λ o with p i = n s = 2 o + 1 S ¯ [ f n s , k ( o ) ] .
The sum prefactors p i for i > 0 can be determined by summing up the individual contributions from the Monte Carlo runs and the prefactor p 0 is given by the energy gap of H 0 .
1qp spectral weight: Lastly, the evaluation for the spectral weight observable is analogous to the 1qp dispersion. The integrand and Monte Carlo sum are defined as
f n s , k ( o ) ( c ) : = G | V G | = n s ( μ , ν ) P s G c s G t ˜ μ ; ν ( o ) ( G , c ) cos ( k δ ) ,
S [ f n s , k ( o ) ] : = c C f n s , k ( o ) ( c ) ,
which we use to calculate
s ( k ) = p 0 + o = 1 o max p o λ o with p i = n s = 2 o + 1 S ¯ [ f n s , k ( o ) ] .
We determine s ( k ) by again calculating the series prefactors p i for i > 0 and then determine the 1qp spectral weight with S 1 qp ( k ) = | s ( k ) | 2 .
It should be noted that we have to perform
n = o max ( o max 1 ) 2 × N seeds
Monte Carlo runs for a series of order o max with N seeds . This means the number of runs grows quadratically with the maximal order o max .
So far, we have derived the necessary formalism for how to express the embedding problem of models with long-range interactions, but we have not talked about how to evaluate the Monte Carlo sums S [ · ] for the integrand functions f. In the next section, we investigate how to evaluate such sums by introducing a suitable Monte Carlo algorithm.

4.7.3. Monte Carlo Algorithm for the Long-Range Embedding Problem

We are left with evaluating the Monte Carlo sum S [ · ] , which runs over all configurations C of graphs. The embeddings on the lattice depend only on the number of vertices of a graph G , and there is no constraint by the edge set as in the nearest-neighbour case because every site of the lattice interacts with any other site, making it a fully connected graph. The only restriction for models with long-range interactions is that the vertices of the graph are not allowed to overlap as they do not self-interact and an overlap would imply infinite interaction strength resulting from the term | δ | ( d + σ ) . In conclusion, we can write the sum over all possible configurations as number-of-sites-many sums over all possible lattice positions:
S [ f n s ( o ) ] = c C f n s ( o ) ( c ) = i 1 i n s f n s ( o ) ( i 1 , , i n s ) ,
S [ f n s , k ( o ) ] = c C f n s , k ( o ) ( c ) = i 1 i n s f n s , k ( o ) ( i 1 , , i n s ) ,
where the primed sum i over vertex position i is the short notation for excluding overlaps with any other vertex position. For example, for the ground-state contribution with three sites on a one-dimensional chain, this sum would look like
S [ f n s ( o ) ] = k = j = j k i = i j i k f n s ( o ) ( i , j , k ) .
Due to the overlap constraint, these are nested high-dimensional sums over the integrand functions f n s ( o ) , which are in general hard to solve. The dimensionality of the MC problem is given by d sum = n s · d because higher dimensions d of the system introduce additional sums for each component. If we wanted to evaluate the Monte Carlo sum in two dimensions for contributions with eight sites, which already occurs in eighth-order perturbation theory, we would have to determine the integral value of 16 nested sums over the integrand function f 8 ( 8 ) . This makes it clear that the evaluation of such sums becomes challenging very quickly. In the following, we use the short notation f n s , interchangeable for both the ground-state energy integrand f n s ( o ) and the 1qp process integrands f n s , k ( o ) .
The first approach to tackle the problem of evaluating these sums using conventional numerical integration techniques was pioneered by S. Fey and K.P. Schmidt already in 2016 [25]. They managed to successfully determine the phase diagram and critical exponents from the closing of the gap of the long-range transverse-field Ising model on a one-dimensional chain with ferro- and antiferromagnetic interactions. While they were successful with their approach over a large range of decay exponents in one dimension, the extraction of the critical properties for small decay exponents was challenging. The two-dimensional problem was out of reach with this approach as the number of nested sums doubles and the sums converge significantly more slowly. Here, Monte Carlo integration came into play as it is known to be a powerful integration technique for high-dimensional problems where conventional integration fails. The reason behind the slow convergence of such high-dimensional sums is often that the configuration space where the integrand mainly contributes to the integral is significantly smaller than the entire configuration space. In 2019, Fey et al. [29] introduced a Markov chain Monte Carlo algorithm to efficiently sample the relevant configuration space. They were able to determine the quantum-critical properties of the long-range TFIM on two-dimensional lattices to even higher precision than previously for the one-dimensional chain, extending the accessible range of decay exponents without having to forfeit perturbative orders in higher dimensions.
In the following, we describe the Markov chain Monte Carlo algorithm introduced by Ref. [29] to evaluate the high-dimensional nested sums. To sample the relevant configuration space efficiently, we use importance sampling with respect to some convenient probability weight π ( c ) with respect to a configuration c and the associated partition function Z = c π ( c ) . We can insert an identity into Equation (179) or Equation (180) and rewrite it as
S f n s = c C π ( c ) Z Z π ( c ) = 1 f n s ( c ) = Z f n s ( c ) π ( c ) π ,
where π ( c ) / Z can be interpreted as the probability of being in configuration c. The integrand now reads as the contribution f n s (we dropped the order o and momentum k as indices to lighten the notation) from configuration c multiplied by its probability, which allows us to write the sum as the expectation value · π of f n s ( c ) / π ( c ) with respect to the weight π . We later call this sum “target sum”. Since the partition function Z is not known a priori, we also introduce a “reference sum”:
S f n s ref = c C f n s ref ( c ) = Z f n s ref ( c ) π ( c ) π ,
over a reference function f n s ref . We require this sum to be analytically solvable to avoid introducing an additional source of error. We denote its analytical expression as S n s ref . The reference function f n s ref should behave similarly to the integrand function of interest f n s . This means that the reference sum and target sum should have considerable contributions in the same area of the configuration space and their asymptotic behaviour should be similar as well. Although we could make, in principle, an arbitrary choice for the reference function, the latter properties guarantee to lead to good convergence. In one dimension, we choose the reference integrand as
f n s ref ( c ) = n = 1 n s 1 1 | i n + 1 i n | ρ = n = 1 n s 1 1 | δ n | ρ
with δ n = i n + 1 i n and the reference integrand exponent ρ , which is a free simulation parameter. We can solve the reference sum as follows:
S f n s ref = c C n = 1 n s 1 1 | i n + 1 i n | ρ = i 1 i N 1 | i 2 i 1 | ρ 1 | i n s i n s 1 | ρ = n s ! i 1 < i 2 i n s 1 < i n s 1 | i 2 i 1 | ρ 1 | i n s i n s 1 | ρ = n s ! i 1 < i 2 i n s 1 < i n s 1 | δ 1 | ρ 1 | δ n s 1 | ρ = n s ! n = 1 n s 1 δ = 1 1 δ ρ = n s ! ζ ( ρ ) n s 1 ,
where ζ ( ρ ) is the Riemann ζ function and we accounted for n s ! possibilities to randomly embed the vertices by ordering the indices of the sums. One major difference between the reference and the target sum is that, in the target sum, many different graph contributions contribute. In fact, the reference sum above is exactly the contribution of order o = n s 1 of a chain graph with n s vertices and the contribution from the associated target sum is the same up to a linear factor. In higher dimensions, we cannot choose a contribution proportional to the one of a chain graph any longer since it cannot be solved analytically. Instead, we make simplifications to the reference sum and require that the reference sum is still good enough to capture the same properties as the target sum. We choose to decouple the dimensions in the reference integrand:
f n s ref ( c ) = n = 1 d ν = 1 n s 1 1 ( 1 + | i ν + 1 , n i ν , n | ) ρ = n = 1 d ν = 1 n s 1 1 ( 1 + | δ ν , n | ) ρ
and explicitly allow overlaps in the reference sum, such that it can be solved analytically as follows:
S f n s ref = i 1 , 1 = i n s , d = 1 ( 1 + | i 2 , 1 i 1 , 1 | ) ρ 1 ( 1 + | i n s , d i n s 1 , d | ) ρ = n = 1 d ( n s 1 ) δ = 1 ( 1 + | δ | ) ρ = n = 1 d ( n s 1 ) 2 δ = 0 1 ( 1 + δ ) ρ 1 = 2 ζ ( ρ ) 1 d ( n s 1 ) .
Although the exponent ρ can be chosen freely, we want to achieve similar asymptotic behaviour as the target integrand; therefore, we choose ρ = 1 + σ in one dimension as the reference sum exactly behaves like the target integrand of a chain graph. In two dimensions, we made some simplifications to the reference sum, and we have to adopt the parameter as ρ = ( d + σ ) / 2 for σ < 5 , ρ = 3 for 5 σ < 7 , and ρ = 3.5 for σ 7 . This is by no means a rigorous choice, but it empirically proved to produce good convergence [29,207]. Solving Equation (183) for Z and inserting it into Equation (182), we obtain
S f n s = f n s ( c ) π ( c ) π f n s ref ( c ) π ( c ) π S n s ref .
We use the analytic expression of Equation (185) in 1D or Equation (187) in 2D for the reference sum S n s ref . We got rid of the partition function Z and now can use this expression in our Monte Carlo run to determine the sum S [ · ] using the analytic expression of the reference sum S n s ref while tracking the running averages in the numerator and denominator expressions.
We are left with just one missing ingredient, which is the choice of the probability weight π ( c ) . For our choice to be a probability weight, it must fulfil π ( c ) 0 , and we want the weight to be the largest if both the reference and the target integrand contribute the most. An obvious choice may be the quadratic mean:
π ( c ) = f n s ref ( c ) 2 + f n s ( c ) 2 ,
which is always 0 and rotationally invariant in f n s ref and f n s . However, we also want both quantities to contribute equally to the probability weight on average over all configurations. As the contributions of the target and reference sum may differ significantly, we introduce a factor for rescaling:
R = S f n s ref S f n s ,
which can be estimated in an in-advance calibration run. We then use an adjusted probability weight:
π ( c ) = f n s ref ( c ) 2 + R 2 f n s ( c ) 2
for the actual Monte Carlo run. The weight needs to be evaluated at every Monte Carlo step to track the running averages of the numerator and denominator in Equation (188).
Now, we have everything together to describe the actual Monte Carlo algorithm. We employ a Markov chain Monte Carlo algorithm, where we need to sample the configuration space C according to the probability weight π in Equation (191). Each configuration with a non-zero weight must be in principle accessible by the Markov chain. On the one hand, we propose a high acceptance rate of the Monte Carlo steps to sample the large configuration space efficiently, not staying in the same configuration for too long. On the other hand, we want to sample closely confined configurations with rather small distances between vertices more often than configurations that are farther apart (configurations with large distances between vertices) such that we capture the asymptotic behaviour of the model. The interaction strength decays algebraically with the distance between vertices, leading to smaller contributions for configurations in which sites are far apart. What we call a confined configuration, therefore, depends on the decay exponent of the long-range interaction σ . In the algorithm, we have the free exponent parameter ρ (for the reference sum) and γ (for probability distributions), which can be changed to tweak this behaviour, but are usually chosen similar or equal to d + σ . In two dimensions, we had to adapt the values of ρ to obtain a similar asymptotic behaviour for the reference and target sum due the approximations we made. An optimal choice of these parameters ensures fast convergence of the Monte Carlo sum. The current embedding configuration should be represented as an array container where the entries are the positions of the graph vertices. In one dimension, entries are simple integers, while in higher dimensions, the position needs to be represented as a mathematical vector. For small decay exponents σ , very large distances can occur between vertices from time to time, which need to be squared when calculating the absolute distance. This can lead to an integer overflow, and therefore, the use of a 128-bit integer may be considered. Further, we define functions in the programme for the target integrand f n s and for the reference integrand f n s ref , where the current configuration is passed as a parameter and the function returns the contribution from the integrand evaluated for the current configuration.
Turning back to the sampling scheme, the idea of the Markov chain is to interpret the vertex positions on the lattices as random walkers. We randomly select a graph vertex and then draw a new position from a probability distribution such that the move fulfils the detailed balance condition. In each Monte Carlo step, we perform the following two moves:
Shift   move :
This Monte Carlo move is implemented to introduce confined random fluctuations to the current configuration independent of the strength of the algebraically decaying long-range interactions. It is especially important for larger decay exponents σ when the configurations are much more likely to be confined. First, we randomly select a vertex n sel { 1 , , n s } drawn from a discrete uniform distribution with p sel = 1 / n s . Second, for the fluctuation, we draw a shift value d prop { n s , , n s } from a discrete uniform distribution p shift = 1 / ( 2 n s + 1 ) . In one dimension, we have to draw a single time, and in higher dimensions, we draw repeatedly for each component. Subsequently, we add the shift to the position of the selected vertex and propose the position:
i prop = i n sel + d prop .
We might have proposed a position that is already occupied by another vertex, so we have to check for overlaps. In one dimension, we reset the proposed position to the original one if there is an overlap, while in higher dimensions, we explicitly allow overlaps. As we remember from above, this distinction is also present in the reference sums in one dimension in Equation (185) compared to higher dimensions in Equation (187). If an overlap occurs in dimensions higher than one, then the target summand is explicitly set to zero such that these configurations cannot contribute (otherwise, the sum would become infinity). Then, we calculate the Metropolis acceptance probability:
p acc shift = min 1 , π ( c prop ) π ( c curr ) = min 1 , ( f n s ref ( c prop ) ) 2 + R 2 ( f n s ( c prop ) ) 2 ( f n s ref ( c curr ) ) 2 + R 2 ( f n s ( c curr ) ) 2 ,
by determining the probability weights π of the current and the proposed configuration. The result of the target and reference function calls should be saved into variables to prevent redundant and expensive function calls at each Monte Carlo step. Note that the transition weights T ˜ ( c curr c prop ) = p sel × p shift cancel out as we draw only from uniform distributions. Lastly, the minimum function is implemented by drawing a random number y 0 , 1 , and we accept the proposed move if y < p acc shift and update the current configuration. An example of such a shift move is depicted in Figure 11a.
Rift   move :
In contrast to the previous move, which should introduce fluctuations to the configuration independent of the current one and independent of the long-range interaction strength, “rift moves” are introduced to better capture the correct asymptotic behaviour induced by the algebraically decaying interactions. The moves are able to propose very large distances between vertices, but are also able to do the opposite, closing the “rift” between vertices when the configuration is split into essentially two clusters. At first, we select a site n sel { 1 , , n s 1 } from the vertex set with discrete uniform probability p sel = 1 / ( n s 1 ) , explicitly excluding the last site. In one dimension, we can order the vertex set such that the first vertex is the one with the smallest positional value and the last the one with the largest value, so we order by i n < i m , where n , m are vertex indices and i n , i m the associated sites on the lattice. The same ordering was also performed when we solved the reference sum in Equation (185). In higher dimensions, a similar ordering comes at a much higher computational cost, so we stick to the vertex numbering given by the array indices, i.e., the order is n < m . Here, it is also important that the vertex labelling of the reference sum coincides with the labelling of the chain graph. To capture the physical asymptotics of the system, we draw random values from a ζ -function distribution. In one dimension, we draw from
p rift ( r prop ) = ( r prop ) γ ζ ( γ ) ,
yielding a power-law distribution with r prop > 0 with the free exponent parameter γ . We choose γ = d + σ for obvious reasons. The distance to the next vertex is given by r curr = i n sel + 1 i n sel , and r prop is the proposed distance drawn from the ζ distribution. Since we ordered by the position and only selected sites in { 1 , , n s 1 } , it is sufficient to draw positive values only. We shift all indices i n > i n sel according to
i n prop = i n + ( r prop r curr ) .
In higher dimensions, we have no such ordering and, therefore, extend such a distribution to negative values (we refer to it as a “double-sided” ζ -function distribution) and draw random values from
p rift ( r prop ) = ( 1 + | r prop | ) γ 2 ζ ( γ ) 1
for each component. Note that the additional one is introduced to prevent divergence when sites overlap. After drawing the new distance r prop , we shift all vertices componentwise with n > n sel according to
i n > n sel prop = i n > n sel + ( r prop r curr ) .
The underlying idea is that, if there is a large distance between two vertices i n sel and i n sel + 1 , we can close the “rift” of the entire configuration instead of introducing a new one between i n sel + 1 and i n sel + 2 . The transition weights for this move are given by
T ˜ ( c curr c prop ) = p sel × p ( r new ) ,
T ˜ ( c prop c curr ) = p sel × p ( r curr ) .
With these, we can calculate the Metropolis–Hastings acceptance probability in one dimension:
p acc rift = min 1 , π ( c prop ) π ( c curr ) p ( c curr ) p ( c prop ) = min 1 , ( f n s ref ( c prop ) ) 2 + R 2 ( f n s ( c prop ) ) 2 ( f n s ref ( c curr ) ) 2 + R 2 ( f n s ( c curr ) ) 2 ( r prop ) γ ( r curr ) γ
and, likewise, in higher dimensions:
p acc rift = min 1 , ( f n s ref ( c prop ) ) 2 + R 2 ( f n s ( c prop ) ) 2 ( f n s ref ( c curr ) ) 2 + R 2 ( f n s ( c curr ) ) 2 n = 1 d ( 1 + | r n prop | ) γ n = 1 d ( 1 + | r n curr | ) γ .
As above, we randomly draw y 0 , 1 , accept if y < p acc rift , and update the current configuration if the proposed configuration is accepted. In Figure 11b, you can find a typical rift move illustrated.
To implement the Monte Carlo algorithm, we just have to introduce a loop, where for each loop iteration we perform a Monte Carlo step consisting of those two moves.
To ensure that fluctuations to the current embedding introduce enough new configurations (shift moves) while making sure that the Monte Carlo algorithm does not spend too much time in physically unlikely configurations, the shift move probability was set to p shift = 0.7 and the rift move probability to p rift = 0.3 [29,207] (The “single-site rift move” was later completely replaced with the “multi-site” rift move presented in this review). This choice proved to produce good convergence behaviour over a large range of decay exponents and different models. After performing a move—accepted or not—we update the estimate for the target and reference sum in the numerator and denominator in Equation (188), respectively, and keep track of the statistics like the variance of target and reference sum. To determine the value S [ f n s ] of the Monte Carlo run, the ratio of both quantities has to be multiplied by the analytical value of the reference sum as in Equation (188).
We can test the convergence of the MC algorithm by considering the 1qp gap Δ = ω ( k = 0 ) of the one-dimensional ferromagnetic long-range transverse-field Ising model (LRTFIM). We benchmark the second-order series coefficient p 2 . According to Equation (174), the coefficient from the MC approach is p 2 MC = S ¯ [ f 2 , 0 ( 2 ) ] + S ¯ [ f 3 , 0 ( 2 ) ] and the exact value is given by p 2 exact = 2 ( ζ ( 2 ( 1 + σ ) ) ζ ( 1 + σ ) 2 ) (see Ref. [25]).
In Figure 12, we show the MC error:
ε MC = p 2 exact p 2 MC p 2 exact
as a function of the number of steps N steps for a hundred MC runs with a distinct seed for each. As can be seen from the figure, the MC error goes to zero with N steps 1 / 2 , as generally expected for MC simulations.
Further improvements to the algorithm can be made by recentring the configuration to the origin since the graphs on the lattice may drift away due to the random walk. Also, for integer powers occurring in the target integrand from powers of the expansion parameters, it may be better to use plain multiplication than using generic integer power functions. Most importantly, lookup tables should be used for the non-integer powers stemming from the algebraically decaying interaction strengths for distances | i 1 i 2 | < δ max within a cutoff δ max that occur most commonly during the MC run. The integrand function also depends on k as cosine terms are present. It is useful to compile the code for a desired k value, so lookup tables can be defined at compile time due to the periodicity of the cosine. For instance, in the simplest case k = 0 , the cosine is always 1 and for k = π the cosine has only values of 1 and 1 .
Let us emphasise that, for the pCUT+MC approach, we need to perform individual runs for every perturbative order, for every possible number of sites at a given order, for different values of momentum k if necessary (e.g., for a dispersion), and for different seeds so that an average value can be calculated. Typical runs are performed with N seeds = 5 –20 seeds for 6–24 h. For instance, for the long-range transverse-field Ising model, we were able to extract perturbative series up to order 10 for the 1qp spectral weight, order 11 for the 1qp gap, and order 13 for the ground-state energy. Much higher orders in perturbation are likely not feasible with the current implementation as the number of Monte Carlo runs scales quadratically with the order. In the future, to further improve the efficiency of the approach, one may come up with additional Monte Carlo moves for the Markov chain that can change the number of sites. As a result, the algorithm would only scale linearly with the perturbative order, but potentially at the cost of slower convergence. Maybe a compromise would be a favourable option, where the lower orders that converge faster are computed using an algorithm with the additional move and higher orders with the algorithm presented above.
Another important issue worth mentioning is the fact that, so far, the above algorithm can only be applied to lattices with a trivial single-site unit cell. To generalise the algorithm to arbitrary lattice geometries in the future, we would need to introduce another Monte Carlo move. We would keep the moves introduced above for moving the vertices along the underlying Bravais lattice. A new move then changes the position within the unit cell. However, new subtleties emerge from introducing a larger unit cell, as we would have to fix two vertices within the unit cells for hopping processes, while the remaining vertices can be moved freely. A larger unit cell also means that we have to calculate the entire matrix of the dispersion, as can be seen in Equation (115). Entries of the matrix are in general complex-valued, which needs to be accounted for as well. Lastly, the behaviour of the Monte Carlo algorithm for the system is altered due to the additional Monte Carlo move, which may impact the convergence. So, the choice of the reference sum should probably be adapted as well to achieve the desired convergence. See also the discussion in Ref. [207].

4.8. Series Extrapolation

We are interested in extracting the quantum-critical properties from the perturbative series obtained from the Monte Carlo embedding. DlogPadé extrapolations are an established and powerful method that allows us to extrapolate high-order series even beyond the radius of convergence and determine the quantum-critical point and associated critical exponents. A more elaborate description of DlogPadés and its application to critical phenomena can be found in Refs. [208,209].
We have given a high-order perturbative series of a physical quantity κ ( λ ) in the perturbation parameter λ . See Section 4.5 for typical quantities of interest. A Padé extrapolant is defined as
P [ L , M ] κ = P L ( λ ) Q M ( λ ) = p 0 + p 1 λ + + p L λ L 1 + q 1 λ + + q M λ M ,
where p i , q i R and the degrees L, M of the numerator polynomial P L ( λ ) and denominator polynomial Q M ( λ ) are restricted to o max = L + M , where o max is the maximal perturbative order. The coefficients p i and q i are fixed by a set of linear equations by cross-multiplying Equation (203) with Q M ( λ ) and requiring o max = L + M , i.e., that all higher order terms must vanish on the left side of the equation [208]. We introduce
P [ L , M ] D = d d λ ln ( κ )
as the Padé extrapolant of the logarithmic derivative of κ that must satisfy o max 1 = L + M , as we lose one perturbative order due to differentiation. The DlogPadé extrapolant of κ can now be defined as
d P [ L , M ] κ = exp 0 λ P [ L , M ] D d λ .
Given that the quantity of interest κ shows a second-order phase transition with a dominant power law κ | λ λ c | θ about the critical point λ c , we can extract the critical point λ c and the associated critical exponent