Stochastic Gravitational-Wave Backgrounds: Current Detection Efforts and Future Prospects

The collection of individually resolvable gravitational wave (GW) events makes up a tiny fraction of all GW signals which reach our detectors, while most lie below the confusion limit and go undetected. Like voices in a crowded room, the collection of unresolved signals gives rise to a background which is well-described via stochastic variables, and hence referred to as the stochastic GW background (SGWB). In this review, we provide an overview of stochastic GW signals, and characterise them based on features of interest such as generation processes and observational properties. We then review the current detection strategies for stochastic backgrounds, offering a ready-to-use manual for stochastic GW searches in real data. In the process, we distinguish between interferometric measurements of GWs, either by ground-based or space-based laser interferometers, and timing-residuals analyses with pulsar timing arrays (PTAs). These detection methods have been applied to real data both by the large GW collaborations and smaller research groups, and the most recent and instructive results are reported here. We close this review with an outlook on future observations with third generation detectors, space-based interferometers, and potential non-interferometric detection methods proposed in the literature.


Introduction
Gravitational waves (GWs) are perturbations of the spacetime metric caused by extremely energetic events throughout the Universe. Up until now, direct detections of GWs have been coherent measurements of resolved waveforms in detector datastreams which may be traced back to single, point-like sources. These make up a tiny fraction of the gravitational-wave sky: The vast collection of unresolved signals corresponding to multiple point sources or extended sources adds up incoherently, giving rise to gravitational-wave backgrounds (GWBs). A variety of different backgrounds is expected given the range of GW sources in the Universe; however, regardless of their origin, most of these are treated as stochastic, as they may be described by a non-deterministic strain signal and are, hence, referred to as stochastic gravitational-wave backgrounds (SGWBs). Some backgrounds are stochastic by generation processes (e.g., inflationary tensor modes), whereas others are stochastic due to the characteristics or limitations of the specific detector used to observe them (e.g., the cumulative signal from many binary black hole coalescences). An SGWB of this latter nature is by definition at the threshold of detection, making it effectively today and those that will probe them in the foreseeable future. We offer a detailed explanation of stochastic search methods in Section 4, differentiating between different observing strategies and detection regimes. In Section 5, we summarise current detection results, reporting the most stringent upper limits yet on SGWBs while highlighting how close we are to detection. Finally, we close this review in Section 6 with an overview of future prospects with the upcoming GW detectors.

Theory of Stochastic Backgrounds
In this section, we provide the theoretical foundations of GW science essential to the description of generation and propagation of GWs in the Universe. First, we review the mathematical description of stochastic backgrounds in a common framework, regardless of their astrophysical or cosmological origin, which lies at the basis of (almost) all stochastic search methods reviewed in Section 4. Then, we introduce the fractional energy density of GWs, which is the main observable targeted by stochastic searches.

Gravitational-Wave Strain and Stokes Parameters
We work in the linearised regime of general relativity, such that the spacetime metric is close to that of flat Minkowski space, with small perturbations that encode GWs, g µν = η µν + h µν . Neglecting terms of order h 2 , the vacuum Einstein field equation can then be written as a wave equation in the De Donder gauge [30], where the box operator is the D'Alembert operator, defined as = ∂ µ ∂ µ . Solutions to this equation may be constructed as linear superpositions of plane waves, where e µν is the normalised polarisation tensor, and k α is the wave four-vector. By inserting this solution in the equation above, one recovers the dispersion relation k α k α = 0. This implies that the waves travel at the speed of light in the linearised regime. We may then pick the transverse-traceless (TT) gauge and reduce h µν to a purely spatial perturbation, h ij , which carries two degrees of freedom or equivalently two independent, spin-2, polarisation states. Generalising Equation (2) as in [31], for example, the GW strain at time t and position vector x given by an infinite superposition of plane waves incoming from all directions on the skyn,n = (sin θ cos φ, sin θ sin φ, cos θ) , in standard angular coordinates (θ, φ) on the 2-sphere may be written as The spatial wave vector is written explicitly as k = 2π fn (we will use c = 1 throughout unless otherwise noted), P labels the polarisation states, and h P are the respective amplitudes assigned to each state. Choosing the linear polarisation basis {+, ×}, the orthogonal polarisation base tensors P may be written as with e θ = (cos θ cos φ, cos θ sin φ, − sin θ) , (7) e φ = (− sin φ, cos φ, 0) , (8) as defined in [32]. For a stochastic signal, h P represents a random complex number. Most often, these are assumed to be drawn from a Gaussian probability distribution. This is likely a good approximation for a cosmological GWB, as we will see in Section 3. For a stochastic GWB of astrophysical sources, this assumption may break down but the central limit theorem will guarantee that the statistics approach with respect to a Gaussian random field if the signal is sourced by a sufficiently large number of independent events and that any high signal-to-noise outliers have been subtracted from the detector's timestreams. Under the Gaussian assumption, the statistical properties of the amplitudes are then characterised solely by the second moments of the distribution h P ( f ,n)h P ( f ,n ) , which, assuming statistical homogeneity, correspond to ensemble averages as [33] h where the Stokes parameters I, the intensity, Q and U, giving the linear polarisation, and V, the circular polarisation, have been introduced. The four Stokes parameters completely describe the polarisation of the observed signal in analogy with electromagnetic Stokes parameters for the photon [34]. The difference here is that whilst the electromagnetic Q and U Stokes parameters transform as spin-2 quantities with respect to rotations, their GW strain counterparts transform as spin-4. In both cases, intensity I behaves as a scalar under rotations, while V transforms as a pseudo-scalar. More details on these and their spin-weighted behaviour may be found in [35].

The Energy Density of Gravitational Waves
The evolution of our Universe is described in terms of its expansion rate, also referred to as the Hubble rate, where a(t) is the cosmological scale factor. Using general relativity, we can write this rate in terms of the different forms of energy density present in the Universe (this is most conveniently performed as a function of redshift z), where H 0 ≈ 70 km s −1 Mpc −1 is the Hubble rate today. In Equation (11), Ω i = ρ i /ρ c are the density parameters of each energy component of the Universe: radiation R, which includes photons and relativistic neutrinos; matter M, which includes baryons and cold dark matter; curvature k; and the cosmological constant Λ. The critical energy density ρ c is a normalisation that can be interpreted as the energy density required to close our (flat) Universe today, which is explicitly where G = 6.67 × 10 −11 m 3 kg −1 s −1 is the universal gravitational constant.
At this stage, one may ask the following: where do GWs fit in? In fact, GWs feature in Equation (11), as a part of the radiation energy density, as they are considered to be carried by massless particles propagating at the speed of light 1 . This allows us to write with the terms in the sum representing the energy density in photons, neutrinos, and GWs, respectively. While the present day value of the radiation density is very small (Ω R ∼ 10 −4 ), we see from Equation (11) that at very high redshift, z O 10 3 , it dominated the Universe's expansion. As a result, any additional contributions to the radiation density, including those from primordial GWs, will leave an imprint on cosmological observables generated at these early epochs: principally, the cosmic microwave background (CMB) and the light element abundances predicted by Big Bang nucleosynthesis (BBN) [37]. These imprints are identical for all forms of additional radiation density; thus, by convention they are parameterised in terms of the effective number of neutrino species, N eff . Constraints on N eff can straightforwardly be converted into constraints on GWB energy density. The most up-to-date analysis combines CMB and BBN data with external datasets such as measurements of the baryon acoustic oscillation scale, producing [38] Given these strong constraints on Ω GW , it is reasonable to consider GWs as perturbations in the Universe, which do not influence the bigger picture but obey the propagation rules, the structure, and the energy content of the Universe that have been set out.
The density parameter appearing in Equation (14) encapsulates the total energy density of GWs at all frequencies 2 , as this is what determines the Hubble expansion rate. However, all other searches for gravitational waves focus on particular frequency bands; thus, it is convenient to define the GW energy density frequency spectrum [39], note that the frequency f here is the measured frequency, assuming an observer and the intervention of an instrument which mediates the true signal and the measurement. Furthermore, beyond the isotropic value of GWB over the entire sky it is possible to include any anisotropy by adding a directional dependence Ω GW ( f ) → Ω GW ( f ,n) wheren is the unit vector of a line of sight.
It is now possible to recover the cosmological information contained in the Ω GW parameter, which is effectively the cumulative GW energy history weighted per redshifted frequency bin, as presented in [40]. This may be observed by noting that homogeneity and isotropy throughout the universe imply 1 There are several theories that postulate modifications to this statement, but these modifications are strongly constrained by multimessenger observations of the binary neutron star merger GW170817 [36]. 2 Strictly speaking, GWs with wavelengths larger than the size of the cosmological horizon are frozen out by Hubble friction and, thus, do not contribute to the effective energy density. As a result, the density parameter appearing on the left-hand side of Equation (14) should be interpreted as an integral over Ω GW ( f ) from a minimum frequency of f ∼ 10 −15 Hz, which corresponds to the size of the horizon at the epoch when the CMB photons were emitted.
where z is redshift, N(z) is the number of GW emitters as a function of redshift, and f r is the emission frequency of GWs in the rest frame of the emitter, f r = f (1 + z). dE GW /d f r is then the spectral energy density of the source population. Equation (15) may be rewritten as hence, equating Equations (16) and (15) frequency by frequency yields Equation (18) implies that the fractional energy density of GWs per measured logarithmic frequency intervals at redshift zero, Ω GW , is proportional to the integral over the cosmic history of the comoving number density of the sources multiplied by the emitted energy fraction of each source, in the appropriately redshifted frequency bin. This directly relates Ω GW to the source event rate as a function of redshift, N(z). In the case of different types of sources, this may be extended to a sum over all contributors, Ω GW ( f ) may also be re-written in terms of the intensity parameter I defined in Equation (9) [32].
We can derive this relationship using the Isaacson formula for the energy density of the GWB [41,42], where the angle brackets indicate an average over many wavelengths. Inserting the plane-wave decomposition in Equation (4) and rewriting the two-point statistics of each plane wave in terms of the Stokes parameters in Equation (9), we obtain Note that we have assumed ergodicity here such that the spatial average in Equation (21) is equivalent to the ensemble average in Equation (9). If we treat intensity as isotropic, I( f ,n) → I( f ), then we immediately recover Equation (20). This fundamental relation allows us to connect GWB observations to cosmological implications of the background and may easily be extended to a directional statement as Note that this implies however, one must take care of factors of 4π as other conventions are sometimes used. Note that we are using the conventions as in [32], where all power spectra are one-sided.
As for the other Stokes' parameters, these do not contribute to Ω GW and are equal to zero in the case of an unpolarised background. This may be easily gleaned by their definitions, in Equation (9), as the second moments of the h + and h × fields should be equal and the expectation value of any cross correlations between the two should vanish. Ω GW , thus, inherits the spectral properties from the GWB intensity as per Equation (20). As will emerge in the following sections, the spectral dependence is often reduced to a power law, in which case it is common practice to write 3 (25) where α is the spectral index of the signal, and f ref is a reference frequency. This assumption is often extended to be valid in each direction independently, such that the spectral and directional dependence of the GW signal factor out as This is particularly useful when performing multiparameter estimations such as map-making as it allows one to integrate out the spectral dependence of the reconstructed signal and solve for an intensity sky map. Note that in the ensemble, averaging is required to recover the Stokes' parameters, and all temporal phase information in the GWs is discarded; as argued in [43], no physical information may be recovered from the coherent phase component.

Sources of Stochastic Backgrounds
GWBs fall broadly into two distinct categories based on the underlying generation mechanism. The first is a superposition of signals of astrophysical origin, galactic and/or extragalactic, that are not individually detected or resolved. This type of background will, thus, be unresolved and incoherent in that the temporal phase of the signal is expected to be random and is expected to be stochastic in the limit where the number of sources is very large. The second category is cosmological or primordial backgrounds. These include GWBs generated by the evolution of vacuum fluctuations during an inflationary epoch that end up as superhorizon tensor modes at the end of inflation [44,45]. It also includes GWBs generated by nonlinear phenomena at early times such as phase transitions or via emission by topological defects [46][47][48]. For the most part, primordial backgrounds are formed via stochastic processes, resulting in stochastic signals; on top of this, propagation effects contribute to isotropising a primordial signal, in most cases erasing any initial anisotropies [49,50]. In particular, stochasticity here implies no initial information is retained by the temporal phase of the background [51]. Cosmological GWBs can also be stochastic due to many overlapping signals, for example, in the case of cosmic strings. Assuming that the orientation of individual astrophysical sources is random on the sky, these are in general expected to be unpolarised, as an equal distribution in the polarisation modes will cancel out any local asymmetry. Primordial signals are also mostly assumed to be unpolarised, as the underlying stochastic process is not expected to select a preferential frame, with some exceptions as we explain below. What then characterises backgrounds of different origin is the spectral dependence of the source energy density in Equation (18): the power spectrum of the signal depends on the specific GW emission mechanism, SMBBHs Cosmic strings FOPTs Stellar-mass CBCs Inflation Figure 1. An overview of potential GWB signals across the frequency spectrum. The light blue curve shows the prediction for single-field slow-roll inflation with a canonical kinetic term, with tensorto-scalar ratio r 0.002 = 0.1 [52]. The pink curve shows a GWB from Nambu-Goto cosmic strings, using "model 2" of the loop network, with a dimensionless string tension of Gµ = 10 −11 [53]. The brown curve shows a GWB from inspiralling supermassive BBHs, with the amplitude and shaded region shown here corresponding to the common noise process in the NANOGrav 12.5-year data set [54]. The two grey curves show GWBs generated by first-order phase transitions at the electroweak scale (∼200 GeV) and the QCD scale (∼200 MeV), respectively [55]. The yellow curve shows a GWB generated by stellar-mass compact binaries, based on the mass distributions and local merger rates inferred by LVK detections [56]. The dashed curves show various observational constraints, as described further in Section 5 (this includes the PPTA constraint, which intersects the possible NANOGrav SMBBH signal); the dotted curve shows the integrated constraint from measurements of N eff , which cannot be directly compared with the frequency-dependent constraint curves but is shown here for indicative purposes.
which is imprinted in the measured strain. Note that this measurement includes non-negligible selection effects, as qualitatively different backgrounds contribute from different redshift shells and from different directions. In this section, we review both astrophysical and cosmological GWBs, providing the necessary background for the targeted searches discussed in Section 5. We also comment on the observational properties of the signal which are essential to understand when building an optimal search method. The various sources are also summarised in Figure 1, which includes the sensitivity of several GW detection efforts for reference.

Astrophysical Backgrounds
Astrophysical GWBs are the collection of all GWs generated by astrophysical processes which are individually unresolved by your GW detector. These can be either individual subthreshold signals, or they can be so numerous that they add up incoherently and form a continuous signal in the timestream.
Perhaps the most studied signal in the literature is a background sourced by a collection of inspiralling and merging compact binary systems. These include black hole binaries, neutron star binaries, white dwarf binaries, and systems counting a mixed pair of these objects. Black hole binaries in particular are a vast category of sources, as the mass of each black hole in the binary ranges between 5 and 10 10 M . The lower mass of 5M here refers to the lack of black holes between the Tolman-Oppenheimer-Volkoff limit (which sets the maximum mass of a neutron star to 2.9M [57]) and 5M , as seen in low-mass X-ray binary observations [58,59] 4 . Specifically, a BH is considered massive (M BH ∼ 10 5 -10 10 M ), intermediate (M BH ∼ 10 2 -10 5 M ), and stellar mass (M BH ∼ 5-10 2 M ) [61]. There are binaries that have a mass ratio close to one, where the two black holes in the binary fall within the same mass category, and there are also so-called extreme mass ratio (EMR) binaries where the two objects belong to different families, including extreme mixed binaries such as massive black hole-neutron star pairs. There is a direct relation between binary masses and GW emission frequencies; however, overall, the GW power spectra have a similar shape. In particular, in the early emission stage of the binary, i.e., the inspiral phase when the energy emission increases adiabatically as the binary's orbit shrinks very gradually, energy emission can be described by very few parameters and is well-approximated by a power law as explained below. The masses also impact the characteristic amplitude of a GW signal, which in turn may be linked to the detection threshold via the signal-to-noise ratio (SNR) of each signal. The latter sets the maximum expected redshift depth at which each type of signal may be observed by the detectors probing that frequency range. A good example of this is the background generated by double white dwarfs, which is expected to be observable by the LISA detector only at very low redshift [62] such that LISA will only be sensitive to white dwarfs from the Milky Way and, perhaps, to those in the local group [63].
Other astrophysical sources of GWs include asymmetric supernovae explosions and rotating neutron stars which are not spherically symmetric and may produce triaxial emission [64]. These are considered minor sources of GWs compared to compact binaries as their characteristic signal is weaker. Furthermore, they are also considerably harder to model; hence, the discussion that follows is mainly focused on the former; nevertheless, many equations presented are general and may be applied to these secondary sources as well. In principle, the fact that these are harder to model also implies that they will necessarily be targets of stochastic searches, as the latter may be considered as generic, "un-modelled" searches of GW power.
It may seem odd to use the fractional parameter Ω GW to describe GW energy from astrophysical events, as historically this is considered a cosmological quantity; however, several authors and collaborations have deemed it a useful convention allowing scientists from different fields to easily compare results. With this in mind, it may be considered as a compact measure of the GW history of the low-redshift Universe, after galaxy and star formation have taken place. To see this in more detail, following [40], let N be the number of events of a particular signal type in a comoving volume. This may be decomposed in event rateṄ multiplied by cosmic time slice dt r /dz, Then, as in [65], the fractional energy density may be rewritten as where the angle brackets indicate an averaging over particular source population parameters θ, , first presented in [74]. On the right, the total expected background is compared to the integrated sensitivity of current and future configurations of the 2G detector network. This plot was obtained using open data published in [75]. and p(θ) is the probability distribution of source parameters. In the case of a stochastic background, one would expect the parameter space to be fully sampled by the population; however, note that selection effects imposed by the detector may break this limit. One would expect the event rateṄ to locally increase with redshift, as the volume of each redshift shell increases. In fact, the event rate for stellar mass compact binary coalescences has been found to increase with redshift when analysing the direct detections reported by the LVK [66]. However, this rate cannot increase monotonically out to infinity and must have a turnaround point at some peak redshiftẑ and then decay to zero as star formation ceases in the early Universe. In general, this may be modelled phenomenologically with [67] such that at low redshift, the rate increases asṄ ∝ (1 + z) α r , then reaches a maximum at z ∼ẑ, and finally decreases at high redshift asṄ ∝ (1 + z) −β r . C is the normalisation constant C(α r , β r ,ẑ) = 1 + (1 +ẑ) −(α r +β r ) which ensuresṄ(z = 0) =Ṅ 0 . Constrainingẑ, α r and β r provides insight on binary formation history and, consequently, star formation history as well. In practice, however, stochastic searches probe the redshift-integrated observable Ω GW ( f ) only. The redshift history may be reconstructed with careful modelling and cross-correlation with direct measurements of individual events for which redshift is estimated; see the approach presented in [65]. Note that selection effects due to the mass distribution of sources need to be taken into account in this sort of analysis. Alternatively to this phenomenological approach, the redshift-dependent event rate may also be probed by conducting a careful study of the specific astrophysical conditions which favour compact binary inspirals and mergers; for example, the evolution of galaxy star formation and chemical enrichment can be combined with prescriptions for merging of compact binaries obtained via stellar evolution simulations [68,69].
To study the spectral dependence of different backgrounds, we can plug in different source models in Equation (18) and calculate the spectral shape of Ω GW . This is particularly useful in the case of an astrophysical background as N and E GW are directly related to the star formation history of the Universe, and astrophysical GWs from different source populations will contribute to Ω GW in different amounts at different times. This may result in a method that differentiates between astrophysical backgrounds and also test different stellar and galaxy evolution models [70].
An example of astrophysical background with a simple expected spectral dependence is that of compact binaries in the adiabatic inspiral phase [71]. In this case, it is sufficient to model GW radiation at leading quadrupole order [72], where M is the chirp mass, given by where m 1 and m 2 are the masses of the two compact objects in the binary. The co-moving number density will depend on chirp masses in the binary distribution and may be rewritten as an integral over M, where the integration extremes will depend on source population characteristics. Rearranging the terms in Equation (28), one obtains where the average over the population parameters is simply reduced to the integral over all possible chirp masses. Here, the superscript CB labels compact binaries generically. This provides a simple spectral dependence for Ω CB GW , which may be condensed as 5 where Ω CB GW ( f ref ) is the appropriately normalised energy density calculated at a reference frequency f ref .
This simple model may be extended to all general inspiral-dominated GWBs as the same principles apply and has been reprised all over the literature and used in almost every GWB search with data available. The same principle, for example, has been used in the estimation of the expected background from stellar mass black hole and neutron star binaries in the LIGO detector frequency band [73], with some tweaks to account for the chirp signal from merging black holes which has a substantially different energy output per frequency. This result is shown in Figure 2.

Primordial Backgrounds
There are a range of potential physical processes occurring in the primordial Universe which would result in a rich variety of GW signals. We report the most noteworthy here; however, this is a nonexhaustive list of all sources which have been studied in the literature. We focus especially on sources which have corresponding search pipelines and for which constraints have been derived with existing GW data. For a more in-depth review of the subject, see [76].
First and foremost, all inflationary models include an irreducible GWB as a byproduct of inflation [76,77]. Any inflationary process results in the stretching of quantum fluctuations which are parametrically amplified into classical density perturbations, producing particularly tensor perturbations which we identify as GWs. The amplification process causes perturbation wavelengths to expand until they reach the size of the horizon, when they exit causal contact and their evolution remains frozen until the Universe's horizon returns to their scale in the post-inflationary era and they may re-enter. At time of re-entry, the GWB is formed by standing waves, and their phases are correlated between modes with opposite momenta [78]. In most models, the perturbations are adiabatic, Gaussian, and approximately scale-invariant, resulting in a similarly defined GWB. In this case, the fractional energy density will be approximately frequency independent at frequencies f H 0 ∼ 10 −18 Hz, and will be primarily characterised by its amplitude. The perturbations are also expected to imprint a pattern of B-modes in the polarization of the CMB [79], which has not been detected yet and is the aim of numerous CMB experiments [80][81][82][83]. This nondetection imposes an upper bound on the inflationary GWB amplitude, setting it well below the reach of present or planned GW detectors. The prevalent opinion in the community is that precision measurements of the CMB are still the best chance to garner decisive evidence of inflation and discriminate between models.
Beyond the irreducible background, however, the inflationary era may also source a second GWB which in certain cases dominates over the irreducible component, often presenting strongly scale-dependent features or characteristic peaks in the power spectrum [76]. For example, the presence of additional fields during the inflationary epoch other than the inflaton could have an effect on GW emission, as these may have interactions resulting in strong particle production or may act as spectator fields and exhibit a sub-luminal propagation speed [84,85]. The presence of additional symmetries in the inflationary sector would also have an impact, as these would result in breaking of space reparametrizations, allowing the graviton to acquire a mass. Alternative theories of gravity (other than general relativity) governing the inflationary era would have an impact on all aspects of primordial GWs. The scalar density perturbations generated during inflation may also source GWs [86]; this may occur either at second order in perturbation theory [87] or by acting as seeds for primordial black holes [88,89]. Finally, GWs can also be produced during preheating [90][91][92][93] due to the violent transfer of energy from the inflationary sector to Standard Model plasma. All these possible inflationary scenarios are effectively extremely hard to probe with electromagnetic observations; hence, GWs present a unique window into this epoch. See for example [94] for a discussion of inflationary backgrounds in the LISA band and how a positive detection would revolutionise the current science of inflation. For more details and references to the specific models that may give rise to the effects discussed above, refer to the review [76].
After inflation, primordial GWs may have been generated as a consequence of phase transitions which may have occurred as the primordial Universe cooled down in its postinflationary adiabatic expansion. First-order phase transitions in particular would proceed via the nucleation of true-vacuum bubbles, sourcing GWs through the collision of these bubbles [44,46], as well as through the resulting acoustic and/or turbulent motion of the primordial plasma [55]. The characteristic power spectrum of a background formed by bubble collisions presents a sharp peak at the frequency related to the energy scale at which the transition occurred, T * , where β/H * is the rate of the transition in units of the Hubble rate at the transition epoch. For typical values β/H * ∼ 10 3 , we see that this peak lies in the megahertz (mHz) band probed by LISA for transitions at the electroweak scale T * ∼ 200 GeV, allowing us to test a myriad of particle physics models which predict such signals. Phase transitions may also give rise to stable topological defects, the most prominent example of which are cosmic strings. These line-like topological defects are generated through spontaneous symmetry breaking in the early Universe [47,48] and are a generic prediction of many theories beyond the Standard Model of particle physics [95]. On macroscopic scales, cosmic strings are effectively described by a single parameter, the dimensionless string tension Gµ, which is determined by the energy scale at which they were formed. Once formed, dynamical evolution results in a dense network of cosmic string loops which oscillate at relativistic speeds, producing copious GWs which combine to produce a strong SGWB signal for which its amplitude and spectral shape are set by Gµ.
As mentioned above, a potential GW signal of cosmological origin is the SGWB from primordial black holes (PBHs); i.e., black holes formed in the early Universe rather than through stellar evolution. Since black holes are massive and non-baryonic and interact only through gravity, PBHs are very natural and well-motivated cold dark matter (CDM) candidates. The cosmic mass density of PBHs is, therefore, usually expressed as a fraction of the total CDM mass density, f PBH ≡ ρ PBH /ρ CDM . Similarly to stellar-origin black holes, PBHs can form binaries (either by forming in close proximity to each other or through dynamical encounters) which then inspiral and emit GWs or they can emit GWs through close hyperbolic encounters [96]. The cumulative signal from many such binaries produces a SGWB spectrum which is determined by the PBH mass spectrum and DM mass fraction f PBH .
In general, GWBs emitted in the primordial Universe are considered stochastic under the ergodic assumption, which equates the ensemble average of an observable with its time and/or spatial average. Essentially, we assume that by observing large enough regions of the Universe today, or a given region for long enough time, we have access to many independent realisations of the system. This is realistic given two fundamental premises of cosmology, namely that the Universe is statistically homogeneous and isotropic, such that the initial conditions of the GW-generating process may be considered the same at every point in space and that the GW source fulfills causality, operating at a time when the typical size of a region of causal contact in the Universe was smaller than today. It can be shown [76] that the correlation scale today of a GW signal from the early Universe is tiny comparable to the present Hubble scale, verifying this requirement.
Stochasticity would also imply the lack of a preferred polarisation mode in the signal, as all mode configurations should be fully sampled in the observation set. While this is intuitively true for astrophysical backgrounds, the argument for primordial GWs is more delicate, and while a background may be stochastic for the reasons described above, there may be specific formation channels which will violate parity and induce an asymmetry in the polarisation modes, effectively generating circularly polarised waves. This effect may be found in alternative theories of gravity, such as Chern-Simons gravity [97] or certain quantum gravity models [98]. The study of parity-violating theories of gravity is inspired by grand unification, as the electro-weak sector of the standard model is chiral and, thus, maximally violates parity [99,100].

Anisotropies in Stochastic Backgrounds
We can distinguish two root causes of anisotropy in stochastic signals. Firstly, there may be inhomogeneities in the GW source mechanisms, such as a particular distribution of the sources on the sky, which necessarily produces an anisotropic signal. Secondly, as GWs propagate, they accumulate line-of-sight effects [101], crossing different matter density fields which are (if mildly) inhomoheneously distributed in our Universe. For astrophysical backgrounds, the spatial distribution of GW events is a biased tracer of the underlying light matter distribution, while primordial backgrounds will present anisotropies linked to the particular physical processes that occurred in the early Universe. In order to investigate these effects, it is useful to define the fractional GW energy density per solid angle dn, such that as in [102,103]. Ω GW ( f ) is the GWB monopole, where the bar is introduced to highlight the fact that it is an isotropic all-sky average over the directional fluctuations. The latter may be described by the GW density contrast, in analogy with the formalism used to quantify the level of anisotropy in CMB temperature [104] and in galaxy clustering [105]. This dimensionless parameter provides an unambiguous metric to compare different results. At first order in the perturbations, the density contrast may be decomposed into three distinct contributions related to the origin of the anisotropies, Here, δ s GW is the source term resulting from anisotropies at the GW source, for example, due to the specific distribution of events for astrophysical backgrounds, while δ los GW is the propagation term [101], which encapsulates accumulated line-of-sight effects. The final term is the dipole of magnitude D induced by the observer's peculiar velocity v obs . Note that the propagation term includes contributions similar to those encountered in CMB anisotropy calculations: the Sachs-Wolfe effect, due to the local curvature at emission, and the integrated Sachs-Wolfe effect, due to the curvature perturbations encountered along the line-of-sight [106]; a Doppler term due to the source's peculiar velocity; and higher order effects such as gravitational lensing and redshift-space distortions [107]. All these terms have been the subject of intense study in recent years and both astrophysical and primordial contributions have been modelled extensively (see for example [108][109][110] for independent calculations of astrophysical background anisotropies and [50,103,111] for examples of primordial ones). The propagation term is often assumed to be negligible in the case of astrophysical backgrounds or it is directly included as part of the source term; in [112,113], it is expressly estimated and shown to account for at most ∼10% of the total anisotropy. In the case of a truly primordial background dating back as far as or before the CMB, however, this term will be considerably larger and will be comparable to the intrinsic anisotropies at large scales as shown in [101].
We analyse here the general scenario where the density contrast at the source behaves as a Gaussian random field, as this is mostly assumed in the literature as well. In this case, δ s GW is characterised by its two-point correlation function, in analogy with CMB anisotropies, as the first moment δ s GW is zero by definition, and all higher order moments will either vanish or may be expressed in terms of C GW [103]. Here, cos θ =n ·n and the angular brackets represent an averaging over alln,n direction pairs of aperture θ. C GW ( f , cos θ) will then be different from zero where there are consistent spatial correlations on the sky associated to θ, at frequency f . To observe this formally, we expand the two-point function in spherical harmonic space as where P is the th Legendre polynomial. By inverting this equation, we obtain the angular power spectrum C δ ( f ) for the GW energy density contrast, which may be estimated from observations. Note that quantity ( + 1)C /2π is roughly the contribution per logarithmic bin to the variance of δ s ; hence, this will often be the quantity shown in plots quantifying the C δ power spectrum from different signal models and in GW data sets. Alternatively, we can also calculate the angular power spectrum of the Ω GW parameter itself, first by decomposing it into spherical harmonic components, and then by defining where angle brackets indicate an ensemble average over random realizations of Ω GW . The information encapsulated in C Ω and C δ is the same, and they are both dimensionless quantities; however, note that the numerical scale of C Ω is roughly that of Ω 2 GW , whereas C δ shows the fractional contribution of each mode to the power spectrum with respect to the monopole Ω GW .
To quote the expected levels of anisotropy for astrophysical backgrounds from calculations in the references mentioned above, one may compare Figure 2 in [108] and Figure 4 in [109]. It has been pointed out in the literature [103,[114][115][116] that there is a tension between these two predictions; the difference in the estimates highlighted here stems probably from different treatments of the source effects at very low redshift, where the clustering at small scales induces non-linear effects in the field which the overall anisotropy level is very sensitive to. Cosmological backgrounds are expected to present CMB-like anisotropies of δ s GW ∼ 10 −5 [101]. A comparison between these estimates and the expected sensitivity-per-multipole of various GW detectors may be found in [117].
Finally, note that to properly assess the sourced GW anisotropies from direct measurements, it is necessary to subtract out the kinematic dipole term induced by the observer's peculiar velocity as seen in Equation (42). This may be performed similarly to CMB analysis, potentially using the independent estimate of v obs from CMB data themselves.

Observational Properties of Stochastic Backgrounds
The observational properties of the SGWB such as time-domain characteristics and selection effects imposed by the observation method merit special attention as these are crucial for developing effective search methods. For the sake of simplicity, this discussion is focused on astrophysical backgrounds where individual events are clearly defined and are not extremely model dependent. However, most of the following considerations generally apply to primordial backgrounds as well and have intuitive extension to backgrounds from topological defects and primordial black holes for example. Three main regimes are identified in the literature based on the frequency of occurrence of the events in time. These are referred to as Poisson noise regime, where GWs appear as isolated bursts in the timestream, Gaussian regime, where the timestream is packed with numerous overlapping signals and the central limit theorem applies, and popcorn or intermittent regime, which lies somewhere in between. A visualisation of these is provided in Figure 3, generated using mock-data generation open source package presented in [118,119]. The discriminating factor between regimes is encapsulated in the duty cycle, which is the mean ratio between event duration and the time interval between consecutive events. This will also be equal to the average number of events present at the detector at a given observation time. While it is intuitively clear that the Gaussian background in Figure 3 is stochastic, this is not obvious of the other regimes. Operationally, a signal may be considered stochastic if it is more effective, in the Bayesian sense of the word, to search for it in the data using a stochastic model for the waveform as opposed to a deterministic one [120]. Furthermore, a signal is defined as resolvable if it can be decomposed into non-overlapping and individually detectable signals, in either the time or frequency domain. With this in mind, the only way to know whether a signal is stochastic, is to search for it and see what method yields the best result or more stringent upper limit. In the case of astrophysical GW signals, there is a continuous distribution of events where the highest SNRs are expected to be resolved, and beneath a certain detector-imposed cutoff, the events accumulate to form the background signal. The boundary between the resolved and unresolved population is not obvious, and a few bright signals stand out above the confusion background. When these are resolvable, deterministic signals, they should be subtracted from data, leaving a residual background that may be analysed with stochastic tools. In [120], the authors present simulations designed to establish whether certain signals are best searched for with stochastic or deterministic methods. They find, similarly to [121] and somewhat unsurprisingly, that deterministic signal models are favoured whenever the background is made up of a small number of sources, and stochastic models are preferred for larger source numbers. The boundary between regimes however is not well defined, and hybrid models comprised of a deterministic signal model for individual bright sources and a Gaussian-stochastic signal model for the confusion background are generally most effective to extract the population parameters.
These analyses highlight the distinction between event rate and detection rate. The event rate is the actual rate at which binaries are merging in the visible Universe, while the detection rate is the rate with which we can detect them, which is plagued by selection effects depending on both the binary's intrinsic and extrinsic parameters and the detector's characteristics.
There is a further degree of complication caused by these unresolved, bright signals in the timestream which behave as shot noise and may hinder the correct interpretation of a detection. These signals represent random, Poissonian fluctuations in the finite population of sources and will induce a white noise component in the angular power spectrum. The observational effect that arises is induced by the fact that nearby, bright sources approximately randomly distributed around the observer will mask the underlying structure of the Universe. This is analysed and quantified in [122] and [123,124] for the monopole and anisotropies of an astrophysical background, respectively. Here, Poisson white noise is modelled with a free parameter representing the cutoff distance from Earth beyond which the GW signal may be considered a stochastic background by the definition given above. The cutoff naturally depends on the sensitivity of the detector array under consideration. This is analysed in [123,124] for the anisotropic background, and it is found that the white noise component dominates current LIGO and Virgo detectors; however, the planned upgrades of these detectors and the future 3G detectors may be sufficient to curb this effect. As discussed in [125,126], another strategy to mitigate the shot noise is to cross-correlate the GW signal with a galaxy catalogue, as this cross-correlation spectrum has a much lower level of shot noise. Furthermore, as GW sources are finite in time, then Poisson statistics dictate that this noise decays as the inverse of the observation time; hence, long observing runs will progressively mitigate this effect. Quantifying this distinction is an evergrowing effort. The current event rate estimates based on astrophysical models informed by the LVK detections show that there is approximately one binary neutron star merger every 13 s and one binary black hole merger every 223 s [73]. Clearly, the vast majority of these lie well beneath the threshold for detection. This is quantified for example in [127] where the detection rate per year is encoded as a function of the merging binary masses. It is shown that there are several astrophysical models for compact binaries which fit the detection rate of ∼10 events per year from the aLIGO O1 and O2 data runs. It is clear that the rate of "intermediate" subthreshold events from an astrophysical population following these models is low and remains far from the stochastic limit as defined above.
It is worth pointing out that these observational features of astrophysical backgrounds discussed here may themselves provide insight in compact binary population parameters such as binary coalescence rates for different binary species, as detailed in [128]. Furthermore, the time dependence of astrophysical backgrounds in general may be a powerful discriminating factor between these and backgrounds of primordial origin.

Detection Approaches and Methodologies
Given its stochastic nature, the main challenge in measuring an SGWB is confidently distinguishing signal from the (stochastic) noise in GW detectors. The interplay between signal and noise bears a central role in stochastic search methods, and in particular the validity of a given method may depend on the expected signal-to-noise regime or ratio (SNR): whether the noise dominates the measurement, i.e., the detector operates in the low signal regime, or whether the signal is comparable to or stronger than the detector noise, i.e., high signal regime. Here, we classify search methods based on assumptions made primarily on the signal, such as Gaussianity and isotropy, starting from minimal assumptions on SNR, and we highlight how the latter then impacts the formulation of the optimal estimators.
Cross-correlation searches have cast a wide net, targeting a variety of stochastic backgrounds within the same style of search and also refining methods to extract specific signals from the data-see, e.g., the dark photon search (e.g., [129]) or the cosmic string search (e.g., [130]). We divide searches into three broad categories. We present the methodology to search for an isotropic stochastic background, which relies on the isotropic assumption to simplify the estimator considerably. Then, we show what happens when this assumption is relaxed in searches for anisotropic backgrounds. Finally, we review special searches which target elusive signatures which may imply deviation from general relativity and/or the standard model. However, first, let us review the common denominators in stochastic searches, namely the cross-correlation statistic and the correlated detector response.

Interdetector and Spatial Correlations
The first step in most stochastic searches is to consider the auto-and cross-correlation of datastreams from one or more detectors, as this operation discards all phase information and targets the signal power directly. To then deconvolve the response of the detectors to the signal, many searches rely on an overlap function, which determines the cross-power response of a pair of detectors to a stochastic background. The most common approach is to estimate the significance of excess correlated power in multiple detectors, integrated over all available observing time, often assuming noise in different detectors not to be correlated 6 . Throughout this section, these methodologies will be reviewed in detail; first, however, let us formally introduce the cross-correlation statistic and derive the overlap functions for different detector setups as these will be the building blocks for the rest of this review.
To start, consider a Gaussian, stationary GWB observed through two distinct detectors, 1 and 2. Each detector timestream may be decomposed into signal and noise components as where R 1 and R 2 represent the individual detector response functions to GW signals, and " * " represents a convolution. These are characterised by functional dependencies on intrinsic and extrinsic parameters of the GWs such as polarisation and incoming direction; however, we omit these now as they may be easily included at a later stage. We consider both h and n fields in Equation (48) as mean zero Gaussian variates; hence, these are fully described by their second-order moments. Let us work in the Fourier domain, where the assumption of stationarity implies that the noise covariance is diagonal, i.e., the noise in each frequency bin is uncorrelated 7 , 6 There are, in fact, sources of spatially correlated noise. They can be distinguished from SGWBs by either a deterministic component or by an overlap reduction function different to those of SGWBs. We provide more details in Section 5. 7 Time segments with non-stationary noise are removed from analyses of real data [56], the stationarity is usually determined empirically. Note that when applying a non-trivial windowing function to time-domain data, e.g., for computing a discrete Fourier transform, we introduce correlations between frequency bins. This effect and the methods to mitigate it are described in [131].
with P i as the noise power spectrum in detector i. We define the discrete Fourier transforms of the data streams starting at timestamp τ asd τ 1 ,d τ 2 , and drop the time label immediately for the sake of conciseness. We then take the cross-correlation of these quantities as which, assuming uncorrelated noise between detectors, has expectation value of Γ 12 is the overlap reduction function, where the reduction comes from the fact that it is a skyintegrated quantity (more details to follow), and I GW ( f ) is the sky-integrated power spectrum of the background. Note the normalisation T obs as we assume the observation lasts a finite amount of time. The covariance of the correlated measurement is as taking the weak signal limit in the final equality. In this case, the cross-correlation statistic has been shown to be a sufficient statistic [132], making it a good choice for stochastic searches. This is because there is little information to be drawn out from the auto-correlated measurements, as these are effectively measurements of the detector noise power; conversely, to use autocorrelations to estimate the signal, one would need an independent and accurate measurement of the noise power to model and subtract out the noise. This approach would require an iterative estimator where both signal and noise are included in the fit [133]. Note that using the cross-correlation component only of the data and, thus, ignoring the auto-correlations allows for considerable simplification of the optimal estimators and computations necessary in data analysis.
The functional form of the overlap function of a detector network can be derived from Equation (51) by appropriately combining the individual response functions of each detector. In the case of laser interferometers, the basic response function is the arm response function, and arms are then combined to form the detector. This may be performed physically, by using laser interferometry as is performed in LIGO/Virgo [134], or in post-processing, with time-delayed interferometry (TDI) as in the case of LISA [135,136]. Considering arm u PS spanning from point P to point S, the one-way arm response − → u : P → S in the frequency domain is described as [33] with the one-way timing transfer function relative to a photon travelling along u PS , T u , as In the case of LIGO-like detectors, which are 90 • -arm, Michelson-Morley style interferometers, a laser photon performs a round trip along one arm before interfering with the beam in the other arm; hence, the relevant response function is the two-way arm response function along ← → u : P ↔ S, which we can write in terms of the one-way timing transfer function as note the extra −L term which accounts for the extra phaseshift incurred when travelling over the same distance twice. Then, assuming that the detector is not moving during the measurement, we can substitute in x S = x P + u, and rewrite the above as where the two-way timing transfer function along the same arm is derived as Note that in the small antenna limit, i.e., when the scale of the detector is much smaller than the wavelength probed, L c/ f GW , we have that |T← → u | −→ 1, implying that all the frequency dependence of the detector response is encoded in the phase term of Equation (56). Consider now detector 1 which is made up of arms u and v, joined at point P; its polarisation response in frequency space, in the small antenna limit, will simply be where we have substituted in x 1 for x P as we consider it to be the location of the detector itself. Finally, the unpolarised overlap function 8 of a detector pair is given by combining the response functions of the two detectors [33], while the overlap reduction function Γ is obtained by sky-integrating the above, Note that polarisation F functions here defined specifically for ground-based interferometers correspond to the individual detector response functions seen in Equation (51). We have chosen to incorporate the frequency-dependent phase terms into the response functions directly; however, in the literature, these are sometimes left out of F functions to underline the fact that, in the small antenna limit, there is no interaction between GW frequency and detector geometry [33]. In the overlap function, the phase terms add, giving rise to a time delay term representing the frequency-dependent phase shift between the two detectors; explicitly, whereγ is the geometric component of the overlap function, and b is the baseline vector that points from detector 1 to detector 2. The numerical values ofγ as a function of direction then depend specifically on the coordinate basis chosen to represent the function; often, the choice is to represent it in an Earth-centered celestial frame, where the function will be time-dependent, as the arms sweep round and round on the sky following the Earth's rotation on its axis. This behaviour gives rise to the specific scan strategy of the detector array, which is essential  for directional stochastic searches. On the other hand, the frequency-dependent phase term highlights the role of the baseline length in a stochastic search: the ratio between |b| and GW wavelength will determine what frequencies will be suppressed in the search; the longer the baseline, the higher the frequencies accessible will be. The overlap reduction functions for the LIGO Hanford and Livingston detector pair (L-H), LIGO Hanford and Virgo (H-V), and LIGO Hanford and Kagra (L-K) are shown in Figure 4a, while the respective (instantaneous) overlap functions are plotted on the sky in Figure 5 at a particular common time.
As mentioned above, long arm length, spaceborne detectors such as the future LISA three-satellite constellation, oriented in roughly an equilateral triangle, will measure GWs by combining the one-way arm responses (53) of several arms at different times composing TDI configurations or channels 9 . The most basic configuration can be thought of as a TDI-Michelson-Morley interferometer, where one constructs a response such as that in Equation (58). However, the great advantage of TDI is that it is possible to construct channels that suppress detector noise and enhance the signal [139]; this is a very active area of study [140][141][142][143][144][145]. For the purposes of this review, let us cite the TDI {X, Y, Z} set of channels, which are based on the Michelson-Morley style measurement in Equation (58), where nominally X is centered around spacecraft 1, referred to hereafter as S 1 ; Y around spacecraft 2, S 2 ; and Z around spacecraft 3, S 3 . However, to minimise noise, each light path is "reflected" back and forth between spacecrafts multiple times such that the noise interferes destructively. The total number of reflections is then linked to a version number for the TDI channel; e.g., "TDI 1 X" refers to a channel constructed by "interfering" the following light paths: For more details on this calculation, see also official LISA documentation [146]. Another useful set of channels is the orthonormal set which approximately diagonalise the X, Y, Z noise-correlation matrix, A, E, T [147]. This set is often used for simplicity as one may consider the auto-correlations of A and E only, treating them as separate measurements, and use T as a Sagnac channel which is mostly sensitive to detector noise and blind to the signal [148,149]. However, in practice diagonalizing X, Y, Z is not a simple task as the detectors continuously move with respect to the measurement. The auto-and cross-correlated overlap functions for LISA are constructed as in Equation (59), where now the polarisation responses are referred to TDI channels. Note that LISA does not operate in the small antenna limit, hence the timing transfer function remains frequency dependent, inducing oscillations in TDI responses at high frequency, as observed in Figure 4b.
In the case of pulsar timing arrays, we can use the one-way transfer function defined in Equation (54), as in this case the arm vectorû, points from the solar system barycenter to the pulsar, and the arm length L is the distance from the solar system barycenter to the pulsar. One can expand the sinc function using Euler's identity and isolate the exponentials with arguments that depend on f L/c. Those exponential terms vary rapidly with sky direction for f L/c 1 averaging down to zero in Equation (60) for Earth-pulsar baselines greater than 100 parsec in the nanohertz band (e.g., Figure 1 in [150]). From here, it is straightforward to calculate the (now frequency-independent) overlap reduction function by considering the cross-correlation of data from two pulsars, which serve as one-way timing beams (see, e.g., [33,150,151] for a more complete discussion). The overlap reduction function for a pair of pulsars a and b now depends on the angle between Earth-pulsar baselines, ζ ab [151]: where Γ PTA is commonly referred to as the Hellings-Downs curve and may be observed in Figure 6.

Isotropic Background Search Methods
In this section, we review the case of isotropic backgrounds searches. We start by presenting the approach for ground-based interferometers, which may be considered the simplest case as these operate in the low-signal regime. We then lay out the detection strategy of PTAs. The application of these methods on real data and corresponding search results are reviewed in Section 5.

Ground-Based Detectors
The flagship LVK stochastic search targets a Gaussian, isotropic, stationary, and unpolarised background [56]. This set of assumptions results in a simple, ready-to-use estimator which results in a relatively rapid analysis of LIGO-Virgo data. All other collaboration searches use, as a starting point, the breakdown of one of these assumptions, and most extend the estimator to a particular case.
As detailed in Section 3, a stochastic signal which satisfies all assumptions listed above is fully described by its spectral energy density, which may be expressed in terms of the fractional energy density Ω GW ( f ). Under the assumption the latter presents a power-law spectrum as in  (63). ζ is the angle between Earth-pulsars baselines, and Γ measures the spatial correlation. At ζ = 0, the function assumes the pulsars located at different distances. When distances are the same, Γ(0) = 1. Shaded regions correspond to constraints on spatial correlations obtained with the second data release of the Parkes Pulsar Timing Array except PSR J0437-4715 [26].
Equation (25), we can use the cross-correlation statistic presented in Equations (51) and (52) to write down an optimal filter to apply to the data and draw out an estimate for the amplitude To start, we remind the reader of Equation (20) relating the measured GWB intensity to the energy density, which we may rewritten as with By inspection, we already oberve that H( f , f ref ) is an unbiased filter to use on the data to derive an estimate of the background, in the case where the GWB obeys Equations (20) and (51). In general, there are several methods to show that this filter is optimal and write an estimator for Ω GW , and different assumptions on the signal and noise components result in distinct solutions. We report the derivation shown in [152]. The approach here is to find the function which maximises the SNR of the estimated GWB signal, defined as where Re-writing the expectation value for the data correlation C( f ) in terms of Ω GW ( f ref ) using Equations (20) and (51), and assuming the low-signal limit following Allen and Romano [152], one obtainŝ where F f ref may be thought of as a normalisation which depends on the chosen reference frequency, In fact, in deriving the variance σ 2 Ω as formulated in Equation (67), it turns out that F f ref is equal to the inverse variance of the estimator, up to a factor of 2, where P 1 and P 2 are the noise power spectra in detectors 1 and 2, respectively, as defined in Equation (49). In principle, these are not known and need to be estimated carefully so as not to bias the search. When operating in the weak signal, it is reasonable to calculate the noise power spectra from the data directly; however, one should be cautious not to use these estimates on the very same data stretch they have been estimated from. In practice, for stochastic searches using ground-based interferometers, the data are segmented into short time segments and the full estimator takes the average of Ω GW over the entire dataset. To estimate Ω GW in a single data segment τ, one may use noise power spectra estimated from other segments within a time window over which detector noise is stationary; for example, in the LVK stochastic searches, the average of P i in the adjacent segments is taken. Otherwise, it is also possible to estimate the noise power spectrum using a parametric fit, as in [153][154][155], or by using a wavelet expansion using methods as shown for example in [156]. Another approach to construct an estimator for the SGWB is to construct a likelihood function for the signal given the data, and maximise it to obtain a maximum likelihood detection statistic. Let us switch to a compact vector notation, where the data vector d = (d 1 , d 2 , ..., d N ) spans the detector space comprising N detectors, and data covariance is the generalised N × N correlation matrix C, where each entry is C ij = d i d j . Starting with a zero-mean Gaussian assumption for both the signal h and noise n terms in the data, we can write where all observed time segments τ and frequency modes f are considered independent. A method based on this likelihood employs both auto-correlations and cross-correlations of the datastreams to reconstruct SGWB and is discussed in the context of ground-based detectors in [157]. Using auto-correlations is not recommended in scenarios where the detector noise terms are not independently measured, as these will be inextricable from the auto-power spectra. This likelihood is in fact proposed for future LISA data analysis in [148,149], making full use of the independent estimate of the noise from GW-insensitive channels (see details on Sagnac channels here [158]). A more simple, alternative route can be taken by noting that, in the low-signal approximation, the Gaussian assumption can actually be applied to the average of the two-detector residual C ij − C ij , leading to a different formulation, This holds in the limit where data are averaged over many segments such that the distribution of the residuals approximates a Gaussian; otherwise, a single residual will not follow this distribution [159]. In [132], Matas and Romano show that likelihood (72) is well approximated by (73) in the low-signal limit. In this case, maximising either likelihood with respect to Ω GW ( f ref ) yields the same solution as above. The extension to include multiple baselines is straightforward, assuming each C ij may be considered as an independent measurement. The full likelihood L(C) becomes the product over the individual baseline likelihoods as where the baseline index b cycles over the detector pairs ij in the set, without double counting baselines.
The approach can be taken one step further to formulate a hybrid frequentist-Bayesian analysis as proposed in [160] by re-expressing the likelihood in terms of Ω and the model parameters we would like to constrain, where Ω M is the fractional energy density of the model M which is being tested, Θ are the parameters on which the model depends, andΩ b GW ( f ) is an estimator for the strength of the SGWB in a single frequency bin (see, e.g., [161]).
Equation (75) is used as the basis for parameter estimation of models for the GWB, and Bayesian model selection [160]. In the case of parameter estimation, one estimates the posterior distribution of the parameters of the model, Θ, where now we have specified model M explicitly, p(Θ) is the prior on the model parameters, and is the model evidence or "marginal likelihood". The posterior is typically estimated either by brute force estimation or, in high dimensional cases, by MCMC methods. In addition to estimating the posterior in Equation (76), one can perform Bayesian model selection by comparing the marginal likelihood for two separate models. This is used to construct an odds ratio between two models A and B, The first term on the right-hand side is the prior odds, where one specifies any prior information about the preference for one model over the other. The second term is simply the ratio of evidences. Odds ratios, or Bayes factors (simply the second term on the right-hand side, i.e., equal prior odds), can be used to distinguish between models for the GWB. A general heuristic guide for their use is discussed in Chapter 3 of [33]. Bayes factors have been used recently as the detection statistic in searches that seek to measure alternative polarizations of GWs [56,161,162], in searching for a background from superradiant instabilities around black holes [163,164], as well as searching for models of cosmological backgrounds [165][166][167]. This hybrid-Bayesian methodology has also been proposed in attempting to distinguish between a GWB and globally correlated magnetic noise [168], and simultaneously search for multiple GWB models contributing at once [169]. We discuss the problem of globally correlated magnetic noise for current and future detectors in more detail in Section 5. In addition, a Fisher matrix formalism has also been proposed to simultaneously search for multiple GWB contributions [170]. The detection approaches outlined above have been validated both through mock data challenges [171] and hardware injection campaigns [172]. The latter consisted in generating an isotropic Gaussian stochastic signal by manipulating the test masses at both LIGO sites and successfully recovering the injection via a cross-correlation stochastic search. However, the mock data challenge in [171] considers the case of a semirealistic astrophysical background of binary black holes and neutron stars. Here, the authors prove that the cross-correlation statistical approach assuming a Gaussian background remains valid even in the case of an intermittent astrophysical signal, although it is suboptimal. The fully optimal search methods for this sort of signal are described in Section 4.4.

Pulsar Timing Arrays
In describing the typical Bayesian analysis that is currently performed by most PTAs to search for an isotropic GWB, we follow the methods laid out in, e.g., [173][174][175][176][177], and refer to more specific studies throughout our discussion. Unlike for ground-based interferometers where the detector strain time series is uniformly sampled with equivalent time stamps at each detector, the data collected by PTAs are the arrival times of pulses from an ensemble of pulsars on the sky. Hence, it is preferable to work in the time domain directly, and the typical starting point is the timing residuals of all pulsars, δt, which are left over after subtracting off the best-fit timing model constructed using deterministic parameters of the pulsars (e.g., rotation frequency, spin-down, binary parameters, sky position, etc.). The likelihood of obtaining n timing residuals given model parameters θ for a given pulsar is a multivariate Gaussian as where s includes contributions from deterministic signals with explicitly modelled time dependence, including the evolution of arrival times as described by the timing model. The covariance matrix C represents contributions from stochastic processes. Note that while we have used "∝", the specific normalization does matter in this analysis in general, as the spectra of some of the stochastic processes that contribute to C are themselves described by a set of free parameters. The diagonal elements of C yield "white" noise that is uncorrelated in time whereas off-diagonal elements of C manifest as time-correlated "red" processes, including the GWB. Red processes are modeled in the frequency domain with a power-law for the power spectral density of timing residuals where f is the GW or noise frequency, and A and −γ are the power-law amplitude and spectral index, respectively. We discuss sources of red noise that affect pulsars individually in more detail in Section 5.2.2. For the GW case, amplitude A corresponds to the GW strain amplitude at the reference frequency f ref = yr −1 and γ corresponds to α in Equation (25) and α explained in footnote 3: γ = 3 − 2α = 5 − α = 13/3. The lowest frequency and the size of the frequency bin ∆ f are usually selected to be equal to the inverse of the total observation time T obs , which is typically more than a decade. The data are collected every few weeks. The observed power spectral density of timing residuals P( f ) [s 3 ] is derived from the GW strain power spectral density S( f ) (s) in Section A4 in [178]. Neglecting frequencies f other than multiples of T −1 obs , the densities and strain are related as With each pulsar modelled by tens of timing model parameters and the necessity to invert the covariance matrix that models both temporal and spatial correlations for every likelihood evaluation, in practice it is necessary to marginalise the likelihood over nuisance parameters, which include timing model parameters and Gaussian coefficients that yield a specific realization of red noise. Below, we describe the marginalization procedure [179,180] that is employed by recent searches (e.g., [26,54,181]). First, the likelihood is rewritten as where The covariance matrix N is now diagonal and, thus, represents the white noise associated with each observation; represents deterministic timing model parameters and the "design matrix" M maps them to the timing residuals; Fa represents low frequency "red" noise. The latter is modelled by a Fourier series with a Fourier amplitudes vector a and a matrix F of unit-norm frequency Fourier modes. Typically, the frequencies of the Fourier series are multiples of 1/T obs [175]. η represents hyperparameters that govern the shape of the red noise spectrum defined by F and a, and s denotes deterministic terms that are not marginalised over. To be precise, the matrix N is block-diagonal, consisting of the nominal error in pulse arrival time, but parameterised by "EFAC," which multiplies arrival time errors; "EQUAD", which is added to the arrival time errors in quadrature; and "ECORR", which accounts for correlated measurement uncertainties in contemporaneous, multi-band observations and contributions to the off-diagonal terms [54,180,181]. Further compactifying the equations, we express Gaussian priors are then placed on b, with covariance B, for which its parameters have previously been referred to as η, where we have left the prior on the timing model parameters unconstrained (the upper left corner indicates a block with diagonal entries of infinity). The posterior is then proportional to the original multivariate likelihood described by Equation (79) times the prior described by Equation (86), The initial timing model parameters and the design matrix are obtained with least-squares fitting. The matrix ϕ contains information about the spectrum of low frequency noise, including intrinsic red noise for each pulsar and the GWB. If we label pulsars as a and b and frequency bins (multiples of 1/T obs ) as i and j, then the elements of this matrix are where ρ i is the spectrum of the GWB (and is, therefore, present in all cross-correlations and auto-correlations), and η ai is the spectrum of intrinsic red noise of pulsar a in frequency bin i (and is, therefore, only present on the diagonals). The vectors of maximum-likelihood values of b,b, are obtained by solvingb Alternatively, one can explicitly marginalise over b as where C = N + T BT T . C can be efficiently inverted using the Woodbury formula to increase computational efficiency. In principle, ϕ is block diagonal with independent frequency bins but correlated noise between pulsars when a GWB is present. However, a GWB signal is expected to reveal itself as a strong common red noise process before cross-correlations are evident. Recent analyses have, therefore, first evaluated the term in Equation (89) with Γ ab = δ ab [179] as the null hypothesis and Γ ab given by Equation (63) as the signal hypothesis. As we discuss in Section 5.2, there is strong evidence for a common spectrum process (Γ ab = δ ab ) compared to the intrinsic red-noise only hypothesis (Γ ab = 0), but little compelling evidence for cross-correlations compared to a common spectrum process.
The η ai and ρ i spectra are modeled as a power law, and so instead of having N freq parameters per pulsar (plus N freq more parameters for the GWB), one has 2(N pulsar + 1) free hyper-parameters that specify the shape of noise processes. White noise parameters, such as the typical EFAC, EQUAD, and ECORR parameters are measured on a per-pulsar (and per-observatory or per-observing back-end) basis. Typical pulsar datasets are modelled by tens of such parameters; thus, white noise parameters are often fixed to their maximum likelihood values that were calculated by analyzing pulsars individually. Bayesian inference with the marginalised likelihood typically yields posterior samples for the 2(N pulsar + 1) hyperparameters, and the Bayesian evidence, which is the integral of the likelihood over the prior. The evidence is used to select a model that best describes the data. In some recent analyses, explicit Bayes factors (the ratio of evidences between models) are calculated using a "product-space" approach that foregoes individual evidence calculations in favor of sampling from multiple models simultaneously [54,181].
It is also common to consider an optimal statistic which is built similar to the one presented for LVK searches [150,182,183]. We follow the conventions and methods in [183], which were implemented in [54]. We start from the auto-covariance and cross-covariance matrices calculated from the timing residuals, For a GWB with PSD given by P GW ( f ), the cross-covariance matrix is given by Specifically, for a background dominated by supermassive black hole binaries that evolve solely due to GW emission, γ = 13/3. The optimal statistic,Â 2 , is then given by [150,182,183] with In the second line, we define the amplitude-independent cross-correlation matrix, which makes Â2 = A 2 GW . If A GW = 0, then the variance of the optimal statistic is given by which allows us to construct an SNR ρ =Â 2 /σ 0 . The SNR significance can be found by estimating its noise-only distribution using simulations of the data with pulsar sky positions randomly assigned or random phase shifts applied to the data to remove a signal [180,184]. The implementation of the optimal statistic requires evaluating C a and S ab for a given choice of hyperparameters that define a, which are amplitudes of sinusoids on which the red noise is decomposed. One typically uses hyperparameters that correspond tob in Equation (92). One can also use the maximum a posteriori estimates of the red noise parameters from the Bayesian analysis, A max red,a , γ max red,a , A max GW , and γ max GW , but this can result in a large bias in the estimate ofÂ 2 because the individual red noise parameters and the common red noise parameters are highly correlated. In practice, the red noise parameters are drawn from the posterior samples from the Bayesian analysis and for each draw the optimal statistic and its SNR are calculated. One can estimate a value ofÂ 2 from this distribution, which is significantly less biased than if one calculates the optimal statistic once from the maximum a posteriori estimates of the noise parameters [183].
While our focus in this section has been on methods for detecting an isotropic GWB with PTAs, note that there is also a recent paper that aims to measure both continuous sources from individual supermassive black hole binaries as well as an isotropic confusion background [185].

Anisotropic Background Detection Methods
Mapping an SGWB is a distinct problem from that of tracing back an individual source position. Individual source mapping leverages the coherence of radiation arriving from the source, using the phase information of the signal to estimate both angular and frequencydependent phase along with other intrinsic and extrinsic source parameters. In either case, however, a basic requirement for directional analyses is multiple observations of the same signal. These multiple observations can be made in different ways. Multiple detectors at different locations will impose a known phase shift on a coherent signal, which can be used to constrain the position on the sky. In the case of a persistent, coherent source, unequal-time auto-correlation of a single detector signal can yield angular phase information, and in the case of some sources of GWs, matched filtering can be used to pinpoint the location on the sky with a significantly larger effective baseline [186]. For the case of a persistent and incoherent source, such as an SGWB, one may use equal-time correlations of data from different detectors and the motion of individual detectors with respect to the frame in which the signal is stationary to reconstruct angular information. We focus on this approach here.
Let us start by discussing the weak signal limit, which applies to the 2G ground-based detector network (e.g., LIGO and Virgo). As we did for the isotropic case, we can assume that the residuals for the data cross-correlation spectra and our model are Gaussian, similarly to Equation (73). The log-likelihood may be written as a function of an anisotropic signal I explicitly as where we generically model the correlated data as The operator A projects the sky signal I into the datastream, and contains all the necessary correlated response information of the detectors. For ground-and space-based detectors, A is given by Equation (59) up to appropriate normalisation factors. We have reintroduced the timestamp label τ here as it serves an important purpose: tracking the time-dependence of the detector response on the sky; the sky signal I is considered absolutely stationary 10 and, thus, τ-independent. For ground-based interferometric detectors, operator A depends only upon the sidereal time, which means the data stream can be folded on the sidereal day before we invert Equation (103) to solve for I [187][188][189]. To probe the effects of the projection operator A on the GW observations, it is useful to simulate a signal as a single point source on the sky and "project" it analytically (or numerically) into cross-correlated mock data, which produces the point-spread function of the detector pair. This has often been studied to glean the effective angular resolution of the detectors to a stochastic source-we refer the reader to [190] for a complete presentation of the subject.
We can now derive a maximum-likelihood mapping solution by maximising the likelihood with respect to the sky map. To best illustrate the structure of the solution, let us use a compact matrix notation where operators in bold generally span multiple dimensions, i.e., time, frequency, direction, and baseline, such as A τ b ( f ,n) → A. These are contracted appropriately to achieve the desired result,Ĩ In the simple case of a single frequency and single time segment of data, we can think of C as a vector containing cross-correlations between different baselines, Σ is the noise covariance of the cross-correlation of the detector pairs, and A is now a matrix that contains geometric factors that project different sky directions onto correlations between different detectors. Explicitly, Σ entries are of the form where i, j label the two detectors in the baseline b. More details on the calculation may be found in [35,190,191].
In practice, this solution may need to be implemented iteratively as the noise covariance Σ may not be known a priori and, hence, may need to be estimated from the data themselves. In the low signal limit however, as is the case with current ground-based networks, the noise covariance is estimated independently of the signal and Equation (104) is, thus, a closed-form mapping solution. The first term on the right-hand side is the Fisher information matrix, which is inverted and applied to the projection map, to extract the maximum-likelihood solutionĨ. z is often referred to as the dirty map in the literature, as it effectively represents a first, naïve representation of the cross-spectral density on the sky, whileĨ is considered the clean map as it is the result of a deconvolution. The Fisher matrix contains the pixel-pixel covariance information necessary to deconvolve the detector response from the dirty map. In principle, this method can also be employed for anisotropic searches in a strong signal regime; in this case, if the signal is Gaussian, the map point-estimate remains accurate; however, covariance Σ requires extra terms as the signal contribution may not be ignored. However, in the presence of a strong signal term, it is preferable to include auto-correlation terms in the likelihood, as these contain precious information. This is the case for, e.g., LISA and PTAs. Hence, we consider again the Gaussian likelihood (72), and as in [192][193][194][195], we extend it to include directional information, where the correlation matrix C τ ( f ) (re-introducing the time and frequency dependence explicitly here for clarity) may be estimated as with N as the noise covariance. Note that A is an N × N matrix representing the correlated response of the full network. The above notation implies that we are discussing LISA because we have specified the frequency domain and that A depends on τ. In principle, PTA analysis works analogouslynote the similarity of Equation (109) to Equation (89), with the added complication of a dependence on direction. However, it is worth briefly reiterating specific differences. First, the "detector response" is time-independent for PTAs, and in general we assume it is frequency independent. In addition, PTAs leave data in the time domain but expand the GW signal and intrinsic red noise in the frequency domain using a set of Fourier basis functions with an appropriate prior to impose a power-law spectrum and correlations between pulsars (see Equations (87)-(92)). It is the correlations between pulsars that are directly affected, with the anisotropy of the background causing a deviation from the Hellings-Downs curve that depends upon the specific realization of the background. This means that Equation (89) changes: with i once again specifying the frequency index, a and b specifying pulsars we are correlating, and ρ i (n) specifying the GWB amplitude at frequency index i in directionn. In this case, Γ ab (n) can be considered a direct analogue to A( f ,n).
In [194], the authors show that extracting the maximum-likelihood map solution from Equation (108) where F is the Fisher information matrix. This solution is in fact structurally identical to the Bond-Jaffe-Knox mapping procedure in CMB [196,197] and requires an iterative implementation: first, a "guess" map is injected in C (Equation (109)), then a first solutionĨ is calculated using Equation (111), which updates the estimate of C and so on until convergence. Beyond SNR considerations, this iterative method is also useful in the case of correlated noise terms as these may be included in the noise model N, whereas solution (104) rests upon the assumption that the noise is completely uncorrelated between different detectors/measurements. Alternatively, in the strong or intermediate signal regime one can employ a fully Bayesian approach to the problem as presented in, e.g., [193,195,198]. In [193], the authors develop a fully Bayesian search pipeline to use PTAs to map the GW sky using the formalism detailed in [192]. In [195], the authors present a Bayesian search for anisotropies in the GWB using LISA, where they decompose the sky onto the first three spherical harmonics, parameterize the noise covariance curve, and use MCMC sampling to map the diffuse sky and estimate the noise properties of the detector simultaneously.
Several different approaches can be taken when applying these map-making methods to real data [158,187,[190][191][192][193][198][199][200][201][202][203][204][205] based on specific necessities. The spectral dependence of the map is often factored out (although not always, see, e.g., [155,170,203,206]), under the assumption in Equation (26) and integrated over to obtain broadband maps of the GW sky. Then, the direction-dependent component of the signal may be expanded and mapped in spherical harmonics, (113) or discretized directly on the sky in pixel space, where Y m are the standard spherical harmonics, and we have used p and p to enumerate pixels on the sky that are specified using some choice of pixelization (e.g., HealPix [207]) 11 . Other works have argued for the use of curl and gradient spherical harmonics, as in CMB analyses [200][201][202], or bipolar spherical harmonics [208], although to our knowledge these have not yet been employed on real data. In both the pixel and spherical harmonic bases, the Fisher matrix F is usually very poorly conditioned for a single baseline, as the sky coverage is incomplete; this is the case even after time-integration, which takes into account the sky-scanning strategy of the baseline. Regularisation techniques are, therefore, needed to estimate maximum likelihood solutions such as those presented in Equations (112) and (106). Examples of regularization techniques include cutting off the sum in (113) a priori at some maximum [191], singular value decomposition (SVD) (see, e.g., [209]), and the drastic approach of limiting the calculation to the diagonal of F, ignoring all off-diagonal correlations [199]. In some cases, combinations of these approaches are used.
The spherical harmonic basis has mostly been employed when searching for a diffuse background, which dominates the low-modes, while pixel space has been employed when searching for point-like stochastic sources, although it has been shown that there is a one-to-one mapping between the two [210] (see, e.g., Figures 1 and 2 of [193] for an illustrative example of how specific anisotropic maps correspond to different spherical harmonic decompositions). There may however be subtle differences between the conditioning in spherical harmonic and pixel space; in interferometry in general, the choice has often been to obtain a spherical harmonic domain solution since the measurements directly constrain this domain (see e.g., [211] for an application to CMB interferometry). The reconstruction of the actual "image" is then left as a separate and straightforward conversion problem.
However, as discussed in [153], in the case of GW detectors, there is a non-trivial coupling of spherical modes in the sky response due to the non-compactness of the beam, indicating that adopting this approach may have significant drawbacks. In particular, there is no isotropic kernel for a primary beam describing the coupling of individual pairs of directions, hence limiting the number of spherical modes in the solution in a piece-wise fashion could make the problem more ill-conditioned than it actually is. Finally, in the case of PTAs it has been shown that, when working in the spherical harmonics basis, the non-uniformity of pulsars on the sky makes it impossible to decouple the monopole from other multipole moments, making the search less sensitive to isotropic backgrounds [204,205]. In those studies, the authors propose using a basis formed with principal component analysis of the direction-dependent sensitivity of the specific PTA.
In fully Bayesian examples, using too many spherical harmonics terms or pixels on the sky can make the problem intractable or, in a detection case, could widen the parameter space beyond what is possible to resolve. Techniques that can be used to make the full Bayesian problem more tractable might be to only search over the first few spherical harmonics, as in, e.g., [193,195]. Other techniques have also been proposed, one of which constructs a set of basis "sky vectors" from only the columns of the SVD matrices that are associated with non-zero singular values of the Fisher matrix. One then samples over their coefficients to reconstruct the map using a lower-dimensional representation of the sky [198,202].
Although notably more challenging when attempting map-making with a singular F, the spectral dependence may be solved for as well, resulting in narrowband directional searches. These have been carried out by the LVK collaboration targeting point-like sources which may have an interesting spectral emission (e.g., asymmetric neutron stars, supernova remnants, the galactic bulge) and have been performed in pixel space [188,189,212]. As point sources are not correlated with other points on the sky, this type of search ignores the off-diagonal correlations in the Fisher matrix to avoid the inversion problem. Another approach employed on LIGO data in [155] is that of frequency-banding, where one solves for maps in separate frequency bands spanning several Hz, making minimal assumptions about the spectrum of the signal in each band. This allows both inverting the full Fisher matrix and obtaining full sky maps in each band and attempt a model-free reconstruction of the frequency spectrum. In the case of pulsar timing arrays, the signal is dominated by a small ( 10) number of frequencies, and so in the fully Bayesian analysis in [203], they estimate different spherical harmonic coefficients at different frequencies to allow for maps that are dominated by point sources in different directions at different frequencies.

The Approach towards Non-Gaussian Backgrounds
The cross-correlation searches reviewed above have been shown to be optimal in the case of a continuous background, which, for example, presents a timestream as in the third row of Figure 3; however, these will be suboptimal in the case of a background which is intermittent, such as the first row of Figure 3. As argued in [213], an intermittent background will not satisfy the central limit theorem; thus, searching for such a signal under the Gaussian assumption will reduce the sensitivity of the search, resulting in delays until detection (although it will not bias the point estimate). Given the observations of GWs accumulated up until now [214] and the lack of a detection of a Gaussian stochastic background, it is clear that a signal of sub-threshold GW bursts from compact binary coalescences is present in LIGO and Virgo data and may be the first stochastic signal that can be detected with ground-based detectors.
A few optimised search methods for non-Gaussian backgrounds have been proposed: a maximum likelihood approach [213] where each isolated GW making up the background is modelled as a "stochastic burst" of GW strain, correlated between detectors; a Bayesian search method which leverages deterministic parameter estimation [215,216], fitting each burst with a compact binary waveform; and, soon, a fully Bayesian stochastic search [217] which extends the formalism of [213]. This method has also been extended to estimate the anisotropic properties of the subthreshold CBC population [218].
While substantially different in formulation and implementation, these approaches share the premise of re-parametrisation of the background signal based on a Gaussianity parameter, ξ, which quantifies the degree of non-Gaussianity. Effectively, this is the same as the duty cycle introduced in Section 3.4 and is linked to the duration of each individual GW burst and the frequency with which they appear in the detector. If normalised, 0 ≤ ξ ≤ 1 is the probability that a burst is present in the detector at any given time, and if ξ = 1, the background is Gaussian. The detector noise, on the other hand, is usually still assumed to be Gaussian 12 .
Then, that assuming we segment data into short segments labelled by i, the likelihood function for the data vector d i (for a set of detectors) given the data covariance C and signal model h is For an intermittent background, model h i in the likelihood will be either a GW burst or a CBC waveform, if the data segment d i has a signal, or simply equal to zero if it does not [216]. The full likelihood L full of the data with an intermittent signal may then be written as the mixture of a signal-and-noise likelihood L s and a noise-only likelihood L n as featuring the duty cycle ξ as a parameter. Maximising this likelihood (with any of these approaches) results in posterior distributions for both the signal parameters that construct h i , and ξ; furthermore, it is also possible to include noise parameters here such that these also are estimated.

Current Detection Efforts of SGWBs
A variety of searches based on the methods described in Section 4 have been implemented in interferometric and timing data, in the case of ground-based detectors and pulsar timing arrays (PTA), respectively, to either detect or constrain a wide range of stochastic backgrounds. Beyond these, many investigations have been carried out to assess the detection potential of future detectors on the ground and in space and their ability to distinguish different background sources.
In this Section, we start by discussing the most recent analysis results from the groundbased detector LIGO-Virgo-KAGRA (LVK) collaboration, obtained with the third observing run (O3) of the 2G detectors: the Advanced LIGO and the Advanced Virgo. These results build on the previous data runs O1 and O2 of the 2G detectors. The overall sensitivity of the 2G detector network may be observed in Figure 7. We then discuss future detection prospects for the 2G network, which should reach design sensitivity with the next few upgrades of the instruments [219]. When representing the sensitivity to a GWB, we use "power-law integrated" sensitivity curves, which show the sensitivity to a GWB at each frequency (at a chosen SNR level) with a power law spectral shape that is tangent to the curve at that frequency [220].
For PTAs, we provide details about existing and proposed collaborations across the world and report on their searches for nanohertz GWBs. We review the upper limits on the amplitude of the isotropic GWB from circular supermassive black hole binaries. Next, we briefly outline main sources of noise that limit the searches and discuss the future of timing array experiments. We close by briefly discussing other interesting GW searches, such as resonant bar experiments and the Moon-based proposals, and highlight their bearing on stochastic searches.

Searches with Ground-Based Laser Interferometers
The ground-based detector communities have put in a continued, concerted effort towards detecting a stochastic GW background between ∼10-10 3 Hz. The principal candidate for a stochastic background in the ground-based detector band is one given by inspiralling and merging binary black holes and neutron stars. This is clear given the sensitivity of the LIGO-Virgo network, as shown for example in Figure 7, and the fact that these detectors have already detected a large set of CBCs [74]. This background has been exclusively searched for via a cross-correlation search, tailored to the astrophysical background in particular by reweighting the data in frequency according to the expected power-law spectral dependence as derived in Equation (36). First efforts of isotropic stochastic searches date back to 2003 and the first Solid curves indicate existing results from LIGO/Virgo's first three observing runs [56], monitoring of the Earth's normal modes [221], Doppler tracking of the Cassini satellite [222], pulsar timing observations by the PPTA [223], and CMB temperature and polarisation spectra measured by Planck [223,224]. Dashed curves are forecast constraints for LIGO at A+ sensitivity, Einstein Telescope [14], AION-km [225], LISA [29], binary resonance searches [226,227], and pulsar timing with the Square Kilometre Array [228]. The dotted curve indicates the level of the integrated constraint from measurements of N eff [38]; note that this is a constraint on the total GW energy density over a broad frequency range and cannot be directly compared to the other constraints. Note also that both the Planck and N eff constraints apply only to primordial GWs emitted before the epoch of BBN. See Figure 1 for various GWB signal predictions in relation to these constraints.
science run [229], followed by several other science runs of the LIGO detectors [16,[230][231][232], and the Advanced detector era since 2016 [162,233], and, finally, the inclusion of Virgo [56]. Beyond these, several other searches have been carried out by targeting specific GWBs; notably, searches which allow for anisotropy in the signal have regularly been carried out [212,[234][235][236][237], as well as searches for cosmic string networks [130,238,239]. Other targeted searches include searches for non-GR polarisation modes [56,162,240]. Furthermore, several stochastic search efforts have been carried out by small teams outside the LVK collaboration; let us cite here a set of directional searches complementary to the LVK ones [154,155,241], a search for correlations between the anisotropic GWB and galaxy catalogues [242], searches for ultralight vector bosons [163,164], a search for a primordial inflationary background [243], and a search for parity-violating stochastic signals [167]. We focus on current search results in this section, detailing the detector characterisation issues that the Advanced detectors have faced up to now. We discuss future challenges for ground-based detectors in Section 6.1, where we explore SGWB detection strategies with third generation interferometers.

Search Results for an Isotropic Background by LVK
Applying the cross-correlation recipe described in Section 4 to the real LIGO-Virgo datasets requires firstly identifying the valid cross-correlation times at which different detectors are simultaneously online and fully operational and, subsequently, Fourier transforming the measured timestreams to the frequency domain. In practice, it is convenient to divide these into smaller time segments and fast-Fourier transform (FFT) each segment, which is treated as an independent measurement. This reduces the cost of Fourier transforming, excludes frequencies that are far into the 1/ f tail, and ensures that noise properties are stationary within each segment.
In the most recent isotropic stochastic analysis [56], raw data are segmented into 192 s segments which are then Hann-windowed and FFTed. The data are downsampled from the native sampling rate of 16 kHz to 4096 Hz, and the full frequency band included in the analyses spans 20-1726 Hz; this choice is made so as to exclude the highest and lowest frequencies available where noise becomes prohibitively large. Data quality cuts are applied based on a measure of noise stationarity between segments or when there are known detector artifacts in the data, and these cuts reduce viable data by ∼20% in O3. In addition, frequencies where narrow spectral artifacts ("lines") occur and are caused by known instrumental disturbances or show coherence between detectors are removed [244] and typically cut ∼15-20% of the frequency band. More details on the methodology, exact values of data excised for each baseline in O3, and implementation of the quality checks can be found in [56,244]. Note that this is the first advanced-detector era stochastic analysis to include multiple baselines 13 ; we highlight in this section the important implications of adding two long baselines to the network. Specifically, O3 analysis spans Hanford-Livingston (H-L), Hanford-Virgo (H-V), and Livingston-Virgo (L-V).
Several new analysis features were implemented in [56], as the LVK collaboration gears up for a possible SGWB detection. The results are presented by using a log-uniform prior, assigning equal weight to each order of magnitude in Ω GW , in line with our current state of knowledge. Furthermore, a correlated noise analysis targeting magnetic noise resonances was carried out, using the framework set up by the likelihood (75). The analysis confirmed the lack of correlated magnetic noise across the interferometer network and laid out a clear methodology, which will prove essential for similar analyses with future, more sensitive detectors (e.g., 3G). For the first time, short gates (∼1 s) were employed to remove frequent, loud glitches which populated the LIGO detector O3 data [245,246]. These were not as frequent nor problematic in previous data runs, and investigations are still under way to determine the sources of these noise artifacts. Without their removal, loud glitches would bias the PSD of a 192-second segment, which would then not pass the stationarity cut, reducing the effective live time of the LIGO detectors to less than 50%, whereas gated datasets are reduced by about 1% of the original total length.
The H-L baseline contributes the most to overall network sensitivity, while Virgo baselines remain important for higher frequencies and, hence, especially for searches employing higher values of the spectral index α. The combined H-L-V spectrum for all three data runs shown in Figure 8 is consistent with Gaussian noise, with symmetric fluctuations around zero. Note that the Virgo baselines contribute information around 64 Hz, where the overlap reduction function of H-L is null. In general, the three baseline combinations are more sensitive given that each detector has different consistent monochromatic noise lines (e.g., the harmonics of the suspension mechanism) which are excluded from analyses, inducing gaps in the frequency spectra.
As there is no clear evidence of a GWB, as LVK places upper limits on the three fixed power law models: α = 0, typically associated with a scale-invariant, cosmological model [76]; α = 2/3, which describes a population of inspiralling compact binaries [64]; and α = 3, which corresponds to a flat strain power [152]. These are astrophysically motivated further in Section 3. With 95% confidence, it was found that Ω GW | 25Hz < 0.6 ± 1.7 × 10 −8 (α = 0), 0.3 ± 1.2 × 10 −8 (α = 2/3), and 0.4 ± 1.3 × 10 −9 (α = 3). In recent analyses, α is also allowed to vary as a parameter of the GWB model in likelihood (75). In this case, a mean-zero Gaussian Combined cross-correlation spectra for the three O3 baselines. The grey lines here mark 1σ contours. The data, by eye, appears to follow the Gaussian distribution. Note that adding the Virgo baselines has filled in the gap in the H-L data present around the 60 Hz power line (this broad line is present in both USA-based detectors, but not in Italy). The original version of this plot is presented in [56]; the one shown here was obtained using open data published in [247].
prior is assumed, with standard deviation 3.5. Marginalising over α with this method yields a 95% confidence upper limit of Ω GW | 25Hz < 0.7 ± 2.7 × 10 −8 . When comparing the hypotheses of signal and noise to noise-only, the log 10 Bayes Factor is found to be −0.3. The posteriors for the joint fit of α and Ω GW | 25Hz are shown here in Figure 9.
The results discussed so far assume that the stochastic GW signal is entirely composed of the two transverse-traceless (TT) polarisation modes predicted by GR. However, modified gravity theories generally contain additional polarisations, up to a maximum of six: the two TT modes of GR, one scalar "breathing" mode, and three longitudinal modes-one scalar and two vector [248]. A detection of non-TT polarisation content would provide compelling evidence for the breakdown of GR. Using the method first developed in Ref. [161], LVKs have conducted searches for non-GR polarisations in GWB, using the distinct overlap reduction functions associated with each different set of modes (scalar, vector, and tensor) and marginalising over the power-law slope α as described above [56,162,240]. The resulting upper limits on the GWB intensity are broadly similar in each case, giving Ω T < 6.4 × 10 −9 for tensors, Ω V < 7.9 × 10 −9 for vectors, and Ω S < 2.1 × 10 −8 for scalars with the O3 dataset [56].
LVK assessed future detection prospects by projecting the expected astrophysical background signal, given the current GW detections, onto the expected sensitivities of upcoming detector networks. This comparison is present in Figure 2 in Section 3.1, where we discuss the different contributions to the stochastic background. The expected signal includes contributions from BBHs, BNSs, and BHNSs, all of which have representative direct detections in the advanced LIGO-Virgo runs. We discuss details of the astrophysical background model in Section 3.1; in essence, the total GWB at the selected reference frequency is expected to be Ω CBC GW | 25Hz < 1.9 × 10 −9 , which is about an order of magnitude away from the upper limit set with α = 2/3. Taking Ω CBC GW as an upper limit, it is compared with the 2σ integrated sensitivity curve for O3,2-year projections for the Advanced LIGO-Virgo network operating at design sensitivity [219], and a 2-year, 50% duty cycle A+ network at design sensitivity [249]. Figure 2 shown in Section 3 suggests that by the time design sensitivity is reached for Advanced LIGO-Virgo detectors, these will be marginally sensitive to the background. By the A+ phase, the vast majority of the background signal should be within reach of detection.
Both the astrophysical GWB spectrum and the set of individual BBH signals resolved by LIGO/Virgo are determined by the cosmic BBH merger rate historyṄ(z). The non-observation  . Posterior distributions for the GWB energy density Ω GW at reference frequency f = 25 Hz and spectral index α. The dashed grey lines show the priors used in parameter estimation. Note that a log-uniform prior was chosen here on Ω GW , and a mean-zero Gaussian was chosen for α as the detector noise PSD is approximately flat. The original version of this plot is presented in [56]; the one shown here was obtained using the open data published in [247]. of the GWB from this category of sources sets an upper bound on this function, while the set of resolved signals provides a lower bound at low redshift. By combining both pieces of information, we can, therefore, infer something about this merger history [56,65], potentially revealing interesting astrophysical information about BBH abundances and formation channels, such as whether BBHs are predominantly formed through isolated binary evolution or through dynamical assembly in dense stellar environments. As was recently pointed out in Refs. [250][251][252], these inferences about the BBH merger rate at high redshift can also be used to set an upper bound on the fraction of individual detections that are expected to undergo strong gravitational lensing.At present (i.e., post-O3), the uncertainty onṄ(z) is still large, spanning several orders of magnitude at redshifts z 1 [56], but we can look forward to significant improvements with future LVK observing runs. The inferred redshift shape ofṄ(z) is currently consistent with the BBH merger rate directly tracking the cosmic star formation rate (see Figure 10), but there are some tentative hints of a shallower growth at low redshift, as one would expect if there is a non-zero delay time between star formation and the eventual merger event. Note that the star formation rate reported here is arbitrarily normalised.
LVK stochastic searches have also been used to search for various cosmological sources of GWs that have been proposed in the literature, which we reviewed in Section 3. Among the most promising of these sources are cosmic strings, and LVK stochastic searches can be used in particular to place upper limits on the cosmic string tension Gµ. The precise constraint depends on the model adopted for the string loop network, but the strongest constraint from the first three observing runs is Gµ < 4 × 10 −15 [130]. This is the strongest existing constraint on a cosmic string model from any experiment; for comparison, CMB constraints are typically at the level of Gµ 10 −7 [254]. Another potential signal of cosmological origin is SGWB from PBHs. The non-observation of this signal by LIGO/Virgo implies that PBHs in the mass range 2 M -200 M cannot make up the entirety of the cosmic CDM budget [255,256], f PBH 1; by the time LIGO/Virgo reach design sensitivity, this constraint is expected to improve to f PBH 10 −3 . Searches have also been conducted for the "scalar-induced" SGWB generated by Median Estimate 90% Credible Bounds Vangioni+ SFR Figure 10. Collection of all posterior draws for the merger rate as a function of redshiftṄ BBH (z) inferred from the combination of the resolved BBH detections in the GWTC-2 catalog [253] and the upper limits on the stochastic background obtained with LIGO/Virgo O3 data. The original version of this plot is presented in [56]; the one shown here was obtained using open data published in [247].
the collapsing overdensities during the process of PBH formation [166]. For GWs in the LVK frequency band, this corresponds to the formation of very light PBHs, with masses 10 13 kg.
With future third-generation interferometers, these searches are expected to dominate over existing constraints coming from electromagnetic signatures of PBH evaporation. We have aimed here to provide only a taster of some of the numerous cosmological GW sources that have been probed with LVK stochastic searches. Many other examples exist, including first-order phase transitions [165], superradiant bosonic condensates around black holes [163,164], and even sources which are not associated with GWs, but which couple directly to the LIGO/Virgo test masses to mimic a SGWB signal, such as dark photons [129].

Search Results for an Anisotropic Background by LVK
The anisotropic search methods outlined in Section 4.3 were most recently applied to LIGO and Virgo detector data in [212]. For the first time, data from Virgo was included in the search, leading to a three-baseline network, similarly to the isotropic search presented in Section 4.2.1. The data were pre-processed as in the latter, starting from the gated LIGO data-sets. To increase the efficiency of the search, the data were folded to one sidereal day before analysis [187]; this effectively reduces the cost of the search by a factor equal to the number of days in the dataset, with extremely little loss of information.
Three different, complementary approaches were employed in the search: a broadband, pixel-based analysis targeting point sources, referenced as the broad-band radiometer (BBR) analysis; a broadband, spherical harmonic decomposition (SHD) analysis, sensitive to extended stochastic sources [191]; and, finally, a narrowband analysis targetting three specific sky locations where a GWB signal may be expected, dubbed the narrowband radiometer (NBR) search. Notably, the so-called "radiometer" searches are both carried out in pixel space and ignore off-diagonal correlations present in the Fisher information matrix (Equation (106)) under the assumption that each direction is fully uncorrelated. The SHD search on the other hand is aimed at diffuse sources which span multiple directions and, thus, requires tackling the Fisher matrix inversion problem. The spherical harmonic domain is chosen here to reduce the effective number of degrees of freedom to a small number of modes on the sky, assuming the measurement is diffraction-limited, i.e., the maximum resolvable scale on the sky is directly related to frequency f and aperture D of an interferometric measurement by the following (see, e.g., [236] for a more detailed discussion), In a cross-correlation GW analysis with pairs of detectors, D is the length of the baseline for each detector pair. To obtain an estimate for a sensible max to use in the search, frequency f is assumed to be the frequency at which the detectors are most sensitive to the stochastic signal; in practice, three different spectral indices were employed in the SHD search in [212], which approach LIGO/Virgo sensitivity curves at different "peak" frequencies. These suggest a band-limiting of max = 3, 4, and 16 for α = 0, 2/3, and 3, respectively. The BBR analysis yields sky maps consistent with mean zero Gaussian noise and, thus, provides directional upper limits, as seen in the signal-to-noise ratio (SNR) maps presented in Figure 11. Note that these are obtained by assuming each direction is fully independent and, thus, should not be read as maps at all, but rather each pixel value should be interpreted as the upper limit on stochastic GWs in that particular direction in the case where that direction is the only one contributing to a stochastic signal. Similarly, SHD analysis produces spherical harmonic coefficients for GWB, band-limited according to the prescription above, which are then converted to the pixel domain yielding maps consistent with noise. The angular power spectrum C GWB as defined by Equation (47) can be directly estimated from the coefficients; however, as the measurement is noise-dominated, the unbiased estimator is [155,191] Here,Ĉ GWB is the naive angular power spectrum estimator calculated as in Equation (47) using the spherical harmonic coefficients obtained from the data. N is an estimate of the noise covariance of the GW measurement, containing the appropriately normalised inverse Fisher matrix F calculated in spherical harmonic coordinates and is taken to be a measure of the uncertainty. Employing the unbiased estimator results in an angular power spectrum estimate fully consistent with zero; thus 95%, upper limits are placed on the spectrum as shown in Figure 12.  Figure 11. SNR maps obtained with the "broad-band radiometer" search method applied to O1, O2, and O3 data, for different fixed values of the spectral index α, as explained in the text. It is clear there is no significant high SNR signal in the maps. The original version of this plot is presented in [212]; the one shown here was obtained using open data published in [257].  [212]; the one shown here was obtained by using open data published in [257].
The NBR analysis described in [212] sets upper limits on frequency spectra from three astrophysically relevant sky locations: the direction of the X-ray emitting binary system Scorpius X-1 [258]; the direction of the remnant of Supernova 1987A [259]; and the center of our Galaxy, which may contain relevant GW sources [260]. No overdensity of GWs is found in any of these directions; hence, narrowband upper limits are placed in each frequency bin. Recently, LVK extended this type of search to an "All-Sky, All-Frequency" (ASAF) search [261], which produces sky maps for each frequency bin, in the pixel domain and at high pixelisation, assuming each pixel is uncorrelated. This type of search is best suited to quickly identify candidates for continuous, quasi-monochromatic GW wave sources coming from point sources on the sky, e.g., rotating neutron stars. The results of this search found a small set of outliers which may be investigated with continuous waves searches, for example, with methods proposed in [262].

Stochastic Searches with Pulsar Timing Arrays
Unlike ground-based interferometers, the first astrophysical signal observed with PTAs is expected to be a stochastic signal [121] from supermassive black hole binaries. For an insightful review of the astrophysics that can be performed with PTAs, see [263], and for a broader overview of the field, see [264]. The concept of such experiments was proposed by Sazhin [265] and further discussed by Detweiler [266]. To reach nanohertz frequencies, PTAs perform arrival time measurements from an ensemble of millisecond pulsars for a minimum of several years. The first three PTAs are the EPTA [21], NANOGrav [19], and PPTA [20]. The dependence of the Hellings-Downs curve in Equation (63) on the angular coordinates of pulsars implies that timing arrays benefit from observing pulsar pairs across the entire sky. Specifically, it was shown by Siemens et al. [267] that it is important to increase the number of observed pulsar pairs to resolve Earth-term spatial correlations of the background from the spatially uncorrelated pulsar-term component of the background, which manifests as time-correlated noise.
While EPTA and NANOGrav mostly observe pulsars from the northern hemisphere, PPTA has access to pulsars from the southern hemisphere. Recently, two more efforts have emerged: the Indian Pulsar Timing Array (InPTA, [25], which is the newest member of IPTA); and the MeerTime project [268], which is based on the MeerKAT [269] facility in South Africa, a precursor of SKA [27]. A recent proposal sees the new telescope FAST [28], the largest "dish" radio telescope in the world, laying the foundations for a PTA collaboration in China. The IPTA [23] is the global effort to accelerate the detection of the nanohertz GWB by combining data sets across the previously mentioned member collaborations.
As discussed in Section 4.2.2, it is expected that, prior to the detection of Hellings-Downs spatial correlations, temporal correlations with the same spectra will be observed in pulsar data sets. This phenomenon is now known as the "common-spectrum process", and it was first reported in the search for the background with the NANOGrav 12.5-year data set [54], where the authors for the first time did not place limits on the background amplitude but instead provided the measurement of the amplitude of the common-spectrum process.
After that, the PPTA collaboration reported [26] statistical evidence for the commonspectrum process in a similar fashion, while also suggesting that current analyses do not consider a model where spectra are similar but not common. Furthermore, the authors simulated data set showed evidence for the common-spectrum process, whereas amplitudes of the injected noise power spectra were different by several orders of magnitude. Thus, it was shown that the evidence for the common-spectrum process might arise from pulsar noise described by parameters in a similar range or by model mis-specification.
On the other hand, compared with NANOGrav results [54], PPTA [26] measured the spectral slope −γ of the common-spectrum process to be even more consistent with the theoretical prediction for the GWB from supermassive black hole binaries, γ = 13/3. The relation between this value and the strain spectral index α = −2/3 is explained in Section 4.2.2. Furthermore, under certain analysis conditions, constraints on spatial correlations as a function of pulsar-Earth baselines obtained by the PPTA seem to be even in a better agreement with the Hellings-Down curve. The PPTA measurement is presented here in Figure 6 and the NANOGrav measurement may be seen in Figure 7 of [54]. The PPTA measurement is affected by the pulsar with strong unmodelled excess noise [26]. The EPTA collaboration has also reported evidence for the common-spectrum process with a 24-year-long data set from an observation of six pulsars [276]. Measurements of power-law parameters of the commonspectrum process by NANOGrav, PPTA, and EPTA are presented in Figure 13.
The results of searches for the isotropic background are summarised in Table 1. The observation time span T obs does not directly increase with publication year due to the inclusion or noninclusion of older data where the limited radio frequency band is not sufficient to model "chromatic" red noise arising from pulse propagation effects in the interstellar medium. One also might notice that a few upper limits on the GW strain amplitude are below the amplitude of the common-spectrum process. This is partly caused by the adoption of the older Solar System ephemerides in earlier publications, which were known to contain systematic errors. The origin of the common-spectrum process is to be clarified in future work through better constraints on spatial and temporal correlations. For a more thorough discussion about the noise with references please refer to the following subsections.  (80)), of the common component of power spectral density of timing residuals δt as reported NANOGrav (orange, [54]), PPTA (blue, [26]), and EPTA (green, [276]).

Challenges in GWB Searches with PTAs
PTAs, as galactic-scale gravitational wave detectors, are affected by a wide variety of noise sources. These include instrumental noise artifacts from the instruments and the hardware that processes raw telescope data; effects induced by the Solar System planets and/or the interstellar medium; and jittering of the neutron stars themselves-sources of radio waves-and their gravitationally bound companions such as other neutron stars, white dwarfs, or black holes. Both instrumental and environmental noise sources play an important role. As discussed in Section 4.2.2, stochastic processes in PTAs can be uncorrelated in time, i.e., white, and timecorrelated, i.e., red. Some noise processes introduce spatial correlations between pulse arrival times in pulsars across the sky. Below we briefly discuss prospects for future PTAs given the current knowledge of the noise.
Environmental white noise associated with pulse shape changes is referred to as jitter (e.g., [278,279]). Without jitter noise, measurement errors are limited by the radio pulse signal-tonoise ratio. New scientific instruments such as FAST and SKA will significantly improve upon previous radiometer (white instrumental) noise levels, thus expanding the number of pulsars in the array and improving noise power spectra of the observed pulsars. Therefore, the timing precision of the brightest pulsars will be limited by environmental properties. In particular, MPTA reported [268] an extraordinary timing precision for some of the millisecond pulsars previously observed by the PPTA. PSR J1909-3744 showed the root-mean-square of timing residuals (rms 14 ) of 66 ns with 11 months of observations, whereas in the second data release of PPTA [280], this pulsar showed rms residuals of 240 ns. Pulse integration times may be increased in the future to further lower white noise levels. Following Equation (81), a pulsar with an achievable timing precision of 10 ns observed biweekly, in 10 years, would be able Table 1. Results of searches for the nanohertz-frequency isotropic GWB with the amplitude A and the energy-density spectral index α = 2/3 in chronological order. The first three columns list details of corresponding publications. Publications that adopted alphabetical author list are denoted by asterisks. The fourth column contains 95% upper limits on A for the GWB and measurements of A of the commonspectrum process (CP) with credible levels of 1σ in [26] and 5-95% in [54,276,277]. The values in the last two columns are the characteristics of the data set: the total observation time T obs and the number of pulsars in the array N psr .

Publication
Collaboration Year to probe a GW strain of 5 × 10 −17 at 3 nHz. In reality, however, the detection of a nanohertz stochastic background requires timing an ensemble of pulsars, and there are sources of red noise that limit timing arrays at low frequencies.
Red noise is resolved in most pulsars after they are observed for a couple of years. A comprehensive study of this topic for the first data release of the IPTA can be found in [281]. Normally, the strongest red noise in recycled pulsars appears at low radio frequencies ν from variations in the dispersion measure, electron column density towards the line of sight. Due to dependence P( f ) ∝ ν −2 , this noise is easy to model and it is straightforward to separate its contribution from the one of the GWB. Some red noise was found to be affecting only certain radio frequency bands or back-end observing systems. Independent of ν, "achromatic" red noise is also referred to as spin noise as it is associated with pulsar rotational irregularities. Spin noise with a power spectral density which continuously increases towards low frequencies with a power-law might eventually turn over, but no such spectra were found in millisecond pulsars [282,283]. One may find the term "timing noise" in the literature, which is used to denote spin noise and glitches, i.e., sudden changes in pulsar spin periods followed by an exponential relaxation (e.g., [284]). GWBs are separated from spin noise through spatial correlations, although incorrect red noise models may result in incorrect inferences of the GWB [285]. For the EPTA second data release, it was shown [283] that tailored pulsar red noise models with band-and system-related noise marginally increase evidence for Hellings-Downs correlations, whereas PPTA showed that the inclusion of the brightest PSR J0437-4715 (which shows significant red noise and residual excess noise [286]) results in an increase in the inferred correlation coefficients Γ(ζ ab ) [26]. Before spatial correlations are resolved, both the GWB and spin noise can yield evidence for the common-spectrum process [26]. Inferences of the GWB based on the common-spectrum process at this point in time may be treated as speculative, especially given that spin noise in pulsars may yield a power-law P( f ) with γ = 4 [287], which is close to γ = 13/3 = 4.3(3) for the GWB. Some low-frequency noise can also introduce spatial correlations which can still be distinguished from the Hellings-Downs prediction [288]. Time-correlated errors in terrestrial time standards, also known as clock errors,  Figure 14. Power-law-integrated sensitivity [291,292] at the level of SNR = 3 of simulated PTAs to the SGWB strain h(t). We demonstrate the effects of timing precision, red noise, and a number of array pulsars. Pulsars are distributed isotropically across the sky and observed biweekly. Orange lines represent variations of the 15-year-long PPTA DR2, with real pulsar T obs , white noise levels based on the rms residuals ranging from 240 ns to 24,050 ns [280] and best-fit red noise parameters from [286]. The dashed line represents PPTA DR2 extended by 5 more years of observation, whereas the dotted line represents PPTA DR2 with a doubled number of pulsars with the same noise properties. Red line is an example of the future IPTA with 20 years of observations of 100 pulsars with white noise rms drawn uniformly between 20 and 500 ns. Red noise log 10 A is drawn uniformly from [−14, −20] and γ drawn uniformly from [1,7], a fiducial choice based on [286]. Green lines represent SKA observing 50 pulsars with white noise rms of 30 ns for 15 years, with (solid line) and without (dashed line) red noise assumed for the IPTA.
can yield monopolar spatial correlations between pulsars a and b, Γ(ζ ab ) = 1. Pulse arrival times are referenced to the position of the Solar System barycenter; thus, systematic errors in the barycenter position can yield red noise with dipolar spatial correlations Γ(ζ ab ) = cos(ζ ab ). This noise can be mitigated byt improving Solar System ephemerides, and when improvements are not available, it can be modelled as a purely deterministic process [289,290]. In an analysis with the NANOGrav 11 year data set [181], it was demonstrated that the results are affected by the choice of Solar System ephemeris, whereas deterministic modeling of systematic errors in ephemerides renders consistent results. The good news is that the contributions of ephemeris noise are found to be rather weak for current data sets and ephemerides. In particular, in the search for a stochastic background in the second data release of the PPTA [26], the authors computed Bayesian evidence for errors in every parameter of the deterministic ephemeris model: masses and orbital Keplerian parameters of planets from Mars to Saturn, with orbital frequencies in the PTA band. They found marginal evidence for systematic errors in older ephemerides and no evidence for errors in present-day ephemerides used for GW analyses between 2020 and 2021. Red noise is mostly environmental; thus, it might be worth allocating more time to fainter pulsars that do not show evidence of such noise.
With the above in mind, we provide sensitivity projections for a few scenarios of future PTAs in Figure 14. When evaluating the performance of future PTAs, several simplifications are usually made. For example, often, red noise and effects of fitting for the timing model are neglected, as in Figures 1 and 7. We addressed these points in Figure 14 by adopting the formalism from [291] and the relevant code [292]. The sensitivity estimates can be further refined based on more realistic noise properties and observing scenarios-a subject of future work. For a more detailed review of measurement errors in pulsar timing arrays please refer to [293]. Challenges for GW detection with pulsar timing arrays are also discussed in [294].

Search Results for an Anisotropic Nanohertz Background
To date, one search for an anisotropic GWB with PTAs has been performed [203]. In the latter, the authors use six pulsars from the EPTA with a timing baseline of T obs = 17.7 years. While the results indicate that the data are unable to improve upon a physically motivated prior, the search setup, motivation, and results are useful to review.
The authors use a spherical harmonic decomposition with max = 4 based on analytical arguments of the PTA resolution established in [295]. They perform three parameterizations and one independent analysis. First, they use a single set of anisotropy coefficients across the entire frequency band (lowest five harmonics of 1/T obs ) and chose a fixed spectral index consistent with a population of supermassive black hole binaries. Second, they allow different anisotropy coefficients at each of the lowest five harmonics of 1/T obs . Third, they allow different spherical harmonic coefficients for each of the lowest four harmonics of 1/T obs and use a single set of spherical harmonics coefficients to describe up to the fiftieth harmonic. In all cases, they impose that the power on the sky in each pixel is positive, as negative power would be unphysical. Fourth, they independently estimate the cross-correlations between individual pulsars and then map these onto a chosen basis.
The results yielded no detection of a GWB, and limits were set on the angular power spectrum C for each of the individual parameterizations (and for each individual frequency, where appropriate). The limits on strain amplitude in multipoles with > 0 are 40% of the monopole. However, these limits do not improve upon the physically motivated prior. Current astrophysical estimates suggest that the strain amplitude in multipoles with > 0 is around 20% of the monopole [296]. In the case of the fourth analysis, the physical prior is not applied, and the authors observed that sensitivity to the dipole is reduced due to the distribution of the six pulsars on the sky, highlighting the desire for using a large number of pulsars that are uniformly distributed on the sky.

Search Results for a Nanohertz Background Not Related to Supermassive Black Hole Binaries
While quite a few sources of a GWB that can be measured by PTAs have been proposed, outside of the generic isotropic background, which is generally assumed to be dominated by unresolved supermassive blackhole binaries, only a few explicit searches have been carried out.
Recently, both the PPTA and NANOGrav have placed constraints on first-order phase transitions in the early Universe [297,298]. In [297], the authors use 45 pulsars to search for a signal from first-order phase transitions using three different models. The common spectrum process first reported in NANOGrav data in [54] is also consistent with a first-order phase transition at temperatures below the electroweak scale, with T 10 MeV at the 1σ. They calculate posterior distributions on the temperature and bubble nucleation rate when phase transition occurs, as well as the strength of the phase transition and the friction between bubble walls and the plasma. However, it is important to note that these posteriors assume only a GWB from phase transitions is present, while the data cannot distinguish between this model and one described by a GWB from supermassive binary black holes. In addition, no spatial correlations between pulsars are found; hence, no detection of a GWB is claimed. The results of [298] are similar-no evidence of spatial correlations between pulsars is reported, and constraints on the temperature of a potential phase transition are found to be 1 MeV T 100 MeV. For a recent discussion on separating a background from supermassive black hole binaries and first order phase transitions (or other potential sources of a nHz GWB) using a multi-messenger approach, see [299].
Several groups have recently searched NANOGrav, PPTA, and IPTA data sets for evidence of alternative polarizations of GWs. In [300,301], the authors claim a Bayes factor of ∼100 in preference of background from spatially correlated scalar-transverse gravitational waves compared to a simple common spectrum red noise process using both NANOGrav and IPTA data. They claim similar significance when comparing a scalar transverse background to a background from tensor transverse gravitational waves. However, they found no such significant evidence when searching PPTA data [302]. In [303], the authors find a preference for a scalar transverse background, but with less significant evidence. However, this significance goes away when the authors account for uncertainties in the solar system ephemeris using [290] or when removing one pulsar, PSR J0030+0451.

Other Stochastic Background Searches
There are many GW experiments that we have omitted from the discussion above. In this section, we very briefly highlight a few that are of particular interest, some of which are shown in Figure 7. Our focus here is on the nanohertz-kilohertz frequency band; however, see the review article [304] and references therein for a summary of detection methods at MHz frequencies and above.
The oldest method of GW detection, pioneered by Weber [305,306] in the 1960s with his "resonant bar" experiments, is to monitor the resonant frequencies of some macroscopic test object. When acted upon by a GW with a frequency matching one of the object's normal modes, the stretching and squeezing action they induce excites vibrations in these modes which can be amplified and detected. Numerous such resonant-mass experiments have been operated since the 1960s [307], but none has had sufficient sensitivity to place meaningful bounds on the GWB below the Ω( f ) < 1 level required to prevent GWs from over-closing the Universe. However, the same idea has been applied with success on a much larger scale, using the Earth itself as a resonant mass. By monitoring the Earth's normal mode frequencies using gravimeter data, Coughlin and Harms [221] placed an upper limit on the GWB in the megahertz frequency band. The corresponding PI curve has a minimum of Ω( f ) ≈ 1.3 × 10 −2 at f ≈ 1.6 mHz. Recently, a trio of similar experiments have been proposed in which gravimeters installed on the Lunar surface could be used to monitor the Moon's normal modes, providing a cleaner probe of GWs in the megahertz frequency band [308,309].
Another GW experiment which has successfully constrained the GWB below the Ω( f ) < 1 level is the work of Armstrong et al. [222], who carried out Doppler-tracking observations of the Cassini spacecraft. The principle here is very similar to IFOs and PTAs in that it tracks the trajectories of photons in the presence of GWs; however, rather than measuring changes in the light-travel time, one attempts to measure changes in the photon frequency due to GW-induced Doppler shifts between the two test masses (this is essentially the time derivative of the perturbation to the light-travel time). This analysis probed GWs in the 1-100 µHz band, with the corresponding PI curve reaching a minimum of Ω( f ) ≈ 9.9 × 10 −3 at f ≈ 3.9 µHz. This is a particularly interesting frequency range, as it straddles a significant gap between PTA constraints at nanohertz frequencies and future constraints from LISA and other space-based interferometers in the megahertz band; however, the Cassini constraint in this "µHz gap" is not strong enough to be of astrophysical or cosmological interest. Recently, it has been pointed out that much more competitive GWB constraints could be obtained in this frequency band using high-precision observations of binary systems (in particular, binary pulsars and the Earth-Moon system), by searching for the imprints of stochastic GWs on their orbits [226,227]. Other proposals for accessing these frequencies include an extremely long-baseline space-based interferometer such as the µAres concept [310].
In addition to changing the frequency and/or time-of-arrival of photons, GWs can also cause angular deflections in photon trajectories, such that distant astronomical objects appear displaced from their true positions on the sky. These GW-induced deflections have a distinctive geometric pattern across the sky, allowing one to search for a GWB signal with high-precision astrometry from observatories such as GAIA and THEIA [311][312][313]. The resulting PI curve lies in roughly in the same nanohertz frequency band probed by PTAs, albeit with improved high-frequency sensitivity. Forecasts in Ref. [313] indicate that next-generation astrometry with an observatory such as THEIA may be sensitive to Ω( f ) ∼ 10 −16 at nanohertz frequencies, surpassing even the reach of pulsar-timing observations with SKA.
Finally, a GW detection technique which is subject to growing interest in the community is atom interferometry, in which, rather than using interference between photons to measure the phase difference between two spatial paths, one exploits the wave-like behaviour of matter on the quantum scale to measure interference between atoms. While first-generation atom interferometry experiments such as AION [225] and MAGIS [314,315] are in their early stages and are not expected to have significant sensitivity to GWs, proposals for future kilometerscale experiments are forecast to provide extremely useful GWB constraints in the 0.1-1 Hz band. In particular, the AION-km proposal has a forecast sensitivity of Ω( f ) ≈ 2.7 × 10 −12 at f ≈ 0.12 Hz.

Stochastic Background Detection Prospects with Future Gravitational-Wave Detectors
We close this review with an outlook on the detection capabilities of upcoming GW detectors. We first discuss the potential of 3G detectors, which will be considerably more sensitive to GWs in the same frequency bands as LVK instruments, as may be observed in Figure 7, and specifically prospects of performing component separation with ET and CE. We then review the upcoming LISA mission and lay out different stochastic search methods which may be employed with such a detector.

Stochastic Searches with Third Generation Interferometers
By the mid-late 2030s, it is expected that ground-based gravitational wave detectors will evolve to the third generation of laser interferometers comprising ET and CE. These observatories are expected to outperform current 2G network sensitivity by several orders of magnitude in strain, extending the BBH detection horizon out to the earliest epochs of star formation at z ≈ 20 and beyond [52,316]. This would allow 3G detectors to individually resolve more than 99.9% of all stellar origin BBH signals in the Universe [317]. This eliminates the motivation to search for the background of stellar origin BBHs with standard cross-correlation methods described in Section 4 because only very few of such signals will remain unresolved, whereas searches for the non-Gaussian background outlined in Section 4.4 might be more appropriate. In between 2G and 3G detectors, there is also the proposed "2.5G" Australian detector NEMO [318], which will use very high laser power to target high frequency signals from tidal effects in the late stages of binary neutron star inspirals.
However, cross-correlation based searches might still be relevant for BNS and BHNS backgrounds because individual events will only be resolved up to redshifts 1 [319]. The approach will be then to subtract out all the resolved sources from the timestream and run a stochastic search on the remaining data. This may be performed iteratively and/or assuming different contributions relative to the signal to perform component separation [320].
Furthermore, cross-correlation based stochastic searches are suitable for searches for cosmological gravitational wave backgrounds, following a subtraction of individual compact binary coalescence signals [317,321]. The noise formed by residuals from the subtraction of best-fit compact binary coalescence waveforms can be reduced to negligible levels following the demonstration by [322].
Third-generation detectors also present a unique detection opportunity for primordial black holes. The merger rate of primordial black hole binaries per unit volume is predicted to grow continuously towards redshifts of 50 and beyond [323], producing a high redshift source of stochastic GWs which could be probed with 3G cross-correlation searches. Note that this remains subject to the theoretically uncertain rate of such events that the nature provided us with; let us cite several investigations on the matter [255,[324][325][326]. Attribution of a merged black hole binary to primordial and not stellar origin based on redshift alone will not be statistically strong even with 40-kilometer Cosmic Explorer, 20-kilometer Cosmic Explorer South, and Einstein Telescope [327]. This means that even in the scenario where primordial black holes are abundant, studies of these events will require a combination of fitting for their ensemble properties as well as cross-correlation-based searches for the stochastic background that are sensitive to subthreshold signals. In the case of competing stochastic backgrounds, studies have been performed to carry out simultaneous Bayesian fitting of both astrophysical and primordial components [169,328], and methods have been proposed to distinguish primordial black holes from astrophysical ones [329,330].
Correlating anisotropies in the GWB which electromagnetic observations has also been proposed as a target for future detectors. In some cases, this means using galaxy catalogues to guide the model used for anisotropic GWB searches [242], while in other cases, the proposal is to correlate sub-threshold GW signals with EM transients to probe fundamental physics [128,331].
There are quite a few technological challenges in the development of 3G detectors that are directly relevant for GWB searches, for which its sensitivity to Ω gw scales with f −3 and, therefore, benefit from improved low frequency sensitivity. Below, we discuss two important and probable sources of low frequency noise that are important for GWB searches.
First, globally correlated magnetic noise from, e.g., Schumann resonances [332,333] will, if not subtracted or dealt with in some manner, provide a lower limit on our ability to measure a GWB in the frequency band from ∼8-40 Hz [168,[334][335][336][337][338][339][340][341][342], and could even be an issue at higher frequencies [342]. The assumption that a GWB will look like correlated noise between multiple detectors means that correlated magnetic noise could masquerade as a GWB if we are not careful. There are several avenues we can take to avoid this issue. One of the most promising is Wiener filtering using low noise magnetometers near the detectors [335][336][337]339]. Another option is spectral separation using the hybrid-Bayesian method discussed at the end of Section 4.2, where one models the contribution of the correlated magnetic noise to the cross-correlation statistic and includes it with a model for the GWB [168]. Finally, reducing the coupling of magnetic fields into the strain channel of the detectors will also be vital [342]. In the end, some combination of all three of these ideas will likely be needed.
Second, low frequency noise due to gravity gradients caused by temperature fluctuations in the atmosphere (infrasound) or seismic waves, known as "Newtonian noise," is likely a limiting noise source at frequencies up to ∼30 Hz [14,18,[343][344][345][346]. For a complete review, see [345]. One cannot shield against Newtonian noise and, therefore, need to either subtract [347][348][349][350][351] it or cleverly design the local environment of the detectors to reduce surface seismic waves and atmospheric perturbations [352]. Estimating the specific contribution of Newtonian noise to the detectors from, e.g., seismic waves, is calculable based on a realisation of the seismic wavefield [345], but is very difficult to achieve in practice. Proof of principle tests of using an array of seismometers or geophones to create Wiener filters or use machine learning to clean a "target" seismometer channel are promising, and work on optimal design of local sensor arrays is mature [347][348][349][350][351]353]. However, cleaning a seismometer is quite different from cleaning Newtonian noise from a gravitational wave detector. In the end, it is likely that for 3G detectors, an array of geophones, broadband seismometers, and tiltmeters (which can be used to separate seismic tilt from seismic waves) will be needed to subtract Newtonian noise due to seismic waves [345,354]. Meanwhile, an array of infrasound microphones or LIDAR detectors, along with potentially using wind fences around facilities, will be needed to reduce and cancel Newtonian noise due to atmospheric fluctuations [355,356].

Stochastic Searches with the Laser Interferometer Space Antenna
Stochastic searches with LISA present distinct challenges from those discussed for PTAs and ground-based interferometer networks, primarily due to the fact that LISA is considered to be a single detector; hence, it is not possible to use a cross-correlation search with a second detector which is not co-located with LISA. It is, however, possible to auto-and cross-correlate TDI channels, as we explain here.
The antenna itself will be comprises three spacecrafts which will be positioned on three heliocentric orbits and will specify a plane at a 60 • angle with the ecliptic, in an equilateral triangle configuration which will be maintained throughout the entire duration of the mission (4+ years) [29]. Each spacecraft will send a laser beam to each other spacecraft such that, in total, six laser measurements will be made, comprising the links of the detector. The links are then combined to form TDI channels, as detailed in Section 4.1. Basic LISA studies start from considering sets of three channels, where each channel is centered around one spacecraft; however, more sophisticated studies are already under way [139,141,357]. These channels produce a set of observations, which we can treat as timestreams for the purposes of this discussion (although these are effectively combined in post-processing and are not the read-out of the detector), X(t), Y(t), Z(t), as introduced in Section 4.1. Taking the same approach as for real interferometer timestreams, we can decompose each of these in their signal and noise components, where R x is the specific response of the X channel, and n X is its noise term. The noise terms for the three channels are not independent, as each shares (possibly multiple) links; hence, a residuals analysis such as that based on Likelihood (73) with T = (X, Y, Z) T , and C is the full covariance matrix of the data, C = T ⊗ T . The key to detecting a stochastic background with LISA is then understanding how instrument noise and GW signals manifest in different TDI channels and finding an accurate modelling approach to plug into Likelihood (120). The first step consists in picking the specific TDI channels and assumptions to work in-there are several options provided in the literature [135,136,147,[358][359][360]. Both maximum-likelihood methods and Bayesian approaches which compute the full posteriors have been proposed to extract and characterise an SGWB from LISA data. In the first case, it is possible to produce a maximum-likelihood solution by fixing a noise model and iterating over the data until the signal converges to the maximum-likelihood estimate; this is essentially the same approach as that laid out in Section 4.3 for anisotropic searches, based on [194]. Otherwise, one may take a full Bayesian approach, parametrizing signal and noise components in the data and assuming appropriate priors for the two sets of parameters, then sampling the Likelihood to obtain informative posteriors. This is performed for example in [148] for an isotropic stochastic background, where the data were given by the LISA Data Challenge (LDC). Here, the authors show that, using the noise orthogonal A, E, T set of TDI channels, they are able to recover the stochastic signal and independently measure the relevant LISA noise parameters for each arm of the interferometer. Note, however, that this is performed in the absence of competing signals.
In fact, component separation will be a major challenge for LISA data analysis, as a variety of signals will contribute to the same frequency band and overlap in time [361]. In the case of stochastic backgrounds, as detailed in Section 3, LISA will be sensitive to an anisotropic galactic white dwarf binary background which traces the shape of the Milky Way, a mostly isotropic background of either primordial or stellar-origin black hole and neutron star binaries [362], and also potentially other primordial backgrounds, such as those arising from inflation [94], first-order phase transitions [55,363], and cosmic strings [53,364], all of which may be either isotropic or anisotropic. Galactic binaries will by far dominate the measurement, and in fact several approaches have been proposed towards component separation, relying on characteristic, observable differences of each component such as their spectral shape [149,365], or their distribution on the sky, as seen in [194,195]. In any case, it will first be necessary to disentangle the resolvable white dwarf binaries from the unresolvable white dwarf binary background in the data streams, as detailed in [133].
Anisotropic stochastic searches with LISA leverage the motion of the LISA constellation around the sun to map the signal in the solar system barycenter frame. The method proposed in [194] fixes the noise model, simulates mock data assuming the LISA response functions in [366] and uses the Bond-Jaffe-Knox maximum-likelihood solution in Equation (111) to test the mapping capabilities of the detector. It was found that the angular resolution of LISA for stochastic signals highly depends on their SNR as a function of frequency, as the detector response varies greatly as a function of the latter. The stochastic signals compete with the detector noise, which at low SNR dominates the modes of the Fisher information matrix (112). Beyond this, the angular resolution at which LISA "sees" the signal is diffraction-limited; hence, the higher the frequency, the better the angular resolution will be. In the best-case scenario of a high SNR signal dominating at frequencies ∼0.1 Hz, the angular resolution will be of the order max ∼ 16-see Figure 6 of [194]. Another approach proposed in [195] tackles the mapping problem by setting priors on the spherical harmonic modes of the signal and solving for their amplitudes in a Bayesian framework, using a single TDI channel. This is especially useful when the sky distribution of the signal is well known, as is the case for the galactic white dwarf binary background. The technique proposed here is based on the Clebsch-Gordan expansion, which allows one to constrain the spherical harmonics with a non-negative distribution. This method has proven to be effective for the recovery of a mock galactic background and yields an informative set of posteriors on the coefficients of the signal-see Figure 3 of [195]. However, it has only been tested at the low resolution of max = 4 as the solver scales significantly in computational cost with the number of modes one tries to recover.   [247,257], and publicly available PTA results presented in [26,54,276,286].