One-point statistics matter in extended cosmologies

The late universe contains a wealth of information about fundamental physics and gravity, wrapped up in non-Gaussian fields. To make use of as much information as possible it is necessary to go beyond two-point statistics. Rather than going to higher order N-point correlation functions, we demonstrate that the probability distribution function (PDF) of spheres in the matter field (a one-point function) already contains a significant amount of this non-Gaussian information. The matter PDF dissects different density environments which are lumped together in two-point statistics, making it particularly useful for probing modifications of gravity or expansion history. Our approach in Cataneo et. al. 2021 extends the success of Large Deviation Theory for predicting the matter PDF in $\Lambda$CDM in these ''extended'' cosmologies. A Fisher forecast demonstrates the information content in the matter PDF via constraints for a Euclid-like survey volume combining the 3D matter PDF with the 3D matter power spectrum. Adding the matter PDF halves the uncertainties on parameters in an evolving dark energy model, relative to the power spectrum alone. Additionally, the matter PDF contains enough non-linear information to substantially increase the detection significance of departures from General Relativity, with improvements up to six times the power spectrum alone. This analysis demonstrates that the matter PDF is a promising non-Gaussian statistic for extracting cosmological information, particularly for beyond $\Lambda$CDM models.


Introduction
In the past several decades cosmology has moved solidly into a data driven science. The current standard model of cosmology, called ΛCDM consists of a cosmological constant as dark energy component (Λ), and cold (non-relativistic) dark matter (CDM) as its principle components.
The parameters of the ΛCDM model are most tightly constrained currently by experiments measuring the temperature anisotropies and polarisation in the cosmic microwave background (CMB) (for example the Planck measurements in [1]). However, while CMB data is very valuable in extracting cosmological information, in the push to sub-percent measurements of standard cosmological parameters, and in testing non-standard cosmologies, the large-scale structure (LSS) of the universe is the most promising complementary tool.
The principle advantage of LSS data is that it is three dimensional, tracing a history of how cosmic structure evolves over time, since the snapshot of the CMB. By counting the Fourier modes available, one can expect 1-2 orders of magnitude improvement on constraints from LSS data (see e.g. [2]). In particular, the large scale structure provides a window to the expansion history of the universe, which makes it an exciting probe of dynamical dark energy and modifications to gravity. These extensions to the standard cosmology are one of the principle science goals of current and upcoming missions like Euclid [3], LSST [4], and DESI [5]. These extensions to the standard ΛCDM model would represent new fundamental physics, and could resolve certain current observational tensions which ΛCDM is strained to explain (recently reviewed in e.g. [6][7][8] However, extracting information from the late universe via large-scale structure is nontrivial for several reasons. The first and most relevant for this work is that the late universe is statistically much more complex than the universe at the time of the CMB. Extracting cosmological information is always done on a statistical basis, treating observables such as a field of galaxy positions or shear lensing maps as single realisations of a random field. Gaussian random fields are completely characterised by their two-point correlation function (or its Fourier counterpart the power spectrum), as all higher order correlations functions can be written as sums of the two-point function via Wick's/Isserlis' theorem. The CMB has been measured to be a near-perfect Gaussian random field [9], and so measurement of the power spectrum is sufficient to quantify all information content in the CMB. However, as gravitational collapse is a non-linear process, the statistics of the late time density are not also Gaussian, as the non-linearity in the mapping from initial to final densities sources non-trivial higher statistics as information "leaks out" of the power spectrum. It is therefore crucial to determine statistics beyond the power spectrum which can recapture this non-linear information in an efficient and theoretically tractable way. Other reasons why extracting information from LSS data is difficult come down to issues of modelling dynamics on nonlinear scales (often side-stepped by running N-body simulations, which are expensive and have their own host of non-trivialities), and on a variety of systematic effects in observations.

The matter PDF in spheres from large deviation theory
This work focuses on a simple choice of non-Gaussian statistic, namely the probability distribution function (PDF) of matter density in spheres. The matter PDF can be straightforwardly calculated from the density field of a cosmological N-body simulation by looking at the distribution of the matter field smoothed with a spherical top-hat filter on the scale of interest.
The work of several papers [10][11][12][13][14] provide an analytic framework for predicting the matter PDF in spheres in the mildly non-linear regime. This method relies on large deviations theory (LDT) where the driving parameter is the non-linear variance, σ 2 NL , of the matter field. This formalism therefore remains valid at redshifts z and for spheres of radius R where σ 2 NL (z, R) < 1.
For Gaussian initial conditions, the PDF, P lin (δ L ), of the linear matter density contrast, δ L , in a sphere of radius r is a Gaussian distribution with width given by the linear variance at scale r and redshift z . ( The linear variance on scale r is given by an integral over the linear power spectrum P L with a spherical top-hat filter in position space where W 3D (k) is the Fourier transform of the 3D spherical top-hat filter, and J 3/2 (k) is the Bessel function of the first kind of order 3/2. The function Ψ in equation 1 is related to the rate function in the context of LDT. The key result from the LDT formalism allows us to relate the rate function of the linear density to the non-linear density, which provides the exponential dependence of the final PDF. Generally this LDT result is called the contraction principle and relates the rate function of different random variables. In the cosmological case, that since large deviations are exponentially unlikely and the matter PDF is computed for spherically symmetric cells, the most likely mapping from linear to non-linear densities should be dominated by spherical collapse, δ L → ρ SC (δ L ). Combining this with mass conservation in spheres (which relates the initial scale, r, to the final scale, R, by r = Rρ 1/3 ) leads to the final decay-rate function Ψ R,z of the non-linear density The LDT model then predicts the matter PDF in spheres is given by where the precise prefactor can be determined by a more detailed analysis (see equations 5a and 5b from [15]). For a standard ΛCDM universe, there are only three quantities needed for this theoretical model of the matter PDF (through the decay-rate function in equation 3): (i) the time-and scale-dependence of the linear variance σ 2 L (r, z).
the mapping between linear and final densities in spheres, which is taken to be spherical collapse δ L → ρ SC (δ L ) (or its inverse δ SC L (ρ)). Figure 1 shows the success of this LDT model against measured PDFs from the Quijote simulations [16], as well as in comparison to a common log-normal phenomenological model (dashed lines).

Extended cosmologies
Due to the significant non-Gaussian information in the matter PDF and the success of this LDT formalism in ΛCDM, we modify this framework to analyse cosmologies with non-GR theories of gravity or dynamical theories of dark energy beyond a cosmological constant.
Collectively we will refer to either modified gravity (MG) or dark energy (DE) models as extended cosmologies. The dark energy model considered was a simple parametrisation of an evolving dark energy, called the w 0 w a CDM model. This cosmology is still described by a smooth dark energy and General Relativity (GR), but with dark energy equation of state given by [17,18] where {w 0 , w a } are new phenomenological parameters with w 0 = −1, w a = 0 corresponding to the cosmological constant.
For the theories of modified gravity, we considered Hu-Sawiki f (R) gravity [19] and the normal branch of DGP braneworld gravity which acts as an additional smooth dark energy component [20]. The strength of the deviation from GR gravity in these theories is quantified by the parameters f R0 and Ω rc respectively. In moving to an extended cosmology, all three of the ingredients outlined in the previous section in principle need updating: the linear variance, the non-linear variance of the log-density, and the mapping between initial and final densities.
Updating the linear variance is straightforward, simply requiring integrating the modified linear power spectrum which can be achieved by the ratio of linear growth factors. This is achieved using a novel technique for emulating the response of the ΛCDM power spectrum to MG/DE effects as described in [21]. The non-linear variance can either be measured from a set of simulations in the extended cosmology, or can be well approximated by a phenomenological log-normal rescaling (equation 14 from [22]) applied to a reference cosmology.
The mapping from initial to final densities is trickiest for extended cosmologies, especially in the case of scale dependent modified gravity. However, in [15] we show that when restricted to mildly non-linear scales (R 10 Mpc/h) we can use the spherical collapse mapping for an Einstein de Sitter cosmology, modified just by the difference in linear growth factors between the ΛCDM and extended cosmology.

Simulations and model validation
In [15] we validated the predicted matter PDFs against a suite of modified gravity and dark energy simulations, and found them to be accurate to 2% over the range of densities used in the final analysis. All PDFs and the analysis presented here are based on the LDT model, calculated using pyLDT 1 , a modularised and user-friendly Python code that takes advantage of the PyJulia interface for computationally intensive tasks. Figure 2 shows the matter PDF in 10 h −1 Mpc spheres for the three different theories of gravity considered, ΛCDM, f (R), and DGP. Generically, introducing modified gravity will change both the width and the shape of the PDF (c.f. Figure 3 of [15]). Since σ 8 sets the width of the PDF, normalising the cosmologies to have the same σ 8 at redshift 0 allows us to isolate the distinct features of modified gravity on the PDF, as done in Figure 2.

Matter PDFs in extended cosmologies
By normalising the width of the PDF at redshift 0, we see a residual difference in the shape of the matter PDF, as well as a distinct redshift dependence. These differences in shape and redshift dependence (as well as a difference in scale dependence not shown in Figure 2) are what allows the PDF to break degeneracies between standard cosmological parameters and modified gravity. The effect of an evolving dark energy on the matter PDF is similar to that of DGP gravity, entering mostly by modifying the expansion history.

Forecasting constraining power with the Fisher formalism
To forecast errors on a set of cosmological parameters, θ, we make use of the Fisher matrix formalism. This formalism provides constraints on θ under the assumption that the likelihood is approximated by a multivariate Gaussian distribution.
Given a (set of) summary statistics arranged in a data vector, S with components S α , and the covariance matrix between those summary statistics, C d , the components of the Fisher matrix, F, are defined as (assuming that the data covariance matrix is independent of cosmological parameters) Assuming the likelihood of our observed statistics given our cosmological parameters is well approximated by a multivariate Gaussian, the parameter covariance matrix C p ( θ) (encoding the expected parameter constraints along with parameter degeneracies) is given by the inverse of the Fisher matrix. To obtain constraints by marginalising over a subset of parameters, one can simply select the appropriate elements of the parameter covariance matrix. In particular, this implies that the fully marginalised constraint on a single parameter, θ i , is given by For this analysis we used three different data vectors. These are the PDF alone, the matter power spectrum alone, and a stacked data vector which combines both the PDF and the matter power spectrum. We combined information from three redshifts, z = 0, 0.5, 1. The data covariance was measured from the fiducial ΛCDM runs of the Quijote suite of simulations [16] subsequently rescaled to correspond to a Euclid-like survey volume of 20 (Gpc/h) 3 . For the matter PDF we combined three sizes of spheres (10,15,20 Mpc/h), while for the matter power spectrum we included information up to k max = 0.2 h/Mpc. Figures 3 and 4 illustrate how the matter PDF depends on standard parameters related to structure formation Ω m and σ 8 , as well as the parameters extending ΛCDM, namely | f R0 |, w 0 , and w a (Ω rc for DGP gravity is omitted here to avoid cluttering the plots). This more directly quantifies how well degeneracies in these parameters can be broken, and allows us to quantify the heuristic understanding gained by Figure 2. These derivatives directly enter into the Fisher constraints via equation 6. Notice that the effect of Ω m on the matter PDF can easily be disentangled from the other parameters in both the DE and MG case, as the matter PDF is sensitive to Ω m only through its skewness and the linear growth factor, D(z) [22].

Response of the PDF to changes in cosmological parameters
In f (R) gravity, we can expect the most degeneracy breaking, and therefore better constraints on f R0 than Ω rc or {w 0 , w a } for two main reasons which can be seen in Figure 3. The first is that the f R0 derivative has a different shape from the σ 8 derivative, showing up as an additional skewness owing to the scale dependent fifth force. This, combined with the fact that the f R0 derivatives are non-zero at z = 0 (unlike in DGP and w 0 w a CDM) allows more information to be extracted from the non-linear regime. While DGP gravity is not shown in Figure 3, its effect is very similar to that of dark energy shown in Figure 4 (as can be seen in Figure 11 from [15]). Figure 4 shows that at fixed scale and redshift, the response of the PDF to w 0 or w a is very similar in shape to the response to σ 8 . For this reason, in the cases of dark energy and DGP gravity, the degeneracy is mainly broken by a difference in the redshift (and scale) dependence. ∆P(ρ)/σ(P(ρ)) at 10 Mpc/h Figure 4. Derivatives of the matter PDF in an evolving dark energy universe. The dependence of the matter PDF on Ω m is easily distinguished from the others by its distinct skewness (see Figure 3) and hence not shown here. The σ 8 , w 0 , and w a derivatives are similar in shape, but have different redshift evolutions, which allows for degeneracy breaking. Figures 5 and 6 show the Fisher forecast constraints for f (R) gravity with | f R0 | = 10 −6 and for w 0 w a CDM about a ΛCDM fiducial (forecasts for DGP gravity can be found in [15]). In both the modified gravity and the dark energy extended cosmologies, the matter PDF alone provides constraints competitive with the matter power spectrum. More importantly however, they provide complementary information, demonstrated by the different degeneracy directions. This indicates that the matter PDF is recovering independent non-Gaussian information beyond the power spectrum.

Fisher forecasts for modified gravity detection and dark energy constraints
Combining the PDF and the power spectrum allows for a 5σ detection of both f (R) and DGP gravity (see Table 1), and at least doubles the constraining power for other parameters such as σ 8 over just power spectrum alone. For evolving dark energy, the improvement is quantified by the dark energy Figure of Merit (FoM), equal to the inverse area of the contour in the w 0 -w a plane. Adding PDF information to the power spectrum increases the FoM by a factor of 5 (summarised in Table 2). The resulting FoM is in the range expected to be reached by Euclid in combining galaxy clustering and weak lensing [23]. f (R) detection DGP detection PDF, 3 scales + prior 5.15σ 1.17σ P(k), k max = 0.2 h/Mpc + prior 2.01σ 2.42σ PDF + P(k) + prior 13.40σ 5.19σ Table 1: Detection significance for a fiducial f (R) with | f R0 | = 10 −6 and DGP model with Ω rc = 0.0625. The stronger f (R) constraints are expected from the additional skewness in the PDF response to | f R0 | as seen in Figure 3.

Conclusions
Standard two point statistics are not sufficient to make full use of the information content in the cosmic large-scale structure, and would leave large amounts of data from current and upcoming galaxy surveys under utilised. The full shape of the matter density PDF f (R), |f R0 | = 10 −6 z = 0, 0.5, 1, V tot =20 (Gpc/h) 3 PDF R = 10, 15, 20 Mpc/h + priors P(k), k max = 0.2h/Mpc + priors PDF+P(k) + priors Forecast constraints on f (R) gravity using a Euclid-like volume. These are marginalised over all other ΛCDM parameters, and include a prior on Ω b and n s described in [15].  Table 2: Constraints from mildly non-linear scales on σ 8 , w 0 , and w a as well as the dark energy Figure of Merit (FoM) coming from the matter PDF, power spectrum, and their combination.
in spheres has been shown to provide great complementarity to the standard two point statistics, and allows extraction of information from the non-linear regime. The analytic framework described here has been successfully applied to ΛCDM universes along with extensions including primordial non-Gaussianity [24] and massive neutrinos [22]. This work demonstrates that the LDT formalism continues to work in modified gravity and dark energy scenarios , providing a powerful non-Gaussian probe of fundamental physics complementary to two-point statistics.
While the analysis presented here is idealised in that it relies on knowledge of the true matter distribution, it is encouraging for realistic scenarios. In the case of ΛCDM cosmologies, the LDT approach has been translated into several observable quantities, including weak lensing [25][26][27], galaxy clustering [28,29], and density-split statistics [30,31]. Given the theoretical information content in the matter PDF demonstrated here, extending the LDT framework to observables in the context of modified gravity would be a worthwhile endeavour for constraining both astrophysical (e.g. baryonic feedback, intrinsic alignment, galaxy bias) and cosmological parameters to complement two-point statistics.  Figure 6. Forecast constraints on w 0 w a CDM dark energy using a Euclid-like volume. These are marginalised over all other ΛCDM parameters, and include a prior on Ω b and n s described in [15]. Data Availability Statement: Our code to compute the matter PDF predictions is publicly available at https://github.com/mcataneo/pyLDT-cosmo. The matter PDF measured from the Quijote simulations are publicly available at https://quijote-simulations.readthedocs.io/en/latest/. The matter PDF measurements for the dark energy cosmologies are publicly available at https://astro.kias.re.kr/jhshin/. Availability of the f (R) simulation can be found in [15].
Acknowledgments: AG is supported by an EPSRC studentship under Project 2441314 from UK Research & Innovation. The figures in this work were created with MATPLOTLIB [32] and CHAINCOSUMER [33], making use of the NUMPY [34] and SCIPY [35] Python libraries.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: