1. Introduction
Underpinning theories of random functions is Kolmogorov’s theorem for the existence of stochastic processes. Given a (e.g., Polish) domain
for real valued functions
f, we depart from the collection of projections
(where
S is any finite subset of
) and, for every
S, we provide a probability distribution
for the following projection:
Consistency among projections dictates the necessary condition where implies that is marginal to . Kolmogorov’s theorem says that this consistency condition is also sufficient: if the chosen are consistent, there exists a probability distribution for a random function f such that the projected random are distributed according to for all finite .
Kolmogorov’s theorem has advantages and disadvantages: On the one hand, mere consistency is enough without further conditions, making existence remarkably easy to prove. This enhances applicability greatly, as does the simplicity of the approach: properties of derive from those of the (user-defined) , and calculations for random are feasible because they take place in finite-dimensional probability spaces. On the other hand, the limiting is a Borel measure for the product topology on , which dissociates from some interesting other (e.g., metrizable) topologies, while the indistinct nature of forces the imposition of extra conditions on the choices of the to induce properties like (-almost-sure) continuity, differentiability, or integrability.
Probability measures on spaces of measures are (less common, but) also of interest in various parts of science. For example, in non-parametric statistics [
1], particularly of the Bayesian type [
2,
3], probability distributions on spaces of probability measures play a central role; machine learning shares much with statistics, and random (probability) measures also feature prominently there.
In those disciplines, the most straightforward approach is often to define random measures by the mapping of random functions: it is commonplace, for example, to normalize an integrable positive random function in order to define a random probability density function. But it remains attractive to think about constructions of random measures that take place directly on the space of measures in its full generality. In this sense, based on the Poisson-type family of completely random measures [
4,
5], a well-developed theory of point processes (e.g., the family of Dirichlet random probability measures) exists (for an overview, see [
6,
7]) and is applied widely. Further examples exist, but a comprehensive mathematical theory for the construction and study of probability measures on measure spaces is lacking to date.
Ideally, such a theory of random measures would be based on an existence theorem like Kolmogorov’s theorem for random functions. In this paper, we formulate several new existence theorems of this kind. For a directed set
of finite, measurable partitions
of
, we choose distributions
and define, for all restrictions of
to
,
These projections
are called
random histograms in what follows, and their consistency follows from the additivity of
: if
refines
, then, for any
,
when
. Under which conditions does such a system of random histograms have a (unique)
histogram limit ? (By which we mean a probability measure
on the space of measures with
-restrictions that match the
.)
This question is, of course, not new: both Bayesian non-parametric statistics and stochastic analyses have formulated a wide variety of conditions for existence, more or less independently. First explorations of the subject in stochastic analyses date back to the studies of [
8,
9]. The authors of these studies formulated the classical Bochner–Kolmogorov conditions for the existence of a random distribution function on
. Other approaches based on inner regularity are considered in [
10,
11,
12] and discussed comprehensively in [
13,
14]. Definitions of measure-theoretic inverse limits are presented in [
15,
16,
17,
18]. Limits of random histogram systems in the Bayesian non-parametric literature were first discussed in [
19], which introduced the Pólya-tree family of histogram systems. Many further developments were based on Kingman’s completely random measures, most prominently in the form of the Dirichlet process [
20,
21]. For overviews of these and further developments on non-parametric Bayesian priors, see [
2,
22]. Regarding existence, most noteworthy is [
23], which formulated the so-called Mean-measure condition for the existence of a limit for a system of random probability histograms
: Orbanz requires that there exists a Borel probability measure
G on
with histogram projections
that match histogram expectations:
for all partitions
and all
.
In this paper we prove and apply several existence theorems for limits of random histogram systems and analyze the variety of ways in which the corresponding random measures manifest in theory and examples. After introductory remarks and a discussion of the Bourbaki–Prokhorov–Schwartz theorem in
Section 2, we consider spaces of probability measures
with the tight, weak or total-variational topology, and we derive conditions that guarantee the existence and uniqueness of a limiting Radon probability measure
in those cases in
Section 3 and
Section 4. As it turns out, the manifestations of their respective random probability measures are quite different: a limit
that is Radon for the weak or total-variational topology is supported by the subset of
of measures dominated by
G, while a
that is Radon for the tight topology is supported by the subset of
of measures with support contained in that of
G. Combined with the Poisson-like manifestation of completely random measures, in
Section 5, we distinguish four phases for random probability histogram limits:
absolutely-continuous, fixed-atomic, continuous-singular and random-atomic. The results are applied to known examples like the Dirichlet (in
Section 6) and Pólya-tree (in
Section 7) families, and we re-derive and sharpen some of the existing results.
In
Section 8, we consider spaces of signed measures with the vague, tight and weak topologies and derive conditions that guarantee the existence and uniqueness of Radon histogram limits
. The generalization to signed measures accommodates a new family of Gaussian probability distributions, defined as limits of random histogram systems of the form,
where
N denotes a multivariate normal distribution with (suitably defined) expectation
and covariance
. All four phases found for probability histogram limits are also realized in this setting, so Gaussian histogram limits exist with the same wide range of diffuse and point-like manifestations. We argue that Gaussian histogram limits based on Green’s functions for the harmonic operator generalize the well-known two-dimensional Gaussian free field [
24] to higher dimensions, suggesting a potential role in four-dimensional Euclidean quantum field theory.
To conclude, we emphasize the constructive nature of the existence theorems provided: random histogram systems not only define but also approximate random measures. The approximative property has two large advantages, one computational and one analytic: Firstly, histogram systems consist of finite-dimensional probability distributions, which we can simulate. The Dirichlet process, for example, derives much of its immense popularity from its ease of numerical implementation and use, and this considerable advantage extends to all histogram methods. The second advantage lies in mathematical accessibility. The analyses of example histogram limits in
Section 6,
Section 7 and
Section 8 are possible
only because calculations with finite-dimensional random histograms are feasible, and limits of the results correspond to properties of the infinite-dimensional histogram limits.
6. Existence and Phases of Dirichlet Histogram Limits
The best-known family of histogram limits is the Dirichlet family; its definition is based most conveniently on the observation that if are independent and distributed according to Gamma distributions , then is distributed according to . (Below, we use the convention that is a single atom of mass one located at zero.)
Definition 8. Let ν be a non-zero, bounded, positive Borel measure on a Polish space and define, for every Borel-measurable partition α,where . The histogram distributions on are those of the normalized positive random elements , where , () and . Together, the distributions are coherent and form the Dirichlet histogram system with base measure ν. It is clear that the Gamma process, defined by the positive random vectors , is completely random and that Dirichlet histogram systems are normalized completely random. Limits of Dirichlet histogram systems therefore describe random probability measures in one of the two atomic phases.
A second immediate observation is that coherence of the histogram system could have been guaranteed based on parametrization in terms of a finitely additive base measure
. The well-known Mean-measure condition [
23] requires
to be countably additive to guarantee the existence of a unique histogram limit with respect to the tight topology on
. We come back to the Mean measure condition below.
6.1. Tight Limits of Dirichlet Histogram Systems
The following theorem is the (by now classical, see [
2]) existence result for Dirichlet histogram limits, with a new proof in terms of condition (
16).
Theorem 6. Let be a Polish space, endow with the tight topology and let ν be a non-zero, bounded, positive Borel measure on . There exists a unique Radon probability measure on projecting to the Dirichlet histogram distributions (24), describing a random probability measure in the random atomic phase. Proof. Let be a countable basis for and let be a refining sequence of partitions, generated by , that resolves . By assumption, there exist distributions for the random histograms , (). As said, the coherence of the inverse system follows from finite additivity of the measure .
To prove condition (
16), let
be given. According to Proposition 8,
defines a bounded positive Borel measure on
and, according to Proposition 7,
is Polish, so
is a Radon measure on
. Hence, there exists a compact
in
such that,
Let
be given. By Markov’s inequality and the fact that under
,
for any
, we have,
by Markov’s inequality, the fact that the
are proportional to
and the fact that
. Conclude that there exists a unique histogram limit
, a Radon probability measure on
with the tight topology. Because the histogram system is normalized completely random, the limiting random element
P is in the random-atomic phase. □
To conclude, two remarks are in order: Firstly, coming back to the Mean-measure condition, it is noted that the above proof relies on being not just finitely, but countably additive, to imply the Radon property. Secondly, we note that restriction to with partitions generated by the basis may be confusing since the most common definition of the Dirichlet histogram system involves all Borel-measurable partitions, . We argue that this distinction expresses the difference between the roles that plays in Theorem 6 and Proposition 2: to define , we are restricted to directed sets of a special form, while, after proving existence, we may use histograms associated with all .
6.2. Weak Limits of Dirichlet Histogram Systems
Whether
is a Radon measure with respect to the weak topology as well depends on the base measure
. To make a preliminary assessment, note that, given
and
, for any
,
Based on (
25), we see that for every
,
Now, let
be given. Due to the bound (
26), for any
and any
, Markov’s inequality gives,
for all
. Since
, as
refines (unless
is finite), this shows that the most obvious upper bound to imply uniform integrability does not lead to a useful argument. However, we show the following.
Theorem 7. Let , and satisfy the minimal conditions and consider with the weak topology. Let ν be a non-zero, bounded, positive, purely atomic measure on . Then, there exists a unique Radon probability measure on with the weak topology, projecting to for all . In that case, describes a normalized completely random measure in the fixed-atomic phase.
Proof. First, consider a countable set D with the discrete topology (which is a Polish space), with a bounded, positive Borel measure on D. According to Theorem 6, the Dirichlet histogram system with base measure has a Radon histogram limit on with the tight topology. Since any bounded is continuous, the tight and weak topologies are equal. Therefore is also Radon with respect to the weak topology on by default.
Now, let
be Polish and let
D denote the set
. Let
denote the set of all finite partitions of
D and let
denote the set of all finite, Borel-measurable partitions of
. Define
to contain all partitions
that combine a partition
from
and a partition
from
, to partition the whole space
. Note that
resolves
, and
is directed and co-final in
. For any
, the Dirichlet histogram distribution
is such that for the (
-measurable) subset
,
So,
with
-probability one. The projections of
onto
give rise to a Dirichlet histogram system with base measure
, the restriction of
to subsets of
D. As argued above, the limit
is a Radon probability measure on
with the weak topology. The space
is weak-to-weak homeomorphic to the weakly closed subspace
M of all
such that
, through the mapping
,
for all
. Conclude that the histogram system based on partitions in
has a histogram limit
that is Radon on
with the weak topology. □
7. Existence and Phases of Pólya-Tree Histogram Limits
Here, we give only a very brief introduction to Pólya-tree distributions; for much more, see [
19,
35,
36,
37] and the overviews in [
2,
27].
The Pólya-tree distribution is defined through a sequence of refining partitions of a Polish space (usually or the interval ), where, in each step, every set in the preceding partition is split into two subsets. To describe the resulting tree of refinements, we define the following: For every , we denote by the set of all binary sequences of length m (and we denote the empty binary sequence formally as , forming the only element of the set denoted ). We also define the set of all finite binary sequences (including the empty one). For any two binary sequences , , we write for the concatenation in . In particular, for any , () in appends a zero (one) to . Also note that for all . We write out as and use the notation for the projections onto the first binary digits. We also define for any with , with the last digit flipped: .
We use
to organize a refining sequence
of partitions,
,
,
, etc., into a
dyadic tree, defining
and, for all
,
Mostly, we shall look at refinement through intersection with basis sets and their complements, i.e., for every , either or equals for some element U in a basis for . Note that in the case of a countable basis , iterative application of the above construction gives rise to a countable that resolves .
Example 4. A typical example is a dyadic tree of partitions of (or ), constructed by iteratively bisecting every interval at the mid-point. This leads to a sequence of refining partitions , , consisting of intervals of the forms where and , , which is generated by a basis and which resolves . (In case , we add to every partition the singleton ).
To arrive at random histogram distributions for the Pólya-tree, we define for every a so-called splitting variable (and ), taking values in such that
- (i.)
for any such that , is independent of ;
- (ii.)
for every , there exist such that has a distribution.
(In case , we assign a separately chosen, fixed probability to with -probability one for all . As a default, we choose .)
Remark 2. Here and below, we extend the usual family of Beta-distributions somewhat: we consider and and define for all , for all and .
The splitting variables
are interpreted as random fractions that determine how much of the probability mass of
goes to
and how much remains for
, in accordance with (
27):
Consequently, for every
,
, the random probability for
can be written as a product of independent fractions:
which fixes the histogram probability measures
on
for all
,
By construction, the
are such that the refinement and coarsening of partitions (corresponding to relations of type (
1)) are accommodated coherently.
For later reference, we note the first two moments of the random variables
: for every
and every
, the mean measure equals,
by independence of the variables
and expectations of the
-distributions. Expressed in terms of the parameters
, the second moment of
takes the form,
based on the independence of the
, the variances of the corresponding
-distributions and Equation (29).
To have a sub-class of relatively simple examples, we define so-called homogeneous Pólya-tree systems.
Definition 9. Let denote the a dyadic tree of partitions of (or ), as in Example 4. A Pólya-tree system is called homogeneous if we choose for all and set for all .
Accordingly, in a homogeneous Pólya-tree system, splitting variables are distributed symmetrically around , and the mean measure G for any homogeneous Pólya-tree system with a limit is a Lebesgue measure.
7.1. Tight Limits of Pólya-Tree Histogram Systems
First, the general case of the Pólya-tree histogram system is analyzed with Corollary 2: here, the particulars of the partition play a role in the formulation of the condition, so we have to be specific regarding and its partitioning. In this subsection, we specify that (or ), with a dyadic tree of partitions. We use the following notation: for all , and .
Theorem 8. Let and let be the dyadic tree of Example 4. Let be a coherent inverse system of Pólya-tree measures (with parameter ) on the inverse system . Then, there exists a unique probability measure on that is Radon with respect to the tight topology and projects to the Pólya-tree histograms parametrized by , if and only if,for every , and the resulting random element P of is in the continuous-singular phase. Proof. Given that
, the partition
consists of
intervals of the forms
, where
and
,
, which is generated by a basis for the standard topology on
. The well-ordered set of partitions
resolves
. For the given
, we consider
. Let also
be given. If
,
with
-probability one and any compact
satisfies property (
17). Assuming that
, we write
for certain fixed
, like above, and consider the sequence of half-open intervals
in
, defined by,
Assuming that (
30) holds, choose
large enough such that,
Note that for all
,
, while, for any
,
. Defining
K to be the closure of
, by Markov’s inequality, we have,
which shows that property (
17) holds.
Conversely, suppose that there exists a
, such that,
Then,
while the sequence
decreases to
⌀. Hence, the mean measures
do not define a measure (on the ring that is formed by the union of all
,
), which precludes the existence of a Borel probability measure
on
with the tight topology (if
would exist,
would define a Borel mean measure). □
Remark 3. The above applies to examples with as well, but, in that case, we have to require, in addition to (30), that,because aside from the open, left-sided boundaries of half-open intervals , there are directions towards where mass can ‘leak away’ in the limit. Example 5. It is well known [36] that a Pólya-tree histogram system with defining parameters that satisfy,for all coincides with a Dirichlet histogram system (not on all of , but on a smaller set of dyadic partitions that resolves , generated by a basis). Accordingly, such Dirichlet–Pólya-tree histogram systems have limits that are Radon probability measures on with the tight topology, and the resulting random element P of is in the random-atomic phase. In the example below, we make a choice for the parameters that gives rise to a coherent histogram system without a tight limit. This choice is not singular by construction in the sense that parameters either grow very large or vanish in the limit: for all , we have . To introduce the example, we define the following function on .
Definition 10. In the standard construction of Cantor space as a subspace of by successive deletions of open mid-sections of intervals, we define the Cantor mid-point function x that parametrizes the set of all mid-points of deleted intervals in terms of finite binary sequences: maps to the midpoint of the interval that is deleted in the m-th transition in the construction of the set : for example, in , , in , , , , in , etc..
Example 6. Take with a dyadic tree of partitions as defined in Example 2, and, for all , ,Note thatIt is noted that andSimilarly,Since for all ,Conclude that,which implies that the Pólya-tree random histograms defined in (33) form a coherent system that does not lead to a limiting probability measure on with the tight topology. 7.2. Weak Limits of Pólya-Tree Histogram Systems
Second, we formulate a sufficient condition for the parameters
such that the corresponding Pólya-tree histogram system has a limit
that is a Radon probability measure on
with the weak topology. Based on this condition, it is demonstrated that homogeneous Pólya-tree systems with
give rise to such weak histogram limits. This rate of growth is lower than that required in the sufficient condition of [
19], which is elaborated upon in [
21,
35,
36] and re-visited in [
2].
Theorem 9. Let be a second countable metrizable space with countable basis , with a corresponding dyadic tree of partitions , , generated by the basis. Let be a coherent inverse system of Pólya-tree measures (with parameter ) on the inverse system . Assume also that condition (16) holds. Then, there exists a unique Radon probability measure Π
on with the weak topology, projecting to for all if,The resulting random element P of is in the absolutely-continuous phase. Proof. Condition (
16) implies the existence of a tightly-Borel probability measure
on
and a corresponding mean measure
, which serves as our choice of
Q in the proof for property (
12). Let
be given. For any
and every
, Markov’s inequality gives,
for all
, where
K denotes the value of the supremum in Condition (
34). Consequently, condition (
12) is satisfied and Theorem 2 asserts that there exists a unique weakly-Radon probability measure
on
that projects to
for all
. □
Corollary 3. Assume the conditions of Theorem 9 and let a sequence , () be given. If the grows like m or faster, , there exists a unique Radon probability measure Π on with the weak topology, projecting to the associated homogeneous Pólya-tree histogram system.
Proof. Substituting
in Condition (
34), we find, for every
,
which behaves like
in the limit
. Since
by assumption, the right-hand side stays bounded and Property (
34) is satisfied. □
Note that the sufficient condition of [
19] (see also [
2]) suggests that the absolute continuity of homogeneous Pólya-tree limits sets in when
grows as
or faster; here, it is shown that absolute continuity is already obtained with
that grows more slowly, like
, or faster.
8. Existence and Phases of Gaussian Histogram Limits
Most known examples of random histogram systems with a limit are of the (normalized) completely random type [
7]. The reason for the preference for systems with independent components is coherence, cf. (
1) or (
9), which is analyzed most conveniently with infinite divisibility, requiring independence between summands. The consequence is that most known histogram limits are in one of the atomic phases of Theorem 5. In this section we introduce the family of Gaussian random measures, random
signed measures on the space
with components that display dependence generically, manifesting in one of the non-atomic phases of Theorem 5.
8.1. Random Histogram Limits with Signed Measures
To arrive at a proof of existence for Gaussian histogram limits, we have to generalize the approaches of
Section 3.2 and
Section 4.1. Consider the case of a locally compact Polish space
. The most natural generalization of our random histogram question calls for construction of Radon probability measures on
, the space of all signed (and potentially unbounded) Radon measures on
, with the vague topology (see [
38], Ch. III, § 1, No. 9), rather than
with the tight topology as in
Section 4. However, to make histogram projections continuous, transition to a zero-dimensional refinement
(as in
Section 4.1.1) is still a necessary step, which does not combine well with the vague topology (test functions
for the vague topology on
remain continuous when viewed as
, but the compactness of their supports in
is lost in general).
However, since with the vague topology is the inverse limit of the spaces for compact , we may also limit attention to compact subsets initially and then use Theorem 1 (with a directed set of compact labeling a coherent inverse system of histogram limits ) to define a limiting Radon probability measure on with the vague topology.
We defer proof of existence for such a ‘vague inverse limit of histogram limits on compacta’ to future work and focus here on the case where
itself is a compact Polish space. Then,
and the vague and tight topologies coincide. Although
is not compact in general,
with the tight topology still stands in continuous bijective correspondence with
(as in (the first part of) Proposition 9), and the histogram projections
of the form (
15) are continuous. This enables the use of Theorem 1 to prove the existence of histogram limits
that are Radon probability measures on
with the vague/tight topology.
Theorem 10. Let be a compact Polish space and let be a directed set of partitions that resolves , generated by a basis that gives rise to a zero-dimensional . Consider with the tight topology and a coherent random histogram system . If,
- (i.)
for every , there is a constant such that for all , - (ii.)
and, for every there is a compact such that for all ,
then there exists a unique Radon probability distribution Π on projecting to for all .
Proof. According to Proposition 8, the Borel sets on and , as well as (bounded) signed Borel set functions and measures, are the same. The first part of Proposition 9 remains true, (the set-theoretic identity mapping is a tight-to-tight continuous bijection), but the second part fails because and are not necessarily Polish spaces (see Remark 4). Like before, the mappings form a coherent and separating family of continuous mappings on . Trivially, the linear spaces are finite-dimensional normed spaces (the vague, tight and total-variational topology are all equivalent) and the mappings are surjective and continuous, like in Proposition 3. Accordingly, forms an inverse system.
To show that condition (
10) holds, let
be given and let
be a constant such that property (
35) is satisfied for every
. Define,
since
, cf. Proposition 1.
Let
be a compact subset of
. For any
,
with
as in the proof of Theorem 3 and the corresponding compacta
in
cf. property (
36), we also define,
Following steps analogous to those in the proof of Theorem 3, one then finds that
satisfies Prokhorov’s conditions (see (
A2)), so that the closure
forms a compact subset of
and, for any
, we have (by monotony of
for any
, like in the proofs of Theorems 2 and 3),
which shows that condition (
10) of Theorem 1 is satisfied. Conclude that there exists a unique Radon probability measure
on
that projects to
for all
. The continuous mapping
serves to define
, a Radon probability measure on
, and
still projects to
for all
. □
As it stands, property (
36) is somewhat unwieldy due to the occurrence of compacta in
. Analogous to Corollary 2, we also provide a version that refers only to compacta in
. For brevity’s sake, we omit the proof (which follows the same precise steps of the proof of Corollary 2): if, for all
, all
and all
, there is a
, compact in
, such that,
for all
such that
, then property (
36) is satisfied.
Remark 4. Regarding the pair of properties (35) and (36), we remark that, unlike earlier applications of Theorem 1, our conditions are sufficient but (perhaps) not necessary for the existence of a histogram limit: note that is not necessarily complete and not a Polish space generically (see [38], Ch. III, § 1, No. 9, Proposition 14), so that Borel measurability of the inverse mapping can no longer be guaranteed. Accordingly, not every Radon probability measure on can be extended to a Borel probability measure on canonically, and there may exist coherent histogram systems with an tight inverse limit Π
on , for which stated conditions do not hold. 8.2. Existence of Tight Gaussian Histogram Limits
For the definition of Gaussian histogram systems, location and covariance parameters are defined in a way comparable to that of the base measure of the Dirichlet family.
Definition 11. Let be a compact Polish space, let λ be a signed Radon measure on . For any Borel-measurable partition α, let denote the α-histogram projection of λ, in . Let Σ be a signed, symmetric Radon measure on (symmetric meaning that for all ). Assume that for every , the -matrix , with entries,() is semi-positive definite. We refer to λ as the center measure,
Σ as the covariance measure
and as Gaussian parameters.
The measure may be viewed equivalently as a linear mapping that takes continuous functions on into signed Radon measures on , or as a symmetric bi-linear form. To give examples, we may turn to the theory of reproducing kernel Hilbert spaces.
Example 7. For , let be a compact subset of , and consider a so-called positive-definite symmetric kernel function
; cf. the Moore–Aronszajn theorem. Every such kernel function is the reproducing kernel for a unique Hilbert space of functions on . We use k to define,and note that such a is a covariance measure in the sense of Definition 11. Mercer’s theorem formulates the associated spectral theory, with viewed as a (compact, self-adjoint, positive) integral operator . Indeed, we may define a kernel by choice of a countable orthonormal subset of continuous functions , and non-negative , to define . To extend the previous example for general covariance measures
, note that if
consists of partitions generated by a basis for
, then, for any continuous
,
Polarization then defines a positive semi-definite bilinear form on
. Within
, there is a linear space
of functions
f that are
-almost-surely equal to zero (
if
;
if
.), and the quotient space
is a real pre-Hilbert space (see [
39] (Definition 7.5)), with Hilbert space completion denoted as
, which generalizes the Moore–Aronszajn Hilbert spaces associated with reproducing kernel functions.
Definition 12. Given Gaussian parameters , we define a Gaussian histogram system as follows: for all , we choose normal probability distributions for random signed histograms , as follows:where denotes the multivariate normal distribution on with expectation and covariance matrix . When , we speak of a centered Gaussian histogram distribution, denoted as . For partitions
, where
refines
, let
be as in (
3), the mapping that expresses finite additivity. Below, we show that for any
and any Gaussian parameters
, the above histogram distributions define a coherent system, referred to as the
Gaussian histogram system associated with the parameters
.
The inclusion of a center measure
is not of influence for the existence of Gaussian histogram limits: for all
and all Borel sets
B in
,
and, hence,
exists if and only if
exists. The existence of histogram limits therefore only concerns the
-parameter. The existence conditions of Theorem 10 can be dominated by uniform bounds on mixed second moments of the absolute histogram components
, which we denote by,
where
is a constant that depends only on the correlation coefficient between
and
(see [
40], p. 933).
Corollary 4. Let be a compact Polish space and let be a directed set of partitions generated by a basis, as in Example 1. Let Σ be a covariance measure on . Consider with the tight topology and the centered Gaussian histogram system . If the covariance measure Σ is such that,
- (i.)
- (ii.)
and, for any open that decrease to ⌀
,
then there exists a unique Radon probability distribution on projecting to for all .
Proof. To use Theorem 10, we first verify the coherence of Gaussian histogram systems. (We do this first step of the proof generically, that is, with
.) If
,
, and we write
, then,
for
. This can be expressed in terms of a linear mapping
such that,
(where
denotes the matrix transpose of
). Recall that for any finite
, any linear
and a random variable
distributed
, the random variable
is distributed
. So
has the same distribution as
for
,
for all
. This verifies the coherence of the histogram system
.
In the rest of the proof, we assume that the histogram system is centered:
. To show that property (
35) holds, we use Chebyshev’s inequality to upper-bound its left-hand side, for every
and
:
Assuming Condition (
40) and choosing
M large enough, we see that property (
35) is satisfied for every
and all
.
Let
and
be given. Because
is generated by a basis,
A is the intersection
of an (open) finite intersection
U of basis sets and a (closed) finite intersection
C of complements of basis sets. Because
is a Polish space,
U is
, i.e.,
U is equal to a countable union of closed sets
. Then, the closed sets
increase to
A as
. In the open complements of
in
A, there exists a decreasing sequence of basis elements
, and we may define closed sets
. The open sets
decrease to
⌀ and the sets
are closed as subsets of
and therefore compact. By assumption (see Example 1),
is such that all sets in the basis occur as elements of
for some
. So, for every
and every
, there exists a
such that the decomposition
is such that
.
Let
be given. Based on Assumption (
41), choose
large enough such that,
Let
be given. Since
,
showing that property (
37) holds. □
In the above proof, we satisfy property (
37) by roughly following the proof of Theorem 8, but with a different construction of the compacta
, which is more generic and based on Example 1. In applications, control over the choice of
allows for convenient constructions. Where the proof of Theorem 8 depends on the Carathéodory-like condition that
for (specific) sequences
in the generating ring that decrease to ⌀, here, the second-absolute-moment set functions of Condition (
41) are required to go to zero.
Based on Condition (
40), we briefly come back to the space
and indicate how it is related to the covariance structure of a centered Gaussian histogram limit
. To this end, we define the real-valued stochastic integrals
for
and consider the linear space of real-valued random variables
that they span. Assuming integrability, on
L, we define the bilinear form,
With , the quotient space with as an inner product is a real pre-Hilbert space, with Hilbert space completion denoted by . The following proposition involves histogram approximations of continuous functions: for any and any partition generated by a basis, let , be real numbers such that and let , noting that, for all , as refines within an that resolves .
Proposition 12. Let be a compact Polish space and let be a directed set of partitions generated by a basis, as in Example 1. Let Σ be a covariance measure for which Conditions (40) and (41) hold. Then, for all ,and the mapping extends to an isometric isomorphism of Hilbert spaces . Proof. We first show that the (semi-definite) inner-product spaces
and
L are isometrically isomorphic. Let
be continuous and denote their supremum norms by
. By the continuity of
,
and the last step holds by Lebesgue’s dominated convergence, based on the facts that
and
. Using the definition of
, we may then write,
To complete the argument, note that,
and the right-hand side is monotone-increasing in
. By monotone convergence,
so that Condition (
40) asserts the integrability of the function
. So, using again the continuity of
and dominated convergence,
This implies that for any
,
if and only if
. Consequently, the pre-Hilbert spaces
and
are in isometrically isomorphic correspondence, and so are their (unique) Hilbert space completions. □
To conclude this subsection, we briefly consider two statistical perspectives: the frequentist question of estimation of the covariance measure and the Bayesian question of using Gaussian histogram limits to define priors on spaces of probability measures.
Remark 5. Consider the statistical estimation of the covariance measure Σ from observed data. Assume independent and identically distributed observations , each distributed marginally according to the Gaussian histogram limit for some fixed covariance measure Σ. The idealized, direct question of how to estimate Σ from the observed sample is difficult because the data is of a functional nature: in any practical way, observing points in a space of measures amounts to the observation of some approximation or projection. In the present context, we interpret the data points as histograms: for some sample size and some , we observe independent and identically distributed . The question of estimating is then the textbook question of estimating the covariance matrix based on an independent and identically distributed sample from a multivariate normal distribution. This is a smooth parametric estimation problem, with best-regular estimators displaying -convergence, optimal asymptotic covariance and optimal Wald-type confidence sets.
If the functional data is expressed through any
(that is, if the statistician can choose the partition before he sees the data ), it is important to note that, like the approximation of probability densities in Lemma 1, it is possible to approximate Σ
by histograms in total variation:(as ). In such a setting, we may refine partitions as the sample size n increases, ideally in such a way that the estimation error,(as ) and the histogram approximation error are of comparable order. Remark 6. To illustrate the statistical possibilities also from the Bayesian perspective, consider the following family of non-parametric priors for measure spaces: since Gaussian histogram limits describe random signed
measures, there is no direct Bayesian interpretation for a Gaussian histogram limit as a prior on a statistical model. However, a Gaussian random measure Φ
can be conditioned to be positive, and if,the conditioned random element can be normalized to a random probability measure, analogous to the (Bayesian) normalization of positive completely random measures (e.g., the Gamma process of Example 8). The resulting class of normalized conditionally positive Gaussian priors
enable the novel option of describing continuous-singular random probability measures with histograms, rather than only random discrete probability measures (e.g., the Dirichlet random measures of Section 6). 8.3. Existence of Weak Gaussian Histogram Limits
In
Section 8.1, we saw that for the existence of tight histogram limits, condition (
35) (which is close to necessary, cf. Remark 4) says that the limiting random
has a norm
that is a tight real-valued random variable. We shall see in the present subsection that for the existence of weak Gaussian histogram limits, it is sufficient that
is an integrable random variable.
Let
be a compact Hausdorff space, fix the topology on
to be the weak topology and choose a directed set
of finite partitions in non-empty Borel sets. Then,
,
and
satisfy the minimal conditions of Definition 1. The compactness of a subset of
is still characterized by the Dunford–Pettis–Grothendieck condition, but with
P replaced by the positive measure
: a subset
H of
is relatively compact in the weak topology if and only if for some positive, bounded measure
,
as
. The proof of the existence theorem for histogram limits that are Radon with respect to the weak topology does not differ substantially from that of Theorem 2, so we omit explicit statement.
Theorem 11. Let be a compact Hausdorff space, consider with the weak topology and choose a directed set of finite partitions in non-empty Borel sets that resolves . Let be a coherent system of Borel probability measures on the inverse system . There exists a unique weakly-Radon probability measure Π
on projecting to for all , if and only if, there is a such that for every , there is a such that,for all . Given a Radon probability measure
on
, the role of the mean measure
G of Definition 5 as the dominating measure is taken over by the positive measure,
for
, and
implies
(as in Lemma 3). Proposition 4 can be adapted to the signed case as well:
is closed in
and,
Below, we apply Theorem 11 to Gaussian histogram systems. To prepare, it is noted that for all
and all
,
Clearly, the resulting set functions are not the -restrictions of the measure Q (contrary to the cases of positive or probability histogram limits, where all are restrictions of the mean measure G). For every and all , , and, hence, . If we assume that resolves , the positive measures (and the total-variational norms ) increase to (and the total-variational norm ).
Corollary 5. Let be a compact Hausdorff space and let be a directed set of partitions generated by a basis that resolves . Let Σ be a covariance measure on . Consider with the weak topology and the centered Gaussian histogram system . If,then there exists a unique weakly Radon probability distribution on projecting to for all . Proof. Let
be given and choose
, where
denotes any upper-bound for the left-hand side of Condition (
44). Due to the bound (
26) and Markov’s inequality, we have for any
,
where we use the conditions that for every
and all
,
and (
43). □
Two remarks are in order: firstly, we relate Condition (
44) for the existence of weak Gaussian limits to Condition (
40) for the existence of tight Gaussian limits by the Cauchy-Schwartz inequality:
showing that (
44) implies (
40). Second, we note that Corollary 1 stays valid in the signed case, so if Condition (
44) holds, there exists a unique Radon probability measure
on
with the total-variational topology, projecting to
for all
.
Example 8. Let be a compact subset of . Consider Example 7 with a kernel function that is constant: for some . If we let consist of partitions α with the property that for all , there exists a translation vector x in such that the Lebesgue measure of is zero. Then, for every α, is the -matrix with all entries equal to,(where for any ). Clearly, the corresponding covariance measure Σ
gives rise to positive semi-definite covariance matrices , with random histogram components that are highly dependent: in fact, the linear space of all with components that sum to zero forms the kernel of :and a centered multivariate normal distribution is supported on the range of its covariance matrix. This means that lies on the diagonal of with probability one:Note thatso, according to Corollary 5, there exists a weakly-Radon probability measure on projecting to for all . Additionally, a moment’s thought shows that the above example serves as a bound for a host of examples based on the reproducing kernels of Example 7.
Corollary 6. Let be a compact Hausdorff space and let be a directed set of partitions generated by a basis that resolves . Let Σ be a covariance measure based on a bounded kernel function . Then, there exists a unique weakly-Radon probability distribution on projecting to for all .
Proof. Assume first that
is as in Example 8. Since, for some
and all
,
for all
and all
,
. Summability then follows as in Inequality (
46), proving the existence of a weakly-Radon histogram limit
. For any Borel-measurable partition
of
, the weakly continuous mapping
induces the Gaussian histogram distribution
on
. The uniqueness of the limit proves the assertion. □
Over the last two decades there has been considerable interest in the so-called Gaussian free field (see, e.g., refs. [
24,
41]); below, we use Green’s functions for the harmonic operators in
as covariance kernels to define Gaussian histogram systems in the closure
of a non-empty, bounded, open subset of
.
Example 9. We consider the existence question for the Gaussian free field first in : the Green’s function is of the form,where is harmonic in x for every y. Define,(where we choose f such that is symmetric and positive-definite). Choose a directed set of partitions generated by a basis that resolves . Based on Corollary 6, we see immediately that the associated centered Gaussian histogram system with histogram distributions has a limit that is a weakly Radon probability measure on . Then, , (), is a multiple of the Lebesgue measure due to translation invariance and,implying that the random element is of the form,for all , where ϕ is a random Radon–Nikodym density function in , the Banach space of Lebesgue integrable functions on . For , the situation changes drastically (see Figure 1 and Figure 2): Green’s functions are unbounded and display singular behavior in neighborhoods of the diagonal, namely, and for . To apply Corollary 6, we modify for small length scales to regularize the singular behavior near the diagonal (e.g., for some small , replace by , which replaces the pole for with an upper bound ). The modified are bounded kernel functions, and Corollary 6 guarantees that for every , there exists a weak histogram limit . One may hope that the limit for as (which may exist only as a tightly Radon probability measure) describes the so-called Gaussian free field in d Euclidean dimensions. In light of earlier explorations (see, for example, ref. [24] for a detailed overview of (mostly) the case), it is also possible that the limit exists only if we embed the space of Radon measures on , in spaces of distributions on . The limiting probability distribution would then describe a random generalized function of the type discussed in [12] (Ch. IX, §6, No. 10) and [42] 8.4. Completely Random Gaussian Histogram Limits
The class of (centered) Gaussian histogram limits has a non-empty intersection with the class of completely random measures, characterized by covariance measures that place all mass on the diagonal. Completely random Gaussian histogram limits exist in the fixed or random atomic phase.
Definition 13. Let Σ be a covariance measure on . We say that Σ is diagonal if for all , .
Note that with a diagonal , the set function defined by for all is a positive Radon measure on , and the Hilbert space is isometrically isomorphic to , the usual space of -square-integrable functions on .
A diagonal covariance measure leaves histogram components independent and leads to a completely random limit, which is of a fixed- or random-atomic nature also in the case of a signed completely random measure. If a Gaussian histogram system with a diagonal covariance measure has a tight limit and
distributes its mass in a uniformly asymptotically negligible way (see (Section XVII.7) of [
43]), infinite divisibility of the distribution of the random variable
is implied.
Corollary 7. Let be a compact Polish space and let be a directed set of partitions that resolves and is generated by a basis. Assume that Σ
is a diagonal covariance measure. Then, the centered Gaussian histogram system has a unique tightly-Radon probability distribution on projecting to for all . If, in addition,then the total-variational norm has a probability distribution that is infinitely divisible. Proof. For a diagonal covariance measure
, any
and any
, the restrictions of cumulant measures
of Definition 7 to the
-algebras
are given by,
for any
. By the Carathéodory extension, all cumulants are therefore finite positive measures and, hence, [
44] (Theorem 4.4), there exists a tight completely random limit described by a marked Poisson process [
7] (Ch. 9–10) (with marks in
rather than
), of the form (
21).
With a diagonal covariance measure
, the components of
are independent random variables for every
. Therefore, the norms of the random histograms
,
are sums of independent terms in a triangular array, and uniform asymptotic negligibility, as assumed in (
47), is sufficient for tight convergence to an infinitely divisible limiting probability distribution [
43] (Section XVII.7). Since the total-variational norm
is the monotone limit of the norms
(Proposition 1), its probability distribution is tight and infinitely divisible. □
Example 10. Let , let consist of partitions in half-open intervals, generated by a basis and collectively fine enough to resolve . Consider a centered Gaussian histogram system with diagonal covariance measure Σ
defined by choosing τ equal to the Lebesgue measure. With sets of the form (and the singleton , for which we set for all ), the histogram system,describes the independent, normally distributed increments of Brownian motion started from , so the (random) Stieltjes function for the measure Φ
,is a version of the sample path of Brownian motion on . Since resolves , the Lebesgue measures of all intervals go to zero as α increases in , and Condition (47) is satisfied. (By extension, if we replace normal distributions by stable distributions in this construction appropriately, infinite divisibility preserves the coherence and existence of Φ
, cf. Theorem 10, implies random Stieltjes functions corresponding to right-continuous versions of Lévy sample paths on .) Weak histogram limits with diagonal covariance measures display a limitation similar to that of Theorem 7. To appreciate the problem, note that for a diagonal covariance measure with positive
dominated by the Lebesgue measure, Condition (
44) cannot be satisfied. Purely atomic measures
, however, lead to Gaussian histogram systems with weak limits.
Corollary 7 and Example 8 form two extremes: in diagonal cases, Gaussian histogram limits manifest in the fixed- or random-atomic phase, while covariance measures that spread their mass more homogeneously over dependence introduces a degree of smoothness, a situation that we have seen in its most extreme form in (the highly non-diagonal) Example 8. Gaussian histogram limits for other covariance measures are somewhere in between: depending on the degree to which -mass is located away from the diagonal, corresponding to the degree of dependence between Gaussian histogram components, the histogram limit may manifest in close-to-atomic (i.e., highly concentrated) or smooth/closer-to-constant form.
To demonstrate the explanatory value of the phase structure of Gaussian histogram limits described above, the last example, which is analyzed more comprehensively in forthcoming work, suggests the applicability of Gaussian histogram limits in (Euclidean) quantum field theory [
45].
Example 11. Let be given and consider for some constant , and Σ
diagonal with,for all . The space plays the role of d-dimensional Euclidean ‘momentum space’ (and the constant Λ that makes compact is known as the UV cutoff scale in physics). The kernel defining Σ
is interpreted as the (unregularized, Euclidean) ‘propagator’ of the massless scalar field (roughly, the Green’s function for the Laplace operator, which is represented by the convolution kernel in momentum space). The diagonal Gaussian histogram limit exists, cf. Corollary 7. We point out the following consequence of the phase of this Gaussian histogram limit. In this case, ‘quantization’, a description of the field in terms of particles, emerges as a consequence of complete randomness: the Gaussian histogram limit is in the random-atomic phase and manifests as a random sum of discrete point masses in Euclidean momentum space. Such configurations have an immediate physical interpretation, as states describing (off-shell) particles, point-like quanta of momentum. It is noted that the emergence of quantization is not a feature of (second-quantized) quantum field theory: in the physical theory of quantum fields, particles are axiomatic and introduced by hand with the formal introduction of a Fock space to describe quantum states of the field (see the classical work [45] (p. 106)).