1. Introduction
The study of estimation efficiency within the framework of information geometry has evolved significantly since the pioneering work of Rao [
1] and the subsequent works of others—see [
2,
3,
4]—and may be found in more recent papers and books like [
5,
6]. The Fisher information metric, providing a canonical Riemannian structure on parametric statistical models, allows an intrinsic quantification of statistical distinguishability and the derivation of sharp risk bounds. This was developed in the article [
7], where intrinsic classical results, such as Cramér–Rao inequalities, were established under regularity conditions; in that work, we developed what can be termed an intrinsic approach to the analysis of point estimation. Given a statistical model—that is, after fixing the collection of possible stochastic mechanisms assumed to generate the observed sample—the term
intrinsic refers to properties that are inherent to the estimator itself and not to the particular parameterization used to describe the model. Intrinsic properties, in contrast to classical ones, are invariant under reparameterizations of the model, which may be interpreted as changes of coordinates in the space of probabilistic mechanisms under consideration; see also [
8].
However, whether one adopts a classical or an intrinsic perspective, the risk functions of different estimators often intersect. As a consequence, the comparison of estimators cannot, in general, be based on pointwise risk criteria alone, unless additional structural properties are imposed, such as unbiasedness or equivariance in the case of families that are invariant under the action of a group (see [
9]). One natural way to address this issue is to assess the performance of an estimator—intrinsic or classical—over an entire region of the parameter space, for example, by integrating its risk with respect to the Riemannian volume measure or by considering the supremum of the risk over that region. This work extends the findings of these articles through the use of indices that quantify estimator performance over regions of the parameter space, rather than at single pointwise locations. Specifically, the aim of this paper is to derive lower bounds for two global risk measures of an estimator over a subset of the parameter space under the intrinsic geometry induced by the Fisher information: the average risk and the maximum risk. In the next section, we outline the setting of the problem and recall some results on local risk bounds, which will be later applied to obtain global bounds. Related contributions are found in [
2], where the analysis is carried out in a classical unidimensional framework, and in [
10], which develops a classical but non-intrinsic perspective.
Building on these foundations, the notion of global efficiency has recently attracted renewed attention, emphasizing the behavior of estimators not only locally but across regions of the parameter space, particularly if we take into account that the interplay between geometry and physics has been further enriched by applications of Fisher information to variational principles in classical and quantum mechanics; see [
11,
12,
13,
14,
15].
2. The Intrinsic Analysis Framework
Let be a sample space, a –algebra of subsets of and a –finite positive measure in . A parametric statistical model is defined as the triple, where is a measure space, is a smooth real manifold, known as the parameter space, and f is a non-negative measurable map, , such that is a probability measure on the measurable space , . Here, is referred to as the reference measure and f as the model function.
For simplicity, in this paper, we shall focus on the case in which is an open-connected subset of . In this setting, it is customary to use the same symbol to denote both the points in and their coordinate representations. Adopting this convention, it is possible to present the results in this familiar form hereafter, even though the statements can be formulated in greater generality.
Additionally, it will be assumed that the model function f satisfies certain regularity conditions:
When x is fixed, the real function is a function in the manifold .
The functions in x, , are linearly independent and belong to for a suitable , that is, the scores have moments of the order for a convenient .
The partial derivatives of the required orders
and the integration of
with respect to
can always be interchanged.
The model is identifiable: the map , with is one-to-one.
Within this framework, the probabilistic mechanism that generates the data under analysis can be equivalently represented by a probability measure, a density function or a parameter, that is, by a point in the parametric manifold
. When these conditions are satisfied, the parametric statistical model is said to be
regular. Initially,
is regarded as a Riemannian manifold endowed with an arbitrary fundamental tensor
h on
, whose components are denoted by
. Nevertheless, it is well known that the parameter space admits a natural Riemannian structure induced by probability measures, referred to as the
information metric, whose fundamental tensor components
coincide with those of the Fisher information matrix. For further details, see [
1,
3,
4,
6,
7], among many others.
In this context, for a given sample size k, an estimator of the true parameter —that is, the parameter associated with the true probabilistic mechanism generating the observed sample—is defined as a measurable map , under the assumption that the probability measure in is given by .
2.1. Local Bounds
Let
denote the components of the metric tensor associated with the Riemannian metric on
, and let
denote the components of the information metric on
. Consider the Levi–Civita connection corresponding to
, and define
where
is the inverse of the exponential map induced by this connection (see
Appendix A). Observe that
A encodes the deviation between the true parameter
and its estimate
, quantified by the tangent vector at
of the geodesic connecting both points, whose length equals the corresponding Riemannian distance. The term
B is the expectation of this deviation vector. For simplicity, it is assumed that the estimators
are such that
A is defined almost everywhere with respect to
, and
B is a
vector field on
. The existence of such a field is ensured whenever the mean square Riemannian distance exists.
Let
, where
denotes the tangent space at
. For each
, define
where
d denotes the Riemannian distance and
is a geodesic defined on an open interval containing zero, satisfying
and
, that is, the tangent vector at
is equal to
. Define
It is known that
is a diffeomorphism mapping
onto
(see Hicks [
16]). An intrinsic extension of the Cramér–Rao bound is obtained, generalizing the formulation of [
7]. The previous result relied exclusively on the information metric, whereas the current framework admits an arbitrary Riemannian metric for the quantification of estimator loss.
Theorem 1 (Riemannian Cramér-Rao lower bound)
. Let be an estimator based on a sample of size k, corresponding to an n-dimensional regular parametric family of density functions. Assume that the parameter manifold Θ is simply connected and that so that the estimator takes values almost surely in a normal neighborhood of θ. Suppose further that the mean squared Riemannian distance with respect to the metric between the true parameter and the estimator, exists for all θ, and that the covariant derivative of the bias field B may be computed by differentiating under the integral sign. Thenwhere and represent the divergence operator. Observe that the divergence of a vector field on a Riemannian manifold is the scalar function that quantifies the net rate at which the vector field flows outward from (or inward toward) a point.
Proof. Let
C be any vector field. Then, applying the Cauchy–Schwartz inequality twice,
where
and
denote, respectively, the inner product and the norm defined on each tangent space.
Let
, where
is the gradient operator. Taking expectations and using the repeated index convention,
Furthermore, we also have
and
Thus,
but
. Moreover,
Then the theorem follows. □
Remark 1. We can choose a geodesic spherical coordinate system with origin ; under this coordinate system, we have the following.where g is the determinant of the metric tensor. ThenNow we can use Bishop’s comparison theorems (see [17], pp. 71–73) to estimate In the Euclidean case,and thus . When the sectional curvatures are non-positive, we obtain the following.and therefore . Finally, when the supreme of the sectional curvatures, , is positive and the diameter of the manifold satisfies , we haveand then we obtain . In any case, , with or , depending on the sectional curvature sign.
Corollary 1. Suppose that there is a global chart such that . Identifying the points with their coordinates, we havewhere MSE and Bias are the ordinary mean squared error and bias under the assumed global chart, and we use the repeated index summation convention. Proof. It follows straightforwardly from the previous theorem and the facts that d is the Euclidean distance, and and . □
Corollary 2 (Intrinsic Cramér-Rao lower bound)
. If , we havewhere ρ is the Rao distance, that is, the Riemannian distance induced by the information metric. In particular, If all the sectional Riemannian curvatures K are bounded from above by a non-positive constant and , thenIf all sectional Riemannian curvatures K are bounded from above by a positive constant and , where is the diameter of the manifold and , then Proof. If the Riemannian metric is the Fisher metric, the distance is known by the Rao distance and
. To prove (
5) and (
6) see [
7]. □
Note that the geometry of the model influences the lower bounds of the Riemannian risk. This influence is intricate, as it depends not only on the Riemannian structure of the parameter space but also on the probability distribution that the estimator induces in that space. For a given bias structure, specified by the bias vector field
B and its divergence
, the behavior of the lower bounds of the Riemannian risk is determined by the curvature. When the curvature is negative, these bounds tend to increase when
decreases; see (
5) as
increases, while for positive curvature, they tend to decrease when
increases as a function of
,
being the diameter of the manifold
that plays a significant role in the calculation of this lower bound.
Intuitively, geodesics are the straightest possible paths in a curved space. When sectional curvatures are positive, the geodesics starting from a single point initially spread out, but eventually start to converge: positive curvature pulls geodesics together. When the curvature is zero, the geodesics starting from a point spread uniformly in all directions, the space is flat, and the geodesics neither attract nor repel each other. When the curvature is negative, geodesics starting from a point diverge between them. Even if they start close together, they spread apart very quickly as you move along them: negative curvature pushes geodesics away from each other. This behavior has consequences in connection with intrinsic risk. When the model has positive sectional curvatures, the risk can decrease, since the geodesics bend together and the estimators might behave more similarly across nearby parameters, while when the sectional curvatures are negative, the risk can increase since the geodesics spread apart and estimators may differ more across the space.
2.2. Global Bounds
It is well known that, for a general loss function, there is no estimator whose risk function is uniformly smaller than that of every other estimator. Consequently, given a particular estimator, it is natural to assess its performance over a specified region of the statistical model by integrating its risk function over that region and normalizing the result by the corresponding Riemannian volume. In what follows, the square of the Rao distance is adopted as the loss function and the Riemannian metric is taken to be the Fisher information metric. This setting corresponds to the intrinsic analysis framework developed in [
7].
Let
be a measurable subset satisfying
, where
V denotes the Riemannian measure. The
Riemannian average of the mean squared Rao distance is defined as
The resulting performance index represents a weighted average of the mean squared Rao distance. This formulation is compatible with a Bayesian perspective: a uniform prior with respect to the Riemannian volume can be regarded as a noninformative prior, see [
18]. Furthermore, as shown in [
19], when the parameter space is a locally compact topological group, the corresponding Riemannian volume coincides, up to a multiplicative constant, with a left-invariant Haar measure. In general, this volume is invariant under any group that leaves the parametric family of densities unchanged.
In the first part of the article, the lower bounds for this global index are derived on geodesic balls of radius R, .
An alternative measure of global estimator performance is given by the maximum risk over a region of the parameter space:
corresponding to the
minimax approach. The final part of the paper is devoted to the derivation of lower bounds for this maximum risk.
3. Variational Methods to Obtain Global Bounds
The local bounds established in Corollary 2 indicate that the expected squared Rao distance between the true probabilistic mechanism generating the sample and its corresponding estimates is bounded from below by a quantity depending on the intrinsic bias structure of the estimator.
Global bounds can be obtained by using variational methods. A study in this direction was previously conducted in [
2]. The approach consists in integrating the local bounds for the mean squared Rao distance, as derived above, under the assumption that the Riemannian metric coincides with the Fisher information metric and over a submanifold
with boundary
. Specifically,
where
if the sectional curvatures are nonpositive and
otherwise.
The functional above depends solely on the vector field B, and the problem reduces to finding the vector field B that minimizes . Since the minimization is performed over a class of vector fields larger than that of smooth bias fields, the resulting minimum provides a lower bound for the average of the mean squared Rao distance.
Observe that the expression (2) yields a pointwise lower bound for the intrinsic risk, whose dependence on the estimator’s bias is immediately evident. Allowing for a non-negligible bias may lead to an artificial reduction of the risk—whether classical or intrinsic—but only at the expense of increasing the bias itself. This trade-off would at best indicate satisfactory performance for a specific probabilistic mechanism, corresponding to a single point in the parameter space , while typically resulting in poor performance over a substantial region of the parameter space, primarily due to the growth of the bias. The minimization of (2) thus emerges naturally when considering this pointwise intrinsic bound. To assess the performance of an estimator over an entire region of the parameter space, it is therefore reasonable to consider estimators with various bias structures, since, in principle, biased estimators may outperform unbiased ones when evaluated over a given region. The problem may then be formulated as follows: among all estimators exhibiting a prescribed form of bias, determine the bias structure that minimizes a lower bound on the risk over the region of interest. In particular, this formulation leads to a relatively simple variational problem, considerably more tractable than the one posed directly in terms of the field A.
In applications, the integration region W should be selected as the subset of within which the true probabilistic mechanism that generates the data is expected to lie. Consequently, since this region can be chosen arbitrarily, it will be chosen in such a way that we expect any well-behaved estimator to exhibit a low or vanishing risk on the boundary , and consequently a small or negligible squared norm of the bias vector, since . Hence, W is chosen so that the bias values on its boundary can be considered zero or negligible. These assumptions can reasonably be made within a broad class of statistical models. In situations where they fail to hold—typically because the true parameter value lies on the boundary of —the boundary conditions must be modified appropriately to reflect the specific structure of the model under consideration.
Lemma 1. The field B minimizes the functionaliff it verifiesand the minimum value is given bywhere satisfies (
9)
and denotes the element of the induced surface area in . Proof. Consider the first variation
, where
is an arbitrary smooth vector field. A direct computation yields
Moreover, the Gâteaux variations satisfy
which shows that the functional
is minimized at any point
B for which its Gâteaux variations vanish. In addition, since the integral term in (
11) is strictly positive for every nonzero smooth vector field
, the functional (Y) is strictly convex; see, for example, [
20]. Consequently, every stationary point is necessarily a global minimizer. The stationary condition
is equivalent to
Using the identity
it follows that
Hence, the stationary condition can be written as
By the Gauss divergence theorem, this expression becomes
where
denotes the Riemannian measure induced in
and
is the outward unit normal vector field in
. Equation (
9) follows from the fact that the preceding equality holds for all
.
For the second part of the proposition, applying condition (
9) together with (
12) gives the following.
Substituting this expression into
yields
From the second stationary condition in (
9), it follows that
Since
, and
on
, together with
, it follows that
. Finally, by another application of the Gauss divergence theorem, the second equality in (
10) is obtained. □
Remark 2. The minimal value of depends solely on the divergence of the optimal field, . Let . It follows from condition (9) that satisfies the boundary value problemwhere Δ denotes the Laplace–Beltrami operator associated with the Riemannian metric on W. We have obtained an explicit solution to this problem in the case where , the geodesic ball of radius R, under the assumption of constant sectional curvature . Theorem 2. Let the parametric statistical model be a Riemannian manifold with constant sectional curvature . Then, for geodesic balls centered at of radius R less than the injectivity radius at γ, and satisfying , the average of the mean squared Rao distance satisfies the following lower bound:whereand the function is defined by Proof. By symmetry and uniqueness, the solution of the boundary value problem in the geodesic ball
,
depends only on the geodesic distance to the center of
. Using geodesic spherical coordinates
with origin at the center of the geodesic ball, the Riemannian volume element satisfies
(see the
Appendix A). Hence
and the differential equation can be written as
Let
and define
. Using the relations
we obtain
Since the identities are
hold, substitution yields
Assuming a power series expansion
, with
and substituting into the above equation, we find
that is,
For
, this yields the recurrence
Hence,
and consequently,
where
is determined from the boundary condition
. It is straightforward to verify that this series converges whenever
, which is true automatically for nonnegative sectional curvature.
To compute
, note that in spherical coordinates, see
Appendix A,
where
is the area of the
n-dimensional unit radius sphere,
S. Since
, we obtain the following:
and thus,
□
Corollary 3. When the parametric statistical model is a Euclidean manifold, the following lower bound holds for the Riemannian average of the mean squared Rao distance on a ball centered at of radius R less than the injectivity radius at γ:where denotes the confluent hypergeometric limit function, see (A2) in Appendix A. Moreover, if the Euclidean manifold Θ is complete and simply connected, then the following lower bound holds globally: Proof. The result follows directly as a particular case of Equation (
15) with constant sectional curvature
. The second assertion is obtained by taking the limit
in Equation (
18). □
Note again that the geometry of the model influences the lower bounds of the Riemannian risk—both in its local and integral forms—when these bounds are extended over a region of the parameter space, and that it does so in a non-trivial manner, a fact that merits further investigation.
Example 1. Consider the n–variate normal distribution with known covariance matrix . For a sample of size k, the Riemannian risk-measured as the mean squared Rao distance associated with the sample mean is given bywhich coincides with the lower bound (19) derived in the preceding corollary. Observe that in this case the parameter space is . It is well known that this model is invariant under the action of a subgroup of the affine group that leaves the sample variance-covariance matrix unchanged. In this setting, the unique equivariant estimator throughout the parameter space is the sample mean . The induced group acting on the parameter space is, in this case, transitive and commutative, implying that the estimator is intrinsically unbiased; see [21], and in the context of intrinsic analysis [22]. The intrinsic risk of is , which coincides with the intrinsic Cramér–Rao bound [7]. However, in a geodesic ball of radius R, the Riemannian mean may achieve an integrated intrinsic risk strictly smaller than , provided that estimators are allowed with appropriate bias, although this quantity goes to when , as we naturally expect. This is consistent with the existence of shrinkage estimators of the James–Stein type; see [23]. In the present example, though, such estimators cannot be equivariant under the group action. Figure 1a,b illustrate these phenomena for and , respectively. The method presented above apply to any statistical model satisfying standard regularity conditions, independently of the particular Riemannian geometry induced by the information metric. Moreover, they remain valid regardless of the probability distribution of the estimator of the probabilistic mechanism determined by . The only aspects that may vary are the complexity of the resulting calculations.
In the univariate case
, the parameter manifold is Euclidean, and the corresponding expression simplifies to
which agrees with the result originally obtained by Chentsov [
2].
We now examine the Euclidean case in Cartesian coordinates. Let us fix a coordinate system with origin at an arbitrary point
and consider the cube
In this setting, the corresponding variational problem reduces to solving the Dirichlet boundary value problem
Looking for a solution of the form
with real-valued functions
each defined on the real variable
, obtaining the following
A convenient particular solution is of the form
, where
g satisfies
The unique solution is given by
so that
Substituting this expression into the definition of functional
yields
which constitutes an improvement over the result obtained by Chentsov [
2].
Furthermore, by Corollary 1, a similar inequality can be established in the general non-Euclidean case (with a fixed coordinate system). Specifically, the
satisfies the bound
where
denotes an upper bound of
within
.
Analogous lower bounds can also be obtained for more general settings.
Theorem 3. Let the parametric statistical model be represented by a Riemannian manifold whose sectional curvatures are bounded from above by . Then the average of the mean squared Rao distance satisfies the lower boundwhere denotes the area of the boundary of a n-dimensional geodesic ball centered at of radius R less than the injectivity radius at γ, its corresponding volume, and is the solution to the boundary value problem (17), on a manifold of constant sectional curvature . Proof. Using geodesic spherical coordinates
, let
denote the solution to the boundary value problem (
17) on a manifold with sectional curvatures bounded above by
, and let
denote the solution to the same problem on a manifold of constant sectional curvature
, we have
By Bishop’s comparison theorem, it follows that
and, since
, we obtain
where
denotes the Laplace–Beltrami operator for a manifold of constant sectional curvature
. Therefore,
Since
for all
, the comparison theorem for elliptic differential equations (see [
24], Theorem 6, p. 243) implies
and equality on the boundary gives
Using (
9) and (
13), we then obtain
which establishes the claimed bound. □
Remark 3. Estimates for the volumes of geodesic balls provided in the Appendix are instrumental in obtaining explicit expressions for the lower bounds derived above. In particular, if the sectional curvatures of the parametric manifold are bounded from below by κ and from above by , then, according to Proposition (A3), the ratio between the area and the volume of the geodesic ball of radius R satisfieswhere denotes the comparison function associated with curvature (as defined in (16)). 4. Lower Bounds for the Maximum Risk
Although one may employ the Riemannian average of the risk to obtain bounds on the maximum risk, alternate minimax bounds can be derived by a more direct argument, as shown below.
Lemma 2. Let X be a smooth vector field on the parameter manifold Θ such that . Let f be a nonnegative smooth function on Θ, and let be a submanifold with a smooth boundary . Thenwhere denotes the element of the induced surface area in . Theorem 4. Let be an estimator in a submanifold with . Then satisfies the inequalityand therefore is a lower bound of the risk of the local minimax estimator on W. Proof. Let
. Integrating inequality (
21) with respect to measure
and applying Fubini’s theorem yield
Hence,
□
Corollary 4. When the parameter manifold is Euclidean and , given a geodesic ball centered at of radius R less than the injectivity radius at γ, the following lower bound holds for the local minimax risk:where . If the Euclidean manifold Θ is complete and simply connected, then Proof. Since
we obtain
Substituting into Theorem 4 yields
The global bound follows by taking the limit
. □
The lower bounds for the local minimax risk in geodesic balls of radius
R in Example 1 are calculated using (
23) and graphically displayed in
Figure 2a,b.
Another lower bound for the integrated Riemannian risk is stated in the following theorem.
Theorem 5. We obtain the following lower bound for the Riemannian average of the mean–squared Rao distance over the geodesic ball centered at of radius R less than the injectivity radius at γ, :where when the sectional curvatures are nonpositive and when the sectional curvature satisfies . Proof. Consider inequality (
22) with
,
, and integrate with respect to
on
. We obtain
The map
is positive and increases monotonically. Using
, we obtain
This yields the stated bound and completes the proof. □
Corollary 5. When the parametric statistical model is a Euclidean manifold, the Riemannian average of the mean squared Rao distance over the geodesic ball centered at of radius R less than the injectivity radius at γ, , satisfies the lower boundIf the Euclidean manifold Θ is complete and simply connected, the corresponding lower bound over the entire manifold is Proof. For a Euclidean manifold of dimension
n, the volume of the geodesic ball of radius
r is
Consequently,
Substituting these expressions into the inequality (
24) yields
The second assertion follows by taking the limit
. □
5. Concluding Remarks
This article introduces two performance indices for statistical estimators restricted to a prescribed region of the parameter space. These indices are based on the intrinsic risk function developed in [
7]. Specifically, for a bounded region
W of the parameter space, we consider (i) the integral of the intrinsic risk over
W with respect to the associated Riemannian volume measure, and (ii) the maximum value of the intrinsic risk attained within
W. Both criteria are compatible with Bayesian methodologies.
In this framework, if it is known a priori that the true parameter belongs to a given region W, biased estimators can be constructed that outperform unbiased estimators in terms of either criterion. When there is strong practical evidence that the parameter lies within a sufficiently small region, allowing controlled bias may be preferable whenever it reduces the average Riemannian risk or the worst-case risk over W.
A representative example is examined in which the statistical model is the multivariate normal distribution with known covariance matrix and a simple random sample of size
k. Numerical illustrations (
Figure 1 and
Figure 2) show that restricting attention to a region
W allows the construction of estimators with strictly smaller integrated risk or smaller maximal risk than classical unbiased estimators, at the cost of introducing a region-dependent bias term. The magnitude and behavior of these improvements depend on the underlying Riemannian geometry induced by the statistical model, suggesting several directions for further investigation.
When the model is invariant under the action of a transformation group
G, coherence considerations require restricting attention to equivariant estimators. In this setting, the corresponding intrinsic risk must remain constant along the orbits of the induced action
in the parameter space. As shown in [
21], if an intrinsically unbiased estimator exists that uniformly minimizes the Riemannian risk, then this estimator must be equivariant. The converse does not hold: equivariance, together with uniform optimality, does not, in general, imply intrinsic unbiasedness. Additional structural assumptions—such as the transitivity of the
G-action and the commutativity of the induced action in
—are required to recover this implication. In that case, the intrinsically unbiased estimator that uniformly minimizes the Riemannian risk will be preferable.