Abstract
With this follow-up paper, we continue developing a mathematical framework based on information geometry for representing physical objects. The long-term goal is to lay down informational foundations for physics, especially quantum physics. We assume that we can now model information sources as univariate normal probability distributions (, , as before, but with a constant not necessarily equal to 1. Then, we also relaxed the independence condition when modeling m sources of information. Now, we model m sources with a multivariate normal probability distribution with a constant variance–covariance matrix not necessarily diagonal, i.e., with covariance values different to 0, which leads to the concept of modes rather than sources. Invoking Schrödinger’s equation, we can still break the information into m quantum harmonic oscillators, one for each mode, and with energy levels independent of the values of , altogether leading to the concept of “intrinsic”. Similarly, as in our previous work with the estimator’s variance, we found that the expectation of the quadratic Mahalanobis distance to the sample mean equals the energy levels of the quantum harmonic oscillator, being the minimum quadratic Mahalanobis distance at the minimum energy level of the oscillator and reaching the “intrinsic” Cramér–Rao lower bound at the lowest energy level. Also, we demonstrate that the global probability density function of the collective mode of a set of m quantum harmonic oscillators at the lowest energy level still equals the posterior probability distribution calculated using Bayes’ theorem from the sources of information for all data values, taking as a prior the Riemannian volume of the informative metric. While these new assumptions certainly add complexity to the mathematical framework, the results proven are invariant under transformations, leading to the concept of “intrinsic” information-theoretic models, which are essential for developing physics.
1. Introduction
In this work, we continue developing the mathematical framework introduced in [1] by implementing some variations to better account for reality. In particular, we model information sources as univariate normal probability distributions (, as before but with a constant not necessarily equal to 1. We also relax the independence condition when modeling m sources of information. Thus, we model m-dependent sources with a multivariate normal probability distribution (, with a constant variance–covariance matrix not necessarily diagonal, i.e., with covariance values different than 0, which leads to the concept of modes rather than sources when finding the solutions.
As in our initial work, the mathematical approach departs from the supposition that physical objects are information-theoretic in origin, an idea that has recurrently appeared in physics. In the following mathematical developments, we discover that the approach is importantly “intrinsic”, giving rise to the paper’s title, which is the main feature we want to emphasize in this study. In other words, regardless of how we parametrize the modeling, the approach’s inherent properties, for example, energy levels, remain the same, irrespective of updating the framework with the above modifications.
This entire work builds upon this finding, which makes it ideal for studying the properties of information representation and developing physics, and our modifications can further improve the framework’s accuracy and applicability to real-world scenarios. The long-term goal is to provide models to explain the “pre-physics” stage from which everything may emerge. We refer to the initial preprocessing of the source data information which is performed, in principle, by our sensory systems or organs. Therefore, the research in this follow-up paper may significantly contribute to the field and potentially guide future work in the area.
2. Mathematical Framework
The plan of this section, which we divide into eleven subsections for didactic purposes, is the following. In Section 2.1, we outline modeling a single source with a single sample and the derivation of Fisher’s information and the Riemannian manifold. In Section 2.2, we describe modeling a single source with n samples. Section 2.3 is devoted to analyzing the stationary states of a single source with n samples in the Riemannian manifold. In Section 2.4, we present the solutions of the stationary states in our formalism. In Section 2.5, we compute the probability density function, the mean quadratic Mahalanobis distance, and the “intrinsic” Cramér–Rao lower bound for a single source with n samples. An extension of this approach to m independent sources is conducted in Section 2.6 to compute the global probability density function at the ground-state level. In Section 2.7, we outline modeling m-dependent sources of a single sample, Fisher’s information, and the Riemannian manifold. Section 2.8 describes m-dependent sources of n samples. In Section 2.9, we analyze the stationary states of m-dependent sources of n samples in the Riemannian manifold. Section 2.10 is devoted to finding the solutions. Finally, in Section 2.11, we use Bayes’ theorem to obtain the posterior probability density function.
2.1. A Single Source with a Single Sample: The Fisher’s Information, the Riemannian Manifold, and the Quadratic Mahalanobis Distance
We start our mathematical description by modeling a single source with a univariate normal probability distribution where is a known constant. This is a well-known parametric statistical model in which unidimensional parameter space may be identified with the real line, i.e., . We can compute all the relevant quantities relevant to our purpose. For a single sample, the univariate normal density (with respect to the Lebesgue measure), its natural logarithm, and the partial derivative with respect to the parameter are given by
From Equation (3), which is also called the score function, we can calculate Fisher’s information [2] for a single sample as
since with the change , we have that .
The Riemannian metric [3] from a single source with a single sample derived from the Fisher’s information (4) is a metric tensor whose covariant component, contravariant component, and determinant, respectively, are
The corresponding square of the Riemannian distance induced in the parametric space is the well-known quadratic Mahalanobis distance [4], i.e.,
The quadratic Mahalanobis distance will play a critical role in the next sections.
2.2. A Single Source with n Samples: The Fisher’s Information, the Riemannian Manifold, and the Square of the Riemannian Distance
If the source generates n samples, , drawn independently from a univariate normal probability distribution (1), the likelihood of this n variate sample, its log-likelihood, and the scores, respectively, are
Likewise, from Equation (11) we can calculate the Fisher’s information corresponding to an n size sample as
since follows a univariate normal distribution with mean equal to and variance equal to .
In other words
which shows the well-known additive property of the Fisher information for independent samples.
The Riemannian metric [3] from a single source with n samples derived from the Fisher’s information (12) is a metric tensor whose covariant component, contravariant component, and determinant, respectively, are
The square of the Riemannian distance, , induced by the Fisher information matrix corresponding to a sample of arbitrary size n will be equal to n times the quadratic Mahalanobis distance (8), i.e.,
2.3. Stationary States of a Single Source of n Samples in the Riemannian Manifold
To calculate the stationary states of a single source of n samples, we can invoke the principle of minimum Fisher information [5] or use the time-independent non-relativistic Schrodinger’s equation [6]. The two approaches have been demonstrated to be equivalent elsewhere [5]. The equation reads as follows
where is a potential energy and . The solution must also satisfy and . For simplicity, we will write instead of .
We can use the modulus square of the score function (11) as the potential energy, except for a constant term
Alternatively, we can use as the potential energy the difference between the maximum of the log-likelihood attained by the sample, , minus the log-likelihood at an arbitrary point , up to a proportionality constant. Since the likelihood is given by (9), we can rewrite it as
where and . The supreme likelihood is obviously attained when , then, the previously mentioned potential will be
This expression is up to a proportionality constant equal to (19). Thus, we may choose as the potential energy with , and Equation (18) reads as:
We compute the Laplacian in Equation (22) as:
Inserting Equation (23) into Equation (22), we obtain:
which is Schrödinger’s equation of the quantum harmonic oscillator [7].
2.4. Solutions of a Single Quantum Harmonic Oscillator in the Riemannian Manifold
Some steps now may seem obvious for those used to quantum mechanics. Considering that has the following form:
Equation (24) results:
Assuming a solution for with the form
And inserting this expression into Equation (26) gives
which implies that
In other words, can not be chosen arbitrarily because they have to satisfy these equations. For example, we can choose and , which forces , and . Therefore, we can write
whose solution is given by
With this configuration, we compute the normalization constant
where we used a first change of variable . Now, using a second change of variable , , Equation (32) writes as
Isolating from Equation (33), we obtain . Therefore, Equation (31) reads as
which is the ground-state solution of the quantum harmonic oscillator problem, and the wave function for the ground-state. The solutions of the quantum harmonic oscillator involve Hermite Polynomials, which were introduced elsewhere [8,9]. In this way, we can prove, after some tedious but straightforward computations, that the wave function:
is also a solution of
where is the energy of the first excited state, and is the normalization constant. With this representation, the ’s (energy levels) are given by
Looking closely at Equation (37), we appreciate that the energy levels depend on two numbers, and n. The ground state at has a finite energy , and can become arbitrarily close to zero by massive sampling. Notably, the energy levels are independent of . In other words, they do not depend on the informative parameters, leading to the concept of “intrinsic” information-theoretic models which will be discussed in greater detail later.
2.5. Probability Density Function of a Single Source of n Samples, Mean Quadratic Mahalanobis Distance, and Intrinsic Cramér–Rao Lower Bound
Assuming that the square modulus of the wave function can be interpreted as the probability density function:
we can compute the performance of the estimations of . For instance, we can calculate the expectation of the quadratic Mahalanobis distance (8) to the sample mean at the ground state (34), obtaining
Likewise, we can compute the expectation of the quadratic Mahalanobis distance (8) to the sample mean at the first excited state, obtaining
The expectation of the quadratic Mahalanobis distances to the sample mean at the different states equal the quantum harmonic oscillator’s energy levels, i.e., this quantity is definitively quantized. Interestingly, the expectation of the quadratic Mahalanobis distance to the sample mean at the ground state (39) equals the intrinsic Cramér–Rao lower bound (ICRLB) for unbiased estimators
considering that we are modeling a single source of n samples with a single informative parameter , i.e., . For further details, see [10].
2.6. m Independent Sources of n Samples and Global Probability Density Function
With m independent sources, each generating n samples, a finite set of m quantum harmonic oscillators may represent reality. Presuming independence of the sources of information, the “global” wave function (also called the collective wave function) can factor as the product of single wave functions. We can write the global wave function as
It constitutes a many-body system, and we may refer to the vector as the field.
For example, in the case of modeling two independent sources, the global wave function at the ground state will be the product of single wave functions, each of them at the ground state
We can generalize Equation (43) for having m independent sources. The global wave function is written as
Using Equation (38), the probability density function is:
2.7. m Dependent Sources of a Single Sample, the Fisher’s Information, the Riemannian Manifold, and the Quadratic Mahalanobis Distance
Consider now m possibly dependent sources, generating a multivariate sample of size 1, . Now, although we have a finite set of m univariate quantum harmonic oscillators that can also represent reality, since these sources may be dependent, it is convenient to model this situation as a single m variate source with an m-variate random vector following a m variate normal probability distribution where and is a known constant strictly positive definite matrix, the covariance matrix of random vector , i.e., . This is a well-known parametric statistical model in which their m-dimensional parameter space may be identified with ; for further details, see [11]. Identifying, as is customary, the points of the manifold with their coordinates , we can compute all the quantities relevant to our purpose. For a single sample, the m-variate normal density (with respect to the Lebesgue measure), its natural logarithm, and the partial derivative with respect to are given by
where, following a standard matrix calculus notation, is the element located in row i and column of the inverse covariance matrix .
The Fisher’s information matrix is a matrix whose elements are
where we have taken into account the symmetry of , and that , or, in matrix form, .
It is well known that the Fisher information matrix is a second-order covariant tensor of the parameter space. It is positive definite and may be considered the metric tensor of this manifold, giving it the structure of the Riemannian manifold. To avoid confusion, we must emphasize that the subscripts or superscripts used to reference the variance and covariance matrix or its inverse do not have a tensor character, i.e., the components of the metric tensor are those of a covariant tensor, in tensor notation, they are written as subscripts and are equal to the components of the inverse variance–covariance matrix given in (49).
The Riemannian geometry induced by the Fisher information matrix in the parameter space is, in this case, Euclidean, and the square of the Riemannian distance, also known as the Rao distance, is the Mahalanobis distance given by
In this expression, the parameter space points are identified with their coordinates and written in matrix notation as column vectors.
All these results correspond to a multivariate with a sample size .
2.8. m Dependent Sources of n Samples, the Fisher’s Information, the Riemannian Manifold, and the Square of the Riemannian Distance
If each of the m dependent sources generates n samples, , drawn independently from a multivariate normal probability distribution (46), the likelihood distribution is
The summation term within the exponential function can be decomposed into two terms
where , and , i.e., the sample mean and the sample covariance matrix corresponding to this random sample is the quadratic Mahalanobis distance to the sample mean, .
The log-likelihood distribution is
and the partial derivative of with respect to using standard classical notation for covariant derivatives and additionally using repeated index summation convention is
2.9. Stationary States of m-Dependent Sources of n Samples in the Riemannian Manifold
To calculate the stationary states, we can invoke the time-independent non-relativistic Schrodinger’s equation [6] as above. In the multivariate case, the wave equation reads as follows
where is the potential energy and . The solution must also satisfy that vanish to infinity and . For simplicity, we will write instead of .
We can use the square of the norm of the log-likelihood gradient as the potential energy, except for a constant term. The components of the gradient of , a contravariant vector field, observing that the inverse of the metric tensor corresponding to a sample of size n is given by since where is the Kronecker delta’s, will be given in classical notation by
and, therefore, the square of the norm of the log-likelihood gradient will be
Alternatively, we can use the difference between the log-likelihood at an arbitrary point minus the log-likelihood attained by the sample as the potential
which is up to a proportionality constant equal to (58). Thus, Equations (58) and (59) suggest to take as the potential energy with . In this way, Equation (56) reads as
To proceed, we must compute the Laplacian in Equation (60). If , with defined in (49), i.e., the determinant corresponding to the tensor of the information metric for samples of size , for a sample of arbitrary size n that determinant will be equal to . In this way, the Laplacian of a function will be given by
where we have used repeated index summation convention. For further details about this formula, see, for instance, [12]. Notice that equals the variance–covariance matrix . Moreover, i matrix , which is the Hessian matrix of under the coordinates , Equation (61) can be written as
Inserting Equation (62) into (60), we obtain
which is the Schrödinger’s equation of m-coupled quantum harmonic oscillators.
2.10. Solutions of m-Coupled Quantum Harmonic Oscillators in the Riemannian Manifold
We observe that both (58) and (61) are invariant under coordinate changes on . Therefore, Equation (63) will remain invariant under these changes, particularly linear ones.
Since is a symmetric matrix which diagonalizes on an orthonormal basis, it can be written as , where is an orthogonal matrix and is a diagonal matrix , , where are the eigenvalues of the square root of the variance–covariance matrix .
Thus, by introducing the change of coordinates and , the metric tensor becomes diagonal, i.e., . Taking this coordinate change into account, the Equation (58) becomes
Moreover, if we define the symmetric matrix , which is the Hessian matrix of under the new coordinates , the Equation (61) becomes
Making use of Equation (64) and Equation (65) in Equation (63), we obtain
which is the Schrödinger’s equation of the m-decoupled quantum harmonic oscillators. If we choose and , Equation (66) can be written as
Additionally, if we define with , we may write
Note, that if is a non–trivial solution of
then, Equation (66) admits a solution of the form
Each of the Equations in (69) admits infinite solutions for different values of , as in our previous work [1]. More specifically, it admits solutions for
In particular, for , we have and the wave function for the ground-state is
Then, the global wave function at the ground state will be
with , which is the intrinsic Cramér–Rao lower bound (ICRLB) for m sources of information of n samples, with each source being modeled with an informative parameter , i.e., a total of m informative parameters. For further details, see [10]. The global probability density function at the ground state can be written as
Since and , where is an orthonormal matrix and, therefore, , we can express Equation (74) as a function of the original coordinates
However, there are many other solutions in (69) considering different values of in (71). It is well-known that the solutions of the quantum harmonic oscillator involve Hermite Polynomials, which were introduced elsewhere [8,9]. In particular, for , we will have and the wave function at the first-excited state will be
We can obtain other solutions via the Hermite polynomials, representing excited states of the quantum harmonic oscillator. For instance, we may obtain the solution for each of the sources and for each energy level . Combining the m sources with the energy levels, we can build up all possible solutions and, therefore, obtain up to different solutions
where and , the total energy of m oscillators, such that .
2.11. Bayesian Framework and Posterior Probability Density Function
Regardless of having independent or dependent sources of information, we can compute the posterior probability distribution calculated from the sources of information for all data values using Bayes’theorem [13], taking the Riemannian volume of the metric as a prior. This measure is Jeffrey’s prior distribution on the parameter, and it can be considered in some way an objective, or at least a reference choice for a prior distribution [14].
Considering Equation (49), the Riemannian volume is constant in . Then, taking into account the likelihood probability distribution (51), the posterior probability distribution based on Jeffrey’s prior is equal to
This posterior probability density function coincides with the global probability density function at the ground state (75). Precisely, the probability density function of m quantum harmonic oscillators at the ground state given by the square of the wave function coincides with the Bayesian posterior obtained from m sources of information for all data values using the improper Jeffrey’s prior. This unexpected and exciting result reveals a plausible relationship between energy and Bayes’ theorem.
3. Discussion
This paper aimed to expand and refine the mathematical framework initially presented in [1]. We made specific adjustments to the approach, enabling us to consider real-world scenarios more thoroughly. As we continued with our work, we came to appreciate the “intrinsic” nature of the modeling, which we believe is a crucial aspect of our study. Our ultimate objective was to improve upon the foundation established in the previous paper and create an even more robust and accurate framework.
First, we extended the approach by modeling a single source of information with a univariate normal probability distribution , as before, but with a constant not necessarily equal to 1. We calculated the stationary states in the Riemannian manifold by invoking Schrödinger’s equation to discover that the information could be broken into quantum harmonic oscillators as before but with the energy levels being independent of , an unexpected but relevant result that motivated us to continue exploring this field.
This primitive result led us to title the work “Intrinsic information-theoretic models”, which asserts that the critical features of our modeling process, such as the energy levels, remain independent of the parametrization used and invariant under coordinate changes. This notion of invariance is significant because it implies that the same model can be applied across different parameterizations, allowing for greater consistency and generalizability. Furthermore, this approach can lead to a more robust and reliable modeling process, as it reduces the impact of specific parameter choices on the final model output. As such, the notion of “intrinsic” information-theoretic models has the potential to improve modeling accuracy and reliability significantly.
Similar to our previous study [1], we evaluated the performance of the estimation of the parameter . Instead of calculating the estimator’s variance, we used the expectation of the quadratic Mahalanobis distance to the sample mean to discover that equals the energy levels of the quantum harmonic oscillator, being the minimum quadratic Mahalanobis distance at the minimum energy level of the oscillator. Interestingly, we demonstrated that quantum harmonic oscillators reach the “intrinsic” Cramér–Rao lower bound on the quadratic Mahalanobis distance at the lowest energy level.
Then, we modeled m independent sources of information and computed the global density function at the ground state as an example. Essentially, we modeled sources with a multivariate normal probability distribution , with a variance–covariance matrix different than the identity matrix of m-dimension, , but being diagonal initially to describe the independence of the sources of information.
We advanced the mathematical approach by modeling m dependent sources of information with a variance–covariance matrix not necessarily diagonal, depicting dependent sources. This resulted in Schrödinger’s equation of m-coupled quantum harmonic oscillators. We could effectively decouple the oscillators through a coordinated transformation, thereby partitioning the information into independent modes. This enabled us to obtain the same energy levels, albeit now with respect to the modes, which further proves the “intrinsic” property of the mathematical framework.
Finally, as in our previous study, we showed that the global probability density function of a set of m quantum harmonic oscillators at the lowest energy level, calculated as the square modulus of the global wave function at the ground state, equals the posterior probability distribution calculated using Bayes’ theorem from the m sources of information for all data values, taking as a prior the Riemannian volume of the informative metric. This is true regardless of whether the sources are independent or dependent.
Apart from the mathematical discoveries detailed in this paper, this framework offers multiple alternatives that we are currently exploring. For example, the informational representation of statistical manifolds with is unknown. Also, this approach can be generalized by exploring other statistical manifolds and depicting how physical observables such as space and time may emerge from linear and nonlinear transformations of a set of parameters of a specific statistical manifold. This way, the laws of physics, including time’s arrow, will appear afterward.
Moreover, several fascinating inquiries warrant further investigation. These involve delving into the relationship between energy and information already highlighted in our initial work. In addition, the very plausible connection between energy and Bayes’ theorem also deserves further exploration. By delving deeper into these topics, we may unlock even more insights into the universe’s fundamental nature and mathematical laws.
The updated framework presented in this study offers a more realistic approach by allowing the modeling of m-dependent sources. In real-world scenarios, information is often distributed over multiple sources that may not be entirely independent. By formulating the problem in terms of modes, we can obtain a solution or set of solutions for the proposed framework. This approach provides a valuable tool for solving complex problems that require a deeper understanding of the underlying dynamics.
Author Contributions
Conceptualization, D.B.-C. and J.M.O.; writing—original draft preparation, D.B.-C. and J.M.O.; writing—review and editing, D.B.-C. and J.M.O. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The study did not require ethical approval.
Data Availability Statement
No new data were created.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Bernal-Casas, D.; Oller, J.M. Information-Theoretic Models for Physical Observables. Entropy 2023, 25, 1448. [Google Scholar] [CrossRef] [PubMed]
- Fisher, R. On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. Lond. Ser. Contain. Pap. Math. Phys. Character 1922, 222, 309–368. [Google Scholar]
- Riemann, B. Über die Hypothesen, Welche der Geometrie zu Grunde Liegen. (Mitgetheilt durch R. Dedekind). 1868. Available online: https://eudml.org/doc/135760 (accessed on 15 July 2023).
- Mahalanobis, P. On the generalized distance in Statistics. Proc. Nat. Inst. Sci. India 1936, 2, 49–55. [Google Scholar]
- Frieden, B. Science from Fisher Information: A Unification, 2nd ed.; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Schrödinger, E. An Undulatory Theory of the Mechanics of Atoms and Molecules. Phys. Rev. 1926, 28, 1049–1070. [Google Scholar] [CrossRef]
- Schrödinger, E. Quantisierung als Eigenwertproblem. II. Ann. Phys. 1926, 79, 489–527. [Google Scholar] [CrossRef]
- Laplace, P. Mémoire sur les Intégrales Définies et leur Application aux Probabilités, et Spécialement a la Recherche du Milieu Qu’il Faut Choisir Entre les Resultats des Observations. In Mémoires de la Classe des Sciences Mathématiques et Physiques de L’institut Impérial de France; Institut de France: Paris, France, 1811; pp. 297–347. [Google Scholar]
- Hermite, C. Sur un Nouveau Développement en Série de Fonctions; Académie des Sciences and Centre National de la Recherche Scientifique de France: Paris, France, 1864. [Google Scholar]
- Oller, J.M.; Corcuera, J.M. Intrinsic Analysis of Statistical Estimation. Ann. Stat. 1995, 23, 1562–1581. [Google Scholar] [CrossRef]
- Muirhead, R. Aspects of Multivariate Statistical Theory; Wiley: Hoboken, NJ, USA, 1982. [Google Scholar] [CrossRef]
- Chavel, I. Eigenvalues in Riemannian Geometry; Elsevier: Philadelphia, PA, USA, 1984. [Google Scholar] [CrossRef]
- Bayes, T. An essay towards solving a problem in the doctrine of chances. Phil. Trans. R. Soc. Lond. 1763, 53, 370–418. [Google Scholar] [CrossRef]
- Jeffreys, H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. Ser. Math. Phys. Sci. 1946, 186, 453–461. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).