On a Generalized Entropy Measure Leading to the Pathway Model: with a preliminary application to solar neutrino data

An entropy for the scalar variable case, parallel to Havrda-Charvat entropy was introduced by the first author and the properties and its connection to Tsallis non-extensive statistical mechanics and the Mathai pathway model were examined by the authors in previous papers. In the current paper we extend the entropy to cover scalar case, multivariable case, and matrix variate case. Then this measure is optimized under different types of restrictions and a number of models in the multivariable case and matrix variable case are obtained. Connections of these models to problems in statistical, physical, and engineering sciences are also pointed out. An application of the simplest case of the pathway model to the interpretation of solar neutrino data is provided.

Classical Shannon entropy has been generalized in many directions [11]. An α-generalized entropy, parallel to Havrda-Charvat entropy, introduced by the first author, is found to be quite useful in deriving pathway models [6], including Tsallis statistics [10] and superstatistics [1,2]. It is also connected to Kerride's measure of inaccuracy [9]. For the continuous case, let f (X) be a density function associated with a random variable X, where X could be a real or complex scalar, vector or matrix variable. In the present paper we consider only the real cases for convenience. Let Shannon's entropy [9] and in this sense (1.1) is a α-generalized entropy measure. The corresponding discrete case is available as , p i > 0, i = 1, ..., k, p 1 + ...p k = 1, α = 1.
Characterization properties and applications of (1.1) may be seen from [9]. Note that Thus there is a parallelism with Kerridge's measure of inaccuracy. The α-generalized Kerridge's measure of inaccuracy [9] is given by When α → 1, eq. (1.2) goes to Kerridge's measure of inaccuracy given by where x is a scalar variable, P (x) is the true density and Q(x) is a hypothesized or assigned density for the true density P (x). Then a measure of inaccuracy in taking Q(x) for the true density P (x) is given by (1.3) and its α-generalized form is given by (1.2).
Earlier works on Shannon's measure of entropy, measure of directed divergence, measure of inaccuracy and related items and applications in natural sciences may be seen from [9] and the references therein. A measure of entropy, parallel to the one of Havrda-Charvat entropy was introduced by Tsallis in 1988 [10,12,13], given by Tsallis statistics or non-extensive statistical mechanics is derived by optimizing (1.4) by putting restrictions in an escort density associated with f (x) of (1.4). Let is optimized over all non-negative functional f , subject to the conditions that f (x) is a density and the expected value in the escort density is a given quantity, that is x xg(x)dx = a given quantity, then the Euler equation to be considered, if we optimize by using calculus of variations, is that where λ 1 and λ 2 are Lagrangian multipliers. That is, Taking λ 2 = a(α − 1) for α > 1, a > 0 we have Tsallis statistics as x > 0 and c 1 can act as a normalizing constant if f 1 (x) is to be taken as a statistical density. Tsallis statistics in (1.5) led to the development of none-extensive statistical mechanics. We will show later that (1.5) comes directly from the entropy of (1.1) without going through any escort density. Let us optimize (1.1) subject to the conditions that f (x) is a density, x f (x)dx = 1, and that the expected value of x in f (x) is a given quantity, that is, x xf (x)dx = a given quantity. Then, if we use calculus of variations, the Euler equation is of the form where λ 1 and λ 2 are Lagrangian multipliers. Then we have by taking λ 2 λ 1 = a(1 − α), a > 0, α < 1, and c 1 is the corresponding normalizing constant to make f 1 (x) a statistical density. Now, for α > 1, write 1 − α = −(α − 1), then directly from (1.6), without going through any escort density, we have which is Tsallis statistics for α > 1. Thus, both the cases α < 1 and α > 1 follow directly from (1.1).

A Generalized Measure of Entropy
Let X be a scalar, a p × 1 vector of scalar random variables or a p × n, p ≥ n matrix of rank n of scalar random variables and let f (X) be a real-valued scalar function such that f (X) ≥ 0 for all X and X f (X)dX = 1 where dX stands for the wedge product of the differentials in X. For example, if X is m × n, X = (x ij ) then where ∧ stands for the wedge product of differentials, dx∧dy = −dy∧dx ⇒ dx∧dx = 0. Then f (X) is a density of X. When X is p × n, p ≥ n we have a rectangular matrix variate density. For convenience we have taken X of full rank n ≤ p. When n = 1 we have a multivariate density and when n = 1, p = 1 we have a univariate density. Consider the generalized entropy of (1.1) for this matrix variate density, denoted by f (X), then Let n = 1. Let us consider the situation of the ellipsoid of concentration being a preassigned quantity. Let X be p × 1 vector random variable.
] is the ellipsoid of concentration. Let us optimize (2.1) subject to the constraint that f (X) ≥ 0 is a density and that the ellipsoid of concentration over all functional f is a constant, that is, is a fixed parameter. If we are using calculus of variation then the Euler equation is given by where λ 1 and λ 2 are Lagrangian multipliers. Solving the above equation we have This C 1 can act as the normalizing constant to make f (X) in (2.2) a statistical density. Note that for α > 1, we have from (2.2) and when α → 1, f 1 and f 2 go to 2 is the positive definite square root of the positive definite matrix V −1 , then dY = |V | − 1 2 dX and the density of Y , denoted by g(Y ), is given by and C 4 is the normalizing constant. This normalizing constant can be evaluated in two different ways. One method is to use polar coordinate transformation, see Theorem 1.25 of [3]. Let where r > 0, 0 < θ j ≤ π, j = 1, ..., p − 2, 0 < θ p−1 ≤ 2π and the Jacobian is given by Under this transformation the exponent (y 2 1 + ... + y 2 p ) δ = (r 2 ) δ . Hence we integrate out the sine functions. The integral over θ p−1 goes from 0 to 2π and gives the value 2π, and others from 0 to π. These, in general, can be evaluated by using type-1 beta integrals by putting sin θ = u and u 2 = v. That is, .
Taking the product we have Hence the total integral is equal to Put x = ar 2δ and integrate out by using a gamma integral to get .
That is, the density is given by From the above steps the following items are available: The density of The density of u = Y ′ Y = y 2 1 + ... + y 2 p , denoted by g 1 (u), is given by (2.9) and the density of r > 0, where r 2 = u = Y ′ Y , denoted by g 2 (r), is given by where |S| denotes the determinant of S and Γ p (α) is the real matrix-variate gamma given by Applications of the above result in various disciplines may be seen from [5,6,7,8]. In our problem, we can connect dY of (2.8) to du of (2.9) with the help of (2.11) by replacing n by p and p by 1 in the n × p matrix. That is, from (2.11) The total integral of f 3 (X) of (2.3) is given by Put v = au δ and integrate out by using a gamma integral to get C 3 = δ a p 2δ Γ(p/2) |V | 1/2 π p/2 Γ( p 2δ ) and we get the same result as in (2.7), thereby the same expressions for g(Y ) in (2.8), g 1 (u) in (2.9) and g 2 (r) in (2.10).

A Generalized Model
If we optimize (2.1) over all integrable functions f (X) ≥ 0 for all X, subject to the two moment-like restrictions

then the corresponding Euler equation becomes
and the solution is available as for α < 1, a > 0, V > O, δ > 0, γ > 0 and for convenience we have taken λ 2 λ 1 = a(1 − α), a > 0, α < 1, where C * can act as the normalizing constant if f (X) is to be treated as a statistical density. Otherwise f (X) can be a very versatile model in model building situations. If C * is the normalizing constant then it can be evaluated by using the following procedure: Then for a > 0, α < 1, δ > 0 we can integrate out by using a type-1 beta integral by putting z = a(1 − α)u δ for α < 1. Then the normalizing constant, denoted by C * 1 , is available as for δ > 0, γ + p 2 > 0. Hence the density of the p × 1 vector X is given by .., p. For α < 1 we may say that f (X) in (3.3) is a generalized type-1 beta form. Then the density of Y , denoted by g(Y ), is given by for a > 0, α < 1 and C * 1 is defined in (3.2). Note that the density of u = Y ′ Y , denoted by g 1 (u), is available, as , for δ > 0, γ + p 2 > 0. Note that for α > 1 in (3.1) the model switches into a generalized type-2 beta form. Write 1 − α = −(α − 1) for α > 1. Then the model in (3.2) switches into the following form: for δ > 0, a > 0, V > O, α > 1. The normalizing constant C * 2 can be computed by using the following procedure. Put z = a(α − 1)u δ , δ > 0, α > 1. Then integrate out by using a type-2 beta integral to get When α → 1 then both f 1 (X) of (3. 3) and f 2 (X) of (3.5) go to the generalized gamma model given by It is not difficult to show that when α → 1 both C * 1 → C * 3 and C * 2 → C * 3 . This can be seen by using Stirling's formula for|z| → ∞ and η is a bounded quantity. Observe that and we can apply Stirlling's formula by taking z = 1 1−α in one case and z = 1 α−1 in the other case. Thus, from f 1 (X) we can switch to f 2 (X) to f 3 (X) or through the same model we can go to three different families of functions through the parameter α and hence α is called the pathway parameter and the model above belongs to the pathway model in [4].

Generalization to the Matrix Case
Let X be a p × n, n ≥ p rectangular matrix of full rank p. Let A > O be p × p and B > O be n × n positive definite constant matrices. Let A 1/2 and B 1/2 denote the positive definite square roots of A and B respectively. Consider the matrix where a > 0, α < 1. Let f (X) be a real-valued function of X such that f (X) ≥ 0 for all X and f (X) is integrable, X f (X)dX < ∞. If we assume that the expected value of the determinant of the above matrix is fixed over all functional f , that is Equation such as the one in (4.1) can be connected to the volume of a certain parallelotope or random geometrical objects. Solving it we have whereĈ is a constant. A more general form is to put a restriction of the form that the expected value of for α < 1, a > 0, A > O, B > O and X is p × n, n ≥ p of full rank p and a prime denotes the transpose. The model in (4.3) can switch around to three functional forms, one family for α < 1, a second family for α > 1 and a third family for α → 1. In fact (4.3) contains all matrix variate statistical densities in current use in physical and engineering sciences. For evaluating the normalizing constants for all the three cases, the first step is to make the transformation Y = A 1/2 XB 1/2 ⇒ dY = |A| n/2 |B| p/2 dX, (4.4) see [3] for the Jacobian of this transformation. After this stage, all the steps in the previous sections are applicable and we use matrix variate type-1 beta, type-2 beta, and gamma integrals to do the final evaluation of the normalizing constants. Since the steps are parallel the details are omitted here.

Standard Deviation and Diffusion Entropy Analysis
Scale invariance has been found to hold for complex systems and the correct evaluation of the scaling exponents is of fundamental importance to assess if universality classes exist. Diffusion is typically quantified in terms of a relationship between fluctuation of a variable x and time t. A widely used method of analysis of complexity rests on the assessment of the scaling exponent of the diffusion process generated by a time series. According to the prescription of Peng et al. [14], the numbers of a time series are interpreted as generating diffusion fluctuations and one shifts the attention from the time series to the probability density function (pdf) p(x, t), where x denotes the variable collecting the fluctuations and t is the diffusion time. In this case, if the time series is stationary, the scaling property of the pdf of the diffusion process takes the form where δ is a scaling exponent. Diffusion may scale linearly with time, leading to ordinary diffusion, or it may scale nonlinearly with time, leading to anomalous diffusion. Anomalous diffusion processes can be classified as Gaussian or Lévy, depending on wether the central limit theorem (CLT) holds. CLT entails ordinary statistical mechanics. That is, it entails a Gaussian form for F in (51.) composing a random walk without temporal correlations (i.e. δ = 0). Due to the CLT. the probability distribution function p(x, t) describing the probabilities of x(t) has a finite second moment < x 2 >, and when the second moment diverges, x(t) no longer falls under the CLT and instaed indicated that the generalized central limit theorem applies.
Failures of CLT mean that instead of statistical mechanics, nonextensive statistical mechanics may be utilized [12,13]. Scafetta and Grigolini [15] established that Diffusion Entropy Analysis (DEA), a method of statistical analysis based on the Shannon entropy (see eq. (1.1)) of the diffusion process, determines the correct scaling exponent δeven when the statistical properties, as well as the dynamic properties, are anomalous. The other methods usually adopted to detect scaling, for example the Standard Deviation Analysis (SDA), are based on the numerical evaluation of the variance. Consequently, these methods detect a power index, denoted H by Mandelbrot [16] in honor of Hurst, which might depart from the scaling δ of eq. (5.1). These variance methods (cf. Fourier analysis and wavelet analysis; see [17,18]) produce correct results in the Gaussian case, where H = δ, but fail to detect the correct scaling of the pdf, for example, in the case of Lévy flight, where the variance diverges, or in the case of Lévy walk, where δ and H do not coincide, being related by δ = 1/(3 − 2H). The case H = δ = 0.5 is that of a completely uncorrelated random process. The case δ = 1 is that of a completely regular process undergoing ballistic motion. Figs. 1 to 4 clearly show that the diffusion entropy development over time for solar neutrinos does neither meet the first nor the latter case. The Shannon entropy, eq. (1.1) for the diffusion process at time t, is defined by If the scaling condition of eq. (5.1) holds true, it is easy to prove that