1. Introduction
The motivation for this work stems from the need for flexible and analytically tractable matrix-variate distributions capable of modeling scatter matrices that may deviate from classical Wishart behavior, a challenge that is encountered in fields such as neuroimaging, seismology, and climatology. This paper introduces a matrix-variate generalized gamma distribution that broadens the modeling capacity of existing families. Its novel contributions include the derivation of the full density function via Jacobians of matrix transformations and explicit representation of several associated statistical functions. The framework unifies important special cases, accommodates both the real and complex settings, and demonstrates that the generalized gamma model consistently attains higher modeling accuracy than competing distributions.
A submodel of the proposed distribution is the matrix-variate Wishart-Kotz distribution as defined, for instance, in the first part of Theorem 3 of [
1], whose density function is specified in Equation (3). Its multivariate version, the so-called Kotz-type distribution, which was proposed in [
2], is discussed in [
3], where several applications arising in ecology, discriminant analysis, mathematical finance, repeated measurements, shape theory, and signal processing are pointed out. Ref. [
4] made use of the log-likelihood of a Kotz-type model to estimate a certain intraclass correlation. The density function of the ratio of bivariate Kotz-type vectors is derived in [
5], who applied his result to transformed stock prices.
Ref. [
6] expressed a matrix-variate Kotz-type distribution as an elliptically contoured distribution and derived some related statistical properties. As mentioned in Section 3.2.2 of [
7], the Wishart–Kotz model is suitable to fit the high-resolution imaging data, which generally exhibits heavy tails. Ref. [
8] discuss the estimation of the covariance parameter of the Kotz-Wishart distributions. The latter is, of course, a generalization of the well-known Wishart distribution, which continues to elicit much interest. For instance, the recent paper, Ref. [
9] introduces a novel method for constructing random matrices that follow a Wishart distribution, expressing dependent elements as algebraic functions of independent random variables having defined densities.
As complex variables are increasingly utilized in engineering and the physical sciences, many of the results will be derived in both the real and complex domains. For instance, a complex matrix-variate Kotz-type distribution is utilized in [
10] as a classifier for multilook polarimetric synthetic aperture radar data.
This paper is organized as follows.
Section 2 makes use of Jacobians of matrix transformations to derive the density function of a matrix-variate generalized gamma distribution from a symmetric Kotz-type model, yielding an explicit expression for its normalizing constant. This construction introduces a shape parameter whose role is examined in detail. Both the real and complex cases are considered.
Section 3 develops several distributional properties, including the characteristic function of the matrix-variate random variable, its first two moments, and the exact density functions of the trace and the determinant. These results rely primarily on the inverse Mellin transform and on key properties of elliptical Wishart distributions. While
Section 4 outlines a number of applications,
Section 5 presents a simulation study as well as data modeling numerical examples.
Section 6 offers some concluding remarks. The reader is referred to
Appendix A for the notation being used.
2. The Matrix-Variate Generalized Gamma Density Function
The derivation of the matrix-variate generalized gamma distribution is most efficiently carried out through a sequence of structured matrix transformations, each accompanied by its corresponding Jacobian. Beginning with the Kotz-type distribution define on a
rectangular matrix
X, we proceed to obtain the distribution of
, then introduce an additional shape parameter to obtain
, and finally apply a scaling transformation to arrive at
. This stepwise approach is conceptually transparent and economical: each density follows directly from the previous one by a simple transformation of variables and the application of Jacobian results, which are stated as lemmas. The real and complex cases proceed in parallel, and the resulting densities (3)–(8) naturally reflect this progression. The transformation chain can be represented as follows:
This sequence applies equally in the complex setting, with tildes marking the analogous complex-valued random matrices. Letting
, the density function of
Y is also determined, along with its complex counterpart.
Let
X be a
real matrix (
) of rank
p. The matrix-variate Kotz-type distribution with parameters
,
, and
has density
as specified for instance in [
11].
The corresponding complex density, which can be determined from [
12], is
whenever
and
with the
Hermitian matrix
being of rank
p.
The density functions of
will be determined by applying the next two results, which respectively hold in the real and complex domain. All the lemmas stated in this section are derived in [
13]. The subsequent results rely on the matrix-variate transformation-of-variables technique, as detailed for instance in [
14] for both the real and complex cases.
Lemma 1. Let X be a , , full-rank real matrix whose elements are all distinct and let the positive definite symmetric matrix . Then, where with , is referred to as a real matrix-variate gamma function, with denoting the real part of a. Lemma 2. Let the complex matrix comprising distinct elements with be of rank p. Let the Hermitian positive definite matrix , denoting the conjugate transpose of . Then, where with denotes the complex matrix-variate gamma function. The
real matrix random variable
has a (type I) Wishart–Kotz (W-K) distribution, as defined in Díaz-García and Gutiérrez-Jáimez (2010) [
1]. On applying the transformation of variables technique, that is, expressing the density function of
X as specified in (1) in terms of
U and then multiplying it by the Jacobian provided in Lemma 1, one obtains the following density function for
U:
for
and
Parenthetically, the derivation of this density function proceeds in the same way as in the univariate setting: consider, for instance, a density where (analogously, the density (1) can be expressed as ) and the transformation ( in the matrix-variate case); denoting by J the derivative (Jacobian) of the inverse transformation, in the univariate case, one obtains the density of u as and, in the matrix-variate case, which is precisely the density given in (3) on noting that, in this case, wherein is expressed as U and that, in view of Lemma 1, .
Similarly, the complex counterpart of this density function is that associated with the
complex-valued matrix random variable
. It is given by
whenever
and
It should be noted that the determinants appearing in the real and complex Wishart–Kotz density functions (3) and (4) arise solely from the Jacobians of the transformations.
A generalized gamma family is then obtained by replacing the shape parameters
in the density functions of
U and
, respectively specified in (3) and (4), which yielding
for
and
and
whenever
and
The density function specified in Equation (5) shall henceforth be referred to as that of a (real unscaled) matrix-variate generalized gamma () distribution. The introduction of the parameter can be justified as follows. In the Wishart–Kotz densities (3) and (4), the exponent of the determinant term is fixed by the geometry of the transformation . By replacing we free this exponent and allow it to vary continuously through . This produces a generalized gamma-type determinant factor, which controls the behavior of the density near the boundary of the cone of positive definite matrices. Thus, acts as a shape parameter governing how much mass the distribution places on matrices with small or large determinants. It increases model flexibility by allowing heavier or lighter tails in the space of positive definite matrices. Furthermore, it should be noted that, in the univariate setting, the generalized gamma family is obtained by introducing an additional shape parameter that modifies the power of the argument. The substitutions above play exactly the same role in the matrix setting: adjusts the determinant term in the same way the univariate shape parameter adjusts the power of the scalar variable. Consequently, is the key ingredient that lifts the Wishart–Kotz distribution to a matrix-variate generalized gamma (MGG) distribution, and the former is readily recovered from the latter by setting .
Next, on letting
and applying Lemmas 1 and 2, one obtains the following density functions of
(the set of
real matrices) and
(the set of
complex matrices) which are utilized in the next section:
(derived from (5) where
is replaced with
and multiplying the result by
, which follows from the transformation of variables technique) and, similarly,
Now, let the real-valued matrix-variate random variable , where the density function of is given in (5) and is a scaling matrix, and let the complex-valued matrix-variate random variable , where the density function of is given in (6).
We appeal to the following lemmas to secure the density functions of the matrix-variate generalized gamma distributions of W and in the real and complex domains.
Lemma 3. When the matrix U is symmetric, the Jacobian of the transformation where A is a symmetric matrix, is given by so that the Jacobian of the inverse transformation is such that .
Lemma 4. In the complex domain, on letting , the Jacobian of the inverse transformation is such that
On letting
with
or equivalently
since
one obtains the real scaled matrix-variate generalized gamma density
for
and
(multiplying the density (5) where
by the Jacobian given in Lemma 3, that is,
and simplifying) and the complex counterpart, that is, that associated with the
complex-valued positive definite matrix random variable
is given by
whenever
and
The scaled matrix-variate generalized gamma model whose density is given in Equation (9) can be interpreted as a generalized Wishart–Kotz distribution and is suitable for modeling random covariance matrices in settings where the classical Wishart assumptions are too restrictive. Specific applications are pointed out in
Section 4.
The derivations above demonstrate that the scaled matrix-variate generalized gamma distribution arises naturally from a sequence of simple and well-structured transformations. The resulting family of distributions is therefore both mathematically coherent and practically flexible, providing a unified framework that encompasses the Wishart, Wishart–Kotz, matrix-variate gamma, and their scaled counterparts as special cases.
4. Illustrative Applications
The real matrix-variate generalized gamma random variable—whose density function is specified in Equation (9) in the scaled case and Equation (5) in the unscaled case—defines a probability distribution on the cone of symmetric positive definite matrices. This distribution is naturally suited to the modeling of covariance (or scatter) matrices whose behavior may deviate from that implied by the classical Wishart model.
We consider two types of data sets for which the distribution provides an appropriate modeling framework, and we also outline additional contexts in which its application is likely to be beneficial.
4.1. Sample Covariance Matrices of Financial Returns
Let be p-dimensional intraday or daily log-return vectors for a set of p assets (e.g., sector indices or large-cap stocks), observed over many non-overlapping windows of length q (e.g., rolling q-day windows).
For each window
w, the sample covariance matrix
is computed. This yields a collection
of observed
positive definite matrices. Empirically, the distribution of these covariance matrices is often heavier-tailed than the Gaussian Wishart model suggests. The determinant and trace may display extra variability, and the distribution of
can exhibit heavier or lighter tails depending on market conditions.
The model (with appropriate choices of ) allows one to flexibly capture such features: and modify the radial (trace) tail behavior, while adjusts the determinant component, giving a more realistic fit to empirical distributions of realized covariance matrices than the standard Wishart or the Wishart–Kotz () distribution—whose density function is given in Equation (3)—due to the presence of the additional shape parameter .
4.2. Longitudinal Multivariate Biomedical Measurements
A second context where the model can be useful is longitudinal multivariate biomedical measurements, such as subject-level covariance matrices of repeated blood pressure, heart rate, and respiration measurements in an intensive-care setting.
Suppose that for each patient
we observe
q repeated three-dimensional measurements
within a fixed monitoring window.
For patient
i, we form the sample covariance matrix
The collection
then consists of
symmetric positive definite matrices.
In such physiological data, there is typically substantial heterogeneity across patients, which results in heavy tails in the distribution of , and skewed or overdispersed behavior in compared with a classical Wishart model. The distribution naturally accommodates: Non-Wishart determinant behavior through , allowing heavier or lighter tails in than under standard Wishart or Wishart–Kotz distributions, as well as flexible radial (trace) tails via : for instance, yields heavier-than-exponential tails in , observed in the case of unstable measurements.
Accordingly, the proposed model is well positioned to yield a superior fit relative to submodels by capturing the determinant and trace departures arising from heterogeneous covariance matrices. In both examples—high-frequency financial covariance matrices and patient-level physiological covariance matrices—the empirical behavior of the random matrices typically departs substantially from the assumptions implicit in the Wishart–Kotz family.
4.3. Additional Applications
Beyond the examples previously discussed, there are several contexts in which modeling the distribution of covariance matrices is useful. These include multivariate Bayesian analysis, diffusion tensor imaging in neuroimaging, high-dimensional covariance estimation, multiple-input multiple-output channel modeling in wireless communication systems, random effects modeling, studies involving multivariate environmental or climatological measurements across spatial locations, and multivariate quality control and reliability studies.
In each of these settings, the covariance matrix itself is a random quantity whose distribution carries meaningful structural information, and generalized gamma-type matrix-variate models constitute a flexible and analytically tractable choice.
5. Simulation Results and Empirical Data Modeling
The simulation study conducted here reveals that the distribution provides improved goodness-of-fit relative to non-nested competitors also defined on the space of symmetric positive definite matrices. This advantage is further supported by its performance on two empirical data sets, where it again emerges as the best-fitting model.
We begin by recalling a standard result—see, for instance, [
18]—that clarifies why submodels are excluded from our model comparisons:
Let and represent two statistical models, with parameter spaces and where , and assumes that is a submodel of in the sense that and that both models yield identical likelihood values for all . Letting and denote the corresponding log-likelihoods, and and their respective maximum likelihood estimators, the result states that the maximized log-likelihood under the larger model cannot be smaller than that under the submodel, i.e., .Now noting that the Wishart–Kotz (
) model is a submodel of the
model where
, that the Wishart (
) distribution is a submodel of the Wishart–Kotz model where
and
, and that the matrix-variate gamma distribution as defined in [
19] is a particular case of the
distribution where
and
, and assuming the regularity conditions required for valid maximum-likelihood estimation hold, for any given sample, the loglikelihood functions will, for example, obey the following inequalities:
where
and
.
5.1. Simulation Experiment
A simulation study was conducted. Several candidate distributions— including the Wishart, Wishart–Kotz, and matrix-variate gamma distribution—were excluded because they arise as submodels of the matrix-variate generalized gamma () distribution and consequently cannot provide an independent basis for assessing model accuracy or comparative superiority. This leaves only a small number of suitable distributions defined on the cone of positive definite matrices, notably the inverse Wishart (), the matrix-variate Beta type II (), and the matrix-variate generalized inverse Gaussian () distributions. We note that the matrix-variate F distribution is merely a reparameterization of the Beta type II, while the Beta type I distribution is inappropriate in this context because its density is defined only for positive definite matrices W satisfying .
Below we list the density functions of the models to be compared with the distribution.
Inverse Wishart distribution:
For
, the set of
positive definite matrices, and
, the density function is
Matrix–variate Generalized Inverse Gaussian distribution:
For
and
where
and
denotes the matrix–argument modified Bessel function of the second kind.
Matrix–variate Beta type II distribution:
For
and parameters
,
The adequacy of each of the four models can be assessed by evaluating the maximized log-likelihood, , the AIC statistic, or the BIC statistic, , where denotes the vector of MLE’s for a given model, and k and N are the number of model parameters and the sample size. By accounting for model complexity, BIC statistics enable comparisons that remain meaningful across competing models. Accordingly, we report only the per-observation BIC values, that is, the BIC statistics divided by the sample size N, since the remaining criteria can be readily recovered from them, if required.
A collection of
N matrices
of dimension
was initially generated, with all entries drawn independently from the standard normal distribution. (All computations were carried out using a common random seed.) For each
, the corresponding
matrix
was formed and utilized in its standardized form to compute the maximum likelihood estimates of the parameters under the
,
,
, and
models. The log-likelihood function of each of the four models was evaluated at the corresponding MLEs, and the per-observation BIC values were computed for several combinations of
p and
q, and
and 1000. These values are reported in
Table 1.
For every (p, q, N) configuration considered, the model attains a smaller BIC value than that of each of the three competing models, indicating that it provides the best overall fit among the models examined.
The simulations were then repeated 1000 times for a moderate sample size of . Only the two most relevant competing models—the and distributions—were retained for comparison, since the remaining two candidates exhibit mean BIC values so distant from those of the proposed model that overlap is virtually impossible.
For the four combinations of p and q considered, the proportion of simulation instances where the mean BIC values evaluated from the model were smaller than those secured for the model was determined. The proportions so obtained are as follows:
Taken together, these outcomes underscore the modeling effectiveness of the distribution and leave little room for doubt regarding its comparative performance.
5.2. Modeling Empirical Covariance Matrices
Two data sets are analyzed in this study. The first consists of a sequence of 25 empirical covariance matrices constructed from 100-day rolling-window log-returns of the S&P 500, FTSE 100, and Nikkei 225 equity indices, using daily observations available from January 2018 onward. The second data set is the classical Iris data, from which a collection of 20 empirical covariance matrices was obtained.
5.2.1. Rolling Covariance Matrices from Global Equity Returns
The dataset consists of a sequence of empirical
covariance matrices constructed from real-world financial data. The underlying observations are daily log-returns of three major equity indices: the S&P 500 (US), FTSE 100 (UK), and Nikkei 225 (Japan). Adjusted daily closing prices for all indices were obtained from the permanent public archive of
Yahoo Finance:
https://finance.yahoo.com. For each index
k, the daily log-return is defined as
where
denotes the adjusted closing price on day
t. Using these returns, a sequence of empirical covariance matrices is estimated via a rolling window of
For window index
m, the empirical covariance matrix is
where
The raw price data starts on the first of January 2018, which contains roughly 500 trading days. From this dataset, rolling windows are constructed, indexed by . Window m uses days , so all 25 covariance matrices describe the same joint return process over sliding 100-day windows during early 2018. The resulting dataset is the collection of symmetric positive definite matrices. The standardized covariances matrices, that is, , , are utilized in the calculations.
The per-observation BIC values are reported in
Table 2 for the four models under consideration. Once again, the
distribution provides the best fit.
5.2.2. Covariance Matrices Associated with the Iris Data Set
The covariance matrices analyzed in this study are derived from the classical Iris data set, which is hosted by the UCI Machine Learning Repository. This data set is freely available from
https://archive.ics.uci.edu/ml/datasets/iris (accessed on the 20 January 2026).
For the purposes of this study, the first 140 of the 150 available observations on three morphological measurements—sepal width (cm), petal length (cm), and petal width (cm)—were retained. These 140 observation vectors were partitioned into 20 consecutive blocks of dimension . For each block, a sample covariance matrix was computed using the usual unbiased estimator. The resulting collection of standardized empirical covariance matrices constitutes the input set for the matrix–variate modeling.
The competing models considered are
,
,
, and
.
Table 3 indicates that, yet again, the
model yields the best fit.
6. Conclusions
The derivation of the matrix-variate generalized gamma (
) density and its normalizing constant, in both real and complex settings, relies on appropriate Jacobians of matrix transformations and introduces an additional parameter that enhances the model’s flexibility. Equations (3)–(10) arise naturally within the transformation sequence
corresponding to the Kotz-type model on
X, the Wishart–Kotz distribution on
U, the generalized gamma distribution on
, and its scaled version on
W, respectively.
may therefore be regarded as a generalized Wishart–Kotz matrix random variable.
The shape parameter modifies the exponent of the determinant term and thereby governs the behavior of the density near the boundary of the cone of positive definite matrices. Larger values of shift probability mass away from nearly singular matrices, whereas smaller values allow heavier tails. Setting recovers the Wishart–Kotz distribution, while yields a richer family that parallels the extension from the gamma distribution to the univariate generalized gamma. This additional flexibility is particularly valuable for modeling high-dimensional or non-Gaussian matrix-valued data, where classical Wishart assumptions may prove inadequate. Importantly, the introduction of preserves analytical tractability, as it enters the transformation chain through a simple modification of the determinant exponent.
We derived explicit forms for a range of statistical functions of the distribution and its submodels, notably the characteristic functions and the density functions of moments of traces and determinants. Contexts in which the model is especially useful—for example, in applications involving random covariance matrices that deviate from Wishart-type behavior, such as those arising in high-frequency financial covariance estimation or longitudinal multivariate biomedical measurements—are pointed out.
A simulation study demonstrates that the distribution provides a more accurate fit than other distributions on symmetric matrices that are not its submodels. It also exhibits superior performance when applied to empirical data.
All computations were performed using the symbolic software package Mathematica (Version 14.3); the corresponding code is available from the second author upon request.
The matrix-variate generalized gamma distribution developed in this paper enriches the toolkit for modeling the randomness inherent in multivariate phenomena, and further applications beyond those already identified are likely to emerge.