Monte Carlo Simulation of a Modified Chi Distribution with Unequal Variances in the Generating Gaussians. A Discrete Methodology to Study Collective Response Times

The Chi distribution is a continuous probability distribution of a random variable obtained from the positive square root of the sum of k squared variables, each coming from a standard Normal distribution (mean = 0 and variance = 1). The variable k indicates the degrees of freedom. The usual expression for the Chi distribution can be generalised to include a parameter which is the variance (which can take any value) of the generating Gaussians. For instance, for k = 3, we have the case of the Maxwell-Boltzmann (MB) distribution of the particle velocities in the Ideal Gas model of Physics. In this work, we analyse the case of unequal variances in the generating Gaussians whose distribution we will still represent approximately in terms of a Chi distribution. We perform a Monte Carlo simulation to generate a random variable which is obtained from the positive square root of the sum of k squared variables, but this time coming from non-standard Normal distributions, where the variances can take any positive value. Then, we determine the boundaries of what to expect when we start from a set of unequal variances in the generating Gaussians. In the second part of the article, we present a discrete model to calculate the parameter of the Chi distribution in an approximate way for this case (unequal variances). We also comment on the application of this simple discrete model to calculate the parameter of the MB distribution (Chi of k = 3) when it is used to represent the reaction times to visual stimuli of a collective of individuals in the framework of a Physics inspired model we have published in a previous work.


Introduction
The Chi distribution is a widely known continuous probability distribution in Statistics. It represents the distribution of a random variable which is calculated as the Euclidean norm over k variables each following a standard Normal distribution [1], where k specifies the degrees of freedom. This is the case of two familiar examples in Physics, namely, the Rayleigh distribution (Chi of k = 2 degrees of freedom) and the Maxwell-Boltzmann distribution (MB) (Chi of k = 3 degrees of freedom). The first appears when the wind velocity is analysed in two dimensions each normally distributed with equal variance, and zero mean. The second represents the distribution of the particle velocities in the Ideal Gas model of Physics, an ideal system of independent particles in thermal equilibrium with a thermostat [2].
In this work, we first generalise the expressions of the Chi distribution for the case of equal (and different from 1) variances in the generating Gaussians. We also analyse the case of unequal variances in the generating Gaussians whose distribution can be still approximated to a Chi distribution. For this purpose, we carry out Monte Carlo simulations of the generating Gaussians with unequal variances. Then, by means of non-linear fittings, we could obtain the parameter of the Chi distribution for a given k, which stands for the degrees of freedom. For this, we perform several simulations starting from different sets of unequal variances in the generating Gaussians.
In the second part of the article, we present a discrete model to calculate the parameter of the Chi distribution in an approximate way for the case of unequal variances. We also comment on the application of this simple discrete model to calculate the parameter of the MB distribution when it is used to represent the reaction times to visual stimuli of a collective of individuals. Reaction time (RT) is a key variable on many psychological experiments as, for instance, it correlates with cognitive disorders [3]. RTs are commonly represented in Experimental Psychology by means of the ex-Gaussian probability distribution [3,4], which provides three interpretable parameters, denoted as , , and . These parameters represent the mean and variance ( , ) of the Gaussian function, and the decay constant ( ) of the exponential function, which are convoluted to give rise to the ex-Gaussian function. RT distributions are asymmetric functions [4,5], which cannot be fully described by the mean and standard deviations [6]. The ex-Gaussian parameters obtained from fitting reaction time distributions to visual stimuli are interpretable in relation to cognitive disorders as certain correlations are evidenced to exist [7][8][9][10][11][12][13]. This is the case of Attention Deficit and Hyperactivity Disorder (ADHD), the most usual diagnosed cognitive disorder in school-aged children.
In our previous work [14,15], we integrated the response times of a group of individuals of similar age in a Maxwell-Boltzmann (MB) distribution in order to model the behaviour of a collective of individuals. Relevant correlation has been found between the RTs series to visual stimuli of individuals who perform experiments in a short time scale (hundreds of milliseconds) when cultural elements are not expected to have influence and without exchanging information. Similarly, relevant frequencies are found when the response times along the sequence of stimuli are analysed by Fast Fourier Transform [16,17]. These facts support the existence of a collective behaviour rather than individuals acting separately. The MB is a very simple distribution depending on one parameter only, which is proportional to the variance. One major advantage of having a probability distribution of a collective is that it allows for a rigorous and more natural way of classifying the individuals in relation to the reference group they belong to and without need for an external reference. With our discrete model, the MB distribution parameter (case of a Chi distribution of k = 3 degrees of freedom and unequal variances in the generating Gaussians) can be calculated in a simple and practical way. The results of this work are particularly relevant for the interpretation of reaction times in different contexts.
The outline of the article is as follows. In Section 2, the general aspects of the Chi distribution and its generalization for variances taking any positive value in the generating Gaussians are commented. The latter includes the case of the MB distribution (Chi of k = 3) of the Ideal Gas particles in Physics. In Section 3, we carry on developing Monte Carlo simulations of an approximate variant of the Chi distribution, namely, when the generating Gaussians have unequal variances. In Section 4, a discrete model is proposed to calculate the parameter, B, of the MB distribution when used to represent response time data of a collective of individuals.

The χ Distribution and Its Generalization for Variances Different from 1
The Chi distribution ( ) is a continuous probability distribution of a random variable obtained as: where are random variables each following a standard Normal distribution (mean = 0 and variance = 1). The resulting probability density is expressed as: If the variances of the generating Gaussians can take any positive value, ( ; ) can be generalised as [6]: where k is the number of degrees of freedom of the distribution, and B a parameter related to its variance. The corresponding cumulative distribution function (CDF) is expressed as: where Γ , is the upper incomplete gamma function, and Γ , the gamma function.
The χ distribution as written in Equation (1) describes a multidimensional ideal gas of particles (where the k variances of the Gaussian components are equal among them).

Case of the Maxwell-Boltzmann Distribution (Chi of k = 3)
In Physics, the Maxwell-Boltzmann (MB) distribution represents the probability density of the moduli of the velocities (| |) of the independent particles forming the ideal gas. This situation is given when the components of the velocities follow Gaussian distributions with the same variances and centred around zero. The MB distribution is just a Chi distribution of k = 3, where the variance in the velocity distributions is proportional to the temperature: In Physics, Equation (3) would represent the case of a multidimensional Ideal Gas.

Monte Carlo Simulations of the Case of Unequal Variances in the Generating Gaussians
As follows, we will perform Monte Carlo simulations to generate a random variable which obtained the positive square root of the sum of k squared variables each coming from non-standard Normal distributions, where the variances can take any positive value.
First, we will simulate the case of 3 generating Gaussians with zero mean and standard deviations , , and . Then, we will evaluate the parameter B by fitting the generated data to the MB function in Equation (5) by using the non-linear fitting algorithm of Levenberg-Marquardt [18,19]. The quality of the fit is indicated by the resulting coefficient of determination .
In Table 1, we present the details of the numerical experiments, where (%) = 2 × 100% × ( min − max )/( min + max ) is the percentage difference between the minimum and maximum variances. The parameter fit resulting from the fit and the corresponding coefficient of determination of the fitting, are also included. The table includes the values of calc = ( + + ) , which is proven to be a very good approximation of fit ; (%) is the percentage difference between fit and calc ; 〈 〉 is the mean value over the three sets of variances taken for the same percentage difference ( (%)), and SD the corresponding standard deviation. Similarly, in Table 2 we present the details of the numerical experiments and the relative errors between both evaluations of B for k = 5.  Figure 1, the relative error between the fitted and the calculated parameters is represented as a function of the percentage difference between the minimum and maximum variances. The error bars in the figure indicate the standard deviation of this error when different possibilities are chosen for the standard deviations within the range min < < max . Results are shown for 3 and 5 Gaussian components (Tables 1 and 2). Figure 2 depicts the same results, but for the coefficient of determination .

Discrete Methodology for the Calculation of the MB Parameter when Used to Represent the Reaction Times of a Collective of Individuals
In previous works [14,15], we have developed a methodology to analyse the response times to visual stimuli of a group of individuals which is based on the use of the MB distribution [2] instead of the ex-Gaussian distribution [3]. The MB distribution emerges when the reaction time distributions of single individuals are combined at a higher level of hierarchy, which allows us to capture the collective behaviour in a single distribution [14,15]. The details of the computer-based experiments and for the design of the visual stimuli can be found in our previous works [4,14,15]. Although the methodology for the analysis of the reaction time data can be also found therein, we will recall in what follows the more relevant expressions.
From the first three moments (mean: , standard deviation: , and skewness: ) of the reaction time distribution of each participant (denoted by the index i), the following vector can be defined: where 〈 〉, 〈 〉, and 〈 〉 are the mean values of the moments over the sample. Then, the Euclidean norm of the vector, , is calculated as: Each participant can now be represented by a single scalar, | |, which we have proven [1] to be distributed as a MB distribution. The following expression will be used to fit the experimental data: where B is the only free parameter (a measure of the dispersion of the distribution). This expression gives the probability of each individual's time response in relation to the reference group the individual belongs to and, therefore, it allows for a classification methodology of individuals based on their reaction times to visual stimuli [14]. A classification methodology, based on a model for a collective of individuals which is characterised by a given probability distribution, allows for the classification of the behaviour of individuals in relation to the group they belong to. There is no need for an external reference. As follows, we introduce a discrete model to evaluate the parameter , which is consistent with the model developed in [14] and suitable for non-equal variances of the distributions of the three components of the vector . The methodology to estimate develops as follows:

•
Step 1. Calculation of the moments: , mean; , the variance; and , the skewness for each participant of index i: where are the response times ( = 1, … , ) of each participant ( = 1, … , ). , , and have time units.

•
Step 2. Evaluation of the mean values of the moments: and calculation of the mean values of the squared moments: • Step 3. Evaluation of the variances of the dimensionless and normalised moments. First, we proceed to the make the moments in Equations (9)-(11) dimensionless and normalised: Secondly, we evaluate the mean values of the squared dimensionless and normalised moments (D.N.M.), i.e.,: If we use the definitions (12) and (13), we can write: The participants are then represented by a single scalar | | distributed as a MB distribution. We evaluate its variance as: where stands for the mean value of the distribution of | |. Using (7) we can write: It is well known that the MB distribution corresponds to the distribution of the positive square root of the sum of the squares of three independent Gaussian distributed random variables, each one with zero mean and equal standard deviations ( * = * = * ≡ ). The mean value of the MB distribution (3) Even for the general case where the three variances of the components are not necessarily equal among them, we can still model the response time data by a MB distribution. We propose to take the parameter B from the average value of the three standard deviations (instead of taking B directly from the average value of the three variances): = ( * + * + * ) = * + * + * + 2 * * + 2 * * + 2 * * . (26) In Figure 3, we summarise the main steps of the proposed discrete methodology to evaluate the parameter B. We can extend the previous procedure to the case of the distribution of the positive square roots of the sum of squares of a set of k independent random variables, each one following a Gaussian distribution with variance ( = 1,2, … , ). In this case, we can generalise Equation (26) as:

Conclusions
Monte Carlo simulations and non-linear fittings have been carried out to study the Chi distribution when it is approximated by the square root of the sum of squared variables following non-standard Normal distributions with different values in the variance. We establish the boundaries of what to expect when we start from a set of unequal variances in the generating Gaussians. We also present a discrete model to calculate the parameter, , of the Chi distribution in an approximate way for the case of unequal variances in the generating Gaussians. We illustrate the application of this simple discrete model to a problem within Experimental Psychology, that is, the calculation of the parameter of the MB distribution when it is used to represent the reaction times to visual stimuli of a collective of individuals in the framework of the Physics inspired model we have published in a previous work. We address two cases that involve the Chi distribution, namely, the case when the variances in the generating Gaussians are equal but different from 1 (multidimensional ideal gas in Physics), and the case when the variances are unequal (multidimensional and anisotropic ideal gas). The results from this work can find multiple applications in social sciences, where to know the behaviour of the collective, and the behaviour of an individual relative to the collective it belongs to, are of enormous importance. For instance, the probability distribution of the collective, represented in terms of a Maxwell-Boltzmann-like distribution, provides a more natural classification methodology of individuals.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.