A Bivariate Beta from Gamma Ratios for Determining a Potential Variance Change Point: Inspired from a Process Control Scenario

: Within statistical process control (SPC), normality is often assumed as the underlying probabilistic generator where the process variance is assumed equal for all rational subgroups. The parameters of the underlying process are usually assumed to be known—if this is not the case, some challenges arise in the estimation of unknown parameters in the SPC environment especially in the case of few observations. This paper proposes a bivariate beta type distribution to guide the user in the detection of a permanent upward or downward step shift in the process’ variance that does not directly rely on parameter estimates, and as such presents itself as an attractive and intuitive approach for not only potentially identifying the magnitude of the shift, but also the position in time where this shift is most likely to occur. Certain statistical properties of this distribution are derived and simulation illustrates the theoretical results. In particular, some insights are gained by comparing the newly proposed model’s performance with an existing approach. A multivariate extension is described, and useful relationships between the derived model and other bivariate beta distributions are also included.


Problem Contextualisation
The monitoring of the variance of independent and identically distributed (i.i.d) normal random variables over time by taking successive, independent samples of measurements over time remains an interesting and valuable research consideration within quality control environment. In this case, when the variance σ 2 changes to σ 2 1 = λσ 2 for some λ = 1, the practitioner needs to investigate the scope of such a change (a value of λ), and ideally, the position within the successive measurements where such a change could've taken place. Suppose that X ij are i.i.d. N(µ, σ 2 ), i = 0, 1, 2, · · · , κ − 1 and X ij ∼ i.i.d. N µ, σ 2 1 = λσ 2 , i = κ, κ + 1, · · · , m where j = 1, 2, · · · , n i ≥ 2 and λ > 0, as outlined in Figure 1. The values of κ and λ are assumed to be unknown, but deterministic in nature. The order of these samples is important and cannot be re-ordered; in other words, the samples have a set sequence corresponding to the order in which they were obtained.
Thus, inspired by a practical objective, this paper aims to present a theoretically motivated framework to 1. present and contextualise this problem within the quality control environment; 2.
follow a systematic approach to build up the distributional foundations from this practical perspective; 3. exploratively focus on the development of the (new) resulting bivariate beta distribution; 4.
compare this model with an existing approach; and Assume that both the process mean (µ) and variance σ 2 are unknown, and that they are estimated by their respective minimum variance unbiased estimators (MVUE), given by: whereX i and S 2 i denote the mean and variance of sample i respectively. Some particular notes on Equation (1) and (2) include: • The index variable i ranges from 0 to m: a total of m + 1 independent rational subgroups or samples. • At least two samples are needed for a potential shift between them to be possible, therefore we assume m ≥ 1.

•
The sample size n i can vary between different samples. • n i ≥ 2 is necessary since the process mean and variance are both assumed to be unknown and have to be estimated. • The pooling approach here is to use m − r + 1 and r sample means and variances in the construction of the test statistic in Section 1.2. Alternatively one can consider a single mean/variance in this construction, which would result in additional information ∑ m i=r n i − 1 and ∑ r−1 i=0 n i − 1 such that n i ≥ 1 and probability density functions are valid. In this case, the approach would reduce to a two sample comparison testing for a change in the variance.
The problem of determining if a shift in the process variance has occurred can be divided into two stages, namely before the potential shift and after, as indicated below (Gamma(·, ·) denotes the usual gamma distribution with suitable shape and scale parameters [1]).

A Solution: Sequential Statistic Framework
The proposed distribution compares all the sample variances before a certain point (where the potential shift occurs), with all sample variances after the time of the shift. In essence, the multi-sample hypothesis testing problem is approached by using m sequential two sample tests; described as: is compared with S 2 2 , S 2 3 , · · · , S 2 m and so forth until Note that, due to the assumption that we have m + 1 samples, the procedure to identify the possible change in the variance requires m comparisons. Of these m comparisons the one that leads to the largest disparity between the sample variables on the left and the right of (5) will indicate the most likely position where the process experienced a change in variance. In essence, what is then needed is to quantify the difference between the sample variances of the left and right of (5), and then determine which of the m different comparisons has the largest difference between the sample variances, and finally to use some measure to determine if this maximum difference is within some set tolerance range. Our proposed mathematical construct on how to achieve this is discussed for the remained of this introduction. Assuming that no shift in the process variance has occurred, it is possible to construct a series of two sample statistics that correspond to the general procedure described in (5). Each statistic corresponds to whether at sample r = κ the two independent samples (the sample variances before time r and the sample variances after and including time r) are from normal distributions with the same unknown variance σ 2 . This can alternatively be viewed as testing whether σ 2 = σ 2 1 , which is similar to testing λ = 1. As such, it follows that detecting a shift in the process variance can be reduced to the following hypothesis test: Suppose that a shift of size λ occurs in the process variance, then the corresponding random variables after the shift would be W i ∼Gamma( n i −1 2 , 2λ), i = κ, κ + 1, · · · , m distributed (see (4)). If there is no shift in the variance, λ = 1, and it follows that W i ∼Gamma( n i −1 2 , 2), i = 0, 1, · · · , m. Thus the hypothesis being investigated can be changed depending on the choice of the scale parameter of the specific gamma random variables. From (5) it follows that the series of statistics that forms the building blocks of the process are given by where W i ∼Gamma( n i −1 2 , 2) for i = 1, 2, · · · , r − 1 and W i ∼Gamma( n i −1 2 , 2λ) for i = r, r + 1, · · · , m.
In essence, (6) implies that the sample variances before the potential shift are pooled together, and the sample variances after the potential shift are pooled together. The numerator of the statistic at time r is the average, weighted by each statistic's degrees of freedom, of all the sample variances between and including samples r and m, while the denominator is the corresponding weighted average of all the sample variances between and including samples 0 and r − 1; this is graphically presented in Figure 2. Thus U * r is the test statistic that is typically used to test the equality of two population variances from a normal distribution. From (3), (4) and (6), it follows that if no shift has occurred in the process variance, i.e., λ = 1, each statistic U * r is univariate F distributed with ∑ m i=r (n i − 1) and ∑ r−1 i=0 (n i − 1) degrees of freedom, respectively.
The reasoning behind the critical values that would indicate whether λ = 1 is justified by inspecting the sequence of statistics in (6). Suppose that an increase (λ > 1) in the process variance has occurred from sample r = κ onward, then: • The statistic U * r 's numerator will contain only sample variances that come from a N µ, λσ 2 , λ > 1 distribution, whereas the denominator will contain only sample variances that come from a N µ, σ 2 distribution. • If k 1 is some integer value such that 1 ≤ r − k 1 < r, then the statistic U * r−k 1 will contain k 1 sample variances in its numerator that are from a N µ, σ 2 distribution. This will reduce the weighted average of the sample variances of U * r−k 1 's numerator in comparison to the numerator of U * r . • Similarly, if k 2 is some integer value such that r < r + k 2 ≤ m, then the statistic U * r+k 2 will contain k 2 sample variances in its denominator that are from a N µ, λσ 2 distribution. This will increase the weighted average of sample variances of U * r+k 2 's denominator in comparison to the denominator of U * r . • Thus, any statistic other than the one immediately following the shift in the process variance, will contain either smaller sample variances in its numerator, or larger sample variances in its denominator (on average). Either of these scenarios result in a high probability that all other statistics are smaller relative to U * r . • This leads to the conclusion that the most probable place where an upwards shift in the process variance will be detected is at the statistic immediately following the shift.
The value that this statistic assumes also has a high likelihood of being the maximum value of all the U * r , r = 1, 2, · · · , m − 1, m statistics. • As such, the most reasonable method of calculating the critical value to detect an upwards shift in the process variance is to calculate the maximum order statistic of the charting statistics U * r , r = 1, 2, · · · , m − 1, m, (under the null hypothesis) and to set the critical value equal to some percentile of the distribution of the maximum order statistic.
Using a similar but inverted argument, it can be justified that the critical value of the control chart should be set equal to some percentile of the minimum order statistic of the charting statistics, under the null hypothesis of no shift having occurred, if the detection of a downward shift in the process variance is of concern. Due to space limitations, the minimum order statistics are not presented in this article.
To simplify matters going forward and for notational purposes we omit the factors ∑ m i=r (n i − 1) and ∑ r−1 i=0 (n i − 1) in (6), and drop the superscript *, and therefore the statistics of interest become where W i ∼ Gamma(α i > 0, β i > 0), i = 0, 1, · · · , m and independent. Since α i = n i −1 2 and β i = 2λ, the shape parameter is related to the sample size of the i th sample, and the scale parameter is related to the underlying distribution's variance. The theoretical focus here is based on the statistics in (7), whereas the statistics in (6) are those that are practically applicable and is the basis in the simulation study in Section 3.
Constructing statistics using ratios of random variables as in (7) is of practical interest in many areas of science. Ref. [2] studied and derived the joint density functions of ratios of Rayleigh, Rician, Nakagami-m, and Weibull random variables; [3] approached the ratios of generalised gamma variables via exact-and near exact solutions, and [4] derived closed-form expressions for the ratio of independent non-identically distributed variables from an α-µ distribution which have applications in the performance analysis of wireless communication systems.
The proposed model in this paper will be compared to the model of [5] in Section 3, and is described here for the convenience of the reader. If r = 2, that the bivariate joint probability density function of the statistics in (10) is given by (w 0 +w 1 ) , and Γ(·) is the gamma function [6].
Refs. [5,7] proposed a beta distribution that is used to detect a shift in the process variance which is based on the Q chart that was developed by [8], and as such the series of comparisons of sample statistics they made were Using a similar approach to the method described earlier the statistics can be given as which is graphically presented in Figure 3-with W i defined in (3) and (4). Based on the work of [5,9] provided insight for detecting the change in the parameter structure if the underlying process is multivariate normal.

Outline of Paper
In Section 2, the bivariate joint probability density function which emanates from (7) is derived, accompanied by an exploratory shape analysis. This is followed by the derivation of the marginal probability density functions, the product moment as well as the maximum order statistic of the distribution. In Section 3 the performance of the model that this article proposes is compared to the Q chart studied by [5], this will be conducted through a simulation study. Tables of simulated values for the 95th percentiles of the maximum order statistics of the sets of random variables in (6) and (7) are provided in Section 3, for varying parameter choices, to enable practical application of the proposed model. The values in these tables are corroborated through numerical integration of the derived expressions for the maximum order statistics. The Appendices A and B contains proofs of the obtained results as well as the positioning of this distribution among other often considered bivariate beta models.

Proposed Model
In this section, the joint probability density function of the random variables U 1 and U 2 (see (7) when r = 2) is derived, followed by a shape analysis, the derivation of the marginal probability density functions, the product moment as well as the maximum order statistic. A brief review of this new candidate with respect to its partners is provided in the Appendices A and B-which provides additional insight for modelling as well as expressions with closed form.

Bivariate Probability Density Function
then the joint probability density function of U 1 and U 2 is given by where u 1 > u 2 > 0.

Shape Analysis
In this section a shape analysis is conducted for the joint probability density function (11). A standard set of parameters has been chosen as a baseline. The parameters are chosen to be α 0 = α 1 = α 2 = 5 and β 0 = β 1 = β 2 = 2, in other words, a process where all three samples consist of 5 × 2 + 1 = 11 observations, and where no shift has occurred in the process variance. Some of the parameters will then be varied from this baseline in order to investigate the effect that a change in the specific parameters will have on the general shape of the joint probability density function. Note that the change in some parameters will be large -so large that they lose practical realism; this is conducted to emphasise and investigate the general change in the shape, and is not meant to be an indication of the practical applications of the joint probability density function. The functions will only be plotted for In Figure 4 it can be seen what effect increasing the sample sizes (while keeping them equal) has on the joint probability density function. It is seen that increasing the sample sizes also increases the height of the peak of the probability density function. Larger sample sizes also shrink the length and width of the "tails" of the joint probability density function. In essence, the larger the sample sizes (n i = 11, n i = 51, n i = 101), the smaller the domain on which the function has non-trivial values.  Figure 5 below demonstrates that a sustained increase in the process variance, irrespective of size, minimally affects the general shape and location of the joint probability density function, but does affect the height of the probability density function's peak. In the below example, the shift in the process variance occurs at time 1, and as one would hope and expect, the joint probability density function relies heavily on the value of the statistic at time 1, U 1 . A similar effect, where the joint probability density function relies heavily on the value of U 2 is seen when the shift occurs at time 2.

Marginal Probability Density Functions
Theorem 2. Assume that (U 1 , U 2 ) has the joint probability density function in (11), then the marginal probability density function of U 1 is given by < 1, and the marginal probability density function of U 2 is given by  (13).
The maximum order statistics are of importance in detecting whether a shift in the process variance does indeed occur, as was discussed in Section 1 and will be demonstrated in Section 3. Although a closed form expression for the maximum order statistic is not tractable in a closed form, an expression is provided that can be implemented to calculate numerical values.
In Section 3.1, properties of (7) are also studied when no shift in the variance occurs, this is imperative since the control limits of a control chart are constructed under the null hypothesis that no shift has occurred, or alternatively given that the process is in control. The in control properties that are investigated for both the Q chart and the new proposed model (6) are where the maximum order statistic is most likely to occur when no shift has occurred in the process variance. Practically, this is a very important question since, if the maximum order statistic consistently occurs at roughly the same place in the sequence of samples, it implies that the distribution should be treated with added suspicion if it indicates that a shift in the process occurs at this location.
In Section 3.2, the out of control performance of the newly proposed distribution is compared with the Q chart model investigated by [5]. The probability that the respective control charts will detect a shift in the process variance is investigated for differing sizes of shifts in the process variance. The comparison is made for varying numbers of samples as well as sample sizes.

Comparison When the Process Is in Control
The location of the maximum order statistic when the process is in control is investigated using graphs. For brevity's sake only one possible combination (m = 29, n = 15) of the number of samples and sample sizes mentioned above is included in this paper. All of them, however, lead to the same general conclusion.
As can be seen from Figure 6 the [5] distribution's maximum order statistic occurs most often at the first statistic, with the probability of the maximum occurring at subsequent statistics steadily decreasing. This implies that the Q chart becomes more stable as the process progresses, which makes practical sense since each subsequent statistic includes more of the sample data. The newly proposed distribution's maximum order statistic occurs most often at the first statistic, and second-most often at the last statistic. This due to the way in which the statistics of the distribution are constructed (see (6)). This implies that while our proposed model may detect shifts at the ends of the samples, signals received at these locations should be treated with a bit of skepticism. In Table 1 the 95th percentiles of the maximum order statistics of the newly proposed distributions are simulated (to the third decimal) using Monte Carlo simulation. In other words z in the equation P max U * 1 , U * 2 , · · · , U * m < z = 0.95. Similarly the 95th percentiles of the maximum order statistics of (6) are simulated in Table 2. Note that: • Since the critical values of the distribution are derived under the assumption of the null hypothesis, the equations for the maximum order statistic can be simplified to be constructed out of chi-square random variables instead of the more complex gamma case. • When the sample sizes of all the samples are equal (n i = n for i = 1, · · · , m) the removal of the constant terms in (7) is superfluous since the series of statistics in (6) and (7) are merely scalar multiples of each other. This can be seen in Tables 1 and 2 where P max U * 1 , U * 2 , · · · , U * m < z ≈ P(max(U 1 , U 2 , · · · , U m ) < mz).

•
Instead of using simulation, the maximum order statistics in Table 2 could have been calculated using (15). Table 3 demonstrates their equivalence for the bivariate case by numerically integrating (15).

Comparisons When the Process is Out of Control
In this section, the proposed model's potential to detect shifts in compared with that of the Q chart form investigated by [5]. The probability of signaling that a shift in the process variance has occurred depends on a few variables. In this paper, these variables are: the number of samples, the sample size of the samples, where in the process the shift in the process variance occurs, and the size of the shift. The figures in this section take all of these parameters into account.
In each of the following figures, the probability of signaling a shift in the variance is displayed as a function of the size of the shift, where the shift size λ, ranges from λ = 1 (no shift) to λ = 5 (a 500% increase in the process variance). Many different combinations of the number of samples, the sample sizes, as well as the locations of the shift were tested; however, only the graphs that illustrate key findings are included in this paper. The chosen parameters that were simulated are: number of samples (m) equal to 10, 20 and 30, the sample sizes (n) equal to 2, 5, 10 and 20, and the location of the shift in the process variance (κ) occurring at (roughly, due to integer sample numbers) 25%, 50% and 75% of the way through the samples.
By simulating graphs such as those in Figures 7-9, certain conclusions can be reached about the properties and efficacy of the two competing models ( (5) and (9)): • When n = 2, irrespective of the number of samples, the newly proposed model outperforms the Q chart. There are some caveats however that should be noted: 1.
In the above graphs, each plotted point was simulated 500,000 times during the Monte Carlo process. For all the n = 2 graphs, the points vary erratically between each 0.05 increases in the shift size, and thus even for a large number of simulations the process cannot be described as "stable". 2.
The Q chart seems to be completely incapable of detecting an increase in the process variance when n = 2, with the probability of detecting a shift remaining at roughly 5%, irrespective of the size of the shift.

3.
While the new model's probability of detecting a shift does increase as the size of the shift increases, it remains relatively low, at roughly 7% to 10%, just marginally higher than the 5% chance when the process is actually IC. This implies that while it might be theoretically possible to implement the new model for samples sizes of 2, it would likely not be a practically useful technique. 4.
The new model's probability of detecting a shift does not increase as the number of samples increases, as would be expected (and as is the case for the other choices of n) • From these points above, it can be concluded that using a sample size of 2 does not lead to an effective control chart. • For a small numbers of samples (m = 9), the newly proposed model outperforms the Q chart for all simulated sample sizes as well as locations of shifts (for all shift sizes). • When there are 20 samples (m = 19), the newly proposed model outperforms the Q chart in nearly all situations. The Q chart does have a higher probability of detecting a shift in the process variance only when the sample sizes are small (n = 5), and the shift occurs relatively late in the process (κ = 15), for shifts in the process variance between λ = 3 and λ = 4.75. Since a 300% to 475% increase in the process variance is unlikely to occur in practice, the newly proposed model would likely be more effective for m = 19. • For m = 29, sweeping statements about the performances of the two methods are more difficult to make since the plotted percentage lines cross often. However it can be said that: 1.
For small sample sizes (n = 5), the proposed model outperforms the Q chart for small shifts in the process variance, whereas the Q chart performs better for larger shifts.

2.
The Q chart performs at its best when the shift in the process variance occurs late in the series of samples.

3.
For larger sample sizes (n = 20) the proposed model outperforms the Q chart when the shift in the process variance occurs early, but when the shift occurs roughly half way through the series of samples, or further, the performance of the two methods are very similar. m = 9, n = 2, κ = 7 m = 29, n = 2, κ = 15

Concluding Remarks
In this paper a new bivariate beta type distribution is proposed that can be utilised to detect a shift in a process's variance when the underlying process follows a normal distribution. The proposed model compares favourably with the Q chart model studied by [5] in most situations; especially when the number of samples is small, and when the process variance experiences a change early on in the series of samples. Future work can include focus on (i) when a shift occurs within a sample, (ii) expanding underlying distributional assumptions to that of the class of scale mixtures to consider departures from normality (see [11]), and (iii) the multivariate setup, of which we propose the probability density function in the following theorem and a proof outlined in the Appendices A and B. Theorem 4. Let W i be independent gamma random variables with parameters (α i > 0, β i > 0) for i = 0, 1, 2, · · · , m. Let U r = ∑ m i=r W i ∑ r−1 i=0 W i , r = 1, 2, · · · , m − 1, m, then the joint probability density function of U 1 , U 2 , · · · , U m is given by where u 1 > u 2 > · · · > u m > 0. • If V 1 = X 1 X 3 and V 2 = X 2 X 3 then (V 1 , V 2 ) has a bivariate beta type II distribution [14].
Y 1 +Y 2 +2Y 3 then (W 1 , W 2 ) has a bivariate beta type III distribution. The multivariate generalisation was derived and studied by [15,16] considered the case when which is a generalisation of the above type III distribution.
has a bivariate beta type VI distribution. This joint probability density function has not yet been derived in the literature and could potentially be applied to detecting shifts in a process variance.
has a bivariate beta type VIII distribution. This is the model that this article proposes in terms of gamma variables in Section 2, but in its special case it can be reduced to be constructed from chi-square variables.
Relationships between these models are graphically represented in Figure A1.