On the Reversible Jump Markov Chain Monte Carlo (RJMCMC) Algorithm for Extreme Value Mixture Distribution as a Location-Scale Transformation of the Weibull Distribution

: Data with a multimodal pattern can be analyzed using a mixture model. In a mixture model, the most important step is the determination of the number of mixture components, because ﬁnding the correct number of mixture components will reduce the error of the resulting model. In a Bayesian analysis, one method that can be used to determine the number of mixture components is the reversible jump Markov chain Monte Carlo (RJMCMC). The RJMCMC is used for distributions that have location and scale parameters or location-scale distribution, such as the Gaussian distribution family. In this research, we added an important step before beginning to use the RJMCMC method, namely the modiﬁcation of the analyzed distribution into location-scale distribution. We called this the non-Gaussian RJMCMC (NG-RJMCMC) algorithm. The following steps are the same as for the RJMCMC. In this study, we applied it to the Weibull distribution. This will help many researchers in the ﬁeld of survival analysis since most of the survival time distribution is Weibull. We transformed the Weibull distribution into a location-scale distribution, which is the extreme value (EV) type 1 (Gumbel-type for minima) distribution. Thus, for the mixture analysis, we call this EV-I mixture distribution. Based on the simulation results, we can conclude that the accuracy level is at minimum 95%. We also applied the EV-I mixture distribution and compared it with the Gaussian mixture distribution for enzyme, acidity, and galaxy datasets. Based on the Kullback–Leibler divergence (KLD) and visual observation, the EV-I mixture distribution has higher coverage than the Gaussian mixture distribution. We also applied it to our dengue hemorrhagic fever (DHF) data from eastern Surabaya, East Java, Indonesia. The estimation results show that the number of mixture components in the data is four; we also obtained the estimation results of the other parameters and labels for each observation. Based on the Kullback–Leibler divergence (KLD) and visual observation, for our data, the EV-I mixture distribution offers better coverage than the Gaussian mixture distribution.


Introduction
Understanding the type of distribution of data is the first step in a data-driven statistical analysis, especially in a Bayesian analysis. This is very important because the distribution that we use must be as great as possible to cover the data we have. By knowing the distribution of the data, the error in the model can be minimized. However, it is not rare for the identified data to have a multimodal pattern. A model with data that has a multimodal pattern becomes imprecise when it is analyzed using a single mode distribution. This type of data is best modeled using mixture analysis. The most important thing in the mixture analysis is to determine the number of mixture components. If we know general, our algorithm can be applied to any mixture distribution, converting the original distribution into a location-scale family.
Several studies using the RJMCMC method on the Weibull distribution have been carried out. First, Newcombe et al. [52] used the RJMCMC method to implement a Bayesian variable selection for the Weibull regression model for breast cancer survival cases. The study conducted by Denis and Molinari [53] also used the RJMCMC method as covariate selection for the Weibull distribution in two datasets, namely Stanford heart transplant and lung cancer survival data. Mallet et al. [54] used the RJMCMC method to search for the best configuration of functions for Lidar waveforms. In their library of modeling functions, there are generalized Gaussian, Weibull, Nakagami, and Burr distributions. With their analysis, the Lidar waveform is a combination of these distributions. In our research, we have used the RJMCMC method on the Weibull distribution but from a different perspective, namely identifying multimodal data by determining the number of components, then determining the membership of each mixture component and determining the estimation results.
This paper is organized as follows. Section 2 introduces the basic formulation for the Bayesian mixture model and hierarchical model in general. Section 3 describes the location-scale distributions and NG-RJMCMC algorithm. Section 4 describes the transformation from the Weibull distribution to the location-scale distribution, determining the prior distributions, and explaining the move types in the RJMCMC method. Section 5 contains the simulation study. Section 6 provides misspecification cases as the opposite of a simulation study to strengthen the proposed method. Section 7 provides analysis results for applications of enzyme, acidity, and galaxy datasets, as well as our data, namely dengue hemorrhagic fever (DBD) in eastern Surabaya, East Java, Indonesia. The conclusions are given in Section 8.

Basic Formulation
For independent scalar or vector observations t i , the basic mixture model can be written as in Equation (1): w j f · θ j independently for i = 1, 2, · · · , n, where f (·|θ ) is a given parametric family of densities indexed by a scalar or vector parameter θ [16]. The purpose of this analysis is to infer to the unknown: the number of components (k), the weights of components (w j ), and the parameters of components (θ j ). Suppose a heterogeneous population consists of groups j = 1, 2, · · · , k in proportion to w j . The identity or label of the group is unknown for every observation. In this context, it is natural to create a group label z i for the i-th observation as a latent allocation variable. The unobserved vector z = (z 1 , z 2 , · · · , z n ) is usually known as the "membership vector" of the mixture model [29]. Then, z i is assumed to be drawn independently of the distributions Pr(z i = j) = w j for j = 1, 2, · · · , k. (2)

The Family of Location-Scale Distributions
A random variable T is defined as belonging to the location-scale family when its cumulative distribution function (CDF) is a function only of t−µ σ , as in Equation (6) [55]: where F(·) is a distribution without other parameters. The two-dimensional parameter (µ, σ) is called the location-scale parameter, with µ being the location parameter and σ being the scale parameter. For fixed σ = 1, we have a subfamily that is a location family with a parameter µ, and for fixed µ = 0, we have a scale family with a parameter σ. If T is continuous with the probability density function (p.d.f.) is a location-scale parameter for T if (and only if) where the functional form g is completely specified, but the location and scale parameters, µ and σ, of f T (t|µ, σ ) are unknown, and g(·) is the standard form of the density f T (t|µ, σ ) [56,57].

NG-RJMCMC Algorithm
In the mixture model analysis, one method is known, namely the reversible jump Markov chain Monte Carlo (RJMCMC). The RJMCMC method can be used to estimate an unknown quantity, such as the number of mixture components, the weights of mixture components, and the distribution parameters of mixture components. For its usefulness in determining the number of mixture components in the indicated multimodal data, RJMCMC has been extensively used. For the Gaussian distribution, this was initially carried out by Richardson and Green [16]. Over time, RJMCMC can be used for distributions other than Gaussian. RJMCMC for general beta distribution was carried out in two studies by Bouguila and Elguebaly [29,58]. In their research, the general beta distribution consists of four parameters, namely the lower limit, the upper limit, and two shape parameters. Then, to use the RJMCMC algorithm, they obtained the location-scale parameterization for this distribution. Another study followed the location-scale parameterization for the RJMCMC method, namely research on the symmetry gamma distribution [59].
Not only in the RJMCMC method but also in other methods, the location-scale parameterization is carried out in the mixture analysis. First, research on the exponential and Gaussian distribution using the Dirichlet process mixture was carried out by Jo et al. [60]. Secondly, research on the asymmetric Laplace error distribution using the likelihood-based approach was carried out by Kobayashi and Kozumi [61]. Finally, research on the exponential distribution using the Gibbs sampler was carried out by Gruet et al. [62]. Based on the studies mentioned above, in the mixture analysis using any method, it will be easier if the distribution used in the research follows the location-scale (family) parameterization. Thus, we given Algorithm 1 as a modification of the RJMCMC algorithm.
Modify the distribution to be analyzed into a member of the location-scale family, determine: ( Letting ∆ denote the state variable (in this study, ∆ be the complete set of unknowns (µ, σ, k, w, z)), and p(∆) be the target probability measure (the posterior distribution), we consider a countable family of move types, indexed by m = 1, 2, · · · . When the current state is ∆, a move type m and destination ∆ * are proposed, with joint distribution given by q m (∆, ∆ * ). The move is accepted with probability If ∆ * has a higher dimensional space than ∆, it is possible to create a vector of continuous random variables u, independent of ∆ [16]. Then, the new state ∆ * is set by an invertible deterministic function of ∆ and u: f (∆, u). Then, the acceptance probability in Equation (8) can be rewritten as in Equation (9): where r m (∆) is the probability of choosing move type m when in the state ∆, and q(u) is the p.d.f. of u. The last term, ∂∆ * ∂(∆,u) , is the determinant of Jacobian matrix resulting from modifying the variable from (∆, u) to ∆ * . If the random variable T follows the Weibull distribution with the shape parameter η > 0 and the scale parameter λ > 0, T ∼ Weibull(η, λ), then the p.d.f. is given by Equation (10) [63]:

Bayesian Analysis of Weibull
and CDF is given by Equation (11) [64]: Equation (11) can be rewritten as Equation (12): Based on Equation (12) and the explanation in Section 3.1, it can be concluded that the Weibull distribution is a member of the scale family. To facilitate the analysis of the Weibull distribution, it is necessary to transform it into a location-scale distribution.

Finite EV-I Mixture Distribution
Based on the explanation provided in the previous subsection, T ∼ Weibull(η, λ) as a member-scale family can be transformed into a member of the location-scale family Y ∼ η are equivalent models [68]. As Y ∼ EV − I µ = ln λ, σ = 1 η belongs to the location-scale family, it is sometimes easier to work with Y ∼ EV − I µ = ln λ, σ = 1 η rather than T ∼ Weibull(η, λ) [68], especially in the analysis of the mixture model. Consequently, the next analysis will use the variable Y. The EV-I mixture distribution with k components is defined as in Equation (15): where Θ = (θ, w) refers to the complete set of parameters to be estimated, where θ = (µ 1 , σ 1 ; µ 2 , σ 2 ; · · · ; µ k , σ k ).
Then, the final form of the joint distribution can be found via Equation (17): Pr(y i |θ z i ).

Priors and Posteriors
In this section, we define the priors. In the hierarchical model in Equation (16), for each parameter, we assume that the priors are drawn independently. Based on research conducted by Yoon et al. [69], Coles and Tawn [70] and Tancredi et al. [71], the priors for the location and scale parameters in the extreme value distribution are flat. In research by Yoon et al. [69], they chose the adoption of near-flat priors for the location and scale parameters. In research by Coles and Tawn [70], the location and scale parameter priors are almost noninformative: the prior for µ is extremely flat, while that for σ resembles 1 σ . Based on research by Tancredi et al. [71], they have chosen a uniform distribution for the location and scale parameters. Therefore, in this study, the Gaussian distribution (with the large variance) with mean ε and variance ζ 2 was selected as a prior for location parameter µ. Thus, µ j for each component is given by Since the scale parameter σ controls the dispersion of the distribution, an appropriate prior is an inverse gamma distribution with the shape and scale parameters are ϑ and , respectively. This prior selection is supported by Richardson and Green [16], and Bouguila and Elguebaly [29]. Thus, σ j for each component is given by where ∼ Gamma(g, h). Used Equations (18) and (19), we get Therefore, the hyperparameter ξ in Equation (17) is actually (ε, ζ 2 , ϑ, ). Thus, according to Equation (20) and the joint distribution in Equation (17), the fully conditional posterior distribution for µ j and σ j are and where n j = #{i : z i = j} represents the number of vectors in the cluster j, and we use '|· · · ' to designate conditioning on all other variables. As we know that the weights of components w = w j j=1,2,··· ,k are defined on the simplex (w 1 , w 2 , · · · , w k ) : w j < 1 , the appropriate prior for the weights of components is a Dirichlet distribution with parameters δ = (δ 1 , δ 2 , · · · , δ k ) [72], with the p.d.f. as in Equation (23): where . According to Equation (2), we also have Using Equations (23) and (24), and our joint distribution from Equation (17), we obtained Appl. Sci. 2021, 11, 7343 is a constant. This is in fact proportional to a Dirichlet distribution with parameters (δ 1 + n 1 , δ 2 + n 2 , · · · , δ k + n k ). Using Equations (2) and (17), we get the posterior for the allocation variables The last, proper prior for k is the Poisson distribution with hyperparameter γ [16], then the p.d.f. for k can be seen as in Equation (27): Our hierarchical model can be displayed as a directed acyclic graph (DAG), as shown in Figure 1.

RJMCMC Move Types for EV-I Mixture Distribution
It was mentioned in Section 3.2 that the Algorithm 1, which moves (a), (b), (c), and (d), can be run in parallel. This section will explain in more detail moves (e) and (f), namely split and combine moves, and birth and death moves.

Split and Combine Moves
For move (e), we choose between split or combine, with the probabilities k b and The combining proposal works as follows: choose two components 1 j and 2 j , where 1 2 μ μ < with no other 1 2 , j μ μ μ ∈     . If these components are combined, we reduce k by 1, which forms a new component * j containing all the observation previously allocated to 1 j and 2 j , and then creates values for * j w , * j μ , and * j σ by preserving the first two moments, as follows:

RJMCMC Move Types for EV-I Mixture Distribution
It was mentioned in Section 3.2 that the Algorithm 1, which moves (a), (b), (c), and (d), can be run in parallel. This section will explain in more detail moves (e) and (f), namely split and combine moves, and birth and death moves.

Split and Combine Moves
For move (e), we choose between split or combine, with the probabilities b k and d k = 1 − b k , respectively, depending on k. Note that d 1 = 0 and b k max = 0, where k max is the maximum value for k; otherwise, we choose b k = d k = 0.5, for k = 1, 2, · · · , k max − 1. The combining proposal works as follows: choose two components j 1 and j 2 , where µ 1 < µ 2 with no other µ j ∈ [µ 1 , µ 2 ]. If these components are combined, we reduce k by 1, which forms a new component j * containing all the observation previously allocated to j 1 and j 2 , and then creates values for w j * , µ j * , and σ j * by preserving the first two moments, as follows: The splitting proposal works as follows: a random j * component is selected then split into two new components, j 1 and j 2 , with the weights and parameters (w j 1 , µ j 1 , σ j 1 ) and (w j 2 , µ j 2 , σ j 2 ), respectively, conforming to Equation (28). Based on this information, we have three degrees of freedom, so we generate three random numbers u = (u 1 , u 2 , u 3 ), where u 1 ∼ Beta(2, 2), u 2 ∼ Beta(2, 2), and u 3 ∼ Beta(1, 1) [16]. Then, split transformations are defined as follows: Then, we compute the acceptance probabilities of split and combine moves: min{1, A} and min 1, A −1 , respectively. According to Equation (9), we obtain A as in Equation (30): where k is the number of components before the split, l 1 and l 2 are the numbers of observations proposed to be assigned to j 1 and j 2 , B(·, ·) is the beta function, p alloc is the probability that this particular allocation is made, g p,q is the beta (p, q) density, the (k + 1)-factor in the second line is the ratio from the order statistics densities for the location-scale parameters (µ, σ) [16], and the other terms have been fully explained in Appendices B and C.

Birth and Death Moves
The following is an explanation of move (f), namely birth and death moves. These moves are simpler than split and combine moves [16]. The first step consists of making a random choice between birth and death, with the same probabilities b k and d k as stated above. For birth, the proposed new component has parameters µ j * and σ 2 j * , which are generated from the associated prior distributions shown in Equations (18) and (19), respectively. The weight of the new component w j * follows a beta distribution, w j * ∼ Beta(1, k). To remain valid for the constraint k ∑ j=1 w j + w j * = 1, the previous weights w j for j = 1, 2, · · · , k must be rescaled by multiplying all by 1 − w j * . Therefore, 1 − w j * k is the determinant of Jacobian matrix corresponding to the birth move. For the opposite move, namely the death move, we randomly choose any empty component to remove. This step always considers the constraint that the remaining weights are rescaled to sum 1. The acceptance probabilities of the birth and death moves: min{1, A} and min 1, A −1 , respectively. According to Equation (9), we obtain A as in Equation (31): where k 0 is the number of empty components before birth, and B(·, ·) is the Beta function.

Simulation Study
In this section, we have 16 scenarios, namely Weibull mixture distribution with two components, three components, four components, and five components, each of which is generated with a sample of 125, 250, 500, and 1000 per component. Detailed descriptions of each scenario are given in Table 1, where the "Parameter of EV-I distribution" column is transformed from the "Parameter of Weibull distribution" column. Table 1. Sixteen scenarios of the Weibull mixture distribution and their transformation into the EV-I mixture distribution.

Scenario
Number of Components Component Number of Generated Data

Parameter of Weibull Distribution
Parameter of EV-I Distribution  In these scenarios, our specific choices for the hyperparameters were ζ = R, ϑ = 2, g = 0.2, h = 10 R 2 , δ = 1, and k max = 30, where R and ε are the length and midpoint (median) of the observed data, respectively (see Richardson and Green [16]). Based on the selection of the hyperparameters, we performed an analysis with 200,000 sweeps. With these 200,000 sweeps, we got a value of k as high as 200,000. From this, we took the most frequently occurring (mode) of k. Then, we replicated this step 500 times. Thus, we already had one mode k for each replication. Finally, we had 500 k and we calculated them as a percentage. The results of grouping the mixture components can be seen in Table 2, while the parameter estimation results can be seen in Table 3. Based on Table 2, each scenario provides a grouping with an accuracy level of at least 95%; the accuracy level is not 100% only when the sample size is 125 per component. Based on Table 3, it can be seen that the estimated parameters are close to their real parameters for all scenarios. Note: details of the computer and time required for the running simulation study are given in Appendix D.    Besides displaying the results of grouping in Table 2 and parameter estimation results in Table 3, we also provide an overview of the histogram and predictive density for each scenario. Histograms and predictive densities for the first to fourth scenarios can be seen in Figure 2a-d, the fifth to eighth scenarios in Figure 3a-d, the ninth to twelfth scenarios in Figure 4a-d, and the thirteenth to sixteenth scenarios in Figure 5a-d.

Misspecification Cases
For the simulation study, we provided 16 scenarios to validate our proposed algorithm. In this section, we provide the opposite-i.e., we intentionally generate data that are not derived from the Weibull distribution and then analyze them using our proposed algorithm. We generate two datasets taken from different distributions. The first dataset is taken from a double-exponential distribution with location and scale parameters of 0 and 1, respectively, and the second dataset is taken from a logistic distribution with location-scale parameters of 2 and 0.4, respectively. Each of these datasets has as many as 1000 data points from the distribution.
We used the EV-I mixture and Gaussian mixture distributions to analyze the data described above. We used the same hyperparameters as in the simulation study section because the double-exponential and logistic distributions both have location and scale parameters. To compare the performance between the EV-I mixture and Gaussian mixture distributions, we used the Kullback-Leibler divergence; a complete explanation and formula for the Kullback-Leibler divergence (KLD) can be found in Van Erven and Harremos [73]. In the simple case, the KLD value of 0 indicates that the true values with fitted densities have identical quantities of information. Thus, the smaller the KLD value, the more identical the true and fitted densities are.
The posterior distribution of k for the misspecification cases data can be seen in Table 4. Based on Table 4, the data that we generate are detected with multimodal data, even though the data we generate are unimodal. Then, in Table 5 can be seen the comparison of KLD for EV-I mixture distribution and Gaussian mixture distribution. Based on Table 5, it can be concluded that the EV-I mixture distribution covers more than the Gaussian mixture distribution for these data. Table 4. Posterior distribution of k for misspecification cases data based on mixture model using the EV-I distribution.  Table 5. Comparison of Kullback-Leibler divergence using EV-I mixture and Gaussian mixture distributions for misspecification cases data.

EV-I Mixture Distribution Gaussian Mixture Distribution
Double-exponential (0,1) In this section, we analyze enzyme, acidity, and galaxy datasets such as those of Richardson and Green [16] (these three datasets can be obtained from https://people. maths.bris.ac.uk/~mapjg/mixdata, accessed on 22 March 2021). We analyzed the datasets using EV-I mixture distribution with the hyperparameters for enzyme data being R = 2.86, ε = 1.45, ζ = 2.86, ϑ = 2, g = 0.2, h = 1.22, δ = 1; for acidity data, R = 4.18, ε = 5.02, ζ = 4.18, ϑ = 2, g = 0.2, h = 0.573, δ = 1; and for galaxy data, R = 25.11, ε = 21.73, ζ = 25.11, ϑ = 2, g = 0.2, h = 0.016, δ = 1; for these three datasets, we used k max = 30. The provisions for selecting these hyperparameters are explained in the Section 5. The posterior distribution of k for all three datasets can be seen in Table 6. Then, we compared the predictive densities of the enzyme, acidity, and galaxy datasets using the EV-I mixture and the Gaussian mixture distributions, which can be seen in Figures 6-8. Visually, based on Figures 6-8, it can be seen that the EV-I mixture distribution has better coverage than the Gaussian mixture distribution. Then, by using the KLD, it can be seen in Table 7, that the EV-I mixture distribution covers more than the Gaussian mixture distribution. Table 6. Posterior distribution of k for enzyme, acidity, and galaxy datasets based on mixture model using the EV-I distribution.

Enzyme, Acidity, and Galaxy Datasets
In this section, we analyze enzyme, acidity, and galaxy datasets such as those of Richardson and Green [16] (these three datasets can be obtained from https://people.maths.bris.ac.uk/~mapjg/mixdata (accessed on 22 March 2021 The provisions for selecting these hyperparameters are explained in the Section 5. The posterior distribution of k for all three datasets can be seen in Table 6. Then, we compared the predictive densities of the enzyme, acidity, and galaxy datasets using the EV-I mixture and the Gaussian mixture distributions, which can be seen in Figures 6-8. Visually, based on Figures 6-8, it can be seen that the EV-I mixture distribution has better coverage than the Gaussian mixture distribution. Then, by using the KLD, it can be seen in Table 7, that the EV-I mixture distribution covers more than the Gaussian mixture distribution.

Dengue Hemorrhagic Fever (DHF) in Eastern Surabaya, East Java, Indonesia
In this section, we apply the EV-I mixture distribution using RJMCMC to a real dataset. These data are the time until patient recovery from dengue hemorrhagic fever (DHF). We obtained the secondary data from medical records from Dr. Soetomo Hospital, Surabaya, East Java, Indonesia. The data concern patients in eastern Surabaya, which consists of seven subdistricts. Our data consist of 21 cases, with each case widespread over each subdistrict. The histogram of the spread of DHF in each subdistrict can be seen in the research conducted by Rantini et al. [36]. It was explained in their study that the data have a Weibull distribution. Whether the data are multimodal or not is unknown. The histogram of our original data is shown in Figure 9a.

Dengue Hemorrhagic Fever (DHF) in Eastern Surabaya, East Java, Indonesia
In this section, we apply the EV-I mixture distribution using RJMCMC to a real dataset. These data are the time until patient recovery from dengue hemorrhagic fever (DHF). We obtained the secondary data from medical records from Dr. Soetomo Hospital, Surabaya, East Java, Indonesia. The data concern patients in eastern Surabaya, which consists of seven subdistricts. Our data consist of 21 cases, with each case widespread over each subdistrict. The histogram of the spread of DHF in each subdistrict can be seen in the research conducted by Rantini et al. [36]. It was explained in their study that the data have a Weibull distribution. Whether the data are multimodal or not is unknown. The histogram of our original data is shown in Figure 9a. To determine the number of mixture components in our data, we applied the NG-RJMCMC algorithm. Of course, the first step was to transform the original data into a location-scale family, which can be seen in Figure 9b Table 8. Based on Table 8, the DHF data in eastern Surabaya have a multimodal pattern with the highest probability of having four components. Table 8. Summary of the results of grouping for DHF data in eastern Surabaya using the EV-I mixture distribution. To determine the number of mixture components in our data, we applied the NG-RJMCMC algorithm. Of course, the first step was to transform the original data into a location-scale family, which can be seen in Figure 9b. Then, for our transformed data, we used the hyperparameters R = 1.9459, ε = 1.3863, ζ = 1.9459, ϑ = 2, g = 0.2, h = 2.6409, δ = 1, and k max = 30. Then, we did all six moves type on the data with 200,000 sweeps. The results of the grouping are shown in Table 8. Based on Table 8, the DHF data in eastern Surabaya have a multimodal pattern with the highest probability of having four components. Table 8. Summary of the results of grouping for DHF data in eastern Surabaya using the EV-I mixture distribution. Using the EV-I mixture distribution with four components, the results of the parameter estimation for each component are shown in Table 9. Then, the membership label of each observation in each mixture component is shown in Figure 10. Finally, the analysis was compared using the four-component EV-I mixture distribution and the four-component Gaussian mixture distribution, as shown in Figure 11. According to Table 10 and Figure 11, it can be seen that our data are better covered by using the four-components EV-I mixture distribution. Using the EV-I mixture distribution with four components, the results of the parameter estimation for each component are shown in Table 9. Then, the membership label of each observation in each mixture component is shown in Figure 10. Finally, the analysis was compared using the four-component EV-I mixture distribution and the four-component Gaussian mixture distribution, as shown in Figure 11. According to Table 10 and Figure 11, it can be seen that our data are better covered by using the four-components EV-I mixture distribution.  Table 10. Kullback-Leibler divergence using EV-I mixture and Gaussian mixture distributions for the DHF data in eastern Surabaya.

Conclusions
We provided an algorithm in the Bayesian mixture analysis. We called it non-Gaussian reversible jump Markov chain Monte Carlo (NG-RJMCMC). Our algorithm is a modification of RJMCMC, where there is a difference in the initial steps, namely changing the original distribution into a location-scale family. This step facilitates the grouping of each observation into the mixture components. Our algorithm allows researchers to easily analyze data that are not from the Gaussian family. In our study, we used Weibull distribution, then transformed it into the EV-I distribution.
To validate our algorithm, we performed 16 scenarios for the EV-I mixture distribution simulation study. The first to fourth scenarios had two components, the fifth to eighth scenarios had three components, the ninth to twelfth scenarios had four components, and the thirteenth to sixteenth scenarios had five components. We generated data in different sizes, ranging from 125 to 1000 samples per mixture component. Next, we analyzed them using a Bayesian analysis with the appropriate prior distributions. We used 200,000 sweeps per scenario and replicated them 500 times. The results of this simulation indicate that each scenario provides a minimum level of accuracy of 95%. Moreover, the estimated parameters come close to the real parameters for all scenarios.
To strengthen the proposed method, we provided misspecification cases. We deliberately generated unimodal data with double-exponential and logistic distributions, then estimated them using the EV-I mixture distribution and Gaussian mixture distribution. The results indicated that the data we generated are multimodally detected. Based on the KLD, the EV-I mixture distribution has better coverage than the Gaussian mixture distribution.
We also implemented our algorithm for real datasets, namely enzyme, acidity, and galaxy datasets. We compared the EV-I mixture distribution with the Gaussian mixture distribution for all three datasets. Based on the KLD, we found that the EV-I mixture distribution has better coverage than the Gaussian mixture distribution. Then, visually, the results also show that the EV-I mixture distribution has better coverage. We also compared the EV-I mixture distribution with the Gaussian mixture distribution for the DHF data in eastern Surabaya. In our previous research, we analyzed the data using the Weibull distribution. We do not know whether the data were identified as multimodal or not. Using our algorithm, we found that the data are multimodal with four components. We also compared the EV-I mixture distribution and the Gaussian mixture distribution.

Conclusions
We provided an algorithm in the Bayesian mixture analysis. We called it non-Gaussian reversible jump Markov chain Monte Carlo (NG-RJMCMC). Our algorithm is a modification of RJMCMC, where there is a difference in the initial steps, namely changing the original distribution into a location-scale family. This step facilitates the grouping of each observation into the mixture components. Our algorithm allows researchers to easily analyze data that are not from the Gaussian family. In our study, we used Weibull distribution, then transformed it into the EV-I distribution.
To validate our algorithm, we performed 16 scenarios for the EV-I mixture distribution simulation study. The first to fourth scenarios had two components, the fifth to eighth scenarios had three components, the ninth to twelfth scenarios had four components, and the thirteenth to sixteenth scenarios had five components. We generated data in different sizes, ranging from 125 to 1000 samples per mixture component. Next, we analyzed them using a Bayesian analysis with the appropriate prior distributions. We used 200,000 sweeps per scenario and replicated them 500 times. The results of this simulation indicate that each scenario provides a minimum level of accuracy of 95%. Moreover, the estimated parameters come close to the real parameters for all scenarios.
To strengthen the proposed method, we provided misspecification cases. We deliberately generated unimodal data with double-exponential and logistic distributions, then estimated them using the EV-I mixture distribution and Gaussian mixture distribution. The results indicated that the data we generated are multimodally detected. Based on the KLD, the EV-I mixture distribution has better coverage than the Gaussian mixture distribution.
We also implemented our algorithm for real datasets, namely enzyme, acidity, and galaxy datasets. We compared the EV-I mixture distribution with the Gaussian mixture distribution for all three datasets. Based on the KLD, we found that the EV-I mixture distribution has better coverage than the Gaussian mixture distribution. Then, visually, the results also show that the EV-I mixture distribution has better coverage. We also compared the EV-I mixture distribution with the Gaussian mixture distribution for the DHF data in eastern Surabaya. In our previous research, we analyzed the data using the Weibull distribution. We do not know whether the data were identified as multimodal or not. Using our algorithm, we found that the data are multimodal with four components. We also compared the EV-I mixture distribution and the Gaussian mixture distribution. Again, the EV-I mixture distribution indicated better coverage, seen both through the KLD and visually.
Author Contributions: D.R., N.I. and I. designed the research; D.R. collected and analyzed the data and drafted the paper. All authors have critically read and revised the draft and approved the final paper. All authors have read and agreed to the published version of the manuscript. Acknowledgments: The authors thank the referees for their helpful comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
T ∼ Weibull(η, λ) with the CDF Define a new variable, Y = ln T, its CDF is Based on its CDF and p.d.f, it can be seen that Y ∼ EV − I(µ, σ) where µ = ln λ and σ = 1 η . Then the appropriate support is as follows: