Next Article in Journal
Pricing Compound and Extendible Options under Mixed Fractional Brownian Motion with Jumps
Previous Article in Journal
Efficient Two-Step Fifth-Order and Its Higher-Order Algorithms for Solving Nonlinear Systems with Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family

by
Mohsen Maleki
1,
Javier E. Contreras-Reyes
2,* and
Mohammad R. Mahmoudi
3
1
Department of Statistics, College of Sciences, Shiraz University, Shiraz 71946 85115, Iran
2
Departamento de Estadística, Facultad de Ciencias, Universidad del Bío-Bío, Concepción 4081112, Chile
3
Department of Statistics, Faculty of Science, Fasa University, Fasa 74616 86131, Iran
*
Author to whom correspondence should be addressed.
Axioms 2019, 8(2), 38; https://doi.org/10.3390/axioms8020038
Submission received: 18 February 2019 / Revised: 15 March 2019 / Accepted: 27 March 2019 / Published: 1 April 2019

Abstract

:
In this paper, we examine the finite mixture (FM) model with a flexible class of two-piece distributions based on the scale mixtures of normal (TP-SMN) family components. This family allows the development of a robust estimation of FM models. The TP-SMN is a rich class of distributions that covers symmetric/asymmetric and light/heavy tailed distributions. It represents an alternative family to the well-known scale mixtures of the skew normal (SMSN) family studied by Branco and Dey (2001). Also, the TP-SMN covers the SMN (normal, t, slash, and contaminated normal distributions) as the symmetric members and two-piece versions of them as asymmetric members. A key feature of this study is using a suitable hierarchical representation of the family to obtain maximum likelihood estimates of model parameters via an EM-type algorithm. The performances of the proposed robust model are demonstrated using simulated and real data, and then compared to other finite mixture of SMSN models.

1. Introduction

Finite mixture models are highly demanded in machine-learning analysis, due to their properties, computational tractability, and for being a good approximation for continuous densities [1]. They are also an important statistical tool for many applications in clustering, discriminant analysis, image processing and satellite imaging [2]. Beyond the already known results provided for the finite mixture of normal distributions (FM-NOR) model in the literature [1], recent developments cover symmetric/asymmetric and light/heavy tailed distributions. One of these is the novel class of finite mixture of multivariate skew-normal mixture (FM-SN) models [3,4], which provides some advantages over the normal mixtures: the normal components allow an arbitrarily close modeling of any distribution by increasing the number of components, and, in the context of supervised learning, groups of observations represented by asymmetrically distributed data can lead to the wrong classification. The components of skew-normal mixture models, however, capture skewness due to their flexibility [1]. In addition, a robust extension of the FM-SN model to robust finite mixture of skew-t (FM-ST) has been done in the influential works of [3,5,6,7]. The FM-ST components, too, capture both skewness and extreme observations due to their flexibility [8].
The SMSN family is a rich and very strong flexible class of distributions which covers the light/heavy-tailed distributions; e.g., skew-normal (SN), skew-t (ST), skew-slash (SSL) and skew contaminated-normal (SCN) distributions, and has been widely considered in many statistical models, especially FM models (see e.g., [5,9,10,11,12,13,14,15]). The SMSN family is an extension of the skewed version of the well-known symmetric scale mixtures of the normal (SMN) family which contains the light/heavy-tailed members: the normal (N), t (T), slash (SL) and contaminated-normal (CN) distributions [16]. Lange et al. [17], Lange and Sinsheimer [18], and Maleki and Nematollahi [19] used the SMN family in an application of robust statistical modeling. A two-piece distribution based on the symmetrical distributions with various scales is an alternative approach to model atypical data (see e.g., [10,20,21,22,23,24]). In our approach, we have used the two-piece distributions based on the SMN family. This family, called the two-piece distributions based on the scale mixtures of normal (TP-SMN), and analogy of the SMSN family, contains the light/heavy-tailed members: the two-piece normal (TP-N), two-piece t (TP-T), two-piece slash (TP-SL) and two-piece contaminated-normal (TP-CN) distributions as its members.
In this paper, we consider the TP-SMN family of distributions as a two-component mixture of truncated SMN distributions on a special two partition of the real domain ( R ), and then propose the finite mixture of this family, called FM-TP-SMN models. It represents an alternative family to the well-known scale mixtures of skew normal (SMSN) family studied by [25]. We have also used a hierarchical representation of the FM-TP-SMN and implemented an expectation-maximization (EM)-type algorithm for finding the maximum likelihood (ML) estimates of the proposed model. Studies by [21,23], show that by truncating the distribution in two partitions, makes it possible to obtain a better fit of empirical distribution because, the subjacent process of the complete likelihood is modeled. This way, the “two-piece” modeling is a direct competitor against the FM-SMSN family of distributions [21].
The rest of this paper is organized as follows. In Section 2, we review some main properties of the TP-SMN family and represent this family as a two-component mixture of the truncated SMN distributions. In Section 3, the FM-TP-SMN model is introduced and the ML estimates of the proposed model parameters via an EM-type algorithm are provided. In Section 4, numerical studies with an application of the proposed models and estimates are considered. Some conclusions and ideas for future research are offered in Section 5.

2. The Two-Piece Scale Mixtures of Normal Distributions

In this section, we analyze some necessary properties of the TP-SMN family of distributions for our proposed FM model.
The well-known SMN family introduced by [16] (the basis of the robust asymmetric TP-SMN family), has the following probability density function (PDF) and stochastic representation. Let X S M N ( μ , σ , ν ) , then its PDF is
f S M N ( x | μ , σ , ν ) = 0 ϕ ( x | μ , u 1 σ 2 ) d H ( u | ν ) , x R ,
and its stochastic representation is
X = μ + σ U 1 / 2 W ,
where ϕ ( · | μ , σ 2 ) represents the density of N ( μ , σ 2 ) distribution, H ( · | ν ) is the cumulative distribution function (CDF) of the scale mixing random variable U, which can be indexed by a scalar or vector of parameters ν , and W is a standard normal random variable that is independent of U.
The TP-SMN is a rich family of distributions that covers the asymmetric light-tailed TP-N (also called the epsilon-skew-normal; [26]), the asymmetric heavy-tailed TP-T, TP-SL and TP-CN distributions, and their corresponding symmetric members. Note that symmetric members of the TP-SMN and SMSN classes are the SMN family. In terms of density, for y R this family can be represented as
g ( x | μ , σ , γ , ν ) = 2 ( 1 γ ) f S M N ( x | μ , σ ( 1 γ ) , ν ) , y μ , 2 γ S M N ( x | μ , σ γ , ν ) , y > μ ,
where 0 < γ < 1 is the slant parameter, f S M N ( · | μ , σ , ν ) is given by (1) and is denoted by Y T P S M N ( μ , σ , γ , ν ) with E ( Y ) = μ b σ ( 1 2 γ ) and V a r ( Y ) = σ 2 [ c 2 k 2 ( ν ) b 2 c 1 2 ] , for which b = 2 / π k 1 ( ν ) , c r = γ r + 1 + ( 1 ) r ( 1 γ ) r + 1 , k r ( ν ) = E ( U r / 2 ) , and U is the scale mixing random variable in (2).
Different TP-SMN member distributions in (3) are obtained by several distributions for scale mixing random variable U in (2), as follows:
  • Two-piece normal (TP-N): U = 1 with probability one,
  • Two-piece t (TP-T): U G a m m a ( ν / 2 , ν / 2 ) , i.e., ν = ν ,
  • Two-piece slash (TP-SL): U B e t a ( ν , 1 ) , i.e., ν = ν ,
  • Two-piece contaminated normal (TP-CN): h ( u | ν ) = ν I ( u = 1 ) + ( 1 ν ) I ( u = 1 ) , i.e., ν = ( ν , τ ) .
For more details and statistical properties of the TP-SMN family, see [20,23].
Further, the two-piece distributions can be represented as the two-component mixture with separated supports, i.e., left and right half basic distributions [20] (Equation (4)), especially when Y T P S M N ( μ , σ , γ , ν ) with PDF given in (3), two-component mixture left and right half SMN distributions with special component probabilities as follows:
g ( x | μ , σ 1 , σ 2 , ν ) = 2 σ 1 σ 1 + σ 2 f S M N ( y | μ , σ 1 , ν ) I ( , μ ] ( y ) + 2 σ 1 σ 1 + σ 2 f S M N ( x | μ , σ 2 , ν ) I ( μ , + ) ( y ) .
Note in Equation (4) that, the scale parameter σ and slant parameter γ in Equation (3) are recovered in the form of σ = σ 1 + σ 2 and γ = σ 2 / ( σ 1 + σ 2 ) .
By using auxiliary (latent) variables S j , j = 1 , 2 ; in terms of the components of the mixture in Equation (4), the TP-SMN random variable can have the following stochastic representation
Y | S 1 = 1 S M N ( μ , σ 1 , ν ) I A ( y i ) , Y | S 2 = 1 S M N ( μ , σ 2 , ν ) I A c ( y i ) ,
where A = ( , μ ) and S M N ( · ) I A ( · ) denotes the truncated SMN distribution on the interval A, and S = ( S 1 , S 2 ) has a multinomial distribution with following probability mass function (PMF):
P ( S = s ) = σ 2 σ 1 + σ 2 z 1 σ 2 σ 1 + σ 2 z 2 ; s 1 , s 2 = 0 , 1 ,
and is denoted by S M ( 1 , σ 1 / ( σ 1 + σ 2 ) , σ 2 / ( σ 1 + σ 2 ) ) . Note that each component-label is a Bernoulli random variable S k B i n o m i a l ( 1 , σ k / ( σ 1 + σ 2 ) ) ; k = 1 , 2 , such that S 1 + S 2 = 1 .

3. Finite Mixtures TP-SMN

In this section, we introduce the finite mixture of TP-SMN (FM-TP-SMN) model and obtain the ML estimates of this model’s parameters.

3.1. FM-TP-SMN Model

Here, we consider a distribution represented as a g-component mixture of TP-SMN distributions. In terms of density, this mixture distribution is characterized by the following density:
f ( y | Θ ) = j = 1 g π j g ( y | μ j , σ j , γ j , ν j ) , y R ,
where Θ = ( π 1 , , π g , μ 1 , , μ g , σ 1 , , σ g , γ 1 , , γ g , ν 1 , , ν g ) , for which ß = ( π 1 , , π g ) with π j > 0 , j = 1 , , g , j = 1 g π j = 1 , and, for j = 1 , , g , g ( y | μ j , σ j , γ j , ν j ) is an T P S M N ( μ j , σ j , γ j , ν j ) -component density as defined in (1). Also, we write Y F M T P S M N ( Θ ) to say that a random variable Y has an FM-TP-SMN distribution as defined by (7).
Concerning the parameter ν j of the mixing distribution H ( · | ν j ) , for j = 1 , , g , it is worth noting that it can be a vector of parameters, e.g. the contaminated normal distribution. Thus, for computational convenience we assume that ν 1 = = ν g = ν (see also [5]).
In terms of the components of the mixtures, Equation (7) can be equivalently obtained by
Y | Z j = 1 T P S M N ( μ j , σ j , γ j , ν ) , j = 1 , , g ,
where Z = ( Z 1 , , Z g ) M u l t i n o m i a l ( 1 , π 1 , , π g ) is a multinomial (component-label) vector with probability mass function P ( Z 1 = z 1 , , Z m = z m ) = π 1 z 1 π 2 z 2 π g z g , z j = 0 , 1 ; j = 1 , , g , j = 1 g z j = 1 .
Since only one component of Z can be equal to one (remaining ones are zero), events { Z j = 1 } and { Z j = 1 , Z r = 0 ; j r } are equivalent, indicating thus that the distribution of Y corresponds to the i-th component of the mixture; for further details, see e.g., [1].
Remark 1.
Let Y F M T P S M N ( Θ ) , then the mean and variance of Y are, respectively, given by
E [ Y ] = j = 1 g π j E [ Y | Z j = 1 ] = j = 1 g π j E [ X j ] ,
and
V a r [ Y ] = j = 1 g π j { V a r [ Y | Z j = 1 ] + ( E [ Y | Z j = 1 ] E [ Y ] ) 2 } = j = 1 g π j { V a r [ X j ] + ( E [ X j ] μ ¯ ) 2 } ,
where X j T P S M N ( μ j , σ j , γ j , ν ) , j = 1 , , g , and μ ¯ = j = 1 g π j E [ X j ] (see e.g., [2]).
The FM-TP-ESN densities in (7) are an extremely flexible class which includes the finite mixtures of SMN densities as special case, when ε j = 0 , j = 1 , , g .
For each i.i.d. sample in the form of Y = ( Y 1 , , Y n ) , by considering the PDF (7), the log-likelihood function is
( Θ | Y ) = i = 1 n log j = 1 g π j g ( y | μ j , σ j , γ j , ν ) .

3.2. ML Estimates of Model Parameters

We can utilize a (latent) indicator (allocation) variables Z i = ( Z i 1 , , Z i g ) , i = 1 , , n , to assign observations belonging to different components of the mixture ( j = 1 , , g ), so in terms of Z i j , we can conclude that
Y i | Z i j = 1 i n d . T P S M N ( μ j , σ j , γ j , ν ) , P ( Z i j = 1 ) = π j ; i = 1 , , n , j = 1 , , g ,
and so using Equations (2) and (5) with S i j = ( S i j 1 , S i j 2 ) , i = 1 , , n , we have that
Y i | U i j , Z i j = 1 , S i j k = 1 i n d . N ( μ j , u i j 1 / 2 σ j 2 ) I A j ( y i ) 2 k I A j c ( y i ) k 1 , U i j | Z i j = 1 , S i j k = 1 i n d . H ( u i j | ν ) , S i j | Z i j = 1 i i d . M 1 , σ 1 j σ 1 j + σ 2 j , σ 2 j σ 1 j + σ 2 j , Z i i i d . M ( n , π 1 , , π g ) ,
for i = 1 , , n ; j = 1 , , g ; k = 1 , 2 , A j = ( , μ j ] and N ( · ) I A ( · ) denotes the truncated normal distribution on the interval A.
The above hierarchical representation of the FM-TP-SMN model will be used to obtain the ML estimates via an ECME-algorithm. This algorithm is a generalization of the ECM-algorithm introduced by [27], which is an extension of the EM-algorithm [28]. It can be obtained by replacing some CM-steps, which maximize the constrained expected complete-data log-likelihood function, with steps that maximize the corresponding constrained actual likelihood function. As [27,29] indicated, the joint ML estimates obtained by ECME-algorithms are much more efficient than other EM-type algorithms.
Let C = { Y , S , Z } denotes the complete data, where Y = ( Y 1 , , Y n ) is the observed sample S = ( S 1 , , S n ) and Z = ( Z 1 , , Z n ) are the latent or unobserved variables from the FM-TP-SMN model with vector of parameters Θ = ( π 1 , , π g , μ 1 , , μ g , σ 11 , , σ 1 g , σ 21 , , σ 2 g , ν ) . Considering the hierarchical representation (10), the completed (augmented) likelihood function is given by
L C ( Θ ) = i = 1 n j = 1 g k = 1 2 ( ϕ ( y i | μ j , u i j k 1 σ k 2 ) h ( u i j k | ν ) p ( z i | π 1 , , π g ) p ( s i | σ 1 j , σ 2 j ) I A j ( y i ) 2 k I A j c ( y i ) k 1 ) z i j s i j k ,
where A j = ( , μ j ] . After ignoring constants and using auxiliary (latent) variables the completed log-likelihood function is in the form:
( Θ ) = i = 1 n j = 1 g z i j log σ 1 j + σ 2 j 1 2 i = 1 n j = 1 g k = 1 2 z i j s i j k u i j k σ k j 2 ( y i μ j ) 2 + i = 1 n j = 1 g k = 1 2 z i j s i j k log h ( u i j k | ν ) .
Quantities z ^ i j = E [ Z i j | Θ ^ , y i ] , s ^ i j k = E [ S i j k | Θ ^ , y i , Z i j = 1 ] and w ^ i j k = E [ Z i j S i j k U i j k | Θ ^ , y i ] , must be defined, and using known properties of conditional expectation and PDF in (4), we obtain w ^ i j k = z ^ i j s ^ i j k κ ^ i j k , where κ ^ i j k = E [ U i j k | Θ ^ , y i , Z i j = 1 , S i j k = 1 ] , i = 1 , , n , j = 1 , 2 , and
z ^ i j = E [ Z i j | Θ ^ , y i ] = π j g ( y i | μ ^ j , σ ^ 1 j , σ ^ 2 j , ν ^ ) j = 1 g π j g ( y i | μ ^ j , σ ^ 1 j , σ ^ 2 j , ν ^ ) ; i = 1 , , n , j = 1 , , g ,
s ^ i j 1 = 2 σ 1 j σ ^ 1 j + σ ^ 2 j f S M N ( y i | μ ^ j , σ ^ 1 j , ν ^ ) I ( , μ ^ j ] ( y i ) g ( y i | μ ^ j , σ ^ 1 j , σ ^ 2 j , ν ^ ) = I ( , μ ^ j ] ( y i ) ,
where g ( · | · ) is the TP-SMN PDF defined in Equation (4), and s ^ i j 2 = 1 s ^ i j 1 , and the conditional expectation κ ^ i j for the TP-SMN distribution members are given by:
  • Two-piece normal (TP-N): κ ^ i j k = 1 ,
  • Two-piece t (TP-T): κ ^ i j k = ν ^ + 1 ν ^ + d i j k ,
  • Two-piece slash (TP-SL): κ ^ i j = 2 ν ^ + 1 d i j k P 1 ( ν ^ + 3 / 2 , d i j k / 2 ) P 1 ( ν ^ + 1 / 2 , d i j k / 2 ) ,
  • Two-piece contaminated Normal (TP-CN): κ ^ i j = τ ^ 3 / 2 ν ^ e τ ^ d i j k / 2 + ( 1 ν ^ ) e d i j k / 2 τ ^ 1 / 2 ν ^ e τ ^ d i j k / 2 + ( 1 ν ^ ) e d i j k / 2 ,
where d i j k = y i μ ^ j σ ^ k j 2 and P x ( a , b ) denotes the distribution function of the G a m m a ( a , b ) distribution evaluated at x.
Now, the expectation step (E-step) at the ( r + 1 ) th iteration of the ECME-algorithm requires the calculation of Q ( Θ | Θ ^ r ) = E Θ [ c ( Θ ) | Θ ^ r , y ] . So,
E-step.
Q ( Θ | Θ ^ r ) = n j = 1 g log ( σ 1 j + σ 2 j ) 1 2 i = 1 n j = 1 g k = 1 2 w ^ i j k σ k j 2 ( y i μ j ) 2 + i = 1 n j = 1 g k = 1 2 E [ Z i j S i j k log h ( U i j k | ν ) | Θ ^ r , y i ] .
For the conditionally maximizing steps (CM-steps) at the ( r + 1 ) -th iteration of the ECME-algorithm we have:
CM-steps.
Update π j , j = 1 , , g , as:
π j ( r + 1 ) = i = 1 n z ^ i j ( r ) n .
Update μ j , j = 1 , , g , as:
μ j ( r + 1 ) = i = 1 n α ^ i j ( r ) y i α ^ i j ( r ) ,
where α ^ i j ( r ) = k = 1 2 w ^ i j k ( r ) / σ k j 2 ( r ) .
Update σ ^ k j ( k + 1 ) ; k = 1 , 2 ; j = 1 , , g , by solving the following stressed cubic equations
σ k j 3 + p σ k j + q = 0 ; k = 1 , 2 ,
where p = i = 1 n w ^ i j k ( r ) ( Y i μ j ( r + 1 ) ) 2 z ^ i j ( r ) , for which q = p σ 2 j I ( k = 1 ) + p σ 1 j I ( k = 2 ) . Note that p , q < 0 , so this cubic equation has unique just root in the ( 0 , + ) interval.
CML-step of the ECME-algorithm.
ν ( k + 1 ) = a r g m a x ν ( Θ ^ ν ( r + 1 ) , ν | Y ) ,
where ( · | Y ) is the log-likelihood function given in (9) and Θ ^ ν ( r + 1 ) denotes the ( r + 1 ) -th update of Θ ^ except ν .
The ECME-algorithm iterates until a sufficient convergence rule is satisfied, e.g., if | ( Θ ^ ( r + 1 ) | y ) / ( Θ ^ ( r ) | y ) 1 | ϵ , under the determined tolerance ϵ .

4. Numerical Studies

In this section, we assess the performance of the proposed FM model using simulated and real datasets. The implementations of the algorithms were based on the R software [30] version 3.5.1 with a core i7 760 processor 2.8 GHz, and a relative tolerance of 10 5 was used for convergence of the ECME-algorithms. A sample copy of the R code is available up on request from the authors and will be available in an R package specialize to this proposed model.

4.1. Simulations

In this section, we have three simulations. In the first, we showed the robustness of the FM-TP-SMN models to classify heterogeneous data; in the second, we showed the misspecification of the proposed FM-TP-SMN models; and in the third simulation we considered suitability of the asymptotic properties for proposed model estimates.

4.1.1. Clustering

The FM models are useful for clustering the observations by allocating them into groups of observations that are similar in some sense. In fact, by considering the estimated (posterior) probabilities, we can assign such observation points to given groups. However, some atypical data have an undesirable effect to suitable clustering (see e.g., [1,2,8]). In our models, we consider the skewness and use the clustering as a base on them to show the robustness on the clustering of atypical data in components. We generated 1000 samples from the FM-TP-SMN with two components and for each sample, and considered the k-means clustering while we have ignored the true classification on these classifications.
We simulated 1000 samples with sample sizes n = 100 , 350 , 800 , from the FM-TP-SMN models with parameters π 1 = 0.75 , μ 1 = 10 , μ 2 = 15 , σ 1 = 2 , σ 2 = 4 , γ 1 = 0.1 , γ 2 = 0.7 and γ 1 = γ 2 = 0.5 (FM-Normal model), for which ν = 4 for TP-T and TP-SL, and ν = ( 0.3 , 0.3 ) for TP-CN. According to the FM-TP-SMN estimated (posterior) probabilities given in (12) and the threshold value 0.5, we allocated the observations to some specific component. For each sample t = 1 , , 1000 , the mean value rate of the correct allocations are given in Table 1, which shows that clustering based on the FM-TP-T, FM-TP-SL and FM-TP-CN are more reasonability than the ordinary FM-Normal model clustering, in the presence of atypical data. Also note that in the case of the true model (FM-TP-T), the FM-TP-CN also outperforms the other models.

4.1.2. Misspecification

For this section, we simulated 2000 samples with lengths n = 150 from FM-SN (asymmetric and light tailed components) and FM-ST (asymmetric and heavy tailed components) separately, with parameters of the previous simulation structure and with ( λ 1 , λ 2 ) = ( 2 , 3 ) . Then, we fitted various proposed FM-TP-SMN models to these data. In Table 2, various FM-TP-SMN models were first compared with the ordinary FM-NOR model (symmetric and light tailed components) and then various competitors within the FM-TP-SMN models (asymmetric components). The results in the first of four rows of Table 2 demonstrate that the number of preferred models belongs to the class of FM-TP-SMN models against the FM-NOR model. Also, the number of preferred models to fit the FM-SN is FM-TP-N, and in this case other preferred models except the FM-TP-N model are models which have similarities with it (for example FM-TP-T with large values of degree of freedom ν ), i.e., preferred fitted models to the FM-SN with asymmetric and light tailed components are the FM-TP-SMN models with light tailed components. In the cases of FM-ST with asymmetric but heavy tailed components, also the FM-TP-SMN models with heavy tailed components were preferred. In this and the real application parts, the model selection criteria to choose the best model are: logarithm of the maximized likelihood function (log-like) which is ( Θ ^ | y ) , Akaike information criteria (AIC); [31], Bayesian information criteria (BIC); [32], in the form of
A I C = 2 k 2 ( Θ ^ | y ) and B I C = k log n 2 ( Θ ^ | y ) ,
respectively, where k is the number of the model parameters.

4.1.3. Asymptotical results

For this section, we simulated 400 samples each one with sample sizes n = 150 , 600, 1000, 2000, 4000, from some FM-TP-T models with two components which are weak separated (WS), medium separated (MS) and strong separated (SS) of components, i.e., little, medium and large overlap of components respectively (see Figure 1), for which π 1 = 0.4 , μ 1 = 10 , μ 2 = 15 , σ 1 = 1 , σ 2 = 1 , γ 1 = 0.65 , γ 2 = 0.35 , ν = 4 and ( μ 1 , μ 2 ) = ( 0.2 , 0.2 ) for WS data; ( μ 1 , μ 2 ) = ( 1.5 , 1.5 ) for MS data; and ( μ 1 , μ 2 ) = ( 5 , 5 ) for SS data.
Using the proposed ECME algorithm to find the ML estimates we focus on the evaluation of Monte-Carlo average of biasness (MC-bias) and mean squared error (MSE) defined as of the ML estimates in each j-th sample, j = 1 , , 400 , respectively given in Table 3, Table 4 and Table 5 by
M C B i a s ( ξ ) = 1 400 j = 1 400 ξ ^ j ( i ) ξ j , M S E ( ξ ) = 1 400 j = 1 400 ( ξ j ξ ^ j ( i ) ) 2 ,
where ξ ^ j ( i ) is the ML estimate of the parameter ξ j in the i-th sample.
These results in Table 3, Table 4 and Table 5 are obtained from the different fitted FM-TP-SMN models and show the performance of the proposed models as well as their parameters estimates. As the sample size increased we naturally observed that the Monte-Carlo average bias of ML estimates and MSE were tending toward zero.

4.2. Applications

In this section, we apply the FM-TP-SMN models on some various real data sets to show the performance of the proposed models and estimates in applications.

BMI Data

We considered the body mass index (BMI) data set collected for men aged between 18 and 80 years. The BMI data set was gathered with the National Health and Nutrition Examination Survey in the US National Center for Health Statistics (NCHS) of the Center for Disease Control (CDC). A strong relationship between the obesity problem and many chronic diseases has attracted attention in recent years, that is, most people with an obesity problem will have chronic diseases. The ratio of body weight in kilograms and height in squared meters (BMI) is a measure to determine the rate of relationship between overweight and obesity. In this way, a person with BMI > 25 is considered overweighed, while BMI > 30 is considered obese.
This dataset had 4579 participants with BMI records, but for modeling with finite mixture models, participants with weights within 39.50–70.00 kg and 95.01–196.80 kg with 1069 and 1054 participants were considered in the first and second subgroups respectively. Lin et al. [7] were first analyzed this dataset by considering the reports in 1999–2000 and 2001–2002, and were fitted the FM-normal, FM-T, FM-SN and FM-ST, always with two components, and then [5,13] fitted the FM-SMSN models to this dataset. The results, obtained by [13], were general and involved the results by [5,7]. So we fitted the proposed FM-TP-SMN models to this dataset and compared obtained results in the [13].
Table 6 contains the ML estimates of the FM-TP-SMN models with two components, and the Log-likelihood, AIC and BIC criterions of the proposed FM-TP-SMN models and FM-SMSN taken from Table 1 due to [13] appear in Table 7.
As noted by Lin et al. [4] and Prates et al. [13], the criteria values in Table 7 indicate that the heavy tailed FM-SMSN models (FM-ST, FM-SCN and FM-SSL) had a better fit than the ordinary FM-NOR and FM-SN models, and also the FM-SSL and FM-ST were the best fitted models. Such results are for the FM-TP-SMN (with corresponding FM-SMSN counterparts) models, while the FM-TP-SMN models were more reasonable than FM-TP-SMN models. However, the FM-TP-SL and FM-TP-T were the best models. In Figure 2, we plot the fitted FM-TP-T and FM-ST densities curved on the histogram of BMI data.

4.3. UScrime Data

As a further application of the FM-TP-SMN models and proposed methodology, we consider the effect of punishment regimes on crime rates [33,34], which is of high interest to criminologists. This has been studied using aggregate data of 47 US states for 1960 given in this data frame, and we consider the 13th column of this data frame which is due to income inequality. The data are available under the UScrime function in the MASS R package.
Table 8 contains the ML estimates of the FM-TP-SMN models with two components, and the Log-likelihood, AIC and BIC criterions of the proposed FM-TP-SMN and FM- SMSN models.
The log-likelihood values in Table 8 indicate that the FM-TP-N and FM-TP-T are the best models within the FM-TP-SMN models, while the FM-SCN is the best model in the class of the FM-SMSN models. The AIC and BIC criteria chose the FM-TP-N model (asymmetric components) within the FM-TP-SMN models, while they chose the FM-NOR model (symmetric components) in the class of the FM-SMSN models, which has symmetrical components. Among all competitors, the criteria chose the FM-TP-N model belonging to the proposed FM-TP-SMN models, which is more reasonable. In Figure 3 is plotted the histogram of US crime data with the curves of FM-TP-SMN and FM-SMSN models. These graphical visualizations show the suitability of the asymmetrical components and proposed FM-TP-SMN models.

5. Conclusions

We have proposed a flexible family of TP-SMN distributions for application in clustering problems. The TP-SMN family is capable of representing distributions of symmetric/asymmetric and light/heavy tailed forms, which contains the well-known symmetric SMN family and is a reasonable competitor for asymmetric SMSN family as a special case. Estimation of the finite mixtures of the FM-TP-SMN parameters is relatively straightforward via the ECME algorithm with a fast convergence (a few iterations loop). Considering a Bayesian approach and the flexible TP-SMN family in the Autoregressive and ARMA processes from [35,36] can be further topics of research.

Author Contributions

M.M., J.E.C.-R. and M.R.M. wrote the paper and contributed the reagents/analysis/materials tools; M.M. conceived, designed, and performed the experiments, and analyzed the data. All authors read and approved the final manuscript. All authors have read and approved the final manuscript.

Acknowledgments

The authors thank the editor and three anonymous referees for their helpful comments and suggestions.

Conflicts of Interest

The authors declare that there is no conflict of interest in the publication of this paper.

References

  1. McLachlan, G.; Peel, D. Finite Mixture Models; John Wiley and Sons: New York, NY, USA, 2000. [Google Scholar]
  2. Contreras-Reyes, J.E.; Cortés, D.D. Bounds on Rényi and Shannon Entropies for Finite Mixtures of Multivariate Skew-Normal Distributions: Application to Swordfish (Xiphias gladius Linnaeus). Entropy 2016, 18, 382. [Google Scholar] [CrossRef]
  3. Frühwirth-Schnatter, S.; Pyne, S. Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions. Biostatistics 2010, 11, 317–336. [Google Scholar] [CrossRef] [PubMed]
  4. Lin, T.I.; Lee, J.C.; Yen, S.Y. Finite mixture modelling using the skew normal distribution. Stat. Sin. 2007, 17, 909–927. [Google Scholar]
  5. Basso, R.M.; Lachos, V.H.; Cabral, C.R.B.; Ghosh, P. Robust mixture modeling based on scale mixtures of skew-normal distributions. Comput. Stat. Data Anal. 2010, 54, 2926–2941. [Google Scholar] [CrossRef]
  6. Lin, T.I. Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 2009, 20, 343–356. [Google Scholar] [CrossRef]
  7. Lin, T.I.; Lee, J.C.; Hsieh, W.J. Robust Mixture Modelling Using the Skew t Distribution. Stat. Comput. 2007, 17, 81–92. [Google Scholar] [CrossRef]
  8. Contreras-Reyes, J.E.; López Quintero, F.O.; Yáñez, A.A. Towards Age Determination of Southern King Crab (Lithodes santolla) Off Southern Chile Using Flexible Mixture Modeling. J. Mar. Sci. Eng. 2018, 6, 157. [Google Scholar] [CrossRef]
  9. Maleki, M.; Arellano-Valle, R.B. Maximum a-posteriori estimation of autoregressive processes based on finite mixtures of scale-mixtures of skew-normal distributions. J. Stat. Comput. Sim. 2017, 87, 1061–1083. [Google Scholar] [CrossRef]
  10. Maleki, M.; Arellano-Valle, R.B.; Dey, D.K.; Mahmoudi, M.R.; Jalali, S.M.J. A Bayesian Approach to Robust Skewed Autoregressive Processes. Calcutta Stat. Assoc. Bull. 2017, 69, 165–182. [Google Scholar] [CrossRef]
  11. Maleki, M.; Wraith, D.; Arellano-Valle, R.B. Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions. Stat. Comput. 2018, in press. [Google Scholar] [CrossRef]
  12. Maleki, M.; Wraith, D.; Arellano-Valle, R.B. A flexible class of parametric distributions for Bayesian linear mixed models. Test 2018, in press. [Google Scholar] [CrossRef]
  13. Prates, M.O.; Lachos, V.H.; Cabral, C. mixsmsn: Fitting finite mixture of scale mixture of skew-normal distributions. J. Stat. Soft. 2000, 54, 1–20. [Google Scholar] [CrossRef]
  14. Hajrajabi, A.; Maleki, M. Nonlinear semiparametric autoregressive model with finite mixtures of scale mixtures of skew normal innovations. J. Appl. Stat. 2019, in press. [Google Scholar] [CrossRef]
  15. Maleki, M.; Wraith, D. Mixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework. Comput. Stat. 2019, in press. [Google Scholar] [CrossRef]
  16. Andrews, D.R.; Mallows, C.L. Scale mixture of normal distribution. J. R. Stat. Soc. Ser. B 1974, 36, 99–102. [Google Scholar] [CrossRef]
  17. Lange, K.L.; Little, R.; Taylor, J. Robust statistical modeling using t distribution. J. Am. Stat. Assoc. 1989, 84, 881–896. [Google Scholar] [CrossRef]
  18. Lange, K.L.; Sinsheimer, J.S. Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Stat. 1993, 2, 175–198. [Google Scholar]
  19. Maleki, M.; Nematollahi, A.R. Autoregressive Models with Mixture of Scale Mixtures of Gaussian innovations. Iranian J. Sci. Technol. Trans. A 2017, 41, 1099–1107. [Google Scholar] [CrossRef]
  20. Arellano-Valle, R.B.; Gómez, H.; Quintana, F.A. Statistical inference for a general class of asymmetric distributions. J. Stat. Plan. Inf. 2005, 128, 427–443. [Google Scholar] [CrossRef]
  21. Hoseinzadeh, A.; Maleki, M.; Khodadadi, Z.; Contreras-Reyes, J.E. The Skew-Reflected-Gompertz distribution for analyzing symmetric and asymmetric data. J. Comput. Appl. Math. 2019, 349, 132–141. [Google Scholar] [CrossRef]
  22. Moravveji, B.; Khodadai, Z.; Maleki, M. A Bayesian Analysis of Two-Piece distributions based on the Scale Mixtures of Normal Family. Iranian J. Sci. Technol. Trans. A 2018, in press. [Google Scholar] [CrossRef]
  23. Maleki, M.; Mahmoudi, M.R. Two-Piece Location-Scale Distributions based on Scale Mixtures of Normal family. Commun. Stat. Theor. Meth. 2017, 46, 12356–12369. [Google Scholar] [CrossRef]
  24. Rubio, F.J.; Steel, M.F.G. Inference in Two-Piece Location-Scale Models with Jeffreys Priors. Bayesian Anal. 2014, 9, 1–22. [Google Scholar] [CrossRef]
  25. Branco, M.D.; Dey, D.K. A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 2001, 79, 99–113. [Google Scholar] [CrossRef]
  26. Mudholkar, G.S.; Hutson, A.D. The epsilon-skew-normal distribution for analyzing near-normal data. J. Stat. Plan. Inf. 2000, 83, 291–309. [Google Scholar] [CrossRef]
  27. Meng, X.; Rubin, D.B. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 2017, 80, 267–278. [Google Scholar] [CrossRef]
  28. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
  29. Liu, C.; Rubin, D.B. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 1994, 81, 633–648. [Google Scholar] [CrossRef]
  30. R Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018; ISBN 3-900051-07-0. Available online: http://www.R-project.org (accessed on 12 December 2018).
  31. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  32. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  33. Ehrlich, I. Participation in illegitimate activities: A theoretical and empirical investigation. J. Political Econ. 1973, 81, 521–565. [Google Scholar] [CrossRef]
  34. Vandaele, W. Participation in illegitimate activities: Ehrlich revisited. In Deterrence and Incapacitation; Blumstein, A., Cohen, J., Nagin, D., Eds.; US National Academy of Sciences: Washington, DC, USA, 1978; pp. 270–335. [Google Scholar]
  35. Ghasami, S.; Khodadadi, Z.; Maleki, M. Autoregressive Processes with Generalized Hyperbolic Innovations. Commun. Stat. Comput. Sim. 2019, in press. [Google Scholar] [CrossRef]
  36. Zarrin, P.; Maleki, M.; Khodadadi, Z.; Arellano-Valle, R.B. Time series process based on the unrestricted skew normal process. J. Stat. Comput. Sim. 2019, 89, 38–51. [Google Scholar] [CrossRef]
Figure 1. An artificial simulated finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) data of length n = 400 with two components: weakly separated components (WS); medium separated components (MS) and strongly separated components (SS), with curved probability density function (PDF) that datasets extracted from them.
Figure 1. An artificial simulated finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) data of length n = 400 with two components: weakly separated components (WS); medium separated components (MS) and strongly separated components (SS), with curved probability density function (PDF) that datasets extracted from them.
Axioms 08 00038 g001
Figure 2. Histogram of body mass index (BMI) data with fitted FM-TP-true (T) (left) and FM-ST (right) models with two components.
Figure 2. Histogram of body mass index (BMI) data with fitted FM-TP-true (T) (left) and FM-ST (right) models with two components.
Axioms 08 00038 g002
Figure 3. Histogram of UScrime data with fitted FM-TP-SMN and FM-SMSN models with two components.
Figure 3. Histogram of UScrime data with fitted FM-TP-SMN and FM-SMSN models with two components.
Axioms 08 00038 g003
Table 1. Mean of true allocations rates for fitted finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) models.
Table 1. Mean of true allocations rates for fitted finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) models.
True ModelSample SizeFitted Model
FM-NormalFM-TP-NFM-TP-TFM-TP-CNFM-TP-SL
FM-TP-T1000.37450.70260.76740.80260.7902
3500.28360.76930.83720.84620.8450
8000.22310.79900.84180.84900.8469
FM-TP-CN1000.60230.65120.78350.79410.7829
3500.62580.76540.85420.86220.8510
8000.63950.78100.85990.86650.8563
FM-TP-SL1000.53740.77350.78100.77470.7840
3500.59030.82350.84200.82500.8491
8000.60340.82640.84370.82810.8507
Table 2. The number of times (out of 2000) the true FM models chosen under seven proposed hypotheses.
Table 2. The number of times (out of 2000) the true FM models chosen under seven proposed hypotheses.
Condition ExaminedTrue Model:FM-SNFM-ST
Criteria:AICBICAICBIC
FM-TP-N vs. FM-Normal 1934195819861975
FM-TP-T vs. FM-Normal 1511164520002000
FM-TP-SL vs. FM-Normal 1387141018231865
FM-TP-CN vs. FM-Normal 1503153217461732
FM-TP-T vs. FM-TP-N 839218611883
FM-TP-SL vs. FM-TP-N 11212318211793
FM-TP-CN vs. FM-TP-N 14319617321746
Table 3. Monte-Carlo average bias (MC-bias) and mean squared error (MSE) for maximum likelihood (ML) estimates in the weak separated components (WS) FM-TP-true (T) model.
Table 3. Monte-Carlo average bias (MC-bias) and mean squared error (MSE) for maximum likelihood (ML) estimates in the weak separated components (WS) FM-TP-true (T) model.
MeasureParameterSample Size
n = 150 n = 600 n = 1000 n = 2000 n = 4000
MC-Bias π 1 −1.64533 × 10 2 2.50755 × 10 3 −1.849 × 10 5 8.427 × 10 6 3.452 × 10 7
μ 1 3.48475 × 10 2 −9.48375 × 10 4 −2.287 × 10 3 7.427 × 10 4 6.584 × 10 4
μ 2 3.93636 × 10 1 −1.98365 × 10 2 −3.276 × 10 3 9.4978 × 10 4 −6.436 × 10 4
σ 1 −4.98656 × 10 1 3.44827 × 10 2 6.487 × 10 3 −6.775 × 10 3 −7.864 × 10 3
σ 2 4.57463 × 10 1 4.03741 × 10 2 −6.284 × 10 3 −6.103 × 10 3 9.903 × 10 4
γ 1 1.57363 × 10 2 1.93845 × 10 3 1.574 × 10 6 1.427 × 10 6 5.047 × 10 7
γ 2 1.40384 × 10 2 1.73644 × 10 3 2.037 × 10 6 6.948 × 10 7 4.765 × 10 7
ν 1.14024 × 10 2 1.36253 × 10 1 −6.017 × 10 2 1.284 × 10 2 −1.201 × 10 2
MSE π 1 7.15248 × 10 3 7.35733 × 10 4 3.854 × 10 4 6.036 × 10 5 6.729 × 10 5
μ 1 1.10375 × 10 1 8.99164 × 10 2 3.927 × 10 3 2.889 × 10 4 2.960 × 10 4
μ 2 1.97364 × 10 0 1.23473 × 10 1 4.920 × 10 2 3.276 × 10 3 3.328 × 10 3
σ 1 1.69475 × 10 0 1.11763 × 10 0 2.118 × 10 1 3.2849 × 10 3 3.453 × 10 3
σ 2 1.67568 × 10 1 1.43855 × 10 1 4.548 × 10 1 6.786 × 10 2 6.903 × 10 2
γ 1 8.00264 × 10 3 3.49566 × 10 4 2.801 × 10 4 6.104 × 10 5 6.003 × 10 5
γ 2 1.98374 × 10 2 8.20183 × 10 3 2.102 × 10 4 6.352 × 10 5 6.102 × 10 5
ν 0.97803 × 10 2 5.46093 × 10 1 1.112 × 10 1 1.684 × 10 2 1.521 × 10 2
Table 4. MC-Bias and MSE for ML estimates in the medium separated components (MS) FM-TP-T model.
Table 4. MC-Bias and MSE for ML estimates in the medium separated components (MS) FM-TP-T model.
MeasureParameterSample Size
n = 150 n = 600 n = 1000 n = 2000 n = 4000
MC-Bias π 1 5.03746 × 10 3 −4.21374 × 10 3 −1.40712 × 10 3 −5.83744 × 10 4 6.10927 × 10 5
μ 1 2.18723 × 10 2 −1.72451 × 10 3 −6.28474 × 10 4 2.99837 × 10 5 −2.48576 × 10 5
μ 2 −1.30284 × 10 2 8.29374 × 10 4 6.99386 × 10 4 3.95645 × 10 5 −3.43927 × 10 5
σ 1 1.25344 × 10 2 1.28374 × 10 4 −5.98375 × 10 5 −4.99380 × 10 5 −5.99837 × 10 6
σ 2 1.27364 × 10 2 −2.04634 × 10 3 −1.47364 × 10 3 2.98476 × 10 4 6.97484 × 10 5
γ 1 −1.10264 × 10 2 −1.02172 × 10 3 −4.93846 × 10 5 −6.10283 × 10 7 7.98375 × 10 7
γ 2 −0.78725 × 10 2 −0.90273 × 10 3 5.97367 × 10 5 5.83753 × 10 7 9.01274 × 10 7
ν 2.40926 × 10 0 1.12027 × 10 1 3.98365 × 10 2 −2.48462 × 10 3 −2.83765 × 10 4
MSE π 1 5.85644 × 10 3 6.45364 × 10 4 3.57464 × 10 5 6.20183 × 10 5 7.10993 × 10 5
μ 1 1.08744 × 10 1 9.37436 × 10 1 1.03764 × 10 2 2.57463 × 10 4 2.03464 × 10 4
μ 2 2.03937 × 10 1 2.00713 × 10 1 4.67483 × 10 2 3.47367 × 10 3 3.24536 × 10 3
σ 1 3.46357 × 10 1 2.47464 × 10 1 2.23433 × 10 1 3.39283 × 10 3 6.58475 × 10 4
σ 2 8.38474 × 10 1 6.37364 × 10 1 4.38475 × 10 1 6.87364 × 10 2 6.00836 × 10 2
γ 1 1.84746 × 10 2 4.03847 × 10 4 3.11972 × 10 4 7.00374 × 10 5 6.21002 × 10 5
γ 2 2.04464 × 10 2 7.64533 × 10 3 2.30283 × 10 4 5.89472 × 10 5 7.00353 × 10 5
ν 6.47465 × 10 1 5.87957 × 10 1 1.93845 × 10 2 1.74534 × 10 2 1.48375 × 10 2
Table 5. MC-Bias and MSE for ML estimates estimates in the strong separated components (SS) FM-TP-T model.
Table 5. MC-Bias and MSE for ML estimates estimates in the strong separated components (SS) FM-TP-T model.
MeasureParameterSample Size
n = 150 n = 600 n = 1000 n = 2000 n = 4000
MC-Bias π 1 5.13426 × 10 3 5.60483 × 10 3 −1.40712 × 10 3 −6.26826 × 10 4 −6.95662 × 10 5
μ 1 −1.90273 × 10 2 −1.24751 × 10 3 1.20183 × 10 3 3.45342 × 10 5 2.74634 × 10 5
μ 2 1.15379 × 10 2 −3.06344 × 10 3 1.47333 × 10 3 4.28144 × 10 5 3.35242 × 10 5
σ 1 −1.03046 × 10 2 -8.34452 × 10 5 6.73645 × 10 5 −4.87354 × 10 5 −7.28374 × 10 6
σ 2 −1.38724 × 10 2 −1.27844 × 10 3 −1.00904 × 10 3 3.49383 × 10 4 9.88365 × 10 5
γ 1 −0.58746 × 10 2 0.34452 × 10 3 5.09847 × 10 5 −5.46353 × 10 7 8.73645 × 10 7
γ 2 −0.60273 × 10 2 −0.24533 × 10 3 6.08422 × 10 5 5.27363 × 10 7 8.69384 × 10 7
ν 2.37264 × 10 0 0.99775 × 10 1 4.65635 × 10 2 −1.74632 × 10 2 −2.46354 × 10 3
MSE π 1 2.94256 × 10 3 3.83748 × 10 4 2.67464 × 10 4 7.78374 × 10 5 9.73646 × 10 5
μ 1 5.01324 × 10 3 4.11293 × 10 4 1.69837 × 10 4 8.32847 × 10 6 2.37462 × 10 6
μ 2 5.73648 × 10 3 5.03744 × 10 4 8.98474 × 10 5 6.26353 × 10 6 3.03733 × 10 6
σ 1 3.48373 × 10 2 1.20385 × 10 3 1.65464 × 10 4 1.92736 × 10 6 1.82037 × 10 6
σ 2 7.92746 × 10 2 5.38474 × 10 3 1.16354 × 10 3 1.98274 × 10 5 1.48263 × 10 5
γ 1 3.26354 × 10 3 4.24846 × 10 4 2.63846 × 10 4 6.03947 × 10 5 8.48375 × 10 5
γ 2 2.73145 × 10 3 3.38475 × 10 4 9.83746 × 10 5 5.64544 × 10 5 8.66555 × 10 5
ν 1.28736 × 10 2 3.01763 × 10 1 9.02635 × 10 2 7.27647 × 10 3 8.83645 × 10 3
Table 6. ML estimation results for fitting FM-TP-SMN models to the body mass index (BMI) data.
Table 6. ML estimation results for fitting FM-TP-SMN models to the body mass index (BMI) data.
ParameterFitted Model
FM-NormalFM-TP-NFM-TP-TFM-TP-SLFM-TP-CN
π 1 0.3910.54000.46000.53740.5282
μ 1 21.41220.752020.74420.798221.0629
μ 2 32.54830.132430.15530.016929.9851
σ 1 2.01765.03435.03034.27851.5215
σ 2 6.41807.40407.40386.32132.4601
γ 1 0.67980.68060.67090.6302
γ 2 0.86950.86730.87710.8785
ν 7.99392.1950(0.971, 0.081)
Table 7. Model selection criteria for fitting FM-TP-SMN and FM-scale mixtures of the skew normal (SMSN) models to the BMI data. The best values are marked in bold.
Table 7. Model selection criteria for fitting FM-TP-SMN and FM-scale mixtures of the skew normal (SMSN) models to the BMI data. The best values are marked in bold.
CriterionFitted Model
FM-NormalFM-TP-NFM-SNFM-TP-TFM-STFM-TP-SLFM-SSLFM-TP-CNFM-SCN
Log-like−6911.76−6870.30−6979.47−6856.65−6869.03−6857.14−6867.99−6871.65−6865.25
AIC13833.3513754.7213972.9513729.4113754.0613730.4113751.9813812.3113748.51
BIC13961.6113763.7913821.2213737.6113782.3313735.8513780.2413775.8913776.77
Table 8. ML estimation results and model selection criteria for fitting FM-TP-SMN models to the UScrime data. The best values are marked in bold.
Table 8. ML estimation results and model selection criteria for fitting FM-TP-SMN models to the UScrime data. The best values are marked in bold.
CriterionFitted Model
FM-NormalFM-TP-NFM-SNFM-TP-TFM-STFM-TP-SLFM-SSLFM-TP-CNFM-SCN
π 1 0.6150.4630.6070.4620.6070.4630.6070.5090.582
μ 1 166.674173.927175.448173.924174.558173.931175.350170.931175.390
μ 2 237.632246.160227.951246.060228.018245.998227.888204.645224.584
σ 1 18.28320.39920.17020.26319.59720.29419.83518.62616.362
σ 2 20.45559.46422.77859.26822.63458.81322.47842.72820.201
γ 1 ( λ 1 ) 0.071−0.6870.071−0.6130.071−0.6790.176−0.872
γ 2 ( λ 2 ) 0.2570.5550.2580.5480.2590.5600.7820.579
ν 39.93410039.72637.5900.7450.672
τ 0.9990.990
Log-like−232.226−228.215−232.274−228.865−231.275−228.211−231.265−229.864−230.281
AIC474.452470.437478.548473.734478.549472.426478.531477.748478.562
BIC483.703483.386491.499488.535491.500487.228491.482494.399491.513

Share and Cite

MDPI and ACS Style

Maleki, M.; Contreras-Reyes, J.E.; Mahmoudi, M.R. Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family. Axioms 2019, 8, 38. https://doi.org/10.3390/axioms8020038

AMA Style

Maleki M, Contreras-Reyes JE, Mahmoudi MR. Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family. Axioms. 2019; 8(2):38. https://doi.org/10.3390/axioms8020038

Chicago/Turabian Style

Maleki, Mohsen, Javier E. Contreras-Reyes, and Mohammad R. Mahmoudi. 2019. "Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family" Axioms 8, no. 2: 38. https://doi.org/10.3390/axioms8020038

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop