Stochastic Comparisons of Weighted Distributions and Their Mixtures

In this paper, various stochastic ordering properties of a parametric family of weighted distributions and the associated mixture model are developed. The effect of stochastic variation of the output random variable with respect to the parameter and/or the underlying random variable is specifically investigated. Special weighted distributions are considered to scrutinize the consistency as well as the usefulness of the results. Stochastic comparisons of coherent systems made of identical but dependent components are made and also a result for comparison of Shannon entropies of weighted distributions is developed.


Introduction
In the literature, weighted distributions have been exhaustively applied and put to use to model data in nature, as they provide more insights to provide more adequacy in modelling as a result of variety of sampling surveys (cf. Rao [1], Patil and Rao [2] and Patil [3]). Let X be a random variable with cumulative distribution function (cdf) F and probability density function (pdf) f and let w(·, θ) is a non-negative function such that E(w(X, θ)) exists and is finite for all θ ∈ χ, where χ is an arbitrary subset of R. Then X w is taken to be a random variable with weighted distribution associated with f , having pdf Many families of statistical distributions hold at the disposal of the family of weighted distributions in (1) (see, e.g., the typical weighted distributions in Sections 3.1 and 3.2). Suppose that the hazard rate function h corresponds to the pdf f so that h(x) = f (x)/F(x) whereF ≡ 1 − F is the survival function of X. In spirit of Jain et al. [4], the hazard rate of X w is characterized by where A(x, θ) = E[w(X, θ) | X ≤ x] and h(x) = f (x)/F(x) is the reversed hazard rate function of X.
The density in (1) may be used to model data randomly drawn from population at a certain level θ of some quantity of interest. For example, θ could be a particular age for an individual, a certain time point or a given threshold with a specific amount. In many realistic circumstances it is acknowledged that the parameter θ may not be constant so that the occurrence of heterogeneity is sometimes incalculable and unexplained. In addition, it often takes place in practical situations where data from several populations is mixed. To model such data sets mixture models are used. For example, the measurements on the life lengths of a device may be gathered regardless of the manufacturer, or data may be gathered on humans without regard, say, to blood type. If the ignored variable has a bearing on the characteristic which is being measured, then the data follow a mixture model. To all intents and purposes, it is hard to find data that are not some kind of a mixture, because there is almost always some relevant covariate that is not observed.
The study of reliability properties of various mixture models has recently received much attention in the literature. When a mixture model is fitted to survival data, the mixing operation can change the pattern of aging for the lifetime unit under consideration in some favorite way (see, for example, Finkelstein and Esaulova [6], Alves and Dias [7], Arbel et al. [8], Cole and Bauer [9], Bordes and Chauveau [10], Li and Liu [11], Amini-Seresht and Zhang [12], Misra and Naqvi [13] and Badía and Lee [14]).
Mixture models capture heterogeneity in data by decomposing the population into latent subgroups, each of which is governed by its own subgroup-specific set of parameters. To represent a general formulation of the mixture model in the case of our study, consider the density associated with (1), where µ(θ) = E(w(X, θ)) and G is the cdf of the random varaible Θ. It is known playing the role of the weight function through which f is altered to f * . This signifies that the mixture density in (4) can be thought as the density of a weighted distribution with weight function v for which E(v(X)) = 1. In situations where Θ is designated by a discrete random variable, a finite mixture model is considered. To this end, the model (4) is developed as where g(θ i ) represents the value of the probability mass function (pmf) of Θ at θ i for i = 1, 2, · · · . Throughout the paper, it is assumed that the output random variables following the mixture weighted distribution (4) have absolutely continuous distribution functions.
To the best of our knowledge, there has not been a work on the literature to argue different stochastic properties of the parametric weighted distributions as well as their mixtures in general to be attractive for broader audiences. There is a need for an effective study in this direction. The main objective of this paper is to initiate such a study to investigate the impact of the association of the model to a parameter on some general stochastic aspects of the model. The rest of the paper is organized as follows. In Section 2, some useful notions of stochastic orders and some further stochastic properties are presented. In Section 3, some special applied weighted distributions are introduced. In Section 4, preservation of several ordinary as well as relative stochastic orderings is studied in Section 4.1. In Section 4.2, preservation properties of some stochastic orders in the extended mixture model of weighted distributions are secured and in the long run in Section 4.3, a link to information theory is provided.

Preliminaries
Assume that the random variables X and Y have distribution functions F and G, survival functions F = 1 − F andḠ = 1 − G, density functions f and g, hazard rate functions h X = f /F and h Y = g/Ḡ and reversed hazard rate functions h X = f /F and h Y = g/G, respectively. To compare the magnitude of random variables some notions of stochastic orders are introduced below. Definition 1. The random variable X is said to be smaller than the random variable Y in the It is known that the following implications hold: The notions of the totally positive of order 2 (TP2) and the reverse regular of order 2 (RR2) are defined as follows.
for all x 1 ≤ x 2 and for all y 1 ≤ y 2 for 1 and 2 equaling to +1 or −1.
If 1 = +1 and 2 = +1, then h is said to be TP 2 . If 1 = +1 and 2 = −1, then h is said to be RR 2 . It is readily pointed out that the TP 2 [RR 2 ] property of h(t, x) is equivalent to saying that h(t, x 2 )/h(t, x 1 ) is non-decreasing [non-increasing] in t whenever x 1 ≤ x 2 after making the conventions that a/0 = +∞ when a > 0 and a/0 = 0 if a = 0. In view of the foregoing statements and by assuming h Y = h 2 and h X = h 1 and also h Y = h 2 and h X = h 1 , one observes X 1 ≤ rhr X 2 holds if, and only if, h i (x) is RR 2 as a function of (i, x) ∈ {1, 2} × ζ, where ζ is the common support of X and Y. In a similar manner we can establish that

Special Weighted Distributions
In this section, several special parametric weight functions are presented making the investigation of the main model of (4) more developed. First, some general formations of the weight function are considered by which many important families of weighted distributions are included. In all of the cases we assume that the weight function has a finite mean with respect to the underlying distribution.

Distribution-Free Weight Functions
Here, several weight functions which do not depend on the underlying distribution are given. Suppose that w i , i = 1, 2 are two non-negative functions of x and that k i , i = 1, 2 are two proper functions of θ so that the following weight functions satisfy the requirement that µ(θ) < ∞. Substituting any of these weight functions in the density (4) leads to a particular model that might be of some interest in a context.

Semiparametric Models
Models where the parameters of interest are finite-dimensional and the nuisance parameters are infinite-dimensional are called semiparametric models. There are some choices for the weight function w(x, θ) that are functional of the underlying distribution function F, including the parameter θ within. Below, we list some kinds of those choices whose associated weight function depend on the underlying distribution.

Stochastic Orderings
In this section, preservation properties of some stochastic orders under the formation of the weighted model in the fixed as well as the random levels of the parameter θ are studied.

Weighted Distribution with Specific Parameter
Here, in the same vein as Misra et al. [16] several preservation properties on likelihood ratio, hazard rate and reversed hazard rates orders can be established in the sense of the model (1). Suppose that X i is a random variable with pdf f i and cdf F i , for i = 1, 2, and assume that X iw i follows the weighted distribution of X i with weight function w i (x) = w(x, θ i ) having pdf where θ 1 and θ 2 are two fixed numbers in χ. In the next round, as will be presented, conditions for stochastic orders made of X 1w 1 and X 2w 2 to emulate the same type of stochastic orders between X 1 and X 2 are obtained.
The following Proposition is a direct conclusion of Theorem 3.2 in Misra et al. [16].
Preservation properties of the stochastic orders considered in Proposition 1 have been procured for some special weighted distributions by Izadkhah et al. [17] including the models of proportional (reversed) hazard rates, upper (lower) records, right (left) truncation, moment generating and size-biased distributions. Izadkhah et al. [18] obtained sufficient conditions for preservation of reversed mean residual life order and Izadkhah et al. [19] presented some conditions under which the mean residual life order is preserved under weighting. For the sake of completeness, the preservation properties of the likelihood ratio, the hazard rate and the reversed hazard rates orders are studied for some of the parametric weighted distributions considered in Sections 3.1 and 3.2. Suppose that X 1 and X 2 are two non-negative random variables with distribution functions F 1 and F 2 , survival functions F 1 = 1 − F 1 andF 2 = 1 − F 2 and density functions f 1 and f 2 , respectively.
Suppose that w 2 and k 1 are both non-decreasing (or non-increasing) functions. By Proposition 1(i), if X 1 ≤ lr X 2 then X 1w 1 ≤ lr X 2w 2 . Let us further assume that k 1 (θ i ) > 0 for i = 1, 2. Then, by Proposition 1(ii) X 1 ≤ hr X 2 implies X 1w 1 ≤ hr X 2w 2 provided that w 1 and w 2 are both non-decreasing. In parallel, if w 1 and w 2 are both non-increasing then using Proposition 1(iii), X 1 ≤ rh X 2 concludes that X 1w 1 ≤ rh X 2w 2 .
Let us suppose that w 2 and k 1 are both non-decreasing (or non-increasing) functions. By Proposition 1(i), X 1 ≤ lr X 2 implies X 1w 1 ≤ lr X 2w 2 . If w 1 , w 2 and k 1 are all non-decreasing functions then Proposition 1(ii) establishes that X 1 ≤ hr X 2 implicate X 1w 1 ≤ hr X 2w 2 . In a similar manner, if w 1 , w 2 and k 1 are all non-increasing functions then from Proposition 1(iii) it is deduced that X 1 ≤ rh X 2 implies X 1w 1 ≤ rh X 2w 2 .
We assume that w 2 and k 1 are both non-decreasing (or non-increasing) functions. Proposition 1(i) guarantees that X 1 ≤ lr X 2 implies X 1w 1 ≤ lr X 2w 2 . If w 1 is non-decreasing and w 2 and k 1 are non-increasing functions then by Proposition 1(ii) X 1 ≤ hr X 2 yields X 1w 1 ≤ hr X 2w 2 . In the dual case, if w 1 is non-increasing and further w 2 and k 1 are both non-decreasing functions then Proposition 1(iii) concludes that X 1 ≤ rh X 2 gives X 1w 1 ≤ rh X 2w 2 .
Some relative stochastic orders including the relative (reversed) hazard rate and relative mean residual life orders have attracted the attention of researchers in the last decade (cf. Di-Crescenzo and Longobardi [20], Kayid et al. [21], Misra and Francis [22], Misra et al. [23], Ding et al. [24], Ding and Zhang [25], Misra and Francis [26] and Misra and Francis [27]). We reminisce about the definition of these orders from Rezaei et al. [28] and Kayid et al. [21] [see, for example, Definition 1(v) and (vi)]. In the next theorem, the study of preservation of the relative hazard rate and the relative reversed hazard rate orders are initiated for a well-known class of semiparamtric distributions. For i = 1, 2, denote by h iw i (t, θ i ) ( h iw i (t, θ i )) the hazard rate (resp. the reversed hazard rate) of X iw i , where w i and is supposed to be valid as a weight function. Before stating the result, we introduce some notations.
The symbol sign = is used to denote the similar sign.
Thus, for all t > 0, we have: By assumption, h 2 (t)/h 1 (t) is non-increasing in t > 0. It suffices only to prove that: The assumption X 1 ≤ hr X 2 yields h 1 (t) ≥ h 2 (t), for all t > 0, which further concludes thatF 1 (t) ≤ F 2 (t), for all t > 0. Therefore, is non-positive (resp. non-negative) for all t > 0, if, and only if, for all x 1 ≤ x 2 and for all θ 1 ≤ θ 2 it holds that which is validated by assumption.
To present the result about the preservation of the relative reversed hazard rate order we introduce some other notation. Let us define for x ∈ [0, 1], is non-decreasing (resp. non-increasing) in x, for all θ and non-decreasing (resp. non-increasing) in θ, for all x, in which ξ * (x, θ) is non-decreasing (non-increasing) in x for all θ ∈ χ, then X 1 ≤ rrh X 2 implies that X 1w 1 ≤ rrh X 2w 2 .
The weight functions considered in Theorems 1 and 2 encompass some particular cases which may be of independent interest. In that regard, the following corollary is resulted.
x v i (u, θ i ) du for i = 1, 2. Then (i) X 1w 1 ≤ rhr X 2w 2 if, and only if, ξ 2 (x,θ 2 ) ξ 1 (x,θ 1 ) is non-increasing in x. (ii) X 1w 1 ≤ rrh X 2w 2 if, and only if, Proof. We only prove the assertion (i) as the proof of (ii) is similarly accomplished. Note that analogously as in the proof of Theorem 1, we can get It can be seen that, for all t ≥ 0, d dt which is non-positive if, and only if, or equivalently if ξ 2 (x, θ 2 )/ξ 1 (x, θ 1 ) is non-increasing in x ∈ [0, 1] according which the ratio h 2w 2 (t, θ 2 )/h 1w 1 (t, θ 2 ) is also non-increasing in t > 0, that is, X 1w 1 ≤ rhr X 2w 2 . The proof is complete.
The following corollary is a useful observation in the context of Theorem 3 as it illustrates that a typical parametric family of weighted distributions enjoys the relative hazard rate and the relative reversed hazard rate ordering properties in some cases.

Corollary 2.
Suppose that the random variable X(θ 1 ) and X(θ 2 ) for θ 1 , θ 2 ∈ χ have density functions x v(u, θ) du, we have: In reliability and survival theories, feature of ordering for lifetime of coherent systems is a relevant subject to be studied. To this end, Navarro et al. [29] obtained a representation of the system reliabilitȳ F Sys as a distorted function of the common component reliabilityF such thatF Sys (t) =F(t), where h is an non-decreasing function depending on the structure of the underlying system and the survival copula of the joint distribution of the component lifetimes. In this context, they have shown that the reliability function of a coherent system with dependent identically distributed (DID) components can be written as a distorted function of the common component reliability function. The following lemma is due to Navarro et al. [29]. Lemma 1. Let τ(X) be the lifetime of a coherent system formed by n DID components with the vector of random lifetimes X = (X 1 , X 2 , . . . , X n ) with common survival functionF. Then the reliability function of τ(X) can be written asF 1] is an non-decreasing continuous function such that h(0) = 0 and h(1) = 1. The function h is called the domination (or distortion) function which is characterized through the structure function φ(·) of the system (see, e.g., Barlow and Proschan [30]) and on the survival copula C of X 1 , X 2 , . . . , X n .
In the set up of the particular weighted distributions given in Section 3.2 , the survival function of the arisen weighted distribution can be commuted to a distorted survival function, as specified earlier in Lemma 1, for which the domination function is characterized by the associated weight function. To this purpose, consider the weight function w(x, θ) = v(F(x), θ) and notice that in this case X w has the survival function 1] plays the role of a parametric domination function. Note that h θ (·) : [0, 1] → [0, 1] is a non-decreasing continuous function with h θ (0) = 0 and h θ (1) = 1. In the reversed direction, if h θ is a distortion (domination) function and v(x, θ) = h θ (x), for any x ∈ [0, 1] then 1 0 v(x, θ) dx = 1 and thusF w (t, θ) = h θ (F(t)). Therefore, there is a unique relationship between v(·, θ) and h θ (·) that is the studies of weighted distributions in the context of semiparametric models entertain the studies of distorted survival functions and vice versa. The parameter θ may be an appropriate quantity that affects the magnitude of system's lifetime. In the case when DID components construct the system, θ may be related to the dependency of the component lifetimes in a way that the survival copula in Lemma 1 depends on θ. For instance, in the case where the Archimedean copula or the FGM copula is adopted to model the association of lifetime of components in a coherent system. The following results are useful to analysis of relative ordering properties of coherent systems as to the best of our knowledge such a study has not been developed in the literature thus far. The following proposition is a direct consequence of Theorem 3.
where h 1 and h 2 are two domination functions. Let T 1 and T 2 have respective survival functions h 1 (F) and h 2 (F). Then, The following example illustrates an application of Proposition 2.
In the following example, we show that Proposition 2 can also be applied to systems with DID components.
The following example reveals a relative ordering property in the Marshall-Olkin family of distributions.

Example 8. Suppose that the incorporated weight function is
whereθ = 1 − θ and θ > 0. The random variable X w has survival function so that h θ (u) = θu/(1 −θu) which is considered to be the relevant domination function. Note that the family of distributions characterized via (8) is called the proportional odds family of distributions which is due to Marshall and Olkin [31]. Let T 1 and T 2 be two random variables with respective survival functions h θ 1 (F) and h θ 2 (F) such that θ 1 > θ 2 . It can be seen that It follows that d dx that is ξ i (x) is RR 2 in i = 1, 2 and x > 0. Thus, according to Proposition 2(i) we deduce that T 1 ≤ rhr T 2 .

Comparisons of Mixture Weighted Distribution
In this segment, the problem of preservation of a number of stochastic orderings in the mixture weighted model is investigated. The study is carried out in two different settings, where firstly the random parameter varies in distribution while the underlying distribution remains unchanged and secondly the underlying distribution is changed in the case when the random parameter is fixed in distribution. The results obtained by Kayid et al. [32] are developed to entertain more dynamic weighted distributions.
It is followed up that some stochastic orders of random parameters as well as the underlying random variables are transmitted to the random variables with the associated mixture weighted distribution. Give thought to Θ i as a random variable with the pdf g i , the cdf G i and the sfḠ i = 1 − G i , for i = 1, 2. Contemplate the random variable X * i , i = 1, 2 having pdf from which the cdf F * i and the sfF * i = 1 − F * i of X * i are procured after somewhat plain algebraic calculations, respectively, by where the bivariate functions A and B and the function µ are all determined as earlier in Section 1.
In the rest of the paper, it is taken for granted that the random variables Θ 1 and Θ 2 are independent. Denote by h * i and h * i the hazard rate and the reversed hazard rate of X * i , respectively. It can be seen, after some integral calculation, that and The following result demonstrates the likelihood ratio order preservation in the model (9).
Proof. It is not impenetrable to realize that X * 1 ≤ lr X * 2 if, and only if, In spirit of (9), one gets By the assumption of Θ 1 ≤ lr Θ 2 we can rely on the fact that It is also obvious that The general composition theorem of Karlin [15] concludes the desired result.
In the setup of the model (9), the reversed hazard rate order of the random parameters is relocated into the overall random variables.
The last result establishes the reversed hazard rate ordering preservation in the baseline-varied mixture weighted model of (13).
Proof. First, we denote by h i and h * * i the reversed hazard rate functions of X i and X * * i , respectively, for i = 1, 2. For all x > 0, for all x > 0, and θ ∈ χ. Thus It can be seen that w(x, θ)/A 1 (x, θ) is non-decreasing in θ ∈ χ. From assumption, for all T ≥ 0. Lemma 7.1(a) of Barlow and Proschan [30] is applicable in (15) and provides the proof.

A Link to Information Theory
The concept of entropy in information theory has played a prominent role in a broad area of science including statistical thermodynamics, urban and regional planning, business, economics, finance, operations research, queueing theory, spectral analysis, image reconstruction, biology and manufacturing (see, for example, El Gamal and Kim [35], Brillouin [36], Khinchin [37] and Grant [38]).
Here before closing the paper, we impose a stochastic ordering property that leads to ordering of entropies of weighted distributions with weight functions given in Section 3.2. The extension of the Shannon entropy from the discrete case to the absolutely continuous case when dealing with lifetime random variables is defined by where f is the pdf of non-negative random variable X with an absolutely continuous distribution function. Note that log, with convention 0 log(0) = 0 stands for the natural logarithm.
However, it is found that the entropy is related to the concept of dispersion of (random) variables. Being aware of this certitude, it is useful to concentrate on dispersion measures of probability distributions as well as their related stochastic dispersion orderings.
In spirit of Theorem 3.B.20(a) and Theorem 3.B.20(b) in Shaked and Shanthikumar [34] if X or Y has an increasing hazard rate function, then and if X or Y has a decreasing hazard rate function, then In accordance with Corollary 4.4 in Bartoszewicz [45], if X or Y has a decreasing reversed hazard rate function, then X ≤ disp Y =⇒ X ≤ rh Y, and if X or Y has an increasing reversed hazard rate function, then If X and Y are two random variables with supports S X = (l X , u X ) and S Y = (l Y , u Y ), respectively, then according to Theorem 3.B.13(a) in Shaked and Shanthikumar [34] when l X = l Y > −∞, and also according to Theorem 3.B.13(b) in Shaked and Shanthikumar [34] when u X = u Y < ∞, The weight functions w θ (x) and v θ (x) considered in the following theorem depend on x only through F(x) and G(x), respectively.
Theorem 9. Let X w θ and Y v θ be the weighted versions of X and Y with weight functions w θ (x) = d θ (F(x)) and v θ (x) = d θ (G(x)), respectively, where θ ∈ χ is a parameter and d θ is a non-negative function so that