Next Article in Journal
Probabilistic Load-Shedding Strategy for Frequency Regulation in Microgrids Under Uncertainties
Previous Article in Journal
A Contrastive Semantic Watermarking Framework for Large Language Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations

by
Abdulaziz S. Alghamdi
1 and
Fatimah A. Almulhim
2,*
1
Department of Mathematics, College of Science & Arts, King Abdulaziz University, P.O. Box 344, Rabigh 21911, Saudi Arabia
2
Department of Mathematical Sciences, College of Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(7), 1127; https://doi.org/10.3390/sym17071127
Submission received: 19 May 2025 / Revised: 5 July 2025 / Accepted: 8 July 2025 / Published: 14 July 2025
(This article belongs to the Section Mathematics)

Abstract

This study estimates the population median through stratified random sampling, which enhances accuracy by ensuring the proper representation of key population groups. The proposed class of estimators based on transformations effectively handles data variability and enhances estimation efficiency. We examine bias and mean square error expressions up to the first-order approximation for both existing and newly introduced estimators, establishing theoretical conditions for their applicability. Moreover, to assess the effectiveness of the suggested estimators, five simulated datasets derived from distinct asymmetric distributions (gamma, log-normal, Cauchy, uniform, and exponential), along with actual datasets, are used for numerical analysis. These estimators are designed to significantly enhance the precision and effectiveness of median estimation, resulting in more reliable and consistent outcomes. Comparative analysis using percent relative efficiency (PRE) reveals that the proposed estimators perform better than conventional approaches.

1. Introduction

In modern surveys, stratification is a methodological technique used to improve estimation accuracy. In a stratified framework, samples are usually taken using simple random sampling from each stratum when the total population is divided into many strata to guarantee homogeneity within each group. The finite population median is very helpful in stratified random sampling because it may be used to properly summarize central tendency in skewed datasets across strata. For instance, stratifying by socioeconomic category reduces the impact of extreme values and enables a more accurate assessment of median income in income distribution studies. In healthcare research, the median recovery time is also estimated by stratifying individuals by age or illness severity, providing a reliable indicator of typical patient outcomes. The median is a useful tool in survey sampling since it incorporates stratification, which provides a more accurate depiction of the center value within each subgroup.
Estimating the finite population median has received comparatively less attention than the vast number of studies devoted to estimating finite population parameters like the mean, variance, proportion, and total [1,2,3,4]. When describing central tendency in highly skewed distributions, including those pertaining to income, spending, taxes, and output, the median is especially helpful since it offers a more reliable measure than the mean in these situations. The need for more research to improve estimation approaches employing auxiliary variables is highlighted by the fact that, despite its importance, the development of efficient methodologies for median estimation in finite populations is still an understudied area. More details about the utilization of information on unknown population parameters can be found in [5,6,7,8,9].
The estimation of the population median using auxiliary information has been significantly advanced by foundational studies [1,2,3,4], which have provided the basis for further research in this area. To estimate the finite population median using various sampling approaches, several estimators have been produced over time [4]. Notably, Ref. [10] presented novel techniques to improve regression and ratio estimators for estimating the median. A generalized family of estimators using two auxiliary variables within the same framework was suggested by [11], while Ref. [12] investigated double sampling-based strategies to increase accuracy. Furthermore, by adding the known median of an auxiliary variable, Ref. [13] defined a minimal unbiased estimator. Considerable progress was achieved by [14], who investigated median estimation in two-phase sampling with two auxiliary variables. To enhance the median approximation under stratified random sampling and simple random sampling, better estimators were proposed by [15,16]. More recently, different estimators were designed by [17,18,19], which included auxiliary data for a variety of population parameters under various sampling strategies. Ref. [20] proposed certain estimators and conducted a simulation-based evaluation of robust transformation techniques for median estimation under simple random sampling. A new transformation-based estimator for population median estimation using auxiliary variables was proposed by [21], supported by a simulation study with real data across various sample sizes and parameter settings. The estimation of the population median using bivariate auxiliary information was studied under simple random sampling by [22]. There have been several developments in recent years as a result of the increased interest in developing more effective estimators for median estimation. Readers can consult [23,24,25,26,27,28,29,30] and the references therein for a thorough understanding of this topic.

2. Motivation and Research Gap

While extensive research has been conducted on the estimation of finite population parameters such as the mean, variance, and proportion, comparatively limited attention has been paid to the estimation of the population median, particularly within the context of stratified sampling. Most existing approaches rely on auxiliary variables under the assumption of symmetric distributions, which often do not reflect practical conditions where data can be skewed or contain outliers. Furthermore, conventional estimators typically use standard forms of auxiliary information without applying transformation-based or robust techniques, limiting their effectiveness in non-normal settings.
In the presence of extreme values or heavy-tailed distributions, the accuracy of traditional estimators (such as ratio, product, regression, and exponential estimators) often deteriorates, producing unreliable results. This limitation stems from their dependence on classical auxiliary variables, which are sensitive to outliers. To address these issues, this study proposes a new class of ratio product estimators that apply transformation strategies and robust statistical measures. By combining the trimean and decile mean with five alternative auxiliary variables, including the interquartile range, midrange, quartile average, and quartile deviation, the proposed estimators are designed to improve the precision and reliability of median estimates under stratified sampling. These enhancements make them particularly suitable for skewed data, small sample sizes, and applications in fields such as environmental research, healthcare, and economics.
To summarize, the main contributions of this paper are as follows:
  • Methodological contribution: A new class of transformation-based ratio product estimators is proposed for the stratified estimation of the population median using robust and nontraditional auxiliary variables.
  • Improved stability across strata: Through the use of transformation techniques within each stratum, the estimator maintains consistent performance and minimizes the effect of internal variation.
  • Theoretical contribution: Bias and mean-squared error (MSE) expressions are derived using first-order approximations, and mathematical conditions for superior performance are established.
  • Empirical contribution: The proposed estimators are evaluated through comprehensive simulation studies using five asymmetric distributions and validated using three real-world datasets, with performance assessed via percent relative efficiency (PRE).
  • Effective with small sample sizes: It delivers accurate median estimates even when sample sizes within strata are limited, making it practical for studies with restricted data availability.
  • Resistant to outliers: Its design naturally reduces the impact of extreme values, ensuring more dependable and stable median estimates.
  • Effective in applied fields: The estimator is particularly useful in fields such as health, education, and economics, where data often show departures from normality and are grouped into strata.
The outlines of the paper are arranged as follows: The basic notations and concepts utilized in this study are explained in Section 3. A summary of various existing estimators is also discussed in Section 3. We thoroughly discuss the proposed class of estimators in Section 4. There is a rigorous mathematical comparison in Section 5 providing a comprehensive evaluation of the existing and newly proposed estimators. A description of a simulation conducted to produce some populations from various distributions is presented in Section 6, which is intended to evaluate the theoretical conclusions from Section 5. In this section, there are also numerical examples illustrating the practical nature of what we obtained theoretically. Finally, Section 7 summarizes the overall conclusions and suggests some ideas for new research.

3. Concepts and Existing Estimators

A population of size N is partitioned into L distinct strata, represented by the vector δ = ( δ 1 , δ 2 , , δ N ) . Each stratum contains N h units, where h = 1 , 2 , , L , ensuring that the total number of units across all strata satisfies the equation
h = 1 L N h = N .
Within the hth stratum, let Y denote the primary variable of interest, while X represents the auxiliary variable. The values of Y for the ith unit ( i = 1 , 2 , , N h ) can be expressed as y h i , while the values of X can be expressed as x h i . From the hth stratum, a random sample of size n h is chosen without replacement, ensuring that the overall sample size meets the requirement that
h = 1 L n h = n .
The population medians for the study and auxiliary variables in the hth stratum are represented as Ω y h and Ω x h , and the sample medians as Ω ^ y h and Ω ^ x h , respectively. The associated probability density functions in the hth stratum are f y ( Ω y h ) and f x ( Ω x h ) . The correlation between the population medians of the study variable Ω M y h , and the auxiliary variable, Ω M x h , within the hth stratum is denoted by ρ y h x h . It is mathematically expressed as follows:
ρ ( Ω M y h , Ω M x h ) = 4 P 11 ( y h , x h ) 1 ,
where P 11 represents the joint probability given by the following:
P 11 = P ( y h Ω M y h x h Ω M x h ) .
To determine the mathematical properties of various estimators, such as their biases, mean-squared errors (MSEs), and minimum MSEs, the following relative error parameters are employed:
ξ 0 h = Ω ^ y h Ω y h Ω y h
and
ξ 1 h = Ω ^ x h Ω x h Ω x h ,
such that E ξ i h = 0 for i = 0 , 1 .
E ξ 0 h 2 = θ h Γ y h 2 ,
E ξ 1 h 2 = θ h Γ x h 2 ,
E ξ 0 h ξ 1 h = θ h Γ y h x h = ρ y h x h Γ y h Γ x h ,
where
Γ y h = 1 Ω y h f y h ( Ω y h ) ,
Γ x h = 1 Ω x h f x h ( Ω x h ) ,
represent the hth stratum coefficients of variation for the population values of the variables Y and X , and
θ h = 1 4 1 n h 1 N h
be the finite population correction factor. For more details about the symbols and notations, see Table 1.
In survey sampling, classical estimators such as ratio, difference, and exponential-type estimators are frequently used to improve the efficiency of point estimates by incorporating auxiliary information. These estimators are traditionally applied to population mean estimation but have also been successfully extended to median estimation. The ratio estimator tends to perform well when the study variable and auxiliary variable are positively correlated, leveraging the auxiliary variable as a proxy to adjust the estimate. The difference estimator is effective when there is a linear relationship with a non-zero intercept between the two variables, allowing correction through additive adjustment. Exponential-type estimators offer enhanced flexibility, especially in modeling nonlinear or multiplicative relationships. When adapted for median estimation, these techniques help reduce the mean-squared error (MSE) of the estimator by stabilizing variation across strata and making better use of known auxiliary characteristics. This rationale provides the foundation for the classical and extended estimators considered in this study.
In the following part, we explore the biases and mean-squared errors (MSEs) of the traditional stratified estimators designed to estimate the population median. We then proceed to compare these results with those from our newly proposed estimators to assess potential areas of improvement.
The standard unbiased estimator for estimating the stratified population median is expressed as follows:
Ω ^ y s t = h = 1 L W h Ω ^ y h
The variance associated with Ω ^ y s t is defined as follows:
V ( Ω ^ y s t ) = h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 ,
where W h = N h N is the known stratum weight for the hth stratum, respectively.
The median estimator is based on the ratio technique used in [4]. The stratified version is represented by Ω ^ R s t and is stated as follows:
δ ^ R s t = h = 1 L W h Ω ^ y h Ω ^ x h Ω x h .
The bias and mean-squared error (MSE) of Ω ^ R s t are represented using the following formulas:
B i a s Ω ^ R s t h = 1 L θ h W h Ω y h Γ x h 2 Γ y h x h
and
M S E Ω ^ R s t h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 + Γ x h 2 2 Γ y h x h .
A difference estimator was introduced in [13], and its corresponding stratified version is denoted as Ω ^ D s t , which is described as follows:
Ω ^ D s t = h = 1 L W h Ω ^ y h + d h Ω x h Ω ^ x h ,
the term d h is an unknown factor, and its optimum value for the hth stratum is given as follows:
d h m i n = ρ y h x h Ω y h Γ y h Ω x h Γ x h .
The minimum value of the mean-squared error for Ω ^ D s t is provided as follows:
M S E Ω ^ D s t m i n h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 1 ρ y h x h 2 .
Using the concept proposed by [31], we write the following estimators for median estimation under stratified sampling, described as follows:
Ω ^ R e s t = h = 1 L W h Ω ^ y h exp Ω x h Ω ^ x h Ω x h + Ω ^ x h
and
Ω ^ P e s t = h = 1 L W h Ω ^ y h exp Ω ^ x h Ω x h Ω x h + Ω ^ x h .
The biases and MSEs for ( Ω ^ R e s t , Ω ^ P e s t ), are defined as follows:
B i a s Ω ^ R e s t h = 1 L θ h W h Ω y h 3 8 Γ x h 2 1 2 Γ y h x h ,
B i a s Ω ^ P e s t h = 1 L θ h W h Ω y h 1 2 Γ y h x h 3 8 Γ x h 2 ,
M S E Ω ^ R e s t h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 + 1 4 Γ x h 2 Γ y h x h
and
M S E Ω ^ P e s t h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 + 1 4 Γ x h 2 + Γ y h x h .
The following estimators, based on the difference-type approach for median estimation as introduced by [10,14], can be defined in terms of stratified random sampling and are expressed as follows:
Ω ^ D 1 s t = h = 1 L W h d 1 h Ω ^ y h + d 2 h Ω x h Ω ^ x h ,
δ ^ D 2 s t = h = 1 L W h d 3 h Ω ^ y h + d 4 h Ω x h Ω ^ x h Ω x h Ω ^ x h ,
Ω ^ D 3 s t = h = 1 L W h d 5 h Ω ^ y h + d 6 h Ω x h Ω ^ x h Ω x h Ω ^ x h Ω x h + Ω ^ x h .
The following values represent the optimum constants d i h ( i = 1 , 2 , , 6 ) :
d 1 h o p t = 1 1 + θ h Γ y h 2 1 ρ y h x h 2 ,
d 2 h o p t = Ω y h Ω x h ρ y h x h Γ y h 1 + θ h Γ y h 2 1 ρ y h x h 2 ,
d 3 h o p t = 1 θ h Γ y h 2 1 θ h Γ y h 2 + θ h Γ y h 2 1 ρ y h x h 2 ,
d 4 h o p t = Ω y h Ω x h 1 + d 3 h o p t ρ y h x h Γ y h Γ x h 2 ,
d 5 h o p t = 1 8 8 θ h Γ x h 2 1 + θ h Γ x h 2 1 ρ y h x h 2
and
d 6 h o p t = Ω y h Ω x h 1 2 + d 5 h o p t ρ y h x h Γ y h Γ x h 1 .
The following expressions represent the minimum biases and mean square errors by using the optimum values of d i h for the hth stratum ( i = 1 , 2 , , 6 ) , which are defined as follows:
B i a s Ω ^ D 1 s t h = 1 L θ h W h Ω y h d 1 h 1 ,
B i a s Ω ^ D 2 s t h = 1 L θ h W h Ω y h d 3 h 1 + θ h d 3 h Ω y h Γ x h 2 Γ y h x h + θ h d 4 h Ω x h Γ x h 2 ,
B i a s Ω ^ D 3 s t h = 1 L θ h W h Ω y h d 5 h 1 + θ h d 5 h Ω y h 3 8 Γ x h 2 1 2 Γ y h x h + θ h 2 d 6 h Ω x h Γ x h 2 ,
M S E Ω ^ D 1 s t m i n h = 1 L θ h W h 2 θ h Ω y h 2 Γ y h 2 1 ρ y h x h 2 1 + θ h Γ y h 2 1 ρ y h x h 2 ,
M S E Ω ^ D 2 s t m i n h = 1 L θ h W h 2 θ h Ω y h 2 1 θ h Γ x h 2 Γ y h 2 1 ρ y h x h 2 1 θ h Γ x h 2 + θ h Γ y h 2 1 ρ y h x h 2
and
M S E Ω ^ D 3 s t m i n h = 1 L θ h W h 2 θ h Ω y h 2 Γ x h 2 1 ρ y h x h 2 θ h 4 Γ x h 2 1 16 Γ x h 2 + Γ y h 2 1 ρ y h x h 2 1 + θ h Γ y h 2 1 ρ y h x h 2 .

4. Suggested Class of Estimators

This section is motivated by [32,33,34,35]; we introduce a stratified class of estimators that employ a transformation approach to estimate the population median. This transformation approach enhances efficiency along with helping estimators in managing data variability. The proposed class is expressed as follows:
Ω ^ e s t = h = 1 L W h Ω ^ y h exp T 1 h a 1 h Ω ^ x h Ω x h a 1 h Ω x h + Ω ^ x h + 2 a 2 h exp T 2 h b 1 h Ω x h Ω ^ x h b 1 h Ω x h + Ω ^ x y + 2 b 2 h ,
where the known parameters ( a 1 h , a 2 h , b 1 h , b 2 h ) are linked to the variable X, and the terms T i h , i = 1 , 2 are the constant values. By employing the different combinations of a 1 h ,   a 2 h ,   b 1 h , and b 2 h , which are detailed in Table 2, we may further derive additional different estimators from Equation (23).
  • where
    T h = T 2 h Ω x h Ω ^ x h Ω x h + Ω ^ x h + 2 X h m a x X h m i n ,
    b 1 h = 1 ,
    b 2 h = X h m a x X h m i n ,
    Interquartile range : Q R h = Q 3 h Q 1 h ,
    Midrange : M R h = X h m a x + X h m i n 2 ,
    Quartile average : Q A h = Q 3 h + Q 1 h 2 ,
    Quartile deviation : Q D h = Q 3 h Q 1 h 2 ,
    Trimean : T M h = Q 1 h + 2 Q 2 h + Q 3 h 4 ,
    Decile mean : D M h = i = 1 9 D i h 9 .
    Note: The pairings of robust statistics in Table 2 (e.g., Q R h with D M h , M R h with T M h etc.) were selected based on heuristic reasoning, aiming to combine measures that reflect complementary aspects of the auxiliary variable, such as central tendency and dispersion. These combinations are not empirically optimized but are grounded in the statistical properties of the paired measures.
Equation (23) introduces a generalized transformation-based class of estimators, representing a significant theoretical advancement in stratified median estimation. By using flexible combinations of robust auxiliary measures and transformation parameters, this class unifies and extends several existing estimators as special cases. This broader structure enhances adaptability across various data patterns, particularly in skewed or heavy-tailed populations, offering improved estimation accuracy and efficiency over traditional approaches.
The following are the expressions for the bias and MSE of the newly recommended class of estimators Ω ^ e s t , which is described in the following theorem. Theorem 1, along with Equations (28) and (29), uses the first-order approximation to derive expressions for the bias and mean-squared errors of the proposed estimators. This approximation plays a crucial role in simplifying complex stochastic relationships between sample statistics and population parameters. While reducing analytical complexity, it still captures the essential behavior of the estimators with sufficient accuracy for practical and theoretical comparison.
Theorem 1.
Let Ω ^ e s t be a family of median-type estimators Ω ^ e s t of the overall population median Ω y within a stratified sampling framework. The expressions for bias as well as mean-squared errors are described as follows:
B i a s Ω ^ e s t 1 8 h = 1 L θ h W h Ω y h t 1 h 2 t 2 h 2 Γ x h 2 2 2 t 1 h 2 t 2 h t 1 h t 2 h Γ y h x h .
and
M S E Ω ^ e s t h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 + t 1 h 2 + t 2 h 2 2 t 1 h T h 4 Γ x h 2 t 1 h t 2 h Γ y h x h .
Proof. 
Let us redefine some key formulas to prove this theorem:
ξ 0 h = Ω ^ y h Ω y h Ω y h , ξ 1 h = Ω ^ x h Ω x h Ω x h ,
we know that E ξ i h = 0 with i = 0 , 1 ,
E ξ 0 h 2 = θ h Γ y h 2 ,
E ξ 1 h 2 = θ h Γ x h 2
and
E ξ 0 h ξ 1 h = θ h Γ y h x h ,
where
Γ y h = 1 Ω y h f y h ( Ω y h ) ,
Γ x h = 1 Ω x h f x h ( Ω x h ) .
The resulting Equation (23) is straightforward to understand by putting it into error terms. This lets us figure out the bias and MSE of Ω ^ e s t 2 , which is shown as follows:
Ω ^ e s t = h = 1 L W h Ω y h 1 + Ω 0 h exp T 1 h t 1 h Ω 1 h 2 1 + t 1 h Ω 1 h 2 1 exp T 2 h t 2 h Ω 1 h 2 1 + t 2 h Ω 1 h 2 1
where t 1 h and t 2 h are defined as follows:
t 1 h = a 1 h Ω x h a 1 h Ω x h + a 2 h
and
t 2 h = Ω x h Ω x h + b 2 h .
Equation (24) is examined from the right side simultaneously using the first-order Taylor series expansion. We exclude terms when e i > 2 to keep operations straightforward while their contributions are considered insignificant in this situation. The following important expression can be obtained using this method:
Ω ^ e s t = h = 1 L W h Ω y h 1 + ξ 0 h exp T 1 h t 1 h ξ 1 h 2 1 t 1 h ξ 1 h 2 + t 1 h 2 ξ 1 h 2 4 × exp T 2 h t 2 h ξ 1 h 2 1 t 2 h ξ 1 h 2 + t 2 h 2 ξ 1 h 2 4 ,
Ω ^ e s t = h = 1 L W h Ω y h 1 + ξ 0 h exp T 1 h t 1 h ξ 1 h 2 T 1 h t 1 h 2 ξ 1 h 2 4 exp T 2 h t 2 h ξ 1 h 2 + T 2 h t 1 h 2 ξ 1 h 2 4 ,
After simplifying, we obtain the following:
Ω ^ e s t h = 1 L W h Ω y h h = 1 L W h Ω y h [ ξ 0 h T 1 h t 1 h T 2 h t 2 h 2 ξ 1 h 2 T 1 h t 1 h 2 2 T 2 h t 2 h 2 T 1 h 2 t 1 h 2 T 2 h 2 t 2 h 2 8 ξ 1 h 2 + 2 T 1 h t 1 h 2 T 2 h t 2 h T 1 h T 2 h t 1 h t 2 h 4 ξ 0 h ξ 1 h ] .
By taking the expectation of Equation (25) and subsequently changing the terms ( ξ 0 h ,   ξ 1 h ,   ξ 1 h 2 ,   ξ 0 h ,   ξ 1 h ) to their expected values, the bias of Ω ^ e s t is represented as follows:
Bias Ω ^ e s t 1 8 h = 1 L θ h W h Ω y h [ 2 T 1 h t 1 h 2 2 T 2 h t 2 h 2 T 1 h 2 t 1 h 2 T 2 h 2 t 2 h 2 Γ x h 2 2 2 T 1 h t 1 h 2 T 2 h t 2 h T 1 h T 2 h t 1 h t 2 h Γ y h x h ] .
After squaring both sides of Equation (25) and applying the expectation, the equation for the MSE of Ω ^ e s t can be derived:
M S E Ω ^ e s t h = 1 L θ h W h 2 Ω y h 2 [ Γ y h 2 + T 1 h 2 t 1 h 2 + T 2 h 2 t 2 h 2 2 T 1 h T 2 h t 1 h t 2 h 4 Γ x h 2 T 1 h t 1 h T 2 h t 2 h Γ y h x h ] .
In practice, the constants T 1 h and T 2 h are adjustable constants that control the influence of the auxiliary variable transformation. Therefore, the final findings are obtained by substituting the constants ( T 1 h = T 2 h = 1 ) in Equations (26) and (27), and simplifying the following expressions:
B i a s Ω ^ e s t 1 8 h = 1 L θ h W h Ω y h t 1 h 2 t 2 h 2 Γ x h 2 2 2 t 1 h 2 t 2 h t 1 h t 2 h Γ y h x h .
and
M S E Ω ^ e s t h = 1 L θ h W h 2 Ω y h 2 Γ y h 2 + t 1 h 2 + t 2 h 2 2 t 1 h T h 4 Γ x h 2 t 1 h t 2 h Γ y h x h .

5. Mathematical Comparison

We provide the efficiency criteria in this section by using the MSE formulas of the recommended family of estimators given in Section 4 and all other estimators given in Section 3, for example Ω ^ e s t ,   Ω ^ y s t , Ω ^ R s t , Ω ^ D s t , Ω ^ R e s t , Ω ^ P e s t , Ω ^ D 1 s t , Ω ^ D 2 s t , and Ω ^ D 3 s t .
(i): 
By comparing the formula proved in (29) with the one given in (2), the following condition is obtained:
V a r ( Ω ^ y s t ) > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t 2 h Γ x h 2 4 h = 1 L θ h W h 2 t h t 2 h Γ y h x h < 1 .
(ii): 
By comparing the formula obtained in (29) with the expression in (2), we get the following condition:
M S E ( Ω ^ R s t ) > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t h 4 Γ x h 2 4 h = 1 L θ h W h 2 t 1 h t 2 h 2 Γ y h x h < 1 .
(iii): 
Comparing the MSE of the estimator presented in (29) with the formula given in (7) yields the following condition:
M S E ( Ω ^ D s t ) m i n > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t 2 h Γ x h 2 4 h = 1 L θ h W h 2 t 1 h t 2 h Γ y h x h Γ y h 2 ρ y h x h 2 < 1 .
(iv): 
By comparing the formula proved in (29) with the one given in (12), the following condition is obtained:
M S E ( Ω ^ R e s t ) > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t 2 h 1 Γ x h 2 4 h = 1 L θ h W h 2 t 1 h t 2 h + 1 Γ y h x h < 1 .
(v): 
By comparing the formula proved in (29) with the one given in (13), the following condition is derived:
M S E ( Ω ^ P e s t ) > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t 2 h 1 Γ x h 2 4 h = 1 L θ h W h 2 t 1 h t 2 h 1 Γ y h x h < 1 .
(vi): 
By comparing the formula obtained in (29) with the expression in (20), we get the following condition:
M S E ( Ω ^ D 1 s t ) min > M S E Ω ^ e s t if
h = 1 L θ h W h 2 ( t 1 h 2 + t 2 h 2 2 t 1 h t 2 h 1 ) Γ x h 2 + 4 ( t 1 h t 2 h ) Γ y h x h 4 h = 1 L W h 2 Γ y h 2 ( θ h Δ y h 2 1 ) ρ y h x h 2 θ h Γ y h 2 1 + θ h Γ y h 2 ( 1 ρ y h x h 2 ) < 1 .
(vii): 
By comparing the formula proved in (29) with the expression in (21), we obtain the following condition:
M S E ( Ω ^ D 2 s t ) min > M S E Ω ^ e s t if
h = 1 L θ h W h 2 t 1 h 2 + t 2 h 2 2 t 1 h t 2 h Γ x h 2 4 t 1 h t 2 h Γ y h x h 4 h = 1 L W h 2 Γ y h 2 θ h ρ y h x h 2 Γ y h 2 + Γ x h 2 θ h Γ y h 2 + ρ y h x h 2 1 θ h Γ x h 2 + θ h Γ y h 2 1 ρ y h x h 2 < 1 .
(viii): 
By comparing the formula mentioned in (29) with the expression in (22), we get the following condition:
M S E ( Ω ^ D 3 s t ) min > M S E Ω ^ e s t if
h = 1 L θ h W h 2 4 Γ y h 2 + t 1 h 2 + t 2 h 2 2 t 1 h t 2 h Γ x h 2 4 t 1 h t 2 h Γ y h x h h = 1 L W h 2 Γ x h 2 1 ρ y h x h 2 4 θ h Γ x h 2 θ h 16 Γ x h 2 1 + θ h Γ y h 2 1 ρ y h x h 2 < 1 .

6. Results and Discussion

In this section, we obtain five distinct simulated populations by employing appropriate positively skewed distributions. Furthermore, three datasets are utilized to validate the effectiveness and consistency of the newly proposed estimators.

6.1. Simulation Study

The choice of distribution for a median estimate is influenced by both the characteristics of the population and the specific distribution in question. The median is especially useful for data that is skewed, contains outliers, or does not follow a normal distribution. To determine the variable X, we selected one of the following five distributions:
  • Simulated data 1: The modest skewness and dispersion in the distribution of X is represented by the Gamma distribution, which is described as X Gamma ( α 1 = 7 , α 2 = 5 ) with ρ y x = 0.5 .
  • Simulated data 2: The slight skew distribution of X is represented by the log-normal distribution, which is described as Log-Normal ( σ 0 = 5 , σ 1 = 3 ) with ρ y x = 0.40 .
  • Simulated data 3: The heavy-tailed distribution of X is represented by the Cauchy distribution, which is described as Cauchy ( α 3 = 8 , α 4 = 4 ) with ρ y x = 0.40 .
  • Simulated data 4: The baseline distribution of X is represented by a uniform distribution, which is described as uniform ( b 1 = 4 , b 2 = 16 ) with ρ y x = 0 .
  • Simulated data 5: The high skew distribution of X is represented by the exponential distribution, which is described as X exponential ( λ = 1 2 ) with ρ y x = 0.65 .
The simulation design considers five carefully selected probability distributions, as follows: gamma, log normal, Cauchy, uniform, and exponential. These distributions reflect various characteristics such as positive skewness, heavy tails, and symmetric behavior, allowing for a comprehensive evaluation of estimator performance under different population shapes. Additionally, the correlation between the study and auxiliary variables is varied from ρ y x = 0.4 to 0.65 , representing a wide range of dependence structures, including both negative and positive associations. This setup helps to examine the robustness and adaptability of the proposed estimators across a broad variety of real-life data scenarios.
Therefore, it is preferable to use these five distributions to analyze and illustrate the effectiveness of the suggested estimators based on various situations and their properties. The variable Y can be determined using the following formula:
Y = ρ y x × X + e ,
here, ρ y x represents the correlation coefficient, while the error component e follows a standard normal distribution, denoted as N ( 0 , 1 ) .
To evaluate the efficiency and robustness of both the proposed and existing estimators, we applied specific R programming techniques under various distribution types and correlation scenarios to examine their PRE values.
  • Based on the above-described distribution, generate a dataset of N = 1300 observations for the variables X and Y.
  • To evaluate the precision of estimators, compute the necessary statistical measures, including the largest and smallest values. In addition, the optimal values of the existing estimators are obtained.
  • Samples of size n h for each stratum can be chosen using SRSWOR from each population N h .
  • The percent relative efficiency values for all estimators discussed in this study are calculated across different sample sizes. This phase ensures that the PREs of each estimator are analyzed for a collection of samples.
  • Following 6000 repetitions of steps 3 and 4, use the formulas below to compute the percent relative efficiency values:
M S E ( Ω ^ v ) min = k = 1 60,000 Ω ^ v k Ω y 2 60,000 ,
and
P R E = V a r Ω ^ y s t M S E ( Ω ^ v ) min × 100 ,
where v = y s t , R s t , D s t , R e s t , P e s t , D 1 s t , D 2 s t D 3 s t , e 1 s t , e 2 s t , , e 8 s t .  Table 3 presents the simulated percent relative efficiencies of both the proposed and existing estimators.

6.2. Real-Life Application

The effectiveness of the proposed estimator is evaluated by analyzing the PRE values for three different datasets. A comprehensive statistical summary is provided below.
Population 1.
(Source: [13])
  • Y 1 : This variable represents the total quantity of fish harvested from all sources during the year 1995, including both commercial and recreational fishing activities;
  • X 1 : This variable refers to the number of fish caught in 1994 by individuals participating in recreational marine fishing, excluding any form of commercial harvesting;
  • Y 2 : This variable denotes the overall count of fish collected in 1995, serving as a measure of that year’s total harvest;
  • X 2 : This variable represents the number of fish captured by recreational marine fishermen in 1994, reflecting the impact of non-commercial fishing on the total catch.
    N 1 = 69 , n 1 = 17 , N 2 = 69 , n 2 = 17 , X 1 m i n = 17,051.500 , X 1 m a x = 20,987.500 , X 2 m i n = 15,055 , X 2 m a x = 19,005 , Ω x 1 = 2011 , Ω y 1 = 2068 , Ω x 2 = 2007 , Ω y 2 = 2068 , f x 1 ( Ω x 1 ) = 0.00014 , f y 1 ( Ω y 1 ) = 0.00014 , f x 2 ( Ω x 2 ) = 0.00014 , f y 2 ( Ω y 2 ) = 0.00014 , ρ y 1 x 1 = 0.151 , ρ y 2 x 2 = 0.314 , T M 1 = 4043 , T M 2 = 3777 , D M 1 = 3853 , D M 2 = 3615.200 , Q R 1 = 3936 , Q R 2 = 3936 , Q A 1 = 2956 , Q A 2 = 3002 , Q D 1 = 1968 , Q D 2 = 1975 , M R 1 = 19,019.500 , M R 2 = 17,030 .
Population 2.
We consider a real finite population presented in the Punjab Development Statistics 2013, on page 226 [36], which includes the number of registered factories and employment levels by division and district. The Pakistan Bureau of Statistics website provides a download link: https://www.pbs.gov.pk/content/microdata (accessed on 1 May 2025).
  • Y 1 : Employment level by division and district in 2010;
  • X 1 : Number of registered factories by division and district in 2010;
  • Y 2 : Employment level by division and district in 2012;
  • X 2 : Number of registered factories by division and district in 2012.
    N 1 = 36 , n 1 = 14 , N 2 = 36 , n 2 = 14 , X 1 m i n = 24 , X 1 m a x = 1986 , X 2 m i n = 24 , X 2 m a x = 2055 , Ω x 1 = 168.500 , Ω y 1 = 10,484.500 , Ω x 2 = 171.500 , Ω y 2 = 10,494.500 , f x 1 ( Ω x 1 ) = 0.002463666 , f y 1 ( Ω y 1 ) = 0.00004033736 , f x 2 ( Ω x 2 ) = 0.002315051 , f y 2 ( Ω y 2 ) = 0.00004086913 , ρ y 1 x 1 = 0.912 , ρ y 2 x 2 = 0.5194465 , T M 1 = 193.438 , T M 2 = 195.750 , D M 1 = 432.500 , D M 2 = 431.500 , Q R 1 = 252.250 , Q R 2 = 265 , Q A 1 = 218.375 , Q A 2 = 220 , Q D 1 = 127.125 , Q D 2 = 132.500 , M R 1 = 1005 , M R 2 = 1039.500 .
Population 3.
We analyze an actual finite population dataset obtained from the Punjab Development of Statistics (2014), specifically from page 135 [37]. This dataset includes information on the enrollment numbers of boys and girls in government primary and middle schools. It is accessible for download on the Pakistan Bureau of Statistics website using the following URL: https://www.pbs.gov.pk/content/microdata (accessed on 1 May 2025).
  • Y 1 : Represents the aggregate count of students who registered during the academic session 2012–2013;
  • X 1 : Denotes the overall number of government-operated primary schools for both male and female students in the same academic year;
  • Y 2 : Refers to the total enrollment of students recorded for the 2012–2013 school year;
  • X 2 : Refers to the complete number of government-managed middle schools catering to both boys and girls during the 2012–2013 academic period.
    N 1 = 36 , n 1 = 14 , N 2 = 36 , n 2 = 14 , X 1 m i n = 388 , X 1 m a x = 1534 , X 2 m i n = 84 , X 2 m a x = 478 , Ω x 1 = 1016.500 , Ω y 1 = 116,230 , Ω x 2 = 206 , Ω y 2 = 49,661 , f x 1 ( Ω x 1 ) = 0.000951993 , f y 1 ( Ω y 1 ) = 0.00000835 , f x 2 ( Ω x 2 ) = 0.004094403 , f y 2 ( Ω y 2 ) = 0.0000143374 , ρ y 1 x 1 = 0.084 , ρ y 2 x 2 = 0.875 , T M 1 = 891.188 , T M 2 = 210.688 , D M 1 = 982.650 , D M 2 = 231 , Q R 1 = 378.250 , Q R 2 = 125.750 , Q A 1 = 891.875 , Q A 2 = 215.375 , Q D 1 = 982.650 , Q D 2 = 62.875 , M R 1 = 961 , M R 2 = 281 .
Next, we compute the percent relative efficiency values for all estimators. The findings from this analysis, illustrating the performance of the proposed estimator family, are presented in Table 4.
P R E = V a r Ω ^ y s t M S E ( Ω ^ t ) min × 100 ,
where t = y s t , R s t , D s t , R e s t , P e s t , D 1 s t , D 2 s t , D 3 s t , e 1 s t , e 2 s t , , e 8 s t .

6.3. Discussion

Simulations were carried out using appropriate distributions with varying ρ y x values. Additionally, the performance of the proposed estimator family underwent evaluation by analyzing three datasets. The percent relative efficiency (PRE) served to assess the various estimators. The PRE values for the newly proposed family and existing estimators across five simulated distributions appear in Table 3. The results from real datasets are shown in Table 4. Based on these analyses, the following key conclusions can be drawn:
  • The results from both simulated and real datasets, as presented in Table 3 and Table 4, indicate that the PRE values of all newly introduced estimators exceed those of the previously established ones discussed in Section 3. This highlights the enhanced effectiveness of the suggested estimators in relation to existing techniques.
  • Additionally, the upward trend in the graphical representations shown in Figure 1 and Figure 2, based on various distributions and actual datasets, further confirms that the new estimators consistently achieve higher PRE values than the conventional estimators. The inverse correlation between the PRE values of the new and traditional estimators strengthens the idea that the newly introduced estimators offer a more efficient estimation method.
  • Furthermore, the box-plots presented in Figure 3 and Figure 4 visually highlight the superior performance of the proposed estimators. These plots display the distribution of percent relative efficiency (PRE) values across both simulated and real datasets, highlighting the consistently higher central tendency and narrower spread of the proposed estimators compared to traditional methods. The compact interquartile ranges and elevated median lines observed in the proposed estimator boxplots signify both robustness and efficiency, particularly under asymmetric and skewed distributions.

7. Conclusions

This study introduces a novel family of estimators for estimating the finite population median under stratified random sampling by utilizing the robust measures of an auxiliary variable. The first-order approximation served as a useful framework for deriving bias and mean-squared error expressions for both existing and newly proposed estimators. This methodology provides deeper insights into their performance and identifies areas for enhancement. To evaluate the effectiveness of the proposed estimators, simulations were conducted using five different distributions under various conditions, along with an analysis of three real-world datasets. The results, summarized in Table 4 and Table 3, indicate that the newly developed estimators demonstrate superior performance and achieve optimal efficiency compared to conventional estimators. Furthermore, all new estimators exhibited higher efficiency than their existing counterparts.
Based on the simulation results and real-life data analysis, the proposed transformation-based estimators ( Ω ^ e 1 s t , Ω ^ e 2 s t , , Ω ^ e 8 s t ) consistently outperformed the traditional estimators. These estimators achieved the highest percent relative efficiency (PRE) values across various distributions and correlation levels, demonstrating superior accuracy, robustness to skewness, and resistance to outliers.
This study provides a foundation for advancing median estimation within the framework of simple random sampling. Future research may focus on applying the proposed estimators to more complex sampling methods like systematic or two-phase sampling. There is also potential to explore their use in large-scale, high-dimensional datasets and incorporate machine learning techniques to enhance robustness. Furthermore, practical applications, such as healthcare, where accurate median estimates are essential for patient outcomes, and economics, where they are vital for assessing income distribution, can benefit from these improvements. Additionally, developing adaptive methods that automatically select the most suitable estimator based on data characteristics may further enhance accuracy.
In future developments, machine learning techniques could play a significant role in improving estimator performance. For example, supervised algorithms may be utilized to identify transformation rules that minimize estimation error, while unsupervised methods could assist in determining optimal stratum partitions based on underlying data patterns. These adaptive strategies can enhance the flexibility of the proposed estimators, especially in situations involving complex or high-dimensional datasets. Applying such data-driven tools may lead to more efficient and robust median estimation procedures.

Author Contributions

Conceptualization, A.S.A.; Methodology, A.S.A.; Software, A.S.A. and F.A.A.; Validation, F.A.A.; Formal analysis, A.S.A. and F.A.A.; Investigation, F.A.A.; Resources, A.S.A. and F.A.A.; Data curation, A.S.A. and F.A.A.; Writing—original draft, A.S.A.; Writing—review & editing, A.S.A. and F.A.A.; Visualization, A.S.A. and F.A.A.; Supervision, F.A.A.; Project administration, A.S.A. and F.A.A.; Funding acquisition, F.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R515), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data Availability Statement

The real data are secondary, and their sources are given in the data section, while the simulated data were generated using R software (latest v. 4.4.0). The codes used in this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gross, S. Median estimation in sample surveys. In Proceedings of the Section on Survey Research Methods; American Statistical Association Ithaca, Alexandria, VA, USA; 1980. [Google Scholar]
  2. Sedransk, J.; Meyer, J. Confidence intervals for the quantiles of a finite population: Simple random and stratified simple random sampling. J. R. Stat. Soc. Ser. B (Methodol.) 1978, 40, 239–252. [Google Scholar] [CrossRef]
  3. Philip, S.; Sedransk, J. Lower bounds for confidence coefficients for confidence intervals for finite population quantiles. Commun. Stat.-Theory Methods 1983, 12, 1329–1344. [Google Scholar] [CrossRef]
  4. Kuk, Y.C.A.; Mak, T.K. Median estimation in the presence of auxiliary information. J. R. Stat. Soc. Ser. B 1989, 51, 261–269. [Google Scholar] [CrossRef]
  5. Zaman, T.; Bulut, H. An efficient family of robust-type estimators for the population variance in simple and stratified random sampling. Commun. Stat.-Theory Methods 2023, 52, 2610–2624. [Google Scholar] [CrossRef]
  6. Daraz, U.; Alomair, M.A.; Albalawi, O.; Al Naim, A.S. New techniques for estimating finite population variance using ranks of auxiliary variable in two-stage sampling. Mathematics 2024, 12, 2741. [Google Scholar] [CrossRef]
  7. Alghamdi, A.S.; Alrweili, H. A comparative study of new ratio-type family of estimators under stratified two-phase sampling. Mathematics 2025, 13, 327. [Google Scholar] [CrossRef]
  8. Alomair, M.A.; Daraz, U. Dual transformation of auxiliary variables by using outliers in stratified random sampling. Mathematics 2024, 12, 2839. [Google Scholar] [CrossRef]
  9. Alghamdi, A.S.; Almulhim, F.A. Optimizing finite population mean estimation using simulation and empirical data. Mathematics 2025, 13, 1635. [Google Scholar] [CrossRef]
  10. Rao, T.J. On certail methods of improving ration and regression estimators. Commun. -Stat.-Theory Methods 1991, 20, 3325–3340. [Google Scholar] [CrossRef]
  11. Khoshnevisan, M.; Singh, H.P.; Singh, S.; Smarandache, F. A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling; VirginiaPolytechnic Institute and State University: Blacksburg, VA, USA,, 2002. [Google Scholar]
  12. Singh, S.; Joarder, A.H.; Tracy, D.S. Median estimation using double sampling. Aust. N. Z. J. Stat. 2001, 43, 33–46. [Google Scholar] [CrossRef]
  13. Singh, S. Advanced Sampling Theory With Applications: How Michael Selected Amy; Springer Science & Business Media: Berlin, Germany, 2003; Volume 2. [Google Scholar]
  14. Gupta, S.; Shabbir, J.; Ahmad, S. Estimation of median in two-phase sampling using two auxiliary variables. Commun. Stat.-Theory Methods 2008, 37, 1815–1822. [Google Scholar] [CrossRef]
  15. Aladag, S.; Cingi, H. Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Commun. Stat.-Theory Methods 2015, 44, 1013–1032. [Google Scholar] [CrossRef]
  16. Solanki, R.S.; Singh, H.P. Some classes of estimators for median estimation in survey sampling. Commun. Stat.-Theory Methods 2015, 44, 1450–1465. [Google Scholar] [CrossRef]
  17. Daraz, U.; Wu, J.; Albalawi, O. Double exponential ratio estimator of a finite population variance under extreme values in simple random sampling. Mathematics 2024, 12, 1737. [Google Scholar] [CrossRef]
  18. Daraz, U.; Wu, J.; Alomair, M.A.; Aldoghan, L.A. New classes of difference cum-ratio-type exponential estimators for a finite population variance in stratified random sampling. Heliyon 2024, 10, e33402. [Google Scholar] [CrossRef]
  19. Daraz, U.; Alomair, M.A.; Albalawi, O. Variance estimation under some transformation for both symmetric and asymmetric data. Symmetry 2024, 16, 957. [Google Scholar] [CrossRef]
  20. Almulhim, F.A.; Alghamdi, A.S. Simulation-based evaluation of robust transformation techniques for median estimation under simple random sampling. Axioms 2025, 14, 301. [Google Scholar] [CrossRef]
  21. Daraz, U.; Almulhim, F.A.; Alomair, M.A.; Alomair, A.M. Population median estimation using auxiliary variables: A simulation study with real data across sample sizes and parameters. Mathematics 2025, 13, 1660. [Google Scholar] [CrossRef]
  22. Hussain, M.A.; Javed, M.; Zohaib, M.; Shongwe, S.C.; Awais, M.; Zaagan, A.A.; Irfan, M. Estimation of population median using bivariate auxiliary information in simple random sampling. Heliyon 2024, 10, e28891. [Google Scholar] [CrossRef]
  23. Shabbir, J.; Gupta, S. A generalized class of difference type estimators for population median in survey sampling. Hacet. J. Math. Stat. 2017, 46, 1015–1028. [Google Scholar] [CrossRef]
  24. Irfan, M.; Maria, J.; Shongwe, S.C.; Zohaib, M.; Bhatti, S.H. Estimation of population median under robust measures of an auxiliary variable. Math. Probl. Eng. 2021, 2021, 4839077. [Google Scholar] [CrossRef]
  25. Shabbir, J.; Gupta, S.; Narjis, G. On improved class of difference type estimators for population median in survey sampling. Commun. Stat.-Theory Methods 2022, 51, 3334–3354. [Google Scholar] [CrossRef]
  26. Subzar, M.; Lone, S.A.; Ekpenyong, E.J.; Salam, A.; Aslam, M.; Raja, T.A.; Almutlak, S.A. Efficient class of ratio cum median estimators for estimating the population median. PLoS ONE 2023, 18, e0274690. [Google Scholar] [CrossRef] [PubMed]
  27. Iseh, M.J. Model formulation on efficiency for median estimation under a fixed cost in survey sampling. Model Assist. Stat. Appl. 2023, 18, 373–385. [Google Scholar] [CrossRef]
  28. Shahzad, U.; Ahmad, I.; Alshahrani, F.; Almanjahie, I.M.; Iftikhar, S. Calibration-based mean estimators under stratified median ranked set sampling. Mathematics 2023, 11, 1825. [Google Scholar] [CrossRef]
  29. Bhushan, S.; Kumar, A.; Lone, S.A.; Anwar, S.; Gunaime, N.M. An efficient class of estimators in stratified random sampling with an application to real data. Axioms 2023, 12, 576. [Google Scholar] [CrossRef]
  30. Alghamdi, A.S.; Alrweili, H. New class of estimators for finite population mean under stratified double phase sampling with simulation and real-life application. Mathematics 2025, 13, 329. [Google Scholar] [CrossRef]
  31. Bahl, S.; Tuteja, R. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 1991, 12, 159–164. [Google Scholar] [CrossRef]
  32. Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 20. [Google Scholar] [CrossRef]
  33. Daraz, U.; Khan, M. Estimation of variance of the difference-cum-ratio-type exponential estimator in simple random sampling. Res. Math. Stat. 2021, 8, 1899402. [Google Scholar] [CrossRef]
  34. Daraz, U.; Agustiana, D.; Wu, J.; Emam, W. Twofold auxiliary information under two-phase sampling: An improved family of double-transformed variance estimators. Axioms 2025, 14, 64. [Google Scholar] [CrossRef]
  35. Daraz, U.; Wu, J.; Agustiana, D.; Emam, W. Finite population variance estimation using Monte Carlo simulation and real life application. Symmetry 2025, 17, 84. [Google Scholar] [CrossRef]
  36. Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2013. [Google Scholar]
  37. Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2014. [Google Scholar]
Figure 1. A visual representation of the results using data collected from various population distributions.
Figure 1. A visual representation of the results using data collected from various population distributions.
Symmetry 17 01127 g001
Figure 2. A visual representation of the results using data collected from three actual populations. (a) (Source: [13]). (b) (Source: [36]). (c) (Source: [37]).
Figure 2. A visual representation of the results using data collected from three actual populations. (a) (Source: [13]). (b) (Source: [36]). (c) (Source: [37]).
Symmetry 17 01127 g002
Figure 3. A graphical representation of the percent relative efficiency (PRE) values for various estimators obtained from different artificial populations.
Figure 3. A graphical representation of the percent relative efficiency (PRE) values for various estimators obtained from different artificial populations.
Symmetry 17 01127 g003
Figure 4. A graphical representation of the percent relative efficiency (PRE) values for various estimators obtained from different real populations.
Figure 4. A graphical representation of the percent relative efficiency (PRE) values for various estimators obtained from different real populations.
Symmetry 17 01127 g004
Table 1. List of symbols and notations.
Table 1. List of symbols and notations.
SymbolDescriptionSymbolDescription
NPopulation sizeLNumber of strata
N h Units in hth stratum n h Sample size in hth stratum
YStudy variableXAuxiliary variable
y h i Y for ith unit in hth stratum x h i X for ith unit in hth stratum
Ω M y h Pop. median of Y in hth stratum Ω M x h Pop. median of X in hth stratum
Ω ^ M y h Sample median of Y Ω ^ M x h Sample median of X
W h Weight of hth stratum θ h Correction factor, hth stratum
ρ y h x h Correlation of medians in hth stratum P 11 Joint probability of medians
ξ 0 h Relative error for Y ξ 1 h Relative error for X
Q R h Interquartile range Q D h Quartile deviation
M R h Midrange Q A h Quartile average
T M h Trimean D M h Decile mean
X h   min Minimum of X X h   max Maximum of X
Table 2. Classifications of the new family of estimators.
Table 2. Classifications of the new family of estimators.
Sub-Classes of the Recommended Estimator Ω ^ est a 1 h a 2 h
Ω ^ e 1 s t = h = 1 L W h Ω ^ y h exp T 1 h Q R h Ω ^ x h Ω x h Q R H Ω x h + Ω ^ x h + 2 D M h exp T h Q R h D M h
Ω ^ e 2 s t = h = 1 L W h Ω ^ y h exp T 1 h M R h Ω ^ x h Ω x h M R h Ω x h + Ω ^ x h + 2 T M h exp T h M R h T M h
Ω ^ e 3 s t = h = 1 L W h Ω ^ y h exp T 1 h Q A h Ω ^ x h Ω x h Q A h Ω x h + Ω ^ x h + 2 Q D h exp T h Q A h Q D h
Ω ^ e 4 s t = h = 1 L W h Ω ^ y h exp T 1 h Q D h Ω ^ x h Ω x h Q D h Ω x h + Ω ^ x h + 2 Q A h exp T h Q D h Q A h
Ω ^ e 5 s t = h = 1 L W h Ω ^ y h exp T 1 h T M h Ω ^ x h Ω x h T M h Ω x h + Ω ^ x h + 2 M R h exp T h T M h M R h
Ω ^ e 6 s t = h = 1 L W h Ω ^ y h exp T 1 h D M h Ω ^ x h Ω x h D M h Ω x h + Ω ^ x h + 2 Q R h exp T h D M h Q R h
Ω ^ e 7 s t = h = 1 L W h Ω ^ y h exp T 1 h M R h Ω ^ x h Ω x h M R h Ω x h + Ω ^ x h + 2 Q A h exp T h M R h Q A h
Ω ^ e 8 s t = h = 1 L W h Ω ^ y h exp T 1 h Q D h Ω ^ x h Ω x h Q D h Ω x h + Ω ^ x h + 2 T M h exp T h Q D h T M h
Table 3. Percent relative efficiency (PRE) values for simulated distributions.
Table 3. Percent relative efficiency (PRE) values for simulated distributions.
Estimator Gam ( 7 , 5 ) LN ( 5 , 3 ) C ( 8 , 4 ) Uni ( 4 , 16 ) Exp ( 1 2 )
Ω ^ y s t 100.000100.000100.000100.000100.000
Ω ^ R s t 114.927110.097108.239108.479109.554
Ω ^ D s t 118.571112.727111.143108.182110.696
Ω ^ R e s t 122.352118.571116.000113.802115.000
Ω ^ P e s t 96.75697.48396.96692.33393.061
Ω ^ D 1 s t 177.500153.636143.854152.500149.091
Ω ^ D 2 s t 226.842176.206151.536175.729165.000
Ω ^ D 3 s t 135.161125.000133.182120.000124.638
Ω ^ e 1 s t 311.428215.000193.225226.364223.345
Ω ^ e 2 s t 300.000216.000186.875206.667208.750
Ω ^ e 3 s t 271.250197.692180.909190.000208.750
Ω ^ e 4 s t 336.154235.454200.000250.000240.000
Ω ^ e 5 s t 365.000247.142215.000278.889259.230
Ω ^ e 6 s t 346.134224.782207.241226.364240.000
Ω ^ e 7 s t 399.090260.000232.369315.000281.666
Ω ^ e 8 s t 375.000247.142223.333278.888259.237
Table 4. Percent efficiency using actual datasets.
Table 4. Percent efficiency using actual datasets.
EstimatorPopulation-IPopulation-IIPopulation-III
Ω ^ y s t 100100100
Ω ^ R s t 104.483103.490102.309
Ω ^ D s t 106.462224.1987105.401
Ω ^ R s t 113.592220.089105.773
Ω ^ P s t 43.61642.7637345.312
Ω ^ D 1 s t 119.703232.478106.597
Ω ^ D 2 s t 121.855233.110106.611
Ω ^ D 3 s t 117.347201.569107.622
Ω ^ e 1 s t 229.832372.780180.626
Ω ^ e 2 s t 314.652393.385181.625
Ω ^ e 3 s t 229.912378.001180.668
Ω ^ e 4 s t 229.732372.7041180.952
Ω ^ e 5 s t 229.183357.866180.970
Ω ^ e 6 s t 229.813377.564181.150
Ω ^ e 7 s t 229.972379.312181.031
Ω ^ e 8 s t 229.648373.538180.954
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alghamdi, A.S.; Almulhim, F.A. Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations. Symmetry 2025, 17, 1127. https://doi.org/10.3390/sym17071127

AMA Style

Alghamdi AS, Almulhim FA. Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations. Symmetry. 2025; 17(7):1127. https://doi.org/10.3390/sym17071127

Chicago/Turabian Style

Alghamdi, Abdulaziz S., and Fatimah A. Almulhim. 2025. "Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations" Symmetry 17, no. 7: 1127. https://doi.org/10.3390/sym17071127

APA Style

Alghamdi, A. S., & Almulhim, F. A. (2025). Stratified Median Estimation Using Auxiliary Transformations: A Robust and Efficient Approach in Asymmetric Populations. Symmetry, 17(7), 1127. https://doi.org/10.3390/sym17071127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop