A Generalized Estimation Strategy for the Finite Population Median Using Transformation Methods Under a Two-Phase Sampling Design
Abstract
1. Introduction
- Advantages of the Proposed Estimators under Two-Phase Sampling
- Enhanced Resistance to Data Irregularities: Utilizing measures such as the interquartile range and mid-range, the estimators effectively minimize the impact of extreme observations and asymmetric distributions. The approach offers consistent median estimates across diverse sampling conditions, outperforming many classical methods.
- Efficient Use of Partial Auxiliary Information: By employing multiple transformation techniques, these estimators flexibly adjust to different underlying population characteristics. The two-phase sampling scheme utilizes initial auxiliary data to improve estimates with minimal additional sampling effort.
- Practical Advantages for Complex Surveys: The proposed methods are particularly useful when full auxiliary data are unavailable or costly to obtain, ensuring reliable median estimation in challenging scenarios.
- Practical Applications: The two-phase sampling scheme utilizes preliminary auxiliary information to enhance the estimation accuracy while minimizing additional data collection efforts. This method proves particularly effective in real-life contexts such as forestry management, where initial satellite imagery data (first phase) guide the selection of sample plots for detailed ground measurements (second phase). By combining these data sources, the proposed estimators deliver reliable median estimates of tree biomass, supporting sustainable resource planning and conservation efforts.
2. Methodology and Notations
- (i)
- First, a fixed sample of m units is selected from the population
- (ii)
- Next, a sub-sample of n elements is chosen from within . In this phase, information on both the study variable y and the auxiliary variable x is obtained.
3. Proposed General Class of Estimators
Justification for the Choice of Transformation Components
- Robustness to outliers: Measures such as the quartile deviation (QD), median absolute deviation (MAD), and interquartile range (IQR) are less influenced by extreme observations compared to conventional measures like the mean or standard deviation. Incorporating these robust statistics into , or enhances the estimator performance in datasets with heavy tails, such as those generated from Cauchy or log-normal distributions.
- Combining location and spread: Estimators like and simultaneously use measures of central tendency (e.g., trimean, quartile average) and dispersion (e.g., IQR, standard deviation). This dual usage improves the adaptability of estimators to different shapes of distributions, particularly when the data exhibit moderate skewness or non-normality.
- Skewness sensitivity: The estimator uses skewness of X to adjust the dynamic to asymmetry in the distribution. By accounting for skewness explicitly in or , this estimator is particularly useful when the auxiliary variable is significantly non-symmetric.
- Transformational stability: Transformations such as , , and , used in estimators like and , contribute to scale stability. These transformations are known to mitigate the effect of skewness and reduce heteroscedasticity, thereby stabilizing the variance in the estimator.
- Geometric and midrange features: Estimators such as and utilize geometric means or midrange components to capture distributional symmetry and central spread. These are especially effective in settings where the auxiliary variable is uniformly distributed or symmetrically bounded.
- Computational simplicity and availability: All transformation components used in the proposed estimators, such as , , , , and midrange, can be readily computed from the first-phase sample data. This makes the estimators highly practical and convenient, especially in real survey applications where full population data is inaccessible.
4. Explicit Comparison Conditions
5. Results and Discussion
5.1. Simulation Study
- Population 1: Let X be distributed according to a heavy-tailed Cauchy model, where and , with a negative correlation () to Y.
- Population 2: The variable X follows a baseline uniform distribution with a lower bound and an upper bound , and it is independent in correlation terms from Y ().
- Population 3: Let X be distributed according to a high-skew exponential model, where the parameter takes the value , and the correlation between X and Y is .
- Population 4: The variable X follows a gamma distribution with moderate skewness and dispersion, characterized by and , and it has a correlation of with Y.
- Population 5: Let X be distributed according to a slightly skewed log-normal model with the parameters and , and it has a correlation of with Y.
- Scheme A (fixed m, varying n): and for each m, (rounded);
- Scheme B (paired growth): ;
- Scheme C (fine grid): crossed with .
5.2. Real-Life Application
5.3. Results from Simulation Studies and Real Data Applications
- Across both simulated and real datasets, the proposed transformation-based estimators outperformed ratio-type, exponential-type, and difference-type estimators . Table 4 and Figure 1 highlight these gains in simulated populations, while Table 6 and Figure 2 show similar improvements in real-life applications. In almost every case, the MSE of the proposed estimators was the smallest among all methods tested.
- In Table 4, the proposed estimators consistently achieved lower MSEs than the traditional methods for all five distributional, settings Cauchy, uniform, exponential, gamma, and log-normal, covering heavy-tailed, skewed, and symmetric cases. Figure 1a–e visually confirs this trend, where the proposed estimators form the lowest bars across each distribution.
- The outcomes in Table 6 demonstrate that the proposed estimators also perform strongly with real survey data, including socio-economic and environmental datasets. Figure 2a–c show that for all three populations, the proposed estimators, particularly , consistently appear among the best-performing results on the MSE scale, surpassing both exponential- and difference-type estimators.
- The trends in Figure 1 and Figure 2 indicate that the new estimators remain effective across a wide range of correlation strengths between the study and auxiliary variables. Table 4 and Table 6 further show that their efficiency holds even when the second-phase sample size n is small relative to m, making them highly practical for budget-restricted surveys.
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zaman, T.; Bulut, H. An efficient family of robust-type estimators for the population variance in simple and stratified random sampling. Commun. Stat.-Theory Methods 2023, 52, 2610–2624. [Google Scholar] [CrossRef]
- Daraz, U.; Alomair, M.A.; Albalawi, O.; Al Naim, A.S. New techniques for estimating finite population variance using ranks of auxiliary variable in two-stage sampling. Mathematics 2024, 12, 2741. [Google Scholar] [CrossRef]
- Zaman, T.; Bulut, H. A simulation study: Robust ratio double sampling estimator of finite population mean in the presence of outliers. Sci. Iran. 2021, 31, 1330–1341. [Google Scholar] [CrossRef]
- Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 1–15. [Google Scholar] [CrossRef]
- Alomair, M.A.; Daraz, U. Dual transformation of auxiliary variables by using outliers in stratified random sampling. Mathematics 2024, 12, 2839. [Google Scholar] [CrossRef]
- Gross, S. Median Estimation in Sample Surveys. In Proceedings of the Section on Survey Research Methods. American Statistical Association Ithaca: Alexandria, VA, USA, 1980. Available online: http://www.asasrms.org/Proceedings/papers/1980_037.pdf (accessed on 1 October 2025).
- Sedransk, J.; Meyer, J. Confidence intervals for the quantiles of a finite population: Simple random and stratified simple random sampling. J. R. Stat. Soc. Ser. B 1978, 40, 239–252. [Google Scholar] [CrossRef]
- Philip, S.; Sedransk, J. Lower bounds for confidence coefficients for confidence intervals for finite population quantiles. Commun. Stat.-Theory Methods 1983, 12, 1329–1344. [Google Scholar] [CrossRef]
- Kuk, Y.C.A.; Mak, T.K. Median estimation in the presence of auxiliary information. J. R. Stat. Soc. Ser. B 1989, 51, 261–269. [Google Scholar] [CrossRef]
- Rao, T.J. On certail methods of improving ration and regression estimators. Commun. Stat.-Theory Methods 1991, 20, 3325–3340. [Google Scholar] [CrossRef]
- Singh, S.; Joarder, A.H.; Tracy, D.S. Median estimation using double sampling. Aust. N. Z. J. Stat. 2001, 43, 33–46. [Google Scholar] [CrossRef]
- Singh, S.; Joarder, A.H. Estimation of distribution function and median in two phase sampling. Pak. J. Stat.-All Ser. 2002, 18, 301–320. [Google Scholar]
- Khoshnevisan, M.; Singh, H.P.; Singh, S.; Smarandache, F. A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling; Virginia Polytechnic Institute and State University: Blacksburg, VA, USA, 2002. [Google Scholar]
- Singh, S. Advanced Sampling Theory with Applications: How Michael Selected Amy; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003; Volume 2. [Google Scholar]
- Gupta, S.; Shabbir, J.; Ahmad, S. Estimation of median in two-phase sampling using two auxiliary variables. Commun. Stat.-Theory Methods 2008, 37, 1815–1822. [Google Scholar] [CrossRef]
- Aladag, S.; Cingi, H. Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Commun. Stat.-Theory Methods 2015, 44, 1013–1032. [Google Scholar] [CrossRef]
- Solanki, R.S.; Singh, H.P. Some classes of estimators for median estimation in survey sampling. Commun. Stat.-Theory Methods 2015, 44, 1450–1465. [Google Scholar] [CrossRef]
- Daraz, U.; Almulhim, F.A.; Alomair, M.A.; Alomair, A.M. Population median estimation using auxiliary variables: A simulation study with real data across sample sizes and parameters. Mathematics 2025, 13, 1660. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Albalawi, O. Double exponential ratio estimator of a finite population variance under extreme values in simple random sampling. Mathematics 2024, 12, 1737. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Alomair, M.A.; Aldoghan, L.A. New classes of difference cum-ratio-type exponential estimators for a finite population variance in stratified random sampling. Heliyon 2024, 10, e33402. [Google Scholar] [CrossRef]
- Daraz, U.; Alomair, M.A.; Albalawi, O. Variance estimation under some transformation for both symmetric and asymmetric data. Symmetry 2024, 16, 957. [Google Scholar] [CrossRef]
- Shabbir, J.; Gupta, S. A generalized class of difference type estimators for population median in survey sampling. Hacet. J. Math. Stat. 2017, 46, 1015–1028. [Google Scholar] [CrossRef]
- Irfan, M.; Maria, J.; Shongwe, S.C.; Zohaib, M.; Bhatti, S.H. Estimation of population median under robust measures of an auxiliary variable. Math. Probl. Eng. 2021, 2021, 4839077. [Google Scholar] [CrossRef]
- Shabbir, J.; Gupta, S.; Narjis, G. On improved class of difference type estimators for population median in survey sampling. Commun. Stat.-Theory Methods 2022, 51, 3334–3354. [Google Scholar] [CrossRef]
- Hussain, M.A.; Javed, M.; Zohaib, M.; Shongwe, S.C.; Awais, M.; Zaagan, A.A.; Irfan, M. Estimation of population median using bivariate auxiliary information in simple random sampling. Heliyon 2024, 10, e28891. [Google Scholar] [CrossRef]
- Bhushan, S.; Kumar, A.; Lone, S.A.; Anwar, S.; Gunaime, N.M. An efficient class of estimators in stratified random sampling with an application to real data. Axioms 2023, 12, 576. [Google Scholar] [CrossRef]
- Stigler, S.M. Linear functions of order statistics. Ann. Math. Stat. 1969, 40, 770–788. [Google Scholar] [CrossRef]
- Singh, H.P.; Vishwakarma, G.K. Modified exponential ratio and product estimators for finite population mean in double sampling. Austrian J. Stat. 2007, 36, 217–225. [Google Scholar] [CrossRef]
- Daraz, U.; Khan, M. Estimation of variance of the difference-cum-ratio-type exponential estimator in simple random sampling. Res. Math. Stat. 2021, 8, 1–15. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Agustiana, D.; Emam, W. Finite population variance estimation using Monte Carlo simulation and real life application. Symmetry 2025, 17, 84. [Google Scholar] [CrossRef]
- Daraz, U.; Agustiana, D.; Wu, J.; Emam, W. Twofold auxiliary information under two-phase sampling: An improved family of double-transformed variance estimators. Axioms 2025, 14, 64. [Google Scholar] [CrossRef]
- Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2013.
- Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2014.
Symbol | Definition | Symbol | Definition |
---|---|---|---|
N | Population size | m | First-phase sample size |
n | Second-phase sample size | Y | Study variable |
X | Auxiliary variable | QA | Quartile average |
Population median of Y | IQR | Interquartile range | |
Population median of X | MR | Midrange | |
First-phase sample median of X | QD | Quartile deviation | |
Second-phase sample median of Y | TM | Trimean | |
Second-phase sample median of X | DM | Decile mean | |
Probability density function of | Skewness of X | ||
Probability density function of | Minimum of X | ||
Correlation coefficient | Maximum of X | ||
Coefficient of variation in | MSE | Mean squared error | |
Covariance term between and | Sampling constant | ||
MAD | Median absolute deviation | Standard deviation of X | |
Coefficient of variation in | Ratios used in bias/MSE | ||
Quartiles of X | Relative error of | ||
Relative error of | Relative error of | ||
Constants in proposed estimators | Transformation components |
Estimator | ||||
---|---|---|---|---|
1 | ||||
1 | ||||
1 | ||||
1 | 1 | |||
1 | ||||
1 | ||||
1 | ||||
1 |
Estimator | |||||
---|---|---|---|---|---|
Dataset-1 | Dataset-2 | Dataset-3 |
---|---|---|
Estimator | Population-1 | Population-2 | Population-3 |
---|---|---|---|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alshanbari, H.M. A Generalized Estimation Strategy for the Finite Population Median Using Transformation Methods Under a Two-Phase Sampling Design. Symmetry 2025, 17, 1696. https://doi.org/10.3390/sym17101696
Alshanbari HM. A Generalized Estimation Strategy for the Finite Population Median Using Transformation Methods Under a Two-Phase Sampling Design. Symmetry. 2025; 17(10):1696. https://doi.org/10.3390/sym17101696
Chicago/Turabian StyleAlshanbari, Huda M. 2025. "A Generalized Estimation Strategy for the Finite Population Median Using Transformation Methods Under a Two-Phase Sampling Design" Symmetry 17, no. 10: 1696. https://doi.org/10.3390/sym17101696
APA StyleAlshanbari, H. M. (2025). A Generalized Estimation Strategy for the Finite Population Median Using Transformation Methods Under a Two-Phase Sampling Design. Symmetry, 17(10), 1696. https://doi.org/10.3390/sym17101696