Next Article in Journal
Localization Operators for the Linear Canonical Dunkl Windowed Transformation
Previous Article in Journal
Notes on Semiprime Ideals with Symmetric Bi-Derivation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Särndal-Type Mean Estimators with Re-Descending Coefficients

by
Khudhayr A. Rashedi
1,
Alanazi Talal Abdulrahman
1,
Tariq S. Alshammari
1,
Khalid M. K. Alshammari
1,
Usman Shahzad
2,*,
Javid Shabbir
3,
Tahir Mehmood
4 and
Ishfaq Ahmad
5
1
Department of Mathematics, College of Science, University of Ha’il, Ha’il 81481, Saudi Arabia
2
Department of Management Science, College of Business Administration, Hunan University, Changsha 410082, China
3
Department of Statistics, University of Wah, Rawalpindi 47040, Pakistan
4
School of Natural Science (SNS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan
5
Department of Mathematics and Statistics, International Islamic University, Islamabad 44000, Pakistan
*
Author to whom correspondence should be addressed.
Axioms 2025, 14(4), 261; https://doi.org/10.3390/axioms14040261
Submission received: 26 January 2025 / Revised: 15 March 2025 / Accepted: 27 March 2025 / Published: 29 March 2025
(This article belongs to the Section Mathematical Analysis)

Abstract

:
When extreme values or outliers occur in asymmetric datasets, conventional mean estimation methods suffer from low accuracy and reliability. This study introduces a novel class of robust Särndal-type mean estimators utilizing re-descending M-estimator coefficients. These estimators effectively combine the benefits of robust regression techniques and the integration of extreme values to improve mean estimation accuracy under simple random sampling. The proposed methodology leverages distinct re-descending coefficients from prior studies. Performance evaluation is conducted using three real-world datasets and three synthetically generated datasets containing outliers, with results indicating superior performance of the proposed estimators in terms of mean squared error (MSE) and percentage relative efficiency (PRE). Hence, the robustness, adaptability, and practical importance of these estimators are illustrated by these findings for survey sampling and more generally for data-intensive contexts.

1. Introduction

In each stage of the normal research process, the first step is the administration of a sample survey because it is one of the basic approaches for data collection. Interviews can be conducted in various ways: virtually, at the person or group level, or through mail, online, or in person. Questionnaires are a common research method adopted in many disciplines, including education, health, income and expenditure, employment, industry, business, animals, and the environment. The techniques permitted for conducting surveys are face-to-face, telephone, mail, and interviews. Surveys can only go wrong if the methodology used to gather the data is flawed, as the data are crucial for reaching the correct conclusion. Thus, it is possible to use an action-plan based on proper strategies focused on implementation, which may enhance the methods that promote the quality of research and lead to higher accuracy of results. One of these methods is the utilization of technical auxiliary variables, which are data points most closely associated with the primary variable of study. The inclusion of such associated variables makes it better than the model being used, hence providing effective and reliable solutions most of the time (Bhushan and Kumar [1]).
In many populations, the presence of extreme values can significantly impact the estimation of unknown population characteristics. Failure to observe these values leads to certain estimates that are very sensitive and can be either overestimated or underestimated in some instances. As a result, the accuracy of traditional estimators, measured by MSE, tends to decline when extreme values are present in the dataset. Despite the fact that such data points might sometimes be omitted from analysis, their collection is essential to enhance the reliability of the population estimates. For instance, Mohanty and Sahoo [2] proposed two linear transformations of the smallest and largest observations of an auxiliary variable to develop more robust estimators. However, further exploration of these methods was limited until Khan and Shabbir [3] extended the use of extreme values to various finite population mean estimators. Furthermore, Daraz et al. [4] used a sampling method known as stratified random sampling to improve the estimation of finite population means in cases where outliers are prevalent. For further discussion and developments in this area, consult [5,6,7] and related works.
We often see a widespread use of the arithmetic mean as a measure of central tendency. It is also a very useful method of measuring location and has been successfully used in a wide spectrum of sciences and arts. Consequently, improving mean estimation methods is essential not only in survey sampling but also in numerous other fields. Accurate mean estimation becomes particularly critical in the presence of extreme observations, a common challenge in sampling surveys (Koc and Koc [8]). To address this issue, efficient methods based on ratio and regression for mean estimation are utilized. Additionally, several techniques that consider extreme values of the augmented dataset are employed. For example, Mohanty and Sahoo [2] initially used linear transformations to determine the lower and upper limits of the auxiliary variable. These transformations demonstrated the efficiency of ratio estimators, even when traditional ratio estimators performed worse than the simple mean per unit estimator. Khan and Shabbir [3] introduced modified ratio, product, and regression-type estimators using minimum and maximum values, with numerical evidence confirming improved efficiency. Khan and Shabbir [3] further expanded this work under double sampling in Khan [9]. Similarly, Cekim and Cingi [5] developed ratio, product, and exponential-type estimators of the population mean by applying novel linear transformations based on known minimum and maximum values of auxiliary variables. Shahzad et al. [6] advanced this concept by designing separate mean estimators under stratified random sampling, incorporating quantile regression and extreme values using the Särndal approach. They also extended their methods to sensitive topics using scrambled response models. More recently, Anas et al. [7] adapted the work of Shahzad et al. [6] and introduced robust mean estimators for simple random sampling. These developments show that it is possible to increase the accuracy of mean estimation by using extremes while varying different sampling situations.
Previous studies have incorporated either ratio and product techniques or regression models, using ordinary as well as robust coefficients to account for outliers. However, no existing mean estimator integrates re-descending M-estimators with extreme values simultaneously. The Särndal technique appears to present a promising avenue for integrating re-descending M-estimators with extreme/contaminated values to improve the estimation of the mean. This notable gap in the literature has inspired us to propose a new class of Särndal-type mean estimators that leverage re-descending coefficients.
The remaining article is structured into multiple sections. In Section 2, re-descending mean estimators are defined as a robust alternative to the ordinary least squares (OLS) method for treating asymmetric data with outliers. The concept is extended in Section 3 by adapting Särndal’s [10] mean estimator to improve efficiency in the presence of extreme values. A new family of re-descending regression-based mean estimators is proposed, incorporating re-descending coefficients developed by various researchers. Section 4 evaluates the proposed methodology using real-world and synthetic datasets, demonstrating lower MSE and higher PRE compared to adapted methods. The article concludes in Section 5.

2. Re-Descending M-Estimators and Mean Estimation

In many real-life situations, the OLS method is used. However, when applying the assumption of the simple normality of the error terms in OLS regression, we encounter problems in real-life data, especially due to outliers. Any type of outlier can significantly impact the OLS estimate, rendering the result inefficient and inaccurate even with the addition of another observation to the model (Dunder et al. [11]). In an attempt to address this problem, scientists have created a stable method of regression analysis known as robust regression that carries out modifications on OLS. This is particularly relevant in large sample settings with asymmetric distribution, as OLS estimates are sensitive to outliers, and the solution lies in robust M-estimators.
Based on work concerning the characterization of the sensitivity of linear regression methods to outliers, Huber [12] developed the M-estimator. This estimator assigns a value close to 1 for the middle value and nearly zero for the most extreme values. In conditions where the OLS assumption of normally distributed error terms does not hold, the M-estimator also works within the maximum likelihood estimation framework. The robust M-estimators replace the squared error term in OLS by a symmetric loss function, defined as follows:
min Λ ^ l = 1 n ϕ ( e l ) .
The previous literature shows that the M-estimator achieves its minimum value using IRLS optimization iterations. The loss function distributes outlier weights to improve performance efficiency.
Then, the associated influence function, ψ ( e l ) , is obtained by differentiating the loss function ϕ ( e l ) with respect to the residuals
l = 1 n ψ ( e l ) Z l = 0 ,
where Z l are the covariates. The weight function w ( e l ) is
l = 1 n w ( e l ) Z l = 0 .
Re-descending M-estimators are derived by modifying the loss function ϕ ( . ) to exhibit a re-descending nature. This concept has been well explored in the literature by Beaton and Tukey [13], Qadir [14], Alamgir et al. [15], Khalil et al. [16], Noor-ul-Amin et al. [17], Anekwe and Onyeagu [18], and Luo et al. [19]. In fact, these estimators are very powerful in eliminating the influence of outliers on the subsequent statistical results.
The mean, as discussed in earlier sections, is a basic way to summarize central tendency. In this study, we utilize re-descending M-estimator regression coefficients to estimate the mean efficiently. Specifically, re-descending M-estimators developed by Noor-ul-Amin et al. [17], Khan et al. [20], Anekwe and Onyeagu [18], and Raza et al. [21,22] will be employed for this purpose.

2.1. Existing Re-Descending Estimators

In order to handle data contaminated with outliers, Noor-ul-Amin et al. [17] proposed a re-descending estimator. Their methodology minimizes the influence of large residuals by applying a weight function. The estimator is adjustable, so that tuning parameters c and a determine its robustness and efficiency, making it suitable for robust regression. The primary contribution of Noor-ul-Amin et al. [17] was to demonstrate that their estimator performs well across a variety of contamination scenarios. Their objective ρ ( λ l ) , Psi ψ ( λ l ) , and weight w ( λ l ) functions are given below.
Objective function:
ρ ( λ l ) = c 2 t a n 1 2 λ l c 2 16 + λ l 2 c 4 4 c 4 + 64 λ l 4 , | λ l | 0
Psi function:
ψ ( λ l ) = λ l 1 + 2 λ l c 4 2 , | λ l | < c , 0 , | λ l | c
Weight function:
w ( λ l ) = 1 + 2 λ l c 4 2 , | λ l | < c , 0 , | λ l | c
where λ l is the residual and c [ 0 , ] is the tuning constant.
In Khan et al. [20], a new re-descending M-estimator based on the hyperbolic tangent function is introduced. Their proposed objective function needs to be smooth and continuous to achieve high efficiency and robustness. This estimator also performs well on asymmetric heavy-tailed data distributions. A hyperbolic cosine-based weight function is used to make the method insensitive to noise and give full weight to central observations in the regression model. The objective, Psi, and weight functions are provided below.
Objective function:
ρ ( λ l ) = c tan 1 tanh λ l 2 2 c , | λ l | 0 ,
Psi function:
ψ ( λ l ) = λ l cosh λ l 2 2 c , | λ l | 0 .
Weight function:
w ( λ l ) = 1 cosh λ l 2 2 c , | λ l | 0 .
Anekwe and Onyeagu [18] developed a re-descending estimator with a focus on polynomial-based weight functions. For residuals above the threshold of c, their weight function transitions smoothly to zero. It provides a high breakdown point and should be robust against leverage points. The objective, Psi, and weight functions are provided below.
Objective function:
ρ ( λ l ) = λ l 6 c 4 + λ l 10 2 c 8 2 λ l 6 c 4 + λ l 2 2 2 λ l 6 ( 3 λ l 4 5 c 4 ) 15 c 8 , | λ l | c , 4 c 2 15 , | λ l | > c
Psi function:
ψ ( λ l ) = λ l 1 λ l c 2 2 1 + λ l c 2 2 , | λ l | < c , 0 , | λ l | c
Weight function:
w ( λ l ) = 1 λ l c 2 2 1 + λ l c 2 2 , | λ l | < c , 0 , | λ l | c
Raza et al. [21] introduced an estimator that is based on parameterized robustness principles. Similar to Noor-ul-Amin et al. [17], their weight function balances efficiency and robustness parameters k and a. The objective, Psi, and weight functions are detailed below.
Objective function:
ρ ( λ l ) = k 2 2 a 1 1 + λ l k 2 a , | λ l | 0
Psi function:
ψ ( λ l ) = λ l 1 + λ l k 2 a 1 , | λ l | 0
Weight function:
w ( λ l ) = 1 + λ l k 2 a 1 , | λ l | 0
where k and a are arbitrary and generalized tuning constants, respectively.
Raza et al. [22] introduced a higher-order polynomial re-descending estimator, characterized by its ability to reject extreme outliers while maintaining efficiency in central data regions. The objective, Psi, and weight functions are detailed below.
Objective function:
ρ ( λ l ) = λ l 2 810 a 8 λ l 8 30 a 2 λ l 4 + 405 a 2 , | λ l | a , 188 a 2 405 , | λ l | > a
Psi function:
ψ ( λ l ) = λ l 1 λ l 3 a 4 2 , | λ l | a , 0 , | λ l | > a
Weight function:
w ( λ l ) = 1 λ l 3 a 4 2 , | λ l | a , 0 , | λ l | > a
where a is the tuning constant.

2.2. Adaptive Mean Estimators Using Re-Descending Coefficients

Ratio estimation is a valuable method for estimating the population mean when there is a positive linear relationship between the supplementary and study variables in survey sampling. This approach was pioneered in the mid-twentieth century by Cochran and has become a major methodological achievement that is widely used not only in the study of agriculture but also in many other research fields. We refer readers to Cochran [23], Cetin and Koyuncu [24], and Daraz et al. [25] for more information about ratio-type estimators. Pioneers of ratio-type estimators based on OLS regression coefficients are Kadilar and Cingi [26]. In the presence of outliers, traditional OLS regression is not adequate or satisfactory. This limitation is overcome in Kadilar et al. [27], who suggested using robust regression methods with the ratio-type estimator to improve the precision and reliability of the ratio-type estimator. They utilized Huber-M regression, a robust regression method, for ratio type mean estimation in a simple random sampling scheme. Building on these foundational studies, we introduce a modified family of estimators under simple random sampling with re-descending coefficients using approaches from Noor-ul-Amin et al. [17], Khan et al. [20], Anekwe and Onyeagu [18], and Raza et al. [21,22]
y ¯ q j = y ¯ + β ^ Amin 2018 ( X ¯ x ¯ ) ( x ¯ ) ( X ¯ ) for   j = 1 y ¯ + β ^ Khan 2021 ( X ¯ x ¯ ) ( x ¯ ) ( X ¯ ) for   j = 2 y ¯ + β ^ Anekwe 2021 ( X ¯ x ¯ ) ( x ¯ ) ( X ¯ ) for   j = 3 y ¯ + β ^ Raza 2024 a ( X ¯ x ¯ ) ( x ¯ ) ( X ¯ ) for   j = 4 y ¯ + β ^ Raza 2024 b ( X ¯ x ¯ ) ( x ¯ ) ( X ¯ ) for   j = 5
The adapted family y ¯ q j contains x ¯ , y ¯ and X ¯ . The characteristics of study variable Y and the auxiliary variable X are denoted as sample means, x ¯ and y ¯ , respectively, and the population mean of the auxiliary variable X is X ¯ . The MSE of the adapted class y ¯ q j is as follows:
M S E ( y ¯ q j ) = Λ [ S y 2 + w i 2 S x 2 + 2 β Amin 2018 w i S x 2 + β Amin 2018 2 S x 2 2 w i S x y 2 β Amin 2018 S x y ] for   j = 1 Λ [ S y 2 + w i 2 S x 2 + 2 β Khan 2021 w i S x 2 + β Khan 2021 2 S x 2 2 w i S x y 2 β Khan 2021 S x y ] for   j = 2 Λ [ S y 2 + w i 2 S x 2 + 2 β Anekwe 2021 w i S x 2 + β Anekwe 2021 2 S x 2 2 w i S x y 2 β Anekwe 2021 S x y ] for   j = 3 Λ [ S y 2 + w i 2 S x 2 + 2 β Raza 2024 a w i S x 2 + β Raza 2024 a 2 S x 2 2 w i S x y 2 β Raza 2024 a S x y ] for   j = 4 Λ [ S y 2 + w i 2 S x 2 + 2 β Raza 2024 b w i S x 2 + β Raza 2024 b 2 S x 2 2 w i S x y 2 β Raza 2024 b S x y ] for   j = 5
The MSE of the adapted family y ¯ q j contains a finite population correction factor Λ , the ratio of means of study and auxiliary variables w i , the variance of study and auxiliary variables S y 2 , S x 2 , and re-descending coefficients ( β Amin 2018 , β Khan 2021 , β Anekwe 2021 , β Raza 2024 a , β Raza 2024 b ) .

3. Proposed Family of Re-Descending Estimators

Being an outlier does not mean it is an anomaly; it could actually be an important source of information about the structure or variation in a dataset (Zaman et al. [28]). Traditional analysis methods do not typically capture the population-specific characteristics or trends that can be inferred from outliers. Särndal [10] recognized this potential, and introduced a mean estimator with the purpose of accounting for the impact of extreme values while keeping asymptotic efficiency. In contrast to conventional mean estimation approaches, this method adjusts for the presence of extreme values automatically. The Särndal [10] mean estimator is defined as follows:
y ¯ s r = y ¯ + ς if   selected   sample   contain   min ( y ) y ¯ ς if   selected   sample   contain   max ( y ) y ¯ for   rest   of   the   samples ,
where ς is a wisely chosen constant. Let N be the size of population and n be the size of sample. The minimum variance of y ¯ s r is given below by using ς o p t = m a x ( y ) m i n ( y ) 2 n
V a r ( y ¯ s r ) = V a r ( y ¯ ) Λ ( m a x ( y ) m i n ( y ) ) 2 2 ( N 1 ) .
In light of Särndal [10], and extending the idea of Mukhtar et al. [29], we propose a family of re-descending regression-based mean estimators for the estimation of Y ¯ . Normally, when the degree of linear relationship between the research variable Y and the auxiliary variable X greater than 0, the choice of m a x ( x , y ) is to be expected, and the m i n ( x , y ) . Utilizing this type of setup under simple random sampling, we define the new family of re-descending regression coefficient-based estimators as follows:
y ¯ p j = y ¯ ς 1 + β ^ ( r ) ( X ¯ x ¯ ς 2 ) f o r j = 1 , 2 , , 5 ,
where β ^ ( r ) can be any of the re-descending coefficients ( β ^ Amin 2018 , β ^ Khan 2021 , β ^ Anekwe 2021 , β ^ Raza 2024 a , β ^ Raza 2024 b ) in this case. In addition, y ¯ ς 1 = y ¯ + ς 1 , x ¯ ς 2 = x ¯ + ς 2 , and ( ς 1 , ς 2 ) are the chosen constants. For the results related to theoretical MSE, let us explain the notations as follows:
η y = y ¯ ς 1 Y ¯ 1 , η x = x ¯ ς 2 X ¯ 1 , E ( η x ) = E ( η y ) = 0 ,
E ( η y 2 ) = Λ Y ¯ 2 S y 2 2 n ς 1 N 1 Y d n ς 1 ,
E ( η x 2 ) = Λ X ¯ 2 S x 2 2 n ς 1 N 1 X d n ς 2 ,
E ( η y η x ) = Λ Y ¯ X ¯ S y x n N 1 ς 2 Y d + ς 1 X d 2 n ς 1 ς 2 .
By expanding y ¯ p j
y ¯ p j Y ¯ = Y ¯ ( 1 + η y ) β ^ ( r ) X ¯ η x Y ¯ f o r j = 1 , 2 , , 5 .
The MSE of y ¯ p j is obtained by squaring both sides of Equation (8) and ignoring terms with η s that have powers larger than two
M S E ( y ¯ p j ) = Λ S y 2 2 n ς 1 N 1 ( Y d n ς 1 ) + β ^ ( r ) 2 S x 2 2 n ς 2 N 1 ( X d n ς 2 )   2 Λ β ^ ( r ) S y x n N 1 ( ς 2 Y d + ς 1 X d 2 n ς 1 ς 2 ) f o r j = 1 , 2 , , 5 .
Note that every notation utilized in M S E ( y ¯ p j ) has been effectively depicted in the preceding lines. Furthermore, our newly built class can be organized in the structure of Särndal [10]. However, we are implementing their framework using re-descending regression-type mean estimators. By leveraging known outcomes and performing simple mathematical calculations, we avoid tedious or unnecessary computations to provide the optimal values of ( ς 1 , ς 2 ) and, consequently, the minimum MSE final expressions of the estimators y ¯ p j as follows:
ς 1 o p t = m a x ( y ) m i n ( y ) 2 n = Y d 2 n ς 2 o p t = m a x ( x ) m i n ( x ) 2 n = X d 2 n
M S E m i n ( y ¯ p j ) = Λ S y 2 + β ^ ( r ) 2 S x 2 2 β ^ ( r ) S y x 1 2 ( N 1 ) ( Y d β ^ ( r ) X d ) 2 f o r j = 1 , 2 , , 5 .
It is important to note that we are using five re-descending coefficients in our estimation process. The calculation of these coefficients for the five re-descending estimators involves an iterative process aimed at achieving robust estimation by minimizing the influence of outliers. Each estimator applies a unique weight function, w ( λ l ) , designed to diminish the impact of large residuals in alignment with its defined structure. As a baseline, the OLS method is used for the first estimate of β ^ (say). Following this, a weighted least squares procedure is performed iteratively, with residual weights updated at each iteration t, such that β ^ ( t + 1 ) = w ( λ l ( t ) ) X i Y i w ( λ l ( t ) ) X i 2 . Convergence is reached when | β ^ ( t + 1 ) β ^ ( t ) | < ϵ , until the process reaches an iteration where ϵ is defined as the tolerance level. To make the article more comprehensive for the readers, we replace β ^ ( r ) with ( β ^ Amin 2018 , β ^ Khan 2021 , β ^ Anekwe 2021 , β ^ Raza 2024 a , β ^ Raza 2024 b ) and consider five individuals of the proposed class with their minimum-MSE as given below:
y ¯ p j = y ¯ ς 1 + β ^ Amin 2018 ( X ¯ x ¯ ς 2 ) for   j = 1 y ¯ ς 1 + β ^ Khan 2021 ( X ¯ x ¯ ς 2 ) for   j = 2 y ¯ ς 1 + β ^ Anekwe 2021 ( X ¯ x ¯ ς 2 ) for   j = 3 y ¯ ς 1 + β ^ Raza 2024 a ( X ¯ x ¯ ς 2 ) for   j = 4 y ¯ ς 1 + β ^ Raza 2024 b ( X ¯ x ¯ ς 2 ) for   j = 5
M S E ( y ¯ p j ) = Λ S y 2 + β Amin 2018 2 S x 2 2 β Amin 2018 S y x 1 2 ( N 1 ) ( Y d β Amin 2018 X d ) 2 for   j = 1 Λ S y 2 + β Khan 2021 2 S x 2 2 β Khan 2021 S y x 1 2 ( N 1 ) ( Y d β Khan 2021 X d ) 2 for   j = 2 Λ S y 2 + β Anekwe 2021 2 S x 2 2 β Anekwe 2021 S y x 1 2 ( N 1 ) ( Y d β Anekwe 2021 X d ) 2 for   j = 3 Λ S y 2 + β Raza 2024 a 2 S x 2 2 β Raza 2024 a S y x 1 2 ( N 1 ) ( Y d β Raza 2024 a X d ) 2 for   j = 4 Λ S y 2 + β Raza 2024 b 2 S x 2 2 β Raza 2024 b S y x 1 2 ( N 1 ) ( Y d β Raza 2024 b X d ) 2 for   j = 5

4. Numerical Study

In this section, we assess the performance of the proposed and adapted estimators by applying them to three real-world datasets and three synthetically generated datasets.

4.1. Real Life Applications

Populations 1 and 2:
The education expenditure dataset from the R robustbase package is utilized as Population-1. This dataset includes variables related to education expenditure across the 50 U.S. states. X represents the number of residents per thousand in urban areas in 1970, and Y represents per capita expenditure on public education in 1975, all concurrently. For Population-2, the same dataset is employed, where the dependent variable Y remains unchanged from Population-1, and the auxiliary variable X is replaced with the per capita personal income from 1973.
Population-3:
Farm loans data provided by Singh [30] are used as Population-3. In this dataset, Y measures the value of real estate loans issued, and X measures the value of non -real estate farm loans issued, in 1977.
The relevant characteristics of these populations are detailed in Table 1. Figure 1, Figure 2 and Figure 3 corresponding to populations [1, 2, 3] reveal an asymmetric distribution and the presence of outliers, making them well-suited for evaluating the performance of the proposed estimators. The MSE results for these real-world populations are presented in Table 2.

4.2. Data Generations and Simulation Study

In this article, a simulation study was conducted to evaluate the performance of the proposed re-descending estimators under various artificially generated populations, including the presence of outliers. The details of these populations will be provided in the following lines.
Populations 4, 5, and 6:
The independent variable X for Populations 4, 5, and 6 was generated from the following distributions:
uniform distribution X Uniform ( 1 , 100 ) ;
gumbel distribution X Gubmel ( 50 , 10 ) ;
weibull distribution X Weibull ( 2 , 50 ) .
The dependent variable Y was generated following a linear relationship described as follows:
Y = 2 X + ϵ ,
where ϵ N ( 0 , 10 ) . In order to create outliers for a given Y, the data was perturbed by adding large deviations drawn from another normal distribution N ( 100 , 20 ) , as outliers are extremely different from the other data of Y. Five random indices were chosen, and the corresponding Y was perturbed with large deviations from another normal distribution N. This approach generated real datasets containing a blend of regular observations and outliers, matching real cases in which outliers may occur due to measurement errors or extreme events. The generated datasets comprised 100 observations, with both the regular data points and the outliers visualized in a scatterplot for clarity, see Figure 4, Figure 5 and Figure 6. The behavior of the outliers was clearly differentiated from standard points by showing the results in red. These artificial datasets were generated for simulation studies and provide a robust framework for testing the effectiveness of re-descending M-estimators in identifying and mitigating the influence of outliers on regression based mean estimation. The simulated MSE results of these artificial populations are provided in Table 3.
The comparison plot for Populations 1–6 are illustrated in Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12.

4.3. Interpretation of Results

  • MSE results from Table 2:
    The results in Table 2 demonstrate the consistent superiority of the proposed estimators ( y ¯ p j ) over the existing estimators ( y ¯ q j ) across all three populations. For Population-1, representing education expenditure data, the lowest MSE is achieved by the proposed estimator y ¯ p 2 with a value of 837.93, significantly outperforming the best-performing existing estimator, y ¯ q 4 , which has an MSE of 2515.49. Similarly, in Population-2, another case of education expenditure data, y ¯ p 2 again achieves the lowest MSE of 423.39, while the highest-performing existing estimator, y ¯ q 1 , records an MSE of 733.88. For Population-3, based on farm loans data, the trend persists, with y ¯ p 2 yielding the lowest MSE of 22,185.60, compared to the much higher value of 129,184.3 observed for y ¯ q 3 .
  • MSE results from Table 3:
    For Population-4, which is uniformly distributed, the proposed estimator y ¯ p 2 achieves the lowest MSE of 44.58, significantly outperforming the best-performing existing estimator y ¯ q 2 with an MSE of 352.73, demonstrating the superior efficiency of the proposed class for uniformly distributed data. In Population-5, generated from a Gumbel distribution, y ¯ p 4 emerges as the most robust with an MSE of 44.67, which is considerably better than the existing estimator y ¯ q 3 with an MSE of 108.66, indicating strong performance in handling skewed data with heavy tails. Similarly, for Population-6, simulated from a Weibull distribution, the proposed estimator y ¯ p 4 achieves the lowest MSE of 44.70 compared to the best-performing existing estimator y ¯ q 4 , which has an MSE of 259.03. These simulations demonstrate the robustness, adaptability, and efficiency of the proposed estimators for asymmetric data.
  • PRE results from Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9:
    The results from Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 illustrate the superior performance of the proposed estimators in terms of percentage relative efficiency (PRE) across all populations. For Population-1, Table 4 shows that the proposed estimator y ¯ p 2 achieves the highest PRE of 304.03, reflecting substantial efficiency gains over existing methods. Similarly, for Population-2, as presented in Table 5, y ¯ p 2 stands out with a PRE of 174.47, outperforming all other estimators. Table 6 in Population-3 displays that y ¯ p 4 has the highest PRE of 579.38 and, thus, has a very good performance in coping with data outliers. For simulated datasets, Table 7 shows that y ¯ p 2 achieves a significant PRE improvement of 785.99 in Population-4, emphasizing its efficiency in uniformly distributed data. Similarly, for Population-5, Table 8 demonstrates that y ¯ p 2 maintains the highest PRE at 234.42, far surpassing the performance of existing methods. Finally, Table 9 shows that once again, y ¯ p 2 emerges with a PRE of 547.58 for Population-6, making it the most adaptable and robust estimator tested on different datasets. Overall, the proposed estimators consistently outperform existing methods in all scenarios, making them highly effective and efficient in robust re-descending regression-based mean estimation.
  • Summary of Results
    All evaluated populations and scenarios are well served by the proposed re-descending regression-based estimators, which consistently show superior performance. Both of the proposed estimators demonstrate robustness and adaptability in terms of MSE and PRE compared with adapted methods. Especially in cases where datasets contain outliers or asymmetric data, conventional estimators are unable to maintain efficiency. The findings confirm the effectiveness of the proposed methodology and its great potential for application in real-world data analysis and survey sampling.

4.4. Limitations of the Study

It is shown that the proposed re-descending Särndal-type mean estimators offer significant improvements in robustness and accuracy but some limitations should be noted. The study also begins with SRS as the main focus of study, while the performance of the proposed estimators under more complex sampling schemes, such as stratified, cluster, and systematic sampling, is still an area for further research. Secondly, these estimators are particularly fruitful when extreme values are present in the dataset.

5. Conclusions

This study presents a novel class of robust Särndal-type mean estimators utilizing re-descending M-estimator coefficients to effectively address the challenges posed by outliers and extreme values in diverse datasets. By incorporating advancements from prior works, such as those by Noor-ul-Amin et al. [17], Khan et al. [20], Anekwe and Onyeagu [18], and Raza et al. [21,22], the proposed estimators significantly improve the accuracy and efficiency of mean estimation, as evidenced by lower MSE and higher PRE compared to adapted methods. These estimators demonstrate remarkable robustness and adaptability across real-world datasets, such as education expenditure and farm loans, as well as simulated datasets derived from uniform, Gumbel, and Weibull distributions. Their practical utility in survey sampling and related fields is due to their ability to maintain reliability and accuracy in the presence of asymmetry and outliers. In addition, these results suggest that these estimators might be applicable for applications in which robust data analysis is important, including economics, health, and the social sciences. Future work could be aimed at extending these methodologies to more complicated sampling frameworks such as [31] and using them in interdisciplinary settings more generally.

Author Contributions

Conceptualization, U.S.; formal analysis, U.S. and I.A.; funding acquisition, K.A.R. and A.T.A.; methodology, T.S.A., K.M.K.A., U.S., T.M. and I.A.; project administration, A.T.A.; software, K.A.R. and U.S.; supervision, J.S.; writing—original draft, K.A.R., A.T.A., T.S.A., K.M.K.A., U.S., J.S., T.M. and I.A.; writing—review and editing, K.A.R., A.T.A., T.S.A., K.M.K.A., U.S., J.S. and T.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been funded by the Scientific Research Deanship at the University of Ha’il- Saudi Arabia through project number RG-24 067.

Data Availability Statement

All the relevant data are available within the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bhushan, S.; Kumar, A. Optimal classes of estimators for population mean using higher order moments. Afr. Mat. 2025, 36, 12. [Google Scholar]
  2. Mohanty, S.; Sahoo, J. A note on improving the ratio method of estimation through linear transformation using certain known population parameters. Sankhyā Indian J. Stat. Ser. B 1995, 57, 93–102. [Google Scholar]
  3. Khan, M.; Shabbir, J. Some improved ratio, product, and regression estimators of finite population mean when using minimum and maximum values. Sci. World J. 2013, 2013, 431868. [Google Scholar]
  4. Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 20. [Google Scholar] [CrossRef]
  5. Cekim, H.O.; Cingi, H. Some estimator types for population mean using linear transformation with the help of the minimum and maximum values of the auxiliary variable. Hacet. J. Math. Stat. 2017, 46, 685–694. [Google Scholar]
  6. Shahzad, U.; Ahmad, I.; Al-Noor, N.H.; Iftikhar, S.; Abd Ellah, A.H.; Benedict, T.J. Särndal approach and separate type quantile robust regression type mean estimators for nonsensitive and sensitive variables in stratified random sampling. J. Math. 2022, 2022, 1430488. [Google Scholar] [CrossRef]
  7. Anas, M.M.; Huang, Z.; Shahzad, U.; Iftikhar, S. A new family of robust quantile-regression-based mean estimators using Sarndal approach. Commun.-Stat.-Simul. Comput. 2024, 1–20. [Google Scholar]
  8. Koc, T.; Koc, H. A new class of quantile regression ratio-type estimators for finite population mean in stratified random sampling. Axioms 2023, 12, 713. [Google Scholar] [CrossRef]
  9. Khan, M. Improvement in estimating the finite population mean under maximum and minimum values in double sampling scheme. J. Stat. Appl. Probab. Lett. 2015, 2, 115–121. [Google Scholar]
  10. Särndal, C.E. Sampling survey theory vs. general statistical theory: Estimation of the population mean. Int. Stat. Inst. 1972, 40, 1–12. [Google Scholar]
  11. Dunder, E.; Zaman, T.; Cengiz, M.; Alakus, K. Implementation of adaptive lasso regression based on multiple Theil-Sen Estimators using differential evolution algorithm with heavy tailed errors. J. Natl. Sci. Found. Sri Lanka 2022, 50, 395–404. [Google Scholar]
  12. Huber, P.J. Robust estimation of a location parameter. In Breakthroughs in Statistics: Methodology and Distribution; Springer: New York, NY, USA, 1992; pp. 492–518. [Google Scholar]
  13. Beaton, A.E.; Tukey, J.W. The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics 1974, 16, 147–185. [Google Scholar]
  14. Qadir, M.F. Robust method for detection of single and multiple outliers. Sci. Khyber 1996, 9, 135–144. [Google Scholar]
  15. Alamgir, A.A.; Khan, S.A.; Khan, D.M.; Khalil, U. A new efficient redescending M-estimator: Alamgir redescending M-estimator. Res. J. Recent Sci. 2013, 2, 79–91. [Google Scholar]
  16. Khalil, U.; Ali, A.; Khan, D.M.; Khan, S.A.; Qadir, F. Efficient UK’s Re-Descending M-Estimator for Robust Regression. Pak. J. Stat. 2016, 32, 125–138. [Google Scholar]
  17. Noor-Ul-Amin, M.; Asghar, S.U.D.; Sanaullah, A.; Shehzad, M.A. Redescending M-estimator for robust regression. J. Reliab. Stat. Stud. 2018, 11, 69–80. [Google Scholar]
  18. Anekwe, S.; Onyeagu, S. The Redescending M estimator for detection and deletion of outliers in regression analysis. Pak. J. Stat. Oper. Res. 2021, 17, 997–1014. [Google Scholar]
  19. Luo, R.; Chen, Y.; Song, S. On the M-estimator under third moment condition. Mathematics 2022, 10, 1713. [Google Scholar] [CrossRef]
  20. Khan, D.M.; Ali, M.; Ahmad, Z.; Manzoor, S.; Hussain, S. A New Efficient Redescending M-Estimator for Robust Fitting of Linear Regression Models in the Presence of Outliers. Math. Probl. Eng. 2021, 2021, 3090537. [Google Scholar]
  21. Raza, A.; Noor-ul-Amin, M.; Ayari-Akkari, A.; Nabi, M.; Usman Aslam, M. A redescending M-estimator approach for outlier-resilient modeling. Sci. Rep. 2024, 14, 7131. [Google Scholar]
  22. Raza, A.; Talib, M.; Noor-ul-Amin, M.; Gunaime, N.; Boukhris, I.; Nabi, M. Enhancing performance in the presence of outliers with redescending M-estimators. Sci. Rep. 2024, 14, 13529. [Google Scholar]
  23. Cochran, W.G. Sampling Techniques; John Wiley and Sons: New York, NY, USA, 1977. [Google Scholar]
  24. Cetin, A.E.; Koyuncu, N. Calibration estimator of population mean in stratified extreme ranked set sampling with simulation study. Filomat 2024, 38, 599–608. [Google Scholar]
  25. Daraz, U.; Agustiana, D.; Wu, J.; Emam, W. Twofold Auxiliary Information Under Two-Phase Sampling: An Improved Family of Double-Transformed Variance Estimators. Axioms 2025, 14, 64. [Google Scholar] [CrossRef]
  26. Kadilar, C.; Cingi, H. Ratio estimators in simple random sampling. Appl. Math. Comput. 2004, 151, 893–902. [Google Scholar]
  27. Kadılar, C.; Cingi, H. Ratio estimators using robust regression. Hacet. J. Math. Stat. 2007, 36, 181–188. [Google Scholar]
  28. Zaman, T.; Iftikhar, S.; Sozen, C.; Sharma, P. A new logarithmic type estimators for analysis of number of aftershocks using poisson distribution. J. Sci. Arts 2024, 24, 833–842. [Google Scholar] [CrossRef]
  29. Mukhtar, M.; Ali, N.; Shahzad, U. An improved regression type mean estimator using redescending M-estimator. Univ. Wah J. Sci. Technol. (UWJST) 2023, 7, 11–18. [Google Scholar]
  30. Singh, S. Advanced Sampling Theory with Applications: How Michael Selected Amy; Springer Science and Business Media: Berlin, Germany, 2003; Volume 2. [Google Scholar]
  31. Albalawi, O. Estimation techniques utilizing dual auxiliary variables in stratified two-phase sampling. AIMS Math. 2024, 9, 33139–33160. [Google Scholar]
Figure 1. Scatter plot for Population-1.
Figure 1. Scatter plot for Population-1.
Axioms 14 00261 g001
Figure 2. Scatter plot for Population-2.
Figure 2. Scatter plot for Population-2.
Axioms 14 00261 g002
Figure 3. Scatter plot for Population-3.
Figure 3. Scatter plot for Population-3.
Axioms 14 00261 g003
Figure 4. Scatter plot for Population-4.
Figure 4. Scatter plot for Population-4.
Axioms 14 00261 g004
Figure 5. Scatter plot for Population-5.
Figure 5. Scatter plot for Population-5.
Axioms 14 00261 g005
Figure 6. Scatter plot for Population-6.
Figure 6. Scatter plot for Population-6.
Axioms 14 00261 g006
Figure 7. MSE comparison plot for Population-1.
Figure 7. MSE comparison plot for Population-1.
Axioms 14 00261 g007
Figure 8. MSE comparison plot for Population-2.
Figure 8. MSE comparison plot for Population-2.
Axioms 14 00261 g008
Figure 9. MSE comparison plot for Population-3.
Figure 9. MSE comparison plot for Population-3.
Axioms 14 00261 g009
Figure 10. MSE comparison plot for Population-4.
Figure 10. MSE comparison plot for Population-4.
Axioms 14 00261 g010
Figure 11. MSE comparison plot for Population-5.
Figure 11. MSE comparison plot for Population-5.
Axioms 14 00261 g011
Figure 12. MSE comparison plot for Population-6.
Figure 12. MSE comparison plot for Population-6.
Axioms 14 00261 g012
Table 1. Characteristics of real populations.
Table 1. Characteristics of real populations.
Population-1Population-2Population-3
N505050
n555
ρ 0.3221160.60830270.8038341
S y 2 3762.6123762.612342,021.5
S x 2 21,029.76415,388.31,176,526
S x y 2865.32724,048.7509,910.4
Y d 621021341.85
X d 58724413928.499
β Amin 2018 0.41804950.061120190.5130374
β Khan 2021 0.41347060.06161210.508356
β Anekwe 2021 0.41815580.061134040.513046
β Raza 2024 a 0.41365150.061126710.5129946
β Raza 2024 b 0.41958040.061048830.5130457
Table 2. MSE using Populations 1, 2, 3.
Table 2. MSE using Populations 1, 2, 3.
EstimatorsPopulation-1Population-2Population-3
y ¯ q 1 2539.205733.8845129,181.7
y ¯ q 2 2514.517738.6179127,774.3
y ¯ q 3 2539.78734.0173129,184.3
y ¯ q 4 2515.49733.9470129,168.7
y ¯ q 5 2547.494733.2008129,184.2
y ¯ p 1 845.8177423.345422,293.97
y ¯ p 2 837.9259423.390022,185.60
y ¯ p 3 846.0024423.346222,294.18
y ¯ p 4 838.2352423.345822,292.95
y ¯ p 5 848.4857423.341522,294.17
Table 3. MSE using Populations 4, 5, 6.
Table 3. MSE using Populations 4, 5, 6.
EstimatorsPopulation-4Population-5Population-6
y ¯ q 1 374.0930108.4700259.7108
y ¯ q 2 352.7311104.7184244.8735
y ¯ q 3 379.1168108.6607259.6915
y ¯ q 4 376.5448108.5785259.0301
y ¯ q 5 377.7044108.8168260.1414
y ¯ p 1 44.8770244.6698744.71908
y ¯ p 2 44.5764044.5667244.60273
y ¯ p 3 45.0415344.6775044.71865
y ¯ p 4 44.9530044.6741944.70433
y ¯ p 5 44.9918144.6839144.72886
Table 4. PRE using Population-1.
Table 4. PRE using Population-1.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 300.2071297.2883300.2751297.4033301.1871
y ¯ p 2 303.0346300.0883303.1031300.2043304.0238
y ¯ p 3 300.1416297.2234300.2095297.3384301.1214
y ¯ p 4 302.9227299.9775302.9913300.0936303.9116
y ¯ p 5 299.2631296.3535299.3309296.4681300.2401
Table 5. PRE using Population-2.
Table 5. PRE using Population-2.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 173.3536173.3354173.3533173.3534173.3552
y ¯ p 2 174.4717174.4533174.4713174.4715174.4733
y ¯ p 3 173.3850173.3667173.3846173.3848173.3866
y ¯ p 4 173.3684173.3501173.3680173.3682173.3700
y ¯ p 5 173.1921173.1739173.1918173.1919173.1937
Table 6. PRE using Population-3.
Table 6. PRE using Population-3.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 579.4466582.2771579.4412579.4733579.4414
y ¯ p 2 573.1337575.9333573.1284573.1601573.1286
y ¯ p 3 579.4582582.2888579.4529579.4850579.4530
y ¯ p 4 579.3886582.2188579.3832579.4153579.3834
y ¯ p 5 579.4579582.2884579.4525579.4846579.4527
Table 7. PRE using Population-4.
Table 7. PRE using Population-4.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 833.5961839.2177830.5514832.1870831.4693
y ¯ p 2 785.9950791.2956783.1241784.6664783.9896
y ¯ p 3 844.7906850.4877841.7050843.3626842.6352
y ¯ p 4 839.0593844.7178835.9947837.6410836.9186
y ¯ p 5 841.6434847.3193838.5693840.2207839.4960
Table 8. PRE using Population-5.
Table 8. PRE using Population-5.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 242.8258243.3879242.7844242.8024242.7495
y ¯ p 2 234.4274234.9700234.3874234.4048234.3538
y ¯ p 3 243.2528243.8159243.2113243.2293243.1764
y ¯ p 4 243.0687243.6313243.0272243.0453242.9923
y ¯ p 5 243.6022244.1661243.5606243.5787243.5257
Table 9. PRE using Population-6.
Table 9. PRE using Population-6.
n y ¯ q 1 y ¯ q 2 y ¯ q 3 y ¯ q 4 y ¯ q 5
y ¯ p 1 580.7607582.2756580.7663580.9522580.6337
y ¯ p 2 547.5817549.0101547.5870547.7623547.4620
y ¯ p 3 580.7174582.2322580.7230580.9090580.5905
y ¯ p 4 579.2384580.7494579.2440579.4295579.1118
y ¯ p 5 581.7235583.2409581.7291581.9154581.5963
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rashedi, K.A.; Abdulrahman, A.T.; Alshammari, T.S.; Alshammari, K.M.K.; Shahzad, U.; Shabbir, J.; Mehmood, T.; Ahmad, I. Robust Särndal-Type Mean Estimators with Re-Descending Coefficients. Axioms 2025, 14, 261. https://doi.org/10.3390/axioms14040261

AMA Style

Rashedi KA, Abdulrahman AT, Alshammari TS, Alshammari KMK, Shahzad U, Shabbir J, Mehmood T, Ahmad I. Robust Särndal-Type Mean Estimators with Re-Descending Coefficients. Axioms. 2025; 14(4):261. https://doi.org/10.3390/axioms14040261

Chicago/Turabian Style

Rashedi, Khudhayr A., Alanazi Talal Abdulrahman, Tariq S. Alshammari, Khalid M. K. Alshammari, Usman Shahzad, Javid Shabbir, Tahir Mehmood, and Ishfaq Ahmad. 2025. "Robust Särndal-Type Mean Estimators with Re-Descending Coefficients" Axioms 14, no. 4: 261. https://doi.org/10.3390/axioms14040261

APA Style

Rashedi, K. A., Abdulrahman, A. T., Alshammari, T. S., Alshammari, K. M. K., Shahzad, U., Shabbir, J., Mehmood, T., & Ahmad, I. (2025). Robust Särndal-Type Mean Estimators with Re-Descending Coefficients. Axioms, 14(4), 261. https://doi.org/10.3390/axioms14040261

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop