Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling

Mahmood, Rashid; Alshanbari, Huda M.; Ali, Nasir; Hanif, Muhammad

doi:10.3390/axioms15020134

Open AccessArticle

Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling

by

Rashid Mahmood

¹,

Huda M. Alshanbari

^2,*

,

Nasir Ali

¹ and

Muhammad Hanif

¹

Department of Statistics, PMAS-Arid Agriculture University, Rawalpindi 46300, Pakistan

²

Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Axioms 2026, 15(2), 134; https://doi.org/10.3390/axioms15020134

Submission received: 13 December 2025 / Revised: 3 February 2026 / Accepted: 10 February 2026 / Published: 12 February 2026

(This article belongs to the Special Issue Probability, Statistics and Estimations, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

In modern survey sampling, particularly when using stratified random sampling (StRS), the existence of outliers and model mis-specifications is a daunting challenge to the conventional parametric and nonparametric methods of estimating parameters. This research presents a new type of predictive estimator that is synergistic to both robust regression and nonparametric local polynomial kernel regression. It aims to offer more resistant and efficient estimators of the average parameter in the areas where supplementary information is known, but irregularity in the data is usual. The proposed estimators use dual calibration methods based on both auxiliary variable means and coefficients of variation, which improves efficiency. This framework enhances predictive performance by integrating the adaptability of kernel-based smoothing with the outlier resistance of robust regression. The accuracy of the suggested estimators is measured by using large scales of simulation experiments on artificial populations with structural heterogeneity and outlier contamination. An empirical comparison, based on percentage relative efficiency (PRE), indicates that the new estimators are superior to classical methods based on the use of a kernel regression in most bandwidth selection strategies. In addition to bringing methodological innovation as it connects distribution theory, regression models, and robust estimation strategies, this work also offers the usefulness of survey practitioners who work with complicated and imperfect real-life data of fisheries and radiations.

Keywords:

stratified sampling; robust regression; non-parametric estimation; local polynomial kernel regression; predictive mean estimation; applied statistics

MSC:

62D05

1. Introduction

In multifaceted survey environments, supplemental information gathered through a national census, remote sensing network, or environment inventory may be of great importance in the design as well as estimation steps of a survey. It is common to use these external sources of data in order to formulate effective estimators of major population parameters including the total or mean. The conventional methods of estimation were founded on the belief that the research variable has a functional relationship, usually linear, with auxiliary variables. These model-based estimators necessitate the upfront specification of underlying model structure, which, in any case, is a hard task when there are more than a few variables or when the population is characterised in a complex manner (Opsomer et al. [1] and Wu and Sitter [2]). These have led to the tendency to switch to nonparametric methods, which are more flexible since they do not presume strongly defined functional forms. It is necessary to highlight that the initial work of Dorfman [3] and subsequently of Dorfman and Hall [4] established the principles of nonparametric modelling inclusion in survey estimation, which can enable more flexible approaches that can be used to model up complex relations in various populations.

Compared to parametric methods, nonparametric inference methods are less sensitive to sampling designs and assumptions of the models (Nadaraya, [5]). The theoretical literature focuses on two major frameworks of the development of efficient estimators: the design-driven framework, which is built only on the randomisation of the sampling design, and the model-driven framework, which assumes the consideration of the finite population conceptualised as stemming from a superpopulation model. The latter model allows for estimating even when non-sampled units are used, and an assumed correlation exists between the survey and auxiliary variables (Dorfman and Hall, [4]). One early step in this direction was the work of Nadaraya, which introduced the Local Polynomial Regression (LPR) as a versatile nonparametric alternative to the classical parametric regression estimators [5]. The numerical experiments demonstrate that the LPR-based estimators are the lowest in MSE (the highest in PRE) compared to the parametric estimators, which are allowed to be used in skewed and outlier-contaminated conditions. On this basis, Rueda and Sanchez-Borrego (RSB) [6] applied the use of LPR techniques to guide probability sampling situations, which added to the further verification of their usefulness in model-based predictive contexts. Recent developments in robust regression and nonparametric learning, with a focus on distribution-resistant modelling such as functional coefficient and quantile-based regression [7], neural network-driven robustness [8], and the adaptive kernel smoothing methods [9,10], highlight the increasing importance of flexible and outlier-resistant estimation strategies for complex data environments.

The local polynomial kernel regression is a versatile tool that can be used to handle both continuous and discrete datasets; although, its usefulness heavily relies on the characteristics of the response variable. In continuous cases, the technique is superior as it models localised polynomials at the target point where the local observations have a heavier influence on the output by weighting with a kernel. This localised construction provides adaptive smoothing that makes little assumptions about the global shape of the underlying function. In the case of discrete or categorical data, it must be modified. As an example, binary or multiclass outcomes are modelled with local logistic regression, and count data are generally modelled with local versions of generalised linear models (GLMs) with a link function, such as Poisson or negative binomial. In such situations, it is particularly important to choose a suitable bandwidth of a kernel, since sparse or unevenly distributed data may have a considerable influence on the model stability and predictive capabilities.

The traditional average stands as one of the most prevalent measures of statistics used as a fundamental measure of data summary in all fields of study, including the sciences, social sciences as well as arts (Zaman [11]). Since it can be interpreted and is generally applicable, the correct estimation of the population mean is of critical significance in survey sampling as well as in a variety of applications (Subzar et al., [12]; Kumar and Siddiqui, [13]). An in-depth description of the methods of mean estimation is available in the article by Shahzad et al. [14] and Koc and Koc [15]. There is an ongoing need of more efficient and reliable ways of estimating the mean in the light of its critical importance. This has elicited the increased use of the model-based methods, which incorporate the auxiliary information and easily customisable modelling structures to enhance the accuracy of the mean estimates.

In this paper, the evidence-based literature of the model-based nonparametric mean estimation methods is subject to discussion since they are broadly applied to estimate population parameters in the context of complex sampling designs. In model-based estimation, the dependent and independent variables are modelled explicitly, and one can predict non-sampled units. The other type of estimation technique constructs the structure of the underlying model that forms the basis of parameter determination (Srivastava, [16]). These models, subject to specific assumptions, enable the imputation of unobserved values at both micro and macro levels. In the estimation, when a sampling design has been used to gather data, this design can be included in the estimation process just like design-based methods. RSB [6] formulated an LPR-assisted estimator on simple random sampling and exhibited a number of the desirable properties following a model-based paradigm. Such kernel-based and nonparametric predictive estimators, however, are susceptible to outliers, a major drawback of these types of estimators, which are common in real-world applications like environmental or meteorological measurements because of sensor errors, data anomalies, or extreme events. As a solution to this, we suggest a new predictive mean estimator, which is a combination of robustness of the regression methods to ensure that they do not give prominence to the outliers in the data and the flexibility of the kernel regression, which offers local smoothing without the assumption of the global parametric form. This mixed methodology improves the accuracy and consistency of central tendency estimation when contaminated data is present and is thus especially appropriate for real-life situations where data quality problems are widespread.

In the case of many predictor variables, LPR may be generalised to a multivariate framework, often known as multiple local polynomial regression (MLPR), where a local polynomial surface is fitted, instead of a curve. Although this generalisation allows for modelling complex multidimensional relationships, it also comes with the problem of the curse of dimensionality. Higher dimensions cause the data to be sparse in higher dimensional space, which may cause instability and decrease the reliability of the estimates. In addition, appropriate bandwidths to be used in each covariate must be chosen; the wrong decisions may lead to over-smoothing with important signals or under-smoothing with noisy predictors. Conversely, these problems are to a great extent alleviated when the model contains one predictor variable. The sparsity in one-dimensional space permits smoothing to be more effective and consistent, and makes bandwidth choice more convenient, decreasing the likelihood of inflation in the variance and making the regression more robust overall.

Although there is a rich body of literature on both model-based and calibration-type mean estimation techniques, to our knowledge, no prior research has constructed calibrated predictive mean estimators under stratified random sampling that apply (i) robust regression to sampled units and (ii) local polynomial regression to non-sampled units, incorporating dual calibration constraints on auxiliary means and coefficients of variation. Historically, the calibration estimators enhance accuracy by adding auxiliary information using constraints to match sample estimates with known population characteristics. This paper then builds on that research by suggesting a new hybrid estimator in which the sampled subset of the population is estimated with outlier-resistant regression approaches, which remain resistant to outlier-induced distortions, and the non-sampled subset is then estimated with kernel regression, which is flexible and data-driven in its smoothing. Robust regression contributes to the accuracy and stability of the overall predictive mean estimator, especially when the data is contaminated. In order to operationalise this hybrid process, we apply a model-based methodology where we implement an LPR estimator to the non-sampled units. It is also necessary to carefully select kernel bandwidths that would have an understood effect on the quality of kernel-based estimator. The resultant process does not only fill the gap between two effective nonparametric tools but is also a practical and strong substitute to more traditional estimators in stratified sampling designs. The application of Gaussian kernel function also guarantees a stable and smooth estimation behaviour in a variety of data conditions.

The accurate measurement and assessment of natural resources, in particular, aquaculture and fisheries have received considerable attention in recent years due to the need for sustainable management and information-based decision-making. Estimation of the average values of biological parameters such as fish length, weight, and body shape are significant in stock assessment and economic and operational planning in the fishery industry. In this paper, the performance of predictive mean estimators is compared on a real-life data of fish market that has diverse morphological characteristics. Meanwhile, a simulated dataset concerning solar ultraviolet (UV) radiation is taken into account as well, which covers significant environmental variables and categories of UV risks. It is necessary to estimate average levels of UV exposure in the environment to conduct environmental risk assessment and develop adaptive strategies in aquatic ecosystems. Collectively, the datasets can help investigate the possibility of powerful, model-oriented mean estimation methods that combine both biological and environmental data. This study also highlights the excellence of estimating means through advanced survey sampling and predictive models in the intricate natural environment, which is useful for fishery management, environmental surveillance, and natural resource planning.

The theoretical and methodological foundation for the creation of a new type of predictive estimators in StRS is laid out in the later Section 2, Section 3, Section 4 and Section 5 of this article. Section 2 explains the existing model of kernel regression in a stratified sampling case and briefly describes the LPR estimator as a nonparametric version of the classical linear regression estimator. This section also describes the way in which the estimator depends upon the smoothing parameters and auxiliary variables and how it is sensitive to the bandwidth selection and the problem of contaminated data. Section 3 deals with the issue of incorporating robust regression techniques into the nonparametric estimation framework. This combination results in a more resistant kernel regression estimator that suppresses the impact of outliers and heteroscedastic noise, particularly in stratified data in the real world. The work is further expanded in Section 3, where two calibrated forms of the adaptive predictive estimators are introduced, which make more efficient use of auxiliary information. Section 4 includes a detailed numerical analysis of artificial and natural populations intended to provide simulations of real situations of sampling with outliers and stratified design. The results are evaluated based on PRE, using three bandwidth selection techniques: fixed, data-driven plug-in (dpik), and biased cross-validation (bcv). Ultimately, Section 5 delivers the final conclusions.

2. Fundamental Estimators

Alshanbari and Anas [17], Alomair et al. [18], and RSB [6] propose a model-driven technique, according to which the bounded population is supposed to be satisfactorily characterised by a predictive model, denoted as

ξ

, such that

y_{i} = m (w_{i}) + ϖ_{i}

In stratified random sampling (StRS), the predictive model is generalised to every stratum

H_{ϑ}

, which is denoted as

y_{i_{H_{ϑ}}} = m (w_{i_{H_{ϑ}}}) + ϖ_{i_{H_{ϑ}}}

where

ϖ_{i_{H_{ϑ}}}

represents independent, identically distributed random errors with zero mean,

E_{ξ_{H_{ϑ}}} (ϖ_{i_{H_{ϑ}}}) = 0

, and constant variance

σ_{H_{ϑ}}^{2} = 1

. Further,

m (\cdot)

is a smooth and unknown function of the supplementary variable w and

E_{ξ_{H_{ϑ}}}

is the expectation in the model

ξ_{H_{ϑ}}

.

After selecting a sample, the average of the population in stratum

H_{ϑ}

, which is denoted by

{\bar{Y}}_{H_{ϑ}}

, can be written as

{\bar{Y}}_{H_{ϑ}} = f_{H_{ϑ}} {\bar{y}}_{s_{H_{ϑ}}} + (1 - f_{H_{ϑ}}) {\bar{y}}_{{\bar{s}}_{H_{ϑ}}}

(1)

In this case, i.e., Equation (1),

{\bar{y}}_{s_{H_{ϑ}}} = {(n_{H_{ϑ}})}^{- 1} \sum_{i_{H_{ϑ}} \in s_{H_{ϑ}}} y_{i_{H_{ϑ}}}

represents the sample mean of the sampled unit

s_{H_{ϑ}}

and

{\bar{y}}_{{\bar{s}}_{H_{ϑ}}} = \frac{1}{N_{H_{ϑ}} - n_{H_{ϑ}}} \sum_{j \in {\bar{s}}_{H_{ϑ}}} y_{j_{H_{ϑ}}}

is the non-sample mean of non-sampled unit

{\bar{s}}_{H_{ϑ}}

. The population and sample counts in the stratum are represented by

N_{H_{ϑ}}

and

n_{H_{ϑ}}

respectively and the sampling fraction is

f_{H_{ϑ}} = \frac{n_{H_{ϑ}}}{N_{H_{ϑ}}}

. N is the total elements of the strata.

It should be noted that the first term of Equation (1) can be calculated directly using the sample. Consequently, the task is to estimate the unknown element

{\bar{y}}_{{\bar{s}}_{H_{ϑ}}}

, which refers to non-sampled units. If auxiliary variable w were observed in all units, it would be easy to predict using the regression model,

y_{j_{H_{ϑ}}}^{*} = m (w_{j_{H_{ϑ}}})

is the proxy of the unobservable

y_{j_{H_{ϑ}}}

, and

j_{H_{ϑ}} \in {\bar{s}}_{H_{ϑ}}

. Nevertheless, in a real world context, the actual

m (\cdot)

is not known. In response to this, nonparametric kernel regression methods are used to obtain predictions

m (\cdot)

, which are predictions

{\hat{m}}_{j_{H_{ϑ}}}

, obtained at each

j_{H_{ϑ}} \in {\bar{s}}_{H_{ϑ}}

, as illustrated by Chambers et al. [19]. Since then, this method has been adapted and generalised by a number of researchers such as RSB [6] to enhance the predictive estimation with respect to more complicated sampling designs.

2.1. Rueda and Sanchez-Borrego Estimator

Based on the fundamental contributions made by RSB [6], the traditional model-driven estimator under the stratified random sampling (StRS) of the

{H_{ϑ}}^{th}

stratum takes the form

{\bar{y}}_{B R_{H_{ϑ}}} = f_{H_{ϑ}} {\bar{y}}_{s_{H_{ϑ}}} + (1 - f_{H_{ϑ}}) \frac{1}{N_{H_{ϑ}} - n_{H_{ϑ}}} \sum_{j_{H_{ϑ}} = {\bar{s}}_{H_{ϑ}}} {\hat{m}}_{j_{H_{ϑ}}}

(2)

The aggregated estimator

{\bar{y}}_{B R}

of the entire population that is represented by all strata is expressed as

{\bar{y}}_{B R} = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}},

where

P_{H_{ϑ}} = \frac{N_{H_{ϑ}}}{n_{H_{ϑ}}}

is the weight of the single stratum and

C_{⌀}

is the weight of all the strata, being considered.

It is worth noting that

{\hat{m}}_{j_{H_{ϑ}}}

, which is obtained through LPR, is a generalisation of the classical LPR regression model and may be employed across diverse forms of modelling. Following the approach taken by Ref. [20] and its developed methodology, RSB [6] utilised a kernel-based

p^{th}

-order LPR estimator aimed at computing the research variable. The kernel function takes the form

K_{h} (u) = h^{- 1} K (u / h)

, in which K is generally adopted as a Gaussian-shaped kernel, with h representing the window-width parameter. To see a bigger picture pertaining to up-to-date changes within the sphere of the kernel-based approach, the readers can turn to Refs. [18,21,22].

Accordingly, the predicted value

{\hat{m}}_{j_{H_{ϑ}}}

for the non-sampled unit

j_{H_{ϑ}} \in {\bar{s}}_{H_{ϑ}}

is calculated using

{\hat{m}}_{j_{H_{ϑ}}} = e_{1}^{'} {(W_{s j_{H_{ϑ}}}^{'} G_{s j_{H_{ϑ}}} W_{s j_{H_{ϑ}}})}^{- 1} W_{s j_{H_{ϑ}}}^{'} G_{s j_{H_{ϑ}}} Y_{s_{H_{ϑ}}} = g_{s j_{H_{ϑ}}}^{'} Y_{s_{H_{ϑ}}},

where

e_{1}

is a unit vector of length

p + 1

,

Y_{s_{H_{ϑ}}} = {[y_{i_{H_{ϑ}}}]}_{i_{H_{ϑ}} \in s_{H_{ϑ}}}

represents the vector of observed responses,

G_{s j_{H_{ϑ}}} = diag {K_{h} (w_{i_{H_{ϑ}}} - w_{j_{H_{ϑ}}})}_{i_{H_{ϑ}} \in s_{H_{ϑ}}}

is the diagonal weight matrix formed using the kernel function, and

W_{s j_{H_{ϑ}}} = {[1, (w_{i_{H_{ϑ}}} - w_{j_{H_{ϑ}}}), \dots, {(w_{i_{H_{ϑ}}} - w_{j_{H_{ϑ}}})}^{p}]}_{i_{H_{ϑ}} \in s_{H_{ϑ}}}

is the design matrix constructed from local polynomial terms.

In the findings of Alshanbari and Anas [17], it is observed that under stratified sampling, the base estimator

{\bar{y}}_{B R_{H_{ϑ}}}

can be improved upon using calibration techniques.

2.2. Alshanbari and Anas Estimator

The inclusion of auxiliary data in estimation processes has been commonly accepted as a favourable approach to the accuracy of estimators of means. The common assumption defined in Shahzad et al. [23] and Zaman [24] is that a meaningful association is present between the principal variable of the study, Y, and a corresponding auxiliary variable, W. As an example, one can state a solid example of the positive correlation between education and income in which education has generally been considered as a causal variable that affects income. Several socioeconomic studies have proven this association (Leesch and Skopek [25]). Likewise, in health sciences, a considerable amount of empirical evidence has been conducted to show the positive impact of physical activity on cardiovascular health. According to Kaiser and Oswald [26], more active people tend to have a healthier heart condition on average. These examples demonstrate that in the proper use of the auxiliary variables, they help to refine the mean calculation and increase the trustworthiness of the survey outcomes.

Calibration estimation is generally recognised as an effective method of smoothing survey weights by minimising an appropriated distance functional; it is also applied to combine auxiliary or additional information. Many scholars have noted the importance of carrying out calibration in strata in order to maximise the effectiveness of population parameter estimates. Construction of the calibration weights is generally divided into two basic modes: selection of the appropriate distance measure and introduction of the calibration constraints. These constraints when well-matched with auxiliary variables could greatly enhance the accuracy of the estimates of the main study variable. This method was expanded by Refs. [27,28], which included several calibration conditions in the general survey sampling paradigm, a concept also examined by Refs. [29,30,31,32]. Although these developments have been made, few efforts have been focused on the development of calibrated mean estimators in the context of stratified random sampling (StRS) and within the model-based framework. The Alshanbari and Anas [17] study is one of them. Their study presents a new, calibration-supported, model-driven mean estimator with StRS using the flexibility of nonparametric calibrated kernel-oriented nonparametric regression techniques.

In the StRS scheme,

(N, n)

denote the population size and total sample size respectively.

({\bar{w}}_{H_{ϑ}}, {\bar{W}}_{H_{ϑ}})

represents the sample and population averages and CVs

(\hat{C} w_{H_{ϑ}}, C w_{H_{ϑ}})

of the supplementary variable W of the

{H_{ϑ}}^{t h}

stratum. Likewise,

(P_{H_{ϑ}}, Π_{H_{ϑ}})

represents the usual stratified weights and their calibrated versions. Based on the foregoing definitions, a particular randomly obtained sample

n_{H_{ϑ}}

is selected out of a stratum containing

N_{H_{ϑ}}

population units, with

H_{ϑ} = 1, 2, \dots, C_{⌀}

. Under these conditions, the calibrated estimator introduced by Alshanbari and Anas [17] is given as

{\bar{y}}_{P M} = \sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}},

(3)

subject to the constraints

\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{W}}_{H_{ϑ}}

(4)

\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} \hat{C} w_{H_{ϑ}} = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} C w_{H_{ϑ}}

(5)

\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}}

(6)

The motivation of incorporating loss functions within the calibration framework as discussed by Ref. [27] is the accuracy of estimating the parameter by changing the weight of the sampled units. This optimisation is a process of reducing a given distance measure, usually between the original design weights and the calibrated weights, with a known set of calibration constraints. In order to operationalise this process, we build a Lagrange-type function (LF) developed by adding the constraint multipliers

(η_{1 (m)}

,

η_{2 (m)}

,

η_{3 (m)})

to a loss function

(\sum_{H_{ϑ} = 1}^{C_{⌀}} \frac{{(Π_{H_{ϑ}} - P_{H_{ϑ}})}^{2}}{{\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}})

based on chi-square and have the following formulation:

\begin{matrix} A_{(m 2)} & = \sum_{H_{ϑ} = 1}^{C_{⌀}} \frac{{(Π_{H_{ϑ}} - P_{H_{ϑ}})}^{2}}{{\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}} - 2 η_{1 (m)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{W}}_{H_{ϑ}}] \\ - 2 η_{2 (m)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} \hat{C} w_{H_{ϑ}} - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} C w_{H_{ϑ}}] - 2 η_{3 (m)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}}] . \end{matrix}

(7)

Computing

\frac{δ A_{(m 2)}}{δ Π_{H_{ϑ}}}

and subjecting it to the zero-gradient condition provides

Π_{H_{ϑ}} = P_{H_{ϑ}} + {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} [η_{1 (m)} {\bar{w}}_{H_{ϑ}} + η_{2 (m)} \hat{C} w_{H_{ϑ}} + η_{3 (m)}] .

(8)

Calibrated weights

(Π_{H_{ϑ}})

have a number of desirable characteristics, such as being able to reduce bias and minimise variance and being coherent with known auxiliary information. The main goal in constructing such weights is to initiate a suitably weighted average of the supplementary information in the sample to their respective established population totals in an attempt to increase the quality of estimates using surveys. It should be noted, however, that one cannot assume calibrated weights will be strictly positive. Negative-valued weights may occur, especially in cases where large differences arise between the distribution of the sample and population, or certain types of distance functions are used in the calibration. The chi-square distance is one of the distance functions that is quite effective in alleviating the negative weight occurrence. The reason is that it penalises large deviations relative to the starting weights which provides incentives to change values without moving too far. Consequently, the application of chi-square distance causes more consistent calibration by providing a better balance and minimising the likelihood of assigning extreme or negative weights.

By substituting (8) in (4), (5), and (6), respectively, we get

G_{1 (3 \times 3)} η_{1 (3 \times 1)} = F_{1 (3 \times 1)},

(9)

where

η_{1 (3 \times 1)} = [\begin{matrix} η_{1 (m)} \\ η_{2 (m)} \\ η_{3 (m)} \end{matrix}],

F_{1 (3 \times 1)} = [\begin{matrix} \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) \\ 0 \end{matrix}],

G_{1 (3 \times 3)} = [\begin{matrix} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) & (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) \end{matrix}] .

By solving Equation (9), we get

η_{1 (m)} = \frac{D_{71 (m)}}{H_{1}}, η_{2 (m)} = \frac{D_{72 (m)}}{H_{1}}, η_{3 (m)} = \frac{D_{73 (m)}}{H_{1}},

where

D_{71 (m)}, D_{72 (m)}, D_{73 (m)}

, and

H_{1}

are provided in Appendix A.

Substituting these values in (8) and (3), we get

{\bar{y}}_{P M} = {\bar{y}}_{s t (m)} + η_{1 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) + η_{2 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) + η_{3 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}),

(10)

= \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}} + R_{a} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{b} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})],

(11)

where

R_{a} = \frac{D_{74 (m)}}{H_{1}}, R_{b} = \frac{D_{75 (m)}}{H_{1}},

\begin{matrix} D_{74 (m)} & = (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}})}^{2} \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}), \end{matrix}

\begin{matrix} D_{75 (m)} & = (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{B R_{H_{ϑ}}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}})}^{2} . \end{matrix}

It is worth noticing that the adapted estimator

{\bar{y}}_{P M}

may be generalised into a more generalised form by choosing different values of

{\hat{Q}}_{H_{ϑ}}

. As a simplification to help the readers to stay on point, we will now assume that

{\hat{Q}}_{H_{ϑ}} = 1

. However, estimator formulation is adjustable by inclusion of various known values of the population characteristic,

{\hat{Q}}_{H_{ϑ}}

, which permits a range of functional manifestations. To shed more light on such generalisations, the readers can consult the works of Refs. [33,34,35].

3. Adapted and Proposed Families of Estimators

3.1. Adapted Family

In this section, we propose some novel type of predictive robust estimators based on the predictive-robust-regression approach. Building on previous work by [36,37] and extending concepts from [6,17] as discussed in Section 2, our method calculates the population mean of a target variable Y with the aid of additional information W in the StRS design. Although the least squares (LS) method is generally considered to be the standard method of parameter estimation, a number of robust alternatives have been created to accommodate the presence of the outliers and non-normality. The least absolute deviations (LAD) is one way of doing this, first suggested by Roger Joseph Boscovich in 1757, and minimises the total of the absolute residuals rather than the squared residuals. Continuing on the necessity of sound methods, Huber [38] proposed the Huber-M estimator, that is, instead of the squared error in the LS, he used a symmetric function, named

ρ

. Huber [39] went on to extend this concept to regression modelling, which formed the basis of later robust estimators. A number of researchers extended the methodology of Huber: Hampel [40] introduced the Hampel-M estimator, Tukey [41] introduced the Tukey-M estimator, and Yohai [42] introduced the Huber-MM estimator, each improving on the robustness properties of different data structures. Another robust estimator, trimmed least squares (TLS), proposed by Rousseeuw and Yohai [43], is more robust based on trimming out extreme residuals that could reflect outliers. Rousseeuw and Leroy [44] defined the least median of squares (LMS) estimator, which minimises the median of the squared residuals (rather than the mean), and is therefore highly resistant to the influence of outliers.

This article constructs a unified group of predictive mean estimators to maintain more accurate and robust estimation when using survey data with irregularities or outliers by combining effective regression frameworks, including Hampel-M, LAD, LMS, Tukey-M, LTS, Huber-M, and Huber-MM. Based on the structure outlined in Equation (2), we enhance the classical model-based mean estimator by substituting its sampled part, with robust regression-based mean estimators defined by Zaman and Bulut [36,37]. Hence, the adapted outlier-resistant family of predictive estimators for one stratum is given by

{\bar{y}}_{R P_{H_{ϑ (i)}}} = f_{H_{ϑ}} {\bar{y}}_{r_{H_{ϑ (i)}}} + (1 - f_{H_{ϑ}}) \frac{1}{N_{H_{ϑ}} - n_{H_{ϑ}}} \sum_{j_{H_{ϑ}} = {\bar{s}}_{H_{ϑ}}} {\hat{m}}_{j_{H_{ϑ}}}; f o r i = 1, 2, \dots, 7

(12)

where

{\bar{y}}_{r_{H_{ϑ (i)}}} = \{\begin{matrix} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l a d)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 1 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l m s)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 2 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l t s)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 3 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (h u b e r)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 4 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (h a m p l e)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 5 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (t u c k e y)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 6 \\ {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (m m)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}}); f o r i = 7 \end{matrix}

(13)

For all the strata,

{\bar{y}}_{R P_{H_{ϑ (i)}}}

can be written as

{\bar{y}}_{R P_{(i)}} = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}; f o r i = 1, 2, \dots, 7 .

(14)

In

{\bar{y}}_{R P_{(i)}}

, the robust regression method is applied on the sampled units to derive the

{\bar{y}}_{R P_{H_{ϑ (i)}}}

, therefore mitigating the effect of outliers. The non-sampled units are approximated by means of the kernel regression with the aid of auxiliary variable, W. Such an estimator can be applied to heterogeneous or contaminated populations because of its combination of robustness by resistant estimation of its mean and flexibility through nonparametric prediction based on a kernel.

3.2. Proposed Family

The model-based framework builds upon the application of superpopulation models represented by the symbol

ξ

and the assumption that the finite population of interest is a realisation of random variables according to the model

ξ

. This method of modelling enables an informed prediction of the unobservable aspects of the population in order to estimate the finite population parameters on the mean population of the study variable, Y. The main benefits of model-based inference are as follows:

The model-driven paradigm also known as the prediction paradigm provides a logical and theoretically grounded basis of statistical inference in finite population models, with classical estimators being a natural optimal predictor in appropriate model conditions.
This approach is consistent with the current inferential paradigms of applied disciplines like econometrics and biostatistics.
Model-based estimators can give results that are similar to those obtained using design-based methods, under regularity conditions, and on sufficiently large samples, making them more attractive to practitioners.
Model-based estimators are more likely to show lower variance in comparison to their design-based counterparts, particularly where the model accurately captures the underlying data-generating process.

The constructed two-fold callibration scheme has been craftily constructed in order to reflect the location and proportional change of the supplementary information W. The use of the coefficient of variation (CV) as a constraint turns out to be especially useful in those cases where the supplementary variable experiences variance instability or highly inter-stratum. This allows the calibrating weights to modify the variability scale and the central tendency (mean) and thereby enhances the estimator accuracy. Calibration based on the mean and CV may considerably decrease the mean squared error (MSE), particularly in cases where the connection between the auxiliary and study variables departs from linearity, as indicated by Refs. [31,33]. In this regard, CV-based calibration can be discussed as a mechanism of scale adjustment that normalises the variability by strata, effectively stabilising the estimation process and enhancing its overall efficiency.

In this research, we construct a calibrated model-based predictive family of mean estimators of the StRS framework [28,30,31,33], referred to as

{\bar{y}}_{P P_{(i)}}

, that combines robust regression of the sampled units and the use of the kernel regression of the non-sampled units. The family assumes the following general form:

{\bar{y}}_{P P_{(i)}} = \sum_{H_{ϑ} = 1}^{C_{⌀}} Π_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}; f o r i = 1, 2, \dots, 7,

(15)

where

Π_{H_{ϑ}}

is the weight of the calibration of the weight in the stratum

H_{ϑ}

and

{\bar{y}}_{R P_{H_{ϑ (i)}}}

is the estimator family of the predictive mean, as specified in Equation (12).

Π_{H_{ϑ}}

is derived by minimising a chi-square distance of the actual design weights

P_{H_{ϑ}}

with the constraints, as given in Equations (4)–(6). These constraints are included through Lagrange multipliers

(η_{1 (m)}

,

η_{2 (m)}

,

η_{3 (m)})

in the optimisation process. The last form of

Π_{H_{ϑ}}

is obtained as a function of the Lagrange multipliers and the auxiliary information as in Equation (8), which provide calibrated weights satisfying the constraints that are more efficient in estimations. It is important to note that the entire process of deriving the final form of Lagrange multipliers

(η_{1 (m)}

,

η_{2 (m)}

,

η_{3 (m)})

and calibrated weights is deliberately omitted in order to avoid redundancy. Nonetheless, curious readers may observe the specific stages in the previous subsection. The new form of

{\bar{y}}_{P P_{(i)}}

is obtained by substituting the derived Lagrange multipliers

(η_{1 (m)}

,

η_{2 (m)}

,

η_{3 (m)})

and the calibrated weight

Π_{H_{ϑ}}

:

{\bar{y}}_{P P_{(i)}} = {\bar{y}}_{R P_{(i)}} + η_{1 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) + η_{2 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) + η_{3 (m)} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}),

(16)

= \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}} + R_{c (i)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (i)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})],

(17)

where

R_{c (i)} = \frac{D_{76 (m (i))}}{H_{1}}, R_{d (i)} = \frac{D_{77 (m (i))}}{H_{1}},

\begin{matrix} D_{76 (m (i))} & = (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}})}^{2} \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}), \end{matrix}

\begin{matrix} D_{77 (m (i))} & = (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ + (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{y}}_{R P_{H_{ϑ (i)}}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}})}^{2} . \end{matrix}

All the family members of

{\bar{y}}_{P P_{(i)}}

based on the generalised final form of Equation (17) are

{\bar{y}}_{P P_{(i)}} = \{\begin{matrix} \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l a d)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (1)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (1)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 1 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l m s)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (2)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (2)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 2 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (l t s)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (3)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (3)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 3 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (h u b e r)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (4)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (4)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 4 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (h a m p l e)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (5)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (5)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 5 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (t u c k e y)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (6)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (6)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 6 \\ \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} {{\bar{y}}_{s_{H_{ϑ}}} + {\hat{Ξ}}_{s_{H_{ϑ (m m)}}} ({\bar{W}}_{s_{H_{ϑ}}} - {\bar{w}}_{s_{H_{ϑ}}})} + R_{c (7)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}})] + R_{d (7)} [\sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}})]; \\ f o r i = 7 \end{matrix}

(18)

3.3. Theoretical Framework and Practical Characteristics of ${\bar{y}}_{P P_{(i)}}$

The proposed calibrated predictive family

{\bar{y}}_{P P_{(i)}}

in Equation (18) is a combination of robust regression of the sampled units and nonparametric kernel prediction of the non-sampled units, followed by the application of calibration constraints depending on the mean and coefficient of variation (CV) of the auxiliary variable. In order to be more rigorous about the soundness of this construction, we clearly specify the regularity conditions that ensure (i) consistency with stratified sampling, (ii) convergence of kernel predictions

\hat{m} (w)

to

m (w)

, (iii) existence of a minimiser of the calibration objective, and (iv) control of bias in the hybrid robust-kernel structure.

3.3.1. Regularity Conditions

(A1) Stratified sampling regularity: Under StRSWOR, for each stratum, $H_{ϑ}$ , $N_{H_{ϑ}} \to \infty$ and $n_{H_{ϑ}} \to \infty$ , and the sampling fractions satisfy $0 < \underset{̲}{f} \leq f_{H_{ϑ}} = n_{H_{ϑ}} / N_{H_{ϑ}} \leq \bar{f} < 1$ .
(A2) Superpopulation model and moments: Within each stratum of the predictive model, $E_{ξ_{H_{ϑ}}} (ϖ_{i_{H_{ϑ}}}) = 0$ and $E_{ξ_{H_{ϑ}}} (ϖ_{i_{H_{ϑ}}}^{2}) \leq σ^{2} < \infty .$
(A3) Bounded auxiliary variable and smooth mean function: The auxiliary variable w is bounded (i.e., $a \leq w \leq b$ ) and the regression function $m (w)$ is $(p + 1)$ -times continuously differentiable on $[a, b]$ with bounded derivatives.
(A4) Kernel and bandwidth: The kernel $K (\cdot)$ is bounded, symmetric, integrates to one, and has finite second moment. The bandwidth satisfies $h \to 0$ and $n_{H_{ϑ}} h \to \infty$ for all strata.
(A5) Robust regression stability: For each robust method $i = 1, \dots, 7$ , the slope estimator ${\hat{Ξ}}_{s H_{ϑ} (i)}$ used in (13) is Fisher-consistent under (A2) and satisfies ${\hat{Ξ}}_{s H_{ϑ} (i)} - Ξ_{H_{ϑ} (i)} = O_{p} (n_{H_{ϑ}}^{- 1 / 2})$ .
(A6) Calibration feasibility: The chi-square distance calibration objective is strictly convex, and the constraint system (4)–(6) has full rank with $H_{1} \neq 0$ in (17), ensuring existence and uniqueness of the minimiser and the calibrated weights $Π_{H_{ϑ}}$ .

3.3.2. Asymptotic Implications

Under (A2)–(A4), standard LPR theory implies that the kernel predictor satisfies

sup_{w \in W} |\hat{m} (w) - m (w)| \overset{p}{\to} 0,

thus guaranteeing convergence of kernel predictions. Combining this result with (A1) and (A5) yields consistency of the adapted predictive family

{\bar{y}}_{R P (i)}

under stratified sampling, i.e.,

{\bar{y}}_{R P (i)} - \bar{Y} \overset{p}{\to} 0

. Moreover, under (A6),

{\bar{y}}_{P P (i)}

is well-defined because the calibration objective admits a unique minimiser. Finally, the hybrid bias of the proposed estimator is controlled: the robust regression component is asymptotically unbiased under Fisher consistency, while the smoothing bias follows the standard order

O (h^{p + 1})

, which vanishes as

h \to 0

.

3.3.3. Practical Characteristics

The proposed estimator possesses several noteworthy practical attributes:

Linearity: The estimator has a linear form, as it incorporates elements of an observed and predicted aspects of the study variable to create a composite mean. It uses a weighting parameter, denoted m, that will vary in response to the contribution of non-sampled units so that there is a balanced representation of the sampled and the predicted data.
Data-intensiveness: This estimator is highly data-driven and its use assumes the known observations of the supplementary variable, $w_{n}$ , for the entire population. These calculations require extensive smoothing and prediction algorithms, especially when they are performed with nonparametric models like kernel regression that require immense numerical computing.
Model-oriented inference: This estimator is model-based in contrast to more traditional design-based estimators which make explicit use of the inclusion probabilities to make inferences about the population. It substitutes the design weights with the alternative ones, say $w_{2}$ , based on the distance measure, robust regression coefficients, and kernel. This approach increases the accuracy of the estimation by prioritising the realised sample structure and is consistent with the conditionality principle; it makes inferences about the observed sample as opposed to an ensemble of possible samples.

4. Numerical Study

4.1. Bandwidth Selection

It should be emphasised that estimators

({\bar{y}}_{P M}, {\bar{y}}_{P P_{(i)}})

, as well as the other estimators in this case, are essentially dependent on the bandwidth parameter h, which controls the trade-off between bias and variance in local polynomial regression. As the choice of h also influences the effectiveness of the estimator, the selection of h was done using several strategies to guarantee consistency and robustness of the outcome. Specifically, the calculation of bandwidths was done based on: (a) a fixed bandwidth rule, (b) the direct plug-in method

(d p i k)

proposed by Ref. [45], and (c) the biased cross-validation technique

(b c v)

that was suggested by Ref. [46]. These selection techniques are demonstrated to result in asymptotically optimal bandwidths in large sample sizes. When the estimators

({\bar{y}}_{B R}, {\bar{y}}_{P M}, {\bar{y}}_{R P_{(i)}}, {\bar{y}}_{P P_{(i)}})

were evaluated within the current bandwidth selection of each of these bandwidth choices, we simultaneously tested the sensitivity of these estimators and the general behaviour of each across the regimes of smoothing.

4.2. Generated Populations

The simulation framework in this section focuses on determining the efficiency and effectiveness of the estimators

{\bar{y}}_{R P_{(i)}}

and

{\bar{y}}_{P P_{(i)}}

relative to

{\bar{y}}_{B R}

and

{\bar{y}}_{P M}

. For this purpose, two simulated datasets were generated.

4.2.1. Population-1

In order to test the behaviour of nonparametric calibration estimators within an StRS framework, a fully synthetic population of two strata with different distributional properties was created. In the first stratum (Stratum 1), 133 values of Y were obtained through a Gamma distribution that was of positive mode, shape 2, and scale 300. The second stratum (Stratum 2) on the other hand comprised 133 observations of Y with a mean of 800 and a standard deviation of 100, which is a more realistic structure, based on normal distribution. To determine the strength of the suggested estimators when some data anomalies were present, the small fraction of the values in each stratum was artificially inflated as a deliberate act to generate artificial outliers. These outliers resembled any unexpected irregularities that could arise in data generating processes. To assess calibrated estimator performance, auxiliary variable W was randomly drawn on a uniform distribution on the range (0,1) on both strata.

4.2.2. Population-2

The second artificial population was created in order to better understand the estimator behaviour in different distributional shapes and contamination patterns in the StRS framework. In the case of Stratum 1, the values of Y were sampled as a log-normal distribution with meanlog = 5.5 and sdlog = 0.6, which has naturally skewed data structure. In order to model the existence of extreme values, a sample of observations were identified and adjusted by introducing large constants, thus introducing a significant number of positive outliers to the dataset. The stratum 2 of Y was shaped to be bimodal, drawing two normal distributions of around 700 and 900 respectively, thus reflecting the heterogeneous underlying distributions. The auxiliary variable W in both strata belonged to the uniform distribution of (0,1). Outliers in both populations were identified through the Interquartile Range (IQR) methodology, see Figure 1 and Figure 2, which allowed the comparative analysis of data when the data were clean and contaminated. This type of controlled simulation environment enabled a thorough analysis of estimator performance in different and adverse situations.

4.2.3. Outlier Generation Mechanism (Simulated Populations)

To evaluate robustness under controlled contamination, outliers are generated by injecting a small number of abnormal response values within each stratum (Zaman and Bulut, [36,37]). In both populations and strata, outliers were introduced by randomly selecting

k = 5

units per stratum (out of 133) and injecting large positive shocks into the response variable: Uniform

(1000, 1500)

and Normal

(1000, 300^{2})

perturbations for the first population and Uniform

(1500, 2000)

and Normal

(1200, 300^{2})

perturbations for the second population. This resulted in an approximate contamination proportion of

k / N_{h} \approx 3.76 %

per stratum, representing vertical outliers in Y. For diagnostic and graphical purposes only, extreme observations were flagged using Tukey’s IQR rule, i.e., values outside

[Q_{1} - 1.5 IQR, Q_{3} + 1.5 IQR]

, see Ref. [41].

Adapting the methodological approach of Koyuncu [28,30], we conducted a simulation experiment with

R_{b} = 5000

replications. MSE-based PRE results for the StRS framework are summarised in Table 1, Table 2, Table 3 and Table 4. Across all generated populations, the computed results of

{\bar{y}}_{B R}, {\bar{y}}_{P M}, {\bar{y}}_{R P_{(i)}}

, and

{\bar{y}}_{P P_{(i)}}

were estimated. The comparative metrics (MSEs, PREs) expressions were

MSE ({\hat{c}}^{(g_{1})}) = \sum_{k_{b} = 1}^{R_{b}} {({\hat{c}}^{(g_{1})} - μ_{b})}^{2} / R_{b}

PRE ({\hat{c}}^{(g_{1})}, {\bar{y}}_{B R}) = \frac{MSE ({\hat{c}}^{(g_{1})})}{MSE ({\bar{y}}_{B R})} \times 100

where

{\hat{c}}^{(g_{1})}

=

{\bar{y}}_{P M}, {\bar{y}}_{R P_{(i)}}, {\bar{y}}_{P P_{(i)}}

.

4.3. Real Life Applications

4.3.1. Fisheries

Kernel-based LPR has valuable benefits in modelling nonlinear relationships, and, therefore, it is especially applicable to the needs of fisheries science. Here, the nonparametric predictive mean estimation accounts for the inherent biological variation observed among fish species, such as differences in size, weight, and other morphological characteristics. In this research, we used the famous Fish Market dataset, which entails data about different fishes and their physical features like length, height, and width. The dataset consists of an individual fish, which makes it suitable in prediction modelling. We use the dataset as discussed by [17] to implement model-based predictive average estimation.

To stratify the fish dataset, Stratum-I contained the recorded heights for Bream fish (which was the study variable Y), whereas Stratum-II contained the recorded width observations of the same species as Y. This was biologically significant stratification, a stratification with which within-group heterogeneity could be more accurately explored. We integrated these variables into a nonparametric calibration framework to investigate how predictive estimators can enhance the accuracy of estimation in fisheries assessment, which can be used to sustainably make decisions in the aquaculture and resource management sector.

4.3.2. Radiations

Nonparametric regression techniques are also useful in the modelling of environmental phenomena, especially in the use of heterogeneous and nonlinear patterns typical of atmospheric data. This study used the method of kernel regression for a dataset that represents the level of ultraviolet (UV) radiation in a variety of meteorological conditions. The dataset consists of environmental variables like temperature, humidity, ozone concentration, and solar position, all of which cause changes in UV intensity.

The UV Radiation dataset has also been described in [17] and is publicly available; therefore, no additional permissions were required for its use. The dataset was originally gathered to enable predictive modelling of the UV risk levels, and this assisted our purpose of model-based predictive mean estimation in environmental monitoring. These data were stratified into two categories of UV risks, where Stratum-I contained the conditions of low-risk and Stratum-II contained moderate-risk cases. In this context, the solar radiation intensity was the key study variable, denoted by Y.

It is important to note that in both simulated populations, the auxiliary variable W was generated from a uniform distribution on

[0, 1]

, i.e.,

W \sim U (0, 1)

, following the methodologies outlined by RSB [6], Qureshi et al. [47], and Subzar et al. [48].

The two case studies taken together illustrate how nonparametric model-based estimation can be very versatile and effective in practice. Both illustrations are based on the pseudo-population concept that is widely utilised in simulation-driven survey designs. Table 5 and Table 6 and Table 7 and Table 8 give the results of PREs obtained with the fisheries and radiation data respectively.

In our simulation and empirical implementation, the suggested families

{\bar{y}}_{R P (i)}

and

{\bar{y}}_{P P (i)}

were estimated at the independent level in each stratum

H

in three steps: (i) fit the robust regression model to the sampled units of size

n_{H_{ϑ}}

to get

{\hat{Ξ}}_{s_{H_{ϑ (i)}}}

(repeated in all robust methods

i = 1, \dots, 7

), (ii) compute the LPR/kernel predictor

\hat{m} (w)

at every non-sampled unit

{\bar{s}}_{H_{ϑ}}

, and (iii) get the calibrated weights

Π_{H_{ϑ}}

by solving the mean and CV-based calibration constraints. The prevailing computational cost was the kernel prediction step, as full evaluation would involve calculating the weights of the kernels of all the non-sampled units against the sampled units

n_{H_{ϑ}}

; this resulted in

O (\sum_{H_{ϑ} = 1}^{C_{⌀}} (N_{H_{ϑ}} - n_{H_{ϑ}}) n_{H_{ϑ}})

per replication operations and robust methods. The increase in the cost of robust regression fitting led to

O (\sum_{H_{ϑ} = 1}^{C_{⌀}} T_{i} n_{H_{ϑ}})

, with

T_{i}

indicating the number of iterations to use method i and calibration being only

O (\sum_{H_{ϑ} = 1}^{C_{⌀}} n_{H_{ϑ}})

cost since it only solved a fixed low-dimensional system of constraints. The choice of bandwidth influenced runtime: with a fixed choice of the bandwidth, the above dominant cost is obtained, but with data-driven rules (dpik and bcv), an additional overhead of piloting/optimisation was introduced but did not alter the leading order term. Because the entire process was repeated with

R_{b}

Monte Carlo repetitions and

i = 1, 2, . . . 7

robust methods, the total run time was

O (R I \sum_{H_{ϑ} = 1}^{C_{⌀}} (N_{H_{ϑ}} - n_{H_{ϑ}}) n_{H_{ϑ}})

in the naive implementation. However the method is still scalable to large-scale stratified surveys since computations can be easily parallelised both across strata and across replications, and the kernel computation can be done with truncated/local neighborhoods and fast nearest-neighbor search (e.g., KD-tree), which can be done much faster for big populations.

4.4. Interpretation

Table 1, Table 2, Table 3 and Table 4:

Table 1, Table 2, Table 3 and Table 4 summarise the PRE values of the competing estimators relative to the baseline estimator

{\bar{y}}_{B R}

(fixed at PRE

= 100

). Across all settings, the predictive estimator

{\bar{y}}_{P M}

provided only a slight improvement over

{\bar{y}}_{B R}

, indicating that the use of auxiliary information alone yielded a modest efficiency gain. In contrast, the adapted families

{\bar{y}}_{R P_{(i)}}

and the proposed calibrated families

{\bar{y}}_{P P_{(i)}}

consistently outperformed the baseline for all indices

i = 1, \dots, 7

, confirming the advantage of combining predictive modelling with robust/adjusted structures. In the first generated population (Table 1 and Table 2), the highest gains were generally achieved by the moderate-complexity indices, particularly

i = 3

and

i = 4

, with

{\bar{y}}_{R P_{(3)}}

and

{\bar{y}}_{P P_{(3)}}

yielding the strongest PRE values and

{\bar{y}}_{P P_{(4)}}

attaining the maximum PRE under

20 %

sampling. For the second generated population (Table 3 and Table 4), PRE values were slightly lower in the

10 %

case but the ranking remained stable; again, indices

i = 3

or 4 dominated. Notably, the

20 %

sampling scenario for the second population (Table 4) produced the largest efficiency improvements, where

{\bar{y}}_{R P_{(4)}}

and

{\bar{y}}_{P P_{(4)}}

exceeded PRE

\approx 101.22

, suggesting that stronger correlation structures and smoother functional relationships favoured the proposed estimators. Overall, these results demonstrate that the proposed predictive families provide reliable efficiency gains over existing benchmarks, particularly when auxiliary information is informative and the underlying relationship between y and w is adequately captured by kernel-based prediction coupled with robust regression.

Table 5 and Table 6:

The first table presents the PRE values for all the estimators for the fisheries data. The results show that

{\bar{y}}_{R P_{(3)}}

,

{\bar{y}}_{R P_{(4)}}

, and their proposed counterparts

{\bar{y}}_{P P_{(3)}}

and

{\bar{y}}_{P P_{(4)}}

achieved the highest PREs. Such estimators were therefore more efficient as well as more accurate when estimating the population mean. The PRE increment confirmed the decrease in MSE and showed that the methods had steady improvements. Similarly, in Table 6, the PRE values remained favourable across all proposed estimators. Adapted and suggested estimators remained predominant, with a maximum PRE value of over 101.03.

{\bar{y}}_{P P_{(3)}}

and

{\bar{y}}_{P P_{(4)}}

still showed strong performances.

Table 7 and Table 8:

The last PRE tables of radiations data showed a plateau in performance. The efficiency improvements began to decrease slightly, even though the proposed estimators were still better than the baseline. Top PRE scores were approximately 100.45, which implied slight improvement. This may have been as a result of greater variability or noise. However, the predictive estimators remained their best in performance, particularly at mid-level model complexity, exhibiting flexibility in varied sampling designs.

Bandwidth sensitivity (Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8):

Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 further highlight the sensitivity of PRE to the bandwidth selection strategy used in LPR-based prediction. Overall, the fixed bandwidth choices (

h = 0.2

and

h = 0.5

) yielded only marginal gains over

{\bar{y}}_{B R}

, typically producing PRE values very close to 100 across both simulated and real populations. In contrast, the data-driven bandwidths (dpik and bcv) consistently produce higher PRE values for all predictive families, indicating that automatic bandwidth tuning better balances the bias–variance trade-off in kernel smoothing. In particular, bcv generally delivers the maximum PRE (or remains competitive with dpik), reflecting its stronger adaptation to heterogeneous and contaminated structures. This pattern is evident in the generated populations (Table 1, Table 2, Table 3 and Table 4), where PRE increases notably under dpik/bcv compared to fixed h and becomes even more pronounced in the fisheries data (Table 5 and Table 6), where the proposed estimators reached PRE values exceeding 102 under dpik/bcv. In the radiations data (Table 7 and Table 8), the improvement trend remained positive but comparatively smaller (PRE near 101), suggesting a flatter regression structure and higher noise level. Collectively, these results confirm that the superiority of the proposed families was not restricted to a single smoothing choice; rather, the proposed estimators maintained stable efficiency gains across bandwidth regimes, with the strongest improvements arising under data-driven bandwidths, particularly bcv.

As indicated, the PRE results of all the

{\bar{y}}_{P P_{(i)}}

estimators were more than 100, indicating improved performance of the estimators compared to others. This conclusion was reached based on our simulation study, but we feel that the same would likely hold in other settings.

5. Conclusions

The current paper proposes a unified approach that integrates robust regression and nonparametric kernel-based methods to improve predictive estimation of population means under StRS. The suggested methodology can be used to overcome the major shortcomings of other parametric and nonparametric estimators, specifically their weakness in handling model mis-specification and outliers, which are common in real-world contexts such as environmental monitoring and fisheries management. The proposed calibrated predictive estimators are based on constraints that consider the mean and CV of W. The efficiency and stability of the predictive estimators is vastly enhanced through the use of constraints. The superiority of the proposed estimators is shown by the numerical simulations on two artificial populations with a purposeful contamination by outliers and structural heterogeneity. Specifically, the calibrated proposed estimators (labeled

{\bar{y}}_{P P_{(i)}}

) consistently performed better for PRE than the classical and adapted kernel-based estimators when the bandwidth was selected based on various strategies (fixed, dpik, or bcv). This stability of the different smoothing parameters also brings out the suitability and flexibility of the proposed family in any given empirical setting. Moreover, the formulation can be flexibly generalised, so that it can be adapted to different stratification and auxiliary information structures. This flexibility is vital in its applications in various areas including survey statistics, environmental science, and analysis of socio-economic data, in which irregularities and complex distribution of data are usual.

Author Contributions

Conceptualization, R.M., H. M A., N.A. and M.H.; Methodology, R.M., H.M.A., N.A. and M.H.; Software, R.M.; Formal analysis, M.H.; Resources, N.A.; Data curation, N.A.; Writing—original draft, R.M., H.M.A., N.A. and M.H.; Writing—review & editing, R.M., H.M.A. and M.H.; Visualization, R.M.; Supervision, N.A. and M.H.; Project administration, H.M.A.; Funding acquisition, H.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2026R299), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The constraint multipliers

(η_{1 (m)}

,

η_{2 (m)}

,

η_{3 (m)})

used in

{\bar{y}}_{P M}

and

{\bar{y}}_{P P_{(i)}}

estimators are

η_{1 (m)} = \frac{D_{71 (m)}}{H_{1}}, η_{2 (m)} = \frac{D_{72 (m)}}{H_{1}}, η_{3 (m)} = \frac{D_{73 (m)}}{H_{1}},

where

\begin{matrix} D_{71 (m)} & = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}})}^{2} \\ + \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}), \end{matrix}

\begin{matrix} D_{72 (m)} & = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}})}^{2} \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) \\ + \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}), \end{matrix}

\begin{matrix} D_{73 (m)} & = \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} ({\bar{W}}_{H_{ϑ}} - {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) \\ + \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) \\ - \sum_{H_{ϑ} = 1}^{C_{⌀}} P_{H_{ϑ}} (C w_{H_{ϑ}} - \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) . \end{matrix}

\begin{matrix} H_{1} & = (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) - {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}})}^{2} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\hat{C}}^{2} x_{H_{ϑ}}) \\ - (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}}) {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}})}^{2} - {(\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}})}^{2} (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}^{2}) \\ + 2 (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) (\sum_{H_{ϑ} = 1}^{C_{⌀}} {\hat{Q}}_{H_{ϑ}} P_{H_{ϑ}} {\bar{w}}_{H_{ϑ}} \hat{C} w_{H_{ϑ}}) . \end{matrix}

References

Opsomer, J.D.; Francisco-Fernandez, M.; Li, X. Model-based nonparametric variance estimation for systematic sampling. Scand. J. Stat. 2012, 39, 528–542. [Google Scholar] [CrossRef]
Wu, C.; Sitter, R.R. A model-calibration approach to using complete auxiliary information from survey data. J. Am. Stat. Assoc. 2001, 96, 185–193. [Google Scholar] [CrossRef]
Dorfman, A.H. Nonparametric regression for estimating totals in finite populations. In Section on Survey Research Methods; American Statistical Association: Alexandria, VA, USA, 1992; pp. 622–625. [Google Scholar]
Dorfman, A.H.; Hall, P. Estimators of the finite population distribution function using nonparametric regression. Ann. Stat. 1993, 21, 1452–1475. [Google Scholar] [CrossRef]
Nadaraya, E.A. On estimating regression. Theory Probab. Its Appl. 1964, 9, 141–142. [Google Scholar] [CrossRef]
Rueda, M.; Sanchez-Borrego, I.R. A predictive estimator of finite population mean using nonparametric regression. Comput. Stat. 2009, 24, 1–14. [Google Scholar] [CrossRef]
Yang, X.; Chen, J.; Li, D.; And Li, R. Functional-Coefficient Quantile Regression for Panel Data with Latent Group Structure. J. Bus. Econ. Stat. 2024, 42, 1026–1040. [Google Scholar] [CrossRef]
Hao, R.; Yang, X. Multiple-output quantile regression neural network. Stat. Comput. 2024, 34, 89. [Google Scholar] [CrossRef]
Tian, Z.; Lee, A.; Zhou, S. Adaptive tempered reversible jump algorithm for Bayesian curve fitting. Inverse Probl. 2024, 40, 045024. [Google Scholar] [CrossRef]
Ren, Y.; Zhang, J.; Xia, Y.; Wang, R.; Xie, F.; Guan, J.; Zhang, H.; Zhou, S. Regression-based Conditional Independence Test with Adaptive Kernels. Artif. Intell. 2025, 347, 104391. [Google Scholar] [CrossRef]
Zaman, T. Efficient estimators of population mean using auxiliary attribute in stratified random sampling. Adv. Appl. Stat. 2019, 56, 153–171. [Google Scholar] [CrossRef]
Subzar, M.; Lone, S.A.; Aslam, M.; AL-Marshadi, A.H.; Maqbool, S. Exponential ratio estimator of the median: An alternative to the regression estimator of the median under stratified sampling. J. King Saud-Univ.-Sci. 2023, 35, 102536. [Google Scholar] [CrossRef]
Kumar, A.; Siddiqui, A.S. Enhanced estimation of population mean using simple random sampling. Res. Stat. 2024, 2, 2335949. [Google Scholar] [CrossRef]
Shahzad, U.; Zhu, H.; Al-Noor, N.H.; Albalawi, O. Ridge regression-based mean estimators using bivariate auxiliary information. Math. Popul. Stud. 2025, 32, 83–103. [Google Scholar] [CrossRef]
Koc, T.; Koc, H. A new class of quantile regression ratio-type estimators for finite population mean in stratified random sampling. Axioms 2023, 12, 713. [Google Scholar] [CrossRef]
Srivastava, S.K. Predictive estimation of finite population mean using product estimator. Metrika 1983, 30, 93–99. [Google Scholar] [CrossRef]
Alshanbari, H.M.; Anas, M.M. Prospective Inference of Central Tendency Through Data-Adaptive Mechanisms. Mathematics 2025, 13, 3622. [Google Scholar] [CrossRef]
Alomair, A.M.; Shahzad, U.; Al-Noor, N.H.; Zhu, H. Probability weighted moments and family of nonparametric regression estimators. Maejo Int. J. Sci. Technol. 2025, 19, 160–170. [Google Scholar]
Chambers, R.L.; Dorfman, A.H.; Wehrly, T.E. Bias robust estimation in finite populations using nonparametric calibration. J. Am. Stat. Assoc. 1993, 88, 268–277. [Google Scholar] [CrossRef]
Breidt, F.J.; Opsomer, J.D. Local polynomial regression estimators in survey sampling. Ann. Stat. 2000, 28, 1026–1053. [Google Scholar] [CrossRef]
Ali, T.H. Modification of the adaptive Nadaraya-Watson kernel method for nonparametric regression (simulation study). Commun.-Stat.-Simul. Comput. 2022, 51, 391–403. [Google Scholar] [CrossRef]
Ali, T.H.; Hayawi, H.A.A.M.; Botani, D.S.I. Estimation of the bandwidth parameter in Nadaraya-Watson kernel nonparametric regression based on universal threshold level. Commun.-Stat.-Simul. Comput. 2023, 52, 1476–1489. [Google Scholar] [CrossRef]
Shahzad, U.; Ahmad, I.; Almanjahie, I.M.; Koyuncu, N.; Hanif, M. Variance estimation based on L-moments and auxiliary information. Math. Popul. Stud. 2022, 29, 31–46. [Google Scholar] [CrossRef]
Zaman, T. Generalized exponential estimators for the finite population mean. Stat. Transition. New Ser. 2020, 21, 159–168. [Google Scholar] [CrossRef]
Leesch, J.; Skopek, J. Five decades of marital sorting in France and the United States—The role of educational expansion and the changing gender imbalance in education. Res. Soc. Stratif. Mobil. 2025, 97, 101044. [Google Scholar] [CrossRef]
Kaiser, C.; Oswald, A.J. The scientific value of numerical measures of human feelings. Proc. Natl. Acad. Sci. USA 2022, 119, e2210412119. [Google Scholar] [CrossRef] [PubMed]
Deville, J.C.; Sarndal, C.E. Calibration estimators in survey sampling. J. Am. Stat. Assoc. 1992, 87, 376–382. [Google Scholar] [CrossRef]
Koyuncu, N. New difference-cum-ratio and exponential type estimators in median ranked set sampling. Hacet. J. Math. Stat. 2016, 45, 207–225. [Google Scholar] [CrossRef]
Singh, S.; Horn, S.; Yu, F. Estimation variance of general regression estimator: Higher level calibration approach. Surv. Methodol. 1998, 48, 41–50. [Google Scholar]
Koyuncu, N. Calibration estimator of population mean under stratified ranked set sampling design. Commun.-Stat.-Theory Methods 2018, 47, 5845–5853. [Google Scholar] [CrossRef]
Sinha, N.; Sisodia, B.V.S.; Singh, S.; Singh, S.K. Calibration approach estimation of the mean in stratified sampling and stratified double sampling. Commun.-Stat.-Theory Methods 2017, 46, 4932–4942. [Google Scholar]
Barranco-Chamorro, I.; Jiménez-Gamero, M.D.; Mayor-Gallego, J.A.; Moreno-Rebollo, J.L. A case-deletion diagnostic for penalized calibration estimators and BLUP under linear mixed models in survey sampling. Comput. Stat. Data Anal. 2015, 87, 18–33. [Google Scholar] [CrossRef]
Garg, N.; Pachori, M. Use of coefficient of variation in calibration estimation of population mean in stratified sampling. Commun.-Stat.-Theory Methods 2019, 49, 5842–5852. [Google Scholar] [CrossRef]
Pal, A.; Varshney, R.; Yadav, S.K.; Zaman, T. Improved memory-type ratio estimator for population mean in stratified random sampling under linear and non-linear cost functions. Soft Comput. 2024, 28, 7739–7754. [Google Scholar] [CrossRef]
Pandey, M.K.; Singh, G.N.; Zaman, T.; Al Mutairi, A.; Mustafa, M.S. Improved estimation of population variance in stratified successive sampling using calibrated weights under non-response. Heliyon 2024, 10, e27738. [Google Scholar] [CrossRef]
Zaman, T.; Bulut, H. Modified ratio estimators using robust regression methods. Commun.-Stat.-Theory Methods 2019, 48, 2039–2048. [Google Scholar] [CrossRef]
Zaman, T.; Bulut, H. Modified regression estimators using robust regression methods and covariance matrices in stratified random sampling. Commun.-Stat.-Theory Methods 2020, 49, 3407–3420. [Google Scholar] [CrossRef]
Huber, P.J. Robust estimation of a location parameter. Ann. Math. Stat. 1964, 35, 73–101. [Google Scholar] [CrossRef]
Huber, P.J. Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Stat. 1973, 1, 799–821. [Google Scholar] [CrossRef]
Hampel, F.R. A general qualitative definition of robustness. Ann. Math. Stat. 1971, 42, 1887–1896. [Google Scholar] [CrossRef]
Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Boston, MA, USA, 1977. [Google Scholar]
Yohai, V.J. High breakdown-point and high efficiency robust estimates for regression. Ann. Stat. 1987, 15, 642–656. [Google Scholar] [CrossRef]
Rousseeuw, P.J.; Yohai, V. Robust regression by means of S-estimators. In Lecture Notes in Statistics; Springer: New York, NY, USA, 1984; Volume 26, pp. 256–272. [Google Scholar]
Rousseeuw, P.J.; Leroy, A.M. Robust Regression and Outlier Detection; John Wiley and Sons Publication: New York, NY, USA, 1987. [Google Scholar]
Wand, M.P.; Jones, M.C. Chapman and Hall; Kernel Smoothing: London, UK, 1995. [Google Scholar]
Scott, D.W.; Terrell, G.R. Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 1987, 82, 1131–1146. [Google Scholar] [CrossRef]
Qureshi, M.N.; Khalil, S.; Hanif, M. Joint influence of exponential ratio and exponential product estimator for the estimation clustered population mean in adaptive cluster sampling. Adv. Appl. Stat 2018, 53, 13–28. [Google Scholar] [CrossRef]
Subzar, M.; Alqurashi, T.; Chandawat, D.; Tamboli, S.; Raja, T.A.; Attri, A.K.; Wani, S.A. Generalized robust regression techniques and adaptive cluster sampling for efficient estimation of population mean in case of rare and clustered populations. Sci. Rep. 2025, 15, 2069. [Google Scholar] [CrossRef]

Figure 1. First generated population.

Figure 2. Second generated population.

Table 1. PRE using first generated population with

n = 10 %

.

Table 1. PRE using first generated population with

n = 10 %

.

$h \to$	0.2	0.5	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.2809	100.2837	100.2694	100.2715
${\bar{y}}_{R P_{(1)}}$	100.1608	100.1609	100.8467	100.9787
${\bar{y}}_{R P_{(2)}}$	100.1143	100.1144	100.8000	100.9318
${\bar{y}}_{R P_{(3)}}$	100.1198	100.1198	100.8055	100.9373
${\bar{y}}_{R P_{(4)}}$	100.1060	100.1061	100.7916	100.9234
${\bar{y}}_{R P_{(5)}}$	100.0828	100.0829	100.7683	100.9001
${\bar{y}}_{R P_{(6)}}$	100.0995	100.0997	100.7851	100.9169
${\bar{y}}_{R P_{(7)}}$	100.0947	100.0948	100.7802	100.9120
${\bar{y}}_{P P_{(1)}}$	100.4523	100.4553	101.1391	101.2757
${\bar{y}}_{P P_{(2)}}$	100.4001	100.4030	101.0866	101.2230
${\bar{y}}_{P P_{(3)}}$	100.4062	100.4090	101.0927	101.2291
${\bar{y}}_{P P_{(4)}}$	100.3979	100.4008	101.0843	101.2208
${\bar{y}}_{P P_{(5)}}$	100.3741	100.3770	101.0604	101.1968
${\bar{y}}_{P P_{(6)}}$	100.3909	100.3939	101.0773	101.2138
${\bar{y}}_{P P_{(7)}}$	100.3858	100.3888	101.0722	101.2086

Table 2. PRE using first generated population with

n = 20 %

.

Table 2. PRE using first generated population with

n = 20 %

.

$h \to$	0.2	0.5	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.2800	100.2825	100.2741	100.2756
${\bar{y}}_{R P_{(1)}}$	100.0726	100.0727	100.7866	100.9606
${\bar{y}}_{R P_{(2)}}$	100.0078	100.0078	100.7212	100.8951
${\bar{y}}_{R P_{(3)}}$	100.0038	100.0037	100.7172	100.8909
${\bar{y}}_{R P_{(4)}}$	100.0315	100.0315	100.7452	100.9191
${\bar{y}}_{R P_{(5)}}$	100.0054	100.0054	100.7188	100.8927
${\bar{y}}_{R P_{(6)}}$	100.0324	100.0325	100.7461	100.9200
${\bar{y}}_{R P_{(7)}}$	100.0304	100.0304	100.7440	100.9179
${\bar{y}}_{P P_{(1)}}$	100.3563	100.3588	101.0718	101.2494
${\bar{y}}_{P P_{(2)}}$	100.2890	100.2914	101.0039	101.1814
${\bar{y}}_{P P_{(3)}}$	100.2853	100.2877	101.0003	101.1776
${\bar{y}}_{P P_{(4)}}$	100.3152	100.3177	101.0304	101.2079
${\bar{y}}_{P P_{(5)}}$	100.2889	100.2913	101.0039	101.1813
${\bar{y}}_{P P_{(6)}}$	100.3159	100.3183	101.0311	101.2086
${\bar{y}}_{P P_{(7)}}$	100.3137	100.3161	101.0289	101.2063

Table 3. PRE using second generated population with

n = 10 %

.

Table 3. PRE using second generated population with

n = 10 %

.

$h \to$	0.2	0.5	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.0549	100.0546	100.0528	100.0526
${\bar{y}}_{R P_{(1)}}$	100.0353	100.0353	100.3942	100.4623
${\bar{y}}_{R P_{(2)}}$	100.0406	100.0406	100.3995	100.4676
${\bar{y}}_{R P_{(3)}}$	100.0388	100.0388	100.3977	100.4658
${\bar{y}}_{R P_{(4)}}$	100.0254	100.0254	100.3843	100.4524
${\bar{y}}_{R P_{(5)}}$	100.0142	100.0142	100.3730	100.4411
${\bar{y}}_{R P_{(6)}}$	100.0134	100.0134	100.3722	100.4403
${\bar{y}}_{R P_{(7)}}$	100.0031	100.0031	100.3619	100.4299
${\bar{y}}_{P P_{(1)}}$	100.0882	100.0879	100.4473	100.5150
${\bar{y}}_{P P_{(2)}}$	100.0947	100.0944	100.4537	100.5215
${\bar{y}}_{P P_{(3)}}$	100.0921	100.0918	100.4512	100.5189
${\bar{y}}_{P P_{(4)}}$	100.0778	100.0775	100.4368	100.5046
${\bar{y}}_{P P_{(5)}}$	100.0665	100.0661	100.4254	100.4931
${\bar{y}}_{P P_{(6)}}$	100.0667	100.0664	100.4256	100.4934
${\bar{y}}_{P P_{(7)}}$	100.0561	100.0558	100.4151	100.4828

Table 4. PRE using second generated population with

n = 20 %

.

Table 4. PRE using second generated population with

n = 20 %

.

$h \to$	0.2	0.5	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.00000	100.00000	100.00000
${\bar{y}}_{P M}$	100.0001	99.99975	99.99988	99.99966
${\bar{y}}_{R P_{(1)}}$	100.0553	100.05528	100.37851	100.51767
${\bar{y}}_{R P_{(2)}}$	100.0675	100.06760	100.39082	100.53005
${\bar{y}}_{R P_{(3)}}$	100.0790	100.07908	100.40231	100.54158
${\bar{y}}_{R P_{(4)}}$	100.0498	100.04978	100.37302	100.51215
${\bar{y}}_{R P_{(5)}}$	100.0380	100.03800	100.36123	100.50031
${\bar{y}}_{R P_{(6)}}$	100.0297	100.02971	100.35292	100.49199
${\bar{y}}_{R P_{(7)}}$	100.0288	100.02878	100.35198	100.49106
${\bar{y}}_{P P_{(1)}}$	100.0536	100.05325	100.37689	100.51561
${\bar{y}}_{P P_{(2)}}$	100.0669	100.06663	100.39027	100.52906
${\bar{y}}_{P P_{(3)}}$	100.0786	100.07834	100.40199	100.54081
${\bar{y}}_{P P_{(4)}}$	100.0480	100.04769	100.37134	100.51002
${\bar{y}}_{P P_{(5)}}$	100.0362	100.03585	100.35949	100.49813
${\bar{y}}_{P P_{(6)}}$	100.0283	100.02798	100.35159	100.49022
${\bar{y}}_{P P_{(7)}}$	100.0274	100.02706	100.35067	100.48930

Table 5. PRE using fisheries population with

n = 10 %

.

Table 5. PRE using fisheries population with

n = 10 %

.

$h \to$	$0.2$	$0.5$	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.8126	100.8164	100.7849	100.7878
${\bar{y}}_{R P_{(1)}}$	100.1455	100.1456	101.0740	101.1641
${\bar{y}}_{R P_{(2)}}$	100.1798	100.1798	101.1087	101.1987
${\bar{y}}_{R P_{(3)}}$	100.2017	100.2018	101.1308	101.2209
${\bar{y}}_{R P_{(4)}}$	100.1630	100.1631	101.0917	101.1818
${\bar{y}}_{R P_{(5)}}$	100.1657	100.1658	101.0944	101.1845
${\bar{y}}_{R P_{(6)}}$	100.1645	100.1646	101.0932	101.1833
${\bar{y}}_{R P_{(7)}}$	100.1525	100.1526	101.0811	101.1712
${\bar{y}}_{P P_{(1)}}$	100.9701	100.9740	101.9049	102.0015
${\bar{y}}_{P P_{(2)}}$	101.0048	101.0086	101.9399	102.0365
${\bar{y}}_{P P_{(3)}}$	101.0280	101.0318	101.9633	102.0599
${\bar{y}}_{P P_{(4)}}$	100.9894	100.9933	101.9243	102.0210
${\bar{y}}_{P P_{(5)}}$	100.9921	100.9960	101.9270	102.0237
${\bar{y}}_{P P_{(6)}}$	100.9914	100.9953	101.9263	102.0230
${\bar{y}}_{P P_{(7)}}$	100.9782	100.9821	101.9130	102.0097

Table 6. PRE using fisheries population with

n = 20 %

.

Table 6. PRE using fisheries population with

n = 20 %

.

$h \to$	$0.2$	$0.5$	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	101.2838	101.2882	101.2582	101.2611
${\bar{y}}_{R P_{(1)}}$	100.2120	100.2122	101.2973	101.4101
${\bar{y}}_{R P_{(2)}}$	100.1872	100.1872	101.2723	101.3848
${\bar{y}}_{R P_{(3)}}$	100.2179	100.2179	101.3032	101.4158
${\bar{y}}_{R P_{(4)}}$	100.2315	100.2317	101.3170	101.4299
${\bar{y}}_{R P_{(5)}}$	100.2285	100.2287	101.3139	101.4268
${\bar{y}}_{R P_{(6)}}$	100.2221	100.2223	101.3074	101.4203
${\bar{y}}_{R P_{(7)}}$	100.2254	100.2256	101.3108	101.4237
${\bar{y}}_{P P_{(1)}}$	101.5051	101.5097	102.6036	102.7237
${\bar{y}}_{P P_{(2)}}$	101.4793	101.4837	102.5777	102.6974
${\bar{y}}_{P P_{(3)}}$	101.5112	101.5157	102.6099	102.7297
${\bar{y}}_{P P_{(4)}}$	101.5261	101.5309	102.6249	102.7451
${\bar{y}}_{P P_{(5)}}$	101.5230	101.5277	102.6218	102.7420
${\bar{y}}_{P P_{(6)}}$	101.5163	101.5211	102.6150	102.7352
${\bar{y}}_{P P_{(7)}}$	101.5198	101.5245	102.6185	102.7387

Table 7. PRE using radiations population with

n = 10 %

.

Table 7. PRE using radiations population with

n = 10 %

.

$h \to$	$0.2$	$0.5$	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.5602	100.5638	100.5416	100.5440
${\bar{y}}_{R P_{(1)}}$	100.0751	100.0751	100.6680	100.6875
${\bar{y}}_{R P_{(2)}}$	100.2685	100.2685	100.8626	100.8820
${\bar{y}}_{R P_{(3)}}$	100.2275	100.2276	100.8214	100.8409
${\bar{y}}_{R P_{(4)}}$	100.0791	100.0792	100.6720	100.6916
${\bar{y}}_{R P_{(5)}}$	100.0812	100.0813	100.6741	100.6937
${\bar{y}}_{R P_{(6)}}$	100.0805	100.0805	100.6733	100.6929
${\bar{y}}_{R P_{(7)}}$	100.0841	100.0841	100.6770	100.6965
${\bar{y}}_{P P_{(1)}}$	100.6423	100.6460	101.2381	101.2626
${\bar{y}}_{P P_{(2)}}$	100.8516	100.8551	101.4487	101.4730
${\bar{y}}_{P P_{(3)}}$	100.8062	100.8098	101.4030	101.4274
${\bar{y}}_{P P_{(4)}}$	100.6469	100.6505	101.2427	101.2672
${\bar{y}}_{P P_{(5)}}$	100.6492	100.6529	101.2451	101.2695
${\bar{y}}_{P P_{(6)}}$	100.6483	100.6519	101.2441	101.2685
${\bar{y}}_{P P_{(7)}}$	100.6523	100.6559	101.2482	101.2726

Table 8. PRE using radiations population with

n = 20 %

.

Table 8. PRE using radiations population with

n = 20 %

.

$h \to$	$0.2$	$0.5$	$dpik$	$bcv$
Estimators
${\bar{y}}_{B R}$	100.0000	100.0000	100.0000	100.0000
${\bar{y}}_{P M}$	100.8112	100.8118	100.7959	100.7954
${\bar{y}}_{R P_{(1)}}$	100.0695	100.0695	100.7571	100.7240
${\bar{y}}_{R P_{(2)}}$	100.0581	100.0580	100.7456	100.7124
${\bar{y}}_{R P_{(3)}}$	100.0341	100.0340	100.7214	100.6883
${\bar{y}}_{R P_{(4)}}$	100.0974	100.0974	100.7852	100.7521
${\bar{y}}_{R P_{(5)}}$	100.1043	100.1043	100.7921	100.7590
${\bar{y}}_{R P_{(6)}}$	100.0956	100.0956	100.7834	100.7502
${\bar{y}}_{R P_{(7)}}$	100.0949	100.0950	100.7827	100.7496
${\bar{y}}_{P P_{(1)}}$	100.8843	100.8849	101.5795	101.5453
${\bar{y}}_{P P_{(2)}}$	100.8719	100.8724	101.5670	101.5327
${\bar{y}}_{P P_{(3)}}$	100.8460	100.8465	101.5409	101.5066
${\bar{y}}_{P P_{(4)}}$	100.9138	100.9144	101.6092	101.5750
${\bar{y}}_{P P_{(5)}}$	100.9210	100.9216	101.6165	101.5823
${\bar{y}}_{P P_{(6)}}$	100.9118	100.9124	101.6073	101.5730
${\bar{y}}_{P P_{(7)}}$	100.9112	100.9118	101.6066	101.5724

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mahmood, R.; Alshanbari, H.M.; Ali, N.; Hanif, M. Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling. Axioms 2026, 15, 134. https://doi.org/10.3390/axioms15020134

AMA Style

Mahmood R, Alshanbari HM, Ali N, Hanif M. Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling. Axioms. 2026; 15(2):134. https://doi.org/10.3390/axioms15020134

Chicago/Turabian Style

Mahmood, Rashid, Huda M. Alshanbari, Nasir Ali, and Muhammad Hanif. 2026. "Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling" Axioms 15, no. 2: 134. https://doi.org/10.3390/axioms15020134

APA Style

Mahmood, R., Alshanbari, H. M., Ali, N., & Hanif, M. (2026). Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling. Axioms, 15(2), 134. https://doi.org/10.3390/axioms15020134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling

Abstract

1. Introduction

2. Fundamental Estimators

2.1. Rueda and Sanchez-Borrego Estimator

2.2. Alshanbari and Anas Estimator

3. Adapted and Proposed Families of Estimators

3.1. Adapted Family

3.2. Proposed Family

3.3. Theoretical Framework and Practical Characteristics of ${\bar{y}}_{P P_{(i)}}$

3.3.1. Regularity Conditions

3.3.2. Asymptotic Implications

3.3.3. Practical Characteristics

4. Numerical Study

4.1. Bandwidth Selection

4.2. Generated Populations

4.2.1. Population-1

4.2.2. Population-2

4.2.3. Outlier Generation Mechanism (Simulated Populations)

4.3. Real Life Applications

4.3.1. Fisheries

4.3.2. Radiations

4.4. Interpretation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling

Abstract

1. Introduction

2. Fundamental Estimators

2.1. Rueda and Sanchez-Borrego Estimator

2.2. Alshanbari and Anas Estimator

3. Adapted and Proposed Families of Estimators

3.1. Adapted Family

3.2. Proposed Family

3.3. Theoretical Framework and Practical Characteristics of y ¯ P P ( i )

3.3.1. Regularity Conditions

3.3.2. Asymptotic Implications

3.3.3. Practical Characteristics

4. Numerical Study

4.1. Bandwidth Selection

4.2. Generated Populations

4.2.1. Population-1

4.2.2. Population-2

4.2.3. Outlier Generation Mechanism (Simulated Populations)

4.3. Real Life Applications

4.3.1. Fisheries

4.3.2. Radiations

4.4. Interpretation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Theoretical Framework and Practical Characteristics of ${\bar{y}}_{P P_{(i)}}$