Next Article in Journal
Comparing Weighted RMSD, Weighted MD, Infit, and Outfit Item Fit Statistics Under Uniform Differential Item Functioning
Next Article in Special Issue
Spatio-Temporal Extreme Value Modeling of Extreme Rainfall over the Korean Peninsula Incorporating Typhoon Influence
Previous Article in Journal
CQEformer: A Causal and Query-Enhanced Transformer Variant for Time-Series Forecasting
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling

by
Nipaporn Chutiman
1,
Supawadee Wichitchan
1,
Chawalit Boonpok
1,
Monchaya Chiangpradit
1 and
Pannarat Guayjarernpanishk
2,*
1
Department of Mathematics, Faculty of Science, Mahasarakham University, Maha Sarakham 44150, Thailand
2
Department of Mathematics, Faculty of Science, Khon Kaen University, Khon Kaen 40002, Thailand
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(23), 3751; https://doi.org/10.3390/math13233751
Submission received: 28 October 2025 / Revised: 19 November 2025 / Accepted: 21 November 2025 / Published: 22 November 2025
(This article belongs to the Special Issue New Advances in Computational Statistics and Extreme Value Theory)

Abstract

Adaptive cluster sampling (ACS) is a sampling technique commonly used for rare populations that exhibit spatial clustering. However, the initially selected sample units may not always satisfy the specified inclusion condition. To address these limitations, general inverse sampling has been incorporated into ACS, in which the initial units are sequentially selected, and a termination criterion is applied to control the number of rare elements drawn from the population. The objective of this study is to develop an estimator of the population mean that incorporates auxiliary information within the framework of general inverse adaptive cluster sampling (GI-ACS). The proposed estimator is constructed based on a regression-type estimator and analytically examined. A simulation study was conducted to validate the theoretical findings. Three scenarios were considered, representing low, moderate, and high correlations between the variable of interest and the auxiliary variable. The simulation results indicate that the proposed estimator achieves lower variance than the GI-ACS estimator that does not utilize auxiliary information across all examined correlation scenarios. Therefore, the proposed estimator is more efficient and preferable when auxiliary variables are available.

1. Introduction

Adaptive cluster sampling (ACS), introduced by Thompson [1], is an effective technique for studying populations that are rare and spatially clustered. Initially, a set of sample units is selected using simple random sampling without replacement (SRSWOR). Whenever a selected unit satisfies a predefined condition C based on the variable of interest, all neighboring units are included. This procedure is applied iteratively such that any newly added unit meeting condition C results in the inclusion of its adjacent units, and the process continues until no further units meet the condition. If an initial unit does not satisfy C, no additional units are included, and the cluster size equals one. The set of all initial and associated neighboring units that satisfy C forms a network.
ACS has been widely applied in the study of rare and clustered populations, including forest ecosystem monitoring [2], herpetofauna surveys in tropical rainforests [3], assessments of sea lamprey larvae [4], and freshwater mussel investigations [5]. It has also been applied in hydroacoustic research [6] and studies related to the COVID-19 pandemic [7,8,9]. Beyond ecological and epidemiological contexts, ACS has recently extended to emerging areas such as autonomous systems and IoT-based applications [10,11].
In many ACS studies, estimation efficiency has been enhanced by incorporating auxiliary variables correlated with the variable of interest. A ratio-type estimator was first proposed in [12], followed by several modifications [13] and extensions to two auxiliary variables [14]. Other developments include ratio-type estimators based on known population parameters—such as the coefficient of variation, kurtosis, and skewness—of the auxiliary variable [15,16], generalized exponential-type estimators [17,18], and generalized ratio-type classes [19]. Beyond advances in estimator development for ACS, significant progress has also been achieved in variance estimation under adaptive cluster sampling [20], with these developments further extending to stratified ACS frameworks [21].
However, in ACS, the initial sample size is predetermined, and it is possible that not all randomly selected units satisfy condition C. To address this issue, inverse ACS was proposed in [22], in which units are sampled sequentially until a specified number of rare elements has been observed. When the number of qualifying units is very small, this procedure may fail to terminate. General inverse sampling, introduced in [23], addresses this limitation by imposing a cap on the final sample size while maintaining control over the number of rare elements obtained. This approach has since been integrated with ACS, forming the general inverse ACS (GI-ACS) design. By imposing an upper bound on the total number of sampled units and simultaneously ensuring that a prescribed number of qualifying units is obtained, GI-ACS provides a mechanism to control the final sample size while maintaining ACS’s capacity to efficiently detect rare and spatially aggregated units. These features make GI-ACS particularly suitable for ecological abundance estimation, environmental surveillance, and epidemiological monitoring. An estimator under GI-ACS based on the Rao–Blackwell theorem was developed in [24], and the combination of unequal-probability inverse sampling with ACS was examined in [25].
In comparison with other probability sampling frameworks, ACS generally achieves superior efficiency when the population is rare and exhibits pronounced spatial clustering. However, its effectiveness may be diminished when the initial random selection yields few or no rare units. GI-ACS alleviates this limitation by combining adaptive neighborhood expansion with a general inverse sampling method. This method controls the final sample size while ensuring a prescribed number of rare units. In contrast, other sampling designs, such as simple random sampling, stratified random sampling, and ranked set sampling, do not expand networks in response to observed rare units, often making them less efficient under spatial association. These distinctions highlight GI-ACS as a flexible alternative that retains the adaptive strengths of ACS while enhancing sampling completeness and operational control.
The authors of previous studies on GI-ACS have primarily focused on estimators based only on the study variable Y. However, in many real-world applications, auxiliary variables correlated with Y are often available and can be measured concurrently at minimal cost. Utilizing such auxiliary information can substantially improve estimation efficiency. Motivated by the regression-type estimator in general inverse sampling proposed in [26], in this study, we develop an estimator that incorporates auxiliary information within the GI-ACS framework.
The remainder of the paper is organized as follows: In Section 2, we provide a review of inverse and general inverse sampling. In Section 3, we describe the general inverse ACS design. In Section 4, we introduce the proposed regression-type estimator using auxiliary information. In Section 5, we present the simulation study and discuss the results. Lastly, Section 6 concludes the study.

2. Inverse Sampling and General Inverse Sampling

Consider a finite population of size N , consisting of the set of values U = y 1 , y 2 , , y N . Let Y denote the variable of interest, and let y i be the y -value associated with the i t h unit. The population is partitioned into two subgroups based on whether the y -value satisfies the precondition C. Following the notations in [22], define the two subgroups by U M = y : y i C ; i = 1 , 2 , N and U N M = U M = y : y i C ; i = 1 , 2 , N , where M is the unknown number of units of U M . A unit’s classification is not known until it is selected. The sampling procedure involves sequentially selecting units at random from the target population until a predetermined number of units r from U M 1 < r M are obtained.
The estimator in inverse sampling was presented in [22]. Suppose a sample size n I is selected via SRSWOR. If at least r r > 1 units from U M are observed, sampling stops. Otherwise, sampling continues sequentially until exactly r units are selected. The total sequential sample size is denoted by n T . Let μ y be the population mean of the variable of interest, then an unbiased estimator of μ y is
y ¯ I S = 1 n I i = 1 n I y i i f n T = n I , 1 N M y ¯ M + N M y ¯ M i f n T > n I ,
where y ¯ M = 1 r i S M y i ; S M is the index label of the members of U M .
y ¯ M = 1 n T r i S M y i ; S M is the index label of the members of U M . In application, M will be not known. The unbiased estimator of M is M ^ = N r 1 n T 1 .
The variance of y ¯ I S is
V y ¯ I S = N n I N S 2 n I i f n T = n I , V y ¯ I i f n T > n I ,
Using the variance of Murthy’s estimator [27], V y ¯ I = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s y i p i y j p j 2 p i p j , where Pr s is the probability of obtaining sample s , Pr s i is the conditional probability of obtaining s given the i t h unit was selected first, and p i is the selection probability of unit i in the first draw and S 2 = N 1 1 i = 1 N y i μ y 2 .
For any sequential sampling design, it may not be possible to complete the sampling procedure in certain situations. Such a situation may occur in an inverse sampling design because r is too large or the population contains very few units satisfying condition C. General inverse sampling was proposed in [23]. Beginning with a sample size n I selected via SRSWOR, we stop further sampling if at least r units from U M are selected. Otherwise, sampling continues until r units from U M are obtained; however, a limit is placed on the final sample size, such that n F = N . Then, an unbiased estimator of μ y is
y ¯ G I S = 1 n I i = 1 n I y i i f A 1 , 1 N M ^   y ¯ M + N M ^ y ¯ M i f A 2 , 1 n F i = 1 n F y i i f A 3 ,
where A 1 = n T = n I ,
A 2 = n I < n T < n F   o r n T = n F & S M = r , and
A 3 = n T = n F & S M < r .
The variance of y ¯ G I S is
V y ¯ G I S = N n I N S 2 n I i f A 1 , V y ¯ I i f A 2 , N n F N S 2 n F i f A 3 ,
The unbiased estimator of the variance of y ¯ G I S is
V ^ y ¯ G I S = N n I N s I 2 n I i f A 1 , V ^ y ¯ I i f A 2 , N n F N s F 2 n F i f A 3 ,
Here, V ^ y ¯ I = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 y i p i y j p j 2 p i p j , where Pr s i , j is the probability of obtaining sample s , given that units i and j were selected (in either order) on the first two draws. s I 2 = n I 1 1 i = 1 n I y i y ¯ I 2 , y ¯ I = n I 1 i = 1 n I y i , and s F 2 = n F 1 1 i = 1 n F y i y ¯ F 2 , y ¯ F = n F 1 i = 1 n F y i .

3. General Inverse ACS

The integration of general inverse sampling with ACS yields the general inverse ACS (GI-ACS) design. Under GI-ACS, if an initially selected unit satisfies the pre-specified condition C, the entire network to which that unit belongs is included. Consequently, the final sample consists of the sequentially selected initial units and all network members generated by means of adaptive expansion, which also includes the edge units that sequentially satisfy condition C.
An estimator of μ y for the general inverse design with ACS [23]:
y ¯ G I S , A = 1 n I i = 1 n I y ¯ i * i f A 1 , 1 N M ^   y ¯ M * + N M ^ y ¯ M * i f A 2 , 1 n F i = 1 n F y ¯ i * i f A 3 ,
where y ¯ i * is the average of the variable of interest in the network that includes unit i of the initial sample, defined as y ¯ i * = 1 m i j ψ i y j , where ψ i denotes the network that includes unit i , and m i is the number of units in that network.
The variance of y ¯ G I S , A is
V y ¯ G I S , A = N n I N S * 2 n I i f A 1 , V y ¯ I * i f A 2 , N n F N S * 2 n F i f A 3 ,
V y ¯ I * = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s y ¯ i * p i y ¯ j * p j 2 p i p j and S * 2 = N 1 1 i = 1 N y ¯ i * μ y 2 .
The unbiased estimator of the variance of y ¯ G I S , A is
V ^ y ¯ G I S , A = N n I N s I * 2 n I i f A 1 , V ^ y ¯ I * i f A 2 , N n F N s F * 2 n F i f A 3 ,
where V ^ y ¯ I * = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 y ¯ i * p i y ¯ j * p j 2 p i p j , s I * 2 = n I 1 1 i = 1 n I y ¯ i * y ¯ I * 2 , y ¯ I * = n I 1 i = 1 n I y ¯ i * , and s F * 2 = n F 1 1 i = 1 n F y ¯ i * y ¯ F * 2 , y ¯ F * = n F 1 i = 1 n F y ¯ i * .

4. Proposed Estimator in General Inverse ACS

Motivated by Moradi’s regression-type estimator [26] formulated under inverse sampling using unit-level observations, the proposed estimator adapts this approach to the GI-ACS design, operating on network-level means resulting from adaptive expansion. In this section, let μ x be the population mean of the auxiliary variable, and the estimator x ¯ G I S , A can be calculated similar to y ¯ G I S , A by using the x -variable.
The modified regression estimator in general inverse ACS is
y ¯ R G I S , A = y ¯ G I S , A + B * μ x x ¯ G I S , A
where B * = i = 1 N x ¯ i * y ¯ i * N μ y μ x i = 1 N x ¯ i * 2 N μ x 2 . The estimators of μ y and μ x are y ¯ G I S , A and x ¯ G I S , A , respectively, whereas i = 1 N x ¯ i * y ¯ i * and i = 1 N x ¯ i * 2 are estimated using Murthy’s estimator. These estimators are given by T ^ x y * = i = 1 n T Pr s i N Pr s x ¯ i * y ¯ i * , T ^ x x * = i = 1 n T Pr s i N Pr s x ¯ i * x ¯ i * . Using Equation (6) for the new variables x ¯ i * y ¯ i * and x ¯ i * x ¯ i * in GI-ACS yields
T ^ x y * = N n I i = 1 n I x ¯ i * y ¯ i * i f A 1 , M ^ x ¯ * y ¯ * M + N M ^ x ¯ * y ¯ * M i f A 2 , N n F i = 1 n F x ¯ i * y ¯ i * i f A 3 ,
where x ¯ * y ¯ * M = r 1 i S M x ¯ i * y ¯ i * and x ¯ * y ¯ * M = n T r 1 i S M x ¯ i * y ¯ i * .
T ^ x x * = N n I i = 1 n I x ¯ i * 2 i f A 1 , M ^ x ¯ * x ¯ * M + N M ^ x ¯ * x ¯ * M i f A 2 , N n F i = 1 n F x ¯ i * 2 i f A 3 ,
where x ¯ * x ¯ * M = r 1 i S M x ¯ i * 2 and x ¯ * x ¯ * M = n T r 1 i S M x ¯ i * 2 .
Therefore, the estimator of B * is b * = T ^ x y * N y ¯ G I S , A x ¯ G I S , A T ^ x x * N x ¯ G I S , A 2 .
The bias of y ¯ R G I S , A is
E y ¯ R G I S , A = E y ¯ G I S , A + b * μ x x ¯ G I S , A
= E y ¯ G I S , A + μ x E b * E b * x ¯ G I S , A .
Since y ¯ G I S , A and x ¯ G I S , A are unbiased estimators of μ y and μ x , respectively, we have
E y ¯ R G I S , A = μ y + E x ¯ G I S , A E b * E b * x ¯ G I S , A
= μ y C O V b * , x ¯ G I S , A
Therefore, the bias of y ¯ R G I S , A is
B i a s y ¯ R G I S , A = E y ¯ R G I S , A μ y = C O V b * , x ¯ G I S , A
From y ¯ R G I S , A = y ¯ G I S , A + b * μ x x ¯ G I S , A = y ¯ G I S , A b * x ¯ G I S , A + b μ x , let z i * = y ¯ i * B * x ¯ i * , with B * estimated by b * . Then y ¯ R G I S , A = z ¯ G I S , A + b * μ x .
The variance of y ¯ R G I S , A is
V y ¯ R G I S , A = N n I N S z * 2 n I i f A 1 , V y ¯ R I * i f A 2 , N n F N S z * 2 n F i f A 3 ,
where S z * 2 = 1 N 1 i = 1 N z i * μ y 2 and V y ¯ R I * is derived from the variance of Murthy’s estimator [27], which applied the Rao–Blackwell theorem.
V y ¯ R I * = V E z ¯ * s = V z ¯ * E V z ¯ * s ,
where V z ¯ * = 1 N 2 i = 1 N V z i * p i I i + 1 N 2 i = 1 N i j C O V z i * p i I i , z i * p i I i . I i is an indicator function that equals 1 when the unit i is selected as the first unit and 0 otherwise, I i I j = 0 for i j , E I i I j = 0 , and C O V I i I j = p i p j , so that V y ¯ * = 1 N 2 i = 1 N j < i N z i * p i z j * p i 2 p i p j .
V z ¯ * s = 1 N 2 V i = 1 N z i * p i I i s
= 1 N 2 i = 1 N z i * 2 p i 2 Pr I i = 1 s Pr I i = 1 s 2 1 N 2 i = 1 N j i N z i * z j * p i p j Pr I i = 1 s Pr I j = 1 s
= 1 N 2 i = 1 N j < i N Pr s i P s j Pr s 2 z i * p i z j * p j p i p j , which gives
E V z ¯ * s = 1 N 2 s Pr s i = 1 N j < i N Pr s i Pr s j Pr s 2 z i * p i z j * p j p i p j
= 1 N 2 i = 1 N j < i N Pr s i Pr s j Pr s z i * p i z j * p j p i p j
Therefore, V y ¯ R I * = 1 N 2 i = 1 N j < i N 1 s i , j Pr s i Pr s j Pr s z i * p i z j * p j 2 p i p j .
Under the GI-ACS design, the inclusion probabilities are determined by the sampling design. As such, these probabilities are treated as design-based quantities, not as random variables drawn from a specific probability distribution. Independence among inclusion events is not assumed. Adaptive expansion naturally creates dependence among units in the same network. The proposed estimator and its variance rely solely on assumptions of general inverse sampling. No extra distributional assumptions are required beyond these design-based properties.
The estimator of the variance of y ¯ R G I S , A is
V ^ y ¯ G I S , A = N n I N s z I * 2 n I i f A 1 , V ^ y ¯ R I * i f A 2 , N n F N s z F * 2 n F i f A 3 ,
where the unbiased estimator of V y ¯ R I * is V y ¯ R I * = 1 N 2 i = 1 n T j < i n T Pr s i , j Pr s Pr s i Pr s j Pr s 2 z i * p i z j * p j 2 p i p j , with Pr s i , j = Pr ( i and j are the first two units, s ) / p i j . From Murthy [27],
V y ¯ R I * = 1 N n T 1 2 n T 2 r r 2 N n T + 1 N n T 2 r 2 i S M j < i : j S M z i * z j * 2
+ N n T + 1 N 2 r 1 r i S M j S M z i * z j * 2 + i S M j > i : j S M z i * z j * 2
V y ¯ R I * = α s M 2 + 1 N 2 V ^ M ^ z ¯ M * z ¯ M * 2 + γ s M 2
where
α = M ^ N 2 N n T + 1 n T r n T r N n T 2 N r n T 2 r 1 ,   γ = N n T + 1 n T r 1 N n T 1 n T 2
V ^ M ^ = N n T r r 1 N n T + 1 n T 1 2 n T 2 = 1 n T 1 N M ^ N M ^ n T 2
s M * 2 = r 1 1 i S M z i * z ¯ M * 2 ,   z ¯ M * = r 1 i M z i * ,
s M * 2 = n T r 1 1 i S M z i * z ¯ M * 2 ,   z ¯ M * = n T r 1 i M z i * ,
s z I * 2 = n I 1 1 i = 1 n I z i * z ¯ I * 2 ,   z ¯ I * = n I 1 i = 1 n I z i * ,   and
s z F * 2 = n F 1 1 i = 1 n F z i * z ¯ F * 2 ,   z ¯ F * = n F 1 i = 1 n F z i * .

5. Simulation Studies and Discussion

Simulation studies were conducted under three correlation scenarios between the auxiliary variable (X) and the variable of interest (Y):
Scenario I: High correlation;
Scenario II: Moderate correlation;
Scenario III: Low correlation, with Y representing real data and X simulated accordingly.
Population I: A Poisson cluster process [28] was used to generate the auxiliary variable X and the variable of interest Y over a 20 × 20 grid (400 units). The population mean of y -values was 59.7375, and the correlation coefficient between the y -data and x -data was 0.9973. The criterion for including neighboring units was C = y : y > 0 . The population I is shown in Figure 1 and Figure 2.
Population II: Based on the method of Chao [12], this population was generated using a linked-pairs process combined with a bivariate Poisson cluster process over a 20 × 20 grid (400 units). The population was more weakly clustered than Population I, with several single-unit networks. The population mean of y was 0.6475, and the Pearson correlation coefficient between x and y was 0.7070. The criterion for including neighboring units was C = y : y > 0 . Population II is shown in Figure 3 and Figure 4.
Population III: This population was derived from the real blue-winged teal dataset [29]. The study area (5000 km2 in central Florida) was divided into 50 units of 100 km2 each. These data were assigned as y-values. The corresponding x-values [30] were simulated using the model x i = 4 y ¯ i * γ i , where γ i ~ N 0 , y ¯ i * × y i and x = 0 was assumed whenever y = 0. The population mean of y was 282.420, and the Pearson correlation coefficient between x and y was 0.4733. The criterion for including neighboring units was C = y : y > 0 . Population III is shown in Figure 5 and Figure 6.
A total of 10,000 iterations were performed for each estimator. E n T is the expected sequential sample size from general inverse sampling and E ν is the expected final sample size, respectively.
The estimated variance of the estimators is defined as
V ^ y ¯ G I S , A = 1 10 , 000 1 i = 1 10 , 000 y ¯ G I S , A i i = 1 10 , 000 y ¯ G I S , A i 10 , 000 2 ,
V ^ y ¯ R G I S , A = 1 10 , 000 1 i = 1 10 , 000 y ¯ R G I S , A i i = 1 10 , 000 y ¯ R G I S , A i 10 , 000 2 .
The relative efficiency of the proposed estimator, compared with y ¯ G I S _ A , is defined as R E y ¯ R G I S , A = V ^ y ¯ G I S , A V ^ y ¯ R G I S , A .
Since the proposed estimator y ¯ R G I S , A is a biased estimator, the estimated absolute relative bias is defined as A R B y ¯ R G I S , A = 1 10 , 000 i = 1 10 , 000 y ¯ R G I S , A i μ y μ y .

Discussion

Based on the data analyzed in this study, the variable of interest and the auxiliary variable exhibit a positive linear relationship. The selected values of the number of rare units r and the initial sample size n I under the GI-ACS design represent practical ranges that support feasible sampling while enabling meaningful variation for evaluating estimator performance under different levels of network expansion.
The results of the study are as follows:
Based on the results presented in Table 1, since the proposed estimator y ¯ R G I S , A is biased, the empirical evidence indicates that when the variable of interest and the auxiliary variable are highly correlated, an increase in both r and n I leads to a decrease in the estimated absolute relative bias. Conversely, when the correlation between the variables is low to moderate, the estimated absolute relative bias of the proposed estimator does not show a significant reduction.
Based on the results presented in Table 2, the estimated variance of all estimators decreases as r and n I increase. Across all scenarios, the proposed estimator y ¯ R G I S , A , which incorporates auxiliary information, consistently yields lower estimated variance than the estimator based solely on the variable of interest y ¯ G I S , A . The lower variance of the proposed estimator relative to y ¯ G I S , A across all scenarios reinforces the stabilizing effect of the auxiliary variable. The proposed estimator incorporates regression adjustment at the network level. In Population I and Population III, where network-level variability is prominent, the auxiliary variable significantly enhances the predictive strength. Even in Population II, where the correlation is moderate and clustering is weaker, the proposed estimator maintains lower variance, indicating that auxiliary information offers robustness against variability induced by both sequential sampling and adaptive expansion. Moreover, the relative efficiency of the proposed estimator is greater than one in all cases, indicating that y ¯ R G I S , A is more efficient than y ¯ G I S , A . The regression-type estimators remain effective even under GI-ACS, providing substantial improvement over estimators that rely solely on the study variable.
Across all three simulation scenarios, the relative bias decreases and the estimated variance declines as the sample size increases, indicating that the proposed estimator under GI-ACS tends to approach the population mean for the variable of interest. This behavior indicates that the regression-based estimator exhibits design consistency within the GI-ACS framework.

6. Conclusions

General inverse sampling was incorporated into adaptive cluster sampling (ACS) in [21], and this design is referred to as the general inverse ACS (GI-ACS) design. In this study, we developed a regression-type estimator for GI-ACS that incorporates auxiliary information to improve the estimation of the population mean in rare and spatially clustered populations. Analytical results established the bias and variance properties of the proposed estimator, and simulation studies conducted under three correlation scenarios demonstrated clear efficiency gains over the existing GI-ACS estimator that does not employ auxiliary variables.
The findings indicate that, across all investigated scenarios, the proposed estimator is consistently more efficient than the estimator of the population mean based solely on the study variable. Moreover, when the auxiliary variable is highly correlated with the variable of interest, the proposed estimator exhibits noticeably smaller bias compared to situations in which the correlation is low to moderate. These results suggest that the proposed estimator under the GI-ACS framework is particularly advantageous when a strong correlation exists between the auxiliary and study variables. Overall, the findings highlight the practical value of auxiliary information in improving the stability and precision of estimators within the GI-ACS design.
The main methodological contribution of this work lies in extending regression-type estimation to the GI-ACS framework, thereby addressing a key limitation of existing GI-ACS estimators, which rely exclusively on the study variable. By incorporating auxiliary information, the proposed estimator provides enhanced precision and robustness across a range of sampling conditions.
In this study, the proposed estimator utilized a single auxiliary variable. The authors of future studies should focus on developing extensions that incorporate additional auxiliary information, such as the coefficient of variation in the auxiliary variable, the correlation coefficient between the auxiliary variable and the variable of interest, or the inclusion of multiple auxiliary variables. Such extensions could further improve the efficiency of estimators under the GI-ACS framework.

Author Contributions

Conceptualization, N.C. and P.G.; methodology, S.W. and N.C.; software, N.C.; validation, M.C. and C.B.; formal analysis, N.C. and M.C.; investigation, S.W. and P.G.; resources, N.C. and C.B.; data curation, N.C. and C.B.; writing—original draft preparation, N.C. and M.C.; writing—review and editing, P.G. and S.W.; visualization, P.G. and C.B.; supervision, N.C. and P.G.; project administration, N.C.; funding acquisition, N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research project was financially supported by Mahasarakham University.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

References

  1. Thompson, S.K. Adaptive cluster sampling. J. Am. Stat. Assoc. 1990, 85, 1050–1059. [Google Scholar] [CrossRef]
  2. Magnussen, S.; Kurz, W.; Leckie, D.G.; Paradine, D. Adaptive cluster sampling for estimation of deforestation rates. Eur. J. For. Res. 2005, 124, 207–220. [Google Scholar] [CrossRef]
  3. Noon, B.R.; Ishwar, N.M.; Vasudevan, K. Efficiency of adaptive cluster and random sampling in detecting terrestrial herpetofauna in a tropical rainforest. Wildl. Soc. Bull. 2006, 34, 59–68. [Google Scholar] [CrossRef]
  4. Sullivan, W.P.; Morrison, B.J.; Beamish, F.W.H. Adaptive cluster sampling: Estimating density of spatially autocorrelated larvae of the sea lamprey with improved precision. J. Great Lakes Res. 2008, 34, 86–97. [Google Scholar] [CrossRef]
  5. Smith, D.R.; Villella, R.F.; Lemarié, D.P. Application of adaptive cluster sampling to low-density populations of freshwater mussels. Environ. Ecol. Stat. 2003, 10, 7–15. [Google Scholar] [CrossRef]
  6. Conners, M.E.; Schwager, S.J. The use of adaptive cluster sampling for hydroacoustic surveys. ICES J. Mar. Sci. 2002, 59, 1314–1325. [Google Scholar] [CrossRef]
  7. Olayiwola, O.M.; Ajayi, A.O.; Onifade, O.C.; Wale-Orojo, O.; Ajibade, B. Adaptive cluster sampling with model based approach for estimating total number of Hidden COVID-19 carriers in Nigeria. Stat. J. IAOS 2020, 36, 103–109. [Google Scholar] [CrossRef]
  8. Chandra, G.; Tiwari, N.; Nautiyal, R. Adaptive cluster sampling-based design for estimating COVID-19 cases with random samples. Curr. Sci. 2021, 120, 1204–1210. [Google Scholar] [CrossRef]
  9. Stehlík, M.; Kiseľák, J.; Dinamarca, A.; Alvarado, E.; Plaza, F.; Medina, F.A.; Stehlíková, S.; Marek, J.; Venegas, B.; Gajdoš, A.; et al. REDACS: Regional emergency-driven adaptive cluster sampling for effective COVID-19 management. Stoch. Anal. Appl. 2022, 41, 474–508. [Google Scholar] [CrossRef] [PubMed]
  10. Hwang, J.; Bose, N.; Fan, S. AUV adaptive sampling methods: A Review. Appl. Sci. 2019, 9, 3145. [Google Scholar] [CrossRef]
  11. Giouroukis, D.; Dadiani, A.; Traub, J.; Zeuch, S.; Markl, V. A survey of adaptive sampling and filtering algorithms for the internet of things. In Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems, Montreal, QC, Canada, 13–17 July 2020; pp. 27–38. [Google Scholar] [CrossRef]
  12. Chao, C.T. Ratio estimation on adaptive cluster sampling. J. Chin. Stat. Assoc. 2004, 42, 307–327. [Google Scholar] [CrossRef]
  13. Dryver, A.L.; Chao, C.T. Ratio estimators in adaptive cluster sampling. Environmetric 2007, 18, 607–620. [Google Scholar] [CrossRef]
  14. Chutiman, N.; Kumphon, B. Ratio estimator using two auxiliary variables for adaptive cluster sampling. Thail. Stat. 2008, 6, 241–256. [Google Scholar]
  15. Chutiman, N. Adaptive cluster sampling using auxiliary variable. J. Math. Stat. 2013, 9, 249–255. [Google Scholar] [CrossRef]
  16. Yadav, S.K.; Misra, S.; Mishra, S. Efficient estimator for population variance using auxiliary variable. Am. J. Oper. Res. 2016, 6, 9–15. [Google Scholar] [CrossRef]
  17. Chaudhry, M.S.; Hanif, M. Generalized exponential-cum-exponential estimator in adaptive cluster sampling. Pak. J. Stat. Oper. Res. 2015, 11, 553–574. [Google Scholar] [CrossRef]
  18. Chaudhry, M.S.; Hanif, M. Generalized difference-cum-exponential estimator in adaptive cluster sampling. Pak. J. Stat. 2017, 33, 335–367. [Google Scholar]
  19. Bhat, A.A.; Sharma, M.; Shah, M.; Bhat, M. Generalized ratio type estimator under adaptive cluster sampling. J. Sci. Res. 2023, 67, 46–51. [Google Scholar] [CrossRef]
  20. Yasmeen, U.; Thompson, M. Variance estimation in adaptive cluster sampling. Commun. Stat. Theory Methods 2019, 49, 2485–2497. [Google Scholar] [CrossRef]
  21. Shahzad, U.; Ahmad, I.; Al-Noor, N.H.; Benedict, T.J. Use of calibration constraints and linear moments for variance estimation under stratified adaptive cluster sampling. Soft Comput. 2022, 26, 11185–11196. [Google Scholar] [CrossRef]
  22. Christman, M.C.; Lan, F. Inverse adaptive cluster aampling. Biometrics 2001, 57, 1096–1105. [Google Scholar] [CrossRef] [PubMed]
  23. Salehi, M.; Seber, G.A.F. A general inverse sampling scheme and its application to adaptive cluster sampling. Aust. N. Z. J. Stat. 2004, 46, 483–494. [Google Scholar] [CrossRef]
  24. Pochai, N. An improved the estimator in inverse adaptive cluster sampling. Thail. Stat. 2008, 6, 15–26. [Google Scholar]
  25. Sangngam, P. Unequal probability inverse adaptive cluster sampling. Chiang Mai J. Sci. 2013, 40, 736–742. [Google Scholar]
  26. Moradi, M.; Salehi, M.; Brown, J.A.; Karimi, N. Regression estimator under inverse sampling to estimate arsenic contamination. Environmetrics 2011, 22, 894–900. [Google Scholar] [CrossRef]
  27. Salehi, M.; Seber, G.A.F. Theory & methods: A new proof of Murthy’s estimator which applies to sequential sampling. Aust. N. Z. J. Stat. 2001, 43, 281–286. [Google Scholar]
  28. Subzar, M.; Alqurashi, T.; Chandawat, D.; Tamboli, S.; Raja, T.A.; Attri, A.K.; Wani, S.A. Generalized robust regression techniques and adaptive cluster sampling for efficient estimation of population mean in case of rare and clustered populations. Sci. Rep. 2025, 15, 2069. [Google Scholar] [CrossRef]
  29. Smith, D.R.; Conroy, M.J.; Brakhage, D.H. Efficiency of adaptive cluster sampling for estimating density wintering waterfowl. Biometrics 1995, 51, 777–788. [Google Scholar] [CrossRef]
  30. Pochai, N. Double and Resampling in Adaptive Cluster Sampling. Doctoral Dissertation, National Institute of Development Administration, Bangkok, Thailand, 2006. [Google Scholar]
Figure 1. The variable of interest ( y ) for Population I. The shaded regions in different colors denote distinct networks.
Figure 1. The variable of interest ( y ) for Population I. The shaded regions in different colors denote distinct networks.
Mathematics 13 03751 g001
Figure 2. The auxiliary variable ( x ) for Population I. The position of the network is aligned with the data y in Figure 1.
Figure 2. The auxiliary variable ( x ) for Population I. The position of the network is aligned with the data y in Figure 1.
Mathematics 13 03751 g002
Figure 3. The variable of interest (y) for Population II. The shaded regions in different colors denote distinct networks.
Figure 3. The variable of interest (y) for Population II. The shaded regions in different colors denote distinct networks.
Mathematics 13 03751 g003
Figure 4. The auxiliary variable (x) for Population II. The position of the network is aligned with the data y in Figure 3.
Figure 4. The auxiliary variable (x) for Population II. The position of the network is aligned with the data y in Figure 3.
Mathematics 13 03751 g004
Figure 5. The variable of interest (y) for Population III (blue-winged teal data). The shaded regions in different colors denote distinct networks.
Figure 5. The variable of interest (y) for Population III (blue-winged teal data). The shaded regions in different colors denote distinct networks.
Mathematics 13 03751 g005
Figure 6. The auxiliary variable (x) for Population III. The position of the network is aligned with the data y in Figure 5.
Figure 6. The auxiliary variable (x) for Population III. The position of the network is aligned with the data y in Figure 5.
Mathematics 13 03751 g006
Table 1. The estimated absolute relative bias of the proposed estimator for the population mean of the variable of interest.
Table 1. The estimated absolute relative bias of the proposed estimator for the population mean of the variable of interest.
Population I
r n I E n T E ν A R B y ¯ R G I S , A
258.733557.03590.3748
21011.294565.18290.1985
21515.428875.42690.0666
22020.095386.83030.0098
25050.0000129.80530.0040
3512.290769.87110.2141
31013.434772.80040.2059
31516.528979.69130.1620
32020.477287.57850.0436
35050.0000129.60720.0027
41016.608982.33830.1499
41518.037284.53810.1538
42021.389691.42830.1264
45050.0000129.28070.0019
51521.391191.49340.1354
52023.600995.69870.1386
52526.2153100.52470.0938
53030.5074106.96970.0414
55050.0238129.89640.0009
103041.5436122.04400.0984
104044.2324124.51400.0887
105044.2324130.60970.0530
Population II
r n I E n T E ν A R B y ¯ R G I S , A
2513.190921.45110.5358
21015.000623.95830.4158
21520.774732.19170.4059
22021.290632.41030.1512
25050.014267.66530.0263
3520.334731.37940.4595
31022.329434.39060.4304
31524.177936.61840.2724
32026.778141.13820.3832
35050.085768.64340.0271
41027.472741.28410.4517
41528.591642.32440.4005
42033.813349.18220.3583
45050.326468.67910.0477
51533.821448.76320.3832
52034.451150.45540.4554
52535.715151.22470.4317
53037.696354.56920.3668
55051.235369.89960.0930
103067.614487.75470.3825
104067.190987.08480.3892
105067.657588.06820.4036
Population III
r n I E n T E ν A R B y ¯ R G I S , A
257.117615.91800.2912
21010.361919.42500.1308
3510.080919.30920.1203
31011.235220.61490.1242
4812.967922.61450.1270
41013.332722.77560.1306
51016.067916.06740.1332
51517.610526.45860.1088
Table 2. The estimated variance of the estimators for the population mean of the variable of interest and the relative efficiency of the proposed estimator compared with y ¯ G I S , A .
Table 2. The estimated variance of the estimators for the population mean of the variable of interest and the relative efficiency of the proposed estimator compared with y ¯ G I S , A .
Population I
r n I E n T E ν V ^ y ¯ G I S , A V ^ y ¯ R G I S , A R E y ¯ R G I S , A
258.733557.03592242.02741359.33101.6494
21011.294565.18291174.3768663.95081.7688
21515.428875.4269730.6367579.03991.2618
22020.095386.8303564.0877555.73301.0187
25050.0000129.8053203.2275196.99621.0316
3512.290769.87111772.15191287.59671.3763
31013.434772.8004863.4391389.31392.2178
31516.528979.6913795.5099426.56051.8649
32020.477287.5785526.7130406.06541.2971
25050.0000129.6072179.4460173.94511.0316
41016.608982.3383742.8746422.31351.7591
41518.037284.5381544.0685232.79032.3372
42021.389691.4283536.4767285.08401.8818
45050.0000129.2807170.3615165.16911.0314
51521.391191.4934516.0823241.12352.1403
52023.600995.6987524.0002228.02932.2980
52526.2153100.5247398.3312219.31411.8163
53030.5074106.9697344.7272228.77291.5069
55050.0238129.8964213.9430211.76171.0103
103041.5436122.0440339.7744167.03692.0341
104044.2324124.5140194.779972.36112.6918
105044.2324130.6097152.757693.74951.6294
Population II
r n I E n T E ν V ^ y ¯ G I S , A V ^ y ¯ R G I S , A R E y ¯ R G I S , A
2513.190921.45110.69090.64541.0705
21015.000623.95830.42910.34251.2528
21520.774732.19170.43030.28621.5035
22021.290632.41030.24610.22351.1011
25050.014267.66530.07240.07131.0154
3520.334731.37940.35440.26441.3404
31022.329434.39060.23710.14981.5828
31524.177936.61840.17610.12941.3609
32026.778141.13820.23100.11162.0699
35050.085768.64340.07920.07901.0025
41027.472741.28410.18620.08952.0804
41528.591642.32440.18570.11481.6176
42033.813349.18220.14770.05132.8791
45050.326468.67910.08040.07581.0607
51533.821448.76320.16170.06352.5465
52034.451150.45540.18350.08762.0947
52535.715151.22470.15620.07682.0339
53037.696354.56920.11170.05691.9631
55051.235369.89960.06140.05521.1123
103067.614487.75470.04840.02412.0083
104067.190987.08480.06190.01883.2926
105067.657588.06820.05970.01474.0612
Population III
r n I E n T E ν V ^ y ¯ G I S , A V ^ y ¯ R G I S , A R E y ¯ R G I S , A
257.117615.918059,954.532948,557.51141.2347
21010.361919.425031,997.991631,002.10941.0321
3510.080919.309244,286.557524,500.92441.8075
31011.235220.614927,880.102821,633.73181.2887
4812.967922.614531,845.343719,473.09681.6354
41013.332722.775623,589.840714,800.25911.5939
51016.067916.067420,425.051911,219.47461.8205
51517.610526.458617,892.665311,709.90551.5280
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chutiman, N.; Wichitchan, S.; Boonpok, C.; Chiangpradit, M.; Guayjarernpanishk, P. A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling. Mathematics 2025, 13, 3751. https://doi.org/10.3390/math13233751

AMA Style

Chutiman N, Wichitchan S, Boonpok C, Chiangpradit M, Guayjarernpanishk P. A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling. Mathematics. 2025; 13(23):3751. https://doi.org/10.3390/math13233751

Chicago/Turabian Style

Chutiman, Nipaporn, Supawadee Wichitchan, Chawalit Boonpok, Monchaya Chiangpradit, and Pannarat Guayjarernpanishk. 2025. "A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling" Mathematics 13, no. 23: 3751. https://doi.org/10.3390/math13233751

APA Style

Chutiman, N., Wichitchan, S., Boonpok, C., Chiangpradit, M., & Guayjarernpanishk, P. (2025). A Refined Regression Estimator for General Inverse Adaptive Cluster Sampling. Mathematics, 13(23), 3751. https://doi.org/10.3390/math13233751

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop