Next Article in Journal
A Supported Online Resilience-Enhancing Intervention for Pregnant Women: A Non-Randomized Pilot Study
Previous Article in Journal
The Association between Message Framing and Intention to Vaccinate Predictive of Hepatitis A Vaccine Uptake
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comparison of Statistical Methods to Construct Confidence Intervals and Fiducial Intervals for Measures of Health Disparities

1
Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University, Washington, DC 20057, USA
2
Department of Oncology, Georgetown University, Washington, DC 20057, USA
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2024, 21(2), 208; https://doi.org/10.3390/ijerph21020208
Submission received: 19 December 2023 / Revised: 4 February 2024 / Accepted: 7 February 2024 / Published: 10 February 2024

Abstract

:
Health disparities are differences in health status across different socioeconomic groups. Classical methods, e.g., the Delta method, have been used to estimate the standard errors of estimated measures of health disparities and to construct confidence intervals for these measures. However, the confidence intervals constructed using the classical methods do not have good coverage properties for situations involving sparse data. In this article, we introduce three new methods to construct fiducial intervals for measures of health disparities based on approximate fiducial quantities. Through a comprehensive simulation study, We compare the empirical coverage properties of the proposed fiducial intervals against two Monte Carlo simulation-based methods—utilizing either a truncated Normal distribution or the Gamma distribution—as well as the classical method. The findings of the simulation study advocate for the adoption of the Monte Carlo simulation-based method with the Gamma distribution when a unified approach is sought for all health disparity measures.

1. Introduction

In recent years, more and more attention has been given to health equity, one of the goals of the Healthy People 2020 [1]. The World Health Organization (WHO) has pointed out the “gap” in health between segments of the population [2]. A health disparity is defined as “a particular type of health difference that is closely linked with social, economic, and/or environmental disadvantage” [1]. The US National Institute on Minority Health and Health Disparities has raised national awareness about the prevalence and impact of health disparities that would adversely affect groups of people who are more vulnerable to health-related issues. Last but not least, the US Centers for Disease Control and Prevention (CDC) has played an important role in identifying the factors that lead to health disparities among racial, ethnic, geographic, and other socioeconomic groups, an example being the 2011 CDC Health Disparities and Inequalities Report [3].
There are multiple measures available to quantify the presence of health disparities across socioeconomic groups. The Health Disparity Calculator (HD*Calc), version 2.0.0, is a free statistical software that calculates estimates of commonly used measures of health disparities and constructs corresponding confidence intervals (CIs), using both classical and Monte Carlo simulation (MCS)-based methods [4,5,6]. The measures implemented in HD*Calc belong to three categories: absolute measures, relative measures and pairwise comparison measures. The absolute measures include range difference (RD), between-group variance (BGV), extended absolute concentration index (eACI) and the slope index of inequality (SII). The relative measures include range ratio (RR), index of disparity (IDisp), mean log deviation (MLD), Theil’s index (T), extended relative concentration index (eRCI), relative index of inequality (RII) and the Kunst–Mackenbach relative index (KMI). The pair comparison methods include pair difference (PD) and pair ratio (PR). Although HD*Calc was designed to analyze data from the Surveillance, Epidemiology, and End Results (SEER) Program, the software can also be used for other population-based health data.
Two articles have formally evaluated the empirical coverage properties of the methods to construct CIs implemented in HD*Calc. The first article has compared the classical method and the MCS-based method using the truncated Normal distribution [7]. The authors concluded that the two methods work well except for situations when the data are sparse. As a general solution to dealing with sparse data, the second article has proposed the use of the MCS-based method with the Gamma distribution [8]. The MCS-based method with the Gamma distribution is currently the recommended approach to construct CIs for the measures of health disparities implemented in HD*Calc. By extending the work from Krishnamoorthy and Lee 2010 [9] to the case of measures of health disparities, the aims of the current article are to introduce three new methods to construct fiducial intervals for measures of health disparities, based on approximate fiducial quantities, and to compare their frequentist properties, i.e., their empirical coverage performance, with those of existing methods by using a simulation study involving nine different scenarios that allow different combinations of sample sizes and true rates per cell (where the cells are all cross-classifications of age groups and socioeconomic groups).
This paper is organized as follows. We review the measures of health disparities implemented in HD*Calc and describe the statistical methods used to construct confidence intervals and fiducial intervals, including the classical method, the MCS-based methods and the proposed new fiducial methods. We describe the simulation study used to evaluate the empirical coverage performance of the intervals constructed using these methods, and, at the end, report and discuss the results of the simulation study. We provide all the results of the simulation study in Appendix A.

2. Materials and Methods

2.1. Background and Notation

In what follows, in contrast to previous work [6,10], we clearly distinguish between functions of parameters and their estimates. We denote by λ j k the true rate (e.g., cancer rate) of the k-th age group within the j-th socioeconomic group, j = 1 , , J , k = 1 , , K . The true age-adjusted rate of the j-th socioeconomic group, μ j , is defined as
μ j = k = 1 K w k λ j k ,
where w k is the weight for the k-th age group within the j-th socioeconomic group.
To estimate the true rate λ j k and the true age-adjusted rate μ j , we use the unbiased estimators R j k and Y j , respectively. We denote the estimated rate of the k-th age group within the j-th socioeconomic group as
R j k = D j k n j k , j = 1 , , J , k = 1 , , K ,
where D j k denotes the number of events and n j k denotes the number of persons (or person-years). The estimated age-adjusted rate is
Y j = k = 1 K w k R j k .
We assume that D j k Poisson n j k λ j k . The variance of the estimator Y j is
σ j 2 = Var Y j = k = 1 K w k 2 Var R j k = k = 1 K w k 2 n j k 2 Var D j k = k = 1 K w k 2 n j k 2 n j k λ j k = k = 1 K w k 2 n j k λ j k .
An unbiased estimate of this variance is
σ ^ j 2 = k = 1 K w k 2 n j k 2 D j k .
We define the vector of J estimators/estimates as Y = Y 1 , , Y J and the vector of the J true age-adjusted rates as μ = μ 1 , , μ J , where E ( Y j ) = μ j for j = 1 , , J . In what follows, we assume that the J estimators are independent and the estimated variances of these estimators are given by σ ^ j 2 for j = 1 , , J .

2.2. Measures of Health Disparities

The (true) measures of health disparities are functions of parameters, F ( μ ) , although they are often not clearly distinguished from their estimates, F ( Y ) which may lead to confusion. Depending on the function F ( · ) , we obtain different measures of health disparities. In what follows, we will present the measures implemented in HD*Calc. For simplicity, we will refer to the (true) age-adjusted rates simply as (true) rates.

2.2.1. Range Difference (RD) and Pair Difference (PD)

The range difference is the difference between the true rates of the best and the worst socioeconomic groups
RD = μ max μ min ,
where μ max = max j μ j and μ min = min j μ j . It is estimated by
RD ^ = Y ( J ) Y ( 1 ) ,
where Y ( j ) is the j-th order statistic of the observed values of Y. This may cause problems as Y ( J ) and Y ( 1 ) may not necessarily be unbiased estimators of μ max and μ min , respectively. To address this issue, we may fix in advance the groups to be compared and consider instead the pair difference
PD = μ 1 μ 2 ,
which has as its estimator
PD ^ = Y 1 Y 2 ,
where Y 1 and Y 2 are estimators of μ 1 and μ 2 , respectively.

2.2.2. Between-Group Variance (BGV)

BGV is calculated using the squares of the differences between the socioeconomic groups’ rates and the population mean rate, with weighting by the corresponding population share
BGV = j = 1 J p j ( μ j μ ¯ ) 2 ,
where
p j = n j s = 1 J n s
is the population share of the j-th socioeconomic group (treated as essentially known, i.e., estimated with negligible sampling error), and
μ ¯ = j = 1 J p j μ j
is the population mean rate. The estimator of BGV is given by
BGV ^ = j = 1 J p j ( Y j Y ¯ ) 2 ,
where Y ¯ = j = 1 J p j Y j .

2.2.3. Range Ratio (RR) and Pair Ratio (PR)

The RR is similar to the RD, where we replace the subtraction with division. It is defined as
RR = μ max μ min ,
where μ max and μ min are defined in (5). It is estimated by
RR ^ = Y ( J ) Y ( 1 ) ,
where Y ( 1 ) and Y ( J ) are defined in (6).
Similarly to PD, PR is defined as
PR = μ 1 μ 2 ,
and is estimated by
PR ^ = Y 1 Y 2 ,
where Y 1 and Y 2 are estimators of μ 1 and μ 2 , respectively.

2.2.4. Relative Concentration Index (RCI) and Extended Relative Concentration Index (eRCI)

RCI is a measure that can be used only with ordinal socioeconomic groups. It is defined by Kakwani et al., 1997 [11] as
RCI = 2 μ ¯ j = 1 J p j z j μ j 1 ,
where μ ¯ and p j are defined in (11) and (10), respectively. Here, z j is the relative rank of the j-th ordinal socioeconomic group, defined as
z j = k = 1 j p k 1 2 p j .
RCI is estimated by
RCI ^ = 2 Y ¯ j = 1 J p j z j Y j 1 ,
where Y ¯ , p j and z j are defined in (12), (10) and (18), respectively.
Yu et al., 2019 [12] used eRCI as a measure of health disparities. It can be calculated as
eRCI = ν j = 1 J p j ( 1 z j ) ν 1 ν μ ¯ j = 1 J p j μ j ( 1 z j ) ν 1 ,
where ν > 0 is the aversion parameter, and μ j , μ ¯ , p j and z j are the same as in (17). The estimator is
eRCI ^ = ν j = 1 J p j ( 1 z j ) ν 1 ν Y ¯ j = 1 J p j Y j ( 1 z j ) ν 1 .
If ν = 2 we obtain RCI. In this article, we use ν = 3 for eRCI.

2.2.5. Absolute Concentration Index (ACI) and Extended Absolute Concentration Index (eACI)

ACI is the absolute version of RCI. It has the following formula
ACI = μ ¯ RCI = j = 1 J p j ( 2 z j 1 ) μ j ,
which can be estimated by
ACI ^ = j = 1 J p j ( 2 z j 1 ) Y j ,
where p j and z j are defined in (10) and (18), respectively.
Yu et al. 2019 [12] used eACI as a measure of health disparities. It can be calculated as
eACI = μ ¯ eRCI = ν μ ¯ j = 1 J p j ( 1 z j ) ν 1 ν j = 1 J p j μ j ( 1 z j ) ν 1 ,
where ν > 0 is the aversion parameter, and μ j , μ ¯ , p j and z j are the same as in (22). The estimator is
eACI ^ = ν Y ¯ j = 1 J p j ( 1 z j ) ν 1 ν j = 1 J p j Y j ( 1 z j ) ν 1 .
If ν = 2 we obtain ACI. In this article, we use ν = 3 for eACI.

2.2.6. Slope Index of Inequality (SII)

SII measures the difference in rates between a hypothetical person with z j = 1 and a hypothetical person with z j = 0 . It was introduced by Preston, Haines and Pamuk, 1981 [13] using a simple linear regression model
E ( Y j | z j ) = β 0 + β 1 z j ,
where z j is defined in (18) and SII = β 1 .
Since the regression is run on grouped data, SII is estimated using the least squares weighted by the population shares p j
SII ^ = j ( p j z j Y j ) j ( p j z j ) j ( p j Y j ) j ( p j z j 2 ) j ( p j z j ) 2 ,
where p j is defined in (10).

2.2.7. Index of Disparity (IDisp)

The index of disparity (IDisp) measures the relative difference between the rates of the socioeconomic groups and a reference rate as a proportion of the reference rate. It was first introduced by Pearcy and Keppel, 2002 [14] as
IDisp PK = 1 J j = 1 J | μ j μ ¯ | μ ¯ × 100 .
A version of IDisp is replacing the population mean rate, μ ¯ , with the rate of a reference group, μ ref , which is
IDisp = 1 J 1 j = 1 , j ref J | μ j μ ref | μ ref × 100 ,
The corresponding estimator is
IDisp ^ = 1 J 1 j = 1 , j ref J | Y j Y ref | Y ref × 100 ,
where Y ref is the estimator of μ ref . To eliminate the absolute values from the formula, HD*Calc recommends the use of the group with the smallest rate as the reference group.

2.2.8. Mean Log Deviation (MLD)

MLD is defined as
MLD = j = 1 J p j log γ j = log ( μ ¯ ) j = 1 J p j log ( μ j ) ,
where μ ¯ and p j are defined in (11) and (10), respectively, and
γ j = μ j μ ¯
is the ratio of the rate of the j-th socioeconomic group and the population mean rate. It is estimated by
MLD ^ = log ( Y ) ¯ j = 1 J p j log ( Y j ) .

2.2.9. Theil’s Index (T)

T is similar to MLD but it uses a different disproportionality function. It is defined as
T = j = 1 J p j γ j log ( γ j ) ,
where p j and γ j are defined in (10) and (32), respectively. It is estimated by
T ^ = j = 1 J p j Y j Y ¯ log Y j Y ¯ .

2.2.10. Relative Index of Inequality (RII)

RII is obtained by dividing SII by the population mean rate [15]
RII = SII μ ¯ = β 1 μ ¯ ,
where μ ¯ and β 1 are defined in (11) and (26), respectively. It is estimated by
RII ^ = 1 j ( p j z j 2 ) j ( p j z j ) 2 j ( p j z j Y j ) Y ¯ j ( p j z j ) .

2.2.11. Kunst–Mackenbach Relative Index (KMI)

Mackenbach and Kunst, 1997 [16] proposed an alternative to RII by dividing the rate of a hypothetical person with z j = 0 by the rate of a hypothetical person with z j = 1
KMI = β 0 β 0 + β 1 ,
where β 0 and β 1 are defined in (26). It is estimated by
KMI ^ = β ^ 0 β ^ 0 + SII ^ ,
where SII ^ is calculated in (27) and β ^ 0 can be obtained as
β ^ 0 = Y ¯ SII ^ × z ¯ ,
where Y ¯ is defined in (12) and
z ¯ = j = 1 J p j z j .

2.3. Confidence Intervals Based on the Classical Method

The classical method used for variance estimation for the majority of the measures of health disparities implemented in HD*Calc is the Delta method. If θ ^ = F ( Y ) is an estimator of the true measure of health disparities θ , we approximate F by using a first-order Taylor series approximation around μ and then
Var ( θ ^ ) Var j = 1 J F y j y j = μ j ( Y j μ j ) ,
where μ = ( μ 1 , , μ J ) is the mean of Y . Assuming that the J socioeconomic groups are independent, we obtain
Var ( θ ^ ) j = 1 J F y j y j = μ j 2 σ j 2 ,
where σ 2 = ( σ 1 2 , , σ J 2 ) is the main diagonal of the variance-covariance matrix of Y . We substitute the unknown parameters μ j and σ j 2 with their estimates to obtain Var ^ ( θ ^ ) , and then construct corresponding Wald confidence intervals for θ . Detailed derivations of the formulas for the estimated variances may be found in Ahn et al., 2018 [7] for 11 of the 15 measures of health disparities implemented in HD*Calc (all measures except eACI, eRCI, PD and PR) and on the HD*Calc website [4] for all 15 measures.

2.4. Fiducial Intervals

In this section, we describe new methods to construct fiducial intervals for measures of health disparities based on the use of approximate fiducial quantities. The fiducial inference is an approach to inference introduced by Fisher that has good frequentist properties [17,18].

2.4.1. Fiducial Quantities (FQs)

Following Krishnamoorthy and Lee, 2010 [9], for an observed value m j k of the number of events D j k , we have the equalities
Pr ( D j k m j k | λ j k ) = Pr χ 2 m j k 2 2 n j k < λ j k | m j k ,
and
Pr ( D j k m j k | λ j k ) = Pr χ 2 m j k + 2 2 2 n j k > λ j k | m j k ,
where χ d 2 is a random variable following a chi-squared distribution with d degree of freedom. Garwood 1936 [19] proposed a related exact confidence interval for a Poisson mean
1 2 n j k χ 2 m j k ; α / 2 2 , 1 2 n j k χ 2 m j k + 2 ; 1 α / 2 2 .
Cox 1953 [20] introduced an approximate FQ for λ j k , χ 2 m j k + 1 2 2 n j k . A related approximate fiducial interval is
1 2 n j k χ 2 m j k + 1 ; α / 2 2 , 1 2 n j k χ 2 m j k + 1 ; 1 α / 2 2 .
Dempster 2008 [21] proposed another approximate FQ for λ j k , a 50-50 mixture of χ 2 m j k 2 2 n j k and χ 2 m j k + 2 2 2 n j k .
An approximate FQ for a function of λ s may be obtained by replacing the λ s with their FQs in the function [18]. In our case, each measure of health disparities can be expressed as a function h ( · ) of λ j k s, and an approximate FQ for
h ( λ 11 , , λ 1 K ; ; λ J 1 , , λ J K ) ,
is obtained as
h ( λ ^ 11 , , λ ^ 1 K ; ; λ ^ J 1 , , λ ^ J K ) ,
where λ ^ j k is an approximate FQ of λ j k .

2.4.2. Simulation-Based Methods to Construct Fiducial Intervals

We use the above approximate FQs to construct three different fiducial intervals (FIs):
FI1.
Simulate λ ^ j k from χ m j k + 1 2 2 n j k ;
FI2.
Simulate λ ^ j k from either χ m j k 2 2 n j k or χ m j k + 2 2 2 n j k , each with a 50% probability;
FI3.
Simulate λ ^ j k from both χ m j k 2 2 n j k and χ m j k + 2 2 2 n j k .
For each method, we plug in the simulated λ ^ j k s into the function h ( · ) to obtain the simulated values of the measures of health disparities h ( λ ^ 11 , , λ ^ 1 K ; ; λ ^ J 1 , , λ ^ J K ) . After performing B simulations, a 95% FI is constructed using the 2.5 and 97.5 percentiles of the set of simulated values for the measures of health disparities. For cells where no event is observed, i.e., m j k = 0 , we follow Zhang et al. 2014 [22] and use
λ ^ j k = 1 / n j k n j k + 1 .

2.5. Monte Carlo Simulation-Based Methods (MCS)

For the Monte Carlo simulation-based methods, we simulate values for the age-adjusted rates, μ j , instead of values for the cell rates λ j k , as performed for the previously described fiducial methods, either from a truncated Normal distribution (MCS-N) or a Gamma distribution (MCS-G). The mean and the variance of the distribution from which we simulate values are the estimated mean and the estimated variance of the estimator of μ j . The use of these two distributions ensures that all simulated values are non-negative. When using the truncated Normal distribution, we simulate from a Normal distribution and discard the negative simulated values, i.e., keep only the non-negative simulated values. The adjustment (49) for zero counts is also applied. After we simulate values for μ ^ j , we use them to calculate the simulated values for the measures of health disparities. After performing B simulations, the 95% CI is constructed using the 2.5 and 97.5 percentiles of the set of simulated values for the measures of health disparities.

2.6. Simulation Study

We simulated data under nine different scenarios to allow different combinations of sample sizes and true rates per cell (where the cells are all cross-classifications of age groups and socioeconomic groups). For each scenario, we simulated data for the 12 cells that correspond to the combinations of three ordered socioeconomic groups and four age groups. Fixed weights, according to the WHO World Standard were applied to each age group. Table 1 describes the characteristics of the nine scenarios, with the means and standard deviations (SDs) being calculated across the 12 cells. The combinations of sample sizes and true rates per cell resulted in five categories for the magnitude of the expected count per cell, i.e., <1, 1–9, 10–99, 100–999, and 1000–9999.
For each scenario, we generated 5000 datasets, and for each dataset, we used 5000 simulations to construct the 95% MCS-based CIs and the 95% FIs. The empirical coverage was defined as the frequency of the true value of the measure of health disparities being covered by the nominal 95% CIs or FIs.

3. Results

We start with the results for scenario 1, which corresponds to a situation involving extremely sparse data, i.e., where the expected count per cell is below 1. The empirical coverage results are presented in Table 2 and Figure 1. For eACI, the Classic method, FI1 and FI2 had empirical coverages considerably below the nominal 95% level; FI1 and FI2 had the same problem for eRCI. The MCS-N method had only about 91% empirical coverage for PD, while FI3 had very large empirical coverage ranging from 99% to 100% for 11 of the 15 measures. By contrast, the MCS-G method performed reasonably well for all 15 measures for this scenario.
The results for scenarios 2 and 3 were very similar to each other. They both correspond to situations involving sparse (but not extremely sparse) data, where the expected count per cell is between 1 and 10. The empirical coverage results are shown in Table 3 and Figure 2 for scenario 2, and in Table 4 and Figure 3 for scenario 3, respectively. For both scenarios, the Classic method still had empirical coverages considerably below the nominal 95% level for eACI. FI1 and FI2 had the same problem for eACI, but to a much lesser extent, with empirical coverages of about 92%. Overall, FI3 performed best for the 15 measures, followed closely by MCS-G and MCS-N.
For scenarios 4 to 9, where the data may not be considered sparse by having an expected count per cell of 10 or more, the empirical coverages are between 94% and 96% for all methods and all 15 measures, except for the Classical method for eACI (where they ranged from 79% to 83%) and eRCI (where they were all 100%). For these scenarios, all methods except the Classical method performed well. The complete results regarding the empirical coverage are shown in Appendix A.

4. Discussion

We compared six methods to construct confidence intervals and fiducial intervals for 15 measures of health disparities with regard to their empirical coverage under nine different scenarios. Overall, two methods performed well: the MCS-G method to construct confidence intervals and the FI3 method to construct fiducial intervals. It is important to note that the documentation for HD*Calc version 2.0.0 also recommends the use of the Monte Carlo simulation-based method with the Gamma distribution based on the results from Ahn et al., 2019 [8] regarding 11 measures of health disparities. Compared to the Normal distribution, the Gamma distribution is a better choice to use for simulating rates due to its positivity. Moreover, its flexibility in accommodating asymmetry surpasses that of a truncated Normal distribution.
The strengths of the current study include the addition of four measures (eACI, eRCI, PD and PR) to the list of 11 measures of health disparities previously investigated, and the consideration of different scenarios corresponding to different combinations of sample sizes and true rates per cell. The limitations, due to feasibility reasons, include the consideration of eACI and eRCI only when the aversion parameter ν = 3 , the use of only one value for the number of simulations used for the MCS-based methods and the fiducial methods, i.e., 5000 simulations, the use of an ordinal socioeconomic group variable with only three levels, and the use of an age group variable with only four levels.
Future research work should consider eACI and eRCI with other values of the aversion parameter, a larger number of socioeconomic groups and age groups, and different numbers of simulations for the MCS-based methods and the fiducial methods. Building upon the work from Talih et al., 2020 [23], related future research should also investigate if it is possible to reduce a large number of measures of health disparities to a smaller set of measures that satisfy a set of desirable properties and are easier to interpret. With a smaller number of recommended measures of health disparities, it would be easier to thoroughly compare the performance of statistical methods to construct confidence intervals and fiducial intervals for these selected measures.

5. Conclusions

Given that the MCS-G method is much simpler to understand and implement than the FI3 method, and the lack of familiarity of statisticians and (more importantly) practitioners with fiducial methods and fiducial intervals, we recommend the use of the Monte Carlo simulation-based method with the Gamma distribution.

Author Contributions

Conceptualization, T.L., A.D.D. and G.L.; methodology, T.L. and G.L.; software, T.L.; formal analysis, T.L.; investigation, T.L., A.D.D. and G.L.; writing—original draft preparation, T.L.; writing—review and editing, A.D.D. and G.L.; visualization, T.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The code used to simulate the data for this study is available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACIabsolute concentration index
BGVbetween-group variance
CDCCenters for Disease Control and Prevention
CIconfidence interval
eACIextended absolute concentration index
eRCIextended relative concentration index
FIfiducial interval
FQfiducial quantity
HD*CalcHealth Disparity Calculator
IDispindex of disparity
KMIKunst-Mackenbach relative index
MCSMonte Carlo simulation
MCS-NMonte Carlo simulation-based using a truncated Normal distribution
MCS-GMonte Carlo simulation-based using the Gamma distribution
MLDmean log deviation
PDpair difference
PRpair ratio
RCIrelative concentration index
RDrange difference
RIIrelative index of inequality
RRrange ratio
SEERSurveillance, Epidemiology, and End Results
SIIslope index of inequality
TTheil’s index
WHOWorld Health Organization

Appendix A. Results of the Simulation Study

Table A1. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for all scenarios.
Table A1. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for all scenarios.
MeasureScenarioClassicalMCS-NMCS-GFI1FI2FI3
PD193.2890.8694.1298.3099.3298.60
295.2895.3895.2695.9296.1095.90
395.2895.3895.2695.9296.1095.90
495.2895.3295.4095.4095.2095.30
595.2895.3295.4095.4095.2095.30
695.2895.3295.4095.4095.2095.30
794.6694.7494.7894.7494.7694.72
894.6694.7494.7894.7494.7694.72
994.8094.7894.8094.7894.7894.92
RD197.4697.0097.7499.3099.3899.26
295.7095.7695.2696.0896.2496.06
395.7095.7695.2696.0896.2496.06
495.2895.3695.4095.4095.1895.28
595.2895.3695.4095.4095.1895.28
695.2895.3695.4095.4095.1895.28
794.6694.7494.7894.7494.7694.72
894.6694.7494.7894.7494.7694.72
994.8094.7894.8094.7894.7894.92
BGV194.2696.7497.9499.4299.4699.38
294.0895.6095.1895.9896.2895.98
394.0895.6095.1895.9896.2895.98
495.0695.0695.1295.1095.1295.08
595.0695.0695.1295.1095.1295.08
695.0695.0695.1295.1095.1295.08
795.2495.4095.1895.3695.3095.34
895.2495.4095.1895.3695.3095.34
994.8094.7694.8894.8694.8494.82
ACI199.6498.3497.7299.98100.099.96
294.6894.7694.0695.6896.4895.74
394.6894.7694.0695.6896.4895.74
495.4095.4895.1295.4095.5695.40
595.4095.4895.1295.4095.5695.40
695.4095.4895.1295.4095.5695.40
794.5494.4894.5494.6894.4294.54
894.5494.4894.5494.6894.4294.54
994.6294.7094.7294.6094.6294.64
eACI182.8295.0495.0852.2064.1094.52
278.9494.6094.8291.8092.3695.68
378.7094.5894.8491.8092.3495.66
481.4694.4294.6094.1094.2094.40
579.3894.5094.6294.0694.1094.40
679.2494.5294.6294.0494.1294.40
781.7895.2895.2095.2495.3695.26
880.1495.2295.2095.2495.3295.30
983.1095.5495.5495.5695.5295.46
SII199.6498.3497.7299.98100.099.96
294.6894.7694.0695.6896.4895.74
394.6894.7694.0695.6896.4895.74
495.4095.4895.1295.4095.5695.40
595.4095.4895.1295.4095.5695.40
695.4095.4895.1295.4095.5695.40
794.5494.4894.5494.6894.4294.54
894.5494.4894.5494.6894.4294.54
994.6294.7094.7294.6094.6294.64
PR196.6299.0497.6691.1295.5695.64
295.5294.1494.3094.7295.4296.10
395.5294.1494.3094.7295.4296.10
495.6695.6495.4495.3695.6295.62
595.6695.6495.4495.3695.6295.62
695.6695.6495.4495.3695.6295.62
794.8094.8894.8094.6894.8094.84
894.8094.8894.8094.6894.8094.84
994.9494.8494.8294.9294.9894.98
RR1100.099.3697.76100.0100.0100.0
297.0893.3493.8097.0497.6498.36
397.0893.3493.8097.0497.6498.36
495.6895.7095.5495.4495.7095.68
595.6895.7095.5495.4495.7095.68
695.6895.7095.5495.4495.7095.68
794.8094.8894.8094.6894.8094.84
894.8094.8894.8094.6894.8094.84
994.9494.8494.8294.9294.9894.98
IDisp199.4099.0497.50100.0100.0100.0
298.3693.3294.5898.1098.4098.78
398.3693.3294.5898.1098.4098.78
495.1495.5895.4295.3295.4895.56
595.1495.5895.4295.3295.4895.56
695.1495.5895.4295.3295.4895.56
794.8094.4494.6494.7094.5894.68
894.8094.4494.6494.7094.5894.68
994.9294.8894.8294.8694.8894.96
MLD193.7299.3297.8899.98100.0100.0
294.8092.9893.0095.9896.5497.44
394.8092.9893.0095.9896.5497.44
495.7095.0695.1095.7495.8296.06
595.7095.0695.1095.7495.8296.06
695.7095.0695.1095.7495.8296.06
795.0295.1495.0294.9895.0095.06
895.0295.1495.0294.9895.0095.06
994.8494.7894.9294.7094.8494.80
RCI197.5098.3097.5499.98100.0100.0
294.1494.7694.0695.6296.3495.64
394.1494.7694.0695.6296.3495.64
495.2895.4695.0495.2295.5695.34
595.2895.4695.0495.2295.5695.34
695.2895.4695.0495.2295.5695.34
794.6094.6294.6494.6294.6494.70
894.6094.6294.6494.6294.6494.70
994.9094.8694.8894.7894.8494.82
eRCI1100.097.1697.2271.5880.9495.20
2100.094.5094.8493.2293.8695.58
3100.094.5094.8493.2293.8695.58
4100.095.0295.1495.0895.1095.46
5100.095.0295.1495.0895.1095.46
6100.095.0295.1495.0895.1095.46
7100.094.8294.7694.8094.8694.86
8100.094.8294.7694.8094.8694.86
9100.095.3895.4095.3895.1295.28
T193.1499.6098.5499.7899.9899.98
293.4893.6493.3695.1695.9097.10
393.4893.6493.3695.1695.9097.10
495.5295.1295.2095.6495.7095.92
595.5295.1295.2095.6495.7095.92
695.5295.1295.2095.6495.7095.92
795.2095.0695.0695.1695.1695.16
895.2095.0695.0695.1695.1695.16
994.8894.8294.9294.7894.7694.84
RII197.5098.3097.5499.98100.0100.0
294.1494.7694.0695.6296.3495.64
394.1494.7694.0695.6296.3495.64
495.2895.4695.0495.2295.5695.34
595.2895.4695.0495.2295.5695.34
695.2895.4695.0495.2295.5695.34
794.6094.6294.6494.6294.6494.70
894.6094.6294.6494.6294.6494.70
994.9094.8694.8894.7894.8494.82
KMI195.5499.7699.6299.98100.0100.0
294.9094.7694.0695.6296.3495.64
394.9094.7694.0695.6296.3495.64
495.1895.4695.0495.2295.5695.34
595.1895.4695.0495.2295.5695.34
695.1895.4695.0495.2295.5695.34
794.6694.6294.6494.6294.6494.70
894.6694.6294.6494.6294.6494.70
994.8494.8694.8894.7894.8494.82

References

  1. U.S. Department of Health and Human Services, Office of Disease Prevention and Health Promotion. Healthy People 2020. 2010. Available online: https://www.cdc.gov/nchs/healthy_people/hp2020.htm (accessed on 1 September 2023).
  2. Marmot, M.; Friel, S.; Bell, R.; Houweling, T.A.; Taylor, S.; on behalf of the Commission on Social Determinants of Health. Closing the gap in a generation: Health equity through action on the social determinants of health. Lancet 2008, 372, 1661–1669. [Google Scholar] [CrossRef] [PubMed]
  3. Truman, B.I.; Centers for Disease Control and Prevention. CDC health disparities and inequalities report-United States, 2011. Mickey Leland Cent. Inf. Portal 2011, 4. Available online: https://digitalscholarship.tsu.edu/mlcejs_info/4/ (accessed on 1 September 2023).
  4. Division of Cancer Control and Population Sciences, National Cancer Institute. Health Disparities Calculator (HD*Calc), Version 2.0.0. 2019. Available online: https://seer.cancer.gov/hdcalc/ (accessed on 1 September 2023).
  5. Breen, N.; Scott, S.; Percy-Laurry, A.; Lewis, D.; Glasgow, R. Health disparities calculator: A methodologically rigorous tool for analyzing inequalities in population health. Am. J. Public Health 2014, 104, 1589–1591. [Google Scholar] [CrossRef]
  6. Harper, S.; Lynch, J. Methods for Measuring Cancer Disparities: Using Data Relevant to Healthy People 2010 Cancer-Related Objectives; NCI Cancer Surveillance Monograph Series, No. 6; Technical Report; NIH Pub. No. 05-5777; National Cancer Institute: Bethesda, MD, USA, 2005.
  7. Ahn, J.; Harper, S.; Yu, M.; Feuer, E.J.; Liu, B.; Luta, G. Variance Estimation and Confidence Intervals for 11 Commonly Used Health Disparity Measures. JCO Clin. Cancer Inform. 2018, 2, 2. [Google Scholar] [CrossRef] [PubMed]
  8. Ahn, J.; Harper, S.; Yu, M.; Feuer, E.J.; Liu, B. Improved Monte Carlo methods for estimating confidence intervals for eleven commonly used health disparity measures. PLoS ONE 2019, 14, e0219542. [Google Scholar] [CrossRef] [PubMed]
  9. Krishnamoorthy, K.; Lee, M. Inference for functions of parameters in discrete distributions based on fiducial approach: Binomial and Poisson cases. J. Stat. Plan. Inference 2010, 140, 1182–1192. [Google Scholar] [CrossRef]
  10. Harper, S.; Lynch, J. Selected Comparisons of Measures of Health Disparities: A Review Using Databases Relevant to Healthy People 2010 Cancer-Related Objectives; NCI Cancer Surveillance Monograph Series, No. 7; Technical Report; NIH Pub. No. 07-6281; National Cancer Institute: Bethesda, MD, USA, 2007.
  11. Kakwani, N.; Wagstaff, A.; Van Doorslaer, E. Socioeconomic inequalities in health: Measurement, computation, and statistical inference. J. Econom. 1997, 77, 87–103. [Google Scholar] [CrossRef]
  12. Yu, M.; Liu, B.; Li, Y.; Zou, Z.; Breen, N. Statistical inferences of extended concentration indices for directly standardized rates. Stat. Med. 2019, 38, 62–73. [Google Scholar] [CrossRef]
  13. Preston, S.H.; Haines, M.R.; Pamuk, E. Effects of Industrialization and Urbanization on Mortality in Developed Countries; Department of Economics, Wayne State University: Detroit, MI, USA, 1981. [Google Scholar]
  14. Pearcy, J.N.; Keppel, K.G. A Summary Measure of Health Disparity. Public Health Rep. 2002, 117, 273–280. [Google Scholar] [CrossRef] [PubMed]
  15. Pamuk, E.R. Social-class inequality in infant mortality in England and Wales from 1921 to 1980. Eur. J. Popul./Revue Européenne Démographie 1988, 4, 1–21. [Google Scholar] [CrossRef]
  16. Mackenbach, J.P.; Kunst, A.E. Measuring the magnitude of socio-economic inequalities in health: An overview of available measures illustrated with two examples from Europe. Soc. Sci. Med. 1997, 44, 757–771. [Google Scholar] [CrossRef]
  17. Fisher, R.A. Inverse probability. Math. Proc. Camb. Philos. Soc. 1930, 26, 528–535. [Google Scholar] [CrossRef]
  18. Fisher, R.A. The fiducial argument in statistical inference. Ann. Eugen. 1935, 6, 391–398. [Google Scholar] [CrossRef]
  19. Garwood, F. Fiducial limits for the Poisson distribution. Biometrika 1936, 28, 437–442. [Google Scholar]
  20. Cox, D. Some simple approximate tests for Poisson variates. Biometrika 1953, 40, 354–360. [Google Scholar] [CrossRef]
  21. Dempster, A.P. The Dempster–Shafer calculus for statisticians. Int. J. Approx. Reason. 2008, 48, 365–377. [Google Scholar] [CrossRef]
  22. Zhang, S.; Luo, J.; Zhu, L.; Stinchcomb, D.G.; Campbell, D.; Carter, G.; Gilkeson, S.; Feuer, E.J. Confidence intervals for ranks of age-adjusted rates across states or counties. Stat. Med. 2014, 33, 1853–1866. [Google Scholar] [CrossRef] [PubMed]
  23. Talih, M.; Moonesinghe, R.; Huang, D.T. Measuring the Magnitude of Health Inequality between 2 Population Subgroup Proportions. Am. J. Epidemiol. 2020, 189, 987–996. [Google Scholar] [CrossRef]
Figure 1. Empirical coverage results for scenario 1. The dashed line shows 95% coverage.
Figure 1. Empirical coverage results for scenario 1. The dashed line shows 95% coverage.
Ijerph 21 00208 g001
Figure 2. Empirical coverage results for scenario 2. The dashed line shows 95% coverage.
Figure 2. Empirical coverage results for scenario 2. The dashed line shows 95% coverage.
Ijerph 21 00208 g002
Figure 3. Empirical coverage results for scenario 3. The dashed line shows 95% coverage.
Figure 3. Empirical coverage results for scenario 3. The dashed line shows 95% coverage.
Ijerph 21 00208 g003
Table 1. Characteristics of the nine scenarios.
Table 1. Characteristics of the nine scenarios.
ScenarioSample Size
Mean (SD)
True Rate
Mean (SD)
Expected Event Count
Mean (SD)
Magnitude of Expected Event Count
12417 (1084)0.0003 (0.0002)0.8 (0.784)<1
22417 (1084)0.003 (0.002)8 (7.84)1–9
324,167 (10,836)0.0003 (0.0002)8 (7.84)1–9
42417 (1084)0.03 (0.02)80 (78.4)10–99
524,167 (10,836)0.003 (0.002)80 (78.4)10–99
6241,667 (108,363)0.0003 (0.0002)80 (78.4)10–99
724,167 (10,836)0.03 (0.02)800 (784)100–999
8241,667 (108,363)0.003 (0.002)800 (784)100–999
9241,667 (108,363)0.03 (0.02)8000 (7839)1000–9999
Table 2. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 1.
Table 2. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 1.
MeasureClassicalMCS-NMCS-GFI1FI2FI3
RD97.4697.0097.7499.3099.3899.26
PD93.2890.8694.1298.3099.3298.60
BGV94.2696.7497.9499.4299.4699.38
ACI99.6498.3497.7299.98100.099.96
eACI82.8295.0495.0852.2064.1094.52
SII99.6498.3497.7299.98100.099.96
RR100.099.3697.76100.0100.0100.0
PR96.6299.0497.6691.1295.5695.64
IDisp99.4099.0497.50100.0100.0100.0
MLD93.7299.3297.8899.98100.0100.0
RCI97.5098.3097.5499.98100.0100.0
eRCI100.097.1697.2271.5880.9495.20
T93.1499.6098.5499.7899.9899.98
RII97.5098.3097.5499.98100.0100.0
KMI95.5499.7699.6299.98100.0100.0
Table 3. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 2.
Table 3. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 2.
MeasureClassicalMCS-NMCS-GFI1FI2FI3
RD95.7095.7695.2696.0896.2496.06
PD95.2895.3895.2695.9296.1095.90
BGV94.0895.6095.1895.9896.2895.98
ACI94.6894.7694.0695.6896.4895.74
eACI78.9494.6094.8291.8092.3695.68
SII94.6894.7694.0695.6896.4895.74
RR97.0893.3493.8097.0497.6498.36
PR95.5294.1494.3094.7295.4296.10
IDisp98.3693.3294.5898.1098.4098.78
MLD94.8092.9893.0095.9896.5497.44
RCI94.1494.7694.0695.6296.3495.64
eRCI100.094.5094.8493.2293.8695.58
T93.4893.6493.3695.1695.9097.10
RII94.1494.7694.0695.6296.3495.64
KMI94.9094.7694.0695.6296.3495.64
Table 4. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 3.
Table 4. Empirical coverage (%) for nominal 95% confidence intervals and fiducial intervals for scenario 3.
MeasureClassicalMCS-NMCS-GFI1FI2FI3
RD95.7095.7695.2696.0896.2496.06
PD95.2895.3895.2695.9296.1095.90
BGV94.0895.6095.1895.9896.2895.98
ACI94.6894.7694.0695.6896.4895.74
eACI78.7094.5894.8491.8092.3495.66
SII94.6894.7694.0695.6896.4895.74
RR97.0893.3493.8097.0497.6498.36
PR95.5294.1494.3094.7295.4296.10
IDisp98.3693.3294.5898.1098.4098.78
MLD94.8092.9893.0095.9896.5497.44
RCI94.1494.7694.0695.6296.3495.64
eRCI100.094.5094.8493.2293.8695.58
T93.4893.6493.3695.1695.9097.10
RII94.1494.7694.0695.6296.3495.64
KMI94.9094.7694.0695.6296.3495.64
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, T.; Dragomir, A.D.; Luta, G. A Comparison of Statistical Methods to Construct Confidence Intervals and Fiducial Intervals for Measures of Health Disparities. Int. J. Environ. Res. Public Health 2024, 21, 208. https://doi.org/10.3390/ijerph21020208

AMA Style

Li T, Dragomir AD, Luta G. A Comparison of Statistical Methods to Construct Confidence Intervals and Fiducial Intervals for Measures of Health Disparities. International Journal of Environmental Research and Public Health. 2024; 21(2):208. https://doi.org/10.3390/ijerph21020208

Chicago/Turabian Style

Li, Tengfei, Anca D. Dragomir, and George Luta. 2024. "A Comparison of Statistical Methods to Construct Confidence Intervals and Fiducial Intervals for Measures of Health Disparities" International Journal of Environmental Research and Public Health 21, no. 2: 208. https://doi.org/10.3390/ijerph21020208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop