Osteoporosis is a common, progressive systemic skeletal disease characterized by low bone mass and deteriorated bone tissue, resulting in an increase in bone fragility and susceptibility to fracture [1
]. Osteoporosis-associated fractures often cause a significant increase in morbidity, mortality, and accompanying social and economic costs [2
]. By estimation, about 50% of postmenopausal Caucasian women and 20% of Caucasian men in the US will suffer at least one fragility fracture after the age of 50 [3
]. With life expectancy increasing universally, osteoporosis and fracture will become an ever-growing health problem worldwide [4
Osteoporosis is a silent disease because bone loss occurs or bone tissue deteriorates without any symptoms [4
]. Patients often are not aware that they have osteoporosis until a fracture occurs. Thus, correctly diagnosing osteoporosis and identifying individuals who will sustain osteoporotic fracture is critical for the prevention of devastating fracture outcomes in the aging population. As BMD is the single strongest predictor of primary osteoporotic fracture [5
], clinical osteoporosis diagnosis is based on BMD measurements from dual-energy x-ray absorptiometry (DEXA) assessment [6
]. The World Health Organization established commonly accepted definitions of osteoporosis as a femur neck BMD that lies ≥2.5 standard deviations below (T-score ≤ −2.5) the mean value for young, healthy women [7
]. This definition becomes the WHO international reference standard for osteoporosis diagnosis. However, the major limitation of this WHO reference standard is its low sensitivity; most fractures occur in individuals with a femur neck BMD T-score > −2.5. In addition, because many other risk factors, including age, female gender, and previous fracture, are associated with fracture risk independently of BMD, several predictive models have been developed to estimate fracture risk from these established risk factors. The Fracture Risk Assessment Tool (FRAX) is the most commonly used fracture risk assessment tool in the US [8
]. Although FRAX improves fracture prediction over the BMD T-score method alone, the predictive performance of both FRAX and WHO T-score method varies in different population cohorts [8
] and with different conditions [10
The original FRAX was developed from nine large cohorts and then validated in 11 independent cohorts across the world [1
]. The US FRAX was calibrated from the data of the Rochester Epidemiology Project [12
], composed predominantly of Caucasians [13
]. Further, the T-score was initially proposed only for postmenopausal Caucasian women [14
]. Although both FRAX and T-score were adjusted subsequently for race and ethnicity, the methodology for the adjustment was not empirically based, thus rendering their performance for fracture prediction unreliable in minorities. In addition, neither FRAX nor T-score takes account of genetic components even though research has shown that hereditary factors are determinants of bone structure and are strongly associate with bone mass decrease, bone deterioration, and fragility fractures. With the development of advanced genomic technologies, numerous genetic loci related to fracture and BMD have been discovered in major genome-wide association studies (GWASs) and genome-wide meta-analyses. Both of these factors provide a unique opportunity to examine the performance of existing clinical fracture prediction approaches in groups with different genomic profiling.
Our previous study has examined the performance of FRAX in postmenopausal women by race and polygenic score, computed from fracture-associated SNPs discovered in the largest GWAS meta-analysis (under review). The T-score method (T-score ≤ −2.5) is the WHO international reference standard for osteoporosis diagnosis and has been endorsed by numerous professional societies, including the International Society of Clinical Densitometry (ISCD) [16
], and widely used in clinical practice for osteoporosis diagnosis. However, the performance of the T-score method for osteoporosis classification in the U.S. minorities was rarely studied, and the T-score performance in osteoporosis classification with different genetic profiling has never been reported in the literature. Thus, this study aimed first to evaluate whether T-score performs differently in osteoporosis classification with different polygenic risk scores, and second to assess T-score performance in osteoporosis diagnosis by race in women. We also examined the extent to which the interaction of race and polygene scores impacts the T-score performance in osteoporosis classification and fracture prediction.
2. Experimental Section
2.1. Data Source
The Women’s Health Initiative (WHI) study is a nationwide longitudinal study to examine the health of postmenopausal women aged 50–79 years old who have no severe medical conditions at baseline [17
]. Between 1993 and 1998, the WHI enrolled 161,809 women aged 50 to 79 years old at 40 clinical centers nationwide. The details of WHI recruitment and follow-up procedures have been described elsewhere [17
]. Briefly, eligible women were enrolled in one or more randomized Clinical Trials (CT) or to an Observational Study (OS). Participants were followed up on by mail or telephone semiannually in CT and with questionnaires annually in the OS. The Institutional Review Boards at each participating institution approved the study protocols and participant consent forms [17
The data used for the present study were de-identified and were acquired through the database of Genotype and Phenotype (dbGap) (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000200.v12.p3
) with the approval of the institutional review board at the University of Nevada, Las Vegas. The data included in this analysis were merged from four WHI sub-studies, WHI Genomics and Randomized Trials Network (GARNET), National Heart Lung, and Blood Institute (NHLBI), Population Architecture using Genomics and Epidemiology (PAGE), and Women’s Health Initiative Memory Study (WHIMS). On baseline questionnaires, participants provided information on age, race/ethnicity, smoking, physical activity, and supplement use. The included subjects were genotyped using either the Affymetrix 6.0 array set (Affymetrix, Santa Clara, CA, USA) or the Illumina (Illumina, San Diego, CA, USA) platform. Participants who reported taking any medication known to influence osteoporosis, including corticosteroid bisphosphonates, calcitonin, parathyroid hormone, selective estrogen receptor modulators, luteinizing hormone-releasing hormone agents, and somatostatin agents, as well as participants who did not have BMD measurements were excluded from the analytic sample. In total, there were 2417 eligible participants from multiple ethnic backgrounds, with genotype data and adjudicated fracture outcomes available.
2.3. BMD Measurements
BMD was measured for women using dual-energy X-ray absorptiometry (QDR 2000 or 2000+, or 4500 W; Hologic Inc, Bedford, MA, USA) at three participating US clinical centers (Birmingham, AL; Pittsburgh, PA; and Tucson/Phoenix, AZ). Certified technicians using standard protocols measured BMD of the total hip, lumbar spine (L2–L4), and total body. Baseline BMD measurements were employed to classify participants in order to determine if they have osteoporosis at baseline in this study. Standard quality assurance protocols for positioning and analysis, routine hip and spine phantoms, and review of a randomly selected sample were employed. Changes of hardware and software were centralized and calibrated, and calibration phantoms across instruments and clinical sites were in close agreement, with inter-scanner variability <1.5% for the spine, <4.8% for the hip, and <1.7% for linearity [18
]. BMD T-scores were calculated for each individual by using the young adult, normal Caucasian women reference databases. Osteoporosis was defined as a BMD that lies 2.5 standard deviations or less below the average value for young, healthy women [19
2.4. Outcomes: Incident Fractures
In this study, any fractures were defined as all fractures except those of fingers, toes, ribs, sternum (or chest), skull (or face), and cervical vertebrae. Major osteoporotic fracture (MOF) was defined as a composite of hip, humerus, forearm, and clinical vertebral fractures. The study participants were followed for 19 years, from the inception of WHI initial study to the end of the WHI extension study II, to ensure a sufficiently long follow-up duration to capture enough events. The follow-up period was computed from the date of the enrollment (OS) or randomization (CT) to the time of the first fracture or the time of death. Self-reported fracture outcomes were identified annually in OS and semiannually in CT by questionnaires. All fractures in the CT and hip fractures in the OS were adjudicated by using radiology reports. Hip fractures were adjudicated centrally or locally using the same criteria. The agreement between central and local adjudication was 96% for hip fractures [20
]. Other types of fractures were adjudicated locally at the clinical centers which were not designed for BMD measurements in the WHI study [17
Blood samples were genotyped using genomic DNA for WHI participants. Genomic data of WHI were acquired through dbGap. Genotype imputation was conducted at the Sanger Imputation Server to impute variants that were missing, un-typed, or poorly captured in the original data. The Haplotype Reference Consortium (HRC) reference panel and Positional Burrows-Wheeler Transform (PBWT) imputation algorithm were employed for genotype imputation. All 63 fracture-associated SNPs reported by Estrada et al. [21
] were successfully imputed. The imputation quality was high, with R2
2.6. Polygenic Score
Genetic risk for decreased BMD was quantified using a standardized metric described in detail by Estrada et al. [21
]. Briefly, this metric allows the composite assessment of genetic risk for complex traits by summarizing the genetic predisposition. Based on 63 femoral neck BMD-associated SNPs discovered in the largest genome-wide meta-analysis [21
], the polygenic score was computed as PGS = sum (xi
); where xi
are individual’s genotype (0, 1, 2) for SNP i, and bi
are the effect size of this SNP. Linkage disequilibrium (LD) pruning was executed in advance to remove possible LD that existed between SNPs. None of the 63 SNPs were deleted after pruning. To demonstrate if the performance of the WHO international reference standard for osteoporosis diagnosis varied by PGS, eligible WHI participants were divided into three PGS groups using distribution of 25%, 50%, and 25%.
2.7. Statistical Analysis
Demographic and baseline clinical characteristics are presented as mean ± SD for continuous variables or frequencies (%) for categorical variables. Differences between the individuals with and without a fracture were examined by using Student’s t-test for continuous variables and by using chi-square tests or Fisher’s exact test (when numbers were small) for categorical variables, respectively. PGS in different races was examined by using ANOVA. The observed cumulative incidence of fracture from the start of WHI to the end of WHI extended II was assessed by race and PGS groups. The cumulative incidence function (CIF) was applied to derive the observed fracture probability for MOF and any fracture with competing mortality risk accounted for. The ratio between T-score predicted fracture incidence and observed fracture incidence (POR), with the corresponding 95% CI, was calculated for each subgroup.
To assess the performance of the T-score method in classifying osteoporosis in different subgroups, the false-positive rate, and the false-negative rate was calculated for each PGS and race group. Multivariate Cox Proportional Hazard Model was employed to assess the effect of PGS and race on the outcome of MOF or any fracture within 19 years, with baseline T-score controlled for. To further assess whether the effect of PGS and race on the outcome of MOF or any fracture are independent of other common risk factors of osteoporosis, separate multivariate Cox Proportional Hazard Models were conducted with baseline T-score, age, body mass index (BMI), and previous fracture controlled for. The T-score diagnosis was treated as a binary categorical variable in the Cox Proportional Hazard Model. Considering that the PGS used in the present study was calculated based on femoral neck BMD-related SNPs, we also assessed whether the predictive value of PGS for hip fracture would be different from other types of fracture. A multinomial logistic regression with three outcomes (hip fracture, non-hip fracture, non-fracture) was performed.
A series of sensitivity analyses were also conducted, with the first one was conducted on a small sample (N = 1775) in which participants who had previous fractures were excluded. To be comparable with FRAX, which assess the 10-year probability of MOF, the POR between predicted fracture incidence and observed incidence of MOF and any fracture in 10 years were also assessed, along with the false-positive rate and false-negative rate for MOF and any fracture classification in different PGS and race groups with 10-year follow up. A subgroup analysis was conducted to assess whether PGS would predict fracture differently in osteopenia patients, and participants with normal BMD at baseline. Statistical analyses were performed using the SAS 9.4 (SAS Institute, Inc., Cary, NC, USA).
The present study provides compelling evidence that during the 19-year follow-up, the T-score method underestimates the risk of MOF and any fracture in women 50–79 years old, across all racial and PGS groups, especially in African Americans and women who have a low genetic risk. Moreover, the prognostic performance of the T-score method estimated by false-positive rate and false-negative rate using the cut-off value of −2.5 differed across race and PGS groups as well. Results from the multivariate Cox proportional Models provided further evidence that the performance of the T-score method in predicting osteoporotic fracture risks varies by race.
The BMD threshold defined by the WHO T-score method was found to be problematic. The National Osteoporosis Risk Assessment Study found that 82% of 2259 women who reported fractures had a T-score > −2.5 [22
]. Similarly, in the Rotterdam Study of 7806 people, both 56% of women and 79% of men with non-vertebral fractures had a T-score of > −2.5 [23
]. Other studies also demonstrated that the majority of low-trauma fractures occur in individuals whose T-scores were above −2.5 [24
], which is consistent with the extremely high false-negative rates observed in the present study, especially in African American and Hispanic women, as well as women who have a low genetic risk. However, the percentage of being misclassified into a high-risk group without sustaining a fracture is highest among Caucasian women when a T-score method is used to assess fracture risk. BMD is known to be the single best predictor of fracture and the differences have been identified in the areal BMD between ethnic and racial groups [26
]. However, the observed cumulative incidence of fracture, in terms of both MOF and any fracture, was significantly higher than the estimation derived from the BMD-based T-score method in minorities. The results of multivariate Cox Proportional Hazard Analysis further demonstrated that race is a significant predictor of MOF and any fracture independent of the T-score classification. Although separate reference database was proposed for Africa Americans and Hispanics [27
], we did not use this ethnic-specific references in this study because whether the T-score derived from the ethnic-specific database performs better or worse in osteoporosis diagnosis remains unclear [28
]. Nonetheless, our previous study suggested that a new classification method of low BMD based on the race-specific lower limit of normal values may help mitigate some of the T-score limitations in minority populations [29
The present study found that T-score greatly underestimated the risk of fracture in women aged 50–79 years old, and the degree of underestimation by the T-score method in the low PGS group is greater than in the high genetic risk groups in both outcomes of MOF and any fracture. However, in the multivariate analysis, genetic profiling was demonstrated not to be a significant predictor of MOF and any fracture, after T-score classification was adjusted for. Prior twin studies demonstrated that the heritable component of fracture is largely independent of BMD [30
], whereas the reported fracture-related genetic variants are also associated with BMD [32
]. Due to the study power issue, GWAS for dichotomous disease as a direct outcome has yielded relatively lower numbers of loci discovered, and this is also the concern for osteoporotic fracture studies as well. Moreover, the multifactorial nature of fracture is another issue that makes it challenging to identify the specific genetic determinants that contribute to the risk of fractures. Therefore, the PGS constructed in the present study may not sufficiently capture the BMD-independent genetic risk of fracture. With more fracture-related genetic components being discovered, a more significant effect of PGS on fracture risk prediction should be foreseen. Another possible reason for the minor effect of PGS on fracture outcomes observed in the present study is that, similar to other age-related traits, the heritability of fracture risk decreases with age [32
]. Since the analytic sample consisted of older women, the effect mediated through genetic influences on bone turnover, and bone geometry or non-skeletal factors such as cognitive function, neuromuscular control, visual acuity, or other factors related to the risk of falling might be more attributable to the predisposition of fracture [33
Limitations of this study are acknowledged. First, the WHI data we used only included women 50–79 years old, so our findings may not apply to men or to women who are not in the age range of this study. Second, genetic variants related to fracture risk independent of BMD remain mostly undiscovered and likely most related genetic variants have not been included in the present study. Therefore they had a limited impact on the T-score classification. Thirdly, concerning the allele frequencies, osteoporotic fracture risk is associated with common and rare variants. Since all SNPs used in the current study were based on a prior GWAS meta-analysis, which likely is able to discover only common genetic variants, the BMD or fracture-related rare genetic determinants may not be included. Finally, the sample size of minority subjects was very small in this study; the results may, therefore, be underpowered.
To the best of our knowledge, this is the first study to assess T-score performance in the prediction of MOF and any fractures in groups with different genetic profiling and of various races. Our findings demonstrated that T-score performed differently in different races and PGS groups, and thus the effect of race and genetic determinants in osteoporotic fracture prediction should be taken into account beyond the T-score classification. Fully integrating genetic profiling and racial factors into the existing fracture assessment model is very likely to improve the accuracy of osteoporosis diagnosis. Thus, developing racial/ethnic-specific, individualized osteoporosis diagnosis methods will provide more accurate fracture risk assessment and decrease false-positive rates and false-negative rates of osteoporosis diagnosis. Further studies, especially these including men, a more extensive sample of minorities, and more comprehensive fracture-associated genetic variants, are warranted.