Machine Learning for Opportunistic Screening for Osteoporosis from CT Scans of the Wrist and Forearm

Background: We investigated whether opportunistic screening for osteoporosis can be done from computed tomography (CT) scans of the wrist/forearm using machine learning. Methods: A retrospective study of 196 patients aged 50 years or greater who underwent CT scans of the wrist/forearm and dual-energy X-ray absorptiometry (DEXA) scans within 12 months of each other was performed. Volumetric segmentation of the forearm, carpal, and metacarpal bones was performed to obtain the mean CT attenuation of each bone. The correlations of the CT attenuations of each of the wrist/forearm bones and their correlations to the DEXA measurements were calculated. The study was divided into training/validation (n = 96) and test (n = 100) datasets. The performance of multivariable support vector machines (SVMs) was evaluated in the test dataset and compared to the CT attenuation of the distal third of the radial shaft (radius 33%). Results: There were positive correlations between each of the CT attenuations of the wrist/forearm bones, and with DEXA measurements. A threshold hamate CT attenuation of 170.2 Hounsfield units had a sensitivity of 69.2% and a specificity of 77.1% for identifying patients with osteoporosis. The radial-basis-function (RBF) kernel SVM (AUC = 0.818) was the best for predicting osteoporosis with a higher AUC than other models and better than the radius 33% (AUC = 0.576) (p = 0.020). Conclusions: Opportunistic screening for osteoporosis could be performed using CT scans of the wrist/forearm. Multivariable machine learning techniques, such as SVM with RBF kernels, that use data from multiple bones were more accurate than using the CT attenuation of a single bone.


Introduction
Bone mineral density (BMD) decreases with age, with the decrease being more evident and rapid in post-menopausal females [1][2][3]. Decreased BMD increases the risk of frailty fractures including fractures of the spine, forearm, and hips [4][5][6][7][8]. Fractures of the hips are associated with increased 1-year mortality following the fracture of approximately 15-36%, therefore identifying risk factors for frailty fractures and hip fractures is of increased clinical importance [9,10]. Dual-energy X-ray absorptiometry (DEXA) is the gold standard screening test for the evaluation of BMD [11]. DEXA evaluates BMD in the L1-L4 lumbar spine, total hip, and femoral neck and compares these values to those of normal young adults (20-29 years of age) from the National Health and Nutrition Examination Survey (NHANES) III cohort to create BMD T-scores [12,13]. The World Health Organization (WHO) and the International Society for Clinical Densitometry (ISCD) have guidelines for the classification of patients aged 50 years or greater with low BMD as osteoporotic or osteopenic based on DEXA BMD T-scores. Patients aged 50 years or greater with lowest BMD T-scores ≥ −1 have normal BMD, patients aged 50 years or greater with −2.5 ≤ lowest

Materials and Methods
The study was compliant with the Health Insurance Portability and Accountability Act of 1996 (HIPAA), and the study protocol was reviewed and approved by the local Institutional Review Board (IRB) at a tertiary care academic medical center. The need for signed informed consent from each patient was waived by the IRB. The work described was carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans.

CT Scanner Protocol and CT Attenuation Measurements
CT scans were performed on Siemens Somatom Definition Edge and Siemens Flash (Siemens Healthineers, Erlangen, Germany) scanners. CT scans were performed without intravenous contrast at 120 kVp, 200-250 mA, field of view (FOV) 118-120 mm, tilt of 0 • in the axial plane at 0.4-0.5 mm, and reconstructed in the coronal and sagittal planes at 1 mm.
The study cohort comprised patients aged 50 years or older, who were evaluated and/or treated between 1 January 2015 and 30 September 2021. Patients were included if they had CT scans of an upper extremity, including the wrist and forearm, and DEXA scans (of the lumbar spine and hips) within 12 months of each other. Patients were excluded if they had prior surgery of the wrist/forearm, hips, or spine with implantation of hardware that would affect CT attenuation measurements.

Segmentations
The trabecular component of each bone was carefully, manually segmented using 3D-Slicer [23]. Care was taken to avoid bone lesions (hemangiomas and bone islands), osteophytes, and the bony cortex. No specific attenuation threshold was used to segment the trabecular bone. The subcortical regions were segmented manually and then automatically interpolated between CT slices. The entire volume of each bone was segmented, and the mean CT attenuation of each bone was recorded ( Figure 1). The study cohort comprised patients aged 50 years or older, who were evaluated and/or treated between 1 January 2015 and 30 September 2021. Patients were included if they had CT scans of an upper extremity, including the wrist and forearm, and DEXA scans (of the lumbar spine and hips) within 12 months of each other. Patients were excluded if they had prior surgery of the wrist/forearm, hips, or spine with implantation of hardware that would affect CT attenuation measurements.

Segmentations
The trabecular component of each bone was carefully, manually segmented using 3D-Slicer [23]. Care was taken to avoid bone lesions (hemangiomas and bone islands), osteophytes, and the bony cortex. No specific attenuation threshold was used to segment the trabecular bone. The subcortical regions were segmented manually and then automatically interpolated between CT slices. The entire volume of each bone was segmented, and the mean CT attenuation of each bone was recorded ( Figure 1). We calculated the CT attenuation of the distal third of the radius, radius UD, radius 33%,the distal third of the ulna, ulna UD, ulna 33%, scaphoid, lunate, triquetrum, pisiform, trapezium, trapezoid, capitate, hamate, and proximal thirds of the first through fifth metacarpals. Segmentations were performed by a trained research assistant (3 weeks of training doing all 3D Slicer tutorials; 4 months of experience with 3D Slicer doing similar projects before doing the current project) and were reviewed by a fellowship-trained musculoskeletal radiologist with 10 years of experience. We calculated the CT attenuation of the distal third of the radius, radius UD, radius 33%, the distal third of the ulna, ulna UD, ulna 33%, scaphoid, lunate, triquetrum, pisiform, trapezium, trapezoid, capitate, hamate, and proximal thirds of the first through fifth metacarpals. Segmentations were performed by a trained research assistant (3 weeks of training doing all 3D Slicer tutorials; 4 months of experience with 3D Slicer doing similar projects before doing the current project) and were reviewed by a fellowship-trained musculoskeletal radiologist with 10 years of experience.

Statistical Analysis
A sample size of 100 patients was determined to have 80% power with a type I error rate of 5% to detect a difference between two receiver operator characteristic (ROC) curves if the first ROC curve has an area under the receiver operator characteristic curve (AUC) of 0.80 and the second ROC curve has an AUC of 0.60, and there are 74% controls and 26% cases. Therefore, the cohort was randomly divided into two datasets-a training/validation dataset (96 (49%) patients) and a testing dataset (100 (51%) patients) for multivariable machine learning.
Summary statistics for all clinical and demographic variables were first calculated. Ttests with unequal variances and Fisher's exact tests were used to compare quantitative and qualitative variables, respectively, between the training/validation and testing datasets.
Pearson's correlations were used to examine the correlations between the CT attenuation of the wrist/forearm bones and to evaluate the correlations between CT attenuation of the wrist/forearm bones (distal third of the radius and ulna, proximal third of the first through fifth metacarpals, scaphoid, lunate, triquetrum, pisiform, trapezium, trapezoid, capitate, and hamate) stratified by gender in the entire dataset. Hierarchical cluster analysis was used to cluster the correlations between the CT attenuations of the wrist/forearm bones.
Pearson's correlation coefficients were also used to evaluate the correlations between the CT attenuations of each of the wrist/forearm bones and DEXA measurements (L1-L4 BMD, L1-L4 BMD T-score, L1-L4 TBS, total hip BMD, total hip BMD T-score, femoral neck BMD, and femoral neck BMD T-score) in the entire dataset.
ROC curves were used to identify the optimal CT attenuation threshold for each bone to evaluate the predictive performance of each bone regarding predicting osteoporosis and osteopenia/osteoporosis in the training/validation dataset. We then evaluated these optimized thresholds in the test dataset.
Since we evaluated the predictive properties of several different bones in the wrist/ forearm to predict osteoporosis and osteopenia/osteoporosis, we used several machine learning methods to select the best combination of bones and clinical factors (age, gender, height, weight, BMI) that could be used to categorize a patient as (i) osteoporotic, (ii) osteopenic/osteoporotic using the WHO guidelines, (iii) femoral neck BMD T-score ≤ −2.5, and (iv) femoral neck BMD T-score < −1. We evaluated the femoral neck BMD T-scores because these scores are thought to be less affected by degenerative changes in the lumbar spine and hip.
A support vector machine (SVM) is a supervised learning model that is often used for pattern recognition, classification, and regression analysis [24]. C-classification with three different kernels (linear, radial basis function (RBF), and sigmoid) [24] tuned with 10-fold cross-validation was utilized.
A random forest classifier is an ensemble learning method for classification based on constructing a multitude of decision trees during training [25]. The random forest model was fit and tuned with the number of variables tried at each step starting at 6, a step factor of 1.5, and 10,000 trees used during the tuning step.
Each of the four machine learning models (linear kernel SVM, RBF kernel SVM, sigmoid kernel SVM, random forest classifier) were used to each categorize patients as (i) osteoporotic using WHO guidelines, (ii) osteopenic/osteoporotic using WHO guidelines, (iii) femoral neck BMD T-score ≤ −2.5, and (iv) femoral neck BMD T-score < −1. The optimal models tuned in the training/validation dataset were retained for each analysis. We used these models to predict (i) osteoporotic using WHO guidelines, (ii) osteopenic/osteoporotic using WHO guidelines, (iii) femoral neck BMD T-score ≤ −2.5, and (iv) femoral neck BMD T-score < −1.
All test statistics were two-sided and p-values < 0.05 were considered statistically significant. We compared the ROC curves using DeLong's test [26] and compared the machine learning models to the CT attenuation of the radius 33% since this region is used to determine osteoporosis and osteopenia on DEXA studies of the forearm/wrist. Statistics were performed using the pROC, e1071, and RandomForest packages in Rv4.1.2 statistical software.
The optimal CT attenuation thresholds for each bone and the predictive performance of these optimal thresholds regarding predicting (i) osteoporosis, (ii) osteopenia/osteoporosis, (iii) femoral neck BMD T-score ≤ −2.5, and (iv) and femoral neck BMD T-score <−1 are shown in Table 4.   Radius-distal third of the radius; Radius UD-ultradistal radius (radius epiphysis/metaphysis); Radius 33%distal third of the radial shaft; Ulna-distal third of the ulna; Ulna UD-distal ulna (ulnar epiphysis/metaphysis); Ulna 33%-distal third of the ulnar shaft; 1 MC-proximal third of the first metacarpal; 2 MC-proximal third of the second metacarpal; 3 MC-proximal third of the third metacarpal; 4 MC-proximal third of the fourth metacarpal; 5 MC-proximal third of the fifth metacarpal; --Undefined.

Training/Validation Dataset
We found that the CT attenuation of each bone was a significant predictor of osteoporosis in the training/validation dataset (Supplementary Figure S1). The highest AUC was for the hamate (optimal CT threshold 170.166 Hounsfield units (HU), AUC = 0.769), capitate (optimal CT threshold 248.039 HU, AUC = 0.763), and first metacarpal (optimal CT threshold −7.772 HU, AUC = 0.752). The radius 33% had an AUC of 0.705 in the training/validation dataset. The linear kernel SVM (AUC = 0.894) and radial basis function (RBF) kernel SVM (AUC = 0.987) had the highest AUCs of the machine learning models in the training/validation dataset.

Test Dataset
The performances of the CT attenuation thresholds obtained from the training/ validation dataset were evaluated in the test dataset. The hamate (AUC = 0.393), capitate (AUC = 0.636), first metacarpal (AUC = 0.639), and radius 33% (AUC = 0.576) showed slightly lower predictive abilities in the test dataset compared to the training dataset. The RBF kernel SVM (AUC = 0.818) had a higher AUC than any of the CT attenuation thresholds for each bone. The RBF kernel SVM was better than the radius 33% model (p = 0.020).

Training/Validation Dataset
When predicting osteopenia/osteoporosis in the training/validation dataset, we found that the CT attenuation of each bone was predictive of osteoporosis/osteopenia (Supplementary Figure S2). The first metacarpal (optimal CT threshold 27.779 HU, AUC = 0.823), scaphoid (optimal CT attenuation threshold 250.749 HU, AUC = 0.773), and lunate (optimal CT attenuation threshold 258.091 HU, AUC = 0.768) were the most predictive, whereas the radius UD (AUC = 0.528), third metacarpal (AUC = 0.529), and fourth metacarpal (AUC = 0.579) were the least predictive bones. The radius 33% had an AUC of 0.716 regarding predicting osteopenia/osteoporosis in the training/validation dataset. The RBF kernel SVM (AUC = 0.969) had the highest AUC of all the machine learning methods investigated regarding predicting osteopenia/osteoporosis in the training/validation dataset.

Test Dataset
In the test dataset, we found that the CT attenuations thresholds obtained from the training/validation dataset for the first metacarpal (AUC = 0.445), scaphoid (AUC = 0.651), and lunate (AUC = 0.433) were less predictive in the test dataset. The radius 33% CT attenuation threshold had an AUC of 0.563 in the test dataset. However, the RBF kernel SVM (AUC = 0.805) had the highest AUC and accuracy in the test dataset. The RBF kernel SVM (p = 0.068) was not significantly better than the radius 33% in the test dataset.

Test Dataset
In the test dataset, we found that the CT attenuation thresholds obtained from the training/validation dataset for the fifth metacarpal (AUC = 0.398), pisiform (AUC = 0.415), first metacarpal (AUC = 0.618), and radius 33% (AUC = 0.426) were less predictive. The RBF kernel SVM had an AUC of 0.770 regarding identifying patients with a femoral neck BMD T-score ≤ −2.5. The RBF kernel SVM (AUC = 0.770, p < 0.001) model was better than the model using the radius 33% CT attenuation threshold.

Predicting Femoral Neck BMD T-Score < −1 3.4.1. Training/Validation Dataset
We found that a CT attenuation threshold of the trapezium (optimal CT attenuation threshold 165.624 HU, AUC = 0.722) had the best AUC to predict femoral neck BMD T-score <−1 in the training/validation dataset (Supplementary Figure S4). The scaphoid (optimal CT attenuation threshold 229.799 HU, AUC = 0.719) and pisiform (optimal CT attenuation threshold 221.709 HU, AUC = 0.714) were the next best predictors of having a femoral neck BMD T-score < −1. The radius 33% (AUC = 0.605), radius (AUC = 0.603), radius UD (AUC = 0.558), and hamate (AUC = 0.584) were the worst predictors of having a femoral neck BMD < −1 in the training/validation dataset. The RBF kernel SVM (AUC = 0.987) had the highest AUC of all the multivariable machine learning models regarding predicting femoral neck BMD T-score < −1 in the training/validation dataset.

Discussion
We found strong positive correlations between the CT attenuation of the wrist/forearm bones. There were stronger correlations between the CT attenuations of the radius and ulna measurements, stronger correlations between the carpal CT attenuations, and stronger correlations between the metacarpal CT attenuations. The CT attenuations of the wrist/forearm bones were largely positively correlated with the DEXA measurements. The RBF kernel SVM was best for predicting osteoporosis with a higher AUC (AUC = 0.818) than other models and statistically better than radius 33%. The RBF kernel SVM also had the highest AUC (AUC = 0.805) regarding predicting patients with osteoporosis/osteopenia. We found that the RBF kernel SVM was best for predicting a femoral neck BMD T-score ≤ −2.5 and was statistically better than the CT attenuation of radius 33%. The data also showed that the RBF kernel SVM was the best for predicting a femoral neck BMD T-score < −1 and statistically better than the CT attenuation of radius 33%.
These results have significant clinical implications. We showed that opportunistic screening for osteoporosis and osteopenia can be performed using routine CT scans of the wrist/forearm obtained for clinical care. We also showed that the accuracy of a machine learning model using the CT attenuation of multiple bones and clinical/demographic variables exceeded that of a single bone. We provided CT attenuation thresholds for each wrist/forearm bone that could be used to identify patients who should go on to get screened for osteoporosis or osteopenia/osteoporosis with a formal DEXA study. This has the potential to identify patients earlier in their course of BMD loss and get patients to clinical care earlier.
The radius transmits the majority of the force from the wrist to the elbow and is therefore often fractured when individuals with diminished BMD fall on an outstretched hand [6,7,27]. Forearm fractures, in particular, distal radius fractures, are often frailty fractures related to diminished BMD. Although the radius is often evaluated on DEXA studies for the prediction of osteopenia/osteoporosis and osteoporosis, we found that the CT attenuation of the radius was not the single best bone for predicting osteopenia/osteoporosis and osteoporosis in the analysis. Bone homeostasis is regulated by the activities of three cell types, namely, the osteoclasts, osteocytes, and osteoblasts, and is generally kept in dynamic equilibrium to maintain bone mass [28]. Our data suggested that all bones may be differentially affected when there is diminished bone mass because of the lack of perfect correlation between the CT attenuation of bones and the likely BMD of each bone [29]. Each bone also likely has a different trabecular bone structure in order to best suit its mechanical/structural demands, and this may affect the mean CT attenuation between bones.
The CT attenuation of the hamate had the best performance to differentiate patients with osteoporosis from those without osteoporosis. The patients in this study were aged 50 years or older, and likely had degenerative changes of the wrist. Degenerative changes of the carpus often affect the scaphoid, trapezium, and trapezoid (triscaphe and first carpometacarpal joint degenerative changes), scapholunate, and the pisotriquetral articulations. These degenerative changes are manifested by increased sclerosis and eburnation, and this increased sclerosis likely affects the CT attenuation of each bone. We hypothesized that the hamate was least likely to be affected by degenerative changes and this was likely the reason it had the best performance. One strength of the multivariable SVM and random forest models was that these models utilized the CT attenuation of all the bones and, therefore, had better performance than the CT attenuation of any individual bone.
A prior study noted that the second metacarpal cortical index can be used for opportunistic screening for osteoporosis [22]. We showed that the CT attenuation of the second metacarpal is predictive of osteoporosis and osteopenia/osteoporosis. We showed that the CT attenuation of the other metacarpals was also predictive of osteoporosis and osteopenia/osteoporosis. However, the CT attenuation of the metacarpals was less accurate than the multivariable machine learning models for the diagnosis of osteoporosis and osteopenia/osteoporosis. One limitation of this study was the small sample size. We found that most patients with CT scans of the wrist/forearm at our institution did not have concurrent DEXA studies, which suggests that there is under-screening/inadequate screening for osteopenia and osteoporosis. Our method may help to identify patients who had a CT scan of the wrist/forearm who should go on to have a formal DEXA study to screen for osteopenia and osteoporosis in the near future. Patients all had their CT scans of the wrist/forearm performed using Siemens scanners, which is a limitation; however, a recently published article suggests that there is minimal bias in CT attenuation measurements between CT manufacturers [30]. This study was a retrospective study at a single multi-center tertiary care academic institution. Another limitation is that most patients were white; therefore, it is unclear how our results will port to other races/ethnicities. While we showed methods regarding how to categorize patients as osteoporotic or osteopenic/osteoporotic using the CT attenuation of the wrist/forearm bones based on DEXA studies, further work is required to evaluate how well these measurements predict future frailty fractures.

Conclusions
In summary, opportunistic screening for osteoporosis and osteopenia/osteoporosis could be performed using the CT attenuation of each of the bones from CT scans of the wrist/forearm. We used machine learning to show that using the CT attenuation of multiple bones was more accurate than using the CT attenuation of a single bone. DEXA scans currently evaluate only the lumbar spine and the hips to assess global bone mineral density. CT attenuation data from routine CT scans of the wrist and forearm can be used to identify patients at risk for osteoporosis who should go on to have formal screening for osteoporosis using DEXA scans.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics12030691/s1, Figure S1: Performance of the CT attenuation of each bone and multivariable machine learning models to predict osteoporosis; Figure S2: Performance of the CT attenuation of each bone and multivariable machine learning models to predict osteopenia/osteoporosis; Figure S3: Performance of the CT attenuation of each bone and multivariable machine learning models to predict femoral neck BMD T-score ≤ −2.5; Figure S4: Performance of the CT attenuation of each bone and multivariable machine learning models to predict femoral neck BMD T-score < −1.

Institutional Review Board Statement:
The study was reviewed and approved by the local IRB (IRB #21-000724).

Informed Consent Statement:
The study protocol was reviewed and approved by the IRB board at the senior author's institution. The need for signed informed consent from each patient was waived because of the retrospective nature of the study. Data Availability Statement: Data available upon request from the corresponding author and IRB approval.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.