Next Article in Journal
The Role of Obstructive Sleep Apnea in Vision-Threatening Diabetic Retinopathy—A National Register-Based Study
Previous Article in Journal
Balloon Eustachian Tuboplasty Combined or Not with Myringotomy in Eustachian Tube Dysfunction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Breast Cancer Risk Using Radiomics Features of Mammography Images

1
Department of Breast and Endocrine Surgery, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
2
Department of Radiology, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
3
Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
*
Author to whom correspondence should be addressed.
J. Pers. Med. 2023, 13(11), 1528; https://doi.org/10.3390/jpm13111528
Submission received: 14 September 2023 / Revised: 23 October 2023 / Accepted: 23 October 2023 / Published: 25 October 2023

Abstract

:
Mammography images contain a lot of information about not only the mammary glands but also the skin, adipose tissue, and stroma, which may reflect the risk of developing breast cancer. We aimed to establish a method to predict breast cancer risk using radiomics features of mammography images and to enable further examinations and prophylactic treatment to reduce breast cancer mortality. We used mammography images of 4000 women with breast cancer and 1000 healthy women from the ‘starting point set’ of the OPTIMAM dataset, a public dataset. We trained a Light Gradient Boosting Machine using radiomics features extracted from mammography images of women with breast cancer (only the healthy side) and healthy women. This model was a binary classifier that could discriminate whether a given mammography image was of the contralateral side of women with breast cancer or not, and its performance was evaluated using five-fold cross-validation. The average area under the curve for five folds was 0.60122. Some radiomics features, such as ‘wavelet-H_glcm_Correlation’ and ‘wavelet-H_firstorder_Maximum’, showed distribution differences between the malignant and normal groups. Therefore, a single radiomics feature might reflect the breast cancer risk. The odds ratio of breast cancer incidence was 7.38 in women whose estimated malignancy probability was ≥0.95. Radiomics features from mammography images can help predict breast cancer risk.

1. Introduction

The importance of identifying women at high risk of developing breast cancer has increased in recent years. Globally, breast cancer affects more women than any other cancer [1]. Early detection and treatment can achieve a high cure rate. Mammography images contain a lot of information about the breasts (mammary gland, stroma, breast size, breast shape, skin properties, and intramammary fat). Using this information, mammography screening has contributed to the early detection of breast cancer and reduced mortality [2,3,4,5,6]. However, some tumors are difficult to detect as malignant tumors using mammography owing to the location of the breast tumor [7,8,9], tumor size [8], tumor density [8], and dense breasts [10,11]. The presence of these non-technical false-negative cases is a weakness of mammography-only screening.
Therefore, using other modalities, such as digital breast tomosynthesis [12], ultrasonography [13], computed tomography (CT), and magnetic resonance imaging (MRI) [14], is essential. However, the application of these modalities needs to be limited to high-risk cases [15], such as those with BRCA1/2 pathogenic variants [16]. This is because they may increase the false positive rate and radiation exposure. Furthermore, they have not been shown to reduce mortality [14,17].
Various models have been devised to calculate the risk of developing breast cancer. The most well-known is the Gail model [18]. However, its accuracy is not very high because it calculates the risk from clinical information alone. In recent years, the addition of genetic information, such as single-nucleotide polymorphisms, is expected to improve accuracy [19,20]. However, it is not realistic to utilize genetic testing for everyone to incorporate genetic information because of the high cost, disadvantages associated with the test results (discrimination and anxiety before the onset of disease due to genetic information), and the fact that the test results may also affect blood relatives. Other risk calculation models have used mammography images [20,21]. In recent reports, risk assessment was performed using not only mammary gland density but also the texture of the mammary gland area [22].
Several changes are known to occur in the mammary glands of women at high risk for breast cancer. For example, diabetes, a risk factor for breast cancer development [23], can cause a characteristic change in the mammary glands of patients with diabetes, known as diabetic mastopathy. This change can be detected using ultrasonography and may also be identified on mammography images [24,25].
Furthermore, BRCA1/2 pathogenic variants are reflected in MRI images [26]. In this method, radiomics features of MRI images are utilized and considered useful for extracting genetic information from medical images. Radiomics can extract texture and shape features from radiological images that are invisible to the naked eye and is one of the most useful methods for analyzing specific regions in radiological images [27].
Radiomics has been used in various modalities in the field of breast cancer [28]. In recent years, a number of studies, particularly those on mammography [29], have reported its usefulness not only for predicting whether a tumor is benign or malignant [30,31], but also its molecular subtypes [32,33], risk of recurrence [34], and prognosis [35]. Thus, a lot of information can be obtained from the radiomics features of mammography images; therefore, the risk of developing breast cancer can also be calculated, leading to individually optimized breast cancer screening methods [36].
In fact, some pilot studies have demonstrated the potential for predicting the risk of breast cancer development from radiomics features of mammography using small numbers and closed datasets [22,37]. In particular, Zheng et al. [37] reported that the risk of breast cancer development could be predicted with very high accuracy (area under the curve (AUC): 0.85) using radiomics features of mammography. However, further validation using large-scale or public datasets remains to be reported.
In this cross-sectional study, we used a large and publicly accessible dataset. Radiomics features calculated from mammography images of a healthy-side breast in women with breast cancer were compared with those of breast cancer-free women to estimate the individual risk of developing breast cancer and to identify radiomics features associated with the risk of developing breast cancer.

2. Materials and Methods

Firstly, we hypothesized that bilateral breasts in a woman have the same environment, such as a germline pathogenic variant [38], exposure to estrogen [39], obesity [40], and lifestyle [41], which would lead to breast cancer development. In breast cancer treatment, some reports suggest that the outcome can be predicted from the image data of the healthy-side breast [42]. Thus, it is assumed that images obtained from the contralateral breast of a patient with breast cancer reflect information about the whole-body environment.
Based on this hypothesis, we compared radiomics features of mammography images of patients with breast cancer and breast-cancer-free women to predict the risk of developing breast cancer. For the malignant group, we used the mammography images from the healthy side, whereas for the breast-cancer-free women, we randomly selected mammography images from either breast to match the left–right ratio of the patients with breast cancer during validation. Using radiomics features calculated from whole-breast mammography images in both groups, we could predict the breast cancer risk and identify specific radiomics features associated with an increased risk of breast cancer development.

2.1. Dataset

We used the OPTIMAM Mammography Image Database [43] as our mammography dataset. This dataset contains screening mammography images and patient data collected from three institutions in the United Kingdom since 2011: the Jarvis Breast Screening Centre in Guildford, St George’s Hospital in southwest London, and Addenbrooke’s Hospital in Cambridge. The entire dataset consisted of 154,832 normal, 6909 benign and 9690 malignant cases and 1888 intermediate cancers in 173,319 women; however, we received 1000 normal, 1000 benign, and 4000 malignant cases as the ‘starting point set’. The 4000 malignant cases were defined as the malignant group and the 1000 normal cases as the normal group. However, the 1000 benign cases were not used.
For the malignant group, we used mammography images obtained just before biopsy to diagnose breast cancer. For the normal group, we used cases that were judged as normal in mammography images more than one year later. We used the oldest image for the normal group.
The proportion of mammography equipment manufacturers for the normal and malignant groups is shown in Table 1. The proportion of malignant cases was higher in those taken with Philips equipment than with other manufacturers. To control for potential confounding effects due to differences in equipment, we only used images obtained with the most frequently used Hologic mammography device (85.1% in the malignant group and 92.0% in the normal group).
We also excluded cases with artificial objects in the images (implants, piercings, implanted devices for cardiac disease), large breast sizes that prevented routine imaging, bilateral breast cancer, and one-sided mammography images. A summary of the excluded cases is shown in Table 2. Finally, 3215 malignant cases and 896 normal cases were included in the analysis.
All 4111 cases that met the inclusion criteria had mammography images available in two directions, mediolateral oblique (MLO) and craniocaudal (CC). However, because of substantial inter-individual variability in MLO-view images arising from factors such as the breast compression angle, success of pectoral muscle compression, and presence of abdominal subcutaneous fat, we chose to use only CC-view images in our study. For the malignant group, we used CC images from the healthy breast. For the normal group, we randomly selected bilateral CC images to match the left–right ratio of the malignant group.

2.2. Breast Mask Image

We created a mask image of the breast region for each breast. For each image, a mask was created using a threshold value of 1/100, numerically calculated with Otsu’s binarization method [44]. In some cases, creating the mask failed with this method, but we were still able to create masks using the threshold value calculated using Otsu’s binarization method.

2.3. Radiomics Features

Radiomics feature values for all breast regions were calculated using Pyradiomics, an open-source Python package platform (http://www.radiomics.io/pyradiomics.html, accessed on 14 September 2023). The radiomics features used in this study were classified into seven categories: shape-based 2D, First Order Statistics, Gray Level Cooccurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Size Zone Matrix (GLSZM), Gray Level Dependence Matrix (GLDM), and Neighborhood Gray Tone Difference Matrix (NGTDM). Shape-based 2D features were calculated only from the original mammography images. The other six categories of radiomics features were calculated from both the original mammography images and the mammography images processed with six filters (log (laplacian of gaussian)-sigma-2-0 mm, log-sigma-3-0 mm, log-sigma-4-0 mm, log-sigma-5-0 mm, wavelet-H, wavelet-L). As a result, a total of 646 radiomics features were obtained. The feature items used in this study (https://pyradiomics.readthedocs.io/en/latest/features.html, accessed on 14 September 2023) are shown in Figure 1.

2.4. Analysis

All machine learning analyses and statistical processing were performed in Python 3.6.5.
We used the Light Gradient Boosting Machine (LGBM) (https://lightgbm.readthedocs.io/en/v3.3.2/) for our classification task. LGBM is an open-source distributed gradient-boosting framework developed by Microsoft Corporation that uses supervised learning to compute the objective variable from the explanatory variables using a decision tree method. Unlike the sort-based decision tree algorithm used in eXtreme Gradient Boosting and other implementations, LGBM features a highly optimized histogram-based decision tree learning algorithm, which increases efficiency and reduces memory consumption.
We conducted the performance evaluation of the classification task using a k-fold cross-validation scheme. The cases were randomly divided into k equal subsets, and the evaluation was performed k times, each time using a different subset as the test set. In this study, k was set to 5 to ensure a sufficient number of normal and malignant cases in each subset. The k-fold cross-validation scheme is effective in preventing models from overfitting.
We trained and tested the LGBM with the target variable being whether the case was malignant or normal and the explanatory variable being radiomics features. We calculated the accuracy and AUC for each split. Due to the age bias in the dataset, we conducted analyses limited to all ages and individuals in their 50s or 60s to control for age effects. We verified whether the AUC for each age group significantly exceeded 0.5 using a one-sample t-test.
The probability of malignancy for each case was predicted using the trained LGBM. The threshold for the malignant probability for calculating the odds ratio of being the contralateral breast of a malignant tumor was set at 0.95.
In the classification task distinguishing malignant from normal cases, we identified the radiomics features that were frequently involved in the decision tree branching. To verify the distribution differences of these radiomics features between malignant and normal cases, we conducted a covariance analysis adjusted for age.

3. Results

The cases used in the analysis, categorized by age and by left and right sides of the mammographic image, are shown in Table 3. All LGBM parameters were set as default values, except n_estimators = 1,000,000, learning rate = 0.0001, and class_weight = ‘balanced’. The AUC, accuracy, and receiver operating characteristic curves calculated from the five-fold cross-validation are shown in Table 4 and Figure 2a–c for each fold. The AUC exceeded 0.5 in all folds for all cases and groups in their 50s and 60s. However, the AUC was lower in those in their 50s and 60s than in all cases, possibly due to the small number of training cases. We performed a one-sample t-test for the AUC of each age group and confirmed that each AUC was significantly above 0.5 (all ages: p < 0.001, 50s: p = 0.030, 60s: p = 0.002).
We present histograms showing the probability of malignancy for each case in the malignant and normal groups in Figure 2d–e. The histograms of the malignant cases have a peak value further to the right than the histograms of the normal cases. Furthermore, the histograms of the malignant cases also have many counts (n = 103, 3.3%), wherein the probability of being malignant is >0.95 (red bar), whereas the normal cases have a very small number of counts (n = 4, 0.4%). The odds ratio of being a malignant case is 7.38 (i.e., (103/4)/(3112/892)), using a cut-off value of the probability of malignancy > 0.95.
The normal group has equal numbers of left and right sides, as both sides are available.
Some features were found to be important as they were frequently used in decision tree branching across multiple folds (Table 5). For example, ‘wavelet-H_glcm_Correlation’ and ‘wavelet-H_firstorder_Maximum’ ranked among the top 10 in terms of importance in all folds, and ‘original_glrlm_LongRunLowGrayLevelEmphasis’ ranked among the top 10 in terms of importance in four folds.
We present 10 radiomics features that were the top 10 most frequently used for decision tree branching across all folds and histograms of the values of these features for both normal and malignant groups (Figure 3). We analyzed the differences in the distribution in each radiomics feature in the normal and malignant groups after adjustment for age. Significant differences in the following three radiomics features were observed between the normal and malignant groups even after age adjustment: ‘wavelet-H_glcm_Correlation’ (p < 0.001), ‘wavelet-H_firstorder_Maximum’ (p< 0.001), and ‘wavelet-H_glcm_Imc2’ (p < 0.001) (Figure 3).

4. Discussion

In recent years, deep learning has been widely used in the field of image analysis with the advancement of computational equipment. Deep learning-based methods might also be able to predict the risk of developing breast cancer using mammography images of the contralateral breast. Nevertheless, for the classification tasks in this study, we chose to use LGBM, a decision tree method. There are two reasons for this. One is that there have been reports that machine learning is more useful than deep learning when the difference between the number of radiomics features used as explanatory variables and the number of cases is not sufficient [45]. In this study, we used a large number of radiomics features, so we decided to use the decision tree method. Another reason is that using decision tree methods allows us to provide explanations for classification using radiomics features. It is challenging to explain which elements of each image are used for classification in deep learning methods. In contrast, we were able to identify several radiomics features, such as ‘wavelet-H_glcm_Correlation’ and ‘wavelet-H_firstorder_Maximum’, which may be useful in predicting the risk of breast cancer development. It is desirable to verify whether these features are related to known breast cancer risk factors, such as genetic mutations [16], smoking [41], and long-term estrogen exposure [39], in the future.
While only the mammary gland area was used to calculate radiomics features in previous studies [22,37], the whole breast was used in our study. This has two advantages. One is that it is more versatile and simpler, as no specific breast region is extracted. The second is that information from outside the mammary area (skin properties and adipose tissue area) can also be incorporated into the model. Biologically, we know that parenchymal stromal cells and adipocytes in the breast influence the development and progression of breast cancer, and that estrogen receptors, which are largely responsible for breast cancer development, are also expressed on epidermal and dermal cells [46]; therefore, information from outside the mammary gland region may also reflect the risk of developing breast cancer.
However, this approach reflects mammary gland density, which has certainly been correlated with breast cancer development in previous reports [21]. Therefore, there was a concern that the results of this study might simply reflect mammary gland density. As a result, the features related to classifying the malignant and normal groups in this study were not strongly and directly correlated with mammary gland density, such as ‘firstorder_original_mean’.
Therefore, this suggests that some radiomics features that are not highly associated with mammary gland density are associated with the risk of developing breast cancer. The AUC for the risk of developing breast cancer estimated from mammary gland density alone in a previous study using 1 million mammography images with approximately 10,000 malignant cases was 0.57 [47], whereas the AUC obtained from whole-breast radiomics features in this study was equal to or greater than this value. This suggests that the whole-breast radiomics features reflect the risk of developing breast cancer, and ‘wavelet-H_glcm_Correlation’ and ‘wavelet-H_firstorder_Maximum’ are good candidates for this, as they are frequently used in decision tree branching and show statistically significant differences between the normal and malignant groups.
The relationship between these radiomics features and the risk of developing breast cancer can also be visually observed in histograms of relative frequency densities. For ‘wavelet-H_glcm_Correlation’, high bars for malignant cases can be seen to the left of the peak, whereas for ‘wavelet-H_firstorder_Maximum’, high bars for malignant cases can be seen to the right of the peak. The former feature, ‘wavelet-H_glcm_Correlation’, represents local similarity with neighboring pixels after high-frequency component emphasis (using wavelet transform) [48]. The latter feature, ‘wavelet-H_firstorder_Maximum’, represents the maximum pixel value with high-frequency component emphasis. Therefore, higher local inhomogeneity and maximal intensity in high-frequency emphasized images are related to the risk of developing breast cancer.
Some other radiomics features show differences in distribution between the malignant and normal cases in the relative frequency density histograms, while others do not show much difference. These features that do not show differences in the relative frequency density histograms were considered to be features that do not reflect the risk of developing breast cancer when used as stand-alone radiomics features. However, when combined with other radiomics features, they may contribute to risk estimation for developing breast cancer.
In this analysis, using all 646 radiomics features, we were able to calculate the probability of having any cancer in the contralateral breast. When we set the cut-off threshold of this probability value as 0.95, the odds ratio of having the malignancy in the contralateral breast was 7.38. Therefore, this value can be interpreted as the relative risk ratio of the high-risk group, which is defined in this study. As such, it may be possible to recommend additional imaging examinations and closer screening schedules for women whose probability of malignancy values exceed 0.95 but do not show abnormalities on mammography. A sufficiently high pre-test probability may justify following ultrasound and/or contrast-enhanced MRI as a further breast cancer screening.
In recent years, prophylactic treatment, medication, and surgical treatment have also been considered for individuals at high risk of developing breast cancer [49,50].
Identifying high-risk individuals using our method may aid in selecting appropriate prophylactic medication targets. The accuracy of this approach is expected to improve as more cases are accumulated. Furthermore, machine learning using many images generally requires significant data preparation and cleaning efforts, such as setting regions of interest. However, this method is simple and has the advantage that the model can be easily enhanced even after accumulating a large number of cases.
In this study, we did not determine what events in the breast tissue were reflected in the radiomics features suggested to be associated with breast cancer risk. These features may reflect changes in the mammary tissue or adipose tissue due to hormonal balance, blood glucose levels, or other factors such as genetic mutations. Further investigation using datasets linked to clinical data or genetic information is needed to clarify this point.
Our study’s limitations include the retrospective nature of the investigation and the use of images of the contralateral breast in women with breast cancer. To evaluate the results of this study, observational studies on the population-calculated radiomics features from mammography images are needed in the future. Another limitation is that the mammography images used were from a dataset and a single imaging equipment manufacturer. It is necessary to verify whether the results of this study apply to other datasets and mammography equipment from different manufacturers.
Finally, the AUC for breast cancer risk prediction obtained in this study is not very high. It is slightly higher than the AUC of the Gail model (AUC: 0.55 (0.52–0.57)) and the Tyrer–Cuzick model (AUC: 0.57 (0.55–0.59)) alone, and is comparable to the AUC obtained by adding mammary gland density information to the Gail (AUC: 0.59 (0.57–0.61)) and Tyrer–Cuzick models (AUC: 0.61 (0.59–0.63)) [51]. Therefore, improving accuracy with larger data sets, comparison with these known breast cancer risk models, and verification of synergistic effects are needed and will be the focus of our future work.

5. Conclusions

In conclusion, our study demonstrates that radiomics features obtained from mammography images using a simple method can predict the risk of developing breast cancer. We identified three radiomics features, ‘wavelet-H_glcm_Correlation’, ‘wavelet-H_firstorder_Maximum’, and ‘wavelet-H_glcm_Imc2’, which might reflect the risk of developing breast cancer. Furthermore, using our method, it is suggested that the subgroup with a 7.38-fold relative cancer prevalence risk can be identified. We hope that incorporating these radiomics features into breast cancer screening will lead to the addition of other examination modalities to reduce mortality rates and help identify individuals at high risk of developing breast cancer for prophylactic treatment.

Author Contributions

Study conception and design, all authors; material preparation, data collection, analysis and draft preparation, writing-review and editing, visualization, Y.S. (Yusuke Suzuki) All authors read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. For this type of study, formal consent is not required. This study used the open dataset (OPTIMAM Mammography Image Database) provided by Cancer Research Horizons and was an observational study. Therefore, although no ethics approval was necessary, this study was approved by the Institutional Review Board of The University of Tokyo (approval number: 1461-(10)).

Informed Consent Statement

Patient consent was waived because the dataset we used is an open one.

Data Availability Statement

The dataset analyzed in this study, OPTIMAM Mammography Image Database (OMI-DB), is available from (https://medphys.royalsurrey.nhs.uk/omidb/) upon reasonable request.

Acknowledgments

We are grateful to CANCER RESEARCH HORIZONS, the OPTIMAM project, and the staff at ROYAL SURREY NHS FOUNDATION TRUST who developed the database images.

Conflicts of Interest

Author T. Yoshikawa belongs to The Department of Computational Diagnostic Radiology and Preventive Medicine, which is sponsored by HIMEDIC, Inc. and Siemens Japan KK. The other authors have no relevant financial or non-financial interests to disclose.

References

  1. Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
  2. Nyström, L.; Andersson, I.; Bjurstam, N.; Frisell, J.; Nordenskjöld, B.; Rutqvist, L.E. Long-Term Effects of Mammography Screening: Updated Overview of the Swedish Randomised Trials. Lancet 2002, 359, 909–919. [Google Scholar] [CrossRef]
  3. Andersson, I.; Janzon, L. Reduced Breast Cancer Mortality in Women under Age 50: Updated Results from the Malmö Mammographic Screening Program. J. Natl. Cancer Inst. Monogr. 1997, 1997, 63–67. [Google Scholar] [CrossRef]
  4. Duffy, S.W.; Tabar, L.; Olsen, A.H.; Vitak, B.; Allgood, P.C.; Chen, T.H.H.; Yen, A.M.F.; Smith, R.A. Absolute Numbers of Lives Saved and Overdiagnosis in Breast Cancer Screening, from a Randomized Trial and from the Breast Screening Programme in England. J. Med. Screen. 2010, 17, 25–30. [Google Scholar] [CrossRef] [PubMed]
  5. Tabár, L.; Fagerberg, C.J.; Gad, A.; Baldetorp, L.; Holmberg, L.H.; Gröntoft, O.; Ljungquist, U.; Lundström, B.; Månson, J.C.; Eklund, G. Reduction in Mortality from Breast Cancer after Mass Screening with Mammography. Randomised Trial from the Breast Cancer Screening Working Group of the Swedish National Board of Health and Welfare. Lancet 1985, 1, 829–832. [Google Scholar] [CrossRef]
  6. Frisell, J.; Lidbrink, E.; Hellström, L.; Rutqvist, L.E. Followup after 11 Years--Update of Mortality Results in the Stockholm Mammographic Screening Trial. Breast Cancer Res. Treat. 1997, 45, 263–270. [Google Scholar] [CrossRef] [PubMed]
  7. Meeson, S.; Young, K.C.; Wallis, M.G.; Cooke, J.; Cummin, A.; Ramsdale, M.L. Image Features of True Positive and False Negative Cancers in Screening Mammograms. Br. J. Radiol. 2003, 76, 13–21. [Google Scholar] [CrossRef]
  8. Goergen, S.K.; Evans, J.; Cohen, G.P.; MacMillan, J.H. Characteristics of Breast Carcinomas Missed by Screening Radiologists. Radiology 1997, 204, 131–135. [Google Scholar] [CrossRef]
  9. Bird, R.E.; Wallace, T.W.; Yankaskas, B.C. Analysis of Cancers Missed at Screening Mammography. Radiology 1992, 184, 613–617. [Google Scholar] [CrossRef]
  10. Posso, M.; Louro, J.; Sánchez, M.; Román, M.; Vidal, C.; Sala, M.; Baré, M.; Castells, X.; BELE Study Group. Mammographic Breast Density: How It Affects Performance Indicators in Screening Programmes? Eur. J. Radiol. 2019, 110, 81–87. [Google Scholar] [CrossRef]
  11. Théberge, I.; Guertin, M.-H.; Vandal, N.; Côté, G.; Dufresne, M.-P.; Pelletier, É.; Brisson, J. Screening Sensitivity According to Breast Cancer Location. Can. Assoc. Radiol. J. 2019, 70, 186–192. [Google Scholar] [CrossRef] [PubMed]
  12. Martínez Miravete, P.; Etxano, J. Breast tomosynthesis: A new tool for diagnosing breast cancer. Radiologia 2015, 57, 3–8. [Google Scholar] [CrossRef]
  13. Scheel, J.R.; Lee, J.M.; Sprague, B.L.; Lee, C.I.; Lehman, C.D. Screening Ultrasound as an Adjunct to Mammography in Women with Mammographically Dense Breasts. Am. J. Obstet. Gynecol. 2015, 212, 9–17. [Google Scholar] [CrossRef]
  14. DeMartini, W.; Lehman, C.; Partridge, S. Breast MRI for Cancer Detection and Characterization: A Review of Evidence-Based Clinical Applications. Acad. Radiol. 2008, 15, 408–416. [Google Scholar] [CrossRef] [PubMed]
  15. Saslow, D.; Boetes, C.; Burke, W.; Harms, S.; Leach, M.O.; Lehman, C.D.; Morris, E.; Pisano, E.; Schnall, M.; Sener, S.; et al. American Cancer Society Guidelines for Breast Screening with MRI as an Adjunct to Mammography. CA Cancer J. Clin. 2007, 57, 75–89. [Google Scholar] [CrossRef] [PubMed]
  16. National Comprehensive Cancer Network. Genetic/Familial High-Risk Assessment: Breast, Ovarian, and Pancreatic Version 1.2024—28 August 2023. Available online: https://www.nccn.org/professionals/physician_gls/pdf/genetics_bop.pdf (accessed on 9 September 2023).
  17. Ohuchi, N.; Suzuki, A.; Sobue, T.; Kawai, M.; Yamamoto, S.; Zheng, Y.-F.; Shiono, Y.N.; Saito, H.; Kuriyama, S.; Tohno, E.; et al. Sensitivity and Specificity of Mammography and Adjunctive Ultrasonography to Screen for Breast Cancer in the Japan Strategic Anti-Cancer Randomized Trial (J-START): A Randomised Controlled Trial. Lancet 2016, 387, 341–348. [Google Scholar] [CrossRef] [PubMed]
  18. Gail, M.H.; Brinton, L.A.; Byar, D.P.; Corle, D.K.; Green, S.B.; Schairer, C.; Mulvihill, J.J. Projecting Individualized Probabilities of Developing Breast Cancer for White Females Who Are Being Examined Annually. J. Natl. Cancer Inst. 1989, 81, 1879–1886. [Google Scholar] [CrossRef]
  19. Mavaddat, N.; Michailidou, K.; Dennis, J.; Lush, M.; Fachal, L.; Lee, A.; Tyrer, J.P.; Chen, T.-H.; Wang, Q.; Bolla, M.K.; et al. Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. Am. J. Hum. Genet. 2019, 104, 21–34. [Google Scholar] [CrossRef]
  20. Allman, R.; Mu, Y.; Dite, G.S.; Spaeth, E.; Hopper, J.L.; Rosner, B.A. Validation of a Breast Cancer Risk Prediction Model Based on the Key Risk Factors: Family History, Mammographic Density and Polygenic Risk. Breast Cancer Res. Treat. 2023, 198, 335–347. [Google Scholar] [CrossRef]
  21. Bodewes, F.T.H.; van Asselt, A.A.; Dorrius, M.D.; Greuter, M.J.W.; de Bock, G.H. Mammographic Breast Density and the Risk of Breast Cancer: A Systematic Review and Meta-Analysis. Breast 2022, 66, 62–68. [Google Scholar] [CrossRef]
  22. Wei, J.; Chan, H.P.; Wu, Y.T.; Zhou, C.; Helvie, M.A.; Tsodikov, A.; Hadjiiski, L.M.; Sahiner, B. Association of Computerized Mammographic Parenchymal Pattern Measure with Breast Cancer Risk: A Pilot Case-Control Study. Radiology 2011, 260, 42–49. [Google Scholar] [CrossRef] [PubMed]
  23. Wolf, I.; Sadetzki, S.; Catane, R.; Karasik, A.; Kaufman, B. Diabetes Mellitus and Breast Cancer. Lancet Oncol. 2005, 6, 103–111. [Google Scholar] [CrossRef] [PubMed]
  24. Soler, N.G.; Khardori, R. Fibrous Disease Of The Breast, Thyroiditis, And Cheiroarthropathy In Type I Diabetes Mellitus. Lancet 1984, 323, 193–195. [Google Scholar] [CrossRef] [PubMed]
  25. Kudva, Y.C.; Reynolds, C.A.; O’Brien, T.; Crotty, T.B. Mastopathy and Diabetes. Curr. Diab. Rep. 2003, 3, 56–59. [Google Scholar] [CrossRef]
  26. Vasileiou, G.; Costa, M.J.; Long, C.; Wetzler, I.R.; Hoyer, J.; Kraus, C.; Popp, B.; Emons, J.; Wunderle, M.; Wenkel, E.; et al. Breast MRI Texture Analysis for Prediction of BRCA-Associated Genetic Risk. BMC Med. Imaging 2020, 20, 86. [Google Scholar] [CrossRef]
  27. Lambin, P.; Rios-Velazquez, E.; Leijenaar, R.; Carvalho, S.; van Stiphout, R.G.P.M.; Granton, P.; Zegers, C.M.L.; Gillies, R.; Boellard, R.; Dekker, A.; et al. Radiomics: Extracting More Information from Medical Images Using Advanced Feature Analysis. Eur. J. Cancer 2012, 48, 441–446. [Google Scholar] [CrossRef]
  28. Conti, A.; Duggento, A.; Indovina, I.; Guerrisi, M.; Toschi, N. Radiomics in Breast Cancer Classification and Prediction. Semin. Cancer Biol. 2021, 72, 238–250. [Google Scholar] [CrossRef]
  29. Siviengphanom, S.; Gandomkar, Z.; Lewis, S.J.; Brennan, P.C. Mammography-Based Radiomics in Breast Cancer: A Scoping Review of Current Knowledge and Future Needs. Acad. Radiol. 2022, 29, 1228–1247. [Google Scholar] [CrossRef]
  30. Wang, G.; Shi, D.; Guo, Q.; Zhang, H.; Wang, S.; Ren, K. Radiomics Based on Digital Mammography Helps to Identify Mammographic Masses Suspicious for Cancer. Front. Oncol. 2022, 12, 843436. [Google Scholar] [CrossRef]
  31. Zhou, C.; Xie, H.; Zhu, F.; Yan, W.; Yu, R.; Wang, Y. Improving the Malignancy Prediction of Breast Cancer Based on the Integration of Radiomics Features from Dual-View Mammography and Clinical Parameters. Clin. Exp. Med. 2022, 23, 2357–2368. [Google Scholar] [CrossRef]
  32. Son, J.; Lee, S.E.; Kim, E.-K.; Kim, S. Prediction of Breast Cancer Molecular Subtypes Using Radiomics Signatures of Synthetic Mammography from Digital Breast Tomosynthesis. Sci. Rep. 2020, 10, 21566. [Google Scholar] [CrossRef] [PubMed]
  33. Ma, W.; Zhao, Y.; Ji, Y.; Guo, X.; Jian, X.; Liu, P.; Wu, S. Breast Cancer Molecular Subtype Prediction by Mammographic Radiomic Features. Acad. Radiol. 2019, 26, 196–201. [Google Scholar] [CrossRef] [PubMed]
  34. Tamez-Peña, J.-G.; Rodriguez-Rojas, J.-A.; Gomez-Rueda, H.; Celaya-Padilla, J.-M.; Rivera-Prieto, R.-A.; Palacios-Corona, R.; Garza-Montemayor, M.; Cardona-Huerta, S.; Treviño, V. Radiogenomics Analysis Identifies Correlations of Digital Mammography with Clinical Molecular Signatures in Breast Cancer. PLoS ONE 2018, 13, e0193871. [Google Scholar] [CrossRef]
  35. Jiang, X.; Zou, X.; Sun, J.; Zheng, A.; Su, C. A Nomogram Based on Radiomics with Mammography Texture Analysis for the Prognostic Prediction in Patients with Triple-Negative Breast Cancer. Contrast Media Mol. Imaging 2020, 2020, 5418364. [Google Scholar] [CrossRef]
  36. Onega, T.; Beaber, E.F.; Sprague, B.L.; Barlow, W.E.; Haas, J.S.; Tosteson, A.N.A.; Schnall, M.D.; Armstrong, K.; Schapira, M.M.; Geller, B.; et al. Breast Cancer Screening in an Era of Personalized Regimens: A Conceptual Model and National Cancer Institute Initiative for Risk-Based and Preference-Based Approaches at a Population Level. Cancer 2014, 120, 2955–2964. [Google Scholar] [CrossRef]
  37. Zheng, Y.; Keller, B.M.; Ray, S.; Wang, Y.; Conant, E.F.; Gee, J.C.; Kontos, D. Parenchymal Texture Analysis in Digital Mammography: A Fully Automated Pipeline for Breast Cancer Risk Assessment. Med. Phys. 2015, 42, 4149–4160. [Google Scholar] [CrossRef]
  38. Breast Cancer Association Consortium; Dorling, L.; Carvalho, S.; Allen, J.; González-Neira, A.; Luccarini, C.; Wahlström, C.; Pooley, K.A.; Parsons, M.T.; Fortuno, C.; et al. Breast Cancer Risk Genes—Association Analysis in More than 113,000 Women. N. Engl. J. Med. 2021, 384, 428–439. [Google Scholar] [CrossRef]
  39. Collaborative Group on Hormonal Factors in Breast Cancer. Menarche, Menopause, and Breast Cancer Risk: Individual Participant Meta-Analysis, Including 118 964 Women with Breast Cancer from 117 Epidemiological Studies. Lancet Oncol. 2012, 13, 1141–1151. [Google Scholar] [CrossRef]
  40. Lahmann, P.H.; Lissner, L.; Gullberg, B.; Olsson, H.; Berglund, G. A Prospective Study of Adiposity and Postmenopausal Breast Cancer Risk: The Malmö Diet and Cancer Study. Int. J. Cancer 2003, 103, 246–252. [Google Scholar] [CrossRef]
  41. Macacu, A.; Autier, P.; Boniol, M.; Boyle, P. Active and Passive Smoking and Risk of Breast Cancer: A Meta-Analysis. Breast Cancer Res. Treat. 2015, 154, 213–224. [Google Scholar] [CrossRef]
  42. Ragusi, M.A.A.; van der Velden, B.H.M.; Meeuwis, C.; Tetteroo, E.; Coerkamp, E.G.; van Nijnatten, T.J.A.; Jansen, F.H.; Wolters-van der Ben, E.J.M.; Jongen, L.; van Raamt, F.; et al. Long-Term Survival in Breast Cancer Patients Is Associated with Contralateral Parenchymal Enhancement at MRI: Outcomes of the SELECT Study. Radiology 2023, 307, e221922. [Google Scholar] [CrossRef]
  43. Halling-Brown, M.D.; Warren, L.M.; Ward, D.; Lewis, E.; Mackenzie, A.; Wallis, M.G.; Wilkinson, L.S.; Given-Wilson, R.M.; McAvinchey, R.; Young, K.C. OPTIMAM Mammography Image Database: A Large-Scale Resource of Mammography Images and Clinical Data. Radiol. Artif. Intell. 2021, 3, e200103. [Google Scholar] [CrossRef]
  44. Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
  45. Tang, F.-H.; Xue, C.; Law, M.Y.; Wong, C.-Y.; Cho, T.-H.; Lai, C.-K. Prognostic Prediction of Cancer Based on Radiomics Features of Diagnostic Imaging: The Performance of Machine Learning Strategies. J. Digit. Imaging 2023, 36, 1081–1090. [Google Scholar] [CrossRef]
  46. Mao, Y.; Keller, E.T.; Garfield, D.H.; Shen, K.; Wang, J. Stromal Cells in Tumor Microenvironment and Breast Cancer. Cancer Metastasis Rev. 2013, 32, 303–315. [Google Scholar] [CrossRef] [PubMed]
  47. Dembrower, K.; Liu, Y.; Azizpour, H.; Eklund, M.; Smith, K.; Lindholm, P.; Strand, F. Comparison of a Deep Learning Risk Score and Standard Mammographic Density Score for Breast Cancer Risk Prediction. Radiology 2020, 294, 265–272. [Google Scholar] [CrossRef] [PubMed]
  48. Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybernitics 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
  49. Cuzick, J.; Sestak, I.; Forbes, J.F.; Dowsett, M.; Cawthorn, S.; Mansel, R.E.; Loibl, S.; Bonanni, B.; Evans, D.G.; Howell, A.; et al. Use of Anastrozole for Breast Cancer Prevention (IBIS-II): Long-Term Results of a Randomised Controlled Trial. Lancet 2020, 395, 117–122. [Google Scholar] [CrossRef]
  50. Li, X.; You, R.; Wang, X.; Liu, C.; Xu, Z.; Zhou, J.; Yu, B.; Xu, T.; Cai, H.; Zou, Q. Effectiveness of Prophylactic Surgeries in BRCA1 or BRCA2 Mutation Carriers: A Meta-Analysis and Systematic Review. Clin. Cancer Res. 2016, 22, 3971–3981. [Google Scholar] [CrossRef]
  51. Brentnall, A.R.; Harkness, E.F.; Astley, S.M.; Donnelly, L.S.; Stavrinos, P.; Sampson, S.; Fox, L.; Sergeant, J.C.; Harvie, M.N.; Wilson, M.; et al. Mammographic Density Adds Accuracy to Both the Tyrer-Cuzick and Gail Breast Cancer Risk Models in a Prospective UK Screening Cohort. Breast Cancer Res. 2015, 17, 147. [Google Scholar] [CrossRef]
Figure 1. A summary of the radiomics features. Nine shape-based features were used as is, while the other features were calculated using six different image transformations in addition to the original image.
Figure 1. A summary of the radiomics features. Nine shape-based features were used as is, while the other features were calculated using six different image transformations in addition to the original image.
Jpm 13 01528 g001
Figure 2. (ac) Receiver operating characteristic curves for each fold at all ages (a), 50s (b), and 60s (c). Histograms of probabilities of malignancy for the malignant (d) and normal (e) groups, predicted from radiomics features.
Figure 2. (ac) Receiver operating characteristic curves for each fold at all ages (a), 50s (b), and 60s (c). Histograms of probabilities of malignancy for the malignant (d) and normal (e) groups, predicted from radiomics features.
Jpm 13 01528 g002
Figure 3. Histograms of relative frequency densities of the malignant and normal groups for each radiomics feature used most frequently for decision tree branching and significant differences (p-values) between the normal and malignant groups after age adjustment for each radiomics feature. Each radiomics feature was used more frequently in alphabetical order (aj).
Figure 3. Histograms of relative frequency densities of the malignant and normal groups for each radiomics feature used most frequently for decision tree branching and significant differences (p-values) between the normal and malignant groups after age adjustment for each radiomics feature. Each radiomics feature was used more frequently in alphabetical order (aj).
Jpm 13 01528 g003
Table 1. Ratio of mammography equipment in the malignant and normal groups.
Table 1. Ratio of mammography equipment in the malignant and normal groups.
Manufacturer of MammographyMalignantNormal
Hologic3403 (85.1%)920 (92.0%)
GE128 (3.2%)43 (4.3%)
Philips341 (8.5%)13 (1.3%)
SIEMENS126 (3.2%)23 (2.3%)
Sectra Imtec0 (0.0%)1 (0.1%)
No image2 (0.1%)0 (0.0%)
All40001000
Table 2. Reasons for exclusion of cases in the malignant and normal groups.
Table 2. Reasons for exclusion of cases in the malignant and normal groups.
Reasons for Exclusion of Cases
Malignant Case (n = 785)Normal Case (n = 104)
Not Hologic (n = 595)Not Hologic (n = 80)
Incidence of bilateral breast cancer (n = 123)Breast implant (n = 12)
No healthy side images (n = 39)Inadequate follow-up period (n = 7)
Breast implant (n = 13)Image only of one side (n = 4)
Large breast (n = 7)Foreign body reaction (n = 1)
Surgical history (n = 4)
Artificial object (n = 2)
Table 3. Distribution of participants based on age and left–right ratio.
Table 3. Distribution of participants based on age and left–right ratio.
Malignant GroupNormal Group
AgeRightLeftAllRightLeftAll
<40011000
40≤, <50117110227101101202
50≤, <606036121215457457914
60≤, <706636431306312312624
70≤240227467262652
All cases1623159232158968961792
Table 4. AUC and accuracy of each fold and their mean values in the 50s, 60s, and all cases.
Table 4. AUC and accuracy of each fold and their mean values in the 50s, 60s, and all cases.
Area under the CurveAccuracy
Fold50s60sAll Cases50s60sAll Cases
fold 10.547010.654520.588940.644780.728400.71203
fold 20.603670.598130.559700.673650.753090.72384
fold 30.629540.550990.609300.688620.753090.72141
fold 40.569940.512860.605610.631740.682100.72384
fold 50.579810.522740.642540.640720.752320.72749
Average0.585990.567850.601220.655900.733800.72172
Table 5. Top 10 most frequently used radiomics features for decision tree branching in each fold.
Table 5. Top 10 most frequently used radiomics features for decision tree branching in each fold.
Radiomics Features Name
fold 1fold 2
1wavelet-H_firstorder_Meanlog-sigma-5-0-mm_firstorder_Skewness
2wavelet-H_glcm_Correlationwavelet-H_firstorder_Maximum
3wavelet-H_firstorder_Maximumoriginal_glrlm_LongRunLowGrayLevelEmphasis
4wavelet-H_glcm_Imc2wavelet-H_glcm_Correlation
5log-sigma-2-0-mm_firstorder_Meanoriginal_glcm_Idmn
6wavelet-L_gldm_LargeDependenceLowGrayLevelEmphasiswavelet-H_firstorder_Mean
7log-sigma-5-0-mm_firstorder_Kurtosiswavelet-H_glrlm_RunVariance
8wavelet-H_ngtdm_Complexitylog-sigma-5-0-mm_firstorder_Kurtosis
9wavelet-H_ngtdm_Contrastlog-sigma-4-0-mm_firstorder_Mean
10log-sigma-3-0-mm_firstorder_Maximumlog-sigma-5-0-mm_firstorder_Maximum
fold 3fold 4
1wavelet-H_firstorder_Maximumwavelet-H_glcm_Correlation
2wavelet-H_glcm_Correlationwavelet-H_firstorder_Mean
3wavelet-H_firstorder_Skewnessoriginal_glrlm_LongRunLowGrayLevelEmphasis
4original_glrlm_LongRunLowGrayLevelEmphasislog-sigma-5-0-mm_firstorder_Skewness
5wavelet-H_glcm_ClusterShadewavelet-L_glrlm_ShortRunLowGrayLevelEmphasis
6original_glrlm_ShortRunLowGrayLevelEmphasiswavelet-H_firstorder_Maximum
7wavelet-L_glrlm_LongRunLowGrayLevelEmphasislog-sigma-2-0-mm_firstorder_Mean
8log-sigma-5-0-mm_firstorder_Skewnesswavelet-H_firstorder_Median
9log-sigma-3-0-mm_firstorder_Medianwavelet-H_ngtdm_Contrast
10log-sigma-2-0-mm_ngtdm_Strengthwavelet-L_glrlm_LongRunLowGrayLevelEmphasis
fold 5
1wavelet-H_glcm_Correlation
2original_glrlm_LongRunLowGrayLevelEmphasis
3wavelet-H_firstorder_Maximum
4wavelet-H_glcm_ClusterShade
5original_shape2D_MaximumDiameter
6wavelet-L_gldm_LargeDependenceLowGrayLevelEmphasis
7log-sigma-5-0-mm_firstorder_Maximum
8wavelet-H_glcm_Imc2
9log-sigma-2-0-mm_firstorder_Skewness
10original_glcm_Idmn
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Suzuki, Y.; Hanaoka, S.; Tanabe, M.; Yoshikawa, T.; Seto, Y. Predicting Breast Cancer Risk Using Radiomics Features of Mammography Images. J. Pers. Med. 2023, 13, 1528. https://doi.org/10.3390/jpm13111528

AMA Style

Suzuki Y, Hanaoka S, Tanabe M, Yoshikawa T, Seto Y. Predicting Breast Cancer Risk Using Radiomics Features of Mammography Images. Journal of Personalized Medicine. 2023; 13(11):1528. https://doi.org/10.3390/jpm13111528

Chicago/Turabian Style

Suzuki, Yusuke, Shouhei Hanaoka, Masahiko Tanabe, Takeharu Yoshikawa, and Yasuyuki Seto. 2023. "Predicting Breast Cancer Risk Using Radiomics Features of Mammography Images" Journal of Personalized Medicine 13, no. 11: 1528. https://doi.org/10.3390/jpm13111528

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop