Regression Analysis between the Different Breast Dose Quantities Reported in Digital Mammography and Patient Age, Breast Thickness, and Acquisition Parameters

Breast cancer is the leading cause of cancer death among women worldwide. Screening mammography is considered the primary imaging modality for the early detection of breast cancer. The radiation dose from mammography increases the patients’ risk of radiation-induced cancer. The mean glandular dose (MGD), or the average glandular dose (AGD), provides an estimate of the absorbed dose of radiation by the glandular tissues of a breast. In this paper, MGD is estimated for the craniocaudal (CC) and mediolateral–oblique (MLO) views using entrance skin dose (ESD), X-ray spectrum information, patient age, breast glandularity, and breast thickness. Moreover, a regression analysis is performed to evaluate the impact of mammography acquisition parameters, age, and breast thickness on the estimated MGD and other machine-produced dose quantities, namely, ESD and organ dose (OD). Furthermore, a correlation study is conducted to evaluate the correlation between the ESD and OD, and the estimated MGD per image view. This retrospective study was applied to a dataset of 2035 mammograms corresponding to a cohort of 486 subjects with an age range of 28–86 years who underwent screening mammography examinations. Linear regression metrics were calculated to evaluate the strength of the correlations. The mean (and range) MGD for the CC view was 0.832 (0.110–3.491) mGy and for the MLO view was 0.995 (0.256–2.949) mGy. All the mammography dose quantities strongly correlated with tube exposure (mAs): ESD (R2 = 0.938 for the CC view and R2 = 0.945 for the MLO view), OD (R2 = 0.969 for the CC view and R2 = 0.983 for the MLO view), and MGD (R2 = 0.980 for the CC view and R2 = 0.972 for the MLO view). Breast thickness showed a better correlation with all the mammography dose quantities than patient age, which showed a poor correlation. Moreover, a strong correlation was found between the calculated MGD and both the ESD (R2 = 0.929 for the CC view and R2 = 0.914 for the MLO view) and OD (R2 = 0.971 for the CC view and R2 = 0.972 for the MLO view). Furthermore, it was found that the MLO scan views yield a slightly higher dose compared to CC scan views. It was also found that the glandular absorbed dose is more dependent on glandularity than size. Despite being more reflective of the dose absorbed by the glandular tissue than OD and ESD, MGD is considered labor-intensive and time-consuming to estimate.


Introduction
Breast cancer is reported as one of the leading causes of mortality for women worldwide [1]. It has surpassed lung cancer as the most commonly diagnosed cancer worldwide, according to the GLOBOCAN 2020 estimates of cancer incidence and mortality produced by the International Agency for Research on Cancer [2]. It is the leading cause of cancer death among women worldwide. Breast cancer is the most common cancer among female United Arab Emirates (UAE) citizens (32.16%) and non-UAE citizens (41.41%) [3]. To date, there is no definite known cause of breast cancer or an effective method to prevent it.
Thus far, mammography is considered the most effective means of early detection of breast cancer [4]. With this, the potential risk of radiation-induced carcinogenesis in some high-risk patients is also increasing, leading to the possibility of limiting the number of screening mammograms or deterring women from undergoing breast screening [5]. Mammography images are obtained using a low-energy X-ray radiation beam [6]. Exposure to such low radiation repeatedly can lead to an adverse impact known as the radiation stochastic effect [1]. The glandular tissue of the breast is considered one of the most radiosensitive organs in the body [7], which makes optimizing equipment performance and managing the radiation dose per each mammogram exam a necessity that cannot be overemphasized to ensure the "as low as reasonably achievable (ALARA)" radiation dose principle.
Mammography mean glandular dose (MGD), which is used synonymously with average glandular dose (AGD), has been widely accepted as the most appropriate measurement for predicting the risk of radiation-induced cancer [8]. Most advanced mammography units provide a means to estimate the MGD or AGD for each patient, where several different algorithms exist and are used by each manufacturer. Thus, in addition to mammography exposure, known as the entrance skin dose (ESD), and the half-value layer (HVL) dedicated for X-ray (target and filter type) spectra, breast thickness and breast density are considered key inputs for such algorithms.
In addition to the ESD dose quantity reported by the mammography machine, organ dose (OD) is another dose quantity that is reported by mammography machines. However, MGD and OD are not the same. A strong significant difference between the MGD and OD was reported in [9].
Thus, it is important to have a good understanding and an accurate estimation of the parameters that can affect the glandular absorbed dose, as it provides an indication of the radiation risk to the breast during exposure. The objectives of this paper are as follows: • Estimate the MGD per view, breast thickness, and age group for every subject enrolled in this study, • Evaluate the impacts of mammography acquisition parameters, age, and breast thickness on the estimated MGD and other machine-produced dose quantities using a multilinear regression model, • Conduct a correlation study between the ESD and OD, and the estimated MGD for each image view, • Compare the findings of this study with other studies conducted on samples selected from different demographic regions.

Mammography Data
The proposed study was applied on breast screening mammogram images captured from the UAE population. All mammography screening examinations were performed using a Siemens Mammomat Inspiration (Siemens Medical Solutions, Forchheim, Germany). A dataset of 2035 mammograms from a cohort of 486 subjects with an age range of (28-86 years) who underwent screening mammography examinations was collected from the University Hospital of Sharjah, UAE, and retrospectively analyzed. This study was approved by the Institutional Review Board (IRB) committee of our university and hospital ethics committees. Mammography acquisition parameters, including image view, number of views, target and filter combination, compressed breast thickness, ESD, organ dose, and other acquisition parameters (kVp and mAs), for all the patients were extracted from the digital imaging and communications in medicine (DICOM) headers and used in this study. The HVL for our Tungsten (W)/Rhodium (Rh) spectra was 0.45 mm Al. A dedicated g conversion factor for each view was obtained from [10,11] for each subject based on the HVL (herein = 0.45 mm Al) and breast thickness per view of each subject.

Estimating Mean Glandular Dose (MGD)
The MGD was estimated per view, breast thickness, and age group for every subject enrolled in this study using Dance et al.'s method [10,11]. The ESD dose, measured in mGy for mammography views such as craniocaudal (CC), medio-lateral oblique (MLO), medio-lateral (ML), and lateromedial (LM), for each and every subject involved in this study was input into the mathematical model presented in Equation (1), as discussed in [10]: where -K is the mammography machine output (calibration) measured in mGy. It is also known as the entrance dose at the surface of the breast. This quantity was provided by the manufacturers for each mammography scan and could also be obtained from the DICOM header. -g is a conversion factor describing the fraction of "K" that is absorbed by the glandular tissue in the breast. g depends on breast thickness and the HVL. -c is a correction factor for breast composition that corrects for any difference in glandularity from 50%, i.e., from 0-100%. Dance et al. [10] provided a reference table of c factors for various HVLs, breast thicknesses from (2-11) cm, and glandularity from which one can extrapolate the percentage of glandularity for each individual. -s is a correction factor for the X-ray spectrum that can be altered when using different target and filter combinations. Such a correction factor is independent of the HVL and can be found in [10] in a simple reference table that includes various target and filter combinations.

Regression Analysis
Multilinear regression models were built to quantify the impacts of mammography acquisition parameters, age, and breast thickness as input parameters on output parameters related to the mammographic organ dose. A linear equation representing the regression model assigns a scale factor (called the coefficient) to each input parameter, with one additional coefficient (called the intercept) added to the equation. The input parameters considered in this work were the age of the patient, the compressed breast thickness, and the acquisition parameters, namely, X-ray tube voltage (kVp), exposure time (msec), exposure (mAs), and X-ray tube current (mA). The output parameters considered in this work were the entrance skin dose (ESD) and organ dose (OD), which were reported after each mammography scan in addition to the estimated MGD, as described in Equation (1). A regression model was estimated for output parameters individually. The multilinear regression model could be represented as a function of the input and output parameters, as follows:Ŷ whereŶ is the estimated output parameter (dose quantity) using the regression model; I 1 , I 2 , . . . , I n are the input parameters; and β 0 , β 1 , . . . , β n are the coefficients (scaler factors). To estimate the values of the coefficients, a least-squares method was used.
Moreover, the individual correlations between each of the input parameters and each output parameter (dose quantity) were assessed by fitting each pair of these input and output parameters to a linear regression model, as shown in the equation below: where I j is one input parameter selected from the set of input parameters I 1 , I 2 , . . . , I n .
The correlations between different mammography dose quantities, namely, machineprovided OD and ESD and calculated MGD, were also investigated. To achieve this purpose, the individual correlations between these dose quantities were assessed by fitting each pair of them to a regression model similar to what was proposed in Equation (3).

Regression Evaluation
The regression models were evaluated using different criteria, such as the coefficient of determination (R 2 ), mean square error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). The coefficient of determination, or R-squared (R 2 ), is a metric to assess the goodness-of-fit for linear regression models. This metric measures the strength of the relationship between a fitted model and an output parameter on a 0-1 scale, where 0 represents no relationship, and 1 represents a strong relationship. R 2 can be calculated as follows: where Y i is the mean of the Y i values. MSE, MAE, and MAPE are calculated as follows: where N is the number of data samples, and Y is the output parameter reported after each mammography scan, i.e., ESD or OD, or estimated, i.e., MGD, as in Equation (1). The MSE, MAE, and MAPE metrics were all applied by comparing the predicted output parameter using the regression model with the reported or estimated parameter Y.

Mammography Data and MGD Estimation
In this section, the mammography data are described, along with the machine-reported and calculated dose quantities. Table 1 shows a descriptive summary of different dose quantities, namely, ESD, OD, and MGD, for each subject enrolled in this study based on image view, with the dominant views being CC and MLO. Table 2 reports ESD, OD, and calculated MGD based on breast thickness. Here, we adapted the classification reported in [12] of fatty for a breast thickness of 5-7 cm, medium for a breast thickness 3-5 cm, and dense for a 2-3 cm breast thickness. Table 3 reports the entrance dose, organ dose and calculated MGD based on age ranges.

Results of the Regression Analysis
In this section, the regression models that were built using the mammography data to quantify the impacts of mammography acquisition parameters and patient information, such as age and breast thickness, on mammographic dose quantities are evaluated. Table 4 shows the results of evaluating the regression models between the acquisition parameters and the dose quantities, namely, ESD, OD, and MGD, for each view using the regression evaluation criteria introduced in Section 2.4.  Moreover, the correlations between MGD and both ESD and OD are investigated in this work. Table 5 shows the regression metrics evaluation between MGD and both ESD and OD for each view.

Discussion
MGD has been widely accepted as the most appropriate measure for estimating the radiation-induced risk of breast cancer [8]. It is important to mention that MGD is a calculated dose index, meaning it is not automatically prompted by commercial mammography machines. Hence, it is only logical to assume that MGD accuracy depends on the method used for calculation. Salomon et al. [13] conducted a study to compare between three different methods, including Wu et al. [14] Dance et al. [10,11] and Volpara [15]. While Volpara provided a tissue composition analysis allowing for a more accurate glandularity estimation, the Wu and Dance methods both used a reference table and customized conversion factors. Nevertheless, recent studies still report MGD using the Dance method, including [9,12,16], similar to what was performed in this study.
Mammography exams tend to vary in the number of views performed, as well as the dose per view, resulting in wide variation in the MGDs absorbed amongst individuals. The American College of Radiology Imaging Network (ACRIN) Digital Mammographic Imaging Screening Trial (DMIST) reported an average MGD of 3.7 mGy from two-view digital mammography (1.86 mGy per view) [17]. Herein, Table 1 demonstrated diverse MGDs across the cohort of subjects enrolled, representing a sample from the UAE population. The mean ± SD (and range) MGD for the CC view was 0.832 ± 0.296 (0.11-3.49) mGy and for the MLO view was 0.995 ± 0.350 (0.26-2.95) mGy. While small differences were seen between the mean MGDs of the CC and MLO views, MLO yielded the high-end dose. Such findings agree with other studies in the field [12,16,18,19]. Jamal et al. [18] reported mean MGDs in a sample of Malaysian subjects of 1.54 mGy and 1.82 mGy for the CC and MLO views, respectively. Chevalier et al. [19] reported a mean (and range) MGD of 1.80 (0.4-6.9) mGy for the CC view and 1.95 (0.6-8.1) mGy for the MLO view. Similarly, Al Naemi et al. [12] reported a mean (and range) MGDs based on views measured in Qatari subjects as 1.90 (0.8-6.16) mGy for the CC view and 1.97 (0.7-6.13) mGy for the MLO view. An average MGD of 0.74 mGy and ranges (0.33 to 6.41) mGy and (0.28 to 8.59) mGy for the CC and MLO views, respectively, representing a cohort sample from Nigeria, were reported by Josephine and colleagues [16].
Breast compression seemed to contribute to MGD. Table 4 showed a positive moderate correlation between the average MGD and the compressed breast thickness (R 2 = 0.315 and 0.351 for the CC and MLO views, respectively). The positive correlation between MGD and compressed breast thickness demonstrated in the present study has been illustrated in previous studies. Riabi et al. [20] showed a significant correlation between MGD and compressed breast thickness with a correlation coefficient of 0.692. Analyzing the difference between the MGDs of thin breasts (<5 cm) and thick breasts (>5 cm), a significant difference was seen (p < 0.01). Du et al. [21] also reported a positive correlation between MGD and compressed breast thickness. Chevalier et al. [19] analyzed the doses of 5034 patients who had undergone full-field digital mammography and concluded that differences between the corresponding MGDs among all the groups were lowest for thin breasts, with a mean ± standard deviation (SD) breast size of 34 ± 8 mm, compared to breast sizes of 61 ± 11 mm. This observation is in line with our reported findings presented in Table 2. Fatty breasts measuring (5-7) cm yielded a higher mean (and range) MGD than dense breasts measuring (2-3) cm, indicating that large-breast-size women tend to receive higher doses per view and may require more than four views for complete examination [5]. In contrast, Al Naemi et al. [12] reported a reverse observation with the mean (and range) MGD decreasing in fatty breasts measuring (5-7) cm and increasing in dense breasts measuring (2-3) cm.
Age also seemed to be a factor that could impact MGD, as it was associated with glandularity and density. Herein, we observed a lower MGD for subjects 64 years and older. Similar findings were reported by Pwamang et al. [1], where patients with a compressed breast thickness of 32 mm in the age group of 40-49 years reported an MGD of 1.55 mGy, whereas a compressed breast thickness of 60 mm in the age group of 50-64 had an MGD of 2.51 mGy. Chevalier et al. [19] reported a mean ± SD MGD of 1.85 ± 0.01 for subjects younger than 50 years and 1.90 ± 0.01 for those above 50 years. On the other hand, Baek et al.'s [22] correlation and regression study concluded that age was negatively associated with MGD (p < 0.05).
Image acquisition parameters such as kVp and mAs seemed also to impact the resulting MGDs, with X-ray tube current showing a substantial impact comparing to tube voltage (kVp). Strong correlation (R 2 = 0.980 and R 2 = 0.972 for CC and MLO, respectively) were seen between the MGDs and the mAs. Weak correlation (R 2 = 0.292 and 0.304 for CC and MLO, respectively) were also seen between the MGDs and the applied kVp. These results are in agreement with the results reported by Riabi et al. [20], where a statistically significant correlation (p < 0.01) was seen between the MGD and each of the applied kVp and mAs amounts, with a correlation coefficient of 0.829 between MGD and kVp and of 0.890 between MGD and mAs.
Baek et al. [22] conducted a multivariate linear regression analysis to investigate the impacts of mammographic composition and breast size on the glandular dose during full-field digital mammography (FFDM) in Korean women. Using a multivariate linear regression analysis, they found that the mAs, kVp, compressed breast thickness, and mammographic breast size were positively associated with MGD (p < 0.05). Patients with radiation dose values above the diagnostic reference value had large breasts of dense composition. Table 5 showed that there was a strong correlation between the calculated MGD and the ESD and OD values reported by the mammogram unit: ESD (R 2 = 0.929 for the CC view and R 2 = 0.914 for the MLO view) and OD (R 2 = 0.971 for the CC view and R 2 = 0.972 for the MLO view). This also showed that OD had a stronger correlation with MGD despite the strong significant difference between MGD and OD that was reported by Suleiman et al. [9].
To our knowledge, limited studies have investigated the correlation between OD and MGD, with most studies focusing only on ESD and MGD [16,21]. Using reference tables to estimate MGD is time-consuming and incorporates uncertainties [13]. Breast density seems to be a significant contributing parameter, directly influencing breast cancer detection accuracy [23]. Denser breasts indirectly impact the MGD, as the noise level associated with dense breasts is higher [24]. Some limitations were encountered in the present study. Primarily, mammography image quality evaluations were not part of this study. We assumed the mammography images obtained were all diagnostically adequate. Further, the data obtained were restricted to a single healthcare institute and a single mammography machine. A larger sample size is needed in order to establish an average MGD baseline for the UAE population. Moreover, breast density evaluations were not part of the study. In addition, the MGDs for the enrolled subjects were estimated based on Dance et al.'s [10,11] bulk reference tables.

Conclusions
This study proposes an estimation and good understanding of parameters that could affect the glandular absorbed dose. The MGD was estimated per view, breast thickness, and age group for every subject enrolled in this study. Moreover, an evaluation of the impacts of mammography acquisition parameters, age, and breast thickness on the estimated MGDs and other machine-produced dose quantities using a multilinear regression model was conducted. Furthermore, a correlation study between the ESD and OD, as well as the estimated MGD, for each image view was discussed. This retrospective study was conducted on mammography scan data for subjects who underwent screening mammography examinations. The findings of this study were compared to the findings of other studies conducted on samples selected from different demographic regions. It was found that the mean MGD for the MLO view was slightly higher than that of the CC view. The findings of the regression analysis showed that all the mammography dose quantities, namely ESD, OD, and MGD, were strongly correlated with tube exposure (mAs). However, patient age showed poor correlation with all the mammography dose quantities. Breast thickness showed better correlation with all the mammography dose quantities compared to patient age. Moreover, it was found that there was a strong correlation between the calculated MGD and the ESD and OD values reported by the mammogram unit, with OD showing a stronger correlation with MGD.