Diagnostic Performance of Prototype Handheld Ultrasound According to the Fifth Edition of BI-RADS for Breast Ultrasound Compared with Automated Breast Ultrasound among Females with Positive Lumps

(1) Objective: To evaluate the diagnostic performance of prototype handheld ultrasound compared to automated breast ultrasound, according to the fifth edition of BI-RADS categorization, among females with positive lumps. (2) Methods: A total of 1004 lesions in 162 participants who underwent both prototype handheld ultrasound and automated breast ultrasound were included. Two radiologists and a sonographer independently evaluated the sonographic features of each lesion according to the fifth BI-RADS edition. The kappa coefficient (κ) was calculated for each BI-RADS descriptor and final assessment category. The cross-tabulation was performed to see whether there were differences between the ABUS and prototype HHUS results. Specificity and sensitivity were evaluated and compared using the McNamar test. (3) Results: ABUS and prototype HHUS observers found the same number of breast lesions in the 324 breasts of the 162 respondents. There was no significant difference in the mean lesion size, with a maximum mean length dimension of 0.48 ± 0.33 cm. The assessment of the lesion’s shape, orientation, margin, echo pattern, posterior acoustic features, and calcification was obtained with good to excellent agreements between ABUS and prototype HHUS observers (κ = 0.70–1.0). There was absolutely no significant difference between ABUS and prototype HHUS in assessment of lesion except for lesion orientation p = 0.00. Diagnostic accuracy (99.8% and 97.7–98.9%), sensitivity (99.5% and 98.0–99.0%), specificity (99.8% and 99.6–99.8%), positive predictive value (98.1% and 90.3–96.2%), negative predictive value (90.0% and 84.4–88.7%), and areas under the curve (0.98 and 0.83–0.92; p < 0.05) were not significantly different between ABUS and prototype HHUS observers. (4) Conclusion: According to the fifth BI-RADS edition, automated breast ultrasound is not statistically significantly different from prototype handheld ultrasound with regard to interobserver variability and diagnostic performance.


Introduction
Breast cancer is the number one cancer afflicting women worldwide, including Malaysians. In the latest report by the Malaysian National Cancer Registry, there was an increase in breast cancer incidence from 18,206 between 2007 and 2011 to 21,634 between 2012 and 2016. In addition, the report showed an increasing number of women being diagnosed with breast cancer at younger ages and at more advanced cancer stages compared to the previous report [1]. Moreover, the Malaysian Study on Cancer Survival reported a dismal 66.8% breast cancer survival rate, which is much lower than that reported for other Asian countries such as Japan (96.2%) [2], Korea (92.6%) [3], and Singapore (79.0%) [4].

HHUS
The ultrasound examination was performed by one sonographer and two radiologists, with Prototype Handheld Fujikin Ultrasound with a linear transducer 7.5 MHz connected to smartphone, contrast 80, gain 70, and depth 5 cm. Patients were instructed to raise their hands above the head in the supine position. Bilateral breasts tissue was scanned manually in a certain order to ensure that all the breast tissue was covered, and lymph nodes in the Diagnostics 2023, 13, 1065 3 of 21 armpits were included in the scope of the examination. Suspected masses were observed in two perpendicular scanning planes (longitudinal and axial). The grayscale images were recorded at the same time. All of the images were stored offline. After prototype HHUS examinations, patients underwent ABUS examinations.

ABUS
ABUS scan data were obtained by a sonographer using Invenia ABUS (GE Healthcare, Milwaukee, WI, USA) using an automated 10 MHz linear array transducer automatically applies compression to the breast across the whole breast and obtains images from different views, such as lateral, anteroposterior, and medial (covering areas of 15 cm). The workstation then reconstructs the breast and display 3D volumes in a 2-mm-thick coronal slice from the skin to the chest wall. Patients were placed in a supine position on an examination bed and positioned with the arm above the head. The images were presented on the workstation in any plane after reconstruction with the volume data.

Image Review
HHUS images were interpreted immediately after scanning by one sonographer and two radiologists, and they had more than 5 years of breast imaging experience. Another radiologist, who has more than 3 years of experience of 3D volume ultrasound, reviewed the ABUS images. ABUS assessment was performed on the axial plane as well as the reconstructed coronal plane. All of the sonographers and radiologists are blind to the patient's identity, the results of other modalities, and the medical background. Each breast unit was assessed according to the terminology of the fifth edition of the BI-RADS lexicon for breast ultrasound [11]: shape (oval, round, or irregular), margin (circumscribed, indistinct, microlobulated, angular, or spiculated), echo pattern (hypoechoic, hyperechoic, isoechoic, anechoic, complex cystic and solid, or heterogeneous), orientation (parallel or perpendicular), posterior features (no posterior features, enhancement, shadowing, or combined pattern), calcification (no calcifications, calcifications in mass, calcifications out of mass, intraductal calcifications), and associated features (architectural distortion, ductal changes, skin changes, or edema). Each observer was instructed to choose the most appropriate BI-RADS descriptor for each lesion. Although there are five categories (from 1 to 5) in the BI-RADS final assessment, the observers were asked to assign lesions to a limited range of BI-RADS categories: 1 (negative), 2 (benign), 3 (probably benign), 4 (suspicious), or 5 (highly suggestive of malignancy). Measurements of lesions detected with prototype HHUS were compared with those obtained using ABUS. The size of the lesion was defined as the maximum diameter via all methods.

Statistical Analysis
Agreement on the BI-RADS lexicon among observers was assessed using kappa statistics. To interpret the kappa coefficients (κ), we used the following definitions: less than 0.20 indicates poor agreement; 0.21-0.40 indicates fair agreement; 0.41-0.60 indicates moderate agreement; 0.61-0.80 indicates good agreement; and 0.81-1.00 indicates very good agreement [12]. The cross-tabulation was performed to see whether there were differences between the ABUS and prototype HHUS results. Receiver operator characteristic (ROC) curve analysis was performed to compare the diagnostic performance of ABUS and prototype HHUS according to selected BI-RADS descriptors. Specifications and sensitivity were evaluated and compared with the McNemar test. According to Leisenring et al., 2000, the calculation and comparison of positive and negative predictive values were done [13]. Statistical analyses were performed using IBM SPSS Software 25.0. The level of significance was set at 0.05.

Participant Characteristics
A total of 162 females were included in our study, with a mean age 31.80 ± 9.44 years old and a mean menarche age of 12.35 ± 1.24 years old. The majority of respondents (93.2%) were Malay, with nearly two-thirds of them being single (63.6%) and one-fifth of them being married (19.1%). A bachelor's degree was held by almost one-fifth of the respondents 46.3%, followed by a postgraduate degree 31.5%. Only 2.5% of the respondents earned the highest monthly income in Malaysia, which is RM 12,501. The majority of the respondents had monthly income ranges between less than RM 2500 and RM 2501 to RM 5000 of 27.8% and 37.7%, respectively. Additionally, this study discovered that 39 (24.2%) respondents had a family history of breast cancer. Out of the 39, the majority of respondents (48.7%) said that they had other family members (such as siblings/sisters, aunty, grandmother) who had breast cancer, followed by their mother 43.6% (Table 1). A total of 1004 masses from 162 females who underwent prototype HHUS examinations followed by ABUS examinations (including cystic and solid masses) were included in this study. In 151/162 of the responders, the prototype HHUS detected extremely similar lesions to those observed by ABUS (gold standard), which totaled 1004 lesions with a mean length dimension of 0.45 ± 0.32 cm by a length range of 0.20-3.60 cm. In total, 998 lesions (99.4%) were either definitely or probably benign, whereas 4 (0.6%) were malignant as determined by biopsy. Out of 162, only 11 obtained normal results ( Table 2). The mean distance from the nipple was 2.00 ± 1.02 cm for the right breast and 1.67 ± 0.71 cm for the left breast. The data showed that there was no statistical difference between the ABUS and prototype HHUS findings for both breasts (p > 0.05 in the Chi-square test). For both breasts, the highest agreement was found and was excellent for ABUS and prototype HHUS observers (HHUS observers 1, 2, and 3; κ = 0.95, κ = 0.95, and κ = 0.97), respectively. According to the 4th Table in Section 3.2.2 findings, there was no difference between the ABUS and prototype HHUS findings for both breasts that could be not considered statistically significant (p > 0.05). Regarding the shape of the lesion, both modalities show agreement revealed that for the right breast, 361 were rounded, 196 were elliptical, and 2 were irregular, whereas for the left breast, 300 were rounded, 144 were elliptical, and 1 was irregular. The relation between both modalities regarding the shape is shown in Table 3.

2.
Orientation of the lesion Regarding the lesion's orientation between the ABUS and prototype HHUS revealed agreement in 555 parallel and 3 perpendicular lesions for the right breast and 443 parallel and 2 perpendicular lesions for the left breast, and there was a significant difference between the ABUS and prototype HHUS findings for both breasts, with a p-value of 0.05 (Table 4).

3.
Margin of the lesion The margin of the lesion both modalities showed agreement in 546 circumscribed, 11 microlobulated, 1 angular, and 1 spiculated for the right breast, compared to 435 circumscribed, 3 angular, 6 microlobulated, and 1 spiculated for the left breast; on the other hand, there was no statistically significant difference between the results of the ABUS and prototype HHUS for both breasts p > 0.05 (Table 5).

4.
Echo pattern of the lesion Table 6's findings indicate that there was no statistically significant difference between the ABUS and HHUS findings for each breast, p > 0.05. For both breasts, with regards to the echo pattern of the lesion, ABUS and prototype HHUS proved agreement using the chi square: 524 were anechoic, 0 were isoechoic, 19 were hypoechoic, 10 were hyperechoic, 1 was complex cystic and solid, and 5 were heterogenous for the right breast, whereas 420 were anechoic, 17 were hypoechoic, 6 were hyperechoic, 2 were heterogenous, and there no findings of isoechoic for the left breast.

5.
Posterior Acoustic Features of the lesion The posterior acoustic features of the lesions were compared at baseline between the ABUS and prototype HHUS findings, indicating that there was no statistically significant difference between the results from the prototype HHUS and the ABUS by either breast, p > 0.05. Regarding the posterior acoustic features of the lesion, both modalities indicated agreement in 465 enhancements, 34 shadowings, and no findings for combined and absent features for the right breast, while 397 enhancements, 28 shadowings, 1 combined feature, and 1 absent feature were found for the left breast (Table 7).

6.
Calcification of the lesion According to the findings in Table 8, there was no significant difference between the results of the prototype HHUS and the ABUS for both breasts p > 0.05. According to both modalities, 542 of the lesions had no calcifications, 10 had calcifications in the mass, 7 had calcifications outside of the mass, and there were no indications for intraductal calcifications in the right breast. Nevertheless, there were 433 cases of no calcifications, 7 cases of calcifications in masses, 5 cases of calcifications outside of masses, and no observations of intraductal calcifications in the left breast.

7.
BI-RADS Assessment of The Lesion According to Table 9 findings, there was no statistically significant difference between the results from the prototype HHUS and the ABUS for either breast, p > 0.05, regarding the BI-RADS assessment of the lesion. ABUS and prototype HHUS revealed agreement as follows: 522 were BI-RADS 2/benign, 31 were BI-RADS 3/probably benign, and 6 were BI-RADS 4/suspicious for right breast, whereas 420 were BI-RADS 2/benign, 22 were BI-RADS 3/probably benign, and 3 were BI-RADS 4/suspicious for the left breast. There were no findings in terms of BI-RADS 1/normal and BI-RADS 5/highly suggestive of malignancy for both breasts.

1.06
Indistinct Observer 3 Circumscribed Table 6. HHUS observers compared with ABUS (gold standard) in terms of echo pattern.

ABUS (Gold
Observer 2 Anechoic Observer 3 Anechoic     Table 9. HHUS observers compared with ABUS (gold standard) in terms of BIRADS assessment.  Table 10 summarizes the interobserver agreement of BI-RADS descriptors and the final assessment category in ABUS and prototype HHUS observers; the overall agreement for all descriptors and final assessment category was good to excellent. There were no significant differences in κ values of BI-RADS lexicons between ABUS and prototype HHUS except for orientation (ABUS except for associated features ( Table 1, the 2nd Figure in Section 3.3)).

Location of the Lesion
According to Table 11 findings, there was no significant difference between ABUS and prototype HHUS findings for both breasts (p > 0.05). The maximum agreement was found for location, and excellent agreement was found for ABUS with different prototype HHUS observers

Location of the Lesion
According to Table 11 findings, there was no significant difference between ABUS and prototype HHUS findings for both breasts (p > 0.05). The maximum agreement was found for location, and excellent agreement was found for ABUS with different prototype HHUS observers

Discussion
The main goal of this study was to assess the diagnostic accuracy of automated breast ultrasound compared to handheld ultrasound in lesions detection, description, and interpretation. The fifth edition of BI-RADS was used to describe the morphological characteristics of each lesion, and the final BI-RADS classification for ABUS and HHUS was determined. In evaluating the results of a screening program, we observed that ABUS and prototype HHUS both detected a comparable number of breast lesions. Previous research showed that ABUS findings were significantly more accurate in measuring breast lesions' sizes compared to HHUS results. In 151/162 of the respondents, ABUS (gold standard) and prototype HHUS detected 1004 lesions with a maximum mean length dimension of 0.48 ± 0.33 cm with length range (0.20-3.60) cm. Additionally, the results of the Chen et al.

Discussion
The main goal of this study was to assess the diagnostic accuracy of automated breast ultrasound compared to handheld ultrasound in lesions detection, description, and interpretation. The fifth edition of BI-RADS was used to describe the morphological characteristics of each lesion, and the final BI-RADS classification for ABUS and HHUS was determined. In evaluating the results of a screening program, we observed that ABUS and prototype HHUS both detected a comparable number of breast lesions. Previous research showed that ABUS findings were significantly more accurate in measuring breast lesions' sizes compared to HHUS results. In 151/162 of the respondents, ABUS (gold standard) and prototype HHUS detected 1004 lesions with a maximum mean length dimension of 0.48 ± 0.33 cm with length range (0.20-3.60) cm. Additionally, the results of the Chen et al. (2021) study are higher than our study; the mean lesion diameters detected with the ABUS and HHUS were 23.2 ± 9.5 mm and 22.3 ± 6.3 mm, respectively [14]. The mean lesion size between ABUS and prototype HHUS observers did not differ significantly. The size of the masses may have an impact on the inter-observer agreement. Our studies found that for masses, the overall perfect agreement for size was at the same level. However, Shin et al. (2011) showed significant inter-observer agreement for masses greater than or equal to 7 cm (k = 0.750) and fair inter-observer agreement for masses less than or equal to 7 cm (k = 0.350) [15]. In our investigation, the two imaging modalities revealed that the benign lesions ranged in size from 0.2 to 2.5 cm; most of benign lesion were cysts followed by fibrocystic and fibroadenoma, whereas the malignant lesions ranged in size from 0.6 to 3.6 cm. According to Chen et al.'s (2013) study, the mean size of benign and malignant lesions was greater (1.97 cm and 1.76 cm, respectively) by ABUS and HHUS [16]. Chang et al. (2010) published a different study with smaller malignant and benign lesions, with mean sizes of 1.55 cm and 1.35 cm, respectively [17]. As shown in a systematic review study conducted by Ibraheem et al. in 2022, the identified malignancies had a mean percentage of 94% (81-100%) in comparison to the non-cancer in all studies, and the found tumors had a mean size of 2.1 cm. The data in the literature indicates that ABUS and HHUS perform similarly to one another in terms of differentiating between malignant and benign breast tumors [10].
Our study's findings revealed that the use of BI-RADS descriptors and final assessment categories was accompanied by a comparatively high level of intra-observer agreement. Regarding the evaluation of lesions' shape, orientation, margin, echo pattern, posterior acoustic features, and calcification of lesions, there were good to excellent agreements between ABUS and HHUS observers. There were no significant differences between the three observers' diagnostic performances, which were all good to excellent. However, HHUS observer 3 discovered a moderate agreement level for the lesion orientation on the left breast. Round, elliptical, and irregular shapes all fall under the category of shape, which is the outline of a mass. It is one of the most significant morphological features for differentiating benign masses from malignant tumors [13,18]. Particularly, malignant breast tumors on ABUS are independently correlated with irregular shape [11]. Previous ABUS analysis showed that irregular shapes had a 52.1% positive predictive value (PPV), 95.0% sensitivity, and 66.2% accuracy [19,20]. In our study, the overall agreement regarding the shape of the lesion was excellent for ABUS with different HHUS observers. However, we observed higher interobserver agreement than [6,11,21]. Significant agreement on orientation was obtained by Shin et al. (2011) for both small masses (<0.7 cm) (k = 0.68) and large masses (>0.7 cm) (k = 0.72) [15]. On the other hand, there was only moderate agreement on orientation for small masses (1.0 cm) (k = 0.530) and substantial agreement for large masses (>1.0 cm) (k = 0.634). In our investigation, the interobserver agreement for ABUS with several HHUS observers ranged from good to moderate. This difference might be the result of the ultrasound probe applying different pressures to the mass, which could produce deformations and affect the length-to-width diameter ratio [22]. Nevertheless, Liu et al. (2022) evaluated a series of 331 histopathologically verified lesions and observed a perfect interobserver agreement for circumscribed margins between ABUS and HHUS, while they showed fair to a moderate agreement for the margins of indistinct (k = 0.38), angular (k = 0.37), microlobulated (k = 0.46), and spiculated (k = 0.31) lesions [21]. As per result of our findings, Choi et al. (2018) observed good interobserver margin agreement [11].
Vourtsis and Kachulis (2018) evaluated 1665 women and observed perfect interobserver agreement between ABUS and HHUS (k = 0.85-0.95), which is less than our findings in terms of posterior features [9]. Liu et al. (2022) observed perfect agreement (k = 0.779), which is consistent with our findings [21]. The findings of our investigation showed good to excellent agreement between ABUS and different prototype HHUS observers. On the other hand, our results showed higher interobserver agreement than that which Liu et al. (2022) and Choi et al. (2018) observed [11,21]. Due to the small number of lesions with outside calcification or intraductal calcification, Liu et al. (2022) used dichotomized categories for presence or absence calcifications instead of descriptors from the fifth edition of BI-RADS. The results showed substantial agreement for these dichotomized categories' assessment of calcification (k = 0.604), which was higher than that of Choi et al. (2018), who also used dichotomized different study samples, which may account for this discrepancy [11,21].
Our results do not agree well with published data from prior studies in terms of sensitivity, specificity, diagnostic accuracy, positive predictive value (PPV), and negative predictive value (NPV). Chen [11]. According to the literature review conducted by Ibraheem et al. (2022), ABUS had considerably greater diagnostic accuracy, positive predictive value, and negative predictive value in our study. The radiologists' performance in terms of detection, sensitivity, and specificity did not significantly differ between the two modalities (p > 0.05) [10].
An AUC of 0.91 to 0.93 for ABUS and 0.83 to 91 for HHUS, with no significant difference between ABUS and HHUS, was observed in a prior study utilizing the fifth edition of BI-RADS [21,28,29]. Our study predicted an AUC of 0.88 to 1.0, which is comparable to the range of the other studies [30][31][32][33][34]. In our analysis, ABUS and HHUS did not miss any malignant lesions. Compared to HHUS, more benign lesions were found. Focal fat lobules were most likely the cause of a few of the solid nodules found on ABUS. Although we did not compare the number of cysts in this investigation, ABUS appeared to identify more cysts than HHUS. Cysts typically showed up on ABUS as being more hypoechoic, and some of them resembled complicated cysts.
According to the earlier study by Schmachtenberg et al. (2017), agreement regarding lesion localization was 0.94 for ABUS and MRI and 0.91 for HHUS and MRI. The three observers' estimates of agreement about lesion localization (same quadrant) ranged from 0.92 to 0.96 [35]. Additionally, Chae et al. (2013) demonstrated great reproducibility for lesion site and distance from the nipple, which is equivalent to our result [36]. Contrary to our findings, a number of studies have assessed the inter-observer reliability of location, the distance from the nipple as well as from the skin, and the size of the lesion with less inter-observer reliability assessment [6,11,[37][38][39]. Wang et al. (2012) found similar results to ours, with the mean distance from the nipple identified by ABUS among 30 lesions measuring 3.01 ± 1.54 cm and HHUS among 27 lesions measuring 3.36 ± 1.72 cm [26].
Even though the present study has several advantages, the accuracy of the prototype HHUS will also help doctors and other medical professionals use it in vital situations where time is of essence (emergency room, intensive care unit), or when the location prefers the use of HHUS devices (remote location, doctor's office). This is one of the study's main strengths. Therefore, by encouraging and supporting early breast cancer screening, this study will assist Malaysian women, although some limitations must be addressed. Firstly, the design for phase one is cross-sectional; there are no follow-up data for women who are determined to be non-cases. Due to verification bias, this might cause an overestimation of the test sensitivity. However, women with potentially benign findings need to have them confirmed by MRI or biopsy; as a result, the likelihood of false negatives should be relatively low. Second, lesions might be missed on ABUS if they have peripheral location. This technical drawback reduces the diagnostic performance of the method comparing to HHUS, especially in larger breasts, and could represent a cause for the misdiagnosis of cancer. The main limitation of ABUS is its inability to assess the axilla, the area behind the nipple. Furthermore, the results could not apply to all Malaysian women because of the limited sample size, the majority of respondents being of Malay ethnicity, and the single site study. Therefore, further research is required to reproduce the findings using a bigger sample size. The price of ABUS and the fact that Malaysia does not offer ABUS screening unless a reference is received are additional factors. As a result, fewer women may have had access to screening due to systemic barriers. There are some recommendations based on the results of the current study and the limitations that were noted. Initially, we recommend the use of HHUS as an adjacent device to mammogram, and MRI for dense breasts. Likewise, we recommended the use of HHUS as an adjacent device to ABUS in order to cover the axilla and area behind the nipple. For future study, the use of HHUS as a breast selfdevice by women is advised since that was the main objective of our study, and because of pandemic we could not achieve that objective.

Conclusions
This study revealed that the baseline screening uptake among Malaysian females accurately represents the current screening for prototype HHUS and ABUS. We discovered that a similar number of breast lesions were diagnosed by ABUS compared to prototype HHUS out of the overall sample. According to these findings, prototype HHUS is not statistically significantly different from ABUS in terms of interobserver variability or diagnostic performance.  Informed Consent Statement: The recruitment of the participants was on a voluntary basis, and all of the participants completed the informed consent form. Written informed consent to publish any materials (data, images, and recordings) was obtained from the participants before publishing this paper.

Data Availability Statement:
The data sets generated during the current study are not available publicly due to domestic regulation of the institution. However, they are available upon request to the corresponding author.