Next Article in Journal
Clinical Characteristics and Early Diagnosis of Spontaneous Fungal Peritonitis/Fungiascites in Hospitalized Cirrhotic Patients with Ascites: A Case–Control Study
Previous Article in Journal
Mediators of Placebo Response to Cannabinoid Treatment in Children with Autism Spectrum Disorder
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prognostic Models Using Machine Learning Algorithms and Treatment Outcomes of Occult Breast Cancer Patients

1
Department of Oncology, The Second Affiliated Hospital of Xi’an Jiaotong University, 157 West Fifth Street, Xi’an 710004, China
2
Department of Otolaryngology, The Second Affiliated Hospital of Xi’an Jiaotong University, 157 West Fifth Street, Xi’an 710004, China
3
Department of Radiation Oncology, The Second Affiliated Hospital of Xi’an Jiaotong University, 157 West Fifth Street, Xi’an 710004, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
J. Clin. Med. 2023, 12(9), 3097; https://doi.org/10.3390/jcm12093097
Submission received: 11 February 2023 / Revised: 5 March 2023 / Accepted: 20 April 2023 / Published: 24 April 2023
(This article belongs to the Section Oncology)

Abstract

:
Background: Occult breast cancer (OBC) is an uncommon malignant tumor and the prognosis and treatment of OBC remain controversial. Currently, there exists no accurate prognostic clinical model for OBC, and the treatment outcomes of chemotherapy and surgery in its different molecular subtypes are still unknown. Methods: The SEER database provided the data used for this study’s analysis (2010–2019). To identify the prognostic variables for patients with ODC, we conducted Cox regression analysis and constructed prognostic models using six machine learning algorithms to predict overall survival (OS) of OBC patients. A series of validation methods, including calibration curve and area under the curve (AUC value) of receiver operating characteristic curve (ROC) were employed to validate the accuracy and reliability of the logistic regression (LR) models. The effectiveness of clinical application of the predictive models was validated using decision curve analysis (DCA). We also investigated the role of chemotherapy and surgery in OBC patients with different molecular subtypes, with the help of K-M survival analysis as well as propensity score matching, and these results were further validated by subgroup Cox analysis. Results: The LR models performed best, with high precision and applicability, and they were proved to predict the OS of OBC patients in the most accurate manner (test set: 1-year AUC = 0.851, 3-year AUC = 0.790 and 5-year survival AUC = 0.824). Interestingly, we found that the N1 and N2 stage OBC patients had more favorable prognosis than N0 stage patients, but the N3 stage was similar to the N0 stage (OS: N0 vs. N1, HR = 0.6602, 95%CI 0.4568–0.9542, p < 0.05; N0 vs. N2, HR = 0.4716, 95%CI 0.2351–0.9464, p < 0.05; N0 vs. N3, HR = 0.96, 95%CI 0.6176–1.5844, p = 0.96). Patients aged >80 and distant metastases were also independent prognostic factors for OBC. In terms of treatment, our multivariate Cox regression analysis discovered that surgery and radiotherapy were both independent protective variables for OBC patients, but chemotherapy was not. We also found that chemotherapy significantly improved both OS and breast cancer-specific survival (BCSS) only in the HR−/HER2+ molecular subtype (OS: HR = 0.15, 95%CI 0.037–0.57, p < 0.01; BCSS: HR = 0.027, 95%CI 0.027–0.81, p < 0.05). However, surgery could help only the HR−/HER2+ and HR+/HER2− subtypes improve prognosis. Conclusions: We analyzed the clinical features and prognostic factors of OBC patients; meanwhile, machine learning prognostic models with high precision and applicability were constructed to predict their overall survival. The treatment results in different molecular subtypes suggested that primary surgery might improve the survival of HR+/HER2− and HR−/HER2+ subtypes, however, only the HR−/HER2+ subtype could benefit from chemotherapy. The necessity of surgery and chemotherapy needs to be carefully considered for OBC patients with other subtypes.

1. Introduction

Occult breast cancer (OBC) is a rare malignant tumor, with an estimated incidence of 0.1% to 1% of all breast cancers. It is distinguished by the presence of pathological breast carcinoma in local lymph nodes or distal metastatic organs (usually axillary lymphadenopathy), but clinical or imaging examination fails to demonstrate the primary breast tumor [1,2,3]. Since OBC was initially described by Halsted in 1907 [4], its prognosis and management have been a matter of debate [1,5,6].
So far, the prognosis of OBC is still debatable. OBC patients have been shown to have a lower chance of mortality than non-OBC patients in some studies [7,8], whereas others have revealed comparable outcomes [9] or even significantly worse prognoses [10]. In addition, studies suggest that the prognosis factors of OBC patients vary greatly with different clinical features [3,5,11]. Moreover, nearly all these studies analyzed only patients with stage I to III, and so the effect of distant metastasis on OBC patients is still unclear. Therefore, there is an urgent need for prognostic prediction models to accurately answer the concerns of all OBC patients about survival and to help optimize their management.
So far, two studies have built several nomograms to predict the breast cancer specific survival (BCSS) of OBC patients [3,12]; however, one model can only be used for patients who have undergone surgery [3] and the other one can only be used for early-stage patients [12]. Moreover, the accuracy of the models is far from sufficient (AUC value or C-index is only around 0.7) and both of them did not assess the overall survival (OS). Consequently, a more widely available and accurate model is necessitated. Nowadays, with the advent of machine learning, analyzing the vast, multi-dimensional and multi-modal data generated by a clinical database has become easier [13,14]. Machine learning can also help us constructed artificial intelligence (AI) prognostic models, significantly improving their accuracy [14,15,16]. Our previous study successfully used a machine learning algorithm to predict the prognosis of breast cancer patients with initial bone metastases and greatly improve the accuracy [15]. However, no one has utilized machine learning to create prognostic models for OBC patients. Thus, we used six kinds of machine learning algorithm to create prognostic models and found that the LR algorithm performed best. Moreover, it has been challenging to conduct randomized controlled trials and standardize management because this particular type of breast cancer is extremely rare. A lot of retrospective studies focus on the effect of different treatment methods on the prognosis of all OBC patients [7,17,18,19,20,21], but no one has analyzed it in relation to different molecular subtypes, hence there is a need for a further inquiry.
Our study explores the factors influencing the prognosis of OBC patients using the most up-to-date Surveillance, Epidemiology and End Results (SEER) database and is the first one to create high-precision AI models to predict the 1, 3 and 5-year OS of OBC patients. We first investigated the role of the N0 stage, family income and months from diagnosis to therapy in OBC patients. Additionally, we further investigated the treatment outcomes of surgery and chemotherapy in different molecular subtypes, which have never been reported before, and found that primary surgery might improve the survival of OBC patients with HR+/HER2− and HR−/HER2+ subtypes; however, only the HR−/HER2+ subtype could benefit from chemotherapy. The necessity of surgery and chemotherapy needs to be carefully considered for OBC patients with other subtypes. This work gains insight into the prognosis of OBC patients and is helpful for their prognostic prediction and clinical management.

2. Materials and Methods

2.1. Data Source and Study Design

The workflow for the design and analyses of this study is illustrated in Figure 1. The data used for analysis in our study were obtained from the SEER database [SEER research plus data, 17 Regs, November 2021 Sub (2000–2019); version 8.4.0], which is openly accessible. As the information about distant metastasis and molecular subtypes were collected from 2010, and taking account of the data update, we only analyzed the data from 2010–2019. From this database, data on females with OBC were obtained. Inclusion criteria were as follows: (1) breast cancer proved to be the sole primary cancer the patient had been diagnosed with; (2) all these patients had shown histopathological and structural evidence consistent with the International Classification of Cancer Diseases Edition III (ICD-O-3); (3) aged ≥18 years; and (4) had T0 stage cancer according to the American Joint Committee on Cancer (AJCC). Exclusion criteria were as follows: (1) patients carried two or more primary cancers; (2) unexplainable TNM stage, such as T0N0M0; and (3) patients whose survival time was vague. Follow up was until the patient died, was lost to follow-up or until 31 December 2019.

2.2. Machine Learning Models

For feature selection, patients were sorted into train and test sets at random in a 7:3 ratio. In our train set, characteristics that were statistically significant in the multivariate Cox analysis, including age at diagnosis, molecular subtype, N stage, surgery, bone, lung, liver and brain metastases were included in our machine learning models to predict 1-, 3- and 5-year overall survival of OBC patients. Prior to excluding patients who were still alive but survived less than 1, 3 or 5 years at the follow-up cut-off date, the above analyses were conducted. A response variable for the survival information was obtained prior to launching the training program, with 1 denoting survival and 0 denoting death. On the test data, we compared the area under the curve (AUC value) of logistic regression (LR), random forest (RF), support vector machine (SVM), decision tree (ID3), k-Nearest Neighbor (kNN) and extreme gradient boosting (XGBoost). The LR model was further assessed by calibration curve and decision curve analysis.

2.3. LR

Logistic regression is known as log odds regression, it is a classification algorithm [22]. Under the assumption that the outcome variable has a probability distribution, logistic regression models the log odds of each patient experiencing the outcome linearly. This is converted to probabilities by means of a “sigmoid” function. Logistic regression is a highly interpretable algorithm and a hallmark of classical predictive modeling.

2.4. SVM

Support vector machines locate the hyperplanes that divide data points into two groups by mapping input vectors to higher dimensional feature spaces [23]. This maximizes the edge distance between the instance nearest to the boundary and the decision hyperplane. The identified hyperplane is the decision boundary between the two clusters and the resulting classifier has considerable generalization power.

2.5. ID3 and RF

The tree-structured classification method used by ID3, one of the earliest and most prevalent machine learning architectures, uses nodes to symbolize input factors and leaves to reflect decision outcomes [24]. Being based on the DT architecture, they are easy to interpret and fast to learn. Based on ID3, a random forest is generated by repetitively drawing k samples from the original training sample set, N, at random, followed by generating k classification trees according to the self-help sample set to generate the random forest.

2.6. XGBoost Model

The XGBoost algorithm modifies the gradient boosting algorithm by performing Taylor expansion of the loss function to the second order, adding a regularization term to the loss function, and solving for the extreme values of the loss function using Newton’s technique [25]. In addition, the technique employed in the XGBoost algorithm called “feature subsampling”, which can be understood as selecting a subset of all features to train each tree (similar to a random forest) in order to improve the generalization capability of the model, make it more diverse and prevent overfitting.

2.7. kNN

The kNN algorithm is founded on the premise that, if a sample falls under a category, most of the k closest to the neighboring samples in the feature space also fall under that category and share the same traits [26]. In determining the classification choice, the technique bases its determination of the category to which the sample to be classified corresponds solely on the category of the few most adjacent samples.

2.8. Statistical Analysis

To explore the connection between diverse pathological and clinical traits and patient survival rates, we applied univariate Cox regression models. To evaluate patient mortality risk and determine independent prognostic factors, further multivariate Cox analysis was carried out. Patients experiencing chemotherapy or surgical therapy and those receiving neither were paired on a 1:1 propensity score (PSM), according to variables in the univariate Cox regression, as a way to examine the role of these therapies on the outcome of patients with OBC [27]. On the PSM-adjusted population, we also conducted Kaplan–Meier (K-M) survival analysis [28] stratified by molecular subtype. Finally, we performed subgroup univariate, as well as multifactorial, Cox analyses in OBC patients according to molecular subtype. R software (version 4.0.2) was employed to conduct all the statistical analyses in this study. Statistical significance was determined to exist when the bilateral tail value was less than 0.05.

3. Results

3.1. Clinical Characteristics of OBC Patients

Eventually, we obtained information on 906 qualified OBC patients from the SEER database (2010 to 2019). The clinicopathological traits of OBC patients are displayed in Table 1 and summarized below. The patients’ median age was 62 years, of which 142 (15.67%) patients were younger than 50 years, and 92 (10.15%) patients were older than 80 years. In total, 449 (49.56%) patients began therapy immediately following diagnosis, whereas 377 (41.61%) patients began therapy after more than 1 month since diagnosis. For the molecular subtypes, HR+/HER2− made up 41.50%, followed by HR−/HER2− (16.11%), HR+/HER2+ (11.81%) and HR−/HER2+ (8.39%). In terms of ethnicity, 80.68% of the patients were white, and the most prevalent histopathological subtype was invasive ductal carcinoma (IDC; 30.57%). Regarding marital status, 49.89% of the patients were married and 14.79% were single. The proportions of stages N0 to N3 were 18.98%, 53.20%, 8.28% and 11.48%, respectively. Only 0.66% of the patients had grade I tumors, compared to 14.24% who developed grade III or IV. About 31.90% of the patients were found to have a decent annual family income of more than US$750,000. In the treatment field, only 24.50% of patients received surgery, 43.71% received radiotherapy and 64.13% received chemotherapy. Bone, lung, liver and brain metastases, respectively, accounted for 29.80%, 9.93%, 8.94% and 4.30% of all patients.

3.2. Univariate and Multivariate Cox Regression Analysis

To uncover significant factors influencing BCSS, as well as overall survival (OS) of OBC patients, we conducted univariate Cox regression analysis, including age at diagnosis, time from diagnosis to therapy, histological type, molecular subtype, marital status, N stage, race, grade, median family income (inflation-adjusted), distant metastases and information about treatment (Table 2).
Furthermore, we carried out multivariate Cox regression analysis to eliminate confounding factors and uncover independent factors correlated to BCSS and OS (Table 2). It showed that, in patients aged >80, distant metastases were significantly linked to inferior BCSS and OS. The HR−/HER2− subtype showed worse BCSS and OS compared with HR+/HER2− patients, whereas the HR+/HER2+ and HR−/HER2+ subtypes did not exhibit any difference. Patients at N1 and N2 stages had more favorable prognosis than at the N0 stage, but the N3 stage was similar to the N0 stage. In terms of treatment, only primary tumor surgery, and not chemotherapy or radiotherapy, could prolong both OS and BCSS according to multivariate Cox regression analysis, although radiotherapy could improve only the OS, just not the BCSS. Additionally, social variables such as family fiscal conditions and marriage status were analyzed; however, they are not independent prognosis factors for OBC.

3.3. Constructing and Assessing Predictive Models for the Estimation of OBC Patients’ Prognosis

In light of the above findings, patients were sorted into train and test data, at random, in a 7:3 ratio (Supplementary Table S1) and univariate and multivariate Cox analysis was used to analyze the train set again (Supplementary Table S2). Eight independent prognostic factors were chosen as model features, and prognostic models were created with six machine learning algorithms to assess the OS of OBC patients at 1, 3 and 5 years. For both train and test sets, we created predicted ROC curves and calculated their AUCs.
Our LR algorithm model manifested extraordinary efficiency in the prediction of OBC patient’ survival at 1 year (train set AUC = 0.884; test set AUC = 0.851), 3 years (train set AUC = 0.829; test set AUC = 0.790) and 5 years (train set AUC = 0.857; test set AUC = 0.824) (Figure 2A–F). In comparison with other machine learning algorithms, RF (1-year AUC = 0.818; 3-year AUC = 0.765; 5-year AUC = 0.824), XGBoost (1-year AUC = 0.795; 3-year AUC = 0.792; 5-year AUC = 0.829), ID3 (1-year AUC = 0.665; 3-year AUC = 0.755; 5-year AUC = 0.788), KNN (1-year AUC = 0.773; 3-year AUC = 0.711; 5-year AUC = 0.784) and SVM (1-year AUC = 0.550; 3-year AUC = 0.676; 5-year AUC = 0.766). The LR model performed best (Table 3).
Then, the accuracy of our LR models was further assessed using calibration curves [29]. According to the calibration curves of the train and test sets (Figure 3A–F), the predicted values of LR models were perfectly in keeping with the observed values, indicating that LR models had remarkable accuracy. After determining the accuracy of the prediction models, we further analyzed their clinical applicability via decision curve analysis (DCA) [30]. The results showed that the LR models had a wide threshold probability range and a good net benefit in predicting 1-year, 3-year and 5-year OS rates for OBC (Figure 4A–F). Overall, our models performed well.

3.4. Benefits of Chemotherapy in OBC Patients Subdivided by Molecular Subtype

Unexpectedly, chemotherapy was not an independent prognostic factor for OBC patients in our multivariate Cox regression analysis (Table 2). Hence, we took a further look at how chemotherapy affected OBC patient prognosis. We contrasted the baseline features of patients receiving chemotherapy with those without chemotherapy (Table 4). These two groups had different baselines. Therefore, the observed disparity was adjusted with the help of propensity score matching (PSM). After PSM adjustment, there existed no discernible differences in baseline characteristics (Table 4).
According to the PSM-adjusted data, the chemotherapy group’s overall risk of death was reduced by about 28% (p = 0.013, HR: 0.72; 95% CI: 0.56–0.93) (Figure 5A), whereas there was no difference in the risk of breast cancer-related death (p = 0.17, HR: 0.81; 95% CI: 0.6–1.09) (Figure 5B). Only the HR−/HER2+ subgroup could substantially benefit from chemotherapy in terms of OS and BCSS, according to the stratified K-M survival study (Figure 6C,G); however, it did not show any benefit for the OS and BCSS of other three subtypes (Figure 6A,B,D–F,H). To further validate these results, we divided all the 906 eligible OBC patients into four groups, according to molecular subtype, and performed univariate and multivariate Cox analyses again (Supplementary Table S3). It showed that only the HR−/HER2+ subtype could benefit from chemotherapy, which is consistent with our results for the PSM-adjusted K-M survival analysis.

3.5. Benefits of Surgery for OBC Patients Subdivided by Molecular Subtype

In view of the above results, we looked further into the influence of surgery on the prognosis of OBC patients with distinctive subtypes. Using the same PSM method, there appeared no significant differences between patients receiving surgical treatment and those without surgery in terms of baseline characteristics (Table 5).
According to the PSM-adjusted data, the surgery group’s overall risk of death was reduced by around 56% (p = 0.001, HR: 0.44; 95% CI: 0.27–0.73) (Figure 7A), with the risk of breast cancer-related death reduced by approximately 51% (p = 0.012, HR: 0.49; 95% CI: 0.27–0.87) (Figure 7B). The stratified K-M survival analysis uncovered that surgical treatment significantly improved OS in the HR+/HER2− and HR−/HER2+ subtypes (Figure 8A,C). However, there was no significant difference in HR+/HER2+ and HR−/HER2− subtypes (Figure 8B,D). In addition, the effect of surgical treatment on BCSS in patients with all subtypes was similar (Figure 8E–H). To further validate these results, we divided all the 906 eligible OBC patients into four groups, according to molecular subtype, and performed univariate and multivariate Cox analyses again (Supplementary Table S3). It showed that surgical intervention was proven to be an independent prognostic factor only for HR+/HER2− and HR−/HER2+ subtypes, supporting our findings from the PSM-adjusted K-M survival analysis.

4. Discussion

OBC is an unusual clinical entity and represents a therapeutic challenge for doctors [31]. Since this type of breast cancer is quite rare, its prognosis remains debatable, and standardized management of OBC is still difficult [1,6]. Some large-sample retrospective studies using SEER could help solve the problem of rare cases, but most such cases in previous studies have a large time span [7,20,21] and some cases that were not OBC might have been considered so in the past because of the limitations of imaging technology [6,32,33]. The present study, as far as we are aware, is the most up-to-date one to examine the clinical traits and prognosis of OBC patients. In two recent investigations, several nomogram prediction models for OBC patients were created using SEER populations [3,12]; however, their models could not predict OS and could only be used for patients who had undergone surgery [3] or were at an early-stage [12]. Moreover, the accuracy of their models is far from sufficient. Thus, our research is also the first to develop AI prognostic models for OBC patients, and our LR models are the most widely available and are more accurate in predicting the OS of OBC patients.
This study identified several independent factors associated with poor prognosis, including age ≥80, triple negative molecular subtype, N0/N3 stage, and distant metastasis. Some studies have shown that OBC patients aged ≥70 are more likely to develop worse OS [20,21], whereas other studies have claimed that age was not a risk factor [3,5,12]. We looked at a wider range of age categories and discovered a worse OS for people aged ≥80. Compared to the HR+/HER2− subtype, only the HR−/HER2− subtype showed poorer survival and some studies also showed that ER+ was an independent favorable factor [3,5], which implies the importance of endocrine therapy for HR+ OBC patients. On the contrary, several studies have indicated that OBC patients of different subtypes showed no difference in terms of survival [7,12,20], which could be attributed to the diverse enrolled populations. Interestingly, compared with N0 stage, the OBC patients at N1 and N2 stages showed better OS and BCSS, but there was no difference between the N0 and N3 stages. Perhaps the prime reason for this is that N0 stage OBC must be accompanied by distant metastasis, coupled with the fact that distant metastasis is also an unfavorable independent prognostic factor; thus, OBC patients at N0 and N3 stages had the worst prognosis. Some previous studies have shown that N2+ is an unfavorable independent prognostic factor of OBC [3,12], but all their references were at the N1 stage; in other words, we are the first to have investigated the role of the N0 stage in OBC patients. We also detected the role of family income and months from diagnosis to therapy, which have never been reported in OBC patients; although both of these are not prognosis factors in OBC.
In terms of treatment, surgery and radiotherapy were both independent protective variables for OBC patients, according to our multivariate Cox regression analysis of the data, whereas chemotherapy was not. Many studies have focused on the therapeutic effects of different surgical methods, such as mastectomy or breast-conserving treatment, combined with radiotherapy and indicated that breast conservation can be considered in patients with OBC [17,19,20,21,34,35]. Surprisingly, previous studies have also reported that chemotherapy was not an independent prognostic factor in OBC patients [3,11,20]. However, no one had investigated the role of chemotherapy and surgery in OBC patients with different subtypes, thus, we further explored this issue. We found that chemotherapy significantly improved both OS and BCSS only in OBC patients with the HR−/HER2+ subtype, suggesting that anti-HER2-targeted therapy combined with chemotherapy may prolong the survival of OBC patients and that endocrinotherapy is more important in the HR+ subtype than chemotherapy. We also found that surgery appeared to be an independent prognostic factor only when it comes to HR+/HER2− and HR−/HER2+ subtypes, indicating that comprehensive endocrinotherapy and surgical treatment is very important for the HR+/HER2− subtype and that multimodal treatment, involving chemotherapy, surgical treatment and anti-HER2-targeted therapy, could benefit the HR−/HER2+ subtype. For OBC patients with other subtypes, the necessity of surgery and chemotherapy needs to be carefully considered.
Our study may have several limitations despite its promising discoveries. First, for systemic therapy, there is no detailed information on, for example, the dosage of each drug or the chemotherapy formula, in current database; hence, we were unable to find out more about the relationships between various chemotherapy regimens and the survival of patients. Meanwhile, the most recent version of the SEER database does not contain any information about endocrine therapy. Second, the SEER database represents the general situation well, but due to ethnic differences, it may not always apply to Asian, and especially Chinese, patients. Third, owing to the limited number of cases, the number of matches in PSM was not 100%; so selection bias might have occurred.

5. Conclusions

We analyzed the clinical features of OBC patients and constructed three high-precision and applicability machine-learning prognostic models to predict their survival. According to our analysis of possible prognostic variables for OBC patients, the survival of OBC patients with the HR−/HER2+ subgroup may benefit from chemotherapy, whereas the prognosis for the HR+/HER2− and HR−/HER2+ subtypes may be benefited by primary surgery.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jcm12093097/s1, Table S1: Baseline characteristics of occult breast cancer (OBC) patients in train set and test set; Table S2: Univariate and multivariate Cox analysis of OBC characteristics extracted from train data; Table S3: Univariate and multivariate Cox analysis of OBC characteristics (stratified by molecular subtype).

Author Contributions

Conceptualization, J.Q., C.L., S.Z. and X.Z.; data curation, J.Q. and C.L.; formal analysis, J.Q., C.L. and M.L.; funding acquisition, J.Q., S.Z. and X.Z.; methodology, J.Q., C.L., M.L. and Z.F.; project administration, F.W.; supervision, J.L., W.W., S.Z. and X.Z.; validation, M.L., Z.F., J.L. and W.W.; writing—original draft preparation, J.Q., C.L., Y.W., S.Z. and X.Z.; writing—review and editing, J.Q., C.L., M.L., Y.W., Z.F., J.L., W.W., F.W., S.Z. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded in part by the following: the National Science Foundation of China (81903856, to X. Zhao; 82174164, to S.Q. Zhang, 82103569, to J.K. Qu); the Key Science and Technology Program of Shaanxi Province (2021KW-57, to X Zhao; 2021KW-60, to J.K. Qu); the Scientific Research Fund of the Second Affiliated Hospital of Xi’an Jiaotong University (RC(XM)202004, to X Zhao); the Free Exploring Fund of Xi’an Jiaotong University (xzy012022096, to X. Zhao; xzy012022097 to J.K. Qu); and the Medical “basic—clinical” Integration and Innovation Project of Xi’an Jiaotong University (YXJLRH2022088 to J.K. Qu).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the fact that the data are fully de-identified and no intervention on patients was performed.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data here are publicly available in the SEER database (https://seer.cancer.gov/ (accessed on 15 April 2022)).

Acknowledgments

We thank all staff at the SEER database for their contribution in data collection, maintenance, distribution and so on. We would also like to thank all the developers of the R programming package for selflessly sharing their code.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fayanju, O.M.; Jeffe, D.B.; Margenthaler, J.A. Occult primary breast cancer at a comprehensive cancer center. J. Surg. Res. 2013, 185, 684–689. [Google Scholar] [CrossRef] [PubMed]
  2. Hotta, M.; Khurana, M.; Sung, M.; O’Connor, V.V. Understanding the Biology of Occult Breast Cancer: Examination of 31 Cases Reveals Aggressive Behavior. Am. Surg. 2020, 86, 116–118. [Google Scholar] [CrossRef]
  3. Zhang, D.; Zhai, J.; Li, L.; Wu, Y.; Ma, F.; Xu, B. Prognostic Factors and a Model for Occult Breast Cancer: A Population-Based Cohort Study. J. Clin. Med. 2022, 11, 6804. [Google Scholar] [CrossRef]
  4. Halsted, W.S. The Results of Radical Operations for the Cure of Carcinoma of the Breast. Ann. Surg. 1907, 46, 1–19. [Google Scholar] [CrossRef] [PubMed]
  5. Walker, G.V.; Smith, G.; Perkins, G.H.; Oh, J.L.; Woodward, W.; Yu, T.-K.; Hunt, K.K.; Hoffman, K.; Strom, E.A.; Buchholz, T.A. Population-based analysis of occult primary breast cancer with axillary lymph node metastasis. Cancer 2010, 116, 4000–4006. [Google Scholar] [CrossRef] [PubMed]
  6. Ofri, A.; Moore, K. Occult breast cancer: Where are we at? Breast 2020, 54, 211–215. [Google Scholar] [CrossRef] [PubMed]
  7. Ge, L.-P.; Liu, X.-Y.; Xiao, Y.; Gou, Z.; Zhao, S.; Jiang, Y.-Z.; Di, G.-H. Clinicopathological characteristics and treatment outcomes of occult breast cancer: A SEER population-based study. Cancer Manag. Res. 2018, 10, 4381–4391. [Google Scholar] [CrossRef]
  8. Huang, K.-Y.; Zhang, J.; Fu, W.-F.; Lin, Y.-X.; Song, C.-G. Different Clinicopathological Characteristics and Prognostic Factors for Occult and Non-occult Breast Cancer: Analysis of the SEER Database. Front. Oncol. 2020, 10, 1420. [Google Scholar] [CrossRef]
  9. Ping, S.; Ming, W.H.; Bin, S.H.; Wen, W.D.; Cheng, Q.; Jun, C.C.; Bin, H.Y.; Jian, C.Z. Comparison of clinical characteristics between occult and non-occult breast cancer. J. BUON 2014, 19, 662–666. [Google Scholar]
  10. Jackson, B.; Scott-Conner, C.; Moulder, J. Axillary metastasis from occult breast carcinoma: Diagnosis and management. Am. Surg. 1995, 61, 431–434. [Google Scholar]
  11. Hessler, L.K.; Molitoris, J.K.; Rosenblatt, P.Y.; Bellavance, E.C.; Nichols, E.M.; Tkaczuk, K.H.R.; Feigenberg, S.J.; Bentzen, S.M.; Kesmodel, S.B. Factors Influencing Management and Outcome in Patients with Occult Breast Cancer with Axillary Lymph Node Involvement: Analysis of the National Cancer Database. Ann. Surg. Oncol. 2017, 24, 2907–2914. [Google Scholar] [CrossRef] [PubMed]
  12. Man, X.; Xu, H.; Wang, H.; Zhao, J.; Chen, X.; Yin, S.; Tan, Q.; Huang, J.; Sun, S.; Zhou, D.; et al. Survival analysis and nomogram for early-stage occult breast cancer with positive lymph nodes based on the SEER database. Ann. Transl. Med. 2022, 10, 1351. [Google Scholar] [CrossRef] [PubMed]
  13. Sajda, P. Machine Learning for Detection and Diagnosis of Disease. Annu. Rev. Biomed. Eng. 2006, 8, 537–565. [Google Scholar] [CrossRef]
  14. Tran, K.A.; Kondrashova, O.; Bradley, A.; Williams, E.D.; Pearson, J.V.; Waddell, N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 2021, 13, 152. [Google Scholar] [CrossRef] [PubMed]
  15. Li, C.; Liu, M.; Li, J.; Wang, W.; Feng, C.; Cai, Y.; Wu, F.; Zhao, X.; Du, C.; Zhang, Y.; et al. Machine learning predicts the prognosis of breast cancer patients with initial bone metastases. Front. Public Health 2022, 10, 1003976. [Google Scholar] [CrossRef] [PubMed]
  16. Lipkova, J.; Chen, R.J.; Chen, B.; Lu, M.Y.; Barbieri, M.; Shao, D.; Vaidya, A.J.; Chen, C.; Zhuang, L.; Williamson, D.F.; et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 2022, 40, 1095–1110. [Google Scholar] [CrossRef] [PubMed]
  17. Tsai, C.; Zhao, B.; Chan, T.; Blair, S.L. Treatment for occult breast cancer: A propensity score analysis of the National Cancer Database. Am. J. Surg. 2019, 220, 153–160. [Google Scholar] [CrossRef]
  18. Terada, M.; Miyashita, M.; Kumamaru, H.; Miyata, H.; Tamura, K.; Yoshida, M.; Ogo, E.; Nagahashi, M.; Asaga, S.; Kojima, Y.; et al. Surgical treatment trends and identification of primary breast tumors after surgery in occult breast cancer: A study based on the Japanese National Clinical Database—Breast Cancer Registry. Breast Cancer 2022, 29, 698–708. [Google Scholar] [CrossRef]
  19. Johnson, H.M.; Irish, W.; Vohra, N.A.; Wong, J.H. The effect of local therapy on breast cancer-specific mortality of women with occult breast cancer and advanced nodal disease (N2/N3): A population analysis. Breast Cancer Res Treat 2019, 177, 155–164. [Google Scholar] [CrossRef]
  20. Zhao, Z.; Zhang, T.; Yao, Y.; Lu, X. Clinicopathological characteristics and treatment outcomes of occult breast cancer: A population-based study. BMC Surg. 2022, 22, 143. [Google Scholar] [CrossRef]
  21. Kim, B.H.; Kwon, J.; Kim, K. Evaluation of the Benefit of Radiotherapy in Patients with Occult Breast Cancer: A Population-Based Analysis of the SEER Database. Cancer Res. Treat. 2018, 50, 551–561. [Google Scholar] [CrossRef] [PubMed]
  22. Domínguez-Almendros, S.; Benítez-Parejo, N.; Gonzalez-Ramirez, A. Logistic regression models. Allergol. Immunopathol. (Madr.) 2011, 39, 295–305. [Google Scholar] [CrossRef]
  23. Huang, S.; Cai, N.; Pacheco, P.P.; Narandes, S.; Wang, Y.; Xu, W. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genom. Proteom. 2018, 15, 41–51. [Google Scholar] [CrossRef]
  24. Kourou, K.; Exarchos, T.P.; Exarchos, K.P.; Karamouzis, M.V.; Fotiadis, D.I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 2015, 13, 8–17. [Google Scholar] [CrossRef] [PubMed]
  25. Yu, Y.; Tran, H. An XGBoost-Based Fitted Q Iteration for Finding the Optimal STI Strategies for HIV Patients. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–9. [Google Scholar] [CrossRef] [PubMed]
  26. Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN Classification With Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 1774–1785. [Google Scholar] [CrossRef]
  27. Kane, L.T.; Fang, T.; Galetta, M.S.; Goyal, D.K.; Nicholson, K.J.; Kepler, C.K.; Vaccaro, A.R.; Schroeder, G.D. Propensity Score Matching: A Statistical Method. Clin. Spine. Surg. 2020, 33, 120–122. [Google Scholar] [CrossRef]
  28. Barakat, A.; Mittal, A.; Ricketts, D.; Rogers, A.B. Understanding survival analysis: Actuarial life tables and the Kaplan–Meier plot. Br. J. Hosp. Med. 2019, 80, 642–646. [Google Scholar] [CrossRef]
  29. Austin, P.C.; Harrell, F.E., Jr.; van Klaveren, D. Graphical calibration curves and the integrated calibration index (ICI) for survival models. Stat. Med. 2020, 39, 2714–2742. [Google Scholar] [CrossRef]
  30. Vickers, A.J.; Cronin, A.M.; Elkin, E.B.; Gonen, M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med. Inform. Decis. Mak. 2008, 8, 53. [Google Scholar] [CrossRef]
  31. Wild, J.B.; Thrush, S.; Vidya, R. What can National Registry data tell us about occult breast cancer? Chin. Clin. Oncol. 2019, 8, S15. [Google Scholar] [CrossRef] [PubMed]
  32. de Bresser, J.; de Vos, B.; van der Ent, F.; Hulsewé, K. Breast MRI in clinically and mammographically occult breast cancer presenting with an axillary metastasis: A systematic review. Eur. J. Surg. Oncol. (EJSO) 2010, 36, 114–119. [Google Scholar] [CrossRef] [PubMed]
  33. Ahmed, M.; Douek, M. The management of screen-detected breast cancer. Anticancer. Res. 2014, 34, 1141–1146. [Google Scholar]
  34. Kim, H.; Park, W.; Kim, S.S.; Ahn, S.J.; Kim, Y.B.; Kim, T.H.; Kim, J.H.; Choi, J.-H.; Park, H.J.; Chang, J.S.; et al. Outcome of breast-conserving treatment for axillary lymph node metastasis from occult breast cancer with negative breast MRI. Breast 2020, 49, 63–69. [Google Scholar] [CrossRef] [PubMed]
  35. Wang, J.; Zhang, Y.F.; Wang, X.; Wang, J.; Yang, X.; Gao, Y.Q.; Fang, Y. Treatment outcomes of occult breast carcinoma and prognostic analyses. Chin. Med. J. 2013, 126, 3026–3029. [Google Scholar] [PubMed]
Figure 1. The flowchart details the procedure for carrying out the study and the analysis of data.
Figure 1. The flowchart details the procedure for carrying out the study and the analysis of data.
Jcm 12 03097 g001
Figure 2. ROC curve of LR model. ROC curve for (A) the 1-year prognostic model using train data; (B) the 1-year prognostic model using test data; (C) the 3-year prognostic model using train data; (D) the 3-year prognostic model using test data; (E) the 5-year prognostic model using train data; and (F) the 5-year prognostic model using test data.
Figure 2. ROC curve of LR model. ROC curve for (A) the 1-year prognostic model using train data; (B) the 1-year prognostic model using test data; (C) the 3-year prognostic model using train data; (D) the 3-year prognostic model using test data; (E) the 5-year prognostic model using train data; and (F) the 5-year prognostic model using test data.
Jcm 12 03097 g002
Figure 3. Calibration curve of LR model. Calibration curve of the LR model predicting OS in (A) the 1-year train set and (B) the test set; (C) the 3-year train set and (D) the test set; and (E) the 5-year train set and (F) the test set. CC: calibration curve.
Figure 3. Calibration curve of LR model. Calibration curve of the LR model predicting OS in (A) the 1-year train set and (B) the test set; (C) the 3-year train set and (D) the test set; and (E) the 5-year train set and (F) the test set. CC: calibration curve.
Jcm 12 03097 g003
Figure 4. Decision curves of the LR model predicting OS. Decision curves of the LR model predicting OS in (A) the 1-year train set and (B) the test set; (C) the 3-year train set and (D) the test set; and (E) the 5-year train set and (F) the test set. The y-axis signifies the net benefit, and the x-axis signifies the threshold probability. The dark green line indicates that all of the patients are alive, and the blue line suggests that none of the patients are alive.
Figure 4. Decision curves of the LR model predicting OS. Decision curves of the LR model predicting OS in (A) the 1-year train set and (B) the test set; (C) the 3-year train set and (D) the test set; and (E) the 5-year train set and (F) the test set. The y-axis signifies the net benefit, and the x-axis signifies the threshold probability. The dark green line indicates that all of the patients are alive, and the blue line suggests that none of the patients are alive.
Jcm 12 03097 g004
Figure 5. OS and BCSS of chemotherapy-treated OBC patients after PSM adjustment. Kaplan–Meier (K-M) survival analysis: (A) PSM-adjusted OS of chemotherapy-treated OBC; (B) PSM-adjusted BCSS of chemotherapy-treated OBC.
Figure 5. OS and BCSS of chemotherapy-treated OBC patients after PSM adjustment. Kaplan–Meier (K-M) survival analysis: (A) PSM-adjusted OS of chemotherapy-treated OBC; (B) PSM-adjusted BCSS of chemotherapy-treated OBC.
Jcm 12 03097 g005
Figure 6. OS and BCSS of chemotherapy-treated OBC patients after PSM adjustment (stratified by molecular subtype). (A) OS of HR+/HER2− subtype; (B) OS of HR+/HER2+ subtype; (C) OS of HR−/HER2+ subtype; (D) OS of HR−/HER2− subtype; (E) BCSS of HR+/HER2− subtype; (F) BCSS of HR+/HER2+ subtype; (G) BCSS of HR−/HER2+ subtype; (H) BCSS of HR−/HER2− subtype.
Figure 6. OS and BCSS of chemotherapy-treated OBC patients after PSM adjustment (stratified by molecular subtype). (A) OS of HR+/HER2− subtype; (B) OS of HR+/HER2+ subtype; (C) OS of HR−/HER2+ subtype; (D) OS of HR−/HER2− subtype; (E) BCSS of HR+/HER2− subtype; (F) BCSS of HR+/HER2+ subtype; (G) BCSS of HR−/HER2+ subtype; (H) BCSS of HR−/HER2− subtype.
Jcm 12 03097 g006
Figure 7. OS and BCSS of surgery-treated OBC patients after PSM adjustment. (A) OS of surgery-treated OBC patients after PSM adjustment; (B) BCSS of surgery-treated OBC patients after PSM adjustment.
Figure 7. OS and BCSS of surgery-treated OBC patients after PSM adjustment. (A) OS of surgery-treated OBC patients after PSM adjustment; (B) BCSS of surgery-treated OBC patients after PSM adjustment.
Jcm 12 03097 g007
Figure 8. OS and BCSS of surgery-treated OBC patients after PSM adjustment. (stratified by molecular subtype). (A) OS of HR+/HER2− subtype; (B) OS of HR+/HER2+ subtype; (C) OS of HR−/HER2+ subtype; (D) OS of HR−/HER2− subtype; (E) BCSS of HR+/HER2− subtype; (F) BCSS of HR+/HER2+ subtype; (G) BCSS of HR−/HER2+ subtype; (H) BCSS of HR−/HER2− subtype.
Figure 8. OS and BCSS of surgery-treated OBC patients after PSM adjustment. (stratified by molecular subtype). (A) OS of HR+/HER2− subtype; (B) OS of HR+/HER2+ subtype; (C) OS of HR−/HER2+ subtype; (D) OS of HR−/HER2− subtype; (E) BCSS of HR+/HER2− subtype; (F) BCSS of HR+/HER2+ subtype; (G) BCSS of HR−/HER2+ subtype; (H) BCSS of HR−/HER2− subtype.
Jcm 12 03097 g008
Table 1. Baseline characteristics of patients with occult breast cancer (OBC).
Table 1. Baseline characteristics of patients with occult breast cancer (OBC).
Characteristic Cases%
Age at diagnosis<5014215.67%
50–5924727.26%
60–6925227.81%
70–7917319.09%
80+9210.15%
Months from diagnosis to therapy0 month44949.56%
≥1 month37741.61%
unknown808.83%
SubtypeHR+/HER2−37641.50%
HR+/HER2+10711.81%
HR−/HER2+768.39%
HR−/HER2−14616.11%
unknown20122.19%
Racewhite73180.68%
black9510.49%
other748.17%
unknown60.66%
Histological typeIDC27730.57%
ILC788.61%
other55160.82%
Marital statusmarried45249.89%
singled13414.79%
divorced/other14516.00%
widowed14516.00%
unknown303.31%
N stageN017218.98%
N148253.20%
N2758.28%
N310411.48%
unknown738.06%
GradeI; well differentiated60.66%
II; moderate differentiated283.09%
III/IV; poorly differentiated12914.24%
unknown74382.01%
Median household income (inflation ajusted)<44,999$738.06%
45,000–54,999$12113.36%
55,000–64,999$21523.73%
65,000–74,999$20822.96%
75,000$+28931.90%
Chemotherapyno/unknown32535.87%
yes58164.13%
Radiotherapyno/unknown51056.29%
yes39643.71%
Surgeryno67674.61%
yes22224.50%
unknown80.88%
Bone metastasesno61267.55%
yes27029.80%
unknown242.65%
Liver metastasesno79888.08%
yes818.94%
unknown272.98%
Lung metastasesno78987.09%
yes909.93%
unknown272.98%
Brain metastasesno83892.49%
yes394.30%
unknown293.20%
Table 2. Univariate and multivariate Cox analysis of OBC characteristics.
Table 2. Univariate and multivariate Cox analysis of OBC characteristics.
Univariate Cox AnalysisMultivariate Cox Analysis
OSBCSSOSBCSS
HR95%CIp ValueHR95%CIp ValueHR95%CIp ValueHR95%CIp Value
Age at dignosis
<50reference reference reference reference
50–591.3380.9089–1.9700.141.4390.9340–2.2160.0990.790.4593–1.35850.390.75890.4224–1.36340.36
60–691.7441.1980–2.538**1.4740.9554–2.2750.0791.17160.6870–1.99790.560.97720.5434–1.75720.94
70–792.1951.4956–3.221***2.0151.2990–3.125**1.34830.7734–2.35080.291.24670.6879–2.25930.47
80+4.593.0608–6.882***3.6452.2560–5.888***2.12831.0377–4.3651*1.52130.7263–3.18680.27
Months from diagnosis to therapy
0 monthreference reference //////
≥1 month1.0850.8694–1.3530.471.0750.8335–1.3880.57//////
Subtypes
HR+/HER2−reference reference reference reference
HR+/HER2+0.55260.3583–0.8523**0.64960.4018–1.0500.0780.87630.5335–1.43950.601.00010.5760–1.73631.0
HR−/HER2+0.85510.5586–1.30880.470.91110.5578–1.4880.710.86520.4985–1.50160.610.96110.5259–1.75640.90
HR−/HER2−1.11470.8188–1.51770.491.22570.8594–1.7480.261.99991.3739–2.9111***2.63621.7191- 4.0426***
Race
whitereference reference //////
black0.85840.6055–1.2170.390.7380.4767–1.1430.17//////
other0.84790.5642–1.2740.430.83370.5161–1.3470.46//////
Histological type
IDCreference reference reference reference
ILC2.9512.036–4.276***3.2672.178–4.899***1.52210.8794–2.63450.132.03591.1437–3.6242*
other1.8751.434–2.452***1.5861.167–2.156**0.88260.6165–1.26340.490.79790.5335–1.19330.27
Marriage status
marriedreference reference reference ///
divorced/separated/other1.1910.8917–1.5910.241.1050.7873–1.5510.561.19430.8067–1.76800.38///
single1.2720.9457–1.7110.111.1260.7898–1.6050.511.09950.7145–1.69200.67///
widowed1.5221.1612–1.996**1.2810.9222–1.7790.140.76850.4741–1.24560.67///
widowed/divorced/other
N Stage
N0reference reference reference reference
N10.36180.28350–0.4616***0.34740.26222–0.4604***0.66020.4568–0.9542*0.56710.3734–0.8612**
N20.16950.09877–0.2907***0.13210.06642–0.2628***0.47160.2351–0.9464*0.34110.1456–0.7991*
N30.37570.25815–0.5468***0.40950.27027–0.6206***0.98920.6176–1.58440.961.04110.6160–1.75950.88
Grade
well differentiatedreference reference //////
moderately differentiated2.730.3518–21.180.34/////////
poorly differentiated2.2710.3118–16.540.42/////////
Median household income (inflation ajusted)
<45,000$reference reference //////
45,000–54,999$0.96080.6266–1.4730.850.81860.4983–1.3450.43//////
55,000–64,999$0.84830.5717–1.2590.410.77580.4949–1.2160.27//////
65,000–74,999$0.7010.4634–1.0600.0930.69920.4385–1.1150.13//////
>74,999$0.82620.5619–1.2150.330.7140.4592–1.1100.14//////
Chemotherapy
no/unknownreference reference reference reference
yes0.37530.3065–0.4594***0.4470.3525–0.567***0.74870.5273–1.06320.110.88360.5923–1.31810.54
Radiotherapy
no/unknownreference reference reference reference
yes0.42470.3397–0.5308***0.48060.3719–0.621***0.67730.4939–0.9289*0.80910.5692–1.15030.24
Surgery
no/unknownreference reference reference reference
yes0.14270.09428–0.216***0.15830.09917–0.2526***0.42710.2609–0.6990***0.44910.2593–0.7777**
Bone metastasis
no/unknownreference reference
yes3.5682.892–4.402***4.1213.216–5.281***2.07791.4807–2.9161***2.50021.6968–3.6839***
Liver metastasis
no/unknownreference reference reference reference
yes4.4433.358–5.877***4.5533.279–6.322***2.47091.5316–3.9863***2.6041.5423–4.3963***
Lung metastasis
no/unknownreference reference reference reference
yes4.033.106–5.229***3.4322.483–4.745***2.5721.7343–3.8142***2.49351.5642–3.9748***
Brain metastasis
no/unknownreference reference reference reference
yes2.831.933–4.142***2.8761.838- 4.5***2.39861.3387–4.2976**2.07651.0757–4.0083*
* p < 0.05, ** p < 0.01, *** p < 0.001.
Table 3. Performance of prognostic models constructed by machine learning algorithms using test data (area under the ROC curve).
Table 3. Performance of prognostic models constructed by machine learning algorithms using test data (area under the ROC curve).
1-Year Survival3-Year Survival5-Year Survival
LR0.8510.7900.824
RF0.8180.7650.824
XGBoost0.7950.7920.829
ID30.6650.7550.788
KNN0.7730.7110.784
SVM0.5500.6760.766
Table 4. Comparison of patient features according to chemotherapy before and after propensity score matching (PSM).
Table 4. Comparison of patient features according to chemotherapy before and after propensity score matching (PSM).
CharacteristicsUnmatched Cohort1:1 Propensity Score Matched (PSM) Cohort
Chemotherapy Not GivenChemotherapyUnadjustedChemotherapy Not GivenChemotherapyPSM-Adjusted
N = 325%N = 581%p ValueN = 234%N = 234%p Value
Age at diagnosis *** 0.462
<50226.77%12020.65% 218.97%3113.25%
50–596820.92%17930.81% 5724.36%6427.35%
60–698225.23%17029.26% 7130.34%6728.63%
70–798526.15%8815.15% 5623.93%4920.94%
80+6820.92%244.13% 2912.39%239.83%
Subtype *** 0.586
HR+/HER2−14945.85%22739.07% 11448.72%10444.44%
HR+/HER2+164.92%9115.66% 166.84%2410.26%
HR−/HER2+113.38%6511.19% 114.70%135.56%
HR−/HER2−3310.15%11319.45% 2811.97%3314.10%
unknown11635.69%8514.63% 6527.78%6025.64%
Race * 0.778
white27785.23%45478.14% 19985.04%19181.62%
black257.69%7012.05% 198.12%229.40%
other206.15%549.29% 156.41%208.55%
unknown30.92%30.52% 10.43%10.43%
Histological type *** 0.668
IDC6419.69%21336.66% 5121.79%5724.36%
ILC3711.38%417.06% 2912.39%2410.26%
other22468.92%32756.28% 15465.81%15365.38%
Marital status *** 0.893
married12939.69%32355.59% 10444.44%10745.73%
divorced/separated/other5216.00%9316.01% 4017.09%4418.80%
single5115.69%8314.29% 3514.96%3715.81%
widowed7924.31%6611.36% 4619.66%3816.24%
unknown144.31%162.75% 93.85%83.42%
N stage *** 0.099
N011134.15%6110.50% 5925.21%4920.94%
N113140.31%35160.41% 11549.15%11950.85%
N292.77%6611.36% 93.85%239.83%
N3278.31%7713.25% 2510.68%218.97%
unknown4714.46%264.48% 2611.11%229.40%
Grade *** 0.076
well00.00%61.03% 00.00%41.71%
moderately30.92%254.30% 31.28%62.56%
poorly216.46%10818.59% 208.55%2811.97%
unknown30192.62%44276.08% 21190.17%19683.76%
Median household income (inflation adjusted) 0.133 0.532
<45,000$329.85%417.06% 2410.26%166.84%
45,000–54,999$4413.54%7713.25% 3113.25%3012.82%
55,000–64,999$8626.46%12922.20% 6427.35%5925.21%
65,000–74,999$6219.08%14625.13% 4217.95%5322.65%
>74,999$10131.08%18832.36% 7331.20%7632.48%
Surgery *** 0.254
no29590.77%38165.58% 20989.32%19784.19%
yes247.38%19834.08% 2410.26%3514.96%
unknown61.85%20.34% 10.43%20.85%
Radiotherapy *** 0.556
no/unknown24575.38%26545.61% 16068.38%15365.38%
yes8024.62%31654.39% 7431.62%8134.62%
Bone metastases *** 1
no16049.23%45277.80% 13758.55%13758.55%
yes14945.85%12120.83% 9038.46%9038.46%
unknown164.92%81.38% 72.99%72.99%
Liver metastases *** 0.908
no27384.00%52590.36% 20286.32%19985.04%
yes3310.15%488.26% 2510.68%2811.97%
unknown195.85%81.38% 72.99%72.99%
Lung metastases *** 0.876
no25578.46%53491.91% 19683.76%20085.47%
yes5216.00%386.54% 3012.82%2711.54%
unknown185.54%91.55% 83.42%72.99%
Brain metastases * 0.979
no29390.15%54593.80% 21491.45%21491.45%
yes144.31%254.30% 125.13%135.56%
unknown185.54%111.89% 83.42%83.42%
* p < 0.05, *** p < 0.001.
Table 5. Comparison of patient characteristics according to surgical treatment before and after propensity score matching (PSM).
Table 5. Comparison of patient characteristics according to surgical treatment before and after propensity score matching (PSM).
CharacteristicsUnmatched Cohort1:1 Propensity Score Matched (PSM) Cohort
Surgery Not GivenSurgeryUnadjustedSurgery Not GivenSurgeryPSM-Adjusted
N = 676%N = 222%p ValueN = 209%N = 209%p Value
Age at diagnosis *** 0.089
<507410.95%6730.18% 3617.22%5626.79%
50–5917425.74%7131.98% 7736.84%6933.01%
60–6919128.25%5725.68% 7033.49%5727.27%
70–7914922.04%2310.36% 188.61%2311.00%
80+8813.02%41.80% 83.83%41.91%
Subtype *** 0.894
HR+/HER2−28341.86%8839.64% 7937.80%8440.19%
HR+/HER2+7010.36%3716.67% 3416.27%3717.70%
HR−/HER2+517.54%2511.26% 209.57%2210.53%
HR−/HER2−9914.64%4520.27% 4722.49%4019.14%
unknown17325.59%2712.16% 2913.88%2612.44%
Race 0.303 0.112
white55582.10%17177.03% 14971.29%16177.03%
black6810.06%2611.71% 3717.70%2511.96%
other497.25%2410.81% 209.57%2311.00%
unknown40.59%10.45% 31.44%00.00%
Histological type *** 0.671
IDC16524.41%10949.10% 9143.54%10047.85%
ILC7110.50%52.25% 52.39%52.39%
other44065.09%10848.65% 11354.07%10449.76%
Marital status 0.075 0.995
married32247.63%12857.66% 11856.46%11956.94%
divorcedsSeparated/other11116.42%3013.51% 2712.92%2813.40%
single10215.09%3214.41% 3114.83%3215.31%
widowed11917.60%2511.26% 2712.92%2411.48%
unknown223.25%73.15% 62.87%62.87%
N stage *** 0.688
N016624.56%62.70% 41.91%62.87%
N133148.96%14565.32% 14468.90%13665.07%
N2385.62%3716.67% 2612.44%3315.79%
N37010.36%3415.32% 3516.75%3416.27%
unknown7110.50%00.00% 00.00%00.00%
Grade *** 0.321
well20.30%41.80% 20.96%41.91%
moderately152.22%135.86% 62.87%125.74%
poorly7110.50%5625.23% 4320.57%4822.97%
unknown58886.98%14967.12% 15875.60%14569.38%
Median household income (inflation adjusted) 0.659 0.437
<45,000$497.25%198.56% 146.70%188.61%
45,000–54,999$8712.87%3415.32% 2110.05%3215.31%
55,000–64,999$15923.52%5524.77% 5325.36%5325.36%
65,000–74,999$15723.22%5122.97% 5425.84%4722.49%
>74,999$22433.14%6328.38% 6732.06%5928.23%
Chemothrapy *** 0.466
no/unknown29543.64%2410.81% 3014.35%2411.48%
yes38156.36%19889.19% 17985.65%18588.52%
Radiotherapy *** 0.768
no/unknown40760.21%9643.24% 9043.06%9444.98%
yes26939.79%12656.76% 11956.94%11555.02%
Bone metastases *** 1
no39358.14%21697.30% 20598.09%20397.13%
yes26539.20%31.35% 10.48%31.44%
unknown182.66%31.35% 31.44%31.44%
Liver metastases *** 0.604
no57685.21%21898.20% 20698.56%20598.09%
yes7911.69%10.45% 10.48%10.48%
unknown213.11%31.35% 31.44%31.44%
Lung metastases *** 0.6
no56884.02%21697.30% 20095.69%20397.13%
yes8612.72%31.35% 62.87%31.44%
unknown223.25%31.35% 31.44%31.44%
Brain metastases ** 0.784
no61791.27%21697.30% 20598.09%20397.13%
yes375.47%20.90% 10.48%20.96%
unknown223.25%41.80% 31.44%41.91%
** p < 0.01, *** p < 0.001.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qu, J.; Li, C.; Liu, M.; Wang, Y.; Feng, Z.; Li, J.; Wang, W.; Wu, F.; Zhang, S.; Zhao, X. Prognostic Models Using Machine Learning Algorithms and Treatment Outcomes of Occult Breast Cancer Patients. J. Clin. Med. 2023, 12, 3097. https://doi.org/10.3390/jcm12093097

AMA Style

Qu J, Li C, Liu M, Wang Y, Feng Z, Li J, Wang W, Wu F, Zhang S, Zhao X. Prognostic Models Using Machine Learning Algorithms and Treatment Outcomes of Occult Breast Cancer Patients. Journal of Clinical Medicine. 2023; 12(9):3097. https://doi.org/10.3390/jcm12093097

Chicago/Turabian Style

Qu, Jingkun, Chaofan Li, Mengjie Liu, Yusheng Wang, Zeyao Feng, Jia Li, Weiwei Wang, Fei Wu, Shuqun Zhang, and Xixi Zhao. 2023. "Prognostic Models Using Machine Learning Algorithms and Treatment Outcomes of Occult Breast Cancer Patients" Journal of Clinical Medicine 12, no. 9: 3097. https://doi.org/10.3390/jcm12093097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop