A Mathematical Model of Breast Tumor Progression Based on Immune Infiltration

Breast cancer is the most prominent type of cancer among women. Understanding the microenvironment of breast cancer and the interactions between cells and cytokines will lead to better treatment approaches for patients. In this study, we developed a data-driven mathematical model to investigate the dynamics of key cells and cytokines involved in breast cancer development. We used gene expression profiles of tumors to estimate the relative abundance of each immune cell and group patients based on their immune patterns. Dynamical results show the complex interplay between cells and molecules, and sensitivity analysis emphasizes the direct effects of macrophages and adipocytes on cancer cell growth. In addition, we observed the dual effect of IFN-γ on cancer proliferation, either through direct inhibition of cancer cells or by increasing the cytotoxicity of CD8+ T-cells.


Introduction
Breast cancer is one of the most common cancers in women, and it is estimated that about 43,600 women in the United States will die because of breast cancer in 2021 [1]. There are different subtypes of breast cancer based on molecular-level analysis of gene expression patterns, such as luminal A (LumA), luminal B (LumB), and triple-negative/basal-like [2]. LumA is the most frequently seen subtype of breast cancer with the lowest mortality rate among other subtypes [3], and triple-negative breast cancer (TNBC) is the most aggressive subtype with a poor overall clinical outcome [4]. Several treatment options are available for breast cancer, including surgery (lumpectomy and mastectomy), radiotherapy, chemotherapy, hormone therapy, and other novel therapies that are determined by clinicopathological aspects of each patient [5]. Although targeted therapies have proven effective for some types of breast cancer [6], chemotherapy remains the standard form of treatment for subtypes of TNBC [7], and the development of resistance to chemotherapy is the main challenge for TNBCs [8,9].
It has been shown in many studies that the tumor's microenvironment has a significant role in breast cancer development, progression, and also response to therapies [10][11][12][13]. Immune cells interact with tumor cells directly or indirectly via chemokine and cytokine signaling in the tumor microenvironment, and they are the major players in the behavior of the tumor and the efficiency of the treatments [14]. When cancer cells die, they release a protein called high mobility group box-1 (HMGB-1) that causes dendritic cells to become activated [15]. Antigens presented by dendritic cells cause T-cells to become activated and kill cancer directly [16][17][18]. IFN-γ is produced by helper T-cells, cytotoxic T-cells [19][20][21], and dendritic cells [19] to inhibit tumor growth [21]. On the other hand, certain immune cells can help cancer progress by either promoting it or having a dual impact. For example, regulatory T-cells control the development and activity of helper and cytotoxic T-cells, hence reducing the immune response and indirectly encouraging malignancy [22,23].
The relationship between clinical outcome and immune cells in breast tumor has been found in many studies. Breast cancer is distinguished by a significant population of tumor associated macrophages [24], and it has been shown that higher macrophage density is related to a poor outcome [25]. Moreover, the presence of CD8+ T-cells has been linked to considerable reductions in the relative risks of death from different subtypes of breast cancer [26]; and in ER-negative cancers, CD4+ and CD8+ T lymphocytes are more closely related to better outcomes than in ER-positive tumors [27]. In addition, it has been found that advanced stage breast tumors have more T-reg cells and a lower ratio of T-helper/T-reg cells [28]. All this evidence suggests that the relative numbers of distinct immune cells, and their interaction network, play key roles in the initiation and progression of breast cancer. Thus, to effectively simulate cancer progression, it is important to divide patients into similar cohorts based on their tumor-infiltrating immune cells and investigate the tumor progression of each group independently. However, adding all immune cells to the model increases the complexity and uncertainty of it. We therefore only considered the above-mentioned key immune cells (macrophages, CD4+ T-cells, T-reg cells, cytotoxic cells, and dendritic cells, which have a huge role in activating T-cells) in the progression of breast cancer.
Many mathematical models have been developed to study the relationships among tumors' initiation, dynamics, and therapeutic responses to discover the best therapeutic combinations and overcome drug resistance in diverse cancer types [29][30][31][32][33][34][35][36][37], including breast cancer [38][39][40][41][42]. Some of the mathematical models for breast cancer investigate the relationships among immune cells, tumor cells, and some treatments [36,43,44]. For example, the effects of trastuzumab on HER2 overexpressing breast cancer in a mouse model system have been studied by integrating mathematical and experimental models [43]. In addition, a mathematical model has been developed to investigate the interactions between the MCF-7 breast cancer cell line and some immune cells. These models [43,45] include only a few immune components, such as NK cells and CD8+ T-cells, similarly to the mathematical model developed for the treatment of the murine 4T1 TNBC cell line with a high-dose chemotherapy drug [36].
Some of the outstanding challenges in the mathematical modeling of cancers are the existence of many unknown parameters and the limited number of datasets. For this reason, it is a routine practice to assume some values for parameters (see, e.g., [46][47][48]), or use estimated parameters from other diseases or models in the mathematical modeling of cancer. For example, the parameter values obtained for sarcoidosis were used to estimate the parameters of a mathematical model for pancreatic cancer, and the values estimated for pancreatic cancer in the mathematical modeling of breast cancer [23,49,50]. In addition, in a mathematical model of breast cancer [51], the values of some of parameters were chosen in line with a mathematical model of tumor invasion not validated for breast cancer [52]. Hence, the available mathematical models for breast cancer only consider a small subset of immune cells, and they assume all breast tumors behave similarly as they use the same parameter values for all tumors. However, since the evolution of a breast tumor depends on its specific immune profile, it is better to first find such tumors' immune variations and understand the mechanism of growth in the absence of treatment for each of these immune variations. For this reason, we present a mathematical model for the interaction network given in Figure 1. We used a system of ordinary differential equations (ODEs) to investigate the differences in tumor progression of patients with different immune profiles. We clustered breast tumors based on their estimated immune cell frequencies using their gene expression data. We then estimated the parameters of the mathematical model for each tumor group separately. We found parameter values by reviewing the available literature and estimating the rest using what data were available. Importantly, we then performed global sensitivity analysis on the non-dimensionalized system to find the most sensitive parameters and investigate their impacts. Lastly, we analyzed the dynamics of the tumors in each group and compared them with patients' clinical information to explore the connections between the tumor microenvironment and the progression of breast cancer.  Table 1.

Materials and Methods
There are many players in the progression of breast tumors. However, to avoid too much complexity and to reduce the uncertainty of the model, we only considered very important players. The main cell types that we modeled were cancer cells, necrotic cells, T-cells, macrophages, dendritic cell, and adipocytes ( Figure 1).

. T-Cells
In this model, T-cells are divided into four subgroups of naive, helper, cytotoxic, and regulatory T-cells. Naive T-cells, denoted by T N , are not necessarily part of the tumor microenvironment, as they are usually activated within lymph nodes. We excluded them from the total number of cells in the microenvironment. Even though there are other methods, such as introducing non-linear terms in the ODEs to avoid unlimited exponential growth, making activation rates for other types of T-cells proportional to the number of naive T-cells was the most convenient way to create a controlled system given the complexity of our model. Thus, we summarize the equation for the dynamics of naive T-cells after deriving the equations of other types of T-cells. The variables T h , T C , and T r , respectively, denote the numbers of activated T-helper cells, cytotoxic T-cells, and T-reg cells.
Regulatory T-Cells (T r ) Dendritic cells stimulate the formation [17] and activation of regulatory T-cells [57]. Furthermore, estrogen enhances the actions of regulatory T-cells [19,58]. Hence, we used the following equation for the dynamics of T-reg cells.
Naive T-Cells (T N ) By combining the above-mentioned equations for the activation of naive T-cells and introducing independent naive T-cell production rate A T N , one can get the following equation for dynamics of naive T-cells:

Dendritic Cells (D)
Dendritic cells are activated by cancer cells [23] and HMGB1 [59][60][61]. Moreover, estrogen enhances the metabolism, proliferation, differentiation, and functionality of dendritic cells [19], and multiple factors induced by cancer cells may promote natural dendritic cell death [57]. Denoting A D N as the production rate of naive dendritic cells, we made the following system of equations for dynamics of naive dendritic cells (D N ) and activated dendritic cells (D):

Macrophages (M)
Since macrophages have many phenotypes and frequently change their phenotype, the breakdown of them into M1, M2, and other subsets would have tremendously complicated the model. Therefore, we modeled all activated macrophages as a single variable denoted by M. Tumor associated macrophages (TAMs) are activated by IL-10 [62,63]. IL-12 and IFN-γ activate M1 macrophages [24,62,64,65], and M2 macrophages are activated by IL-4 and IL-13, which are secreted by helper T-cells [62]. Moreover, estrogen exposure leads to alternative macrophage activation [19,58]. Denoting naive macrophages by M N , activated macrophages by M, and the production rate of naive macrophages by A M , we made the following system of equations for the dynamics of naive and activated macrophages.

Cancer Cells (C)
Since cancer cells proliferate at an abnormal rate, the proliferation of cancer cells is traditionally modeled by [C](1 − [C]/C 0 ), where C 0 is the maximum capacity. In addition, IL-6 promotes the proliferation of cancer cells [66,67]. Additionally, adipocytes, releasing metabolic substrates, promote the proliferation of breast cancer cells [68]. On the other hand, activated CD8+ T-cells kill cancer cells [23,69], and IFN-γ initiates the elimination of cancer cells by inducing cell cycle arrest, apoptosis, and necroptosis [21]. The dynamics of cancer cells was modeled by the following equation.
The direct crosstalk of cancer cells with tumor-surrounding stromal components, such as tumor-surrounding adipocytes, promotes tumor progression [70]. We modeled the proliferation of adipocytes similarly to the cancer cells' proliferation.
2.1.6. Necrotic Cells (N) Necrosis occurs when cells are under metabolic stress as their resources are being consumed by cancer cells [60]. Cells that go through the process of necrosis are denoted as necrotic cells. Since resources are limited in the cancer microenvironment, some cells will undergo necrotic cell death instead of other types of cell death [60,71]. Activated CD8+ T-cells kill cancer cells [23], and CD8+ cytotoxic T-cells produce IFN-γ, which then eliminates cancer cells [21]. Since a fraction of cancer cells can go through first becoming necrotic cells, the production rate of necrotic cells was modeled by the fraction (α NC ) of dying cancer cells.

Molecules
The dynamics of above mentioned molecules were modeled in the following way.
For this reason, the dynamics of HMGB1 was modeled by the following equation.
IL-12 (IL 12 ) IL-12, which stimulates the growth and functions of T-cells, is involved in the differentiation of naive T-cells into helper T-cells. IL-12 is secreted by macrophages and dendritic cells [23,57,64]. Helper and cytotoxic T-cells also produce IL-12 [19]. We modeled the dynamics of IL-12 using the following equation.
Estrogen (E) Adipocytes are the primary producers of estrogen [84,85]. In breast tumors of postmenopausal women, estrogen can reach levels orders of magnitude greater than the low levels circulated in the body [85]. In general, adipose tissues in the breasts, brain tissues, osteoblasts, and other tissues locally produce estrogen, which circulates throughout the body. This amount of reproduced estrogen in the body depends on the pre-exiting existing amount. For this reason, we modeled the production rate of estrogen throughout the body E 0 ). This gave us the following equation for the dynamics of estrogen.
IL-6 (IL 6 ) The important cytokine that leads to proliferation of cancer cells is IL-6 and is secreted by cancer associated adipocytes [64,66,67], macrophages [23,62,64,65,87], and dendritic cells [19,23,57]. There are some popular tumor deconvolution methods used to estimate the percentages of different immune cell types from the gene expression profiles of tumors. Among these methods, CIBERSORTx [88] has shown great performance in several studies [71,[89][90][91]. In this study, we applied CIBERSORTx B-mode to gene expression profiles of primary tumors of breast cancer patients obtained from the Cancer Genome Atlas (TCGA) project of breast cancer (BRCA) [92] and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) cohort [93]. There are 1904 primary breast tumor samples in the METABRIC microarray data downloaded from cBioPortal [94] and 1218 primary breast tumors with RSEM normalized RNA-seq data in log 2 scale in the TCGA data downloaded from the University of California Santa Cruz (UCSC) Xena web portal [95]. Before performing CIBERSORTx B-mode on both datasets separately, we transformed TCGA data to the linear space and dropped samples of normal breast tissues. After estimation of cell proportions, we only considered cases with CIBERSORTx B-mode p-values < 0.05 and continued our study with 2993 patients. We used expression values of genes encoding the molecules in the dynamical model and combined some immune cells frequencies to estimate the values of model's variables as described in Table 1. Note that the genes' expression levels in TCGA data were scaled depending on METABRIC data to eliminate the effect of different ranges in the two datasets on the results.

Patient Data Analysis
We grouped patients based on their estimated immune cell fractions using K-means clustering and determined the value of K using the elbow method. As a result, there were five distinct immune patterns of breast tumors. Figure 2 shows the average cell fractions in tumors of each cluster for the most variant immune cells among clusters, plus the standard deviation of each cluster as a bar.  Since the deconvolution method only provides us the percentage of each immune cell type in primary tumors, we used tumor weight for METABRIC data and tumor size for TCGA data, as described below, to estimate the numbers of immune cells, cancer cells, and necrotic cells in each tumor. For numerical stability, if the percentage of an immune cell that was used in the model was zero, it was substituted with 10% of the smallest positive cell fraction in the deconvolution data. We also excluded patients if their necrosis percentage or their tumor size in TCGA data, or the weight of their tumor in METABRIC data, were not available.

Cluster
First, for simplicity, we assumed that the average number of cancer cells was twice the average number of total immune cells [37]. Therefore, using the necrosis percentage given in the TCGA data, the average ratio of immune cells to cancer cells to necrotic cells is approximately 0.3:0.6:0.1 in breast tumors. Additionally, the epithelial cell density has been reported as 4.5 × 10 4 cells/cm 3 in breast cancer [96]. We therefore choose the scaling factor α = 4.5 × 10 4 so that the average density of cancer cells across all patients was close to that value.
For each patient P tcga in TCGA data, the total cell number (TCN tcga ) was assumed to be proportional to the weight of the tumor: where K is the number of patients in TCGA data. For TCGA data, numbers of necrotic cells (N tcga ), cancer cells (C tcga ), and total immune cells (TIC tcga ) were calculated using the given necrotic percentage (N p ) for each patient: For METABRIC data, tumor size was used to calculate total cell number (TCN meta ) similarly to the TCGA data: where K is the number of patients in the METABRIC data. Since the necrotic percentage was not available for METABRIC data, we used immune cells proportion from deconvolution data to calculate the total number of immune cells (TIC meta ) for each patient (P meta ): Once we have the amount of all the cells and molecules we can extract useful information such as initial conditions ( Table 2) and steady state values (Table 3) for each cluster.

Parameter Estimation
We used 17 equations and 75 parameters in our model, which needed to be determined. We found the values of IL 12 , and δ I γ by finding their corresponding cell and molecule half-lives in the biological literature. For the rest of the parameters, we derived them so that the dynamics of all cells and molecules reached their steady state within our simulation time. In other words, for an ODE of the type dX dt = F(X, θ, t), we solved: for the parameter vector θ = θ 1 , · · · , θ N , where T is the maximum simulation time and X ∞ is the vector of steady-state values for the state variables given in Table 3. However, even with all of the known death rate parameters, we still had 62 more to determine and only 17 steady state equations of type (18). To circumvent this issue, we added some assumptions which were basically relationships between the parameters. These relationships were based on assuming some production rates or death rates within the same ODE would be more effective than the rest.The relationship formulations were created by the authors to make sure we got a positive set of parameters. For more technical details of the parameter estimation process, please refer to Appendix A.1.   As we mentioned, because of the lack of biological information about many of parameters' values, we made several assumptions about these parameters given in Appendix A.1. To remedy this important limitation of the model, we perform a global gradient-based sensitivity analysis by changing each of these assumptions 5000 times, and obtaining a new set of parameter values with each new assumption. Since the number of parameter values (number of assumptions × number of variations = 38 × 5000 = 190, 000) still is a finite number, this limitation of the model must be considered when the results of the model are used. For better numerical stability, we performed the sensitivity analysis using the non-dimensionalized system (see Appendix A.2 for more details).
Generally, the level of sensitivity of a variable X, to its vector of parameters θ = θ 1 , · · · , θ N for an ODE of the type dX dt = F(X, θ, t), is calculated by: We calculated the sensitivity of cancer cells and the total number of cells for each parameter θ i at the steady state. In other words, for the steady state values X * the sensitivity vector s = s 1 , · · · , s N was obtained by differentiating F(X * , θ) = 0 with respect to θ. We got this analytical formula: where ∇F(X * , θ) −1 is the numerical inverse of Jacobian of F with respect to X. To calculate the global sensitivity, we followed the following steps: • First, we define a local sensitivity measure for each parameter θ i in the neighborhood Ω k (θ i ) as These neighborhoods were acquired by varying assumptions (A1)-(A22) by scaling factors 0.01 to 100. The integration was carried out numerically using sparse grid points [97,98]. • Second, we found weights for the aforementioned neighborhoods. Scaling each assumption provides us with a new set of parameters. The weights were then determined by calculating the distance of each resulting parameter set to a fixed base parameter set. We assigned higher weights to the parameters that were closer to the base values. We denote each weight by w k i for i = 1, · · · , N and k = 1, · · · , K corresponding to the parameter and its neighborhood, respectively. • Finally, we obtained the global sensitivity level S i to each parameter θ i by

Data Analysis of the Clusters
We also use TCGA and METABRIC clinical data to analyze the clinical features of each cluster and their associations with tumor immune microenvironment and dynamics. We see that cluster 2 and cluster 3, respectively, include a significant high number of the youngest and oldest patients compared to the other clusters ( Figure 3A). We also observe that cluster 5 has the highest percentage of tumors with estrogen receptor (ER+) and human epidermal growth factor receptor 2 negative (HER2-) tumors, and ultimately a high number of LumA, LumB and normal-like tumors. On the other hand, clusters 1 and 4 have a higher percentage of aggressive subtypes such as HER2-enriched and Basal than other clusters, which explains having the highest percentage of tumors without estrogen receptor (ER-) in these clusters ( Figure 4A,C,D). In addition, survival probabilities of cluster 1 and cluster 4 are lower than other clusters while cluster 5, which has more non-aggressive tumor, has the best survival results ( Figure 3C). Note that survival status of the patients is given as alive or dead in TCGA data while in METABRIC data dead patients are separated into two categories 'died of other causes' and 'died of disease' so we translate 'died of other causes' into alive status. For this reason, we see survival probability of cluster 3 is lower in Figure 3C while the percentage of alive patients in cluster 3 is higher ( Figure 4B).

Dynamics of the Breast Cancer Microenvironment
We find the dynamics of each variable involved in tumor microenvironment by solving the ODEs (1)-(17) with parameters acquired from the steady state assumptions for each cluster (Appendix A.1). We derive the dynamics of all of the cells and molecules based on the non-dimensional parameters in Table A4, half-lives and estimated death rates from Table A3, and initial conditions acquired from the tumors with the smallest size in each cluster ( Table 2) and their steady state values in Table 3. We also plot the changes in total cells. We define total cells as: We exclude T N since they are mainly found in the circulation and lymphatic system [99]. The rest of the T-cells are known as tumor infiltrating T-cells and can be found abundantly in tumors' microenvironment. Dendritic cells primarily get activated inside of the tumor and cancer cells, necrotic cells and adipocytes are the other main components of the breast tumor, [12]. Finally, since most of naive macrophages polarize into M1 (antitumor) and M2 (pro-tumor) phenotypes inside of the tumor we consider a 20% factor for M N , [100]. The dynamics of cells given in Figure 5 shows that for all clusters except cluster 2, the naive T-cells increase from their initial values and reach the steady state quickly, while the naive T-cells in cluster 2 increase and then decrease in a slow manner to reach the steady state. Helper T-cells increase from their initial condition and then decrease to reach the steady state quickly. Cytotoxic T-cells also follow the same pattern. T-reg cells reach the steady state quickly and remain constant at the steady state. While the naive dendritic cells start with a higher population initially and decrease eventually to reach the steady state, mature dendritic cells start with a low initial population and increase eventually to reach their steady states. This is because mature dendritic cells are derived from naive dendritic cells. On the other hand, both naive macrophages and macrophages increases within a short amount of time before reaching their steady states.
Cancer cells dynamics show an exponential growth until they reach the steady state. Since cancer cells go through necrosis, the necrotic cells' growth behaves similarly. Adipocytes activation increases over time until they reach the steady state for all clusters. The amount of estrogen also increases from their initial conditions in each cluster as it is produced by adipocytes.
The initial concentration of HMGB1 in clusters 4 and 5 is higher than other clusters and stay higher in their steady state. The general trend for HMGB1 is increasing for all clusters before it reaches the steady state.
Although the concentration of IL-12 increases at the beginning, it lowers from the maximum concentration to reach the steady state. IL-10 and IL-6 increase from their initial value to reach the steady state. IFN-γ increases first then decreases over time as the cancer cells increase.
Due to the shear abundance of cancer cells and adipocytes compared to other cells, the total number of cells is more significantly affected by these cells. We can see that cluster 5 has one of the lowest cancer cells and the lowest adipocytes population, so the total cells population is the lowest in cluster 5. Additionally, cluster 2 has the highest cancer cells and one of the highest adipocytes in its steady state. As a result, total cells dynamics is the highest for cluster 2. Since total cells population is proportional to tumor sizes, it can be inferred that cluster 2 has larger tumors than the other clusters. While cluster 3 has the fastest cancer growth, cluster 2 has the highest cancer population at the steady state among the other clusters. Additionally, cluster 2 has the highest population of dendritic cells that inhibit cancer cells and being activated by cancer cells. As dendritic cells are also activated by estrogen and HMGB1, their concentrations are the lowest in cluster 2. Thus, the higher population of dendritic cells may be mostly due to cancer cells' activation. Note, cluster 2 includes the primary tumors of the youngest patients ( Figure 3B) and aggressive tumors ( Figure 4A), and it is known that aggressive breast tumor in younger patients grow faster [101].
On the other hand, cluster 4 and 5 have the slowest cancer growth, while cluster 1 has the lowest cancer population at the steady state compared to other clusters. Interestingly, the amount of immune cells that are known to be correlated with good prognoses such as cytotoxic T-cells and helper T-cells are the lowest in the cluster 1 which might cause making a wrong prediction at first glance. Thus, it is more important to consider interactions between the immune cells and cancer cells than just looking at their quantities to understand cancer prognosis.
Cluster 5 has the best survival probability according to Figure 3C. The population of adipocytes is the lowest in cluster 5 compared to the other clusters. Dendritic cells population, and the concentration of HMGB1 and estrogen are very low. Furthermore, cluster 5 has one of the lowest cancer cells population at the steady state. Furthermore, Figure 4C,D illustrate that cluster 5 has the lowest HER2-positive and ER-positive patients that are indicators of a better prognoses.
Although dynamical results show a lower cancer growth for cluster 1 and 4, clinical data analysis shows that cluster 1 and 4 have the lowest survival probability among the other clusters as illustrated in Figure 3C. This can be due to the fact that cluster 1 and 4 have one of the highest population of adipocytes at the steady state and thus providing more resources for cancer cells. Furthermore, based on Figure 4A these clusters have the highest percentage of aggressive tumors.

Sensitivity Analysis
After finding the base parameters from the steady state assumptions in Appendix A.1 and the steady state values given in Table 3, we carry out the global sensitivity analysis mentioned in Section 2.2.4.
We investigate the global sensitivity of cancer cells and the total number of cells (Equation (23)) to each parameter. The results are given in Figures 6 and 7. As expected, we see the cancer population is the most sensitive to cancer related parameters (Figure 6), and since total cells includes high number of cancer cells (Figure 5), they are also sensitive to these parameters (Figures 6 and 7). In addition, total cells and cancer populations in all clusters are significantly sensitive to δ A and λ A . This is reasonable, since all clusters have high numbers of adipocytes that directly contribute to cancer proliferation Equation (9). All of the sensitivity plots imply that less decay and more production of adipocytes lead to larger number of cancer and total cells. Finally, cancer population in all clusters are directly sensitive to macrophage promoting parameters and reversely sensitive to the macrophage decay rates. We have modeled both macrophage phenotypes (M1 and M2) together. However, in all the clusters the frequency of M2 (pro-tumor) phenotype is more dominant (Figure 2). Therefore, more macrophages should result in more cancer cells and consequently higher number of total cells. The presence of δ M N as a sensitive parameter in Figure 7 can be justified in the same way as a significant number of naive macrophages turn into activated macrophages. Moreover, in clusters 1, 2, and 4, we see that clearance of necrotic cells has an opposite effect on the total number of cells which is reasonable, given the presence of the necrotic cells in the tumor microenvironment. In clusters 3 and 5, δ N is replaced by λ MI γ , because cluster 3 has a very high number of macrophages and cluster 5 has a high level of I γ , Figure 5.

Dynamics with Varying Assumptions
We investigate the effect of changing parameters' assumptions on cancer dynamics in clusters 1-5. To avoid cluttering, we only target assumptions that incorporate sensitive parameters from the previous section, and we scale them by factors of 0.2 and 5. These scaled assumptions will produce new sets of parameters for each cluster leading to new dynamics. Furthermore, we locally perturb all the sensitive parameters from Section 3.3 to acquire a 10% variation region for cancer dynamics.
Notice that some sensitive parameters are either solely obtained from ODEs' steady states or taken from the literature and are not explicitly included in these assumptions. Therefore, we only scale the following assumptions by 1.0, 0.2 and 5.0: The results are shown in Figure 8. We can see that scaling the assumptions (24) and (25) had the largest effect on the dynamics of cancer cells across the five clusters. On the other hand, assumptions (26)-(29) negligibly affect the time of reaching steady state in clusters 4 and 5 when they are up-scaled. The assumption (24) delayed the time of reaching steady states for all clusters while it was scaled down and sped it up when it was scaled up. Clusters 2, 4, and 5 were the most affected in the case shown in Figure 8A; and clusters 1, 2, 4, and 5 were the most affected in the case shown in Figure 8B. Below is a comparison between the values of three crucial parameters for Figure 8A. According to Table 4, downscaling the assumption (24) increased the value of δ CT c and decreased the values of δ CI γ and δ C . This is because both δ C and δ CT c are related to each other via δ CI γ . Since clusters 2, 4, and 5 had the highest levels of cytotoxic cells, increasing cytotoxic cells' cancer inhibition effect (δ CT c ) remarkably decreased the number of cancer cells. Similarly, we got Table 5 when scaling the assumption (25). In this case, upscaling the assumption (25) decreased the value of δ C and increased the values of δ CI γ and δ CT c . For the same reason, cancer saturation in clusters 2, 4, and 5 underwent a delay. Moreover, cluster 1 which has a low level of cytotoxic cells but a reasonable level of IFN-γ can more effectively remove cancer cells with this combination of parameter values. Table 4. Three important parameter values. Values of the parameters δ CT c , δ CI γ , and δ C without scaling and after scaling the assumption (24) by factors 0.2 and 5. These are the most effected parameters and correspond to the dynamics in Figure 8A.

Clusters
Without Scaling Scale = 0.2 Scale = 5 Cluster 1 None of the cases above completely neutralized the tumor. Even in the extreme cases, such as clusters 4 and 5 in Figure 8A,B, the cancer population growth was just delayed and would reach its steady state in later times. However, the mentioned parameters can cause significant delays in cancer growth, especially in clusters 1 and 4, whose patients had low survival probabilities. Hence, targeting cells and molecules corresponding to these parameters can be crucial in cancer treatments.
To even further investigate the parameters involved in cancer ODE and have more flexibility in finding scenarios in which cancer completely vanishes, we carried out a bifurcation analysis (see Appendix A.6). We assumed that [IL6], [A], [T c ], and [I γ ] were at their steady state values for each cluster. Next, we chose one parameter as the bifurcation variable and let the rest of them take their estimated values from Table A4 for each cluster.
Our results show that three parameters from Tables 4 and 5 gave non-zero equilibria for cancer for very small values, and they caused cancer to vanish for a large range. This is in-line with how sensitive the dynamics results were to these parameters. On the other hand, parameters such as λ C , λ CIL 6 , and λ CA contribute to larger steady state values as they get larger. Saliently, cancer in clusters 1-5 vanished for very small values of λ C , while small values of λ CA could significantly decrease the steady state population of cancer only in cluster 5. This result is very interesting, because cluster 5 had the highest survival rate (Figure 3), the highest number of LumA subtypes (Figure 4), and the lowest adipocyte population ( Figure 5). For more details, see Appendix A.6. Table 5. Three important parameter values. Values of the parameters δ CT c , δ CI γ , and δ C without scaling and after scaling the assumption (25) by factors 0.2 and 5. These are the most effected parameters and correspond to the dynamics in Figure 8B.

Dynamics with Different Initial Conditions
We investigated the effects of different initial conditions on the dynamics of each cluster. We varied the initial conditions by extracting individual patient data from each cluster. Due to the large number of patients, we plotted the dynamics of tumors with the number of cancer cells below the 40th percentile as the initial conditions. We observe that the dynamics did not change regardless of different initial conditions. All the patients from one cluster converged to the same steady state, no matter whether they started from higher or lower initial quantities ( Figure A1). This is reasonable, since the parameters were derived based on the steady state assumptions for each cluster.
We also looked at the dynamics of small tumors in one cluster converging on a large tumor at the steady state in another cluster. We observe that even when the initial conditions were not from the same cluster as the parameters' values, the dynamics of tumors quickly converged on the dynamics of tumors in the cluster of the steady state. This testifies that the impact of parameters on dynamics is more significant than the initial conditions (Appendix A.5)s.

Discussion
Understanding cancer mechanisms and components in in vitro and in vivo studies is time consuming and does not provide comprehensive results to explain cancer complexity, since each cancer component is studied one at a time [102]. With the advancements in tumor deconvolution methods and the availability of more data [103], the data-driven mathematical models have become popular for exploring the complexity of the system more effectively. In this study, we developed a mathematical model that explored the characteristic differences of breast tumors based on their tumor microenvironments. This model allowed us to understand tumor progression in more detail.
The mathematical results showed that tumor growth in each cluster demonstrates unique interactions with its tumor microenvironment. For example, as the number of cancer cells increases, the numbers of cytotoxic cells, helper T-cells, and IFN-γ increase at the beginning and then decrease to reach a steady state. Furthermore, while regulatory T-cells in clusters 1, 2, and 3 reached the steady state very early and remained constant throughout the time, in clusters 4 and 5, they increased very early and then decreased to reach the steady state. Importantly, it has been found that interactions among immune cells can impact the immunological response in various cancer types [104,105]. Our results also show that it is crucial to investigate the interactions among immune cells rather than simply looking at the quantity of a certain immune cell type to make appropriate prognostic predictions for breast cancer patients. The results also show that some of the variables, such as T c , T h , D, E, and I γ , stayed approximately constant over time. Hence, one might simplify the model by removing the ODEs of these variables and assuming their steady state values in the remaining equations.
When we compare dynamical results with clinical information of the clusters, we can see that cluster 5, which has the best survival probability and one of the most nonaggressive tumor types ( Figure 3C), had some of the lowest tumor growth and had a smaller cancer population at the steady state ( Figure 5). Furthermore, we want to clarify that while cluster 3 contained mostly LumA and LumB breast tumors that tend to have better prognosis and better survival times compared to other subtypes of breast tumor, the survival probability of this cluster is low. This is because cluster 3 significantly consisted of the oldest patients compared to the other clusters ( Figure 3A). In addition, while cluster 1 included more aggressive tumors than cluster 2, tumor growth in cluster 2 was the highest at the steady state. This might be due to the fact that cluster 2 had the youngest patients ( Figure 3A), and aggressive breast tumors in young patients tend to grow more rapidly than ones in older women [101].
Having a large number of unknown parameters and difficulties in deriving their values are the main challenges in mathematical modeling of cancer. In this study, we derived some parameter values from experimental studies published in the literature, and we calculated others using the steady state assumptions and maximum values of the variables, using a similar method given in recent studies [71,106]. We estimated parameters uniquely for each group of patients with different immune profiles, because many studies have demonstrated that patients with different immune compositions show different prognoses and respond differently to therapy [107][108][109]. Thus, having separate parameters for each cluster allowed us to see the effects of immune cells on tumor growth more accurately.
In addition, we performed a global gradient-based sensitivity analysis to investigate the effects of each parameter on the dynamical system. The results of sensitivity and bifurcation analyses provided interesting insights about the biological mechanism, especially the importance of IFN-γ in controlling cancer growth (Figures 8 and A3). We saw that the cancer population is primarily sensitive to parameters that directly promote or inhibit cancer cells, and secondarily, to parameters which promote or inhibit macrophages. In other words, cancer dynamics positively react to an increase in macrophages. It is reported that high levels of tumor associated macrophages are correlated with worse patient prognosis [100]. Moreover, to acquire patient specific parameters, we had to devise some mathematical assumptions about the parameters of our system. These assumptions were mostly mathematical artifacts ensuring nonzero parameter values. We chose the assumptions which involved the sensitive parameters in our sensitivity analysis and investigated the effect of scaling these assumptions to see how the cancer dynamics would change. We found out that we could delay cancer recovery time through certain assumptions. More specifically, we saw that the parameter δ CI γ was involved in the most interesting cases. In these cases, we saw a significant delay in cancer saturation time. Furthermore, according to our bifurcation analysis (see Appendix A.6), these parameters have a large range, which led to a vanishing cancer population that could translate into promising treatment options. The therapeutic potential of IFN-γ in later stages of cancer has been observed by researchers [110,111]. It is known that IFN-γ can directly inhibit breast cancer by restoring ICI-induced apoptosis in breast cancer cells that have acquired resistance to this antiestrogen [112] , and the combination of anti-growth factor receptor MAb and cytokines such as IFN-γ may be useful in the treatment of breast cancer [113]. On the other hand, IFN-γ can increase the cytotoxicity of T-cells, leading to more effective inhibition of cancer cells [114,115]. Our bifurcation results also showed that controlling the production of cancer via adipocytes can significantly decrease the cancer population at the steady state for LumA subtypes. It has been observed that an excessive amount of adipocytes is associated with advanced T stages of the triple-negative and LumA subtypes, and the modulation of adipocytes can provide novel therapy targets for breast cancer [116]. Besides these, we also investigated how a dynamical system would be affected if we used different initial conditions for each cluster, and we saw that the parameter values that were used in each cluster were more important than the initial conditions for determining the dynamics.
The results of this study should be used by considering the limitations of this study. While the use of time course data would be ideal to obtain parameters for each cluster, the availability of such data is limited. To combat this challenge, we used large amounts of tumor data in each cluster as the steady state values, to estimate parameters. Despite this limitation due to the lack of time course data, our model still presented essential information about the progression of breast tumors, and our study can be used to develop new models using only gene expression data of the patients. Moreover, one might utilize different approaches, such as applying partial differential equations, to investigate tumor growth in space and time, or utilize other parameter fitting algorithms [117][118][119][120] to obtain more accurate dynamical systems. Lastly, it has been shown that treatment options in breast cancer have significant effects on the immune system. This model can be used for inferring treatment strategies in breast cancer by adding the interactions between the effects of the treatments on the tumor microenvironment.

Abbreviations
The following abbreviations are used in this manuscript:

. Steady State and Additional Assumptions
Since we were modeling the evolutions of tumors without any interventions, we assumed small tumors grow to large ones at a steady state. As there were both small and large tumors in each cluster, the averages of the smallest tumors and the largest tumors in each cluster, respectively, gave us the initial conditions and the steady state values for the cluster; see Tables 2 and 3. However, we carried out our parameter estimation based on just the steady state values and independent of the initial conditions. This enabled us to validate the importance of our parameter values based on the data of every single patient in the same cluster and across clusters; see Appendices A.4 and A.5.
If we assume specific values of steady state for each variable, we obtain the following sample parameter set: Under these assumptions, the system of Equations (1)-(17) provided us with 17 constraints on parameters. In order to solve this system, we needed to make additional assumptions about a selection of parameters. This ensured that the number of independent equations equaled the number of unknown parameters and that we could uniquely determine the values. We assumed that the carrying capacities A 0 , C 0 , and M 0 were proportional to the maximum values of their corresponding variables in each cluster. We also took the necrosis coefficient to be α NC = 0.5. We also utilized the available research [71,[121][122][123][124] to adopt the natural death/decay/degradation rates. Considering a cell or cytokine X, we estimated the death rate δ X = ln 2/t X 1/2 using half-life measurements t X 1/2 . More specifically, we found that naive T-cells' half-life ranges between 1 and 8 years [71]. Taking it to be two years, we derived a death rate of 9.49 × 10 −4 . Next, we found that the half-lives of cytotoxic T-cells and helper T-cells are 41 h and 3 days, respectively, resulting in δ T h = δ T r = 2.31 and δ T c = 0.406 [71]. The half-life of mature DCs is about 2 to 3 days, giving us an average of 2.5 days. Using this value for t X 1/2 , we obtained δ D = 0.277. Intestinal macrophages have a half life ranging from 4 to 6 weeks. Considering the half life of five weeks, we calculated δ M = 0.0198. According to [125], we found that the half life of adipocytes is 10 years, resulting in δ A = 2.8 × 10 −3 . For HMGB1, there are varying results in [121]. We considered a half-life of 17 min and used this to find δ H = 18. For the degradation rate of estrogen, we considered a 3 to 5 h half-life [122]. Using the average of 4 h, we calculated δ E = 4.16. For IL-6, we assumed the half-life to be 15.5 h. Thus, we obtained δ IL 6 = 1.07 [71]. We took the half-lives of IL-10 and IL-12 to be 3.6 h [71] and 7.8 min [123], respectively. This resulted in δ IL 10 = 4.62 and δ IL 12 = 128. Lastly, for IFN-γ, the estimated half-life was about 30 min, resulting in the rate δ γ = 33.3. More concisely, we used the following the degradation rates in our computation: Though this decreased the number of unknowns, there still existed a substantial number of parameters without values. Therefore, we next imposed heuristic assumptions based on activation, production, and inhibition rates. Let us look at those in detail.
We began by considering that the mean rate of doubling time of a breast cancer tumor is 193 ± 141 [126]. This gives an interval of 52 days to 334 days. Using this fact, we assumed that a faster doubling rate results from both innate cancer proliferation and proliferation caused by cytokine IL-6 and adipocytes. In the case of the faster double rate, we assumed that death is only innate. This results in This contrasts with our slower doubling rate assumptions. We assumed only innate cancer proliferation. In this case, death rates included the effects of anti-cancer agents cytotoxic T-cells and IFN-γ, giving us We used the following mean values: In these assumptions, we took IL mean 6 , A mean , T mean c , and I mean γ to be the average values of the corresponding variable across all patients. In our further assumptions, we used the maximum observable quantities for all the variables across all patients denoted by: 6 , IL max 10 , IL max 12 , and I max γ .
The following maximum variables were calculated from the datasets: Assuming helper T-cells are primarily activated by dendritic cells, we can write We took the inhibition of helper T-cells to be mostly accomplished by cytokine IL-10 and regulatory T-cells. In particular, we claim that these inhibitors are 20 times more effective than natural degradation.
Next, we made the following assumptions about cytotoxic T-cells. The activation of cytotoxic T-cells by estrogen is much stronger than activation by IL-12 and dendritic cells. We declare estrogen to be 200 times as effective as IL-12, and we declare dendrites to be 100 times as effective as estrogen.
We also state that IL-10 and regulatory T-cells are 20 times more effective than natural degradation of cytotoxic T-cells.
For regulatory T-cells, we assumed that their activation by dendritic cells is four times stronger than their activation by estrogen. This is written as We also claim that the activation of dendritic cells by HMGB1 and estrogen is twice as effective as by cancer cells. This gives us the following equality: Next, we declare the inhibition of dendritic cells by cancer cells to be equivalent to the natural degradation rate. In other words, Further, we assume that helper T-cells are the primary activator for macrophages. We express this assumption as For HMGB1, we declare the following equality to be true.
Our next assumption sets the production of IL-12 to be equal among macrophages, dendritic cells, helper T-cells, and cytotoxic T-cells: For the production of IL-10, we let macrophages, dendritic cells, regulatory T-cells, helper T-cells, cytotoxic T-cells, and cancer cells work at equal rates: Next, we assumed that IFN-γ is mostly produced by cytotoxic T-cells. This gave us the following equality: We also declare IL-6 to be equally produced by adipocytes, macrophages, and dendritic cells. Thus, we obtained the following: Furthermore, we scaled the carrying capacities for cancer cells, adipocytes, and estrogen by 2, 2, and 1.5, respectively.
We took the necrosis factor α NC to be 0.5 for all clusters.
The proliferation rate of cancer cells by IL 6 was assumed to be twice that by adipocytes: We declare the general secretion of estrogen to be twenty times the production of estrogen by adipocytes: We took the depreciation of naive dendritic cells to be equivalent to that of dendritic cells. Likewise, we took the depreciation of naive macrphage cells to be equivalent to that of macrophages.
Finally, we assumed that the inhibition of cancer cells by IFN-γ is less effective than inhibition by cytotoxic cells and other causes: Altogether, these assumptions adequately allowed us to derive a sample parameter set.

Appendix A.2. Non-Dimensionalization
For a more stable numerical simulation, parameter estimation, and sensitivity analysis, we non-dimensionalized our parameters and variables with respect to each variable's steady state value. Therefore, a non-dimensional variable [X], non-dimensionalized as [X]/[X ∞ ], will take the value 1 at its steady state. Since the dimensional time scale does not introduce any inconveniences, we chose not to non-dimensionalize with respect to the time. As a result we got the following system: Since we did not nondimensionalize with respect to time, the production rates λ C , λ A and λ E and the decay rates and δ IL 6 were left unchanged. To non-dimensionalize the rest of the parameters, we used the following rules: Furthermore, for the reaction parameters of the form λ XY used in terms of the form λ XY [Y][Z], we used the general non-dimensinalization formula: and finally, for inhibition rates of the form δ XY used in terms of the form δ XY [Y][Z], we used the general non-dimensinalization formula: Consequently, we got non-dimensionalized assumptions as follows:

. Parameter Values
This section includes all the parameter values used in this project. Values are separated into two groups. First, we have degradation rates included from prior research (see Table A3). Second, we have scaling-dependent parameters (see Table 1) based on the assumptions made in Appendix A.1. In order to sufficiently analyze the dynamics of the tumor based on the estimated parameters, we emphasize that possible variations in assumptions and patient to patient values should be considered. These variations have been taken into account and addressed by performing sensitivity analysis and clustering patients based on their immune profiles. In their dimensional forms, the scaling-dependent parameters would also depend on the scaling constant α, in addition to derivation assumptions and patient data. Therefore, we list the objective non-dimensional values. The dimension of these parameters is per day.      since they came out as the most sensitive parameters in Section 3.3. Our results show that regardless of minute differences in quantities, the qualitative behaviors are identical across all the clusters; see Figure A3. In all clusters, for λ C , λ CIL 6 , λ CA , δ CT c , δ CI γ , and δ C , we can see the existence of a single equilibrium for a large range of parameter values (0 to 1). Increasing the values of λ C , λ CIL 6 , and λ CA caused the cancer population to reach higher steady state values. However, for λ C in clusters 1-5 and for λ CA in cluster 5, we observed interesting results at the beginning of their corresponding intervals. More specifically, very small values of λ C caused the cancer population to vanish for all clusters, whereas very small values of λ CA led to very small steady state (but not zero) values for cancer only in cluster 5; see Figure A3. On the contrary, parameters δ CT c , δ CI γ and δ C only pertain to a non-vanishing cancer population in a very small parameter space. For the most part, they caused the cancer population to disappear at the steady state; see Figure A3. The negative Lyapunov exponents show that all the steady state values are Lyapunov stable. Figure A3. Bifurcation and Lyapunov exponent diagrams for cancer parameters: Subfigures show the bifurcation on top of the corresponding Lyapunov diagrams for clusters 1-5. Bifurcation was done for parameters λ C , λ CIL 6 , λ CA , δ CT c , δ CI γ , and δ C from the cancer ODE (9).