The OncoSim-Breast Cancer Microsimulation Model

Background: OncoSim-Breast is a Canadian breast cancer simulation model to evaluate breast cancer interventions. This paper aims to describe the OncoSim-Breast model and how well it reproduces observed breast cancer trends. Methods: The OncoSim-Breast model simulates the onset, growth, and spread of invasive and ductal carcinoma in situ tumours. It combines Canadian cancer incidence, mortality, screening program, and cost data to project population-level outcomes. Users can change the model input to answer specific questions. Here, we compared its projections with observed data. First, we compared the model’s projected breast cancer trends with the observed data in the Canadian Cancer Registry and from Vital Statistics. Next, we replicated a screening trial to compare the model’s projections with the trial’s observed screening effects. Results: OncoSim-Breast’s projected incidence, mortality, and stage distribution of breast cancer were close to the observed data in the Canadian Cancer Registry and from Vital Statistics. OncoSim-Breast also reproduced the breast cancer screening effects observed in the UK Age trial. Conclusions: OncoSim-Breast’s ability to reproduce the observed population-level breast cancer trends and the screening effects in a randomized trial increases the confidence of using its results to inform policy decisions related to early detection of breast cancer.


Introduction
Rapidly emerging knowledge in breast cancer control has put pressure on the health system for the adoption of new technologies and policies. Randomized trials are the gold standard of evidence to introduce new interventions in clinical practice and public health; however, such evidence is not always relevant for informing policy decisions because the context of the interventions evolves quickly, compared with the time that elapses between the design of a trial and the availability of its results. For example, most breast cancer screening randomized trials were from the era before breast cancer adjuvant treatment was available and used film-screen mammography [1,2]; breast cancer survival has since vastly improved [3], and digital mammography has superseded film-screen mammography. A cancer simulation model can help integrate evidence from multiple sources and make them more relevant to inform contemporary clinical and policy decisions.
Several groups have developed sophisticated cancer-specific models based on the natural history of cancer that can be revised for additional analyses and incorporate knowledge from experts in different areas [4]. An example includes breast cancer models developed by the CISNET breast cancer working group, where the models have been used extensively to investigate emerging issues in breast cancer control and to inform debates on topics such as breast density legislation in the US [4]. OncoSim-Breast is an example of such a model but developed for the Canadian population using Canadian data, whenever applicable. It is the only breast cancer model of this nature, i.e., a microsimulation model developed for informing various breast cancer control questions, in Canada. When compared with the CISNET breast cancer models, OncoSim-Breast is unique in that it is available at no charge to users in the public sector. Breast cancer is the latest addition to OncoSim's suite of cancer models. The validation and applications of OncoSim colorectal, cervical, and lung cancers have been described previously [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. Briefly, these models were developed using country-specific data, whenever available, and were calibrated to match the key output in the national cancer registry. These models were also used to inform the development and revisions of clinical guidelines, and the design and implementation of cancer screening programs [13]. The primary objective of OncoSim-Breast is to investigate emerging issues related to breast cancer control in Canada. This work builds on a strong foundation of analyses performed over a decade ago to estimate the impact of diagnostic and therapeutic approaches to non-metastatic breast cancer in Canada, using the Statistics Canada POHEM mathematical model [20]. The present paper has two goals. First, it aims to describe the key assumptions in OncoSim-Breast. Secondly, it compares OncoSim-Breast's projections with observed data in the Canadian Cancer Registry, projected breast cancer mortality estimates in the Canadian Vital Statistics, and the observed screening effects in a randomized trial.

OncoSim-Breast
The OncoSim-Breast mathematical simulation model combines inputs (demography, the natural history of tumour development and progression, screening, cancer costs, and quality of life) to project population-level outcomes, such as breast cancer incidence and mortality, screening outcomes, stage and age at diagnosis, life years, quality-adjusted life years, lifetime breast cancer costs, and screening or follow-up procedure costs (Figures 1 and 2, Table 1). knowledge from experts in different areas [4]. An example includes breast cancer models developed by the CISNET breast cancer working group, where the models have been used extensively to investigate emerging issues in breast cancer control and to inform debates on topics such as breast density legislation in the US [4]. OncoSim-Breast is an example of such a model but developed for the Canadian population using Canadian data, whenever applicable. It is the only breast cancer model of this nature, i.e., a microsimulation model developed for informing various breast cancer control questions, in Canada. When compared with the CISNET breast cancer models, OncoSim-Breast is unique in that it is available at no charge to users in the public sector. Breast cancer is the latest addition to OncoSim's suite of cancer models. The validation and applications of OncoSim colorectal, cervical, and lung cancers have been described previously [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. Briefly, these models were developed using country-specific data, whenever available, and were calibrated to match the key output in the national cancer registry. These models were also used to inform the development and revisions of clinical guidelines, and the design and implementation of cancer screening programs [13]. The primary objective of OncoSim-Breast is to investigate emerging issues related to breast cancer control in Canada. This work builds on a strong foundation of analyses performed over a decade ago to estimate the impact of diagnostic and therapeutic approaches to non-metastatic breast cancer in Canada, using the Statistics Canada POHEM mathematical model [20]. The present paper has two goals. First, it aims to describe the key assumptions in OncoSim-Breast. Secondly, it compares OncoSim-Breast's projections with observed data in the Canadian Cancer Registry, projected breast cancer mortality estimates in the Canadian Vital Statistics, and the observed screening effects in a randomized trial.

OncoSim-Breast
The OncoSim-Breast mathematical simulation model combines inputs (demography, the natural history of tumour development and progression, screening, cancer costs, and quality of life) to project population-level outcomes, such as breast cancer incidence and mortality, screening outcomes, stage and age at diagnosis, life years, quality-adjusted life years, lifetime breast cancer costs, and screening or follow-up procedure costs (Figures 1 and 2, Table 1).

Natural history
Rate of occult tumour onset (oncogenesis) Supplemental File S1: Figure S1 Calibrated from the input parameters in the University of Wisconsin Breast Cancer Model [24] to match the incidence data in the cancer registry *.
Distribution of tumour type (DCIS vs. invasive) by age Supplemental File S1: Table S2 Relative risk of developing occult tumour based on BRCA1/2 gene mutation and breast cancer family history Supplemental File S1: Table S3 Calibrated from Singletary SE (2003) [25] to match the incidence data in the cancer registry *

Model Inputs Estimates Data Sources
Relative risk of developing occult tumour based on hormone therapy use Supplemental File S1: Table S4 Calibrated to match the results of a study reporting the impact of hormone therapy use on breast cancer risk [26] Tumour growth Supplemental File S1: Table S5 Calibrated from the Wisconsin Breast model's parameters [24] to match stage-specific incidence data in the Canadian Cancer Registry (1992-2013) and Canadian Cancer Screening Database (2007)(2008) Tumour spread to other lymph nodes, hazard

Cancer detection
Probability of clinical detection by tumour size Supplemental File S1: Table S8 Calibrated from the input parameters in the University of Wisconsin Breast Cancer Model [24] to match the incidence data in the cancer registry *.
Stage distribution at detection Supplemental File S1: Tables S9-S11 Canadian Cancer Registry * Breast tumour biology Joint distribution of hormone receptor status, HER2neu status, and grade at detection, by tumour size, nodal involvement, metastatic status, and age of women at tumour detection (Supplementary File S2) Canadian Cancer Registry *

Disease progression
Stage-specific recurrence and survival risks Supplemental File S1: Tables S12-S16 Unpublished data from British Columbia † Province/territory-specific relative risk of breast cancer survival Supplemental File S1: Table S17 Canadian Cancer Registry *

Screening
Sensitivity and specificity of mammography Supplemental File S1: Figure S5 and Table S18 Cost of follow-up procedures for abnormal screen results Supplemental File S1: Table S19 Ontario Breast Screening Program 2011,

Canadian Breast Cancer Screening Database 2004-2008 and Ontario Health
Insurance fee schedules [27,28] Breast cancer costs Supplemental File S1: Section 6 Retrospective administrative database analysis using Ontario data, Ontario Health Insurance Program schedule of benefits, and end-of-life costing study of breast cancer patients [27,29] Age-specific health state utilities-Canadian general population Supplemental File S1: Table S21 [30] Breast cancer-specific preference score Supplemental File S1:

Demography
OncoSim simulates one individual at a time, replicating the age and sex distributions and all-cause mortality of the population in each province and territory in Canada (Supplemental File S1: Section 1). Each simulated individual has attributes, such as demography (sex, province/territory), and breast cancer-related risk factors (BRCA1/2 gene mutation, family history, and exposure to hormone replacement therapy; Supplemental File S1: Table S1).

Natural History
OncoSim-Breast simulates the onset, growth, and spread of tumours, both invasive cancer and Ductal Carcinoma in situ (DCIS) (Supplemental File S1: Section 2). In the model, invasive tumours can develop without an apparent prior in situ component, because they became invasive before reaching the 2 mm threshold of our simulation. In situ and invasive tumours are allowed to develop and grow independently of each other. Thus, a woman could have one of the four outcomes: (1) in situ disease, (2) an invasive tumour, (3) in situ disease that becomes invasive and evolves independently of the initial in situ component, or (4) no breast tumour at all. The development, tumour biology, growth, and clinical detection of breast cancers, both invasive cancer and DCIS, were calibrated from inputs in the University of Wisconsin Breast Cancer Epidemiology Simulation Model ("Wisconsin Breast model") [32] to match the incidence of cancer by age group and year in the National Cancer Incidence Reporting System (1969-1991), Canadian Cancer Registry (1992-2013) and Canadian Breast Cancer Screening Database (2007)(2008).
Tumour onset: In OncoSim-Breast, tumours start from 2 mm, based on the probable minimum size detectable by mammography screening and similar to the Wisconsin Breast model. The probability of tumour onset varies by age and years (Supplemental File S1: Figure S1). In addition, the risk increases if a woman has any of the breast cancer-related risk factors (BRCA1/2 mutation, family history of breast cancer, or exposure to hormone replacement therapy; Supplemental File S1: Tables S3 and S4); if a woman has previously had a DCIS tumour, she is also more likely to have invasive cancer (see equation in Supplemented Methods).
Tumour growth: In the model, tumours grow according to the time since tumour onset, the presence of BRCA1/2 gene mutation, tumour type (DCIS or invasive), and tumour aggressiveness (Supplemental File S1: Figure S2). All tumours were assumed to grow according to a Gompertz distribution that gives the tumour diameter (d) in cm as a function of years since tumour onset (t), scaled according to the maximum diameter allowed for a tumour type, according to the following equation: where d 0 is the diameter of the tumour at occult onset (0.2 cm); d max is the maximum size the tumour is allowed to reach; α represents the tumour growth rate estimated through model fitting; t represents the years since tumour onset. The breast tumour growth equation coefficients were calibrated from the Wisconsin Breast model's parameters to match stage-specific incidence data in Canadian Cancer Registry (1992-2013) and Canadian Breast Cancer Screening Database (2007-2008) and various other targets (Supplemental File S1: Table S5). Supplemental File S1: Figure S2 shows the growth curves by tumour type and class for a mean growth rate and mean max size.
Tumour spread: An invasive tumour can spread into lymph nodes and beyond the breast. The spread to other lymph nodes is determined by the size and growth rate of the tumour and time since tumour onset as follows: µ N is the propensity to generate positive nodes. It is drawn from a gamma distribution (mean and variance in Supplemental File S1: Table S5) at the time of tumour onset to allow for heterogeneity; b 1 , b 2 , b 3 are coefficients estimated through calibration of natural history (Supplemental File S1: Table S6); V(t) denotes the volume of the spherical tumour; V (t) denotes the growth rate of the volume, and is the derivative of V(t); t represents the age of the tumour, i.e., years since oncogenesis. The equation was adopted from the CISNET-Wisconsin model and was calibrated to match positive node data in Canadian Cancer Registry (1992-2013) and Canadian Breast Cancer Screening Database (2007)(2008). When calibrating the model, we assumed that subsequent non-invasive tumours cannot develop into an invasive tumour once a woman has developed an invasive tumour. However, there is no limit in the number of positive nodes an invasive tumour could generate.
The tumour size and the number of lymph nodes affected then determine if invasive cancer spreads beyond the breast (metastasis). The following equation governs the metastasis rate of an invasive tumour: Hazard of metastasis = µ M × k(tumour size, number of positive nodes) where µ M is the propensity for metastasis. It is drawn from a gamma distribution (mean and variance in Supplemental File S1: Table S5) at the time of tumour onset to allow for heterogeneity; k is an annual metastasis hazard estimated through model calibration. It is a function of tumour size and the number of positive nodes (Supplemental File S1: Table S7).
The hazard was calibrated to match stage-specific incidence data in Canadian Cancer Registry (1992-2013) and the Canadian Breast Cancer Screening Database (2007)(2008). The overall rate of metastasis is the cumulative metastasis rate of all the invasive tumours in a person. For example, if a woman has three invasive tumours, her annual metastasis hazard is the sum of the metastasis hazard of the three tumours.

Cancer Detection, Staging, and Tumour Biology
Cancer detection: The probability of cancer being detected depends on the tumour size and the number of tumours. If a woman has multiple tumours, we assumed her cancer detection probability is the sum of the clinical detection probability of the individual tumours (Supplemental File S1: Table S8). Clinical detection probability for a tumour was calibrated from the inputs in the Wisconsin Breast model to match the incidence data in the National Cancer Incidence Reporting System (1969-1991) and the Canadian Cancer Registry (1992-2013). The hazards were interpolated linearly for in-between sizes.
Staging: The stage at detection uses the American Joint Committee on Cancer (AJCC) classification of tumour size (T), nodal status (N), and metastasis (M) (Supplemental File S1: Table S9). The tumour size and nodal status at detection are estimated using the tumour size and number of positive nodes generated from the natural history component and age. First, the model determines if a tumour is a T4 tumour; the probability of a T4 tumour (have extended to the chest and/or skin) is a function of tumour size T* and the number of positive nodes N* (Supplemental File S1: Table S10). Next, it estimates T based on T* for non-T4 tumours. N: The model assigns nodal status (N in TNM) at the time of detection from a distribution that depends on the number of positive nodes N* and T, fitted using the Canadian Cancer Registry data (Supplemental File S1: Table S11).
Tumour biology: To simplify the model, OncoSim assigns tumour biology (hormone receptor status, HER2neu status, and grade) once tumour has been detected. The joint distribution of these biological factors was estimated from the Canadian Cancer Registry by tumour size (Tis, T1a, T1b, T1c, T2, T3, T4), nodal involvement (N0, N1, N1mi, N2, N3), metastatic status and age of women at tumour detection (10-year age groups) (Supplemental File S2). Women with BRCA1/2 gene mutation have different distributions of tumour biology than women without BRCA1/2 gene mutation [33]. Women who used hormone replacement therapy have different tumour grades [34].

Disease Progression
Upon cancer detection, the model draws time to disease progression (recurrence and breast cancer death), based on stage, tumour biology, age at diagnosis, and detection method (clinically or screening). A woman will die from breast cancer if the simulated time to breast cancer death is sooner than the simulated time to non-breast cancer death. We modelled disease progression using data from a cohort of women diagnosed with breast cancer in British Columbia between 2006 and 2009 and followed up until 2014. We fitted the stage-specific outcomes data (diagnosis to local recurrence, diagnosis to distant recurrence, local recurrence to distant recurrence, etc.) to Weibull regression models, controlling for the number of years from diagnosis, age, grade, hormone status, her2-neu status, screening status, and the variables' interactions (Supplemental File S1: Tables S12-S16). Stage-specific recurrence risks and breast cancer survival outcomes were estimated using data from the British Columbia Cancer Agency because comprehensive staging data only became available recently in the Canadian Cancer Registry. To capture provincial differences in stage-specific survival, the model applies province-specific relative risks, estimated from more recent data in the Canadian Cancer Registry to the British Columbia survival curves (Supplemental File S1: Table S17).

Screening
In OncoSim, screening can detect tumours earlier than they would have been detected clinically. The survival from time of screen-detection to breast cancer death includes lead time and net survival benefit (Supplemental File S1: Figure S3). Neither lead time nor net survival benefit is input to the model: rather, these can be estimated from the model output. Survival models were calibrated to match the observed survival data from a cohort of women diagnosed with breast cancer in British Columbia in 2006-2009; the survival data from these women were available up to 2014. Screen detection also leads to a stage shift that contributes to the survival benefit. The model reports overdetection (cancers that would not otherwise present clinically) as an output.
For evaluating screening strategies or related performance, the model allows users to create different screening strategies and scenarios by modifying the following input parameters: screening program recruitment strategy (e.g., start/end age and years); screening participation and retention; screening frequency; screening modality (e.g., digital mammography); sensitivity and specificity of screening; follow-up protocol after abnormal screening results; and costs of screening and follow-up procedures.
The model also includes historical breast screening trends in Canada (starting in 1986) to match the observed screening patterns reported in the screening programs in 2007-2012. Screening interventions can vary by family history and BRCA1/2 gene mutation. The model includes different screening modalities and allows their performance to vary by tumour size, age group, and screen sequence (Supplemental File S1: Figure S5 and Table S18). Women with an abnormal mammogram receive additional workups, such as diagnostic imaging, biopsy, and fine-needle aspiration. The model includes costs of screening and follow-up procedures from the perspective of a public healthcare payer, such as the Ministry of Health (Supplemental File S1: Table S13).

Breast Cancer Costs
The model included healthcare costs associated with breast cancer from the perspective of a public healthcare payer, such as the Ministry of Health (Supplemental File S1: Section 6). The costs included breast cancer surgery, radiation treatment, chemotherapy, imaging tests and oncology physician fees, acute hospitalizations, emergency department visits, home care, long-term care, complex continuing care, and others. The model captures lifetime costs of breast cancer across three phases of care (first 18 months after diagnosis, continuing care, and terminal care), a similar approach as that used in other established breast cancer simulation models [3].

Health-Related Quality of Life
To calculate quality-adjusted life years after an individual is diagnosed with breast cancer, the model multiplies the duration of each health state with age-and sex-specific preference scores for the Canadian population and breast cancer-specific health state utilities (upon cancer diagnosis) (Supplemental File S1: Tables S21 and S22) [30,31]. When an individual is in several health states at the same time, we assumed the utility score is multiplicative [35].

Model Validation
We validated our model in three ways using OncoSim version 3.3.6. First, for face validation, we plotted the projected incidence and stage distribution of breast cancer in Canada and the observed data in the Canadian Cancer Registry (1992-2017). Second, as another face validation exercise, we compared OncoSim's projected breast cancer deaths in 2018 and the latest breast cancer death data in the Canadian Vital Statistics [36]. Lastly, as an external validation exercise, we simulated the screening strategies of the UK Age trial [37,38] in OncoSim to compare OncoSim's projected impact of breast cancer screening on incidence and mortality with the observed effects in the trial.
The UK Age trial is a randomized trial that compared annual screening in women aged 40-49 years with usual care in the UK in the 1990s [37]. To compare our results with other established breast cancer simulation models, we set up our simulation following their methods when they compared their predictions against the UK trial results (details of the simulation have been reported in another paper) [39]. Briefly, we simulated a cohort of women born in 1950-1957 to match the birth cohort in the UK Age trial in two scenarios: (1) no screening and (2) annual screening for women age 40-49. In the screening scenario, we calibrated the rescreening rate to the average number of mammograms per woman in the Age trial (4.8) [39]. For each scenario, we estimated the incidence of breast cancer and breast cancer deaths in women aged 40-49 years. We then compared OncoSim-Breast's projected incidence of breast cancer (DCIS and invasive cancers) with the trial's mean estimate and its 95% confidence interval. For breast cancer mortality, we compared the mortality reduction ratio from OncoSim-Breast with the trial's mean estimate and 95% confidence interval at the 10-year and 17-year follow-up. We chose to compare rate ratios rather than rates because the populations were different: volunteers in the UK Age trial vs. Canadian population. In this simulation, we did not adjust the natural history to match the UK population, and we did not change the all-cause mortality variable for the UK population.

Results
OncoSim's projected breast cancer incidence and deaths at the national level were close to the observed data in recent years (projected incidence in Figure 3). OncoSim's projected a breast cancer death rate of 27 per 100,000 women in 2018 and the Vital Statistics reported 28 deaths per 100,000 women [36]. When projecting breast cancer incidence by province/territory in recent years (2008-2017), OncoSim's estimates were also close to the observed data for most jurisdictions (Figure 3). Its projections were within the confidence intervals of the Canadian Cancer Registry data for all provinces and territories, except the two larger provinces (Quebec and Ontario), where its projections were slightly lower. OncoSim's projected age trend in the incidence of invasive breast cancer and DCIS was also similar to that in the Registry (Figure 4A,B) in 1992-2013. When comparing the projected incidence for specific age groups, OncoSim's projection was slightly higher in women aged 70-79 years in 1992-2013. For stage distribution, OncoSim's projected that 80% of breast cancer cases diagnosed in 2011-2015 were earlier stage cases (stage I and II), whereas the observed data in the Canadian Cancer Registry reported 82% ( Figure 5).  In our external validation exercise, we estimated the effects of annual breast cancer screening in women aged 40-49 years. OncoSim's projections were within the confidence intervals of the observed results from the UK Age trial (Table 2). When estimating the mortality reduction in breast cancer screening, OncoSim estimated a smaller effect than the Age trial at the 10-year follow-up, but the estimates were more similar at the 17-year follow-up. OncoSim's projections were almost identical to the average mortality reduction predicted by the five CISNET breast cancer models at the 10-and 17-year follow-up [39].

Discussion
This paper provides an overview of OncoSim-Breast inputs, assumptions, breast cancer cost projections, and model validation results. When projecting incidence, mortality, and stage at diagnosis of breast cancer, OncoSim-Breast's estimates were close to the estimates reported in the Canadian Cancer Registry and Vital Statistics. In addition, OncoSim-Breast's ability to reproduce the observed effects of annual breast cancer screening in a randomized screening trial increases the confidence of using the model results to inform breast cancer screening-related policy decisions. When simulating the effects of breast cancer screening in women aged 40-49 years on breast cancer mortality, OncoSim's projections were almost identical to the average projections from the CISNET breast cancer models [39].
Building upon the experience of other OncoSim models and another established breast cancer microsimulation model 3 , OncoSim-Breast was developed using Canadian data. While the model has many potential applications, its primary purpose was to evaluate the impact of interventions related to early detection, such as promoting breast cancer awareness through professional and public education and screening. For screening, the model has many detailed outputs for informing policy decisions, including the harm of screening (e.g., false positives and overdetection), healthcare costs, and benefits (life years gained, cancer incidence and mortality, and quality-adjusted life years). Jurisdictions planning the implementation of population-based breast cancer screening can compare the impact of different screening strategies. For jurisdictions that have an organized breast cancer screening program in place, OncoSim-Breast could help investigate emerging issues such as increasing false positives and customizing screening protocols based on different risk factors. In addition, jurisdictions can use the model to assess the impact of service disruptions during the COVID-19 pandemic [41]. For example, they can estimate the impact of pausing screening for various time intervals on the stage of diagnosis and breast cancer deaths. They can also compare the impact of different strategies for restoring screening programs on downstream resources, such as follow-up diagnostics, biopsies, and surgeries.

Limitations
This paper has several limitations. First, OncoSim is a simulation model built using the best available data; the accuracy of projections depends on the quality of data input and the validity of assumptions. To address the issue of rapidly emerging evidence, OncoSim-Breast allows users to modify the inputs and assumptions. Second, our comparison of OncoSim-Breast's projections with more recent Canadian Cancer Registry data was limited by the availability and quality of data in the Registry. Third, our simulation of the UK Age trial was an exploratory external validation exercise; we did not calibrate the model to reflect the use of single-view mammography in AGE or to match the historically poorer breast cancer outcomes at that time. Fourth, OncoSim-Breast was built to be a multi-purpose breast cancer simulation tool and could simulate many scenarios; therefore, it would not be feasible to validate all its possible projections against observed data. To ensure OncoSim-Breast's relevance for supporting policy decisions, the team compares OncoSim-Breast's projections with emerging real-world data and refines the model based on new evidence, on an ongoing basis. In the upcoming releases, examples of further enhancements will include adding emerging data on new screening modalities and other factors that might affect screening performance, such as breast density and polygenic risk scores. Fifth, the model does not consider the impact of comorbidity on breast cancer survival. Finally, OncoSim-Breast focuses on breast cancer in women only.

Conclusions
OncoSim-Breast is a natural history-based simulation model developed using Canadian cancer incidence, mortality, screening program, and cost data. It reproduces breast cancer trends in the Canadian Cancer Registry, breast cancer mortality in the Vital Statistics, and the breast cancer screening effects observed in a randomized screening trial.