The Geographic Context of Racial Disparities in Aggressive Endometrial Cancer Subtypes: Integrating Social and Environmental Aspects to Discern Biological Outcomes

The number of Endometrial Carcinoma (EC) diagnoses is projected to increase substantially in coming decades. Although most ECs have a favorable prognosis, the aggressive, non-endometrioid subtypes are disproportionately concentrated in Black women and spread rapidly, making treatment difficult and resulting in poor outcomes. Therefore, this study offers an exploratory spatial epidemiological investigation of EC patients within a U.S.-based health system’s institutional cancer registry (n = 1748) to search for and study geographic patterns. Clinical, demographic, and geographic characteristics were compared by histotype using chi-square tests for categorical and t-tests for continuous variables. Multivariable logistic regression evaluated the impact of risks on these histotypes. Cox proportional hazard models measured risks in overall and cancer-specific death. Cluster detection indicated that patients with the EC non-endometrioid histotypes exhibit geographic clustering in their home address, such that congregate buildings can be identified for targeted outreach. Furthermore, living in a high social vulnerability area was independently associated with non-endometrioid histotypes, as continuous and categorical variables. This study provides a methodological framework for early, geographically targeted intervention; social vulnerability associations require further investigation. We have begun to fill the knowledge gap of geography in gynecologic cancers, and geographic clustering of aggressive tumors may enable targeted intervention to improve prognoses.


Introduction
Endometrial carcinoma (EC) is the most common cancer of the female genital tract [1]. It is categorized into five frequent histological subtypes (histotypes), which describe the histological characteristics and biological behavior of the tumor: endometrioid, serous, mixed, and clear-cell carcinoma, and carcinosarcoma [2]. While endometrioid is the most common histology, representing about 75% of all EC, histotype-specific cancer incidence differs by population, where Black women in the US have lower survival and are disproportionately diagnosed with the aggressive non-endometrioid histotypes [2,3]. Prior risk stratification systems show that Black women in the U.S. have poorer outcomes, as they are diagnosed at a later stage [4] with higher grade [5] and more aggressive non-endometrioid histology [6]. Although Black women often present with more advanced stages of disease, higher-grade census tract levels covering four themes: socioeconomic status, household composition and disability, minority status and language, and housing type and transportation. The cancer literature has used SVI to assess the resiliency and vulnerability of different communities to external stress [30,31]. In this study, it was used as a marker of the geographic context of socioeconomic conditions in the patients' residential area. Overall, associations between tumor type and geographic context, such as neighborhood deprivation, have not been studied in EC. We suggest that markers of neighborhood deprivation risks can be associated with histotype and the underlying biology. We sought to examine the association of endometrioid versus non-endometrioid histology and survival with an individual's social and physical environment.

Study Population
We collected data from 4039 patients diagnosed with endometrial cancer, including endometrioid and non-endometrioid subtypes, between 1998 and 2021 at University Hospitals Cleveland (UH) Seidman Cancer Center from the institutional cancer registry, with follow-up for 22 years. All patients were treated with primary surgery including hysterectomy and bilateral salpingo-oophorectomy. Additionally, Patients considered to be at high-risk for recurrence, including those with non-endometrioid subtypes or advanced stage (FIGO stage II or III), were offered adjuvant chemotherapy or pelvic radiation. These individuals were matched with the Ohio Cancer Incidence Surveillance System (OCISS) from 1992 to 2018 in order to obtain outcome data for overall and cancer-specific survival. Records were excluded if they did not have a complete street address and, of these, if the residential address listed was outside of Ohio. The fifteen counties comprising the Case Comprehensive Cancer Center (CCCC) catchment area were used to define the study area. All residential addresses that were located within these counties were geocoded and within this dataset, addresses that did not have high positional accuracy were removed, resulting in a dataset of 3582 women. Only 1748 women could be linked by diagnosis date between the two datasets, and 836 were excluded due to missing data ( Figure S1). While we were unable to use the full original dataset, the proportions of race (Black or White women) and disease stage (stage I or higher stage) are similar between both datasets.

Demographic Characteristics
Relevant variables available from our institutional cancer registry include histology, height, weight, FIGO stage, race, and residential addresses (subsequently geocoded to X-, Y-coordinates). Patients were categorized as having endometrioid versus non-endometrioid histotypes. The non-endometrioid category comprises several histotypes, including serous, clear cell, or mixed carcinoma, and carcinosarcoma.
The OCISS database contains statewide cancer surveillance data for incidence and survival of cancer by the National Program of Cancer Registries. All cancer cases diagnosed among Ohio residents, except for basal cell and squamous cell carcinoma of the skin, are reported to OCISS. The OCISS database follows the data standards and data dictionary of the North American Association of Central Cancer Registries and includes clinical characteristics at diagnosis (stage, grade, and cancer site), demographics (age, sex, and race), date of diagnosis, primary treatment information, vital status follow-up, and cause of death.

Geospatial Analyses
Patient residential addresses with complete model data were available for 912 unique patients, with high positional accuracy. Each individual was geocoded to their point level locations. Spatial epidemiological cluster detection techniques, Local Moran's I (LMI) [32], Gi* [33], and GeoMEDD [34] were used to investigate geographic patterns of non-endometrioid subtypes within the CCCC area in ArcMap (v = 10.7.1) [35]. Local Moran's I and Gi* have been used widely in detection of cancer clusters [19,28,36,37], while GeoMEDD is a new approach developed in COVID-19 spatial syndromic surveillance [34]. Local Moran's I and Gi* operate on aggregate geographic units, such as census tracts, in this case. GeoMEDD output is based on the boundary of point level cases based on userdefined space-time relationships, which identify granular concentrations, such as at a street segment or building level. While GeoMEDD reports only the presence of geographical areas that meet user-specified spatial and temporal relationships (but not significance thresholds), LMI and Gi* compare the observed spatial pattern of disease with the null hypothesis of complete spatial randomness (CSR) as generated by 999 Monte Carlo simulations [28,38]. Local Moran's I and Gi* were calculated for the rate of non-endometrioid subtypes by census tract, with neighbors defined as having shared boundaries and vertices (1st order queen contiguity matrix). The output from these techniques is multiple clusters of varying sizes and significance thresholds of p ≤ 0.01. For example, some clusters may be p = 0.01, while others may be p = 0.0025, etc. Therefore, results are reported for all clusters as p ≤ 0.01. It should also be noted that these significance values are pseudo p-values, as spatial cluster detection inherently relies on multiple testing procedures [39].
Neighborhood deprivation proximate to each address was integrated within a Geographic Information System (GIS). The Centers for Disease Control and Prevention (CDC) Social Vulnerability Index (SVI) approximated social conditions using 15 census variables at the census tract level ( Figure 1) [24]. Each census tract was ranked by percentiles ranging from 0 to 1 with greater values indicating greater social vulnerability [24]. For each patient, an 800-m buffer was drawn circling their address at diagnosis and the average SVI score for the year 2018 was individually coded. The Environmental Protection Agency's (EPA) Toxic Release Inventory (TRI) served as another marker of risk in the home environment of each patient [23]. Geographic X, Y coordinates pinpointed each facility's location. Using the 800-m buffer, the number of TRI facilities as well as the mean and median chemical release values reported by these facilities for the years 2000, 2010, and 2020 were calculated for each patient ( Figure 1). This buffer size was selected for this exploratory analysis based on its widespread use as a walkable distance in numerous studies on utilization of the neighborhood environment [40,41].

Geospatial Analyses
Patient residential addresses with complete model data were available for 912 unique patients, with high positional accuracy. Each individual was geocoded to their point level locations. Spatial epidemiological cluster detection techniques, Local Moran's I (LMI) [32], Gi* [33], and GeoMEDD [34] were used to investigate geographic patterns of non-endometrioid subtypes within the CCCC area in ArcMap (v = 10.7.1) [35]. Local Moran's I and Gi* have been used widely in detection of cancer clusters [19,28,36,37], while GeoMEDD is a new approach developed in COVID-19 spatial syndromic surveillance [34]. Local Moran's I and Gi* operate on aggregate geographic units, such as census tracts, in this case. GeoMEDD output is based on the boundary of point level cases based on user-defined space-time relationships, which identify granular concentrations, such as at a street segment or building level. While GeoMEDD reports only the presence of geographical areas that meet user-specified spatial and temporal relationships (but not significance thresholds), LMI and Gi* compare the observed spatial pattern of disease with the null hypothesis of complete spatial randomness (CSR) as generated by 999 Monte Carlo simulations [28,38]. Local Moran's I and Gi* were calculated for the rate of non-endometrioid subtypes by census tract, with neighbors defined as having shared boundaries and vertices (1st order queen contiguity matrix). The output from these techniques is multiple clusters of varying sizes and significance thresholds of p ≤ 0.01. For example, some clusters may be p = 0.01, while others may be p = 0.0025, etc. Therefore, results are reported for all clusters as p ≤ 0.01. It should also be noted that these significance values are pseudo p-values, as spatial cluster detection inherently relies on multiple testing procedures [39].
Neighborhood deprivation proximate to each address was integrated within a Geographic Information System (GIS). The Centers for Disease Control and Prevention (CDC) Social Vulnerability Index (SVI) approximated social conditions using 15 census variables at the census tract level ( Figure 1) [24]. Each census tract was ranked by percentiles ranging from 0 to 1 with greater values indicating greater social vulnerability [24]. For each patient, an 800-m buffer was drawn circling their address at diagnosis and the average SVI score for the year 2018 was individually coded. The Environmental Protection Agency's (EPA) Toxic Release Inventory (TRI) served as another marker of risk in the home environment of each patient [23]. Geographic X, Y coordinates pinpointed each facility's location. Using the 800-m buffer, the number of TRI facilities as well as the mean and median chemical release values reported by these facilities for the years 2000, 2010, and 2020 were calculated for each patient (Figure 1). This buffer size was selected for this exploratory analysis based on its widespread use as a walkable distance in numerous studies on utilization of the neighborhood environment [40,41].  [23]. Geographic X, Y coordinates pinpoint each facility's location. Using the 800-m buffer, the number of TRI facilities as well as the mean and median chemical release values reported by these facilities for the years 2000, 2010, and 2020 were calculated for each patient.

Statistical Analysis
Clinical, demographic, and environmental characteristics were compared by histotype (endometrioid versus non-endometrioid) using chi-square tests for categorical variables (i.e., FIGO stage and race) and t-tests for continuous variables (i.e., age, BMI, SVI, the total number of TRI releases (TRI density), and the number of TRI facilities (TRI count)). Normality was measured using both visual density and Q-Q plots as well the Shapiro-Wilk statistical approach. We categorized the variables SVI, TRI density, and TRI count for our modeling of outcomes. SVI quartiles were utilized, where the highest quartile captured the greatest social vulnerability. Both TRI variables were categorized as having no facilities or releases (0) or any facilities or releases (>0). Multivariable logistic regression models were used to evaluate the impact of health, social, and physical environmental risks of histotypes. All multivariable models were adjusted for age, BMI, FIGO stage, and race.
A survival analysis employing Cox proportional hazard models and a log-rank test compared risk of overall death (219 deaths, 693 survival) and cancer-specific survival (72 deaths, 840 survival). Both univariate and multivariable analyses were employed for survival outcomes, where multivariable models were adjusted for age at diagnosis (continuous), BMI (obese vs. non-obese), FIGO stage (stage I vs. higher stage), and race (Black vs. White). All statistical analyses were performed using R and two-sided statistical tests were employed, where statistical significance was defined applying a threshold of p = 0.05. Given this analysis was exploratory, multiple comparisons were not considered.

Results
Among the 912 women with endometrial cancer, 649 (71.2%) had the endometrioid subtype and 263 (28.8%) had non-endometrioid subtypes (Table 1). Chi-square tests and t-tests reported significant differences by histotype for age, SVI, TRI density, BMI, FIGO stage, and race. On average, individuals with the non-endometrioid subtype were older, had a higher SVI, a lower density of TRI releases, a lower BMI, and a more advanced FIGO stage (stage II-IV), and were predominantly Black women (Table 1). The three geospatial techniques identified similar statistically detectable geographic clusters of non-endometrioid histotype rates located in two high social vulnerability areas of Cuyahoga County as well as in more rural areas of northeast Ohio (p < 0.05). GeoMEDD identified clusters ranging in size from areas of a neighborhood to individual congregate living facilities. Furthermore, there were five individual addresses with at least three patients who had non-endometrioid subtype tumors. These are not duplicate records, rather they are indicative of congregate living facilities such as nursing homes and apartments. In some census tracts with low populations, it can be these individual point locations that drive the rate for the tract. In these cases, it is not neighborhood-level conditions that may be associated with the more aggressive tumor subtype, but rather women at older ages concentrated by buildings.
The results from a multivariable logistic regression model are presented in Table 2 and were adjusted for covariates ( Table 1). The models measured the association of endometrioid vs. non-endometrioid histotype with either the social variable, SVI, or the environmental variables, TRI count and TRI density, adjusted for age, BMI, FIGO stage, and race. SVI was associated with histotype both as a continuous variable (OR = 2.14; 95% CI = 1.26, 3.63; p = 4.70 × 10 −3 ; data not shown) and as a categorical variable (OR = 1.77; 95% CI = 1.16, 2.72; p = 0.008) comparing high SVI to the reference, low SVI (Tables 2 and S1). Under the categorical variable model, women with the greatest social vulnerability index have a 77% increased risk of non-endometrioid EC compared to women residing in areas with minimal social vulnerability. Neither of the TRI variables were significantly associated with endometrioid or non-endometrioid histotypes in these multivariable models (Tables 2 and S1). A survival analysis evaluated the time to overall death for the social and environmental variables SVI, TRI count, and TRI density as continuous and categorical variables. In models similar to the multivariable logistic regression model, each of these variables, SVI, TRI count, and TRI density, were individually measured with adjustment by covariates age, BMI, FIGO stage, and race. SVI was associated with survival both as a continuous variable (HR = 1.86; 95% CI = 1.17, 2.94; p = 8.33 × 10 −3 ) and as a categorical variable comparing high SVI to low SVI (HR = 1.67; 95% CI = 1.14, 2.41; p = 7.95 × 10 −3 ) (Figures 2 and S2, Tables 3 and S2). Neither of the TRI variables were significantly associated with survival in these models (Tables 3 and S2).

Figure 2.
Overall deaths survival analysis hazard ratios. In the overall deaths survival analysis, SVI high is significant (p = 0.008) when adjusting for age, BMI, FIGO stage, and race. Advanced age, higher BMI, and Black women have odds ratios greater than or equal to 1 and therefore are at an increased risk of death.  Overall deaths survival analysis hazard ratios. In the overall deaths survival analysis, SVI high is significant (p = 0.008) when adjusting for age, BMI, FIGO stage, and race. Advanced age, higher BMI, and Black women have odds ratios greater than or equal to 1 and therefore are at an increased risk of death.
Additionally, multivariable survival analyses predicted increased likelihood of overall deaths with SVI. The association of total survival and SVI was significant in models with SVI as a continuous or a categorical variable (Table S3). Those with the high SVI category were significantly associated with death compared to those with the low SVI category. The endometrial cancer-specific deaths model did not replicate the results, which may be due to a smaller sample size of cancer-specific deaths (Table S3).

Discussion
We used a modeling approach to evaluate the association between endometrial cancer histotype and several community-scale factors for EC cases diagnosed in the CCCC area. A model was built to assess the relationship of endometrioid versus aggressive/nonendometrioid histotype and a novel variable, SVI, adjusting for age, BMI, FIGO stage, and race. SVI was associated with the aggressive non-endometrioid histotype as both a continuous and a categorical variable. For example, a woman residing in an area with the highest SVI quartile has a 77% increased risk for developing non-endometrioid cancer compared to the women living in the area with the lowest SVI quartile. In a race-stratified analysis, SVI is significantly associated with histotype as both a continuous and categorical variable in White women (data not shown). No significant association was detected in Black women, as the current analysis is underpowered, but the SVI odds ratios were in the same direction as the analysis with only White women. Given the impact of histotype on progression, social vulnerability may be indicative of disease severity and should be considered in the treatment of all women. While other studies have not detected an association between tumor type and SVI, SVI has been associated with poorer post-surgery outcomes [42] and less access to treatment [43,44] in cancer patients.
We also geographically clustered endometrial histotype in the CCCC area. Both endometrioid and non-endometrioid histotype clusters were detected and indicated that the CCCC area has a racially patterned external environment, as is the case throughout much of the U.S. [13]. This is the first time (to our knowledge) that endometrial histotype has been geographically clustered. The mapped output will be used internally to target clinical intervention and to direct more locally targeted analyses. The SVI data can be used to create risk stratifications based on where individuals live for more holistic treatments. Those residing in high-risk areas could receive extra screening that may lead to earlier diagnosis. Targeted screening could improve health outcomes of individuals with nonendometrioid EC, as there are fewer early diagnosable symptoms.
Follow-up studies using a greater population of endometrial-cancer-specific deaths are needed to confirm the survival analysis results. Overall, social vulnerability should be considered in treatment, as it may also be indicative of overall survival. We expect that improved screening and targeted treatment derived from SVI risk stratifications could decrease mortality rates, particularly from the high-risk non-endometrioid histotypes [45][46][47][48].
While there are significant associations between SVI and both non-endometrioid EC and survival, the biological mechanism is still unknown. High social vulnerability may be associated with biological stress responses, as the cancer literature has previously used SVI to assess the community's resiliency and vulnerability from external stressors [30,31,[49][50][51]. Future studies could better define why Black women and women residing in areas of high SVI are more frequently diagnosed with non-endometrioid EC. There could be a stress-related biological mechanism, such as changes in DNA methylation or host immune response, that is driving the increase in non-endometrioid EC diagnosis. Additionally, a particular component encompassing SVI could drive the association in different communities. A geographically weighted regression of histotype and SVI could detect distinct clusters where SVI is more or less predictive [52]. As we were unable to use the full dataset due to a lack of linkage between institutional and Ohio cancer databases based on diagnosis dates, we may be able to detect more distinct clusters with a fuller dataset. Further studies could also detect associations between each of the 15 components encompassing the SVI measurement to understand if different clusters of histotype are associated with specific vulnerabilities.
No significant association was detected between either TRI variable, continuous or categorical, and histotype or death. This may be due to the limited sample size of individuals with TRI facilities or releases greater than zero. Some women in our sample resided in regions with TRI facilities and releases greater than zero but not enough to detect a statistical difference ( Figure S3). It may also be that environmental exposures are not from TRI facilities, as they are regulated, but from non-reporting facilities, which may pose more of a risk to health [16,53]. Future studies in areas with a variety of TRI exposures are needed to measure if there is a relationship between TRI facilities or releases and histotype or death.

Limitations
This study offers novel insights into the geographic context of tumor biology in endometrial cancer and spatial methods to search for granular places that can be targeted for education and intervention; however, it has a number of limitations that should be considered. Some of these will require further investigation as this line of inquiry develops. First, the study does not account for the temporal aspect of the exposure-outcome relationship. We do not know how long the patients lived in their home address nor if the geographic context of their surroundings in this placed changed over time, specifically in the chosen proxies for neighborhood deprivation. We also assume stationarity in the patient home addresses, which may not be the case. There is evidence to demonstrate that low-income patients in particular exhibit elevated residential mobility, often forced mobility as in the case of evictions. Furthermore, we lack knowledge of daily mobility of patients and their surroundings, such as areas where they spend their time away from the home address that may be more relevant sources of exposure to risks such as TRI. Despite its importance for measuring exposure, such "activity space" geography is not a part of the Electronic Medical Record (EMR). These limitations are a part of a larger concern, the Uncertain Geographic Context Problem (UGCoP) [54], which our team is addressing [55][56][57]. Furthermore, we lacked a replication site. Replicate studies in other cities with a history of a racially patterned environment may be needed to further support our claims [58,59]. In particular, recent studies have questioned the role of neighborhood, especially how the characteristics of deprivation, stressors, and social vulnerability influence biological processes that lead to adverse health outcomes, such as DNA methylation [60]. This study was limited to women who self-identified as Black or White. Our results cannot generalize to women of other races and ethnic groups without further analyses.

Conclusions
The United States displays a racially patterned external environment that likely impacts health and survival of women with EC. Individuals in our study with the more aggressive non-endometrioid subtype were predominantly Black women and resided in areas with higher social vulnerability. These associations represent an opportunity for improved targeted screening and risk stratification based on a woman's health characteristics and geographic location. Clinical risk stratification using SVI and other risk variables can detect who benefits most from extra screening and care. Clinicians may be able to improve prediction and detection of aggressive EC and decrease poor outcomes if social and physical factors are included as risk factors.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijerph19148613/s1, Figure S1: Inclusion and Exclusion Flowchart; Figure S2: Association of SVI and Survival; Figure S3: Distribution of TRI Count and TRI Density;   Informed Consent Statement: Informed consent was waived for this study by the Institutional Review Board due to the minimal risk to participants and retrospective nature of the study.
Data Availability Statement: Not applicable.