Clustering of Pediatric Brain Tumors in Texas, 2000–2017

Risk factors for pediatric brain tumors are largely unknown. Identifying spatial clusters of these rare tumors on the basis of residential address may provide insights into childhood socio-environmental factors that increase susceptibility. From 2000–2017, the Texas Cancer Registry recorded 4305 primary brain tumors diagnosed among children (≤19 years old). We performed a spatial analysis in SaTScan to identify neighborhoods (census tracts) where the observed number of pediatric brain tumors was higher than expected. Within each census tract, the number of pediatric brain tumors was summed on the basis of residential address at diagnosis. The population estimate from the 2007–2011 American Community Survey of 0- to 19-year-olds was used as the at-risk population. p-values were calculated using Monte Carlo hypothesis testing. The age-standardized rate was 54.3 per 1,000,000. SaTScan identified twenty clusters, of which two were statistically significant (p < 0.05). Some of the clusters identified in Texas spatially implicated potential sources of environmental risk factors (e.g., proximity to petroleum production processes) to explore in future research. This work provides hypothesis-generating data for further investigations of spatially relevant risk factors of pediatric brain tumors in Texas.


Introduction
Pediatric brain tumors are the most common solid tumor in children ≤19 years old. Children who develop these tumors experience substantial morbidity and high mortality rates [1,2]. Aside from postnatal exposure to ionizing radiation [3], risk factors related to pediatric brain tumors are largely unestablished. People are exposed to a mixture of chemicals daily. Some of the variations in chemical exposures can be attributed to nearby geospatial features (e.g., living near agricultural land, major roadways, industrial facilities, etc.) [4]. Spatial-temporal variation of exposure to environmental factors may result in clusters of cancer cases. There is consistent evidence that children with leukemia may aggregate in space and time based on the address at diagnosis [5]. Conversely, it is inconclusive whether pediatric brain tumors cluster in a similar manner. Spatial clustering analyses of pediatric brain tumors have been conducted in the United States and Europe. A systematic review identified 16 publications that have conducted space-time clustering in relation to childhood brain tumors [5]. Of these, four reported an aggregation of any childhood brain tumors. Two studies conducted in Florida used the address at diagnosis and reported a cluster encompassing the Miami-Dade area [6,7]. The other two studies were conducted in Yorkshire, United Kingdom, using both addresses at birth and diagnosis [8] and in Great Britain using the address at birth [9].
Both studies reported clusters, but neither specified the geographical areas that represented the clusters.
Clustering analyses of pediatric brain tumors conducted in the United States have been conducted in a few states (i.e., California, Florida, Georgia, and New Jersey) and have used different geographical units, including counties [10,11], Zip Code Tabulation Areas (ZCTA-approximates U.S. postal service zip codes) [7,12], and census tracts [6]. Census tracts provide greater geospatial granularity than counties and ZCTAs, especially in dense urban areas.
Texas is the second most populous state in the United States [13] and the second in the number of newly diagnosed pediatric cancer cases each year [14]. Texas is also geographically diverse in terms of its urbanicity and economic industries (e.g., agriculture, oil and gas production). No studies have conducted a clustering analysis within Texas. Therefore, to generate hypotheses related to the etiology of pediatric brain tumors, we conducted a clustering analysis with the census tract as the unit of analysis in the state of Texas for the period 2000-2017. By identifying the areas of the state where incidence rates may be higher than expected, the results could further guide the investigation of potential risk factors for pediatric brain tumors in Texas.

Study Population
The Texas Cancer Registry (TCR) is the statewide, population-based registry that utilizes active and passive surveillance systems to collect cancer cases in Texas. The TCR is one of the largest cancer registries in the United States. TCR collects information required by the North American Association of Central Cancer Registries (NAACCR) and National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) Program, including the types of cancer diagnosed and their locations within the body on the basis of ICD-O-3 codes. For children ≤19 years old with a recorded tumor diagnosed, the ICD-O-3 histology, primary site, and behavior codes were used to classify the tumor based on the International Classification of Childhood Cancer (ICCC), 3rd edition [15]. From 2000-2017, 4305 children were diagnosed with a primary malignant brain tumor (i.e., categorized as group III on the basis of the ICCC with an ICD-O-3 behavior code of 3).

Cluster Analysis
Using SaTScan software [16], we conducted a cluster analysis that used the census tract as the unit of analysis. We retrieved the census tract geographic file from the US Census Bureau TIGER database that matched the 2007-2011 American Community Survey (ACS) estimates. To obtain the total population at risk within each census tract, we used the U.S. Census population estimates based on the midpoint of our study period (i.e., 2009), which was the American Community Survey's 5-year population estimate of 0-to 19-year-olds for 2007-2011. The Texas Department of State Health Services Center for Health Statistics geocoded each patient's residential address at diagnosis, making it possible to sum the number of pediatric brain tumors inside each census tract. Because of the nature of the data, we formatted the analysis as Poisson distribution [17], representing the brain tumor diagnosis as events. SaTScan imposes a moving window over a geographical area. Census tracts are represented by their centroids and are included in windows that contain their centroids. Circular windows were set, such that SaTScan could run an infinite number of circles around each centroid until the window reached a maximum radius of 50% of the population at risk was included. SaTScan tested whether the incidence of pediatric brain tumors in census tracts within a window was greater than the incidence in census tracts outside the window by calculating the likelihood function. SaTScan uses Monte Carlo hypothesis testing to calculate the p-value by comparing the rank of the maximum likelihood from our data with the maximum likelihoods from randomly generated data sets [16,18].

Results
The age-adjusted incidence rate for pediatric brain tumors in Texas over our study period was 54.3 per 1,000,000 children. The rates were calculated using the 2000 US Standard Population for age standardization. The median age of cases was 7 years. Cases were more likely to be male (53%) ( Table 1). By race/ethnicity, 48% were non-Hispanic white, 37% were Hispanic, 11% were non-Hispanic Black, and 4% were Other or Unknown. Six cases had missing latitude or longitude information, leaving 4299 for the cluster analysis. SaTScan identified 20 clusters, of which two were statistically significant at p < 0.05 ( Figure 1, Table 2). The most significant cluster (p < 0.001) was identified in a census tract encompassing the Texas Medical Center. The Texas Medical Center is home to two renowned cancer centers that treat childhood cancers (i.e., Texas Children's Hospital and MD Anderson Cancer Center). The second significant cluster spatially encompassed 451 census tracts in the larger Dallas-Fort Worth metropolitan area (most located in Montague, Cooke, Grayson, Wise, Denton, and Collin County) (p = 0.01). Two clusters had a p-value greater than 0.05 and less than 0.15. One of these clusters contained three census tracts in Orange County, Texas (p = 0.11). The other contained 15 census tracts along part of the Houston Ship Channel (p = 0.14). While the remaining 16 clusters had a p-value greater than 0.15, SaTScan still identified them as non-random aggregations of pediatric brain tumors.

Discussion
Overall, we identified 20 clusters of childhood brain tumors across Texas, of which 2 were statistically significant. Below, we hypothesize as to what these geospatial clusters may reflect.
The most significant hotspot was identified around the Texas Medical Center, which is home to two world-renowned pediatric cancer treatment centers. Physicians across the United States and the world may refer their patients to these cancer treatment centers. We speculate that this cluster reflects families temporarily relocating to the Texas Medical Center to receive care. While a non-significant cluster was identified a few miles outside of McAllen, Texas (cluster #18, Figure S1), this cluster was near the Vannie Cook Children's Clinic, which is the comprehensive pediatric cancer and hematology center in South Texas. We also hypothesize that this cluster might reflect the referrals of patients to Vannie Cook by physicians in South Texas or Mexico who might temporarily relocate to this area for treatment. Further research is needed to understand what address is being collected at the time of diagnosis and if changes in the data collection or reporting procedures may be necessary.
The second most significant cluster was geospatially the second largest, contained the largest number of census tracts, and included a large portion of North Texas. There are several possible hypotheses that could contribute to this cluster. First, the Dallas-Fort Worth Metropolitan Area experienced substantial population growth in the northern suburbs during this study period [19]. Although we used the census population estimates based on the midpoint of our study period (i.e., 2009) to represent the population at risk, it is possible that some census tracts might have experienced faster growth, which we were unable to account for. Second, part of the large cluster overlaps with the Barnett Shale, an area that has large reserves of natural gas and some permitted gas drilling. Extracting these resources requires a multistep process that includes pad preparation, drilling, hydraulic fracking, and gas production. Each process releases or injects different compounds into the environment [20]. Pad preparation, drilling, and oil and gas production release air pollution [20]. Hydraulic fracking injects fracking fluid (a mixture of water, sand, and chemicals) into the well site to create cracks in the deep rocks to better access the gas or oil, which may contaminate sources of drinking water [20]. While not conclusive, there is evidence that living near oil and gas operations is associated with adverse health effects, including poorer reproductive outcomes and respiratory conditions [21]. Living near hydraulic fracking sites has also been reported to elevate the risk of childhood leukemia [22,23], but the literature regarding pediatric brain tumors is sparse. One study reported an elevated risk of pediatric brain tumors, but the risk was higher in counties with the fewest number of wells [24]. Third, the southern part of the cluster is near the Dallas-Fort Worth airport-one of the busiest airports in the United States. Airplanes emit various air pollutants, including ultrafine particles and volatile organic compounds. Concentrations of air pollutants have been reported to be higher around airports and downstream of landing and takeoff [25]. One study reported that the incidence of childhood leukemia in Texas tended to be higher in census tracts near airports [26], but no study was found investigating proximity to airports and the risk of pediatric brain tumors. The literature on air pollutants and pediatric brain tumors is inconclusive, but some positive associations have been reported [3].
While not significant, the third and fourth clusters were geospatially near ports where petroleum and petroleum products are imported and exported or near oil and gas refineries. The cluster in Orange County was geographically located between Port Arthur, Texas, in the southwest and Lake Charles, Louisiana, in the east. These two cities are home to two of the largest petroleum refineries in the United States and have ports for importing and exporting oil and gas products. The fourth cluster included part of the Houston Ship Channel, one of the world's busiest ports, and was located near oil and gas refineries. Another non-significant hotspot (cluster #13, Figure S1) was also identified in Port Arthur. Together, these clusters suggest that being exposed to pollutants from these industries may increase susceptibility to pediatric brain tumors, warranting further research to test this hypothesis.
Our clustering analysis can only identify areas where the number of observed cases of childhood brain tumors is greater than expected. Because we did not conduct any statistical inference tests between specific exposures and the risk of childhood brain tumors, we cannot draw any definitive conclusions about why these clusters were identified. Further research is needed to test the hypotheses proposed above between potential sources of pollutants and the risk of childhood brain tumors. The proposed sources of pollutants discussed above emit a mixture of pollutants (e.g., metals, particulate matter, and benzene [20,25,27]) that can be inhaled or ingested and cross the blood-brain barrier [28]. Several of these pollutants have been linked to adverse neurodevelopmental outcomes in children [28] and adult brain tumors [29]. Pediatric brain tumors have over 100 histological subtypes and thus are more heterogeneous than childhood leukemia. In the systematic review, 10 of the 16 publications did not report a cluster with pediatric brain tumors [5]. Given the heterogeneity of childhood brain tumors, it is possible that childhood brain tumors may cluster by histological subtypes. Eight studies reported in the literature have conducted analyses by histological type. Two (one in North West England and the other in Murcia, Spain) reported a cluster with astrocytoma [30,31] and one with primitive neuro-ectodermal tumors (PNET) in Yorkshire, United Kingdom [8]. Another study published after the systematic review reported a cluster of astrocytomas in California [12]. Three studies reported conducted geospatial analyses with medulloblastoma or ependymoma [5], but no cluster was identified. It may be challenging to identify clusters of medulloblastoma and ependymoma given that there are four and nine molecular subtypes, respectively. The molecular characterization of medulloblastoma was recognized by the World Health Organization in 2016 [32] for ependymoma in 2021 [33]. These molecular characterizations were not collected in the Texas Cancer Registry during this study period. Given our decision to conduct the spatial clustering analysis using census tract to identify areas with more granularity, we did not conduct analyses by histological subtype due to the smaller sample size. As the Texas Cancer Registry continues to collect data on cases diagnosed in the state, future studies may be able to conduct clustering analyses by specific subtypes of pediatric brain tumors.
This study has some limitations. Because we did not have the address at birth, we were unable to repeat the geospatial analyses at this timepoint and compare clusters. While we cannot rule out the prenatal period as a critical window of exposure, there is some evidence that exposures based on addresses at birth and diagnosis are similar [34,35], so these results may also apply to sources of exposure during pregnancy. In our analyses, we used Kulldorff's circular scan, which may not capture non-circular clusters. Strengths of this study include having latitude and longitudinal data of cases, its large sample size and representation of almost all cases of childhood brain tumors diagnosed in the state, the geographical diversity based on urbanicity and economic industries, and utilizing census tracts as the geographical unit.
In conclusion, this is the first geospatial clustering analysis of childhood brain tumors diagnosed in Texas, in which several clusters were identified. Some of these clusters geospatially overlap in areas with known sources of environmental pollutants. While this study cannot draw conclusions about these pollutants and the risk of childhood brain tumors, future research is warranted to test the hypotheses generated from our results for both exposures during pregnancy and early childhood.