Spatially Explicit Environmental Factors Associated with Lymphatic Filariasis Infection in American Samoa

Under the Global Program to Eliminate Lymphatic Filariasis (LF) American Samoa conducted seven rounds of mass drug administration (MDA) between 2000 and 2006. Subsequently, the territory passed the WHO recommended school-based transmission assessment survey (TAS) in 2011/2012 (TAS-1) and 2015 (TAS-2) but failed in 2016, when both TAS-3 and a community survey found LF antigen prevalence above what it had been in previous surveys. This study aimed to identify potential environmental drivers of LF to refine future surveillance efforts to detect re-emergence and recurrence. Data on five LF infection markers: antigen, Wb123, Bm14 and Bm33 antibodies and microfilaraemia, were obtained from a population-wide serosurvey conducted in American Samoa in 2016. Spatially explicit data on environmental factors were derived from freely available sources. Separate multivariable Poisson regression models were developed for each infection marker to assess and quantify the associations between LF infection markers and environmental variables. Rangeland, tree cover and urban cover were consistently associated with a higher seroprevalence of LF-infection markers, but to varying magnitudes between landcover classes. High slope gradient, population density and crop cover had a negative association with the seroprevalence of LF infection markers. No association between rainfall and LF infection markers was detected, potentially due to the limited variation in rainfall across the island. This study demonstrated that seroprevalence of LF infection markers were more consistently associated with topographical environmental variables, such as gradient of the slope, rather than climatic variables, such as rainfall. These results provide the initial groundwork to support the detection of areas where LF transmission is more likely to occur, and inform LF elimination efforts through better understanding of the environmental drivers.


Introduction
Lymphatic filariasis (LF) is a neglected tropical disease that causes considerable disability and disfigurement in the mostly low-and-middle-income countries where it is endemic [1,2]. Wuchereria bancrofti, a filarial worm transmitted between humans by mosquitoes, is the parasite responsible for all LF cases in American Samoa [3]. Once the larvae enter the human host, they mature into adult filarial worms that damage the lymphatic system, potentially causing chronic pain and disability [4]. Chronic consequences of LF infection include lymphoedema, scrotal hydroceles and elephantiasis. There is also a major mental health burden stemming from social stigma and an inability to carry out employment and normal daily activities [5,6]. The World Health Organization (WHO) estimates that 863 million people worldwide remain at risk of LF infection, including the entire population of American Samoa [7].
The Global Program to Eliminate LF and the regional program, the Pacific Program to Eliminate LF (PacELF), were officially formed in 2000 and 1999, respectively [8,9]. These programs aim to eliminate LF as a public health problem using a two-pronged strategy. The first is to interrupt transmission using mass drug administration (MDA), followed by post-MDA surveillance using transmission assessment surveys (TAS). The second is the and LF warrants further investigation. An understanding of these associations can also help reduce the risk of infections through management of environmental factors, such as landcover management. This study aims to identify potential environmental drivers of LF transmission in American Samoa by using spatial analysis to explore associations between LF seroprevalence in a 2016 community survey and locally collected or remotely sensed environmental data. While this study is focused on American Samoa, the methods could be widely applicable to other settings.

Study Location
American Samoa is a US territory in the South Pacific that consists of five islands and two atolls [33]. It has a tropical climate with little variation in rainfall and temperature throughout the year [34]. The population is approximately 57,000, 95% of which live on the main island of Tutuila (shown in Figure 1) [35]. Tutuila is just over 30 km from east to west, with its highest peak at 653 m above sea level [36]. In total, the group of islands has an area of 197 km 2 [35]. Trop. Med. Infect. Dis. 2022, 7, 295 3 of 21 support and facilitate LF elimination in American Samoa, and to help develop more sensitive and targeted surveillance methods, the association between environmental factors and LF warrants further investigation. An understanding of these associations can also help reduce the risk of infections through management of environmental factors, such as landcover management. This study aims to identify potential environmental drivers of LF transmission in American Samoa by using spatial analysis to explore associations between LF seroprevalence in a 2016 community survey and locally collected or remotely sensed environmental data. While this study is focused on American Samoa, the methods could be widely applicable to other settings.

Study Location
American Samoa is a US territory in the South Pacific that consists of five islands and two atolls [33]. It has a tropical climate with little variation in rainfall and temperature throughout the year [34]. The population is approximately 57 000, 95% of which live on the main island of Tutuila (shown in Figure 1) [35]. Tutuila is just over 30 km from east to west, with its highest peak at 653 m above sea level [36]. In total, the group of islands has an area of 197 km 2 [35].

Infection Markers
Data on LF infection markers were taken from the results of a 2016 community-based, population proportionate cluster survey conducted in American Samoa in parallel with TAS-3. The community household survey collected data on five serological markers as indicators of LF infection in participants aged eight years and over. These were microfilaremia (Mf) which indicates an active infection [38], circulating filarial antigen indicating the presence of living or dead worms [38], and three LF antibodies against Wb123, Bm14 and Bm33. Circulating filarial antigen and all three antibodies can persist to varying degrees both post-infection and post-treatment [30]. While the patterns and dynamics of

Infection Markers
Data on LF infection markers were taken from the results of a 2016 community-based, population proportionate cluster survey conducted in American Samoa in parallel with TAS-3. The community household survey collected data on five serological markers as indicators of LF infection in participants aged eight years and over. These were microfilaremia (Mf) which indicates an active infection [38], circulating filarial antigen indicating the presence of living or dead worms [38], and three LF antibodies against Wb123, Bm14 and Bm33. Circulating filarial antigen and all three antibodies can persist to varying degrees both post-infection and post-treatment [30]. While the patterns and dynamics of antibodies have not been fully comprehended, it is understood that Wb123 appears in the early stages of infection while Bm14 and Bm33 antibodies may persist for multiple years following infection [38]. Circulating filarial antigen, and Wb123, Bm14 and Bm33 antibodies were measured using ELISA tests. The presence and density of Mf was determined through microscopy. The estimated prevalence of the five LF infection markers, along with laboratory methods, have previously been reported [13,15,38].

Survey Design
The community survey included 30 primary sampling units (PSU). Each PSU included whole villages, combinations of small villages, or village segments with a total population of <2000 people. Two of these PSUs, Fagali'i and the contiguous villages of Futiga, Ili'ili, and Vaitogi, were included based on previous studies confirming high LF antigen prevalence [14]. The villages sampled in this study are displayed in dark green on the map of the two largest islands of American Samoa (Figure 1). Only the villages on the largest island of Tutuila were surveyed in 2016 and included in this study.
The sampling methodology of the 2016 community survey has been previously reported [15]. The end result of the sampling strategy was that 29% of the households in each PSU were randomly selected from a georeferenced list of buildings. All household members in the selected houses aged eight years and older were invited to participate. This selection process gave the total sample size (n = 2710), 98.5% of whom had their GPS coordinates registered based on their household location. The remaining 1.5% were assigned coordinates based on the village centroid. This ensured that every participant was assigned to the correct village. As these approximated locations were in the centre of the village, all were surrounded by other buildings, meaning that there was no effect on buffers used to calculate the environmental variables.

Spatial Layers on Island, Village and Building Boundaries
Shapefiles of boundaries of islands, villages and buildings used in this analysis were compiled by the American Samoa Department of Commerce and downloaded from the American Samoa National Marine Sanctuary GIS data archive [39].

Environmental Variables and Environmental Data
To select the environmental variables for analysis, we conducted a literature review on studies that investigated the relationship between LF infection markers and environmental factors. The review identified 27 studies from around the world. Information was collected on all the environmental factors reported in these studies, with particular focus on the strength and direction of each relationship (see Appendix A for details). The matrix also included study location and date, number of participants, study design, aims, main vector, and overall conclusions. The majority of the studies were conducted in Africa, with only a few undertaken in the Pacific region. As a consequence of this geographic imbalance, the main vector species studied was Anopheles, which has different breeding and biting habits to the Aedes mosquito in American Samoa. In the reviewed studies, there were an array of relationships reported between environmental variables and LF infection markers.
The basis for including environmental variables in this study were two-fold. First, environmental variables were selected based on the presence of strong evidence in the literature review indicating their relationship with the prevalence and distribution of various LF markers. Second, the environmental variable had to be suitable for analysis at a village scale. For example, it would not be logical to include the mean distance to a permanent waterbody aggregated to a village spatial level, as, upon visual inspection, most villages have waterbodies scattered throughout. Based on these criteria, 14 environmental variables were identified for inclusion in this study, however three had to be removed due to lack of available data with sufficient coverage. These three were annual average temperature, average temperature in the dry season and average temperature in the wet season.
Spatial environmental data for American Samoa were downloaded from multiple sources ( Table 1). Datasets were selected to be as close to 2016 as possible to represent the environmental conditions at the time of the community survey. Datasets were excluded if there was insufficient coverage of American Samoa at a high enough resolution to represent the spatial heterogeneity within the small islands (≤1 km 2 ). The issue of insufficient coverage applied mainly to temperature variables, which were removed as a covariate in this study.

Environmental Data Collection and Extraction
To accurately represent the environmental factors in the inhabited areas of each village, where transmission is most likely, 'inhabited buffer zones' were created in ArcGIS Pro version 2.8.6 and used as the areas for environmental data extraction. By creating the inhabited buffer zones for each village, uninhabited areas, which are less likely to influence LF transmission, were removed from the analysis. To create the buffer zones, we assigned a centroid point to each building within the sampled villages using the Centroid (Polygon) tool in ArcGIS version 2.8.6 ( Figure 2). Each centroid was then given a circular buffer, and all buffers in the village were dissolved using the ArcGIS version 2.8.6 dissolve tool to create a boundary of inhabited areas for the village. The buffer size for each building was first set at 100 m radius from the centroid, as 100 m is the approximate standard flight range of Ae. Polynesiensis, and is also a plausible representation of where people most frequently move around a building [46]. A second buffer size of 50 m radius was used for sensitivity analysis.
buffer, and all buffers in the village were dissolved using the ArcGIS version 2.8.6 dissolve tool to create a boundary of inhabited areas for the village. The buffer size for each building was first set at 100 m radius from the centroid, as 100 m is the approximate standard flight range of Ae. Polynesiensis, and is also a plausible representation of where people most frequently move around a building [46]. A second buffer size of 50 m radius was used for sensitivity analysis.

Figure 2.
The process used to create the inhabited buffer zones using ArcGIS version 2.8.6 [47]. The building locations were acquired from the American Samoa National Marine Sanctuary GIS data archive [39]. Each colour in step four illustrates a different inhabited buffer zone.
The spatial layers downloaded for elevation and population density did not need to be altered prior to extracting and averaging the environmental data per inhabited buffer zone. The remaining variables were derived and calculated from the downloaded layers using various methods, outlined below.
Normalised difference vegetation index (NDVI) is a measure of vegetation density and health, and was estimated from the Landsat Collection 2 Level-2: Surface Reflectance and Surface Temperature images [44]. All Landsat images from 2016 with a maximum cloud cover of <10% were included in the calculation. The raster calculator in ArcGIS version 2.8.6 was used to calculate the ratio of near-infrared bands, which are reflected by vegetation, and red light bands, which are absorbed by vegetation [47,48]. NDVI values range from −1 and +1. The values taken from ArcGIS version 2.8.6 were averaged over the year. A high NDVI in an inhabited buffer zone indicates high vegetation density, while a very low NDVI generally indicates water.
The downloaded landcover raster layer included ten landcover classes and needed to be reclassified to four separate rasters, one for each of the landcover classes that were chosen for inclusion in the analysis: rangeland, trees, crop land, and urban/built environments. These four classes were chosen based on the evidence found in the literature review that suggested they are most likely to have an association with LF infection markers. The remaining six landcover classes were not included in this analysis. The reclassified raster layer was converted into a polygon layer and the ArcGIS version 2.8.6 intersection tool was used to find the total area within each inhabited buffer zone that was covered by each of the landcover classes. The landcover class values used in the final models were the percentage of each inhabited buffer zone covered by that landcover class. The Figure 2. The process used to create the inhabited buffer zones using ArcGIS version 2.8.6 [47]. The building locations were acquired from the American Samoa National Marine Sanctuary GIS data archive [39]. Each colour in step four illustrates a different inhabited buffer zone.
The spatial layers downloaded for elevation and population density did not need to be altered prior to extracting and averaging the environmental data per inhabited buffer zone. The remaining variables were derived and calculated from the downloaded layers using various methods, outlined below.
Normalised difference vegetation index (NDVI) is a measure of vegetation density and health, and was estimated from the Landsat Collection 2 Level-2: Surface Reflectance and Surface Temperature images [44]. All Landsat images from 2016 with a maximum cloud cover of <10% were included in the calculation. The raster calculator in ArcGIS version 2.8.6 was used to calculate the ratio of near-infrared bands, which are reflected by vegetation, and red light bands, which are absorbed by vegetation [47,48]. NDVI values range from −1 and +1. The values taken from ArcGIS version 2.8.6 were averaged over the year. A high NDVI in an inhabited buffer zone indicates high vegetation density, while a very low NDVI generally indicates water.
The downloaded landcover raster layer included ten landcover classes and needed to be reclassified to four separate rasters, one for each of the landcover classes that were chosen for inclusion in the analysis: rangeland, trees, crop land, and urban/built environments. These four classes were chosen based on the evidence found in the literature review that suggested they are most likely to have an association with LF infection markers. The remaining six landcover classes were not included in this analysis. The reclassified raster layer was converted into a polygon layer and the ArcGIS version 2.8.6 intersection tool was used to find the total area within each inhabited buffer zone that was covered by each of the landcover classes. The landcover class values used in the final models were the percentage of each inhabited buffer zone covered by that landcover class. The percentage of each landcover class was calculated by dividing the area of the buffer zone covered by the landcover class (m 2 ) by the total area of the inhabited buffer zone. A description of each of the landcover classes used in this analysis is outlined in Table 2. Table 2. The four types of landcover analysed for their association with LF infection markers in American Samoa. Data were taken from the recently updated Esri Land Use/Land Cover 10 m map [43].

Landcover Class Description
Crop Areas with human planted vegetation below tree height.

Rangeland
Areas of shrubs, natural fields, and grassland, with no evidence of artificial plotting or tall tree coverage.

Trees
Areas of tall and dense vegetation, around 4.5 m or higher, including forests, swamps, mangroves, and savannas.
Built/Urban Areas of human development including major paved roads, houses, towns, and large areas of asphalt.
The slope gradient (degrees) was derived from the digital elevation model (DEM) raster layer using the ArcGIS version 2.8.6 slope analysis tool [47]. The monthly precipitation records for each month in 2016 were summed and averaged in ArcGIS version 2.8.6 to provide the annual rainfall in mm. Other variables calculated from this layer were the average monthly precipitation in the driest months (June to September) and wettest months of each year (October to May).
Spatial mean values of rainfall, elevation, slope gradient, NDVI and population density were calculated by inhabited buffer zone in ArcGIS software version 2.8.6 [47] to define the parameters for subsequent analyses.

Associations between Environmental Variables and LF Infection Markers
The statistical analysis of associations between the infection markers and environmental drivers was conducted using R studio software version 2022.02.2 [49]. Analysis was conducted at the village level, rather than household level, to best represent variation in the environmental factors. A descriptive statistical analysis was undertaken for each infection marker and environmental variable. The distribution of each variable was inspected using histograms, and the relationship between the five infection markers and the environmental variables was assessed using scatter plots. Village-level crude prevalence of Mf, LF antigen and each Ab (Wb123, Bm14, Bm33) were estimated, and binomial exact methods were applied to ascertain the 95% confidence intervals (95% CI). Choropleth maps were produced to visualise the estimated prevalence for each of the five infection markers in the sampled villages. For all remaining analysis and the final models, infection marker count data was used.
Each environmental variable initially had two sets of extracted data, using building buffer zones of 50 m and 100 m. The correlation between the environmental data extracted for the 50 m and 100 m buffers and all five infection markers was calculated and based on the Pearson correlation coefficient, the buffer size associated with a stronger correlation with each infection marker was used for the next step of the variable selection process.
After selecting one buffer size per environmental variable for each infection marker, all remaining independent variables were then examined for collinearity using a univariate Poisson regression model. Pairs with a correlation coefficient of >0.8 in the univariate Poisson models were considered highly correlated, and the variable with the highest Akaike information criterion (AIC) value in each pair was excluded from the final multivariable models.
As recent LF surveys in American Samoa suggest that each detection of infection marker is a discrete and relatively rare event, the final analysis for each marker used a Poisson regression model [14,15]. The final model selection and analysis for each infection marker was performed with a multivariable stepwise Poisson regression process that used the LF infection marker count data. The stepwise regression process using a forward selection of the predictor variables was used to find the most parsimonious model. A village population log offset was included in the model to account for differences in population size between sampled villages. The results of the multivariable Poisson regression models have been expressed as relative risk (RR), with the effect size calculated based on these values. For all analyses, a p-value of <0.05 was used to indicate statistical significance.
A 95% confidence interval (95% CI) for RR that excluded 1 was also used to support the indication of significance.

Village Level Prevalence of Infection Markers
The study included 2671 participants in 750 households spread across 30 villages. The infection marker with the highest prevalence in the sampled villages was the Bm33 Ab (45.63% 95% CI 43.73-47.45%), followed by Wb123 Ab (25.61% 95% CI 23.96-27.31%), Bm14 Ab (13.1% 95% CI 11.85-14.41%), antigen (5.05% 95% CI 4.25-5.95%) and Mf (1.27% 95% CI 0.88-1.8). The highest seroprevalence of LF was in the village of Fagali'i, a previously confirmed hotspot [14], where Bm33 Ab prevalence was 95.06% (95% CI 87.84-98.64%) and antigen prevalence 38.27% (95% CI 27.68-49.74%). Fagali'i also had the highest seroprevalence of Bm14 Ab and Wb123 Ab. Bm33 Ab was detected in all sampled villages. Conversely, Mf-positive persons were detected in only 11 of the sampled villages. A descriptive summary of the infection marker counts is shown in Table 3. The geographic distribution of LF seroprevalence amongst the sampled villages is shown in Figure 3. Areas of high antigen and Ab crude prevalence were consistently detected in the far western part of the island, particularly in the villages of Fagali'i, a previously confirmed hotspot, and Fagamalo [14]. The prevalence of Mf (confirmation of active infection) was also the highest in these two villages (14.81% and 23.08%). Conversely, the prevalence of all infection markers tended to be lowest in the eastern part of the island.
Average population density (person/km 2 ) 50 m 20.30 6.74 12.00 36.00 100 m 18. 16 6.00 10.00 33.58 The geographic distribution of LF seroprevalence amongst the sampled villages is shown in Figure 3. Areas of high antigen and Ab crude prevalence were consistently detected in the far western part of the island, particularly in the villages of Fagali'i, a previously confirmed hotspot, and Fagamalo [14]. The prevalence of Mf (confirmation of active infection) was also the highest in these two villages (14.81% and 23.08%). Conversely, the prevalence of all infection markers tended to be lowest in the eastern part of the island.

Village-Level Environmental Data
Annual rainfall, average monthly rainfall in the wet season and average monthly rainfall in the dry season were lowest in the inhabited buffer zone of Tula (in the east) and highest in the inhabited buffer zone in the combined area covering the villages of Satala-Anua-Atuu (central). Average annual rainfall in each inhabited buffer zone ranged between 2137 mm and 4526 mm. In most villages, the seasonal variation in monthly rainfall was minimal, with mean monthly rainfall varying by ~100 mm between the wet and dry season months. The population density ranged between 12 to 36 persons/km 2 with an average of 18 persons/km 2 . As expected, both slope gradient and elevation were lower closer to the coastline and higher in the centre of the territory. There was minimal variation in the mean values of the environmental data extracted using 50 m and 100 m buffers ( Table  3). The results of the environmental data extraction in each PSU, in both the 50 m and 100 m inhabited buffer zones, is reported in Appendix B.

Village-Level Environmental Data
Annual rainfall, average monthly rainfall in the wet season and average monthly rainfall in the dry season were lowest in the inhabited buffer zone of Tula (in the east) and highest in the inhabited buffer zone in the combined area covering the villages of Satala-Anua-Atuu (central). Average annual rainfall in each inhabited buffer zone ranged between 2137 mm and 4526 mm. In most villages, the seasonal variation in monthly rainfall was minimal, with mean monthly rainfall varying by~100 mm between the wet and dry season months. The population density ranged between 12 to 36 persons/km 2 with an average of 18 persons/km 2 . As expected, both slope gradient and elevation were lower closer to the coastline and higher in the centre of the territory. There was minimal variation in the mean values of the environmental data extracted using 50 m and 100 m buffers ( Table 3). The results of the environmental data extraction in each PSU, in both the 50 m and 100 m inhabited buffer zones, is reported in Appendix B.

Multivariable Poisson Regression Models
The models developed for each infection marker differed in the environmental variables that were found to be significant (Table 4). Tree cover, which was included in all five multivariable models, consistently had a positive association with infection marker positivity. The risk of Mf-positivity increased by 18% (95% CI 9%, 29% p < 0.01) for every 1% increase in tree cover in the inhabited buffer zone. Effect sizes of a similar magnitude were found in the other four infection marker models. The association between rangeland cover and Mf-positivity had the largest effect size of any relationship within the study. The risk of Mf-positivity was estimated to increase by 42% (95% CI 17%, 76% p < 0.01) for every 1% increase in rangeland cover. The Bm33 Ab model was the only Ab model to include rangeland cover, again finding a statistically significant positive relationship with Ab-positivity (5% 95% CI 2%, 8% p < 0.01). The percentage of crop cover within an inhabited buffer zone was included in the Wb123 Ab and Bm14 Ab models however the result in the Wb123 Ab model was not statistically significant. However, the Bm33 Ab model showed a statistically significant negative association between crop cover and Ab-positivity, whereby the risk of Ab-positivity decreased by 22% for every 1% increase in crop cover (95% CI −33%, −11% p < 0.01). Urban cover also had a statistically significant association with Mf-positivity and Ab-positivity. Higher population densities had a negative association with infection marker positivity across all models, especially the Mf model (−12%, 95% CI −19%, −4% p < 0.01) and antigen model (−11% 95% CI −15%, −7% p < 0.01). There was also a negative association between steeper slope gradients and Mf (−9% 95% CI −16%, −2% p 0.02) and antigen (−9% 95% CI −12%, −6% p < 0.01) positivity.
In all models, rainfall showed no significant association with the detection of LF infection markers. NDVI had a very minimal association (RR between <0.0001 and 0.03) with infection marker positivity in all models where it was included (antigen, and Wb123, Bm14 and Bm33 antibodies).

Discussion
This study was the first to investigate associations between environmental variables and LF infection markers in American Samoa. We identified seven environmental variables that were associated with at least one of the five LF infection markers tested in the 2016 community survey in American Samoa. Each of the five infection markers were associated with a unique set of environmental variables, with the effect sizes varying between each marker. However, there were some overall consistencies detected in the results. In the villages included in this study, rangeland, tree cover, and urban cover had the strongest positive association with LF infection markers. Tree cover had a positive association with LF infection marker positivity in all five models. Similarly, higher population densities universally had a negative association with LF infection marker positivity. There is also evidence that slope gradient had a negative association with Mf and antigen positivity.
The associations between environmental variables and LF infection markers were generally consistent with the results found in previous studies. For example, the negative association with slope gradient in this study has been found almost universally in other studies conducted on the African continent [32,50]. However, while some associations were similar to reports from other countries, other findings were contradictory. The negative relationship between high population density and LF infection markers was opposite to the results found in most other studies.
Although the negative association observed with population density could appear to logically conflict with the positive association observed with urban landcover, it should be noted that the urban landcover class is not a direct measure of the level of urbanisation. More accurately, this landcover class represents evidence of the presence of manmade structures such as roads and buildings, which may have since been abandoned. In American Samoa, higher population density generally follows a similar distribution to the urban landcover classification but there are areas of medium-high population density that sit within rural areas. Additionally, large portions of the urban areas have low population densities. Areas with high population densities may have higher quality residential buildings and therefore less exposure to mosquitos. People in these areas might also have had better access during rounds of MDA. These possible confounding factors will require further investigation in the future.
The contrasting results between this study and studies in other locations may be related to the complexity of the dynamic relationships between LF transmission and environmental drivers in different locations. The differences observed between this study and other studies also demonstrate the importance of considering the specific geographical context for this type of analysis. Compared to other study locations, American Samoa is smaller in physical size and population and is characterised by minimal seasonal variability in climatic factors. Additionally, even the most densely populated areas of American Samoa would be considered as low density in most parts of the world.
Our study found that some landcover classes were strongly associated with the presence of LF infection markers. Rangeland produced the largest effect size in the study when analysed for its relationship with Mf-positivity, although the confidence interval was wide. Rangeland includes areas of shrubs, natural fields, and grassland [43]. This type of landcover may provide Ae. polynesiensis with the small puddles of water that it prefers to use as a breeding ground [16]. The tree class was another variable that had an association with infection marker positivity. Dense tree cover may protect mosquitos from fatally high temperatures and intense rainfall. This phenomenon was found in a similar study in Ghana, where mosquitos could survive higher temperatures if they were protected by dense tree coverage [50].
The crop class was included in two infection marker models and had a statistically significant negative relationship with Bm33 Ab-positivity within this study. Crop class was also included in the Wb123 Ab model but did not produce a statistically significant association (Table 4). A study in Burkina Faso found a similar negative relationship between crop cover and LF infections. They attributed it to the large quantities of insecticide use on the crops that temporarily reduces mosquito abundance [51]. However, their study was specifically related to cotton crops, and they suggested that different crop varieties may impact LF prevalence in different ways. Information on crop type was not available for this study.
Slope gradient showed a negative association with Mf-positivity and antigen-positivity, consistent with studies conducted in Nigeria and Ghana [32,50]. It is likely that steeper gradients promote stronger water run-off, which causes damage to mosquito eggs and the pools of water used for breeding grounds, leading to lower mosquito abundance [32]. It is also possible that slope gradient is acting as a proxy variable for elevation, as they were highly correlated in all models. A higher mean slope gradient is generally the result of high mean elevation. W. bancrofti cannot survive at high elevations, generally >600 m, as temperatures are outside of the species' survival ranges [19,32,52]. In the villages sampled in this study, the highest mean elevation was in the inhabited buffer zone of Satala-Anua-Atuu (402 m). This makes it possible for temperature to be a confounding factor but potentially to a lesser extent than areas with mean elevations >600 m. In all models except the Bm14 model, elevation was not included due to multicollinearity with slope gradient.
This study did not include temperature as a variable due to the limited data, and rainfall was removed from the multivariable models due to multicollinearity, ill-fit or lack of statistical significance. Regardless, it is still possible that both rainfall and temperature play an important role in driving LF transmission. Lack of correlation does not always mean a lack of causation and there are strong, well-known biological pathways between temperature, rainfall and LF infection risk. These biologically plausible pathways indicate that further investigation is warranted in American Samoa when higher quality data become available. However, it is possible that the statistical correlation will not adequately represent the real-world dynamics due to the minimal variation in rainfall and temperature.
The main limitation of this study is the shortage of high-resolution datasets that completely cover American Samoa, which meant environmental variables with previously confirmed associations with LF in other parts of the world could not be analysed. The main example was temperature, where elevation had to be used as a proxy due to the limited coverage by the temperature raster layer. There was also no NDVI dataset which completely covers American Samoa. An NDVI layer was created to compensate for this absence, as described in Section 2. However, many LANDSAT images had to be excluded due to cloud coverage >10%, so the NDVI layer was not representative of the potential variation throughout the year. A further potential limitation is that migration and temporary residence were not considered. Previous investigations of travel impact on LF in in American Samoa have however found than recent travel was not related to LF infection [13,15].
This study included five markers as an indication of LF infection: antigen, Mf and three antibodies (Wb123, Bm14 and Bm33). Antigen remains the standard infection marker used in LF surveys but evidence is increasingly showing the value of including antibodies in surveillance [29]. However, there is still only a limited understanding of how antibody patterns change and develop over time [30]. As research into the dynamics of LF antibodies develops, it is possible that future surveys will test for antigens in combination with one or more antibodies [30]. Therefore, an understanding of how each individual infection marker is impacted by environmental factors is likely to prove useful. The challenge posed by interpreting the different positive counts in current TAS and community surveys is ongoing, and research is continuing.
It is outside the scope of this study to use the information on environmental associations to predict LF distribution in unsampled villages. However, it is encouraging to note the consistency between high LF prevalence and the presence of the potential environmental drivers when compared to the results of other studies around the world. In the two sampled villages with the highest LF prevalence, both had low average slope gradients, low population densities and high tree cover.

Conclusions
This study is the first step toward understanding the intricate associations between environmental factors and LF transmission in American Samoa. These results can be used in future spatial analysis of LF distribution, such as predictive risk mapping to identify potential hotspots and inform evidence-based strategies to strengthen surveillance. This type of analysis may also be used to assess the potential impact of climate change on LF transmission. Our models can be refined and adapted based on past, present and predicted future environmental data to understand changes in LF disease distribution as climate change continues to impact infectious disease distribution.