Land Use Effects on Airborne Bacterial Communities Are Evident in Both Near-Surface and Higher-Altitude Air

: Land use inﬂuences the composition of near-surface airborne bacterial communities, and bacteria can be transported through the atmosphere at global scales. The atmosphere mixes vertically, but rigorously assessing whether the effects of land use on atmospheric communities extends to higher altitudes requires examining communities from multiple altitudes collected at a stable location and timeframe. In this study, we collected near-surface (<2 m) and higher-altitude (150 m) air samples from three sites in an agricultural/developed location and a forested/undeveloped location. We used bacterial 16S rRNA amplicon sequencing to compare communities and predict functionality by altitude. Higher-altitude and near-surface communities did not differ in composition within each location. Communities collected above the undeveloped location were equally variable at both altitudes; higher-altitude samples from the developed location predominantly contained Firmicutes and were less variable than near-surface samples. We also compared airborne taxa to those present in soil and snow. Communities from higher-altitude samples above the developed location contained fewer overlapping taxa with soil and snow sources, and overlapping Operational Taxonomic Units (OTUs) among the three sources differed by location. Our results suggest that land use affects the composition of both near-surface and higher-altitude airborne bacterial communities and, therefore, may inﬂuence broad bacterial dispersal patterns. This small-scale pilot study provides a framework for simultaneously examining local and regional airborne microbial communities that can be applied to larger studies or studies using different types of samplers.


Introduction
Atmospheric transport is a major mechanism for microbial dispersal to all global ecosystems, and it is estimated that there are between 10 5 and 10 6 cells in each cubic meter of air [1,2]. Due to their small size and low sedimentation rates, microorganisms, which include bacteria, fungi and microeukaryotes, can remain airborne for long periods of time and be transported over intercontinental distances (e.g., [3][4][5][6][7][8]). For example, microorganisms associated with Saharan Desert dust from Algeria were detected~2250 km north in the Swiss Alps after a 40-h-long deposition event [8]. Even further distances have been calculated for airborne microbial dispersal between dust sources from China, South Korea and Japan to the west coast of North America after spending 10 days airborne [7]. Comparisons of communities using DNA-based and RNA-based methodologies indicate that some microorganisms are likely viable during atmospheric transport [9][10][11] and act as physical surfaces for condensation and ice nuclei formation in clouds and fog, contributing to the hydrologic cycle [1,12,13]. When aerially disbursed microorganisms deposit on the the surface, or the regional taxonomic pool. Anthropogenic effects on airborne communities and bioaerosol concentrations can be strengthened by seasonality [36,37], though seasonal changes are not always observed [38,39]. For this investigation, we collected samples in the winter because we wanted to focus on a time when anthropogenic influences on airborne bacterial communities are likely at a maximum and bacterial contributions from plants are at a minimum [23]. It is well established that airborne communities can differ by land use; here, we focused on questions related to the effects of land use on community connectivity between altitudes. Specifically, we addressed three questions: (1) Are there indications that bacterial community composition and selected putative functions, differ between near-surface air and higher-altitude air? (2) If so, is there evidence that underlying land use could affect the relationship between near-surface and higher-altitude airborne communities? (3) Does land use development result in a change the connectivity between terrestrial (soil and snow) bacterial communities and airborne communities?

Airborne Bacterial Community Sampling
In winter 2016-2017, we collected airborne samples at three sites near Pellston, in northern Michigan, located in Emmet and Cheboygan counties and at three sites in Kalamazoo County, in southwest Michigan, USA ( Figure 1, Tables S1 and S2). We worked with land owners to select sites that had sufficiently large open areas to safely fly each of three 4.8-m diameter helikites at an altitude of 150 m for six hours, which is the maximum altitude allowed by the United States Federal Aviation Administration (FAA) for tethered flight vehicles [40]. We calculated land cover information surrounding 10,000 m of each site using the National Land cover database in ArcGIS 10.6.1 [41]. Sites near Pellston, which we call the "undeveloped location", were predominantly forested (31.6%) and wetland (18.6%) land covers (Table S1). The sites near Kalamazoo, which we call the "developed location", were predominantly surrounded by agricultural (42.3%) and developed (25.9%) land uses (Table S1). We sampled at the three developed sites (Parkview Campus, Asylum Lake Preserve and Schoolcraft Airstrip) on 23 December 2016. We sampled at two undeveloped sites (University of Michigan Biological Station (UMBS) UVB Field and Robinson Road) on 13 January 2017; due to time constraints, we sampled at the third undeveloped site (Chickagami Trail) on 14 January 2017.
At each of the six sites, we collected an air sample from less than 2 m (suspended from a tripod) and a sample from 150 m (suspended from a tethered helikite) above the ground, using Remote Airborne Microbial Passive (RAMP) samplers, as described in [42]. Examples of the sampling set-up are presented in Figure S1. In brief, this is a remotely operated passive sampler that we designed for use with the helikite. Each sampler weighs under 2.7 kg (6 lbs), which is the payload limit for tethered flight, according to FAA regulations [40]. We decontaminated and prepared the RAMP samplers one day prior to sampling, including a field blank control sampler, which remained closed during the sampling period. We used this field blank to correct for possible contamination during sampler preparation and transport. We hung low-altitude samplers from a tripod at 1.4 m; when open, they extended down to 0.5 m above the ground. We opened all samplers using a remote once they reached their target altitude to prevent contamination from non-target altitudes. During the six-hour sampling period, onboard weather sensors collected air temperature, barometric pressure and humidity data (Table S3). The program code for Arduino operation is publicly available, as described in [42]. We also recorded ground temperature, wind speed and wind direction every hour during sampling.
Once the six-hour sampling was complete, we remotely closed the RAMP samplers and transported them back to the laboratory. We decontaminated the outsides of the closed RAMP samplers using 70% ethanol and a UV irradiation chamber for 10 min. We then removed the collection dishes and placed them in zip-sealing plastic bags at 4 • C overnight until we could complete DNA extraction. We removed and stored collection dishes from the field blank sampler in the same way. At each of the six sites, we also aseptically collected three 10-cm snow samples and one soil sample using a 5-cm diameter corer. We collected soil and snow samples from an undisturbed area near the base of the helikite tether at each site. We stored these samples on ice in zip-sealing plastic bags during transport to the laboratory, where they were stored at −80 • C until DNA extraction could be completed. geted a predicted pathway analysis to those pathways related to bacterial DNA repair and sporulation to address the question of whether altitude-related stress impacts DNA repair and sporulation.

Land Cover and Statistical Analyses
We calculated backwards trajectories of air masses (Figure 1) using the online version of the National Oceanic and Atmospheric Administration (NOAA) Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model using the Global Data Assimilation System (GDAS1) database [53]. For each day of sampling, we calculated a 6-h backwards trajectory for both 2 and 150 m above ground level (AGL), to correspond with the exact time period and altitude of each sampling effort. We imported shapefiles into ArcGIS (10.6.1) and created a 10,000-m buffer surrounding each trajectory. We calculated trajectories from a central point between all sites at each location and chose 10,000 m as the buffer because it encompassed all sites at both locations. We used the National Land Cover database within each of these buffer areas to determine land cover [41].  Table S1 shows land use percentages for each of the trajectory buffers.
The study design tests whether airborne communities differ between near-surface and higher-altitude air at two locations with different land uses and whether location influences airborne community variability and connectivity with snow and soil bacterial communities. To address question 1, we used QIIME version 1.9.1 to calculate weighted UniFrac distances between near-surface and higher-altitude air samples collected at each location [54]. We used the "vegan" package [55] in R version 3.5.1 to test for differences  Table S1 shows land use percentages for each of the trajectory buffers.

DNA Extraction
We extracted DNA from air samples and the field blank control within 24 h of collection using a MoBio PowerWater ® DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA, USA) as described in [42]. We extracted DNA from 0.25 g of soil samples using the MoBio PowerSoil ® DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA, USA), according to the manufacturer's instructions. We combined and homogenized the three snow samples collected from each site prior to DNA extraction. We melted the snow samples in a sterile beaker and then filtered 40 mL of water through 0.2-µm sterile filters (Whatman, Buckinghamshire, UK) and extracted DNA from the filter using the MoBio PowerWater ® DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA, USA) with the amended protocol described in [42]. For all extractions, we included an extraction control to identify any contaminants introduced during the extraction or sequencing process. DNA extracts were stored at −80 • C until sequencing could be completed.

DNA Sequencing and Bioinformatics
Amplicon preparation and next-generation sequencing were performed at Michigan State University's Genomic Core-Research Technology Support Facility. The V4 hypervariable region of the bacterial 16S rRNA gene was amplified using primers 515F and 806R [43] and examined on agarose gel. Samples were normalized and pooled into single wells; then, Mi-Seq sequencing was performed (Illumina, San Diego, CA, USA). We used PANDAseq version 2.8 [44] to remove all sequences that were shorter than 247 bp and longer than 275 bp from the dataset and to combine forward and reverse reads. We used Diversity 2021, 13, 85 5 of 18 QIIME v.1.9.1 [43] using the vsearch algorithm version 2.4.3 [45] to remove chimeras and cluster Operational Taxonomic Units (OTUs), as described in [46]. We classified OTUs using the Silva version 128 database [47]. We removed sequences identified as mitochondria, chloroplasts, Archaea and all sequences identified as Ralstonia, which are common contaminants in DNA extraction kits [48]. We removed any OTUs identified in extraction controls from the respective samples (air, snow or soil). For air samples only, we removed any OTUs identified in the field blank control boxes that were more abundant in the control than on average across the six air samples, as described in [42]. For the six developedlocation air samples, the number of sequences per sample ranged from 3430 to 78,959, so we rarefied all samples to 3430 for within-site air community comparisons, as per recommendations suggested in [49]. For the six undeveloped-location air samples, the number of sequences ranged from 10,503 to 56,460, so we rarefied all samples to 10,503 for within-site air community comparisons. Collector's curves indicate that the air communities were sampled sufficiently ( Figure S2). For analyses comparing soil, snow and air samples, we did not rarefy sequences because we wanted to compare complete taxonomic overlap of OTUs; one snow sample (from the Parkview site) was not included in this analysis due to low sequencing success. The sequence reads are publicly available in the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA, PRJNA627456).
To explore predicted functional data, we identified OTUs using the Greengenes version 13.8 reference database [50] and then used Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) version 1.1.3 to determine predicted Kyoto Encyclopedia of Genes and Genomes (KEGG) 1-3 pathways [51,52]. We targeted a predicted pathway analysis to those pathways related to bacterial DNA repair and sporulation to address the question of whether altitude-related stress impacts DNA repair and sporulation.

Land Cover and Statistical Analyses
We calculated backwards trajectories of air masses ( Figure 1) using the online version of the National Oceanic and Atmospheric Administration (NOAA) Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model using the Global Data Assimilation System (GDAS1) database [53]. For each day of sampling, we calculated a 6-h backwards trajectory for both 2 and 150 m above ground level (AGL), to correspond with the exact time period and altitude of each sampling effort. We imported shapefiles into ArcGIS (10.6.1) and created a 10,000-m buffer surrounding each trajectory. We calculated trajectories from a central point between all sites at each location and chose 10,000 m as the buffer because it encompassed all sites at both locations. We used the National Land Cover database within each of these buffer areas to determine land cover [41].
The study design tests whether airborne communities differ between near-surface and higher-altitude air at two locations with different land uses and whether location influences airborne community variability and connectivity with snow and soil bacterial communities. To address question 1, we used QIIME version 1.9.1 to calculate weighted UniFrac distances between near-surface and higher-altitude air samples collected at each location [54]. We used the "vegan" package [55] in R version 3.5.1 to test for differences between airborne communities by altitude using regression-based models implemented with envfit. We visualized community similarity among samples using separate principle coordinates analysis (PCoA) ordinations for each location. We calculated OTU richness and Shannon diversity for all air samples and compared these metrics by altitude. To examine differences in functionality by altitude, we compared the relative abundances of predicted pathways for sporulation, nucleotide excision repair, DNA repair and recombination pathways. We calculated the relative abundances of bacterial orders present at each altitude and conducted Wilcoxon ranked-sum tests to describe differences in altitude overlap patterns in the developed and undeveloped locations. We also used similarity percentage (SIMPER) to determine the most influential OTUs driving differences between altitudes in each location [56]. For all parametric statistical approaches, we tested datasets for normality and skewness and transformed them as appropriate. Significance was assessed at α = 0.05.
To address question 2, we calculated the average pairwise distances between the ordination points representing the three near-surface samples and the three high-altitude samples at each location. We compared these averages to the pairwise distance between all near-surface-to-high-altitude pairwise distances using analysis of variance (ANOVA). We also compared pairwise distances of near-surface and high-altitude samples that were colocated at individual sites within a location to pairwise distances of samples that were not co-located. We used Welch's t-test to determine whether average distances were affected by co-location.
To address question 3, we calculated the number of overlapping OTUs present in near-surface air, high-altitude air, snow and soil datasets. We represented these results with Venn diagrams, using the "Venn Diagram" package (version 1.6.0) in R [57]. We considered an OTU as overlapping if, after quality filtering steps, it was present at least one time in at least one replicate from two of the four sources. For overlap results, only presence/absence of OTUs was considered. We also calculated the relative abundances of bacterial families present in each sample source and used Wilcoxon ranked-sum tests to describe differences in connectivity between sources at each location.

Considerations of the Experimental Approach
Our study approach is unique because it allowed us to sample at stable positions at three replicate sites and altitudes concurrently within the same air mass. We needed to take several considerations into account to test this approach. In accordance with United States FAA regulations, the tethered helikites we used to collect our samples could only be flown to a maximum altitude of 150 m. In addition, helikite sampling requires that: sites have sufficient clearance to tether and launch a 4.8-m diameter helikite, sites are at least 9.65-km distant from all airports, meteorological conditions on sampling days have low wind and no precipitation and field crew are sufficiently trained for safe operation [40]. A crew of three people is required to launch and maintain one helikite during sampling, so in order to complete this study, we trained nine team members.
Though our experiment was necessarily limited in scale, we took several experimental design and sampling considerations into account. First, in order to complete sampling in both a developed and an undeveloped locations, we needed to sample two different air masses because the locations are geographically distant. We also needed to sample on different dates at the different locations due to time, budget and personnel constraints, so development, location and sampling date are not independent. To correct for this, we focused our statistical analyses on within-location/date groups only instead of comparisons by location. We assumed that we were testing for an effect of land use instead of an effect of biome or soil type. Both Kalamazoo and Pellston are located in the "deciduous broadleaf forest" biome [58]. Soil types differ between the two sites: near Kalamazoo, soils are typically Alfisols or Mollisols, whereas near Pellston, they are typically Spodosols or Entisols. However, there was not a difference in soil bacterial community composition by location (permanova, p = 0.22).
Second, on the dates we sampled, the wind traveled from the northwest to the undeveloped sites we sampled ( Figure 1). While this is a common wind direction experienced in Pellston all year round, winds predominantly travel from the west, where the air mass would also pass over forest and the northern part of Lake Michigan [59]. On the date we sampled in the developed location, the wind traveled from the south ( Figure 1). Wind travels from the south for approximately one-third of the year in Kalamazoo; the rest of the year, wind travels from the west or southwest [60]. In either case, the predominant land use that these air masses would travel over is agricultural, though a full west wind would also pass over the lower part of Lake Michigan, which could affect results. Wind direction could influence the results, leading to day-to-day variations in airborne bacterial community composition and potential differences in atmospheric-terrestrial connectivity [28,61].
Third, until recently, scientists relied solely on culture-dependent methods for detecting airborne microbial diversity, but these methods reveal only a small fraction of microorganisms in the atmosphere and underestimate the volume and diversity of airborne bacteria [62]. Next-generation sequencing allows us to analyze large quantities of environmental samples with precision [63]. However, air samples often contain low microbial biomass as compared to other environments such as soil and marine habitats, making it difficult to perform next-generation approaches [64]. Powered bioaerosol samplers provide the opportunity to increase biomass collection as well as assess particulate fractions and concentrations. They are a great option when the investigator can be close at hand. However, many bioaerosol samplers are heavier than the 2.7-kg (6 lbs) FAA limitation for the weight of a payload attached to a tethered balloon [40]. Several popular bioaerosol samplers exceed these regulations, such as Coriolis µ air samplers (Bertin Instruments, Montignyle-Bretonneux, France) and some liquid impinger samplers that require vacuum filtration. We found that smaller powered samplers such as the Button Aerosol Sampler (SKC, Inc. Eighty Four, PA, USA) are difficult to turn on and off remotely, so that collecting samples from non-target altitudes can be avoided. Their battery life is also inconsistent at cold temperatures (personal observations). We did not test the Air Ideal sampler (bioMérieux Inc., Durham, NC, USA), which is also light enough for this application. For the purposes of our study, a passive sampler was the best option to pair with additional weather sensors for temperature, humidity and pressure using the tethered helikite approach, and we collected sufficient biomass to complete next-generation sequencing with these samplers.
Fourth, even with sufficient biomass, it is still a challenge to detect which microorganisms are active in the atmosphere. Atmospheric samples contain diverse communities, with a high number of rare members [6,9,65,66]. Rare members of bacterial communities can be disproportionately active community members, so sequencing to a depth which reaches saturation is exceedingly important for bacterial research [4,9,67]. Our collector's curve indicates that we completely sampled airborne communities despite low biomass yields, but because sequencing success was variable from sample to sample, we rarefied all datasets for more accurate comparisons ( Figure S2). The DNA-based approach we used provides a snapshot of the bacterial community, but further studies using transcriptomics or proteomics approaches can address whether community members are dormant or active.
Fifth, few bioaerosol studies discuss contamination control procedures, which is critically important when examining low-yield biomass samples [34]. The passive samplers we used avoid contamination from powered aircraft and only sample at target altitudes. In addition, our unique use of a field blank sampler allowed us to correct for contaminant OTUs, in addition to negative controls in our DNA extractions. Field blanks are common controls used in biogeochemical studies but are rarely used in microbial ecology. We recommend the use of a field blank for all airborne microbial community studies since it can reveal spurious OTUs or flaws in the sampling decontamination procedure.

Airborne Community Results by Location
Specific percentages of land uses over which the air masses traveled are shown in Table  S3. When all samples were considered together, land use explained 57% of the variation in airborne bacterial communities (p = 0.043). Altitude only explained eight percent of the variation in airborne communities, which was not significant (p = 0.701, Figure S3). OTU richness was significantly higher (p = 0.05) at the undeveloped location (635 ± 299) than at the developed location (251 ± 52). Phylogenetic diversity was marginally higher (p = 0.09) at the undeveloped location (30.05 ± 12.68) than at the developed location (16.60 ± 2.12). The five most abundant bacterial orders across all twelve air samples were Clostridiales (17%), Rhizobiales (11%), Actinomycetales (10%), Sphingomonadales (9%) and Bacillales (5%). Sphingomonadales (Wilcox p = 0.019) and Pasteurellales (Wilcox p = 0.030) were significantly more abundant in samples collected at the undeveloped location than at the developed location ( Figure 2). Other major orders did not significantly differ between the two locations; however, some prominent orders (Bacillales, Caulobacterales and Flavobacteriales) were more abundant at the undeveloped location, though not significantly (Figure 2). At the family level, Sphingomonadaceae, Methylocystaceae and Pasteurellaceae were more abundant at the undeveloped location and Lachnospiraceae and Lactobacillaceae were significantly higher at developed location. Average relative abundances for all bacterial orders are shown in Figure 3. Since it is well-established that airborne communities can differ by land use, the remainder of our results focus on differential patterns within each land-use type (i.e., undeveloped and developed).
OTU richness was significantly higher (p = 0.05) at the undeveloped location (635 ± 299) than at the developed location (251 ± 52). Phylogenetic diversity was marginally higher (p = 0.09) at the undeveloped location (30.05 ± 12.68) than at the developed location (16.60 ± 2.12). The five most abundant bacterial orders across all twelve air samples were Clostridiales (17%), Rhizobiales (11%), Actinomycetales (10%), Sphingomonadales (9%) and Bacillales (5%). Sphingomonadales (Wilcox p = 0.019) and Pasteurellales (Wilcox p = 0.030) were significantly more abundant in samples collected at the undeveloped location than at the developed location (Figure 2). Other major orders did not significantly differ between the two locations; however, some prominent orders (Bacillales, Caulobacterales and Flavobacteriales) were more abundant at the undeveloped location, though not significantly (Figure 2). At the family level, Sphingomonadaceae, Methylocystaceae and Pasteurellaceae were more abundant at the undeveloped location and Lachnospiraceae and Lactobacillaceae were significantly higher at developed location. Average relative abundances for all bacterial orders are shown in Figure 3. Since it is well-established that airborne communities can differ by land use, the remainder of our results focus on differential patterns within each land-use type (i.e., undeveloped and developed).

Effects of Altitude
Meteorological conditions and wind speeds were similar on the days of the sampling events at both locations (12.95 km hr −1 at undeveloped and 11.10 km hr −1 at developed). Within each location, separate patterns emerged. In the undeveloped location, near-surface bacterial communities were similar to higher-altitude communities (altitude, R 2 = 0.0199, Figure 4A). Near-surface and higher-altitude ordination points were equally dispersed, and neither of these averages differed from the pairwise distances between randomized crossaltitude pairs (p = 0.41) However, near-surface and higher-altitude samples collected from the same undeveloped sites were more similar to each other than cross-site pairs (p = 0.05). This suggests that local site effects influenced the composition of both near-surface and higher-altitude air samples at the undeveloped location. OTU richness and phylogenetic diversity did not differ between altitudes (p = 0.63 and 0.67, respectively), and the relative abundances of predicted functional pathways did not differ (sporulation: p = 1.0; nucleotide excision repair: p = 0.7; DNA repair and recombination: p = 0.7). Taxa within the orders Actinomycetales, Rhizobiales and Sphingomonadales were shared in similar abundances between the two altitudes ( Figure 2). OTUs that contributed most to differences in airborne communities were predominantly within the orders Bacillales, Actinomycetales and Sphingomonadales, which were more abundant at higher altitudes. Meanwhile, Bacteroidales, Pasteurellales and other Sphingomonadales OTUs were more abundant at lower altitudes. (Table S4).

Effects of Altitude
Meteorological conditions and wind speeds were similar on the days of the sampling events at both locations (12.95 km hr −1 at undeveloped and 11.10 km hr −1 at developed). Within each location, separate patterns emerged. In the undeveloped location, near-surface bacterial communities were similar to higher-altitude communities (altitude, R 2 = 0.0199, Figure 4A). Near-surface and higher-altitude ordination points were equally dispersed, and neither of these averages differed from the pairwise distances between randomized cross-altitude pairs (p = 0.41) However, near-surface and higher-altitude samples collected from the same undeveloped sites were more similar to each other than cross-site pairs (p = 0.05). This suggests that local site effects influenced the composition of both near-surface and higher-altitude air samples at the undeveloped location. OTU richness and phylogenetic diversity did not differ between altitudes (p = 0.63 and 0.67, respectively), and the relative abundances of predicted functional pathways did not differ (sporulation: p = 1.0; nucleotide excision repair: p = 0.7; DNA repair and recombination: p = 0.7). Taxa within the orders Actinomycetales, Rhizobiales and Sphingomonadales were shared in similar abundances between the two altitudes ( Figure 2). OTUs that contributed most to differences in airborne communities were predominantly within the orders Bacillales, Actinomycetales and Sphingomonadales, which were more abundant at higher altitudes. Meanwhile, Bacteroidales, Pasteurellales and other Sphingomonadales OTUs were more abundant at lower altitudes. (Table S4).  At the developed location, altitude explained 67% of the variation in bacterial communities. Though this was not statistically significant (p = 0.2), it is 33 times the explanatory power of altitude at the undeveloped location, suggesting that different mechanisms may be at play in the two land use types ( Figure 4B). Higher-altitude communities were more similar to each other than near-surface communities or to random cross-altitude pairs (p = 0.02), indicating community homogeneity across the high-altitude samples. However, samples collected from different altitudes at the same site were no more similar to each other than by chance (p = 0.84). This suggests that local site sources distinguish near-surface air in the developed location just as they do in the undeveloped location. However, homogeneous communities in higher-altitude air overwhelm these local site effects in the regional taxonomic pool. OTU richness and phylogenetic diversity did not differ between altitudes (p = 0.83 and 0.82, respectively). Taxa within the orders Clostridiales, Actinomycetales and Bacteroidales were shared in similar abundances at both altitudes ( Figure 2). OTUs that distinguished high altitude samples were predominantly within Firmicutes (Clostridiaceae, Lactobacillaceae, Lachnospiraceae and Aerococcaceae). Conversely, OTUs that distinguished the near-surface communities were within the families Mycoplasmataceae, Xanthomonadaceae and Geodermatophilaceae (Table S4). Predicted pathways for sporulation were marginally more abundant in the higher-altitude samples than in the near-surface samples (p = 0.1), as expected with higher abundances of Firmicutes. As in the undeveloped site, relative abundances of predicted pathways for DNA repair did not differ by altitude (nucleotide excision repair: p = 1.0; DNA repair and recombination: p = 1.0).

Connectivity between Sample Sources
Our community results suggest that bacterial community connectivity is disrupted between near-surface and higher-altitude air at developed sites. To examine this further, we compared communities collected from atmospheric and terrestrial sources from each At the developed location, altitude explained 67% of the variation in bacterial communities. Though this was not statistically significant (p = 0.2), it is 33 times the explanatory power of altitude at the undeveloped location, suggesting that different mechanisms may be at play in the two land use types ( Figure 4B). Higher-altitude communities were more similar to each other than near-surface communities or to random cross-altitude pairs (p = 0.02), indicating community homogeneity across the high-altitude samples. However, samples collected from different altitudes at the same site were no more similar to each other than by chance (p = 0.84). This suggests that local site sources distinguish near-surface air in the developed location just as they do in the undeveloped location. However, homogeneous communities in higher-altitude air overwhelm these local site effects in the regional taxonomic pool. OTU richness and phylogenetic diversity did not differ between altitudes (p = 0.83 and 0.82, respectively). Taxa within the orders Clostridiales, Actinomycetales and Bacteroidales were shared in similar abundances at both altitudes ( Figure 2). OTUs that distinguished high altitude samples were predominantly within Firmicutes (Clostridiaceae, Lactobacillaceae, Lachnospiraceae and Aerococcaceae). Conversely, OTUs that distinguished the near-surface communities were within the families Mycoplasmataceae, Xanthomonadaceae and Geodermatophilaceae (Table S4). Predicted pathways for sporulation were marginally more abundant in the higher-altitude samples than in the near-surface samples (p = 0.1), as expected with higher abundances of Firmicutes. As in the undeveloped site, relative abundances of predicted pathways for DNA repair did not differ by altitude (nucleotide excision repair: p = 1.0; DNA repair and recombination: p = 1.0).

Connectivity between Sample Sources
Our community results suggest that bacterial community connectivity is disrupted between near-surface and higher-altitude air at developed sites. To examine this further, we compared communities collected from atmospheric and terrestrial sources from each of four sample provenances (soil, snow, near-surface air and higher altitude air). Communities in all four groups differed from each other at both sampling locations (permanova: undeveloped, p = 0.01; developed p = 0.001). For comparison between communities collected from soil, snow and air only, we used unrarefied sequence data from two sites near the developed location (Asylum Lake Preserve and Schoolcraft Airstrip) and two sites near the undeveloped location (Robinson Road and UMBS UVB Field). The snow sample from Parkview (developed location) had low sequencing success and we excluded it from the analysis. Since we sampled Chickagami Trail (undeveloped location) on a different day than other sites from this location, we excluded these samples from the source overlap analysis. We did this to avoid the addition of potentially confounding variation by date and to balance our design.
At the undeveloped location, more OTUs overlapped between sample provenances than at the developed location ( Figure 5A), driven by a lower overlap between terrestrial sources and higher-altitude air above developed land uses. At the undeveloped location, near-surface air communities shared 5.3% of OTUs with soil communities, 10.3% with snow communities and 16.3% with OTUs that overlapped between soil and snow communities, for a total of 31.9%. Higher-altitude air communities shared a comparable 6.4% of OTUs with soil communities, 11.4% with snow and 14.4% with OTUs that overlapped between soil and snow communities, for a total of 32.3%. In total, 0.6% of OTUs overlapped between all four sample provenances in the undeveloped location.  In contrast, near-surface air communities at the developed location shared 13.4% of OTUs with soil communities, 10.4% with snow communities and 7.9% with OTUs that overlapped between soil and snow communities, for a total of 31.7% ( Figure 5B). Higher-altitude communities at the developed location shared 7% of OTUs with soil communities, 4.5% with snow communities and 3.5% with OTUs that overlapped between soil and snow communities, for a total of 15.5%. In total, 0.1% of OTUs overlapped between all four sample provenances in the developed location. Near-surface and higheraltitude communities at the undeveloped location and the near-surface communities at the developed location each carried 50-53% unique OTUs not shared by any other provenance. In contrast, higher-altitude communities from the developed location carried 68% unshared OTUs.
The relative abundances of the predominant orders in air, soil and snow samples are presented in Figure 3 for each location. Seventeen bacterial orders were present in all sample provenances from the undeveloped location and only seven orders were present in the developed location ( Figure 5). We examined the order-level composition of the taxa shared by all four provenances in each location. Rhizobiales and Bacillales were abundant in both the undeveloped and developed locations (totals of 18.2% and 4.1%, respectively). However, Sphingomonadales contributed most to the shared community (42%) in the undeveloped location, while Pseudomonadales predominated the shared community (67%) at the developed location ( Figure 5).
We explored this further using a heatmap approach to examine which families were shared most frequently in all provenances ( Figure 6). In the undeveloped location, OTUs within Sphingomonadaceae were shared in high abundances among all four sample provenances. Acetobacteraceae were also present in all provenances, but were most abundant in snow and soil, and then decreased in abundance from lower-to higher-altitude air. Many families were disproportionately abundant in soil and snow, indicating a potential lack of airborne dispersal in their life histories, such as Chitinophagaceae, Cytophagaceae, Chthoniobacteraceae and Acidobacteriaceae. Conversely, OTUs within Pasteurellaceae and Microbacteriaceae were more abundant in airborne samples but were either absent or in low abundance in soil and snow. There were several similarities at the developed location. Sphingomonadaceae were also shared by all sample provenances at the developed sites, and the same four orders were disporporationately present in soil and snow. In contrast, Clostridiaceae were most abundant in higher-altitude air samples and Rhizobiaceae, Xanthomonadaceae and Oxalobacteraceae were abundant in snow samples only. While it is not possible to determine whether these communities act as sources or sinks, our results provide evidence that land use affects connectivity between soil, snow and airborne bacterial communities, notably at higher altitudes. Homogenous, higher-altitude communities that traveled over developed land carried more OTUs representative of anthropogenic sources and exhibited less overlap with near-surface atmospheric and terrestrial communities.

Discussion
The results of our pilot study suggest that the effects of land use development extend to the composition of higher-altitude airborne bacterial communities, but development also influences the connectivity between airborne bacterial communities and terrestrial sources. The approach we employed provides a framework for describing microbial diversity and dispersal in terms of local and regional taxonomic pools. While small in scale, our results suggest that the land use over which an air mass travels can influence not only the composition of airborne bacterial communities at multiple altitudes, but also the local connectivity between atmospheric and terrestrial bacterial communities. Near-surface air Other families such as Clostridiaceae, Bacillaceae, Lachnospiraceae and Ruminococcaceae were present in communities from all provenances, but were more abundant in air samples.

Discussion
The results of our pilot study suggest that the effects of land use development extend to the composition of higher-altitude airborne bacterial communities, but development also influences the connectivity between airborne bacterial communities and terrestrial sources. The approach we employed provides a framework for describing microbial diversity and dispersal in terms of local and regional taxonomic pools. While small in scale, our results suggest that the land use over which an air mass travels can influence not only the composition of airborne bacterial communities at multiple altitudes, but also the local connectivity between atmospheric and terrestrial bacterial communities. Near-surface air samples over both land uses were affected by local factors that distinguished communities from site to site. Higher-altitude air that traveled over developed land carried bacterial communities that were more homogeneous across the three sampled sites, with higher relative abundances of taxa that are likely anthropogenically associated. Higher-altitude air that traveled over undeveloped land carried more heterogeneous and diverse communities with naturally associated taxa. Taken together, our results suggest that development influences the types of bacterial community connectivity that exist between near-surface, higher-altitude air and terrestrial bacterial sources.
All near-surface airborne bacterial communities in our study carried local, site-specific signatures, regardless of land use. As a result, near-surface communities at either location were no more similar to each other than they were to any higher-altitude communities. This phenomenon has been demonstrated previously with other near-surface air samples [27,28].
Our results indicate that local site effects likely extended to higher altitudes only at the undeveloped location, though verification with further replication and sampling in multiple air masses is necessary to confirm this result. At the developed location, higher-altitude air carried a homogenous community predominated by anthropogenically associated taxa, suggesting connectivity across a regional, instead of local, scale. A proposed mechanism is depicted in the conceptual diagram in Figure S4.
At the undeveloped sampling sites, the air mass we sampled passed over a heterogeneous landscape consisting of open water, forests, wetlands and few human developments. Both near-surface and higher-altitude bacterial communities were equally heterogeneous, and local effects influenced site-to-site variation at both altitudes. As a result, near-surface and higher-altitude communities collected from the same site shared more similar community composition than samples collected from different sites at the same altitude. Furthermore, nearly half of the OTUs present in the air samples at either altitude were shared with snow or soil bacterial communities. The most predominant OTUs shared between air, soil and snow were Sphingomonadales and Rhizobiales, many members of which are plantassociated taxa or are found in freshwater and soil environments (e.g., Sphingomonas [58]; Microbacteriaceae [68]; Methylocystaceae [69]).
In contrast, the air masses we sampled in the developed location passed over predominantly agricultural land. Here, near-surface airborne communities were also distinctive by site, but higher-altitude communities were more homogeneous thanas higher-altitude communities collected at the undeveloped location. In particular, high abundances of spore-forming Firmicutes within the order Clostridiales (e.g., Lachnospiraceae, Clostridiaceae and Ruminococcaceae) present in our higher-altitude samples, as well as those collected in the upper troposphere by others, suggest a large contribution from anthropogenic sources to the regional taxonomic pool, since taxa within these groups are associated with agriculture, livestock and wastewater [34]. Taxa within Firmicutes also predominated near-surface communities at our developed location, as has been demonstrated previously for developed sites located in the Midwestern United States [24,27,36]. However, their prevalence in higher-altitude air suggests that the mechanism for this phenomenon is likely regional transport from agricultural areas in higher-altitude air. It is probable that when this higher-altitude air mixes with near-surface air, anthropogenically associated Firmicutes are introduced at lower altitudes.
Several other studies that have investigated bacterial communities at much higher altitudes have done so over broad geographic scales and using powered aircraft, making the effects of specific local land uses difficult to parse out [10,[33][34][35]. In support of our results, Smith and colleagues found that airborne bacterial communities collected at 0.3 to 12 km in altitude over the western United States were similar to the samples we collected at 150 m above the developed sites in this study [34]. These high-altitude samples carried indicators of agriculture, livestock feedlots and wastewater, such as Lachnospiraceae and Ruminococcaceae. The 12-km samples contained higher relative abundances of Clostridiaceae than ground-level air samples [34]. Because of this, we expect that the influence of land use we describe extends to higher altitudes, though it is not possible to conduct this test in the United States with tethered helikite-based sampling due to FAA restrictions [40]. Interestingly, there were no altitudinal differences in the predicted functional pathways for DNA repair that we targeted, though other studies have suggested that UV irradiation and low atmospheric pressure may act as important habitat filters and/or mutagens in samples collected at higher altitudes [35,[70][71][72]. We expect that the samples we collected from 150 m were not sufficiently high to detect the effects of UV irradiation and other extreme conditions, which may be more of a consideration in the stratosphere.
The mechanism for atmospheric-terrestrial connectivity between bacterial communities may also be affected by development. While near-surface air shared a similar percentage of OTUs between soil and snow at both the developed and undeveloped locations, over two-thirds of the OTUs carried by higher-altitude air that passed over developed land were not shared with any local soil, snow or near-surface air communities. In contrast to the undeveloped location, the most predominant OTUs shared in the developed location were members of Pseudomonadales, which is a ubiquitous group associated with humans and other animals, food spoilage, wastewater, soil and plants. The similarity between communities collected at higher altitudes above the developed location was over two times greater than above the undeveloped location. Most of the developed area over which this air mass traveled is agricultural, and it is well known that agricultural development can lead to convergence of many ecosystem parameters, including plant communities, soil microbial communities and biogeochemical patterns [73,74]. While our study is small in scale, our results provide evidence that agricultural convergence effects extend to higher-altitude air masses but are not obvious near the surface. While further replication is necessary to confirm this, our data help clarify the observation made by Barberán and colleagues that dust-deposited microbial communities exhibit patterns of development-related homogenization at a continental scale [24]. It is likely that the homogenous regional taxonomic pool from higher altitudes facilitates this pattern, rather than site-specific local pools.
Our results indicate that both near-surface and higher-altitude airborne communities differed by the land use over which the air mass traveled. Furthermore, higher-altitude communities may either be strongly connected to local sources or they may be more connected to regional sources. There are many factors that could cause these patterns. For example, taller trees in an undeveloped forested location could disrupt predominant winds and lead to greater mixing between near-surface and higher-altitude air. A higher bacterial load is also more likely to be present in air samples at developed locations, with dust-borne microorganisms more capable of transport in highly disturbed developed areas. Our approach provides a new framework for addressing how and when connectivity between multiple altitudes in the atmosphere and terrestrial sources are affected by anthropogenic activities. We recognize that in this experimental design, sampling date and location covary. We corrected for the effect of date as much as possible by sampling on days that had similar meteorological conditions and wind speeds to minimize differences. Atmospheric mixing in the troposphere is well documented [59][60][61][62], and it is possible that the wind and mixing patterns in Pellston could have led to more mixing between the near-surface and higher-altitude air masses. However, if this were the case, we would have expected to see greater homogenization across all communities and fewer site-specific effects. We also note that due to timing, we needed to sample at the Chickagami site in Pellston one day after the other two sites, and the communities from these samples were more similar to the other undeveloped replicates than developed samples. This observation suggests that location and direction of the prevailing air mass are more influential than date, but a more robust study would need to be completed to address this conclusively.

Conclusions
Our study demonstrates a framework for testing the effects of land use on near-surface and higher-altitude microbial communities, which reflect local and regional taxonomic pools, respectively. We observed differences in the type of connectivity between atmospheric and terrestrial communities in developed and undeveloped locations, but we cannot conclude whether soil and snow are sources or sinks for airborne taxa. Further work with greater replication is needed in this area to track whether certain provenances are more likely to act as sources, as well as the physiological parameters that allow certain microbial taxa to be more capable of aerosolization and airborne transport than others. Using the appropriate experimental design, source-sink models can be useful tools that leverage high-throughput sequence data [75]. This could be accomplished with time-series sampling linked to specific terrestrial activities, such as tilling and fertilizing agricultural fields.
Our results provide compelling evidence that agricultural development is linked to convergence in higher-altitude bacterial communities. Further work to demonstrate this effect requires more sampling of airborne communities in developed and undeveloped locations across a broad geographic scale. Other factors such as temporal and seasonal variability [23,29,61], ecosystem type [24], predominant vegetation type [26], agricultural activity [2] and atmospheric pollution [21] also affect the composition of airborne bacterial communities and require further attention.