Effect of Environmental Heterogeneity and Trophic Status in Sampling Strategy on Estimation of Small-Scale Regional Biodiversity of Microorganisms

Microorganisms are diverse and play key roles in lake ecosystems, therefore, a robust estimation of their biodiversity and community structure is crucial for determining their ecological roles in lakes. Conventionally, molecular surveys of microorganisms in lakes are primarily based on equidistant sampling. However, this sampling strategy overlooks the effects of environmental heterogeneity and trophic status in lake ecosystems, which might result in inaccurate biodiversity assessments of microorganisms. Here, we conducted equidistant sampling from 10 sites in two regions with different trophic status within East Lake (Wuhan, China), to verify the reliability of this sampling strategy and assess the influence of environmental heterogeneity and trophic status on this strategy. Rarefaction curves showed that the species richness of microbial communities in the region of the lake with higher eutrophication failed to reach saturation compared with that in lower trophic status. The microbial compositions of samples from the region with higher trophic status differed significantly (P < 0.05) from those in the region with lower trophic status. The result of this pattern may be explained by complex adaptations of lake microorganisms in high eutrophication regions with environmental conditions, where community differentiation can be viewed as adaptations to these environmental selection forces. Therefore, when conducting surveys of microbial biodiversity in a heterogeneous environment, investigators should incorporate intensive sampling to assess the variability in microbial distribution in response to a range of factors in the local microenvironment.


Introduction
Shallow lakes are crucial for the conservation of local and global biodiversity [1], they can be classified as oligotrophic, oligo-mesotrophic, mesotrophic, meso-eutrophic, eutrophic, and hypereutrophic classes, based on trophic status according to previous studies [2][3][4][5]. Lake ecosystems vary considerably in species richness, but they contain more biodiversity than other aquatic ecosystems, such as streams, ditches, and temporary ponds [6]. Microorganisms are diverse and play key roles in these ecosystems although their biodiversity varies among different lakes [7,8]. Consequently, investigations focused on microbial diversity are crucial for our understanding of their ecological function in lakes [9][10][11]. In recent years, high-throughput sequencing (HTS) has been widely used in studies of biodiversity in a wide range of aquatic ecosystems including lakes [12,13]. Water samples used for the extraction of metagenomic DNA for estimating the diversity and abundance of biological populations, are usually collected from the same region of a lake or from different regions based on equidistant sampling [14][15][16][17]. This strategy is widely used to monitor density or the abundance of biological populations based on equidistant sites [14][15][16][17][18]. These sampling sites are always distributed equidistantly along

Sampling and Environmental Information
Ten sampling sites from Sha Lake, Shuiguo Lake, Guozheng Lake, and Tangling Lake, were chosen for our study based on their trophic status. An equidistant sampling strategy was designed based on 10 sites in two regions (eutrophic and meso-eutrophic) of East Lake, Wuhan, China, on 15 January 2019 ( Figure 1). The sites 1 to 3 (Sha Lake and Shuiguo Lake) were located in a region with eutrophic status and sites 4 to 10 (Guozheng Lake and Tangling Lake) were located in a region with meso-eutrophic status. At each site, 5 L of surface water was collected and pre-filtered using 200 µm pore-size meshes. A 0.5 L aliquot of pre-filtered sample was filtered onto a 0.22 µm Durapore membrane (Millipore, MA, USA). In addition, a further 0.5 L aliquot of pre-filtered water from each sampling site was mixed and homogenized in a sterilized plastic tank. After that, a 0.5 L of the mixed water was filtered onto 0.22-µm Durapore membrane as a reference sample. A multi-parameter probe (YSI, Yellow Springs, OH, USA) was used to measure dissolved oxygen in situ. Environmental factors including the concentrations of nitrate nitrogen, nitrite nitrogen, ammonia nitrogen, and orthophosphate phosphorus, were measured at each sampling site, as described previously [36].

DNA Extraction, PCR, and High-Throughput Sequencing
Each membrane was moved into a bead tube. Then, environmental DNA was extracted from the membranes using the PowerWater ® DNA Extraction kit (Mo Bio, Carlsbad, CA, USA) according to the manufacturer's instructions. DNA concentration was measured using a NanoDrop 2000 (Thermo Scientific, Waltham, MA, USA). Both the hypervariable regions of 18S and 16S rDNA were amplified from the same total DNA extracts. 18S rDNA fragments (V4 hypervariable region), were PCR amplified with primers EK-565F (5 -GCA GTT AAA AAG CTC GTA GT-3 ) and EK-1134R (5 -TTT AAG TTT CAG CCT TGC G-3 ) [37]. The PCR mixtures (20 µL) contained 4 µL of 5 × Fastpfu Buffer, 2 µL of dNTPs (2.5 mmol L -1 ), 0.8 µL of each primer (5 µmol L -1 ), 0.4 µL TransStart Fastpfu DNA Polymerase, 0.2 µL BSA and 10 ng of template DNA. The PCR program for eukaryotic primers began with an initial denaturation at 95 • C for 3 min, followed by 30 cycles of 95 • C for 30 s, 55 • C for 30 s, 72 • C for 45 s; and a final extension at 72 • C for 10 min. A primer set (515F/806R) targeting the V4 region of the 16S rRNA gene was used for PCR amplification as described previously [38]. PCR products were sequenced on an Illumina MiSeq platform by Majorbio (Shanghai, China).

Sequence Analysis
High-throughput data analyses were conducted, as described previously [36]. Specifically, raw sequences were demultiplexed, quality filtered by Trimmomatic [39] and merged by FLASH [40] using the default parameters. Reads with exact barcodes and primers, un-ambiguous nucleotides, and length <200 or >550 for eukaryotes and length <50 or >350 for prokaryotes were retained. Chimeras were identified and removed using UCHIME [41]. Afterward, singleton OTUs (the number of reads was one among all samples) were discarded before the downstream analyses as potential sequencing errors. Remaining sequences were grouped into operational taxonomic units (OTUs, Chicago, IL, USA) for both eukaryotes and prokaryotes at a 97% similarity cutoff using the UPARSE default algorithms [42]. The taxonomy of each OTU was analyzed by RDP Classifier algorithm (http://rdp.cme.msu.edu/, accessed on 22 August 2022) against the Silva database (SSU132) using confidence threshold of 70%. Finally, all samples were rarefied to the same sequence depth (n = 23,703 and 35,038 sequences for eukaryotes and prokaryotes, respectively) by random subsampling to standardize sequencing effort.

Statistical Analysis
Statistical analyses and all graphic visualization were performed using the R v.4.0 [43]. Alpha diversity indices, including richness, Shannon-Wiener index, Chao 1 and Pielou's evenness, were calculated using a "vegan" package [44]. RDA (redundancy analysis) [45] and the Monte Carlo permutation test [46] were performed to explore the relationship between microbial communities and environmental factors. The PCoA (principal coordinate analysis) [45] was performed to visualize patterns of community structures of prokaryotes and eukaryotes based on both Bray-Curtis [47] and unweighted unifrac dissimilarity [48], and the significant differences between regions were tested by running a permutational multivariate analysis of variance (ADONIS) of both eukaryotes and prokaryotes [49][50][51]. The Wilcoxon test [52] was performed to assess significant difference for richness overlaps of samples from regions and reference sample between eutrophic regions and the meso-eutrophic region for prokaryotes and eukaryotes, and differences for concentration of environmental factors between samples from eutrophic and meso-eutrophic regions. Spearman's rank correlations were performed to explore relationship between richness overlaps of samples from regions and reference sample and concentration of environmental factors for prokaryotes and eukaryotes, respectively [53]. In addition, threshold indicator taxa analyses [54] were performed for selecting indicator OTUs which significantly correlated with change of environmental factors. Indicator taxa with purity and reliability (≥0.95) were plotted in increasing order with respect to their observed environmental change point. The neutral community model (NCM) was used to determine the potential importance of neutral processes on community assembly of samples from different regions [55]. In this model, the R 2 represents the overall fit to the neutral model, the Nm indicates the metacommunity size (N) times immigration (m). The confidence is 95%, based on 1000 bootstrap replicates.

Microbial Diversities in Eutrophic and Meso-Eutrophic Regions
After filtering, a total of 2446 and 362 OTUs were detected from prokaryotic and eukaryotic sequences, respectively. The richness, Shannon-Wiener index, Chao 1 and Pielou's evenness of prokaryotic communities ranged from 480 to 1227, 3.224 to 4.502, 781.110 to 1830.124, and 0.522 to 0.654, respectively. The richness, Shannon-Wiener index, Chao 1 and Pielou's evenness of eukaryotic communities ranged from 81 to 178, 2.289 to 3.007, 102 to 209.059, and 0.516 to 0.580, respectively (Table 1). Both principal coordinate analysis (PCoA) and permutational multivariate analysis of variance (ADONIS) results showed significant differences (P < 0.05) between the community structures of the eutrophic region (site 1, site 2, and site 3) and the meso-eutrophic region (site 4 to site 10) both for prokaryotes and eukaryotes based on Bray-Curtis dissimilarity ( Figure 2A) and unweighted unifrac dissimilarity ( Figure 2B). In addition, the distances between two adjacent samples from the eutrophic region seems farther than that of the meso-eutrophic region based on both Bray-Curtis and unweighted unifrac dissimilarity, implying that there was stronger niche differentiation in eutrophic region.  , respectively. P represents significance between two regions based on ADONIS analysis for prokaryotes and eukaryotes. P < 0.05 is considered as statistically significant.

Rarefaction Curves and Relative Abundance of Taxonomic Groups in Samples
Rarefaction curves showed a smooth curve both for the reference sample and for samples collected from meso-eutrophic sites, revealing a good coverage of prokaryotic and eukaryotic species number in each case ( Figure 3). However, the rarefaction curves for the eutrophic samples were not saturated for both the prokaryote and the eukaryote species number (Figure 3). In addition, Wilcoxon test results showed that there were significant differences of relative abundance value for several taxonomic groups (relative abundance higher than 0.1% in region were shown) between the eutrophic region and the meso-eutrophic region for prokaryotes and eukaryotes ( Figure 4). For prokaryotes, the relative abundance of Acidobacteria, Chloroflexi, Omnitrophica, Parcubacteria, Planctomycetes, Verrucomicrobia, and Woesearchaeota_DHVEG-6 for the eutrophic region were significantly (P < 0.05) lower than that of the meso-eutrophic region ( Figure 4). The relative abundances of other taxonomic groups including Actinobacteria, Bacteroidetes, Cyanobacteria, Euryarchaeota, Firmicutes, Fusobacteria, and Proteobacteria, have no significant dif-ferences (P > 0.05) between the eutrophic region and the meso-eutrophic region ( Figure 4). For eukaryotes, the relative abundances of Apicomplexa, Cercozoa, Fungi, Perkinsea, and Stramenopiles_X for the eutrophic region were significantly lower than meso-eutrophic counterparts (P < 0.05). However, the relative abundances of Ciliophora and Ochrophyta for the eutrophic region were significantly higher than that of the meso-eutrophic region. In addition, the relative abundances of Chlorophyta, Cryptophyta, Dinophyta, Katablepharidophy, and Metazoa, between the eutrophic and meso-eutrophic regions, have no significant differences (Figure 4).   Figure 4. The comparisons for relative abundances of taxonomic groups from eutrophic region and meso-eutrophic region for prokaryotes and eukaryotes, respectively. The value of mean relative abundance higher than 0.1% of taxonomic groups were shown. The significance based on Wilcoxon test. Significant codes: ** P < 0.01; * P < 0.05.

The Differences of Species Richness Overlap and Factors Related with Species Richness Overlap
The boxplots for the overlaps of species richness between eutrophic samples and the reference sample and the overlaps of species richness between meso-eutrophic samples and the reference sample, for both prokaryotes and eukaryotes, revealed that samples from the eutrophic region have significantly lower overlap (P < 0.05) with the reference sample than do samples from the meso-eutrophic region ( Figure 5). For prokaryotes, the overlaps of samples from the eutrophic regions were lower than 0.25, whereas for samples from the meso-eutrophic region the overlaps were generally higher than 0.3. For eukaryotes, the overlaps were lower than 0.5 for samples from the eutrophic region and higher than 0.5 for samples from the meso-eutrophic region.
4 Figure 5. The boxplots for overlaps of richness between eutrophic region and reference data, and between meso-eutrophic region and reference data for prokaryotes and eukaryotes, respectively. The significance based on Wilcoxon test.

The Relationship between Microbial Community and Environmental Factors
Redundancy analysis (RDA) results indicated that the microbial communities at the 10 sampling sites were divided into two groups according to the trophic status of the region in which the sites were situated ( Figure 6A). Sites 1, 2, and 3 were in the eutrophic region based on the high concentrations of NO 3 -N (nitrate nitrogen), NO 2 -N (nitrite nitrogen), AN (ammonia nitrogen), and PO 4 (orthophosphate phosphorus). Sites 4 to 10 were in the mesoeutrophic region, based on the high DO (dissolved oxygen) concentration. The Wilcoxon test showed that concentration of DO, and NO 2 -N, were significantly different from the two regions, implying that DO and NO 2 -N were key factors driving microbial community structures ( Figure 6B). Several OTUs were strongly associated with environmental factors ( Table 2). These included prokaryotic OTUs mainly affiliated with Actinobacteria, Bacteroidetes, Parcubacteria, and Proteobacteria, and eukaryotic OTUs mainly affiliated with Cercozoa, Chlorophyta, Ciliophora, Cryptophyta, Dinophyta, Fungi, Metozoa, Ochrophyta, and unclassified Eukaryota ( Table 2). Monte Carlo permutation tests (Table 3) revealed that both DO and NO 2 -N were significantly correlated with prokaryotic community structures (correlation coefficient: 0.850 and 0.675, respectively). Eukaryotic community structures were also significantly correlated with DO and NO 2 -N (0.906 and 0.773, respectively). PO 4 , orthophosphate phosphorus. The green characters represent 10 samples, P represents prokaryotes from sites and E represents eukaryotes from sites. Red arrows represent measured environmental factors, blue characters represent OTUs that were strongly associated with the first two axes (fitness > 80%) of RDA for prokaryotes and eukaryotes, respectively. Comparison for concentration of DO and NO 2 -N from two regions (B). Significance determined by Wilcoxon test. P < 0.05 are considered as significant effect.

Spearman's Rank Correlation and Threshold Indicator Taxa with Changing of Environmental Factors
According to Spearman's rank correlation, the overlaps were positively correlated with DO ( Figure 7A) and negatively correlated with NO 2 -N ( Figure 7A) for prokaryotes (r = 0.80 and −0.88, respectively) and eukaryotes (r = 0.83 and −0.83, respectively). In addition, threshold indicator taxa analysis (TITAN) results showed several threshold indicator OTUs that were significantly correlated with DO and NO 2 -N, whereas no indicator taxa correlated significantly with other environmental factors ( Figure 7B). Indicator OTUs were positively correlated with DO concentration but negatively correlated with NO 2 -N concentration ( Figure 7B). For prokaryotes, we found that the abundances of five positive responding indicator OTUs which assigned to Actinobacteria, Cyanobacteria, Proteobacteria, and unclassified bacteria, were increased with DO concentration between 9.660 to 10.555 mg/L. Additionally, 18 negative responding indicator OTUs, which were assigned to Actinobacteria, Cyanobacteria, Firmicutes, Planctomycetes, Proteobacteria, Thaumarchaeota, and unclassified Bacteria, were decreased with NO 2 -N gradient between 0.0110 to 0.0135 mg/L. The point to emphasize here is that out1864 (Cyanobacteria) aoutOTU2025 (unclassified Bacteria) had significant positive correlation with a change of DO concentration and significant negative correlation with a change of NO 2 -N concentration. For eukaryotes, the abundance of six positive response indicator OTUs which assigned to Ciliophora, Katablepharidophyta, Ochrophyta, and Perkinsea, were positively correlated with changes of DO concentrations from 9.660 to 10.555 mg/L. Additionally, 16 negative responding indicator OTUs, which were assigned to Cercozoa, Ciliophora, Cryptophyta, Dinophyta, Katablepharidophyta, Ochrophyta, Perkinsea, Stramenopiles_X, and unclassified Eukaryota, were negatively correlated with changes of NO 2 -N concentrations ranging from 0.0110 to 0.0135 mg/L ( Figure 7B, Table 4). Note that OTU134 (Katablepharidophyta), OTU235 (Ochrophyta) and OTU137 (Perkinsea) had significant positive correlation with change of DO concentration and significant negative correlation with change of NO 2 -N concentration.

The NCM Explains Different Community Variation between Samples from Regions with Different Trophic Status
The neutral community model (NCM) was used to explore the potential importance of neutral processes on community assembly of samples from regions with different trophic status. The NCM explained 30% of taxon detection frequency of prokaryotes in the eutrophic region and 54% of taxon detection frequency of prokaryotes in the meso-eutrophic region ( Figure 8A). Similarly, the NCM explained the lower taxon detection frequency of eukaryotes in the eutrophic region (37%), than that of eukaryotes in the meso-eutrophic region (72%) ( Figure 8B). The NCM results indicates that deterministic processes played a more significant role in structuring the microbial community assembly in the eutrophic re-gion, while the stochastic processes played the most important role in structuring microbial community assembly in the meso-eutrophic region.

Discussion
Environmental factors and microbial biodiversity from the eutrophic regions were different from the meso-eutrophic regions (Figures 2-6), indicating that our study area (East Lake) is an environmentally heterogeneous lake, which was consistent with previous studies concentrating on East Lake [5,19,34,35]. In addition, the relative abundances of microorganisms in regions with different trophic status were different. For instance, the relative abundance of Acidobacteria was higher in the meso-eutrophic region than that of the eutrophic region (Figure 4), which is consistent with a previous study indicating that the abundance of Acidobacteria was abundant in moderately trophic lake, the distribution of Acidobacteria in lakes varies with trophic status [56]. However, the relative abundance of Ciliophora and Ochrophyta in the eutrophic region was significantly higher than that of the meso-eutrophic region (Figure 4), which is consistent with previous studies showing that ciliate abundance increased from oligotrophic to eutrophic lakes [57], and Ochrophyta was always a dominant phytoplankton in eutrophic lake [58], indicating that they prefer the eutrophic environment. Within this heterogeneous lake, consequentially, the influences of sampling sufficiency on microbial biodiversity analysis between eutrophic and mesoeutrophic regions of East Lake showed significant differences based on the equidistant sampling strategy (Figures 3-6). The rarefaction curves of microbial communities from the eutrophic region did not reach saturation like that of the meso-eutrophic region (Figure 3), indicating that there were many microbial species that had not been detected from the eutrophic region using the same sampling density (Figure 3). Furthermore, the overlaps of microbial richness between samples from the eutrophic region and the reference sample were significantly lower than that of the meso-eutrophic region ( Figure 5), which suggests that the microbial diversity in the eutrophic region was underestimated and could not represent the microbial diversity of this lake ecosystem. Consequently, sample collection based solely on the equidistant sampling strategy is not reliable for estimating regional microbial biodiversity in a heterogeneous lake. The most likely explanation is that in a homogeneous environment, the loss of niche opportunities leads to reduced differentiation of microbial communities, while heterogeneity increases niche partitioning and therefore promotes community differentiation [27][28][29][30]. Furthermore, environmental heterogeneity would also provide shelter and refuges from adverse environmental conditions, which in turn should promote species persistence [59,60]. Our study area, East Lake, has been shown to be an environmentally heterogeneous lake [5,19,34]. Those regions (lakes) of East Lake are affected by different outlets, resulting in different environmental conditions. For instance, Sha Lake and Shuiguo Lake are surrounded by a densely populated area with several sewage outlets extending into the bay, and the water quality is very poor compared to other lakes, while Guozheng Lake and Tangling Lake have much better water quality because they are away from sewage outlets [20,34]. Therefore, different environmental conditions can lead to regional environmental heterogeneity, which can provide additional niche spaces for microorganisms. To obtain complete regional microbial biodiversity, it is necessary to increase the number of samples in such heterogeneous areas.
Additionally, our results indicated that the community structures in meso-eutrophic regions are similar to each other, while the microbial community differentiation in eutrophic regions were obviously differentiated ( Figure 2). Consequently, the species sorting may have a stronger effect on the microbial community of samples from the eutrophic region, while neutral process has a stronger effect on the microbial community of samples from meso-eutrophic region ( Figure 8). Thus, the results indicated that there were stronger community differences among samples from the eutrophic regions as compared to samples from the meso-eutrophic regions, as a result of stronger niche differentiation in the eutrophic region [61]. The niche differentiation can be influenced by environmental factors (trophic status), as previous studies indicated that the level of eutrophication of the studied waterbodies and the strength of local environmental gradients play a key role in structuring differentiation of phytoplankton community compositions from 50 freshwater bodies in urban areas [62]. Other hydrological studies have shown that the level of eutrophication is correlated with the biodiversity, indicating that eutrophication is an important anthropogenic pressure and reduces biodiversity in multiple groups of organisms in shallow lakes and ponds worldwide [14,[63][64][65]. Eutrophication due to nutrient loading and reduced dispersal owing to habitat fragmentation are global scale pressures that are increasingly affecting biodiversity patterns at multiple spatial scales [66]. However, previous studies also indicated that moderate eutrophication would promote biodiversity by providing several niche spaces, while low biodiversity was detected in hypereutrophic and oligotrophic environment because of their rare niche space [4,5,19]. Our investigation showed that there were many microbial species that had not been detected from the eutrophic region (Figure 3), indicating that there were relatively diverse niche spaces in this region. The results of this study reveal that the environmental factors of DO and NO 2 -N concentrations were responsible for niche differentiation in the eutrophic region and drive the variability of the microbial biodiversity and community structure ( Figure 6, Table 3). This is consistent with previous studies indicating that nitrogen limitation drives the phytoplankton community structure in East Lake [67,68]. Furthermore, previous studies indicated that Cyanobacteria could produce nitrite (NO 2 -N) during bloom degradation [69] and the concentrations of DO and NO 2 -N were significantly correlated with the size of Cyanobacteria blooms [70]. Here, we present that several indicator OTUs of Cyanobacteria had strong positive and negative relationships with DO and NO 2 -N concentrations, respectively ( Table 4). The NO 2 -N and DO conditions implied should therefore enable us to determine the sampling intensity in different regions.

Conclusions
In summary, traditional microbial surveys based solely on equidistant sampling have largely neglected the influence of environmental heterogeneity and trophic status on biodiversity analysis with insufficient sampling density. In fact, environmental heterogeneity would give rise to niche differentiation and increase the speciation rates of the microbial community. Trophic status also correlates with microbial biodiversity by species sorting. Consequently, the conventional sampling strategy should be reconsidered in ecosystems with small-scale environmental heterogeneities and trophic status, and we should reasonably increase the number of sampling sites according to local environmental conditions in future research.

Data Availability Statement:
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found from the NCBI database: PRJNA765570.