The Geochemical Drivers of Bacterial Community Diversity in the Watershed Sediments of the Heihe River (Northern China)

: The city of Zhangye (Gansu Region, China) has been subjected to several changes related to the development of new proﬁtable human activities. Unfortunately, this growth has led to a general decrease in water quality due to the release of several toxic wastes and pollutants (e.g., heavy metals) into the Heihe River. In order to assess the environmental exposure and the potential threat to human health, microbiological diversity for the monitoring of water pollution by biotic and abiotic impact factors was investigated. In particular, we analysed samples collected on different sites using 454 pyrotag sequencing of the 16S ribosomal genes. Then, we focused on alpha-diversity indices to test the hypothesis that communities featuring lower diversity show higher resistance to the disturbance events. The ﬁndings report that a wide range of environmental factors such as pH, nutrients and chemicals (heavy metals (HMs)), affected microbial diversity by stimulating mutualistic relationships among bacteria. Furthermore, a selection in bacterial taxa related to the different concentrations of polluting compounds was highlighted. Supporting the hypothesis, our investigation highlights the importance of microbial communities as sentinels for ecological status diagnosis.


Introduction
The availability and quality of freshwater is necessary to ensure adequate humanrelated ecosystem services and has been the subject of scientific discussion and studies [1,2]. In particular, it is estimated that about two billion people live in areas subjected to high water stress, while four billion people experience critical conditions in the supply of quality water for at least one month a year [3]. As anthropogenic pressure advances, the number of people facing water emergencies, including those related to water pollution, will increase over time. It has been estimated that by 2050, half of the world's population will live in territories with poor freshwater quality [4]. For this reason, it is a priority not only to safeguard the quality of freshwater, but above all to intervene by monitoring situations of stress and disturbance. Thus, it is necessary to establish water quality indicators, which are traditionally divided into three categories: physical indicators (quantity of water, flow, etc.), chemical indicators (presence of metals, nitrogen, TOC, etc.), and biological indicators. The latter are mainly fish, invertebrates and a few representatives of bacteria [5]. In particular, microorganisms, algae and fauna (e.g., invertebrates and fishes) in urbanrelated freshwater and sediments are exposed and easily affected by any environmental factor changes, including large loads of organic matter, nutrients and pollutants discharge (pesticides, toxic materials and heavy metals, HMs). Their overall community response in terms of richness or evenness renders them suitable bioindicators of pollutant presence, providing information on ecosystem health [6][7][8][9][10][11][12], and studying the biodiversity of aquatic habitats could be very important for estimating, monitoring and assuring their sustainable use and the deriving measures of environmental management and restoration. It has been suggested that geographical abiotic (pH, nutrients, water availability, oxic/anoxic conditions, temperature, chemical pollution, HMs, pesticides, antibiotics) and biotic factors (plasmids, phages, transposons) govern microbial diversity [13]. In fact, the biogeographical distribution patterns of bacterial communities could be affected by local or regional factors (or by a combination of them), which may differ by site and time [14][15][16][17]. Moreover, the growth and division of bacterial cells depends on the amount of nutrients present in the system; variations in abiotic factors induce changes in bacterial community composition and in their physicochemical habitat (e.g., stimulating processes such as sulfate reduction or nitrogen (N) fixation) [18,19]. Therefore, bacterial communities and their diversity could be suitable indicators of environmental degradation and possibly used to assess the perturbation level, even if many authors found it challenging to select the right bacterial diversity measurement index; a deeper investigation of the topic will be crucial in the development of suitable methods [6,20,21]. Indeed, in microbial ecology, it is hypothesized that a higher biodiversity improves the ecosystem resilience and resistance by varying the responses to that environmental change to maintain ecosystem functions [22] and generally, resilience evaluation involves the characterization of the quantity, structure and shift of taxonomic and/or functional redundancy into a broad group of biological arrangements such as from genes to communities [23,24].
Diversity among microorganisms has mainly been characterised from a taxonomic view at the community level, using different measurements (e.g., alpha, beta, gamma diversity). Alpha diversity relates to the diversity of one environmental area or single sample, and it is usually determined as, e.g., number of species (species richness) and as the extension of the dominance of the species (species evenness). It has been previously shown to successfully highlight perturbances in complex freshwater basins affected by different scales of pollution. For example, McClary-Gutierrez et al. [25] provided evidence for a significant decrease (Wilcoxon rank sum test, p < 0.01) in bacterial community diversity in impacted streams compared with unimpacted streams and in particular, a statistically significant correlation between developed land area and Shannon diversity across the sites sampled (Spearman rank sum test, ρ = 0.50, p < 0.05) was observed. Moreover, Mohiuddin et al. [26] measured significantly different average taxonomic richness and alpha diversity between samples collected from lakes and creeks (p < 0.05) and between lakes and stormwater outfalls (p < 0.05).
While this approach is useful, it is limited by sampling procedures, which seem to severely influence species richness with the necessity of standardized samples or the utilization of estimator factors to correct sampling biases (e.g., Chao1 or ACE). On the contrary, evenness seems to be relatively unaffected by under-sampling biases and is generally determined with Simpson's or Pielou's indices or rank abundance curves [23].
The goal of this work was to test the hypothesis that alpha diversity could be used as a possible proxy for freshwater sediment health monitoring. We conducted this study in the Heihe polluted watershed, within the urban area of Zhangye City (northern China) and through high-throughput pyrosequencing techniques. The Heihe River represents a long-term interdisciplinary research area, already investigated in terms of water utilization, reed stands ecology and environmental economy [35]. Hence, we focused on the correlation between bacterial community diversity and the main environmental factors (i.e., HMs and nutrient spreading) that characterize each sampling site. Our study sought to demonstrate that alpha diversity measured through NGS approaches has the potential to identify the possible drivers behind adaptive responses in the ecosystem subjected to changing environmental conditions. By assessing these environmental changes through microbial community diversity and partitioning communities against curated databases, we provide novel insights into how alpha diversity and sequencing data can be integrated into environmental monitoring for both human and ecological health.

Sampling Method
Investigations were performed in Zhangye City (Gansu Province, northern China), a modern city surrounded by the Gobi Desert and the Qilianshan Mountains. The city is characterized by canal networks (both natural and artificial) running into the Heihe River and ponds, mostly covered by Phragmites australis (Cav.) Trin. (common reed). The sampling procedure was designed to assure the broadest geographic coverage possible, considering different types and levels of pollution. The area was divided into four different zones based on their effluents (industrial, urban, agricultural, and natural). A total of 17 sampling sites were chosen, as stated by Borruso et al. [6] in Table 1 and showed in Figure 1. In particular, according to Table 1, the sites and areas were separated considering the anthropogenic influence. Six sampling points were examined along the main canal crossings through the main urban area (ZY36-41) with presumed organic and antibiotic pollution; four sites were near coal and metallurgy industries in a smaller secondary canal (ZY24, ZY25, ZY26) and in the main canal receiving water from the industrial area (ZY23) with probable chemical and industrial pollution. Two sites were sampled in Zhangye Natural Park (ZY34 and ZY35), where reeds were completely covering the area and where low anthropogenic pressure was presumed. For this reason, we considered them controls. Five sites were chosen for the area where agricultural and livestock pollution were hypothesized; in particular, two were collected in an agricultural drainage canal near the Heihe River (ZY28 and ZY29), two on the banks of Heihe (ZY30 and ZY31) and one between the industrial area and the main city center (ZY21). In each sampling site, we collected one sample of rhizosphere (particles enveloping the root within 1 and 3 mm) with a careful shaking from three water dispersed roots of P. australis plants distancing about 10 cm. In the same place, about 10 g of wet sediment were taken with a sterile spoon and put into sterile tubes at 4 • C, while the remaining sediment from each root was collected with the same method and placed into sterile bags at −20 • C to allow molecular and chemical analysis [6]. The tubes were rapidly brought to the nearest laboratories for the DNA extraction. Then, DNA containing tubes and sediments were shipped to Italy in a box with dry ice.

DNA Extraction and 454 Pyrotag Sequencing
Wet sediments (about 5 g) were centrifuged at 13,000 rpm for 15 min to remove the excess water. Then, according to the user manual, PowerSoil ® DNA Isolation Kit (MoBio, Arcore, Italy) was used for the extraction of the total community genomic DNA from 1 g (wet weight) of the centrifuged sediments. The quantification of DNA was performed using the NanoVue™ Plus (GE Healthcare, Little Chalfont, UK) spectrophotometer, adjusting the concentration to 10 ng/µL. The genomic DNA of each of the 17 samples were shipped to the Molecular Research LP (MR DNA TM ) (http://www.mrdnalab.com (accessed on 10 June 2022)). The DNA was used for the PCR amplification of the environmental 16S rRNA genes for bacteria with a primer set amplifying the V4-V6 variable regions (primers 518F 5 -CCAGCAGCYGCGGTAAN-3 and 1046R 5 -CGACRRCCATGCANCACCT-3 ). Samples were sequenced using the Roche 454 GS-FLX system, titanium chemistry, according to the protocols of that company. The pyrosequencing data were analysed with a proper analysis pipeline as suggested by the company. Sequences with length <200 bp, or with ambiguous bases and homopolymer runs exceeding 6 bp, were removed before the chimera checking. The 16S rRNA gene pyrotags were defined after the removal of singleton sequences by clustering into 97% similarity. Representatives from each OTU were classified using BLASTN against a curated GreenGenes database. All pyrotags have been submitted to the EMBL/NCBI/DDBJ Short Read Archive under study accession number PRJEB4308. To normalize the variations in read depth across samples, the data were rarefied to the minimum read depth of 1248 sequences per sample.

Statistical Analysis
PAST 3.26 software was used to perform cluster analysis using the UPGMA algorithm and the Bray-Curtis similarity index. R software (R Foundation for Statistical Computing, Vienna, AT; http://www.R-project.org/ (accessed on 10 June 2022)) was used to investigate the alpha diversity indices (i.e., richness, Shannon, Simpson and evenness) and the Pearson correlation among genera and environmental parameters. Permutational multivariate analysis of variance (PERMANOVA) was applied to test the possible effect of the different land uses on the bacterial communities.

Results and Discussion
According to MacDonald et al. [36] and Persaud et al. [37] classifications, metal concentration over the probable effect concentration (PEC) were identified as toxic on the biota, as already described by Borruso et al. [6] and shown in Table 2. For example, high concentrations of Cd and Pb were found in the industrial sites (ZY23, ZY24, ZY25) and some metals like Ni and Zn were over the PEC in ZY25 and ZY24. For the urban area (ZY36 to ZY41), Hg, Ni and Cr showed higher values. Conversely, in the agricultural area (ZY21, ZY28, ZY29, ZY30, ZY31) and the natural park (ZY34, ZY35), all metals had concentrations much lower than the PEC threshold, with the exception of Ni levels in ZY28. The concentrations of Co and Mn were also below the PEC threshold for all sites. N was higher in and around the urban area, probably due to civil wastewater release (see Table 1), followed by the industrial one. The samples collected in the natural park (i.e., control) had metal concentration values below the PEC threshold.  The yield of the pyrosequencing run, after the quality check, was 51,364 pyrotags from all the samples. The total number of OTU identified as bacteria was 33,304, with an average per sample of 1850 ± 590 (OTU at 97% similarity). Alpha diversity, calculated as richness, Shannon and evenness indices, was measured. The highest values for Shannon and evenness were found in ZY25, with 489.04 and 0.663, respectively, while the lowest were in ZY39 with 90.77 and 0.533 (Table 3). Richness had an average of 582 ± 134. Sample ZY21 had the lowest richness (311), and ZY31 had the highest (819). The relationships between the indices and the metals/elements/pH were analysed by Pearson correlation as shown in Table 4. Only those with correlation values ρ > |0.5| and ρ < 0.05 were considered meaningful for the analysis. The P, Cr and Hg concentrations displayed significant negative correlations with bacterial richness (P: p value of 0.031, Cr: 0.048, Hg: 0.049) but also with Shannon (P: p value of 0.009, Cr: 0.018, Hg: 0.015) and evenness (P: p value of 0.009, Cr: 0.019, Hg: 0.016). N concentrations were negatively and significantly correlated to the evenness and Shannon indices (p value: 0.022 and 0.021 respectively) but not to richness.  Figure 2 shows the response of the bacterial taxonomy according to the concentrations and spreading of the nutrients and metals or to pH changes through a co-correlation plot (2% average relative abundances) generated by Pearson correlation index. Similarly, Figure 3 shows the Pearson co-correlation plot (always 2% average relative abundances) focusing on the inhibition/induction dynamics among the bacterial taxa. In this context, Table S1 reports all the detailed data for correlation factors and related p-values, while Table S2 sums up the main outcomes in terms of geochemical drivers, taxonomic groups and the potential indicator service.  Here again, only significant results (R > |0.5| and p-value < 0.05) were considered. At the genus level, with average relative abundances more than 2%, we detected the presence of 52 main genera among sites. The results shown in the figures/tables highlight that most genera were strictly related to civil and agricultural wastewater, to gut microbiome or to human cell metabolism, and as such, to the area use/pollutant/nutrient availability and the human influence. Having demonstrated this, for the discussion, we hence focused more on taxonomic groups whose potential indicator service is related to the pollution spread. Further investigations might consider different taxonomic groups.
For example, nutrient availability (N and P) seemed to positively affect the genera Bacteroides and Sulfuricurvum, while both negatively influenced the genera Ralstonia eutropha H16 (also named as H16 or Cupriavidus necator) and Opitutus. Additional genera negatively influenced by N and P were Sideroxydans and Sulfuritalea, which are characterized by a metabolism involved in Fe and S transformation. Moreover, pH showed a weakly positive but significant effect on the genera Anaerolinea and Opitutus. The genus Bacteroides was affected positively by Hg and negatively by Mn. The abundance of the genus Gelria was positively influenced by metal concentration (Al, Fe, Zn, Cu and As) and also co-correlated with the genus Syntrophus. On the contrary, the genus Zooglea was negatively affected by Al, Cu, Co and As, and the genus Flavobacterium was negatively correlated with Cu, Zn, Pb, Cd and As. For the 52 main genera found, the mean abundance and the standard deviation per land use were also calculated ( Table S3). The results confirm that there was a selection of microbial communities in line with the use of land and the related affecting pollutants and variables/elements. More specifically, the genera Bacteroides and Sulfuricurvum were positively correlated with N and were more abundant in the urban and industrial areas (0.25 and 0.24, respectively, as average of frequencies of reads for Bacteroides and 8.79 and 0.35 for Sulfuricurvum), where the nitrogen concentration turned out to be higher. In contrast, Ralstonia eutropha H16, strongly affected by N, was predominant (average of frequency of reads = 0.67) in the natural park (where there were low levels of N). Similarly, Gelria, positively influenced by Zn, showed much higher abundance in the industrial area (Zn concentration over the PEC − average of frequency of reads = 0.51).
Moreover, the PERMANOVA test confirmed significant differences (p-value < 0.001) in the distributions of the bacterial species in relation to the land use (Table 5). Cluster analysis (UPGMA on Bray-Curtis distance) of the ecological matrix derived by the 16S rRNA gene pyrosequencing data separated the communities into five groups (Figure 3). The first included ZY23, ZY24 and ZY25 sites (industrial area); the second, ZY28, ZY 29, ZY30, ZY31 (agricultural area); the third, only ZY34 and ZY35 (natural park); the fourth, ZY36 and ZY37 (urban area); and the last, ZY39, ZY40 and ZY41 sites (urban area). Finally, ZY21 (agricultural), ZY26 (industrial) and ZY38 (urban) were scattered into the clustering. Considering the geographic distribution of the urban, industrial, agricultural, and natural areas around Zhangye City, Borruso et al. [6] demonstrated how bacterial communities could be used as effective diagnostic tools in complex environments like cities, with the ability to discriminate each kind of environmental pollution according to land use. In this study, we evaluated the effects of specific environmental parameters (chemical elements/metal concentration) on microbiome diversity and on dynamic responses. We wished to test the effectiveness of microbial communities reflecting the environmental pollution by metal exposure. The hypothesis is that richer microbial communities may better resist long-term disturbances, reflected in the pollution in sediments.
According to MacDonald et al. [36] and Persaud et al. [37], we found that toxic concentrations of HMs in sediments were mainly due to different discharge sources into the channels, as well as to industrial by-products from coal combustion for energy production, as well as residuals from metallurgy industry and from chemical plants. Higher levels of chromium were caused by vehicle exhaust particles and the combustion of civil wastes [38]. N was directly released into the main urban channel through civil wastewater [39], but it was also present in the agricultural area due to fertilization. P, Cr and Hg concentrations had a negative correlation with bacterial richness (Table 4). This finding is consistent with what was found in a previous study by Borruso et al. [40], where P was identified as drivers of the rhizobacterial community distribution. We confirm also what is showed by other works that demonstrated the key role of HMs (such as Cr and Hg) on the functional diversity of bacterial communities in freshwater sediments [41], as well as in other ecosystems such as soils [42] or wastewater treatment plants [43]. Other studies instead showed that the diversity is not always negatively correlated with HMs but could increase at low doses of HMs [44], due to the possible enrichment of ecological niches favouring the incidence of different gene functionalities. Our results are somewhat consistent with the intermediate disturbance hypothesis. Surprisingly, we did not observe the significant correlations between pH variation and microbial shift ( Table 4) that were shown elsewhere [18,45].
The co-occurrence analysis showed the presence of 52 main bacterial genera mostly associated to civil and agricultural wastewater, gut microbiome or human metabolism. For example, nutrient availability (N and P) seemed to positively affect the genera Bacteroides (p-value = 0.008) and Sulfuricurvum (p-value = 0.005), while both negatively influenced the genus Ralstonia eutropha (p-value = 0.014). R. eutropha strain H16 is known to be a hydrogenoxidizing Betaproteobacterium commonly found in freshwater and soils. It can synthesize a wide range of metabolites and bioplastics (polyhydroxyalkanoates). This bacterium features both heterotrophic and lithoautotrophic metabolisms [46,47]. Additional genera that were negatively influenced were Sideroxydans (p-value = 0.009 for N, and 0.006 for P) and Sulfuritalea (p-value = 0.020 for N, and 0.002 for P), characterized by a metabolism involved in Fe and S transformation. Furthermore, pH showed a weak influence and positive significant effects on the genera Anaerolinea and Opitutus (p-value = 0.023 and 0.026 respectively). The first is usually found in natural anaerobic environments, and one of the core microbial populations in anaerobic digesters [48,49], meanwhile, Opitutus encompasses anaerobic fermentative species linked to nitrate reduction as a typical P. australis rhizobacterial genus [40,50]. The genus Bacteroides was also positively affected by Hg (p-value = 0.023) but negatively by Mn (p-value = 0.003). Wu et al. found that a combination of factors (HMs, physicochemical properties) induced the spreading and persistence of antibiotic-resistant genes (ARG's) among Bacteroides species [51]. Interestingly, the abundance of the genus Gelria, which has a role in syntrophic acetate oxidation [52], seemed to be positively influenced by metal concentration (Al-p-value = 0.014, Fe-0.028, Zn-0.034, Cu-0.009, and As-0.015). Gelria is a thermophilic, anaerobic, obligately syntrophic, glutamate-degrading genus, usually found in granular sludge within anaerobic sludge bed reactors [53]. The genus Syntrophus (positively influenced by Al-p-value = 0.004, Fe-0.030, Cr-0.016, Ni-0.004, and Co-0.027) is responsible for methane production and was found to be correlated to Gelria for treating the mixed-LCFA (long-chain fatty acids) at low temperature in municipal wastewater plants [54]. Syntrophus-associated sequences have been identified in anoxic sediments and sewage sludge [48]. Its presence could indicate potential environmental exposure to several xenobiotic compounds released by industrial activity. Indeed, trace amounts of HMs may stimulate the growth and activity of methanogens, and their high levels have toxic effects [43]. On the contrary, the genus Zooglea was negatively affected by Al (p-value = 0.02), Cu (p-value = 0.003), Co (p-value = 0.03) and As (p-value = 0.004). Zooglea are betaproteobacteria usually detected in both fresh and polluted waters/sediments [48]. Their cells produce finger-like projections or outgrowths in sludge flocs [55]. The genus Flavobacterium was negatively correlated with Cu (p-value = 0.01), Zn (p-value = 0.02), Pb (p-value = 0.007), Cd (p-value = 0.01) and As (p-value = 0.02). Flavobacteria are involved in ammonia, nitrite, and metal removal [56] and are often isolated in freshwater environments. Some of them are known to cause disease in freshwater fishes [48,57]. Most of the genera with a sulphur-oxidative metabolism (e.g., Sulfurimonas, Sulfuritalea) suffer from the spread of Hg, Mn and Cu. On the other hand, sulphate-reducing bacteria belonging to genera Desulfobacter play an important role in the transformation of metals such as Cu, Fe, Mn, Pb, Sb, and Zn into oxidizable fractions [58].
The responses and interactions identified through the chemical elements were confirmed by what has been observed by taxa dynamics. For instance, the genera Gelria and Syntrophus were positively co-correlated with Zooglea, whereas Syntrophus was negatively correlated. Due to the presence of high concentration of N and P, Bacteroides and R. eutropha H16 were negatively correlated. Considering the sulphur metabolism, the sulphur-and hydrogen-oxidizer Sulfurimonas positively influenced the sulphate-reducer Desulfobacter and vice versa. Here, cross-feeding microbial interactions could stimulate different genera of microorganisms to help them in adapting to environmental changes [59]. Separately, changes in microbial community dynamics owing to HM pollution (e.g., such as sensitive species replacement) largely depend on the community richness [60] and evenness could also be affected by community shifts [61].
Cluster analysis highlighted the separation of the 16S rRNA gene diversity according to the land use. Among the clusters, one group contained microbial communities from sites characterized by high levels of Cu, Zn, Ni, Pb, Hg, Cd and As and the presence of industrial effluents. The second group was composed of microbial communities strongly influenced by agricultural effluents. The third one included samples taken from a reed stand natural area. The fourth group consisted of samples from an area with high levels of Cr, Ni and Hg and contamination from city effluents. The last cluster included sites with high levels of Ni and Hg and contamination from city effluents. As already proposed by Borruso et al. [6], ZY38 showed a different community due to the generally lower concentrations of metals and nutrients, whereas the divergence of ZY26 was probably due to a dramatic algal bloom observed during the sampling. ZY21 seemed to be affected by the urban and agricultural discharges, resulting in a borderline sample between the two areas.

Conclusions
In the Zhangye area, we investigated the diversity and the distribution of bacterial communities in 17 interconnected freshwater habitats with different land uses. We then examined the influence of environmental factors on bacterial communities, with the aim of testing alpha-diversity measurement as a tool for the efficient biomonitoring of pollution.
Our results indicate that lower alpha diversity index values of microbial communities correlate with higher rates of trace pollutants and a wide range of environmental factors such as nutrients and chemicals (HMs), stimulating mutual relationships among bacteria as the co-correlation analyses show (Figures 2 and 3 and Tables 3 and 4). These findings suggest that bacteria are sensitive to changes in biotic/abiotic factors. Indeed, the distribution of the bacterial species in relation to the land use and thus to pollutant/nutrient/metal presence was confirmed by PERMANOVA with significant difference (ρ < 0.001- Table 5) and habitat-specific bacterial species were identified. Moreover, the cluster analysis ( Figure 4) highlighted similarities in the ongoing bacterial taxon selection processes, probably due to the different concentrations of polluting compounds in the different areas. We demonstrated that alpha diversity measurement and the metagenomic approach could be valid tools for assessing water/sediment quality for human and environmental safety. Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/w14121948/s1, Table S1: Pearson co-correlation plot (2%) showing the bacterial taxonomy according to the concentrations and spreading of nutrients/pH/metals, as well the taxa dynamics (inhibition or induction effects) induced by the taxa themselves. Table S2: Summary of the main outcomes Table S3: Mean relative abundance (frequencies of reads) and standard deviation per land use measured for the main 52 genera detected.  Data Availability Statement: All data generated or analysed during this study are included in this published article and in the supplementary information available at www.mdpi.com/xxx/s1 website.