You are currently viewing a new version of our website. To view the old version click .
Microorganisms
  • Article
  • Open Access

25 December 2025

Multi-Regional Study on the Microbial Community Structure, Core Microbiome and Functional Characteristics in Deep Fracture Waters

,
,
,
and
1
State Key Laboratory of Lithospheric and Environmental Coevolution, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing 100029, China
2
College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing 101408, China
*
Author to whom correspondence should be addressed.
Microorganisms2026, 14(1), 45;https://doi.org/10.3390/microorganisms14010045 
(registering DOI)
This article belongs to the Special Issue Advances in Genomics and Ecology of Environmental Microorganisms

Abstract

The deep terrestrial subsurface is the largest reservoir of Earth’s freshwater resources as well as the largest habitat for prokaryotic life. However, the deep-subsurface microbiome, especially its spatial distribution across countries/continents, is still poorly understood. In this study, we compiled and compared 30 16S rRNA gene amplicon libraries from three deep fractured aquifers in different parts of the world (depth range of tens of meters to 2.4 km below surface) to understand the spatial distribution and functions of deep-subsurface microbial community, and to test for the presence of core taxa. The results revealed spatially heterogenous microbial community composition at both the local and the global scales, even at the phylum level. Environmental filtering was identified as an important driver of the microbial community structure of deep groundwaters. Despite the spatial heterogeneity, the three aquifers share a core microbiome at the genus level. Only one family, Comamonadaceae, was present in all the 30 samples analyzed. Several other families were also prevalent, including Hydrogenophilaceae, Omnitrophaceae, BSV26 (Candidatus Kryptonia), and an unclassified Thermodesulfovibrionia. FAPROTAX functional prediction indicated that chemoheterotrophic functions predominate, and the core microbial genera, together with the dominant genera, collectively govern the functional characteristics. Taken together, our findings provide new insights into the spatial heterogeneity and functional potential of deep-subsurface ecosystems across the globe.

1. Introduction

The deep terrestrial subsurface (hundreds to thousands of meters below the Earth’s surface) is now recognized as the largest habitat of prokaryotic life on Earth [1,2]. Its unique physicochemical conditions, such as high temperature, high pressure, nutrient depletion, and dark anaerobiosis, foster diverse microbial communities with specialized functions [3,4,5]. Groundwater in deep, fractured aquifers is not only a core habitat for microbial survival but also a vital carrier for material cycling and energy transfer. The composition and function of groundwater microbial community directly influence the biogeochemical cycles of carbon, nitrogen, sulfur, and other elements in the deep subsurface.
Despite recent advancements in high-throughput sequencing technology that have promoted research in deep subsurface microbiology, studies on microbial communities in deep fractured aquifers still face numerous challenges compared to those in surface and shallow environments. On the one hand, the difficulty and high cost of obtaining deep subsurface samples result in limited high-quality datasets in the literature. On the other hand, the geological backgrounds (e.g., lithology, structure) and hydrogeochemical conditions (e.g., pH, redox potential, dissolved nutrient content) of deep strata vary substantially among regions. This has hindered the formation of a unified understanding regarding the spatial distribution patterns, driving mechanisms, and the core functional taxa of deep-subsurface microbial communities [6]. In particular, the existence and ecological functions of the “core microbiome” is a hot topic and point of contention in current microbial ecology research [7]. The “core microbiome” is considered to be the key taxon adapted to the common selective pressures of specific habitats, and its presence or absence directly reflects the degree of habitat homogenization and microbial adaptive strategies [8,9]. Furthermore, the coupling relationship between the functional potential of deep-subsurface microorganisms (e.g., organic matter degradation, distribution of functional genes related to element cycling) and their associated hydrogeochemical conditions requires further verification through systematic data analysis and functional prediction [10].
To date, most deep-subsurface microbiome research had focused on individual regions [11,12,13,14], or addressed it solely from the perspective of geochemistry [15] or microbial diversity [16], lacking an overarching understanding of the whole picture. Furthermore, because different research groups often employ different strategies in sequencing data processing and data visualization, the results across different 16S-based microbiome studies are almost impossible to compare, necessitating meta-analyses based on their raw sequencing data [17,18]. To address the aforementioned research gaps, in this study we analyzed the 16S rRNA gene amplicon sequencing data in conjunction with hydrogeochemical data from three deep, fractured aquifers across the globe. The main objectives of this study are (1) to find out if the hydrogeochemical properties of deep fracture waters are heterogenous across different geographical regions; (2) to analyze the diversity, abundance distribution, and similarity patterns of microbial communities in deep fracture waters; (3) to find out if a core microbiome exists in deep aquifers across different continents; and (4) to predict the functional potential of deep aquifer microbial communities and evaluate the effect of hydrogeochemical conditions on their structures and functions.

2. Materials and Methods

2.1. Study Area

Three deep-subsurface aquifers (Figure 1) were selected in this study, accessed through the BedrettoLab Deep Life Observatory (DELOS), the Kidd Creek Observatory (KCO) and the Sanford Underground Research Facility (SURF), respectively. The DELOS is located in the Riale di Ronco high-altitude Alpine catchment of the Gotthard Massif, Switzerland, accessible via the Ronco Portal of the 5.2 km long Bedretto Tunnel (46.497518° N, 8.494992° E) that is tens of meters to 1.6 km underground [19]. Its lithology includes three geological units [20,21]: the Tremola Series (tunnel meter (TM) 0–434), composed of mica-gneiss, amphibolites, schists, calc-silicate rocks, and quartzites [22]; the Prato Series (TM 434–1138), dominated by mica-gneisses, amphibolites, and schists with centimeter-to-meter scale compositional heterogeneity; and the Rotondo granite (TM 1138–5218), further divided into equigranular granite (RG1) and biotite-rich porphyritic granite (RG2) with zircon U-Pb intrusion ages ranging from 280 to 335 Ma [20]. The KCO is situated 2.39 km (7850 ft) underground in an active copper-silver-zinc mine of the 2.7 Ga Canadian Shield near Timmins (Ontario, Canada) within the traditional territories of the Anishinabewaki and Cree Nations. Its lithology is dominated by rhyolitic layers bounded by high-silica rhyolite flows, tholeiitic basalts, and up to 500 m-thick carbonate-altered komatiitic flows that host the 2.7 Ga Kidd Creek massive sulfide deposit [23]. The SURF, formerly the Homestake Gold Mine, is situated in Lead, SD, USA, within the northern Black Hills region. The samples analyzed here were obtained from the 4850 ft (1478 m) underground level of SURF, within the Precambrian Poorman Formation dominated by sericite-carbonate-quartz phyllite [24].
Figure 1. Geographical locations of the underground laboratories where the samples analyzed in this study were obtained from.

2.2. Data Acquisition

To evaluate the community composition and functional potential of the deep subsurface fracture water microbiome, four published microbial assessment studies were selected based on the following criteria: (1) Fluid samples originated from deep-subsurface fractured aquifers. (2) Publicly available 16S rRNA gene amplicon sequence data. (3) Amplification of the V4 and V5 hypervariable regions of the 16S rRNA gene was performed using the universal primer pair 515F-Y (5′-GTGYCAGCMGCCGCGGTAA)/926R (5′-CCGYCAATTYMTTTRAGTTT) [25]. (4) High-throughput sequencing was performed using the Illumina Miseq platform (Illumina, San Diego, CA, USA). Only a subset of representative samples from each study were selected (15 samples from SURF, 14 samples from DELOS and 1 sample from KCO), covering as much microbial variety as possible using only a small number of samples (30 samples total).
Raw high-throughput 16S rRNA gene amplicon sequence data sets for the SURF groundwater were obtained from the European Nucleotide Archive at the European Bioinformatics Institute, under accession no. PRJEB35125 [26] and PRJEB44691 [27]. The former accession no. corresponds to pristine groundwater samples obtained from artesian boreholes and dripping fractures including BoreholePST, BoreholeGC, Port17Ledge, FractureA, FractureB and FractureC [26], whereas the latter accession no. corresponds to time-series produced water samples from a 10-month injection test at a geothermal test site, named according to the date each sample was taken: PB/PDT/PI/PST-MMDDYY [27,28]. Raw amplicon sequence data for the DELOS were obtained from the National Center for Biotechnology Information’s (NCBI) under accession no. PRJNA1181539 [19]. This accession no. corresponds to pristine groundwater samples obtained from flowing fractures (TM1306, TM1494, TM2647, TM2848, TM4166, TM4348, TM4447, TM444, TM4652, TM4752, TM4846 and TM5132) and uncased boreholes (TM901, TM2794), named according to the horizontal distance from the tunnel entrance at which each sample was taken: TM-XXXX. Raw amplicon sequence data for the KCO were obtained from the European Nucleotide Archive at the European Bioinformatics Institute, under accession no. PRJEB49237 [29]. This accession no. corresponds to pristine groundwater samples obtained from artesian boreholes including KC12299.
Note that the samples collected on 16 May 2019 and 11 December 2019 in the SURF time-series produced fluid dataset were selected in this study because the community compositions differed significantly between these two dates in each well. We aimed to capture as much community variation as possible within each study area using the minimum number of samples. One exception is PB_082819, whose community structure was similar with that of PB_121119. PB_082819 was included just as a reference. For the DELOS dataset, we selected one sample per tunnel meter. For each location with time-series data available (TM444, TM1306, TM1494, TM2848 and TM4652), only the last sample of the time series was selected. For the KCO dataset, only samples from one borehole named 12299 was deemed representative of the aquifer, which is why only one sample “KC12299” was selected from KCO.
Hydrogeochemical data for the three regions were obtained from the US Department of Energy Geothermal Data Repository (at https://gdr.openei.org/submissions/1424, accessed 1 October 2025), https://github.com/GeobiologyLab/DELOS-2021-time-series (accessed 1 October 2025), and the Interdisciplinary Geoscience Data Alliance (at https://doi.org/10.60520/IEDA/113522, accessed 1 October 2025), respectively.

2.3. High-Throughput Sequencing Data Processing

SURF: Primer sequences were trimmed from the raw sequencing reads of each sample using cutadapt (version 5.2) [30]. The Dada2 (version 1.34.0) [31] and phyloseq (version 1.50.0) [32] packages in R were used to analyze the sequencing data. Sequence quality filtering was performed by truncating all reads exceeding 220 base pairs (discarding bases with quality scores < 30) and removing sequences that did not perfectly match the proximal primers, contained more than two expected errors, or harbored ambiguous bases (Ns). The Dada2 [31] method was used to infer Amplicon Sequence Variants (ASVs) from quality-filtered reads, remove sequencing errors, merge forward and reverse reads while disallowing mismatches in overlapping regions, remove chimeras, and generate a ASV table.
DELOS: All data processing procedures were conducted on the Galaxy online bioinformatics cloud platform (https://usegalaxy.org). Based on the sequence quality, the first 5 nucleotides of the forward and reverse reads were trimmed using the FASTQ Trimmer by column (Galaxy Version 1.2+galaxy0) [33,34]. Meanwhile, the last 21 nucleotides of the reverse read were trimmed using FASTQ Trimmer by column. After trimming, Trimmomatic (Galaxy Version 0.39+galaxy2) [35] was used for quality filtering of the reads (SLIDINGWINDOW:100:28), and adapters (TruSeq3, paired—ended, for MiSeq and HiSeq) were removed. Paired end reads were joined using fastq-join [36] (-p 3 -m 20). The resulting data were downloaded, and an ASV table was generated in R using the Dada2 package.
KCO: Sequence quality filtering was carried out the same as for the SURF dataset, with the sole modification that reads exceeding 230 bp were truncated.
The above data processing method was selected to maintain consistency with the data processing approach in the original publications. Finally, the sequence tables generated for each dataset were merged, with duplicate ASVs resolved to produce a ASV count matrix for subsequent analysis. ASVs were annotated using the dada2 package and the Silva nr99 v138.2 dataset [37,38,39].

2.4. Water Chemistry Data Processing

Bicarbonate data was not available in some of the original publications. Therefore, based on a measured pH of 7~9.5 for all samples, bicarbonate concentrations were calculated via charge balance, with the assumption that the net positive charge of the samples is fully offset by bicarbonate [40].

2.5. Diversity and Statistical Analyses

The diversity and statistical analyses were performed using R packages phyloseq (1.50.0) [32], vegan (2.7.2) [41], and NbClust (3.0.1) [42]. Shannon and Observed indices were used to study alpha diversity. For beta diversity analyses, we calculated community similarity at both the phylum and the ASV level using both the Bray–Curtis and weighted Unifrac metrics. For the Bray–Curtis metric, the computation was implemented using the distance() function in the R package phyloseq, with the parameter method = “bray” specified to generate the Bray–Curtis distance matrix. The optimal number of clusters was determined by silhouette analysis [43] on the Bray–Curtis dissimilarity matrix [44] using the NbClust() (min.nc = 2, max.nc = 10, method = “ward.D”, index = “silhouette”) from NbClust (version 3.0.1) package [42]. For the weighted UniFrac metric, ASV-level analyses were performed using method = “wunifrac”, with all other parameters identical to those used for the Bray–Curtis metric. At the phylum level, ASVs were first agglomerated to the phylum rank using tax_glom, after which the same procedures as in the ASV-level analysis were applied. The VennDiagram (version 1.7.3) [45] package in R was used to draw Venn diagrams.

2.6. Functional Prediction Analyses

The ASV table was first agglomerated at the genus level to obtain the number of reads each genus has in each sample. Then, the abundance of each function in each sample was calculated by summing the reads of those genera that contains the function in the database. Note that the ASVs not classifiable at the genus level was not considered in the FAPROTAX analysis. The above data analysis was performed using the Majorbio Cloud (www.majorbio.com) [46,47]. Functional aggregation was performed using the official collapse_table.py script [48] provided by FAPROTAX (v1.2.1) [48,49], generating an abundance table of different functional categories across all samples. Subsequently, the functional abundance table was exported, and the abundance of each function was divided by the total number of functional reads in each sample to obtain the relative abundance of each function in each sample. Finally, the packages “Complex Heatmap (version 2.22.0)” and “vegan (version 2.7.2) ” in R were used to generate the heatmap of functional relative abundance. For each functional category, the relative abundance values were standardized to z-scores [50] with a mean of 0 and a standard deviation of 1. Clustering was performed using the Bray–Curtis dissimilarity at the genus level for samples (columns) and Euclidean distance for functional genes (rows).

3. Results

3.1. The Hydrochemical Properties of Deep Fracture Waters Are Spatially Heterogenous Across Study Regions

The pH of the three selected aquifers ranges from 7 to 9.5, indicating a weakly alkaline environment. A Piper diagram was generated based on the water chemistry data from each aquifer (Figure 2, Table S1). The hydrochemical types in the SURF region exhibit distinct patterns depending on whether the sampled water was pristine or not. The pristine groundwater samples from artesian boreholes and dripping fractures, including Borehole PST, FractureA, FractureB and FractureC almost completely overlapped on the lower-right corner of the cation diagram and on the upper corner of the anion diagram, indicating that they had very similar geochemical compositions, corresponding to the SO4-Na·K hydrochemical type. In contrast, the time-series produced water samples exhibit a scattered distribution in the lower-left region of the cation diagram. On the anion diagram, the time-series data points are distributed along the HCO3 axis, dominated by HCO3 and SO42− ions. Based on these geochemical signatures, the time-series data points are therefore classified as either the HCO3-Ca·Mg or HCO3·SO4-Ca·Mg hydrochemical type. The hydrochemical composition of deep fissure water in the DELOS region exhibits higher dispersion. On the cation diagram, data points are distributed along the Ca2+ axis, dominated by Ca2+ and Na+/K+. However, the anion diagram exhibits distinct banding patterns: the TM901 and TM1494 samples overlap almost entirely in the upper corner, indicating sulphate-rich water types; the TM1306 and TM2647 samples show remarkable similarity, with SO42− predominating alongside increased HCO3; The proportion of HCO3 in TM2794–TM5132 generally increases with depth into the tunnel, consistent with the original findings [19]. ln particular, the TM4166 sample occupies the lower left corner, where HCO3 is strongly dominant. The KCO region was not plotted on the Piper diagram due to the presence of only one sample, KC12299, with incomplete water chemistry data. However, previous studies [29,51,52,53] have revealed that the deep fracture water in this region is typically characterized by high salinity, as exemplified by sample KC12299 with a total dissolved solids (TDS) content of 197,274 mg·L−1 (Measured value), corresponding to the Ca-Cl hydrochemical type. TDS ranged from 40–670 mg·L−1 (Measured value) in DELOS. Since SURF did not include measured TDS values, TDS was estimated using an empirical calculation formula and was calculated by
T D S = ( a n i o n s ) + ( c a t i o n s ) 1 2 H C O 3
where the symbol in the parentheses denoted the mass concentration of the dissolved ion, with the unit of mg·L−1. The results showed that the TDS content in the SURF area ranged from 530 to 6700 mg·L−1. This calculation approach has inherent limitations, including incomplete coverage of all components contributing to TDS and the potential amplification of uncertainties associated with ion measurements. Therefore, the results should be regarded as indicative rather than definitive.
Figure 2. Hydrochemical composition of the groundwater samples analyzed in this study displayed in a Piper diagram.

3.2. Different Microbial Community Compositions Within-Site and Across-Sites

A total of 2486 ASVs were inferred from the selected SURF sequencing dataset. These ASVs encompassed 68 phyla and 254 families in total. A total of 25 ASVs were inferred from the selected KCO sequencing dataset. These ASVs encompassed 7 phyla and 19 families in total. A total of 6063 ASVs were inferred from the selected DELOS dataset. The number of raw reads, the library size after quality filtering, and the number of ASVs for each sample are summarized in Table S2. These ASVs encompassed 69 phyla and 248 families in total. The microbial community compositions of the total of 30 selected samples are shown in Figure 3. It can be seen that a number of taxa were present in multiple samples across different aquifers.
Figure 3. The microbial community composition of the 30 groundwater samples analyzed in this study. Bar plots show the finest classification possible down to the family level. The major taxa (i.e., taxa that were within the top 10 most abundant in at least one sample) are shown in color. The complete legend is shown in Figure S1.
In terms of community dominance and structural complexity, the three regions exhibited significant differences in their average community characteristics (Figure 4). The average relative abundance of the one sample from the KCO region indicates it almost entirely consists of one bacterial phylum and its corresponding family: Halanaerobiaeota (phylum) and Halonotobacteriaceae (family), the latter accounting for 92.2% of the total abundance. All other taxa including α-Proteobacteria, γ-Proteobacteria, Actinobacteria and Planctomycetes are present at abundances below 6%, resulting in a “single-taxon-dominated” [54] community feature. In contrast, DELOS and SURF display more complex and diverse community structures. As shown in Figure 4, DELOS is characterized by a balanced distribution of multiple dominant taxa: the three most abundant phyla (Pseudomonadota, Nitrospirota, and Verrucomicrobiota) collectively represent 37.4% of the total abundance, and the three most abundant families (BSV26, Omnitrophaceae and Acidiferrobacteraceae) sum to 17.8%. No single taxon in DELOS exceeds 15% abundance, indicating a “multi-taxon-co-dominant” pattern [55]. SURF, meanwhile, exhibits a “few-dominant-taxa-led” structure: its three most abundant phyla (Pseudomonadota, Thermodesulfobacteriota and Bacillota) account for 59.3% of the total abundance, and the three most abundant families (Hydrogenophilaceae, Rhodocyclaceae and Halothiobacillaceae) contribute 22%.
Figure 4. Averaged microbial community profile in the sample set across three regions. Only the major taxa (top 40 most abundant y across wells in the averaged microbial community profile) are shown in color. For taxa not classifiable at the family level, the finest possible classification is shown.

3.3. Alpha Diversity Analysis

Alpha diversity generally refers to metrics that describe species richness, evenness, or the inherent diversity of a given sample [56]. The Shannon alpha diversity index (H’) was calculated for the three study regions (Figure 5a). Among them, the DELOS region exhibited the highest mean diversity (H’ = 4.41), followed by the SURF region (H’ = 3.92), while the KCO region showed the lowest value (H’ = 1.05). These results indicate that the microbial community structure in the DELOS region is relatively more diverse and uniform, whereas the KCO region has the lowest community diversity, which is consistent with the relative abundance profile of each region. Consistent with the elevated Shannon indices (Figure 5a), observed species richness (Figure 5b) was also exceptionally high in DELOS samples TM1306 and TM2848. Interestingly, TM444, BoreholePST, and FractureB exhibited high observed richness but low Shannon values, indicating dominance by a few taxa, a pattern consistent with the corresponding relative-abundance profiles.
Figure 5. Box plots of alpha-diversity metrics: (a) Shannon diversity index; (b) Observed species richness. The vertical lines indicate the boundaries of the data distribution, and the horizontal lines from bottom to top represent the minimum boundary (Q1 − 1.5 × IQR), the first quartile (Q1, 25th percentile), the median (50th percentile), the third quartile (Q3, 75th percentile), and the maximum boundary (Q3 + 1.5 × IQR). Note that IQR = interquartile range. H’ represents the mean value of the Shannon–Wiener diversity index.

3.4. Samples from Different Regions Form Distinct Clusters

To assess the beta diversity of microbial communities in the deep subsurface, Principal Coordinate Analysis (PCoA) was performed using both Bray_Curtis and weighted Unifrac distance metrics. (Figure 6). The Bray–Curtis distance was calculated between each pair of samples at the ASV level (Figure 6a). The first two principal coordinates (PCo1 and PCo2) explained 12.5% and 7.2% of the total variance, respectively, with samples from different sampling sites (DELOS, KCO, and SURF) forming distinct clusters. The DELOS samples (green dots) mostly clustered in the left-hand region of the plot, forming two relatively compact subclusters. This indicates a certain degree of dissimilarity in microbial community composition within each subcluster at this sampling site. SURF samples (blue dots) were distributed across the right-hand region, forming a cluster distinct from that of the DELOS samples. This suggests significant differences in microbial community composition between SURF and DELOS sites. On the other hand, the distribution of data points of the SURF produced-fluid samples was relatively dispersed, with greater inter-individual distances between time-series produced waters than between pristine groundwater samples. The KCO sample (red dot) occupy a position between the DELOS and SURF clusters, suggesting their microbial community composition possesses unique characteristics while exhibiting associations with both other sites. Naturally, the limited sample size of KCO must be considered as a potential confounding factor, requiring further validation of its community features with additional data. Furthermore, PERMANOVA analysis (F = 2.02, R2 = 0.48, p = 1.00 × 10−4) further confirmed statistically significant differences in microbial β-diversity across distinct deep subsurface locations. This result could be an indicator of the significant influence of unique subsurface environmental conditions on shaping the microbial community structure.
Figure 6. PCoA analysis plots of microbial community datasets across the three regions: (a) Based on the Bray–Curtis distance at the ASV level; (b) Based on the weighted Unifrac distance at the ASV level; (c) Based on the weighted Unifrac distance at the Phylum level.
Principal coordinate analysis (PCoA) based on the weighted Unifrac distance reveals notable segregation between the DELOS (green) and the SURF (blue) communities along the first principal coordinate (PCo1) (Figure 6b). This pattern indicates significant differences in groundwater community structure between the two regions. However, partial overlap between clusters suggests the presence of shared core microbial taxa. Collectively, PCo1 and PCo2 explained approximately 40% of the total variance, representing the primary axes of microbial community differentiation within the study areas. Upon taxonomic agglomeration to the phylum level, samples from the three regions again formed visibly distinct clusters (Figure 6c), highlighting across-region community heterogeneity even at the phylum level (Figure S2). PERMANOVA indicated that sampling location exerted a significant effect on microbial community structure (F = 10.11, R2 = 0.43, p = 0.0001), underscoring geography as a key factor shaping phylogenetically weighted community composition.

3.5. Limited Shared Microbial Taxa Among Three Regions

Venn diagram analysis was used to interrogate the shared microbial taxa among different regions. Six phyla (Pseudomonadota, Planctomycetota, Bacillota, Chloroflexota, Actinomycetota, and Deinococcota) were common to all three regions (Figure 7a, Figure S2 and Table S3), representing 7.50% of all phyla detected in this study. 49 phyla (61.25%) were shared exclusively between SURF and DELOS, among which Nitrospirota, Patescibacteria, Verrucomicrobiota, and Candidatus Kryptonia exhibited markedly higher relative abundances. The three most abundant phyla exclusive to the SURF region were Deferribacterota, Campylobacterota, and FW113, whereas the three most abundant phyla exclusive to the DELOS region were Candidatus Lindowbacteria, Firestonebacteria, and Abditibacteriota. In contrast, KCO contained only 7 phyla (8.75%), although the small number of phyla could be the result of the insufficient sample size. At the family level (Figure 7b), 14 families were common to all three regions, representing 4.17% of all the families detected across the three aquifers. Notably, SURF and DELOS harbored a shared core microbiome of 147 families, which constituted 43.75% of the combined family pool (i.e., the total number of unique families) across the three regions. A total of 7 shared genera were detected across the three regions, accounting for 1.22% (Figure 7c). At the ASV level, no shared nucleotide sequences were identified (Figure 7d), although this could be an artifact of the different bioinformatics pipelines used and PCR amplification bias across different aquifers.
Figure 7. Venn diagram analysis on the microbial taxa across different regions at the (a) phylum level, (b) family level, (c) genus level, and (d) ASV level.

3.6. FAPROTAX Functional Prediction Reveals Metabolic Niche Partitioning of Groundwater Microbiota in C–N–S–Fe Cycling Across the Three Study Regions

FAPROTAX is a functional annotation database that establishes associations between bacterial/archaeal taxa and metabolically/ecologically relevant functions (e.g., nitrogen fixation, sulfate respiration, and hydrocarbon degradation) based on the literature documenting cultured representatives [57]. This database was originally constructed for a study focusing on marine environments, incorporating 80+ functional groups and taxonomic details corresponding to over 4600 bacteria and archaea from oceanic habitats [48,58]. Numerous existing studies have employed FAPROTAX for the functional annotation of groundwater prokaryotes, which has verified the reliability of its annotation results and supported its applicability in this field [59,60,61,62].
In this study, a grouping analysis was conducted on 12,264 target records, with 1871 out of 12,264 records (15.26%) assigned to at least one group, and 61 groups were represented (i.e., associated with at least one record). FAPROTAX functional prediction based on 16S rRNA gene sequences indicated that the groundwater microbial communities in the three regions all possessed the metabolic potential to drive the biogeochemical cycling of key elements such as C, N, S, and Fe (Figure 8). Overall, chemoheterotrophy was the shared and most abundant functional category in the system, and aerobic chemoheterotrophy accounted for a significant proportion in some samples.
Figure 8. FAPROTAX functional profiles. (only the top five most abundant functions per individual sample are considered.) Clustering among functions are performed based on Euclidean distance, whereas clustering among samples are conducted using Bray–Curtis distance.
Regarding inter-regional differences, the relative abundances of sulfate_respiration, respiration_of_sulfur_compounds in SURF samples were significantly higher than those in DELOS and KCO, indicating an anaerobic habitat rich in sulfate where more active sulfur cycling processes may have occurred. In oxygen limiting environments, many aerobic bacteria are capable of switching from aerobic to anaerobic respiration to generate energy. Nitrate respiration is a metabolic process in which microorganisms use nitrate as the terminal electron acceptor under anaerobic conditions [63]. The relatively high abundances of nitrogen fixation, nitrate reduction, nitrate respiration, and nitrogen respiration in the BoreHolePST sample reflect the efficient adaptation of microorganisms to the environment characterized by limited oxygen availability and relatively scarce energy substrates. Meanwhile, BoreholeGC exhibited relatively high abundance of iron_respiration, which is consistent with the elevated iron concentration reported in a previous study [26]. The TM901, TM2647, and TM4348 taxa in DELOS showed relative advantages in dark_oxidation_of_sulfur_compounds and dark_sulfide_oxidation, suggesting a dark/low-light microhabitat with sufficient sulfide and available electron acceptors. In addition, fermentation was only highly abundant in KCO, indicating an hypoxia/severe hypoxia habitat with abundant fermentable organic matter and limited electron acceptors. Functional labels related to human or animal pathogenesis, specifically human_pathogens_all and animal_parasites_or_symbionts, showed relatively high abundances in some samples. This suggests that the associated pathogens are present not only in their hosts, but in groundwater systems as well, necessitating further quantitative PCR analysis to evaluate their ecological risks.
Functional-level clustering revealed that most samples exhibited highly similar patterns between taxonomic abundance and functional abundance, indicating a generally positive correlation between microbial community composition and its potential ecological functions. However, a few samples (e.g., FractureC, TM4447, and TM1494) were placed on distant branches in the taxonomic dendrogram but clustered closely in the functional tree, suggesting possible niche convergence.

4. Discussion

4.1. Environmental Filtration Exerts a Driving Influence on the Microbial Community Structure Within Deep Groundwater Environments

The KCO, SURF, and DELOS sites are distinguished by contrasting hydrochemical conditions. In the KCO area, the groundwater exhibits exceptionally high mineralization, which may be attributed to its prolonged residence time. During long-term water–rock interactions [29] and geochemical evolution, the groundwater gradually evolved into a Ca–Cl hydrochemical type. The SURF area was historically a gold mining district and retains abundant sulfide minerals (e.g., pyrrhotite and pyrite) [64]. These sulfur reservoirs have been progressively oxidized due to mining operations or dissolved during long-term groundwater circulation/mixing, thereby directly supplying sulfate (SO42−) to the system. In the DELOS area, deep fracture waters exhibit a highly dispersed hydrochemical character, which may primarily be the result of distinct water–rock interaction pathways controlled by different lithologies (e.g., gneiss-schistand granite), together with highly variable hydraulic connectivity and recharge sources. Distinct hydrochemical endmembers represent differentiated energy substrate pools, thereby selecting for and sustaining specific functional microbial assemblages.
The microbial community structures at all three locations demonstrate a pronounced environmental filtering effect [18,65,66] on dominant taxa. This demonstrates the close interaction between environmental conditions, functions and community compositions within deep fractured groundwater systems. In the KCO, groundwater is characterized by extreme salinity (TDS = 197,274 mg· L−1), long-term isolation from the surface (>100 million years), anoxia and severe oligotrophy [29]. These harsh conditions (Table S1) impose strong environmental filtering that selects for halophilic, obligate anaerobes capable of tolerating high osmotic pressure, resulting in a highly simplified community dominated by a single taxon, Halanaerobiaeota (family Halobacteroidaceae), at a relative abundance exceeding 92%. This family can adapt to hypersaline conditions by synthesizing compatible solutes (e.g., betaine) or regulating cellular membrane osmotic pressure, and can survive under anoxic conditions through fermentation and anaerobic chemoheterotrophic metabolism [67]. Most microorganisms are unable to tolerate such extremely high salinity and other harsh environmental conditions and are therefore excluded from competition, thereby creating a specialized ecological niche for Halobacteroidaceae. This pattern exemplifies community simplification, where extreme environments reduce niche breadth and eliminate all but the most specialized taxa [68,69]. In contrast, the SURF hosts moderately saline, sulphate-rich fracture fluids that span micro-oxic to anoxic conditions, favoring metabolically versatile Gammaproteobacteria, particularly the families Hydrogenophilaceae, Rhodocyclaceae, and Halothiobacillaceae, that are capable of sulfur oxidation, iron reduction, and facultative anaerobic respiration. Pseudomonadota, Nitrospirota, and Verrucomicrobiota co-dominate in DELOS, resulting in a multi-taxon co-dominant community structure. Multiple overlapping environmental gradients create a mosaic of microhabitats that promote niche partitioning among taxa. In shallow environments that are relatively oxic and characterized by low electrical conductivity, members of the phylum Nitrospirota, dominated by the chemolithoautotrophic genus Leptospirillia, can obtain energy through the oxidation of ferrous iron. In environments with high sulfate concentrations, communities are instead dominated by the class Thermodesulfovibrionia within Nitrospirota, which efficiently utilize sulfate as an electron acceptor. Certain lineages within Verrucomicrobiota are frequently associated with deep, oligotrophic environments [70], whereas Proteobacteria, owing to their high metabolic versatility, are widely distributed across different environmental gradients. This pattern underscores the role of spatial heterogeneity and niche differentiation [71] in regulating subsurface microbial biogeography under less extreme but more variable environmental conditions.

4.2. Significant Differences in the Microbial Community Composition Across the Three Aquifers Revealed by Bray–Curtis and Weighted Unifrac Distances

The hydrochemical characteristics of samples TM2848, TM4846, TM4752, and TM2794 (Figure 2) are consistent with their clustering patterns in the PCoA ordination (Figure 6a), further indicating that environmental filtering exerts a strong influence at these sampling sites. However, some samples (e.g., FractureA, FractureB, and FractureC) exhibit similar hydrochemical compositions but markedly different microbial community structures. This discrepancy may be attributed to the limited hydraulic communication among parallel fractures, whereby geographic isolation constrains microbial dispersal and results in community divergence over time or across permeable fractures through stochastic ecological processes [72,73]. The clustering of TM4348, TM2647, and TM901 from the DELOS region with the SURF sample in terms of weighted Unifrac distance reflects a high degree of evolutionary similarity in their microbial communities. At the phylum level, both groups share dominant microbial taxa such as Pseudomonadota and Planctomycetota, which results in their close proximity on the PCoA plot.
The differences between PCoA plots based on Bray–Curtis distance and weighted Unifrac distance stem from their distinct calculation logics. The Bray–Curtis distance focuses solely on the relative abundance of ASVs [74,75]. In contrast, the weighted Unifrac distance integrates both species abundance and phylogenetic relatedness, emphasizing the evolutionary divergence of communities [76]. This dual-perspective analysis underscores that deep subsurface microbial community structure is shaped by a combination of environmental filtering (driving abundance changes) and phylogenetic niche conservatism (driving evolutionary differentiation). Interestingly, although sample KCO appeared dissimilar to all others, the weighted Unifrac distance between KCO and DELOS was smaller than that between the two PI samples. This indicates that within-group differences can exceed between-group differences. Such “counter-intuitiveness” arises because weighted Unifrac incorporates not only relative abundances but also the phylogenetic relatedness of taxa, and it gives greater weight to highly abundant lineages. Moreover, the metric is highly sensitive to phylogenetic divergence at low taxonomic levels, which can produce a phenomenon that appears inconsistent with relative-abundance data.

4.3. Core Microbiome Is Present in Deep Groundwater Environments

The exploration of the core microbiome in the field of microbial ecology has become quite commonplace. However, discrepancies currently exist in how the core microbiome is defined and quantified [7]. Owing to methodological differences in data processing, cross-study comparisons often yield OTU-level Venn diagrams devoid of any overlap. Moreover, taxonomic assignments produced by one primer set may not map directly onto those generated by another, thereby imposing considerable challenges on the identification of a core microbiome across disparate datasets [77]. Therefore, finer classification levels are not necessarily preferable, a suitable classification level should be selected for analysis based on the actual circumstances of the samples. To study whether a core microbiome exists across the three selected aquifers, the defining criterion chosen was the common taxa across all sites [17,78,79].
According to the established defining criteria for the core microbiome, the core taxa of the three aquifers at different taxonomic levels are identified as follows: At the phylum level, six core phyla shared across all regions were identified, namely Bacillota (Firmicutes), Actinomycetota (Actinobacteria), Pseudomonadota (Proteobacteria), Planctomycetota (Planctomycetes), Chloroflexota (Chloroflexi), and Deinococcota (Deinococcus-Thermus). Among these, Pseudomonadota maintained the highest relative abundance in the microbial communities of all regions, exhibiting a significantly dominant position. At the genus level, seven core microbial genera were further screened out, including Legionella, Lysobacter, Devosia, Shinella, Hyphomicrobium, Phreatobacter, and Pseudomonas. As the dominant core phylum, Pseudomonadota and its subordinate core genera may play a key synergistic role in regulating the functions of the deep fracture water ecosystem (e.g., material cycling, environmental adaptability).
In some studies, the core microbiome is defined as the shared taxa across all samples (not all sites), which is a more stringent criteria because of the within-site heterogeneity. As shown in Table S4, only one family, Comamonadaceae (Pseudomonadota), was present in all 30 samples across the three regions. A few other families were present in >80% of the samples (i.e., high prevalence) and were among the top-10 families with the highest averaged relative abundance, including Hydrogenophilaceae (Pseudomonadota), Omnitrophaceae (Verrucomicrobiota), BSV26 (Candidatus Kryptonia), and an unclassified Thermodesulfovibrionia (Nitrospirota).

4.4. The Functional Characteristics of Deep Fractures Are Collectively Shaped by Both Core Microbial Genera and Dominant Genera

Through the investigation of core microbial genera and their metabolic functions, we found that all seven core microbial genera are chemoheterotrophic bacteria, which is consistent with the dominance of chemoheterotrophy according to FAPROTAX analysis. Meanwhile, these genera primarily rely on organic carbon oxidation as their core metabolic process, with electron acceptors including O2 and NO3. This corresponds to their occupation of distinct ecological niches: microaerophilic oligotrophy (Phreatobacter, Legionella), potential denitrification capacity (Shinella, Hyphomicrobium, Devosia), and metabolic versatility (Pseudomonas, Lysobacter). Due to the scarcity of KCO samples, their contribution to core microbial genera is fairly limited, and fermentation is presumably driven by the dominant genus Fuchsiella. This fermentative metabolism is consistent with the hypoxic and hypersaline hydrochemical characteristics of the deep aquifers in KCO. Iron respiration detected in Borehole GC is presumably derived from Magnetospirillum, a genus with relatively high abundance that is characterized by its capacity to sequester iron and produce the mineral magnetite (Fe3O4) under microaerobic environments [80]. This functional trait aligns with the detectable dissolved iron in Borehole GC groundwater [26]. The high abundance of human-pathogenic functions is likely associated with the pro-nounced presence of Hydrogenophilaceae, Nitrosomonadaceae, and Rhodocyclaceae. It is noteworthy that functional predictions at the family or genus level are insufficient to indicate actual pathogenic potential. Instead, metagenomic sequencing is required to verify the presence of genuine virulence genes. Meanwhile, it is necessary to consider environmental characteristics and the existence of potential human exposure pathways.
The high proportion of aerotrophic heterotrophs in certain areas reflects the microbial community’s adaptive strategy to this heterogeneity, wherein dominant microorganisms predominantly possess low-oxygen tolerance, enabling survival within the transition zone between aerobic and microaerophilic conditions underground. The relatively high abundance of this feature may also indicate the presence of material exchange or disturbance within the subsurface environment of the study area. Due to influences such as injection, production, or recharge, the deep subsurface system is not an isolated, closed system. On the other hand, nitrate respiration represents a prototypical anaerobic alternative respiration pathway. Its high abundance in synergy with other nitrogen metabolism functions indicates anaerobic conditions in the subsurface environment, further highlighting the spatial heterogeneity of subterranean ecosystems.

5. Conclusions

Through the meta-analysis of groundwater 16S rRNA gene amplicon sequencing data from the DELOS, the KCO and the SURF, we obtained an overall understanding of the deep-subsurface microbial diversity and functions across different countries/continents. Specifically, we found that the microbial community composition in deep, fractured aquifers is spatially heterogenous at the local (within site) and the global (between sites) scale, and even at the phylum level. A combined analysis of microbial community data and hydrochemical data suggests that environmental filtration is a key driver shaping the microbial community structure in deep groundwater environments. On the other hand, despite the spatial heterogeneity, a core microbiome exists among the three aquifers analyzed in this study, encompassing seven core microbial genera, namely Legionella, Lysobacter, Devosia, Shinella, Hyphomicrobium, Phreatobacter, and Pseudomonas. Only one family, Comamonadaceae, was present in all the 30 samples analyzed, whereas a few other prevalent families were found in >80% of the samples and had high averaged relative abundance, including Hydrogenophilaceae (Pseudomonadota), Omnitrophaceae (Verrucomicrobiota), BSV26 (Candidatus Kryptonia), and an unclassified Thermodesulfovibrionia (Nitrospirota). Through functional prediction using FAPROTAX, it was found that chemoheterotrophic functions predominated. The coexistence of aerobic chemoheterotrophy and nitrate respiration demonstrates the spatial heterogeneity of oxygen content in deep subsurface environments, where the core microbial genera and dominant genera collectively govern the functional characteristics.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/microorganisms14010045/s1, Figure S1: Full legend for the relative abundance plots in Figure 3 of the main manuscript. Figure S2: The microbial community composition of 30 samples at the phylum level. Table S1: Regional hydrochemical data for SURF, DELOS and KCO. Table S2: The number of raw reads, the library size after quality filtering, and the number of ASVs. Table S3: Names of the phyla in each compartment of the phylum-level Venn diagram (Figure 7). Table S4: ASV table agglomerated to the Family level for the 30 samples analyzed.

Author Contributions

X.L.: formal analysis, investigation, visualization, writing—original draft preparation; T.H.: resources, writing—review and editing; Y.L.: resources, writing— review and editing; Z.P.: resources, writing—review and editing. Y.Z.: conceptualization, methodology, validation, writing— review and editing, project administration, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the State Key Laboratory of Lithospheric and Environmental Coevolution (Grant SKL-K202303). This is a contribution to the project of Theory of Hydrocarbon Enrichment under Multi-Spheric Interactions of the Earth (THEMSIE).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in [European Nucleotide Archive at the European Bioinformatics Institute] at [https://www.ebi.ac.uk/ena/browser/home, accessed 15 September 2025], reference number [PRJEB35125; PRJEB44691; PRJEB49237]; [National Center for Biotechnology Information’s (NCBI)] at [https://www.ncbi.nlm.nih.gov/, accessed 15 September 2025], reference number [PRJNA1181539]. These data were derived from the following resources available in the public domain: [list European Nucleotide Archive at the European Bioinformatics Institute, https://www.ebi.ac.uk/ena/browser/home and National Center for Biotechnology Information’s (NCBI), https://www.ncbi.nlm.nih.gov/].

Acknowledgments

The authors wish to thank the Majorbio Cloud Platform for support during the data analysis process, and Yulong Liu for assistance in the data analysis.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Magnabosco, C.; Lin, L.H.; Dong, H.; Bomberg, M.; Ghiorse, W.; Stan-Lotter, H.; Pedersen, K.; Kieft, T.L.; van Heerden, E.; Onstott, T.C. The biomass and biodiversity of the continental subsurface. Nat. Geosci. 2018, 11, 707–717. [Google Scholar] [CrossRef]
  2. Bar-On, Y.M.; Phillips, R.; Milo, R. The biomass distribution on Earth. Proc. Natl. Acad. Sci. USA 2018, 115, 6506–6511. [Google Scholar] [CrossRef] [PubMed]
  3. Schreiber, M.E.; Simo, J.A.; Freiberg, P.G. Stratigraphic and geochemical controls on naturally occurring arsenic in groundwater, eastern Wisconsin, USA. Hydrogeol. J. 2000, 8, 161–176. [Google Scholar] [CrossRef]
  4. Phelps, T.J.; Raione, E.G.; White, D.C.; Fliermans, C.B. Microbial activities in deep subsurface environments. Geomicrobiol. J. 1989, 7, 79–91. [Google Scholar] [CrossRef]
  5. Reith, F. Life in the deep subsurface. Geology 2011, 39, 287–288. [Google Scholar] [CrossRef]
  6. Morais, S.; Vidal, E.; Cario, A.; Marre, S.; Ranchou-Peyruse, A. Microfluidics for studying the deep underground biosphere: From applications to fundamentals. FEMS Microbiol. Ecol. 2024, 100, fiae151. [Google Scholar] [CrossRef]
  7. Neu, A.T.; Allen, E.E.; Roy, K. Defining and quantifying the core microbiome: Challenges and prospects. Proc. Natl. Acad. Sci. USA 2021, 118, e2104429118. [Google Scholar] [CrossRef]
  8. Lowe, B.A.; Marsh, T.L.; Isaacs-Cosgrove, N.; Kirkwood, R.N.; Kiupel, M.; Mulks, M.H. Defining the “core microbiome” of the microbial communities in the tonsils of healthy pigs. BMC Microbiol. 2012, 12, 20. [Google Scholar] [CrossRef]
  9. Hernandez-Agreda, A.; Gates, R.D.; Ainsworth, T.D. Defining the Core Microbiome in Corals’ Microbial Soup. Trends Microbiol. 2017, 25, 125–140. [Google Scholar] [CrossRef]
  10. Forster, R.J.; Mohan, A.M.; Bibby, K.J.; Lipus, D.; Hammack, R.W.; Gregory, K.B. The Functional Potential of Microbial Communities in Hydraulic Fracturing Source Water and Produced Water from Natural Gas Extraction Characterized by Metagenomic Sequencing. PLoS ONE 2014, 9, e107682. [Google Scholar] [CrossRef]
  11. Konno, U.; Kouduka, M.; Komatsu, D.D.; Ishii, K.; Fukuda, A.; Tsunogai, U.; Ito, K.; Suzuki, Y. Novel Microbial Populations in Deep Granitic Groundwater from Grimsel Test Site, Switzerland. Microb. Ecol. 2013, 65, 626–637. [Google Scholar] [CrossRef] [PubMed]
  12. Hallbeck, L.; Pedersen, K. Characterization of microbial processes in deep aquifers of the Fennoscandian Shield. Appl. Geochem. 2008, 23, 1796–1819. [Google Scholar] [CrossRef]
  13. Ino, K.; Konno, U.; Kouduka, M.; Hirota, A.; Togo, Y.S.; Fukuda, A.; Komatsu, D.; Tsunogai, U.; Tanabe, A.S.; Yamamoto, S.; et al. Deep microbial life in high-quality granitic groundwater from geochemically and geographically distinct underground boreholes. Environ. Microbiol. Rep. 2016, 8, 285–294. [Google Scholar] [CrossRef] [PubMed]
  14. Roh, Y.; Liu, S.V.; Li, G.; Huang, H.; Phelps, T.J.; Zhou, J. Isolation and Characterization of Metal-ReducingThermoanaerobacter Strains from Deep Subsurface Environments of the Piceance Basin, Colorado. Appl. Environ. Microbiol. 2002, 68, 6013–6020. [Google Scholar] [CrossRef]
  15. Payler, S.J.; Biddle, J.F.; Sherwood Lollar, B.; Fox-Powell, M.G.; Edwards, T.; Ngwenya, B.T.; Paling, S.M.; Cockell, C.S. An Ionic Limit to Life in the Deep Subsurface. Front. Microbiol. 2019, 10, 426. [Google Scholar] [CrossRef]
  16. Shu, W.-S.; Huang, L.-N. Microbial diversity in extreme environments. Nat. Rev. Microbiol. 2021, 20, 219–235. [Google Scholar] [CrossRef]
  17. Gittins, D.A.; Bhatnagar, S.; Hubert, C.R.J.; Goordial, J. Environmental Selection and Biogeography Shape the Microbiome of Subsurface Petroleum Reservoirs. mSystems 2023, 8, e0088422. [Google Scholar] [CrossRef]
  18. Liu, W.; Liu, L.; Yang, X.; Deng, M.; Wang, Z.; Wang, P.; Yang, S.; Li, P.; Peng, Z.; Yang, L.; et al. Long-term nitrogen input alters plant and soil bacterial, but not fungal beta diversity in a semiarid grassland. Glob. Change Biol. 2021, 27, 3939–3950. [Google Scholar] [CrossRef]
  19. Acciardo, A.S.; Arnet, M.; Gholizadeh Doonechaly, N.; Ceccato, A.; Rodriguez, P.; Tran, H.N.H.; Wenning, Q.; Zimmerman, E.; Hertrich, M.; Brixel, B.; et al. Spatial and temporal groundwater biogeochemical variability help inform subsurface connectivity within a high-altitude Alpine catchment (Riale di Ronco, Switzerland). Front. Microbiol. 2025, 16, 1522714. [Google Scholar] [CrossRef]
  20. Rast, M.; Galli, A.; Ruh, J.B.; Guillong, M.; Madonna, C. Geology along the Bedretto tunnel: Kinematic and geochronological constraints on the evolution of the Gotthard Massif (Central Alps). Swiss J. Geosci. 2022, 115, 8. [Google Scholar] [CrossRef]
  21. Ma, X.; Hertrich, M.; Amann, F.; Bröker, K.; Gholizadeh Doonechaly, N.; Gischig, V.; Hochreutener, R.; Kästli, P.; Krietsch, H.; Marti, M.; et al. Multi-disciplinary characterizations of the BedrettoLab—A new underground geoscience research facility. Solid Earth 2022, 13, 301–322. [Google Scholar] [CrossRef]
  22. Hafner, S. Petrographie des Südwestlichen Gotthardmassivs Zwischen St.Gotthardpass und Nufenenpass. Ph.D. Thesis, ETH Zurich, Zurich, Switzerland, 1958. [Google Scholar]
  23. Jamieson, J.W.; Wing, B.A.; Farquhar, J.; Hannington, M.D. Neoarchaean seawater sulphate concentrations from sulphur isotopes in massive sulphide ore. Nat. Geosci. 2012, 6, 61–64. [Google Scholar] [CrossRef]
  24. Kneafsey, T.J.; Blankenship, D.A.; Dobson, P.F.; Morris, J.; White, M.D.; Fu, P.; Schwering, P.C. The EGS Collab Project: Learnings from Experiment 1; Stanford University: Stanford, CA, USA, 2020. [Google Scholar]
  25. Parada, A.E.; Needham, D.M.; Fuhrman, J.A. Every base matters: Assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environ. Microbiol. 2015, 18, 1403–1414. [Google Scholar] [CrossRef] [PubMed]
  26. Zhang, Y.; Dekas, A.E.; Hawkins, A.J.; Parada, A.E.; Gorbatenko, O.; Li, K.; Horne, R.N. Microbial Community Composition in Deep-Subsurface Reservoir Fluids Reveals Natural Interwell Connectivity. Water Resour. Res. 2020, 56, e2019WR025916. [Google Scholar] [CrossRef]
  27. Zhang, Y.; Horne, R.N.; Hawkins, A.J.; Primo, J.C.; Gorbatenko, O.; Dekas, A.E. Geological activity shapes the microbiome in deep-subsurface aquifers by advection. Proc. Natl. Acad. Sci. USA 2022, 119, e2113985119. [Google Scholar] [CrossRef]
  28. Zhang, Y.; Dekas, A.E.; Hawkins, A.J.; Primo, J.C.; Gorbatenko, O.; Huang, T.; Pang, Z.; Horne, R.N. Transportability of exogenous microbial community correlates with interwell connectivity in deep aquifers. Water Res. 2025, 285, 124008. [Google Scholar] [CrossRef]
  29. Ford, S.E.; Slater, G.F.; Engel, K.; Warr, O.; Lollar, G.S.; Brady, A.; Neufeld, J.D.; Lollar, B.S. Deep terrestrial indigenous microbial community dominated by Candidatus Frackibacter. Commun. Earth Environ. 2024, 5, 795. [Google Scholar] [CrossRef]
  30. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 2011, 17, 10–12. [Google Scholar] [CrossRef]
  31. Callahan, B.J.; McMurdie, P.J.; Holmes, S.P. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 2017, 11, 2639–2643. [Google Scholar] [CrossRef]
  32. Watson, M.; McMurdie, P.J.; Holmes, S. phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data. PLoS ONE 2013, 8, e61217. [Google Scholar] [CrossRef]
  33. Callahan, B.J.; Sankaran, K.; Fukuyama, J.A.; McMurdie, P.J.; Holmes, S.P. Bioconductor Workflow for Microbiome Data Analysis: From raw reads to community analyses. F1000Research 2016, 5, 1492. [Google Scholar] [CrossRef]
  34. Blankenberg, D.; Gordon, A.; Von Kuster, G.; Coraor, N.; Taylor, J.; Nekrutenko, A. Manipulation of FASTQ data with Galaxy. Bioinformatics 2010, 26, 1783–1785. [Google Scholar] [CrossRef] [PubMed]
  35. Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef] [PubMed]
  36. Aronesty, E. Comparison of Sequencing Utility Programs. Open Bioinform. J. 2013, 7, 1–8. [Google Scholar] [CrossRef]
  37. Quast, C.; Pruesse, E.; Yilmaz, P.; Gerken, J.; Schweer, T.; Yarza, P.; Peplies, J.; Glöckner, F.O. The SILVA ribosomal RNA gene database project: Improved data processing and web-based tools. Nucleic Acids Res. 2012, 41, D590–D596. [Google Scholar] [CrossRef]
  38. Yilmaz, P.; Parfrey, L.W.; Yarza, P.; Gerken, J.; Pruesse, E.; Quast, C.; Schweer, T.; Peplies, J.; Ludwig, W.; Glöckner, F.O. The SILVA and “All-species Living Tree Project (LTP)” taxonomic frameworks. Nucleic Acids Res. 2014, 42, D643–D648. [Google Scholar] [CrossRef]
  39. Pruesse, E.; Peplies, J.; Glöckner, F.O. SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 2012, 28, 1823–1829. [Google Scholar] [CrossRef]
  40. Dobson, P.F.; Kneafsey, T.J.; Nakagawa, S.; Sonnenthal, E.L.; Voltolini, M.; Smith, J.T.; Borglin, S.E. Fracture Sustainability in Enhanced Geothermal Systems: Experimental and Modeling Constraints. J. Energy Resour. Technol. 2021, 143, 100901. [Google Scholar] [CrossRef]
  41. Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 2003, 14, 927–930. [Google Scholar] [CrossRef]
  42. Charrad, M.; Ghazzali, N.; Boiteau, V.; Niknafs, A. NbClust: AnRPackage for Determining the Relevant Number of Clusters in a Data Set. J. Stat. Softw. 2014, 61, 1–36. [Google Scholar] [CrossRef]
  43. Gentle, J.E.; Kaufman, L.; Rousseuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis. Biometrics 1991, 47, 788. [Google Scholar] [CrossRef]
  44. Bray, J.R.; Curtis, J.T. An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecol. Monogr. 1957, 27, 325–349. [Google Scholar] [CrossRef]
  45. Chen, H.; Boutros, P.C. VennDiagram: A package for the generation of highly-customizable Venn and Euler diagrams in R. BMC Bioinform. 2011, 12, 35. [Google Scholar] [CrossRef] [PubMed]
  46. Han, C.; Shi, C.; Liu, L.; Han, J.; Yang, Q.; Wang, Y.; Li, X.; Fu, W.; Gao, H.; Huang, H.; et al. Majorbio Cloud 2024: Update single-cell and multiomics workflows. iMeta 2024, 3, e217. [Google Scholar] [CrossRef]
  47. Ren, Y.; Yu, G.; Shi, C.; Liu, L.; Guo, Q.; Han, C.; Zhang, D.; Zhang, L.; Liu, B.; Gao, H.; et al. Majorbio Cloud: A one-stop, comprehensive bioinformatic platform for multiomics analyses. iMeta 2022, 1, e12. [Google Scholar] [CrossRef]
  48. Louca, S.; Parfrey, L.W.; Doebeli, M. Decoupling function and taxonomy in the global ocean microbiome. Science 2016, 353, 1272–1277. [Google Scholar] [CrossRef]
  49. Peng, C.; Liu, Y.; Qin, Y.; Sun, D.; Jia, J.; Xie, Z.; Gong, B. Dissolved Oxygen Decline in Northern Beibu Gulf Summer Bottom Waters: Reserve Management Insights from Microbiome Analysis. Microorganisms 2025, 13, 1945. [Google Scholar] [CrossRef]
  50. Abdi, H. Z-scores. Encycl. Meas. Stat. 2007, 3, 1055–1058. [Google Scholar]
  51. Li, L.; Wing, B.A.; Bui, T.H.; McDermott, J.M.; Slater, G.F.; Wei, S.; Lacrampe-Couloume, G.; Lollar, B.S. Sulfur mass-independent fractionation in subsurface fracture waters indicates a long-standing sulfur cycle in Precambrian rocks. Nat. Commun. 2016, 7, 13252. [Google Scholar] [CrossRef]
  52. Warr, O.; Young, E.D.; Giunta, T.; Kohl, I.E.; Ash, J.L.; Sherwood Lollar, B. High-resolution, long-term isotopic and isotopologue variation identifies the sources and sinks of methane in a deep subsurface carbon cycle. Geochim. Cosmochim. Acta 2021, 294, 315–334. [Google Scholar] [CrossRef]
  53. Lollar, B.S.; Lacrampe-Couloume, G.; Voglesonger, K.; Onstott, T.C.; Pratt, L.M.; Slater, G.F. Isotopic signatures of CH4 and higher hydrocarbon gases from Precambrian Shield sites: A model for abiogenic polymerization of hydrocarbons. Geochim. Cosmochim. Acta 2008, 72, 4778–4795. [Google Scholar] [CrossRef]
  54. Chen, P.; Mei, T.; He, X.; Lin, Y.; He, Z.; Kong, X. Impacts of Lead and Nanoplastic Co-Exposure on Decomposition, Microbial Diversity, and Community Assembly Mechanisms in Karst Riverine Miscanthus Litter. Microorganisms 2025, 13, 2172. [Google Scholar] [CrossRef]
  55. Uhl, B.; Schall, P.; Bässler, C. Achieving structural heterogeneity and high multi-taxon biodiversity in managed forest ecosystems: A European review. Biodivers. Conserv. 2024, 34, 3327–3358. [Google Scholar] [CrossRef]
  56. Cassol, I.; Ibañez, M.; Bustamante, J.P. Key features and guidelines for the application of microbial alpha diversity metrics. Sci. Rep. 2025, 15, 622. [Google Scholar] [CrossRef] [PubMed]
  57. Yang, Z.; Peng, C.; Cao, H.; Song, J.; Gong, B.; Li, L.; Wang, L.; He, Y.; Liang, M.; Lin, J.; et al. Microbial functional assemblages predicted by the FAPROTAX analysis are impacted by physicochemical properties, but C, N and S cycling genes are not in mangrove soil in the Beibu Gulf, China. Ecol. Indic. 2022, 139, 108887. [Google Scholar] [CrossRef]
  58. Sansupa, C.; Wahdan, S.F.M.; Hossen, S.; Disayathanoowat, T.; Wubet, T.; Purahong, W. Can We Use Functional Annotation of Prokaryotic Taxa (FAPROTAX) to Assign the Ecological Functions of Soil Bacteria? Appl. Sci. 2021, 11, 688. [Google Scholar] [CrossRef]
  59. Chandler, L.; Harford, A.J.; Hose, G.C.; Humphrey, C.L.; Chariton, A.; Greenfield, P.; Davis, J. Saline mine-water alters the structure and function of prokaryote communities in shallow groundwater below a tropical stream. Environ. Pollut. 2021, 284, 117318. [Google Scholar] [CrossRef] [PubMed]
  60. Ma, J.; Liu, H.; Chen, H.; Xiong, H.; Tong, L.; Guo, G. Is redox zonation an appropriate method for determining the stage of natural remediation in deep contaminated groundwater? Sci. Total Environ. 2024, 928, 172224. [Google Scholar] [CrossRef]
  61. Xu, F.; Li, P. Biogeochemical mechanisms of iron (Fe) and manganese (Mn) in groundwater and soil profiles in the Zhongning section of the Weining Plain (northwest China). Sci. Total Environ. 2024, 939, 173506. [Google Scholar] [CrossRef]
  62. Ma, J.; Liu, H.; Zhang, C.; Ding, K.; Chen, R.; Liu, S. Joint response of chemistry and functional microbial community to oxygenation of the reductive confined aquifer. Sci. Total Environ. 2020, 720, 137587. [Google Scholar] [CrossRef]
  63. Durand, S.; Guillier, M. Transcriptional and Post-transcriptional Control of the Nitrate Respiration in Bacteria. Front. Mol. Biosci. 2021, 8, 667758. [Google Scholar] [CrossRef] [PubMed]
  64. Caddey, S.W.; Bachman, R.L.; Campbell, T.J.; Reid, R.R.; Otto, R.P. The Homestake Gold Mine, an Early Proterozoic Iron-Formation-Hosted Gold Deposit, Lawrence County, South Dakota; US Government Printing Office: Reston, VA, USA, 1991. [CrossRef]
  65. Brisson, V.; Schmidt, J.; Northen, T.R.; Vogel, J.P.; Gaudin, A. A New Method to Correct for Habitat Filtering in Microbial Correlation Networks. Front. Microbiol. 2019, 10, 585. [Google Scholar] [CrossRef] [PubMed]
  66. Horner-Devine, M.C.; Carney, K.M.; Bohannan, B.J.M. An ecological perspective on bacterial biodiversity. Proc. R. Soc. London. Ser. B Biol. Sci. 2004, 271, 113–122. [Google Scholar] [CrossRef] [PubMed]
  67. Ali, M.; Fujii, M.; Ibrahim, M.G.; Elreedy, A. On the potential of halophiles enriched from hypersaline sediments for biohydrogen production from saline wastewater. J. Clean. Prod. 2022, 341, 130901. [Google Scholar] [CrossRef]
  68. Wang, S.; Hu, Y.; Fan, T.; Fang, W.; Liu, X.; Xu, L.; Li, B.; Wei, X. Microbial Community Structure and Co-Occurrence Patterns in Closed and Open Subsidence Lake Ecosystems. Water 2023, 15, 1829. [Google Scholar] [CrossRef]
  69. Yang, J.; Li, W.; Teng, D.; Yang, X.; Zhang, Y.; Li, Y. Metagenomic Insights into Microbial Community Structure, Function, and Salt Adaptation in Saline Soils of Arid Land, China. Microorganisms 2022, 10, 2183. [Google Scholar] [CrossRef]
  70. Nixon, S.L.; Daly, R.A.; Borton, M.A.; Solden, L.M.; Welch, S.A.; Cole, D.R.; Mouser, P.J.; Wilkins, M.J.; Wrighton, K.C.; Suen, G. Genome-Resolved Metagenomics Extends the Environmental Distribution of the Verrucomicrobia Phylum to the Deep Terrestrial Subsurface. mSphere 2019, 4, e00613-19. [Google Scholar] [CrossRef]
  71. Kothari, A.; Roux, S.; Zhang, H.; Prieto, A.; Soneja, D.; Chandonia, J.-M.; Spencer, S.; Wu, X.; Altenburg, S.; Fields, M.W.; et al. Ecogenomics of Groundwater Phages Suggests Niche Differentiation Linked to Specific Environmental Tolerance. mSystems 2021, 6, e0053721. [Google Scholar] [CrossRef]
  72. Stegen, J.C.; Lin, X.; Konopka, A.E.; Fredrickson, J.K. Stochastic and deterministic assembly processes in subsurface microbial communities. ISME J. 2012, 6, 1653–1664. [Google Scholar] [CrossRef]
  73. Zhou, J.; Ning, D. Stochastic Community Assembly: Does It Matter in Microbial Ecology? Microbiol. Mol. Biol. Rev. 2017, 81, e00002-17. [Google Scholar] [CrossRef]
  74. Ricotta, C.; Podani, J. On some properties of the Bray-Curtis dissimilarity and their ecological meaning. Ecol. Complex. 2017, 31, 201–205. [Google Scholar] [CrossRef]
  75. Boeraş, I.; Burcea, A.; Coman, C.; Bănăduc, D.; Curtean-Bănăduc, A. Bacterial Microbiomes in the Sediments of Lotic Systems Ecologic Drivers and Role: A Case Study from the Mureş River, Transylvania, Romania. Water 2021, 13, 3518. [Google Scholar] [CrossRef]
  76. Shah, T.; Liu, Q.; Yin, G.; Shah, Z.; Li, H.; Wang, J.; Wang, B.; Xia, X. Structural and Functional Differences in the Gut and Lung Microbiota of Pregnant Pomona Leaf-Nosed Bats. Microorganisms 2025, 13, 1887. [Google Scholar] [CrossRef] [PubMed]
  77. Abellan-Schneyder, I.; Matchado, M.S.; Reitmeier, S.; Sommer, A.; Sewald, Z.; Baumbach, J.; List, M.; Neuhaus, K.; Tringe, S.G. Primer, Pipelines, Parameters: Issues in 16S rRNA Gene Sequencing. mSphere 2021, 6, e01202-20. [Google Scholar] [CrossRef] [PubMed]
  78. Ainsworth, T.D.; Krause, L.; Bridge, T.; Torda, G.; Raina, J.-B.; Zakrzewski, M.; Gates, R.D.; Padilla-Gamiño, J.L.; Spalding, H.L.; Smith, C.; et al. The coral core microbiome identifies rare bacterial taxa as ubiquitous endosymbionts. ISME J. 2015, 9, 2261–2274. [Google Scholar] [CrossRef]
  79. Pinar-Méndez, A.; Wangensteen, O.S.; Præbel, K.; Galofré, B.; Méndez, J.; Blanch, A.R.; García-Aljaro, C. Monitoring Bacterial Community Dynamics in a Drinking Water Treatment Plant: An Integrative Approach Using Metabarcoding and Microbial Indicators in Large Water Volumes. Water 2022, 14, 1435. [Google Scholar] [CrossRef]
  80. Suzuki, T.; Okamura, Y.; Calugay, R.J.; Takeyama, H.; Matsunaga, T. Global Gene Expression Analysis of Iron-Inducible Genes in Magnetospirillum magneticum AMB-1. J. Bacteriol. 2006, 188, 2275–2279. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.