Ultradeep Microbial Communities at 4.4 km within Crystalline Bedrock: Implications for Habitability in a Planetary Context.

The deep bedrock surroundings are an analog for extraterrestrial habitats for life. In this study, we investigated microbial life within anoxic ultradeep boreholes in Precambrian bedrock, including the adaptation to environmental conditions and lifestyle of these organisms. Samples were collected from Pyhäsalmi mine environment in central Finland and from geothermal drilling wells in Otaniemi, Espoo, in southern Finland. Microbial communities inhabiting the up to 4.4 km deep bedrock were characterized with phylogenetic marker gene (16S rRNA genes and fungal ITS region) amplicon and DNA and cDNA metagenomic sequencing. Functional marker genes (dsrB, mcrA, narG) were quantified with qPCR. Results showed that although crystalline bedrock provides very limited substrates for life, the microbial communities are diverse. Gammaproteobacterial phylotypes were most dominant in both studied sites. Alkanindiges -affiliating OTU was dominating in Pyhäsalmi fluids, while different depths of Otaniemi samples were dominated by Pseudomonas. One of the most common OTUs detected from Otaniemi could only be classified to phylum level, highlighting the uncharacterized nature of the deep biosphere in bedrock. Chemoheterotrophy, fermentation and nitrogen cycling are potentially significant metabolisms in these ultradeep environments. To conclude, this study provides information on microbial ecology of low biomass, carbon-depleted and energy-deprived deep subsurface environment. This information is useful in the prospect of finding life in other planetary bodies.


Introduction
Currently, Earth is the only observed inhabited planetary object in the Universe. In order to identify feasible extraterrestrial locations for habitability and potentially life, different approaches have been employed: through calculating habitable zones in solar systems or in the Universe [1], defining what is meant with life and habitability [2], and quantifying the probability of origin of life [3]. Although we have been unable to retrieve tangible samples or incontrovertible evidence from possible extraterrestrial life, the search is ongoing. Using analog environments on Earth, we can likewise explore the great capacity of life to proliferate in multiple extremes [4]. Among these analogs, the deep continental subsurface on Earth provides an example for any subsurface crustal environment in a rocky Life 2020, 10, 2 3 of 22 be probing for similar kinds of environments with an extremely low cell content, where forward contamination is a major issue to tackle with.

Site Descriptions
Samples were retrieved from two different study sites: (1) experimental drill hole number R-2247 at the Pyhäsalmi Cu-Zn mine (First Quantum Minerals, Ltd., Vancouver, BC, Canada), Finland, from a drill hole reaching the depth of 2.4 km and (2) OTN2 and OTN3 deep drill holes from a depth range of 2.6-4.4 km in the municipal area of Espoo, Finland. Site description for the drill hole in Pyhäsalmi mine has been previously published [37]. Briefly, Pyhäsalmi mine is located nearby Pyhäjärvi town, central Finland (26.042 • E, 63.659 • N). The exploratory drill hole was drilled in 2012 into tonalitic and metavolcanic rocks and is naturally overflowing but has been plugged since the drilling. The drill hole R-2247 fluids are saline Ca-Na-Cl type that contain 0.093 mM of total organic carbon and 0.066 mM dissolved organic carbon. Dissolved carbon dioxide and methane have previously been detected in the gas phase (0.012 mL·L −1 and 4.08 mL·L −1 , respectively) [37]. The Otaniemi deep drill hole is located in Aalto University campus in Espoo, Southern Finland (24.827 • E, 60.188 • N). Drill holes OTN2 and OTN3, drilled in 2016-2017 are designed for geothermal heat production by St1 Deep Heat Ltd. A pilot drill hole reaching a depth of 2 km was drilled and explored prior to drilling of the deeper production wells, which provided information about the fracturing and temperature of the bedrock in the vicinity of the OTN2 and OTN3. The drill hole sampled in this study was at the time drilled to the depth of 4.5 km, using the air hammer drilling technique. Estimated in situ temperatures range from 46 to 76 • C in sampling depths of this study [38]. Bedrock in the Otaniemi area comprises of mainly of mica gneiss and migmatitic granite [39].

Sampling
Fluid samples for microbiological analyses were retrieved from Pyhäsalmi R-2247 in 26th of April 2016 by first unplugging and flushing the drill hole for five hours. The flow rate was approximately 40 L·h −1 . An acid-washed, autoclaved pressure-tight stainless-steel cylinder was fastened to the tap of the drill hole plug, flushed with circa 5 L of fluid and closed after it was filled to the top. Altogether seven parallel 0.5 L samples were retrieved. The cylinders were kept at room temperature (close to ambient temperature of the fluids in the mine, 23 • C), transported to the laboratory and stored 9 days prior to analysis. The valves were flame-sterilized and cylinder valves were gradually opened under N 2 gas flow in order to allow sample gas pressure to gradually stabilize to normal pressure. Samples were emptied from the cylinders into acid-washed, sterile Schott bottles under N 2 flushing. Biomass from six samples was immediately collected by filtering the sample through 0.2 µm Sterivex filters (Merck-Millipore, Merck KGaA, Darmstadt, Germany). Three filters with biomass were dedicated for DNA analysis (Pyhäsalmi DNA a-c) and three for RNA analysis (Pyhäsalmi RNA a-c). Filters dedicated for RNA analysis were filled with LifeGuard solution (Mo Bio, QIAGEN Inc., Hilden, Germany) for better preservation of RNA in the samples. All filters were stored in sterile 50 mL Corning tubes at −20 • C prior to further analysis.
Geochemical parameters (dissolved O 2 , pH, electrical conductivity and temperature) were measured in a flow-through cell with portable sensors (WTW) at the beginning of the flushing and immediately after collecting the microbiological samples. In addition, samples were taken for geochemical laboratory analysis after ca. 1 hour of flushing (Geochemistry sample a) and just before the microbiological sampling (Geochemistry sample b). Filtered (<0.45 µm) 100 mL samples were taken for cation analysis and 250 mL and 500 mL unfiltered samples devoted to anion analyses and determination of alkalinity, respectively. Two 100 mL samples were also taken for sulfide analysis prior to microbiological sampling. These were collected in glass bottles (Winkler) and immediately fixed with 2 M NaOH and 1 M zinc acetate. Cation samples were acidified with ultrapure HNO 3

and all
Life 2020, 10, 2 4 of 22 geochemical samples stored at +4 C prior to analysis. Alkalinity was determined by end-point titration to pH 4.5 in the same evening using a digital titrator (Hach, Loveland, CO, USA), and other samples brought to commercial laboratories for cation and anion analysis (Labtium Oy, Espoo, Finland), and sulfide analysis (Ramboll Oy, Vantaa, Finland).
Crushed rock material from Otaniemi drill holes OTN2 and OTN3 was gathered into a plastic sample collection bucket at the airflow output of the drilling apparatus. Samples were collected from the plastic bucket by grabbing a handful of crushed rock material with a UV-sterilized plastic bag and turning it inside out, while avoiding touching the sides of the bucket. Excess air was squeezed out from the bags and samples were frozen at −80 • C for further analysis. Two replicate rock material samples were taken from each sampling depth. Samples were collected from OTN2 on 1st of July and 9th of August, 2016 (Otaniemi 1, 2569 m and Otaniemi 7, 3115 m, respectively) and from OTN3 from 29th of October to 15th of November on five occasions (Otaniemi 2, 3, 4, 5 and 6, depths respectively 4015, 3203, 4375, 4203 and 3617 m). As air was used in the drilling to flush out the crushed rock material from the borehole in Otaniemi, we collected air samples in order to detect possible airborne contaminants of the samples. We filtered 1 m 3 of air during 35 min using an Impactor FH5 ® sampler (Markus Klotz GmbH, Bad Liebenzell, Germany). Airborne particles were retained on a gelatin filter paper (Gelatin Filter Disposables, Sartorius Stedim Biotech GmbH, Göttingen, Germany), which were further processed as the other filter samples. Air samples were collected twice, 11th of July and 1st of November 2016.

Sample Preparation for Molecular Biology Analyses
All molecular biology procedures were carried out in a laminar flow cabinet sterilized with UV light and before RNA extraction, surfaces were also wiped with RNase Zap wipes (ThermoFisher Scientific, Waltham, MA, USA). DNA and RNA extraction from Pyhäsalmi mine fluid samples was done with and NucleoSpin Soil (DNA) and NucleoSpin RNA plant (RNA) (Macherey-Nagel, Düren, Germany) kits. First, the Sterivex filter case was cut open with flame-sterilized tools and filter cut out with a sterile scalpel. The filter was cut into small slices and placed to the extraction tube of the kit. In the RNA extraction, the LifeGuard solution left in the Sterivex units was also pipetted to the extraction tube. DNA and RNA were then extracted according to the manufacturer's instructions. A sterile filter for extraction control was treated as samples. After extraction, DNA yield was measured with Nanodrop spectrophotometer (ThermoFisher Scientific) and DNA was kept at -20 • C for further analysis. RNA was translated to cDNA with Sensifast cDNA Synthesis kit (BioLine, London, UK) according to the manufacturer's instructions, aliquoting each RNA sample into four parallel reactions that were combined after translation.
Two methods were used for DNA extraction from Otaniemi rock samples, either straight extraction with TriPrep kit (Macherey-Nagel) using 1 g of ground rock sample, or washing procedure combined to nucleic acid extraction. The washing procedure was done with mixing 100 g of rock material with 500 mL Na-phosphate buffer (1M, pH 7), shaking (30 min, 200 rpm), letting heavier particles sink down for 1.5 h and decanting the supernatant to cellulose acetate filter (Corning Inc., Corning, NY, USA) in order to collect the biomass by filtration. After filtration, the filters were cut out of the funnels with a sterile scalpel, halved and frozen at −20 • C in test tubes. Nucleic acid extraction from the filters was performed as described for the Pyhäsalmi samples above. Nucleic acid yield was measured with Qubit (ThermoFisher Scientific). Negative extraction controls as well as air controls were included in the analysis and treated as actual samples.

Sequencing of the Microbial Community
The microbial community structure of Pyhäsalmi R-2247 was determined with unidirectional 16S rRNA gene amplicon sequencing of DNA and cDNA with the IonTorrent PGM platform at Biocenter Oulu sequencing center (University of Oulu, Finland).The 16S rRNA gene V3-V4 region amplicon libraries were produced using MyTag mastermix (Bioline, Memphis, TN, USA) and primers 341f and 785r [40] for bacteria and 349f-806r primers for archaea [41] (Table S1). For the fungal ITS1 gene region Life 2020, 10, 2 5 of 22 ITS1 and ITS2 primers were used [42,43]. The amount of template DNA was increased to 4 µL, after first experimenting with 2 µL. The thermal cycle program used to amplify the libraries was as follows: initial annealing 94 • C for 5 min, 45 × 94 • C for 1 min, 50 • C for 30 s, 72 • C for 1 min, and final extension at 72 • C for 10 min. After purification and size selection, amplicons were sequenced using the 316 Chip Kit v2 with Ion PGM Template IA 500 and Ion PGM Hi-Q Sequencing kits (Thermo Fisher Scientific).
The bacterial community structure of the Otaniemi samples was determined with paired-end (2 × 150 bp) 16S rRNA gene amplicon sequencing (V4-V5 region) from DNA retrieved with the washing procedure combined with kit extraction. Sequencing was performed with Illumina MiSeq platform in Marine Biology Laboratory (Woods Hole, MA, USA) according to their online protocol (https://vamps2.mbl.edu/resources/primers, accessed 10.12.2019). Briefly, DNA concentration of the samples was first determined with PicoGreen assay (ThermoFisher Scientific) then samples were concentrated with speedvac. PCR amplification was done with 35 cycles instead of the default 30. Samples were cleaned using AMPure beads (Beckman Coulter Life Sciences, Indianapolis, IN, USA) according to manufacturer's instructions and quantified prior to sequencing.
In addition, two Pyhäsalmi R-2247 samples (Pyhäsalmi DNA b and c) were sent to the Marine Biology Laboratory (MA, USA) for paired-end metagenomic sequencing with the Illumina MiSeq, resulting to~275 nt long products, as part of the Deep Carbon Observatory's Community of Deep Life sequencing effort. Due to low amount of DNA, multiple displacement analysis was performed prior to preparation of the metagenomic libraries. Nugen Ovation UltraLow DR Kit was used to prepare the metagenomic libraries otherwise according to the manufacturer's instructions, but with increased number of amplification cycles from the suggested 18 to 22.

Quantification of Taxonomic and Functional Marker Genes
The quantities of different taxonomic and functional groups of microbes were determined with quantitative PCR. Bacterial and archaeal numbers were determined with amplification of 16S rRNA gene with domain-specific primers [44][45][46][47]. Sulfate and nitrate reducing microorganisms in addition to methanogens were quantified using functional gene copy numbers. Functional genes amplified with qPCR were dsrB, narG and mcrA, respectively. Primers and standards for each assay are described in Supplementary Material, Table S1 [48][49][50][51][52]. Roche LightCycler SYBR Mastermix with 1 µL of bovine serum albumin, 10 µM of forward and reverse primer and 1 µL of nucleic acid template was used for PCR mastermix. Quantitation was done with LightCycler 420 (Roche Molecular Diagnostics, Pleasanton, CA, USA) using the following protocols: 5 min at 95 • C, 40 cycles of 95 • C for 30 s, 60 • C for 30 s and 72 • C for 15 s, and final extension of 72 • C for 3 min. Sample fluorescence was measured at the end of each elongation phase. Melting curve analysis comprised of 15 s denaturation at 95 • C, 1 min annealing phase at 55 • C (archaeal 16S rRNA and mcrA assays) and or 65 • C (bacterial 16S rRNA, narG and dsrB assays) and continuous measuring and melting step with temperature rising 0.11 • C per s to 95 • C.
Metagenomic reads from Pyhäsalmi were run through EBI Metagenomics pipeline v.4.0 using default settings [56]. Reads were quality checked and trimmed for low-quality regions and adapter sequences using Trimmomatic v. 0.35 [57]. Gene-coding regions were searched with FragGeneScan v. 1.20, InterProScan v. 5.25-64.0 and Prodigal v. 2.6.3. Sequence and structural similarities to noncoding RNA sequences were searched with Infernal v. 1.1.2 and rRNA sequences classified with MAPseq v. 1.2 using Silva 128 reference database. Metagenomic reads were also trimmed in Galaxy web platform (www.usegalaxy.org) [58] using Trimmomatic v.0.38.0 with ILLUMINACLIP, SLIDINGWINDOW and MINLEN trimming (parameters used: nr of bases to average across = 4, average quality = 20, minimum length of reads to be kept = 50) [57]. Taxonomic labels were assigned to trimmed reads with Kraken v. 1.3.0 [59] using database for bacteria, Kraken data translation to full NCBI taxonomy (v. 2015-15-10), and further visualized with a Krona chart. Metagenomic sequences were annotated with KAAS protein annotation tool using GHOSTX search with single-directional best-hit method, and KEGG's GhostKOALA annotation program generating KEGG Orthology assignments and reconstructing KEGG modules and pathways [60,61]. Reads were assembled to contigs with Megahit v. 1.1.2 [62] with default parameters.
Microbial community functionality in Pyhäsalmi and Otaniemi samples was also predicted with FAPROTAX [63]. Taxonomy of the microbial communities was compared against the FAPROTAX database using the Microbiome Helper [63,64]. Relative functionality abundances and assigned functions for each sample were further visualized in R (R Core Team, 2018) with package ggpubr [65].

Diversity Indices
Shannon (H ) and Simpson diversity indices in addition to species richness (Chao1) and coverage (ACE) estimates were calculated using the MOTHUR-generated .biom-file in R with Phyloseq package version 1.20.0 [66].

Geochemistry
Geochemical data collected in this study is presented in Table 1, and field measurements in Table 2. Geochemistry has been previously characterized in Miettinen et al. (2015) [37], and no significant changes in the composition of the fluid was observed. Ca and Na are the most dominant cations, and Cl is the most abundant anion. Total dissolved solids are up to 81 g L −1 , demonstrating the high salinity of the fluids (Table 1). While other parameters measured in the field were constant compared to previous report, a slight rise in pH could be observed (Table 2).

Low Biomass Environment
Both our sampling sites represent ultralow biomass environments. Bacteria were universally more common in the studied ultradeep biospheres compared to other microorganisms according to cell number estimates measured with quantitative PCR. The copy numbers (used as a proxy for the number  (Figure 1). Archaeal 16S rRNA gene copies were below the detection limit (40 copies/ml or g of sample) in Pyhäsalmi, and could be only estimated by extrapolating from the standard curve. Otaniemi crushed rock samples contained 20-105 copies of bacterial and 1-2 archaeal 16S rRNA genes per gram. Nitrate reduction (narG) marker gene copies were detected from Otaniemi samples, copy numbers ranging from 2-13 per mL. Extraction control and PCR negative control resulted in 637 and 1476 bacterial reads, respectively. Amplicon sequencing of the bacterial 16S rRNA gene in the Otaniemi samples resulted in a total of 270k reads. Of this, the shallowest depth (3203 m) contained 62% (167,290 reads), sample from 4203 m 16% (42,413 reads) and deepest depth at 4375 m 12% (28,779) of the total reads. Approximately 11% of reads were obtained from extraction control. Sequencing was not successful from control air samples and PCR negative control. Although extraction control also resulted in a number of reads, the OTUs those represented were removed from the data prior to diversity analyses and functional prediction with FAPROTAX. Pyhäsalmi metagenomes resulted in libraries of 221 Mb and 130 Mb.

Microbial Community Composition
While the samples contained extremely low biomass, we retrieved more than 200 different 16S rRNA OTUs with the community sequencing effort from both study sites. OTUs representing on average less than 1% of the total bacterial community in each sample formed approximately 10-20% of the bacterial communities, thus representing the rare biosphere. Altogether 286 bacterial OTUs were detected from Pyhäsalmi mine samples originating from 2.4 km deep drill hole. According to phylogenetic marker gene amplicon sequencing, the majority of the microbial community of this drill hole in Pyhäsalmi comprised of Proteobacteria, mainly Alpha-and Gammaproteobacteria (on average 96%) ( Figure 2). The Shannon diversity estimate H' ranged from 2.3 to 2.4 between the Pyhäsalmi samples ( Table  3). A phylotype closely affiliating with Alkanindiges (Gammaproteobacteria) was the most dominant detected in both the DNA and RNA fraction, with a relative abundance of 56-64% of the bacterial community. In the DNA-derived bacterial community, the second most common OTU affiliated with Parvibaculum (Alphaproteobacteria), while in the RNA fraction, an OTU affiliating with Sphingobacteriaceae (Bacteroidetes) had higher relative abundance. On average, according to the Chao1 estimate, 53% of the richness of the community detected from the DNA fraction and 57% of the RNA fraction was captured (Table 3). Extraction control and PCR negative control resulted in 637 and 1476 bacterial reads, respectively. Amplicon sequencing of the bacterial 16S rRNA gene in the Otaniemi samples resulted in a total of 270k reads. Of this, the shallowest depth (3203 m) contained 62% (167,290 reads), sample from 4203 m 16% (42,413 reads) and deepest depth at 4375 m 12% (28,779) of the total reads. Approximately 11% of reads were obtained from extraction control. Sequencing was not successful from control air samples and PCR negative control. Although extraction control also resulted in a number of reads, the OTUs those represented were removed from the data prior to diversity analyses and functional prediction with FAPROTAX. Pyhäsalmi metagenomes resulted in libraries of 221 Mb and 130 Mb.

Microbial Community Composition
While the samples contained extremely low biomass, we retrieved more than 200 different 16S rRNA OTUs with the community sequencing effort from both study sites. OTUs representing on average less than 1% of the total bacterial community in each sample formed approximately 10-20% of the bacterial communities, thus representing the rare biosphere. Altogether 286 bacterial OTUs were detected from Pyhäsalmi mine samples originating from 2.4 km deep drill hole. According to phylogenetic marker gene amplicon sequencing, the majority of the microbial community of this drill hole in Pyhäsalmi comprised of Proteobacteria, mainly Alpha-and Gammaproteobacteria (on average 96%) (Figure 2). The abundance-based coverage estimate (ACE) showed that on average 52% and 49% of the bacterial richness was captured in sequencing the DNA and RNA fractions, respectively ( Table 3). The DNA and RNA communities shared 59 OTUs (34.5% of all the OTUs). Of the negative DNA extraction control, 90% of the sequences affiliated with betaproteobacterial Ralstonia, which was not detected in any of the actual samples (Supplementary Material, Table S2). In the PCR negative control, the most common OTUs affiliated with different Cyanobacteria (52%). These were rare, on average 0.02% of the sequences from all the subsurface samples. However, 5% of the sequences in the PCR negative control sample affiliated with an alphaproteobacterial phylotype that was also present in the samples with 1-2% relative abundance. The only archaeal sequences detected with 16S rRNA Figure 2. Bacterial community structure based on 16S rRNA gene amplicon sequencing at Pyhäsalmi and Otaniemi deep drill holes. Results from three replicate samples of Pyhäsalmi have been combined and average relative abundance is shown. OTUs detected from controls are filtered out from the result, except for one unclassified Alphaproteobacterial OTU present in Pyhäsalmi samples (marked with an asterisk, *). Rare OTUs represent those OTUs that were present less than on average 1% or 0.1% relative abundance in the samples from Otaniemi and Pyhäsalmi, respectively. The Shannon diversity estimate H ranged from 2.3 to 2.4 between the Pyhäsalmi samples ( Table 3). A phylotype closely affiliating with Alkanindiges (Gammaproteobacteria) was the most dominant detected in both the DNA and RNA fraction, with a relative abundance of 56-64% of the bacterial community. In the DNA-derived bacterial community, the second most common OTU affiliated with Parvibaculum (Alphaproteobacteria), while in the RNA fraction, an OTU affiliating with Sphingobacteriaceae (Bacteroidetes) had higher relative abundance. On average, according to the Chao1 estimate, 53% of the richness of the community detected from the DNA fraction and 57% of the RNA fraction was captured (Table 3). The abundance-based coverage estimate (ACE) showed that on average 52% and 49% of the bacterial richness was captured in sequencing the DNA and RNA fractions, respectively ( Table 3). The DNA and RNA communities shared 59 OTUs (34.5% of all the OTUs). Of the negative DNA extraction control, 90% of the sequences affiliated with betaproteobacterial Ralstonia, which was not detected in any of the actual samples (Supplementary Material, Table S2). In the PCR negative control, the most common OTUs affiliated with different Cyanobacteria (52%). These were rare, on average 0.02% of the sequences from all the subsurface samples. However, 5% of the sequences in the PCR negative control sample affiliated with an alphaproteobacterial phylotype that was also present in the samples with 1-2% relative abundance. The only archaeal sequences detected with 16S rRNA gene sequencing affiliated with Methanobrevibacter (Methanobacteria) and thaumarchaeotal "Candidatus Nitrosopumilus". Fungal ITS sequences affiliated with ascomycotal Cladosporium and Orbilia, and basidiomycotal Vuilleminia, Apiotrichum and Trichosporon (98-100% identity score from BLAST). Some sequences could only be identified to kingdom level (i.e., to Fungi). With a very few sequences per each sample, ecological indices were not calculated for archaeal and fungal sequences.
Sequencing was successful from three different depths of Otaniemi OTN3 drill hole samples (3, 4 and 5). Bacterial communities from these depths had distinct compositions (Figure 2). Altogether 203 OTUs were observed from the data of which DNA extraction control was filtered out. Of these, the samples shared 27 OTUs. The bacterial community in Otaniemi depth at depth of 3203 m comprised mainly of Gammaproteobacteria (37%), Firmicutes (15%), Actinobacteria (13%), unclassified Bacteria (13%) and Bacteroidetes (10%). This depth had also the highest diversity index, Shannon H = 3.9 (Table 3). Gammaproteobacteria formed the majority of the bacterial community at 4203 m depth (78%). Betaproteobacteria (Burkholderiaceae-affiliating OTU, 5%) and Actinobacteria had the next highest relative abundance (mostly Dietzia, 5% relative abundance). The Shannon diversity index H was 3.4. The major groups in the 4375 m sample were Gammaproteobacteria, 53%, Actinobacteria 15% (of which a Nocardioides-affiliating OTU representing 7% relative abundance), and Bacteroidetes (10%). This sample had the H index of 2.5, which was the lowest of all studied Otaniemi samples. Most common gammaproteobacterial OTUs affiliated with Pseudomonas, Acinetobacter, Enhydrobacter and unclassified Enterobacteriaceae. The second-most common OTU in the Otaniemi bacterial communities could not be classified further than to phylum level (unclassified Bacteria). Off all the control samples taken, only the DNA extraction control yielded sequences, of which most were affiliated with Gammaproteobacteria. Of these, Pseudomonas-affiliating OTUs were also detected in the samples, but for example Solirubrobacter, Rheinheimera, Afipia and unclassified Burkholderiaceae-affiliating sequences were only detected in large quantities in the DNA extraction control (Supplementary Material, Table S2).
The metagenomic data of the microbial community structure in Pyhäsalmi supports loosely the amplicon sequencing results. From the two combined metagenomes, most of the reads were assigned to bacteria, 97% of the total of 63,255 sequences (Supplementary Material, Figure S1). According to Kraken, the majority of the bacterial community at 2.4 km depth at Pyhäsalmi comprised of Proteobacteria (on average 39%). Actinobacteria (38%) and Firmicutes (15% were also present in the combined metagenome. Alphaproteobacterial Rhizobiales, betaproteobacterial Burkholderiales and gammaproteobacterial Pseudomonadales represented the most abundant orders. Actinobacteria and Firmicutes composed of several presumably contaminant taxa, namely Propionibacterium, Streptococcus Life 2020, 10, 2 11 of 22 and Staphylococcus. Archaeal sequences formed 1% of the total community with Methanosarcinales the most abundant order. Thaumarchaeota represented 1% of the archaeal community.

Microbial Functionality
Microbial functionality was tested with marker gene assays using quantitative PCR. Out of the tested marker genes, only narG gene copies were detected. These nitrate reduction marker genes were successfully quantified from 3203 m and 4375 m at Otaniemi (on average five and 13 copies of narG per g of sample, respectively) ( Figure 1). As the detection limit of this assay is 10 copies per g of sample, we could only extrapolate numbers from the standard curve for the 3203 m sample. Sulfate reduction and methanogenesis marker gene copies were not detected from either site.
Looking into the functions in Pyhäsalmi metagenomes, the most frequent gene ontology annotations in biological process category were metabolic and biosynthetic process, nitrogen compound and small molecule metabolic process, and processes involved in transport (Figure 3). In the molecular function category ion, nucleic acid and nucleotide binding, oxidoreductase and catalase activities were the most prevalent categories. Gene ontologies that were most abundant in the cellular component category were membrane and intrinsic to membrane-categories.
In order to understand more deeply the ecosystem functionality of the microbial community, we used KEGG's GhostKOALA annotation for the Pyhäsalmi metagenomes. Reconstructed KEGG modules revealed the complete module of the reductive pentose phosphate pathway (Supplementary Material, Table S3). Complete modules for gluconeogenesis, pyruvate oxidation, pentose phosphate pathway, glyoxylate cycle, and several amino acid biosynthesis modules were also detected. In the environmental information processing category, nitrate/nitrite transport system in addition to for example phospholipid, ribose, peptide/nickel and ABC transport systems were complete. Dissimilatory nitrate reduction pathway was fully reconstructed using KAAS (Supplementary Material, Figure S2).
The Megahit assembly resulted in 3502 contigs with a total of 1.4 Mbp. The average contig length was 408 bp, ranging from minimum of 200 bp to maximum of 9 kbp. The N50 was 398 bp. With this amount of contigs and short average contig length, no further analysis was attempted for the metagenomic data.
The functional profiles from FAPROTAX analysis had 370 and 57 assignments affiliating to at least one group in Otaniemi and Pyhäsalmi samples, respectively. Otaniemi samples hosted 28 different functional groups, whereas Pyhäsalmi samples hosted 32 different functional groups. Some of the detected microbial community members in Otaniemi and Pyhäsalmi remained unclassified and represented uncultured species, whereas the FAPROTAX database relies on the characterized strains. In detail, 89% of the Otaniemi OTUs and 80% of the Pyhäsalmi OTUs were left without functional assignment based on FAPROTAX database.
Functional FAPROTAX predictions indicated that chemoheterotrophy represented a major driving force in deep biosphere metabolism in both Otaniemi and Pyhäsalmi mine ( Figure 4) (Supplementary data, Data file S1). OTUs grouped with chemoheterotrophic lifestyle belong to Alphaproteobacteria, Gammaproteobacteria, Actinobacteria and Firmicutes. Same OTUs appeared also in the aerobic chemoheterotrophy-category, likely due to the nature of the bacterial species (facultative anaerobes) these OTUs were assigned. In Pyhäsalmi, the metabolic profiling indicates possible sulfate and sulfur compound respiration. The metabolic profiles of Otaniemi microbial communities show potential for fermentation, methylotrophy and aromatic carbon compound degradation. The functional fermentation group was detected based on orders such as Clostridiales, Bacteroidales and Pseudomonadales. Methanol oxidation and methylotrophy are potential metabolisms at Otaniemi at depth of 4375 m. Minor levels of sulfate metabolism (0.01-0.3% of detected functional groups), were detected, and sulfate and sulfur compound respiration in Otaniemi and Pyhäsalmi could be linked to Desulfobacterales (Supplementary data, Data file S1). There is potential for more complex metabolic pathways in Otaniemi microbial communities, such as degradation of aromatic and hydrocarbon compounds. reduction and methanogenesis marker gene copies were not detected from either site.
Looking into the functions in Pyhäsalmi metagenomes, the most frequent gene ontology annotations in biological process category were metabolic and biosynthetic process, nitrogen compound and small molecule metabolic process, and processes involved in transport (Figure 3). In the molecular function category ion, nucleic acid and nucleotide binding, oxidoreductase and catalase activities were the most prevalent categories. Gene ontologies that were most abundant in the cellular component category were membrane and intrinsic to membrane-categories.

Discussion
For more than two decades NASA's Mars exploration program has been themed "Follow the Water". Traditionally the habitability of the planet has been defined with the possibility of liquid water existing on the surface of the object. Today we know that life survives and thrives even in the deepest realms of the oceans, in the subsurface sediments and even in cracks and fractures of deep, crystalline bedrock [21,22,[67][68][69]. This study, as well as other deep subsurface investigations show, that the deep subsurface habitats usually retain very low cell numbers [70][71][72][73]. However, the deep biosphere biomass consists of approximately 15% of the total biomass of the Earth [74] and therefore could play a significant role in dynamics of elemental cycling on all inhabited planetary objects.

Habitats Hosting Low Biomass
Multi-extreme surface habitats on planetary bodies are considered inhospitable [75]. However, subsurface could provide a more suitable environment for microbial life, although if analogous to

Discussion
For more than two decades NASA's Mars exploration program has been themed "Follow the Water". Traditionally the habitability of the planet has been defined with the possibility of liquid water existing on the surface of the object. Today we know that life survives and thrives even in the deepest realms of the oceans, in the subsurface sediments and even in cracks and fractures of deep, crystalline bedrock [21,22,[67][68][69]. This study, as well as other deep subsurface investigations show, that the deep subsurface habitats usually retain very low cell numbers [70][71][72][73]. However, the deep biosphere biomass consists of approximately 15% of the total biomass of the Earth [74] and therefore could play a significant role in dynamics of elemental cycling on all inhabited planetary objects.

Habitats Hosting Low Biomass
Multi-extreme surface habitats on planetary bodies are considered inhospitable [75]. However, subsurface could provide a more suitable environment for microbial life, although if analogous to Earth, low biomass can impede the detection of life in these environments. Detection of functional molecules, such as DNA and RNA would be a powerful indicator of life on other planetary bodies, and could be regarded as the smoking-gun evidence [76], assuming that extraterrestrial life is DNAor RNA-based. Quantitative PCR methods are very sensitive and work well with very low amount of DNA [77]. With quantitative PCR, there would be possibility to retrieve information about the volume and functionality of life, assuming that life on other planetary objects would not be excessively different from life on Earth. In the present study, we were able to demonstrate the feasibility of qPCR in determining life in ultralow biomass habitat, with detection of fewer than ten to some hundreds of copies of 16S rRNA gene and transcript fragments from the extracted DNA and RNA. Such an approach demonstrates a targeted need for sample return missions.

Microbial Community Structure
The microbial communities detected from both our study sites had a few dominating OTUs. In Pyhäsalmi, more than half of the 16S rRNA genes and transcripts sequenced affiliated with Alkanindiges. The type strain of this bacterium, Alkanindiges illinoisensis, isolated from oilfield soil is using long-chain linear and branched hydrocarbons and only grows weakly on acetic acid [78]. Interestingly, Alkanindiges 16S rRNA gene sequence is 99.1% similar with unclassified bacterium clone from deep groundwater in Maqarin, Jordan, therefore, this was not the first time for this type of bacterium to be detected from the deep biosphere [79]. No long-chain hydrocarbons have been reported from borehole in Maqarin, but there are small amounts of iso-butane, N-butane and N-pentane reported from Pyhäsalmi R-2247 drill hole [37]. Thus, it is possible that this bacterium uses those as growth substrates in Pyhäsalmi bedrock.
Pseudomonas was the most common phylotype in Otaniemi samples. Bacteria related to Pseudomonas have been detected in many deep biosphere studies, and it has been suggested to form the core microbiome in deep subsurface [80][81][82]. Although there is speculation about Pseudomonas being contaminants in deep subsurface sequence datasets, there are cultivation studies where these microbes have been isolated from deep fluids [83,84]. Other Pseudomonadales affiliating bacteria (Acinetobacter, Enhydrobacter) were also present in relatively high abundance in Otaniemi. Pseudomonadales is a heterogeneous order of Proteobacteria that are ubiquitous in many ecosystems including aquatic and soil environments and have significant ecological importance. They can be regarded as "weeds" of the bacterial kingdom, as growth can occur in various different habitats with wide temperature and pH range. As pseudomonads are metabolically versatile chemoorganotrophs, their carbon sources can vary from amino acids to aromatic compounds. Pseudomonads are mainly aerobic organisms, while some are denitrifiers [85][86][87].
The apparent lack of chemoautotrophs in our samples is surprising, as these are usually the main primary producers in the deep biosphere [26]. However, chemoheterotrophic organisms are flexible in their metabolism and therefore might gain a competitive edge against the chemoautotrophs [22,88]. In many deep subsurface environments, autotrophs form only a minor proportion of the total community compared to heterotrophs [69,89] so we may have missed these because of the low biomass in the first place. In addition, recent studies have shown that some ultra-small microbes in the deep biosphere will pass the filters used in this study, therefore introducing bias to the microbial community structure analysis as well as functional profiling of the community [24,90,91]. Some bacteria also decrease in cell size under oligotrophic conditions similar to deep crystalline bedrock fluids [91], and may be lost during the biomass collection step.
The rare biosphere, i.e., OTUs comprising less than 1% of the total microbial community, was present in significant extent in both study sites. The rare biosphere represents an important gene pool and may in fact play a disproportionately large role in biogeochemical cycling [92,93]. Interestingly, we could also detect some archaea that usually represent a minor part of the total community in deep subsurface environments, and some fungal signals as well. Fungi in the deep subsurface are not particularly well characterized. Only recently studies have highlighted their existence in the deep biosphere, and their ecological role is still rather unclear [94,95].

Metabolic Capacities of Microbial Communities
For future astrobiology or sample return missions to Mars or icy moons, we need to define the feasible microbial functional capacities within the subsurface. It is commonly thought that chemolithoautotrophic organisms are the likely organisms that would be best adapted to conditions in Mars [96,97]. The evolutionary emergence of chemolithoautotrophs coincides with the timeframe when conditions on Mars were favorable for life [98]. Chemolithoautotrophs use inorganic carbon (CO 2 ) for building biomass and generating energy. In this study, we detected signals of microbial life that uses organic carbon and a variety of different energy metabolisms. For example, the most abundant organisms in the microbial communities used small organic molecules in both ultradeep subsurface sites in this study. The predicted functionality shows that chemoheterotrophy is a common feature of these microbial communities. As trace organics have been detected on Mars, and in plumes ejecting from Enceladus [99][100][101], chemoorganotrophs should not be neglected in future life detection missions. Even though autotrophs are the primary producers of ecosystems on Earth, the number of autotrophs supporting the total microbial community is sometimes much lower in the deep subsurface compared to heterotrophs [23,89]. Therefore, it might be more reasonable to aim the detection of life towards chemoorganotrophs.
However, potential for autotrophy was demonstrated in Pyhäsalmi metagenomes. Complete reconstructed reductive pentose phosphate pathway from KEGG shows that this important mechanism of autotrophic CO 2 fixation in nature could be functional in ultradeep crystalline bedrock [102].
Nitrate has been detected in mudstone deposits at Gale Crater on Mars and could provide a nitrogen source, instead of Mars atmospheric nitrogen that is much lower (2.6%) compared to Earth (78%) [103,104]. Although nitrate concentration in Pyhäsalmi is below the detection limit of <0.2 mg L −1 , we found multiple indications on nitrate metabolism playing a role in the ultradeep, oligotrophic subsurface. From Pyhäsalmi metagenomic data, dissimilatory nitrate reduction pathway could be reconstructed, and a complete nitrate/nitrite transport system mapped. One of the archaeal OTUs detected from Pyhäsalmi samples affiliated with "Candidatus Nitrosopumilus" that is an autotrophic ammonia-oxidizing thaumarchaeon [105,106]. We could also detect marker gene copies of nitrate reductase narG, which is functioning in the first nitrate reducing step of the dissimilatory nitrate reduction pathway. Putative functions related to nitrogen metabolism from FAPROTAX were predicted as well. Higher nitrate levels were regarded as a sign of habitability in a Mars analog environment and were suggested as a useful guide for finding life on Mars [107]. This idea is reinforced by the detection of nitrate cycling potential in another type of analog environment in our study. Likewise, of other predicted functions with FAPROTAX, sulfur and sulfate respiration and oxygen-dependent methylotrophy could be accomplished in Martian subsurface conditions [97,108]. However, as FAPROTAX analysis demonstrated, only a relatively small percentage (11-20% in this study) of the total diversity in ultradeep bedrock can be assigned to a cultured microbial species, and therefore the metabolic potential of the deep biosphere remains elusive.
When analyzing microbial communities and their metabolic potential in substrate-limited and oligotrophic environments, one must take into consideration that metabolic pathways may be truncated. Microbes gain energy faster by partial oxidation of carbon compounds, producing intermediate metabolites that, with enough biodiversity, act as food and energy sources to other members of the community [109]. Substrates are recycled in a community so effectively that accumulation of end-products does not occur significantly [110], therefore complicating the observation of these possible biosignatures even further in the search for life in the Solar System.

Considerations on Contamination
Contamination is a pressing issue in all studies of low biomass environments. The cell numbers are extremely low, and consequently the risk of contamination from different sources during the sampling and laboratory procedures is high. Although aiming to retrieve samples with aseptic techniques, using precaution in laboratory work and using stringent quality control with sequence analysis, there are still sources of contamination that cannot be ruled out in deep biosphere studies. Deep subsurface studies are often taking advantage of predrilled holes in mines (in this study Pyhäsalmi) or other industrial drilling (Otaniemi) when microbiological sampling has not been considered and typically not suited to the addition of an equipment sterilization step before the drilling. Therefore, implementing controls into each step of the study: sample collection, nucleic acid extraction, PCR, reverse transcription of RNA and sequencing, is fundamental [83].
We followed the moderately stringent contaminant removal methodology suggested by Sheik et al., but our sequencing dataset still had several OTUs that could be considered contaminants (e.g., Pseudomonas, Acinetobacter, Sphingomonas, Burkholderia, Streptococcus, Lactobacillus, Dietzia) [83,111]. Most of these are shown to originate from nucleic acid extraction kits, which would be likely used in extracting nucleic acids from samples in sample return missions. However, there are ongoing technology development for isolation and sequencing of nucleic acids in situ on other planetary objects [112,113]. These methodologies would also be able to identify forward contamination that is a concern whenever landing spacecraft to Mars [96]. Nonetheless, similar precautions and quality control in sampling, sample processing and data analysis should be followed whether we are working with low biomass and analog environments on Earth, meteorites or actual extraterrestrial deep subsurface samples.

Conclusions
In this study we detected diverse bacterial communities in two different deep terrestrial subsurface locations in the Fennoscandian Shield. Archaea and fungi were detected in very low numbers compared to the bacteria. Gammaproteobacterial Alkanindiges OTU was dominating in fluids retrieved from 2.4 km depth, while Pseudomonas-related OTUs were common in crushed rock samples retrieved from even deeper, up to 4.4. km depth. Many detected OTUs affiliated with bacteria known for chemoheterotrophic metabolism and/or participation to nitrogen cycling. Metagenomic data also indicated potential for nitrate reduction. In conclusion, this study describes the microbial community in low biomass, carbon-depleted and energy-deprived deep subsurface environment. The information retrieved can be useful for future space missions in the quest of searching life signs in other planetary objects. Missions can be aimed to detect heterotrophic life in subsurface, and if successful, comparison of the Martian life to the deep biosphere found on Earth can be done.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2075-1729/10/1/2/s1: Table S1: Details on the primers used in this study. Table S2: Raw counts of OTUs in Pyhäsalmi and Otaniemi samples. Table S3: GhostKOALA module reconstructs from Pyhäsalmi metagenome. Figure S1: Krona chart of the microbial community from Pyhäsalmi metagenome based on Kraken analysis, Figure S2: KAAS produced Kegg pathway for nitrate reduction, Data file S1: FAPROTAX reports. Funding: This research was funded by Wihuri Foundation postdoctoral research grant for L.P., KYT2018 and KYT2022 grants (RENGAS and BIKES) to R.K., and a Royal Society of Edinburgh Research Fellowship to C.C. Deep Carbon Observatory sponsored Community of Deep Life -sequencing opportunities for Pyhäsalmi metagenomic sequencing and Otaniemi 16S rRNA gene amplicon sequencing. COST Action Life-ORIGINS (TD1308) funded the short-term scientific mission for L.P. to visit ETH Zürich to perform qPCR analyses. acknowledged for their assistance in the field sampling. Pyhäsalmi Mine, First Quantum Minerals Ltd., Pyhäjärven Callio/Calliolab, and Mikko Numminen are thanked for allowing, arranging and assisting sampling in the Pyhäsalmi mine. Tero Saarno and Rami Niemi (St1 Deep Heat) are acknowledged for access to and assistance on the Otaniemi deep drilling site. Deep Carbon Observatory's Community of Deep Life coordinator Rick Colwell, and Hilary Morrison from Marine Biology Laboratory are thanked for sequencing opportunities and assistance. Mark Lever is acknowledged for hosting L.P during the short-term scientific mission and the opportunity to use the facilities of the Genetic Diversity Center in ETH Zürich. Christian Brandt (Swedish University of Agricultural Science) is thanked for the help with the visualization script. Three anonymous reviewers are thanked for their valuable comments and suggestions for improving the manuscript.