Insights into Biodegradation Related Metabolism in an Abnormally Low Dissolved Inorganic Carbon (DIC) Petroleum-Contaminated Aquifer by Metagenomics Analysis

In petroleum-contaminated aquifers, biodegradation is always associated with various types of microbial metabolism. It can be classified as autotrophic (such as methanogenic and other carbon fixation) and heterotrophic (such as nitrate/sulfate reduction and hydrocarbon consumption) metabolism. For each metabolic type, there are several key genes encoding the reaction enzymes, which can be identified by metagenomics analysis. Based on this principle, in an abnormally low dissolved inorganic carbon (DIC) petroleum-contaminated aquifer in North China, nine groundwater samples were collected along the groundwater flow, and metagenomics analysis was used to discover biodegradation related metabolism by key genes. The major new finding is that autotrophic metabolism was revealed, and, more usefully, we attempt to explain the reasons for abnormally low DIC. The results show that the methanogenesis gene, Mcr, was undetected but more carbon fixation genes than nitrate reduction and sulfate genes were found. This suggests that there may be a considerable number of autotrophic microorganisms that cause the phenomenon of low concentration of dissolved inorganic carbon in contaminated areas. The metagenomics data also revealed that most heterotrophic, sulfate, and nitrate reduction genes in the aquifer were assimilatory sulfate and dissimilatory nitrate reduction genes. Although there was limited dissolved oxygen, aerobic degrading genes AlkB and Cdo were more abundant than anaerobic degrading genes AssA and BssA. The metagenomics information can enrich our microorganic knowledge about petroleum-contaminated aquifers and provide basic data for further bioremediation.


Introduction
The biodegradation of petroleum plays a vital role in transforming hydrocarbons into harmless matter in the natural environment [1]. The understanding of microorganism metabolism related to biodegradation in petroleum hydrocarbon contaminated (PHC) aquifers can help us to evaluate the potential ability of biodegradation, which is the theoretical basis of bioremediation [2]. while NA68 and MW6 were designed as the downstream wells. PM7 and NA7 were designed as transitional wells to monitor the information of the zone adjacent to the contaminated source zone. The aquifer at the site is mainly composed of sand gravel layers. The thickness of the aquifer is about 3.3 m. Its lower layer is silty clay, which can be considered as the aquiclude. Its upper layer is also silty clay at most of the site, while in the downstream area (around NA68 and MW6) there is no upper aquiclude. The groundwater depth is about 21.7 m to 24.0 m and the water table is higher than the upper confining bed. Therefore, the groundwater is confined in the main area and unconfined in the downstream area. The groundwater flow is from northwest to southeast.

Sample Collection and Chemical Analysis
Samples were collected from the monitoring wells using bailers in August 2018. Information about the water temperature (T), pH, electrical conductivity (EC), dissolved oxygen (DO), and oxidation-reduction potential (ORP) in the collected groundwater was determined using a Portable Parallel Analyzer (SL1000, Hach, Loveland, CO, USA) and recorded before collecting the laboratory testing samples. The groundwater samples were considered to be representative when the variations of T, pH, EC, DO, and ORP in three successive samples were within ±1 °C, ±0.2, ±3%, ±10%, or ±0.2 mg/L, and ±20 mV, respectively. When sampling, 1 L of groundwater was collected in two 500 mL plastic buckets for inorganic analysis and 160 mL was collected in four 40 mL amber glass bottles with Teflon airtight caps for organic and CO2 analysis. The samples were then stored on ice in an incubator before being transported to the laboratory. For the metagenomics analysis, 5 L of groundwater was collected in a sterilized 5 L plastic bucket and stored on ice in an incubator. The water samples were transported to the laboratory, and DNA was collected in five poly tetra fluoroethylene filter membranes with a pore size of 0.22 μm by air pump filtration in one day. The filter membranes were stored at −80 °C until DNA extraction.

DNA Isolation, Sequencing, and Library Construction
Total genomic DNA was extracted from nine collected samples using the E.Z.N.A. ® DNA Kit (Omega Bio-tek, Norcross, GA, USA) according to the manufacturer's instructions. Concentration and purity of extracted DNA were determined with TBS-380 and NanoDrop2000, respectively. DNA extract quality was checked on 1% agarose gel. The aquifer at the site is mainly composed of sand gravel layers. The thickness of the aquifer is about 3.3 m. Its lower layer is silty clay, which can be considered as the aquiclude. Its upper layer is also silty clay at most of the site, while in the downstream area (around NA68 and MW6) there is no upper aquiclude. The groundwater depth is about 21.7 m to 24.0 m and the water table is higher than the upper confining bed. Therefore, the groundwater is confined in the main area and unconfined in the downstream area. The groundwater flow is from northwest to southeast.

Sample Collection and Chemical Analysis
Samples were collected from the monitoring wells using bailers in August 2018. Information about the water temperature (T), pH, electrical conductivity (EC), dissolved oxygen (DO), and oxidation-reduction potential (ORP) in the collected groundwater was determined using a Portable Parallel Analyzer (SL1000, Hach, Loveland, CO, USA) and recorded before collecting the laboratory testing samples. The groundwater samples were considered to be representative when the variations of T, pH, EC, DO, and ORP in three successive samples were within ±1 • C, ±0.2, ±3%, ±10%, or ±0.2 mg/L, and ±20 mV, respectively. When sampling, 1 L of groundwater was collected in two 500 mL plastic buckets for inorganic analysis and 160 mL was collected in four 40 mL amber glass bottles with Teflon airtight caps for organic and CO 2 analysis. The samples were then stored on ice in an incubator before being transported to the laboratory. For the metagenomics analysis, 5 L of groundwater was collected in a sterilized 5 L plastic bucket and stored on ice in an incubator. The water samples were transported to the laboratory, and DNA was collected in five poly tetra fluoroethylene filter membranes with a pore size of 0.22 µm by air pump filtration in one day. The filter membranes were stored at −80 • C until DNA extraction.

DNA Isolation, Sequencing, and Library Construction
Total genomic DNA was extracted from nine collected samples using the E.Z.N.A. ® DNA Kit (Omega Bio-tek, Norcross, GA, USA) according to the manufacturer's instructions. Concentration and purity of extracted DNA were determined with TBS-380 and NanoDrop2000, respectively. DNA extract quality was checked on 1% agarose gel.

Bioinformatics Analysis
(1) Sequence quality control and genome assembly: Adapter sequences were stripped from the 3' and 5' ends of paired-end Illumina reads using SeqPrep (https://github.com/jstjohn/SeqPrep). Low-quality reads (length <50 bp or quality value <20 or having N bases) were removed by Sickle (https://github.com/najoshi/sickle). Metagenomics data were assembled using MEGAHIT (https://github.com/voutcn/megahit), which makes use of succinct de Bruijn graphs. Contigs with a length of 300 bp or more were selected for the final assembling result, and then the contigs were used for further gene prediction and annotation.
(2) Gene prediction, taxonomy, and functional annotation: Open reading frames (ORFs) from each assembled contig were predicted using MetaGene (http://metagene.cb.k.u-tokyo.ac.jp/). The predicted ORFs with a length of 100 bp or more were retrieved and translated into amino acid sequences using the National Center for Biotechnology Information (NCBI) translation table.
All predicted genes with a 95% sequence identity (90% coverage) were clustered using CD-HIT (http://www.bioinformatics.org/cd-hit/), and the longest sequences from each cluster were selected as representative sequences to construct a non-redundant gene catalog. Reads after quality control were mapped to the representative sequences with 95% identity using SOAPaligner (http://soap.genomics. org.cn/), and gene abundance in each sample was evaluated.
Representative sequences of the non-redundant gene catalog were aligned to the NCBI NR database with an e-value cutoff of 10 −5 using BLASTP (version 2.2.28+, http://blast.ncbi.nlm.nih.gov/Blast.cgi) for taxonomic annotation.
Functional read annotations for presented figures and tables were assigned using the Enzyme database of the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene expression pathways were explored/visualized using the KEGG mapper [16]. Parts per million (ppm), or reads annotated as enzymes in one million sequencing reads, was used to represent the relative abundance of genes encoding enzymes in samples [20].

Hydrochemical Characteristics of Groundwater
Contaminant concentrations, electron acceptors, metabolic byproducts, and other hydrochemical parameters of the collected samples from August 2018 are listed in Table 1.
All the contamination indices indicated that the aquifer was still contaminated by petroleum hydrocarbons. There was no sample free from contamination in the aquifer. The downstream wells, MW3, NA7, NA68, and MW6, had higher concentrations of contaminants. This indicates that contaminants migrated to the downstream aquifer with groundwater flow.  [3]. NO 3 − should have had similar distribution according to the natural attenuation theory [21], but it did not tally with the facts. There was more NO 3 − upstream and downstream but less in the contaminated source area. The most likely cause is that there was outside nitrogen input around NA68. The evidence is that there is a sewage treatment plant nearby and the ammonium concentration in NA68 is particularly high (300 mg·L −1 ). Even though there was nitrogen input, the concentration decreased compared with the upstream wells. It is suggested that there was also nitrate reduction biodegradation. Our previous study revealed that biodegradation in the aquifer in 2016 was mainly caused by nitrate and sulfate reduction [22]. It was also true in 2018, according to the variations of electron acceptor concentrations listed in Table 1. Dissolved inorganic carbon (DIC) concentration in wells around the contamination source zone (NA125, PM7, PM3, and MW3) was lower than that in upstream and downstream wells. It is generally acknowledged that heterotrophic microorganisms completely oxidize organic carbon (hydrocarbons) to inorganic carbon (CO 2 ), elevating DIC as a result [23]. Most studies have found that there are higher DIC concentrations in organically contaminated groundwater worldwide [24][25][26][27]. PM7, PM3, and MW3 had more hydrocarbons than upstream wells and were expected to have higher DIC concentrations, which seems to conflict with the statement above. The phenomenon of abnormally low DIC concentration in the petroleum-contaminated aquifer attracted our attention when analyzing the hydrochemical parameters of the groundwater collected in August 2016 at the same site. It is speculated that the phenomenon was caused by carbon fixation of autotrophic metabolism through the sequencing of 16S rRNA gene [28]. Other researchers have found that autotrophic microorganisms may exit in PHC groundwater [5,29]. Whether there are carbon fixation microorganisms in the aquifer should be revealed by metagenomics analysis. Although the lack of samples free from contamination in the aquifer make us lose the opportunity to assess the differences of metagenomes between contaminated and non-contaminated areas, the samples collected in different contaminated areas along the groundwater flow had different characteristics of contamination and hydrochemistry as analyzed above. Therefore, there would be some interesting findings about microorganism metabolisms in different groundwater flow-field locations through the analyses of samples we collected.

Sequencing Statistics
The Illumina HiSeq 4000 sequencing resulted in over 50 million sequence counts for all samples. More than 300,000 contigs were selected for gene prediction and annotation. According to the NCBI NR database, genes were annotated at the domain level, as shown in Table 2.  Table 2 shows that more than 97% of the sequences were annotated to bacteria in all samples. This suggests that the organisms in the aquifer were dominated by bacteria.

Carbon Fixation Metabolism
There are at least six carbon fixation pathways known to exist in autotrophic bacteria and archaea: The Calvin-Benson-Bassham (CBB) cycle, Wood-Ljungdahl (WL) pathway, dicarboxylatehydroxybutyrate (DH) cycle, 3-hydroxypropionate (HP) cycle, Arnon-Buchanan (AB) cycle, and hydroxypropionate-hydroxybutyrate (HH) cycle [30,31]. The carbon fixation enzymes in these pathways were gained from the map of "carbon fixation pathways in prokaryotes" in the KEGG PATHWAY Database. Since HP and HH share the same carbon fixation enzymes [31], these pathways were analyzed together. The science of DH partly involves carbon fixation enzymes of the AB cycle as well as enzymes of the HH cycle, and there are no enzymes that are unique to the DH pathway combined with the limited microorganisms in the DH pathway, to our knowledge [31]; the DH pathway was not discussed alone in the study.
The abundance of genes encoding these enzymes is listed in Supplementary Table S1. The total abundance of genes encoding transformation of CO 2 to other material in each pathway is shown in Figure 2. The AB (DH), HP/HH, and WL pathway genes were more abundant than the CBB pathway genes. It was an interesting result that the CBB pathway, always considered as the most important carbon fixation pathway and dominant in most ecosystems [12], had the least abundance among all the pathways. The lower abundance of CBB carbon fixation genes indicates that petroleum hydrocarbons may change the aquifer ecosystem and stimulate carbon fixation microorganisms in other pathways.

Carbon Fixation Metabolism
There are at least six carbon fixation pathways known to exist in autotrophic bacteria and archaea: The Calvin-Benson-Bassham (CBB) cycle, Wood-Ljungdahl (WL) pathway, dicarboxylatehydroxybutyrate (DH) cycle, 3-hydroxypropionate (HP) cycle, Arnon-Buchanan (AB) cycle, and hydroxypropionate-hydroxybutyrate (HH) cycle [30,31]. The carbon fixation enzymes in these pathways were gained from the map of "carbon fixation pathways in prokaryotes" in the KEGG PATHWAY Database. Since HP and HH share the same carbon fixation enzymes [31], these pathways were analyzed together. The science of DH partly involves carbon fixation enzymes of the AB cycle as well as enzymes of the HH cycle, and there are no enzymes that are unique to the DH pathway combined with the limited microorganisms in the DH pathway, to our knowledge [31]; the DH pathway was not discussed alone in the study The abundance of genes encoding these enzymes is listed in Supplementary Table S1. The total abundance of genes encoding transformation of CO2 to other material in each pathway is shown in Figure 2. The AB (DH), HP/HH, and WL pathway genes were more abundant than the CBB pathway genes. It was an interesting result that the CBB pathway, always considered as the most important carbon fixation pathway and dominant in most ecosystems [12], had the least abundance among all the pathways. The lower abundance of CBB carbon fixation genes indicates that petroleum hydrocarbons may change the aquifer ecosystem and stimulate carbon fixation microorganisms in other pathways. Another interesting finding is that the carbon fixation genes were more abundant in the contaminated source area, which was attributed to the higher abundance of carbon fixation genes in the WL and HP/HH pathways. The analysis of species structure (Supplementary Figure S1) showed that most of the dominant species involved in carbon fixation, such as Reyranella, Novosphingobium, Microbacterium, Sphingomonas, Acidovorax, and Bordetella, potentially have the ability to biodegrade hydrocarbons [32][33][34][35][36][37][38][39][40]. It is also reported that these bacteria can assimilate inorganic carbons and be considered as facultative autotrophic microorganisms [41,42]. It is reported that carbon fixation pathways, e.g., the AB, or HP pathway, may facilitate the assimilation of simple organic substances, e.g., acetate, succinate, or propionate, as these substances are intermediates of these pathways. Thus, carbon mixotrophs using such pathways have a competitive advantage over obligate autotrophs or  Another interesting finding is that the carbon fixation genes were more abundant in the contaminated source area, which was attributed to the higher abundance of carbon fixation genes in the WL and HP/HH pathways. The analysis of species structure (Supplementary Figure S1) showed that most of the dominant species involved in carbon fixation, such as Reyranella, Novosphingobium, Microbacterium, Sphingomonas, Acidovorax, and Bordetella, potentially have the ability to biodegrade hydrocarbons [32][33][34][35][36][37][38][39][40]. It is also reported that these bacteria can assimilate inorganic carbons and be considered as facultative autotrophic microorganisms [41,42]. It is reported that carbon fixation pathways, e.g., the AB, or HP pathway, may facilitate the assimilation of simple organic substances, e.g., acetate, succinate, or propionate, as these substances are intermediates of these pathways. Thus, carbon mixotrophs using such pathways have a competitive advantage over obligate autotrophs or heterotrophs [31]. Therefore, these facultative autotrophic microorganisms thrive in the aquifer, especially in the contaminated source zone.
Although the mechanism is still unclear and should be studied further, the result explains the phenomenon of low DIC concentrations in the contaminated aquifer. NA125, PM7, and PM3 wells were less contaminated, and they had lower DIC concentration. In particular, the NA125 well had the most abundant carbon fixation genes among upstream wells (MW14, MW4, NA125, and PM7) that had fewer contaminants, and therefore had the lowest DIC concentration (Table 1). MW3, NA7, and NA68, the most contaminated wells, had a higher abundance of carbon fixation genes, but with the DIC generated by heterotrophic metabolism, their DIC concentrations were slightly elevated. MW6 had fewer carbon fixation genes but more contaminants, which resulted in heterotrophic metabolism dominating the area and elevated DIC concentration.
The amount of genes encoding carbon fixation was more than 1500 ppm in all samples. Except for MW6, the abundance of carbon fixation genes was greater than that of genes involved in nitrate and sulfate reduction (Supplementary Figure S2), both of which are the primary biodegradation reactions in petroleum-contaminated groundwater [22,43]. Therefore, there would be a big carbon sink effect in the aquifer. The carbon fixation metabolism should be considered in future studies of petroleum-contaminated aquifers.

Methane Metabolism
Methane metabolism includes methanogenesis and methane oxidation. Methanogens can obtain energy for growth by converting a limited number of substrates to methane under anaerobic conditions. Three types of methanogenic pathways are known: CO 2 to methane, methanol to methane, and acetate to methane [44]. According to the KEGG pathway map, all the methanogenic pathways can generate the same intermediate product, methylcoenzyme M, which will be reduced to methane by methyl-coenzyme M reductase (Figure 3). The abundance of mer, cdh, and mta is~70,~20, and~3 parts per million (ppm), respectively. Therefore, the pathway of CO 2 to methylcoenzyme M is dominant. This process mainly occurs around the contaminated source area (PM3 to MW3). It may be because there was less dissolved oxygen (DO) in the area (Table 1), making it suitable for the reduction of CO 2 [45]. According to Figure 3, there was no detected mcr gene encoding methyl-coenzyme M reductase, which is associated with the final stage of methanogenesis. That may because there was no methanogenic redox environment or there were other adequate preferred electron acceptors for microorganisms (Table 1). Therefore, it is suggested that little methane would be generated from the aquifer and released to the atmosphere.
However, there was some methane monooxygenase gene, a key methane oxidation gene, in collected samples. It is suggested that there was methanogenesis in the aquifer, but the gene was undetectable. MW3 and NA7 had the lowest methane monooxygenase gene abundance because of the limited oxygen, while MW4 and MW14 had lower methane monooxygenase gene abundance because of the limited methane upstream.

Nitrogen Metabolism
There are three methods of nitrate reduction, according to the KEGG map: Denitrification, dissimilatory reduction, and assimilatory reduction (Figure 4).
In all three nitrate reduction methods, the nitrate should first be transformed into nitrite. The enzyme in this step is nitrate reductase. Genes encoding this enzyme are Nar, Nap, NR, and Nas ( Figure 4). The total abundance of these genes was about 350 ppm to 700 ppm in the groundwater. Generally speaking, there were more nitrate reductase genes in downstream than upstream groundwater. It is likely that there were higher concentrations of hydrocarbons. Nitrate reduction is always considered to be heterotrophic metabolism, which needs organic carbon as its carbon source and energy [46].  However, there was some methane monooxygenase gene, a key methane oxidation gene, in collected samples. It is suggested that there was methanogenesis in the aquifer, but the gene was undetectable. MW3 and NA7 had the lowest methane monooxygenase gene abundance because of the limited oxygen, while MW4 and MW14 had lower methane monooxygenase gene abundance because of the limited methane upstream.

Nitrogen Metabolism
There are three methods of nitrate reduction, according to the KEGG map: Denitrification, dissimilatory reduction, and assimilatory reduction ( Figure 4).   However, there was some methane monooxygenase gene, a key methane oxidation gene, in collected samples. It is suggested that there was methanogenesis in the aquifer, but the gene was undetectable. MW3 and NA7 had the lowest methane monooxygenase gene abundance because of the limited oxygen, while MW4 and MW14 had lower methane monooxygenase gene abundance because of the limited methane upstream.

Nitrogen Metabolism
There are three methods of nitrate reduction, according to the KEGG map: Denitrification, dissimilatory reduction, and assimilatory reduction (Figure 4). In subsequent steps of nitrite reduction, there are three steps in denitrification and only one step in dissimilatory and assimilatory reduction. In denitrification, nitrite is reduced to nitric oxide through nitrite reductase encoded by the NirK or NirS gene, then reduced to nitrous oxide through nitric oxide reductase encoded by the NorBC gene, and then reduced to nitrogen through nitrous oxide reductase encoded by the NosZ gene. The gene abundance at each step was~300 ppm,~200 ppm, and~90 ppm, respectively. NA68 had the highest gene abundance among all samples in these steps. That may be because there was the highest concentration of organics (COD: 482 mg/L), the lowest concentration of DO (1.02 mg/L), and adequate nitrate (33.21 mg/L) in the NA68 well. Denitrifying bacteria are always considered to be anaerobic heterotrophic microorganisms [47], and therefore suitable for growing in less DO and more organic conditions. There are more denitrification genes involved in the transformation of nitrite to reduced nitrogen upstream and downstream of the aquifer rather than the contamination source zone. This may be caused by less nitrate in the contamination source zone.
In the dissimilatory reduction process, nitrite is transformed into ammonium directly through nitrite reductase encoded by NirBD and NrfAH genes. The abundance of the two genes was about 100 ppm to 300 ppm, which was greater than that of the NirK (NirS) gene in denitrification except in the NA68 sample. There was more abundance in upstream and downstream samples. There was less of the assimilatory reduction gene, NirA, whose abundance was 0-50 ppm.
Based on the above results, it can be speculated that nitrite reduction was mainly caused by dissimilatory reduction and denitrification in the study aquifer. It is reported that denitrifying microorganisms are capable of oxidizing toluene, ethylbenzene, and m-, o-, p-xylenes, and there was substantial investigation into BTEX degradation under denitrification conditions in aquifers [48,49]. Denitrification is also commonly desired for in situ bioremediation of PHC-contaminated aquifers, therefore nitrate was injected into several PHC-contaminated aquifers to stimulate and enhance denitrification [43,50]. However, dissimilatory nitrate reduction was not regarded as an important process in many contaminated aquifers [49][50][51]. This did not agree with the findings in the study. Studies found that most obligately anaerobic nitrate-reducing bacteria perform dissimilatory nitrate reduction to ammonia (DNRA) rather than denitrification to nitrate [52], and the dissimilatory pathway is regulated by oxygen [53]. It is suggested that most of the nitrate-reducing bacteria in the aquifer are obligately anaerobic.
There were fewer nitrification genes (AmoABCD and Hao genes) in the aquifer for the limited DO in groundwater [54]. These nitrification genes were more abundant upstream near the contaminated source. That may be because there was some ammonium and relatively abundant dissolved oxygen. Ammonium may not be the limiting factor for nitrification in the aquifer, since there was more ammonium but fewer nitrification genes in NA68 and MW3.
NxrAB gene, encoding oxidization of nitrite to nitrate, was abundant in all samples. That is because the NxrAB gene shares the same enzyme with Nar. The enzyme is nitrite oxidoreductase, which can convert either nitrate to nitrite or nitrite to nitrate [55]. In anaerobic conditions, the enzyme may mostly play a reduction role.
There was also the nitrogen fixation gene (Nif DHK) in some wells with little abundance, except NA7 and MW6 wells located downstream, with the least dissolved oxygen, to which nitrogenase is sensitive [56].
To sum up, the dominant nitrogen metabolism in the aquifer is nitrate reduction, which is one of the most important ways to biodegrade petroleum contaminants. Denitrification and dissimilatory nitrate reduction may coexist in the aquifer, and dissimilatory reduction may play a dominant role. Dissimilatory nitrate reduction produces ammonium, which is the preferred nitrogen source for most microorganisms [57]. Therefore, this process can biodegrade hydrocarbon and supply the nitrogen source simultaneously.

Sulfur Metabolism
Sulfur metabolism, especially sulfate reduction, plays an important role in hydrocarbon biodegradation [58]. The process of sulfate reduction is always accompanied by biodegradation of hydrocarbons, because sulfate can supply the electron acceptor [21]. The varying abundance of genes relating to sulfate reduction metabolism can be observed between all samples in Figure 5.
There are four steps in assimilatory reduction. The genes encoding the enzymes that transform sulfate to (adenylyl sulfate) APS and then (3'-phosphoadenylyl sulfate) PAPS are Sat and Ask, whose total abundance was about 300 ppm to 500 ppm. The transformation of PAPS to sulfite genes involves Asr and Psr genes, which account for about 100 ppm to 200 ppm. The sulfite-to-sulfide transformation genes are Sir(N) and Sir(F), which account for 150 ppm to 400 ppm. Interestingly, the abundance of genes involved in each step of assimilatory reduction was less in contamination source area samples (PM7, PM7, and MW3) than other samples. The lower abundance of Sat and Ask genes in samples from the contamination source area may be caused by the coaction of redox conditions and contaminant concentrations. The Sat gene is also involved in sulfur oxidation ( Figure 5), which would be an aerobic or nitrate reduction process [59,60]. More Sat genes in upstream and downstream wells may be caused by higher DO and nitrate concentrations. Moreover, sulfur oxidation needs reduced sulfur compounds, which would be supplied by the reduction of sulfate in contaminated areas. Therefore, areas with higher DO, nitrate, and contaminant concentrations may have higher Sat gene abundance. Asr (Psr) and Sir genes may share the most abundant microorganism, Novosphingobium, and other microorganisms with the Sat gene (Supplementary Figure S3), which may result in similar distribution among Sat, Asr (Psr), and Sir genes. To sum up, the dominant nitrogen metabolism in the aquifer is nitrate reduction, which is one of the most important ways to biodegrade petroleum contaminants. Denitrification and dissimilatory nitrate reduction may coexist in the aquifer, and dissimilatory reduction may play a dominant role. Dissimilatory nitrate reduction produces ammonium, which is the preferred nitrogen source for most microorganisms [57]. Therefore, this process can biodegrade hydrocarbon and supply the nitrogen source simultaneously.

Sulfur Metabolism
Sulfur metabolism, especially sulfate reduction, plays an important role in hydrocarbon biodegradation [58]. The process of sulfate reduction is always accompanied by biodegradation of hydrocarbons, because sulfate can supply the electron acceptor [21]. The varying abundance of genes relating to sulfate reduction metabolism can be observed between all samples in Figure 5. There are four steps in assimilatory reduction. The genes encoding the enzymes that transform sulfate to (adenylyl sulfate) APS and then (3'-phosphoadenylyl sulfate) PAPS are Sat and Ask, whose total abundance was about 300 ppm to 500 ppm. The transformation of PAPS to sulfite genes involves Asr and Psr genes, which account for about 100 ppm to 200 ppm. The sulfite-to-sulfide transformation genes are Sir(N) and Sir(F), which account for 150 ppm to 400 ppm. Interestingly, the abundance of genes involved in each step of assimilatory reduction was less in contamination source area samples (PM7, PM7, and MW3) than other samples. The lower abundance of Sat and Ask genes in samples from the contamination source area may be caused by the coaction of redox conditions and contaminant concentrations. The Sat gene is also involved in sulfur oxidation ( Figure 5), which would be an aerobic or nitrate reduction process [59,60]. More Sat genes in upstream and downstream wells may be caused by higher DO and nitrate concentrations. Moreover, sulfur oxidation needs reduced sulfur compounds, which would be supplied by the reduction of sulfate in contaminated areas. Therefore, areas with higher DO, nitrate, and contaminant concentrations may have higher Sat gene abundance. Asr (Psr) and Sir genes may share the most abundant microorganism, Novosphingobium, and other microorganisms with the Sat gene (Supplementary Figure S3), which may result in similar distribution among Sat, Asr (Psr), and Sir genes.
There are three steps in dissimilatory reduction and oxidation. The Sat gene, encoding the first step reaction enzyme, is shared with assimilatory reduction. The second step (APS to sulfite) gene, Apr, and third step (sulfite to sulfide) gene, Dsr, had negligible abundance (~10 ppm) compared with assimilatory reduction. The abundance of Apr and Dsr genes was also less in contamination source There are three steps in dissimilatory reduction and oxidation. The Sat gene, encoding the first step reaction enzyme, is shared with assimilatory reduction. The second step (APS to sulfite) gene, Apr, and third step (sulfite to sulfide) gene, Dsr, had negligible abundance (~10 ppm) compared with assimilatory reduction. The abundance of Apr and Dsr genes was also less in contamination source area samples. Sulfur oxidation microorganisms, such as Thiomargarita and Thiodictyon, and sulfur-reduction microorganisms, such as Desulfovibrio and Desulfobulbus, coexist in the aquifer (Supplementary Figure S4). The lower concentrations of DO, nitrate, and contaminants restricted the growth of this kind of microorganism.
Generally speaking, assimilatory reduction genes in the aquifer were more abundant than dissimilatory reduction and oxidation genes. This finding suggests that the biodegradation of petroleum by sulfate reduction is mainly caused by assimilatory reduction in the study aquifer. Assimilatory reduction is the process by which microorganisms take up sulfate to build blocks for peptides and proteins [61]. Therefore, sulfate in the aquifer plays a dual role of electron acceptor and sulfur source.

Other Biodegradation Related Metabolism
The components of petroleum contaminants are complex and hard to pick out one by one. Different components may have different metabolic pathways, and there are dozens of enzymes involved in the biodegradation of one component. Moreover, a particular enzyme could be involved in many pathways. Therefore, it is hard to identify the genes involved in the biodegradation of every petroleum contaminant, and all the genes in the xenobiotics biodegradation and metabolism level of the KEGG database were considered as a whole to identify the potential biodegradation ability of petroleum contaminants in the aquifer ( Figure 6). There was a higher abundance of genes, greater than 10,000 ppm, in PM3, MW3, NA7, NA68, and MW6 samples, corresponding to the higher concentrations of contaminants (Table 1).  1.1.1). These are always considered as essential enzymes related to the degradation of benzoate, nitrotoluene, fatty acid, ketone bodies, and some amino acids [62][63][64][65]. The higher abundance of xenobiotic biodegradation genes may indicate higher biodegradation ability in petroleum-contaminated aquifers. In source and downstream areas, there were more contaminants, which may provide the carbon and energy source for the metabolism of biodegradation microorganisms. It was also consistent with the genes related to sulfate and nitrate reduction according to the analysis above. The presence of the alkane 1-monooxygenase gene (AlkB, EC 1.14.15.3) and catechol dioxygenase genes (Cdo, EC 1.13.11.1 and EC1.13.11.2) implies the degradation of medium-chain (C5-C22) alkanes and aromatic hydrocarbons, respectively, using oxygen as an electron acceptor [66,67] 1.1.1). These are always considered as essential enzymes related to the degradation of benzoate, nitrotoluene, fatty acid, ketone bodies, and some amino acids [62][63][64][65]. The higher abundance of xenobiotic biodegradation genes may indicate higher biodegradation ability in petroleum-contaminated aquifers. In source and downstream areas, there were more contaminants, which may provide the carbon and energy source for the metabolism of biodegradation microorganisms. It was also consistent with the genes related to sulfate and nitrate reduction according to the analysis above.
There were more AlkB genes and fewer Cdo genes in upstream samples (MW4, MW14, and NA125), and more Cdo genes and fewer AlkB genes in the contamination source samples (MW3, PM3, and NA7). The downstream samples (NA68 and MW6) had a medium abundance of AlkB and Cdo genes, while the PM7 sample had the least abundance of both AlkB and Cdo genes. That may be because there was less contamination and relatively adequate oxygen upstream of the aquifer, and Cdo has higher oxygen capture capacity and is more sensitive to the limited oxygen than Alk B. Although there was more contaminant in NA7, NA68, and MW6, the amplification of Cdo gene was restrained by the limited oxygen.
Although the aquifer was in an almost anaerobic environment, there were fewer anaerobic biodegradation genes. There were no BssA genes and fewer AssA genes compared with AlkB and Cdo genes in all samples (Figure 7). The most likely reason is that there is less knowledge about the hydrocarbon anaerobic biodegradation pathway [69]. The genes selected in this study may not be the main anaerobic biodegradation genes. Another probable reason is that anaerobic biodegradation is slower than aerobic biodegradation [70]. There were more AssA genes in MW6, NA7, and PM7 for the less dissolved oxygen in these samples.  [13,68]. Figure 7 shows the distribution of these genes in the aquifer. There were more AlkB genes and fewer Cdo genes in upstream samples (MW4, MW14, and NA125), and more Cdo genes and fewer AlkB genes in the contamination source samples (MW3, PM3, and NA7). The downstream samples (NA68 and MW6) had a medium abundance of AlkB and Cdo genes, while the PM7 sample had the least abundance of both AlkB and Cdo genes. That may be because there was less contamination and relatively adequate oxygen upstream of the aquifer, and Cdo has higher oxygen capture capacity and is more sensitive to the limited oxygen than Alk B. Although there was more contaminant in NA7, NA68, and MW6, the amplification of Cdo gene was restrained by the limited oxygen.
Although the aquifer was in an almost anaerobic environment, there were fewer anaerobic biodegradation genes. There were no BssA genes and fewer AssA genes compared with AlkB and Cdo genes in all samples (Figure 7). The most likely reason is that there is less knowledge about the hydrocarbon anaerobic biodegradation pathway [69]. The genes selected in this study may not be the main anaerobic biodegradation genes. Another probable reason is that anaerobic biodegradation is slower than aerobic biodegradation [70]. There were more AssA genes in MW6, NA7, and PM7 for the less dissolved oxygen in these samples.

Possible Mechanism of Biodegradation Related Metabolism
The samples collected in different contaminated areas along the groundwater flow revealed some interesting information about microorganism metabolisms. For examples, in contaminated source zone, there were more abundant genes related with carbon fixation, xenobiotic biodegradation, and catechol dioxygenase, but less alkane 1-monooxygenase gene and sulfate reduction genes.
Based on the information revealed above, possible mechanisms of biodegradation related metabolism could be deduced as follows: Petroleum hydrocarbons stimulate hydrocarbon biodegradation microorganisms, which take nitrate and sulfate as both electrical acceptors and nutrient elements. During biodegradation, some biodegradation microorganisms, considered to be facultative autotrophic bacteria, can use inorganic carbon and intermediate degradation products of contaminant components as their carbon source. The metabolism can fix the inorganic carbon and result in lower DIC concentration in the contaminated source zone. These processes could not generate methane, since the easily metabolized electrical acceptors, nitrate and sulfate, were adequate and there was no methanogenesis condition in the aquifer. This deductive mechanism is based on DNA information, which is a good indication to know the situation in an aquifer. To gain

Possible Mechanism of Biodegradation Related Metabolism
The samples collected in different contaminated areas along the groundwater flow revealed some interesting information about microorganism metabolisms. For examples, in contaminated source zone, there were more abundant genes related with carbon fixation, xenobiotic biodegradation, and catechol dioxygenase, but less alkane 1-monooxygenase gene and sulfate reduction genes.
Based on the information revealed above, possible mechanisms of biodegradation related metabolism could be deduced as follows: Petroleum hydrocarbons stimulate hydrocarbon biodegradation microorganisms, which take nitrate and sulfate as both electrical acceptors and nutrient elements. During biodegradation, some biodegradation microorganisms, considered to be facultative autotrophic bacteria, can use inorganic carbon and intermediate degradation products of contaminant components as their carbon source. The metabolism can fix the inorganic carbon and result in lower DIC concentration in the contaminated source zone. These processes could not generate methane, since the easily metabolized electrical acceptors, nitrate and sulfate, were adequate and there was no methanogenesis condition in the aquifer. This deductive mechanism is based on DNA information, which is a good indication to know the situation in an aquifer. To gain more information on the real activity of genes and the reactions acting in this context, an analysis of mRNA should be carried out in our future study.

Conclusions
In this study, high-throughput sequencing of environmental DNA was used to gain insight into biodegradation related metabolism in a petroleum-contaminated aquifer. The hydrochemical results suggest that a mass of petroleum hydrocarbons was biodegraded by microorganisms through the metabolism of nitrate and sulfate reduction. Metagenomics analysis proved the speculation and gave more detail information: Nitrate and sulfate reduction were dominated by dissimilatory nitrate and assimilatory sulfate reduction, respectively. This indicates that nitrate and sulfate may serve not only as electron acceptors but also as sources for microorganism metabolism. Although the aquifer was in an anaerobic environment, it had a considerable abundance of aerobic biodegradation genes. Accompanied by biodegradation of hydrocarbons, carbon fixation genes thrived, which probably caused the phenomenon of lower dissolved inorganic carbon concentration in the contaminated source area. Interestingly, no methanogenesis genes were detected in the aquifer, indicating that there was less methane effusing into the atmosphere. Considering the great amount of carbon fixation genes in the aquifer, it can be deduced that there would be a carbon sink effect. These interesting findings motivate us to study the mechanism further. Supplementary Materials: The following are available online at http://www.mdpi.com/2076-2607/7/10/412/s1, Figure S1: Gene abundances involved in carbon fixation, nitrate reduction and sulfate reduction, Figure S2: Microorganisms involved in carbon fixation (in Genes level), Figure S3: Microorganisms at genes level in each step of assimilatory reduction and oxidation, Figure S4: Microorganisms at genes level in each step of dissimilatory reduction and oxidation, Table S1: Abundances (ppm) of genes encoding enzymes involved in carbon fixation.