Understanding the Implications of Predicted Function for Assessment of Rapid Bioremediation in a Farmland ‐ Oilfield Mixed Area

: Farmland ‐ oilfield mixed areas are fragile ecosystems that require dynamic remediation to counteract the undesirable impact of energy development. Practicable assessment methods are piv ‐ otal to a fast and accurate evaluation of the in situ bioremediation process. Petroleum pollutants impose component ‐ dependent effects on autochthonous microbiota before and after remediation. Here, the predicted functional response of soil microbiomes to petroleum pollutants was analyzed in a historically polluted farmland ‐ oilfield mixed area from the perspective of developing a set of feasible biomarkers for immediate post ‐ bioremediation evaluation. An array of microbial, genetic, systematic, and phenotypic biomarkers was proposed. Our results showed that the biomarkers could proxy the stage of the bioremediation multidimensionally. We argue that functional diversity should be considered together with microbial community dynamic to evaluate the restoration status of the microbial communities in petroleum ‐ contaminated farmland ‐ oilfield mixed environments. methodology, Z.B. and H.W.; software, H.W.; formal analysis, H.W. and T.B.; investigation, H.W.; resources, Z.B. and X.Z.; writing—origi ‐ nal draft preparation, H.W.; writing—review and editing, T.B., Z.B., S.W. and H.W.; visualization, H.W. and T.B.; project administration, Z.B. and Y.Z.; funding acquisition, Z.B. and X.Z.


Introduction
Petroleum is a persistent contaminant in the environment [1], especially in areas combining agriculture and oil infrastructure [2]. Biodegradation by exogenous or indigenous microorganisms is a green venue to attenuate petroleum contamination [3][4][5][6]. Mycobacterium [7], Alcanivorax [8], Acinetobacter [9], Porticoccaceae [10], Bacillus [11], and Rhodobacteraceae [12] are the genera of the well-studied petroleum-degrading microorganisms. Petroleum degrading bacteria are ubiquitous and exist as indigenous microbiota in pristine environments. As a unique carbon and energy source, spilled petroleum can summon and stimulate degraders to form a dynamic community in the contaminated environment [13][14][15]. In addition, the high content of petroleum stresses the environment and biota, negatively impacting the viability of indigenous species in an acute pollution event [16]. Therefore, the property of the constantly changing and developing microbial community such as phylogenetic and functional diversity may be able to proxy the environmental sustainability before and after remediation [17].
Petroleum pollutants, chemically known as total petroleum hydrocarbons (TPHs), basically include compounds of two categories, namely, easily degradable chemicals such as petroleum hydrocarbons (PHs) and long-lasting chemicals such as polycyclic aromatic hydrocarbons (PAHs). Taking into account the complex petroleum composition, the pollutant-dependent response of microbial communities such as structure alteration, enzyme deactivation, and soil respiratory inhibition has been studied globally in terrestrial and aquatic environments [18][19][20][21][22][23]. There is an increasing recognition that functional alteration at the community level, rather than taxonomy, is more responsive to environmental stress [24]. For instance, a previous study showed that functional traits responded more to environmental changes, while stochasticity essentially determined microbial community structures [25]. In addition, Mendes et al. showed that the microbial phylogenetic diversity and functions were different within an environmental gradient. The functional diversity was apt to the soil properties [26]. Furthermore, studies evidenced the convergence of functional genes from broad microbial taxa [27]. In this context, it appears that even unrelated microbial taxa can carry out similar functions in distantly related environments, indicating that the functional capability of a community is an important consideration rather than its composition alone. Consequently, since petroleum pollutants impose component-dependent effects on autochthonous microbiota before and after remediation, the functional alteration of soil microbiomes against different components of petroleum may be able to provide a feasible approach to the evaluation of the extent to which the contamination has been reduced.
Two strategies, direct metagenomic analysis and taxonomic profiling plus functional prediction (TPFP), are commonly used to unveil the function of a microbiome [28]. Compared to amplicon-based methods, metagenomic sequencing is much more expensive and computationally intensive. As an economical and practicable alternate, the functional insight over the phylogenetic data can predict the functional outline of a microbiome under different conditions, providing a relatively explicit perspective of the sorts of processes undertaken by microbes. Yet, few studies have tackled its application on the assessment of bioremediation in soils contaminated by petroleum.
Here, soils before and after bioremediation from long-term petroleum contamination in a historically polluted farmland-oilfield mixed area was analyzed through highthroughput sequencing and phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt), functional annotation of prokaryotic taxa (FAPROTAX), and BugBase analysis. The degradation pathways of the easily degradable petroleum hydrocarbons (PHs) and permanent polycyclic aromatic hydrocarbons (PAHs) derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database were investigated to evaluate functional changes before and after bioremediation. The results showed that different components of petroleum pollutants as well as different remediation phases resulted in different functional alterations of the soil microbiomes. A number of microbial, genetic, systematic, and phenotypic biomarkers related to the degradation pathways were identified. Therefore, we suggest that predicted functional diversity could be advanced as a promising tool to evaluate the status of microbial community restoration along with community dynamic in petroleum-contaminated farmland-oilfield mixed areas.

Test Field and Bioremediation
The Dagang oilfield lies near the Bohai Sea, China. The district is characterized by wetlands, rivers, farmlands, and oil infrastructure. The test field was located in Dagang oilfield adjacent to areas of crop and livestock farming (38°69′20″ N, 117°48′03″ E). The surface soil (0-20 cm depth) of the test field was alkaline (pH 8.86) and has been historically polluted by petroleum (the detailed physicochemical properties of the soil are shown in Table S1). A rapid bioremediation on the test field was conducted for 4 months by using a well-established physicochemical approach which included soil ventilation and organic fertilizer amendment (50% straw and 50% livestock manure) [29]. The organic fertilizer application rate was 900 kg ha −1 . The in situ soil petroleum concentration in the test field before remediation was approximately 18 g/kg and 161 mg/kg after.

Sample Collection
Two batches of soil samples (0-20 cm depth) were collected before (petroleum-contaminated soil, PCS) and after (petroleum remediation soil, PRS) the bioremediation from the Dagang oilfield in Tianjin, respectively. The soil samples, one kilogram per batch, were collected from every 5 m interval of an experimental plot (50 × 10 m) in the test field using sterilized 50 mL falcon tubes (Corning, Corning City, NY, USA). We obtained 20 samples for each test group and mixed thoroughly for further analysis. Then, 500 g of the soil samples per batch were sieved (mesh size = 2 mm) and stored at 4 °C for physicochemical analysis. The rest was stored at −80 °C and freeze-dried for environmental DNA extraction.

Physicochemical Parameters
Nitrogen index (the contents of total nitrogen, ammonia nitrogen, and hydrolysable nitrogen), pH, organic matter, salinity, phosphorus index (the contents of total phosphorus and available phosphorus), potassium index (the contents of total potassium and available potassium), and total petroleum hydrocarbons (TPH) were analyzed and quantified according to standard operation protocols [30][31][32].

DNA Extraction and PCR Amplification
Soil genomic DNA was extracted from 0.5 g of each soil sample using the Fast DNA SPIN Kit for Soil (MP Biomedicals, Santa Ana, CA, USA) according to the manufacturer's instruction. The hypervariable region V3-V4 of the bacterial 16S rRNA gene was amplified with the primer pair 338F (5′-ACTCCTACGGGAGGCAGCAG-3′) and 806R (5′-GGAC-TACHVGGGTWTCTAAT-3′) [29] by an ABI Gene Amp ® 9700 PCR thermocycler (ABI, Carlsbad, CA, USA). The PCR amplification of 16S rRNA gene was performed as follows: initial denaturation at 95 °C for 3 min, followed by 30 cycles of denaturing at 95 °C for 30 s, annealing at 55 °C for 30 s and extension at 72 °C for 45 s, and single extension at 72 °C for 10 min, and end at 10 °C. The PCR mixtures contain 5 × Trans Start Fast buffer 4 μL, 2.5 mM dNTPs 2 μL, the forward primer (5 μM) 0.8 μL, the reverse primer (5 μM) 0.8 μL, Trans Start Fast DNA Polymerase 0.4 μL, template DNA 10 ng, and finally ddH2O up to 20 μL. PCR reactions were performed in triplicate. The PCR product was extracted from 2% agarose gel and purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, Union City, CA, USA) according to the manufacturer provided instructions and quantified using a Quantus™ Fluorometer (Promega, Madison, WI, USA).

High-Throughput Sequencing and Bioinformatic Analysis
Purified amplicons of each sample were pooled in equimolar and subjected to pairend sequencing on an Illumina MiSeq PE300 platform (Illumina, San Diego, CA, USA) by Major Bio-Pharm Technology Co., Ltd. (Shanghai, China). After initial filtering, operational taxonomic units (OTUs) with 97% similarity cutoff were clustered using UPARSE version 7.1. Chimeric sequences were identified and removed [33]. The representative sequence of each OTU was analyzed by RDP Classifier version 2.2 against the 16S rRNA genes in the NCBI database using a confidence threshold of 0.7. Linear discriminant analysis coupled with effect size (LEfSe) [34] was used to explore functional biomarkers for each gene category through the Galaxy website (http://huttenhower.sph.harvard.edu/galaxy/, accessed on 11 April 2021); a linear discriminant analysis (LDA) threshold (>2.0, 0.05 alpha value) for the factorial Kruskal-Wallis test and the all-against-all multi-class analysis strategy was used to detect features with significantly different abundances between assigned classes. Alpha diversity was analyzed using the QIIME1.9.1 software package in the online Majorbio Cloud Platform (https://cloud.majorbio.com/, accessed on 23 May 2021). Phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt) was used to predict the shifted ecological functions of each microbial community (Lee et al., 2019). PHs and PAHs degradation genes based on KEGG pathways (fatty acid degradation (KO00071) and polycyclic aromatic hydrocarbon degradation (KO00624)) were summarized [35]. OTUs that contributed to specific genes involved in PHs and PAHs metabolism were identified. FAPROTAX and BugBase analysis were applied to predict the ecological functions regarding C, N, and S cycling and phenotypes of the microbial communities at the online Majorbio Cloud Platform (https://cloud.majorbio.com/, accessed on 21 October 2021).

Quantification of Functional Genes
The quantification of the abundance LadA (K20938) and nidA (K11943) in the soil samples before and after bioremediation were performed using the CFX Connect™ Real-Time PCR Detection System (Bio-Rad Laboratories, Hercules, CA, USA). Details about primers and reaction parameters for genes LadA and nidA are shown in Supplementary Table S2. PCR amplification was carried out in 25.0 μL reaction systems, containing 12.5 μL of SYBR ® Premix Ex Taq (Takara Biotech, Beijing, China), 1 μM of each primer (20 mM), 1.0 μL of the DNA template, and 9.5 μL of ddH2O. Standard curves were obtained using serial dilution plasmids containing the target genes. Each DNA sample was detected in triplicate and each quantitative PCR test had three no DNA template controls.

The Profile of the Soil Microbial Community before and after Bioremediation
Amplicon analysis and quality filtering by the MiSeq platform resulted in 50,305 and 50,427 high qualities partial 16S rRNA gene sequences for soil samples before and after bioremediation, respectively. The bacterial species richness of each sample determined through rarefaction analysis exhibited a flat and saturated trend ( Figure S1). The taxonomic patterns of the bacterial community in the soil samples showed noticeable alteration ( Figure 1). The Shannon index was 5.09 and 5.80 for soils before and after bioremediation, respectively. The Chao 1 index was 1079.9 before bioremediation and 1742.7 after. At the phylum level, in both soil samples before and after remediation, Actinobacteria accounted for a considerable proportion (>27%), whereas the relative abundance of Proteobacteria decreased slightly after bioremediation. Similarly, the relative abundance of Gemmatimonadota (6.7% to 3.3%) and Patescibacteria (1.2% to 0.5%) both decreased after bioremediation. By contrast, the proportions of Chloroflexi (5.5% to 11.2%), Acidobacteriota (3.4% to 4.4%), and Desulfobacterota (0.3% to 1.7%) were increased after the remediation process. At the order level, the most abundant bacteria in PCS were Corynebacteriales (8.75%, changed into 5.38% after bioremediation), Microtrichales (7.82%, changed into 3.96% after bioremediation), Actinomarinales (7.08%, changed into 7.35% after bioremediation), Kiloniellales (5.00%, changed into 3.82% after bioremediation), and Oceanospirillales (4.98%, changed into 5.85% after bioremediation). The relative abundance of Flavobacteriales and Micrococcales drastically increased by 114.38% and 92.11% after remediation, respectively.
Various known PHs and PAHs degraders were presented in the soil samples before and after remediation ( Table 1). The relative abundance of known PHs and PAHs degraders was higher in the soil before remediation, which accounted for 25.53% of the entire bacterial community. Mycobacterium, Alcanivorax, Porticoccaceae, Rhodobacteraceae, Gammaproteobacteria, Woeseia, Microbacteriaceae, Flavobacteriaceae, Marinobacter, and Burkholderiales were the top 10 abundant PHs and PAHs degrader guilds in the soils before and after the remediation. Mycobacterium and Alcanivorax together accounted for 12.9% and 10.24% of sequences in the soils before and after the remediation. After the rapid bioremediation, the abundance of Bacillus, Burkholderiales, Marinobacter, Acinetobacter, and Sphingomonas increased markedly. The change rates were 600.00%, 417.50%, 388.64%, 100.00%, and 73.68%, respectively. By contrast, the abundance of Alteromonadaceae, Rhodococcus, Rhodobacteraceae, Pseudomonas, and Microbacteriaceae decreased. The change rates were −100.00%, −84.21%, −80.70%, −78.95%, and −60.29%, respectively. In particular, the bioremediation stimulated degraders in the phylum of Firmicute but depressed degraders in the phylum of Actinobacterota. The taxa listed in the above figures were with a mean relative abundance >1%. Those with a mean relative abundance <1% were grouped as 'others'.

Functional Profiling of the Soil Microbiome before and after Bioremediation
Functional diversity of soil samples before and after remediation were investigated using PICRUST. A total of 156 million genes corresponding to KEGG pathway level 1 (metabolism, genetic information processing, environmental information processing, cellular processes, human diseases, and organismal systems) were identified according to the profiled soil microbiomes, of which about 109 million genes were involved in metabolism, accounting for about 70% (Figure 2A). Among the various metabolic pathways, approximately five million genes related to xenobiotics biodegradation were identified ( Figure 2B). It was obvious that the numbers of the identified functional genes declined, and the organization of functions changed remarkably after the bioremediation. As a consequence of bioremediation, the dominant genes were involved in DNA replication and repair, protein translation, membrane transport, and signal transduction, while metabolism of organic matter dominated in petroleum-contaminated soil ( Figure 2C). Functional profiling in terms of element cycling before and after bioremediation was compared through FAPROTAX analysis (Figure 3). The relative abundance of functions associated with C, N, and S cycles enhanced slightly after biodegradation. This change was mainly contributed by the alteration of C and S cycling genes. C cycling in the bioremediation process was driven by chemoheterotrophy, aerobic chemoheterotrophy, and hydrocarbon degradation ( Figure 3A). Intriguingly, hydrocarbon degradation and aromatic compound degradation remained at a high level after bioremediation. With the progress of bioremediation, N fixation, and nitrate reduction increased ( Figure 3B), while nitrate respiration, nitrogen respiration, nitrate denitrification, and nitrite denitrification diverted. Bacterial functions in S cycling were mainly composed of sulfur respiration (Figure 3C), and abnormally expressed after the rapid bioremediation. Soil microbial phenotypes such as biofilm-forming and oxygen utilization were investigated through Bugbase analysis (Figure 4). The relative abundance of aerobic and biofilm-forming microorganisms both decreased after bioremediation. Although, the main constituent species in these two phenotypes were highly consistent before and after remediation.
(A) (B) Figure 4. The phenotypic contribution map of the major orders composition on a particular phenotype including biofilm-forming (A) and oxygen utilization (B). The Y axis denoted the relative abundance of different species to the phenotype.

Predicted Functional Changes before and after Bioremediation in Relation to PHs and PAHs
Various enzymes were involved in PHs and PAHs degradation. In both soil microbiomes prior to and after bioremediation, a total of 10 and 26 KEGG Orthology (KO) were detected in fatty acid degradation (KO00071) and polycyclic aromatic hydrocarbon degradation (KO00624), respectively ( Table 2). Long-chain acyl-CoA synthetase, aldehyde dehydrogenase (NAD+), S-(hydroxymethyl) glutathione dehydrogenase/alcohol dehydrogenase, alcohol dehydrogenase, and alkane 1-monooxygenase were the predicted most abundant enzymes for PHs degradation before and after bioremediation ( Figure 5A). For PAHs related metabolism, although the dominant functions were unchanged, the percentage of an array of enzymes increased noticeably after the bioremediation. These enzymes were 3-oxoadipate CoA-transferase, 4-hydroxy-4-methyl-2-oxoglutarate aldolase, 4-oxalomesaconate hydratase, PAH dioxygenase small subunit, extradiol dioxygenase, and 3,4-dihydroxyphthalate decarboxylase ( Figure 5B). In addition, PHs and PAHs degradation processes in PCS and PRS were compared ( Figure 6). Four steps, including oxygen incorporation, dehydrogenation, aldehyde dehydrogenation, and acetyl-CoA formation, were involved in the metabolism of a typical component of PHs such as alkane. Surprisingly, long-chain alkane monooxygenase involved in PHs degradation in PCS was much lower than that in PRS, which was consistent with the result of gene quantification in Figure S2. The component-dependent effects of PAHs on microbial degradation were illustrated ( Figure 6B). The orthologous genes related to phenanthrene degradation were predicted to be more active in PRS. For fluorene and anthracene degradation, however, the predicted expression of genes encoding components of the naphthalene dioxygenase (NDO) multicomponent enzyme system showed divergence in PCS and PRS.

Discussion
The potential and feasibility of TPFP were investigated on immediate post-bioremediation evaluation of long-term petroleum contamination in a historically polluted farmland-oilfield mixed area. It has been reported that the effect of the high content of petroleum on the diversity of autochthonous microbiomes is significant [45]. Various factors, including in situ geology, contaminated stage, and even the local climate, contribute together to the distribution and degradation of petroleum pollutants [46]. After acclimated, indigenous microorganisms can efficiently utilize the easily degradable part of the spilled crude oil as a perpetual carbon and energy source. Therefore, on the one hand, the diversity of the autochthonous microbiomes is normally high in historically petroleum polluted sites [47]. On the other hand, PAHs rather than PHs may accumulate in polluted soils for the low degradation rate [48]. Since the content of soil PAHs is likely in line with the abundance of PAHs degraders or PAHs-specific gene markers [13,14], the abundance of the microbes and genes might be one of the meaningful probes to evaluate whether long-term soil petroleum contamination has been successfully remedied. Indeed, ring-hydroxylating-dioxygenase alpha subunit (RHDα) genes including nagAc, phnAc, nahAc, nidA, and pdoA have been applied to quantify PAHs degraders in contaminated surface soils [49]. The bioremediation approach we applied increased the soil bacterial diversity and richness in four months. On the contrary, Mycobacterium, the dominant indigenous degrader of PAHs and PHs, decreased in abundance after the rapid bioremediation which may follow the residual content of PAHs (Table 1), indicating the remediation was, at least partially, accomplished. This result is consistent with the observation that the process augmented the Firmicutes to Bacteroidetes ratio (F:B ratio) of petroleum degraders (Table 1). Previous studies revealed that a high F:B ratio is associated with increased systematic energy uptake efficiency [50]. Therefore, the raised F:B ratio after the bioremediation may suggest the establishment of a highly efficient petroleum-utilizing microbial community. Furthermore, the leading functional genes of the microbiome, as a result of bioremediation, comprised DNA replication and repair, protein synthesis, membrane transport, and signal transduction, while organic matter metabolism dominated in the microbiome of the petroleum-contaminated soil ( Figure 2C). As functional changes at the community level respond better to environmental stress than taxonomic changes, the result signaled promisingly that most petroleum in the soil has been degraded. Therefore, the organization of main functions of the microbiome could be an efficient proxy to probe the completion of bioremediation in petroleum-contaminated environments. Concomitantly, the relative abundance of genes associated with sulfur respiration increased immediately after the rapid bioremediation ( Figure 3). In general, the organosulfur compounds including aromatic heterocycles constitute up to 70% of the total sulfur content in crude oil [51]. Since microbial sulfur respiration preferentially uses sulfates as well as other oxygenated sulfur compounds, elemental sulfur, and polysulfides as the electron acceptors [52], it inferred that the organosulfur compounds in the crude oil of our study have been thoroughly degraded. In addition, the extent of microbial phenotypes such as biofilm-forming and oxygen utilization declined after the rapid bioremediation (Figure 4), indicating the soil petroleum content was decreasing, as crude oil promotes biochemical oxygen demand and microbial aggregate formation [1,53,54].
In total, 36 predicted enzymes functioned on PHs and PAHs degradation in both soil microbiomes prior to and after bioremediation ( Table 2), suggesting (1) the bioremediation method we applied did not lead to involvement of additional degradation pathways; (2) there might be an internal configuration among the 36 enzymes depending on the content changes of PHs and PAHs; (3) abnormally expressed genes could be biomarkers to indicate the stage of petroleum remediation. Indeed, after the bioremediation, the evenness of functional genes related to PAHs metabolism was altered. In particular, the expression of enzymes related to energy generation and C-O cleavage such as 3-oxoadipate CoA-transferase, 4-hydroxy-4-methyl-2-oxoglutarate aldolase, and 4-oxalomesaconate hydratase enhanced markedly. These results are consistent with the observation that the post-remediation microbiome was more energy efficient than that of the soil microbiome before remediation. These abnormally expressed genes, therefore, could be potential biomarkers to probe the stage and completion of the bioremediation. In addition, the gene activity of orthologous enzymes relating to the metabolism of phenanthrene, a major constituent of petroleum [55], was predictably more active following bioremediation ( Figure  6B). These genes, therefore, could also be considered as potential biomarkers for evaluating the completion of the bioremediation on petroleum contamination.
Collectively, various biomarkers including microbial abundance, genes, patterns, and phenotypes were identified through TPFP for completion assessment of the bioremediation in a historically polluted farmland-oilfield mixed area. These biomarkers included but were not limited to the abundance of PAHs degraders (as microbial markers), the genes of sulfur respiration, energy generation, and phenanthrene degradation (as genetic markers), the F:B ratio and the organization of functions of microbiomes (as systematic markers), and biofilm-forming and oxygen utilization (as phenotypic markers). Therefore, we propose that predicted functional diversity along with microbial community dynamic can be applied as a powerful tool for probing bioremediation completion and microbial restoration in long-term petroleum-contaminated farmland-oilfield mixed areas.

Conclusions
In this report, we investigate the recovery of petroleum-contaminated soils after bioremediation, including the fate of petroleum contamination, the evenness, and abundance of bacterial communities. The dynamic structure of the microbial community was determined, the functional biomarkers of the soil microbial community were analyzed, and the contribution rates of the microbial community to petroleum hydrocarbon degradation in different periods were calculated. Therefore, functional diversity plays an important role in evaluating microbial recovery. In many studies evaluating the restoration of microbial ecosystem diversity, the microbial community structure and diversity indexes, such as abundance and evenness, are used as the evaluation scale. However, functional changes at the community level respond better to environmental stress than taxonomic changes, so the organization of main functions of the microbiome could be an efficient proxy to probe the completion of a bioremediation in petroleum-contaminated environments. The fate of pollutants, microbial community dynamics, and functional diversity can be combined to accurately evaluate the restoration of microbial communities in an environmental ecosystem.
Supplementary Materials: The following are available online at www.mdpi.com/article/10.3390/su14042248/s1, Figure S1. Rarefaction curves of the two soil samples. Figure S2. Abundance of ladA (A), nidA genes (B) in the process associated metabolism of PHs and PAHs, respectively. Table S1: Physicochemical parameters of the soil samples before and after bioremediation. Table S2: Primer pairs and PCR conditions used for qPCR.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in the study were deposited in the NCBI sequence read archive (SRA) repository under the accession number (PRJNA764367).