Metagenomic Analysis for Evaluating Change in Bacterial Diversity in TPH-Contaminated Soil after Soil Remediation

Soil washing and landfarming processes are widely used to remediate total petroleum hydrocarbon (TPH)-contaminated soil, but the impact of these processes on soil bacteria is not well understood. Four different states of soil (uncontaminated soil (control), TPH-contaminated soil (CS), after soil washing (SW), and landfarming (LF)) were collected from a soil remediation facility to investigate the impact of TPH and soil remediation processes on soil bacterial populations by metagenomic analysis. Results showed that TPH contamination reduced the operational taxonomic unit (OTU) number and alpha diversity of soil bacteria. Compared to SW and LF remediation techniques, LF increased more bacterial richness and diversity than SW, indicating that LF is a more effective technique for TPH remediation in terms of microbial recovery. Among different bacterial species, Proteobacteria were the most abundant in all soil groups followed by Actinobacteria, Acidobacteria, and Firmicutes. For each soil group, the distribution pattern of the Proteobacteria class was different. The most abundant classed were Alphaproteobacteria (16.56%) in uncontaminated soils, Deltaproteobacteria (34%) in TPH-contaminated soils, Betaproteobacteria (24%) in soil washing, and Gammaproteobacteria (24%) in landfarming, respectively. TPH-degrading bacteria were detected from soil washing (23%) and TPH-contaminated soils (21%) and decreased to 12% in landfarming soil. These results suggest that soil pollution can change the diversity of microbial groups and different remediation techniques have varied effective ranges for recovering bacterial communities and diversity. In conclusion, the landfarming process of TPH remediation is more advantageous than soil washing from the perspective of bacterial ecology.


Introduction
Soil pollution attributed to total petroleum hydrocarbons (TPHs) is a worldwide concern [1]. Organic pollutants such as gasoline and diesel can reach deep into the soil and aquifers resulting in soil and groundwater pollution and even causing economic losses and destruction of agricultural production systems [2,3]. Since the Soil Environment Conservation Act was implemented in 1995, the Korean government has promoted soil and groundwater remediation through monitoring of contaminated regions and applying remediation techniques [4].
Various physicochemical and biological techniques, such as stabilization, soil washing, electrokinetic remediation, landfarming, phytoremediation, and biodegradation are used to remediate TPH-contaminated soil [5]. Soil washing and landfarming, in particular, are widely used in the TPH soil remediation process, with numerous advantages including simplicity and cost-effectiveness [6][7][8][9]. Since the 1980s, soil washing has been extensively researched and applied throughout Europe and North America. This method is a physicochemical process that uses desorption and solubilization to remove contaminants sorbed to

Soil Sampling and Sample Description
Soil samples were collected from a closed military base located in Gyeonggi Province, South Korea. The petroleum carbonate threshold value (2000 mg/kg) set by the Korea Ministry of Environment (KME) was exceeded in this area, and two soil remediation processes were used to clean up the polluted soil: soil washing and landfarming. A total of 11 soil samples were collected from the site including: two control samples, four from petroleum-contaminated sites (CS), three samples obtained after the soil washing process (SW), and two from the landfarming process (LF). Three sub-soil samples were taken at a depth of 0-20 cm from each sampling site and combined to form one sample. Each sample was thoroughly homogenized by manual shaking. The soils were sieved and separated into three subsamples using a 2 mm mesh. The first soil sample was air-dried at 35 • C for physicochemical analyses. The remaining subsoil sample was stored at 4 • C for petroleum carbonate analysis. Another ten grams of fresh soil was transferred to a 50 mL conical tube and stored at −80 • C for DNA extraction.

Soil Chemical Properties and Total Petroleum Hydrocarbon (TPH) Analysis
Soil pH and electrical conductivity (EC) were measured in a soil/distilled water ratio of 1:5 (w/v) using a pH meter (Orion Star™ A111, Thermo Fisher Scientific, Waltham,  [24]. Available phosphorus was extracted using Bray No. 1 [25], determined by the molybdenum blue method [26], and quantified using a UV-spectrophotometer (Uvmini-1240, Shimadzu, Kyoto, Japan). The cation exchange capacity (CEC) was extracted using the ammonium acetate method (pH 7.0) and determined by ICP-OES (Optima 3200XL, Perkin Elmer, Waltham, MA, USA). Total nitrogen was measured using a Kjeldahl distillation apparatus. Total petroleum hydrocarbon (TPH) in the soil samples was analyzed according to KSTM ES 07552. Briefly, 10 g of soil sample was extracted with 200 mL of dichloromethane and the extract was analyzed using a gas chromatograph flame ionization detector (7890A, Agilent, Santa Clara, CA, USA). Genomic DNA from soil bacteria was extracted using the Nuclear Spin ® Soil DNA extraction kit, following the manufacturer's guidelines (Macherey-Nagel, Düren, Germany). Briefly, 0.3 g of the soil sample was placed in MN Bead Tubes Type A containing ceramic beads, 700 µL of lysis buffer SL1 and 150 µL of enhancer SX were added, and bead-beating was performed using a Mini Beadbeater-16 (Macherey-Nagel, Düren, Germany). The subsequent process was performed according to the manufacturer's protocol, and the genomic DNA was extracted with 50 µL of Elution Buffer SE. The concentrations and purities of the extracted DNA were evaluated using a UV-Vis spectrophotometer (Optizen NANO Q, Mecasys, Daejeon, Korea) and stored at −80 • C until further analysis. Genomic DNA samples were sent to LabGenomics (Seongnam, Korea) for metagenomic analysis, and PCR amplification and Illumina MiSeq sequencing were performed. To analyze the soil bacterial colony structure, the universal primers 341F (5'-CCTACGGNGGCWGCAG-3') and 805R (5'-GACTACHVGGGGTATCC-3') were used to amplify the V3-V4 regions of the 16S rRNA. PCR was performed using Herculase II Fusion DNA Polymerase (Agilent, Santa Clara, CA, USA), and carried out in a total volume of 23 µL containing: 2.5 µL amplicon PCR F, R primer, 0.5 µL Herculase II Fusion DNA Polymerase, 5.0 µL 5 × Herculase II reaction buffer, 0.25 µL dNTPs (100 nM), and 14.75 µL PCR-grade water. Amplicon PCR conditions were as follows: an initial denaturation step of 3 min at 95 • C, followed by 25 cycles of 95 • C for 30 s, 55 • C for 30 s, 72 • C for 30 s, and a final extension for 5 min at 72 • C. Then, limited-cycle amplification was performed to add multiplexing indices and Illumina adapter sequences, and the conditions were as follows: initial denaturation step for 3 min at 95 • C followed by eight cycles of 95 • C for 30 s, 55 • C for 30 s, 72 • C for 30 s, and a final extension for 5 min at 72 • C. The final product was normalized and pooled using PicoGreen (Promega, Madison, WI, USA), and the size of the libraries was verified using the TapeStation DNA screentape D1000 (Agilent). Paired-end sequencing was performed using the MiSeq™ platform (Illumina, San Diego, CA, USA).

Bioinformatic Analysis
The raw sequence data obtained by MiSeq sequencing analysis were classified by sample using the index sequence, and a FASTQ file was generated. Then, the paired-end data, separated by each sample, were assembled into a single sequence using FLASH (ver. 1.2.11) [27], and sequences with a length of less than 400 bp or more than 500 bp were removed. To exclude low-quality, ambiguous, chimeric sequences, which are deemed sequencing errors, the acquired sequences were processed using the CD-HIT-OTU program [27], an operational taxonomic unit (OTU) analysis program, based on CD-HIT-EST. Sequences were clustered into species-level OTUs with 97% sequence similarity. Taxonomic assignment was performed by comparing the representative sequence from each OTU to the reference database (NCBI 16S Microbial) using BLAST+ (ver. 2.9.0) [28]. Various bioin-formatics analyses were performed using QIIME (ver. 1.9) with the obtained OTU findings and taxonomy information [29]. In addition, beta diversity between soil groups was obtained based on weighted UniFrac distance [30], principal coordinate analysis (PCoA), and unweighted pair group method with arithmetic mean (UPGMA) clustering was conducted to compare and visualize the diversity of bacterial communities in the four soil groups.

Data Analysis
Statistical tests were performed using the Statistical Package for Social Science (SPSS) version 26.0 (2021, SPSS Inc., Chicago, IL, USA). Soil physicochemical properties, TPH concentration, and bacterial relative abundance were measured in triplicate and expressed as mean and standard deviation. Analysis of variance (ANOVA) was performed to verify the statistical difference with the results to confirm the normality and homogeneity of the variance, followed by the post-hoc Duncan's test (p < 0.05) for multiple comparisons between soil samples.

Soil Chemical Properties and TPH Concentration
The chemical properties and TPH concentrations of the soil samples are listed in Table 1. The soil pH was slightly acidic in the control (6.23) and contaminated sites (CS) (6.22), but after the remediation process, it increased to neutral for the landfarming (LF) (6.90) and soil washing processes (SW) (7.23). The increased soil pH in the LF and SW soils was attributed to the remediation process. The soil pH was adjusted to neutral with microbial feeding solution to optimize the microbial activity in the LF treatment. In addition, the surfactant's alkaline property was used to detach TPH from the soil, which may have resulted in an increase in soil pH in the SW treatment. When compared to the control, the SW treatment had significantly lower concentrations of soil organic matter (SOM) (0.81%), average phosphorus (Av. P) (2.86 mg/kg) and total nitrogen (TN) (0.13%). Soils with particle sizes less than 0.075 mm, mainly clay portion, were washed out during the SW process and consequently, SOM and Av. P sorbed on surface of clay particle were also removed causing a lower concentration of SOM and Av. P compared to control. In addition, in SW treatment, soil washing solutions, water, and surfactants reduced TN concentration.
The concentrations of TPH in the control and CS groups were 77 and 2690 mg/kg, respectively. In Korea, the TPH contamination criterion was set at 2000 mg/kg, and we confirmed that the TPH concentration in CS soil was higher than the threshold value TPH concentrations in SW and LF were significantly decreased to 111 and 249 mg/kg, after the soil remediation process, indicating removal effectiveness of 95.8% and 90.7%, respectively.

Soil Bacterial Diversity Analysis
A summary of soil bacterial diversity analysis, including the number of OTUs, Good's coverage, Chao1, and Shannon index is shown in Table 2. A total of 985,196 raw read counts Toxics 2021, 9, 319 5 of 12 were obtained from 11 soil samples through MiSeq sequencing analysis of soil bacterial communities. Low-quality and chimeric sequences were removed through preprocessing and clustering using CD-HIT-OTU, resulting in between 13,360 and 44,406 reads. In addition, the filtered bacterial sequences were clustered into OTUs with a similarity level of 97%. As a result, OTUs were observed in the range of 263-1483 in all soil samples. The control group had the largest average number of OTUs (1265), while the TPH-affected CS group showed a significant decrease in OTUs with the lowest average number of 296. In SW and LF, an average of 305 and 815 OTUs were measured, respectively. Good's coverage estimator ranged from 97.90% to 99.99%, exceeding 97% in all soil samples, indicating that the sequencing depth was sufficient to identify the entire bacterial population in the soil [31]. Alpha diversity was calculated to analyze the bacterial species richness (Chao 1 index) and diversity (Shannon index) within soil samples under four different conditions., The Chao1 index (316) of SW was similar to that of CS, and the Shannon index (5.94) was slightly increased (p < 0.05). LF on the other hand, had significantly increased Chao1 (1000) and Shannon index (7.06) than CS (p < 0.05).
These results reveal that TPH leaching into the soil reduces the richness and diversity of bacteria in the soil. TPH contamination has also been shown to reduce the alpha diversity of soil bacteria in several studies [32,33]. We also confirmed that Chao1 and Shannon indices decreased significantly in TPH-contaminated soil (CS) compared to the control in this study. We also found that the alpha diversity of bacterial communities was affected differently by each soil remediation approach. When the LF process was used, alpha diversity indices in the soil were restored to control levels but no significant changes in biodiversity were seen when the SW process was applied to TPH-contaminated soil.
Soil biodiversity may be affected by different remediation processes. SW, for example, is a physicochemical method of remediation that employs washing solutions. Although washing solutions efficiently remove TPH from the soil, they have been shown to have a deleterious impact on soil bacterial activity and diversity by altering the soil's physicochemical properties [34,35]. In contrast, the concentration of TPH in the soil decreases in the LF process due to volatilization or biodegradation [36,37]. In this procedure, organic matter and bacterial consortia are used for successful biodegradation and increased soil bacterial activity [38,39]. Previous research has also found that employing the LF process to reduce TPH restored bacterial diversity to pre-contamination levels [40].
The similarity of the four soil groups was compared using a beta diversity analysis. The distribution of bacterial communities in the four groups of soil samples was visually represented through PCoA (Figure 1). The first (53.63%) and second principal coordinates (27.02%) accounted for 80.65% of the total variation. The control and LF groups were clustered at a similar level, indicating that the bacterial communities were similar between the control and LF groups. CS and SW, on the other hand, were highly different from the control and LF, implying that the bacterial communities of CS and SW were very distinct from the control and LF. In addition, the UPGMA clustering results revealed that the four soil groups were primarily divided into two sections: control and LF, and CS and SW (data not shown). This finding also supports the notion that TPH contamination and the soil remediation process could have a significant impact on bacterial communities. control and LF groups. CS and SW, on the other hand, were highly different from the control and LF, implying that the bacterial communities of CS and SW were very distinct from the control and LF. In addition, the UPGMA clustering results revealed that the four soil groups were primarily divided into two sections: control and LF, and CS and SW (data not shown). This finding also supports the notion that TPH contamination and the soil remediation process could have a significant impact on bacterial communities.

Varied Bacterial Distribution in Soil
A total of 21 phyla, 53 classes, 119 orders, 242 families, 659 genera, and 1118 species were identified in all soil samples. Figure 2 represents the bacterial phyla with a relative abundance of 1% or more. The four soil groups consisted mainly of Proteobacteria, Actinobacteria, Acidobacteria, Firmicutes, Chloroflexi, Gemmatimonadetes, and Bacteroidetes. They accounted for more than 90% of the total bacteria present in the soil.

Varied Bacterial Distribution in Soil
A total of 21 phyla, 53 classes, 119 orders, 242 families, 659 genera, and 1118 species were identified in all soil samples. Figure 2 represents the bacterial phyla with a relative abundance of 1% or more. The four soil groups consisted mainly of Proteobacteria, Actinobacteria, Acidobacteria, Firmicutes, Chloroflexi, Gemmatimonadetes, and Bacteroidetes. They accounted for more than 90% of the total bacteria present in the soil.
control and LF groups. CS and SW, on the other hand, were highly different from the control and LF, implying that the bacterial communities of CS and SW were very distinct from the control and LF. In addition, the UPGMA clustering results revealed that the four soil groups were primarily divided into two sections: control and LF, and CS and SW (data not shown). This finding also supports the notion that TPH contamination and the soil remediation process could have a significant impact on bacterial communities.

Varied Bacterial Distribution in Soil
A total of 21 phyla, 53 classes, 119 orders, 242 families, 659 genera, and 1118 species were identified in all soil samples. Figure 2 represents the bacterial phyla with a relative abundance of 1% or more. The four soil groups consisted mainly of Proteobacteria, Actinobacteria, Acidobacteria, Firmicutes, Chloroflexi, Gemmatimonadetes, and Bacteroidetes. They accounted for more than 90% of the total bacteria present in the soil.  Despite the fact that the phylum composition of each group differed, Proteobacteria were identified in the largest proportion in all soils. The control group had the lowest rate at 36.22%, while CS had the highest rate at 72.72%. In addition, SW and LF decreased to 65.41% and 50.51%, respectively, after the soil remediation process. Proteobacteria are abundant in a variety of soil environments [41] and play an essential role in the carbon, sulfur, and nitrogen cycles in the soil [42]. Furthermore, many bacteria belonging to the phylum Proteobacteria degrade hydrocarbons in the soil [43]. The prevalence of Proteobacteria in TPH-contaminated soil has also been reported elsewhere [44,45]. Additionally, the largest proportion of Proteobacteria observed in CS is consistent with previous research demonstrating that TPH concentration and relative abundance of Proteobacteria Toxics 2021, 9, 319 7 of 12 increased proportionally [46,47]. Similar to the decrease in Proteobacteria in SW and LF, the degradation of TPH during the soil remediation process could be linked.
In addition, among the four soil groupings, the distribution pattern of bacterial classes in the Proteobacteria phylum differed. Figure 3 shows the relative abundance of 0.1% or more among the seven classes of Proteobacteria (Alpha-, Beta-, Gamma-, Delta-, Epsilon-proteobacteria, Hydrogenophilalia, and Oligoflexia) observed in this study. Alphaproteobacteria were found in the range of 10.63-16.56% in all soil groups and were predominant in the control. This class was found most frequently in uncontaminated, ordinary soil, and Betaproteobacteria and Gammaproteobacteria were found in relatively low proportions [48,49]. Betaproteobacteria (6.32%), Gammaproteobacteria (3.94%), and Deltaproteobacteria (9.36%) were less abundant in the control group than Alphaproteobacteria (16.56%).
65.41% and 50.51%, respectively, after the soil remediation process. Proteobacteria are abundant in a variety of soil environments [41] and play an essential role in the carbon, sulfur, and nitrogen cycles in the soil [42]. Furthermore, many bacteria belonging to the phylum Proteobacteria degrade hydrocarbons in the soil [43]. The prevalence of Proteobacteria in TPH-contaminated soil has also been reported elsewhere [44,45]. Additionally, the largest proportion of Proteobacteria observed in CS is consistent with previous research demonstrating that TPH concentration and relative abundance of Proteobacteria increased proportionally [46,47]. Similar to the decrease in Proteobacteria in SW and LF, the degradation of TPH during the soil remediation process could be linked.
In addition, among the four soil groupings, the distribution pattern of bacterial classes in the Proteobacteria phylum differed. Figure 3 shows the relative abundance of 0.1% or more among the seven classes of Proteobacteria (Alpha-, Beta-, Gamma-, Delta-, Epsilon-proteobacteria, Hydrogenophilalia, and Oligoflexia) observed in this study. Alphaproteobacteria were found in the range of 10.63-16.56% in all soil groups and were predominant in the control. This class was found most frequently in uncontaminated, ordinary soil, and Betaproteobacteria and Gammaproteobacteria were found in relatively low proportions [48,49]. Betaproteobacteria (6.32%), Gammaproteobacteria (3.94%), and Deltaproteobacteria (9.36%) were less abundant in the control group than Alphaproteobacteria (16.56%). Deltaproteobacteria, on the other hand, were predominant in CS, and their prevalence increased significantly from 9.36% to 33.68% compared to the control. Additionally, Betaproteobacteria increased from 6.32% to 9.85%, while Gammaproteobacteria increased from 3.94% to 16.27%. These findings are consistent with previous research, which suggests that TPH-contaminated soil has a reduction in Alphaproteobacteria and an increase in Beta-, Gamma-, and Deltaproteobacteria [50,51]. The presence of several types of TPHdegrading bacteria belonging to Beta-, Gamma-, and Deltaproteobacteria [52], which play a significant role in the biodegradation of hydrocarbons in the soil [53], caused these bac- Deltaproteobacteria, on the other hand, were predominant in CS, and their prevalence increased significantly from 9.36% to 33.68% compared to the control. Additionally, Betaproteobacteria increased from 6.32% to 9.85%, while Gammaproteobacteria increased from 3.94% to 16.27%. These findings are consistent with previous research, which suggests that TPH-contaminated soil has a reduction in Alphaproteobacteria and an increase in Beta-, Gamma-, and Deltaproteobacteria [50,51]. The presence of several types of TPH-degrading bacteria belonging to Beta-, Gamma-, and Deltaproteobacteria [52], which play a significant role in the biodegradation of hydrocarbons in the soil [53], caused these bacterial composition modifications. Most TPH-degrading bacteria belong to the Gammaproteobacteria class, which results, in increased relative abundance at high TPH concentrations [54,55]. Many studies have labeled this phenomenon as a "gamma shift" [56][57][58]. Meanwhile, during the soil remediation process, the bacterial composition altered again. Deltaproteobacteria in SW and LF decreased to 12.78% and 4.04%, respectively, as compared to CS In addition, Betaproteobacteria increased to 23.68% in SW, whereas Gammaproteobacteria increased to 24.09% in LF.
Actinobacteria was the second most dominant phylum in the control (21.04%), CS (14.19%), and SW (18.90%). The relative abundance of Actinobacteria in LF, on the other hand, was 8.76%, which was lower than that in the other soil groups. Actinobacteria, Bacteroidetes, and Firmicutes contain TPH-degrading bacteria, such as Proteobacteria [59][60][61], and are commonly found in TPH-contaminated soil [62,63]. However, in contrast to previous studies, Actinobacteria was the most prevalent, accounting for 21.04% in the control and 14.19% in CS. In addition, despite differences in TPH concentration and remediation processes, the relative abundance of Bacteroidetes and Firmicutes did not differ in all soil groups (p < 0.05).

Effect of TPH-Degrading Bacteria in Contaminated and Remediated Soil
In this study, we tried to determine how TPH-degrading bacteria affect the bacterial community structure. Information on TPH-degrading bacteria was gathered from previous studies and their distribution by soil groups at the genus level was analyzed [64][65][66][67][68][69][70]. Table 3 shows the TPH-degrading bacteria, found in all soil samples, with a relative abundance greater than or equal to 0.01%. TPH-degrading bacteria mainly belonged to Proteobacteria, Actinobacteria, Firmicutes, and Bacteroidetes. These were present in the control and CS groups at 4.05% and 20.61%, respectively. Many soil bacteria act as TPH decomposers, using hydrocarbons in the soil as carbon and energy sources [71]. The release of TPH as a pollutant into the soil invariably increases the number of TPH-degrading bacteria [72]. Similarly, the relative abundance of TPH-degrading bacteria in the CS was more than three times higher than in the control in this study. In comparison to other TPHdegrading bacteria, Acinetobacter, Immundisolibacter, Pseudomonas, Pseudoxanthomonas, Staphylococcus, and Thermomonas had higher relative abundances in CS. These bacteria were frequently observed at relatively high TPH concentrations, which is consistent with previous results [73,74]. In addition, these bacteria were Proteobacteria, contributing to an increase in Gammaproteobacteria in CS compared to the control. Deltaproteobacteria however, was listed among the bacteria that degrade TPH. This suggests that the increase in Deltaproteobacteria in the CS was not caused by TPH-degrading bacteria. The percentage of TPH-degrading bacteria in SW was slightly higher than in the CS at 23.49%, which was slightly higher than that in the CS. When compared to CS, this group had a higher abundance of Acinetobacter, Immundisolibacter, Pseudomonas, and Staphylococcus, as well as a similar distribution pattern of TPH-degrading bacteria. Moreover, comparable to the similarity of beta diversity between CS and SW, these results suggest that the soil washing process did not notably affect the TPH-degrading bacteria or the overall bacterial community structure. The TPH-degrading bacteria in LF, on the other hand, decreased to 12.29%. The return of the bacterial population to the uncontaminated level (control group) following the landfarming process appears to be the cause of this decrease in relative abundance.

Conclusions
Changes in the soil environment, such as soil pollution or remediation processes can have a significant impact on soil biological properties. This study used metagenomic analysis to examine how bacterial diversity changed in TPH-contaminated soil after soil washing and landfarming remediation processes. The Illumina MiSeq platform was utilized to compare soil bacterial diversity in this study, which clearly demonstrated that bacterial species diversity varied depending on the soil remediation process.
In the presence of TPH in the soil, bacterial alpha diversity was reduced. LF removed less TPH than SW, but it vastly increased bacterial richness and diversity. Various distribution patterns for each soil group were identified as a result of this analysis of beta diversity and relative abundance at the phylum and class levels. The bacterial community of LF was determined to be similar to the control group. The many TPH-degrading bacteria found in SW, however, revealed that the soil washing process did not significantly affect the bacterial communities.
From the perspective of bacterial ecology, these results suggest that the landfarming process of TPH remediation is more advantageous than soil washing.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.