Assessment of the Microbial Communities in Soil Contaminated with Petroleum Using Next-Generation Sequencing Tools

: Microbial communities are known to play a principal role in petroleum degradation. This study tries to determine the composition of bacteria in selected crude oil-contaminated soil from Tabasco and Tamaulipas states, Mexico. We determined the microbial populations living under these conditions. We evaluated the structure and diversity of bacterial communities in the contaminated soil samples. The most abundant phylum is proteobacteria. Next Generation Sequencing (NGS) analysis of the sampled soils from both states revealed that this phylum has the most relative abundance among the identiﬁed bacteria phyla. The heatmap represented the relative percentage of each genus within each sample and clustered the four samples into two groups. Moreover, this allowed us to identify many genera in alkaline soil from Tamaulipas, such as Skermanella sp., Azospirillum sp. and Unclassiﬁed species from the Rhodospirillaceae family in higher abundance. Meanwhile, in acidic soil from Tabasco, we identiﬁed Thalassospira , Unclassiﬁed members of the Sphingomonadaceae family and Unclassiﬁed members of the Alphaproteobacteria class with higher abundance. Alpha diversity analysis showed a low diversity (Shannon and Simpson index); Chao observed species in both Regions. These results suggest that the bacteria identiﬁed in these genera may possess the ability to degrade petroleum, and further studies in the future should elucidate their role in petroleum degradation.


Introduction
Soils contaminated with petroleum have received attention in recent times, especially due to the financial and technical implications that are associated with decontaminating soil polluted with petroleum [1][2][3][4]. Soil is usually the last destination of both organic and inorganic contaminants, and, as such, it has remained one of the major recipients of oil spills and other petroleum products [5]. Petroleum, as a hydrocarbon, is a mixture of saturated hydrocarbons, aromatic compounds, asphaltenes, and resins. Petroleum contamination was able to ensure the biodegradation of hydrocarbon by 70%, indicating the potential of the consortium for environmental remediation [15]. This implies that metagenomics evaluation can lead to the identification of microorganisms in hydrocarbon-contaminated soil that can be used as a consortium for bioremediation purposes. Some metagenomic diversity studies based on the sequencing of the 16S rRNA gene have identified diverse bacteria that could be involved in the degradation of hydrocarbon. For instance, Garrido-Sanz et al. (2019) identified strains such as Pseudomonas, Aquabacterium, Chryseobacterium, and Sphingomonadaceae as the dominant genera in a 16S-rRNA based metagenomic analysis of a diesel oil-polluted-soil in a rhizoremediation assay [34].
The advent of next-generation sequencing (NGS) techniques has influenced many studies, revealing the efficiency of using NGS for the quantification, identification, and analysis of the microbial population in soil [30,38,39]. Next-generation sequencing-based techniques have been employed in bio-stimulation assays to determine the most dominant bacterial phyla in contaminated soil using Illumina sequencing technology. The outcome of this study identified proteobacteria, Firmicutes, and Bacteroidetes phyla as the most abundant species in the studied soil [40]. In another study using the 16S rRNA gene-based Illumina MiSeq sequencing, proteobacteria (49.11%) and actinobacteria (24.24%) were reported as the most dominant phyla, and the main genera were Pseudoxanthomonas, Luteimonas, Alkanindiges, Acinetobacter and agromyces in oil-contaminated soil [28].
Previous analyses of soil using next-generation sequencing techniques have shown that next-generation sequencing is an important tool in understanding the diversity and functional roles of microbes in their environment [41][42][43]. We believe a next-generationbased analysis of crude oil-contaminated soil from Tabasco and Tamaulipas can reveal the diversity and functional role of the microbes in these environments. There is still a paucity of information on the abundance of the microbial population of the oil-contaminated soils of Mexico. We conducted a study to characterize the structure and diversity of bacterial communities in contaminated soil samples collected from the Tamaulipas and Tabasco regions in Mexico.

Research Design
In this study, we analyzed bacterial composition in soil contaminated with crude oil from two regions in Mexico. Figure 1 shows the general steps taken to identify the microbial communities at both sites. Appl. Sci. 2023, 13, x FOR PEER REVIEW 3 of 17 The adoption of omics techniques has helped in the identification of potentially useful microbes for the decontamination of oil polluted environments. Metagenomic analysis of a hydrocarbon-degrading bacterial consortium showed that the bacterial consortium was able to ensure the biodegradation of hydrocarbon by 70%, indicating the potential of the consortium for environmental remediation [15]. This implies that metagenomics evaluation can lead to the identification of microorganisms in hydrocarbon-contaminated soil that can be used as a consortium for bioremediation purposes. Some metagenomic diversity studies based on the sequencing of the 16S rRNA gene have identified diverse bacteria that could be involved in the degradation of hydrocarbon. For instance, Garrido-Sanz et al. (2019) identified strains such as Pseudomonas, Aquabacterium, Chryseobacterium, and Sphingomonadaceae as the dominant genera in a 16S-rRNA based metagenomic analysis of a diesel oil-polluted-soil in a rhizoremediation assay [34].
The advent of next-generation sequencing (NGS) techniques has influenced many studies, revealing the efficiency of using NGS for the quantification, identification, and analysis of the microbial population in soil [30,38,39]. Next-generation sequencing-based techniques have been employed in bio-stimulation assays to determine the most dominant bacterial phyla in contaminated soil using Illumina sequencing technology. The outcome of this study identified proteobacteria, Firmicutes, and Bacteroidetes phyla as the most abundant species in the studied soil [40]. In another study using the 16S rRNA gene-based Illumina MiSeq sequencing, proteobacteria (49.11%) and actinobacteria (24.24%) were reported as the most dominant phyla, and the main genera were Pseudoxanthomonas, Luteimonas, Alkanindiges, Acinetobacter and agromyces in oil-contaminated soil [28].
Previous analyses of soil using next-generation sequencing techniques have shown that next-generation sequencing is an important tool in understanding the diversity and functional roles of microbes in their environment [41][42][43]. We believe a next-generationbased analysis of crude oil-contaminated soil from Tabasco and Tamaulipas can reveal the diversity and functional role of the microbes in these environments. There is still a paucity of information on the abundance of the microbial population of the oil-contaminated soils of Mexico. We conducted a study to characterize the structure and diversity of bacterial communities in contaminated soil samples collected from the Tamaulipas and Tabasco regions in Mexico.

Research Design
In this study, we analyzed bacterial composition in soil contaminated with crude oil from two regions in Mexico. Figure 1 shows the general steps taken to identify the microbial communities at both sites. Step 1.
The sampling collection was from soil contaminated in two regions of Mexico, Burgos and Tabasco.
The physicochemical analysis determined the principal properties of soil. Step 3.
Sequencing was carried out using the ion Torrent Platform and subsequently trimming and cleaning the sequencing with low quality. Step 1. The sampling collection was from soil contaminated in two regions of Mexico, Burgos and Tabasco.
Step 2. The physicochemical analysis determined the principal properties of soil.
Step 3. Sequencing was carried out using the ion Torrent Platform and subsequently trimming and cleaning the sequencing with low quality. Step 4. Data analysis was carried out with the QIMME software version 1.9, a nextgenration platform. Alpha diversity, OTUS and beta diversity were computed. Finally, a heatmap was generated using R software version 3.3.3 and SRA sequences were uploaded at NCBI.

Sampling Site
Sampling was carried out in the Mexican regions of Tabasco and Tamaulipas. Contaminated soil samples were recovered from the Tres Bocas town (location of 3a.), Huimanguillo Municipality, Tabasco (17 • 55 18.9 N, 93 • 50 48.6 W) and were classified as Acrisol (AC) and from Cuenca Burgos, Tamaulipas (26 • 00 51 N, 98 • 29 45 O) and was classified as Kastanozem based on the World Reference Base (WRB). Each region (Burgos and Tabasco) constitutes a zone. There were two zones selected. Six spots were randomly sampled for each region; Burgos (SB. R1 and SB. R2 replicates) and Tabasco (ST.R1 and ST.R2). Different soil samples were collected in triplicates, according to Galazka et al. (2021), and the soil samples were combined to form a composite sample for each region, giving rise to four composite samples, as shown earlier [44]. The texture was determined by using the method of Bouyocos, and the pH was measured in a water solution (1:2) [45] using three replicates for each composite sample ( Figure 2). A t-test was performed to compare the difference between the physiochemical properties of the soil from Burgos and Tabasco at p < 0.05. The hydrocarbon content of the soil was not analyzed as the focus of the study is to determine the bacteria diversity in oil-contaminated soil.
Data analysis was carried out with the QIMME software version 1.9, a nextgenration platform. Alpha diversity, OTUS and beta diversity were computed. Finally, a heatmap was generated using R software version 3.3.3 and SRA sequences were uploaded at NCBI.

Sampling Site
Sampling was carried out in the Mexican regions of Tabasco and Tamaulipas. Contaminated soil samples were recovered from the Tres Bocas town (location of 3a.), Huimanguillo Municipality, Tabasco (17°55′18.9″ N, 93°50′48.6″ W) and were classified as Acrisol (AC) and from Cuenca Burgos, Tamaulipas (26°00′51″ N, 98°29′45″ O) and was classified as Kastanozem based on the World Reference Base (WRB). Each region (Burgos and Tabasco) constitutes a zone. There were two zones selected. Six spots were randomly sampled for each region; Burgos (SB. R1 and SB. R2 replicates) and Tabasco (ST.R1 and ST.R2). Different soil samples were collected in triplicates, according to Galazka et al. (2021), and the soil samples were combined to form a composite sample for each region, giving rise to four composite samples, as shown earlier [44]. The texture was determined by using the method of Bouyocos, and the pH was measured in a water solution (1:2) [45] using three replicates for each composite sample ( Figure 2). A t-test was performed to compare the difference between the physiochemical properties of the soil from Burgos and Tabasco at p < 0.05. The hydrocarbon content of the soil was not analyzed as the focus of the study is to determine the bacteria diversity in oil-contaminated soil.

DNA Extraction from Contaminated Soil Samples and Construction of the V2-V3-16S rDNA Libraries
Prior to the extraction of the genomic DNA, the soil samples were cleaned to remove excess petroleum compounds. The soil was cleaned by placing the samples in the oven to allow the oil to evaporate. This was followed by tissue absorption of the oily components of the soil to prevent the oil in the soil from inhibiting DNA extraction from the soil sample. Then, the DNA was extracted from 1 g of the soil with the Power Soil DNA Isolation

DNA Extraction from Contaminated Soil Samples and Construction of the V2-V3-16S rDNA Libraries
Prior to the extraction of the genomic DNA, the soil samples were cleaned to remove excess petroleum compounds. The soil was cleaned by placing the samples in the oven to allow the oil to evaporate. This was followed by tissue absorption of the oily components of the soil to prevent the oil in the soil from inhibiting DNA extraction from the soil sample. Then, the DNA was extracted from 1 g of the soil with the Power Soil DNA Isolation Kit (MOBIO, Carlsbad, CA, USA) according to the manufacturer's instructions ( Figure 2). The quantity of purified DNA was measured using a NanoDrop 2000 Spectrophotometer (Thermo Scientific, Waltham, MA, USA). Nested PCR was performed using the pair of primers 27F (GAGAGTTTGATCCTGGCTCAG) and 1495R (CTACGGCTAC-CTTGTTACGA) to amplify the 16S rRNA gene from the extracted community DNA [45,46].
The PCR conditions were as follows: 5 min 95 • C; 35 cycles of 60 s at 95 • C, 60 s at 60 • C and 90 s at 72 • C, followed by 10 min at 72 • C [47]. Then, these PCR products were used as a template for amplifying the V2-V3 region (252 bp) of the 16S rRNA fragment of the bacterial genome. The PCR conditions were 5 min at 95 • C, 25 cycles of 30 s at 95 • C, 30 s at 60 • C and 45 s at 72 • C, followed by 10 min at 72 • C [48]. The sequences for the forward primer and reverse primer are shown in Table 1.
The 252 bp fragments were purified from the gel with the Wizard ® SV Gel and PCR Clean-Up System (Promega ® , Rome, Italy) and were used for library construction. The concentration of the DNA used for each library was quantified with the Qubit ® 2.0 Fluorometer, according to the instructions of the manufacturer (Invitrogen, Waltham, MA, USA). The amplicons were purified with the Agencourt AMPure XP (Beckman coulter ® , Brea, CA, USA) system; then emulsion PCR was carried out with the Ion OneTouch™ 200 Template v2 DL (Life Technologies ® , Carlsbad, CA, USA) using 60 pM per each library. Libraries were sequenced at the Ion Torrent PGM (Life Technologies). Template enrichment with Ion Sphere Particles (ISPs) was employed on the Ion OneTouch™ 2 system. The sequencing was carried out on the Ion Torrent PGMTM platform (Life Technologies).

Microbial Community Structure Analysis
Firstly, raw reads were filtered using the Ion Torrent PGM software Torrent Suite v4.0.2. Then, these reads were trimmed to remove tags and primers. The trimmed reads were quality filtered with Trimmomatic (quality score > 20, read length = 150-200 bp), according to Vital-López et al. (2017) [45]. The obtained sequences were then analyzed on QIIME version 1.9 [49]. Open reference operational taxonomic units were determined at 97% similarity using the USEARCH algorithm [50]. Finally, sequence alignments were completed using the Green genes core set [51,52].

Diversity Computation and Bioinformatic Analysis
We computed alpha diversity to estimate the observed species (operational taxonomic units = OTUs), and species richness was determined with the Chao1 estimator and species diversity with Shannon and Simpson. Alpha diversity was plotted with R software. A heatmap was generated using the R software for the top 20 bacteria relatives at the genus level, considering their abundance in the contaminated soil samples. The associated dendrograms were generated with the Unweighted Pair Group Method with the Arithmetic Mean (UPGMA), with a clustering threshold of 0.75 in all samples. The beta diversity analysis was calculated using UniFrac analysis according to Murugesan et al. (2015) [53]. The sequences datasets obtained were uploaded to the NCBI server (National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/, accessed on 18 January 2022). Sequence Read Archive (SRA) submission was processed as Bioproject: PRJNA798056, BioSamples: SAMN25045560, SAMN25045941, SAMN25045951 and SAMN25045953, SRA: SRR17658723, SRR17658722, SRR17658721 and SRR17658720.

Physicochemical Properties
The mean pH of soil from Tabasco, in southeastern Mexico, was mildly acidic, but the soil from Burgos was alkaline. The Student t-test for Independent Samples was performed, and the mean difference was statistically significant in the percentage of sand, silt, and clay, with pH values p < 0.05 (Table 2).  Table 3). To analyze the bacterial diversity in the contaminated soil samples, cleaned reads were analyzed. The open reference operational taxonomic units (OTUs) were determined at 97% similarity using the USEARCH algorithm and sequence alignments with the green genes core set ( Figure 3). formed, and the mean difference was statistically significant in the percen and clay, with pH values p < 0.05 (Table 2).  To analyze the bacterial diversity in the contaminated soil sampl were analyzed. The open reference operational taxonomic units (OTUs) at 97% similarity using the USEARCH algorithm and sequence alignmen genes core set ( Figure 3).   Table 4 show bacterial composition at the phylum level. It gives a reflection of bacteria distribution in the Burgos (SB.R1 and SB.R2) and Tabasco (ST.R1 and ST.R2) soil samples. We identified a total of 17 phyla from all samples. The identification of the phyla revealed, Proteobacteria as the dominant phyla, followed by Actinobacteria, Firmicutes and Cyanobacteria.  Figure 4a and Table 4 show bacterial composition at the phylum level. It gives a reflection of bacteria distribution in the Burgos (SB.R1 and SB.R2) and Tabasco (ST.R1 and ST.R2) soil samples. We identified a total of 17 phyla from all samples. The identification of the phyla revealed, Proteobacteria as the dominant phyla, followed by Actinobacteria, Firmicutes and Cyanobacteria.    The  Table 4. The plotted heatmap represents the relative percentage of each bacterial genera within each sample as clustered in four samples to form two main clades or groups ( Figure 5). We used the plotted heatmap to compare the similarity patterns between the samples for each region. Therefore, the heatmap shows the highest degree of similarity among samples of the same region. The bioinformatics analysis at the genus level showed that the sequencing reads could be assigned to 408 taxons, of which 20 were common in both soil samples. In Figure 5, the red color scale indicates a major percentage (highest abundance) and yellow a minor percentage (lowest abundance) of the identified phyla. Some genera in the Tabasco soils clearly differed from those in the Burgos region; for example, the color of Azospirillum was red in Burgos samples in comparison with yellow in Tabasco  The plotted heatmap represents the relative percentage of each bacterial genera within each sample as clustered in four samples to form two main clades or groups ( Figure  5). We used the plotted heatmap to compare the similarity patterns between the samples for each region. Therefore, the heatmap shows the highest degree of similarity among samples of the same region. The bioinformatics analysis at the genus level showed that the sequencing reads could be assigned to 408 taxons, of which 20 were common in both soil samples. In Figure 5, the red color scale indicates a major percentage (highest abundance) and yellow a minor percentage (lowest abundance) of the identified phyla. Some genera in the Tabasco soils clearly differed from those in the Burgos region; for example, the color of Azospirillum was red in Burgos samples in comparison with yellow in Tabasco

Alpha Diversity Analysis
The total number of species observed was 877 for SB.R1, 1843 for SB.R2, 502 for ST.R1 and 2048 for ST.R2, respectively ( Figure 6). The number of species found in the contaminated soils of Burgos, Tamaulipas was low compared to our previous report on the bulk soils of Tamaulipas, where the identified average was 6208 species [45]. This variation could be associated with the contamination of the Burgos soil, nated soils of Burgos, Tamaulipas was low compared to our previous report on the bulk soils of Tamaulipas, where the identified average was 6208 species [45]. This variation could be associated with the contamination of the Burgos soil, requiring adaptation by the bacteria living in it. The contaminated soil samples showed a number of species (Chao1 mean ± SE), SB.R1 = 1,210.819 ± 45.995, SB.R2 = 1,941.843 ± 15.267, ST.R1= 862.460 ± 59.884, ST.R2 = 2,088.329 ± 8.440 (Figure 6). Shannon's index showed that SB.R1 = 3.96, SB.R2

Beta Diversity Analysis
The Beta diversity was calculated in soil samples from the Tabasco and Burgos regions. Figure 7a shows the Unweighted UniFrac analyses, which calculated the distances between samples obtained from the Tabasco and Burgos samples. The results were produced using principal coordinates analysis (PCoA). Figure 7b shows the hierarchical clustering tree of samples based on the UniFrac metric. The bacterial communities from each region are grouped in a separate branch of the tree. Beta diversity was measured using

Beta Diversity Analysis
The Beta diversity was calculated in soil samples from the Tabasco and Burgos regions. Figure 7a shows the Unweighted UniFrac analyses, which calculated the distances between samples obtained from the Tabasco and Burgos samples. The results were produced using principal coordinates analysis (PCoA). Figure 7b shows the hierarchical clustering tree of samples based on the UniFrac metric. The bacterial communities from each region are grouped in a separate branch of the tree. Beta diversity was measured using Bray Curtis distance matrix significance which employs an ordination method PERMANOVA to compare the groups, and the results showed that there are two main groups (Burgos and Tabasco soils) with a p-value of =0.332. Therefore, no statistically significant differences were observed between the groups (Table 5).
Bray Curtis distance matrix significance which employs an ordination method PER-MANOVA to compare the groups, and the results showed that there are two main groups (Burgos and Tabasco soils) with a p-value of = 0.332. Therefore, no statistically significant differences were observed between the groups (Table 5).

Discussion
It is known that soil pH is a primary factor driving the bacterial operational taxonomic unit abundance and soil bacterial alpha diversity rather than soil nutrients. It is

Discussion
It is known that soil pH is a primary factor driving the bacterial operational taxonomic unit abundance and soil bacterial alpha diversity rather than soil nutrients. It is responsible for shaping bacterial communities in agricultural soils, including their ecological functions and biogeographic distribution [54]. Soil fertility depends on physical, chemical and biological soil attributes [55].
In one study, Burgos soils were induced to decompose hydrocarbons impregnated in drill cuttings and were able to initiate the bioremediation of the hydrocarbon in the drill cuttings [56]. The bio-stimulation of soil microorganisms with nutrients N and P, humidity and aeration increased the decomposition of hydrocarbons and fostered the bioremediation of the drill cuttings [56]. Lin et al. (2022) [57] found that the soil pH and conductivity increased during the bioremediation experiment.
We identified 17 phyla, and the main phyla found in this study were Proteobacteria, Actinobacteria, Firmicutes and Cyanobacteria. This observation is similar to the findings from previous research on the bacterial microbiome and metagenomics studies of petrochemical-contaminated soils [15,30,[58][59][60]. Similarly, in a study by Kumar et al. (2018) on the microbial community of alfalfa and barley soil samples, Proteobacteria (45.9%) was found to be the most dominant phyla [61]. This observation was similar to the report of Melekhina et al. (2021), in which Proteobacteria was the most abundant phyla in their assay [62]. However, our findings were in contrast to a previous study that reported Acidobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Planctomycetes, and Proteobacteria were the dominant phyla among all oil-contaminated soils assessed by High Throughput Sequencing of 16S rRNA Genes [30]. The differences in bacterial composition as compared to the findings from this study may be associated with differences in soil physicochemical properties since the physicochemical properties of soils play a significant role in shaping the microbial communities in the soil [25,54]. We conducted a study where we determined the bacterial composition from bulk soil samples from Tamaulipas, and we found that the main phyla were Proteobacteria, Firmicutes, Acidobacteria, Actinobacteria, Gemmatimonadetes, and Bacteroidetes had the highest diversity according to the Shannon and Simpson index [45].
In addition, previous studies have shown that soil contaminated with hydrocarbon tends to have some of the families identified in this study as the most abundant family, corroborating our observations and suggesting that bacteria in these families may be associated with the degradation of hydrocarbons [59,63,64]. The identification of bacteria in the genera Azospirillum conformed with previous studies that identified Azospirillum as one of the most abundant bacteria in the soil. Azospirillum is the most studied genus of plant growth-promoting rhizobacteria [65], and they are also known to remove sulfide from swine waste biogas [66]. A previous study from our lab has also shown that Azospirillum has the potential for the degradation of hydrocarbon [67]. Another essential genus found in these soils, Skermanella sp., was reported as a pyrene degrader in a study involving an oilfield soil that used natural attenuation, bioaugmentation, and bio-stimulation approaches in the degradation of pyrene [6]. Thus, their presence in the studied soil could be associated with their roles as bio-degraders in crude oil-contaminated soil.
Lastly, the presence of the genus Thalassospira in the petroleum-contaminated soil corroborates previous studies that have reported their abundance in polluted water and soil samples [17]. When we calculated the alpha diversity, we found similar results as it has been reported that in oil-contaminated soils, the Shannon and Simpson indices computed based on operational taxonomic unit (OTU) abundance diversity indices tend to be very low [33].
The use of microorganisms for the degradation of xenobiotics is important because it is an environmentally friendly contaminant mitigation approach [68,69]. Several studies have successfully used microorganisms for the bioremediation of different heavy metal-contaminated soils, such as lead (Pb), zinc (Zn) and cadmium (Cd) [70]. Similarly, Pseudomonas aeruginosa NAPH6 recovered from contaminated seawater in Tunisia was shown to effectively degrade naphthalene and other aliphatic hydrocarbons [71]. There is a study that identified a strain of Microbacterium sp. from Burgo soil contaminated with hydrocarbon as a potential bacteria for the bioremediation of hydrocarbon-contaminated soil [72]. A comparative look at our previous study on bulk agronomic soil in Tamaulipas and the soil contaminated with petroleum from Burgos in Tamaulipas and Tabasco in this study showed that the content of the soil has an influence on the microbial population and the abundance of some particular genera in the soil contaminated with petroleum and not agronomic soil further confirmed the potential of these bacterial genera for the bioremediation of soil contaminated with petroleum. This assertion is further corroborated by the fact that many bacteria in these genera have been reported with the capacity to grow in, tolerate, and degrade hydrocarbon in lab assays and in situ assays [11,21]. Hence, the abundance of different bacteria genera in this study further corroborates the potential of bacteria in the degradation of petroleum or hydrocarbons in a contaminated environment.

Conclusions
In the studied soil, we identified the structure and diversity of bacterial communities in oil-contaminated soil using the next-generation sequencing platform. In both contaminated sites, the abundant soil bacteria included Rhodospirillaceae and Sphingomonadaceae as the main families, and the genera with the highest abundance were Skermanella sp., Azospirillum sp., and Thallospira sp. The presence of these genera implies that they may be associated with the degradation of petroleum, or they might possess a mechanism through which they can survive in the presence of petroleum.
In conclusion, petroleum-contaminated soil is considered a major global concern because of its impact on human health and the functioning of the ecosystem. The presence of crude oil or petroleum in the soil can alter the microbial ecology of the soil in question. This may affect the population and the diversity of the microbes in the soil. As observed in this study, the population and diversity of the bacteria identified in the crude oilcontaminated soil were lower than those reported in our previous study on agronomic bulk soil collected in northeast Tamaulipas. Finally, next-generation sequencing analysis of oil-contaminated soil can give insight into the microorganisms that could be selected for bioremediation purposes in preparation for the bioremediation of petroleum-contaminated soil. Conclusively, this study has been able to show that petroleum contamination can alter the microbial ecology of contaminated soil.  Data Availability Statement: The sequencing data sets obtained were uploaded to the NCBI server (National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/ (accessed on 18 January 2022)). Sequence Read Archive (SRA) submission was processed as Bioproject: PRJNA798056.