Comparison of Microbial Gene Diversity in Grassland Topsoil Depending on Soil Quality

: Soil has multiple functions, including the provision of habitat to organisms, and most biological activities occur in the surface soil. Due to the negative effects of soil erosion, efforts for soil conservation are being made, including the development of a reliable index that can help assess soil quality. In this study, the physical and chemical properties and biological genes from grassland topsoil were analyzed, in order to identify surface soil organism markers that could be used as a soil quality index. Six spots of grassland topsoil were analyzed, one high-quality and ﬁve low-quality, based on a web-based soil quality assessment module. Consequently, eukaryotes and prokaryotes with different soil quality ratios were compared and examined. The following bacteria and archaea have the potential to be used in soil quality assessment: circulation of materials including nitrogen, Nitrospira spp., Candidatus Nitrososphaera, and Candidatus Nitrosotalea; biological puriﬁcation, Geobacter spp.; pathogens, Burkholderia spp., Paraburkholderia spp., Pseudomonas brassicacearum , and Rhizobacter spp.; antibiotic secretion, Candidatus Udaeobacter; and material degradation Steroidobacter spp. and Rhodanobacter spp. This study provides primary data for identifying biological markers for soil quality evaluation. In the future, a wider variety of data need to be accumulated to develop a highly reliable index related to soil quality.


Introduction
Soil purifies pollutants, stores carbon, contributes to the internal water cycle, provides essential components to organisms, and serves as a habitat for organisms, including plants and microorganisms [1]. The surface soil contains high concentrations of organic matter and abundant microorganisms; it is the layer where most biological activities take place within the soil, thereby serving as a space for the main networking [2]. However, natural and artificial factors, including soil pollution, extreme weather events, climate change, increased land use intensity, and inappropriate land use, can cause surface soil erosion and degradation [3]. The surface soil, the top soil layer (30 cm), has major functions, as it is rich in organic matter and microorganisms and, therefore, serves as a source of nutrients and water for plants (MOE, 2021; Surface soil information portal system, http://pyoto.araon.org/ (accessed on 12 July 2021). In addition, the annual average surface soil loss in Korea is approximately 32 ton/ha, three times the standard for soil erosion given by the Organization for Economic Cooperation and Development (OECD) [4]. In particular, a crucial role of surface soil is carbon storage. The amount of carbon stored within 2 m of soil in the global carbon cycle is 2400 Gt, and the management of surface soil loss is expected to enable a preemptive response to climate change [5]. Global surface soil loss is constantly increasing, due to an extreme climate [6]. Surface soil erosion could threaten the versatility of soil through the loss of ecological niches, and it could exacerbate global warming because of reduced carbon storage capacity and the aggravation of river water pollution, due to soil loss. As it takes a long time to rebuild the eroded surface soil, this is considered an important field in agriculture and environment sciences [4]. On the contrary, strict measures for soil conservation, such as the construction of an early warning system for the potential loss of soil versatility and the development of a reliable index to assess soil quality, have been implemented in developed countries such as those in Europe. Accordingly, a soil quality assessment model based on physical and chemical properties has been reported [7].
However, the debate regarding the biological index for soil quality assessment is ongoing. Although it requires the discovery of organisms that could ideally explain core functions and elucidate the key roles of different species, it is difficult to identify specific keystone markers, owing to the large functional redundancy of the soil microbial community [3]. The application of molecular biology techniques based on next generation sequencing (NGS) allows the accumulation of baseline data on soil biodiversity [2,[8][9][10][11][12]. However, related research has focused on the agricultural environment linked to the production of high-quality crops, with relatively limited research on soil quality. In addition, agricultural soil may not be ideal for diversity research related to soil quality, as it could be affected by artificial factors, including specific agricultural crops and the fertilizers applied. In contrast, grassland occupies around 26-40% of the total land mass, contains the highest level of biomass on earth [13], and is not influenced by artificial factors like agricultural soil; hence, it is seemingly suitable for diversity research. However, because of limited baseline data and related research, in this study, we compared and analyzed NGS-based biological gene diversity according to soil quality in grassland topsoil, and examined potential biological markers related to soil quality.

Sampling
Samples were collected in duplicate between the end of August and beginning of September in 2019 from six grasslands in the Han River Basin, in Gyeonggi and North Chungcheong Provinces (Figure 1). The surface soil was collected from the 0-30 cm layer using a hand soil auger and then homogenized. The collected soil samples were placed in an icebox, moved to the laboratory, and sieved through a 2-mm mesh for analysis.

Surface Soil Quality Assessment
According to the analytical methods of soil, plant, and soil chemistry from the National Academy of Agricultural Science, and the standards for the examination of soil pollution provided by the National Institute of Environmental Research [14,15], the following parameters were measured: bulk density, porosity, available water capacity (AWC), pH, electric conductivity (EC), organic content, available phosphate, cation exchange capacity (CEC), soil respiration, and soil enzymes [1,[16][17][18]. Based on the analyzed parameters, soil quality was assessed using a web-based soil quality assessment module (http://pyoto.araon.org (accessed on 12 July 2021) [1]. Soil quality index (SQI) is expressed on a 0-4 scale using pH, organic content, EC, and available phosphate. The soils were scored using Formula (1), in accordance with the equation proposed by Bhaduri et al. [19] and deformation of Wymore [20]; where X represents the soil property value, b represents the slope, and A represents the datum line or soil properties. (1)

Sample Pretreatment and NGS Analysis
The total nucleic acid was extracted from the soil samples following the manufacturer's instructions with a Fast DNATM Spin Kit for soil (MP Biomedicals, Solon, OH, USA). To analyze the genetic diversity of prokaryotes and eukaryotes, the V4 region of the bacterial 16S rRNA gene was amplified using barcode-attached 515F (5 -GTG YCA GCM GCC GCG GTA A-3 ) and 806R (5 -GGA CTA CHV GGG TTW TCT AAT-3 ), with the extracted nucleic acid as a template. Similarly, the barcode-attached 1391F (5 -GTA CAC ACC GCC CGT C-3 ) and EukBr (5 -TGA TCC TTC TGC AGG TTC ACC TAC-3 ) were used to amplify the fungal 18S rRNA gene [21]. The amplified products were identified using a gel documentation system (BioRad, Hercules, CA, USA) after running through a 2% agarose gel, followed by purification using a QIAquick PCR purification Kit (Qiagen, Hilden, Germany) and analysis of microbial gene diversity using the Illumina MiSeq platform (Illumina Inc., San Diego, CA, USA). Quantitative Insights into Molecular Ecology (QIME) software was used for comparison at levels from phylum to species through data trimming and analyzing the alpha diversity. Excluding chimera, short length, low-quality, primer mismatch, and non-target reads, 208,499-406,826 bacterial reads (262,851 on average) and 54,165-312,367 fungal reads (159,150 on average) were analyzed from the NGS baseline data (data not shown).

Soil Quality
An analysis of the average value of each category, for 12 samples collected from six spots with two replicates, showed that the soil quality of spot #2 was high, with an SQI of 2.40 and a score of 73.21; the soil quality of the remaining five spots was low. The SQI of the spots with a low soil quality ranged between 0.54 and 1.51 (1.10 on an average), and the score was between 2.51 and 22.87 (12.51 on an average). Furthermore, the surface soils with a high soil quality had higher organic content of 1.36 g/kg, an available phosphate of 64.57 mg/kg, and β-glucosidase of 4.32 mg PNP/kg h, compared to the surface soils with a low soil quality. In contrast, it had lower acid phosphatase, arylsulfatase, and βglucosaminidase levels, of 36.31, 3.45, and 2.00 mg PNP/kg h, respectively. On the contrary, the alpha diversity indices of prokaryotes and eukaryotes were rich in samples with a high soil quality, and the genes in the discovered species were diverse, consistent with the results of physical and chemical soil quality assessments. However, the dominance indices were similar among all samples (Table 1).

Genomic Characteristics of Prokaryotes and Eukaryotes According to Soil Quality
In this study, biological gene diversity according to soil quality was analyzed in the topsoil of several grasslands in Korea. The high-quality grassland topsoils showed a rich and diverse array of prokaryotes and eukaryotes, based on their physical and chemical properties, whereas the low-quality soils showed opposite results ( Table 1). As these results were considered to be similar to the integrated physical, chemical, and biological soil characteristics from standard indicators of soil health index suggested by Nielsen and Winding [22], the grassland topsoil was assumed to be suitable for the identification of biological markers of soil quality.

Characteristics of Prokaryotes
Twenty-six phyla of prokaryotes in grassland topsoil, including Proteobacteria, were analyzed at the phylum level. In high-quality soil, Proteobacteria were the most dominant (35.35%), and Acidobacteria (24.75%) were the second most dominant, followed by Actinobacteria (5.40%), Chloroflexi (4.28%), Bacteriodetes (6.10%), and Verrucomicrobia (5.45%). In low-quality soil, Proteobacteria was the most dominant (37.74%) and Acidobacteria was again the second most dominant (21.71%), followed by Actinobacteria (8.52%), Chloroflexi (8.20%), and Verrucomicrobia (5.32%). Compared with low-quality soil, high-quality soil had a higher proportion of Acidobacteria (3.04%), Thaumarchaeota (1.88%), Bacteroidetes (1.82%), and Gemmatimonadetes, Rokubacteria, and Armatimon-adetes (0.50-1.00%). On the contrary, low-quality soil had more Actinobacteria (3.12%), Chloroflexi (2.82%), Proteobacteria (2.39%), candidate phylum WPS-2 (1.04%), Patescibacteria, and others (0.50-1.00%) than high-quality soil. At the class level, δ-Proteobacteria, Blastocatellia (Subgroup 4), and Subgroup 6 were predominant in high-quality soil. In low-quality soil, α-and γ-Proteobacteria and Acidobacteria were predominant (Figure 2a). The predominant prokaryotes in the high-quality soil were Nitrospira, Polycyclovorans, Geobacter, Candidatus Nitrososphaera, Geothrix, Bacillus, and Candidatus Nitrosotalea. Nitrospira inhabits a variety of environments, including soil, ground water, and fresh water, and can cause nitrification by oxidizing ammonia and nitrite [23]. Polycyclovorans inhabits phytoplankton and can utilize hydrocarbons as an energy source [24]. Geobacter is an anaerobic bacterium with biological purification functions for oxidizing organic compounds and metals into carbon dioxide using electron acceptors [25]. Candidatus Nitrososphaera is an ammonia-oxidizing archaea [26], and some species of the genus Geothrix are functional bacteria for iron oxidation [27]. Bacillus, which is a facultative anaerobe, is thermally resistant, by virtue of the formation of endospores [28]. Candidatus Nitrosotalea belonging to the family Thaumarchaeota and Nitrosopumilaceae have nitrogen-related functions [29]. Some predominant prokaryotes in the low-quality soil were Burkholderia-Caballeronia-Paraburkholderia, Candidatus Udaeobacter, Pseudomonas brassicacearum subsp. brassicacearum, Pseudarthrobacter, Steroidobacter, Crenobacter, Bradyrhizobium, Plantactinospora, Polaromonas, Rhizobacter, Rhodanobacter, and Bryobacter. The genus Burkholderia includes plant and animal pathogens, as well as species responsible for degrading organic pesticides and polychlorinated biphenyls [30,31]. Members of the genus Caballeronia contribute to nitrogen fixation and plant growth promotion, and members of the genus Paraburkholderia inhabit tissues of plants such as pine trees, with no reported connection with human infection [32,33]. Candidatus Udaeobacter is a functional bacterium responsible for the secretion of antibiotics in the soil and the potential removal of trace gases [34]. Furthermore, P. brassicacearum subsp. brassicacearum is a plant pathogenic bacterium related to rapeseed (Brassica napus) roots and tomato [35]. Genus Pseudarthrobacter was reclassified from the genus Arthrobacter, which is an aerobic bacterium that lives in soil. Genus Steroidobacter includes species responsible for the degradation of hormones, including steroids and denitrification [36]. Members of the genus Crenobacter have been isolated from environments such as thermal springs and caves [9], with members of the genus Bradyrhizobium mainly found in the soil, including species responsible for nitrification and nitrogen fixation using leguminous bacteria. Members of the genus Plantactinospora have been isolated from plant tissues; the genus Polaromonas is psychrophilic and belongs to the family Comamonadaceae. In addition, members of the genus Rhizobacter have been isolated from the soils of botanical gardens and have been reported to cause bacterial gall in carrots [37,38]. Members of the genus Rhodanobacter inhabit the soil; these include species responsible for the denitrification and degradation of pesticides, including lindane [39]. Members of the genus Bryobacter have been isolated from peats containing accumulated and degenerated remains of grasses and trees, and currently the genus contains Bryobacter aggregatus as the only species [40]. Some bacteria analyzed in this study as potential markers included Nitrospira related to the nitrogen cycle, including nitrification; archaea included Candidatus Nitrososphaera and Candidatus Nitrosotalea, Geobacter associated with biological purification; plant pathogenic bacteria such as Burkholderia, Paraburkholderia, P. brassicacearum, and Rhizobacter; Candidatus Udaeobacter related to antibiotic secretion; and the genus Steroidobacter and Rhodanobacter related to material degradation. This result was similar to that of previous studies, including that by Vestergaard et al. [41], in that it comprises a group that could be used to indicate soil health and quality (based on NGS metadata).
To design a biological index for soil quality assessment, more data need to be accumulated about microbiota identified up to the genus or species level and regarding biological genes that differed according to soil quality. In this study, eukaryotes showed greater differences in rate than prokaryotes in the high-quality soil ( Table 2). Among these, Penicillium, which was predominant in the high-quality soil, is commonly present in soil and air in temperate climates, degrades organic matter, and produces mycotoxins. It can affect the fruits and bulbs of plants and can be pathogenic to some organisms, including mosquitoes [42][43][44]. Pythium violae is a fungus imperfectus belonging to the phylum Oomycota and is pathogenic to plants, including carrots [45,46]. Other predominant eukaryotes in high-quality soil included Klebsormidium flaccidum, a filamentous green algae, Trinema of phylum Cercozoa, saprophytic Rhizopus delemar, Sorosphaerula veronicae inhabiting plants, Acanthamoeba inhabiting water or soil, and the order Chaetonotida of phylum Gastrotricha [47][48][49][50][51][52][53]. Some predominant eukaryotes in the low-quality soil were Artemisia, which is contained in mugwort, crown daisy, tarragon, and southern trees [54]; the subclass Acari, including mites and ticks that inhabit the skin of animals such as humans [55]; and Triplonchida, an order of terrestrial nematodes [56]. Other predominant eukaryotes in the low-quality soils included Archaeorhizomyces, which is a symbiotic fungus inhabiting plant roots and rhizosphere; Pyrenochaeta, which causes eumycetoma; and pantropically distributed genus Vigna [57].

Discussion
Numerous studies on soil quality based on chemical measurements have been reported, but research on biological quality is limited; therefore, it is necessary to obtain basic data. In this study, we analyzed eukaryotic and prokaryotic genetic diversity related to the quality of grassland topsoil and suggested several biomarkers of soil quality. As the topsoil is the main network space where biological activities occur in the soil [1,2], it was selected as the material for this study. There is a possibility that the quality of grassland topsoil is relatively diverse compared to agricultural topsoil, such as fields, where the chemical-based soil quality diversity is at a medium-high level, and bare land, where the soil quality is low. In this study, grassland was assumed to be a suitable land use type. However, as this study involved analyses of data from six sites, it might be necessary to analyze the diversity of biomarker candidates identified in this study in additional grassland topsoil samples in the future. In addition, there is a need to monitor biomarkers according to various land uses, such as agriculture and forestry, and to evaluate biodiversity in contaminated soils. In this study, eukaryotic 18S and prokaryotic 16S rDNA partial regions were used as eukaryotic and prokaryotic amplicons provided to NGS; a diversity analysis tool mainly used for biomarker discovery. It was difficult to analyze the benefits of eukaryotes for plants; eukaryotes are expected to contribute to the circulation of materials or nutrients in high-quality soils or act as plant pathogens in low-quality soil. For eukaryotic gene identification, in this study, we used the 18S universal primer mix, which is commonly used. As the identification of eukaryotes is more complicated than that of prokaryotes, it could be difficult to identify short genetic fragments. In addition, the eukaryotes that showed differences according to soil quality might be suitable for monitoring the genetic diversity of eukaryotes in grassland topsoil, but they may not be ideal for the identification of genetic fragments compared to fungi. As the proportion of fungi could decrease with the inclusion of these genes, the discovered soil quality markers will only target fungi, requiring a replacement with the ITS region. As Penicillium can be commonly found in both soil and air [57], its potential as a marker would require identification at the species level. Finally, it is thought that the accumulation of baseline data is needed for marker discovery. Prokaryotes have shorter genes than eukaryotes and can be easily identified due to 16S rDNA. As bacteria involved in nitrification, iron oxidation, and hydrocarbon utilization were mainly present in high-quality soil, they were recognized as a symbiont-forming group, beneficial to soil microbes and plants associated with material circulation. In contrast, genes of bacterial groups that were suspected to be pathogenic to plants and animals, or responsible for antibiotic secretion and degradation of hormones and pesticides, were mainly identified in low-quality soil. Pathogenicity, which negatively affects plants, anti-pathogenic compounds, and hormone-associated bacteria are a potential genetic markers [41].
Nielsen and Winding [22] considered microorganisms as a soil health index, as they can quickly respond to the changes in the soil and, therefore, provide an excellent index for changes, are key components in material circulation and organic matter degradation, and stabilize soil aggregates with microbial cells and polysaccharides. In addition, it was implied that the amount of chemicals in the soil cannot be used as a reliable index for soil health, in relation to biological utility. Some of the suggested indicators for soil health index are (i) correlation with the ecosystem; (ii) integration of soil's physical, chemical, and biological characteristics; (iii) management at the appropriate time and response to environmental change; and (iv) compatibility with the existing soil database if applicable. This study provides baseline data for discovering biological markers that could be used to assess soil quality in the future. A wider variety of data needs to be accumulated to develop a highly reliable index associated with soil quality in the future.

Conclusions
In this study, biodiversity according to soil quality was examined in grassland topsoil. Several prokaryotes were identified as potential markers, i.e., Nitrospira spp., Candidatus Nitrososphaera, and Candidatus Nitrosotalea related to the nitrogen cycle; Geobacter spp. associated with biological purification; Burkholderia spp., Paraburkholderia spp., Pseudomonas brassicacearum, and Rhizobacter spp., which are plant pathogenic bacteria; Candidatus Udaeobacter, an antibiotic producer; and Steroidobacter spp. and Rhodanobacter spp., associated with material degradation. However, identification of candidate markers in eukaryotes requires the identification of organisms at the species level, with a focus on fungi. The accumulation of a wider variety of baseline data and further studies are required to examine the above candidate markers using the same, and different, types of soils and monitoring the markers suggested in the study according to the soil quality. In addition, surface soil was recognized as the main bioresource in this study, and therefore, it could be used for fundamental research to establish grounds for the management of soil quality, through intensive national investment and development.