Truth or Lie: Does the DNA Extraction Procedure Really Affect the Insight in Composition and Diversity of Microbial Communities in Saffron Cultivated Soils?

: The aim of this study was to evaluate the performance of two of the most commonly used commercial kits for soil DNA extraction regarding the values of the taxonomic diversity of prokaryotes and community composition of saffron (Crocus sativus) cultivated ﬁelds. The impact of the QIAGEN-DNeasy PowerSoil Kit (MO) and Macherey-Nagel™ NucleoSpin™ Soil (MN) kit was tested on the soil of an Italian western alpine experimental site located in Saint Christophe (Aosta Valley, AO). Nine biological replicas of bulk soil were collected and analyzed independently with the two kits. 16S rRNA metabarcoding was applied to characterize soil microbial communities. We ﬁrst noticed that both DNA extraction kits yielded nearly the same number of OTUs: 1284 and 1268 for MN and MO, respectively. Both kits did not differ in the alpha diversity of the samples, while they had an inﬂuence on the beta diversity. The comparative analysis of the microbial community composition displayed differences in microbial community structure depending on which kit was used. These differences were especially highlighted at Phylum and Class levels. On the other hand, the fact that, from a functional point of view, our approach did not highlight any differences allows us to state that the results obtained with the two extraction kits are comparable and interchangeable. Based on these results and those in the literature, we could undoubtedly recommend both commercial kits, especially if the soil target microorganisms are prokaryotes and the study focuses on agricultural sites.


Introduction
Environmental DNA analysis (eDNA) has been recognized as an efficient method for the description and characterization of microbial diversity, hampered for a long time by traditional microbiology methods, based on culture isolation methods [1]. Moreover, during the last decades, thanks to technological advances in DNA sequencing (high throughput sequencing) and data analysis, the detection of taxa present in an eDNA sample was successfully applied in many different environmental samples collected from terrestrial and aquatic ecosystems (e.g., soil compartments, marine, freshwater, ice) [2]. Indeed, DNA metabarcoding [3] provides a useful, cheaper, faster, more sensitive, and easily scalable for routine monitoring program approach to characterizing biodiversity of whole communities by using DNA traces of organisms for their taxonomic identification from a single sample [4]. However, together with soil sampling and DNA analyses, soil DNA extraction remains a critical issue, especially in environmental microbiology studies where soil biodiversity is very often reported as one of the most challenging to be investigated, despite its importance in regulating multiple ecosystem services such as nutrient cycling, organic matter decomposition, plant productivity and pathogen control [5]. Studies related to soil biodiversity have continuously gained importance due to its significant interlinkages with many other areas, such as agriculture and climate change in order to understand how land management and agricultural practices could affect or preserve soil biodiversity and functioning [6]. Regarding this field of investigation, over the last decade, a growing number of articles have focused on testing the metabarcoding effectiveness, and in particular its ability to quantify biodiversity and detect species present in agroecosystems [7,8]. However, while some studies compared, in the same environment, the diversity of organisms assessed with traditional methods versus DNA metabarcoding [9], not many others compared and documented for the same site/soil, the performance and results obtained after using different soil DNA extraction kits [10,11]. Such comparisons are needed to inform the scientific communities about the specificities and limitations of each approach and to allow us to define the best strategies for current and future biomonitoring programs such as, for example, those adopted by Long Term Observatories (LTOs) sites, sampled for soil biodiversity indicators (known as the indicator LTOs) [6]. Understanding the identity, ecology, and physiology of those microbial taxa, which partially exhibit hidden talent to overcome environmental-related problems like pollution and soil degradation, could be crucial to sustain plant healthy growth and soil fertility and to develop novel less impacting agriculture strategies. For these reasons an accurate selection of the optimal method is of high importance, and it can influence the results and the interpretation of the following analyses [12]. According to previous studies related to this important issue, the aim of this technical study was to evaluate the ability of two of the most commonly used commercial kits for soil DNA extraction to analyze the taxonomic diversity and community composition of prokaryotes in one saffron (Crocus sativus) cultivated field. The impact of the QIAGEN-DNeasy PowerSoil Kit (MO) and Macherey-Nagel™ NucleoSpin™ Soil (MN) kit was tested on the soil of an Italian western alpine experimental site located in Saint Christophe (Aosta Valley, AO, Italy) in order to establish if 16S rDNA Next Generation Sequencing (NGS) libraries, created on DNA isolated with the two kits, will produce reads representing the same or diverse prokaryotes communities. The diversity and community composition of soil bacteria and archaea were measured via amplicon sequencing using the Illumina MiSeq platform, following protocols endorsed by the Earth Microbiome Project (EMP) (http://earthmicrobiome.org/ accessed on 3 June 2022), in order to match our data to previous and future EU initiatives (i.e., EcoFINDERS; EUdaphobase; JRC/LUCAS; Soil BON; EJP_SOIL), devoted to mapping current and future status and trends in soil biodiversity across EU agroecosystems.

Sampling Site
One Italian western alpine experimental site, located in the municipality of Saint Christophe (45 • 45 06.9 N; 7 • 20 37.0 E; 700 m a.s.l.) (Aosta Valley, Italy) and cultivated with saffron (Crocus sativus) for at least the previous three years, was selected for our analyses [13,14]. Nine soil samples were collected at 10 cm depth, 1 m away from each other, and kept in zip lock bags at 4 • C before being processed. Soil samples were then sieved by means of a 2 mm stainless steel sieve mesh split into two 2 mL eppendorf tubes and then frozen at −20 • C.

Soil DNA Extraction, PCR Amplification, and Sequencing
All the steps described below were carried out by the same researcher in order to avoid additional variation in the DNA extraction, PCR products amplification, purification, and quantification. Total soil DNA has been extracted in double, from the 9 samples using two different commercial kits: DNeasy PowerSoil Kit, (formerly sold by MOBIO as PowerSoil DNA Isolation Kit) (Qiagen, Hilden, Germany), and Macherey-Nagel™ Nu-cleoSpin™ Soil (MN) (Düren, Germany). Extractions were carried from 250 mg of soil samples, following the manufacturer's protocols. The obtained DNA were then quantified by means of Qubit Fluorometer 2.0 (Thermo Fisher Scientific (Waltham, MA, USA), Invitrogen (Waltham, MA, USA)) following manufacturer's protocol, and then 20 ng per sample of bacterial 16S rDNA were then amplified using the primer set 515fB (GT-GYCAGCMGCCGCGGTAA) [15] and 806rB (GGACTACNVGGGTWTCTAAT) [16] with the addition of the following Illumina overhang adapter sequences: forward overhang: 5 -TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-[locus specific target primer], reverse overhang: 5 -GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-[locus specific target primer] in order to target 291 bp of the V3-V4 hypervariable region and have only a few biases against any bacterial taxa [17][18][19]. PCR reaction mixes were made using Invitrogen Platinum HotStart PCR Master Mix (Thermo Fisher Scientific) with the following PCR conditions: an initial step at 94 • C for 3 min, 30 cycles at 94 • C for 45 s, 57 • C for 45 s, 72 • C for 60 s and a final extension step of 72 • C for 10 min. DNA extracted was amplified in triplicate and pooled prior to the purification using the Wizard ® SV Gel and PCR Clean-Up System (Promega). A final number of 18 PCR purified products were quantified with Qubit dsDNA BR Assay kit and Qubit Fluorometer 2.0 following manufacturer's protocol and sent for Illumina MiSeq sequencing (2 × 250 bp) to IGA technologies (Udine, Italy).

Bioinformatic and Data Analysis
Due to the low quality of reverse sequences, only forward sequences were analyzed by means of the microbiome bioinformatics platform QIIME2 (Quantitative Insights Into Microbial Ecology 2, version 2021.2 [20]).
Denoising and quality control, including removal of chimeras, were achieved by means of the DADA2 plugin [21]. To avoid low-quality sequences, reads were truncated at 290 bp. The OTU table was generated by means of qiime vsearch cluster-features-de-novo plugin using 97% as the identity threshold.
The resulting OTU table was filtered and only sequences with a frequency ≥10 were retained. The taxonomic assignment was achieved using, as reference database, the SILVA Release 138.1 [22].
The generated dataset, including OTU table, taxonomy table, phylogenetic tree from Qiime2, and metadata, was then imported into Rstudio [23] and was used to create a phyloseq object by means of the R package qiime2R [24]. The R packages phyloseq version 1.36.0 [25], ggplot2 version 3.1.0 [26] and vegan version 2.5-4 [27] were employed for all the following analyses. The OTU table was rarefied at 25,338 sequences per sample, by means of the rarefy_even_depth function of the R package phyloseq. Rarefaction curves were obtained by means of the function rarecurve of the R package vegan. The dataset generated for this study can be found in the NCBI Sequence Read Archive (SRA-NCBI; https://submit.ncbi.nlm.nih.gov/subs/sra/SUB11366953, accessed on 3 July 2022) under project accession number PRJNA830672.
Biodiversity analyses were carried out by comparing the richness (number of species) and evenness (richness taking into account relative abundances) of bacterial communities. Within-sample (alpha) diversity was assessed by three estimators: "observed fungal species," "Chao1," and "Shannon". Alpha diversity indexes were then plotted using the R package phyloseq, bar plots and ordination plot (NMDS) were obtained using the R packages phyloseq, ggplot2.
To evaluate the presence of significant differences (p < 0.05) between alpha diversity indexes, the adonis function from R package vegan was used to perform PERMANOVA tests using dissimilarity index of Bray-Curtis. Permutation test for homogeneity of multivariate dispersions was also run in order to check the validity of the previous tests by means of the function betadisper of the R package vegan. Moreover, to investigate if DNA Extraction kits influence the composition of the bacterial communities, the presence of significant differences between the taxonomic distribution and trophic level were evaluated by means of the software Past 4 [28], using the test Anova, Mann-Whitney pairwise comparison; Bonferroni corrected (p < 0.05). The evaluation of significant differences in terms of the number of retrieved OTUs was evaluated by means of a Chi-square test. Venn diagram was obtained by means of Venny2.1.0 software [29].
Trophic behavior was evaluated by means of FAPROTAX: Functional Annotation of Prokaryotic Taxa Version: 1.0 [30].
When possible, we have classified the sequences obtained at the Family level, because this is the taxonomic rank required for the use of FAPROTAX, all the other analyses were conducted at Kingdom, Phylum, and Class levels.

Results
After the bioinformatic analysis, 718,848 high-quality sequences (out of a total of 1,517,997 raw sequences) were obtained, and following subsampling, at even sequencing depth, 25,338 sequences for each sample were retained. Sequences were then clustered in 1925 OTUs. The rarefaction effects on the OTU table have been rendered by means of the rarefaction curves, in order to graphically estimate species richness. This analysis highlighted that all samples are well described, as a matter of fact, the curve ascribed to each sample reaches its plateau ( Figure S1).
Our results showed that the two adopted DNA extraction kits yielded nearly the same number of OTUs, 1284 and 1268 for MN and MO, respectively (Chi-square test p > 0.05), 657 OTUs were exclusively retrieved using MN, 641 using MO, and 627 were shared between the two kits ( Figure 1). In more detail, the 627 shared OTUs account for 84.7% of retrieved sequences (on average 617 sequences per OTU) while the 7.98% (on average 51 sequences per OTU), and the 7.30% (on average 58 sequences per OTU) were exclusively retrieved with MO and MN DNA extraction kits, respectively.
To evaluate the presence of significant differences (p < 0.05) between alpha diversity indexes, the adonis function from R package vegan was used to perform PERMANOVA tests using dissimilarity index of Bray-Curtis. Permutation test for homogeneity of multivariate dispersions was also run in order to check the validity of the previous tests by means of the function betadisper of the R package vegan. Moreover, to investigate if DNA Extraction kits influence the composition of the bacterial communities, the presence of significant differences between the taxonomic distribution and trophic level were evaluated by means of the software Past 4 [28], using the test Anova, Mann-Whitney pairwise comparison; Bonferroni corrected (p < 0.05). The evaluation of significant differences in terms of the number of retrieved OTUs was evaluated by means of a Chisquare test. Venn diagram was obtained by means of Venny2.1.0 software [29].
Trophic behavior was evaluated by means of FAPROTAX: Functional Annotation of Prokaryotic Taxa Version: 1.0 [30].
When possible, we have classified the sequences obtained at the Family level, because this is the taxonomic rank required for the use of FAPROTAX, all the other analyses were conducted at Kingdom, Phylum, and Class levels.

Results
After the bioinformatic analysis, 718,848 high-quality sequences (out of a total of 1,517,997 raw sequences) were obtained, and following subsampling, at even sequencing depth, 25,338 sequences for each sample were retained. Sequences were then clustered in 1925 OTUs. The rarefaction effects on the OTU table have been rendered by means of the rarefaction curves, in order to graphically estimate species richness. This analysis highlighted that all samples are well described, as a matter of fact, the curve ascribed to each sample reaches its plateau ( Figure S1).
Our results showed that the two adopted DNA extraction kits yielded nearly the same number of OTUs, 1284 and 1268 for MN and MO, respectively (Chi-square test p > 0.05), 657 OTUs were exclusively retrieved using MN, 641 using MO, and 627 were shared between the two kits ( Figure 1). In more detail, the 627 shared OTUs account for 84.7% of retrieved sequences (on average 617 sequences per OTU) while the 7.98% (on average 51 sequences per OTU), and the 7.30% (on average 58 sequences per OTU) were exclusively retrieved with MO and MN DNA extraction kits, respectively. Our findings reveal also that the two DNA extraction kits did not affect the alpha diversity of the samples (Figure 2A, anova p > 0.05), while they had an influence on the beta diversity (PERMANOVA; p ≤ 0.025, Bray-Curtis) ( Figure 2B). PERMANOVA analy- Our findings reveal also that the two DNA extraction kits did not affect the alpha diversity of the samples (Figure 2A, anova p > 0.05), while they had an influence on the beta diversity (PERMANOVA; p ≤ 0.025, Bray-Curtis) ( Figure 2B). PERMANOVA analysis was supported by a non-significant result in permutest tests that have revealed an homogeneity of multivariate dispersions. At the Kingdom-level ( Figure S2) we didn't observe any statistically significant difference between the two kits (Archea 4% MN; 3.5% MO; Bacteria 96% MN; 96.5% MO;), while at the Phylum-level, the taxonomic analysis showed that not the same Phyla were found with both kits (Figure 3; Table S1), as a matter of fact with MO, we didn't detect Dependentiae, FCPU426, Thermoplasmatota, and Zixibacteria, that were instead retrieved in samples extracted with MN even if with a really low abundance.  At the Kingdom-level ( Figure S2) we didn't observe any statistically significant difference between the two kits (Archea 4% MN; 3.5% MO; Bacteria 96% MN; 96.5% MO), while at the Phylum-level, the taxonomic analysis showed that not the same Phyla were found with both kits (Figure 3; Table S1), as a matter of fact with MO, we didn't detect Dependentiae, FCPU426, Thermoplasmatota, and Zixibacteria, that were instead retrieved in samples extracted with MN even if with a really low abundance. At the Kingdom-level ( Figure S2) we didn't observe any statistically significant d ference between the two kits (Archea 4% MN; 3.5% MO; Bacteria 96% MN; 96.5% MO while at the Phylum-level, the taxonomic analysis showed that not the same Phyla we found with both kits (Figure 3; Table S1), as a matter of fact with MO, we didn't dete Dependentiae, FCPU426, Thermoplasmatota, and Zixibacteria, that were instead r trieved in samples extracted with MN even if with a really low abundance.  Significant differences in the abundance of taxonomic groups between the two kits were observed for four Phyla out of thirty-six ( Figure 3; Table S1), and in more detail, Firmicutes, Patescibacteria, Sumerlaeota, and Thermoplasmatota were statistically more abundant in the samples extracted with MN (Anova, Mann-Whitney pairwise comparison; Bonferroni corrected (p < 0.05).
At Class-level, the most represented Classes are Alphaproteobacteria ( Table S2). Our result highlighted that the two adopted kits have had an influence on the retrieved bacterial communities at this taxonomic rank. In more detail, Bacilli, Chlostridia, Desulfuromonadia Gracilibacteria JG30-KF-CM66, and MB-A2-108 were statistically more abundant in samples extracted with MN (3.3% of total abundance) while Blastocatellia, Hydrogenedentia, Longimicrobia, OLB14, Phycisphaerae, and Planctomycetes were statistically more abundant in samples extracted with MO (15.39% of total abundance). Furthermore, eleven Classes out of the ninety-one retrieved were not detected using the MO kit (Zixibacteria, Babeliae, FCPU426, AT-s3-28, Microgenomatia, Thermacetogenia, Latescibacteria Limnochordia Subgroup_11, Subgroup_18, and Thermoplasmata) while only one was not detected using MN kit (Lineage IIc). Finally, concerning the taxonomic assignment, it is noteworthy that only a small percentage of sequences were not assigned either to Phylum (0.5% MN; 0.3% MO) or Class (0.6% MN; 0.4% MO) independently of the extraction kit (Tables S1 and S2). Regarding the trophic assignment of the two recovered bacterial communities, we did not find any statistical differences between those obtained from MN and MO (Figure 4).

Discussion
In this work, we have evaluated the influence of two of the most commonly used commercial kits for soil DNA extraction on the taxonomic diversity of prokaryotes and

Discussion
In this work, we have evaluated the influence of two of the most commonly used commercial kits for soil DNA extraction on the taxonomic diversity of prokaryotes and community composition of saffron (Crocus sativus) cultivated fields. We have investigated prokaryotic communities by means of an Illumina metabarcoding approach targeting 16S rDNA. In this study, we observed a high number of good quality reads, 718,848 which were clustered in 1925 OTUs. Concerning the number of OTUs, we have obtained roughly the same number of OTUs with the two extraction kits, but interestingly, only about a third of OTUs were retrieved with both. However, it is important to note that the variability introduced by the two adopted kits has an influence mainly on the recovery of sporadic/rare species, as a matter of fact, this third of OTUs in common represents the majority of the sequences found, i.e., 84.7%.
Only a small percentage of sequences were not assigned to Phylum or Class independently of the extraction kit. These findings allowed us to obtain an exhaustive picture of the microbial community structure with both extraction kits. Moreover, the alpha diversity analysis confirmed a similar diversity in the obtained populations. Our findings are partially in agreement with those obtained by Zielińska and coworkers [11], who demonstrated, by means of a metabarcoding approach, the existence of differences in microbial community structure depending on which kit was used. As a matter of fact, our results have highlighted that the abundance of different Phyla or Class in the microbial structure is significantly different. In our case, however, these differences are seen mostly among the less abundant Phyla or Class and thus had no influence on the core microbiome.
As stated by Changey and coworkers [31], during a long-term experiment, microbial DNA extraction, in conjunction with soil sampling and DNA data analysis, is one of the crucial issues in soil microbial community evaluation. These authors conclude that the DNA extraction methods substantially modify the concentration and quality of eDNA, and the abundance and structure of microbial communities, and this is particularly true for archeal and fungal communities while having less impact on bacterial communities. A precise comparison of our results with those obtained by Changey and coworkers [31], is impeded due to the fact that to evaluate bacterial communities, we have used an Illumina metabarcoding approach, while the aforementioned authors used qPCR and molecular fingerprint. Taking into account this discrepancy, we can, however, conclude that our results are in agreement with those mentioned above.
Another important question, reported by Soliman et al. [12], is represented by the handling bias inherent in the entire sample handling procedure. To minimize these biases, our work was entirely conducted by a single researcher. However, as demonstrated by our results, even though we can avoid or minimize handling bias, the effect of DNA extraction kits is not always predictable as it is sometimes influenced by particular soil management, which modifies some soil properties and reduces the efficiency of DNA extraction. For example, the addition of Biochar, which heavily modifies the structure and composition of organic C amounts, has been demonstrated to affect DNA extraction efficiency from soil [32]. On the other hand, we agree with Soliman et al. [12], who reported that it is unrealistic to expect research groups all over the world to accept a strict standard that does not allow using commercial extraction kits of their own choice.

Conclusions
Finally, in our opinion, classification at the taxonomic level alone is not sufficient to throw full light on the effects of extraction kits on the composition of prokaryotic communities. As a matter of fact, even if we have seen some taxonomic differences between the two kits, the use of FAPROTAX (Predictive software, for the Functional Annotation of Prokaryotic Taxa) has demonstrated that the trophic assignment of the two recovered bacterial communities did not display any statistical differences, allowing us to show that the results obtained with the two extraction kits are comparable and interchangeable.

Data Availability Statement:
The dataset generated for this study can be found in the NCBI Sequence Read Archive (SRA-NCBI; https://submit.ncbi.nlm.nih.gov/subs/sra/SUB11366953, accessed on 3 July 2022) under project accession number PRJNA830672.