Taxonomic Binning Approaches and Functional Characteristics of the Microbial Community during the Anaerobic Digestion of Hydrolyzed Corncob

: Maize forms the basis of Mexican food. As a result, approximately six million tons of corncob are produced each year, which represents an environmental issue, as well as a potential feedstock for biogas production. This research aimed to analyze the taxonomic and functional shift in the microbiome of the fermenters using a whole metagenome shotgun approach. Two strategies were used to understand the microbial community at the beginning and the end of anaerobic digestion: (i) phylogenetic analysis to infer the presence and coverage of clade-speciﬁc markers to assign taxonomy and (ii) the recovery of the individual genomes from the samples using the binning of the assembled scaffolds. The results showed that anaerobic digestion brought some noticeable changes and the main microbial community was composed of Corynebacterium variable , Desulfovibrio desulfuricans , Vibrio furnissii , Shewanella spp., Actinoplanes spp., Pseudoxanthomonas spp., Saccharomonospora azurea , Agromyces spp., Serinicoccus spp., Cellulomonas spp., Pseudonocardia spp., Rhodococcus rhodochrous , Sphingobacterium spp. Methanosarcina mazei , Methanoculleus hydrogenitrophicus , Methanosphaerula spp., Methanoregula spp., Methanosaeta spp. and Methanospirillum spp. This study provides evidence of the drastic change in the microbial community structure in a short time and the functional strategy that the most representative microorganisms of the consortia used to carry out the process.


Introduction
Fossil fuels have played an essential role in the development of the world's economies. However, the use of fossil fuels also has negative impacts, such as releases of carbon dioxide, methane, and nitrous oxide gases into the atmosphere [1]. All of these are harmful gases, responsible for climate change induced by global warming [2]. As a consequence of climate change, the polar ice shields are melting, the sea is rising, and extreme weather events promote the loss of crops around the world [3,4]. Therefore, a global energy transition is needed to enhance the use of renewable energies, such as that to biomass [5][6][7].
The production of biogas from the biomass of wastewater, peels, or grains and starch crops, such as corncob is a cost-effective process to generate energy and to reduce greenhouse gas emissions [8]. Biogas is promising clean energy that has some advantages over other technologies, can use the infrastructure of natural gas, use byproducts, generate organic fertilizer, and could be implemented either on small or large scales [9,10]. In order to improve biogas production, biomass is treated with several pre-treatment methods that can be chemical, physical, or biological [11][12][13]. The alkaline organosolv pre-treatment is well used in bioethanol production [14]. However, due to promising results in the at room temperature (25 • C), to reduce humidity, and cut into smaller pieces. The size of the corncob is reduced to increase the surface area and for the organic solvents to better penetrate and separate the components of this raw material [14]. A particle size of 2 mm was used because it is recommended by other authors [31]. The corncob was characterized using the Van Soest fiber method, to obtain the fraction of cellulose, hemicellulose, and lignin [32].

Alkaline Organosolv Pre-Treatment
The procedure of the alkaline organosolv pre-treatment of the corncob was carried out in a high-pressure reactor. The dried corncob residue (400 g) was placed in the reactor with 1.9 L of water, 2 L of ethanol, and 200 mL of acetic acid for 2.5 h at 175 • C. When the reaction was complete and the reactor was cooled down to ambient temperature, the mixture in the reactor was poured out. Later, the mixture was filtered to separate the solid residue from the liquid phase. The remaining residue was dried overnight in an oven at 100 • C, before weighing [15].

Batch Anaerobic Digestion Experiments
The inoculum for the anaerobic digestion was taken from cow manure, mixed with soil and water in equal proportions, and then 10 g VSS/L was taken to inoculate the bioreactors. The inoculum was evaluated for organic composition parameters, such as the total solids (TS), volatile suspended solids (VSS) and pH [23]. Batch anaerobic digestion experiments were carried out in three bottles with a liquid volume of 300 mL, at a concentration of 10 g chemical oxygen demand (COD)/L for 15 days. The pH of the hydrolysate was adjusted to 7.5 before use and the batch reactor was flushed with nitrogen gas to guarantee the anaerobic conditions. All experiments were performed at 35 • C with constant agitation and in triplicate. Additionally, a control test without the addition of corncob hydrolysate was carried out. To analyze the microbial community changes at the end of the anaerobic digestion, the samples were taken from the batch reactor at the beginning of the process (inoculum plus hydrolysate) and at the end of the anaerobic digestion (microbial community plus digestate).
The accumulated biogas production was measured by the liquid displacement method of an alkaline solution (NaOH) for 15 days (Figure 1). The amount of biogas produced per unit of measurement of mL per day (mL/d) per unit of raw material in biogas, normalized in mL per gram of volatile solid (NmL/g VS), was quantified to determine the effectiveness of the anaerobic digestion process. The biogas samples were analyzed with an Agilent Technologies 6820 GC gas chromatograph, with a DB-35 capillary column, to determine the type and percentage of biogas generated.

Hydrolysate and Digestate Characterization
The hydrolysate and digestate were characterized in terms of: (i) total sugars by the colorimetric method of Dubois [33]; (ii) inhibitory compounds, such as furfural, by high performance liquid chromatography (HPLC) [34]; (iii) chemical oxygen demand (COD) by the colorimetric method according to Standard Methods 5220D [35]; and (iv) volatile fatty acids (VFAs) such as acetate, propionate, and butyrate by HPLC [23].

DNA Extraction and Sequencing
Samples were taken from all three bioreactors after carefully mixing and pooling them to have a composite sample and perform DNA extraction. DNA was extracted from the pooled sample using a DNeasy PowerSoil Kit (QIAGEN, Hilden, Germany). After, an Illumina library was prepared from total DNA using the TruSeq DNA PCR-Free (Illumina, Inc., San Diego, CA, USA) to obtain an average fragments size of 500 bp. NextSeq500 (Illumina, Inc., San Diego, CA, USA) was used for the sequencing of the samples with a 150 cycle configuration, generating paired-end reads with a length of 75 bp.

Bioinformatics Analysis
First, a quality control analysis using a quality control tool for high throughput sequence data (FastQC) program was carried out [36]; low-quality sequences were removed using an in-house Perl script. The phylogenetic analysis was carried out by metagenomics phylogenetic analysis (MetaPhlAn) [37]. The graphical representation of phylogenetic data was made with high-quality circular representations of taxonomic and phylogenetic trees (GraPhlan) [38]. Megahit v1.1.3 was used to assemble the raw reads [39] and the gene prediction and annotation used were Prodigal and DIAMOND [40], respectively. All samples were binned separately, using MaxBin2 [41] and MetaBAT [42], and the predictions were combined using DAS Tool [43].
The Kyoto Encyclopedia of Genes and Genomes (KEGG) was used to search the proteins identified in the samples. The orthology numbers (KO) were set up in the KEGG Mapper website to identify the related pathways. The raw reads from the Whole Metagenome Sequencing were deposited at the National Center for Biotechnology Information database under the BioProject number PRJNA602505.

Statistical Analysis
All experiments were carried out in triplicate and the statistical package used was Statgraphics Centurion XVII software. A significance level of p = 0.05 was used for all statistical tests.

Anaerobic Digester Performance of Corncob
The corncob used has the following cellulosic fractions: 45% cellulose, 34% hemicellulose, 16% lignin, and 5% extractives. In contrast, the structure of the pretreated corncob had 22% cellulose, 72% hemicellulose, 5% lignin, and 2% extractives. The pre-treatment successfully gained a higher fraction of hemicellulose, a polymer consisting of monomers, including glucose, mannose, xylose, and arabinose. The hydrolysate has a higher concentration of total sugars~158 g/L compared to the levels obtained using acid or enzymatic pre-treatments, which reached around 20 and 120 g/L, respectively [23,44]. This was followed by monosaccharides: glucose (6.23 g/L), xylose (4.12 g/L), galactose (2.42 g/L), mannose (1.66 g/L) (Table 1). Nonetheless, there is a presence of acetic (1.28 g/L), propionic (0.29 g/L), butyric (0.41 g/L), and furfural (0.02 g/L). If these compounds were not at such low concentrations, they would inhibit the anaerobic digestion [23,45]. The anaerobic digestion of the hydrolysate was evaluated in triplicated for 15 days, until the methane production was exhausted. The methane production was 483 ± 12 NmLCH 4 /gVS. The digestate showed 85% of COD removal and 1.35 ± 0.22 g/L of the total residual sugars, as well as the following monosaccharides: glucose (1.66 ± 0.45 g/L), xylose (1.12 ± 0.19 g/L), and mannose (0.38 ± 0.051 g/L). Regarding volatile fatty acids, there was an increase in all of them and a decrease in the quantity of furfural (0.01 ± 0.006 g/L) ( Table 1). The alkaline organosolv pre-treatment showed a higher methane production, compared to the control test without organosolv pre-treatment (120 ± 8 NmLCH 4 /g VS).
The composition of the biogas produced with the anaerobic digestion of the hydrolysate was 72.80% of CH 4 , 26.18% of CO 2 and 1.02% other gases. This confirms the results previously reported by the authors [15]. The higher level of hemicellulose may have been a stimulus to the microorganisms, due to more available carbon and energy sources [46].

The Microbial Community at the Begging and after the Anaerobic Digestion of Corn Cob Hydrolysate
The first strategy to understand the microbial community that carried out the anaerobic digestion was metagenomic phylogenetic analysis (MetaPhlAn), a method for characterizing whole metagenome shotgun samples. This computational package infers the presence and reads the coverage of clade-specific markers to assign taxonomy (~184 markers for each microorganism) [47]. The phylogenetic analysis carried out by MetaPhlAn assigned 115 different genera ( Figure 2).
The second strategy for understanding the microbial community was recovering individual genomes from the samples using the assembled scaffolds' binning. However, only genomes with a high abundance were binned, with high coverage levels of 90-99%, as shown in Table 2. Reconstructed complete genomes of individual microbes in the community allowed us to understand their individual contribution and their ecological functions in specific metabolic pathways, as well as reconstructing the community interaction networks [30,48]. The results of the individual genomes reconstruction were consistent with the data obtained using MethaPhlan2 (Table 2).  The microbial community's composition at the beginning of the process represents the environment that it comes from, as has been demonstrated by several kinds of research [18,41,49]. Furthermore, the inoculum represents the metabolic capacity to develop the process [50]. The leading bacterial community of the inoculum was composed of the followed species: Streptomyces spp., Acinetobacter spp., Klebsiella oxytoca, Desulfovibrio desulfuricans, Klebsiella pneumoniae, Vibrio furnissii, Shewanella spp., Enterobacter cloacae, Pseudomonas stutzeri, Corynebacterium variable, Acinetobacter venetianus, Saccharomonospora azurea, Escherichia spp., Methanosarcina mazei and Methanoculleus hydrogenitrophicus ( Figure 2).
As can be seen, the inoculum microbial community was mainly composed of Proteobacteria, Bacteroidetes and Actinobacteria, as has been previously reported in inoculum from manure [24], a wastewater treatment plant [51] and a mesophilic anaerobic digester plant [52]. Proteobacteria have ecologically relevant groups, capable of participating in the hydrolysis and acidogenesis/acetogenesis stage due to their metabolic capacities [29]. Members of Bacteroidetes are actively involved in the degradation of lignocellulosic biomass residues [53] and Actinobacteria, represented by genera such as Corynebacterium or Propionimicrobium are microorganisms with a heterogenic metabolic capacities which make them suitable to participate in all stages of anaerobic digestion [13,54].
The microbial community structure is determined by physicochemical parameters such as temperature, pH, and levels of fatty acids [50,55,56]. At the end of the process, the community was enriched with Corynebacterium, a variable, facultative anaerobic bacteria, which plays a vital role in the decomposition of organic material, such as cellulose, fats, and organic acids [57,58]. Other representative species founded were: Desulfovibrio desulfuricans, Vibrio furnissii, Shewanella spp., Actinoplanes spp., Pseudoxanthomonas spp., Saccharomonospora azurea, Agromyces spp., Serinicoccus spp., Cellulomonas spp., Pseudonocardia spp., Rhodococcus rhodochrous and Sphingobacterium spp. With regard to the enriched archaea community, this mainly comprised Methanosarcina mazei, followed by Methanoculleus hydrogenitrophicus, Methanosphaerula spp., Methanoregula spp., Methanosaeta spp. and Methanospirillum spp. Many studies have reported the abundance of Methanosarcina mazei in biogas production [13] as a result of their ability to use hydrogen, acetate, methanol, and any methylated C 1 compounds for methanogenic metabolism [59]. Methanoculleus hydrogenitrophicus, Methanoregula spp., and Methanosphaerula spp. are hydrogenotrophic methanogens that are plentiful in several biogas reactors [28,60,61]. Methanosaeta spp. is an acetoclastic methanogen that is able to use a low concentration of acetate 7-70 µM for growth [13,62]. These archaea sets show a community structure with hydrogenotrophic members. Maintaining very low H 2 levels promotes methane production and diverse acetoclastic archaea to carry out acetate oxidation, formed from other VFAs (such as propionate and butyrate), which keeps the process balanced [62,63]. The archaea structure, with a high abundance of

The Community Interaction Networks and Their Potential Functional Profile
The functional annotation results were summarized in Figure 2 (outer ring). The focus on pathways and lipid abundance, carbohydrate, and energy metabolism was predicted by KEGG because these gene profiles are related to cellular functions relevant to the bioprocess [50,65]. There are differences in the lipid metabolism associated with an overrepresentation of the glycerolipid metabolism during the anaerobic digestion, due to an increase in energy production and the need for glycerol 3-phosphate as an intermediate for the carbohydrate and lipid metabolic pathways [66]. Carbohydrate metabolism was overrepresented during the anaerobic digestion, mainly in the fructose/mannose metabolism, where fructose from the hydrolysate biomass and the fructose-6-phosphate from the glycolysis were transformed to mannose, to be used as a carbon and energy source. Furthermore, at the end of the process, it was found that more genes-encoding enzymes were involved in pyruvate metabolism, an essential compound during the acidogenesis/acetogenesis stage. These genes associated pyruvate to carbohydrates via gluconeogenesis, to fatty acids such as formate, and to the pyruvic acid transformation to acetyl-CoA, by the enzyme pyruvate-ferredoxin oxidoreductase to produces acetate and adenosine triphosphate (ATP) [67].
The last over-represented pathway of carbohydrate metabolism was the butyrate metabolism, related to butyric acid production, the main fatty acid founded in some batch anaerobic digestion systems, and associated with the equilibrium of the system due to the H 2 levels [68]. The result of the butyrate metabolism is in line with the high levels of butyric acid found in the chemical analysis of the digestate (Table 1). With regard to energy metabolism, which includes oxidative phosphorylation, methane, nitrogen, sulfur, photosynthesis, and the carbon fixation metabolism, we examined every pathway separately and only focused on the methane, nitrogen, and sulfur metabolism. Consistent with the statistical analysis (X2 (3, N = 200) = 5 9915, p = 0.05), the results showed the influence of anaerobic digestion over the genes founded in methane, nitrogen and sulfur metabolism. As was expected, the methane metabolism pathway showed a greater abundance of genes related to methanogenesis compared to the abundance of genes founded in inoculum.
Whereas nitrogen and sulfur metabolism pathways were under represented during the anaerobic digestion, due to a combination of two factors, first the levels of ammonia, nitrite and nitrate and sulfur and second the microbial community were not enriched in microorganisms that carry out the complete pathways of nitrogen and sulfur metabolism [63,67,69].
The key enzymes in anaerobic digestion were selected for screening, to determine the presence of those enzymes and correlate them with the microorganisms enriched during the process (Table 3). In the first stage, the microbial community began to degrade the biomass using the Cellobiose phosphorylase, which catalyzed cellulose degradation into α-D-glucose 1-phosphate and D-glucose [70]. Vibrio furnissii, Cellulomonas spp., Corynebacterium variable, and Actinoplanes spp. are able to produce this enzyme (Figure 3) [17,71].  Furthermore, the results showed the specific functional potential of each genus. For example, Corynebacterium variable, Desulfovibrio desulfuricans, Vibrio furnissii, Sphingobium spp., Rhodococcus rhodochrous, Actinoplanes spp., and Pseudoxanthomonas spp., were very active during hydrolysis with enzymes which linked them with several pathways, such as pentose phosphate (glucose 1-dehydrogenase), and fructose/mannose metabolism (mannose-6phosphate isomerase, GDP-mannose 6-dehydrogenase, fructokinase). Vibrio furnissii, Shewanella spp., Cellulomonas spp., Desulfovibrio desulfiricans, Rhodococcus rhodochrous, Corynebacterium variable, Pseudoxanthomonas spp. Saccharomonospora azura, Actinoplanes spp., and Sphingobacterium spp. participate in starch and sucrose metabolism (glucokinase). As can be seen, there are species that only participate in a specific metabolic pathway, such as Sphingobacterium spp. However, some microorganisms participate in more than one pathway, and stages such as Corynebacterium variable. With regard to acidogenesis; the critical step is the pyruvate formation from carbohydrates, the enzyme selected to explore this step is the pyruvate kinase and the microorganism with genes related to this enzyme were Vibrio furnissii, Shewanella spp., Desulfovibrio desulfiricans, Corynebacterium variable, Rhodococcus rhodochrous, Agromyces spp., Cellulomonas spp., Saccharomonospora azura, Pseudonocardia spp., and Actinoplanes spp. Furthermore, to explore the pyruvate transformation toward the short-chain fatty acids, the enzyme selected was malate dehydrogenase, where the same microorganisms were observed as for pyruvate kinase except for Desulfovibrio desulfiricans.
Acetogenesis was explored using the acetate metabolism via reactions in the citric acid cycle with the enzyme Succinyl-CoA:acetate, CoA-transferase and the microorganism related to this enzyme were Corynebacterium variable, Rhodococcus rhodochrous, and Sphingobacterium spp. In addition, the oxidation of acetate can be the result of the interaction between syntrophic acetate-oxidizing bacteria (SAOB), which oxidize acetate to produce H 2 /CO 2 that are utilized by hydrogenotrophic methanogens which use the H 2 to reduce CO 2 to CH 4 [72]. The marker enzymes for acetogenesis are dependent on syntrophic relations and were represented by Formyltetrahydrofolate synthetase; the microorganism with genes related to this enzyme were Vibrio furnissii and Shewanella spp.
The methanogenesis stage showed enzymes that were not present before the anaerobic digestion, such as the tetrahydromethanopterin hydro-lyase, mainly found in Methanosarcina [73].
Another key enzyme of the methanogenic process was the anaerobic carbon monoxide dehydrogenase, which is unique to archaea species and allows using carbon monoxide as a single carbon and energy-producing H 2 and CO 2 as well as formylmethanofurantetrahydromethanopterin N-formyltransferase and the NAD-reducing hydrogenase, which are enzymes that catalyze the reduction of CO 2 to methane, depending on the levels of H 2 in a bioreactor [74,75].
Finally, the next-generation sequencing technologies could revolutionize the diagnosis of anaerobic digestion processes. The current biogas diagnosis analyzes the methane rates, the biogas' quality or the byproducts generated [51,65]. However, by monitoring over time, the bioreactor's microbial community could predict some undesirable results and determine the appropriate action to be taken. For example, Methyloversatilis (0.81%) was detected at the end of our anaerobic digestion: this genus had been reported in methanolfed heterotrophic systems as being an undesirable genus because it could reduce the rate of methane generated [76]. This finding makes sense due to the alkaline organosolv pretreatment, which uses an organic solvent with water for lignin removal, and perhaps, these genera were enriched due to the abundance of the substrate. The solvent must be removed before fermentation. However, the remaining solvent could adhere to biomass and promote the abundance of Methyloversatilis. If the relative abundance of Methyloversatilis was more abundant, we should revise the solvent method's removal to improve the process.
Furthermore, it is necessary to continue studying the functional potential and understand the metabolism of the microorganisms which carried out the bioprocess, to target genes, which could be changed by genetic engineering and to understand what enzymes are over represented using other substrates and what that means. It is also necessary to find maker microorganisms for the optimization and standardization of the process. The information generated during this research, such as the microbial profile at the end of the anaerobic digestion and the identified marker enzymes, will allow us to explore the possibility of generating synthetic inoculums and follow the process with the marker enzymes to improve the methane generation using corncob in future projects.

Conclusions
The pre-treated corncob improved the anaerobic digestion and reached a higher methane production with low concentrations of byproducts compared with the corncob without pre-treatment. The alkaline organosolv pre-treatment successfully gained a higher fraction of hemicellulose that acts as the stimulus to microorganisms during the bioprocess.
This study provides evidence of the drastic change in the microbial community structure in a short time and the functional strategy used by the most representative microorganism of the consortia to carry out the process. Gammaproteobacteria dominated the inoculum and after the anaerobic digestion, the microbial community was composed mainly of Proteobacteria, Bacteroidetes and Actinobacteria.
The functional profile allows us to support the taxonomic findings related to bioprocess and identify the main microorganism involved in the corncob digestion, such as Vibrio furnissii, Shewanella spp., Cellulomonas spp., Desulfovibrio desulfiricans, Rhodococcus rhodochrous, Corynebacterium variable, Pseudoxanthomonas spp. Saccharomonospora azura, Actinoplanes spp., and Sphingobacterium spp. Furthermore, the results allow the selection of marker enzymes for each stage of the anaerobic digestion and identify the microorganism with the genes related with those enzymes.
The metabolisms with over-representation were the lipid metabolism associated with the glycolipid pathways, the carbohydrate metabolism in the fructose/mannose pathways and the energy metabolism in the methanogenesis pathways, while the nitrogen and sulfur metabolism pathways were under represented.