Omics Sequencing of Saccharomyces cerevisiae Strain with Improved Capacity for Ethanol Production

: Saccharomyces cerevisiae is the most important industrial microorganism used to fuel ethanol production worldwide. Herein, we obtained a mutant S. cerevisiae strain with improved capacity for ethanol fermentation, from 13.72% ( v / v for the wild-type strain) to 16.13% ( v / v for the mutant strain), and analyzed its genomic structure and gene expression changes. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment revealed that the changed genes were mainly enriched in the pathways of carbohydrate metabolism, amino acid metabolism, metabolism of cofactors and vitamins, and lipid metabolism. The gene expression trends of the two strains were recorded during fermentation to create a timeline. Venn diagram analysis revealed exclusive genes in the mutant strain. KEGG enrichment of these genes showed upregulation of genes involved in sugar metabolism, mitogen-activated protein kinase pathway, fatty acid and amino acid degradation, and downregulation of genes involved in oxidative phosphorylation, ribosome, fatty acid and amino acid biogenesis. Protein interaction analysis of these genes showed that glucose-6-phosphate isomerase 1, signal peptidase complex subunit 3, 6-phosphofructokinase 2, and trifunctional aldehyde reductase were the major hub genes in the network, linking pathways together. These ﬁndings provide new insights into the adaptive metabolism of S. cerevisiae for ethanol production and a framework for the construction of engineered strains of S. cerevisiae with excellent ethanol fermentation capacity.


Introduction
Ethanol is a potential environmentally friendly alternative to fossil fuels that can be used to propel light vehicles with gasoline, thereby improving octane numbers and reducing environmental pollution [1].In addition, it is the premier biotechnological global product in terms of volume and economic value [2,3].Sucrose is an abundant, cheap, and readily available substrate for industrial fermentation, and its use in the production of fuel ethanol has proved successful in Brazil [4,5].Sugarcane contains 11-18% (wet W/W) sugars, comprised of 90% sucrose and 10% glucose and fructose [6].During the edible sugar-making process, cane molasses is generated as a by-product in vast amounts, containing 45-60% sucrose and ~5-20% glucose and fructose [5,7].Sucrose production is the most important industry in the Guangxi Zhuang Autonomous Region, accounting for ~60% of the total planting area and yield of sugarcane in China.
S. cerevisiae WT (MATa/MATα) (CGMCC 2.4748) [19] is a wild-type diploid strain isolated from year-old sugar mill waste in Nanning, China, and S. cerevisiae MT (MATa/MATα) is a mutant of the WT strain.The MT strain was obtained by random mutation using UV irradiation; the screening process was as follows: the mutated cell solution was coated on a culture dish (150 mm) with a yeast peptone dextrose (YPD) medium (20 g L −1 tryptone, 10 g L −1 yeast extract, and 20 g L −1 glucose), cultured at 30 • C for two days.The grown cell moss on the culture dish was evenly divided into 16 regions, and each was scraped into 5 mL YPD medium for overnight culture at 30 • C, then 100 uL cell solution was absorbed into 5 mL YPD medium for overnight culture.Cell concentrations were measured, comparative amounts of cells were inoculated into 100 mL YPS30 medium (20 g L −1 tryptone, 10 g L −1 yeast extract, and 300 g L −1 sucrose) in 250 mL Erlenmeyer flasks, and ethanol fermentation was allowed to proceed at 30 • C, 180 rpm.Cells with the highest ethanol yield were coated on a culture dish again, repeated the above process five times, then isolated single colonies from the stain with the highest ethanol yield.After two rounds of isolation, we got the purified mutant with the highest ethanol yield.When the fermentation experiment was performed in this paper, cells were cultivated in a YPD medium at 30 • C, 180 rpm.Comparative amounts of freshly cultured WT and MT cells were inoculated into a 100 mL YPS30 medium in 250 mL Erlenmeyer flasks, and ethanol fermentation was allowed to proceed at 30 • C, 180 rpm.

Detection of the Fermentation Process
The number of cells was counted using an automated cell counter (IY1200 Counstar, Ruiyu, Shanghai, China) every 4 h.Ethanol production was quantified using a gas chromatograph (6890, Agilent, Palo Alto, CA, USA).In the fermentation broth, reducing sugar was determined using the dinitrosalicylic acid response.Following HCl hydrolysis, the total residual sugar content of the broth was quantified using the dinitrosalicylic acid response every 8 h.RNA-Seq was used to analyze gene expression at 16 h (T1), 40 h (T2), and 64 h (T3) during fermentation.Three parallel experiments were performed for each sample.

Whole Genome DNA Extraction, Library Construction, and Sequencing
Total genomic DNA was obtained by completely grinding cells in liquid nitrogen, decontaminating, adding cetyltrimethylamine bromide to facilitate the removal of polysaccharides, deproteinizing with phenol and chloroform, precipitating with isopropanol and absolute ethyl alcohol, washing with 75% alcohol to remove the sediment, and finally dissolving the pellet in ddH 2 O.DNA quality was assessed using a Nanodrop Microspectrophotometer (Nanodrop 2000, Thermo Fisher Scientific, Waltham, MA, USA) and agarose gel electrophoresis.At least 3 µg of genomic DNA was used to construct paired-end libraries with an insert size of 500 bp using a Paired-end DNA Sample Prep kit (Illumina Inc., San Diego, CA, USA).These libraries were then sequenced on a NovaSeq6000 system using a PE 150 strategy by GeneDenovo Biotechnology Co., Ltd.(Guangzhou, China).

RNA Extraction, Library Construction, and Sequencing
Total RNA was extracted using a Trizol reagent kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's protocol.RNA quality was assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and checked using RNase-free agarose gel electrophoresis.mRNA was enriched using Oligo (dT) beads.The enriched mRNA was fragmented into short fragments using a fragmentation buffer and reverse transcribed into complementary DNA (cDNA) using a NEB_Next Ultra RNA Library Prep Kit for Illumina (NEB #7530, New England Biolabs, Ipswich, MA, USA).The purified double-stranded cDNA fragments were end-repaired, followed by the addition of an adenylate (A) base; then, the fragments were ligated to Illumina sequencing adapters.The ligation reaction product was purified using AMPure XP Beads (1.0×).The ligated fragments were subjected to size selection using agarose gel electrophoresis and amplified using polymerase chain reaction (PCR).The resulting cDNA libraries were sequenced using an Illumina NovaSeq 6000 by GeneDenovo Biotechnology Co., Ltd (Guangzhou, China).

Sequence Data Analysis
The WGS and RNA-Seq raw reads were deposited in the Sequence Read Archive database with accession number PRJNA885247.
Raw WGS reads were processed to obtain high-quality clean reads by removing reads with ≥10% unidentified nucleotides (N), reads with >50% bases having Phred quality scores ≤ 20, and reads aligned to the barcode adapter.To identify single nucleotide polymorphisms (SNPs) and insertion-deletion mutations (InDels), the Burrows-Wheeler Aligner was used to align the clean reads from each sample against the reference genome (Ensembl_release100 of S. cerevisiae) [20].Variant calling was performed for all samples using the GATK Unified Genotyper.SNPs and InDels were filtered using GATK's Variant Filtration with proper standards [21].To determine the physical positions of each SNP, the software tool ANNOVAR was used to align and annotate SNPs or InDels [22].Structural variation [23] was determined using the software BreakDancer (Max1.1.2., Ken Chen, The Genome Center, St. Louis, MO, USA) [24].
Raw RNA-Seq reads were further filtered using fastp (version 0.18.0)[25].Reads containing adapters, ≥10% of unknown nucleotides, low-quality reads containing ≥50% lowquality (Q-value ≤ 20) bases, and reads mapped to rRNA were removed.The remaining clean reads were further used in assembly and gene abundance calculations.Using En-sembl_release100 of the S. cerevisiae genome FASTA file as the reference genome, the index of the reference genome was built, and paired-end clean reads were mapped to the reference genome using HISAT2.2.4 [26].The fragment per kilobase of transcript per million mapped reads (FPKM) value was calculated to quantify gene expression abundance and variations using RSEM software (v1.3.1,Bo Li, Departmet of Comuter Sciences, University of Wisconsin-Madison, Madison, WI, USA) [27].Correlation analysis of two parallel experiments was performed using R Correlation.Gene differential expression analysis was performed using DESeq2 [28].Gene expression pattern analysis was used to cluster genes of similar expression patterns from multiple samples (at least three in a specific time point, space, or treatment dose size order).To examine the expression pattern of differentially expressed genes (DEGs), the expression data of each sample (in the order of treatment) were normalized to 0, log2(v1/v0), and log2(v2/v0), and then clustered using Short Time-series Expression Mine software (STEM, Jason Ernst, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA) [29].The clustered profiles with p-values ≤ 0.05 were considered significant profiles.KEGG enrichment analysis of the target DEGs was performed using the KOBAS 3.0 package.The protein interaction network was analyzed using the interaction relationships in the STRING protein interaction database (http://string-db.org,accessed on 28 February 2022) [30], and the interaction network diagram was constructed using Cytoscape [31].

Quantitative Reverse Transcription PCR (qRT-PCR)
Two comparison groups, MT-T1 vs. MT-T2 and WT-T2 vs. MT-T2 were selected as objects, and qRT-PCR of 21 genes from the two comparison groups were used to verify the RNA-Seq accuracy.The amplified primers were shown in Table S1.Biological replicates of each gene were identical to those used for RNA-Seq.Reverse transcription was performed using the HiScrip II Q RT SuperMix kit for qPCR (R223, Vazyme, Nanjing, China), and qPCR was performed using the ChanQ SYBR qPCR Master Mix kit (Q341, Vazyme, Nanjing, China) on a TianLong 988 system (Tianlong, Xi'an, China).Data were analyzed using the 2 −∆∆Ct method, with bifunctional DRAP deaminase/tRNA pseudouridine synthase RIB2 as the reference gene.

Cell Growth and Fermentation
The S. cerevisiae MT strain had obvious advantages in growth and ethanol fermentation in 30% (w/v) sucrose medium compared to that of the WT strain.During the cell growth phase, the difference in cell number between the MT and WT strains was pronounced after 20 h, with a maximum mean cell number for the two strains of 3.06 × 10 8 and 3.23 × 10 8 , respectively (Figure 1A).The trend in sugar consumption was consistent with that of ethanol yield; the difference between the two strains was shown after 24 h, and the ethanol yields were the highest at 64 h (13.72% v/v for the WT strain and 16.13% v/v for the MT strain (Figure 1B)).The total residual sugar content in the fermentation broth was stable from 48 h, while at the end of 80 h fermentation, the total residual sugar content was 41.52 g/L (WT) and 34.03 g/L (MT) (Figure 1C).The reduced sugar content in the fermentation broth showed an initial increase and subsequently decreased, peaking at 32 h.The sugar reduction of MT was significantly lower than that of WT after 16 h, at 31.0667 g/L (MT) and 41.5 g/L (WT) after 80 h fermentation (Figure 1D).Almost all the sucrose in the fermentation broth of the WT strain became reducing sugar.

Genome Sequencing
Strict quality control and data filtering were performed on the original data to obtain the high-quality clean reads used in the data analysis.Genome coverage, sequencing depth, SNPs, and InDels in the two samples were analyzed by aligning the sequences with the reference genome of S. cerevisiae (Ensembl_release100).The statistics on quality of sequencing data were shown in Tables S2-S5, on information of SNPs and InDels were shown in Tables S6-S10.There was no obvious difference in the genome location of SNPs between the two strains, but SNP coding information in the MT strain contained significant changes compared to that of the WT; synonymous single-nucleotide variants (SNVs) increased, while nonsynonymous SNVs decreased (Figure 2A).The location of InDels in the MT genome changed in both the exons and upstream of the coding region, while the coding information changed in frameshift and nonframeshift deletions (Figure 2B).SV included translocation breakends (BND), deletions (DEL), tandem duplications (DUP), insertions (INS), and inversions (INV), shown in Figure 2C.Compared to that of the WT strain, the proportion of BNDs decreased, and INSs increased in the MT strain.The distribution density of SNPs on each chromosome was relatively high (Figure 3A), while the distribution density of InDels was high only on mitochondria (Figure 3B).The GO enrichment of the SNPs and InDels in MT strain was significantly enriched in cellular process, single-organism process, metabolic process (Biological Process), cell, cell part, and organelle (Cellular Component), binding and catalytic activity (Molecular Function) (Figure 4).The KEGG enrichment result of SNPs was basically consistent with the result of InDels (Figure 5).Significantly enriched metabolic pathways include carbohydrate metabolism, amino acid metabolism, metabolism of cofactors and vitamins, and lipid metabolism.

Genome Sequencing
Strict quality control and data filtering were performed on the original data to obtain the high-quality clean reads used in the data analysis.Genome coverage, sequencing depth, SNPs, and InDels in the two samples were analyzed by aligning the sequences with the reference genome of S. cerevisiae (Ensembl_release100).The statistics on quality of se

DEGs and KEGG Enrichment Analysis between Different Groups
The statistics on quality of sequencing data were shown in Tables S11-S15.Plot FPKM distribution for all samples were shown in Figure S1.The heat map of the Pearson correlation coefficient R between sanmples was shown in Figure S2.Genes with a false discovery rate ≤ 0.05 and an absolute fold change ≥ 2 were considered DEGs.DEGs between different comparison groups are shown in Figure 6.Compared to the WT strain, the MT strain had significantly upregulated gene expression at the T1 time point and significantly downregulated gene expression at the T2 and T3 time points.There were more upregulated genes in the WT strain and more downregulated genes in the MT strain at the T2 and T3 time points than at the T1 time point.For both the WT and MT strains, there was little expression difference between the T2 and T3 time points.
The top 20 metabolism pathways of the enriched DEGs between the WT and MT strains at three time points are shown in Figure 7.The enriched genes were mainly involved in sugar, amino acid, and fatty acid metabolism.Between WT-T1 and MT-T1, the vitamin B6 metabolism was significantly enriched, and starch and sucrose metabolism, Purine metabolism, and glycolysis/gluconeogenesis were the most enriched pathways.The citrate cycle and pyruvate metabolism pathways were significantly enriched between WT-T2 and MT-T2; the citrate cycle, pyruvate metabolism, and glycolysis/gluconeogenesis pathways collectively enriched the most genes between the two strains.Between WT-T3 and MT-T3, the significantly enriched pathway was alanine, aspartate, and glutamate metabolism, together with pyruvate metabolism, citrate cycle, glycolysis/gluconeogenesis, cysteine, and methionine metabolism enriched the most genes between the two strains.The results were consistent with the enrichment of genomic variation genes, including SNPs and InDels.

DEGs and KEGG Enrichment Analysis between Different Groups
The statistics on quality of sequencing data were shown in Tables S11-S15.Plot FPKM distribution for all samples were shown in Figure S1.The heat map of the Pearson correlation coefficient R between sanmples was shown in Figure S2.Genes with a false discovery rate ≤ 0.05 and an absolute fold change ≥ 2 were considered DEGs.DEGs between different comparison groups are shown in Figure 6.Compared to the WT strain, the MT strain had significantly upregulated gene expression at the T1 time point and significantly downregulated gene expression at the T2 and T3 time points.There were more upregulated genes in the WT strain and more downregulated genes in the MT strain at the T2 and T3 time points than at the T1 time point.For both the WT and MT strains, there was little expression difference between the T2 and T3 time points.
The top 20 metabolism pathways of the enriched DEGs between the WT and MT strains at three time points are shown in Figure 7.The enriched genes were mainly involved in sugar, amino acid, and fatty acid metabolism.Between WT-T1 and MT-T1, the vitamin B6 metabolism was significantly enriched, and starch and sucrose metabolism, Purine metabolism, and glycolysis/gluconeogenesis were the most enriched pathways.The citrate cycle and pyruvate metabolism pathways were significantly enriched between WT-T2 and MT-T2; the citrate cycle, pyruvate metabolism, and glycolysis/gluconeogenesis pathways collectively enriched the most genes between the two strains.Between WT-T3 and MT-T3, the significantly enriched pathway was alanine, aspartate, and glutamate metabolism, together with pyruvate metabolism, citrate cycle, glycolysis/gluconeogenesis, cysteine, and methionine metabolism enriched the most genes between the two strains.The results were consistent with the enrichment of genomic variation genes, including SNPs and InDels.

Expression Trends of DEGs on the Timeline and Exclusive Genes in the MT Strain
To examine the expression patterns of DEGs, the expression data of each sample (in the order of treatment) were normalized to 0, log2(v1/v0), log2(v2/v0), and then clustered using STEM.The parameters of "Maximum unit change in model profiles between time points" = 1, "Maximum output profiles number" = 20, and "Minimum ratio of fold change of DEGs" was ≥2.0.The clustered profiles with p-values ≤ 0.05 were considered significant profiles.There were eight expression trend profiles, and four profiles were significantly enriched for each strain for the fermentation process: profile 0-downregulation; profile 1-initial downregulation and then no change; profile 6-initial upregulation and then no change; and profile 7-upregulation (Figure 8).

Expression Trends of DEGs on the Timeline and Exclusive Genes in the MT Strain
To examine the expression patterns of DEGs, the expression data of each sample (in the order of treatment) were normalized to 0, log2(v1/v0), log2(v2/v0), and then clustered using STEM.The parameters of "Maximum unit change in model profiles between time points" = 1, "Maximum output profiles number" = 20, and "Minimum ratio of fold change of DEGs" was ≥2.0.The clustered profiles with p-values ≤ 0.05 were considered significant profiles.There were eight expression trend profiles, and four profiles were significantly enriched for each strain for the fermentation process: profile 0-downregulation; profile 1-initial downregulation and then no change; profile 6-initial upregulation and then no change; and profile 7-upregulation (Figure 8).Exclusive genes in the MT strain of each significantly enriched profile were determined using Venn diagram analysis, and KEGG pathway enrichment analysis of the exclusive genes was conducted.As shown in Figure 9, the exclusive genes in profiles 6 and 7 with upregulated trends are mainly enriched in carbon and sugar metabolism pathways, whereas the exclusive genes in profiles 0 and 1 with downregulated trends are mainly enriched in the ribosome and amino acid and fatty acid biosynthesis.Genes involved in the top 15 metabolic pathways are shown in Table 1.Exclusive genes in the MT strain of each significantly enriched profile were determined using Venn diagram analysis, and KEGG pathway enrichment analysis of the exclusive genes was conducted.As shown in Figure 9, the exclusive genes in profiles 6 and 7 with upregulated trends are mainly enriched in carbon and sugar metabolism pathways, whereas the exclusive genes in profiles 0 and 1 with downregulated trends are mainly enriched in the ribosome and amino acid and fatty acid biosynthesis.Genes involved in the top 15 metabolic pathways are shown in Table 1.
The interaction network of the genes in Table 1 was analyzed, except for those involved in ribosomes.The Pearson correlation and significance between pairwise genes were calculated, and the top 100 pairs of absolute correlation were shown when the p-value was ≤0.05.As shown in Figure 10, there were nine groups in the interaction network.The largest group consisted of 45 genes and centered on SPC3 and PGI1, the two central blocks being connected by the protection of telomeres 1 (POT1) and GMP synthase (GUA1).The second group consisted of seven genes, with PFK2 as the hub gene.The third group consisted of six genes, with GRE3 as the hub gene.There were low numbers of genes and relatively simple relationships in the remaining six groups.Fructose and mannose metabolism YOR120W GCY1 a Significant expression trend of the MT strain, profile 0-downregulation.b Significant expression trend of the MT strain, profile 1-initial downregulation, and then no change.c Significant expression trend of the MT strain, profile 6-initial upregulation, and then no change.d Significant expression trend of the MT strain, profile 7-upregulation.
As a hub gene in the first group, SPC3 is involved in protein export.The connected genes complement factor B (CFB5) and small subunit processome component (UTP18) are involved in ribosome biogenesis in eukaryotes; FAA3 is involved in fatty acid metabolism; lipase (LIP2) is involved in lipoic acid metabolism; LEU9 is involved in valine, leucine, and isoleucine biosynthesis; phosphoribosylaminoimidazole carboxylase (ADE2) is involved in purine metabolism; and VMA7 is involved in oxidative phosphorylation.These connections demonstrate the relationship among protein export, ribosome biogenesis, and oxidative phosphorylation, as well as fatty acid and amino acid metabolism.SPC3 encodes a signal-anchored protein subunit that enters the endoplasmic reticulum (ER) as a signal peptidase [32].The ER plays an important role in maintaining the balance and stability of cellular proteins; misfolded proteins accumulate in the ER when cells are under different kinds of stress, known as ER stress [33].The role that SPC3 and other interacting genes play in ER stress is worth investigating.
PGI1 is involved in the pentose phosphate pathway, whereas the genes bifunctional purine biosynthetic protein (ADE5,7), bifunctional glutathione transferase (GTT1), 5-aminolevulinate synthase (HEM1), O-phosphol-L-serine:2-oxoglutarate transaminase (SER1), SFA1, ELO1, CEM1, and sterol 24-C-methyltransferase (ERG6) are involved in purine metabolism; glutathione metabolism; glycine, leucine, and isoleucine biosynthesis; and fatty acid metabolism.Genes involved in fatty acid biosynthesis (ELO1 and CEM1) were downregulated, whereas SFA1 involved in fatty acid degradation was upregulated, which is beneficial to the accumulation of acetyl-CoA.SFA1 is also a bifunctional alcohol dehydrogenase, and its synergic upregulation with ALD6 and POT1 increases the catalytic degradation of fatty acids to alcohol.The effect of SFA1 on ethanol production in yeast cells has been studied, but the mechanisms regulating the different effects under different conditions remain unclear [34].This module mainly showed the relationship between the pentose phosphate pathway and fatty acid and amino acid metabolism.The Hsp70 family chaperone (SSA2) and PTC1, involved in the MAPK pathway, were located at the edge of this module.Glutathione transferase (Gtt1) of S. cerevisiae is crucial to the response to hydrogen peroxide stress.Increasing glutathione content might enhance yeast cell tolerance to lignocellulose inhibitors and increase the production of ethanol [35,36].GTT1 expression was negatively correlated with interlinked genes in the network of this module; these interactions help to better understand GTT1.
trend of the MT strain, profile 6-initial upregulation, and then no change.d Significant expression trend of the MT strain, profile 7-upregulation.
The interaction network of the genes in Table 1 was analyzed, except for those involved in ribosomes.The Pearson correlation and significance between pairwise genes were calculated, and the top 100 pairs of absolute correlation were shown when the pvalue was ≤0.05.As shown in Figure 10, there were nine groups in the interaction network.The largest group consisted of 45 genes and centered on SPC3 and PGI1, the two central blocks being connected by the protection of telomeres 1 (POT1) and GMP synthase (GUA1).The second group consisted of seven genes, with PFK2 as the hub gene.The third group consisted of six genes, with GRE3 as the hub gene.There were low numbers of genes and relatively simple relationships in the remaining six groups.
Figure 10.Protein-protein interaction network of the exclusive genes in the profile 0, 1, 6, and 7 of the MT strain.The node circles represent genes, labeling with gene ID or gene symbol; the node size is graded according to the Cytoscape connectivity, with greater connectivity leading to larger nodes.The node color is gradually changed according to log2FC; red represents log2FC > 0; the darker the red, the larger the upregulation ratio; blue represents log2FC < 0; the bluer the color, the larger the downregulation ratio.Positive correlations are shown by solid gray lines, and negative correlations are shown by dashed gray lines.The thickness of the line gradually changes according to the absolute value of the correlation coefficient; the larger value, the thicker line.
Figure 10.Protein-protein interaction network of the exclusive genes in the profile 0, 1, 6, and 7 of the MT strain.The node circles represent genes, labeling with gene ID or gene symbol; the node size is graded according to the Cytoscape connectivity, with greater connectivity leading to larger nodes.The node color is gradually changed according to log2FC; red represents log2FC > 0; the darker the red, the larger the upregulation ratio; blue represents log2FC < 0; the bluer the color, the larger the downregulation ratio.Positive correlations are shown by solid gray lines, and negative correlations are shown by dashed gray lines.The thickness of the line gradually changes according to the absolute value of the correlation coefficient; the larger value, the thicker line.
PGI1, SPC3, PFK2, and GRE3 were the major hub genes in the network.In the future, these metabolic pathways may be optimized by overexpressing or deleting these hub genes to obtain S. cerevisiae strains with a higher capacity for ethanol fermentation.

qRT-PCR
qRT-PCR of 21 genes from two comparison groups was carried out to verify the accuracy of the RNA-Seq data and the expression level of important genes.The log2 n value of each gene is shown in Figure 11.All gene expression determined using qPCR was consistent with that obtained from RNA-Seq, thereby indicating the credibility of the RNA-Seq data.In the MT strain, the expression of sucrose invertase gene SUC2 was downregulated; furthermore, the content of total sugar and reducing sugar in MT fermentation broth was lower than that of the WT strain after 16 h.The difference in sucrose utilization between the two strains highlights the need for further study.Seq data.In the MT strain, the expression of sucrose invertase gene SUC2 was downregulated; furthermore, the content of total sugar and reducing sugar in MT fermentation broth was lower than that of the WT strain after 16 h.The difference in sucrose utilization between the two strains highlights the need for further study.

Sucrose Consumption of the WT and MT Strains
S. cerevisiae consumes sucrose in two ways.(1) The hydrolyzation of sucrose into glucose and fructose by extracellular sucrose invertase encoded by the SUC gene family.In this process, monosaccharides enter cells by facilitated diffusion to participate in the carbon metabolism pathway; (2) Sucrose enters cells via the proton-coupled transporter and is hydrolyzed in the cytosol by maltose metabolizing enzyme and intracellular sucrose invertase [5,38].The sucrose invertase gene SUC2, maltose permease genes (MALx1), maltase genes (MALx2), and some regulatory genes (e.g., MALx3) are the key genes in sucrose consumption of S. cerevisiae [38].Invertases of S. cerevisiae coded by SUC2 can be transcribed into two mRNAs that differ in their 5′ ends (1.8 Kb and 1.9 Kb).The longer of the two invertases with a signal peptide can be secreted outside the cell, and its synthesis level is regulated by glucose repression.The shorter invertase lacking a signal peptide encodes the intracellular invertase [39,40].In the absence of extracellular invertase activity of S. cerevisia, sucrose was internalized by proton symporters using ATP, which led to improving the anaerobic fermentation and ethanol yield from sugar [5].In addition, the  In this process, monosaccharides enter cells by facilitated diffusion to participate in the carbon metabolism pathway; (2) Sucrose enters cells via the proton-coupled transporter and is hydrolyzed in the cytosol by maltose metabolizing enzyme and intracellular sucrose invertase [5,38].The sucrose invertase gene SUC2, maltose permease genes (MALx1), maltase genes (MALx2), and some regulatory genes (e.g., MALx3) are the key genes in sucrose consumption of S. cerevisiae [38].Invertases of S. cerevisiae coded by SUC2 can be transcribed into two mRNAs that differ in their 5 ends (1.8 Kb and 1.9 Kb).The longer of the two invertases with a signal peptide can be secreted outside the cell, and its synthesis level is regulated by glucose repression.The shorter invertase lacking a signal peptide encodes the intracellular invertase [39,40].In the absence of extracellular invertase activity of S. cerevisia, sucrose was internalized by proton symporters using ATP, which led to improving the anaerobic fermentation and ethanol yield from sugar [5].In addition, the replacement of all endogenous hexose transporters with hexose-proton symport and extracellular invertase (SUC2) can increase ethanol yield and anaerobic growth of S. cerevisiae [41].
We analyzed the protein interaction network of DEGs (WT-T2 vs. MT-T2) and extracted the network of genes related to sucrose consumption in Table 2. Pearson correlations between pairwise genes of DEGs were calculated.Considering the genes in Table 2 as core genes, the top 10 correlation pairs in the absolute value of each core gene were shown when the p-value was ≤ 0.05.As shown in Figure 12, most core genes were connected by complex nodes.MAL33, MPH2, and IMA2 showed significant upregulation.PMP2, PGM1, HXT11, HRK1, MAL31, GRE3, and HXK2 showed marginal upregulation.The remaining core genes were downregulated.Among the core genes, RGT2, AST2, HXT2, and HXT5 are closely related to the SUC2 gene.All of them are related to the plasma membrane and glucose transporter.As a glucose-sensing receptor, the glucose-sensing signal generated by RGT2 can lead to the inhabitation of the RGT1 transcriptional repressor and, thus, the derepression of HXT genes that encode glucose transporters [42,43].RGT2 was downregulated in response to glucose starvation [42].Does the down expression of SUC2, RGT2, and HXTx indicate that the sucrose is primarily hydrolyzed intracellularly in the MT strain?However, plasma membrane H+-ATPase PMA1 and PMA2, which are related to the transmembrane transport of sucrose, are also downregulated in the MT strain.How the MT strain efficiently uses sucrose to produce ethanol is our next research objective.Our network diagram results revealed the potential pathway of sucrose absorption and provided a basis for subsequent study of the sucrose mode of absorption.

Carbon Metabolism in the MT Strain
The metabolic pathway enrichment results of exclusive genes in the MT strain were mainly reflected in amino acid, fatty acid, and sugar metabolism.The carbon skeleton of amino acids is broken down to form acetyl-CoA, α-ketoglutaric acid, succinyl-CoA, fumaric acid, and oxaloacetic acid, all of which enter the citric acid cycle.The amino acid synthesis also uses pyruvate, α-ketoglutaric acid, and oxaloacetic acid, which are intermediates in glycolysis, the citric acid cycle, and the pentose phosphate pathway, respectively.These intermediates link sugar metabolism to amino acid metabolism [44,45].As a carbon skeleton, acetyl-CoA is the catabolic product of fatty acids and is the only source of carbon atoms in the fatty acid molecule.As shown in Figure 13, the expression of exclusive genes in the MT strain involved in amino acid, fatty acid, and sugar metabolism resulted in carbon metabolism flow to pyruvate and acetyl-CoA.Downregulation of ketol-acid reductoisomerase (ILV5), 2-isopropylmalate synthase (LEU4), 2-isopropylmalate synthase (LEU9), and dihydrolipoyllysine-residue acetyltransferase (LAT1) genes can reduce the synthesis of pyruvate to valine, leucine, isoleucine, and acetyl-CoA, respectively.Upregulation of xylulokinase (XKS1), 3-hydroxybutyrate dehydrogenase 2 (BDH2), aldo-keto reductase superfamily protein (YJR096W), and GRE3 genes contribute to the conversion of ribose-5-phosphate to D-xylose, and subsequently to pyruvate.Pyruvate accumulation is beneficial to increase ethanol production.Upregulation of POT1 and downregulation of fatty acid synthase (CEM1), long-chain fatty acid-CoA ligases (FAA3 and FAA4), fatty acid elongase (ELO1), carnitine O-acetyltransferase (YAT1), fatty acid elongase (YLR372W), and enoyl-CoA hydratase (PHS1) genes enhance the hydrolysis of fatty acids to acetyl-CoA and decrease fatty acid synthesis from acetyl-CoA.Upregulation of aldehyde dehydrogenase (ALD6) and bifunctional alcohol dehydrogenase (SFA1) genes promote the conversion of acetyl-CoA to ethanol.Upregulation of aryl-alcohol dehydrogenases (AAD3 and AAD4), alpha-trehalase (NTH1), hexokinase 1 (HXK1), YJR096W, glycerol 2-dehydrogenase (GCY1), and ALD6 genes direct the flow of carbon into the glycolytic pathway, then into pyruvate and ethanol.The expression of alanine transaminase (ALT2), aspartate transaminase (AAT1), and ALD6 genes promote the breakdown of amino acid carbon skeletons to form α-ketoglutaric acid and succinyl-CoA.Ethanol could be a major source of acetyl-CoA and NADPH indirectly during fermentation by S. cerevisiae, and ALD6 plays an important role in this process [46].Understanding the different metabolic processes is relevant in regard to engineered S. cerevisiae deployed for ethanol production.

Carbon Metabolism in the MT Strain
The metabolic pathway enrichment results of exclusive genes in the MT strain were mainly reflected in amino acid, fatty acid, and sugar metabolism.The carbon skeleton of amino acids is broken down to form acetyl-CoA, α-ketoglutaric acid, succinyl-CoA, fumaric acid, and oxaloacetic acid, all of which enter the citric acid cycle.The amino acid synthesis also uses pyruvate, α-ketoglutaric acid, and oxaloacetic acid, which are intermediates in glycolysis, the citric acid cycle, and the pentose phosphate pathway, respectively.These intermediates link sugar metabolism to amino acid metabolism [44,45].As a carbon skeleton, acetyl-CoA is the catabolic product of fatty acids and is the only source of carbon atoms in the fatty acid molecule.As shown in Figure 13, the expression of exclusive genes in the MT strain involved in amino acid, fatty acid, and sugar metabolism resulted in carbon metabolism flow to pyruvate and acetyl-CoA.Downregulation of ketol-acid reduc- 2-dehydrogenase (GCY1), and ALD6 genes direct the flow of carbon into the glycolytic pathway, then into pyruvate and ethanol.The expression of alanine transaminase (ALT2), aspartate transaminase (AAT1), and ALD6 genes promote the breakdown of amino acid carbon skeletons to form α-ketoglutaric acid and succinyl-CoA.Ethanol could be a major source of acetyl-CoA and NADPH indirectly during fermentation by S. cerevisiae, and ALD6 plays an important role in this process [46].Understanding the different metabolic processes is relevant in regard to engineered S. cerevisiae deployed for ethanol production.

Conclusions
We obtained a mutant S. cerevisiae strain with improved capacity for ethanol fermentation and analyzed its genomic structure and gene expression changes.The SNPs had a high distribution density on all chromosomes except for the ends of chromosome I and II, while the InDels had the highest distribution density on the mitochondria genome.GO and KEGG enrichment for associated genes of SNPs and InDels were performed, and the significantly enriched metabolic pathway included carbohydrate metabolism, amino acid metabolism, metabolism of cofactors and vitamins, and lipid metabolism.There were significant differences in gene expression between the two strains during fermentation.The results of KEGG enrichment of DEGs between two strains were consistent with the result

Figure 2 .
Figure 2. Genome sequencing results.Location and coding information of SNPs (A), location and coding information of InDels (B), and SV of the two strains (C).SNPs: Single Nucleotide Polymorphisms; InDels: Insertion-Deletion Mutations.

Figure 3 .
Figure 3.The distribution density of SNPs and InDels on chromosomes in MT strain.The distribution density of SNPs (A), the distribution density of InDels (B).SNP: Single Nucleotide Polymorphism; InDel: Insertion-Deletion Mutation.

Figure 3 . 26 Figure 4 .
Figure 3.The distribution density of SNPs and InDels on chromosomes in MT strain.The distribution density of SNPs (A), the distribution density of InDels (B).SNP: Single Nucleotide Polymorphism; InDel: Insertion-Deletion Mutation.Fermentation 2023, 9, x FOR PEER REVIEW 8 of 26

Figure 4 .
Figure 4. GO enrichment of the genes associated with SNPs and InDels in MT strain.GO enrichment of the genes associated with SNPs (A), GO enrichment of the genes associated with InDels (B).GO: Gene Ontology; SNP: Single Nucleotide Polymorphism; InDel: Insertion-Deletion Mutation.

Figure 5 .
Figure 5. KEGG enrichment of the genes associated with SNPs and InDels in MT strain.KEGG enrichment of the genes associated with SNPs (A), KEGG enrichment of the genes associated with InDels (B).KEGG: Kyoto Encyclopedia of Genes and Genomes; SNP: Single Nucleotide Polymorphism; InDel: Insertion-Deletion Mutation.

Figure 5 .
Figure 5. KEGG enrichment of the genes associated with SNPs and InDels in MT strain.KEGG enrichment of the genes associated with SNPs (A), KEGG enrichment of the genes associated with In-Dels (B).KEGG: Kyoto Encyclopedia of Genes and Genomes; SNP: Single Nucleotide Polymorphism; InDel: Insertion-Deletion Mutation.

Figure 6 .
Figure 6.Statistics of the DEGs between different comparison groups.DEGs: different expression genes.

Figure 6 .
Figure 6.Statistics of the DEGs between different comparison groups.DEGs: different expression genes.

Figure 8 .
Figure 8. Gene expression trends of DEGs between the two strains during fermentation.The black line is the trend line, and the gray lines are gene lines.Gene expression trends of DEGs in the WT strain (A), profiles of the WT strain ordered by p-value significance of the number of genes assigned versus expected (C), gene expression trends of DEGs in the MT strain (B), profiles of the MT strain ordered by p-value significance of the number of genes assigned versus expected (D).DEGs: different expression genes.

Figure 8 .
Figure 8. Gene expression trends of DEGs between the two strains during fermentation.The black line is the trend line, and the gray lines are gene lines.Gene expression trends of DEGs in the WT strain (A), profiles of the WT strain ordered by p-value significance of the number of genes assigned versus expected (C), gene expression trends of DEGs in the MT strain (B), profiles of the MT strain ordered by p-value significance of the number of genes assigned versus expected (D).DEGs: different expression genes.

Figure 9 .
Figure 9. Exclusive genes by Venn diagrams and KEGG pathway enrichment of these exclusive genes in the MT strain.Venn diagram analysis of MTprofile0 and WT profile0, and the top 15 pathway enrichments of the exclusive genes in the MT strain (A), Venn diagram analysis of MTprofile1 and WT profile1, and the top 15 pathway enrichments of the exclusive genes in the MT strain (B), Venn diagram analysis of MTprofile6 and WT profile6, and the top 15 pathway enrichments of the exclusive genes in the MT strain (C), Venn diagram analysis of MTprofile7 and WT profile7, and the top 15 pathway enrichments of the exclusive genes in the MT strain (D).KEGG: Kyoto Encyclopedia of Genes and Genomes; DEGs: different expression genes.

Figure 9 .
Figure 9. Exclusive genes by Venn diagrams and KEGG pathway enrichment of these exclusive genes in the MT strain.Venn diagram analysis of MTprofile0 and WT profile0, and the top 15 pathway enrichments of the exclusive genes in the MT strain (A), Venn diagram analysis of MTprofile1 and WT profile1, and the top 15 pathway enrichments of the exclusive genes in the MT strain (B), Venn diagram analysis of MTprofile6 and WT profile6, and the top 15 pathway enrichments of the exclusive genes in the MT strain (C), Venn diagram analysis of MTprofile7 and WT profile7, and the top 15 pathway enrichments of the exclusive genes in the MT strain (D).KEGG: Kyoto Encyclopedia of Genes and Genomes; DEGs: different expression genes.

1 .
Sucrose Consumption of the WT and MT Strains S. cerevisiae consumes sucrose in two ways.(1) The hydrolyzation of sucrose into glucose and fructose by extracellular sucrose invertase encoded by the SUC gene family.

Fermentation 2023, 9 , 26 Figure 12 .
Figure12.Protein-protein interaction network of sucrose consumes related genes.Core genes are shown in a diamond shape, and others in a circle, labeled with gene ID or gene symbol.The color is gradually changed according to log2FC; red represents log2FC > 0; the darker the red, the larger the upregulation ratio; blue represents log2FC < 0; the bluer the color, the larger the downregulation ratio.Positive correlations are shown by solid gray lines, and negative correlations are shown by dashed gray lines; the thickness of the line gradually changes according to the absolute value of the correlation coefficient; the larger value, the thicker line.

Figure 12 .
Figure12.Protein-protein interaction network of sucrose consumes related genes.Core genes are shown in a diamond shape, and others in a circle, labeled with gene ID or gene symbol.The color is gradually changed according to log2FC; red represents log2FC > 0; the darker the red, the larger the upregulation ratio; blue represents log2FC < 0; the bluer the color, the larger the downregulation ratio.Positive correlations are shown by solid gray lines, and negative correlations are shown by dashed gray lines; the thickness of the line gradually changes according to the absolute value of the correlation coefficient; the larger value, the thicker line.

Figure 13 .
Figure 13.Exclusive genes in the MT strain are involved in carbon metabolism.Yellow box: glycolysis; blue box: fatty acid metabolism; green box: amino acid metabolism; purple box: tricarboxylic

Figure 13 .
Figure 13.Exclusive genes in the MT strain are involved in carbon metabolism.Yellow box: glycolysis; blue box: fatty acid metabolism; green box: amino acid metabolism; purple box: tricarboxylic acid cycle; red box: alcohol biosynthesis.Red represents upregulated gene; green represents downregulated gene.The 3D structures of compounds were downloaded from PubChem (https://pubchem.ncbi.nlm.nih.gov/,accessed on 28 September 2022).

Table 1 .
Genes involved in the top 15 enriched pathways of the exclusive genes in the MT strain.

Table 2 .
Genes of DEGs between WT-T2 vs. MT-T2 are possibly related to sucrose consumption.
a log 2-fold change of differential expression; a positive number means up expression, and a minus means down expression.