Comparative Genomic Analysis of the Mutant Rhodotorula mucilaginosa JH-R23 Provides Insight into the High-Yield Carotenoid Mechanism

: In this study, the wild-type Rhodotorula mucilaginosa GDMCC 2.30 and its high carotenoid-producing mutant JH-R23, which was screened from the space mutation breeding treated wild type, were used as materials. Through whole-genome sequencing and resequencing analysis, the carotenoid metabolic pathway and mechanism of high carotenoid production in the mutant were explored. The R. mucilaginosa GDMCC 2.30 genome comprised 18 scaffolds and one circular mitochondrial genome with a total size of 20.31 Mb, a GC content of 60.52%, and encoding 7128 genes. The mitochondrial genome comprised 40,152 bp with a GC content of 40.59%. Based on functional annotations in the GO, KEGG, and other protein databases, nine candidate genes associated with carotenoid metabolic pathways, and candidate genes of the CrtS and CrtR homologous gene families were identified. The carotenoid metabolic pathway was inferred to start from sugar metabolism to the mevalonate pathway, as is common to most fungi, and the final product of the mevalonate pathway, geranylgeranyl diphosphate, is a precursor for various carotenoids, including β -carotene, lycopene, astaxanthin, and torularhodin, formed through the activity of crucial enzymes encoded by genes such as CrtI, CrtYB, CrtS, and CrtR. Resequencing analysis of the mutant JH-R23 detected mutations in the exons of four genes, including those encoding Gal83, 3-oxoacyl-reductase, p24 proteins, and GTPase. These mutations are interpreted to have an important impact on carotenoid synthesis by JH-R23.


Introduction
Carotenoids are fat-soluble pigments present in higher plants; fungi; algae; and bacteria [1].These compounds are favored by the food (including health food) and cosmetics industries for their diverse physiological functions; such as antioxidant and antitumor activity and immunity enhancement [2,3].The estimated global market value for carotenoids is projected to expand to USD 2 billion by 2026 with an annual growth rate of 4.2% [4].Compared with chemical synthesis and biological extraction; fermentation is currently the main method used for commercial carotenoid production.To further improve the production efficiency of carotenoids; research on strains producing high carotenoid yields and their metabolic engineering has become an important focus for the industrial fermentation production of carotenoids Rhodotorula mucilaginosa is a red yeast rich in carotenoids and lipids and originates from seawater; sediments; glaciers; and other environments.Food waste; agricultural waste; and other substances can be used as culture media for the species for high density fermentation production of carotenoids and lipids [5][6][7].Although wild-type red yeast strains are highly adaptable; carotenoid production is relatively low.Therefore; isolation of red yeast mutants producing high carotenoid yields and optimization of product metabolism regulation based on the clarity of the mutation mechanism are urgently required for enhanced industrial production [8].Cutzu et al. applied ultraviolet mutagenesis to Rhodotorula glutinis and isolated a mutant with its β-carotene yield increased by 2.8-fold compared with that of the parental strain [9].Nasrabadi et al. combined multiple physicochemical mutagenesis methods to obtain the mutant Rhodotorula acheniorum MRN and increased its carotenoid yield by 6.45-fold through culture medium optimization [10].Zheng et al. used atmospheric and room temperature plasma mutagenesis and screened the mutant Rhodotorula toruloides M18; which showed a 14.68-fold increase in torularhodin production over that of the wild type [11].In previous work; our research group loaded R. mucilaginosa GDMCC 2.30 onto the "New Generation Manned Spacecraft Test Ship" to obtain the mutant JH-R23 and increased its carotenoid production by 2.46-fold through optimization of the fermentation process [12].However; little information is available on the mechanism of high carotenoid production in red yeast mutants; and only a few genome sketches assembled from Illumina sequencing data are available [13], still lacking high quality genomic studies; which largely hinders improvement of the product yield and scale of carotenoid production through metabolic regulation and optimization.In this study; we sequenced the complete genome of strain GDMCC 2.30 using PacBio and Illumina sequencing technologies and analyzed the carotenoid metabolic pathway by combining the present genomic information and previous research.The mutant JH-R23 was resequenced and its mutations were analyzed by comparative genomics to establish a theoretical basis for research on metabolic engineering for carotenoid production in R. mucilaginosa.

Strain Culture and Genomic DNA Extraction
The wild-type R. mucilaginosa GDMCC 2.30 was purchased from the Guangdong Microbial Culture Collection Center.The strain was cultured in 20 mL potato dextrose broth medium (20% [w/v] potato, 2% [w/v] glucose, and pH 7.0) at 28 • C and 200 rpm until the logarithmic phase (OD600 = 1.0).Cells were harvested by centrifugation at 13,400× g for 1 min and immediately flash frozen in liquid nitrogen for further extraction of genomic DNA and total RNA.Genomic DNA was extracted using the cetyltrimethylammonium bromide method [14].Genomic DNA quality and integrity were assessed by agarose gel electrophoresis and comparison with appropriate size standards, while DNA yield and purity were measured using a NanoDrop™ 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and TM-380 fluorometer (Turner BioSystems, Inc., Sunnyvale, CA, USA).High quality DNA (OD260/280 = 1.8-2.0,>1 µg) was used for the study.

Genome Sequencing and Transcriptome Sequencing
The genomes of the wild-type GDMCC 2.30 and the mutant JH-R23 were sequenced by Shanghai Winnerbio Technology Co., Ltd.(Shanghai, China) using PacBio Sequel II platforms and Illumina NovaSeq 6000.For PacBio sequencing, fragments shorter than 500 bp were removed from the SMRTbell library before long-read sequencing.For Illumina sequencing, genomic DNA was fragmented using the E220 Focused-ultrasonicator (Covaris, Shelton Connecticut, MA, USA), and the library was prepared by end repair, A-tailing, adapter ligation, purification, and PCR amplification, followed by paired-end sequencing (2 × 150 bp).
Total RNA was extracted from each strain using the TRIzol reagent kit (Sangon Biotech, Shanghai, China) following the manufacturer's protocol.Transcriptome sequencing libraries were constructed using the NEBNext Ultra Directional RNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA), and sequenced on an Illumina NovaSeq 6000 by Shanghai Winnerbio Technology Co., Ltd.(Shanghai, China).All genomic sequencing data and transcriptome sequencing data are available in the NCBI database under BioProject PRJNA1034680.
The R. mucilaginosa GDMCC 2.30 genome was annotated using the pipeline shown in Figure 1 [20].
Fermentation 2024, 10, x FOR PEER REVIEW 3 of 14 data and transcriptome sequencing data are available in the NCBI database under BioProject PRJNA1034680.
After obtaining initial gene models, PASA was utilized to update and refine them.Additionally, gene structure annotation information was manually reviewed and corrected using IGV-GSAman (v0.7.14) (https://gitee.com/CJchen/IGV-sRNA,accessed on 13 February 2023).This allowed us to obtain the final gene model prediction.The genome annotation was submitted to the NCBI database under BioProject PRJNA1034680.

Space Breeding Mutation
For pre-flight preparation, R. mucilaginosa GDMCC 2.30 cells were cultured in tubes containing potato dextrose agar (HuanKai, Guangzhou, China) at 28 • C for 48 h.These tubes were shipped from Shenzhou Biotechnology Co. (Inner Mongolia, China) to the "New Generation Manned Spacecraft Test Ship", which flew for 67 h in space under special conditions of microgravity (10 −6 -10 −3 g), vacuum (101.325kPa), temperature (17-23 • C), and cosmic ionizing radiation (0.146 Gy/y); the tubes were returned to the laboratory after the successful landing of the return capsule.
After the space mission, the mutants were gently scraped from the potato dextrose agar tubes, resuspended in 10 mL of sterile water and diluted to 10 −5 , 10 −6 , and 10 −7 with sterile water, and plated on potato dextrose agar at 28 • C for 48 h to screen for the highest carotenoid-producing strain.The strain with the highest carotenoid production was screened on potato dextrose agar plates and named JH-R23.The wild-type and mutant strains were inoculated into 20 mL of potato dextrose broth medium at 28 • C and 200 rpm for 144 h, respectively.Carotenoid production and dry cell weight were determined every 24 h to compare their carotenoid differences.Meanwhile, they were plated on potato dextrose agar plates and incubated at 28 • C for 120 h to be stored for preservation.
The internal transcribed spacer (ITS) gene region of rDNA was amplified by polymerase chain reaction (PCR) using ITS1 (5 ′ -TCCGTAGGTGAACCTGCGG-3 ′ ) and ITS4 (5 ′ -TCCTCCGCTTATTGATATGC-3 ′ ) primers purchased from Sangon Biotech (Shanghai, China) [44].The PCR products were analyzed by electrophoresis before being sent to Sangon Biotech (Shanghai, China) for sequencing.The obtained ITS sequences were available in the NCBI GenBank database under OR976268-OR976269, and then blasted against the NCBI database for identification.

Carotenoid Yield and Biomass Determination
Extraction of carotenoid was performed as described by Tian et al. [45] with some modifications as follows.After fermentation, cells were disrupted using the DMSO cell wall breaking method.Fermentation broth (5 mL) was centrifuged at 7576× g for 5 min to obtain wet cells, which were washed twice with triple distilled water, resuspended once in 5 mL of 99% anhydrous ethanol, and centrifuged at 7576× g for 5 min.The supernatant was discarded, 2 mL DMSO was added to resuspend the cells, the cell suspension was incubated in a 65 • C water bath for 1.5 h, then 6 mL acetone was added and shaken for 10 min until the cells were colorless.After centrifugation at 7576× g for 5 min, the supernatant was collected to measure the absorbance at 480 nm.Total carotenoid yield was calculated according to previous studies [46].Carotenoid yield was calculated using the following formula: where D is the dilution factor, V is the volume of organic solvent used for extraction (mL), W is the dry weight of cells used for extraction (g), and 0.16 is the extinction coefficient of the organic solvent.
For biomass determination [47], 5 mL fermentation broth was centrifuged at 7576× g for 5 min.The supernatant was discarded, the wet cells were washed twice with 5 mL triple distilled water, and then centrifuged to obtain the wet cells.The wet cells were dried to constant weight in a 105 • C oven and the dry weight was measured as the biomass.

Statistical Analysis
The data are presented as the mean ± SD of at least three independent experiments.All data were statistically analyzed using SPSS software (v26.0).Significant differences were assessed using two-sided t-tests (p < 0.05).

Rhodotorula mucilaginosa GDMCC 2.30 Genome Sequencing, Assembly, and Evaluation
To assemble a high quality genome of R. mucilaginosa GDMCC 2.30, PacBio Sequel II and Illumina NovaSeq 6000 were used to sequence the complete genome.The results showed that a total of 6.94 Gb Illumina short reads and 6.42 Gb PacBio long reads were obtained, approximately 300× high quality subreads covered the 21.27 Mb genome (Table S2).Based on a k-mer analysis, the genome heterozygosity was 0.0141%, suggesting that the strain was a homozygous diploid (Figure S2).The genome contained 18 scaffolds and a circular mitochondrial genome with a total size of 20.31 Mb and a GC content of 60.52%.The circular mitochondrial genome comprised 40,152 bp with a GC content of 40.59% (Figure 2).A total of 1714 BUSCO genes were identified (97.1%) and telomere sequences existed at both ends of most scaffolds (Table S3), indicating the genome data were of high quality.Given the high-depth sequencing, novel sequencing technologies, and suitable assembly methods, the assembly for wild-type GDMCC 2.30 was more complete, accurate, and near chromosome level compared with assemblies reported for other conspecific strains (Table 1).This laid a sound foundation for functional genomic and mutation analysis of this strain.strains (Table 1).This laid a sound foundation for functional genomic and mutation analysis of this strain.

Rhodotorula mucilaginosa GDMCC 2.30 Gene Annotation and Carotenoid Biosynthesis Pathway Prediction
A Repeats accounted for 1.71% of the R. mucilaginosa GDMCC 2.30 genome, including LTR transposons, which accounted for 1.03%.The MAKER annotation pipeline predicted 7128 protein-coding genes, 118 tRNAs, 23 rRNAs, and 7 snRNAs.Integrated annotations from the Nr, SwissProt, KOG, GO, and KEGG databases revealed 7015 genes (98.41%) with predictable functions (Figure S3, Table S4), with an average gene sequence length of 1717.9 bp and an average protein sequence length of 571.7 aa; the average number of exons per gene was 6.39 with an average length of 268.8 bp, and the average number of introns per gene was 5.39 with an average length of 89 bp.The mitochondrial genome contained 23 tRNA genes, 3 rRNA genes, and 25 protein-coding genes (Figure 2B).Compared with the reported R. mucilaginosa RIT389 (NC_036340.1)mitochondrial genome [48], the GC content and genome size were basically consistent, with sequence similarity of 99% (Figure S4).
Given that annotations of the R. mucilaginosa GDMCC 2.30 genome included six candidate genes associated with the mevalonate pathway and five carotenoid biosynthesis candidate genes in total (Table S5), it was inferred that, with acetyl coenzyme A (acetyl-CoA) as the initial substrate, geranylgeranyl diphosphate (GGPP) could be synthesized through the synergistic activities of multiple crucial enzymes to provide precursors for the biosynthesis of C40 carotenoids [54].In addition, the genome contained numerous genes and gene clusters encoding carotenoid biosynthetic enzymes, such as lycopene cyclase/phytoene synthase (CrtYB), carotenoid oxygenase (CCD1), phytoene desaturase (CrtI), and geranylgeranyl pyrophosphate synthase (CrtE) (Figure 3A).These enzymes collaboratively catalyze the conversion of GGPP into various carotenoids, including lycopene, β-carotene, and torulene [55,56].

Rhodotorula Mucilaginosa GDMCC 2.30 Gene Annotation and Carotenoid Biosynthesis Pathway Prediction
A Repeats accounted for 1.71% of the R. mucilaginosa GDMCC 2.30 genome, including LTR transposons, which accounted for 1.03%.The MAKER annotation pipeline predicted 7128 protein-coding genes, 118 tRNAs, 23 rRNAs, and 7 snRNAs.Integrated annotations from the Nr, SwissProt, KOG, GO, and KEGG databases revealed 7015 genes (98.41%) with predictable functions (Figure S3, Table S4), with an average gene sequence length of 1717.9 bp and an average protein sequence length of 571.7 aa; the average number of exons per gene was 6.39 with an average length of 268.8 bp, and the average number of introns per gene was 5.39 with an average length of 89 bp.The mitochondrial genome contained 23 tRNA genes, 3 rRNA genes, and 25 protein-coding genes (Figure 2B).Compared with the reported R. mucilaginosa RIT389 (NC_036340.1)mitochondrial genome [48], the GC content and genome size were basically consistent, with sequence similarity of 99% (Figure S4).
Given that annotations of the R. mucilaginosa GDMCC 2.30 genome included six candidate genes associated with the mevalonate pathway and five carotenoid biosynthesis candidate genes in total (Table S5), it was inferred that, with acetyl coenzyme A (acetyl-CoA) as the initial substrate, geranylgeranyl diphosphate (GGPP) could be synthesized through the synergistic activities of multiple crucial enzymes to provide precursors for the biosynthesis of C40 carotenoids [54].In addition, the genome contained numerous genes and gene clusters encoding carotenoid biosynthetic enzymes, such as lycopene cyclase/phytoene synthase (CrtYB), carotenoid oxygenase (CCD1), phytoene desaturase (CrtI), and geranylgeranyl pyrophosphate synthase (CrtE) (Figure 3A).These enzymes collaboratively catalyze the conversion of GGPP into various carotenoids, including lycopene, β-carotene, and torulene [55,56].S5.
Currently, the metabolic process of astaxanthin and torularhodin synthesis in Rhodotorula yeasts remains controversial.Previous studies indicate that, in P. rhodozyma, CrtS and CrtR collaboratively catalyze the ketolation and hydroxylation of β-carotene into astaxanthin [57][58][59].By constructing phylogenetic trees for the Ascomycota CrtS (OG00000414) and the CrtR (OG00004469) gene families to predict their domain structures (Figure S5), we observed that both gene families had similar domain compositions and  S5.
Currently, the metabolic process of astaxanthin and torularhodin synthesis in Rhodotorula yeasts remains controversial.Previous studies indicate that, in P. rhodozyma, CrtS and CrtR collaboratively catalyze the ketolation and hydroxylation of β-carotene into astaxanthin [57][58][59].By constructing phylogenetic trees for the Ascomycota CrtS (OG00000414) and the CrtR (OG00004469) gene families to predict their domain structures (Figure S5), we observed that both gene families had similar domain compositions and motif constitutions (Table S6).Motif_7a showed species specificity, whereas motif_7b showed copy number specificity.The CYP_FUM15-like structural domain is highly conserved in the CrtS gene family, while the CYPOR and Flavodoxin_1 structural domains are conserved in the CrtR gene family.Therefore, although R. mucilaginosa is phylogenetically distant from P. rhodozyma, analysis of the amino acid sequence similarity and domain structure suggested a strong similarity in adaptive evolution.It is inferred that the enzymes encoded by these gene families in R. mucilaginosa have similar ketolation or hydroxylation functions as those in P. rhodozyma and participate in the synthesis of carotenoids such as astaxanthin and torularhodin.However, the exact synthesis pathways for these carotenoids in R. mucilaginosa GDMCC 2.30 require further verification by performing heterologous expression experiments.In summary, based on the present genomic information and previous research, the carotenoid biosynthetic pathway in R. mucilaginosa GDMCC 2.30 was resolved from a KEGG pathway analysis (Figure 3B), which lays the foundation for future understanding of carotenoid biosynthesis, as well as providing new insights into the study of carotenoid metabolic pathways.

Analysis of the Mechanism of High Carotenoid Production in the Mutant JH-R23
Through space mutation breeding treatment of R. mucilaginosa GDMCC 2.30, we diluted the yeast cells and plated them on potato dextrose agar plates to screen for a high carotenoid-producing mutant strain.Pigment intensity analysis was performed on 46 colonies grown on potato dextrose agar plates (Figure 4A, left) and we obtained a deep red mutant strain, JH-R23, which was molecularly identified, fermentation verified, and strain conserved (Figure 4A, right).A 617 bp fragment of the rDNA-ITS gene was amplified and sequenced.The sequence obtained was compared with sequences in the NCBI database and showed 100% similarity to the R. mucilaginosa strain (OR976269.1),confirming the homology of the isolated strain.Before 24 h of cultivation, the carotenoid production of JH-R23 and wild-type strains was similar, with no significant difference.However, after 48 h of cultivation, the carotenoid production per 24 h of the JH-R23 strain was 1.98, 1.94, 2.01, 2.39, and 2.46 times higher than that of the wild-type strain, respectively, and reached a peak after 144 h of cultivation.The carotenoid production of the wild-type strain was 151.39 µg/g and that of the JH-R23 mutant strain was 372.84 µg/g (Figure 4B).Mutagenesis has been used to enhance carotenoid production in R. mucilaginosa to varying degrees.However, current research has mainly focused on UV mutagenesis.Issa et al. mutagenized R. mucilaginosa A734 using UV light at 254 nm, resulting in a 1.12-fold increase in total carotenoids [60].However, the observed increase was much lower than that of JH-R23 in this study (2.46-fold), and this is the first time that such a significant increase in carotenoids has been reported for R. mucilaginosa by space mutagenesis.
Genome resequencing analysis of the genetic variation in JH-R23 detected 38 SNPs and 58 InDels (Table S7).The majority of mutations were located in intergenic regions, but mutations were detected in the exons of four genes encoding Gal83, 3-oxoacyl-reductase, p24 family protein, and GTPase.
The Gal83 protein is the β subunit of the Snf1 protein kinase complex, forming a trimer complex with the α subunit of Snf1 and γ subunit of Snf4 [61].As an important intracellular energy sensor, this complex is activated under glucose starvation and participates in relieving the inhibition of glucose catabolism products.For example, the Snf1 protein kinase complex promotes expression of the hexose transporter genes HXT2 and HXT4 and phosphorylates the Mig1 transcriptional repressor to relieve the inhibition of Gal gene transcription regulated by galactose induction [62,63].The insertion of a cytosine nucleotide in the Gal83 coding sequence of JH-R23 caused a frameshift mutation, shortening the protein length from 851 to 540 amino acids (Figure 4C), resulting in the loss of glycogen-binding capability.However, Gal83 contains a highly conserved glycogenbinding domain that is homologous to that of AMPK family proteins.The mutation that eliminated Gal83-glycogen binding would also affect the activity of the Snf1 protein kinase complex, thus weakening the feedback inhibition effect of glucose catabolism products.This would lead to sustained expression of high-affinity transport proteins, thereby enhancing the co-consumption capabilities of glucose and xylose to provide substrates and energy for carotenoid synthesis [64].In addition, this mutation may influence the conformation of adjacent domains or interactions with unknown signaling molecules, and thus the specific mechanism requires further study [64].As shown by Wang et al. [54], transcriptome comparison between a Gal83 knockout strain and the wild type revealed upregulation in the acetyl-CoA and CoA biosynthesis pathways, but downregulation in the sugar lipid metabolism and ether lipid metabolism pathways.This finding highlighted the regulatory role of Snf1 in carbon source utilization, sporulation, trap formation, oxidative stress response, and other metabolic activities, providing substrates and energy for the downstream synthesis of metabolic products [65].In addition, 3-oxoacyl-reductase plays an important role in the primary stage of lipid synthesis and multiple synonymous mutations may reduce the expression of lipid synthetic enzymes by changing the codon usage preference, thereby reducing the competition for carotenoid precursors.Mutation of GTPase can increase the sensitivity of a strain to osmotic pressure and oxidative stress, stimulating elevated carotenoid production for antioxidation [66].In summary, the Gal83 mutation may be the major reason for the change in carotenoid yield in the JH-R23 mutant, but the mutations in the genes encoding 3-oxoacyl-reductase and GTPase also may increase carotenoid yield to some extent.The Gal83 protein is the β subunit of the Snf1 protein kinase complex, forming a trimer complex with the α subunit of Snf1 and γ subunit of Snf4 [61].As an important intracellular energy sensor, this complex is activated under glucose starvation and participates in relieving the inhibition of glucose catabolism products.For example, the Snf1 protein kinase complex promotes expression of the hexose transporter genes HXT2 and HXT4 and phosphorylates the Mig1 transcriptional repressor to relieve the inhibition of Gal gene transcription regulated by galactose induction [62,63].The insertion of a cytosine nucleotide in the Gal83 coding sequence of JH-R23 caused a frameshift mutation, shortening the protein length from 851 to 540 amino acids (Figure 4C), resulting in the loss of glycogenbinding capability.However, Gal83 contains a highly conserved glycogen-binding do- Comparative analysis of 13 red yeast genomes revealed 4084 orthologous gene families and 1646 single-copy orthologous gene families (Table S8).Based on the gene family analysis, a phylogenetic tree for 14 red yeast species was constructed (Figure 5).A GO enrichment analysis indicated that JH-R23-specific gene families were mainly enriched in purine nucleotide biosynthesis and metabolism, cytoplasmic ribosomes, pantothenate biosynthesis and metabolism, and amide biosynthesis and metabolism (Table S9).Pantothenate is a precursor of CoA, which can promote the synthesis of acetyl-CoA and stimulate energy metabolism, thereby increasing the supply of carotenoid precursors and energy, and enabling the potential capability for high carotenoid production in JH-R23 [67,68].Comparative evolutionary analysis showed that 88 gene families were significantly expanded, and 69 gene families were significantly contracted in JH-R23 (p < 0.05; Figure 5).Furthermore, GO and KEGG enrichment analyses indicated that these gene families were associated with ABC transporters, ribosomes, and other pathways (Table S10).These findings suggested that the mutant JH-R23 exhibited higher activity, unique pantothenate synthesis, and stronger transmembrane transport capabilities, thus enabling the high carotenoid production.
mutations in the genes encoding 3-oxoacyl-reductase and GTPase also may increase carotenoid yield to some extent.
Comparative analysis of 13 red yeast genomes revealed 4084 orthologous gene families and 1646 single-copy orthologous gene families (Table S8).Based on the gene family analysis, a phylogenetic tree for 14 red yeast species was constructed (Figure 5).A GO enrichment analysis indicated that JH-R23-specific gene families were mainly enriched in purine nucleotide biosynthesis and metabolism, cytoplasmic ribosomes, pantothenate biosynthesis and metabolism, and amide biosynthesis and metabolism (Table S9).Pantothenate is a precursor of CoA, which can promote the synthesis of acetyl-CoA and stimulate energy metabolism, thereby increasing the supply of carotenoid precursors and energy, and enabling the potential capability for high carotenoid production in JH-R23 [67,68].Comparative evolutionary analysis showed that 88 gene families were significantly expanded, and 69 gene families were significantly contracted in JH-R23 (p < 0.05; Figure 5).Furthermore, GO and KEGG enrichment analyses indicated that these gene families were associated with ABC transporters, ribosomes, and other pathways (Table S10).These findings suggested that the mutant JH-R23 exhibited higher metabolic activity, unique pantothenate synthesis, and stronger transmembrane transport capabilities, thus enabling the high carotenoid production.In summary, an exon mutation disrupted the glycogen-binding domain of Gal83, which would affect the activity of the Snf1 protein kinase complex and weaken its feedback inhibition of glucose.This would lead to sustained expression of high-affinity transport proteins, thereby increasing sugar consumption to provide substrates and energy for carotenoid synthesis.

Conclusions
In this study, we generated a near chromosome-level genome assembly of R. mucilaginosa GDMCC 2.30 using PacBio Sequel II and Illumina NovaSeq 6000 sequencing data.This genome contained 18 scaffolds and a circular mitochondrial genome with a total size of 20.31 Mb and a GC content of 60.52%.The mitochondrial genome comprised 40,152 bp with a GC content of 40.59%.A total of 1714 BUSCO genes (97.1%) were identified, and most scaffold ends contained telomere sequences, indicating the data quality was high, and enabling further analysis of the biological evolution and functional genomics of the strain.Through in-depth analysis and sorting of carotenoid-related gene clusters and families, the carotenoid biosynthesis pathway in the strain was inferred to start from sugar metabolism to mevalonate metabolism, and the final product of mevalonate metabolism (GGPP) is a precursor for various carotenoids formed through the activity of crucial enzymes encoded by genes such as CrtI, CrtYB, CrtS, and CrtR.The mutant JH-R23, screened after loading GDMCC 2.30 on the "New Generation Manned Spacecraft Test Ship", was resequenced and comparative genomic analysis showed that an exon mutation disrupted the glycogenbinding domain of Gal83, thereby affecting the activity of the Snf1 protein kinase complex and weakening the feedback inhibition of glucose catabolism.These changes would lead to sustained expression of high-affinity transport proteins, thereby enhancing sugar consumption to provide substrates and energy for carotenoid synthesis.Multiple synonymous mutations of 3-oxoacyl-reductase and GTPase were additional important factors that would contribute to the increase in carotenoid production, establishing the genetic foundation for elevated carotenoid production.

Supplementary Materials:
The following supporting information can be downloaded at: https://doi.org/10.5281/zenodo.10431767,Figure S1.IGV-GSAman software correction schematic.Figure S2: Evaluation of R. mucilaginosa genome size and ploidy by k-mer analysis.Figure S3: R. mucilaginosa GDMCC 2.30 gene annotation status. Figure S4: R. mucilaginosa GDMCC 2.30 mitochondrion alignment status. Figure S5: Phylogenetic relationships, protein structural domains, and motif structures of specific gene families.Table S1: Table of Species in Comparative Genomic Analysis.Table S2: Summary of sequencing data of the R. mucilaginosa GDMCC 2.30 genome.Table S3: Summary of BUSCO's completeness analysis of R. mucilaginosa GDMCC 2.30 genome.Table S4: Annotation Statistics of the R. mucilaginosa GDMCC 2.30 genome.Table S5: List of Genes used in Figure 2B and its TPM.Table S6: Amino acid sequences for the CrtS gene family (OG00000414) and the CrtR gene family (OG00004469) for each species.Table S7: SNP/Indel analysis results in GDMCC 2.30 vs. JH-R23.Table S8: Statistics for gene family inference and gene counts for each family and each species.Table S9: GO enrichment results of species-specific genes of the R. mucilaginosa JH-R23.Table S10: GO/KEGG enrichment results of significantly expanded genes in R. mucilaginosa JH-R23.
Author Contributions: Conceptualization, methodology, visualization, and writing-original draft preparation, J.H.; conceptualization, methodology, and formal analysis, S.Y.; supervision, writing-review and editing, funding acquisition, project administration, and resources, H.J. All authors have read and agreed to the published version of the manuscript.

Figure 1 .
Figure 1.Schematic illustration of the pipeline for the genome annotation of R. mucilaginosa GDMCC 2.30 with the supplements of transcriptome sequences.The orange lines indicate the import of transcripts in MAKER3 for the further prediction of gene models.The IGV-GSAman software correction schematic in Figure S1.

Figure 1 .
Figure 1.Schematic illustration of the pipeline for the genome annotation of R. mucilaginosa GDMCC 2.30 with the supplements of transcriptome sequences.The orange lines indicate the import of transcripts in MAKER3 for the further prediction of gene models.The IGV-GSAman software correction schematic in Figure S1.

Figure 2 .
Figure 2. (A) Circular maps of the Rhodotorula mucilaginosa GDMCC 2.30 nuclear genome; Tracks from the outermost to innermost circles represent chromosome information, gene expression level, GC content, sequencing depth, and collinearity, respectively.(B) mitochondrial genome; Genes within the circle are transcribed clockwise, while those outside are transcribed counter-clockwise.Genes are color-coded based on their functional groups.The inner circle's orange represents the GC content, while the maroon represents the AT content.

Figure 2 .
Figure 2. (A) Circular maps of the Rhodotorula mucilaginosa GDMCC 2.30 nuclear genome; Tracks from the outermost to innermost circles represent chromosome information, gene expression level, GC content, sequencing depth, and collinearity, respectively.(B) mitochondrial genome; Genes within the circle are transcribed clockwise, while those outside are transcribed counter-clockwise.Genes are color-coded based on their functional groups.The inner circle's orange represents the GC content, while the maroon represents the AT content.

Figure 3 .
Figure 3. Analysis of the carotenoid metabolic pathway in Rhodotorula mucilaginosa GDMCC 2.30.(A) Comparison of carotenoid biosynthetic gene clusters among 13 red yeast species.Homologous genes are indicated by the same color; (B) Putative carotenoid metabolic pathway map.All proteincoding genes are annotated from the KEGG database.The full gene names are listed in TableS5.

Figure 3 .
Figure 3. Analysis of the carotenoid metabolic pathway in Rhodotorula mucilaginosa GDMCC 2.30.(A) Comparison of carotenoid biosynthetic gene clusters among 13 red yeast species.Homologous genes are indicated by the same color; (B) Putative carotenoid metabolic pathway map.All proteincoding genes are annotated from the KEGG database.The full gene names are listed in TableS5.

Figure 5 .
Figure 5. Comparative genomic analysis of Rhodotorula mucilaginosa JH-R23 and other red yeast species.(A) Phylogenetic tree and divergence time estimation for 13 representative red yeast species;

Figure 5 .
Figure 5. Comparative genomic analysis of Rhodotorula mucilaginosa JH-R23 and other red yeast species.(A) Phylogenetic tree and divergence time estimation for 13 representative red yeast species; (B) Expanded (red) and contracted (green) gene families; (C) Distribution of single-copy, multi-copy, and unmatched gene families among 14 red yeast species.

Funding:
This work was financially supported by the Key-Area Research and Development Program of Guangdong Province 2018B020206001.Institutional Review Board Statement: Not applicable.Informed Consent Statement: Not applicable.Data Availability Statement: This project has been deposited at NCBI under the accession PRJNA1034680.
a NA, not available.
a NA, not available.