Genome Sequence and Analysis of the Flavinogenic Yeast Candida membranifaciens IST 626

The ascomycetous yeast Candida membranifaciens has been isolated from diverse habitats, including humans, insects, and environmental sources, exhibiting a remarkable ability to use different carbon sources that include pentoses, melibiose, and inulin. In this study, we isolated four C. membranifaciens strains from soil and investigated their potential to overproduce riboflavin. C. membranifaciens IST 626 was found to produce the highest concentrations of riboflavin. The volumetric production of this vitamin was higher when C. membranifaciens IST 626 cells were cultured in a commercial medium without iron and when xylose was the available carbon source compared to the same basal medium with glucose. Supplementation of the growth medium with 2 g/L glycine favored the metabolization of xylose, leading to biomass increase and consequent enhancement of riboflavin volumetric production that reached 120 mg/L after 216 h of cultivation. To gain new insights into the molecular basis of riboflavin production and carbon source utilization in this species, the first annotated genome sequence of C. membranifaciens is reported in this article, as well as the result of a comparative genomic analysis with other relevant yeast species. A total of 5619 genes were predicted to be present in C. membranifaciens IST 626 genome sequence (11.5 Mbp). Among them are genes involved in riboflavin biosynthesis, iron homeostasis, and sugar uptake and metabolism. This work put forward C. membranifaciens IST 626 as a riboflavin overproducer and provides valuable molecular data for future development of superior producing strains capable of using the wide range of carbon sources, which is a characteristic trait of the species.


Introduction
Candida membranifaciens is an anamorphic yeast first described as Candida melibiosi var. membranifaciens by Lodder & Kreger-van Rij [1]. Based on the examination of strains assigned to Candida melibiosi var. membranifaciens, C. melibiosi var. melibiosi, and C. guilliermondii, Wickerham and Burton observed that the variety membranifaciens was unable to mate and able to ferment melibiose more vigorously than C. melibiosi [2]. Candida melibiosi var. membranifaciens was proposed as the new species C. membranifaciens. More recently, upon phylogenetic analysis of the D1/D2 domains of the large subunit and the nearly complete possible use of yeast biomass as a source of protein [23]. Therefore, the identification of novel riboflavin yeast overproducers and their physiological and molecular characterization are essential steps to gain new insights into relevant unclear issues. This is the case of the regulation of riboflavin secretion and accumulation in the culture medium, the characterization of the unknown phosphatase that catalyzes the dephosphorylation of 5-amino-6-ribitylamino-2,4(1H,3H) pyrimidinedione 5 -phosphate in riboflavin biosynthetic pathway, and the physiological role of riboflavin overproduction under iron-limiting conditions. Moreover, from a circular bioeconomy perspective, the selection of new yeast strains able to effectively use organic residues as alternative feedstocks for the production of added-value compounds such as riboflavin is a topic to be pursued.
In this work, we isolated four C. membranifaciens strains from soil and investigated their potential to overproduce riboflavin and C. membranifaciens IST 626 was found to produce the highest amounts of riboflavin. Riboflavin production by this strain was optimized by manipulating growth medium composition. The genome of C. membranifaciens IST 626 was sequenced and annotated. A comparative genomic analysis with other relevant yeast species is also provided.

Isolation and Identification of Candida membranifaciens Isolates
Candida membranifaciens isolates were obtained from three soil samples collected in Arrábida Natural Park, Sesimbra, Portugal (38 • 26 12.4 N, 9 •  . For yeast isolation, three cycles of culture enrichment were performed to avoid growth of filamentous fungi, as previously described [24]. Approximately one gram of each soil sample was inoculated in 50 mL of growth medium containing: 3 g/L malt extract (Sigma-Aldrich, Burlington, MA, USA), 3 g/L yeast extract (ThermoFisher), 5 g/L peptone (ThermoFisher), 1 g/L (NH 4 ) 2 SO 4 (Panreac), 0.25 g/L KH 2 PO 4 (Panreac, pH 5.0), 30 g/L of glucose (Scharlau), and 30 g/L of xylose (Sigma-Aldrich). This growth medium with the soil sample was supplemented with chloramphenicol (100 µg/mL) and incubated at 30 • C at 150 rpm for 48 h (First Enrichment). Then, 1 mL of this culture was added to 49 mL of the same medium, and incubated again at 30 • C, 150 rpm, 48 h (Second Enrichment). To differentiate yeasts with the ability to grow in different carbon sources, a differential enrichment step was performed, where 1 mL from the Second Enrichment culture was added to 49 mL of the same medium but containing either 60 g/L glucose or 60 g/L xylose and incubated in the same conditions as before. After 48 h of cultivation, the samples were diluted in 0.85% NaCl solution and poured into isolation agar medium. This medium includes 3 g/L yeast extract (ThermoFisher), 5 g/L peptone (ThermoFisher), 1 g/L (NH 4 ) 2 SO 4 (Panreac), 0.25 g/L KH 2 PO 4 (Panreac), and 20 g/L agar (NZYtech), with either 60 g/L glucose or 60 g/L xylose [24], and was supplemented with chloramphenicol (100 µg/mL). Plates were incubated at 30 • C for 48 h. Yeast cells from colonies with different morphologies were observed on an Axioplan microscope (×1000 magnification) (Zeiss ® ) and streaked into new agar plates to assure the purity of the isolate. Yeast isolates were maintained at 4 • C until DNA extraction was performed. For long-term storage, isolates were preserved at −80 • C in their isolation medium containing 15% (v/v) glycerol.
For the molecular identification of yeast isolates, genomic DNA was extracted using the phenol:chlorophorm:isoamyl alcohol method [25] and used as a template for the amplification by polymerase chain reaction (PCR) of the D1/D2 domain sequence of the 26S and the internal transcribed spacer (ITS) region of ribosomal DNA (rDNA). The primers' pairs NL-1 (5 -GCATATCAATAAGCGGAGGAAAAG-3 ) and NL-4 (5 -GGTCCGTGTTTCAAGACGG-3 ), and ITS1 (5 -TCCGTAGGTGAACCTGCGG-3 ) and ITS4 (5 -TCCTCCGCTTATTGATATGC-3 ), known to be effective for the taxonomic identification of yeasts [26], were used in the amplification of D1/D2 and ITS regions, respectively. The two DNA fragments from each isolate were purified using NZYGel pure (NZYtech, Portugal) and Sanger-sequenced (Stabvida, Portugal) using each corresponding primer. The molecular taxonomic identification was performed by comparing D1/D2 and ITS sequences with others deposited in GenBank using the BLAST algorithm from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/blast, accessed on 30 October 2019). The consensus sequences from D1/D2 region of C. membranifaciens strains IST 495, IST 498, IST 507, and IST 626 were deposited in GenBank under the accession numbers MZ614941, MW003712, MW003715, and MW532700, respectively. The consensus sequences from ITS region of C. membranifaciens strains IST 495, IST 498, IST 507, and IST 626 were deposited under the accession numbers MZ615411, MW003718, MW003721, and MW532702, respectively.

Phylogenetic Analysis
For the phylogenetic placement of C. membranifaciens strains isolated in this study, their D1/D2 domain of LSU rDNA sequence were aligned iteratively with the sequences of related species from Debaryomycetaceae family, retrieved from the GenBank under the accession numbers indicated in the phylogenetic tree, by using the multiple alignment tool Muscle [27]. The software MEGA-X v.10.2.2 was used for phylogenetic tree construction using the maximum likelihood method of the Tamura-Nei evolutionary model [28]. The confidence level of the clades was estimated using bootstrap analysis with 1000 replicates.

Yeast Strains Tested and Growth Media and Conditions Used
Candida membranifaciens strains IST 495, IST 498, IST 507, and IST 626, isolated and identified as described herein, were examined in this study. We also used C. membranifaciens PYCC 2525 T obtained from the Portuguese Yeast Culture Collection (PYCC). The basal growth media used in this study were prepared with either 6.9 g/L of commercial yeast nitrogen base without amino acids (mentioned hereafter as YNB) or 6.9 g/L yeast nitrogen base without amino acids and without iron (mentioned hereafter as YNB-Fe), both from Formedium TM (United Kingdom), supplemented with 20 g/L glucose (Merck) or 20 g/L xylose (Sigma-Aldrich). All media and solutions were prepared using Milli-Q ® water (Merck Millipore) and all growth assays were carried out in 50 mL shake flasks containing 25 mL of medium. Pre-cultivation of yeast cells was performed in either YNB or YNB-Fe with 20 g/L glucose for 24 h at 30 • C with orbital shaking (250 rpm). After pre-cultivation, yeast cells were harvested by centrifugation (5000 g, 5 min) and inoculated at an initial optical density (OD) at 600nm of 0.1 ± 0.05 in 25 mL of the medium to be tested. To investigate the ability of different C. membranifaciens strains to produce riboflavin, yeast cells were inoculated in either YNB or YNB-Fe with 20 g/L glucose for 120 h under the growth conditions described above. To evaluate the impact of iron supplementation in the production of riboflavin, C. membranifaciens IST 626 cells were inoculated in YNB-Fe with 20 g/L of glucose, to which increasing concentrations of iron (III) chloride (+0.5 µM, +0.8 µM, +1.0 µM, +1.2 µM, +1.5 µM, +2.0 µM) were added. Riboflavin, biomass (OD 600nm ) and glucose concentrations were determined after 120 h of growth. Cells inoculated in YNB with 20 g/L of glucose were used as a control. To evaluate the impact of different sugars and glycine supplementation of the growth medium on riboflavin production by C. membranifaciens IST 626, cells were cultivated in YNB-Fe media containing either 20 g/L glucose or 20 g/L xylose as a carbon source and supplemented with 1 g/L or 2 g/L glycine (Sigma-Aldrich).

Quantification of Riboflavin
Extracellular riboflavin was determined using a spectrophotometric method as previously described [29,30], but with minor modifications. Briefly, a volume of 1 mL of each culture was harvested and centrifuged at 10,000 rpm for 3 min to remove cells. Supernatants were mixed in 0.1 N HCl using the appropriate dilution, and riboflavin concentrations were determined by reading the absorbance of each sample at 445 nm in a spectrophotometer (Hitachi U-2001) and the riboflavin calibration curve performed using several dilutions of a pure riboflavin (Sigma-Aldrich) stock solution (0.2 g/L).

Genome Sequencing, Assembly, and Annotation
Genomic DNA from C. membranifaciens IST 626 was sequenced in an Illumina Novaseq 6000 platform, producing 2 × 150-bp paired-end reads. Library preparation (NEBNext ® DNA Library Prep Kit from NewEngland Biolabs, Inc., Ipswich, MA, USA) and sequencing were carried out by Novogene Bioinformatics Technology Co., Ltd. (Hong Kong). Illumina sequencing produced 21,753,322 raw paired-end reads. Low-quality bases and adapters were removed using BBDuk from BBMap package (http://jgi.doe.gov/data-and-tools/ bb-tools/, accessed on 30 June 2020). Read duplicates were removed using PRINSEQ (v0.20.4) (6). Ultimately, 17,196,026 high-quality reads were used for subsequent analysis. Correction of the reads and assembly into scaffolds were performed using SPAdes (v3.14.0) (7). Scaffolds smaller than 2000 bp were filtered out, and the remaining sets of scaffolds were used as draft assemblies. Assembly quality was analyzed using Quality Assessment Tool for Genome Assemblies (QUAST; v4.6.3) (8). The final assembly consists of 56 contigs, with a total length of 11,508,125 bases, which includes a mitochondrial DNA contig with 28,672 bp. The sequencing reads were deposited in Sequence Read Archive (SRA) under the accession number PRJNA777779, and the sequences obtained in the genome assembly were deposited in GenBank under the accession number JAKQXL000000000. During submission to NCBI, three scaffolds were determined to be contaminants and were excluded. The genome was annotated using the JGI Annotation pipeline [31], made available in JGI fungal genome portal MycoCosm (https://mycocosm.jgi.doe.gov/Canmem1, accessed on 25 February 2022).

Analysis of Genome Content
Gene prediction was performed by using the tools available at MycoCosm, the JGI's web-based fungal genomics resource, at the JGI portal [31]. The functional analysis of predicted genes was based on the eukaryotic orthologous groups of proteins (KOG) classification [31,32]. Proteins involved in riboflavin biosynthesis and transport, as well as the proteins involved in iron homeostasis and transport were analyzed using the Annotation and Blast analysis tools from JGI. Members of sugar porter family (2.A.1.1) present in the genome of C. membranifaciens IST 626 were obtained based on JGI data for transporter annotations [31], which uses the Transport Classification (TC) system [33,34]. Glycoside hydrolases were obtained based on CAZy annotations [34] also available at MycoCosm.

Search for Putative Transcription Factor Binding Sites
Putative transcription factor (TF) binding sites were searched for in promoter regions of C. membranifaciens selected genes using YEASTRACT+ [35,36]. This database contains all known regulatory associations between transcription factors and target genes in different yeast species for which the information is available in the literature, which means, essentially, for Saccharomyces cerevisiae. YEASTRACT+ includes genomic information on the flavinogenic yeast species C. albicans [14]. The sequences of promoters of interest were retrieved from MycoCosm at the JGI portal [31,37] and inserted into the "Find TF Binding Site(s)" tool of YEASTRACT+ portal (http://yeastract-plus.org/pathoyeastract/calbicans, accessed on 30 June 2021). The analysis was performed using documented TF binding sites from C. albicans.

Isolation and Identification of Candida membranifaciens Isolates
In this study, four different isolates were isolated from the superficial layer of three different soils, two of them from reserve areas in Arrábida Natural Park and Berlengas islands. For those isolations, a rich and undefined growth medium containing glucose and xylose was used for the first two enrichment steps and either one of these sugars was used in the third enrichment step [24]. Strains IST 626, IST 495, and IST 507 were isolated in a medium where xylose was the carbon source present in the third enrichment step, whereas strain IST 498 was isolated when glucose was the carbon source present in that enrichment step.
The isolates were identified based on the comparison of their D1/D2 and ITS sequences with the sequences deposited in the NCBI database. The sequences shared 100% identity with the corresponding C. membranifaciens sequences, and the isolates were considered of the C. membranifaciens species. The phylogenetic analysis, based on D1/D2 domain of LSU rDNA sequence, placed the isolates close to C. membranifaciens NRRL Y-2089 T (=PYCC 2525 T ) and the flavinogenic W14-3 strain on Yamadazyma/Candida clade ( Figure 1). The closest species to the C. membranifaciens clade is C. friedrichii [3]. The natural habitat of C. membranifaciens is not defined, but isolates from this species have been retrieved from diverse habitats and substrates, including fresh, marine, and estuarine waters, insects, plants, and clinical specimens [4,5]. Yeasts of Candida genus have been isolated from soils worldwide [19,[38][39][40], but, to the best of our knowledge, this is the first report on the isolation of the species C. membranifaciens from this habitat. Interestingly, in our study, four strains from this species were isolated from three different soils, showing that soil is a reservoir of this species.

Candida membranifaciens Isolates Are Riboflavin Producers
C. membranifaciens IST 626 was able to produce over 10 mg/L riboflavin when cultured in YNB with 20 g/L of glucose ( Figure 2a). The other strains tested produced lower concentrations of the vitamin under the same conditions ( Figure 2a). When the five strains examined were cultured in YNB-Fe, all strains except PYCC 2727 T overproduced riboflavin (produced more than 10 mg/L). C. membranifaciens IST 626 produced the highest concentration of this vitamin, which reached approximately 20 mg/L after 120 h of cultivation ( Figure 2b). These results indicate that, contrary to C. membranifaciens PYCC 2727 T , the four C. membranifaciens strains isolated in this study are flavinogenic. Moreover, the results confirm that the presence of iron has a marked negative impact on riboflavin production by C. membranifaciens, consistent with previous observations [6]. Riboflavin production by C. membranifaciens strains IST 495, IST498, IST 507, and IST 626 isolated in this study, and the type strain PYCC 2727 T . All strains were cultured in YNB (a), or YNB-Fe (b). All media contained 20 g/L of glucose. Riboflavin production was determined during 120 h of growth at 30 • C and orbital agitation (250 rpm). Error bars represent the standard deviations of three independent measurements.
Based on the higher riboflavin production capacity of C. membranifaciens IST 626, this strain was selected for further studies that included the determination of the effect of iron addition to YNB-Fe ( Figure 3) and the optimization of growth medium conditions for riboflavin production ( Figure 4). The supplementation of YNB-Fe with increasing concentrations of iron (III) chloride decreased riboflavin production by C. membranifaciens IST 626 in a dose-dependent manner ( Figure 3). Among the conditions tested, the concentration of riboflavin produced and of remaining glucose was higher and biomass concentration was lower when C. membranifaciens IST 626 was cultured for 120 h in medium without iron. When increasing concentrations of iron (III) chloride were added to commercial YNB-Fe, a dose-dependent decrease in riboflavin concentration was observed after 120 h of growth. The concentration of cells and glucose was similar in all iron-(III)-chloride-supplemented media. The commercial medium YNB (containing~1.2 µM iron (III) chloride) was used as a control. This study demonstrates the impact of different concentrations of iron (III) chloride in the volumetric production of riboflavin and shows that, in the absence of iron, biomass production and glucose consumption are negatively affected due to iron limitation.
In the case of C. membranifaciens IST 626, riboflavin overproduction (>10 mg/L) was observed when iron (III) chloride was added at concentrations below 1.0 µM. The link between iron metabolism and flavinogenesis has long been established [15] but, after 40 years of research, the molecular and physiological mechanisms underlying such a relationship in flavinogenic yeasts remain poorly understood. Under iron depletion, riboflavin has been suggested to play a role in the nonenzymatic reduction of insoluble Fe 3+ to the more accessible soluble Fe 2+ or to act as a cofactor for the activity of intra-and extracellular enzymes [12]. This mechanism is well documented in some bacterial species [41,42], but was not validated in flavinogenic yeasts [12]. Iron is a vital micronutrient that is essential for multiple biological processes, including respiration, given that respiratory complexes contain heme and Fe-S clusters whose synthesis depends on iron availability [43]. In S. cerevisiae, at concentrations bellow 1 µM Fe, the cell faces iron deficiency and prioritizes the utilization of this micronutrient, meaning that most of the available iron goes to the mitochondria, where it is assembled into Fe-S clusters and heme centers, as well as prosthetic groups that are critical for cellular metabolism [44]. Other cellular processes can be severely affected by iron starvation due to the decrease in the availability of iron-dependent metabolites, such as amino acid intermediates, heme, unsaturated fatty acids, and deoxyribonucleotides [45], which may lead to a deceleration or even abrogation of several metabolic processes, for instance, sugar metabolization. The higher glucose concentration present in the medium without iron after 120 h of cultivation ( Figure 3) might be the result of the deceleration of growth and sugar metabolism.  (Figure 4b) showed that the volumetric production of this vitamin was higher when xylose was the available sugar, reaching approximately 33 mg/L after 120 h of cultivation. Ethanol was produced by C. membranifaciens IST 626 when glucose was the available sugar (Figure 4b,c,e). The percentage of glucose diverted to either alcoholic fermentation or respiration in this species has not been studied but, as in S. cerevisiae, C. membranifaciens was also able to produce ethanol from glucose under aerobic conditions. In contrast, xylose was assimilated but not fermented by this species (Figure 4b,d,f). Based on the reported increase in riboflavin production by C. flareri (C. famata) and by A. gossipy when cultivated in media supplemented with glycine [21,46], the effect of glycine supplementation in YNB-Fe media containing either glucose or xylose was also assessed (Figure 4). Although glucose consumption and cell growth were slightly favored by glycine supplementation during the first two days of fermentation, riboflavin production by C. membranifaciens IST 626 was not enhanced by glycine addition to glucose YNB-Fe media (Figure 4a,c,e) and, for this reason, growth curves in these media were only followed for 120 h. Differently, riboflavin production was stimulated by the addition of glycine to the culture medium containing xylose (Figure 4b,d,f). Riboflavin concentration reached approximately 120 mg/L and xylose was almost fully consumed after 216 h of cultivation in medium supplemented with 2 g/L glycine (Figure 4f). The increase in riboflavin production in glycine-containing media correlates with the increase in yeast biomass, indicating that glycine was essential for the full catabolism of the xylose present. In the flavinogenic yeast C. flareri, the mechanisms underlying the enhancement of riboflavin production by glycine were not clarified [21]. Nevertheless, it is known that, in S. cerevisiae, glycine participates in multiple biological processes. They include, for example, the biosynthesis of purines, glyoxylate [47], of glutathione [48], and of serine by the glycine decarboxylase multienzyme complex that plays a critical role in connecting the metabolism of one-, two-, and threecarbon compounds in different metabolic pathways [49]. The positive impact of amino acids on the catabolism of other carbon sources by a different nonconventional yeast was previously demonstrated [50]. Although our results do not elucidate how glycine relates to catabolization of xylose and, consequently, increased biomass and riboflavin production, overall, they indicate that the production of this vitamin by C. membranifaciens remarkably increased upon optimization of the growth medium with xylose and glycine and that this strain can be further exploited to produce this vitamin.
Having in mind the future exploration of this strain for riboflavin production, it was considered of interest to obtain the first genome sequence for the species C. membranifaciens and the best riboflavin-producing strain IST 626 and provide its assembly and annotation. This is expected to contribute to enlightening the molecular mechanisms underlying riboflavin biosynthesis and the ability of this species to assimilate a wide variety of carbon sources.

General features of Candida membranifaciens IST 626 Genome
The genome sequence of C. membranifaciens IST 626 was obtained by paired-end Illumina sequencing. A total of 17 million reads were acquired and assembled into 56 scaffolds (≥2000 bp), resulting in an overall sequence coverage of 224x. A summary of genome assembly statistics is presented in Table 1. The sum of all scaffold sizes is 11,508,125 bp. The predicted GC content is 32.15%. This value is below the minimum value on the range of the GC content of other species from the Candida/Yamadazyma clade (GC content varies from 33.5 to 53.9%) [51][52][53]. A total of 5619 genes were predicted to be encoded in the genome of C. membranifaciens IST 626. Protein functions were assigned to 61% (3446 genes) of the predicted genes according to the eukaryotic orthologous groups of proteins (KOG) classification [54] (Figure 5). Among them, 1165 genes were assigned to "Cellular processes and signaling", 980 genes to "Information storage and processing", and 1301 genes to "Metabolism" major categories. In this latter category, the most dominant functions are "Amino acid transport and metabolism" (243 genes), "Energy production and conversion" (210 genes), "Carbohydrate transport and metabolism" (174 genes), and "Lipid transport and metabolism" (172 genes). A to-tal of 893 genes (16% of the assigned genes) were included in the "Poorly characterized functions" category, which includes the categories "Function unknown" and "General function prediction only". The number of genes assigned to each function is detailed in Supplementary file 1, where the number of genes assigned to other species related to C. membranifaciens is also included. Figure 5. Number of predicted genes assigned to a function based on the eukaryotic orthologous groups of proteins (KOG) classification. Represented is the distribution of predicted genes according to their putative function within the major categories "Cellular processes and signaling" (dark grey), "Information storage and processing" (black), "Metabolism" (white), and "Poorly characterized functions" (light grey).

Proteins Associated with Riboflavin Production and Transport
Riboflavin biosynthesis comprises a total of seven enzymatic reactions controlled by six RIB genes and an unknown gene that codes for an enzyme catalyzing the dephosphorylation of 5-amino-6-ribitylamino-2,4(1H,3H) pyrimidinedione 5'-phosphate (ARPP). The phosphatase responsible for this reaction in Arabidopsis thaliana was recently described [55], but its homolog in yeast remains unknown. The genome of C. membranifaciens IST 626 contains the homologs of RIB1, RIB2, RIB3, RIB4, RIB5, and RIB7 (Table 2), as well as the genes involved in the synthesis of flavin mononucleotide (FMN) (FMN1) and flavin adenine dinucleotide (FAD) (FAD1), which are essential cofactors for the majority of flavoproteins/flavocoenzymes in different organisms [7,8]. To search for the missing phosphatase in the riboflavin biosynthetic pathway in C. membranifaciens IST 626 genome, the protein sequence of A. thaliana 5-amino-6-(5-phospho-D-ribitylamino)uracil phosphatase (AtPyrP2) [55] was used as a query in JGI's MycoCosm Blast search tool. Two candidate proteins for the dephosphorylation of ARPP were identified (Table 2), which belong to the group of HAD hydrolyses, as in A. thaliana.  [56]; (**) TFBSs predicted in promoter sequences smaller than 160 bp. (***) Candidate genes for the dephosphorylation of ARPP in riboflavin biosynthetic pathway.
The overexpression of riboflavin biosynthetic genes in flavinogenic yeasts leads to the increase in riboflavin production in different species (reviewed in [57]). For instance, in C. famata the overexpression of the riboflavin biosynthetic genes RIB1 and RIB7 and of the transcriptional activator SEF1 remarkably increased riboflavin production [58]. This transcription factor is essential for riboflavin production in C. famata and P. guilliermondii [12,59] and plays a role in iron homeostasis in the flavinogenic yeast C. albicans [60]. In this later species, Sef1 directly binds to the promoter of RIB1 under iron-limiting conditions [61]. This transcription factor, together with Hap43 and Sfu1 are key regulators in the transcriptional control of iron-responsive genes [61,62]. Under iron-limiting conditions, Hap43 and ironuptake genes are activated by Sef1 [61]. In contrast, under iron repletion conditions, SEF1 and iron-uptake genes are repressed by Sfu1 (GATA factor) that, in turn, is repressed by Hap43 under low concentrations of iron [61]. Moreover, Hap43 also regulates the expression of the core HAP complex genes HAP5, HAP32, and HAP2, thus being considered a master regulator of iron homeostasis [62]. Remarkably, Hap43 was found to be involved in the positive regulation of RIB4 (orf19.410.3) in C. albicans under iron depletion conditions [62].
Homologs of those transcription factors were identified in the genome sequence of C. membranifaciens IST 626 (Table 3).  ATPase activity, iron-sulfur cluster binding activity and role in iron-sulfur cluster assembly (*) Molecular function retrieved from Candida Genome Database [56]; (**) Contains a FAD-binding domain.
Given the taxonomic proximity between C. albicans and C. membranifaciens, we used the YEASTRACT+ portal [35], a valuable tool that allows cross-species comparative genomics of transcription regulation in nonconventional yeasts [36], to search for putative transcription factor binding sites (TFBS) in the sequences upstream of the genes that code for riboflavin biosynthetic enzymes in C. membranifaciens, using as a query the TFBS predicted for TFs from C. albicans (Table 2). Remarkably, putative TFBSs for Hap43/Hap5 (CCAAT binding site) or Hap5 (CCATT binding site) were detected in the promoter sequences of several C. membranifaciens riboflavin biosynthetic genes and of the candidate genes for the dephosphorylation of ARPP in the riboflavin biosynthetic pathway ( Table 2), but no TFBS for Sef1 was identified using this tool. This may indicate that the described TFBS for C. albicans Sef1 is distinct to that of C. membranifaciens. Nonetheless, the possible regulation of riboflavin biosynthetic genes by Hap43/Hap5 deserves attention and to be experimentally validated.
Riboflavin transport in and out of the cell is still poorly characterized in flavinogenic yeasts, but two riboflavin permeases and one riboflavin excretase were described in P. guilliermondii (reviewed in [63]). More recently, a riboflavin excretase Rfe1 from C. flareri was also identified based on homology with D. hansenii DEHA2C03784p [64]. Using the same homology approach, we identified a putative riboflavin excretase in the genome sequence of C. membranifaciens IST 626 (Table 4). Interestingly, based on the MCL clustering tool [65] at JGI MycoCosm portal, the annotated protein is not conserved among the species considered in the analysis, but is present in the flavinogenic yeasts C. albicans, C. tropicalis, M. guilliermondii, and D. hansenii. Concerning the import of the vitamin, in S. cerevisiae, an uptake system encoded by MCH5 that belongs to a family of monocarboxylate transporters was demonstrated to uptake riboflavin into the cell [66]. C. membranifaciens IST 626 genome includes a remarkable number of genes encoding proteins from this family, 12, compared with the six identified in C. albicans or the five in S. cerevisiae. Nevertheless, it is still unknown whether these transporters do have a role in riboflavin uptake in flavinogenic yeasts. The putative TFBS for Hap43/Hap5 were found in the promoter sequences from some of these C. membranifaciens encoding genes (Table 4).

Proteins Associated with Iron Homeostasis and Regulation
The homologs of genes involved in iron homeostasis and regulation, which also include genes encoding iron-dependent flavoproteins, were identified in C. membranifaciens IST 626 genome (Table 3). In S. cerevisiae and C. albicans, the main players of iron metabolism have been identified and characterized (reviewed in [45,67]). In C. albicans, this topic has attracted the attention of many researchers due to the extraordinary ability of this species to cope with the different concentrations of iron within the human host microenvironments [68]. In general, extracellular iron uptake in yeasts can occur through the high-affinity reductive iron uptake, which involves extracellular reduction of ferric iron by ferric reductases encoded by the FRE genes [69], and subsequent reoxidation to its ferric form by the Fet3 multicopper ferroxidase that makes a complex with the high-affinity iron transporter Ftr1 [70]. Seven ferric reductase encoding genes homologous to S. cerevisiae and C. albicans FRE genes were identified in the genome of C. membranifaciens IST 626, of which six contain an FAD-binding domain. Two FET3 homologs containing a copper-oxidase domain and one FTR1 homolog were also identified in the genome sequence of C. membranifaciens. In C. albicans, five putative multicopper oxidases have been identified, but only four possess the copper-oxidase domains (required for oxidase activity) [71]. Since copper is required for oxidase activity, the intracellular copper transporter Ccc2 is essential for the function of the reductive pathway and for high-affinity iron transport in both S. cerevisiae and C. albicans [72,73]. We found in the genome of C. membranifaciens two putative CCC2 homologs (Table 3). Remarkably, no FET4 homolog could be identified in C. membranifaciens IST 626 genome. In S. cerevisiae, Fet4 is responsible for the low-affinity uptake of ferrous iron [74], and, as found for C. membranifaciens, this transporter is absent from C. albicans genome [36].
Yeasts also have the ability to utilize heme or hemoglobin as an iron source, a critical process for yeast survival and virulence previously characterized in S. cerevisiae and C. albicans [76]. In C. albicans, the uptake of hemoglobin is mediated by a family of specific hemoglobin receptors in the cell surface encoded by RBT5, RBT51, WAP1/CSA1, CSA2, and PGA7 genes [77]. In S. cerevisiae, this family of hemoglobin transporters is absent, consistent with the, in general, nonvirulent nature of this species. After internalization of hemoglobin inside vacuoles, this molecule is hydrolyzed or denatured to release the heme group that can subsequently be oxidized by the heme oxygenase Hmx1 [76]. As in C. albicans and S. cerevisiae, C. membranifaciens includes in its genome sequence the gene encoding the heme oxygenase Hmx1, as well as the C. albicans homologs for hemoglobin receptors Pga10 (Rbt51) and Pga7 (Table 3). In C. albicans, the activation of the iron regulon by Sef1 is co-ordinated with the biosynthesis of iron-sulfur clusters in the mitochondria [78]. Homologs of the genes involved in iron-sulfur cluster assembly were also identified in C. membranifaciens IST 626 genome (Table 3). Iron-sulfur clusters were proposed to play a role in the regulation of riboflavin biosynthesis and iron accumulation in the flavinogenic yeast M. guilliermondii, but the mechanisms underlying such an association remain unclear [79].

Proteins Associated with Sugar Transport and Metabolism
C. membranifaciens IST 626 can assimilate a wide variety of carbon sources that include hexoses (glucose and galactose), pentoses (xylose and arabinose), α-glucosides (maltose, trehalose, melezitose), β-glucosides (cellobiose, salicin), β-fructosides (sucrose, inulin), and α-galactosides (raffinose, melibiose), but is unable to assimilate lactose (β-galactoside) (Supplementary file 2). The first steps for the assimilation of sugars involve their cellular uptake or extracellular hydrolysis, followed by uptake of smaller molecules. The genome of C. membranifaciens IST 626 holds a total of 45 proteins from the sugar porter family (TC 2.A.1.1) [34,80] (Table 5, Supplementary file 3). This remarkable number of putative sugar transporters includes maltose and general α-glucoside transporters, involved in the transport of trehalose, maltose and/or melezitose, putative glucose/xylose facilitators and proton symporters, hexose transporters, glucose sensors, and transporters for other compounds, such as glycerol or quinate (Table 5, Supplementary file 3). Regardless of C. membranifaciens' inability to assimilate lactose, homologs of K. lactis galactose/lactose permease Lac12 [81] were found in its genome sequence. It is likely that these putative transporters are responsible for the uptake of galactose instead of lactose. Moreover, no β-galactosidase-encoding gene (KOG0496) was found in the genome of C. membranifaciens IST 626 (Supplementary files 1 and 4).

Conclusions
Valuable information regarding the isolation of C. membranifaciens strains from soil samples and the general ability of the species to produce riboflavin that, to date, was considered exclusive of C. membranifaciens subsp. flavinogenie W14-3 was provided. Among the C. membranifaciens isolates obtained and tested, strain C. membranifaciens IST 626 was selected as the best riboflavin producer and demonstrated that riboflavin production can be improved by culture medium optimization. Medium supplementation with glycine favored complete xylose metabolization, leading to higher biomass concentration and riboflavin volumetric production that reached approximately 120 mg/L after around 200 h of cultivation. C. membranifaciens IST 626 genome was sequenced and annotated, providing some indications on riboflavin biosynthesis and regulation and on the assimilation of different carbon sources. Putative transcription factor binding sites for Hap43 transcriptional regulator were found in the promoter regions of riboflavin biosynthetic genes' homologs (RIB1, RIB3, RIB5, RIB7, and FMN1), suggesting that Hap43 may have a role in the regulation of riboflavin biosynthesis under iron limitation. In future work, it would be interesting to examine the role of this transcription factor in the regulation of these and other genes, and compare the transcriptional profile of C. membranifaciens IST 626 and that of the type strain in different media. C. membranifaciens was found to hold a remarkable number of transporters for sugar/sugar-related compounds and metabolic enzymes, consistent with its capacity to use a wide variety of carbon sources, in particular, hexoses, pentoses, and inulin. These carbon sources are present in the hydrolysates from forest and agro-industrial residues, making this species a possible platform for riboflavin production from relevant feedstocks under a circular economy context. In conclusion, this work put forward the riboflavin overproducer C. membranifaciens and provides valuable molecular data to be used for the development of novel strains able to effectively use a wide range of raw materials in the production of added-value compounds.

Informed Consent Statement: Not applicable.
Data Availability Statement: The sequences from D1/D2 region of C. membranifaciens strains IST 495, IST 498, IST 507, and IST 626 were deposited in GenBank under the accession numbers MZ614941, MW003712, MW003715, and MW532700, respectively. The sequences from ITS region of C. membranifaciens strains IST 495, IST 498, IST 507, and IST 626 were deposited under the accession numbers MZ615411, MW003718, MW003721, and MW532702, respectively. The sequencing reads were deposited in Sequence Read Archive (SRA) under the accession number PRJNA777779 and the sequences obtained in the genome assembly were deposited in GenBank under the accession number JAKQXL000000000. The genome annotation is available in JGI fungal genome portal MycoCosm (https://mycocosm.jgi.doe.gov/Canmem1).

Conflicts of Interest:
The authors declare no conflict of interest.