mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans

Dohn, Ryan; Xie, Bingqing; Back, Rebecca; Selewa, Alan; Eckart, Heather; Rao, Reeta Prusty; Basu, Anindita

doi:10.3390/vaccines10010030

Open AccessArticle

mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans

by

Ryan Dohn

^1,2,*,

Bingqing Xie

¹

,

Rebecca Back

¹

,

Alan Selewa

^1,3,

Heather Eckart

¹

,

Reeta Prusty Rao

⁴ and

Anindita Basu

^1,2,3

¹

Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA

²

Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA

³

Biophysical Sciences Graduate Program, University of Chicago, Chicago, IL 60637, USA

⁴

Department of Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA

^*

Author to whom correspondence should be addressed.

Vaccines 2022, 10(1), 30; https://doi.org/10.3390/vaccines10010030

Submission received: 29 October 2021 / Revised: 18 December 2021 / Accepted: 22 December 2021 / Published: 27 December 2021

Download

Browse Figures

Versions Notes

Abstract

:

Advances in high-throughput single-cell RNA sequencing (scRNA-seq) have been limited by technical challenges such as tough cell walls and low RNA quantity that prevent transcriptomic profiling of microbial species at throughput. We present microbial Drop-seq or mDrop-seq, a high-throughput scRNA-seq technique that is demonstrated on two yeast species, Saccharomyces cerevisiae, a popular model organism, and Candida albicans, a common opportunistic pathogen. We benchmarked mDrop-seq for sensitivity and specificity and used it to profile 35,109 S. cerevisiae cells to detect variation in mRNA levels between them. As a proof of concept, we quantified expression differences in heat shock S. cerevisiae using mDrop-seq. We detected differential activation of stress response genes within a seemingly homogenous population of S. cerevisiae under heat shock. We also applied mDrop-seq to C. albicans cells, a polymorphic and clinically relevant species of yeast with a thicker cell wall compared to S. cerevisiae. Single-cell transcriptomes in 39,705 C. albicans cells were characterized using mDrop-seq under different conditions, including exposure to fluconazole, a common anti-fungal drug. We noted differential regulation in stress response and drug target pathways between C. albicans cells, changes in cell cycle patterns and marked increases in histone activity when treated with fluconazole. We demonstrate mDrop-seq to be an affordable and scalable technique that can quantify the variability in gene expression in different yeast species. We hope that mDrop-seq will lead to a better understanding of genetic variation in pathogens in response to stimuli and find immediate applications in investigating drug resistance, infection outcome and developing new drugs and treatment strategies.

Keywords:

mDrop-seq; scRNA-seq; high-throughput; fluconazole; Candida albicans; Saccharomyces cerevisiae; stress response; cell-cycle

1. Introduction

The rise of high-throughput single-cell mRNA sequencing (scRNA-seq) has led to a greater understanding of the functional and phenotypic heterogeneity present in our body on a cellular level. Primarily developed for mammalian cells [1], scRNA-seq uses the transcriptome of a single cell to analyze cell type [1], cell state [2], and cell response [3]. While variation between different cell types (in a multicellular organism) or cells of different species may be expected, scRNA-seq techniques have shown that there is significant cell-to-cell heterogeneity even between otherwise identical cells [4]. High-throughput techniques can examine thousands of cells at once, adding statistical power to determine variability between cells [1,5].

Technological challenges, such as the tough cell walls, small size, and concomitantly smaller amounts of transcripts per cell [6] however, have prevented similar applications in unicellular microbial organisms [7]. The strength and rigidity of the microbial cell walls composed of diverse components, e.g., peptidoglycans in bacteria, and chitin and β-glucan layers in yeasts [8,9,10], make them resistant to most lysis agents used for scRNA-seq. Microbes also have significantly less mRNA compared to animal cells, with estimates ranging up to two orders of magnitude less for bacterial cells [11,12].

Despite the challenges, achieving high-throughput microbial scRNA-seq is essential to understanding the heterogeneity and complex interactions between cells in a population. A few recent studies have profiled S. cerevisiae at high throughput with single-cell resolution using multiple cells from clonal populations to increase signal [13], the 10X platform [14,15], as well as the Drop-seq platform [16]. Plate-based scRNA-seq in Muñoz et al. [17] was used for isolated macrophage-C. albicans pairs. However, clinically relevant yeast species, such as C. albicans, have yet to be characterized at single-cell resolution and high-throughput. Single-cell RNA-seq on fungal pathogens such as C. albicans and Candida auris can help understand the commensal-to-pathogenic switch that leads to opportunistic infections [18,19]. As a common hospital-acquired infection, C. albicans can cause both superficial infections in humans and severe systemic infections in immunocompromised individuals [20]. Understanding the heterogeneity in how individual yeast cells respond to changes in the hosts’ immune system or their shifting microbiome can help treat and prevent such infections. The emergence of drug-resistant microbes is an urgent human health crisis [21]. Molecular mechanisms that confer protection to a select few cells when the parent population is decimated by anti-microbial agents can help understand the rise of drug-resistant strains [22].

Here, we present microbial Drop-seq, or mDrop-seq as a method of high-throughput scRNA-seq of different yeast species [16]. With modifications to the Drop-seq platform [1], we were able to accomplish microfluidic compartmentalization of single cells, in-droplet lysis and cellular barcoding of two species of yeast, viz., S. cerevisiae and C. albicans for scRNA-seq at scale. We quantified the transcriptional heterogeneity within clonal populations of yeast cells and profiled their response at single-cell resolution to environmental stresses, such as heat shock, and exposure to fluconazole, a common anti-fungal agent. Under heat shock at 42 °C, we observe differential expression of key stress response genes, including HSP12, HSP26, and HSP42 in S. cerevisiae cells. When exposed to fluconazole, C. albicans cells show differential upregulation of key drug target pathways such as ergosterol biosynthesis and ribosome activity and an overall increase in histone activity. Importantly, both S. cerevisiae and C. albicans show disruption in their cell cycle patterns under heat shock and fluconazole exposure, respectively. Taken together, we posit that the cell cycle state of the yeast cell in a population of continuously cycling cells is related to the variability in the cell’s response to stress. mDrop-seq’s ability to simultaneously profile a mix of S. cerevisiae and C. albicans cells demonstrate that different fungal species are amenable to simultaneous single-cell transcriptomic profiling using mDrop-seq. mDrop-seq will help decipher transcriptional variability and the use of alternate stress response pathways in yeasts in response to external stresses such as drug treatment and find immediate use in developing new classes of drugs or vaccines for emerging drug-resistant populations.

2. Results

2.1. Optimizing Single Yeast Cell Lysis in Droplets

The scRNA-seq workflow for yeasts in microfluidic drops is shown in Figure 1A (left). To establish a high-throughput scRNA-seq method for yeast cells such as S. cerevisiae and C. albicans, we adapted the emulsion droplet and bead-based DNA barcoding scheme used in Drop-seq [1] and DroNc-seq [23]. Initial experiments were performed using S. cerevisiae, a popular model organism that is amenable to technology development due to easier lysis of the species’ relatively thinner cell wall (~120 nm, compared to ~150 nm thick cell walls in C. albicans [24]). S. cerevisiae also has a well-annotated genome that is useful to establish data analysis tools. Figure 1A (right) lists all mDrop-seq experiments performed on S. cerevisiae (SC) and C. albicans (CA) in this study. Laboratory strains BY4741 and SC5314 were used for S. cerevisiae and C. albicans, respectively.

To determine the duration of droplet incubation for optimal zymolyase activity needed for each species, a single collection of mDrop-seq droplets on S. cerevisiae was split into three different pools and each pool was incubated in a 37 °C water bath for 10, 15, and 20 min, guided by bulk lysis experiments. The droplets were inspected by optical microscopy following incubation to ensure droplet stability and cell lysis.

The 10 min lysis incubation yielded a lower quantity of cDNA, indicating incomplete lysis and was excluded from further analysis. The 15- and 20-min incubations generated two libraries, indicated as SC_15min_Rep1 and SC_20min_Rep1. There were ~5000–10,000 cells processed per sample. Figure 1B shows the number of features (left), as a proxy for genes, and Unique Molecular Identifiers or UMI (right), as a proxy for a number of mRNA molecules captured per cell barcode. The UMI identifies individual mRNA molecules detected, allowing correction for PCR replicates to prevent PCR bias. Supplementary Table S1 summarizes the total and average number of reads from each sample, number of cells, mean number of genes and UMI obtained from a single cell, and the total number of unique genes obtained per experiment. We filtered out cell barcodes with less than 200 or more than 2000 genes detected, as they likely represent poor quality cells or multiple cells loaded in a single droplet, respectively. Across all experiments, we saw only 30–35% more UMI compared to the number of genes detected [25]. This experiment was performed twice.

The two mDrop-seq datasets from the lysis trial, SC_15min_Rep1, SC_20min_Rep1 overlap in principal component (PC) space (Figure S1A–C), with a Pearson correlation >0.99 between the datasets for UMI counts. This allowed us to combine the SC_15min_Rep1, SC_20min_Rep1 datasets into a single dataset of 12,012 cells, which we used to systematically explore differential expression (DE) in S. cerevisiae genes. Figure 1C shows the combined dataset in Uniform Manifold Approximation and Projection (UMAP) space [26]. Cell cycle assignment and regression of this dataset are discussed in the Supplementary Material and Figure S1D–H. The genes with the highest expression were involved in glycolysis (TDH3, ENO2, FBA1), stress response (HSC82), or ribosome biogenesis (RPP2B) (Figure 1D). As these cells were grown in a 2% glucose medium and lysed in log phase, the presence of glycolysis genes is expected. We also saw the expression of general stress response genes common to heat shock, along with DNA replication stress and oxidative stress that we attribute to the yeast cells responding to stresses (e.g., enzymatic lysis, heat) during cell lysis. We also observed a stress-induced batch effect within our biological replicates, which is discussed in our Supplementary Material.

To compare mDrop-seq with bulk RNA-seq, we also performed population-level RNA-seq on S. cerevisiae. A pseudo-bulk [1] comparison of the mDrop-seq dataset showed a Pearson’s correlation of 0.85 with our bulk RNA-seq, and an overall good correlation (~0.8, on average), with public RNA-seq datasets [27]. The correlation in transcript counts between our bulk RNA-seq and public datasets was 0.85, comparable to mDrop-seq.

2.2. Single-Cell Specificity of mDrop-Seq Confirmed by Species Mixing Experiments

To establish that mDrop-seq has single-cell specificity, species-mixing experiments [1] were performed using a mix of S. cerevisiae and C. albicans cells. Species-mixing experiments allow checking for “doublets”, or cell barcodes that capture two cells, assuming that a fraction of such drops will capture both S. cerevisiae and C. albicans cells and the corresponding cell barcodes will align to both S. cerevisiae and C. albicans genomes. The probability of finding one or more cells in a single drop is estimated assuming a uniform concentration of cells in the loading medium following a Poisson distribution. The Poisson parameter, λ governs the cell distribution in drops and may be used to calculate the fraction of cell doublets, etc. in the experiment. Two different cell-loading conditions were tested at 350,000 and 700,000 cells/mL, representing Poisson loading parameter, λ = 0.077 and 0.15, respectively. Across 4120 cells detected at λ = 0.077 or 7.7% Poisson loading, the majority of the barcodes map to one genome only, with only 47 barcodes mapping to both S. cerevisiae and C. albicans genomes (Figure 1E, left). Assuming that cross-species doublets make up half of all doublets (a similar number of barcodes would contain two S. cerevisiae cells or two C. albicans cells), we estimate the doublet rate to be ~2.3% of all barcodes detected. Figure 1E, right, shows a similar species-mixing experiment at λ = 0.15 or 15% Poisson loading; across 8080 cells detected, only 173 cells map to both genomes, giving a final doublet rate of ~4.2%. A Poisson loading of 15% was used for all S. cerevisiae and C. albicans experiments. These species mixing experiments also demonstrate that mDrop-seq is capable of simultaneous single-cell profiling of a mix of different yeast species. This feature may be useful in characterizing a fungal microbiome composed of multiple yeast species without the need to sort cells by species.

2.3. Heterogeneity in Heat Shock Response of Saccharomyces cerevisiae

To test the efficacy of mDrop-seq in detecting transcriptional changes in yeast cells to stimuli, S. cerevisiae cells were subjected to heat shock prior to running mDrop-seq. Heat shock is a widely conserved response in cells that involves the expression of protein chaperones [28]. The cells underwent a 20 min heat shock at 42 °C and then were chilled on ice before running mDrop-seq. The S. cerevisiae dataset in Figure 1C was used as the control for the heat shock experiments.

Two replicate mDrop-seq experiments on heat shock stimulation of S. cerevisiae cells showed significant similarity in expression (Pearson correlation of 0.96). The second replicate showed a slight increase in genes and UMI detected (587 and 1053, respectively) in Figure 2A. The heat shock replicate experiments were performed on different days, using independent cultures grown from the same stock.

Compared to the control, we see the upregulation of several stress response genes (the mean expression of the control and heat shock data showed a Pearson correlation of 0.74 in Figure 2B). We ordered the genes in the control and heat shock datasets by descending log fold change (logFC). For many of these genes, the p-values are small (Wilcoxon Rank Sum test, adj-p < 1 × 10⁻¹⁰, Bonferroni correction) with many adj-p < 2.225 × 10⁻³⁰⁸ (lowest value reported in Seurat). Among the genes induced under heat shock, we see several genes involved in heat shock-related stress response, such as HSP12 (logFC = 1.78) as well as other heat shock protein (HSP) family genes associated with other types of stress response (Figure 2C and Figure S3C). We also see significant differential expression in genes marked for protein transport, such as KAR2 (logFC = 2.49; Figure S3A) and DNA replication stress, such as RTC3, NCE103 (logFC = 2.68, 2.51, respectively; Figure S3B) under heat shock. House-keeping genes, such as ACT1, histone gene HTB1, ribosomal genes, and glycolysis genes (Figure 2D and Figure S3D) are present in both datasets, with slightly higher expression (logFC = 0.6, 1.1, and 1.5 for HTB1, ACT1, and RPP2B, respectively) in the control dataset. These results suggest that mDrop-seq has the power to detect cellular responses to stimuli on a single-cell level.

2.4. Activation of Stress Response in S. cerevisiae under Heat Shock

To investigate the sequence of activation in stress response genes in S. cerevisiae under heat shock, we applied trajectory analysis [29] on a subset of control and heat-shocked S. cerevisiae data. Three datasets, SC_15min_Rep2, SC_20min_Rep2 and SC_HeatShock_Rep2 were integrated (Figure S4A,B) and cell-cycle module scores were assigned to each cell. We then used Monocle v2 [29] to infer the expression changes during heat shock in pseudo-time (Figure 2E–G, with Figure S4C showing the corresponding pseudo-time ordering of cells). Figure 2E shows the trajectory where the control samples, SC_15min_Rep2 and SC_20min_Rep2 overlap with each other, while the heat shock sample SC_HeatShock_Rep2 diverge into two separate branches indicative of two distinct pseudo-states. When marked by each cell’s cell-cycle phase (Figure 2F), we note that both branches of the trajectory taken by the heat-shocked sample are dominated by S phase cells. In contrast, the G1, S and G2M phases largely overlap for the control samples. Next, we compared the cell-type clustering results with trajectory analysis. Figure 2G shows the pseudo-time trajectory where each cell is colored according to the unsupervised cluster it belongs to, shown in Figure S4A. The UMAPs in Figure S4B show the contribution of each sample to the overall clustering: we note that cells in cluster 0 come mostly from the control samples (left, middle) and cluster 2 is composed of cells primarily from the heat shock sample (right). These results are consistent with Figure S4D where cells in cluster 0 predominantly occur in control samples (left, middle) and cluster 2 in the heat shock sample (right). DE analysis was performed on the two branches of the heat shock sample using the Wilcoxon Rank Sum test. The expression level in the logarithmic scale was visualized on the trajectory tree plot (Figure S4E). Heat shock genes, HSC82, HSP12 and HSP82 (Figure S4E) are differentially expressed between the two branches of the trajectory indicating differential response in S. cerevisiae cells to heat shock.

2.5. Characterizing Expression Heterogeneity in Candida albicans Cells

To demonstrate the utility of mDrop-seq in profiling different yeast species, we applied mDrop-seq to Candida albicans, a common hospital-acquired infection that can become life-threatening [30]. This yeast has several features that make it challenging for droplet single-cell profiling. Thicker cell walls in C. albicans compared to S. cerevisiae (~150 nm and ~120 nm, respectively [24]) make their lysis more difficult. This species also has a hyphal phenotype that can clog microfluidic channels and disrupt flow and droplet generation. The Droplet Yeast Lysis Buffer or DYLB (see Methods) used for S. cerevisiae cells failed to lyse C. albicans cells (as assessed under a microscope). The C. albicans Lysis Buffer used in our experiments is similar to DYLB but with higher concentrations of both detergent and enzyme (see Methods). We performed two replicates of lysis experiments (15, 20, and 25 min) to establish the ideal incubation time for lysis and RNA capture of C. albicans in drops. The 25 min lysis from the second replicate did not produce a library due to technical error. Average counts varied between the five libraries (Table S1), consistent with lysis experiments performed in bulk. Gene cutoffs were determined for each lysis time and replicate; this ranged from 175 to 370. A total of 14,320 C. albicans cells were detected across two replicate datasets. Lysis time of 20 min outperformed the other lysis times in both replicates in terms of genes and UMI captured (Figure 3A, Table S1). We combined the 15- and 20-min incubation datasets from replicate 1 (CA_20min_Rep1 and CA_25min_Rep1) to construct a combined dataset with 4006 cells (Figure S5), with lower and upper cutoffs of 220 and 1600 genes per cell. The 15 min lysis experiment yielded poor data and was excluded from further analysis.

The 15- and 20-min libraries (CA_15min_Rep2 and CA_20min_Rep2) in replicate 2 were combined into a dataset containing 10,314 cells. The gene with the highest expression, on average, is a white-phase yeast transcript WH11 (Figure 3B). C. albicans in the white phase is the generic, round yeast form, and expected when grown in rich medium [31], as opposed to the elongated, mating-competent opaque phase [32]. These morphological forms favor growth, unlike the C. albicans’ filamentous hyphal form [31]. TDH3, a gene involved in glycolysis (that also showed the highest expression in S. cerevisiae), is the fourth highest expressed gene in the C. albicans (in both replicates, also see Figure S5A). Potentially due to the difference in genes and UMI detected, CA_15min_Rep2 and CA_20min_Rep2 were separated in PC space (Pearson correlation = 0.96). The data were integrated to find common structures in the data without the interference of batch variation. After integration [33], we see several distinct clusters (clusters 2, 3, and 5) in this dataset (Figure 3C) that separate out from the remaining cells. Cluster 3 is marked by the expression of GPI-anchored cell wall genes such as FGR41 and PGA38 (Figure 3D). Clusters 2 and 5 are heavily represented by histone tail genes as markers (Figure 3E), indicating that these clusters may represent variation caused by the cell cycle. Transcription factors STP4 and ADR1 encoding zinc finger proteins and implicated in C. albicans virulence, [17,34] and GNP1, a transmembrane transporter of amino acids are moderately expressed across the entire population (Figure 3F). OLE1, involved in filamentation, also shows high expression across all cells (Figure 3F). Figure 3G shows a heatmap of the most significantly expressed genes in each cluster. Cell cycle analyses of both C. albicans replicates are discussed in Supplementary Material.

Our population-level RNA-seq data showed a modest Pearson correlation of 0.70 with pseudo-bulk [1] data constructed from mDrop-seq and 0.64 with a publicly available dataset of C. albicans [35]. The correlation between our population-level data and the public dataset was 0.84, by comparison.

2.6. Differential Expression in Candida albicans in Response to Fluconazole Exposure

Fluconazole is an anti-fungal agent commonly used to treat infections from various Candida species. C. albicans cells were exposed to fluconazole over the course of 3 h, with samples taken for running mDrop-seq prior to exposure, at 1.5 h and 3 h. Fluconazole is a bis-triazole antifungal agent that binds to cytochrome P-450 to disrupt the conversion of lanosterol to ergosterol [36]. Previous experiments showed that C. albicans responds to the presence of fluconazole in a variety of ways, such as increasing expression of the drug target genes, increasing drug efflux, and finding compensatory pathways for ergosterol biosynthesis [37]. We sampled a population of C. albicans cells before (as control) and after (1.5 and 3 h) exposure to 15 µg/mL fluconazole, which is slightly higher than the C_max dose of 400 mg [38]. These experiments were performed twice. C. albicans strain SC5314 was shown to be susceptible to fluconazole with a tested MIC50 of 0.156 µg/mL and an MIC80 of 1.25 µg/mL [39]. MIC values were determined via broth microdilution assay in accordance with CSLI standards [39].

We saw an increase in UMI and genes detected per cell when exposed to fluconazole, across replicates (Figure 4A and Figure S8A). Figure 4B shows a UMAP plot of the integrated dataset (control, 1.5 and 3 h) from replicate 1 with a slight separation between the control and fluconazole-exposed samples (replicate 2 shown in Figure S8). In contrast, there was very little separation between the samples at 1.5 and 3 h fluconazole exposures. The mean gene expression of the control library yielded a Pearson correlation of 0.91 for the 1.5 h and 0.88 with the 3 h time-points (Figure 4C).

When comparing the 1.5 and 3 h time-points of fluconazole treatment to the control, we saw significant upregulation of several ergosterol biosynthesis pathway genes that alter C. albicans susceptibility to different classes of antifungal drugs such as azoles and allylamines [40]. Figure 4D shows six of these genes, with ERG11 being the main drug target of fluconazole and ERG1, associated with terbinafine resistance [41] (LogFC = 1.26, 1.71, 1.02, 1.53, 1.34, and 2.00 for ERG11, ERG252, ERG1, ERG13, ERG10, and ERG6, respectively). ABC transporters used for drug efflux were not detected, likely due to the short duration of our fluconazole treatment [37]. The fluconazole-treated cells also showed increased expression of many histone genes (LogFC > 2) compared to control (Figure S9A). Note that histone genes in yeasts have poly-adenylated tails, unlike in humans [42]. Histone genes are discussed further in the Supplementary Material.

Using DE analysis on the combined dataset, we identified the following genes of interest: DE genes that show higher expression in the fluconazole treated datasets, e.g., CHT2, INO1, POL30, TNA1, RHD3, HXK2 (Figure 4E) that included several antigenic genes and genes upregulated during a host immune response; DE genes that show increased expression in the control data that decreased with time under fluconazole treatment, e.g., ASR1, ASR2, WH11, HSP70, AHP1 (Figure 4F) associated with core and heat shock specific stress responses; and DE genes that show the highest expression transiently in the 1.5 h fluconazole treated sample, e.g., MRV5, ADH2, SOD5, CAR1 (Figure 4G) associated with acid, osmotic and alkaline stress responses. Violin plots of housekeeping genes ACT1, PDA1, TDH3, and PGK1 are shown in Figure S9B for comparison.

Next, we performed cell cycle analysis on the combined dataset (CA_Fluconazole-ctrl, 1.5, 3h_Rep1) from the control, 1.5 and 3 h time-points of the fluconazole treatment. When colored by the cell-cycle phase, the UMAP of the combined dataset (Figure 4H) showed some clustering by the cell cycle. We also saw a significant increase in the number of cells assigned to the S phase under fluconazole treatment (3.2× and 1.8× for 1.5 and 3 h, respectively; Supplementary Table S2) compared to the control dataset. Cells under stress tend to go into cell cycle arrest [43]. Increased expression of ERG genes was also associated with slow growth in yeasts [44]. Since many histone genes occur in the list of marker genes for the S phase, we verified that high histone activity alone in the fluconazole-treated cells was not skewing our cell cycle assignment towards the S phase (see Analysis of cell cycle phases in C. albicans in Supplementary Material).

Unsupervised clustering of the combined CA_Fluconazole_Rep1 data, after cell cycle effects were regressed out (Figure 4I), showed clusters of cells exhibiting histone activity (cluster 4), ribosome activity (clusters 4, 6), synthesis of ribonucleoproteins and Hap43 induced proteins, along with a reduction in iron metabolism (cluster 5; violin plots of expression for some genes in this cluster are shown in Figure S10A), stress response (cluster 7), synthesis of the cell wall and vacuolar proteins (cluster 8), and nucleolar activity (cluster 9). When cells from each time-point were plotted separately (Figure S10B), we saw that some cell clusters were present predominantly in either control or fluconazole-treated time-points, e.g., clusters 5 and 7 in control, cluster 4 at 1.5 h, and cluster 6 at 3 h. The number of cells in clusters 2 and 3 decreased monotonically between control, 1.5 and 3 h time-points. Similar analyses of fluconazole treated C. albicans, replicate 2 are shown in S11A. These results show interesting variability in transcriptomic response between cells to fluconazole, potentially providing insight into differences in resistance between cells. Biological replicates were analyzed separately due to batch effects between the replicates potentially interfering with biological variation within the data.

2.7. Trajectory Inference in Fluconazole Stimulation in Candida albicans

Since the fluconazole treatment led to steady changes in gene expression along the 3 h time course, we attempted to capture the temporal changes in gene expression by constructing pseudo-time trajectories for the C. albicans stimulation, using Monocle, an R package [29]. We assumed the control (untreated) sample as time t = 0 h in the fluconazole treatment for this analysis. Figure 4J,K show the pseudo-temporal trajectories of CA_Fluconazole_Rep1 response to fluconazole, marked by experimental time-point and cell-type clusters (identified by unsupervised clustering and shown in Figure 4I), respectively. Based on prior knowledge, the tip (bottom left) in Figure 4J was set as the starting point for pseudo-time construction. The occurrence of the untreated control (blue) on the left of trajectory in Figure 4J and the fluconazole treated samples (coral–1.5 h; green–3 h) to the right are consistent with the inferred pseudo-time progression (Figure S10C). Figure S10D shows the increasing expression of ERG10 and ERG11 genes that mediate resistance to fluconazole and other antifungal agents along the trajectory [37].

The contributions of each experimental time-point to the different branches of the pseudo-time trajectory are shown in Figure S10E,F, marked by cell-type clusters and cell cycle stages, respectively, while broken down by control (left), 1.5 h (middle) and 3 h (right) time-points. The branches to the left were predominantly composed of cells from the control sample. S phase assignment dominated the fluconazole treated cells (Figure S10F, middle, right), as seen from cell clustering.

Similar analyses of fluconazole treated C. albicans, replicate 2 (CA_Fluconazole_Rep2) are shown in Figure S11. As seen in C. albicans replicate 1, the representation of each sample along the pseudo-time trajectory is consistent with experimental time-points (Figures S8I and S11B), starting with control (left) cells at the tip, followed by 1.5 h fluconazole-treated cells (middle) and finally the 3 h treated cells (right). Cells belonging to the S phase also dominate the fluconazole treated samples (Figure S11C, middle, right), similar to C. albicans replicate 1.

In summary, pseudo-time analysis of C. albicans that were exposed to fluconazole shows that cell activation trajectory in pseudo-time can be used to infer the temporal sequence of gene expression in yeast cells under external stimuli.

3. Discussion

As noted earlier, single-cell genomic analyses of microbial species have been difficult due to challenges in single-cell lysis and low input material in microbial cells. Yeasts and other fungi have poly-adenylated tails on the 3′ end of their mRNA, allowing selective mRNA capture using poly-dT oligonucleotides that is not possible in bacterial cells making fungi more experimentally tractable among microbial species.

We established the feasibility of mDrop-seq to profile transcriptional heterogeneity in fungal species at single-cell resolution and at scale by performing mDrop-seq on a total of 35,109 single cells of S. cerevisiae and 39,705 C. albicans cells across multiple replicates, experimental conditions and environmental stimuli in the form of heat shock (S. cerevisiae) and fluconazole exposure (C. albicans). Based on Drop-seq and DroNc-seq [1] used to profile gene expression in mammalian cells, mDrop-seq leverages existing single-cell experimental and computational tools and allows for the lower barrier of entry and easy adaption of single-cell RNA-seq on fungal species for labs that are set up for Drop-seq or similar workflows.

3.1. Droplet Content and Stability

To implement mDrop-seq, we needed to overcome the challenge of microbial cell lysis in emulsion drops, while maintaining droplet stability and RNA integrity for downstream molecular biology reactions. This was accomplished through a combination of zymolyase and sarkosyl activity in drops, along with thermal incubation. The enzyme, zymolyase targets a common component of fungal cell walls and requires thermal activity; the detergent, sarkosyl is a strong lytic agent that works ubiquitously on mammalian, zebrafish, C. elegans and fruit fly cells [45,46,47]. Different concentrations of zymolyase and sarkosyl were used in the mDrop-seq lysis buffers for S. cerevisiae and C. albicans (see Methods), based on bulk and droplet-based lysis experiments on the two species. We posit that similar cocktails consisting of zymolyase and sarkosyl will prove effective on a broad class of fungal species that share similar cell wall properties to S. cerevisiae and C. albicans, including clinically relevant species such as C. auris. Anti-fungal peptides [48] that target specific components of the fungal cell wall may also be added to the lysis cocktail of zymolyase and sarkosyl.

The stability of emulsion drops and the efficacy of downstream reactions are affected by droplet contents. We note that a high concentration of detergent in the lysis buffers, e.g., 3.3% sarkosyl in the C. albicans lysis buffer, is detrimental to stable droplet formation, necessitating lower flow rates on the microfluidic device. Since reverse transcription in mDrop-seq was performed outside microfluidic drops after the emulsion was broken following single-cell lysis and mRNA capture on barcode beads in drops [1,23], the compatibility of lysis buffer and reverse transcriptase was not an issue. The DYLB and C. albicans lysis buffer used in our experiments were optimized for the microfluidic device [23], oil-surfactant mix, and flow parameters used here. While stable droplets may be generated at higher flow rates with other surfactants, we prefer the oil-surfactant mix used here due to its relatively low cost, long shelf-life and easy availability.

3.2. Sequencing and Alignment

We used Illumina Paired End (PE) sequencing for mDrop-seq. Due to the relatively lower complexity of the yeast mDrop-seq libraries in general, or GC content bias (~37%, in C. albicans and (~41% in S. cerevisiae, reported by FastQC), we found it beneficial to sequence these libraries multiplexed with more complex libraries such as human (~45%, from FastQC) or use higher Illumina PhiX concentration to improve the overall quality of sequencing runs.

The 20 bp long Read 1 sequence was used to de-multiplex the cell barcode and UMI while the 60 bp Read 2 was used to identify the 3′ end of transcripts. While longer read lengths can help reduce multi-mapping in complex genomes such as the human at 3.1 Bb [49], yeast genomes are typically much smaller, e.g., 12 Mb for S. Cerevisiae [50] and 14.7 Mb for C. albicans [51]; so transcripts with shorter read lengths (~30 bp) can be uniquely mapped to them. Figure S12A shows the percent of uniquely mapped reads (left) and reads mapping to multiple loci (right) as functions of Read2 fragment lengths for S. cerevisiae, C. albicans and human genomes. We also compared the effect of clipping the transcript fragments on the 3′ vs. 5′ ends on STAR [52] aligner and saw no noticeable difference in mapping rates between the two.

UMI identifies individual mRNA molecules, allowing us to collapse PCR replicates and prevent PCR bias. Across all mDrop-seq experiments, we saw ~30–120% more UMI compared to the number of genes detected, with higher percent UMIs detected in the heat shock experiments in S cerevisiae or fluconazole treatment in C albicans, compared to their respective controls. The majority of genes detected had only a single count of transcripts attributed to them. This is expected because most yeast genes are expressed as single copies of mRNA at any time [25]. This may make it difficult to differentiate between true variation and drop-outs in the data; a problem in scRNA-seq that is exacerbated in yeasts.

3.3. Analysis of Stress, Heat Shock and Response to Anti-Fungal Agent in Yeast Cells

When comparing the control and heat shock sample in S. cerevisiae, we see a very clear heat shock response and notable separation in PC and UMAP space. This separation is primarily driven by upregulation of known heat shock genes along with genes involved in DNA replication stress (e.g., KAR2, LST8, ERO1) and protein transport (e.g., RTC3, NCE103, TMA10). Using a pseudo-time trajectory analysis, we clearly see the progression from a non-stimulated control to the heat-stimulated cells, with a branch point indicating two separate heat shock responses. The most significant difference between these two branches is the differential expression of ribosomal structure (e.g., RPL8B, RPL25, RPL36B) vs. oxidative-reduction energetic processes (e.g., PIG2, PCL8, GDB1).

The S. cerevisiae data were normalized and batch-corrected to allow comparisons between experimental conditions, and categorized into three sub-groups: control, “stressed” control, and heat shock, for comparison. We find the “stressed” control group to be intermediate between the control and heat shock in that it showed an elevated stress-response signature for HSP12, HSP26 and HSP42 compared to the control set, but lower than the heat shock data. In addition, the heat shock data showed expression of genes related to heat shock stress, e.g., HSP30, and PIN3, DNA replication stress, e.g., LST8, BTN2, ERO1, and protein transport, e.g., RTC3, TMA10 and SPI1 that were absent in the control and “stressed” control samples. We saw similar levels of transcription for housekeeping genes, e.g., ACT1, HTB1, TDH2, and FBA1 across all groups. Others of interest are genes such as AIM44, PIR1, PST1, and EGT2, associated with cell wall stability and cell budding that were expressed in a subset of cells clustering together in the control data.

Using gene lists specific to the G1, S and G2M phases to the cell-cycle, we scored and assigned each cell to a unique cell-cycle phase for both S. cerevisiae and C. albicans. Cells that could not be unambiguously assigned to any particular cell-cycle phase were marked as NA. In both heat shock and drug treatment experiments, we see a significant decrease in the number of G2M phase cells (Supplementary Table S2) and an increase in cell numbers assigned to the S phase. Since fluconazole treatment may be expected to elicit a stress response in yeasts, we propose that yeast cells are in general likely to get arrested in the S phase under stress.

Since fungal histone mRNA are polyadenylated [42] and thus amenable to mDropseq capture, we were able to characterize histone activity in S. cerevisiae and C. albicans. Overall, we saw the significant differential expression of genes associated with histone activity in the cellular sub-sets identified by unsupervised clustering of both S. cerevisiae and C. albicans populations. The increase in the number of cells expressing histone genes in environmentally perturbed datasets may be due to stalling of the cell cycle in the S phase. The increase in the fraction of the cells belonging to the S phase may also be due to the higher chromatin accessibility needed for increased transcription under stress response that drives up histone expression.

mDrop-seq on C. albicans cells exposed to the anti-fungal agent fluconazole showed significant increases in the number of counts and genes detected, compared to control data. Much of these increases appeared to be driven by increased expression of ribosomal structure genes (e.g., RPS27, RPS6A, RPS12). Across both replicates, we noted significant up-regulation of several genes belonging to the Ergosterol Biosynthesis pathway, including ERG11 that produces the cytochrome P-450 target of fluconazole. The upregulation of this pathway is a known mechanism for resisting the membrane disruptive effects of fluconazole. We did not detect any upregulation of ABC transporter genes as mechanisms of resistance. During the 3 h exposure time, we also noted several genes that increased their expression transiently, increasing quickly in expression in 1.5 h before decreasing by the end of the 3 h period. These genes included some stress-induced genes such as SOD5 and ADH2, as well as C1-08350C-A, C5-02110W-A.

After integrating the control and fluconazole treated data for each replicate, followed by cell cycle regression and clustering analysis, we noted that the signatures of ribosomal and histone expression differences persisted within the clusters. A cluster marked by GPI-anchored proteins was also seen in the integrated data, as well as a cluster involving nucleolar and pre-ribosomal genes.

In particular, we noted increased histone activity in C. albicans under fluconazole exposure. Since many cell-cycle genes for the S phase in C. albicans are histone-related, we confirmed that the signal for the S phase assignment was preserved, compared to the G1 and G2M phases even when the histone genes were excluded from the S phase marker list (Supplementary Material). Since chromatin accessibility is needed for increased transcription under stress response, this may drive up histone expression under heat-shock in S. cerevisiae or fluconazole treatment in C. albicans.

Limitations of the study: While the strain of C. albicansused in our experiments is susceptible to fluconazole, antifungal drug exposures may cause contact or acquired resistance that can be reversible. The adaptability of C. albicanscan be confounding due to the many pathways it can use to increase resistance in response to different classes of antifungal drugs [53,54]. Our experimental design simply demonstrates C. albicansresponse to fluconazole over our time-course study but cannot distinguish if these responses are intrinsic, temporary, or permanent. Further experiments will be needed to resolve the issue that is beyond the scope of this study.

3.4. Scalability as a Technology

mDrop-seq offers a cheap transcriptomic profiling solution for unicellular fungi and may be easily adapted in laboratories that use Drop-seq or similar techniques. Existing bioinformatics and statistical tools for single-cell analyses may be effectively leveraged to analyze fungal species at single-cell resolution. According to our estimate, the cost of single-cell library preparation using mDrop-seq is ~USD250 per sample (~5000 cells/sample). At ~50 million reads per sample, we estimate sequencing to cost an additional ~USD190 per experiment.

4. Materials and Methods

4.1. Yeast Strains and Cell Culture

Saccharomyces cerevisiae (strain BY4741, Open Biosystems) cells were grown as a dense culture (>1 × 10⁸ cells/mL) in YEPD (MP Biochemicals, #MP114001022) at 27 °C overnight. The S. cerevisiae culture was heavily diluted (1:20) in fresh YEPD medium and grown for 4 h at 27 °C, following which the cells were placed on ice and chilled.

Candida albicans (strain SC5314, ATCC) were grown in YEPD at 27 °C after heavy dilution (~1:375). After 20 h of cell culture, Candida albicans cells were chilled on ice prior to processing using mDrop-seq.

4.2. Heat Shock Stimulation of S. cerevisiae

After S. cerevisiae was grown in YEPD for 4 h post-dilution, cells were counted in YEPD using Neubauer Improved (NI) hemocytometer (InCyto, Cheonan, Korea, #DHC-N01-2). A total of 700,000 cells were aliquoted into a 1.7 mL microfuge tube and heat shock stimulation was applied by placing the tube in an Eppendorf F1.5 Thermomixer set to 42 °C and 500 rpm for 20 min. At the end of the heat shock incubation, the microfuge tube containing the S. cerevisiae cells were placed on ice for 5 min. The cells were then washed once with ice-cold 1X PBS (Teknova, Hollister, CA, USA, #P0195) and 0.01% BSA (NEB, Ipswich, UK, #B9000Sm), henceforth referred to as PBS-BSA, quickly recounted, and brought to a concentration of 700,000 cells/mL in PBS-BSA. A total of 10 µL of RNase Inhibitor was added to 1 mL of yeast cells in PBS-BSA and mDrop-seq was performed as described below. The emulsion droplets were collected on ice to preserve the heat shock signal during the droplet encapsulation period.

4.3. Fluconazole Stimulation of C. albicans

After C. albicans was grown overnight in YEPD, five million cells were counted using a Neubauer Improved (NI) hemocytometer (InCyto, #DHC-N01-2) and diluted into 2 mL of fresh YEPD. A total of one million cells were removed from this pool and put on ice as the control population and processed using mDrop-seq. Fluconazole (Sigma, St. Louis, MO, USA #F8929-100MG) was freshly diluted to 100 µg/mL in fresh YEPD and added to the remaining four million C. albicans to a final concentration of 15 µg/mL. The C. albicans were then incubated in fluconazole under end-over-end rotation at room temperature for 1.5 and 3 h, removed and put on ice prior to running mDrop-seq.

4.4. mDrop-seq Cell Preparation and Co-Encapsulation in Droplets

Yeast cells (S. cerevisiae or C. albicans) were centrifuged separately at 1000× g for 3 min in a swinging bucket centrifuge at 4 °C. The cells were washed twice with ice-cold PBS-BSA. Following the washes, 10 µL of cells were sampled and counted using a NI hemocytometer (InCyto, Cheonan, Korea, #DHC-N01-2). ~1 mL of cells at 700,000 cells/mL suspended in PBS-BSA was placed in a 2.5 mL syringe (BD, Franklin Lake, NJ, USA #309657). A total of 10 µL of RNase Inhibitor (Lucigen, Middleton, WI, USA #F83923) was added per 1 mL suspension immediately before microfluidic encapsulation. A 75 μm DroNc-seq device fabricated in the cleanroom using published design and protocol [23] was used for droplet generation. Cells and beads were co-flowed into the microfluidic device, both at 1.5 mL/h for S. cerevisiae and 1 mL/h for C. albicans. Cells at 700,000 cells/mL and 4,500,000 droplets/mL gives a Poisson loading distribution with λ = 0.15.

Barcoded beads (ChemGenes, Wilmington, MA, USA, #Macosko-2011-10(V+)) were suspended in DYLB or C. albicans lysis buffers at 350,000 beads/mL and kept in suspension by constant stirring with a magnetic tumble stirrer and flea-magnet setup (V&P Scientific, San Diego, CA, USA, #VP 710, #772DP-N42-5-2); the flea magnet is placed in the syringe containing the barcode beads suspended in lysis buffer and the stirrer kept in the vicinity of the syringe during droplet generation. Cells and beads in lysis buffer were co-encapsulated in drops using a surfactant-oil mix (BioRad, Hercules, CA, USA, #1864006) flowed at 8 mL/h in a 10 mL syringe (BD, Franklin Lake, NJ, USA #302995) as the outer carrier oil phase. Droplets were collected at ~3750 droplets/sec for 30 min in 50 mL tubes (Genesee Scientific, San Diego, CA, USA, #28-106).

Saccharomyces cerevisiae: Barcoded beads were suspended in Droplet Yeast Lysis Buffer or DYLB, consisting of 6.7% (w/v) Ficoll PM-400 (GE Healthcare, Chicago, IL, USA, #17-0300-05), 225 mM Tris pH 7.5 (Teknova, Hollister, CA, USA, #T2075), 22 mM EDTA (Fisher, Hampton NH, USA, #BP2482-500), 0.67% sarkosyl (Teknova, Hollister, CA, USA, #S3377), 55 mM KH₃PO₄ (Sigma, St. Louis, MO, USA #P5629-25G), 1.3 mM DTT (Teknova, Hollister, CA, USA, #D9750), 0.1% (v/v) β-mercaptoethanol (Sigma, St. Louis, MO, USA #M6250-10ML), and 450 units/mL zymolyase (Zymo Research, Orange County CA, USA, #E1005); this mix was optimized for S. cerevisiae lysis in drops.

Candida albicans: Barcode beads were suspended in C. albicans lysis buffer containing 6.7% Ficoll PM-400 (GE Healthcare, Chicago, IL, USA, #17-0300-05), 225 mM Tris pH 7.5 (Teknova, Hollister, CA, USA, #T2075), 22 mM EDTA (Fisher, Hampton NH, USA, BP2482-500), 3.3% (w/v) sarkosyl (Teknova, Hollister, CA, USA, #S3377), 55 mM KH₃PO₄ (Sigma, St. Louis, MO, USA #P5629-25G), 1.3 mM DTT (Teknova, Hollister, CA, USA, #D9750), 0.1% (v/v) β-mercaptoethanol (Sigma, St. Louis, MO, USA #M6250-10ML), 650 units/mL zymolyase (Zymo Research, Orange County CA, USA, #E1005), that was optimized for lysing C. albicans in drops.

S. cerevisiae and C. albicans species mixing: Each yeast cell population, S. cerevisiae or C. albicans were processed separately as follows: Each yeast species was centrifuged at 1000× g for 3 min in a 4 °C swinging bucket centrifuge and washed twice with ice-cold PBS-BSA. Following the washes, 10 µL of each cell aliquot was sampled from each species and counted using a NI hemocytometer.

Two experiments at two different Poisson loading concentrations were performed to calculate doublet rates at these loading conditions: 175,000 cells from each species were combined and the final volume adjusted to 1 mL of PBS-BSA for λ ≈ 0.077; 350,000 cells from each species were suspended in a total volume of 1 mL PBS-BSA for λ ≈ 0.15. Due to the presence of C. albicans, the stronger C. albicans lysis buffer was used for both experiments.

4.5. Cell Lysis, Reverse Transcription, cDNA Amplification and Next Era Library Generation for mDrop-Seq

After droplet collection, the 50 mL tubes were transferred to a 37 °C water bath for zymolyase digestion and lysis for ~20 min; different lysis incubation times ranging from 10–25 min were tested, both for S. cerevisiae and C. albicans. After the incubation, the Drop-seq protocol was followed for breaking droplets, collecting barcode beads with mRNA hybridized onto them and washing them in 6x Saline-Sodium Citrate (Teknova, Hollister, CA, USA, #S0282) [1]. Reverse transcription was performed in 1.5 mL microfuge tubes under end-over-end rotation using a modified Reverse Transcription mix (1x Maxima H- RT buffer, 4% Ficoll PM-400 (GE Healthcare, Chicago, IL, USA, #17-0300-05), 3 mM MgCl₂ (Sigma, St. Louis, MO, USA #M1028), 1 M Betaine (Sigma, St. Louis, MO, USA #14300), 1 mM dNTPs (Clontech, San Francisco, CA, USA, #639125), 1 U/μL Rnase Inhibitor (Lucigen, Middleton, WI, USA #F83923), 2.5 μM Template-Switching Oligo primer, (Integrated DNA Technologies, Coralville, IA, USA, AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG), and 10 U/μL Maxima H- RT enzyme (ThermoScientific, Waltham, MA, USA #EP0751) with a 30 min incubation at room temperature, followed by a 90 min incubation at 50 °C. This generated barcoded cDNA affixed to the barcoded beads referred to as Single Transcriptome Attached to MicroParticles or STAMPs. The beads were then washed once in 0.5% SDS (Teknova, Hollister, CA, USA, #S0288), twice in 0.02% Tween 20 (Teknova, Hollister, CA, USA, #T0710) both prepared in TE buffer (Teknova, Hollister, CA, USA, #T0228) and treated with Exonuclease I digestion (Fisher, Hampton, NH, USA, #M0293L). The total number of STAMPS collected was counted manually under the microscope. cDNA amplification was performed on RNA-DNA conjugates attached to ~120,000 barcode beads in a 96-well plate (Genesee Scientific, San Diego, CA, USA, #24-302) loaded at 10,000 STAMPs per well. The STAMPS were amplified for 17 PCR cycles, using Kapa Hifi Hotstart 2x Mastermix (Fisher, Hampton, NH, USA, #NC0465187) and Drop-seq PCR primer (Integrated DNA Technologies, Coralville, IA, USA, AAGCAGTGGTATCAACGCAGAGT) [1]. Post-PCR cleanup was performed by removing the STAMPs and pooling the supernatant from the wells together into a single 1.7 mL low-retention tube (Genesee Scientific, San Diego, CA, USA, #22-281LR) along with 0.6X Ampure XP beads (Beckman Coulter, Brea, CA USA, #A63880) [1]. After adding the Ampure beads to the PCR product, the tube was incubated at room temperature for 2 min on a thermomixer (Eppendorf Thermomixer^® C, Eppendorf, Germany, #5382000023) set to 1250 rpm, and for another 2 min on bench for stationary incubation. Next, the tube was placed on a magnet and washed 4X times using 1 mL ethanol (Sigma, St. Louis, MO, USA #E7023) at 80% concentration in each wash. cDNA was eluted in ultra-pure water (Life Tech, Carlspan, CA, USA, #10977-023) at 2.5 μL/well and the concentration and library size were measured using Qubit 3 fluorometer (Thermo Fisher, Waltham, MA, USA) and BioAnalyzer High Sensitivity Chip (Agilent, Santa Clara, CA, USA, #5067-4626). Representative traces of cDNA libraries of S. cerevisiae and C. albicans are shown Figure S12B,C, respectively.

500 pg of the cDNA library was used in Nextera library (Illumina, San Diego, CA, USA, #FC-131-1096) preparation, following the original Drop-seq protocol, with a 3 min 72 °C incubation step added at the beginning of the thermo-cycling program to yield Nextera libraries averaging 500–700 bp [1,23].

4.6. Population-Level RNA-Seq Library Preparation of S. cerevisiae and C. albicans

We performed the same standardized procedure to prepare bulk RNA-seq libraries for S. cerevisiae and C. albicans as follows: We lysed ~8,000,000 cells each using DYLB (S. cerevisiae) and C. albicans lysis buffer (C. albicans) and incubation at 37 °C for 20 min. Total RNA was extracted from each lysate using the Direct-zol RNA Miniprep Plus kit (Zymo Research, #R2071), assessed RNA quality using Qubit HS RNA Assay (Invitrogen, Waltham, MA, USA, #Q32852) and diluted to 25 pg/µL. The total RNA library was annealed to 5′-AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTN-3′ primer (Integrated DNA Technologies, Coralville, IA, USA,) that allow polyA selection of mRNA and template switching, similar to Drop-seq. Briefly, 11 µL of total RNA library was mixed with 11 µL of 10 µM primer above, 11 µL of 10 mM dNTP (Takara, #639125), 13.75 µL of ultra-pure water and 2.75 µL of RNase Inhibitor (Fisher, Hampton NH, USA, #NC1081844), and incubated at 75 °C for 3 min on a PCR thermocycler. A reverse transcription master-mix consisting of 11 µL of 5X Maxima RT buffer, 2.2 µL of H₂O, 11 µL of 5 M Betaine (Sigma, St. Louis, MO, USA #14300-500G), 1.65 µL of 100 mM MgCl₂ (Sigma, St. Louis, MO, USA #M1028), 2.2 µL of 50 µM Drop-seq TSO primer (Integrated DNA Technologies, Coralville, IA, USA, AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG), 1.1 µL of RNase inhibitor (Fisher, Hampton NH, USA, #NC1081844) and 2.75 µL of Maxima H-RTase enzyme (Fisher, Hampton NH, USA, #FEREP0753) was added immediately after the annealing step. The final RT reaction volume of 45 µL was pipetted several times to mix, centrifuged briefly to spin down the contents and incubated on a PCR thermocycler using the following program: 42 °C for 90 min; 5 cycles*(42 °C for 2 min, 50 °C for 2 min); 70 °C for 15 min, to perform reverse transcription of the polyadenylated mRNA selectively annealed to the primer above. The cDNA was amplified for 12 cycles for S. cerevisiae, and 13 cycles for C. albicans, using Drop-seq PCR primer (Integrated DNA Technologies, AAGCAGTGGTATCAACGCAGAGT) and KAPA HiFi HotStart ReadyMix PCR Kit (Fisher, Hampton NH, USA, #NC0465187K). Amplified cDNA was quantified using Qubit and BioAnalyzer, followed by Nextera library (Illumina, San Diego, CA, USA, #FC-131-1096) generation.

4.7. Sequencing

Nextera libraries of samples, including bulk RNA-seq, were loaded at ~ 15 pM concentration and sequenced on an Illumina NextSeq 500 using the 75 cycle v3 kits for paired-end sequencing. A total of 20 bp were sequenced for Read 1, 60 bp for Read 2 using Custom Read 1 primer, GCCTGTCCGCGGAAGCAGTGGTATCAACGCAGAGTAC (Integrated DNA Technologies), according to protocol. Due to the low complexity of yeast cDNA libraries, Illumina PhiX Control v3 Library (Illumina, San Diego, CA, USA, #FC-110-3001) was added at 5–10% of the total loading concentration for all sequencing runs. Samples for each experiment were loaded at 7–15% of a NextSeq 500 lane and yielded 10–40 million reads for each sample. Some samples were sequenced twice, depending on library complexity. For these samples, the Fastq files were concatenated using the UNIX zcat function before running the dropRunner pipeline. Data are available at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE154515 (accessed on 28 October 2021).

4.8. mDrop-Seq Data Preprocessing, Alignment and Quality Control

There were ~5000–10,000 cells processed per sample and each library was sequenced at ~40–90 million reads. We developed a Snakemake [55] protocol called dropRunner that takes paired-end reads in FASTQ format as input and produces an expression matrix corresponding to the UMI of each gene in each cell. The protocol initially performs FastQC [56] to obtain a report of read quality. The reports are later inspected manually to ensure high-quality reads were generated. Next, it creates a whitelist of cell barcodes using umi_tools [57] 0.5.3, which is a list of the top 10,000 valid cell barcodes in terms of number of reads. Next, reads were aligned to the respective yeast genomes using STAR [52] 2.7.0a. STAR 2.7 introduced STARsolo, a turnkey solution for processing droplet single-cell RNA-seq data built directly into the STAR aligner. The whitelist and paired-end reads are used as input for STARsolo, which performs alignment, gene UMI counting, and cell-barcode-filtering in one step. STARsolo uses a heuristic approach for filtering cells with low or noise-level UMI counts. It does so by constructing a UMI count rank plot for each cell (a knee-plot) and picks a cut-off based on the knee of the curve. The pipeline can be found at GitHub (aselewa/dropseqrunner). The filtered digital expression matrices from STARsolo were loaded in Seurat (v3.1.1), an R package for downstream single-cell transcriptome analyses.

All data from S. cerevisiae were aligned to the Saccharomyces cerevisiae reference genome, version sacCer3_s288c (https://www.yeastgenome.org/strain/S288C, accessed on 28 October 2021) obtained from the Saccharomyces Genome Database (SGD). C. albicans data aligned to Candida albicans SC5314 reference genome, version A21-s02-m09-r10 (http://www.candidagenome.org/download/gff/C_albicans_SC5314/Assembly21/, accessed on 28 October 2021).

4.9. Bulk RNA-Seq Data Processing

Two sets of bulk RNA-seq data obtained from S. cerevisiae and C. albicans were assessed for read quality using FastQC, mapped to the respective genomes described above using STAR 2.7.0a aligner [52], and RNA counts were generated from the bam files using FeatureCounts [58]. Read lengths were down-sampled during alignment using the STAR aligner, “-clip3pNbases” and “-clip5pNbases” parameters. Count matrices for the two yeast species were compared to their corresponding mDrop-seq datasets using the Seurat package.

4.10. Clustering Cells and Generating UMAP

We followed the analysis pipeline recommended by Seurat. Data were normalized and scaled using default commands provided by the Seurat package in R. Seurat was used to calculate the gene dispersion and mean expression to find highly variable genes. This reduces the computational time of PCA compared to using the full set of genes. Highly variable genes were used to calculate the PCs for the yeast mDrop-seq data. An elbow plot displaying the variation explained by each PC was used to determine the number of PCs needed to explain the majority (>90%) of the variation. The top PCs determined in this way were used to perform clustering which was visualized with Uniform Manifold Approximation and Projection (UMAP) [26].

4.11. Calculating Doublet Rates from Species Mixing Experiments

An mDrop-seq dataset containing a mix of S. cerevisiae and C. albicans cells was aligned once to the S. cerevisiae genome and again separately, to the C. albicans genome. Cells were removed based on data quality. The 7% Poisson loading experiment had gene cutoffs of 50 for both experiments, while the 15% Poisson loading experiment had cutoffs of 275 and 400 for S. cerevisiae and C. albicans, respectively. We identify cell barcodes that capture genes from both S. cerevisiae and C. albicans genomes. Due to the similarities between the yeast genomes caused by shared ancestry, there are genes that will map to both species. This was checked by taking a C. albicans dataset and mapping it to the S. cerevisiae genome, and vice versa, to identify the genes common to both species. After removing these common genes from the mixed-species dataset, we identified the cell barcodes that show significant mapping (>50 genes) to both genomes as true doublets.

4.12. Batch Correction

Dataset merging and batch correction were performed using the Anchor Integration function in the Seurat package. Datasets were independently normalized and highly variable genes were detected using gene dispersion and mean expression. The datasets were scaled before running the Canonical Correlation Analysis (CCA) function in Seurat to determine dataset anchors and merge the objects. A new integrated dataset was then created using the detected anchors. This integrated dataset was used for dimensional reduction and clustering analyses.

4.13. Cell Cycle Analysis

Lists of genes that serve as cell cycle phase markers for G1, S, and G2M phases were obtained from Spellman et al. [59] for S. cerevisiae and the Candida Genome Database for C. albicans [60]. Cell cycle assignment was made based on the G1, S, and G2M markers for cell-cycle phases. The AddModuleScore function implemented in Seurat was used to calculate cell-cycle scores for each phase. This function sampled random genes as a control set; the number of the genes in the control set was determined by the number of markers in the cell-cycle gene lists. Since there were three such lists of markers, the minimum size of cell-cycle marker lists was used. For each cell, the largest module score was selected among the three phases and the corresponding cell cycle phase for the selected module score was assigned to the cell. A threshold on the selected module score was applied to ensure the cell cycle assignment was robust. If all module scores for a cell were below zero, the phase for the cell was left undecided and counted as “Not Assigned” or NA in Supplementary Table S2.

This module score list was used with Seurat to create a column in the object metadata containing the assigned cell-cycle phase. Cell-cycle phase metadata was used to calculate PCs instead of highly variable genes. Cell-cycle variation was regressed out during data scaling and centering. The expression percentage and level for the G1, S, and G2M marker genes were visualized using dot plots where the size of the dots indicates the percentage of cells expressing a marker and the color intensity reflects normalized expression.

4.14. Hierarchical Clustering of Cells

To investigate variations in single-cell expression that are not related to cell division and proliferation, the cell cycle effects were removed when scaling the gene expression per cell. The module scores for each of the G1, S, and G2M phases were regressed out against each gene. PCs and Shared Nearest Neighbor (SNN) graph [61] were constructed from the scaled gene expression matrix.

4.15. Trajectory Analysis on Single-Cell Data

To trace the lineage or process of temporal activation in yeast cells in response to external stimuli such as heat shock or fluconazole treatment, trajectory analysis [29] was performed on the single-cell data from control and stimulated cells. Datasets from different conditions/time points were integrated using anchor-based integration described above. The R-based pipeline, Monocle 2 (v 2.10) was used to process the data and construct the trajectory.

The genes used to order cells along the pseudo-time trajectory, or ordering genes were set based on differentially expressed (DE) genes obtained from unsupervised clustering in Seurat. DE genes with q-value <0.01 were selected as the ordering genes in Monocle. The count matrix was log-transformed after adding one to the counts to eliminate logarithms of zero values. PCA was performed on the normalized count matrix using the ordering genes and the top variable PCs were selected based on the scree plot of variance explained per component. The PCs were reduced into a tree structure by Discriminative Dimensionality Reduction with Trees (DDRTree) [62]. The backbone of the tree branches formed the cell trajectory. The root of the trajectory was set as the tip of the tree branch that contained the largest number of cells from the control sample, and each cell was ordered in pseudo-time based on the root. During the PCA and DDRTree dimension reduction phases, we removed any cell cycle effects by specifying the G1, S, and G2M module scores obtained from Seurat as variables to be linearly subtracted from the data to look for changes in gene expression that are independent of cell-cycle effects as response of external stimuli.

5. Conclusions

We introduce mDrop-seq, a droplet-based RNA sequencing of fungal species at single-cell resolution and high throughput. We applied mDrop-seq on two important yeast species, viz., S. cerevisiae, a model organism commonly used to study fundamental processes in biology, and C. albicans, a common, clinically relevant pathogen. We were able to identify cellular subsets and their expression profiles within the larger population that show cell-budding activity, histone synthesis, etc. To our knowledge, this is the first work that performs single-cell RNA-seq on C. albicans at high throughput. Modestly priced and based on established protocols that are easy to implement, we anticipate that mDrop-seq will be instrumental for understanding the basis on phenotypic and functional variability in fungal species in a wide range of contexts, including medical treatment of fungal pathogens, understanding basic biological processes in model organisms, and production of biologics.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/vaccines10010030/s1, File S1: Supplementary Material.

Author Contributions

Conceptualization, R.D. and A.B.; Methodology, R.D., R.B. and H.E.; Analysis, R.D., B.X. and A.S.; Interpretation R.D., A.B. and R.P.R.; Writing and editing, R.D., B.X., A.S., R.B. and A.B.; Supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by NIH R21 AI144417-02, DP2 AI158157-01 and BSF 2017357 grants. RPR was supported by NIH U19AI110818 and 1R15AT009926-01.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All sequenced Data are available at the NCBI Gene Expression Omnibus (GEO): https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE154515 (accessed on 28 October 2021).

Acknowledgments

We thank Sean Crosson for help setting up C. albicans cell culture, Ashish Thakur for help designing bulk RNA-seq experiment, Samantha Keyport for help designing heat shock experiments and Aviv Regev, Carl de Boer and Sebastian Pott for discussions and feedback. Microfluidic devices were fabricated in the University of Chicago Pritzker Nanofabrication Facility that is supported by the Soft and Hybrid Nanotechnology Experimental (SHyNE) Resource (NSF ECCS-2025633), a node of the National Science Foundation’s National Nanotechnology Coordinated Infrastructure. Next-Gen sequencing was performed at the University of Chicago Functional Genomics Facility and computational resources were provided by the University of Chicago Research Computing Center.

Conflicts of Interest

The authors declare no conflict of interest.

References

Macosko, E.Z.; Basu, A.; Satija, R.; Nemesh, J.; Shekhar, K.; Goldman, M.; Tirosh, I.; Bialas, A.R.; Kamitaki, N.; Martersteck, E.M.; et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 2015, 161, 1202–1214. [Google Scholar] [CrossRef] [Green Version]
Andrews, T.S.; Hemberg, M. Identifying cell populations with scRNASeq. Mol. Aspects Med. 2018, 59, 114–122. [Google Scholar] [CrossRef]
Szabo, P.A.; Levitin, H.M.; Miron, M.; Snyder, M.E.; Senda, T.; Yuan, J.; Cheng, Y.L.; Bush, E.C.; Dogra, P.; Thapa, P.; et al. Single-cell transcriptomics of human T cells reveals tissue and activation signatures in health and disease. Nat. Commun. 2019, 10, 1–16. [Google Scholar] [CrossRef]
Raj, A.; van Oudenaarden, A. Nature, Nurture, or Chance: Stochastic Gene Expression and Its Consequences. Cell 2008, 135, 216–226. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zheng, G.X.Y.; Terry, J.M.; Belgrader, P.; Ryvkin, P.; Bent, Z.W.; Wilson, R.; Ziraldo, S.B.; Wheeler, T.D.; McDermott, G.P.; Zhu, J.; et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017, 8, 1–12. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Gao, J.; Huang, Y.; Wang, J. Recent Developments in Single-Cell RNA-Seq of Microorganisms. Biophys. J. 2018, 115, 173–180. [Google Scholar] [CrossRef] [Green Version]
Saliba, A.-E.; Santos, S.C.; Vogel, J. New RNA-seq approaches for the study of bacterial pathogens. Curr. Opin. Microbiol. 2017, 35, 78–87. [Google Scholar] [CrossRef]
Gow, N.A.R.; Hube, B. Importance of the Candida albicans cell wall during commensalism and infection. Curr. Opin. Microbiol. 2012, 15, 406–412. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lesage, G.; Bussey, H. Cell wall assembly in Saccharomyces cerevisiae. Microbiol. Mol. Biol. Rev. 2006, 70, 317–343. [Google Scholar] [CrossRef] [Green Version]
Silhavy, T.J.; Kahne, D.; Walker, S. The bacterial cell envelope. Cold Spring Harb. Perspect. Biol. 2010, 2, a000414. [Google Scholar] [CrossRef] [PubMed]
Kang, Y.; Norris, M.H.; Zarzycki-Siek, J.; Nierman, W.C.; Donachie, S.P.; Hoang, T.T. Transcript amplification from single bacterium for transcriptome analysis. Genome Res. 2011, 21, 925–935. [Google Scholar] [CrossRef] [Green Version]
Zenklusen, D.; Larson, D.R.; Singer, R.H. Single-RNA counting reveals alternative modes of gene expression in yeast. Nat. Struct. Mol. Biol. 2008, 15, 1263–1271. [Google Scholar] [CrossRef] [PubMed]
Nadal-ribelles, M.; Islam, S.; Wei, W.; Latorre, P. Sensitive high-throughput single-cell RNA-Seq reveals within- clonal transcript-correlations in yeast populations. Nat. Microbiol. 2019, 4, 683–692. [Google Scholar] [CrossRef] [PubMed]
Jariani, A.; Vermeersch, L.; Cerulus, B.; Perez-Samper, G.; Voordeckers, K.; Van Brussel, T.; Thienpont, B.; Lambrechts, D.; Verstrepen, K.J. A new protocol for single-cell RNA-seq reveals stochastic gene expression during lag phase in budding yeast. Elife 2020, 9, e55320. [Google Scholar] [CrossRef]
Jackson, C.A.; Castro, D.M.; Saldi, G.; Bonneau, R.; Gresham, D. Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments. Elife 2020, 9, e51254. [Google Scholar] [CrossRef]
Urbonaite, G.; Lee, J.T.H.; Liu, P.; Parada, G.E.; Hemberg, M.; Acar, M. A yeast-optimized single-cell transcriptomics platform elucidates how mycophenolic acid and guanine alter global mRNA levels. Commun. Biol. 2021, 4, 822. [Google Scholar] [CrossRef] [PubMed]
Muñoz, J.F.; Delorey, T.; Ford, C.B.; Li, B.Y.; Thompson, D.A.; Rao, R.P.; Cuomo, C.A. Coordinated host-pathogen transcriptional dynamics revealed using sorted subpopulations and single macrophages infected with Candida albicans. Nat. Commun. 2019, 10, 1607. [Google Scholar] [CrossRef] [Green Version]
Pfaller, M.A.; Diekema, D.J. Epidemiology of invasive candidiasis: A persistent public health problem. Clin. Microbiol. Rev. 2007, 20, 133–163. [Google Scholar] [CrossRef] [Green Version]
Muñoz, P.; Bouza, E.; Cuenca-Estrella, M.; Eiros, J.M.; Pérez, M.J.; Sánchez-Somolinos, M.; Rincón, C.; Hortal, J.; Peláez, T. Saccharomyces cerevisiae Fungemia: An Emerging Infectious Disease. Clin. Infect. Dis. 2005, 40, 1625–1634. [Google Scholar] [CrossRef]
Mayer, F.L.; Wilson, D.; Hube, B. Candida albicans pathogenicity mechanisms. Virulence 2013, 4, 119–128. [Google Scholar] [CrossRef] [Green Version]
Centers for Disease Control and Prevention: Antibiotic Resistance Threats in the United States. 2019. Available online: https://www.cdc.gov/drugresistance/pdf/threatsreport/2019-ar-threats-report-508.pdf (accessed on 25 May 2021).
Levin, B.R.; Rozen, D.E. Non-inherited antibiotic resistance. Nat. Rev. Microbiol. 2006, 4, 556–562. [Google Scholar] [CrossRef] [PubMed]
Habib, N.; Avraham-Davidi, I.; Basu, A.; Burks, T.; Shekhar, K.; Hofree, M.; Choudhury, S.R.; Aguet, F.; Gelfand, E.; Ardlie, K.; et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 2017, 14, 955–958. [Google Scholar] [CrossRef] [Green Version]
Klis, F.M.; de Koster, C.G.; Brul, S. Cell wall-related bionumbers and bioestimates of Saccharomyces cerevisiae and Candida albicans. Eukaryot. Cell 2014, 13, 2–9. [Google Scholar] [CrossRef] [Green Version]
Miura, F.; Kawaguchi, N.; Yoshida, M.; Uematsu, C.; Kito, K.; Sakaki, Y.; Ito, T. Absolute quantification of the budding yeast transcriptome by means of competitive PCR between genomic and complementary DNAs. BMC Genom. 2008, 9, 574. [Google Scholar] [CrossRef] [Green Version]
McInnes, L.; Healy, J.; Saul, N.; Großberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 2018, 3, 861. [Google Scholar] [CrossRef]
Silva, A.; Cavero, S.; Sarah, V.; Solé, C.; Böttcher, R.; Chávez, S.; Posas, F.; de Nadal, E. Regulation of transcription elongation in response to osmostress. PLoS Genet. 2017, 13, e1007090. [Google Scholar] [CrossRef] [PubMed]
Richter, K.; Haslbeck, M.; Buchner, J. The Heat Shock Response: Life on the Verge of Death. Mol. Cell 2010, 40, 253–266. [Google Scholar] [CrossRef]
Qiu, X.; Mao, Q.; Tang, Y.; Wang, L.; Chawla, R.; Pliner, H.A.; Trapnell, C. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 2017, 14, 979–982. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kim, J.; Sudbery, P. Candida albicans, a major human fungal pathogen. J. Microbiol. 2011, 49, 171–177. [Google Scholar] [CrossRef]
Slutsky, B.; Staebell, M.; Anderson, J.; Risen, L.; Pfaller, M.; Soll, D.R. “White-opaque transition”: A second high-frequency switching system in Candida albicans. J. Bacteriol. 1987, 169, 189–197. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lan, C.-Y.; Newport, G.; Murillo, L.A.; Jones, T.; Scherer, S.; Davis, R.W.; Agabian, N. Metabolic specialization associated with phenotypic switching in Candidaalbicans. Proc. Natl. Acad. Sci. USA 2002, 99, 14907–14912. [Google Scholar] [CrossRef] [Green Version]
Stuart, T.; Butler, A.; Hoffman, P.; Hafemeister, C.; Papalexi, E.; Mauck, W.M., 3rd; Hao, Y.; Stoeckius, M.; Smibert, P.; Satija, R. Comprehensive Integration of Single-Cell Data. Cell 2019, 177, 1888–1902.e21. [Google Scholar] [CrossRef] [PubMed]
Issi, L.; Farrer, R.A.; Pastor, K.; Landry, B.; Delorey, T.; Bell, G.W.; Thompson, D.A.; Cuomo, C.A.; Rao, R.P. Zinc Cluster Transcription Factors Alter Virulence in Candida albicans. Genetics 2017, 205, 559–576. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cottier, F.; Tan, A.S.M.; Yurieva, M.; Liao, W.; Lum, J.; Poidinger, M.; Zolezzi, F.; Pavelka, N. The transcriptional response of Candida albicans to weak organic acids, carbon source, and MIG1 inactivation unveils a role for HGT16 in mediating the fungistatic effect of acetic acid. G3 Genes Genomes Genet. 2017, 7, 3597–3604. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Morrow, J.D. Fluconazole: A new triazole antifungal agent. Am. J. Med. Sci. 1991, 302, 129–132. [Google Scholar] [CrossRef]
Berkow, E.L.; Lockhart, S.R. Fluconazole resistance in Candida species: A current perspective. Infect. Drug Resist. 2017, 10, 237–245. [Google Scholar] [CrossRef] [Green Version]
Pfizer. DIFLUCAN (Fluconazole Tablets) (Fluconazole Injection—For Intravenous Infusion Only) (Fluconazole for Oral Suspension). FDA Drug Label 2011, 1–37. Available online: http://www.accessdata.fda.gov/drugsatfda_docs/label/2014/019949s058,019950s062,020090s042lbl.pdf (accessed on 2 February 2021).
Bandara, H.M.H.N.; Wood, D.L.A.; Vanwonterghem, I.; Hugenholtz, P.; Cheung, B.P.K.; Samaranayake, L.P. Fluconazole resistance in Candida albicans is induced by Pseudomonas aeruginosa quorum sensing. Sci. Rep. 2020, 10, 7769. [Google Scholar] [CrossRef]
Henry, K.W.; Nickels, J.T.; Edlind, T.D. Upregulation of ERG genes in Candida species by azoles and other sterol biosynthesis inhibitors. Antimicrob. Agents Chemother. 2000, 44, 2693–2700. [Google Scholar] [CrossRef] [Green Version]
Leber, R.; Fuchsbichler, S.; Klobucníková, V.; Schweighofer, N.; Pitters, E.; Wohlfarter, K.; Lederer, M.; Landl, K.; Ruckenstuhl, C.; Hapala, I.; et al. Molecular mechanism of terbinafine resistance in Saccharomyces cerevisiae. Antimicrob. Agents Chemother. 2003, 47, 3890–3900. [Google Scholar] [CrossRef] [Green Version]
Fahrner, K.; Yarger, J.; Hereford, L. Yeast histone mRNA is polyadenylated. Nucleic Acids Res. 1980, 8, 5725–5737. [Google Scholar] [CrossRef] [Green Version]
Castrillo, J.I.; Zeef, L.A.; Hoyle, D.C.; Zhang, N.; Hayes, A.; Gardner, D.C.; Cornell, M.J.; Petty, J.; Hakes, L.; Wardleworth, L.; et al. Growth control of the eukaryote cell: A systems biology study in yeast. J. Biol. 2007, 6, 4. [Google Scholar] [CrossRef]
Bhattacharya, S.; Esquivel, B.D.; White, T.C. Overexpression or Deletion of Ergosterol Biosynthesis Genes Alters Doubling Time, Response to Stress Agents, and Drug Susceptibility in Saccharomyces cerevisiae. MBio 2018, 9, e01291-18. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Tzur, Y.B.; Winter, E.; Gao, J.; Hashimshony, T.; Yanai, I.; Colaiácovo, M.P. Spatiotemporal Gene Expression Analysis of the Caenorhabditis elegans Germline Uncovers a Syncytial Expression Switch. Genetics 2018, 210, 587–605. [Google Scholar] [CrossRef] [Green Version]
Farrell, J.A.; Wang, Y.; Riesenfeld, S.J.; Shekhar, K.; Regev, A.; Schier, A.F. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 2018, 360, 6392. [Google Scholar] [CrossRef] [Green Version]
Ariss, M.M.; Islam, A.B.M.M.K.; Critcher, M.; Zappia, M.P.; Frolov, M. V Single cell RNA-sequencing identifies a metabolic aspect of apoptosis in Rbf mutant. Nat. Commun. 2018, 9, 1–13. [Google Scholar] [CrossRef] [PubMed]
Matejuk, A.; Leng, Q.; Begum, M.D.; Woodle, M.C.; Scaria, P.; Chou, S.-T.; Mixson, A.J. Peptide-based Antifungal Therapies against Emerging Infections. Drugs Future 2010, 35, 197. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Schneider, V.A.; Graves-lindsay, T.; Howe, K.; Bouk, N.; Chen, H.; Kitts, P.A.; Murphy, T.D.; Pruitt, K.D.; Thibaud-nissen, F.; Albracht, D.; et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017, 27, 849–864. [Google Scholar] [CrossRef] [Green Version]
Saccharomyces Cerevisiae (ID 15)—Genome—NCBI. Available online: https://www.ncbi.nlm.nih.gov/genome/?term=s+cerevisiae (accessed on 21 May 2020).
Candida Albicans (ID 21)—Genome—NCBI. Available online: https://www.ncbi.nlm.nih.gov/genome/?term=Candida%20albicans (accessed on 21 May 2020).
Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 2013, 29, 15–21. [Google Scholar] [CrossRef] [PubMed]
Costa-de-Oliveira, S.; Rodrigues, A.G. Candida albicans Antifungal Resistance and Tolerance in Bloodstream Infections: The Triad Yeast-Host-Antifungal. Microorganisms 2020, 8, 154. [Google Scholar] [CrossRef] [Green Version]
Cowen, L.E.; Steinbach, W.J. Stress, drugs, and evolution: The role of cellular signaling in fungal drug resistance. Eukaryot. Cell 2008, 7, 747–764. [Google Scholar] [CrossRef] [Green Version]
Köster, J.; Rahmann, S. Snakemake-a scalable bioinformatics workflow engine. Bioinformatics 2012, 28, 2520–2522. [Google Scholar] [CrossRef] [Green Version]
Andrews, S. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2010. Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 12 December 2021).
Smith, T.; Heger, A.; Sudbery, I. UMI-tools: Modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res. 2017, 27, 491–499. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liao, Y.; Smyth, G.K.; Shi, W. FeatureCounts: An efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 2014, 30, 923–930. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Spellman, P.T.; Sherlock, G.; Zhang, M.Q.; Iyer, V.R.; Anders, K.; Eisen, M.B.; Brown, P.O.; Botstein, D.; Futcher, B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 1998, 9, 3273–3297. [Google Scholar] [CrossRef] [PubMed]
Côte, P.; Herve, H.; Whiteway, M. Transcriptional Analysis of the Candida albicansCell Cycle. Mol. Biol. Cell 2009, 20, 3363–3373. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Stuart, T.; Satija, R. Integrative single-cell analysis. Nat. Rev. Genet. 2019, 20, 257–272. [Google Scholar] [CrossRef]
Mao, Q.; Wang, L.; Goodison, S.; Sun, Y. Dimensionality Reduction Via Graph Structure Learning. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–14 August 2015; Association for Computing Machinery: New York, NY, USA, 2015; pp. 765–774. [Google Scholar]

Figure 1. mDrop-seq of Saccharomyces cerevisiae cells. (A) mDrop-seq experimental schematic (left) and list of mDrop-seq experiments on different yeast samples (right). (B) Number of genes and UMI detected for each cell at two cell lysis times: 15 and 20 min. (C) UMAP visualizing the clustering on 12,012 S. cerevisiae cells. (D) Boxplot displaying the top 15 genes expressed by fraction of total counts for 15 and 20 min lysis times. (E) Species-mixing plots where each dot depicts a unique cellular barcode that align to S. cerevisiae (blue), C. albicans (red), or both genomes (purple). Two Poisson loading concentrations of 350 cells/µL (left; λ = 0.08) and 700 cells/µL (right; λ = 0.15) were tested. (A) was generated using BioRender.com.

Figure 2. Heat shock treatment of 26,019 Saccharomyces cerevisiae cells profiled using mDrop-seq. (A) Violin plots displaying the number of genes and UMI for heat shock replicates. (B) Correlation between average gene expression values for the control and heat shock S. cerevisiae datasets. (C,D) Violin plots displaying expression differences in control and heat shock datasets for (C) heat shock-related genes, and (D) “house-keeping” (actin, histones, ribosomal, and glycolysis) genes. (E–G) Pseudo-time trajectory of gene expression inferred from the combined control and heat-shocked S. cerevisiae dataset shown in this figure. Colors indicate: (E) experimental time-points, (F) Cell cycle stages (cells that could not be assigned to a cell cycle stage are marked NA), and (G) cell-type clusters shown in Figure S4A.

Figure 3. mDrop-seq of 10,314 Candida albicans cells, replicate 2. (A) Violin plots of genes and UMI counts per cell. (B) Boxplot of 15 genes with highest expression in C. albicans, plotted by fraction of total counts. (C) UMAP displaying the clustering analysis of 10,314 C. albicans cells after batch correction. (D–F) Feature plots displaying: (D) GPI-anchored cell wall proteins that represent markers for cluster 3, (E) histone tail genes that represent markers for clusters 2 and 5, (F) genes for transcription factors, fatty acid biosynthesis, and hyphal formation involved with C. albicans virulence. (G) Heatmap displaying expressions of the top marker genes for each cluster.

Figure 4. Fluconazole treatment of 15,503 Candida albicans cells profiled using mDrop-seq. (A) Number of genes and UMI for each fluconazole treatment times. (B) UMAP displaying the clustering of the combined control and fluconazole data. (C) Correlation between the control and 3 h fluconazole exposed yeast for gene expression. (D) Violin plots displaying expression in control and fluconazole datasets for several ergosterol biosynthesis genes. (E) Violin plots displaying the expression differences of genes detected at significantly higher expression in fluconazole treated data. (F) Violin plots of gene expression show significantly higher expression in the control data. (G) Violin plots of DEGs for the 1.5 h fluconazole exposed data. (H) Cell cycle assignment on the integrated control and fluconazole data. (I) UMAP of the integrated data after cell cycle regression. (J,K) Pseudo-time trajectory of the combined data inferred using Monocle. Colors indicate: (J) experimental time-points, and (K) cell-type clusters shown in I.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dohn, R.; Xie, B.; Back, R.; Selewa, A.; Eckart, H.; Rao, R.P.; Basu, A. mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans. Vaccines 2022, 10, 30. https://doi.org/10.3390/vaccines10010030

AMA Style

Dohn R, Xie B, Back R, Selewa A, Eckart H, Rao RP, Basu A. mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans. Vaccines. 2022; 10(1):30. https://doi.org/10.3390/vaccines10010030

Chicago/Turabian Style

Dohn, Ryan, Bingqing Xie, Rebecca Back, Alan Selewa, Heather Eckart, Reeta Prusty Rao, and Anindita Basu. 2022. "mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans" Vaccines 10, no. 1: 30. https://doi.org/10.3390/vaccines10010030

APA Style

Dohn, R., Xie, B., Back, R., Selewa, A., Eckart, H., Rao, R. P., & Basu, A. (2022). mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans. Vaccines, 10(1), 30. https://doi.org/10.3390/vaccines10010030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

mDrop-Seq: Massively Parallel Single-Cell RNA-Seq of Saccharomyces cerevisiae and Candida albicans

Abstract

1. Introduction

2. Results

2.1. Optimizing Single Yeast Cell Lysis in Droplets

2.2. Single-Cell Specificity of mDrop-Seq Confirmed by Species Mixing Experiments

2.3. Heterogeneity in Heat Shock Response of Saccharomyces cerevisiae

2.4. Activation of Stress Response in S. cerevisiae under Heat Shock

2.5. Characterizing Expression Heterogeneity in Candida albicans Cells

2.6. Differential Expression in Candida albicans in Response to Fluconazole Exposure

2.7. Trajectory Inference in Fluconazole Stimulation in Candida albicans

3. Discussion

3.1. Droplet Content and Stability

3.2. Sequencing and Alignment

3.3. Analysis of Stress, Heat Shock and Response to Anti-Fungal Agent in Yeast Cells

3.4. Scalability as a Technology

4. Materials and Methods

4.1. Yeast Strains and Cell Culture

4.2. Heat Shock Stimulation of S. cerevisiae

4.3. Fluconazole Stimulation of C. albicans

4.4. mDrop-seq Cell Preparation and Co-Encapsulation in Droplets

4.5. Cell Lysis, Reverse Transcription, cDNA Amplification and Next Era Library Generation for mDrop-Seq

4.6. Population-Level RNA-Seq Library Preparation of S. cerevisiae and C. albicans

4.7. Sequencing

4.8. mDrop-Seq Data Preprocessing, Alignment and Quality Control

4.9. Bulk RNA-Seq Data Processing

4.10. Clustering Cells and Generating UMAP

4.11. Calculating Doublet Rates from Species Mixing Experiments

4.12. Batch Correction

4.13. Cell Cycle Analysis

4.14. Hierarchical Clustering of Cells

4.15. Trajectory Analysis on Single-Cell Data

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI