Understanding the pathophysiology of infection is critical to the rational design of prophylactic and therapeutic strategies to tackle infectious diseases. The course of infection, determined by the encounter of pathogens and host cells, is often measured as population-averaged results, leaving the important cell-to-cell heterogeneity out of the picture. The heterogeneity arises from both the pathogens and the infected cells. For example, pathogen heterogeneity can be reflected in the case of viruses, as a mixture of mutated viral particles displaying different infection ability [6
], or in the cases of bacteria, as a population of cells having different resistance to the same antibiotics [7
]. Host cellular heterogeneity is a combined result of variances in metabolism, composition, activation status, cell cycle, or infection history [6
]. Recent advances in single-cell analysis provide an attractive approach to probe the cellular population diversity and characterize infection pathophysiology at single-cell resolution. In this section, we will review how the recent advancement of single-cell technologies has helped deepen the understanding of pathogen and host cell heterogeneity and how the complex immune system reacts against infectious pathogens, with a focus on the contributions of single-cell sequencing.
2.1. Pathogen Heterogeneity
Pathogen heterogeneity can be inherent or as a result of heterogeneous host–pathogen interactions. It is a favorable feature for pathogens because varied genomic sequences or functional properties enable immune evasion, colonization in novel hosts, and drug resistance acquisition; therefore, they increase the possibility of survival. Besides, stochastic fluctuation in biochemical reactions may also contribute to cell-to-cell variability. Single-cell technologies provide high-resolution insights into different aspects of intracellular pathogen replication.
One area of virology that has benefited from the enhanced resolution of single-cell technologies is the study of variation in infection across single cells and the reasons for such variation. In the study by Heldt et al., cells were infected in a population, isolated into microwells, and incubated. The supernatant was subjected to viral plaques measurement, and viral RNA was quantified from lysed single infected cells [8
]. It was shown that cells infected by influenza A virus (IAV) under the same conditions produced largely heterogenous progeny virus titers, ranging from 1 to 970 plaque-forming units (PFU) and intracellular viral RNA (vRNA) levels varied three orders of magnitude. Similarly, using scRNA-seq, another study determined the percentage of viral transcripts in the total mRNA generated from IAV-infected cells, and it revealed that while most cells contained less than 1% of viral transcripts, some cells generated more than 50%, demonstrating infection heterogeneity from the angle of viral load [9
]. Reasons for this variation can be further explored through the use of high-throughput imaging technology. For instance, Akpninar et al. used virus expressing red fluorescence protein (RFP) to study the effect of defective interfering particles (DIP) on viral infection kinetics. DIP are noninfectious progeny particles lacking genes essential for replication, and they are commonly produced during infection due to the high mutation rate. When participating in infection along with viable viral particles, they compete for host cellular machinery and result in viral replication inhibition. In this study, cells in a bulk population were infected with a mixture of vesicular stomatitis virus (VSV) expressing RFP and VSV-DIPs, and they were either untreated or isolated by serial dilution. RFP expression was observed during incubation as a surrogate for viral replication levels. The results showed that DIP inhibited viral replication 10 times more on single cells, suggesting that the inhibition of viral replication is mitigated by cell–cell interactions when infection happens in a population [10
The genomic mutation of pathogens during infection can be also detected directly. The sequencing of transcriptome and viral genes in single infected cells showed that IAV is highly prone to mutation during infection [11
]. Detected mutations can cause consequences include viral polymerase malfunction and failure to express the interferon (IFN) antagonist protein, which is correlated to heterogeneous immune activation among infected cells [11
]. The sequencing of 881 plaques from 90 VSV-infected cells detected 36 parental single nucleotide polymorphism (SNP) and 496 SNP generated during infection (Figure 1
]. Although extremely low multiplicity of infection (MOI) was adopted, resulting in 85% of the cells statistically infected with only one PFU, 56% contained more than one parental variant, indicating that pre-existing differences in viral genomes can be spread within the same infectious unit, in this case, the host cell population. Moreover, by measuring the viral titers produced by each infected cell, a significant correlation was found between the number of mutations in the viral progeny and the log yield of the initially infected cell.
Genomic variability also widely exists among bacteria populations. Fluorescence labeling enables the quantification of bacterial growth in single host cells [13
], and by correlating the heterogenous growth with host response, it was found that the Salmonella
population exhibits different induction levels of the PhoP/Q two-component system, which modulates lipopolysaccharides (LPS) on the surface of individual bacteria [14
2.2. Host Cell Heterogeneity
To understand the pathophysiology of infectious diseases, it is important to study the identities of targeted cells. Mounting evidence has shown that even under identical conditions, individual host cells manifest differential susceptibility and responses to infection in a population. How does this preference arise? Do they share similar features that might be reasons for their susceptibility of infection? How do the states of infected cells affect pathogen replication and infection outcome? Furthermore, how are host cells’ phenotypes influenced by infection individually and temporally? Answers to these questions are critical for the identification of target cells and individuals of novel pathogens, as well as for the understanding of infection pathophysiology.
Analysis of cells exposed to pathogens at single-cell resolution requires, first and foremost, strategies to distinguish infected cells from uninfected ones. Pathogen-specific proteins, such as viral glycoproteins embedded in the cell membrane, or intracellular proteins such as viral capsid or polymerases, as well as pathogen nucleic acids, including genomic DNA/RNA and transcripts, can serve this purpose. These microbial elements can be labeled with specific antibodies or oligonucleotide probes for detection and quantification. Alternatively, pathogen nucleic acids can be directly captured in deep sequencing. By combining tools for pathogen identification with host cell phenotyping assays, infected cells can be profiled at the single-cell level.
Xin et al. investigated the effects of host cell heterogeneity on both acute and persistent infection by foot-and-mouth disease virus (FMDV) [16
]. By sorting single infected cells with FACS based on cellular parameters, and quantifying viral genome replication with RT-PCR, they showed that the host cell size and inclusion numbers affected FMDV infection. Cells with larger size and more inclusions contained more viral RNA copies and viral protein and yielded a higher proportion of infectious virions, which is likely due to favorable virus absorption. Additionally, the viral titer was 10- to 100-fold higher in cells in G2/M than those in other cell cycles, suggesting that cells in the G2/M phase were more favorable to viral infection or for viral replication. Such findings have also been reported for other viruses [9
], revealing a general effect of heterogeneous cell cycle status in a population on virus infection.
Golumbeanu et al. demonstrated host cell heterogeneity using scRNA-seq: they showed that latently HIV-infected primary CD4+
T cells are transcriptionally heterogeneous and can be separated in two main cell clusters [19
]. Their distinct transcriptional profiles correlate with the susceptibility to act upon stimulation and reactivate HIV expression. In particular, 134 genes were identified as differentially expressed, involving processes related to the metabolism of RNA and protein, electron transport, RNA splicing, and translational regulation. The findings based on in vitro infected cells were further confirmed on CD4+
T cells isolated from HIV-infected individuals. Similarly, enabled by scRNA-seq and immunohistochemistry, several candidate Zika virus (ZIKV) entry receptors were examined in the human developing cerebral cortex and developing retina, and AXL
was identified to show particularly high transcript and expression levels [20
scRNA-seq can also be used to identify potential target cells of novel pathogens and facilitate the understanding of disease pathogenesis and treatment. The spike protein of the virus SARS-CoV-2, the pathogen responsible for the COVID-19 pandemic, binds with the human angiotensin-converting enzyme 2 (ACE2) [22
]. This binding, together with a host protease type II transmembrane serine protease TMPRSS2, facilitates viral entry [22
]. By analyzing the existing human scRNA-seq data, it was identified that lung type II pneumocytes, ileal absorptive enterocytes, and nasal goblet secretory cells co-express ACE2
, which suggests that they might be the putative targets of SARS-CoV-2 [24
In the preparation of scRNA-seq library, standard poly-T oligonucleotide (oligo-dT) is commonly used to capture mRNA from single cells, which can also capture polyadenylated viral transcripts from DNA virus or negative-sense single stranded RNA virus. A simultaneous analysis of host transcriptome profiles and viral DNA/RNA offers information on the presence of the studied pathogen and its activities and allows a more accurate characterization on the dynamics of host–pathogen interactions.
Wyler et al. profiled the transcriptome of single human primary fibroblasts before and at several time points post-infection with herpes simplex virus-1 (HSV-1), and they described a temporal order of viral gene expression at the early infection stage [25
]. More importantly, by simultaneously profiling the host and viral mRNA, they identified that transcription factor NRF2 is related to the resistance to HSV infection. The finding was verified with the evidence that NRF2 agonists impaired virus production. Steuerman et al. performed scRNA-seq of cells from mice lung tissues obtained 2 days after influenza infection [26
]. FACS was applied to sort immune and non-immune cells based on CD45 expression. Nine cell types were clustered (Figure 2
A), and viral load was determined by the proportion of reads aligned to influenza virus gene segments, with higher than 0.05% considered infected. The authors found that viral infection can be detected in all cell types, and the percentage ranges from 62% in epithelial cells to 22% in T cells. However, the high variability of viral load was only observed among epithelial cells, while the majority of infected cells of other cell types showed to have low viral load (less than 0.5%) (Figure 2
For positive sense RNA virus whose transcripts lack polyadenylation and cannot be captured by oligo-dT, a reverse complementary DNA oligo probe to the positive-strand viral RNA was employed. Zanini et al. described this method and correlated gene expression with virus level in the same cell to study the infection of dengue virus (DENV) and Zika virus (ZIKV). They identified several cellular functions involved in DENV and ZIKV replication, including ER translocation, N-linked glycosylation, and intracellular membrane trafficking [27
]. Interestingly, by contrasting the transcriptional dynamics in DENV versus ZIKV-infected cells, differences were spotted in the specificity of these cellular factors, with a few genes playing opposite roles in the two infections. Genes in favor of DENV (such as RPL31
, and TMED2
) and against DENV infection (such as ID2
) was also validated with gain/loss-of-function experiments.
Analysis methods have been advancing for the detection of genetic variant-based scRNA-seq data [28
]. They could contribute, in the study of infectious diseases, to the characterization of temporal changes in viral mutational prevalence [31
]. Moreover, viral mutation can be correlated with host gene expression status at the single-cell level to further investigate their potential mutual effect on one another throughout the course of infection and reveal the dynamic host responses and pathogen adaptations in the progression of infection [32
In spite of the above-mentioned examples characterizing virus presence with scRNA-seq, it is worth noticing that viral mRNA or genome occurrence is not necessarily equivalent to viral progeny, due to reasons such as missing essential genes caused by mutations. Experimental techniques enabling the joint analysis of host transcriptional responses and viral titers will be needed to reveal the underlying mechanisms of virus production levels and host cell heterogeneity. Another challenge of analyzing viral RNA data is distinguishing infected cells with intracellular viral transcription from uninfected cells acquiring exogenous viral RNA. Combining single-cell transcriptomics data with flow cytometry or mass cytometry by time-of-flight (CyTOF) to measure the intracellular viral protein may help overcome this issue.
2.3. Host Immune Responses in Infection
Immune responses activated by infection, since it is the innate immune responses that are primarily initiated in infected cells, or adaptive immune responses by lymphocytes carrying specific roles, are dynamic and complex, and they often happen in specific tissue microenvironments. Heterogeneity in immune responses is also a long-recognized phenomenon. For instance, the activation of antiviral responses in dendritic cells (DCs) by bacterial LPS starts with a small fraction of cells initiating the reaction, followed by the response by the rest of the population via paracrine responses [33
]. Technologies that enable the simultaneous measurement of multiple parameters facilitate high-resolution characterization of transcripts and protein at the single-cell level and boost our understanding of how host immune responses are initiated and orchestrated against infection. Although pathogens usually dominate the war with host immune responses, hence the prevalence of infectious diseases, in-depth understanding of the interplay provides valuable information for the design of strategies to fight against infectious diseases. In this section, we cover the single-cell characterization of both innate immune responses from infected cells and adaptive immune responses activated in infected units.
Type I interferon (IFN), a key cytokine in innate immunity, orchestrates the first line of host defense against infection. Its production is initiated upon host cells sensing pathogen-specific molecules, and it turns on the antiviral state of host cells by activating the transcription of hundreds of IFN-stimulated genes (ISGs), some of which are crucial for coordinating adaptive immune responses. Many studies have shown a large variability of IFN expression among infected cells. In the case of influenza virus infection, this can be partially explained by the high mutation rate during replication, revealed by sequencing viral genes in single infected IFN reporter cells [11
]. However, such viability was also found to exist in infected cells expressing unmutated copies of all viral genes, which might be a result of the stochastic nature of immune activation irrelevant to viral genotypes [11
In another study, PBMCs from patients with latent tuberculosis infection (LTBI) or active tuberculosis (TB), and from healthy individuals were analyzed with scRNA-seq [34
]. T cells, B cells, and myeloid cells were distinguished, and 29 subsets were clustered. The novel finding in this work is the consistent depletion of one natural killer (NK) cell subset from healthy individual samples to samples from LTBI and TB, which was also validated by flow cytometry. The discovered NK cell subset could potentially serve as a biomarker for distinguishing TB from LTBI patients, which is valuable for predicting disease outcome and developing treatment strategies. By analyzing scRNA-seq data of PBMCs derived from individuals before and at multiple time points after virus detection, Kazer et al. investigated the dynamics of immune responses during acute HIV infection [35
]. After identifying well-established cell types and subsets in PBMCs, the authors examined how each cell type varies in phenotype during the course of infection. Genes involved in cell-type specific activities, including monocyte antiviral activity, dendritic cell activation, naïve CD4+
T cell differentiation, and NK trafficking manifested similar changes with plasma virus levels: peaking closer to detection and gradually descending with time.
Phenotypic variations in bacteria populations were shown to influence host cell responses. Avraham et al. investigated macrophage responses against Salmonella
infection with fluorescent reporter-expressing bacteria and scRNA-seq on host cells [14
]. Transcriptional profiling revealed the bimodal activation of type I IFN responses in infected cells, and this was correlated with the level of induction of the bacterial PhoP/Q two-component system. Macrophages that engulfed the bacterium with a high level of induction of PhoP/Q displayed high levels of the type I IFN response, which was presumably due to the surface LPS level related to PhoP/Q induction. With a similar setup, Saliba et al. studied the Salmonella
proliferation rate heterogeneity in infected macrophages [13
]. The varied growth rate of bacteria, indicated by fluorescent expression by engineered Salmonella
in single host cells, influenced the polarization of macrophages. Those bearing nongrowing Salmonella
manifested proinflammatory M1 macrophages markers, similar with bystander cells, which were exposed to pathogens but not infected. In comparison, cells containing fast-growing Salmonella turned to anti-inflammatory, M2-like state, showing that bacteria can reprogram host cell activities for the benefit of their survival.
The above-mentioned strategy to simultaneously profile host cell transcriptome and viral RNA also plays an important role in characterizing immune responses against infection by identifying infected immune cells and analyzing the transcriptomes simultaneously. For instance, it was applied to study the heterogeneous innate immune activation during infection by West Nile virus (WNV) [17
]. High variability was revealed for both viral RNA abundance and IFN and ISGs expression. Interestingly, the expression of some ISGs, with Tnfsf10
, and Mx1
being the most prominent examples, was found to be negatively correlated with viral RNA abundance, which could be a direction for future studies on WNV-mediated immune suppression in infected cells. Similarly, Zanini et al. studied the molecular signatures indicating the development of severe dengue (SD) infection by analyzing single PBMCs derived from patients [36
]. FACS was employed to sort PBMCs into different cell types (T cells, B cells, NK cells, DCs, monocytes), and then scRNA-seq was performed. The majority of viral RNA-containing cells in the blood of patients who progressed to SD were naïve immunoglobulin M (IgM) B cells expressing CD69 and CXCR4 receptors, as well as monocytes. Transcriptomic profiling data indicated that various IFN regulated genes, especially MX2 in naive B cells and CD163 in CD14+
monocytes, were upregulated prior to progression to SD.
Comparison of the single-cell transcriptomes of lung tissue from health and influenza-infected mice revealed that 101 genes, among which the majority are ISGs and targets of antiviral transcription factors, were consistently upregulated among all nine identified infected cell types, including both immune and non-immune cells [26
]. This finding suggested that antiviral innate responses against influenza infection generically exist (Figure 2
C). Moreover, by contrasting the expression profiles among infected, bystander, and unexposed cells, it was shown that the non-specific IFN gene module is a result of extracellular exposure and responses of environmental signals.
While single-cell transcriptomics analysis provides an unbiased determination on host cell states, proteomics analysis offers direct characterizations of proteins expressed upon pathogen activation. Going beyond traditional flow cytometry, mass spectrometry, or cytometry by time-of-flight (CyTOF) offers vastly increased numbers of parameters that can be investigated simultaneously, exponentially increasing the depth of the dataset collected. For instance, to investigate the effect of a precedent dengue virus infection on the outcome of subsequent Zika infections, PBMCs derived from patients with either acute dengue infection or health individuals were incubated with dengue virus or Zika virus, and the treated PBMCs were assessed by multiparameter CyTOF [37
]. CyTOF in this study allowed the simultaneous detection of changes in the frequency of immune cell subpopulations and quantification of functional activation markers and cytokines in distinct cell subsets. While secondary infection with dengue virus led to increases of CD4+ T cells and T cell subsets, which are involved in adaptive immunity, secondary infection with Zika virus induced the upregulation of several functional markers including IFNγ and macrophage inflammatory protein-1β (MIP-1β) in NK cells, DCs, and monocytes, indicating an intact innate immunity against Zika virus in the cases of possible concurrent dengue infection. Hamlin et al. compared two DENV serotypes (DENV-2 and DENV-4) in their infection in human DCs using CyTOF, which allowed simultaneous analysis on DENV replication, DC activation, cytokine production, and apoptosis [38
]. The tracking of intracellular DENV proteins and extracellular viral particles showed different replication kinetics yet similar peak viral titers by these two serotypes, as well as the percentage of infected DCs. Moreover, DENV-4 infection was found to induce a higher expression of CD80, CD40, and greater production of tumor necrosis factor-α (TNFα) and interleukin-1β (IL-1β), compared to DENV-2 infection. Additionally, bystander cells, which were identified by the absence of intracellular viral proteins, were identified to produce less TNFα and IL-1β, but show more activation of interferon-inducible protein-1 (IP-1), which is a member of ISGs.
Besides CyTOF, host cell secretomes can also be measured with customized miniatured systems, and the level of multiplexing and flexibility of sample handling is often improved. For instance, Lu et al. showed the co-detection of 42 secreted proteins from immune effector cells stimulated with LPS [39
]. In a similar setup, Chen et al. performed a longitudinal tracking of secreted proteins from single macrophages in response to LPS treatment [40
]. These studies provide valuable insights into the dynamic and comprehensive responses to pathogen over time. Notably, such methods require microfabrication tools and skills, which is not always available and thus hinder their accessibility, compared with flow cytometery and CyTOF.
Epigenetic profiling at the single-cell level is also important, especially for elucidating the influence of host immune responses in chronic infection. The Assay for Transposase-Accessible Chromatin with high throughput sequencing (ATAC-Seq) utilizes Tn5 transposase to insert sequencing adapters into regions of open chromatin, in order to study genome-wide chromatin accessibility. Buggert et al. applied ATAC-seq and established the epigenetic signatures of HIV-specific memory C8+
T cells resident in lymphoid tissue [41
]. Yao et al. used chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-Seq) to examine the histone modification of progenitor-like CD8+
T cells from mice chronically infected with lymphocytic choriomeningitis virus (LCMV) [42
]. They found that progenitor-like CD8+
T cells showed distinct epigenomic features compared with memory precursor cells, exhibiting more abundant active histone markers (H3K37ac modification) at genes co-expressed with Tox
, which encodes the thymocyte selection-associated high mobility group box protein TOX. This might promote the long-term persistence of virus-specific CD8+
T cells during chronic infection.
In some cases, deep sequencing can be implemented together with other single-cell technologies for a comprehensive and systematic profiling of immune responses against infection. For instance, Michlmayr et al. performed 37-plex CyTOF on peripheral blood mononuclear cells (PMBCs), RNA seq on whole blood, and serum cytokine measurement of blood samples from patients with chikungunya virus (CHIKV) infection [43
]. Moreover, samples collected at acute and convalescent phases were compared to study the disease progression. Such multidimensional analysis allows the large-scale, unbiased characterization of gene expression, cytokine/chemokine secretion, and cell subpopulation changes in response to infection. One important result of this study is revealing monocyte-centric immune response against CHIKV, with the frequency of two subsets both related to antibody titers and antiviral cytokine secretion. In addition, significant viral protein expression was found in two B cell subpopulations.
While multiple assays can be done on the same bulk sample to obtain different data parameters (e.g., transcriptomic, proteomic), such datasets are not able to correlate the data parameters at the resolution of a single cell. Newer advances allow the simultaneous collection of multiple types of parameters for the same cell. For instance, Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) and RNA Expression and Protein Sequencing (REAP-seq) are techniques for the simultaneous collection of transcriptomic and high-dimensional information on specified proteomic targets. By using antibodies tagged with unique nucleotide sequences, the subsequent transcriptomic sequencing simultaneously sequences these tags to allow the quantification of the antibody targets. Corresponding transcriptomic and proteomic data at the single-cell level allows the opportunity to study the role of post-translational gene regulation in the immune response. The increased dimensionality of the information obtained may also allow more accurate machine learning to identify signatures of healthy or dysfunctional immune responses. For instance, using CITE-seq, Kotliarov et al. were able to identify a common signature of activation in a plasmacytoid dendritic cell-type I interferon/B lymphocyte network that was associated both with flares of systemic lupus erythematosus (SLE) and influenza vaccination response level [44