Single-Cell RNAseq Profiling of Human γδ T Lymphocytes in Virus-Related Cancers and COVID-19 Disease

The detailed characterization of human γδ T lymphocyte differentiation at the single-cell transcriptomic (scRNAseq) level in tumors and patients with coronavirus disease 2019 (COVID-19) requires both a reference differentiation trajectory of γδ T cells and a robust mapping method for additional γδ T lymphocytes. Here, we incepted such a method to characterize thousands of γδ T lymphocytes from (n = 95) patients with cancer or adult and pediatric COVID-19 disease. We found that cancer patients with human papillomavirus-positive head and neck squamous cell carcinoma and Epstein–Barr virus-positive Hodgkin’s lymphoma have γδ tumor-infiltrating T lymphocytes that are more prone to recirculate from the tumor and avoid exhaustion. In COVID-19, both TCRVγ9 and TCRVγnon9 subsets of γδ T lymphocytes relocalize from peripheral blood mononuclear cells (PBMC) to the infected lung tissue, where their advanced differentiation, tissue residency, and exhaustion reflect T cell activation. Although severe COVID-19 disease increases both recruitment and exhaustion of γδ T lymphocytes in infected lung lesions but not blood, the anti-IL6R therapy with Tocilizumab promotes γδ T lymphocyte differentiation in patients with COVID-19. PBMC from pediatric patients with acute COVID-19 disease display similar γδ T cell lymphopenia to that seen in adult patients. However, blood γδ T cells from children with the COVID-19-related multisystem inflammatory syndrome are not lymphodepleted, but they are differentiated as in healthy PBMC. These findings suggest that some virus-induced memory γδ T lymphocytes durably persist in the blood of adults and could subsequently infiltrate and recirculate in tumors.

The overall T and γδ T cell response remain hitherto poorly understood in coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Like other respiratory viral infections, COVID-19 can exhibit distinct patterns of clinical features and disease severity, ranging from a virus-induced respiratory pathology due to a weak immune response to an overaggressive immunopathology caused by excessive immune activation. The heterogeneous immune profiles recently characterized in peripheral blood mononuclear cells (PBMC) of COVID-19 patients reflect this wide range of symptoms. COVID-19 patients have a quantitatively decreased compartment of γδ T cells that are more differentiated than control samples [16], accompanied by an increased number of immature neutrophils [17]. Similar alterations were reported with the other adaptive and unconventional T lymphocytes [18,19], supporting a scenario of broad T cell immunosuppression linked to fatal outcomes [20].
Likewise, the extent and quality of tumor-infiltrating T lymphocytes (TILs) in cancer are important determinants of the outcome of immunotherapy or untreated cancer patients. On the one hand, a weak immune response to cold tumors, peripherally excluding tumors, or tumors infiltrated by exhausted T lymphocytes allows for uncontrolled cancer and leads to fatal outcomes. On the other hand, too strong immune reactions are sometimes observed in untreated cancers (for review [21]) and systematically in patients receiving immune checkpoint blockade. Therefore, viral infections and cancer can present large and heterogeneous spectrums of immune response patterns. In human cancer, our research and other studies have shown that the abundance of γδ TILs, whether TCRVγ9 or TCRVγnon9, varies substantially across individuals and among cancer types, and that it is associated with patient outcomes [22,23]. As opposed to peripheral blood and hematopoietic malignancies dominated by TCRVγ9 γδ T cells, the predominant subset of γδ TILs in most solid tumors are TCRVγnon9 cells, whose number, cytotoxic differentiation, and functional status vary between individuals [24][25][26].
We recently depicted the pan-cancer landscape of human γδ TILs using single-cell RNA sequencing (scRNAseq). This strategy unveiled that the differentiation of cytotoxic γδ T cells from the TCRVγ9 and TCRVγnon9 subsets was related to the CMV status of the healthy donors [27]. Furthermore, despite the considerable heterogeneity of cancer patients, their infiltration and differentiation appeared phased with that of the adaptive T CD8 TIL compartment, yielding correlated rates of functional and/or exhausted γδ T and T CD8 TILs in human cancers [28].
Overall, these findings raised questions as to whether the viral status of cancer patients affected their γδ TILs, and whether the transcriptomic profiles correlated with those of T lymphocytes from COVID-19 patients' PBMC. Here, we addressed these questions through a comprehensive characterization of the differentiation and functional hallmarks of γδ T lymphocytes by analyzing scRNAseq datasets of tumors from cancer patients with known viral status, and PBMC from COVID-19 patients.

scRNAseq Datasets Pre-Processing
The (cells, gene) matrix data for the control γδ T isolated from healthy donor's PBMC has been previously published [28]. The scRNAseq of tumor lesions from Hodgkin's lymphoma (HL) (EGAS00001004085) [29], head and neck squamous cell carcinoma (HNSCC) (GSE139324) [30], PBMC from COVID-19 patients (GSE145926, GSE155224, GSE155249, GSE166489, and GSE167029) were downloaded and assembled to digitally extract their γδ T lymphocytes. After pre-processing and discarding cell doublets and dying cells, all datasets were integrated using the R package Seurat version 4.0 [31]. Principal components analysis (PCA) was performed on this integrated dataset before uniform manifold approximation and projection (UMAP) [32] to check the integrated dataset quality.

Single-Cell Signatures and Scores
After normalization and integration, the integrated scRNAseq dataset was scored for a collection of multigene signatures [28] using Single-Cell Signature Explorer software [33]. Briefly, the score of each single-cell C j for gene set GS x was computed as the sum of all unique molecular identifiers (UMI) for all the GS x genes expressed by C j divided by the sum of all UMI expressed by Cj: Score of cell Cj for geneset GSx = ∑ (UMI)GSx Cj / ∑ (UMI)Cj (1) All cell signature scores were merged with each cell's trajectory coordinates using Single-Cell Signature Merger software [33]. For visualization, Single-Cell Multilayer Viewer was used, a serverless software allowing to merge up to five layers of quantitative and qualitative variables, alone or in combination [28].

Score and Gate for Digital Extraction of γδ T Lymphocytes
Starting from any scRNAseq datasets pre-processed as above, this procedure comprised five gating steps sequentially applied using Single-Cell Virtual Cytometer software [34] The datasets of the finally identified γδ T cells were digitally extracted from the original scRNAseq dataset and further subtyped and injected on the trajectory (see below).

Subtyping of the TCRVγ9 and TCRVγnon9 γδ T Cells
This was performed as previously described [28]. Briefly, γδ T lymphocytes express either the TRGC1-encoded TCRVγ9 or the TRGC2-encoded TCRVγnon9 in a mutually exclusive fashion, so the extracted γδ T lymphocytes were categorized as TCRVγ9 cells based on ('TRDC, TRGC1' positive cells) or TCRVγnon9 cells based on ('TRDC, TRGC2' positive cells) classifications using a compensated score of these two signatures. Compensated scores were obtained by multiplying the score of the gene set of interest, here GSS TCRVγ9 , by its difference to its complementary gene set (GSS TCRVγ9 minus GSS TCRVγnon9 ): Scatterplots of GSS TCRVγ9 and GSS TCRVγnon9 finally identified the TCRVγ9 or TCRVγnon9 subset of each γδ T lymphocyte [28].

Cells Injection on Public Pseudotimed Differentiation Trajectory
The maturation trajectory of γδ T cells was computed by minimum spanning tree and visualized as pseudotimed trajectories. Pseudotimed trajectories are differentiation trajectories shown with cell pseudotime on the x-axis and a projection of both trajectory dimensions (here MST1 and MST2) on the y-axis [28]. We will refer below to them as 'public γδ T lymphocytes trajectory' the pseudotimed differentiation trajectory of all γδ T lymphocytes from~150 tissues samples, including healthy donors' PBMC and cancer patients' tumors (the (cell, gene) matrix data is available from [28]). New γδ T lymphocytes digitally extracted from additional datasets, such as those from COVID-19 patients, were injected onto this 'public γδ T cell trajectory'. Briefly, a total of (n = 9) datasets for COVID-19-derived tissues were downloaded. These COVID-19 datasets were first filtered for quality control, integrated as described above, and their γδ T lymphocytes were digitally extracted by score and gate [28]. These new γδ T lymphocytes were then injected onto the public trajectory by using Seurat's Integration and Map-to-Query workflow to determine their respective (MST1, MST2) coordinates and then infer their pseudotime within the public trajectory. This procedure encompasses the following four steps: Step 1: • Query 1 = γδ T lymphocytes digitally extracted from COVID-19 samples; • Reference = public trajectory; • Integration of gene expression data from Query 1 and Reference (Seurat's Integration).

Tumor Infiltrating γδ T Lymphocytes from Virus-Positive and -Negative Cancer Patients
We downloaded the published scRNAseq dataset from a total of (n = 26) human tumors of head and neck carcinoma (HNSCC), including (n = 8) samples positive and (n = 18) negative for HPV, as well as (n = 9) Hodgkin's lymphoma (HL) of mixed cellularity subtype in which (n = 5) samples were positive and (n = 4) negative for EBV. Their respective γδ T lymphocytes were digitally purified, and their TCR subtype and differentiation stage were determined by injection onto the 'public γδ T cell trajectory' (see Methods) and cross-labeling with reference gene signatures from external single-cell datasets of human TILs [28].
Altogether, these data demonstrate that patients with HPV-positive HNSCC and EBVpositive HL have γδ TILs more prone to recirculate from the tumor and avoid exhaustion.

γδ T Lymphocytes from COVID-19 Patients
The scRNAseq datasets from PBMC of (n = 43) COVID-19 patients and (n = 21) healthy controls, as well as from broncho-alveolar lavage fluids (BALF) from (n = 20) COVID-19 patients and (n = 4) healthy controls were downloaded. A total of (n = 7473) γδ T lymphocytes were digitally extracted from these datasets, and their TCR subtype and differentiation stage were characterized as described above (see Methods).
In spite of similar dataset sizes, each sample of COVID-19 PBMC yielded on average three times less γδ T lymphocytes than the healthy controls (100 vs. 330, respectively, t-test P = 4 × 10 −5 ), and both TCR subsets had decreased cell counts (on average 63 vs. 142 TCRVγ9 and 37 vs. 187 TCRVγnon9 in COVID-19 vs. control samples, respectively). Parallel analyses showed on average n = 561 and n = 479 T CD8 cells per sample in the same groups indicating that the COVID-19-related lympho-reduction from PBMC is specific to the γδ T cell lineage (Chi-2 P value = 5 × 10 −27 ).

γδ T Lymphocytes in Clinical Groups of Adult Patients with COVID-19
These γδ T cell analyses were then extended to clinical subgroups of adult COVID-19 patients. The total number of γδ T lymphocytes in BALF of adult COVID-19 patients varied considerably between individuals without reaching a statistical difference between mild (n = 2) and severe (n = 6) disease (on average 39 vs. 22 γδ T cells per sample, respectively). The TCRVγ9 cell counts were similar in both groups (on average 13 vs. 14 cells per sample, respectively), whereas the TCRVγnon9 cell counts decreased in severe COVID-19 (on average 26 vs. 7) despite maintaining the same differentiation stage pattern (on average 1% Tn, 37% Tcm, 62% Tem, 0% Temra). Of note, with 0% Tn, 20% Tcm, 80% Tem, and 0% Temra in both groups, the TCRVγ9 cells were more differentiated than the TCRVγnon9 cells. The γδ T lymphocytes in BALF of patients with mild disease included 9% Ttrm and 12% Tex cells, contrasting to 40% Ttrm and 40% Tex in severe disease, and both subsets behaved similarly (Figure 3a). Hence, acute COVID-19 increases both tissue residency and exhaustion of γδ T lymphocytes in infected lung lesions.  Figure 1).
Together, these results indicated that acute COVID-19 disease induces a similar γδ T cell lymphopenia PBMC of pediatric and adult patients. In contrast, the PBMC-derived γδ T cells from MIS-C patients with SARS-CoV-2 infection are not lymphodepleted and resemble to those from healthy PBMC.

Discussion
In single-cell transcriptomics, pseudotime describes the progression of each cell alongside a continuous series of differentiation states, through a non-linear transformation of real chronological time [42][43][44]. How explicit time is quantitatively converted into γδ T cell pseudotime remains to be determined, and depends on the transcriptome diversity and the total number of single cells in the dataset. All the features of differentiation trajectories are strictly dependent upon the number of single cells selected. Hence, once a trajectory has been inferred from a given single-cell dataset, adding ab extra single cells to this dataset will change the trajectory, preventing their mapping on the former map. Here, we addressed this issue by incepting a method to inject without distortion any amount of ab extra cells on the reference trajectory.
In this way, we were able to map and thus characterize thousands of γδ T lymphocytes from newer datasets onto a formerly-built reference map called here 'public γδ T cell trajectory' [28]. Of note, this trajectory encompasses a remarkable set of γδ T lymphocytes, being currently paved with~30,000 γδ T cells from~220 healthy and diseased adult individuals. As a result, it achieves a much higher level of exhaustivity and resolution than is possible with the cells from a single individual alone. Together, the public γδ T cell trajectory and the method for mapping newer γδ T cells provide a unique resource for characterizing these lymphocytes from diverse sources at the highest level of resolution.
As a result of high-resolution mapping of thousands of γδ T lymphocytes from patients with cancer or COVID-19, we were able to gain a wealth of information about γδ T lymphocyte differentiation dynamics at the single-cell scale. As reviewed in this series, the role of human γδ T cells in viral infections is highly investigated and debated in the context of SARS-CoV-2 infection [45][46][47][48]. Here, we found in PBMC of COVID-19 patients that γδ T lymphocytes were decreased compared to control samples, a finding consistent with flow cytometry studies [16,20,49,50]. Although a peripheral lymphopenia in COVID-19 was initially thought to indicate T cell immunosuppression, our present maps and other studies [18] show that γδ T cells increase concurrently in BALF of COVID-19 patients. T cell exhaustion is rarely observed among the γδ T cells in PBMC from adult and pediatric patients with COVID-19, even during its acute phase as reported with HEV infection [14]. Most Tex γδ T cells were instead found in the infected lung rather than in the circulating blood. Hence, COVID-19-induced γδ T cell exhaustion likely represents a scar of recent localized T cell activation in both TCRVγ9 and TCRVγnon9 subsets of cells rather than in a single subset alone [17]. Moreover, their advanced differentiation stages in blood from children with multisystem inflammatory syndrome associated with SARS-CoV-2 infection further support this view [51]. Tocilizumab, an IL-6 pathway inhibitor, effectively reduces mortality of severely ill and lymphopenic COVID-19 patients, notably by attenuating inflammation and normalizing their circulating T cell rates [52,53]. Consistent with these reports, the increased effector memory cells we observed here indicates that Tocilizumab also promotes γδ T cell differentiation, which constitutes a bioactivity of dual interest for both antiviral [45] and anticancer [54] immunity.
Single-cell studies have shown that the differentiation dynamics of human γδ T lymphocytes in both TCRVγ9 and TCRVγnon9 subsets primarily reflect the transcriptional emergence of their cytotoxic function, which peaks at the Tem and Temra stages [27]. In many solid and hematological human cancers, both lineages of γδ T and T CD8 TILs present strikingly coherent differentiation profiles, and a sizeable fraction of γδ T TILs are tissue-resident memory and exhausted cells [28]. Many T CD8 TILs in melanoma are bystander cytotoxic lymphocytes with a TCR specificity for viral antigens (V-spe), while only a minority represent true tumor-antigen-specific (T-spe) lymphocytes [55]. Among such CD8 TILs, the V-spe clonotypes prominently display functional Tcm and Tem differentiation stages whereas the majority of T-spe clonotypes are dysfunctional Tex cells [56]. In parallel, we found that HPV-positive HNSCC and EBV-positive HL patients have γδ TILs that are more prone to recirculate from the tumor and avoid exhaustion. Considering the above findings, it is tempting to hypothesize that a contingent of peripheral γδ T lymphocytes induced by viral infections does persist in the long term in the blood of adult individuals as central/effector memory cells, and can readily infiltrate tumors to mediate a large spectrum of HLA-unrestricted cytotoxic activities despite true antigen specificity.
For its clinical relevance to cancer immunotherapy, future studies from our team will seek to discern at the single-cell level the T-spe and V-spe clonotypes persisting among the human γδ T lymphocytes in the peripheral blood and infiltrating tumors. Further, our study provides a fundamental framework for characterizing γδ T lymphocytes in viral diseases and cancers, and provides the community with a unique resource for the monitoring of human γδ T lymphocytes at high-resolution.  Informed Consent Statement: Not applicable, the presented study did not involve novel tissue samples from humans or animals.
Data Availability Statement: Further information and reasonable requests for resources and processed data, such as integrated (cell, UMI) matrices of extracted γδ T and their corresponding cell annotations, should be directed to J.-J.F. (jean-jacques.fournie@inserm.fr) or J.P.C. (juan-pablo.cerapio-arroyo@inserm.fr). Source scRNAseq datasets used in this study are available at GEO with the accession numbers: GSE139324, GSE145926, GSE155224, GSE155249, GSE166489, and GSE167029; and at the European Genome-Phenome Archive (EGA) with the accession number: EGAS00001004085. Single-Cell Multilayer Viewer (scMLV) is available on GitHub repository: https://github.com/ MarionPerrier/scMLV (accessed on 31 March 2021). Single-Cell Signature Scorer and Single-Cell Virtual Cytometer are available at: https://sites.google.com/site/fredsoftwares/products (accessed on 12 October 2021). Seurat v4 is available from the Comprehensive R Archive Network (CRAN) and further details in the installation can be found on https://satijalab.org/seurat/ (accessed on 12 October 2021).

Abbreviations
The following abbreviations are used in this manuscript CMV cytomegalovirus COVID- 19