Identification of Novel Molecular Markers of Human Th17 Cells

Th17 cells are important players in host defense against pathogens such as Staphylococcus aureus, Candida albicans, and Bacillus anthracis. Th17 cell-mediated inflammation, under certain conditions in which balance in the immune system is disrupted, is the underlying pathogenic mechanism of certain autoimmune disorders, e.g., rheumatoid arthritis, Graves’ disease, multiple sclerosis, and psoriasis. In the present study, using transcriptomic profiling, we selected genes and analyzed the expression of these genes to find potential novel markers of Th17 lymphocytes. We found that APOD (apolipoprotein D); C1QL1 (complement component 1, Q subcomponent-like protein 1); and CTSL (cathepsin L) are expressed at significantly higher mRNA and protein levels in Th17 cells than in the Th1, Th2, and Treg subtypes. Interestingly, these genes and the proteins they encode are well associated with the function of Th17 cells, as these cells produce inflammation, which is linked with atherosclerosis and angiogenesis. Furthermore, we found that high expression of these genes in Th17 cells is associated with the acetylation of H2BK12 within their promoters. Thus, our results provide new information regarding this cell type. Based on these results, we also hope to better identify pathological conditions of clinical significance caused by Th17 cells.


RNA Sequencing (RNA-seq) and Analysis of Differentially Expressed Genes (DEGs)
Global changes in gene expression in human naive CD4+ cells and fully differentiated Th17 cells (from three anonymous blood donors) were analyzed by high-resolution RNA sequencing (RNA-seq). For each sample, the mRNA fraction was isolated with a NEBNext ® Poly(A) mRNA Magnetic Isolation Module Kit (New England Biolabs, Ipswich, MA, USA) according to the manufacturer's instructions. Libraries were prepared using the NEBNext ® Ultra™ RNA Library Prep Kit for Illumina ® (New England Biolabs) according to the manufacturer's instructions. Sequencing was performed on a HiSeq2000 instrument (Illumina, San Diego, CA, USA) in PE100 mode. FASTQ sequence reads were aligned to the GRCh38 reference genome. Adapter trimming was performed using the bbduk script (https://sourceforge.net/projects/bbmap/). Prior to DEG analysis, the gene expression statistics were analyzed using Salmon software [29], which provides fast and bias-aware quantification of transcript expression. The quantification of gene expression was performed at the transcript level. Then, quantified transcript-level data from the two datasets (CD4+ cells vs Th17 cells) were aggregated at the gene level for gene-level differential expression and analyzed using the R package DESeq. (https://bioconductor.org/packages/release/bioc/html/DESeq.html; [30]). Benjamini and Hochberg's approach was used to control the false discovery rate and adjust the p-values. An adjusted p-value < 0.05 and fold change in expression of 1.5 were the criteria used to define significant differences in gene expression.

Gene Ontology
Gene ontology analysis was performed using PANTHER software [31].

Statistics
Statistical analysis was performed using one-way ANOVA followed by Dunn's post hoc test or Friedman repeated measures ANOVA on ranks followed by Dunn's post hoc test. A P-value of 0.05 or lower indicated statistical significance. Statistical analyses were performed using SigmaStat version 3.5 (Systat Software, Inc., San Jose, CA, USA).

Identification of Novel Th17-Specific Genes
To identify new Th17 cell markers, in the first stage of our study, we performed de novo sequencing of naive CD4+ cells and cells differentiated towards a Th17 phenotype for five days. Transcriptomics analysis revealed that the expression of more than 2000 genes differed in Th17 lymphocytes in comparison to CD4+ cells (Dataset S1). Some of these differentially expressed genes were previously identified, e.g., RORC [5], VDR [35], BATF3 [36], HIF1A [37], ATP1B1, IL2RB, COL6A3, and MIAT [38] (Dataset S1). Ontological analysis revealed that many terms linked to the biology and differentiation of T cells were enriched in the differentially expressed genes (Table S1 and Dataset S2). In the next step, we selected several dozen genes, which we screened using real-time RT-PCR to detect their expression in cells isolated from two blood donors differentiated towards Th1 cells, Th2 cells, Th17 cells, and Tregs. Based on the results of this experiment, three putative Th17-specific genes were selected: APOD, C1QL1, and CTSL (Table S2). These genes were then analyzed in detail in cells isolated from a larger group of donors ( Figure 1). The cell differentiation of each subpopulation was confirmed by expression of the signature transcription factors RORγT, TBX21, GATA3, and FOXP3 [6,[39][40][41] and secretion of the specific cytokines/proteins IL-2, IL4, IL17, and CTLA4 [42][43][44][45][46] (Figures S1 and S2). We showed that at the mRNA level, the expression of these genes was significantly lower in different T cell subtypes than in Th17 cells, in which their expression was at least one log higher (Figure 1), analogous to the expression of RORγT ( Figure S1C). Similar results were obtained using Western blotting ( Figure 2). Because both APOD, C1QL1, and CTSL are secreted proteins, we determined their amounts in the cell supernatants. As expected, we observed the highest median APOD protein level in the supernatant of Th17 cells (36.14 ng/mL) in comparison to those of Th1 cells (1.85 ng/mL), Th2 cells (1.54 ng/mL), and Tregs (8.76 ng/mL) ( Figure 3A). A similar pattern was observed for C1QL1, which exhibited median expression levels of 4.26 ng/mL (Th17 cells), 0.12 ng/mL (Th1 cells), 0.09 ng/mL (Th2 cells),    exhibited median expression levels of 4.26 ng/mL (Th17 cells), 0.12 ng/mL (Th1 cells), 0.09 ng/mL (Th2 cells), and 0.07 ng/mL (Tregs) (Figure 3B), and for CTSL, with median expression levels of 5.43 ng/mL (Th17 cells), 1.12 ng/mL (Th1 cells), 1.59 ng/mL (Th2 cells), and 2.04 ng/mL (Tregs) ( Figure  3C).    ). An asterisk indicates a statistically significant difference at p< 0.05. Statistical analysis was performed using Friedman repeated measures ANOVA on ranks followed by Dunn's post hoc test.

Analysis of Epigenetic Marks in the Loci of the Identified Genes
As a next step, we examined the methylation and acetylation patterns of histones bound to the promoter regions of the identified genes in Th1 cells, Th2 cells, Th17 cells, and Tregs in vivo using chromatin immunoprecipitation (ChIP) assays to determine whether some cell-type-specific epigenetic modifications would be observed. The following modifications were analyzed: both H3K4me3 and H2BK12ac were associated with the activation status of the genes, and H3K27me2/3 was associated with inactive gene promoters [47,48]. Interestingly, the CTSL1 and C1QL1 gene promoters exhibited high levels of H3K4me3 ( Figure 4) in Tregs, whereas for the APOD gene, the greatest degree of H3K4me3 binding was observed in Th17 cells, and that in Tregs was significantly lower ( Figure 4A). Upon comparing the H2BK12ac occupancy of the APOD, CTSL, and C1QL1 promoters in Th1, Th2, Th17, and Th17 cells, we observed that H2BK12ac levels were highest in Th17 cells, although it should be noted that the H2BK12ac occupancy in Tregs was also substantial ( Figure  4A-C). CTSL1 promoter-, C1QL1 promoter-, and APOD promoter-bound H3K27me2/3 was not detectable or observed at very low amounts and did not show a T-cell-specific pattern ( Figure  4A-C). ). An asterisk indicates a statistically significant difference at p < 0.05. Statistical analysis was performed using Friedman repeated measures ANOVA on ranks followed by Dunn's post hoc test.

Analysis of Epigenetic Marks in the Loci of the Identified Genes
As a next step, we examined the methylation and acetylation patterns of histones bound to the promoter regions of the identified genes in Th1 cells, Th2 cells, Th17 cells, and Tregs in vivo using chromatin immunoprecipitation (ChIP) assays to determine whether some cell-type-specific epigenetic modifications would be observed. The following modifications were analyzed: both H3K4me3 and H2BK12ac were associated with the activation status of the genes, and H3K27me2/3 was associated with inactive gene promoters [47,48]. Interestingly, the CTSL1 and C1QL1 gene promoters exhibited high levels of H3K4me3 ( Figure 4) in Tregs, whereas for the APOD gene, the greatest degree of H3K4me3 binding was observed in Th17 cells, and that in Tregs was significantly lower ( Figure 4A). Upon comparing the H2BK12ac occupancy of the APOD, CTSL, and C1QL1 promoters in Th1, Th2, Th17, and Th17 cells, we observed that H2BK12ac levels were highest in Th17 cells, although it should be noted that the H2BK12ac occupancy in Tregs was also substantial ( Figure 4A-C). CTSL1 promoter-, C1QL1 promoter-, and APOD promoter-bound H3K27me2/3 was not detectable or observed at very low amounts and did not show a T-cell-specific pattern ( Figure 4A-C).

Discussion
In the present study, using transcriptional profiling of human naive CD4+ and Th17 cells, we selected genes potentially upregulated in mature Th17 cells. Then, we tested the expression of these genes in a number of cultures initiated from different donors to identify genes whose expression was potentially specific to Th17 cells (in comparison to Th1 cells, Th2 cells, and Tregs) (Table S2). Through this analysis, we identified three candidates, APOD, C1QL1, and CTSL, which were examined in a larger number of donors and whose specificity for Th17 cells was confirmed at the mRNA (Figure 1), cellular protein (Figure 2), and secreted protein levels ( Figure 3). As already mentioned, among the identified genes was APOD, which encodes apolipoprotein D, a glycoprotein that is distinct from other apolipoprotein family members [49] with high similarity to the lipocalin family [50]. This protein transports a wide variety of molecules, including cholesterol, phospholipids, progesterone, and pregnenolone [50][51][52][53], as well as substances such as arachidonic acid [54] and (E)-3-methyl-2-hexenoic acid [55]. Previous proteomics analysis of human CD4+ and Th17 cells revealed that APOD is upregulated in Th17 cells [56], which is in line with our own observations (Dataset S1); however, our study went beyond this as it focused on the determination of T subset-restricted expression of this and other genes to identify novel markers of Th17 cells. Increases in APOD expression were found to be correlated with altered lipid metabolism and the risk of atherosclerosis [53]. Interestingly, patients with atherosclerosis also show significant increases in the number of peripheral Th17 cells and levels of Th17-related cytokines, e.g., IL-6, IL-17, and IL-23 [57]. Similar results were also confirmed in mouse models [58,59]. Thus, it is tempting to speculate that Th17 cell-derived APOD participates in the proinflammatory environment and might be a diagnostic factor in this and/or other Th17-dependent chronic

Discussion
In the present study, using transcriptional profiling of human naive CD4+ and Th17 cells, we selected genes potentially upregulated in mature Th17 cells. Then, we tested the expression of these genes in a number of cultures initiated from different donors to identify genes whose expression was potentially specific to Th17 cells (in comparison to Th1 cells, Th2 cells, and Tregs) (Table S2). Through this analysis, we identified three candidates, APOD, C1QL1, and CTSL, which were examined in a larger number of donors and whose specificity for Th17 cells was confirmed at the mRNA (Figure 1), cellular protein (Figure 2), and secreted protein levels ( Figure 3). As already mentioned, among the identified genes was APOD, which encodes apolipoprotein D, a glycoprotein that is distinct from other apolipoprotein family members [49] with high similarity to the lipocalin family [50]. This protein transports a wide variety of molecules, including cholesterol, phospholipids, progesterone, and pregnenolone [50][51][52][53], as well as substances such as arachidonic acid [54] and (E)-3-methyl-2-hexenoic acid [55]. Previous proteomics analysis of human CD4+ and Th17 cells revealed that APOD is upregulated in Th17 cells [56], which is in line with our own observations (Dataset S1); however, our study went beyond this as it focused on the determination of T subset-restricted expression of this and other genes to identify novel markers of Th17 cells. Increases in APOD expression were found to be correlated with altered lipid metabolism and the risk of atherosclerosis [53]. Interestingly, patients with atherosclerosis also show significant increases in the number of peripheral Th17 cells and levels of Th17-related cytokines, e.g., IL-6, IL-17, and IL-23 [57]. Similar results were also confirmed in mouse models [58,59]. Thus, it is tempting to speculate that Th17 cell-derived APOD participates in the proinflammatory environment and might be a diagnostic factor in this and/or other Th17-dependent chronic inflammatory diseases, especially as patients with autoimmune diseases are at greater risk of developing arteriosclerosis [60,61]. Another gene identified in this study, C1QL1, encodes the secreted complement C1q-like 1 protein of unknown function, which is predominantly expressed in the brain [62,63] and exhibits affinity for the BAI3 receptor [64]. The protein product of this gene might act as a synaptic organizer [65], but the role of C1QL1 outside the brain remains elusive. Liu et al. showed that C1QL1 activates ERK1/2 and promotes angiogenesis [66]. It is generally accepted that inflammation fosters angiogenesis [67,68], and proinflammatory Th17 cells and their cytokines participate in this process [69][70][71][72]; thus, it is conceivable that C1QL1 supports the Th17-dependent growth of new blood vessels under pathological conditions. CTSL, which encodes cathepsin L, a lysosomal cysteine proteinase, is another gene that revealed a Th17 cell-restricted expression pattern. Interestingly, Tuomela et al. identified it as among genes upregulated at the early stage of Th17 cell differentiation [38], but in striking contrast to our results, they also observed similar CTSL protein expression in Tregs. This discrepancy might be partially explained by the different cellular models used in the two studies. Tuomela et al. differentiated Tcells from the umbilical cord blood of neonates, while we used peripheral blood from adult donors. Neonates have a different immunological status compared to that of adults, which is related to the different gene expression patterns in immune cells [73,74]. Cathepsin L is involved in the regulation of CD4+ T cell selection in the thymus [75] and NKT cell development [76]. Recently, Hou et al. demonstrated that the endogenous cathepsin L inhibitor serpin B and other pharmacological inhibitors suppress Th17 differentiation, indicating that cathepsin L is an important player in promoting Th17 generation [77]. Furthermore, cathepsins including cathepsin L, when overexpressed, can be secreted and play a role in shaping the microenvironment in physiological and pathological processes, e.g., cancer [78] and various inflammatory disease including those with autoimmune components [79][80][81]. Our results suggest that this protein is specific for this particular Tcell subpopulation, in which it might be involved in the activation of receptors, cytokines, specific signaling proteins (e.g., STAT3) [82,83], and Th17-mediated tissue damage.
Because gene expression is determined by not only tissue-specific transcription factors but also by epigenetic mechanisms, we decided to determine whether we could find an epigenetic mark correlating with high expression of the identified genes in Th17 cells. Analysis of H3K27me2/3 binding, which is associated with gene silencing [47], within the promoters of the APOD, C1QL1 and CTSL genes indicated that in all subtypes of Tcells, levels of this mark were low, or it was undetectable ( Figure 4). Interestingly, when investigating two epigenetic marks that are enriched in the promoter regions of active genes, e.g., H3K4me3 and H2KB12ac [84][85][86], we found that H2BK12ac levels associated with the promoters of the APOD, C1QL1 and CTSL genes were the highest in Th17 cells, which suggests a common epigenetic mechanism regulating Th17-restricted genes. However, unexpectedly, the H3K4me3 mark showed a different pattern (highest occupancy on the C1QL1 and CTSL gene promoters in Tregs), and only the APOD gene was correlated with its expression in Th17 cells. Some epigenetic modifications may be dominant over others, e.g., H3K27me3 is usually dominant, which means that its presence is associated with repression [48,85]. H3K4me3 alone cannot induce active transcription [48,85], so it appears that the dominant modification in the case of the analyzed genes is H2KB12ac. It has been shown that H3K4me3 at the CNS2 locus of IL17A/F allows binding of the RORα and RORγT transcription factors and induction of the expression of IL17A/F during the differentiation of Th17 cells [87,88]. In response to some stimuli, Tregs can change their phenotype into a Th17-like phenotype [89][90][91]. It appears that epigenetic changes are involved in this process [92,93], and we suggest that H3K4me3 is a part of the histone code that maintains the region in an "open" state in the event that rapid activation associated with cellular plasticity is needed [94][95][96].

Conclusions
In summary, we present evidence that expression of the APOD, C1QL1 and CTSL genes in human CD4+ cells is restricted to Th17 cells and associated with high levels of acetylated histone H2BK12 at the promoter regions of these genes. Furthermore, the expression of these genes and the functions of their encoded proteins might provide a better understanding of the involvement of Th17 cells in the Cells 2020, 9, 1611 9 of 14 pathogenic processes underlying arteriosclerosis and Th17 cell-driven angiogenesis. Furthermore, the results of analyses of the expression of these genes and concentrations of their protein products have potential clinical application in the identification of Th17 cell-related inflammation.

Conflicts of Interest:
The authors declare no conflict of interest.