Therapeutic Potential of Cancer Vaccine Based on MHC Class I Cryptic Peptides Derived from Non-Coding Regions

: MHC class I molecules display intracellular peptides on cell surfaces to enable immune surveillance under pathological conditions. The source of MHC class I antigens responsible for cancer protection is not fully understood. Here, we explored the MHC class I peptidome in mouse colon cancer cells using a proteogenomic approach. We showed that cryptic peptides derived from unconventional short open reading frames accounted for part of the MHC class I peptidome. Moreover, cancer growth was signiﬁcantly prevented in mice immunized with a cocktail of synthesized cryptic peptides. Together, our data showed that the source of cancer antigens was not limited to fragments of consensus proteins. Cryptic antigens were displayed by MHC molecules and mediated anti-cancer effects, suggesting their therapeutic potential for cancer prevention.


Introduction
Cytotoxic CD8 + T cells can eliminate malignant and virally infected cells. They perform immune surveillance by monitoring antigen peptides displayed by MHC class I molecules to detect abnormalities or changes in target cells. Susceptibility to immune checkpoint inhibitors (ICIs), observed in a variety of cancer types with mismatch-repair deficiency, strongly suggests that MHC molecules present mutation-derived neoantigens and that these antigens play a role in cancer rejection, mediated by cytotoxic CD8 + T cells [1,2]. However, the majority of neoantigens are private antigens derived from passenger mutations, and the presentation of neoantigens is most likely to occur in a limited number of cancers with a high tumor mutation burden (TMB) [3]. Despite the fact that immunogenic neoantigens can induce reactive T cell responses and mediate cancer rejection [4], the specificity of current in silico prediction methods for selecting immunogenic neoantigens from an array of somatic mutations is not satisfactory, hindering the therapeutic use of these antigens in clinical settings.
Peptides displayed by MHC class I are generated by the intracellular antigen processing machinery. A variety of protein fragments digested by the proteasomes in the cytosol are transported into the endoplasmic reticulum, selected, customized, and ultimately loaded onto MHC class I molecules [5,6]. Thus, a peptide-MHC class I repertoire represents fragments of most, if not all, proteins generated inside the cell. However, the source of MHC class I peptides is not fully understood, and the discovery of a novel class of cancer antigens may create a breakthrough in immunotherapy. MHC class I peptides potentially arise from unconventional translation products that are separate from the known proteome [7,8]. Proteogenomic analyses have enabled the direct identification of cryptic peptide sequences and demonstrated that MHC-class I peptides originate from unconventional open reading frames (ORFs) in human tumor tissues [9][10][11]. A seminal study reported that MHC molecules presented tumor-associated antigens, derived from noncoding regions, that played protective roles in a mouse hematologic tumor model [12].
Recently, we identified an immunogenic cancer antigen derived from an oncogenic long non-coding RNA (lncRNA) in human colorectal cancer tissues [13]. CD8 + T cell responses against this antigen and its MHC presentation were observed in multiple patients. Here, we hypothesized that some cryptic peptides, if not all, are aberrantly present in cancer cells, serving as cancer antigens. These unexplored antigen types may serve as targets for cancer immunotherapy, regardless of the TMB. In this study, we aimed to identify cryptic peptides in the MHC class I peptide repertoire of mouse colon cancer cells, and comprehensively assess their potential as the basis of a vaccine for preventing solid cancer growth.

Direct Detection of Cryptic Peptides Presented by MHC Class I Molecules in Mouse Colon Cancer Cells
To comprehensively capture the MHC class I peptidome, including cryptic peptides derived from allegedly noncoding ORFs, we employed a proteogenomic approach that combined mass spectrometry (MS) sequencing with transcriptomic data ( Figure 1A). Peptide-H-2D d complexes were immunoprecipitated from the cell lysate of CT26, a mouse colon cancer line, using a specific antibody, and the eluted D d -bound peptides were subsequently analyzed using MS. The MS spectra were searched against a custom reference database, which contained the computationally translated sequences of any potential ORFs in the mouse transcripts registered in the GENCODE (www.gencodegenes.org). Matched sequences with a false discovery rate (FDR) of 0.01 were pooled, and D d -bound peptide candidates were selected based on the multiple factors: peptide length, NetMHC affinity score, and expression of the peptide-encoding gene in CT26 cells. The resulting sequences were searched against the UniProtKB database (www.uniprot.org). Sequences not registered in the proteome database were classified as cryptic MHC peptides. This proteogenomic pipeline yielded 534 unique H-2D d -bound peptide sequences comprising 514 canonical and 20 cryptic peptides ( Figure 1B). Cryptic peptides accounted for 3.75% of the H-2D d -bound peptidome.
Immuno 2021, 1, FOR PEER REVIEW 2 noncoding regions, that played protective roles in a mouse hematologic tumor model [12]. Recently, we identified an immunogenic cancer antigen derived from an oncogenic long non-coding RNA (lncRNA) in human colorectal cancer tissues [13]. CD8 + T cell responses against this antigen and its MHC presentation were observed in multiple patients. Here, we hypothesized that some cryptic peptides, if not all, are aberrantly present in cancer cells, serving as cancer antigens. These unexplored antigen types may serve as targets for cancer immunotherapy, regardless of the TMB. In this study, we aimed to identify cryptic peptides in the MHC class I peptide repertoire of mouse colon cancer cells, and comprehensively assess their potential as the basis of a vaccine for preventing solid cancer growth.

Direct Detection of Cryptic Peptides Presented by MHC Class I Molecules in Mouse Colon Cancer Cells
To comprehensively capture the MHC class I peptidome, including cryptic peptides derived from allegedly noncoding ORFs, we employed a proteogenomic approach that combined mass spectrometry (MS) sequencing with transcriptomic data ( Figure 1A). Peptide-H-2D d complexes were immunoprecipitated from the cell lysate of CT26, a mouse colon cancer line, using a specific antibody, and the eluted D d -bound peptides were subsequently analyzed using MS. The MS spectra were searched against a custom reference database, which contained the computationally translated sequences of any potential ORFs in the mouse transcripts registered in the GENCODE (www.gencodegenes.org). Matched sequences with a false discovery rate (FDR) of 0.01 were pooled, and D d -bound peptide candidates were selected based on the multiple factors: peptide length, NetMHC affinity score, and expression of the peptide-encoding gene in CT26 cells. The resulting sequences were searched against the UniProtKB database (www.uniprot.org). Sequences not registered in the proteome database were classified as cryptic MHC peptides. This proteogenomic pipeline yielded 534 unique H-2D d -bound peptide sequences comprising 514 canonical and 20 cryptic peptides ( Figure 1B). Cryptic peptides accounted for 3.75% of the H-2D d -bound peptidome.

Characterization of MHC Class I Cryptic Peptidome
Both canonical and cryptic peptides had known characteristics of H-2D d -binding peptides: 9-mer sequences were dominant in terms of length, and Gly, Pro, and Leu/Phe/Ile, consensus H-2D d -binding anchor motifs, were conserved at positions 2, 3, and 9 ( Figure 2A,B). There was also a trend shared between canonical and cryptic peptides: both types of peptides originated from abundantly expressed genes ( Figure 2C). The majority of the source genes encoding the cryptic peptides were annotated as protein coding in GENCODE, as they contained at least one transcript harboring a consensus ORF for protein translation. However, the ORFs encoding cryptic peptides were found in different reading frames. Cryptic peptides were distinct from canonical peptides in terms of ORF lengths; the average ORF length of cryptic peptides was significantly shorter than that of canonical peptides ( Figure 2D).
Immuno 2021, 1, FOR PEER REVIEW 3 UniProtKB database and classified as canonical or cryptic MHC class I peptides. (B) Pie chart showing the proportion of cryptic peptides in the H-2D d peptidome of CT26 cells (n = 534).

Characterization of MHC Class I Cryptic Peptidome
Both canonical and cryptic peptides had known characteristics of H-2D d -binding peptides: 9-mer sequences were dominant in terms of length, and Gly, Pro, and Leu/Phe/Ile, consensus H-2D d -binding anchor motifs, were conserved at positions 2, 3, and 9 (Figure 2A,B). There was also a trend shared between canonical and cryptic peptides: both types of peptides originated from abundantly expressed genes ( Figure 2C). The majority of the source genes encoding the cryptic peptides were annotated as protein coding in GENCODE, as they contained at least one transcript harboring a consensus ORF for protein translation. However, the ORFs encoding cryptic peptides were found in different reading frames. Cryptic peptides were distinct from canonical peptides in terms of ORF lengths; the average ORF length of cryptic peptides was significantly shorter than that of canonical peptides ( Figure 2D).

Vaccination with Cryptic MHC Class I Peptides Prevented Cancer Growth In Vivo
Next, we assessed the therapeutic potential of the identified cryptic peptides for preventing tumor growth in vivo. Although it is possible to predict the cancer specificity of each cryptic peptide to optimize the candidate peptide pools, we chose to test all 20 identified peptides (19 non-overlapping peptides) to avoid the risk of false-negative exclusion of immunogenic peptides. The peptides were synthesized and divided into seven groups, each containing 1-3 cryptic peptides (Table 1). The synthetic peptide cocktails were subcutaneously injected into BALB/c mice along with adjuvants, followed by CT26 cancer cell injection, and tumor growth was periodically measured ( Figure 3A). In six of the seven groups, the sizes of the challenged tumors were not statistically different between the vaccinated and control groups injected with the adjuvant alone ( Figures 3B and S1A,B). Interestingly, tumor growth was significantly prevented in mice injected with Group 6 peptides. This group contained three cryptic peptides derived from the Ndrg1, Mrps18b, and Lonp1 genes. Similar to other cryptic peptides, their epitope sequences originated from shorter ORFs, independent of the consensus protein-coding ORFs ( Figure 3C).

Vaccination with Cryptic MHC Class I Peptides Prevented Cancer Growth In Vivo
Next, we assessed the therapeutic potential of the identified cryptic peptides for preventing tumor growth in vivo. Although it is possible to predict the cancer specificity of each cryptic peptide to optimize the candidate peptide pools, we chose to test all 20 identified peptides (19 non-overlapping peptides) to avoid the risk of false-negative exclusion of immunogenic peptides. The peptides were synthesized and divided into seven groups, each containing 1-3 cryptic peptides (Table 1). The synthetic peptide cocktails were subcutaneously injected into BALB/c mice along with adjuvants, followed by CT26 cancer cell injection, and tumor growth was periodically measured ( Figure 3A). In six of the seven groups, the sizes of the challenged tumors were not statistically different between the vaccinated and control groups injected with the adjuvant alone ( Figures 3B and S1A,B). Interestingly, tumor growth was significantly prevented in mice injected with Group 6 peptides. This group contained three cryptic peptides derived from the Ndrg1, Mrps18b, and Lonp1 genes. Similar to other cryptic peptides, their epitope sequences originated from shorter ORFs, independent of the consensus protein-coding ORFs ( Figure 3C).  Here, three cryptic peptides in Group 6 were each administrated as a vaccine independently. Data represent the mean ± SD (n = 5), and p-values were calculated using a two-tailed t-test.  Here, three cryptic peptides in Group 6 were each administrated as a vaccine independently. Data represent the mean ± SD (n = 5), and p-values were calculated using a two-tailed t-test.
We also vaccinated mice with each of the three peptides independently and assessed tumor growth. Statistical differences were not observed, although the average tumor size was reduced as a result of the Mrps18b and Lonp1 peptide vaccinations (Figures 3D and S1B). This result suggested that the anti-cancer effects induced by single peptide-based vaccines were insufficient, while induced T cells with different antigen specificities synergized when the peptides were administered simultaneously, possibly indicating the advantage of cocktail vaccination. Taken together, these results demonstrated the potential of cryptic peptides presented by MHC class I molecules to confer cancer protection.

Discussion
Cancer vaccines are a long sought-after therapeutic approach that induces specific T cell responses to prevent tumor growth. In contrast to ICI, which can be performed without knowledge of T cell antigen specificity, vaccination requires information about cancer antigens, ideally epitope sequences specifically presented by targeted tumor cells [14]. Recent technological advances in MS have enabled the capture of the MHC class I peptidomes of cancer cells, including mutation-derived neoantigens and cryptic antigens derived from allegedly noncoding regions. In this study, we explored the MHC class I peptidome of mouse colon cancer cells and found that cryptic peptides derived from noncoding regions accounted for up to 4% of the peptidome. Moreover, the growth of colon cancer was prevented in mice vaccinated with a peptide cocktail consisting of cryptic peptides derived from the Ndrg1, Mrps18b, and Lonp1 genes. Anti-cancer effects mediated by these cryptic peptides were a surprise, since it is likely that cryptic peptide presentation itself is not always a cancer-specific event but also observed in normal tissues [13]. Our results support the previous finding that noncoding regions are a rich source of tumor-rejection antigens, even in solid cancers [12].
The intrinsic factors conferring immunogenicity to Group 6 cryptic peptides remain unclear. The induction of anti-cancer effects by vaccination may imply that cryptic translation events in cancer cells are upregulated compared with those in normal host tissues [15]. At least, these three genes were abundantly expressed in CT26 cells, based on transcriptome data. In our experiments, anti-cancer effects were observed when the vaccine containing all three peptides was administered. The effects were compromised when three peptides were administrated independently. It is well known that the ratio of tumor-reactive T cells to cancer cells (E/T) influences anti-tumor effects. Targeting multiple antigens may have induced a larger number of tumor-reactive T cells in total, which led to tumor rejection. According to this scenario, further optimization of T cell induction (e.g., an amount, dose, or route of peptide vaccination) may recover the anti-cancer effects. Alternatively, the simultaneous induction of three different TCRs with different antigen specificities may have prevented an escape from T cells due to antigen losses [16].
The screening for cryptic antigens in this study was limited to H-2D ligands, potentially excluding immunogenic peptides bound to MHC molecules of other classes. We note that none of the three peptide sequences have been reported in previous studies using the CT26 line. This is presumably attributable to different sensitivities in each proteogenomic pipeline; for instance, we focused on H-2D ligands and implemented direct immunoprecipitation using a specific antibody. Proteogenomic approaches have enabled analysis not only in cell lines but also in patient tissues; however, these approaches vary methodologically among most laboratories in terms of the biochemical isolation of MHC-bound peptides, construction of customized reference databases, and types of analysis software, requiring the laborious validation of identified sequences. The establishment of an efficient, standard method based on sample type should be further considered.
The discovery of cancer antigens that originate from noncoding regions and mediate tumor rejection in mice expands the pool of target antigens used for cancer immunotherapy in clinical settings. In contrast to mutation-derived neoantigens, cryptic peptides are unlikely unique to individual patients and may be related to tumorigenesis [13]. In summary, although the underlying mechanisms of tumor-oriented transcription and translation remain unclear, our data suggest the possibility of using cryptic peptides as new therapeutic targets.
The sample was loaded into a nano-flow liquid chromatography (Easy-nLC 1000 system, Thermo) online-coupled to an Orbitrap mass spectrometer equipped with a nanospray ion source (Q Exactive Plus, Thermo). Nano-flow LC separation was performed with a linear gradient ranging from 3% to 30% buffer B (100% ACN and 0.1% FA) at a flow rate of 300 nL/min for 80 min using a 75 µm × 20 cm capillary column with a particle size of 3 µm (NTCC-360, Nikkyo Technos). For MS, survey scan spectra were acquired at a resolution of 70,000 at 200 m/z with an AGC target value of 3e6 ions and a maximum IT of 100 ms, ranging from 350 to 2000 m/z with charge states between 1+ and 4+. A data-dependent top 10 method was employed. The MS/MS resolution was 17,500 at 200 m/z with an AGC target value of 1e5 ions and a maximum IT of 120 ms.

MS Database Search for Cryptic Peptides
For the MS database search for cryptic peptides and canonical MHC ligands, we built a customized reference FASTA database, which contained every translation product derived from all possible ORFs found in the transcripts registered in the GENCODE database (v. M22). Possible ORFs were defined as sequences that began from the ATG codon and ended with stop codons or reached the transcript end, yielding polypeptides that were at least seven amino acids in length. Experimental MS/MS data were searched against the database using the Sequest HT and Percolator algorithms on the Proteome Discoverer 2.3 platform (Thermo). For database searching, the tolerance for precursor and fragment ions was set at 10 ppm and 0.02 Da, respectively. The oxidation of methionine (+15.995 Da) was selected as a dynamic modification. No specific enzyme was selected. For the subsequent analysis using Percolator (Thermo), concatenated target-decoy selection was validated based on the q-values. Annotated peptide spectrum matches (PSMs) with the highest score Informed Consent Statement: Not applicable. Data Availability Statement: MS raw data and FASTA files have been deposited to the ProteomeXchange Consortium via the jPOSTrepo partner repository (https://repository.jpostdb.org) with the dataset identifier PXD029094.
Conflicts of Interest: T.K. received grant support from Astellas. T.T. received grant support from Ono Pharmaceutical and Sumitomo Dainippon Pharma.