Revisiting the Non-Coding Nature of Pospiviroids

Viroids are small, circular, highly structured pathogens that infect a broad range of plants, causing economic losses. Since their discovery in the 1970s, they have been considered as non-coding pathogens. In the last few years, the discovery of other RNA entities, similar in terms of size and structure, that were shown to be translated (e.g., cirRNAs, precursors of miRNA, RNA satellites) as well as studies showing that some viroids are located in ribosomes, have reignited the idea that viroids may be translated. In this study, we used advanced bioinformatic analysis, in vitro experiments and LC-MS/MS to search for small viroid peptides of the PSTVd. Our results suggest that in our experimental conditions, even though the circular form of PSTVd is found in ribosomes, no produced peptides were identified. This indicates that the presence of PSTVd in ribosomes is most probably not related to peptide production but rather to another unknown function that requires further study.


Introduction
The 'central dogma' of molecular biology explains the flow of genetic information and consists of the process of transcribing DNA into RNA, which is then translated into proteins. Translation is usually divided into four stages: initiation, elongation, termination and ribosome recycling [1]. The initiation step is the most complex in terms of the proteins involved. During initiation, the 40S ribosomal subunit binds to the mRNA and scans until an initiation codon (AUG) is found. In the last few years, several alternative initiation starting codons have been described [2]. Following initiation, the 60S ribosomal subunit joins to form the 80S ribosome whereupon the elongation step starts, translating the information encoded in three consecutive nucleotides into an amino acid (aa), creating a peptide, and then a protein. Recognition of the stop codon drives the termination process and the release of the protein. Finally, ribosome recycling occurs, where the messenger RNA (mRNA) is released and the 80S ribosome is separated into its 40S and 60S components [1].
For many years, it was believed that mRNAs were the only RNAs produced by DNA that can be translated. However, only around 4% of the RNA transcribed is actual mRNA [3]. The remainder corresponds to different classes of non-coding RNAs [4]. In 1979, a peculiar endogenous circular RNA (circRNA) was discovered in HeLa cells [5]. At viroids cannot produce any peptides, thus suggesting that viroid localization in proximity of ribosomes is due to reasons other than translation.

Bioinformatic Analysis
Nucleotide sequences of all available strains for 30 viroid species from the Pospiviroidae family were downloaded from the NCBI database in FASTA format. Sequences identified as duplicates were excluded from the analysis (Table S1). All the sequences were then analyzed for the existence of potential ORFs according to the following steps: Open Reading Frame (ORF) detection: ORFs in circular genomes may originate at any point in the sequence and run the length of the genome or even exceed it. To identify candidate ORFs in the circular viroid genomes, we used artificial genome sequences as contigs composed of two copies of the same sequence joined together. All AUG and non-AUG starting codons (according to [2]) were identified in all three reading frames, and sequence strings that started with the detected starting codons and stopped at the end of the remaining sequence were obtained as ORF-containing candidates (putative ORFs). Each such putative ORF was then trimmed to contain contiguous subsequences between in-frame start and stop codons, which were retained for further analysis. In the case of multiple in-frame overlapping ORFs terminating at the same stop codon, only the longest ORF was kept in the final list of candidates.
Translation of ORFs: Each sequence from the final set of putative ORFs was in silico translated into a protein, based on the genetic code. For each viroid species, basic analyses were carried out, including the number of different peptides per species, mean peptide length, standard deviation of peptide length, mean molecular weight of peptides and standard deviation of peptide molecular weight (Table 1). BLASTp analysis was carried out to search for significant sequence similarity (p value < 0.05) with previously characterized proteins.
ORF emergence tendencies: To investigate if viroid genomes show a greater ORF frequency than expected by chance, the same procedure was subsequently used on randomly scrambled genome sequences with an identical nucleotide composition. Except from the actual number of ORFs per genome, the localization of the ORFs across the 5 characteristic genome domains, (the terminal left domain, the pathogenicity domain, the central domain, the variable domain and the terminal right domain) was also checked for enrichment in comparison with the scrambled genomes. To acquire the information about the characteristic domains, BED files with the coordinates of the start of each ORF and the coordinates of the domains, whenever available, were created. The intersect tool from the bedtools suite [31] was used to find the overlaps.
Conservation of ORFs: The conservation rate of the ORFs identified in Pospiviroidae genomes, inter-and intra-specifically, was obtained with the use of the MAFFT alignment algorithm (Multiple Alignment using Fast Fourier Transform) [32]. The percentage of occurrence of a nucleotide at each alignment position was calculated, and extended conserved regions were defined as areas having at least 40% similarity among the genomes included in the alignment. The occurrence rate of ORFs among the different strains per genome was also calculated by counting the number of strains where a specific ORF is predicted by dividing over the total number of strains of the species.
Detection of KOZAK motif: We generated a positional-specific scoring matrix (PSSM) through a comparison of the matrices for the KOZAK motif, based on [33], and a background matrix, created from 5074 random sequences (a number equal to the viroid sequences used in the study). We then conducted a motif search using a custom R script that scans the target sequence with the PSSM matrix and returns instances of matrix similarity according to an arbitrary threshold of 0.65.

Plants and Infections
Tomato (Solanum lycopersicum cv Rutgers; Livingston Seed Co, Columbus, OH, USA) and Nicotiana benthamiana plants were infected with either PSTVd RG1 (GenBank Acc. No. U23058) or PSTVd NB (GenBank Acc. No. AJ634596.1). Infections were either performed mechanically or via agro-infiltration. For mechanical infections, the dimeric construct of PSTVd RG1 was used to synthesize infectious dimeric transcripts as described previously [34]. PSTVd RNA transcript (1 µg) was inoculated into both plant types. All plants were grown in a growth chamber at a temperature of 25 • C with 16 h of light and 8 h of darkness [35]. For agroinfiltration experiments, N. benthamiana plants were agroinfiltrated with an A. tumefaciens GV3101 strain carrying an infectious PSTVd NB dimer, kindly provided by Dr. De Alba and Dr. Flores (Institute for Cellular and Molecular Plant Biology-IBMCP), as described previously [36]. Plants were grown in a glasshouse under ambient temperature and light conditions.

Total Ribosome Isolation, Polysome Fractionation and RNA Preparation
Total ribosomes and polysomes were prepared as previously described [37] with modifications. Actively growing leaf samples (25 g) were frozen in liquid nitrogen and macerated to a fine powder. Two volumes of cold plant extraction buffer (50 mM Tris-HCl (pH 9.0) (Sigma, Burlington, VT, USA), 30 mM MgCl 2 (Fischer chemicals, Chicago, IL, USA), 400 mM KCl (Fischer chemicals, Chicago, IL, USA), 17% (w/w) sucrose (Fischer chemicals, Chicago, IL, USA) were added and clarified by passage through DEPC-treated cheesecloth. The resulting extracts were centrifuged at 3000 rpm for 7 min at 4 • C. Onetenth volume of 20% Triton X-100 was added and samples were centrifuged at 12,000 rpm for 20 min. Clear supernatants were then layered (1:1) on a 60% sucrose cushion (20 mM Tris-HCl (pH 7.6), 5 mM MgCl 2 , 510 mM NH 4 Cl, 60% (w/w) sucrose) and centrifuged at 28,000 rpm for 19 h in a SW28 rotor in a Beckman Coulter ultracentrifuge (Beckman Coulter, Indianapolis, IN, USA). The resulting pellets were carefully rinsed with resuspension buffer (50 mM KCl, 20 mM Tris-HCl (pH 7.6), 5 mM MgCl 2 ) and resuspended in 200 µL of the same buffer. The resuspended total ribosomes were fractionated on a 5-50% sucrose gradient by centrifugation at 16,000 rpm for 13 h in a SW28 rotor. The 40S, 60S and 80S ribosomes and the polyribosomes were purified, and the RNAs were extracted as described previously [38]. Briefly, the RNA was precipitated with 5.5 M guanidine HCl (Sigma, Burlington, VT, USA) and ethanol (Commercial alcohols, Toronto, ON, Canada), followed by acidic phenol:chloroform extraction and re-extraction of the supernatant with an equal volume of chloroform. Purified RNAs were treated with DNase I according to manufacturer's instructions (Promega, Madison, WI, USA). RNA integrity was evaluated using a Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA)).

High Throughput Sequencing for Detection of Quasi-Species
The results of small viroid RNA experiments have been described elsewhere [39]. PSTVd-sRNA sequences of PSTVd RG1 -infected tomato plants (GEO Acc. No. GSM1717894) were analyzed for the presence of potential start codons. Initially, 21-nt long sRNA with a match score of 1 and mismatch cost of 2 to PSTVd RG1 were segregated using CLC Genomic Workbench version 4.6 software (https://www.qiagenbioinformatics.com/products/clcgenomics-workbench/version-11-available/ accessed on 8 December 2021) and were then manually re-examined for the presence of AUG codons.
HTS analysis for PSTVd genomes was performed as follows: PSTVd NB agroinfiltrated plants were collected at 3 weeks post infection (wpi) and RNA was extracted as described previously [40]. Following DNAse I (Roche Diagnostics, Basel, Switzerland) treatment and extraction with phenol/chloroform, the integrity of RNAs was assessed using the Agilent 2100 Bioanalyzer. Library construction used the Ion Total RNA-seq Kit (Life technologies-Merk group, Darmstadt, Germany), and sequencing was performed using the Ion Torrent Proton platform. The quality of the raw reads (a total of 21 928 628) before and after the various cleaning steps was assessed with FastQC [36]. Quality and adapter trimming was performed with fastp [41] using the following settings: -q 20 -length_required 21 -cut_tail -cut_front -cut_mean_quality 20. Cleaned fastq files were aligned to the PSTVd NB genome (AJ634596.1) using BBMap [42] with default settings. Aligned reads (54 426) were extracted with samtools view [43]. Nucleotide variants from bam files were produced with quasitools [44], ran as quasitools call ntvar. The resulting VCF file was then used to extract alternative start codons.
2.5. cDNA Synthesis, RT-PCR, RT-qPCR and Northern Blot for PSTVd Detection Following RNA extraction, cDNA synthesis was performed using 250 ng of RNA and SuperScript III reverse transcriptase (Invitrogen, Carlsbad, CA, USA). PCR was carried out using Q5 DNA polymerase according to the manufacturer's instructions (New England Biolabs, Ipswich, MA, USA). Primers were either designed for this study or published before (Table S2) [45,46]. PCR-produced fragments were cleaned and cloned in pGEM-T vector (Promega, Madison, WI, USA) using the manufacturer's instructions, followed by sequencing. The resulting sequences were assembled and aligned using the CLC Free Workbench (https://digitalinsights.qiagen.com/products-overview/discoveryinsights-portfolio/analysis-and-visualization/qiagen-clc-main-workbench/ accessed on 8 December 2021) and were then manually analyzed.
For the evaluation of the PSTVd titer in both the total RNA extract and the polysome fraction, cDNA was prepared by reverse transcribing 500 ng RNA (SuperScript III reverse transcriptase-Invitrogen, Carlsbad, CA, USA) in the presence of random primers. Three housekeeping genes, specifically the 5.8S, 18S, and 25S rRNAs, were used for normalization, and three biological and three technical replicates were used. The qBASE framework was used for the analysis [47].
The detection of PSTVd by northern blotting was carried out as described previously [34,36].

In Vitro Translation and Immunoblot Assays
In order to perform in vitro translation, both the Wheat Germ Extract kit (Promega, Madison, WI, USA) and the FluoroTect™ GreenLys Labeling System (Promega, Madison, WI, USA) were used according to manufacturer's instructions with the following modifications. Briefly, the reaction was performed in 25 µL containing 5 µg viroid RNA (specifically, (+) dimeric, (−) dimeric, (+) monomeric and (−) monomeric) and 2 µL of FluoroTect™. The reactions were carried out at 25 • C for 60 min, followed by an incubation at 30 • C for 60 min. The reactions were then terminated by the addition of RNase A (Promega, Madison, WI, USA). For PSTVd-derived translational analysis, 5 µL of the in vitro translation reactions were separated on a 12% SDS-PAGE gel and were then transferred to polyvinylidene difluoride membranes (Bio-Rad Laboratories, CA, USA). Anti-BODIPY™ FL rabbit IgG (ThermoFischer Scientific Inc, Waltham, MA, USA) at a dilution of 1:500 dilution was used to detect the translation according to the manufacturer's instructions (Invitrogen, Carlsbad, CA, USA), followed by a subsequent incubation with a 1:10,000 dilution of the IRDye 800CW donkey anti-rabbit-IgG polyclonal antibody (LI-COR). The proteins were subsequently visualized using an LiCOR scanner (LI-COR, Lincoln, NE, USA) at 700 nm.

Proteomic Analysis
Sp3-mediated protein digestion: N. benthamiana plants were agroinfected with PSTVd NB and upper leaves were collected 4 wpi. Leaf tissue was pooled and homogenized in 4% SDS, 0.1 M DTT, 0.1 M Tris pH 8 lysis buffer. Three biological replicas of each group (either non-inoculated or 4 wpi) were processed using the sensitive sp3 protocol [48]. Additionally, the sp3 protocol was used in parallel for the digestion of the 3 kDa ultra-filtrate (Sartorius AG, Göttingen, Germany) of the leaf extracts in order to assess the lower molecular weight protein portion. The cysteine residues were reduced in 100 mM DTT and alkylated in 100 mM iodoacetamide (Acros Organics, Thermo Fisher Scientific Inc., Waltham, MA, USA). Twenty micrograms of beads (1:1 mixture of hydrophilic and hydrophobic SeraMag carboxylate-modified beads (Cytiva, Marlborough, MA, USA, former GE Life Sciences) were added to each sample in 50% ethanol. Protein clean-up was performed on a magnetic rack. The beads were washed twice with 80% ethanol and once with 100% acetonitrile (Fisher Chemical, Thermo Fisher Scientific Inc., Waltham, MA, USA). The captured proteins were digested overnight at 37 • C under vigorous shaking (Thermomixer, Thermo Fisher Scientific Inc., Waltham, MA, USA) with 0.5 µg Trypsin/LysC (mixture MS grade, Promega, Madison, WI, USA) prepared in 25 mM ammonium bicarbonate. The next day, the supernatants were collected, dried using a vacuum centrifuge (Savant, Thermo Fisher Scientific Inc., Waltham, MA, USA), solubilized in a mobile phase A, sonicated and the peptide concentration was determined through measurement of the absorbance at 280 nm.
LC-MS/MS: Nano-liquid chromatography of the resulting tryptic peptide mixture was carried out using a Ultimate3000 RSLC system configured with an Acclaim pepmap C18 trap column (Thermo Fisher Scientific Inc., Waltham, MA, USA), and a 25 cm-long pepsep nano column (pepsep.com, Marslev, Denmark) for a total of 500 ng of peptides was loaded on the precolumn at a flow rate of 6 µL/min for 4 min with 0.1% formic acid in water. The peptide separation was achieved using 0.1% (v/v) formic acid in water (mobile phase A) and 0.1% (v/v) formic acid in acetonitrile (mobile phase B). The flow rate was set to 350 nL/min in the first 12 min of the gradient and 250 nL/min in the main gradient. The gradient was linear from 8% to 28% phase B in 35 min, 28% to 36% in 5 min, 36 to 95% in 0.5 min, staying isocatic for 5 min and then equilibrating at 8% for 10 min at 350 nL/min.
The data acquisition was performed in positive mode using a Q Exactive HF-X Orbitrap mass spectrometer (Thermo Fisher Scientific Inc., Waltham, MA, USA). MS data were acquired in a data-dependent strategy, selecting up to the top 12 precursors based on precursor abundance in the survey scan (m/z 350-1500). The resolution of the survey scan was 120,000 (at m/z 200) with a target value of 3 × 10E6 ions and a maximum injection time of 100 ms. HCD MS/MS spectra were acquired with a target value of 1 × 10 5 and resolution of 15,000 (at m/z 200) using an NCE of 28. The maximum injection time for MS/MS was 22 ms. Dynamic exclusion was enabled for 20 s after one MS/MS spectra acquisition. The isolation window for MS/MS fragmentation was set to 1.2 m/z. Three technical replicas were acquired.
Data Analysis: The generated raw files were searched using the MaxQuant Software (1.6.14.0) (MaxPlanck, Germany) [51] using Andromeda, against the predicted proteome based on the N. benthamiana Genome v1.0.1 (Niben v1.0.1, containing 56701 proteins, 2015), with the predicted PSTVd ORFs and the MaxQuant common contaminant database. To be accepted for the identification, an error of less than 20 ppm (first recalibration search) and 4.5 ppm tolerance in the main search of peptide mass tolerance was accepted. Up to 2 missed cleavages were allowed and the modifications taken into account were: oxidation (M); acetylation (protein N-term); deamidation (NQ) as variable and carbamidomethylation (Cys) as fixed modifications. Matching between runs and second peptide options were activated. Protein, peptide and "site" identifications were validated at an FDR of 1% using a reversed database. The above data analysis was repeated using an "unspecific" search mode against the predicted PSTVd ORFs, removing the constraint for tryptic generated peptides.
Data visualization: The MaxQuant search engine quantitative (LFQ) results were analyzed and visualized using the Perseus computational framework (version 1.6.10.43) (MaxPlanck, Germany) [52]. The LFQ values were log2 transformed and the proteins were filtered for potential contaminants, reversed hit and those were only identified by site. The biological and technical replicates were grouped into non-inoculated or PSTVd-infected plants and the two groups were filtered based on at least 70% valid values present in at least one group. Remaining empty values were imputed based on normal distribution. The groups were compared using a student t-test using permutation-based FDR calculation (s0: 0.1, FDR < 0.05). The results after statistical analysis were visualized in a volcano graph based on the difference between the two samples expressed in log2(x) versus their statistical significance expressed in −Log10 (p value).

Analysis of the Presence of ORFs in Viroid Sequences
To identify possible ORFs in viroid sequences, we used the nucleotide sequence of 30 different viroid species from the Pospiviroidae family, including all available isolates at the time of this study (Table S1). Firstly, we duplicated the sequence of each viroid to avoid 'premature' termination of a predicted ORF at the 3'/5' junction of the genomic sequences, since viroids are circular. Secondly, we used both AUG as well as non-AUG start codons, based on the work of Kearse and Wilusz [2]. Finally, in the case of overlapping ORFs, we decided to keep only the longer ORF. With these rules, we showed that all viroids are predicted to produce small peptides with a mean size of peptides for each species ranging from 3 to 15 kDa (Table 1). It is important to note that differences in the observed number of small peptides for each viroid species can be primarily attributed to the different number of isolates available for the analysis (Supplementary Table S1). All predicted peptides were then analyzed using BLASTp against the complete non-redundant NCBI protein database (nr) to test for similarity with known proteins, but none were identified.
Since the presence of an optimal Kozak sequence can enhance the production of a peptide [54], we studied if the predicted ORFs contain an optimum Kozak sequence associated with the identified start codons. For this purpose, we used the motif described in Joshi et al. [33]. As shown in Table 2, 17 PSTVd isolates present a Kozak frame, whereas CEVd, CSVd and CLVd present the same motif but only in a very small number of the tested isolates. This suggests that even though starting codons are present in viroids, only a few of them are highly favorable to be used for translation. We then assessed the likelihood of the existence of ORFs in relation to their position throughout the genome. For this, we used two different approaches. Firstly, we calculated the degree of conservation of the various ORFs between the different isolates of the same species as the proportion of ORFs identified in the same position across genomes of the same species (see Methods). Histograms of mean conservation for the isolates are shown in Figure 1A  conservation score of 1 corresponding to 100% sequence identity between isolates. For most of the viroids, including PSTVd, the score is close to 1, indicating that the ORFs identified in isolates of PSTVd are highly conserved. However, this feature is not shared by all species, since some viroids, such as IRVd and CBCVd2, lack ORF sequence conservation. PVd was not included in the analysis since we only accessed one isolate of this specific viroid. The second approach was to assess the possibility of ORF existence in artificially scrambled viral genome sequences. The results are presented in Figure 1B and Supplementary Figure S2, as scatterplots of numbers of observed ORFs in real vs. scrambled genome sequences. The presence of dots above the red diagonal line of the graph corresponds to a higher tendency for the ORFs in the real sequences, whereas the presence of dots below the red line corresponds to a higher frequency for ORFs in the scrambled genome sequence. For the tested viroid species, some of them present more ORFs in their real sequence compared to the scrambled sequences (e.g., PSTVd AGVd, and HLVd), suggesting that the identified ORFs are somewhat constrained by the genomic sequence structure. Again, this is not a general feature since viroids such as CEVd, CLVd and GYSVd show more ORFs in the scrambled genome, suggesting that not all viroids have the same tendency in terms of predicted ORFs, and that even though they are in the same family, viroids may work in a different way to produce infection ( Figure S2). We also explored the possibility of ORF "hotspots", or positions in the genome with an increased likelihood to give rise to ORFs. By projecting each identified ORF coordinate on its genome of origin, we created aggregate plots of "ORF-density" over the length of the genome for each species. We then compared the density plot with the one obtained from scrambled genomes. Results are presented in Figure 1C and Supplementary Figure S3. In PSTVd isolates, a hotspot is observed between nucleotides at positions 45 to 62, which is clearly not observed when the genome was shuffled, suggesting that this region could be important for the production of peptides. Hotspots were also observed in all viroids; however, the number as well as their distribution varies depending on the viroid species ( Figure S3).
Last, we performed a structural analysis of the viroid sequences with regard to the presence of these ORFs. If a ribosome is to be attached on the viroid sequence, this is more probable to happen in a loop region than in a self-complementary base-paired sequence. For this, we calculated the presence of ORF in loops, bulges and hairpins, using published structures of viroids [18,19,[55][56][57][58][59]. Although not all viroids have a solved secondary structure, most of the tested viroids have starting codons in loops, suggesting that a ribosome could attach to this region to initiate translation (Table S3).
Taken together, the above results indicate that there are ORFs present in all tested viroids, even though very few are associated with a favorable Kozak sequence. Nevertheless, there are converging indications of spatial, sequence and structural constraints associated with the identified potential ORFs. A significant percentage of these are conserved between isolates and are preferably positioned in loops, which is suggestive of an increased likelihood for translation.
To investigate this hypothesis, we focused on only one viroid, PSTVd, an important quarantine viroid, and particularly on two strains that have been widely used in different works in recent years, PSTVd RG1 and PSTVd Nb , which both contain a number of putative ORFs based on the analysis described.

Analysis of Potential Quasi-Species during Infections to Identify Possible Additional ORFs
As already mentioned, in this analysis we used two different PSTVd strains, PSTVd RG1 and PSTVd NB , both capable of creating quasi-species during infection. A previous study showed that PSTVd may exhibit a 1/3800 to 1/7000 mutation rate [60]. A point mutation could potentially generate start codons in several regions of the PSTVd RG1 sequence. The PSTVd-sRNA sequences of PSTVd RG1 -infected tomato plants (GEO Acc. No. GSM1717894), which were previously generated by Adkar-Purushothama et al. [39], were analyzed for the presence of potential start codons. The results showed a total of 143 AUG out of the 4594 PSTVd-sRNA sequences analyzed (3.1%). All the mutations that led to the formation of an AUG initiation codon are shown in Figure 2A,B.
We then performed HTS analysis using either non-infected or PSTVd NB -infected N. benthamiana plants. PSTVd NB infection was confirmed by Northern blotting prior to sequencing (data not shown). HTS reads that mapped to PSTVd NB were used for the identification of quasi-species. This analysis allowed the identification of a mutation likelihood expressed as percentage to be determined for each nucleotide at all genome positions (Table S4). The overall likelihood for each position in the PSTVd genome was found to be <1%; however, at positions 40 to 60 of the PSTVd genomic sequence, the mutation percentage was as high as 7% (Table S4 and Figure S4). Subsequent analysis of the mutations identified 111 putative AUG codons generated at positions where nucleotide changes were observed. Mutations with the highest probability in each position are presented Figure 2C,D. These results suggest that even if native PSTVd sequences do not possess a large number of AUG initiation codons, there is a tendency for the generation of mutations during infection/replication, which may lead to the formation of ORFs, therefore allowing the translation of peptides from viroid RNAs during the infection process.

The Circular Form of PSTVd Is Associated with Ribosomes
It has been shown before that PSTVd is found in ribosomes, but only in tomatoes [27]. In order to understand the association of PSTVd with the host ribosome during infection, tomato and N. benthamiana plants infected with PSTVd RG1 were used. PSTVd RG1 is known to induce severe symptoms in tomato cv. Rutgers, while N. benthamiana is a symptomless host [39,61]. Viroid accumulation in both tomato and N. benthamiana plants was confirmed by RT-PCR from the upper leaves. Both tomato and N. benthamiana plants showed PSTVdspecific amplicons of approximately 360 nt (i.e., the full length; Figure 3A), which was confirmed by sequencing.  Table S4).
Then, we investigated the presence of the viroid in ribosomes. Lysate from collected tissue was subjected to centrifugations, including ultracentrifugation on a 60% sucrose cushion ( Figure 3B). RT-PCR and Northern blot analysis confirmed the presence of PSTVd in the total ribosome fraction of the infected tomato and N. benthamiana plants ( Figure 3C,D). Additionally, RT-qPCR assays were performed on both total RNA extracts and RNA extracts derived from the total ribosomal fraction to quantify the level of viroid enrichment in the ribosomes. Higher amounts of viroid molecules were detected in the total ribosomal fraction as compared to the total RNA extract, suggesting that PSTVd is indeed enriched in the ribosomes of both tomato and N. benthamiana plants ( Figure 3E). These results confirmed that viroids are associated with the total ribosomal fraction of infected plants.
However, to verify whether viroid molecules are associated with non-translating ribosomes (40S, 60S and 80S) or with polysomes, the total ribosomal fractions from leaf samples were subjected to fractionation ( Figure 4A). Briefly, the isolated ribosomal fractions were dissolved in resuspension buffer and then were layered on a 5-50% sucrose gradient cushion. During centrifugation, the heavier molecules move down the sucrose gradient faster than do the lighter ones. In other words, the polysomes move towards the bottom of the tube, followed by the 80S ribosomes (monosomes), while both the 60S and 40S ribosomal subunits remain on the top of the gradient. The fractionated RNAs were grouped into non-translating ribosomes and polysomes and were subjected to RT (using the Vid-RE primer), followed by PCR amplification using the Vid-FW/Vid-RE primers. Results showed the presence of full-length PSTVd-specific amplicons were derived only from the polysome fraction of PSTVd RG1 -inoculated tomato and N. benthamiana plants. No PCR amplification was detected with the RNA isolated from the non-translation ribosome fractions of the infected plants. None of the mock-inoculated plants showed any amplification ( Figure 4B). The PSTVd-specific bands were cloned and sequenced in order to confirm their identity. The data presented here suggest that PSTVd is associated with polysomes in both infected tomato and N. benthamiana plants. It is worthy to highlight that, as described in Cottilli et al., a peak corresponding to 40S fraction is very low, suggesting that PSTVd could be affecting the 18S rRNA maturation, and therefore the 40S formation, also in N. benthamiana [27]. Hence, second full-length PCR amplification was performed using a new set of primers (PSTVd-254F/PSTVd-253R) on the RT product which was synthesized using the Vid-RE primer. Results revealed the presence of full-length PSTVd amplicons in the polysome fraction of PSTVd inoculated plants (also verified by sequencing), but not in either the ribosome fraction or in the mock-inoculated plants ( Figure 4D). Taken together, these results suggest that circular PSTVd molecules are found in translating ribosomes of both tomato and N. benthamiana plants.  The simplest and most powerful tool with which to verify whether these polysomeassociated PSTVd molecules are linear or circular RNA is cDNA synthesis using a targetspecific primer followed by two independent PCRs, as described in Figure 4C. If the target is circRNA, both PCRs should yield amplicons equivalent to the full-length targets, whereas if the target is monomeric linear, only one PCR will yield a full-length target. Hence, second full-length PCR amplification was performed using a new set of primers (PSTVd-254F/PSTVd-253R) on the RT product which was synthesized using the Vid-RE primer. Results revealed the presence of full-length PSTVd amplicons in the polysome fraction of PSTVd inoculated plants (also verified by sequencing), but not in either the ribosome fraction or in the mock-inoculated plants ( Figure 4D). Taken together, these results suggest that circular PSTVd molecules are found in translating ribosomes of both tomato and N. benthamiana plants.

In Vitro Translation of PSTVd
In order to verify potential start codons, in vitro translation assays were performed using the wheat germ extract system with the idea of verifying whether or not PSTVd RG1 ORFs could be translated into peptides. For this purpose, in vitro-generated circular (+) PSTVd and both (+) and (−) monomeric and dimeric PSTVds RNAs were prepared using a synthetic PSTVd RG1 sequence. The positive control (i.e., luciferase control RNA) produced a high intensity band, while none of the tested viroid transcripts permitted the detection of peptides by immunoblot assays (Figure 5). These experiments were repeated under many different conditions, including the use of various concentrations of both magnesium (2 to 5 mM MgCl 2 ) and potassium (50 to 150 mM KCl), as well as of various incubation times (60-120 min) and temperatures (25 • C to 30 • C). Regardless of the conditions tested, it was not possible to detect the synthesis of any peptides derived from the PSTVd template.

In Vitro Translation of PSTVd
In order to verify potential start codons, in vitro translation assays were performed using the wheat germ extract system with the idea of verifying whether or not PSTVd RG1 ORFs could be translated into peptides. For this purpose, in vitro-generated circular (+) PSTVd and both (+) and (-) monomeric and dimeric PSTVds RNAs were prepared using a synthetic PSTVd RG1 sequence. The positive control (i.e., luciferase control RNA) produced a high intensity band, while none of the tested viroid transcripts permitted the detection of peptides by immunoblot assays (Figure 5). These experiments were repeated under many different conditions, including the use of various concentrations of both magnesium (2 to 5 mM MgCl2) and potassium (50 to 150 mM KCl), as well as of various incubation times (60-120 min) and temperatures (25°C to 30 °C). Regardless of the conditions tested, it was not possible to detect the synthesis of any peptides derived from the PSTVd template.

Using Mass Spectrometry to Identify PSTVd Produced Small Peptides
To study in vivo possible PSTVd peptide production, we performed MS analysis in infected plants. N. benthamiana plants were inoculated with PSTVd NB and 4 wpi leaves were collected and tested for viroid presence ( Figure 6A). Since we have used PSTVd NB , the expected peptides to be produced were known and are shown in Table 3. We selected and performed three biological and three technical replicates for not infected and PSTVdinfected plants. We identified 3730 different proteins, and after filtering (see Materials and

Using Mass Spectrometry to Identify PSTVd Produced Small Peptides
To study in vivo possible PSTVd peptide production, we performed MS analysis in infected plants. N. benthamiana plants were inoculated with PSTVd NB and 4 wpi leaves were collected and tested for viroid presence ( Figure 6A). Since we have used PSTVd NB , the expected peptides to be produced were known and are shown in Table 3. We selected and performed three biological and three technical replicates for not infected and PSTVdinfected plants. We identified 3730 different proteins, and after filtering (see Materials and Methods), we kept 3227 proteins for further analysis, presented in Table S5. We first focused on the analysis of the proteins found in order to validate the MS technique. After statistical analysis, 85 proteins were identified as having their expression altered by PSTVd infection and are shown in a volcano plot ( Figure 7A) as well as in detail in Table 4. The log2 difference is derived from the statistical comparison of the LFQ intensities between the two groups (infected samples vs. control samples). In order to verify the results, we looked at older published data [28]. Proteins such as oxygen-evolving enhancer protein 2 (OEE2) or pathogen-related protein 10 (PR10) were found in our experimental set as statistically significantly altered by PSTVd, as has been previously described for CEVd [28]. Therefore, we considered that our results were of good quality to be used for further analysis. Total RNA staining (methylene blue) was used as loading control. (B) Three different strategies were followed in this study. In strategy 1, total lysate from both infected and non-infected plants was used for further MS analysis. In strategy 2, total lysate was filtered through specialized column to keep only small peptides, and then proceed with MS analysis. In strategy 3, a 15% polyacrylamide gel was used to separate proteins and only proteins smaller than 30 kDa were kept for further MS analysis.  (C) Heat map of proteins related to translation statistically affected by the infection of PSTVd. All graphs were created using the Perseus 1.6.10.43 software [52]. More details in Table S5.    By analyzing the ontology of the identified proteins in detail (Table 4), it was revealed that certain proteins involved in metabolism and stress are influenced by the viroid infection. In addition, most of the affected proteins are located in the cytoplasm or in the chloroplast and not in the nucleus ( Figure S5). Furthermore, an important number of proteins involved in translation seem to be affected. In Figure 7B, the red dots of the Volcano plot represent all identified proteins involved in translation, and it is obvious that there is a tendency for under-expression of these proteins upon viroid infection. Some of these proteins have been found with statistically significant changes ( Figure 7C). This result suggests that translation is leading to a shut down during viroid infection.
Then, we focused on the presence of small PSTVd peptides. Even though in the performed analysis we were able to recognize a significant number of small peptides (as small as 3kDa), none of the in silico-predicted PSTVd microproteins were positively identified. Therefore, we proceeded with two alternative strategies ( Figure 6B). We reasoned that the previous highly complex in-protein experiment may have masked some small peptides due to the large number of cellular proteins found in the lysate. For that reason, we opted for the filtration of the lysate to enrich our samples for low-molecular-weight proteins ( Figure 6B). We also performed a third strategy, where we performed a 15% SDS-PAGE gel, cut bands under 30kDa and repeated the proteomic analysis. The data analysis of the above strategies was performed using both specific trypsin digestion as well as the non-stringent "unspecific digestion" alternative. Unfortunately, even though small peptides were identified originating from other proteins, we were not able to identify any of the predicted PSTVd peptides.

Discussion
Since the discovery of viroids, it has been generally accepted that they do not encode ORFs. However, recent research developments suggest that viroids can bind to ribosomes [27]. In addition, endogenous circRNAs have the capacity of being translated and produce small peptides called micropeptides [8][9][10][11]. With this in mind, we decided to revisit the idea that viroid RNAs are not translated by using distinct and sensitive techniques.
We performed thorough bioinformatics analysis using 30 different Pospiviroidae species, including 2441 published isolates. We showed that all tested viroid sequences contain small ORFs with mean sizes of putative polypeptides ranging between 3 and 15 kDa. We have considered ORFs that started with AUG or non-AUG starting codons [2]; however, very few of them presented a favorable Kozak sequence that would predict enhanced translation. The presence of these ORFs does not appear to be random since, as suggested by the spatial preference of ORFs to occur in the same position across genomes. Finally, by analyzing all viroid isolates, we determined 'hotspots' usually found in structurally loose regions of the genome. For PSTVd the hotspot was mostly located between positions 40 to 60 in the pathogenicity region. These analyses provide evidence that viroid genomes possess potential ORFs that could be translated.
The production of microproteins from small ORFs, including from circRNA, has been described previously [8][9][10][11]. Precursors of miRNAs, which have been previously proposed to have similar structural features to viroids, have also been found to interact with ribosomes and produce micropeptides ranging from 4 to 60 aa [62][63][64]. ORF translation from UTR has also been produced by uORFs (upstream ORFs in the 3 UTR) or sORFs (small ORFs generally in 5 UTRs). Most uORFs are found upstream of major mRNA ORFs and are most often initiated using an AUG start codon. However, almost 50% of uORFs have been found to start from non-AUG start codons [65]. The production of peptides from uORFs has been found essential in translation since it can either enhance translation (e.g., ribosomal shunt) or reduce it [66,67]. Finally, circular RNA satellites, which are small pathogens sharing a few common characteristics with viroids, have been found capable of producing small peptides [21].
In this work, we have specifically focused on PSTVd to study the possible production of peptides by viroids in the two different strains used in this work, PSTVd RG1 and PSTVd NB . Although there was no AUG present, there were a few non-AUG starting codons, allowing the production of peptides ranging from 3 to 204 aa for PSTVd NB and from 2 to 61aa for PSTVd RG1 . However, upon infection, a significant number of point mutations are produced (3% and 7% depending on the system) as has been shown before [60], also generating AUG starting codons, that can be used for initiation of translation. However, the number of recognized quasi-species with these mutations is relatively small to significantly affect viroid biology.
It has been shown that CEVd genomic RNA as well as viroid-derived siRNAs have been localized in ribosomes [27], suggesting that pospiviroidae species have the tendency of accumulating in ribosomes. In this work, we have shown that the circular PSTVd genome localizes in ribosomes in N. benthamiana and tomato plants too. Therefore, applying a combination of new and older techniques, we aimed to test the hypothesis that viroids can be translated. We first performed in vitro experiments, but no translation products were found in any of the different conditions tested. Older experiments using both PSTVd and CEVd in in vitro translation experiments showed similar results [22,23]. In addition, analogous experiments in viroid PLMVd of the Avsunviroidae family again did not produce any peptides (F. Cote and J.P. Perreault, unpublished results). Taken together, these results suggest that no peptides are produced in cell-free in vitro systems. Nevertheless, this system has some limitations, including low protein yield [68], and therefore we cannot exclude the possibility that peptides may be produced but not detected. Consequently, we opted for an in vivo experiment to look for peptides using a different technique.
We performed proteomic analysis in lysates of PSTVd-infected N. benthamiana plants, using a robust dataset containing three biological replicas and three technical replicas. We showed altered expression of 85 proteins during PSTVd infection. Some, such as OEE2 and PR10, have also been described previously, suggesting that our analysis was accurate [28]. We found that an important number of PSTVd deregulated proteins are localized in the cytoplasm. In addition, we found that apart from proteins usually affected upon infection, such as stress proteins or proteins related to different metabolic pathways, proteins related to the translation mechanism were also influenced, showing a trend of under-expression. This phenomenon could be related to ribosomal stress. It has been proposed before that during CEVd infection, ribosomal biogenesis in tomato plants was affected [27]. Downregulation of proteins related to translation could also be a result of a translation shut-off. Viruses benefit from a decrease in the translation of endogenous transcripts as this protects them from defense-related proteins. In addition, they may divert translation to their own benefit [69]. This can be achieved by different mechanisms such as influencing translation initiation factors or even cleaving endogenous mRNAs. Hence, the most common 'strategy' used by viruses is to either bind or affect the phosphorylation translation initiation or elongation factors [69]. It has been proposed before by independent studies that CEVd, PSTVd and PMLVd bind eIF1A [28,29]. Other factors such as eEF2 and eIF5A have been found to be influenced by CEVd infectivity [27], suggesting that viroids may decrease the translation rate in order to gain time for establishing host propagation.
From the standard LC-MS/MS lysate analysis, no PSTVd-expressed microprotein was identified. We reasoned this could be due to the large number of proteins identified, that could in a way 'mask' small peptides. Therefore, we have opted firstly for a filtering of the lysate, keeping only small peptides, and, secondly assessed proteins smaller than 30 kDa following electrophoresis, using LC-MS/MS. Again, both strategies failed to identify PSTVd-derived peptides. It cannot be excluded that technical limitations may be responsible for this. One possibility is that these peptides are extremely hydrophilic, making them difficult to be detected by the LC-MS/MS technique. Then again, we have tested the predicted peptides with a specific software for hydrophobicity, and they were found adequate for LC-MS/MS (data not shown). Another issue could be the low quantity of the produced peptides. Yet, as shown in a Northern blot, the quantity of viroid present at 4 wpi is high enough to assume that if a peptide is produced by each molecule, then its quantity should be detectable. Another possibility could be a fast peptide degradation procedure that would increase the difficulty to obtain a peptide fragment in LC-MS/MS, even though a protease inhibitor was added into the lysis buffer. We cannot also exclude that a probable PSTVd peptide could be retained in a specific cellular domain that we cannot obtain using this work specific conditions. Finally, the used lysis buffer could be improved for small peptides as it was recently published [70].

Conclusions
Our results suggest that even though viroids are present in ribosomes and have ORFs which are potentially translatable, no peptide was identified using either in vitro or in vivo translation experiments. Therefore, viroids may be 'using' ribosomes for reasons other than translation. One possibility could be binding to ribosomes for protection. It has been shown before that the ribosome protects the portion of RNA enclosed within its subunits [71,72]. Although usually only around 35 nt are protected, more than one ribosome can typically be found associated with an mRNA [72]. Therefore, we could speculate that through binding to PSTVd RNAs, multiple ribosomes can provide protection from the action of different cellular nucleases. An alternative explanation may be related to the movement of viroid RNAs. Ribosomes localize at the surface of the endoplasmic reticulum, the mitochondria, as well as freely in the cytosol. Cytosolic ribosomes are suggested to use microtubules to circulate in cells [73], so it could be speculated that viroids use ribosomes to move within the cell. Finally, another possibility could be that viroids are binding to ribosomes to hijack the translation mechanism. However, if and to what extent the interaction of viroid ribosomes is related to viroid pathogenicity remains unclear. Taken together, this study shows that even though ORFs are present in viroids, in our experimental conditions, they do not seem to be translated. Nevertheless, viroids may utilize ribosomes for a different reason. Further experimentation is needed to test such a hypothesis.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/cells11020265/s1, Figure S1: Conservation rate in viroid species, Figure S2: Comparison between bioinformatically shuffled genome and real genome for viroids, Figure S3: Presence of 'hotspots' in viroid genomes, Figure S4: Nucleotide mutation rate for PSTVd, Figure S5: GO enrichment analysis using PlantRegMap, focusing of cellular compartment, Table S1: Viroids and strains used for this analysis (by NCBI), Table S2: Primers used in this study, Table S3: ORF present in different part of viroid structure, Table S4: Presence of alternative nucleotides in different PSTVd positions, Table S5

Informed Consent Statement: Not applicable.
Data Availability Statement: The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [74] partner repository with the dataset identifier PXD030755.