Comparative Analysis of Novel Strains of Porcine Astrovirus Type 3 in the USA

Porcine astrovirus type 3 (PoAstV3) has been previously identified as a cause of polioencephalomyelitis in swine and continues to cause disease in the US swine industry. Herein, we describe the characterization of both untranslated regions, frameshifting signal, putative genome-linked virus protein (VPg) and conserved antigenic epitopes of several novel PoAstV3 genomes. Twenty complete coding sequences (CDS) were obtained from 32 diagnostic cases originating from 11 individual farms/systems sharing a nucleotide (amino acid) percent identity of 89.74–100% (94.79–100%), 91.9–100% (96.3–100%) and 90.71–100% (93.51–100%) for ORF1a, ORF1ab and ORF2, respectively. Our results indicate that the 5′UTR of PoAstV3 is highly conserved highlighting the importance of this region in translation initiation while their 3′UTR is moderately conserved among strains, presenting alternative configurations including multiple putative protein binding sites and pseudoknots. Moreover, two predicted conserved antigenic epitopes were identified matching the 3′ termini of VP27 of PoAstV3 USA strains. These epitopes may aid in the design and development of vaccine components and diagnostic assays useful to control outbreaks of PoAstV3-associated CNS disease. In conclusion, this is the first analysis predicting the structure of important regulatory motifs of neurotropic mamastroviruses, which differ from those previously described in human astroviruses.


Introduction
Astroviruses are a diverse group of positive-sense, single-stranded RNA viruses belonging to the family Astroviridae, order Stellavirales; and are classified into the genera Mamastrovirus and Avastrovirus, depending upon if they infect mammals or avian species, respectively [1]. Historically, members of both genera were frequently associated with cases of enteric disease and seldom associated with extraintestinal manifestations [2][3][4][5]. In the last decade, members of the genus Mamastrovirus have been increasingly associated with central nervous system (CNS) disease [6]. In 2010, both human and mink neurotropic strains were associated with cases of encephalomyelitis [3,7]. More recently, multiple neurotropic strains of astroviruses have been associated with cases of CNS disease affecting wild and domesticated animals including swine, sheep, cattle, muskox, and alpacas [8][9][10][11][12][13].
In humans, as in animals, cases of astrovirus-associated CNS disease are often fatal. Of the numerous neurotropic strains identified infecting mammals, only a couple strains infecting humans have been successfully isolated in cell culture [14,15]. This suggests that neurotropic strains belonging to different genogroups could use different strategies for virus replication, including various mechanisms for translation initiation [16]. Multiple research groups have unsuccessfully pursued the isolation of neurotropic mamastroviruses from clinical samples, and while this approach is still largely advantageous, in silico molecular studies may aid to elucidate unknown viral sequence motifs and mechanisms associated with a diverse range of biological activities for viruses difficult to isolate in cell culture systems [17][18][19].
The organization of the open reading frames (ORF), namely ORF1a, ORF1b, and ORF2, encoding the non-structural protein 1a (nsp1a), non-structural protein 1ab (nsp1ab), and the capsid protein, respectively, of the family Astroviridae is one of the most distinctive features of its members [20]. This arrangement, where structural proteins are encoded immediately after the 5' untranslated region (UTR) distinguishes the particular genomic organization of astroviruses from other single-stranded RNA viruses affecting mammals.
Porcine astroviruses are a polyphyletic group of viruses and genetic differences of the complete capsid sequence (ORF2, adjacent to the 3 UTR) defines the current taxonomy. The taxonomy of porcine astroviruses has not been completely updated since the 9th report of the International Committee on Taxonomy of Viruses (ICTV) nearly a decade ago. Porcine astroviruses have been previously classified into seven genotype species (i.e., MAstV 3, MAstV 22, MAstV 24, MAstV 26, MAstV 27, MAstV 31, and MAstV 32), divided into five distinct genetic lineages (i.e., PoAstV1-5). All porcine astroviruses were initially identified in fecal samples, associated or not with cases of diarrhea; however porcine astroviruses have been recently associated with extraintestinal manifestations [21][22][23][24][25].
In 2018, strains of porcine astrovirus type 3 (PoAstV-3, tentatively Mamastrovirus 22) were associated with CNS disease affecting swine herds in both the USA and Hungary [12,26,27]. Subsequent studies have shown that PoAstV3 has been associated with cases of polioencephalomyelitis as early as 2010, with continued diagnosis in the US swine herd [28]. Retrospective data from multiple veterinary diagnostic laboratories in the USA indicate an increased detection of cases of PoAstV3 with 38 new cases of disease identified between December of 2017 and March 2021 [29,30]. In addition, experimental reproduction of disease with PoAstV3 has been recently demonstrated by inoculation of CNS tissue homogenate in colostrum-deprived, cesarean-derived (CDCD) piglets [31]. Although there have been multiple advances in the epidemiology and pathophysiology of PoAstV3, the virus has not been yet isolated in cell culture systems, hampering its research [19,22,32].
Previously, in silico analyses have modeled and identified important motifs for replication of astroviruses, including the frameshift signal, both UTRs, and multiple stem loops within their 3 termini of their genome [33][34][35]. Additionally, multiple putative RNA-protein interaction regions have been identified with this approach in both UTRs of human astroviruses suggesting their potential association in multiple viral and host cell processes. Potential RNA-binding proteins (RBPs) previously identified included multiple serine/arginine-rich splicing factors (SRSF), heterogeneous nuclear ribonucleoprotein E2 (hnRNPE2), and polypyrimidine tract-binding protein (PTB or hnRNPI) [36]. Furthermore, recent studies comparing multiple human and animal astroviruses have identified a functional ORF, encoding a viroporin, and previously suggested by in silico analyses [17,37,38]. These studies and other studies demonstrate the usefulness of in silico analysis [39,40].
Due to the continued relevance of PoAstV3-associated polioencephalomyelitis in the US swine herd, we describe the comparative analysis of multiple novel strains of PoAstV3; focusing on the secondary RNA structure and motifs present in their untranslated regions and its association with RBPs; the presence of a putative VPg within its genome, and the identification of multiple linear epitopes and antigens within the capsid protein. This study predicts and describes multiple conserved motifs and potential molecules relevant to the life cycle and replication of PoAstV3, which may be important for its propagation in vitro and future studies involving neurotropic mamastroviruses. Furthermore, highly conserved epitopes and antigens identified in this study may allow us to develop diagnostic assays and vaccine candidates useful for the prevention of outbreaks associated with neurotropic strains of PoAstV3.

Diagnostic Samples
A subset of swine cases diagnosed with PoAstV3-associated polioencephalomyelitis at the Iowa State University Veterinary Diagnostic Laboratory (ISU-VDL) from January 2017 to December 2020 was included in this study (n = 64). Inclusion criteria included cases with histologic lesions consistent with a CNS viral infection (lymphoplasmacytic polioencephalomyelitis, perivascular cuffing, gliosis and neuronal necrosis), and concurrent detection of PoAstV3 by reverse transcription quantitative polymerase chain reaction (RT-qPCR) in CNS tissues. Additionally, feces (n = 2) and one intestinal tissue homogenate sample from farms with intermittent PoAstV3-associated polioencephalomyelitis were included.

RT-qPCR and Sanger Sequencing
For each diagnostic case, tissue homogenates used at the time of initial diagnosis were retrieved and stored at −80 • C. Previously, CNS tissues were processed as follow: aliquots (1 to 3 g) of CNS tissue were minced with sterile forceps and scissors, homogenized with 15 ml Minimal Essential media (MEM, Fisher Scientific, Waltham, MA USA), and processed using a Geno/Grinder ® 2010 (SPEX ® SamplePrep LLC, Metuchen, NJ, USA). Fecal swabs and intestinal tissue homogenate were diluted in 1 mL of phosphatebuffered saline solution (PBS). RT-qPCR conditions and methodology have been described previously [12]. Briefly, nucleic acids were extracted from 100 µL sample aliquots using MagMAX™ Pathogen RNA/DNA Kit (Thermo Fisher Scientific, Waltham, MA, USA) with a KingFisher Flex Purification System (Thermo Fisher Scientific, Waltham, MA, USA) following the instructions of the manufacturer. Primer pairs were designed to amplify overlapping amplicons targeting the complete genomes of two previously published PoAstV3 sequences (Table 1, GenBank accession Nos. JX556691 and KY940545). The viral genomic RNA was amplified using a 25 µL reaction utilizing custom qScript ® XLT One-Step RT-PCR Kit (Quanta Biosciences, Gaithersburg, MD, USA) following the manufacturer's recommendations. Each primer was present in the final reaction at 320 nM, and 4 µL of RNA template was used per reaction. PoAstV3 RNA amplification was performed on an Applied Biosystems SimpliAmp thermal cycler (Thermo Fisher Scientific, Waltham, MA, USA) under the following conditions: initial reverse transcription at 48 • C for 20 min, followed by initial denaturation at 94 • C for 3 min and 45 cycles of denaturation at 94 • C for 30 s; annealing at 50 • C for 50 s; extension at 68 • C for 90 s; and final elongation at 68 • C for 7 min. Phosphate buffered saline was extracted as a negative extraction control and nuclease-free water was used as a negative amplification control. The RT-qPCR products were visualized using QIAxcel Advanced System (Qiagen, Germantown, MD, USA) and purified using the ExoSAP-IT PCR Product Cleanup Reagent (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. Sequencing was completed via Sanger sequencing with the BigDye™ Terminator v3.1 cycle sequencing kit (Thermo Fisher Scientific, Waltham, MA, USA) on a 3730xl DNA Analyzer (Thermo Fisher Scientific, Waltham, MA, USA) at the Iowa State University DNA Facility. Sequences were assembled using Geneious Prime 2020.2.4 software.

Phylogenic and Sequence Analysis
PoAstV3 sequences obtained were aligned using ClustalO webserver under default settings with 47 full genome neurotropic mamastrovirus strains retrieved from GenBank. Phylogenetic trees were generated using the neighbor-joining algorithm with MEGAX and rendered with interactive Tree Of Life (iTOL) [41][42][43][44]. In addition to the PoAstV3 sequences obtained in this study, 14  was also included in the analysis [45,46]. All sequences were retrieved from GenBank. In addition, PoAstV3 ORF1a, ORF1ab, and ORF2 were identified and translated using Jalview 2.11.1.3 and further aligned with ClustalO webserver under default settings to obtain nucleotide (nt) and amino acid (aa) percent identity matrices [41,47].

Identification of Putative Genome-Linked Virus Protein (VPg)
The presence of a putative VPg previously identified between residues 664 and 758 (human) and residues 666 to 752 (mink) of ORF1a was evaluated [52,53]. Predicted Nterminal proteolytic cleavage site [Q(K/A)] located upstream of conserved motif KGK(N/T)K and C-terminal proteolytic cleavage site [Q(P/A/S/L)] downstream of C-terminal EEY-like motif were identified within the ORF1a of USA consensus sequence and its molecular weight was calculated using Expasy Compute pI/Mw tool [54]. USA nsp1a consensus sequence was further analyzed with DisoRDPbind webserver to identify putative disordered RNA-protein binding regions [55].

Linear Antigen Epitope Prediction
The consensus capsid protein sequence of USA, Japanese, Hungarian, and Spanish PoAstV3 strains and the capsid protein sequence of German strain GER/L00919-K17/2014 were analyzed under default settings with SVMTriP webserver and the Immunomedicine group's predicted antigenic peptides web-based tools for the prediction of linear epitopes [56,57]. Recommended epitopes predicted with SVMTriP webserver (score ≥ 0.85), and sequence motifs predicted by the Immunomedicine group's web-based tool (antigenic propensity ≥1.2) were identified and further analyzed using VaxiJen v2.0 webserver with a threshold value of 0.4 [58]. Probable antigens identified in USA capsid protein with VaxiJen v2.0 webserver, were further identified in individual USA strains using find function with Jalview 2.11.1.3 and the aa % identity of these regions were evaluated for point mutations.

Diagnostic Samples
Sixty-four samples (CNS tissue (n = 61), feces (n = 2), and intestinal tissue homogenate (n = 1)) from 32 diagnostic cases were available and retrieved. Twenty-nine of these cases (n = 29/32, 90.6%) had a diagnosis of polioencephalomyelitis due to PoAstV3 with CNS tissues available. In three cases (feces (n = 2) and intestinal tissue homogenate (n = 1)) PoAstV3 PCR-based detection was requested by the submitter. The age of affected animals in CNS cases ranged from 3-week-old pigs to adult sows. CNS cases by production stage were as follows: nursery (3 to 9 weeks of age (n = 19)), grow-finish (10 to 13 weeks of age (n = 4)), and adult (gilts and sows (n = 6)). Non-CNS samples consisted of feces from suckling piglets (5-and 8-day-old) and intestinal tissue homogenate from suckling piglets (7-day-old). All CNS cases were reported to originate in the state of Iowa, except for two cases that originated in Illinois. These cases occurred during January (n = 1), February (n = 4), April (n = 3), May (n = 2), June (n = 2), July (n = 6), August (n = 3), September (n = 3), October (n = 5) and December (n = 3). Cases were traced back to 18 unique sources (systems). Of the 32 diagnostic cases, twelve (37.5%) originated from unique, unrelated farms/systems, seven (21.8%) cases were identified to originate from a single farm/system, followed by three different farms/system with three (9.3%) cases each, and two different farms/systems with two (6.2%) cases each. Additional epidemiologic and clinicopathologic information for cases 3, 4, 7, and 8 had been published elsewhere [28,59]. A summary of metadata by case including farm and strain identification (ID), tissue sample, production Viruses 2021, 13, 1859 6 of 20 category, state of origin, initial PoAstV3 RT-qPCR result, clinical signs of affected animals, and month of diagnosis are summarized in Table S1.

RT-qPCR and Sanger Sequencing
Twenty complete coding sequences (CDS) were obtained from the initial 64 samples retrieved (n = 20/64, 31.25%). The complete 5 UTR was obtained in all 20 coding sequences while the complete 3 UTR, excluding the poly (A) tail, was obtained in 19 of the coding sequences, in addition to a partial 5 termini (42 nt) of the 3 UTR of strain USA/IA/38290C/20. These 20 sequences originated from 16 different cases (n = 16/32, 50%) representing 11 unique sources (systems), with the most sequences (n = 19/20, 95%) obtained from CNS cases and only one sequence from feces (Table S1). Genome sequences from strains with a complete 3 UTR (n =19) ranged from 6328 to 6434 nt in length with a GC content of 46.7% to 47.7% (Table S2). Sequences were deposited at GenBank under accession numbers MW653747-MW653753 and MW732145-MW732157.

Phylogenic and Sequence Analysis
All PoAstV3 strains obtained from diagnostic cases clustered with other PoAstV3 strains previously described in the USA and other neurotropic strains identified to belong to the VA/HMO phylogenetic clade within MAstV genogroup II ( Figure 1). PoAstV3 sequences clustered in three distinct clusters within USA strains. Nine of these novel strains (

5 UTR Analysis
The 5 UTR of USA, Spanish, and Hungarian PoAstV3 strains is 30 nt long while German strain GER/L00919-K17/2014 is 32 nt in length given a GC insertion at the 5 termini. The 5 UTR of USA and Spanish PoAstV3 strains are identical and differ from Hungarian strains and German strain GER/L00919-K17/2014 by a single mutation (G>A). The Hungarian strains and German strain GER/L00919-K17/2014 are identical if initial GC insertion is absent from strain GER/L00919-K17/2014. The 5 UTR of all Japanese strains appears incomplete although their 3 termini (CCGGCCCU) are identical to other PoAstV3 strains, except for JPN/Bu-5/2014 ( Figure 2). The 5 UTR consensus sequence folds into a stem with an internal loop (A, C), a loop (CGUU), and a bulge (CC) and has a free energy score of −7.80 kcal/mol. (Figure 2). A suboptimal Kozak consensus sequence ACCAUGG (initiation codon underlined) is found as part of the UTR at the 3 termini of the stem including the ORF1a/1ab initiation codon. The minimum free energy (MFE) of the optimal secondary structure of consensus 5 UTR was −7.06 kcal/mol.

Frameshift Analysis
The conserved shift heptamer sequence (AAAAAAC) is located consistently in all PoAstV3 strains at position 2516-2523, with a conserved UGA stop codon located 39 nt upstream of this sequence motif. A repeated, in tandem, AUG codon is located in a −1 position from the termination codon, and a single nucleotide substitution (2555 C > U) is observed in strains USA/IA/76780/20 and JPN/Bu2-5/2014 in contrast to all other strains. The secondary RNA structure of this conserved region is characterized by the presence of a loop motif (CAAGA) and a bulge (AUUA) (Figure 3a). The predicted optimal MFE of this structure was −16.93 kcal/mol.

3 UTR Analysis
The length of the 3 UTR of PoAstV3 without the poly (A) tail, varies between 54 nt in Hungarian strains (KY073229, KY073231) to 86 nt in length in USA strains (MT394895,

5′UTR Analysis
The 5′UTR of USA, Spanish, and Hungarian PoAstV3 strains is 30 nt long while German strain GER/L00919-K17/2014 is 32 nt in length given a GC insertion at the 5′ termini. The 5′UTR of USA and Spanish PoAstV3 strains are identical and differ from Hungarian strains and German strain GER/L00919-K17/2014 by a single mutation (G>A). The Hungarian strains and German strain GER/L00919-K17/2014 are identical if initial GC insertion is absent from strain GER/L00919-K17/2014. The 5′UTR of all Japanese strains appears incomplete although their 3′ termini (CCGGCCCU) are identical to other PoAstV3 strains, except for JPN/Bu-5/2014 ( Figure 2). The 5′UTR consensus sequence folds into a stem with an internal loop (A, C), a loop (CGUU), and a bulge (CC) and has a free energy score of -7.80 kcal/mol. (Figure 2). A suboptimal Kozak consensus sequence ACCAUGG (initiation codon underlined) is found as part of the UTR at the 3′ termini of the stem including the ORF1a/1ab initiation codon. The minimum free energy (MFE) of the optimal secondary structure of consensus 5'UTR was −7.06 kcal/mol.  The start codon is included on both secondary structure (a) and percent identity plot (light blue box). Sequences identified in this study are denoted by a black box.
In USA and Spanish strains, a net insertion of up to 32 nt in their 3 UTR causes the extension of the 3 UTR in contrast with Hungarian and Japanese strains, and GER/L00919-K17/2014. Secondary RNA structures similar to the s2m are predicted in USA, Hungarian, and Spanish 3 UTR consensus region but are not evident in Japanese 3 UTR consensus and GER/L00919-K17/2014. In the 3 UTR consensus region of USA and Spanish strains, this motif is followed by two stem structures with multiple internal loops. The 3 UTR consensus region of Hungarian PoAstV3 has only a single stem structure with a prominent loop. In contrast, the secondary 3 UTR structure of the Japanese consensus strain and GER/L00919-K17/2014 do not form a classical s2m structure and instead form a single stem with an internal loop, a bulge and a multi-branched loop, and a stem-loop with internal loops, a bulge, and a hairpin loop, respectively. Distant base-pairing regions and pseudoknots were further detected with IPknot webserver. Multiple base-pairing regions were predicted between the s2m and adjacent stem-loop motif within the 3 UTR of USA and Spanish consensus sequences. The secondary structure of the Hungarian 3 UTR consensus region initially modeled with RNAfold, transformed into a single multi-branched stem-loop similar to the multibranched predicted stem-loop of GER/L00919-K17/2014. Additional base-pairing regions were predicted within the multi-branched structure initially modeled for Japanese 3 UTR region, and between this structure and the poly (a) tail ( Figure 5). The MFE of optimal secondary structures of USA and Spanish 3 UTR was −16.07 and −16.27 kcal/mol, respectively. The MFE of optimal secondary structures of Hungarian and Japanese strains, and strain GER/L00919-K17/2014 3 UTRs was −13.22, −17.06, and −14.25 kcal/mol, respectively.

3′UTR Analysis
The length of the 3′UTR of PoAstV3 without the poly (A) tail, varies between 54 nt in Hungarian strains (KY073229, KY073231) to 86 nt in length in USA strains (MT394895, MT394896, MW653747, MW653748, MW732148, MW732153). The 3′UTR of USA strains varies between 80 nt to 86 nt in length. The 3′UTR of GER/L00919-K17/2014, Japanese (LC201595, LC201596), and Spanish strains (MK962341, MK962341) are 58 nt, 59-60 nt, and 83-84 nt in length, respectively. In all strains, the canonical stop codon UAG is consistently located 15 nt upstream of the start of the s2m except for strain USA/IA/53797GA/2019 (MW653750) that possess a UGA (opal) stop codon 16 nt upstream from the start of s2m. All PoAstV3 strains possess multiple stop codons within their 3′UTR (Figure 4). In USA and Spanish strains, a net insertion of up to 32 nt in their 3′UTR causes the extension of the 3'UTR in contrast with Hungarian and Japanese strains, and GER/L00919-K17/2014. Secondary RNA structures similar to the s2m are predicted in USA, Hungarian, and Spanish 3'UTR consensus region but are not evident in Japanese 3′UTR consensus and GER/L00919-K17/2014. In the 3′UTR consensus region of USA and Spanish strains, this motif is followed by two stem structures with multiple internal loops. The 3′UTR consensus region of Hungarian PoAstV3 has only a single stem structure with a prominent loop. In contrast, the secondary 3′UTR structure of the Japanese consensus strain and GER/L00919-K17/2014 do not form a classical s2m structure and instead form a single stem with an internal loop, a bulge and a multi-branched loop, and a stem-loop with internal loops, a bulge, and a hairpin loop, respectively. Distant base-pairing regions and  Multiple putative protein binding sites were identified in the 3′UTR of PoAstV3. Putative binding sites for PTB were consistently found in all sequences, ranging from two motifs (in JPN/Bu4-2-1) up to four intercalated motifs in USA strains. In Hungarian and Japanese strains, as in strain GER/L00919-K17/2014, only putative binding sites for PTB and hnRNPE2 were identified. In all other PoAstV3 strains at least one motif for PTB, hnRNPE2 and SRSF5 were identified (Figures 6 and 7). No putative protein binding sites were identified on the 5′UTR of PoAstV3. Multiple putative protein binding sites were identified in the 3 UTR of PoAstV3. Putative binding sites for PTB were consistently found in all sequences, ranging from two motifs (in JPN/Bu4-2-1) up to four intercalated motifs in USA strains. In Hungarian and Japanese strains, as in strain GER/L00919-K17/2014, only putative binding sites for PTB and hnRNPE2 were identified. In all other PoAstV3 strains at least one motif for PTB, hnRNPE2 and SRSF5 were identified (Figures 6 and 7). No putative protein binding sites were identified on the 5 UTR of PoAstV3.

Linear Antigen Epitope Prediction
Results, sequences, and scores of predicted linear epitopes and antigens are summarized in Table 3. SVMTriP online tool predicted four recommended linear epitopes in strain GER/L00919-K17/2014 and three linear epitopes for Hungarian, Spanish, and Japanese consensus sequences, respectively. Two recommended linear epitopes were identified in the USA consensus sequence. The Immunomedicine antigenic peptides web-based tool predicted three linear epitopes for Spanish consensus sequence, two linear epitopes in the USA, Hungarian, and Japanese consensus sequences, and one linear epitope for GER/L00919-K17/2014. Of the fifteen epitopes predicted by SVMTriP online prediction

Linear Antigen Epitope Prediction
Results, sequences, and scores of predicted linear epitopes and antigens are summarized in Table 3. SVMTriP online tool predicted four recommended linear epitopes in strain GER/L00919-K17/2014 and three linear epitopes for Hungarian, Spanish, and Japanese consensus sequences, respectively. Two recommended linear epitopes were identified in the USA consensus sequence. The Immunomedicine antigenic peptides web-based tool predicted three linear epitopes for Spanish consensus sequence, two linear epitopes in the USA, Hungarian, and Japanese consensus sequences, and one linear epitope for GER/L00919-K17/2014. Of the fifteen epitopes predicted by SVMTriP online prediction tool and the eight epitopes predicted by Immunomedicine antigenic peptides web-based tool, VaxiJen v2.0 prediction tool corroborated the probable antigenicity of ten and four linear epitopes, respectively. Epitopes predicted by both SVMTriP and VaxiJen v2.

Discussion
This study characterizes the CDS, 3 and 5 UTRs of the largest collection of PoAstV3 sequences (n = 20) associated with polioencephalomyelitis in swine originating from 11 individual farms/systems. Consistent with other reports, PoAstV3-polioencephalomyelitis affects swine at different stages of production (weaning, grow-finish, adult animals) [15,42]. Additionally, a clear association with winter months and the occurrence of PoAstV3 neurotropic infections, as other authors have previously suggested, has not been corroborated in this study [15]. On the contrary, PoAstV3 CNS cases did not have a clear seasonal distribution: summer (n = 9), autumn (n = 8), and winter (n = 8), and spring (n = 4). This may be a result from current intensive production practices or indicate that seasonal distribution is not a prominent feature of PoAstV3 ecology.
As with other astroviruses, the genome of PoAstV3 contains three ORFs. PoAstV3 USA strains identified in this study follow the ORFs delimitation described for sui generis US-MO 123 strain with a nsp1a, nsp1ab and capsid protein composed by 855, 1356 and 809 residues, respectively. The range sequence similarity of nsp1a, nsp1ab and the capsid protein of the USA strains characterized in this study is 94.79-100%, 96.3-100% and 93.51-100%, respectively. Not surprisingly, the lowest percent homology was observed in the capsid protein. These twenty novel USA strains cluster with previously described strains into three well-defined clusters. Nine strains originating from six farms/systems and six strains originating from four farms/systems clustered with the original neurotropic PoAstV3 strain USA/IA/7023/2017 (KY940545) and original fecal strain US-MO 123 (JX556691), respectively. Five of these novel strains originating from 2 farms/systems clustered with strains USA/IA/48142/2018 (MT394895) and USA/IA/53214/2018 (MT394896) which recently resulted in PoAstV3-polioencephalitis following experimental inoculation of CDCD pigs [12,21,28,31,60,61]. Interestingly, strains from one farm/system (Farm N) clustered with both strain USA/IA/7023/2017 (KY940545) and strain US-MO 123 (JX556691). Based on the genetic analysis of strains originating from system N, it appears that within a system distinct PoAstV3 strains can cause neurological disease in sows with one strain also been found in feces of suckling piglets (Table S1). In contrast, on Farm B a subset of highly homologous PoAstV3 strains (>99% nt ORF2) was detected in multiple polioencephalomyelitis cases spanning a year (2018). Similarly, on Farm E and O highly homologous PoAstV3 strains (>99% nt ORF2) were detected in two individual animals experiencing PoAstV3associated neurological disease at the same time.
Previous studies involving neurotropic mamastroviruses have reported unsuccessful attempts at viral isolation. We have also pursued virus isolation in a few of the samples identified in this study using multiple cell lines (i.e., PK-15, ST, BHK, Vero cells [data not shown]) with similar negative results. The inability to isolate neurotropic strains from clinical samples appears to be a common factor for many neurotropic astroviruses and this impediment may be partially compensated by the development of robust reverse genetics systems, as previously described, or by the development of specific permissive cell culture systems [44]. Hence, the study and characterization of RNA motifs associated with virus replication and translation initiation, as those commonly found in both UTRs may serve as an initial approach to develop future studies [17,38,40,62].
The 5 UTR of PoAstV3 is highly conserved among all strains. A single point mutation is observed in Hungarian strains in comparison with USA and Spanish strains. This point mutation is located at position 13 involving a G>A substitution. Previously, the motifs CCAA, UGGU and GGCC were identified in human astrovirus and are present in all PoAstV3 strains located at position 1, 19 and 25, respectively [36,62]. Strain GER/L00919-K17/2014 has nucleotides GC as initial sequence bases, extending its UTR up to 32 nt in contrast to the 5 UTR of PoAstV3, although if absent, the remaining of the 5 UTR is identical to Hungarian strains. The 5 UTR of all Japanese strains appears incomplete due to the absence of the pentamer CCAAA, although available 5 UTR segments have a high degree of homology with other PoAstV3, and they all possess the invariable motif GGCC. This high degree of sequence conservation within the 5 UTR of PoAstV3 and other mamastroviruses, and the presence of conserved motifs among PoAstV3 and human astroviruses supports the notion of common ancestral origins and possible recombination events between members of the genus Mamastrovirus, including human astroviruses.
In single-stranded RNA viruses, including human astroviruses, viral translation is initiated at the 5 UTR and previous studies modelling the 5 UTR of human astrovirus have shown a high sequence-based, secondary RNA structural degree of conservation [36,[62][63][64].
Our results indicate that the short 5 UTR sequence of PoAstV3, and therefore its secondary RNA structure, is highly conserved in strains found in different continents. Furthermore, our analysis suggests that the high degree of conservation of the 5 UTR of PoAstV3 is highly regulated and may possess an essential role in the virus life cycle. In human strains, the 5 UTR has been shown to contain putative protein binding sites; although, our analysis indicate there is not marked homology within predicted binding site motifs (e.g., PTB, SRSF2, SRSF5 and SRSF6) found in humans astrovirus and the 5 UTR of PoAstV3. Thus, this may suggest a plausible explanation of why PoAstV3 have not been successfully isolated in cell culture systems using human cell lines [16,17,22].
In members of the genus Mamastrovirus, experimental evidence has previously inferred the presence of a VPg linked to infectious viral RNA, and in human astrovirus, this has been shown to be essential for in vitro infectivity [36,45,46]. In this study we identified a conserved polyprotein within the ORF1ab of USA PoAstV3 strains spanning between previously described cleavage sites encompassing the putative human astrovirus VPg. Furthermore, we found the RNA binding site [SATTFIQAKGKNKK] between residues 633-647 containing the conserved motif KGK(N/T)K previously characterized as the Nterminal end of calicivirus VPg, and the conserved proteolytic cleavage site [Q(K/A)] previously described in human astroviruses. In addition, the conserved motif EEY found in calicivirus VPg (including a tyrosine residue postulated to covalently associate with viral RNA) was also identified. This suggests that both KGK(N/T)K and EEY motifs of PoAstV3 putative VPg could bind viral RNA. Interestingly, the molecular weight of the putative PoAstV3 VPg is approximately 4.81 kDa, considerably smaller than those found in human astrovirus and caliciviruses with an estimate molecular weight of 11 kDa and 13-15 kDa, respectively [36,45,47,48]. Future research directed to infer the role of this putative PoAstV3 VPg can aid in the development of in vitro infectivity studies designed to assess the relevance of this protein in cell culture systems [49,50].
In contrast to the 5 UTR, multiple protein binding sites analogous to human astroviruses are observed in the 3 UTR of PoAstV3. The 3 UTR secondary structure of PoAstV3s is complex and may adopt multiple secondary RNA conformations. This appears to be related to the presence of the duplicated tandem motif UUGAUUUUCU(U/C)UUUUUUCUU UAGGC found in USA and Spanish strains ( Figure 3). This sequence motif is absent in other European and Japanese strains and clearly demarks the conformation of the secondary structure and putative binding sites of PoAstV3 strains. The 3 UTR of Hungarian and Japanese PoAstV3 strains and German strain GER/L00919-K17/2014 is shorter and therefore possess fewer protein binding sites in comparison to USA and Spanish strains ( Figure 5). In GER/L00919-K17/2014, ESP/B333/2017, USA/IL/53219/20 and all Hungarian strains, SRSF5 is the first binding motif found at position 19-23 after the ORF2 stop codon. This binding site is replaced in 16 American strains and strain ESP/B377/2017 for hnRNPE2 binding site, spanning generally from nucleotides 19-22 upstream of the ORF2 stop codon. Adjacent to these motifs, the first PTB binding site is generally found spanning from nucleotides 24-29. Multiple intercalated PTB binding sites are found in all PoAstV3 strain. PTB motifs varies from two to four copies, with the fewest copies found in strain JPN/Bu4-2-1/2014 and the most in multiple USA strains. A dual binding site for PTB and hnRNEP2 is found as the last binding region of the 3 termini in all USA and Spanish strains. In contrast to the 3 UTRs of human astroviruses, no putative binding sites for SRSF2, SRSF3 and SRSF6 were identified, indicating that RNA binding motifs for these cellular proteins are different from those identified for human astroviruses. These features and the absence of predicted RBPs in the 5 UTR of PoAstV3 could indicate that translation initiation and replication of PoAstV3 differs from that postulated for human astroviruses, and other cellular proteins and RNA binding sites may play a role in these processes. In addition, the 3 UTR of all PoAstV3 strains have multiple additional amber, ochre, and opal stop codons after standard amber stop codon (opal in strain USA/IA/53797GA/2019, Figure S2) suggesting the possibility of non-canonical elongation and termination mechanisms, and the use of stop-codon read-through strategy by PoAstV3, as it was previously described in other viruses [51,52]. Perhaps, the presence of repeat motif UUGAUUU-UCU(U/C)UUUUUUCUUUAGGC containing multiple additional RBPs as observed in the 3 UTR of USA and Spanish strains is an evolutionary strategy used by PoAstV3 to maximize its replication by stop-codon read-through elongation and termination [53]. Future in vitro studies analyzing the interaction of swine orthologues for PTB, SRSF2, SRSFF3 SRSF5, SRSF6, TIA and hnRNPE2 with the 3 UTR of PoAstV3 may elucidate if binding of these cellular proteins is a requirement for PoAstV3 replication.
In seminal work, Willcocks and Carter described two distinct stem motifs within the 3 termini of human astrovirus 1 (GenBank accession Z11682) [20,65]. In this study, two partially homologous regions are consistently found in the 3 UTR of PoAstV3 sequences. The first motif found in the 3 UTR of all PoAstV3 located immediately after the stop codon, GAACGAGGGUACAG, makes up the 3 termini of s2m. The second motif, AAAUU-GAUUU, is found consistently in USA and Spanish strains but not in other PoAstV3 strains and is part of a transition sequence region between subsequent stem structures found on these PoAstV3 strains. These two motifs [GA(A/T)CGAGGGUACAG, AAAUUGAUUU] are also conserved in the 3 UTR of all classical human genotypes (HAstV1 to HAstV8) as shown by Monceyron et al. constituting the s2m and s1m motifs, respectively; however, in PoAstV3 due to the differences in length of their 3 UTRs and secondary RNA structure, the second motif AAAUUGAUUU is predicted to participate in the folding of two stem structures instead of one (s1m) as seen in human astroviruses. This remarkable difference between human astroviruses, possessing two conserved stem motifs within their 3 UTR, and PoAstV3, that may present 1 to 3 stem structures in their 3 UTR, as shown by USA and Spanish strains (three stems), Hungarian strains (two stems) and Japanese strains and strain GER/L00919-K17/2014 (one stem) may indicate promiscuous binding by PoAstV3 strains with unknown cellular proteins ( Figure 5). This difference is further accentuated when pseudoknots are predicted. Pseudoknot prediction indicates that the secondary structure of the 3 UTR of USA and Spanish strains remains as a three stem structure, while Hungarians strains adopt the conformation of a multibranched stem, similar to strain GER/L00919-K17/2014. Thus far, no predicted pseudoknots have been described in the 3 UTR of human astroviruses, although future in silico studies may corroborate the presence of pseudoknots in this region of human astroviruses and other members of the genus Mamastrovirus.
Recent work with human astroviruses has elucidated potential binding motifs for cellular proteins within their 3 UTR. Our analysis indicates that homologous binding regions are present in the 3 UTR of PoAstV3s. The elongation of the 3 UTR of USA and Spanish strains provides additional binding motifs when compared to other strains with shorter 3 UTRs (i.e., GER/L00919-K17/2014, Hungarian, and Japanese strains). The hairpin of the first and second stem motifs of USA and Spanish strains have binding sites for hnRNPE2 and SRSF5, respectively; and multiple PTB binding sites are predicted at the internal loops and bulges between these two stems. In contrast, GER/L00919-K17/2014 and Hungarian strains lack binding sites for hnRNPE2 within their secondary structure. Additionally, the stop codon is invariably present in the hairpin of stem loop analogous to s2m in all strains, with the exception of Japanese strains.
It is plausible that the null rate of success in isolating PoAstV3 in cell culture is related to intrinsic sequence motifs present in their genome as related to the secondary conformation adopted by RNA binding motifs. Future in vitro studies should contemplate the association of the cellular proteins described herein and focus on the identification of unknown cellular proteins relevant for the replications of PoAstV3 in cell culture systems. In addition, the increasing number of PoAstV3 genomes available in public databases can help to understand hidden complexities of this virus and aid in the development of permissible cell lines for neurotropic astroviruses; however, the validity of predicted RNA structures described in this study should be further correlated in context with the global RNA structure of PoAstV3. Future analysis considering the global structure of PoAstV3 and other neurotropic mamastroviruses would be valuable for assessing possible interactions of the UTRs with other genomic regions. Global analyses may elucidate secondary or tertiary structures potentially crucial for translation and replication of neurotropic mamastroviruses, and therefore, useful for the development of cellular culture systems to study viral pathogenesis.
In a previous study by Amimo et al. potential antigenic epitopes for PoAstV3 strain U460 were found in multiple locations of the ORF2. These antigenic epitopes were all lo-  [31]. In astroviruses, VP27 is part of the dimeric spike protein and is considered highly variable and antigenic. Our results indicate that both antigenic epitopes identified are highly homologous within USA PoAstV3 strains and may elicit specific antibodies against PoAstV3. Future in vitro and in vivo studies are necessary to evaluate the efficacy and cross-protection of these antigens for the control of PoAstV3-asscociated CNS infections.

Conclusions
This study encompasses the analysis of the largest collection of PoAstV3-polioencephal omyelitis genomes, and the first analysis modelling the secondary structure of important putative regulatory RNA motifs in neurotropic mamastrovirus. Our results indicate contrasting differences to those previously described for human astrovirus, which may explain the inability of this virus, and perhaps other neurotropic mamastroviruses, to be isolated in cell culture. Finally, we described the presence of two highly conserved antigenic epitopes in the VP27 region of PoAstV3, which may allow the development of useful immunogens and diagnostic assays to diagnose and control cases of PoAstV3-associated polioencephalomyelitis.