Molecular Characterization of Small Ruminant Lentiviruses in Polish Mixed Flocks Supports Evidence of Cross Species Transmission, Dual Infection, a Recombination Event, and Reveals the Existence of New Subtypes within Group A

Small ruminant lentiviruses (SRLVs) are a group of highly divergent viruses responsible for global infection in sheep and goats. In a previous study we showed that SRLV strains found in mixed flocks in Poland belonged to subtype A13 and A18, but this study was restricted only to the few flocks from Małopolska region. The present work aimed at extending earlier findings with the analysis of SRLVs in mixed flocks including larger numbers of animals and flocks from different part of Poland. On the basis of gag and env sequences, Polish SRLVs were assigned to the subtypes B2, A5, A12, and A17. Furthermore, the existence of a new subtypes, tentatively designed as A23 and A24, were described for the first time. Subtypes A5 and A17 were only found in goats, subtype A24 has been detected only in sheep while subtypes A12, A23, and B2 have been found in both sheep and goats. Co-infection with strains belonging to different subtypes was evidenced in three sheep and two goats originating from two flocks. Furthermore, three putative recombination events were identified within gag and env SRLVs sequences derived from three sheep. Amino acid (aa) sequences of immunodominant epitopes in CA protein were well conserved while Major Homology Region (MHR) had more alteration showing unique mutations in sequences of subtypes A5 and A17. In contrast, aa sequences of surface glycoprotein exhibited higher variability confirming type-specific variation in the SU5 epitope. The number of potential N-linked glycosylation sites (PNGS) ranged from 3 to 6 in respective sequences and were located in different positions. The analysis of LTR sequences revealed that sequences corresponding to the TATA box, AP-4, AML-vis, and polyadenylation signal (poly A) were quite conserved, while considerable alteration was observed in AP-1 sites. Interestingly, our results revealed that all sequences belonging to subtype A17 had unique substitution T to A in the fifth position of TATA box and did not have a 11 nt deletion in the R region which was noted in other sequences from Poland. These data revealed a complex picture of SRLVs population with ovine and caprine strains belonging to group A and B. We present strong and multiple evidence of dually infected sheep and goats in mixed flocks and present evidence that these viruses can recombine in vivo.


Introduction
Small ruminant lentiviruses (SRLVs) include two retroviruses, Caprine arthritis encephalitis virus (CAEV), and Maedi-visna virus (MVV), members of the genus Lentivirus of the Retroviridae family. Originally, MVV and CAEV were considered as distinct viral species restricted to sheep and goats, respectively, but several reports indicated that there are different lentiviral subtypes able to infect both sheep and goats [1]. SRLVs induce a multisystem disease with progressive and debilitating inflammatory lesions in the mammary gland, joints, lungs, and the brain. Diseases caused by SRLVs may take severe clinical from mixed flocks may help to understand the genetic and antigenic make-up of these viruses, phylogenetic relationship and their allocation into the recently established groups. Moreover, genetic studies may also be useful for the development of regionally-tailored diagnostic tests.

Animals and Samples
A total of 263 samples were investigated in this study, 163 from sheep and 100 from goats, and originating from 17 mixed flocks and different geographic regions of Poland. Sheep and goats were housed in the same barn with the possibility of direct contact and via water and feed troughs. Animals were randomly chosen from the flocks for serological study and were clinically healthy, without any clinical signs. Blood was taken in EDTA and serum tubes for serology and molecular analysis. Sera samples were tested for MVV/CAEV antibodies using the commercially test ID Screen MVV/CAEV Indirect Screening (IDvet, Grabels, France). EDTA-anticoagulated blood was used as a source of PBLs, which were isolated according to standard protocols [24]. The genomic DNA was extracted using a NucleoSpin Blood Quick Pure Kit (Macherey-Nagel GmbH & Co. KG, Dueren, Germany), according to the manufacturer's recommendation. The quality and quantity of DNA was evaluated in a Nanophotometer (Implen GmbH, Munich, Germany). All methods were performed in accordance with the relevant guidelines and regulations. Specifically, blood collection was approved (no. 37/2016) by the Local Ethical Committee on Animal Testing at the University of Life Sciences in Lublin (Lublin, Poland).

DNA Sequencing and Analysis
PCR products were purified using NucleoSpin Gel and PCR Clean-up (Marcherey-Nagel, GmbH 7 Co, Hamburg, Germany) and cloned into the pDRIVE vector (Qiagen, GmbH, Hilden, Germany). Ligation products were used to transform EZ Competent Cells (Qiagen, GmbH, Hilden, Germany) and plasmid DNA was extracted using the Nucle-oSpin Plasmid kit (Marcherey-Nagel, GmbH 7 Co., Hamburg, Germany). A minimum of five clones derived from each DNA sample were sequenced on a 3730 xl DNA Analyzer (Applied Biosystems, Foster City, CA, USA) using BigDye Terminator v3.1 Cycle Sequencing kit. The obtained SRLV sequences were trimmed and analyzed using the Geneious Pro 5.3 software (Biomatters Ltd., Auckland, New Zealand). All novel sequences reported in this study were submitted to the Gen-Bank database under accession numbers: OL348000-OL348058 for gag sequences and OL436259-OL436303 for env sequences. The evolutionary relationships of analyzed strains with other published sequences were investigated by constructing the phylogenetic trees from multiple alignments. The available sequences of the reference SRLV strains of genotypes A-C and E, represented isolates from a wide range of countries, were included in the analysis. In the present study, the SRLVs found by Colitti et al. [13] were renamed from A18 to A19 and from A19 to A20. All sequences were aligned using MUSCLE. Model testing was performed to select the best evolutionary model based on the Bayesian information criterion (BIC) and Akaike information criterion (AIC). According to the results General Time Reversible (GTR) model with the gamma distribution (+G) with 5 rate categories and by assuming that a certain fraction of sites are evolutionarily invariable (+I) was applied to infer a phylogenetic tree using maximum likelihood (ML) and neighbor-joining method. The reliability of the phylogenetic relationships was evaluated by nonparametric bootstrap analysis with 1000 iterations. Alignment, model testing, and tree building were performed using MEGA 6 application [29]. The tree topology was confirmed using the Bayesian method with the GTR model implemented in Genious software. Nucleotide and amino acid sequence percent identity (percentage of bases/residues which are identical) was estimated using Geneious software while pairwise genetic distances were calculated with the MEGA 6 software. The nonsynonymous (dn) and synonymous (ds) substitution rate was calculated using SNAP (Synonymous No-synonymous Analysis Program) v 2.1.1 [30]. Potential N-linked glycosylation sites were identified using the N-GlycoSite tool [31].

Analysis of Recombination
To detect possible recombination events, the Recombination Detection Program version 4 (RDP4) with default setting was used [32]. The software used seven primary exploratory recombination signal detection methods, RDP [33], GENECONV [34], BootScan [35], MaxChi [36], Chimaera [37], SiScan [38], and 3Seq [39]. The beginning and end breakpoints of the potential recombinant sequences were also defined by the RDP4 software. Putative recombinant events were considered significant when p ≤ 0.01 was observed for the same event using four or more algorithms.

Phylogenetic Analysis
Out of 263 serum samples, 84 (32.0%) (53 from goats and 31 from sheep) originating from eight flocks were positive in the ELISA test. DNA extracted from the blood of serologically positive animals was used to amplify the CA fragment of the gag gene for phylogenetic analysis. Proviral DNA of 54 samples originated from 26 sheep and 28 goats from six different flocks was successfully amplified and sequenced (Table 2).  Obtained sequences were aligned with reference sequences representing the genotypes of SRLVs described to data, however, we included only sequences of appropriate length matching to data obtained in this study. An unrooted phylogenetic tree is shown in Figure 1. Sequences of Polish strains analyzed in this study were widely distributed on the tree, clustering in subtype B2, A5, A12, and A17. In particular, sequences of sample s#21, s#20, g#9510, g#3540, s#14, g#0599, g#3535, g#0788, g#0580, and s#29 from flock 16 clustered within subtype B2 were closely related with Polish strains #11, #2437 and #4106, which were previously detected in the same Polish region (mean nt distance 2.6% ± 0.5%). Sequences originated from sheep #2590, #3691, #3275, #4315 from flock 13, from sheep #9855 from flock 14 and from sheep #0334 from flock 10 also clustered within subtype B2 but were more closely related with sequences of Spanish strain #496 and Swiss strain #5720 (mean nt distance 7.3% ± 1.8%). Sequences originating from goat #8039, #8046, #9692, #1318, and #8008 from flock 13 were closely related with Polish strains #6038, #5819, and #5826 (mean nt distance 1.3% ± 0.4%), representing subtype A5, while sequences originated from goat #9431, #8172, #6909, #5621, #1580, #1485, #5654, and #5686 from flock 17 were more closely related with Polish strains #5616, #8344, and #3085 representing subtype A17 (mean nt distance 2.6% ± 0.6%). Twenty sequences had been assigned to the A12 subtype and phylogenetic analysis clearly showed the existence of three separated subgroups within this cluster. Sequences from sheep #40, #12, #33, #3, #16, #1, #13, #6, #4 and #14 from flock 16 (I cluster) were closely related with sequences of Polish strains #4007, #4819 and #1202 with mean nt distance of 4.0% ± 1.4%. Sequences from goat #7219, #8891, #7102, #7134, #7096 and #6808 from flock 10 (II cluster) and sequences from goat #8699, #3533, #9509 and #3535, from flock 16 (III cluster) were closely related with sequences of Polish strains #15, #10 and #13 but they formed separated clusters (mean nt distance 7.9% ± 1.2%). Additionally, sequences from sheep #1622, #4315, #4018, #2590, and goat #8046, from flock 13 as well as sequences from sheep #5023, #3249, #3188, #3225, and #3201 formed new clusters within group A, which could be tentatively named as A23 and A24, respectively. Affiliation of these new clusters were supported with high bootstrap values ≥84. Sequences of the proposed subtype A23 and A24 had a mean sequence similarity (intra-subtype similarity) of 4.3% and 3.4%, respectively. The mean genetic distances between new subtypes and other subtypes representative for genotype A varied from 13.1% to 24.6% for subtype A23, and from 12.5% to 22.3% for subtype A24 (Table 3). To evaluate the robustness of our analysis, we also performed phylogenetic analysis using neighbor-joining and the Bayesian inference method (Supplementary Materials Figures S1 and S2), which resulted the same classification of all strains, whereby supporting the existence of new subtypes A23 and A24. Subtypes A5 and A17 have been found only in goats while subtype A24 has only been detected in sheep. In contrast, subtypes A12, A23 and B2 have been found in both sheep and goats. Dual infection with B2 and A12 was found in sheep #14 and goat #3535 from flock 16. Co-infection with B2/A23 was detected in sheep #2590 and sheep #4315 from flock 13 while co-infection with subtypes A5/A23 was detected in goat #8046, also originating from flock 13. Only in two flocks detected SRLV sequences representing one genotype. In flock 12 circulated only subtype A24 while in flock 17, subtype A17. This was confirmed by pairwise nucleotide comparison, in which distances estimated among sequences derived from flocks 12 and 17 varied from 1% to 2.9% and from 0% to 4.3%, respectively. On the other hand, in four flocks highly divergent SRLV subtypes were found: subtypes B2/A12 in flock 10, subtypes A5/A23/B2 in flock 13, subtypes A24/B2 in flock 14, and subtypes B2/A12 in flock 16. Table 3. Estimated of mean evolutionary divergence between subtypes of genotype A (inter-genotype) based on the CA fragment of gag gene. All samples were also used to amplify 608 bp fragment of env gene. Out of 54 tested samples, 45 (20 from sheep and 25 from goats) were successfully amplified, and after sequencing were subjected to phylogenetic analysis. The phylogenetic tree ( Figure 2) confirmed that Polish sequences belonged to subtype B2, A5, A12, and A17, as well as to new identified subtypes A23 and A24. Similarly to the gag phylogenetic assignment, env sequences from sheep #2590, #4315, #3275, and #1622 from flock 13 formed the new subtype A23, while sequences from sheep #3188, #3249 and #3201 from flock 12 formed subtype A24. The mean nucleotide distance between sequences belonging to the A23 subtype and those belonging to A24 subtype was 22.0%, while the mean distance between these subtypes and other subtypes within group A ranged from 18.5% to 27.8% and from 20.4% to 27.6% for A23 and A24, respectively. Sequences from sheep #9855, #3691 and #4018 previously located in subtypes B2 and A23, respectively, now created a separate branch clustered closely with North American MVV strains 85/34 and S93. In addition, sequences from sheep #5023, which on the basis of gag fragment was affiliated to subtype A23, now formed a separate cluster.

Identification of Putative Recombination
A recombination analysis of sequences used for phylogenetic analysis was performed to verify if sequences obtained in this study resulted from a recombination of already known sequences. On the basis of gag alignment, one putative recombination event was detected by five statistical methods with high significance and reliability. The recombinant sequence #14(2) was detected in sheep co-infected with strains A12/B2 from flock 16. In this recombination event, the beginning and ending breakpoints were located at 27 and 320 nucleotides in alignments and the major and minor parents were #4007, representing subtype A12 (90.9% similarity) and #0599 representing subtype B2 (100% similarity), respectively (Figure 3a). On the basis of env alignment, two putative recombination events were detected. Four methods detected a recombination event in #13s4018 between positions 64 and 276 in alignment with #13s4315 (subtype A23) as the minor parent and unknown, suggesting #5819 (subtype A5), as the major parent. The results also indicated that #13s3691 arose from recombination events between the same breakpoints position, 64 and 276 in alignment, but with #13s4315 (subtype A23) and unknown, (suggesting #5819 A5), as the major and minor parents, respectively (Figure 3b).

Analysis of Immunodominant Regions
To analyze the conservation of immunodominant regions of sequences of Polish strains analyzed in this study, their deduced amino acid (aa) sequences of capsid and surface proteins were aligned with the aa sequences of reference parental strains, Cork and K1514. Pairwise percent identity of the gag amino acid sequences of Polish SRLVs was high and ranged from 81.3% to 100%. Furthermore, Polish sequences shared 82.1-96.3% and 92.4-97.7% amino acid sequence identity with strains Cork and K1514, respectively. Analysis revealed that all gag sequences belonging to group B had glycine-glycine (GG) motifs and all MVV-like sequences had asparagine-valine (NV) motif, typical for strains belonging to group A (Supplementary Materials Figure S3). Immunodominant epitope 3, situated at the C-terminal end of the capsid protein, was highly conserved between sequences belonging to group A and B, while the analysis of sequences in epitope 2 revealed a perfectly conserved region (GKLNEEAERW) located at the N-terminal part of epitope and distinct region at the C-part of the epitope specific for MVV-like (VRQNPPGP) and CAEV-like (RRNNPPPP) strains. In the Major Homology Region (MHR), which is usually highly conserved in the gag gene of all retroviruses, some alterations were present within group A (Figure 4). In particular, sequence from goat #9692 had isoleucine (I) instead of valine (V) at the fourth position compared to strain #K1514, while sequences from goat #6808 and #8699 had threonine (T) instead of asparagine (N), and valine (V) instead of isoleucine (I) at the eighth and 16th positions, respectively. All sequences representing subtype A17 and sequence from goat #6808 representing subtype A12 had substitutions arginine (R) instead of lysine (K) at the fifth position, while sequences from goat #3535 and #9509, representing subtype A12, had substitution arginine (R) instead of lysine (K) in the seventh position of MHR. All goat-derived sequences representing subtype A5 and sequences from goats #3535 and #9509 had substitution glutamic acid (E) instead of aspartic acid (D) at the 14th position. Six sequences had substitution serine (S) instead of threonine (T) and 35 sequences had substitution asparagine (N) instead of threonine (T) at the ninth position compared to the sequences of strain #K1514. Type B sequences had more conservative MHR sequences. Compared to sequences of strain Cork, all sequences had substitution at the eighth position (T/S or T/G). Sequences from sheep #0334 had substitution serine (S) instead of asparagine (N) and serine (S) instead of proline (P) at the ninth and eleventh positions. Five sequences representing subtype B2 which formed subclusters on the phylogenetic tree had unique substitution threonine (T) instead of alanine (A) at the 16th position of MHR. The env aa sequences of Polish strains were more heterogenous than gag sequences showing 54.1-100% similarity to each other and 57.1-73.1% and 59.4-73.7% to the strains K1514 and Cork, respectively. Sequences of the variable region (V4) of analyzed strains differed significantly. Comparison of aa sequences in epitope SU5 revealed the conserved region (VRAYTYGV) located at the N-terminal part of the epitope. The variable region was conserved among the sequences of strains belonging to subtypes A5, A23, A24, and B2 showing type-specific variation. Sequences belonging to subtypes A12 and A17 showed intra-subtype variability ( Figure 5). Sequences could be divided into groups corresponding groups formed on the phylogenetic tree. To determine if the nucleotide sequences encoding Gag and Env proteins evolved under positive selective pressure, the ratio of nonsynonymous to synonymous base substitutions were estimated. The results showed that for both fragments, the dN/dS ratio for caprine and ovine sequences was below 1, showing a negative selection. Additionally, the number of potential N-linked glycosylation sites (PNGS) was estimated and ranged from 3 to 6 in respective sequences. In all caprine sequences of subtype A17 from flock 17, and in all caprine sequences of subtype A12 from flock 10, four potential N-linked glycosylation sites were detected, in positions 8, 14, 32, and 39 in the alignment. These four PNGS were also detected in ovine sequence #5023 from flock 14, which formed a new cluster within group A on the phylogenetic tree. In two out of three ovine sequences, representing subtype A24 from flock 14, and in three sequences (14s9855, 13s4018, and 13s3691) which formed a cluster distinct from known A and B subtypes additional glycosylation site, at position 53, was detected. In sequences from flocks 13 and 16, 3-6 N-linked glycosylation sites were observed, but with some heterogeneity in the positions between different isolates. In most ovine sequences representing subtype A23 from flock 13, four PNGS were detected

Analysis of LTR Sequences
LTR sequences from 47 out of 54 samples (87%), which were successfully amplified, were aligned with sequences of prototype strains K1514 and CAEV-Cork representative for SRLVs groups A and B, respectively. The nucleotide pairwise percent identity of Polish sequences ranged from 72.3% to 100%. Furthermore, Polish sequences shared 48.4-53.8% and 74.0-85.8% sequence identity with strains Cork and K1514, respectively. The LTR regions analyzed in this study contained two AP-1, one AML (vis) and one AP-4 putative motifs without any duplication or deletion in their U3 regions compared with reference sequences (Figure 6). Sequences corresponding to the TATA box, AP-4, and polyadenylation signal (poly A) were quite conserved. All sequences belonging to subtype A17 had unique substitution T to A in the fifth position of TATA box. Sequence of AML(vis) motif was present in all Polish samples and was identical with the sequence of K1514 strain, except for sequences originating from goats #8891 and #7219 which had substitution A to G in the second position of AML-vis. The AP-1 sites were less conserved. Sequences of the first AP-1 site were identical in all Polish sequences analyzed and the sequence of K1514 strain (TCATGTA), but differed from the sequence of Cork strain (TGACATA), while in the second AP-1 site considerable nucleotide changes were observed. All Polish sequences analyzed, except sequences representing subtype A17, had the specific 11 nt deletion in the R region. However, A17 sequences had the CCGAAGGAAAG insertion almost identical, like in the K1514 strain. Furthermore, all Polish sequences had 13 nt deletion in the U5 region.

Discussion
Previous studies revealed that the SRLVs population in Poland is highly heterogeneous. SRLVs isolated so far from sheep and goats in Poland belonged to the well-known subtypes B1, B2, A1, A5, and A16, as well as subtypes A12, A13, A17, and A18-detected only in Poland. Since mixed flocks promote interspecies transmission and the emergence of new variants [18][19][20], the aim of this study was to perform genetic characterization of field SRLV strains present in Polish mixed flocks to get a better insight into their heterogeneity.
The current SRLVs phylogeny consisting of five main groups, which are divided in multiple subtypes, emphasizes the high genetic diversity among SRLV strains. In 2004, Shah et al. proposed a classification of SRLVs based on sequences of the gag-pol (1.8 kb) [40]. However, due to low proviral load and high genetic variability of SRLV strains, in many cases this fragment could not be obtained [12,18,41,42]. As a result, classification of SRLVs is more often performed on a very conservative~0.4 kb gag fragment for which sequences representing most of subtypes are available. Using this fragment, we confirmed circulation of subtypes B2, A5, A12, and A17 in analyzed flocks and revealed circulation of new subtypes A23 and A24. These new subtypes were closely related to subtypes A13 and A18 which were previously detected only in Poland, suggesting their common origin. Clearly separation of these subtypes from other Polish subtypes, located in distinct clusters on the phylogenetic tree, suggests the presence of at least three SRLV genetic lines circulating in Poland which are independently evolving. As was previously reported, some strains can cluster to different subtypes depending on the fragment that is analyzed. Such an observation was noted for the subtype A19, A20, and B5 strains ( [10], this report). This suggests that these strains could be generated by recombination. Thus, to confirm the existence of our putative new subtypes, we decided to perform phylogenetic analysis using variable env sequences, which confirmed that they formed a separate clusters with no clear relation to any of the previously described group A subtypes. Lentiviral genomes are among the most rapidly evolving known. Most of the mutations are introduced during the reverse transcription stage of the viral life cycle as a consequence of low fidelity of reverse transcriptase, which has no proofreading activity. However, interspecies transmission, co-infections, and recombinations are the main mechanisms which contribute significantly to genetic variability and accelerate viral evolution [17,[43][44][45]. Direct evidence of interspecies transmission of SRLVs from sheep to goats and vice versa has been documented by the detection of most subtypes in both sheep and goats [16,18,24,46]. It is also evidenced that mixed flocks, where sheep and goats are kept together, is the factor promoting cross-species infections which can result in the emergence of new variants [18][19][20]. In our study, subtypes A5 and A17 have been found only in goats, subtype A24 has been detected only in sheep, while subtypes A12, A23, and B2 have been found in both sheep and goats. This confirms the ability of SRLVs to frequently cross the species barrier under natural conditions. Furthermore, co-infection with strains belonging to different subtypes was evidenced in three sheep and two goats which originated from two flocks. Sheep #14 and goat #3535 from flock 16 were co-infected with B2/A12, while two sheep (#2590 and #4315) and one goat (#8046) originated from flock 13 were co-infected with B2/A23 and A5/A23, respectively. Existence of co-infected animals can be explained by circulation of more than one subtype in these flocks. Our results revealed that in four out of six flocks, highly divergent SRLV subtypes were found (subtypes B2/A12, A5/A23/B2, A24/B2, and B2/A12). In flock 13, circulation of sequences belonging to subtypes A5, A23 and B2 were found. All goats were infected with strains representing subtype A5, while in sheep, only sequences belonging to subtypes A23 and B2 were found. Only one goat was co-infected with A5/A23 which strongly indicates that the direction of transmission of subtype A23 in this case was from sheep to goat. Mean gag nucleotide distance of A23 sequences found in this co-infected goat and A23 sequences found in sheep was 3.5%-strongly confirming their common origin. Furthermore, two sheep from this flock were co-infected with subtypes A23 and B2. However, it is difficult to explain which subtype was first introduced to the flock, because both subtypes were detected at the same frequency. The gag mean nucleotide distance of B2 and A23 sequences was 0.5% and 3.9%, respectively, while the mean nucleotide distance between these subtypes was 32.9%. This confirms the circulation of two distinct subtypes in the sheep from this flock which do not represent the evolution of the homologous strains. In flock 16, circulation of two subtypes, A12 and B2, was detected in both, sheep and goats. Because the most of sheep were infected with subtype A12 and most of goats were infected with subtype B2, the transmission of these subtypes is suggested to be from goats to sheep and from sheep to goats, respectively. Interspecies transmission was also observed in flock 10 and flock 14, where subtypes A12/B2 and B3/A24 were found. The high genetic distance of sequences found in these flocks strongly suggests that they originated from different sources. The introduction of different subtypes of SRLVs in analyzed flocks could not be tracked back because we don't know the history of animal movements, but they most likely resulted from the purchase of infected animals from other flocks. The introduction of new animals to a flock for genetic improvement is common practice in Poland and undoubtedly represents a high risk factor for the spread of SRLVs, which may result in higher diversity of SRLVs [40,47,48]. In Poland, it is also facilitated by the lack of SRLVs eradication programs and any veterinary controls.
Co-infection with at least two subtypes offers opportunities for viral recombination which is believed to be a powerful source of genetic variability of lentiviruses leading to the emergence of new strains. Previous studies have provided clear evidence of recombination between SRLVs belonging to groups A and B, as well as between subtypes belonging to the same group [18,20,24,45,46,49,50]. In the present study, three examples of recombinations have been found. One putative recombination event was detected in the gag fragment in sheep #14 co-infected with strains A12/B2, while on the basis of env fragments, two putative recombination events were detected in two sheep from flock 13. Recombination is a frequent event in the envelope gene which could lead to the generation of chimerical SRLVs with altered cell tropism, pathogenicity and transmission efficiency which may endanger not only domestic ruminants but also other animal species [51][52][53].
Analysis of sequences is important, not only for evaluating the spread of SRLV subtypes but also for gaining knowledge of antigenic variability. Alterations in the amino acid sequences of immunodominant epitopes determine their antigenicity and may impact on sensitivity of serological tests. The primers used in this study allowed the sequencing of two immunodominant epitopes of capsid protein-epitopes 2 and 3, which were identified by Rosati et al. [54] and which are used in SRLVs ELISA tests. The CA is the major viral core protein and antibodies against this antigen are usually first generated in sheep and goats infected with SRLVs and remain detectable for a long time [55]. Our results, which are consistent with other reports, showed conservation of epitope 3 and the amino-terminal part of epitope 2 (GKLNEEAERW), as well as the group-specific part located at the C-termini of epitope 2 (group A, VRQNPPGP, group B, RRNNPPPP) [24,25,56,57]. More variability was found in the MHR, which is usually highly conserved in many retroviruses [12,58,59]. MHR of Polish MVV-like sequences had more alteration and revealed unique mutations in sequences of subtypes A5 and A17. All Polish sequences representing subtype A17 had unique substitution lysine (K) to arginine (R) at the fifth position of MHR, while substitution aspartic acid (D) to glutamic acid (E) at the 14th position of MHR was exclusively found in Polish sequences representing subtype A5. Because these subtypes were found in goats, we hypothesize that these changes may have resulted from cross-species transmission, as many genetic changes after SRLVs cross-species transmission were reported in many publications [1,16,20,40,51]. However, subtypes A17 and A5 were detected only in goats of Polish White Improved/Polish Fawn Improved and Carpathian breeds, respectively. This may also suggest that mentioned changes could have arisen as a result of long host-virus adaptation and evolution.
As expected, the SU sequences showed more extensive variations in comparison to CA sequences. Previous sequence analysis of SRLV strains defined five major variable regions of SU [60], but sequences obtained in this study allowed for the analysis of only the V4 and V5 regions. Although the dN/dS ratio for V4V5 sequences was below 1, indicating the existence of a purifying selection, we noticed that some regions may be under positive selection. Our results revealed that the V4 region of Polish strains differed significantly, and that the differences mainly occurred within a region previously proposed to be part of a variable, conformational neutralization epitope [61,62]. Moreover, insertion and deletion only occurred in highly variable region (HV2) [61], confirming that this region underwent rapid sequence evolution during SRLVs infection. Mutation in this region resulted in escape from neutralization and the creation of a new type of neutralization specificity [62]. Thus, it is suggested that the V4 region may have an analogous function to V3 in HIV-1 since the V3 loop of HIV-1 is the major target for neutralizing antibodies [45,63]. Interestingly, in the V4 region, a 'signature pattern' related to different clinical status in sheep and goats has been found [64]. Comparison of aa sequences in epitope SU5 revealed conserved region (VRAYTYGV) located at the N-terminal part of the epitope and highly variable motif at the C-terminal which was well-conserved among strains belonging to the same subtypes confirming that the SU5 epitope is responsible for type-specific immune response allowing strain-specific diagnosis. Moreover, this observation appears to support the hypothesis that this region may function as a decoy antigen and, therefore, in a given subtype, evolves under negative selection [65]. Additionally, the V4V5 regions contain the majority of the conserved N-linked glycosylation sites and cysteine residues, suggesting that they form a highly constrained and surface-exposed domain [60]. These sugars, or "glycans," play several important roles in the infective cycle of the viruses. In HIV-1, effects of glycosylation on viral replication, glycoprotein cleavage, CD4 binding activity, and coreceptor usage have been documented. Removal of the glycan may indirectly increase viral activity by causing a shift in the V3 loop, leading to an increase in co-receptor binding while high density of glycans protect the virus against neutralizing antibodies as a "glycan shield" [66,67]. In this study, we observed differences in the number of potential N-linked glycosylation sites (PNGS) which ranged from 3 to 6 in respective sequences. Most of the sequences had four PNGS which were detected, at positions 8, 14, 32, and 39 in the alignment, suggesting that these PNGSs may be evolutionarily conserved. Although some glycans are evolutionarily conserved, the number of others may vary extensively within infected individuals, since they may appear or disappear over the course of an infection in a single host [66,67]. Thus, variation in the number and position of PNGSs in Polish sequences may result from a host-species adaptation of SRLVs, as was evidenced for HIV-1. In general, the average numbers of predicted glycosylated positions were relatively conserved between the different subtypes of SRLVs.
U3-R regions contain elements described as important for the regulation of SRLVs transcription and replication. Therefore, sequence variation in LTRs might affect the interactions with cellular factors and alter viral expression and replication. Furthermore, it was also demonstrated that LTR sequence variability may affect tissue tropism and disease outcome [63]. The analysis of Polish LTR sequences revealed that sequences corresponding to the TATA box, AP-4, AML-vis, and polyadenylation signal (poly A) were quite conserved which was previously reported in strains from different geographic areas. This sequence conservation argues in favor of their importance in the replication strategies of SRLVs [12,26,42,[68][69][70]. On the other hand, considerable alteration was observed in AP-1 sites, which confirmed previous findings suggesting that AP-1 sites may by functional even despite some changes [70,71]. AP-1 binding sites are important for regulation of SRLVs expression in macrophages and are required for phorbol ester-inducible gene expression of SRLV. Multiple copies of AP_1 sites presumably allow transcription regardless of mutations [69][70][71]. The TATA box is present and highly conserved in all retroviruses [70]. What is interesting is that our results revealed that all sequences belonging to subtype A17 had unique substitution T to A in the fifth position of the TATA box. Mutation in the TATA box of HIV-1 sequence changes the binding of the TATA-binding protein (TBP) which leads to a decrease in transcription [72,73]. The meaning of mutation detected in TATA box of subtype A17 SRLVs is unknown and warrants further study. Additionally, our results revealed that all sequences representing subtype A17 did not have a 11 nt deletion in the R region, which was noted for other sequences from Poland. Correlation between this deletion and the occurrence of clinical signs in infected animals has been suggested [51,69,74], but our results did not confirm these assumptions. All animals infected with both virus carrying the deletion and without deletions, were without clinical signs.
In conclusion, the results of this work extend the current knowledge on the distribution of SRLV subtypes in sheep and goats from Poland. Our results showed that SRLVs circulating in Poland are highly heterogeneous with ovine and caprine strains belonging to group A and B. We present strong and multiple evidence of dually infected sheep and goats in mixed flocks, and present evidence that these viruses can recombine in vivo. The results of the phylogenetic analysis revealed the existence of putative new subtypes which should lead to the consideration of an update of current SRLVs classification. Furthermore, genetic analysis of Polish SRLV sequences revealed some specific alteration present in gag and LTR gene fragments in some subtypes. Thus, the isolation and characterization of biological properties of these viruses should be performed to evaluate their pathogenic potential.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/v13122529/s1, Figure S1: Neighbor-joining phylogenetic tree based on alignment of CA fragment of gag gene. Figure S2: Bayesian phylogenetic tree based on alignment of CA fragment of gag gene. Figure S3: Alignment of deduced amino acid sequences of immunodominant epitopes of capsid protein and major homology region (MHR) of the SRLVs obtained in this study and reference strains.