Genomic Epidemiology and Heterogeneity of SRLV in Italy from 1998 to 2019

Small ruminant lentiviruses (SRLV) are viruses that retro-transcribe RNA to DNA and show high rates of genetic variability. SRLV affect animals with strains specific for each host species (sheep or goats), resulting in a series of clinical manifestations depending on the virulence of the strain, the host’s genetic background and farm production system. The aim of this work was to present an up-to-date overview of the genomic epidemiology and genetic diversity of SRLV in Italy over time (1998–2019). In this study, we investigated 219 SRLV samples collected from 17 different Italian regions in 178 geographically distinct herds by CEREL. Our genetic study was based on partial sequencing of the gag-pol gene (800 bp) and phylogenetic analysis. We identified new subtypes with high heterogeneity, new clusters and recombinant forms. The genetic diversity of Italian SRLV strains may have diagnostic and immunological implications that affect the performance of diagnostic tools. Therefore, it is extremely important to increase the control of genomic variants to improve the control measures.


Introduction
Small ruminant lentiviruses (SRLV) are heterogenic retroviruses belonging to the Lentivirus genus of the Retroviridae family [1].
SRLV include two related retroviruses, Visna-maedi virus (VMV) and caprine arthritis and encephalitis virus (CAEV), which were considered separately until a few years ago, while today they represent the prototypes of the two predominant genotypes (A and B1). SRLV are capable of infecting both sheep and goats, especially in mixed flocks; however, some subtypes show a certain specific target adaptation, although they are not considered strictly host-associated [2][3][4].
SRLV induce a multi-systemic disease with progressive inflammatory lesions in the mammary gland, lungs, joints and brain. In one-third of infected animals, symptoms such as pneumonia, arthritis and mastitis have been commonly observed [5].
These single-stranded RNA viruses are responsible for a persistent and lifelong infection by targeting the monocytes of the host and stem cells located in the bone marrow [2].
Genetic variability is a key feature of the SRLV genome, and its knowledge, in addition to shedding light on host-virus interactions, is essential for accurate diagnosis and molecular epidemiology studies. SRLV quasi-species are continuously generated through mutation, recombination and selection pressure by the host immune system [6].
Small ruminant variability is a tool for the virus to evade the host immune response and ensure the persistence of the infection [7,8], and may be involved in the crossing of the species barrier. The high rate of variability is revealed by the presence of a number of viral subtypes with variable pathogenic properties within each species [9].
Based on the classification proposed by Shah et al. [10], SRLV have been classified into five genetic groups (A-E), which differ from each other in 25-37% of their nucleotide sequences. Genotypes A, B and E, originally described in sheep or goats, may be distributed into different subtypes (A1-A22, B1-B5 and E1-E2). New subtypes constantly appear as more local strains are analyzed, which outlines the continuous need for surveillance of diagnostic and vaccination strategies [11,12].
The aim of this work was to report an up-to-date overview of the genomic epidemiology and genetic diversity of SRLV in an Italian small ruminant population over time (1998-2019) by targeting a conserved genome region.
The 800 bp fragment was amplified using a nested-PCR protocol described by Grego et al. [18]. After gel electrophoresis, PCR products from each sample were purified using the QIAquick Gel Extraction kit (Qiagen, Valencia, CA, USA) and used as templates for sequencing with the Big Dye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems, Foster City, CA, USA). Precipitated products were sequenced on an ABI PRISM 3130 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Both sense and antisense strands were sequenced by performing three independent reactions for each sample. The sequence dataset was analyzed using the DNAStar package v.15. Nucleotide sequences were aligned using the Clustal X two algorithm with respect to the amino acid coding frame with 50 published SRLV reference strains retrieved from PubMed at the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/, 5 October 2021).
Manual editing was performed using BioEdit software (version 7.0.) [21]. The phylogeny was estimated using maximum likelihood (ML) analysis and Bayesian inference (BI) analysis to improve the robustness of the analysis. Maximum analysis (ML) analysis was performed in MEGA v.X [22] using the GTR statistical model with gamma distribution + I (G+I). The robustness of the clusters was assessed by performing 10,000 bootstrap replicates, and branches with bootstrap values exceeding 70% were grouped together. BI analysis was evaluated in BEAST v.1.8.4 with the GTR+G+I substitution model, with two runs consisting of four Markov chains [23]. A consensus tree was constructed using TreeAnnotator v.1.8.4, and the trees were displayed and edited using FigTree v.1.4.0.
The within-group mean distance, between-group mean distance and pairwise distance were calculated in MEGA v.X [22], using the p-distance method (G+I) with 1000 bootstrap replicates.
To detect the occurrence of recombination, a dataset including 319 sequences of SRLV was analyzed using SplitsTree4 [24] and RDP3 [25] software. Recombination events were assessed using the Phi test of SplitsTree v. 4. The sequence dataset was analyzed using the DNAStar package v.15. Nucleotide sequences were aligned using the Clustal X two algorithm with respect to the amino acid coding frame with 50 published SRLV reference strains retrieved from PubMed at the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/, accessed on 5 October 2021).
Manual editing was performed using BioEdit software (version 7.0.) [21]. The phylogeny was estimated using maximum likelihood (ML) analysis and Bayesian inference (BI) analysis to improve the robustness of the analysis. Maximum analysis (ML) analysis was performed in MEGA v.X [22] using the GTR statistical model with gamma distribution + I (G+I). The robustness of the clusters was assessed by performing 10,000 bootstrap replicates, and branches with bootstrap values exceeding 70% were grouped together. BI analysis was evaluated in BEAST v.1.8.4 with the GTR+G+I substitution model, with two runs consisting of four Markov chains [23]. A consensus tree was constructed using TreeAnnotator v.1.8.4, and the trees were displayed and edited using FigTree v.1.4.0.
The within-group mean distance, between-group mean distance and pairwise distance were calculated in MEGA v.X [22], using the p-distance method (G+I) with 1000 bootstrap replicates.
To detect the occurrence of recombination, a dataset including 319 sequences of SRLV was analyzed using SplitsTree4 [24] and RDP3 [25] software. Recombination events were assessed using the Phi test of SplitsTree v. 4.

Results
The topology of the tree obtained ( Figure 2) indicates that the 269 Italian samples analyzed in this study belonged to 10 previously described subtypes, two subtypes not previously described in Italy and samples with very high heterogeneity (unassigned). We also described three samples that were characterized by recombination events. This data provides evidence for natural recombination events.
Interestingly, four samples (134_SI_2012/LR723582, 224_CA_2016/LR732258, 223_ CA_2016/LR732257 and 111_SI_2012/LR723579) differed significantly from all the SRLV clusters described previously (Figure 2). The samples have been detected in sheep and goats, respectively, in different geographical areas and time, and clustered together into a new putative phylogenetic group, tentatively named 'A23 , based on the tree topology and p-distance values. The new subtype A23 showed a mean distance range of 0.177-0.310 in respect to the other subtype from group A. In particular, the sample 111_SI_2012/LR723579 showed high heterogeneity with other samples from the same cluster (0.147-0.170).
Based on the results available, the sample FR694914 has been previously classified as A9 [18], but now it clearly belongs to subtype A24.
In addition, a single sample (160_CA_2014/LR735695) collected from a sheep bulk milk from Calabria in 2015 belonged to genotype A, and is not clustered in any of the already known subtypes, showing a between subtypes/groups mean distance value range of 0.189-0.330.
Eighty-seven samples, collected from goats and only one in sheep, clustered in B1 with a within-subtype mean distance of 0.128. This subtype is the most widespread from a geographical point of view based on available data, as it was identified in 17 regions: Abruzzo The genotype E has not been detected within the 165 SRLV samples sequenced in the study, but it has been identified in a preliminary study [17] and in others that described SRLV [18].
The presence of recombinant strains was investigated using RDP3, Simplot and Splits Tree programs. Interestingly, the results showed evidence of natural recombinant signals in three virus samples collected from different geographical areas, animal species and time. In particular, one virus sample (51937/UM/06/FR695064) showed an A9/A11 pattern, while two samples (10308/MA/09/FR694909 and 10310/MA/09/FR694910) had a similar A3/A10 mosaic structure (Figure 3). Viruses 2021, 13, x FOR PEER REVIEW 6 of 11

Discussion
SRLV infection has a considerable economic effect on sheep and goat breeding; nevertheless, its impact on small ruminant production is largely underestimated by local farmers. Currently, there are no treatments or vaccines against SRLV. Live trade of goats or sheep from different countries or regions where the disease has been reported is believed to be the main reason for its wide spread. Thus, control programs remain the only way to avoid the spread of the SRLV infection [6]. In most countries there is no particular attention to SRLV infections. Up to now, there have been sporadic control plans mostly limited to goats and the B genotype circulation in areas characterized by particular food productions. In Europe, control programs have been implemented in many countries since SRLV have been detected in their goatherds [26][27][28][29].
In Italy, there is no national mandatory SRLV control and eradication plan; however, some regions, due to the large number of small ruminant farms-and therefore, being concerned about the problem -have autonomously devised initiatives to receive a SRLVfree status [30]. There are few voluntary and sporadic mandatory eradication programs on a local basis aimed at the eradication of CAEV [31][32][33].
The accurate diagnosis of the SRLV infection is of major importance in epidemiological research, control programs and safe small ruminant international trade according to the World Organization for Animal Health (OIE) recommendations. These objectives might be hampered by the high genetic and biological variation of SRLV; in general, however, such infections are efficiently detected through serological methods that can be complemented with molecular techniques [34].
Small ruminant lentivirus quasi-species are continuously generated through mutation, recombination and selection pressure by the host immune system. New subtypes constantly appear as more local strains are analyzed, which outlines the continuous need for surveillance of diagnostic designs [11,12].
In this study, we have set the goal of outlining an image as comprehensively as possible of the situation in Italy regarding SRLV, expanding the study in terms of the territories involved and the time of observation.
The current SRLV phylogeny, consisting of five genotypes, which are further divided into multiple subtypes, emphasizes the high genetic variation among SRLV strains.
Although there are other molecular studies based on different genomic virus regions [10,35,36], the scientific approach based on sequencing of the gag-pol region (800 bp) is the one most used by different scientific groups [10,[16][17][18]20]. Based on this approach, SRLV were classified in the following genotypes/sub types: A1, A2, A3, A4, A5, A8, A9, A11, A19, A20, A21, A22, C, B1, B2, B3, E1 and E2. For completeness, the SRLV genotypes/subtypes were compared with samples that are present in the GenBank database using a shorter overlapping gag region (420 bp). These analyses are less informative, as with the shorter alignment we lost recombination information; nevertheless, we confirmed that none of the Italian samples considered in this study clustered within other described subtypes (A12, A13, A16, A17 and A18; data not shown).
Genotype B contains three subtypes: B1, B2 and B3. The B4 subtype described by Santry et al., 2013 [37] was previously classified as a recombinant group within genotype A [38]. Subtype B5 was classified based on the sequences of the pol region, but it was classified as B1 based on the overlapping gag region [39].
Group C strains, originally sampled from Norwegian goats and group E strains are characterized by their extensive genetic divergence from other SRLV groups and their limited geographical range [10,18,40]. Genotype D was found in a few samples originating from Switzerland and Spain, but is now reclassified as genotype A [13]. Group E strains are characterized by their extensive genetic divergence from other SRLV groups and their limited geographical range [10,18,40].
This high genetic diversity between strains often poses challenges for countries that implement SRLV eradication programs, as none of the existing diagnostic tests are capable of detecting all circulating strains [3].
The sequences obtained from this study were analyzed with all the SRLV genotypes previously described, but only the results obtained with the gag-pol region (800 bp) are shown in Figure 2. The phylogenetic tree based on the gag (420 nt) region is not shown.
The subtypes A23 and A24 are quite distant from each other; in particular, the most distant from the other A subtypes in the branch was A23. It includes four sequences (three from sheep, one from goat) retrieved from two neighboring regions that are very similar regarding the types of farms present (Sicilia and Calabria). The A24 includes three sequences obtained from three neighboring regions of central Italy (Umbria, Marche and Lazio), between which there are frequent commercial exchanges. These regions are epidemiologically related because of common trade practices between sheep and goat farmers.
The sequence 160_CA_2014/LR735695 has been identified in bulk milk collected from a mixed outdoor farm (Calabria), characterized by mixed breeding and crossroads of different races. The sample showed high diversity compared to the other genotype A samples, but a high homology with the cluster identified by Molaee et al. [13] and defined as Middle Eastern/Iranian. This cluster was characterized through a high nucleotide identity with the samples considered to be ancestors of SRLV.
The genetic diversity of lentiviruses is driven by a low-fidelity reverse transcriptase and their propensity to recombine via strand transfer [42]. Co-circulation and co-infection with more than one lentivirus type offers an opportunity for viral recombination. Only a handful of reports of mixed infections and recombination under both experimental and natural conditions have been published to date [34,[43][44][45][46][47][48][49]. Further studies are necessary to understand the role of natural recombinants in the spread of SRLV, as well as their ability to evade the current diagnostic tests.
Most of the Italian samples characterized in this study (130) belonged to the most homologous genotype B. Subtype B1 is the most represented in general and this study contributes to this evidence. In fact, 88 samples clustered in subtype B1. Although subtype B1 has been identified in six sheep, it represents the main genotype for goats.
B2 and B3 are subtypes that mainly affect sheep: 19 out of 94 sequences of ovine origin showed B2 subtype and 38 were subtype B3. The latter viral subtype was also detected in five samples of goat origin.
It is important to observe the great variability within B1 and B3, in which well-defined internal clusters branch out. Although the calculated mean distance in the groups appears to be quite high, the BI does not support further separation. For genotype E, the results agreed with those already published in our preliminary molecular epidemiological study. This genotype is not widespread and describes a type of SRLV that is non-pathogenic or has low pathogenicity.
The great SRLV heterogeneity in Italy can also be explained by the presence of different herd management, different breeding systems, transhumance, pastoring practices, micro environmental factors and habitat. The geographical distribution, presence of wild ungulates and climate change may also be involved.
In summary, this work provides new knowledge on the circulation of SRLV variants circulation in Italy over a long period, and provides new information about the presence of never-identified subtypes.
The rapid evolution of SRLV and emergence of new recombinant forms, subtypes and groups may have significant implications for the development of reliable diagnostic testing and in the success of eradication programs. To better understand the impact of the high divergence detected, it might be useful to extend the analysis to the full genome, at least for some selected samples. The genetic heterogeneity amongst field strains of SRLV remains to be fully characterized, which is important as this genetic variation in turn translates into virus strains and genetic sequences with different biological properties such as virulence.
The current situation of eradication and control programs for SRLV in the ovine/caprine population is underestimated in Italy. For SRLV mitigation, it is extremely important to regulate the animal trade according to the disease status of a farm or region, and to increase the control and restriction of trade of biological products.
The genetic characterization of Italian SRLV strains will help in the development of appropriate diagnostic tools to assist in the national control program. Interestingly, the study shows the need for a new classification, not necessarily based on complete sequences, but taking into account all the cases of a dubious clustering based on the current classification.
Author Contributions: M.B., F.F. and M.G.: conceptualization, original draft preparation, writing and manuscript revision, I.P., S.P. and P.G.: methodology, data curation, review and editing: C.T. and C.I.: methodology, software. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: Accession numbers: The partial SRLV sequences generated in this study have been deposited in the NCBI GenBank database www.ncbi.nlm.nih.gov (Supplementary  Table S1).