Identification of a Novel Equine Papillomavirus in Semen from a Thoroughbred Stallion with a Penile Lesion

Papillomaviruses (PVs) have been identified in a wide range of animal species and are associated with a variety of disease syndromes including classical papillomatosis, aural plaques, and genital papillomas. In horses, 13 PVs have been described to date, falling into six genera. Using total RNA sequencing (meta-transcriptomics) we identified a novel equine papillomavirus in semen taken from a thoroughbred stallion suffering a genital lesion, which was confirmed by nested RT-PCR. We designate this novel virus Equus caballus papillomavirus 9 (EcPV9). The complete 7656 bp genome of EcPV9 exhibited similar characteristics to those of other horse papillomaviruses. Phylogenetic analysis based on concatenated E1-E2-L2-L1 amino acid sequences revealed that EcPV9 clustered with EcPV2, EcPV4, and EcPV5, although was distinct enough to represent a new viral species within the genus Dyoiotapapillomavirus (69.35%, 59.25%, and 58.00% nucleotide similarity to EcPV2, EcPV4, and EcPV5, respectively). In sum, we demonstrate the presence of a novel equine papillomavirus for which more detailed studies of disease association are merited.

During the 2018 southern hemisphere serving season, the stallion experienced difficulty covering mares, primarily manifest as apparent pain on ejaculation. A wart-like lesion, 1 cm in circumference, was observed at the tip of the penis consistent with a genital papillomavirus lesion ( Figure S1). Further endoscopy and ultrasound excluded neoplasia with no evidence of further internal lesions.

Sample Collection, RNA Sequencing and Virus Discovery
Two sets of urine and semen samples were collected from the stallion for microbiological investigation, one set placed into a standard specimen container and the other stored in RNA later. Because the nature of any causative pathogen was unknown, we employed a meta-transcriptomic approach as this is able to detect any microbial species (i.e., bacteria, eukaryotes, viruses) as long as sufficient expressed RNA is present. Accordingly, total RNA was extracted using the RNeasy plus universal kit (QIAGEN, Chadstone Centre, Victoria, Australia), with RNA sequencing libraries then constructed with the SMARTer Stranded total RNA-seq kit (TaKaRa, Clayton, Victoria, Australia). RNA sequencing of 100 bp pair-end libraries on the Illumina NovaSeq platform yielded 84.54 Gb of data (Table S2). All sequencing reads have been uploaded onto the NCBI Sequence Read Archive (SRA) database under BioProject PRJNA552109.
RNA sequencing reads were quality trimmed and horse reads were subsequently removed by mapping to the horse genome. To identify potential viral transcripts, non-horse reads from each library were compared against the non-redundant nucleotide (nt) and non-redundant protein (nr) databases using Blastn and Diamond blastx, respectively, with e-value thresholds of 1 × 10 −10 and 1 × 10 −4 [10], and were then annotated by taxonomy. Reads from the virus-positive library were de novo assembled using Megahit v1.1 [11,12]. Virus-associated contigs were extracted and assembled using Geneious 11.1.5 [13], followed by subsequent blast analysis against the NCBI nt database using BLASTn as further confirmation.

Identification of a Novel Equine Papillomavirus
A 7605 bp genome sequence of a papilloma-like virus was identified in one semen library. Prediction of open reading frames (ORFs) was performed using the ORF Finder tool at NCBI (https://www.ncbi.nlm.nih.gov/orffinder/). A conserved domain search (https://pave.niaid.nih.gov/ #analyze/l1_taxonomy_tool) revealed that the L1 protein of the new virus exhibited the highest nucleotide and amino acid identities with EcPV2, at 69.35% and 70.44%, respectively, indicative of a novel papillomavirus. A novel Equus caballus papillomavirus 8 (EcPV8) associated with viral plaques, viral papillomas, and squamous cell carcinoma has been recently described [14]. We therefore refer to the novel equine papillomavirus described here as Equus caballus papillomavirus 9 (EcPV9, GenBank accession number MN117918), in accordance with current guidelines for the classification of papillomaviruses [15]. To obtain the full virus genome and to verify the sequence obtained from the deep sequencing and assembly processes, overlapping primers were designed and nested RT-PCR was performed. This resulted in the determination of a circular genome of 7656 bp in length. Remapping of the sequence reads from this library revealed a maximum coverage of 3419X (Figure 1), corresponding to an abundance of 152.97 RPM (reads mapped per million input reads).
Two zinc-binding domain(s) (CXXC-X29-CXXC) were found in E6 (nt 708 and 937; amino acids 10 and 85) and one in E7 (nt 1201; aa 50), separated by 29 amino acids ( Figure S2). No PDZ binding domain (XS/TXV/L) was located at the C-terminus of the predicted EcPV9 E6 protein sequence ( Figure S3), which has been reported as a characteristic feature of high risk (i.e., pathogenic) HPV types in comparison to low risk HPVs [16]. Notably, it was previously reported that a PDZ binding domain (XS/TXV/L) was located at the C-terminus of the predicted EcPV-2 E6 protein sequence [8], which was not observed here ( Figure S3). No putative pRB binding site (retinoblastoma tumor suppressor-binding domain) (LXCXE) was identified in the putative EcPV9 E7 protein, consistent with all equine and dyoiotapapillomaviruses determined to date [17,18], and the putative E4 protein showed a typical high proline content (12.8%, 18P/141 aa). Two zinc-binding domain(s) (CXXC-X29-CXXC) were found in E6 (nt 708 and 937; amino acids 10 and 85) and one in E7 (nt 1201; aa 50), separated by 29 amino acids ( Figure S2). No PDZ binding domain (XS/TXV/L) was located at the C-terminus of the predicted EcPV9 E6 protein sequence ( Figure S3), which has been reported as a characteristic feature of high risk (i.e., pathogenic) HPV types in comparison to low risk HPVs [16]. Notably, it was previously reported that a PDZ binding domain (XS/TXV/L) was located at the C-terminus of the predicted EcPV-2 E6 protein sequence [8], which was not observed here ( Figure S3). No putative pRB binding site (retinoblastoma tumor suppressor-binding domain) (LXCXE) was identified in the putative EcPV9 E7 protein, consistent with all equine and dyoiotapapillomaviruses determined to date [17,18], and the putative E4 protein showed a typical high proline content (12.8%, 18P/141 aa).
To determine the evolutionary relationships of EcPV9, we inferred a phylogenetic tree based on the concatenated alignment of four coding sequences (E1, E2, L2, and L1). Amino acid sequences (concatenated E1-E2-L2-L1) of 13 equine PVs, as well as the type species of each of the 52 PV genera, were aligned using the E-INS-I algorithm in the MAFFT v7 package [19]. A phylogenetic tree was then estimated using the maximum likelihood method in PhyML 3.0 [20], incorporating the LG+Γ model of amino acid substitution, a SPR branch-swapping algorithm, and 1000 bootstrap replications. This analysis revealed that EcPV9 is clearly related to Dyoiota PVs-EcPV2, EcPV4, and EcPV5 ( Figure 3). Hence, this evolutionary analysis demonstrates that EcPV9 is a novel species within the genus Dyoiotapapillomavirus, yet most closely related to EcPV2, classified as Dyoiotapapillomavirus 1.

Disease association
As no biopsy samples could be taken from this case, it is not possible to confidently determine its significance in the observed pathologies. Nevertheless, the novel EcPV described here was extracted from semen samples, collected when a wart-like lesion was visible on the tip of the penis ( Figure S1), and hence compatible with a disease syndrome caused by a papillomavirus. In addition, it was notable that EcPV9 exhibited greatest sequence similarity with EcPV2, a major aetiologic agent of equine squamous cell carcinoma (SCC) disease [8], again compatible with the idea that EcPV9

Disease Association
As no biopsy samples could be taken from this case, it is not possible to confidently determine its significance in the observed pathologies. Nevertheless, the novel EcPV described here was extracted from semen samples, collected when a wart-like lesion was visible on the tip of the penis ( Figure S1), and hence compatible with a disease syndrome caused by a papillomavirus. In addition, it was notable that EcPV9 exhibited greatest sequence similarity with EcPV2, a major aetiologic agent of equine squamous cell carcinoma (SCC) disease [8], again compatible with the idea that EcPV9 might also be associated with papillomavirus-related malignancies in horses. Finally, our meta-transcriptomic analysis identified no other likely microbial pathogen in any of the samples analyzed from this stallion.
In conclusion, we report the identification of a novel equine papillomavirus (genus Dyoiotapapillomavirus) in a thoroughbred Australian stallion suffering a genital papilloma ("wart"), highlighting the broad diversity of these viruses in horses. Further investigation of the clinical impact of this virus on horse health is clearly merited.
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4915/11/8/713/s1, Figure S1: Photograph of the penile lesion, Figure S2: Amino acid sequence alignment of the E6(A) and E7(B) proteins from EcPV2, EcPV9, EcPV4, and EcPV5, Figure S3: Amino acid sequence alignment of the C-terminal ends of E6 proteins derived from HPV types associated with genital infections and 4 PV types from the genus Dyoiotapapillomavirus, Table S1: Equine associated papillomaviruses and their respective associated disease/syndrome, Table S2: Information on the RNA sequencing libraries generated here, Table S3: Genomic nucleotide and amino acid features of the genus Dyoiotapapillomavirus.

Conflicts of Interest:
The authors declare no conflict of interest.