Tracking the Continuous Evolutionary Processes of an Endogenous Retrovirus of the Domestic Cat: ERV-DC

An endogenous retrovirus (ERV) is a remnant of an ancient retroviral infection in the host genome. Although most ERVs have lost their viral productivity, a few ERVs retain their replication capacity. In addition, partially inactivated ERVs can present a potential risk to the host via their encoded virulence factors or the generation of novel viruses by viral recombination. ERVs can also eventually acquire a biological function, and this ability has been a driving force of host evolution. Therefore, the presence of an ERV can be harmful or beneficial to the host. Various reports about paleovirology have revealed each event in ERV evolution, but the continuous processes of ERV evolution over millions of years are mainly unknown. A unique ERV family, ERV-DC, is present in the domestic cat (Felis silvestris catus) genome. ERV-DC proviruses are phylogenetically classified into three genotypes, and the specific characteristics of each genotype have been clarified: their capacity to produce infectious viruses; their recombination with other retroviruses, such as feline leukemia virus or RD-114; and their biological functions as host antiviral factors. In this review, we describe ERV-DC-related phenomena and discuss the continuous changes in the evolution of this ERV in the domestic cat.


Introduction
Retroviruses can integrate into the host genome through a DNA intermediate in the viral life cycle. Endogenous retroviruses (ERVs) arise from the infection of host germline cells by exogenous retroviruses, which are then transmitted as genetic components to the host's descendants. Therefore, ERVs are remnants of ancient exogenous retroviruses. Rapid advances in genome projects have shown that ERVs occupy about 4-10% of the human, mouse, and cat genomes [1][2][3][4].
Paleovirology is the study of ancient viruses, including ERVs, to clarify the processes of viral evolution in the long term [5][6][7]. Various paleovirological studies have clarified the events of ancestral virus evolution, including interspecies viral transmission [8,9], virus-host evolutionary arms races [10,11], and the acquisition of host functional genes through ERV domestication [12][13][14][15]. However, the continuous processes of viral evolution over millions of years are largely unknown.
ERV-DC is an endogenous gammaretrovirus of the domestic cat (Felis silvestris catus), which is classified into ERV1-3FCa-I by Repbase [16][17][18]. ERV-DC has a simple genomic structure and encodes a Gag-Pol polyprotein and an Env protein in a viral genome of about 8.8 kbp. A unique feature of the ERV-DC family is that the proviruses are phylogenetically classified into three genotypes: genotype I (GI), genotype II (GII), and genotype III (GIII) ( Table 1).
In the last decade, the specific characteristics of each genotype have been clarified ( Figure 1). For example, the GI and GIII proviruses can produce replication-competent viruses, and these viruses use different receptors for infection [16,19]. The GI env gene has also been transduced into the feline leukemia virus (FeLV), generating a novel interference subgroup: FeLV subgroup D (FeLV-D) [16]. The GII proviruses encode an antiviral factor, Refrex-1, that specifically inhibits ERV-DC GI and FeLV-D infections [20]. In this review, we summarize ERV-DC-related phenomena and discuss the continuous evolutionary processes of ERV-DC. Genotype-specific phenomena in the evolution of ERV-DC. ERV-DC proviruses are phylogenetically classified into three genotypes: genotype I, genotype II, and genotype III. ERV-DC proviruses are also classified into subgroups according to a cis-acting element in the LTR: the A-type LTR subgroup has adenine (A) in a cis-acting element and has strong promoter activity while the T-type LTR subgroup has thymine (T) in a cis-acting element and attenuated promoter activity. Each node point represents an ERV-DC-related phenomenon.

Two Distinct Infectious ERV-DC Proviruses with Different Viral Properties
Previous studies have shown that most ERVs are inactivated by the accumulation of mutations after their integration, but a few ERVs retain their replication capacity [21][22][23][24]. The characterization of each proviral genome of ERV-DC indicated that most of these proviruses lose their infectivity. However, ERV-DC14 belonging to GI, ERV-DC10, and ERV-DC18 belonging to GIII can produce infectious viruses [16,19], and representative morphologically C-type retrovirus-like particles have been observed with transmission electron microscopy in cells infected with these proviruses.
Here, we describe the loci that retain the virus replication ability for each genotype. Among the GI proviruses, a viral replication ability has only been confirmed in ERV-DC14 [19]. A fluorescence in situ hybridization (FISH) analysis mapped ERV-DC14 to the feline chromosome C1q32. A survey of the insertion polymorphisms of ERV-DC loci indicated that only 2.5% of the Japanese domestic cats tested carried ERV-DC14, so ERV-DC14 is present at a low frequency in the host population [16]. Interestingly, although the ERV-DC8 provirus has an intact genetic structure very similar to that of ERV-DC14, no viral productivity has been observed. Therefore, a comparison of ERV-DC14 and ERV-DC8 may identify the functional domain responsible for viral replication.
ERV-DC10 and ERV-DC18, which belong to GIII, differ at only one base in a primer-binding site in the full-length proviral genome [16]. A FISH analysis mapped ERV-DC10 and ERV-DC18 to the feline chromosomes C1q12-21 and D4q14, respectively. Therefore, two different loci in GIII produce infectious viruses. A survey of ERV-DC insertion polymorphisms showed that 37.7% of the Japanese domestic cats tested carried ERV-DC10, but ERV-DC18 was detected in only one sample (cat ON-T). Further investigation detected ERV-DC18 in two cats related to cat ON-T, so ERV-DC18 is considered to have been recently endogenized by infection or transposition of ERV-DC GIII. Thus, these results suggested that GIII contains active proviruses that still retain viral replication potential or transposition activity.
Although both GI and GIII contain proviruses with the ability to produce infectious viruses, their viral properties differ in two respects: their receptor usage and viral titers. An interference assay indicated that GI and GIII use different receptors for infection [16,19]. GI can infect cell lines derived from various mammals, including the domestic cat, human, monkey, dog, and cow. In contrast, GIII infects limited cell lines, including those of human and dog. Interestingly, GIII is unable to infect cat cells, such as AH927 fibroblasts and G355 astrocytes, in which the GIII provirus is integrated. Therefore, GI is a nonecotropic virus, whereas GIII is a xenotropic virus. Identifying the receptors used by GI and GIII for infection has clarified the differences in their infectious tropism [25][26][27][28][29][30].
The viral titer of the ERV-DC14 (GI) is lower than those of the ERV-DC10 and ERV-DC18 (GIII). Therefore, ERV-DC14 fails to establish a persistent infection in human cultured cells, whereas ERV-DC10 and -DC18 can persistently infect human cells [19]. (The detailed differences in their viral titers are described in a later chapter: "Genotype-specific strategies of ERV-DCs in transcriptional regulation".)

Transduction of ERV-DC GI env Gene into Feline Leukemia Virus ( FeLV)
FeLV is a pathogenic gammaretrovirus of the domestic cat that causes various pathologies in the host, including proliferative diseases and immunosuppressive diseases, with high mortality rates [31,32]. The FeLV subgroups are classified according to their infectious tropism, based on differences in the receptors they use for infection [30,[33][34][35][36][37][38][39]. FeLV subgroup A is the commonest subgroup, and is transmitted horizontally between host individuals through grooming, sharing of food, or biting.
The discovery of a novel recombinant FeLV genome that had captured viral fragments of ERV-DC prompted the characterization of ERV-DC [16,40]. The recombinant FeLV was designated "FeLV-D" because it belongs to a different interference group from the known FeLV subgroup. Both a genetic analysis and interference experiments showed that FeLV-D had acquired the env gene of ERV-DC GI [16,20,39]. Therefore, ERV-DC GI has contributed a new feature to a modern virus ( Figure 2).
FeLV-D was identified in four cats, including two cats that were related. Clinical information on the FeLV-D-infected cats showed that three of the cats had hematopoietic tumors, including lymphoma and leukemia [41]. Further investigations, such as an ingestion experiment in laboratory animals, are required to analyze the virulence of FeLV-D in more detail.
In the FeLV-D env genes detected in the four infected cats, the 3 recombination junctions in the transmembrane unit (TM) differed from each other, although the full-length surface units (SUs) of all the clones were replaced with that of ERV-DC GI. This suggests that FeLV-D was generated de novo in each cat. It has been reported that in other subgroups of FeLV, slight differences in genetic structure lead to changes in infection tropism or pathogenicity [30,42]. Therefore, it is possible that the differences in the viral properties of the FeLV-D clones are caused by variations of the recombination junction.

ERV-DC GII Encodes the Host Antiviral Factor Refrex-1
In rare cases, ERV-derived sequences have gained an activity that contributes to the maintenance of the host physiology, and this phenomenon is called "ERV domestication". ERV domestication is exemplified by antiviral factors [14,21], placenta formation ability [43], myoblast fusion ability [44], or an mRNA transporter in the nervous system [45,46]. These reports suggest that ERV domestication has contributed dramatically to the evolution of the host.
During the characterization of FeLV-D, we unexpectedly detected an unknown antiviral factor directed against FeLV-D and ERV-DC GI in the supernatant of 3201 feline T cells [20]. We designated the antiviral factor "restriction for feline retrovirus X (Refrex-1)". To determine the gene that encodes Refrex-1, we screened soluble molecules associated with ERV-DC in a cDNA library synthesized from transcripts in 3201 cells, and two suspected clones were identified. Further experimentation revealed that these two clones inhibited FeLV-D and ERV-DC GI infections in a dose-dependent manner.
Interestingly, the genetic structures of the two clones were identical to the env genes of ERV-DC7 and ERV-DC16, and experiments using cells transfected with ERV-DC7 or ERV-DC16 confirmed the same antiviral activity as Refrex-1. These results indicate that ERV-DC7 and ERV-DC16, belonging to ERV-DC GII, have been domesticated as Refrex-1 to protect the host from viral infection. Further experiments showed that the defense mechanism of Refrex-1 involves the extracellular secretion of Refrex-1, which interferes with the receptors for FeLV-D and ERV-DC GI infection ( Figure 3). In contrast to FeLV-D and ERV-DC GI infections, Refrex-1 does not block ERV-DC GIII infections, presumably because the infectious receptors used by ERV-DC GIII differ from those used by ERV-DC GI and FeLV-D.
The env genes of both ERV-DC7 and ERV-DC16 are truncated by stop codons at amino acids 252 and 298, respectively, in the proline-rich region (PRR). One of the unique properties of Refrex-1 is that it consists of a truncated Env protein including an SU domain and signal peptide [14]. These structures, which lack a TM, appear to express efficient antiviral activity for the following two reasons. First, Refrex-1 can be secreted from cells without remaining on the cell membrane, and causes extracellular receptor interference. The infection efficiencies of FeLV-D and ERV-DC GI were significantly reduced in cells when the supernatants of a variety of cultured feline cells were added. Second, Refrex-1 does not include the immunosuppressive domain (ISD). The ISD of ERV-DC closely resembles that of other gammaretroviruses whose immunosuppressive activities have been demonstrated [47][48][49]. Therefore, it is conceivable that structures lacking the ISD can protect the host against viral infection without impairing its immunity.
The proportion of cats shown to carry ERV-DC7 or ERV-DC16 indicates that both loci are fixed in the cat genome [16]. Therefore, the antiviral activity of Refrex-1 may have contributed to the survival of the host during feline evolution. A genotype-specific expression analysis using cat organs showed high expression levels of GII in most organs, especially in the peripheral blood [19]. However, many aspects of Refrex-1 functions in vivo remain unknown, including the extent to which Refrex-1 can inhibit infection by FeLV-D or ERV-DC GI, and whether ERV-DC7 or ERV-DC16 can act as Refrex-1.

Refrex-1 Is under Robust Control by Accumulated Inactivation Mechanisms
Because it was unclear how ERV-DC7 and ERV-DC16 evolved into Refrex-1, we investigated the process of Refrex-1 evolution using the reconstructed full-length env genes of ERV-DC7 and ERV-DC16 (ERV-DC7fl and ERV-DC16fl, respectively) [50]. An infectivity analysis of ERV-DC7fl and ERV-DC16fl revealed that they are unable to produce viruses because the cleavage between SU and TM is disrupted. Therefore, Refrex-1 is strictly controlled by multiple inactivation steps, including stop codons in the PRR and cleavage failure.
Additional analyses provided interesting data on ERV-DC7fl. First, the reconstructed env gene showed antiviral activity, like Refrex-1, even when ERV-DC7fl appeared to remain on the cell surface. Second, a comparative analysis of the ERV-DC7 alleles showed that purifying selection is active not only in the domain that currently functions as Refrex-1, but also in the entire env gene. Third, a sequence comparison revealed recombination phenomena among the alleles of the ERV-DC7 env gene.
These results prompted the following hypothesis about the processes underlying the evolution of ERV-DC7 into Refrex-1. Before evolving to the current Refrex-1, intermediate alleles of the ERV-DC7 env gene existed. For example, one was inactivated by a stop codon in the PRR, and another was inactivated by cleavage dysfunction. Among these intermediates, several alleles appear to have acquired antiviral activity. The recombination phenomena among the ERV-DC7 alleles suggest that the stricter control of viral activity, which is a prerequisite for ERV domestication, has been gradually achieved by the accumulation of inactivation steps during recombination. To test this hypothesis, ERV-DC7 must be detected in other species of the genus Felis and compared with that in the domestic cat.

Genotype-Specific Transcriptional Regulation Strategies in ERV-DCs
As described above, the characteristics of the ERV-DC proviruses seem to be associated with their genotypes. ERV-DC GI and GIII include replication-competent proviruses, and the infectious activities of ERV-DC10 and -DC18 (GIII) are much stronger than those of ERV-DC14 (GI) [16,19]. ERV-DC7 and -DC16 encode the antiviral factor Refrex-1 [20]. Because the three ERV-DC genotypes show distinct characteristics, we speculated that these genotypes are controlled in distinct ways. To clarify this point, we investigated the transcription patterns and regulatory mechanisms of ERV-DCs at genotype-and locus-level resolution.
The ERV-DC genotype-specific expression profiles were characterized in various feline organs [19]. The expression of GII was high in almost all tissues, whereas that of GI and GIII was extremely low. These results appear to be consistent with the fact that GII contributes to the host's viral defenses as Refrex-1, and that GI and GIII include proviruses capable of producing infectious viruses.
To clarify the mechanism underlying the differences in their transcriptional activities, we investigated the regulatory mechanisms involved. First, the methylation status of the CpG islands in the 5 long terminal repeat (5 LTR) of each ERV-DC provirus was analyzed. Genotype-specific methylation levels were identified: the GIII 5 LTR is strongly methylated along its full length, whereas the GI 5 LTR is partially methylated downstream from the TATA box. By contrast, the GII 5 LTR is negligibly methylated.
Secondly, we analyzed the basal promoter activity of the 5 LTR in each provirus. Promoter activity was confirmed in the GII, GIII, and several GI proviruses (ERV-DC19 and ERV-DC2), but not in other GI proviruses (ERV-DC1, -DC3, -DC4, -DC8, -DC14, and -DC17). Further investigation with a chimeric LTR constructed from ERV-DC19 and ERV-DC8 identified the nucleotide substitution responsible for the difference in the basal promoter activities. Although ERV-DC14 is incapable of persistent infection, ERV-DC14TA, in which the substitution determining the promoter activity was repaired, established persistent infections like those of ERV-DC10 and ERV-DC18.
These results indicate that the observed genotype-specific transcriptional regulation is attributable to two distinct mechanisms in the 5 LTR: the transcriptional activity of GI is controlled by partial CpG methylation and a point mutation in a cis-acting element, whereas that of GIII is restricted by strong CpG methylation. In contrast to GI and GIII, there are few restrictions in GII expression.
The LTRs of ERV-DC were classified into two subgroups based on the identification of the cis-acting element: the A-type LTR subgroup, with high promoter activity; and the T-type LTR subgroup, with low activity (Figure 1). Interestingly, a comprehensive search of the domestic cat genome indicated that the copy number of the T-type LTR is significantly higher than that of the A-type LTR, despite the low promoter activity of the T-type LTR. One hypothesis explaining this phenomenon is that ERV-DCs with T-type LTRs have escaped from negative selective pressure because they have lost their strong promoter activity, which could reduce the fitness of the host. Another hypothesis is that the T-type LTR has been optimized to the transcriptional environment of germ cells in order to replicate efficiently in those cells [51]. However, further analyses, such as the identification of the transcription factors that bind the A-type and T-type LTRs, are required to test these hypotheses.

Multiple Recombination Events between ERV-DC and RD-114
RD-114 was first isolated from fetal kittens that had ingested human rhabdomyosarcoma cells, and subsequent studies showed that RD-114 is a domestic cat ERV [52]. The identification of ERV-DC revealed that RD-114 is a chimeric virus composed of the gag-pol genes of ERV-DC and the env gene of a baboon endogenous retrovirus (BaEV) (Figure 2) [16,18]. Other studies have shown that BaEV is also a chimeric virus composed of the gag-pol genes of a Papio cynocephalus endogenous retrovirus (PcEV) and the env gene of a simian endogenous retrovirus (SERV) [53][54][55]. Therefore, RD-114 was generated by at least two recombination events: the transduction of SERV env into PcEV and the transduction of BaEV env into ERV-DC [56]. Interestingly, PcEV appears to be a counterpart of ERV-DC in the baboon, because these genomes share approximately 70% similarity. These results suggest that multiple interspecies transmissions have occurred among old world monkeys and cats, with several subsequent recombination events among the transmitted viruses.
Although the existence of both ERV-DC and RD-114 has not been confirmed in the chromosomal DNA of the Tsushima wild cat (Prionailurus bengalensis euptilurus), it is predicted that a recombination event between ERV-DC and BaEV occurred about 6.2-9.3 million years ago (mya), when the genus Felis and the genus Prionailurus seem to have separated [16,57]. However, it is still unclear how ERV-DC and RD-114 coexisted within the same host. Therefore, we analyzed the details of these recombination events using ERV-DC, RD-114, and RD114-virus-related sequences (RDRSs) [58,59].
A phylogenetic analysis suggested that RD-114 (RD-114_CRT1 and RD-114_SC3C) and most RDRSs (RDRS_A2, RDRS_C2b, RDRS_D4, RDRS_C1, and RDRS_E3) were generated by the transduction of ERV-DC GIII, whereas only RDRS_C2a is composed of the ERV-DC GII sequence ( Figure 4A). The estimated time of integration, based a comparison of their LTRs, indicated that the age of ERV-DC7 (belonging to GII) is around 2.8 mya and that of RDRS_C2a is around 1.6 mya [16,59]. However, we speculated that almost all ERV-DC GI and GIII and other RDRSs were integrated into the feline genome quite recently, at less than 0.2 mya. The recombination analysis also showed that the gag and pol genes of RD-114_SC3C and RDRS_C2a derived from ERV-DC GIII and GII, respectively, and these results are consistent with those of the phylogenetic analysis ( Figure 4B). In contrast, the recombination analysis indicated that a part of the pol gene of RDRS_E3 is derived from ERV-DC GI. Therefore, these results suggest that RD-114 has repeatedly recombined with each genotype of ERV-DC over a prolonged period.
In this review, the occurrence of multiple recombination events between ERV-DC and RD-114 has been suggested. However, there is little insight into where and when these recombination events occurred. Furthermore, the interspecies transmission of the virus between old world monkeys and domestic cats, which underpins the occurrence of recombination, is largely unclear. To clarify these issues, it will be necessary to combine a comprehensive viral sequence analysis and a host species analysis from both evolutionary and ecological perspectives, as has been reported in other studies of BaEV and PcEV evolution [54,55,60,61]. The bootscanning analysis was conducted with Simplot [62]. Sequences for the analysis were extracted with a window size of 200 bp and a step size of 20 bp, and each phylogenetic analysis was repeated 100 times with the neighbor-joining method based on the Kimura 2-parameter model [63]. The accession numbers of the nucleotide sequences are: ERV-DC: AB674439-AB674452, AB807599, and AB807600; RD-114_CRT1: AB559882; RD-114_SC3C: NC_009889; RDRSs: LC005744-LC005749; and BaEV: D10032.

Evolutionary Scenario of ERV-DC
The endogenization of a virus leads to dynamic changes in the host genome, but it is unclear whether the results are detrimental or beneficial to the host. In this review, the main characteristics specific to each ERV-DC genotype have been described. ERV-DC GI and GIII contain loci that confer viral replicative or translocation abilities, which may threaten the host's life [16,19]. Moreover, the emergence of FeLV-D, a viral recombinant of ERV-DC GI and FeLV, may pose a new hazard to domestic cats [16]. However, in contrast, the presence of ERV-DC GII has benefited the survival of the host population by introducing the antiviral factor Refrex-1, which has prevented the expansion of ancient ERV-DC infections and modern FeLV-D infections [20].
The rearrangement of ERV-DC-related phenomena is analogous to the continuous changes occurring in the evolution of the ERV-DC family, and the evolutionary scenario suggests that the appearance of Refrex-1 introduced a branch point in the evolution of ERV-DC ( Figure 1). Therefore, to escape the effects of Refrex-1, some ERV-DCs appear to have changed the receptors they use for infection, and became the current GIII viruses. Similarly, the evolution of RD-114 may have also circumvented the antiviral activity of Refrex-1 by altering the env gene.
The various ERV-DC genotypes imply that the three genotypes were endogenized independently in terms of geography or era, but the evolutionary origin of ERV-DC is largely unknown. Further analysis of ERV-DC in other species of the genus Felis may clarify the evolutionary processes of ERV-DC.

Concluding Remarks
The rearrangement of ERV-DC-related phenomena suggests a scenario of ERV-DC evolution and raises new questions. In particular, the following factors must be clarified in more detail: the viral pathogenicity of ERV-DC and FeLV-D, the domestication of ERV-DC-derived sequences, and the long-term processes of ERV-DC evolution.
To address the first question, the potential viral pathogenicity of ERV-DC proviruses belonging to GI and GIII must be investigated, together with the novel viral pathogenicity of FeLV-D. Analyzing the pathogenicity of these viruses is important not only in paleovirology, but also in risk management to control the emerging and reemerging viruses.
Other studies have reported the appearance of new viruses with the reassortment of viral genes [64,65]. Because most ERV-DC proviruses retain their potential genetic structure, we must also consider the possibility of viral gene reassortment (Table 1).
Previous studies have also indicated that abnormal ERV expression may be associated with diseases such as cancer [66] or autoimmune diseases [67], although the relationship between ERV-DC expression and disease remains to be investigated. For this purpose, it will be necessary to analyze expression data from both normal and diseased tissues.
To address the second question, the function of Refrex-1 must be demonstrated in vivo. We believe that Refrex-1 may have a physiological function as a secreted protein other than its antiviral activity.
Furthermore, because most ERV-DC proviruses have an intact open reading frame (ORF), we must consider whether other ERV-DC proviruses have acquired new biological functions. In particular, ERV-DC6, which seems to be fixed within the domestic cat population, retains an intact ORF in the env gene. Therefore, it is possible that ERV-DC6 has gained a function as a host gene.
A comparison of ERV-DC across the genus Felis should be effective in addressing the third question. (The details are described in the chapter "Evolutionary scenario of ERV-DC".) Further exploration of these factors will clarify the continuous evolution of ERV-DC on a finer time scale. Studies of the continuous evolution of ERV-DC will extend our paleovirological understanding, including the evolutionary history of retroviruses over long time periods and the roles of endogenous viruses in host evolution.