The Evolutionary History and Spatiotemporal Dynamics of the NC Lineage of Citrus Tristeza Virus

Citrus tristeza virus (CTV) is a major pathogen affecting citrus trees worldwide. However, few studies have focused on CTV’s evolutionary history and geographic behavior. CTV is locally dispersed by an aphid vector and long distance dispersion due to transportation of contaminated material. With the aim to delve deeper into the CTV-NC (New Clade) genotype evolution, we estimated an evolution rate of 1.19 × 10−3 subs/site/year and the most common recent ancestor in 1977. Furthermore, the place of origin of the genotype was in the United States, and a great expansion of the population was observed in Uruguay. This expansion phase could be a consequence of the increment in the number of naïve citrus trees in Uruguayan orchards encompassing citrus industry growth in the past years.


Introduction
RNA viruses have a great potential for rapid evolution due to the high mutation rates, large population sizes, and short generation times [1]. This rapid evolution means that epidemiological and evolutionary processes occur on a time scale of a few years.
Citrus tristeza virus (CTV; genus Closterovirus, family Closteroviridae) is one of the most destructive pathogens that affect citrus trees worldwide [2]. The CTV genome is a single-stranded positive sense RNA molecule of approximately 19.3 kb in length containing 12 open reading frames that encodes for at least 19 protein products, including replication proteins, cell-to-cell movement proteins, virion assembly proteins, suppressors of RNA silencing proteins, and two capsid proteins [3][4][5][6]. p25 is a multifunctional protein involved in the suppression of intercellular RNA silencing and works as a capsid protein covering 95% of the virus genome [3,6].
Natural CTV hosts essentially include species of the genera Citrus and Fortunella, and depending on the virus strains and on the species or scion-rootstock combinations, CTV may cause three distinct syndromes named quick decline, stem pitting, and seedling yellows [2]. As an aphid-borne virus transmitted in a semi-persistent manner, CTV is mainly transmitted by vector species from the genera Toxoptera and Aphis, which locally spread the virus within a given citrus producing area. Long-distance dissemination is caused principally by the transportation of infected material [2].
Like most RNA viruses, CTV exists in nature as a mixture of genetic variants that coexist in the same plant displaying high levels of genetic and phenotypic diversity [7]. Genetic studies of different strains of CTV revealed the existence of seven distinct genetic lineages or genotypes worldwide, known as VT, T3, T30, T36, T68, RB, and NC (New Clade) the latter being proposed by our group [8][9][10][11][12][13]. This new genotype includes the HA16-5 (GQ454870) and Taiwan-Pum/M/T5 (JX266713) strains, from which the complete genomes are available, grouping into a clade with nucleotide sequence identities higher than 98.7% for p25 gene between strains [13].
CTV biology has been intensively studied for many decades, but evolutionary and spatiotemporal phylodynamic studies, which can provide relevant information for understanding the emergence of new viral diseases, genotypes, or genetic variants and for designing more efficient strategies for disease control, have not been conducted in-depth. In this matter, few studies have focused on CTV's evolutionary history with reference to geography [12,14,15]. The whole CTV genome evolutionary rate, from an infectious clone introduced by bark-flap inoculation into C. macrophylla, has been calculated to be 7 years after inoculation and was in the order of 10 −5 substitutions per site per year (s/s/y) [12]. This parameter was also calculated for p20 and p25 genes, in both cases resulting one order of magnitude higher (10 −4 s/s/y) than that for the complete genome estimation [14,15]. In the same article, Silva and co-workers [14] estimated the time of the most recent common ancestor (tMRCA) of CTV lineages based on the p25 gene. Nevertheless, these parameters were estimated for all CTV extant genotypes described at the moment of publication of the respective articles, highlighting the fact that the NC genotype has not yet been described. Another important contribution regarding the geographic behavior of CTV was made by Davino and co-workers [15], who studied the genetic relationship among CTV strains from Sicily, Italy, and evaluated the geographical and temporal dissemination of the virus in the island. Their results suggest that mild and severe CTV isolates belonging to five different clades (lineages) were introduced in Sicily in 2002.
With the aim to describe the spatiotemporal dynamics of the new NC-CTV genotype, which is highly represented in Uruguayan citrus orchards, we performed a Bayesian analysis of 123 dated sequences belonging to the aforementioned genotype. Sequences from Africa, Asia, Europe, North America, and South America were included in the analysis, and the ancestor of the NC genotype was dated as well as its evolutionary rate based on the p25 coding region. We also delineated the possible geographic dispersion route of the virus since its origin until 2015. To our knowledge, this is the first exhaustive report of the evolutionary and geographical behavior of the NC genotype, from which an exhaustive characterization remains incomplete.

Sequence Dataset
A total of 123 sequences of CTV p25 gene and associated information including collection date and location were retrieved from the NCBI nucleotide database (http://www.ncbi.nlm.nih.gov) to perform the analyses. The dataset comprises sequences collected between 1979 and 2015 from different countries, including Angola, Argentina, Brazil, China, Greece, India, Portugal, Sao Tomé and Principe, Taiwan, United States, and Uruguay. The Uruguayan sequences used in the present work were obtained from samples processed by our group (Table S1).
The model of nucleotide substitution that best fit the dataset (HKY) was selected using the jModelTest program according to the Akaike Information Criterion (AIC) [26,27]. Maximum Likelihood (ML) phylogenetic trees were reconstructed with PhyML program using an online web server [28,29].
The heuristic tree search was performed using the SPR branch-swapping algorithm and branch support was calculated with the approximate likelihood-ratio (aLRT) SH-like test [30].

Phylodynamic and Phylogeographic Approaches
The time of most recent common ancestor (tMRCA), the evolutionary rate of the new CTV genotype (NC) and its demographic history in Uruguay were jointly inferred using the Bayesian Markov Chain Monte Carlo (MCMC) statistical framework implemented in the BEAUti/BEAST package v1.8.0 (available online: http://tree.bio.ed.ac.uk/software/beast/) [31].
The TempEst program was used to determine that sequences used in this work showed temporal structure to proceed with molecular clock analyses.
The HKY nucleotide substitution model was determined to be the best fit with the dataset and the lognormal relaxed (uncorrelated) molecular clock model was used [14,32,33]. Bayesian analyses were conducted with the available tree priors described by Silva and co-workers [14] for the p25 gene and the best demographic model was evaluated through Bayes Factor; the Bayesian Skyline coalescent tree prior was used [34].  [35]. The KMZ file is available from the authors upon request.

The Monophyly and Temporal Signal of the Data Set
In order to test that the Uruguayan field samples collected in 2015 belonging to the NC genotype, a phylogenetic analysis with the ML method was performed, and the tree showed that all the sequences used in this work clustered as a highly supported monophyletic group (aLRT = 1) ( Figure S1). Another important fact to take into account when Bayesian approaches are used is that the dataset must have temporal signal and are not the result of artefactual nucleotide changes in the sequences [36,37]. We estimated the temporal signal with TempEst software (available online: http://tree.bio.ed.ac.uk/ software/tempest/) [38] and sequences with incongruent data/variability were discarded.

Rates and Dates of Evolution
Our Bayesian coalescent estimates of evolutionary dynamics, based on the p25 gene, indicate that the CTV-NC genotype evolves at a rate of 1.19 × 10 −3 subs/site/year (95% HPD, 1.72 × 10 −3 − 7.1 × 10 −4 subs/site/year). This rate was used to estimate times to common ancestry. Based on the MCCT (Figure 1

Evolutionary History of the Genotype
The demographic reconstruction of this new genotype, observed in the Skyline plot (Figure 3), showed that the viral population size can be divided into two different phases.
The first phase shows that the population size remained constant since its origin until around 2012. In the second phase, we can observe an increment of one log of the population size around this date until 2015, the date of the most recent sequences of our dataset, suggesting that the population is growing.

Evolutionary History of the Genotype
The demographic reconstruction of this new genotype, observed in the Skyline plot (Figure 3), showed that the viral population size can be divided into two different phases.

Evolutionary History of the Genotype
The demographic reconstruction of this new genotype, observed in the Skyline plot (Figure 3), showed that the viral population size can be divided into two different phases.
The first phase shows that the population size remained constant since its origin until around 2012. In the second phase, we can observe an increment of one log of the population size around this date until 2015, the date of the most recent sequences of our dataset, suggesting that the population is growing. The first phase shows that the population size remained constant since its origin until around 2012. In the second phase, we can observe an increment of one log of the population size around this date until 2015, the date of the most recent sequences of our dataset, suggesting that the population is growing.

Discussion
The monophyly of the NC genotype jointly with the fact that no recombinant sequences were found among the strains used in the present work are important in order to avoid bias in the subsequent phylodynamic and phylogeographic analyses.
The substitution rate obtained in the present work is similar to values reported for other RNA viruses [39]. Nevertheless, Silva and co-workers [14] recently described the gene evolutionary rate in the order of 10 −4 subs/site/year, although their calculations were made for all the CTV circulating genotypes at once. It has been proposed that, with the emergence of a new variant or during an outbreak, the evolutionary rate of a virus is faster and becomes slower and reaches the equilibrium after a period of time [40,41]. In this case, this is the first calculation of the CTV-NC genotype evolutionary rate apart from the other circulating genotypes being maybe the explanation for the difference between values.
Related to the ancestor, there is only one previous report of the tMRCA for all the existing CTV genotypes [14] in which the ancestor of the clade corresponding to the NC genotype (Group 5 Nolasco's typing system) was located between 37 and 431 years before 2010 . Despite the broad range obtained by the authors, which may simply reflect the heterogeneity in divergence times for the different CTV genotypes on which the calculations were based, our results are consistent with this.
The continuous movement of infected material, probably in an illegal manner despite the strict sanitary programs of every citrus-growing country, allows this virus genotype to spread continuously. Although this is the first report of the most probable dissemination dates and geographic locations of the CTV-NC genotype, our calculations predict that the virus was present 2-4 years before the first report of this genotype in every country where Group 5 of Nolasco's typing system was used to report it [42,43]. This result is consistent with a scenario in which the virus remains undetected until the number of infections reaches a detection threshold.
With respect to the spatio-temporal dynamics of the genotype in Uruguay, our results are in agreement with the two different introductions of the virus in the country, and it seems that the population size is growing since the second entrance of the genotype in Uruguay. The local expansion after the second introduction of the virus are confirmed by the high prevalence of this genotype in the country [13]. This population growth could be in agreement with the increment of the number of naïve citrus trees in Uruguayan orchards to encompass the citrus industry increase. Nevertheless, to confirm that the population is still growing, further analyses and continuous vigilance of citrus trees is needed. It has also been shown that temperature has an impact in virus population dynamics, favoring the spreading of thermo-tolerant variants at the expense of a diminished replication of other variants [44]. However, to confirm this hypothesis, further analyses are needed.

Conclusions
CTV has been moving around the world since the origin of modern citrus-growing techniques. The movement of contaminated material and the local dispersion of the virus by aphids after its arrival to a new place, together with an increasing number of naïve trees are the perfect scenario to a new virus genotype to rise. In the present work, we describe for the first time the dispersion pattern of the recently described NC-genotype around the world as well as the date of origin and its evolutionary rate. In particular, we describe a possible scenario to explain a local growth of its population in Uruguayan citrus orchards, which is in agreement with the emergence of a new variant hypothesis. In order to attain a better understanding of the NC genotype, studies about its biological characterization have to be done. These results are of great importance and must be taken into account to develop a genotype-specific cross-protection program against this highly represented genotype in Uruguay.