The Functional Impact of Transposable Elements on the Diversity of Plant Genomes

Transposable elements (TEs) are self-mobilized DNA sequences that constitute a large portion of plant genomes. Being selfish DNA, they utilize different mobilization mechanisms to persist and proliferate in host genomes. It is important that new TE insertions generate de novo variability, most of which is likely to be deleterious, but some can be advantageous. Also, a growing body of evidence shows that TEs were continually recruited by their hosts to provide additional functionality. Here, we review potential ways in which transposable elements can provide novel functions to host genomes, from simple gene knock-outs to complex rewiring of gene expression networks. We discuss possible implications of TE presence and activity in crop genomes for agricultural production.


Introduction
It has been widely acknowledged that two major forces, whole genome duplication (WGD) events and the activity of transposable elements (TEs), are responsible for shaping plant genomes [1].TE activation and proliferation is driven by a complex array of interactions between TEs and the host biology, for example, mode of reproduction, flowering biology, ploidy level, efficiency of mechanisms silencing TEs and controlling genome expansion/contraction.Moreover, these interactions are heavily affected by the environment and population size.It results in a vast diversity in the content and amount of TEs even in closely related plant genomes [2,3] or even in different lineages within the same species [4], pointing at a highly dynamic nature of the process.Here, we will briefly describe different groups of plant TEs and present the range of effects they have been shown to impose on plant diversity.

Transposable Elements
TEs, also called mobile genetic elements or mobile DNA, are stretches of DNA capable of proliferation within host genomes through recurrent integration into new chromosomal positions.In plants, they constitute the largest portion of the nuclear genome and they dynamically shape genomes in such a way that even closely related species can have very different sets and abundance of particular TE families.There are different groups of TEs and their classification is the subject of an ongoing debate.The most widely accepted classification system [5] defines two classes of TEs sharply differing in their mechanism of transposition.Class I comprises elements which transpose using an RNA intermediate.As a result of a successful transposition event, a new copy is integrated while the seed element remains intact at the donor site.It is therefore often referred to as a 'copy-and-paste' transposition.Class I is further divided into five orders, namely LTR (long terminal repeat elements), LINE (long interspersed nuclear elements), SINE (short interspersed nuclear elements), DIRS (Dictyostelium intermediate repeat sequence), and PLE (Penelope-like elements).In plants, LTR retrotransposons, especially those comprising Ty1-copia and Ty3-gypsy superfamilies, have been particularly successful and usually constitute the major fraction of all plant TEs [6].In fact, the abundance of LTR retrotransposons closely correlates with the genome size of angiosperms, largely explaining the so called C-value paradox, that is, major differences in the amount of DNA in a haploid genome (C-value), not matching the complexity of the species and their taxonomic relationships.
In contrast, the abundance of Class II, referred to as DNA transposons does not seem to correspond with the size of plant genomes (Figure 1).Owing to their mode of transposition, they are usually far less abundant than LTR retrotransposons.Only two orders are defined within Class II, that is, TIR (terminal inverted repeat elements) and Helitrons.Five TIR superfamilies are present in higher plant genomes: hAT, Mutator, CACTA, PIF/Harbinger, and Tc1-mariner.TIR DNA transposons transpose by a 'cut and paste' mechanism.Upon mobilization, they are excised from the donor site and reintegrate at the acceptor site.Helitrons, transposing via a 'rolling circle' mechanism, are also common components of plant genomes.[7] and other reports on angiosperm genome assemblies [8][9][10][11][12][13][14][15][16].
Founder elements of each family have to be capable for self-mobilization, that is, they must have functional open reading frames encoding proteins required for transposition, as well as other essential structural components, depending on the type of TE.Such elements are called autonomous.However, non-autonomous TEs that partially or completely lost their coding regions may still be able to transpose as long as they can be targeted by the transposition machinery provided in trans by a related autonomous element.For example, SINEs are non-autonomous Class I elements relying on LINE-encoded proteins to transpose, while MITEs (miniature inverted-repeat transposable elements) are small non-autonomous TEs derived from and mobilized by DNA transposons of the TIR order, for example, PIF/Harbinger and Tc1-mariner provide transposition machinery for Tourist and Stowaway MITEs, respectively [17,18].In plants, MITEs can reach copy numbers in the range of tens of thousands, even though unlike LTR retrotransposons they do not constitute a large genomic fraction owing to their minute size.

TEs and the Host Genome
TEs are essentially selfish genetic entities sometimes even called 'ultimate parasites' [19].Their mobility allows them to survive throughout generations, while generally they do not provide any advantage to the host.However, a growing body of evidence points at much more complex relationships between TEs and host genomes.Each transposition event generates de novo variability.It may be presumed that most newly integrated copies would have a deleterious to neutral impact on the host, depending on the site of integration.Therefore, host genomes developed mechanisms for repressing the activity of TEs [20].In plants, the predominant mechanism relies on siRNA-directed repressive chromatin modifications.Any particular TE family has its specificity in terms of the preferred location in the genome.The most abundant families of LTR retrotransposons are frequently located in pericentromeric and intergenic regions, while DNA transposons are distributed in genic regions, near genes or in UTRs and introns, and the mechanisms of silencing are very different in the two contexts [21].The deficient in DNA methylation 1 (DDM1) pathway is responsible for stable silencing of TEs by forming constitutive heterochromatin, while RNA-directed DNA methylation (RdDM) pathway targets mostly younger and smaller TEs located in the vicinity of genes [22].
Despite the tight epigenetic control of the TE mobility by the host genome, some TEs developed strategies to self-regulate their activity in response to stress by acquiring stress-responsive motifs recognized by host regulatory proteins.Many reports have shown that TEs required an environmental stress for mobilization, for example, ONSEN Ty1-copia retrotransposon is mobilized in elevated temperatures [23], while other TE families were shown to be induced by cold, drought, salinity, wounding, UV light and pathogen attack [24].Some DNA transposons may at least to some extent escape epigenetic silencing, as they are especially AT-rich and target AT-rich genomic regions [25].Another strategy adopted by TEs is to minimize negative effects of transposition.For example, new insertions of a group of Ty3-gypsy retrotransposons are targeted to transcriptionally silent heterochromatic regions through a chromodomain at the C-terminus of their integrase which recognizes epigenetic marks characteristic of heterochromatin [26].
TEs may become active shortly after invading the genome via horizontal transfer [27] or be reactivated [28,29], possibly upon relaxation of the host controlling mechanisms as an effect of environmental stress, which may be further reinforced by the above-mentioned stress responsiveness developed by particular TE families.TEs may also be turned on by interspecific hybridization and polyploidization [30].Plant tissue cultures have also been reported as common inducers of TE activity that results in somaclonal variation [31].Release of the host control over TEs results in family-specific bursts of transposition spanning short evolutionary periods, followed by re-establishment of the silencing machinery (Figure 2).It is very different from single nucleotide substitutions, which are assumed to be produced at a relatively constant rate per generation.It has been reported that an active MITE family produced up to 40 new insertions per plant per generation [25], by far exceeding the rate of nucleotide substitutions.
Even though TE activation is not directly beneficial to the host, subsequent TE/genome interactions may occasionally provide opportunities for the development of genetic novelty which can ultimately result in better adaptation.Notably, it may appear in any of the above-described levels of interaction; it may be conditioned by rearrangements at the excision or insertion site, by providing new regulatory elements by TEs to adjacent genes or by recruiting the epigenetic silencing machinery, mostly RdDM, to regulate gene expression.Finally, more pronounced chromosomal rearrangements, duplications, deletions and inversions, can be induced upon recombination between different copies of TEs from the same family.In the short run they can affect adaptability, while in the long time they may drive speciation and promote evolvability.In subsequent sections, we will present examples of processes by which TEs can shape host genomes.

Gene Knock-Outs
Perhaps the most readily observed effects of recent transposition events are knock-out mutations frequently resulting in abrupt phenotypic changes.The loss of purple pigmentation of maize kernels was the first macroscopic trait attributed to the activity of TEs [32].Wrinkled seeds, one of the traits used by G. Mendel to define the laws of inheritance, was shown to be caused by a TE insertion knocking out a starch-branching enzyme SBEI by a hAT element [33].Over the years, many such knock-out mutations were described, affecting a range of plant organs, including numerous reports on flower color and morphology [34][35][36][37][38].The propensity for induction of knock-out mutations at a high rate made some TE families, for example, Ac/Ds and En/Spm from maize, routine tools for insertional mutagenesis, referred to as transposon tagging [39].Robust collections of mutants were produced in different plant species including the model plant Arabidopsis thaliana (e.g., [40,41]).
In the soybean plant, insertion of a Ty1-copia LTR retrotransposon in GmphyA2 rendered plants insensitive to photoperiod and allowed them to be cultivated at high latitudes [42], being an example of TE-driven adaptability.Occurrence of the knock-out mutants was limited to the northern regions of Japan, likely owing to human selection for increased fitness in that particular environment.Presumably, in nature knock-out mutations rarely have positive effect on the fitness of the host.
On the other hand, altered phenotypes frequently have been selected for by farmers, as they provided better agronomic, culinary or processing quality to crops.A range of TE-derived knock-out mutations affect seed quality, for example, waxy mutants in cereals [43][44][45].

Alterations of Gene Expression
Insertions of TEs into regulatory regions may result in a more subtle change, not completely turning off the gene, but rather altering its expression profile.In Sicilian blood oranges, insertion of a LTR retrotransposon upstream of Ruby, a MYB transcription factor, resulted in its increased expression in the fruits and purple coloration of the flesh.In addition, an LTR-derived cold-responsive regulatory motif enhances the expression of Ruby in low temperatures, so that anthocyanins accumulate in the cold [46].Another seminal example is reduced branching in maize as compared to its wild ancestor, teosinte.The expression of teosinte branched1 (tb1), a domestication syndrome gene, is enhanced by insertion of a Hopscotch retrotransposon into a regulatory region ca.60 kb upstream of tb1 resulting in apical dominance characteristic for maize [47].TE-driven changes in the regulation of gene expression may lead to altered physiology and stress tolerance of the host, increasing its adaptability to particular ecological niches.TEs were shown to be involved in modifications of response to light [48] and early flowering [49].They influence host plant reaction to abiotic and biotic stresses, for example, tolerance to increased levels of aluminum [50] or disease resistance [51].

Epigenetic Reprogramming
Even though TEs localized in the vicinity of genes are targets for epigenetic silencing, efficient mechanisms exist restraining the spread of methylation out of TEs and maintaining tight boundaries separating TE termini from adjacent genes [21].Nevertheless, changes in the status of epigenetic marks imposed on TEs by the silencing machinery may extend beyond the TE, producing stably inherited epialleles [1].Epigenetic silencing of FWA and FLC genes by SINE and hAT elements, respectively, governing flowering behavior in A. thaliana [52,53] are two examples of such TE/gene interactions.

Structural Rearrangements
Active TE families have been reported as being involved in movement of genes, segmental duplications and other types of rearrangements.A group of ca.3000 Mutator-like elements (MULEs) in rice, named Pack-MULEs, was shown to capture genes or gene fragments and shuffle them around the genome, sometimes giving rise to novel chimeric transcripts [54].For some of those transcripts, evidence for their potential function was provided [55].In maize, gene fragments originating from 376 different genes were transduplicated by Helitrons [56].Analysis of wheat chromosome 3B showed that CACTA transposons mediated 140 gene capture events, some of the captured genes being transcribed and showing signatures for selection [29].Retroposition of a region comprising the SUN gene in tomato into a novel position driven by an LTR retrotransposon resulted in the development of elongated fruits [57].The above examples underline the propensity of many different types of TEs to dynamically shape plant genomes on the large scale, possibly providing genetic novelty that can be adaptive.
Genome rearrangements can also result from homologous recombination between different copies of elements belonging to the same family or by alternative transposition [58].In maize, aberrant transposition events of Ac/Ds elements at the p1 locus produced deletions, duplications, inversions and translocations [59].Accumulation of such rearrangements in different lineages may have a profound effect on the development of reproductive barriers, ultimately driving speciation.

Exaptation and Rewiring Gene Expression Networks
TE-encoded proteins originally governing their mobility may at some point become exapted, that is, they acquire new functions advantageous to the host and turn into regular genes.Plant FAR1 and FHY3 transcription factors responsible for far-red light dependent morphogenesis were shown to be homologous to a MULE-encoded transposase [60].Similarly, MUSTANG, gary and DAYSLEEPER genes were derived from Class II TE transposases and were postulated to fulfill functions that are currently essential for plant development [61][62][63].
The potential for molecular domestication of TE-encoded transposases lies in the fact that they share the ability to recognize and bind to specific DNA motifs located in terminal inverted repeats (TIRs) of the corresponding transposons.It has been proposed that the ability to interact with DNA motifs distributed around the genome, resulting from prior transposition bursts, drove their recurrent exaptation, retaining the DNA binding feature and acquiring capability to regulate expression of genes proximal to the binding sites.Upon natural selection for most advantageous interactions, a fine-tuned regulatory network would ultimately emerge, providing concerted expression of a number of genes driven by the regulatory protein originating from the exapted transposase [64].This way, TEs would be responsible for both spreading eventual transcription factor binding sites and providing the protein capable of recognizing them.

Conclusions
The sessile lifestyle of plants renders them vulnerable to environmental stresses.Capability for genetic adaptation and development of novel mechanisms of resistance is therefore crucial for them to persist.Relationships between the environment, plant genomes, and TEs are complex.Upon stress, mechanisms controlling the activity of TEs are relaxed, possibly as a side-effect of gene expression reprogramming required for physiological adaptation.Mobilization of certain TE families can be further enhanced by their internal stress-responsive regulatory motifs.Even though a stress-induced transposition burst likely produces mostly deleterious effects, on the population level it possibly might provide novel advantageous variants which would be subject to natural selection and ultimately drive the evolution of the species, especially if combined with more pronounced TE-driven genomic rearrangements (Figure 3).Importance of TEs in combination with recent allopoliploidization was shown to be essential for successful inhabitation of novel ecological niches by a recently formed invasive species Spartina anglica [65].However, it is difficult to directly link recent or ongoing TE proliferation with defined positive adaptive changes.Having said that, novel genomic approaches allow investigating TE/genome interactions on an unprecedented level, shifting the paradigm from anecdotic reports on effects of particular TEs on the host to more systematic studies revealing global relationships [66].
Crop domestication and improvement provides more compelling examples of novel TE-derived phenotypes which happened to be attractive to early farmers (e.g., apical dominance in maize [46]) or modern producers (e.g., seedless apples [67]).Notably, most of them were simple knock-out mutations which were easy to select.Recently, it was postulated to use controlled activation of endogenous TEs in crop improvement [68] and experiments outlining possible protocols for such implementation have been reported [69].As described above, TEs provide means for heritable genetic/epigenetic alterations in response to stress factors, generating phenotypic plasticity that can be subject to selection resulting in adaptation to changed environment.Facing the inevitable problems imposed by global warming, this strategy might be especially useful in developing novel crop varieties more tolerant to a range of abiotic stresses.

Figure 1 .
Figure 1.Correlations between genome size and abundances of Class I and Class II transposable elements in diploid angiosperm genome assemblies.Genome sizes (in Mb) are shown on the X axis (logarithmic), percentage of genomes occupied by transposable elements (TEs) (blue diamonds), Class I (red squares), and Class II (green triangles) is shown on the Y axis (linear).Determination coefficient (R 2 ) values are shown next to the corresponding trend lines.The data were after [7] and other reports on angiosperm genome assemblies [8-16].

Figure 2 .
Figure 2. A simplified presentation of the outcome of subsequent transposition bursts of related TE families.(a) An exemplary phylogenetic tree of TE families active in different evolutionary periods.Colored triangles represent coalesced clades grouping individual copies belonging to the same family.(b) Mutation rates resulting from bursts of TE activity over time.Colors correspond to those of the family clades shown in (a), the dashed line depicts the constant rate of nucleotide substitutions.

Figure 3 .
Figure 3.A schematic representation of key processes driving interplay between transposable elements, host genomes and environment.