Molecular Tools for Exploring Polyploid Genomes in Plants

Polyploidy is a very common phenomenon in the plant kingdom, where even diploid species are often described as paleopolyploids. The polyploid condition may bring about several advantages compared to the diploid state. Polyploids often show phenotypes that are not present in their diploid progenitors or exceed the range of the contributing species. Some of these traits may play a role in heterosis or could favor adaptation to new ecological niches. Advances in genomics and sequencing technology may create unprecedented opportunities for discovering and monitoring the molecular effects of polyploidization. Through this review, we provide an overview of technologies and strategies that may allow an in-depth analysis of polyploid genomes. After introducing some basic aspects on the origin and genetics of polyploids, we highlight the main tools available for genome and gene expression analysis and summarize major findings. In the last part of this review, the implications of next generation sequencing are briefly discussed. The accumulation of knowledge on polyploid formation, maintenance, and divergence at whole-genome and subgenome levels will not only help plant biologists to understand how plants have evolved and diversified, but also assist plant breeders in designing new strategies for crop improvement.


Polyploidy in the Plant Kingdom: Occurrence and Significance
The genome sizes of eukaryotes can differ 10,000-fold and part of these differences may be attributed to changes in the ploidy level. Polyploids are organisms having more than two complete sets of chromosomes in their cells. They are common in angiosperms, where at least 70% of the species experienced one or more events of genome doubling during their evolutionary history [1,2]. Many crop species are polyploids (Table 1), and it was stated that "life on earth is predominantly a polyploid phenomenon and civilization depends mainly on use of polyploid tissues-noteworthy is the endosperm of cereals" [3]. Polyploidization is considered a major evolutionary force in plants. It is a definitive cause of sympatric speciation due to the immediate reproductive isolation between newly formed polyploids and their parents [4]. Polyploidization also makes it possible to overcome hybrid sterility and produce viable offspring following interspecific hybridization. However, there are still several open questions related to polyploidy and polyploidization. For example, Soltis et al. [5] reported that polyploidy frequency in angiosperms is high, even if the number of lineages that really derived from genome-wide duplication (WGD) events is still largely unknown. Similarly, it is not clear if polyploidy causes a change in the interaction with herbivores and pollinators. Based on their origin, polyploids are classified into two major groups, autopolyploids and allopolyploids [6]. The former are the result of doubling homologous genomes (e.g., autotetraploid AAAA) from a single, or closely related, species. The latter are the result of hybridization between different species. Therefore, they combine two or more different genomes (e.g., allotetraploid AABB). As a consequence, autopolyploids may form multivalents at meiosis, and have polysomic inheritance. By contrast allopolyploids show bivalent pairing and have disomic inheritance [7]. There are several mechanisms that may lead to an increase in ploidy level in plants. However, there is strong circumstantial evidence that sexual polyploidization through gametes with unreduced chromosome number (2n gametes) represents the main route for polyploidization [8,9]. 2n gametes generally result from the expression of mutations affecting micro-and megasporogenesis. Such mutations have been extensively studied in a number of genera, including Solanum, Medicago, Manihot, Malus, Arachis, Lolium, and Agropyrum, and have been generally attributed to the action of single recessive genes [10]. The first gene (AtPS1) involved in 2n gamete production has been identified in A. thaliana [11]. AtPS1 mutants display an anomalous (parallel, fused, tripolar) orientation of spindles at metaphase II of male meiosis, leading to the production of 2n pollen. Similarly, 2n gametes have been observed in the jason (jas) [12], switch1(swi1)/dyad [13,14], osd1 and tam (CYCAS1;2) mutants [15,16]. In a given species and habitat, the acquisition of the polyploid condition may bring about several advantages. Following polyploidization, often novel phenotypes appear or variation exceeds the range observed in the diploid parental species. Fawcett et al. [17] suggested that polyploids had a better chance to survive the Cretaceous-Tertiary extinction event. Phenotypic advantages may include, among the others, changes in morphology, physiology and secondary metabolism that confer an increased fitness. Some of these traits, such as increased drought tolerance, pathogen resistance, longer flowering time, larger vegetative and reproductive organs ( Figure 1) may represent important plant breeding targets and, therefore, increase the potential use of polyploids in agriculture. From a genetic point of view, the most significant advantages associated with polyploidy are probably heterosis and gene redundancy [18]. Heterosis is due to non-additive inheritance of traits in a newly formed polyploid compared to its parents. Notably, it can be present also at the gametophytic level. The main factors that affect non-additive inheritance are likely novel regulatory interactions and allelic dosage [19]. Gene redundancy promotes neofunctionalization of duplicated genes, in the long term, but also immediately protects against deleterious recessive alleles. In a recent treatment, Mayrose et al. [20] shed more light on the evolutionary dynamics and consequences of polyploidy. The authors, computing the net diversification rates of polyploid lineages for ferns, lycophytes, gymnosperms and angiosperms, hypothesized that polyploidy is often an evolutionary dead end. However, the possible longer-term evolutionary success of those polyploids that survive, needs to be tested. Several studies provided evidence that extensive and reproducible genetic and epigenetic changes are possible following polyploidization [9]. They include DNA and histone methylation, DNA elimination, gene losses, gene neo-and subfunctionalization, translocations, amplification and reduction of repetitive sequences and alteration of gene expressions through different mechanisms (methylation-mediated silencing, transposon activation, intergenomic interactions, etc.). Given the recent advances in the field of plant molecular biology and biotechnology, through this review we provide an overview of the most suitable technologies and strategies that can allow, at the molecular level, efficient studies on polyploid plants, possibly promoting research in areas that have been ignored or underestimated so far. Reference to some recent significant findings will also be made.

Methods for Genome Analysis
A combination of genetic mapping, molecular cytogenetics, sequence and comparative analysis has shed new light and opened perspectives on the nature of ploidy evolution at all timescales, from the base of the plant kingdom, to intra-and interspecific hybridization events associated with plant domestication and breeding. Strong evidences on the mechanisms of genomic modification have come from the use of physical analysis of chromosomes by in situ hybridization techniques and from genome-wide molecular marker analyses.

In Situ Hybridization
In situ hybridization represents the bridge between the chromosomal and molecular level of genome investigation. In recent years it has received a renewed interest for detecting chromosome rearrangements. It is very powerful for reliable identification of chromosomes, allowing the positioning of unique sequences and repetitive DNAs along the chromosome(s). Fluorescent in situ hybridization (FISH) is based on fluorescent labels linked to DNA probes and visualized under a fluorescence microscope. Genomic in situ hybridization (GISH) involves the use of total genomic DNA of species as a probe on chromosomes, thus leading to whole genome discrimination rather than the localization of specific sequences. There are several examples on the use of these techniques. Studies on the distribution of four tandem repeats in allotetraploids Tragopogon mirus, Tragopogon miscellus and their diploid parents provided evidence that chromosomal rearrangements did not occur following polyploidization, as suggested by the additive patterns of polyploids [21]. By contrast, in newly synthesized allotetraploid genotypes of B. napus, Szadkowski et al. [22] demonstrated extensive genome remodeling due to homeologous pairing between the chromosomes of the A and C genomes. Based on high-resolution cytogenetic maps, Wang et al. [23] demonstrated that genome size difference between the A and D sub-genomes in allotetraploid cotton was mainly associated with uneven expansion or contraction between different regions of homoeologous chromosomes. Recently, Chester and co-workers [24] combined GISH and FISH analysis to demonstrate that in natural populations of T. miscellus extensive chromosomal variation (mainly due to chromosome substitutions and homeologous rearrangements) was present up to the 40th generation following polyploidization.
Autopolyploids are particularly intractable because segregation depends on chromosome pairing behavior (preferential vs. random pairing) and double reduction [34]. A simplification that has been generally adopted for mapping dominant molecular markers is to only utilize the so-called single dose markers from each parent, i.e., those segregating 1:1 in the mapping population (for example, a population obtained from the cross Mmmm × mmmm in a tetraploid species) [35,36]. Statistical models for QTL mapping have been developed for polyploids with bivalent pairing, taking into account preferential vs.

Methylation Sensitive Molecular Markers
The use of an AFLP-like method using restriction enzymes sharing the same recognition site but having differential sensitivity to DNA methylation (isoschizomeres) was proved to be efficient and reliable for the determination of genome-wide DNA methylation patterns [46]. This technique, termed Methylation-Sensitive Amplified Polymorphism (MSAP), is based on the use of the isoschizomers HpaII and MspI both recognizing the 5′CCGG sequence, but affected by the methylation state of the outer or inner cytosine residues. Using this method, noteworthy results were obtained in newly synthesized polyploids. In Arabidopsis, Madlung et al. [47] demonstrated that frequent changes occurred in F 4 allotetraploids when compared with the parents. Changes involved both increases and decreases in methylation, but no overall hyper-or hypomethylation. Similarly, alterations in cytosine methylation in wheat occurred in about 13% of the loci, either in F 1 hybrids or in allopolyploids. Notably, alterations in methylation patterns affected both repetitive DNA sequences and low-copy DNA in approximately equal proportions [25]. On the other hand, lack of rapid DNA methylation changes at symmetric CCGG sites was hypothesized in allopolyploid cotton [48]. A similar behavior was observed also in Brassica [49] and sugarcane [50].

Comparative Genome Analysis
Comparative genomics research has gained importance as a powerful tool for addressing both fundamental and applied questions in genome evolution [51][52][53]. The implementation of these methodologies, however, requires consideration of the variable rates at which different aspects of genome evolution occur [52]. Comparing different wheat species, genomic rearrangements originating by illegitimate DNA recombination were identified as a major evolutionary mechanism [54,55]. Innes et al. [56], comparing homologous regions in several related legume species, demonstrated that retroelements were the largest contributor to duplicated regions. Comparative analysis of Brassica oleracea triplicated segments showed that 35% of the genes were lost. Retained genes were dosage-sensitive and not randomly located. Duplicates of transcription factors and members of signal transduction pathways were significantly over-retained following WGD, whereas these same functional gene categories exhibited lower retention rates following smaller scale duplications [57]. For instance, in four independent polyploid wheat lineages, recurrent deletions of Puroindoline (Pin) gene at the grain Hardness (Ha) locus were identified [58].
Phylogenetic and taxonomic studies have been conducted in order to pinpoint the exact placement of the ancient polyploidy events within lineages and to determine when novel genes resulting from polyploidy have enabled adaptive processes. Recent genomic investigations not only indicated that polyploidy is ubiquitous among angiosperms, but also suggested several ancient WGD events [59][60][61][62], even in basal angiosperm lineages. Phylogenetic reconstruction with completely sequenced genomes suggested that genome doubling led to a dramatic increase in species richness in several angiosperms, including Poaceae, Solanaceae and Brassicaceae, thus contributing to the dominance of seed plants and angiosperms [63]. To date, only a few reports investigated the fate of the genome after polyploid formation [64]. The probability of fixation and maintenance of duplicated genes depends on many variables. Transposable elements may play a key role in fuelling genome reorganization and functional changes following allopolyploidization. A pivotal example of using comparative approaches to investigate the role of retroelements in polyploids is provided by Wawrzynski et al. [65]. The authors, investigating the nonautonomous retrotransposon replication in soybean estimated a much greater impact of such transposable elements on genome size than previously appreciated. More recently, in wheat Kraitshtein et al. [66] reported a retrotransposition bursts in subsequent generations. By contrast, no evidence for a transposition burst was found in different allopolyploid species [31,67,68]. Comparative approaches in which genetic events are considered both in a phylogenetic and genetics framework should be conceptualized and modeled.

High-Throughput DNA Sequencing and High Resolution Melting (HRM) Analysis
High-throughput DNA sequencing associated with computational analysis provides general solutions for the genetic analysis of polyploids [69]. However, ploidy is a substantial challenge in sequencing and assembly of plant genomes. A number of biological factors influence the feasibility of discrimination, including the degree of gene family complexity, and the reproductive system. Of course, the level of knowledge concerning the progenitor diploid species is also very important. To date, all attempts to sequence polyploids have relied on either a reduction in ploidy or a physical separation of chromosomes. Attempts to sequence a heterozygous diploid potato genome (RH89-039-16) were challenging due to the high degree of heterozygosity [70]. In order to bypass the difficulties of sequencing the polyploid genome of cultivated strawberry (Fragaria x ananassa), woodland strawberry (Fragaria vesca) was sequenced [71]. In B. napus, the polyploidy issue was addressed by sequencing leaf transcriptome across a mapping population and representative ancestors of the parents of the population [72]. The Wheat Genome Initiative (http://www.wheatgenome.org/) has focused on flow cytometry separation of individual or groups of homeologous chromosomes [73]. To better understand the nature and extent of variation in functionally relevant regions of a polyploid genome, a sequence capture assay to compare exonic sequences of allotetraploid wheat accessions was developed [74]. In cultivated wheat gene duplications were predominant, while in wild wheat mainly deletions were identified. Exon capture proved to be a powerful approach for variant discovery in polyploids. This technique has the potential to identify variation that can play a critical role in the origin of new adaptations and important agronomic traits.
A wealth of SNP detection approaches has been applied to study polyploidy in plants. Akhunov et al. [75], using the Illumina GoldenGate assay, identified a high number of SNPs in tetraploid and hexaploid wheat. More recently, Allen et al. [76] from Illumina GAIIx data identified more than 14,000 putative SNPs in 6225 distinct hexaploid bread wheat reference sequences. In elite inbred maize lines, more than 1 million SNPs have been identified an Illumina sequencing platform [77]. In the heterozygous polyploid sugarcane, a targeted SNP discovery approach based on 454 sequencing technology was developed by Bundock et al. [78]. Using a 454 and Illumina expressed sequence tag sequencing of the parental diploid species of the allotetraploid Tragopogon miscellus, Buggs et al. [79] identified more than 7,700 SNPs differing between the two progenitor genomes the allotetraploid derived from. The Sequenom MassARRAY iPlex platform [80] was used by to validate 92 SNP markers at the genomic level that were diagnostic for the two parental genomes. SNP discovery was also pursed through 454 technology coupled to High Resolution Melting (HRM) curve analysis in tetraploid alfalfa (Medicago sativa) [81]. HRM is a technique that can identify mismatches, even for single bases, in amplicons containing heteroduplex molecules [82], and is emerging as a powerful tool for polyploid genetics [83]. It was demonstrated that the 454 system is a cost-effective approach for SNP discovery targeted to genes of interest in polyploid genomes, and that HRM can identify different alleles in polyploids [66,68]. Salmon et al. [84] detected homoeologous SNPs in G. arboreum (A genome), G. raimondii (D genome), and G. hirsutum (AD genomes). The authors estimated that the proportion of genome in G. hirsutum that has experienced non reciprocal homoeologous exchanges since the origin of polyploid cotton 1-2 Mya was between 1.8% and 1.9%. SNPs have also been discovered in transcriptome sequences of polyploidy B. napus [85]. Next-generation sequencing has been used to mine SNPs in elite wheat germplasm [76].

Methods for Gene Expression and Regulation Analysis
Several methods have been developed for quantifying gene transcription and regulation patterns in polyploids. Although studying gene expression changes in allopolyploids is more complicated than in autopolyploids, most studies on ploidy-related gene expression changes were carried out on synthetic allopolyploids. Indeed, genome merger and doubling can determine widespread transcriptome modifications, generating cascades of novel expression patterns, regulatory interactions, and new phenotypic variations that subsequent natural selection may act upon.

Northern Hybridization and cDNA-AFLP
When sequence information was still scanty, comprehensive transcript-profiling for quantitatively measuring gene expression variation was carried out by Northern blot analysis. This technique involves the use of electrophoresis to separate RNA samples and detection with a labeled probe complementary to a specific RNA target sequence. There are only a few examples on the use of this type of analysis in polyploids. It was used by Guo and co-workers [86] to investigate the dosage effects of 18 genes in an autopolyploid maize series (1×, 2×, 3×, and 4×). Expression levels of genes were dependent on chromosomes dosage, although some varied their expression in response to the "odd" or "even" ploidy. By contrast, several examples are available on the use of cDNA-AFLP. It is a PCR-based technique, which relies on digestion of cDNA by two restriction enzymes and ligation of specific adapters. A set of specific primers designed for these adapters allow simultaneous amplification of fragments under stringent conditions. In synthetic allotetraploids between A. thaliana and A. arenosa, Comai et al. [87] found 20 suppressed genes out of 700 examined. Similarly, Lee and Chen [88], by extending the analysis also to A. suecica (a natural allopolyploid likely formed through pollination of A. arenosa with 2n gametes from A. thaliana) were able to identify a set of 10 different genes differentially expressed in A. suecica and its progenitors. In synthetic Triticum aestivum allohexaploids [89], about 8% of transcripts displayed altered expression, and >95% of them were reduced or absent. Similar gene expression changes have been found in cotton allopolyploids [90]. Another example on the use of this technique is offered by the work carried out by Tate et al. [91]. In newly synthesized Tragopogon allotetraploids, preferential expression of parental homeologues displayed a correlation with a loss of parental genomic fragments. Notably, such changes were not observed in newly developed Tragopogon F 1 hybrids, implying that they arose following genome duplication.

Single-Strand Conformational Polymorphism (SSCP) Analysis
This technique detects sequence variations (single-point mutations and other small-scale DNA changes) through electrophoretic mobility differences. DNA that contains a sequence mutation (even a single base pair change) displays a different measurable mobility compared to reference DNA when electrophoresed in non-denaturing, or partially denaturing conditions. Due to these features, SSCP was employed to distinguish between homoeologous cDNA molecules. This approach has been applied to A. suecica [88], cotton [90] and wheat [92] leading to the finding that, basically, genes duplicated by polyploidy are rarely expressed at similar levels, and that there is a biased expression or silencing of some homeologous gene pairs.

Microarrays
Technologies to monitor gene expression achieved a breakthrough through the introduction of microarrays [93]. Genome and transcriptome sequencing have speed up probe development, which consequently resulted in the commercial availability of whole-genome microarrays for many model and crop species. This also gave the possibility to design custom arrays at affordable costs. Approaches of comparative expression profiling have mainly focused on synthetic allotetraploids revealing both additive and non-additive gene expression. The former occurs when gene expression level in the tetraploid is either the sum of the parental values or equal to the mid-parent value (MPV). For instance, tanscriptional profiling of resynthesized A. suecica lines from newly created autotetraploid A. thaliana and the natural tetraploid A. arenosa revealed that, albeit most of genes were additively expressed (from 65% to 95%), more than 1,400 genes diverged from the MPV. The combination of diverged parental genomes in a common nucleus during allopolyploidization implies the reunion of previously diverged regulatory hierarchies, which likely entails non-additive gene expression. This hypothesis has been validated by genome-wide expression analyses also in synthetic polyploids of wheat, cotton, Senecio, Brassica, and Spartina [94][95][96][97][98][99]. These studies demonstrated that allopolyploid plants exhibit considerable transcriptome alterations as compared with their diploid progenitors. Transcriptome analyses of autopolyploids suggested that there are less dramatic alterations of gene expression compared to allopolyploids [100]. Expression profiling analyses of autotetraploid A. thaliana of two different accessions revealed that transcriptome alterations caused by autopolyploidy depend on genome or genetic composition [101]. Microarray analysis provided evidence that ~10% of the ~9,000 potato genes tested displayed expression changes (within the twofold level) among a potato autopolyploid series (1×, 2×, and 4×) [102]. A similar twofold level change was detected in a corn ploidy series (1×-4×) [103].
DNA microarray technology has been also used to profile expression of noncoding RNA molecules naturally occurring in the plant genomes, such as micro RNA (miRNA). They are a class of 20-24 nucleotide small RNAs that repress their target genes by mRNA degradation or translational repression. Therefore, identification and quantification of miRNAs is deemed essential to understanding an organism's or tissue's gene regulatory network [104]. MicroRNA expression profiling was performed with custom designed chips in both natural A. suecica and resynthesized Arabidopsis genotypes. It indicated that many miRNA and trans-acting siRNA (tasiRNA), i.e., endogenous siRNAs that direct the cleavage of non-identical transcripts, are non-additively expressed [105]. Among the differentially expressed miRNAs, miR163 is severely repressed in leaves and flowers of A. arenosa and allotetraploids, but is highly expressed in A. thaliana. Analysis by Ng et al. [106] demonstrated that miR163 expression differences results from cis-acting effects, as well as from trans-acting repressor(s) that are present in A. arenosa and allotetraploids but absent in A. thaliana.

High Throughput RNA Sequencing
Next-generation sequencing (NGS) technologies are changing the ways in which gene expression is studied. The principle behind these applications of high throughput sequencing technologies, which have been termed RNA-seq, is simple: complex RNA samples are directly sequenced to determine their content. Therefore, unlike hybridization-based data requiring the estimation of RNA amount by image analysis, RNA-seq data consists of absolute numbers of reads from each gene. These data are highly suitable for the analysis of gene expression since, not relying on probes, they are less error-prone than previous methods and allow to determine absolute expression levels [107]. Sequencing-based methods also permit the genome-wide study of small RNA expression. In T. miscellus allopolyploids, Buggs et al. [79] profiled almost 3000 SNP markers using an Illumina RNA-seq approach to study differential expression of duplicate homologous genes derived from the parental genomes. The authors found expression biases among tissues in the diploid parents (T. dubius and T. pratensis) in comparison to the natural allopolyploids, as well as uniform expression in F 1 and first-generation synthetic allopolyploids. To explain the observed "transcriptomic shock", they hypothesized a loosening of gene expression regulation, which may set the stage for gradual evolution of novel patterns of expression in the early generations of polyploidy. Croate and Doyle [108] used quantitative reverse transcriptase-polymerase chain reaction and RNAseq in allopolyploid Glycine dolichocarpa and its diploid progenitors. They inferred dosage responses for several thousand genes and showed that most of them had partial dosage compensation. In G. max, RNA-seq allowed the identification of the gene family likely contributing to differences in photosynthetic rate between the allotetraploid and its progenitors [109]. The authors also provided evidences that the tetraploid appeared to use the "redundant" gene copies in novel ways. In A. thaliana allopolyploids, transcriptome profiling was carried out by Ha and colleagues [105] through high-throughput cDNA pyrosequencing. For the first time, these authors gained insight into small RNA expression diversity and evolution in closely related species as well as in interspecific hybrids. The data suggested a role for small RNAs in buffering against genomic shock in Arabidopsis interspecific hybrids and allopolyploids. In particular, they seem to have a central role in maintaining genome and chromatin stability as well as in modulating non additive gene expression. In addition, Ha et al. [105] found that repeat-and transposon-associated siRNAs (rasiRNA and TE-siRNA, respectively) were highly divergent between A. thaliana and A. arenosa and their non additive gene expression in allopolyploids were not correlated. By contrast, miRNA and tasiRNA sequences were conserved between species, but their expression patterns were highly variable between the allotetraploids and their progenitors.

Methods for Protein Analysis
Compared with genomic and gene expression variations, changes in proteins and gene products in polyploids and their progenitors were rarely examined. An early study on genome-wide protein profiling was performed in maize lines of different ploidies by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) [110]. This is a technique for the separation of proteins according to their molecular weight, in the presence of a reducing agent (2-mercaptoethanol). Data obtained showed that expression per genome for most maize proteins did not change with ploidy, even though ploidy-modulated expression changes were detected for a few proteins. A more powerful method to analyze complex protein mixtures is the protein two-dimensional electrophoresis (2-DE) analysis, in which proteins are separated according to their isoelectric point and mass. In diploid, tetraploid and hexaploid wheat 2-DE experiments showed that the expression of homeologous proteins in hexaploid wheat depended on interactions among the parental A, B and D genomes [111]. Other recent studies using 2-DE indicated numerous and unbiased variations of proteins in newly synthesized B. napus [112,113] and wheat hybrids [114]. Using protein 2-DE coupled with mass spectrometry (MS) assays, a recent study in maize showed a positive correlation of differentially expressed proteins with ploidy levels [115]. The highest correlations were found in diploid-hexaploid and tetraploid-hexaploid comparisons. Recently, Ng and co-workers [116] were able to study quantitative changes in the proteome of Arabidopsis autopolyploids and allotetraploids and their progenitors using the isobaric tags for relative and absolute quantification (iTRAQ) technique, coupled with mass spectrometry. The levels of protein divergence ranged from ~18% between A. thaliana and A. arenosa to ~7% between an A. thaliana diploid and autotetraploid. In F 1 -and F 8 -resynthesized allotetraploids the proteomic divergence relative to MPV was intermediate (~8%). These data suggest that, during polyploidization, rapid changes occurring in post-transcriptional regulation and translational modifications of proteins can lead to high protein discrepancy between species.

Conclusions and Perspectives
With the speed of technology improvement and the application of genomic tools, polyploidy research is undergoing a renaissance. It can be expected that comprehensive studies using multidisciplinary approaches will push the boundaries of current methodologies to translate the knowledge gained into practical applications. Particularly significant will be the high-throughput genome-wide approaches to unraveling the genetic and epigenetic consequences of polyploidization and the availability of phenotyping platforms. They are all reaching an unprecedented level of resolution at relatively affordable costs to the point that genotyping-by-sequencing [117] and targeted sequence capture [118] are now feasible also for high diversity, large genome species. NGS not only will extend the possibilities of gene and marker discovery, but will enable genome-wide quantification of gene expression. It will also allow direct genome-scale investigation of chromatin and DNA methylation cross-talk, by ChIP-Seq, bisulfite sequencing, etc. Characterizing transcripts through sequencing is advantageous to circumvent problems posed by highly redundant and extremely large genomes. It should be pointed out that the rapid pace at which new sequencing technologies are emerging is generating a growing disparity between the rate of data generation and its full and biologically meaningful analysis. However, there are outstanding examples addressing successful strategies for dealing with these challenges [109,[118][119][120].
Genetic mapping can exploit robust statistical models, and will be crucial for identifying the genes underlying the polyploidization process in the bulk of the fast growing genome sequence information. Merging results from genetic, genomics and proteomics investigations will help to understand to what extent polyploid genome flexibility is associated with amplified responses to selection. We have recently hypothesized that defense response plasticity of potato could be correlated to gene number and category and cluster organization [121]. Understanding polyploid evolution requires knowledge to be integrated at the population level, and will have not only to rely on suitable experimental designs, but also on surveys of variation at multiple levels. Recent and forthcoming sequencing technologies are providing a wealth of genomic data to be released soon, also for wild species that can be employed for evolutionary studies. Until recently, sequencing complex genomes was considered very challenging due, for example, to the difficulties in discriminating among paralogous, hortologous, and homoeologous sequences. However, the availability of the genome sequence of the ancestors, the reduction in the ploidy level or the physical separation of chromosomes offered the possibility to circumvent these challenges and examples of polyploid genomes fully sequenced have become available [122,123]. The accumulation of knowledge on polyploid formation, maintenance, and divergence at the whole-genome and subgenome levels will not only help plant biologists to understand how plants have evolved and diversified, but also assist plant breeders in designing new strategies for crop improvement.