Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies

Jeon, Donghyun; Kim, Changsoo

doi:10.3390/plants13152087

Open AccessReview

Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies

by

Donghyun Jeon

¹

and

Changsoo Kim

^1,2,*

¹

Department of Science in Smart Agriculture Systems, Chungnam National University, Daejeon 34134, Republic of Korea

²

Department of Crop Science, Chungnam National University, Daejeon 34134, Republic of Korea

^*

Author to whom correspondence should be addressed.

Plants 2024, 13(15), 2087; https://doi.org/10.3390/plants13152087

Submission received: 8 July 2024 / Revised: 24 July 2024 / Accepted: 26 July 2024 / Published: 27 July 2024

(This article belongs to the Special Issue Deciphering Plant Molecular Data Using Computational Methods)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The Brassicaceae family is distinguished by its inclusion of high-value crops such as cabbage, broccoli, mustard, and wasabi, all noted for their glucosinolates. In this family, many polyploidy species are distributed and shaped by numerous whole-genome duplications, independent genome doublings, and hybridization events. The evolutionary trajectory of the family is marked by enhanced diversification and lineage splitting after paleo- and meso-polyploidization, with discernible remnants of whole-genome duplications within their genomes. The recent neopolyploidization events notably increased the proportion of polyploid species within the family. Although sequencing efforts for the Brassicaceae genome have been robust, accurately distinguishing sub-genomes remains a significant challenge, frequently complicating the assembly process. Assembly strategies include comparative analyses with ancestral species and examining k-mers, long terminal repeat retrotransposons, and pollen sequencing. This review comprehensively explores the unique genomic characteristics of the Brassicaceae family, with a particular emphasis on polyploidization events and the latest strategies for sequencing and assembly. This review will significantly improve our understanding of polyploidy in the Brassicaceae family and assist in future genome assembly methods.

Keywords:

whole-genome duplication; diversification; glucosinolate; sub-genome discrimination; polyploid

1. Introduction

The Brassicaceae family, commonly known as Cruciferae, is a significant group within the angiosperms and belongs to the Brassicales order. This family includes 5 supertribes, 58 tribes, and 4140 species, covering a range of plants like annuals, biennials, and herbaceous perennials [1,2]. Moreover, the family Brassicaceae has a pivotal role in agriculture as a source of crucial crops such as cabbage, cauliflower, and mustard. Additionally, it is recognized as a genetic diversity hotspot, featuring unique secondary metabolites like the ‘mustard oil bomb’, which consists of glucosinolates (GSLs) and myrosinase enzymes. These metabolites are crucial for studying the interactions between biochemistry, natural selection, and gene evolution, especially in the context of herbivory defense mechanisms [3]. The abundant genetic resources of the family position it as an ideal model for research in flowering plant biology, facilitating investigations into plant development and the impact of ancient genome duplications.

Whole-genome duplications (WGDs) have significantly influenced the evolution of land plants, including angiosperms. These duplications not only foster the emergence of new traits and evolutionary innovations, but also contribute to increased trait variations within a clade [4]. Polyploidization, the process of acquiring multiple sets of chromosomes, has been a pivotal factor in the evolution of angiosperms. Each flowering plant experienced at least one, often multiple, ancient polyploidy events. About 15% of speciation events in angiosperms are associated with an increase in ploidy. Additionally, at some point during the diversification of the angiosperm crown group, 47% to 100% of flowering plant species can be linked to a polyploidy event [5]. The Brassicaceae family, diverging from other eudicots approximately 24.5~60 million years ago, is also no exception. The discovery of major WGDs, such as the At-α, At-β, and At-γ events in Arabidopsis thaliana, have had a pivotal role in identifying similar patterns across other angiosperm lineages [2,6,7,8]. This family underwent primary diversification during the Neogene period, from about 23 to 2.6 million years ago. It was also theorized that almost 50% of the taxa within the Brassicaceae family may have originated from recent polyploid events [9]. These species’ divergence and major polyploidy often align with periods of climate instability, suggesting that having multiple sets of chromosomes might offer an evolutionary advantage in rapidly changing environments. This is particularly evident in the evolutionary trajectory of the Brassicaceae family, underscoring the significant impact of polyploidization in fostering the diversity and resilience of this plant group.

The genomes of the Brassicaceae family have more polyploidy compared to other flowering plants. Distinguishing sub-genomes in the polyploidy of Brassicaceae has been challenging, mainly due to the complexities associated with polyploids. However, advancements in genomic technologies, such as long-read sequencing and chromosome capture methods, have substantially enhanced the analysis and identification of these sub-genomes. These technological improvements have led to more precise genomic assemblies at the pseudochromosome level, enriching our understanding of genomic sequencing and the evolutionary biology of plants [10,11,12]. However, distinguishing between subgenomes remains a challenge due to their similarities. Many reviews have been published on the importance of polyploidy in plant evolution and environmental adaptation [13,14,15]. However, despite the high rates of polyploidy within the Brassicaceae family, there have yet to be any reviews specifically focusing on this family. However, although there have been significant advancements in the latest assembly strategies for polyploid crop genomes since the publication of [16], there has yet to be any progress in documenting these developments. This review examines the specific characteristics and impacts of polyploidization within the Brassicaceae family, particularly emphasizing genomic assembly strategies for polyploid crops. It aims to provide a comprehensive framework that will guide future genomic assembly studies for polyploidy. By elucidating the complexities and innovative approaches involved in assembling polyploid genomes, this review offers essential insights that are expected to propel further advancements in the field of polyploidy in the Brassicaceae family.

2. Polyploidization of the Brassicaceae Family

The Brassicaceae family is renowned for its frequent hybridization events. However, this might be an underestimation, as it needs to fully consider older, clade-specific WGDs, which are often masked by subsequent diploidization processes (Figure 1). WGD events significantly characterize the evolutionary history of the Brassicaceae family and the broader Brassicales order [17]. The paleopolyploid genome of the Brassicaceae family is diversified into four or more major lineages, forming distinct monophyletic clades or tribes [18]. While some of these lineages and tribes retained their paleopolyploidization characteristics, some underwent younger WGDs, forming several new genera and tribes. Estimations indicate that approximately one-fourth of the tribes within the crucifer family experienced diversification after the post-At-α genome duplications [19]. Unlike the older paleopolyploid ones, these younger mesopolyploid genomes display different phylogenomic characteristics. Due to their more recent origin, mesopolyploid clades show a more significant karyological variability and have multiple base chromosome numbers, reflecting their independent diploidization events [20]. More recent mesopolyploidization events were identified in eleven of the fifty-one tribes within the Brassicaceae family. The family underwent eleven additional tribe-specific WGDs during their evolution in the Miocene and Pliocene epochs [21]. These additional WGD events potentially impacted around 130 genera, averaging 1.45 WGDs per genus in 35 million years. This rate is notably higher than the average for land plants [22,23]. The genomes that underwent mesopolyploidization exhibit extensive diploidization. Chromosomal rearrangements, reduction in genome size, and genome fractionation characterize this process. Despite these significant alterations, remnants of the duplicated genomic regions from these mesopolyploidization events are still detectable in the current genomic structures. Recently evolved polyploids, known as neopolyploids (such as Brassica napus), are distinguished by their larger genome size, elevated chromosome count, duplicated gene content, and existing diploid progenitors. Neopolyploidy is prevalent, evident in 43% of families. The proportion of neopolyploids in each tribe slightly correlates with the number of species [22,24,25]. This evolutionary complexity reflects the dynamic nature of the Brassicaceae genome over millions of years. This suggests that polyploidization is an ongoing evolutionary process within the Brassicaceae.

2.1. Diversification of Brassicaceae

The Brassicaceae family, known for its history of multiple instances of paleo- and mesopolyploidization in the past and the prevalence of neopolyploids, offers a potential model for studying how diversity evolves through polyploidization events. Studies have shown that WGDs, particularly older events like paleopolyploidization, may increase diversification rates, but this increase often occurs after a considerable delay. Diversification rates increased after nearly half of the nine ancient polyploidy events examined in flowering plants [26]. Understanding the genomic evolution in the Brassicaceae family involves recognizing the ‘Lag-Time Model’ that occurs between polyploidization events and the resulting diversification. This model is marked by a period of genomic adjustment, primarily through post-polyploid diploidization. During this phase, the genome underwent significant alterations after it had doubled. It is important to understand that diversification was not directly initiated by the genome doubling itself, but rather by the subsequent diploidization process. This delay suggests that WGDs are important contributors to diversification, but are not the sole driving factor [27]. Neopolyploidization is often considered a direct cause of speciation, acting as a crucial mechanism for generating new species within the Brassicaceae family. This process, involving the duplication of the entire genome, facilitates the creation of novel genomic arrangements that can lead to significant evolutionary innovations. Therefore, while both neopolyploidization and mesopolyploidization are essential for understanding the evolutionary dynamics within the Brassicaceae, their roles differ. In this context, genome doubling serves as an initial trigger or a stepping stone, setting the stage for further evolutionary processes that can lead to the diversification and emergence of new species within this rapidly evolving plant family [22,25,28].

The evolution of the Brassicaceae family is closely linked to significant geological and climatic changes during the Oligocene, Miocene, and Pliocene periods. The diversification of the Brassicaceae crown group began around 30 million years ago with the split of the Aethionemeae tribe from the rest of the family, coinciding with a global temperature decline and the formation of the Antarctic ice sheet in the Early Oligocene [29]. Most of the major branches within the family, except Aethionemeae, originated just before 20 million years ago, over a relatively brief period of about 0.69 million years. This period saw the emergence of the primary Lineages I, II, III, and IV. However, most tribal diversification happened after a significant drop in average temperatures that followed the Middle Miocene Climatic Optimum, which occurred between 18 and 14 million years ago [8]. These observations suggest that a global trend towards cooler and drier conditions may have influenced the diversification and expansion of the Brassicaceae.

2.2. Evolution of the Biosynthetic Pathways to Produce GSLs

GSLs are sulfur-rich, amino acid-derived metabolites predominantly found in the Brassicales order. These compounds have a basic structure comprising a sulfonated oxime or sulfated isothiocyanate group linked to thioglucose and a variable side chain [30]. GSLs, along with their activating enzymes, myrosinases, form a defense mechanism known as the mustard oil bomb. This mechanism is crucial for defense against pathogens and herbivores, acting through signaling, antimicrobial actions, and as deterrents [31]. The presence of specific proteins like epithiospecifier proteins and nitrile-specifier proteins significantly influences the transformation of GSLs into isothiocyanates or nitriles [32,33].

Before the At-β WGD event, GSLs were primarily synthesized from phenylalanine and branched-chain aliphatic amino acids through a core pathway regulated by an MYB–MYC complex. Post the At-β WGD, descendant plant families began synthesizing indole and methionine-derived GSLs. This development was accompanied by the retention of duplicate genes involved in regulation and early biosynthetic steps [34,35]. These duplicates underwent neofunctionalization, performing distinct roles and producing different substrates and products. For example, duplicated MYB genes specialized to function in individual GSL pathways, while MYC genes adapted to function in both pathways [36]. Following the At-α WGD, retained gene duplicates for enzymes like flavin monooxygenase glucosinolate S-oxygenase and indole GSL methyltransferase underwent tandem duplication and subfunctionalization. These processes enabled these enzymes to catalyze modifications of side chains in Met-derived GSLs and indole GSLs [37,38].

Regarding GSL transport within plant tissues, NRT1/PTR family proteins have a crucial role. The evolution of GSL transport likely predates the At-α WGD and initially involved NPF transporters with broad or nonspecific side chain affinity. Post-At-α WGD, these transporters evolved to transport specific GSL substrates, like indole GSLs, selectively [39]. Additionally, GSLs are structurally related to cyanogenic glucosides, widespread metabolites in plants and some arthropods. Phylogenetic analysis indicated that specific GSL transporters are orthologous to cyanogenic glucoside transporters found in cassava. This suggests that ancestral cyanogenic glucoside transporters evolved to develop a high affinity for cyanogenic glucosides and later for specific GSLs [40] through duplication and subfunctionalization. Additionally, genes associated with GSL biosynthesis tend to be preferentially retained or duplicated following polyploidization. These genes include ones encoding methylthioalkylmalate synthase enzymes, isopropylmalate isomerases, and branched-chain amino acid aminotransferases, which have been observed to either retain or duplicate in each subgenome post-allotetraploidization. Notably, the gene clusters responsible for encoding flavin-containing monooxygenases, crucial enzymes in sinigrin synthesis, exhibited a tendency for preferential retention following polyploidization. This cluster was predominantly expressed and underwent tandem duplication, contributing to a higher accumulation of sinigrin [41,42].

2.3. Polyploidy and Its Role in the Evolution and Domestication of Brassica Species for Agricultural Trait Development

Polyploidy, the condition of having more than two complete sets of chromosomes, is often linked to numerous desirable agricultural traits, such as larger seed size [43], increased stress tolerance [44], and enhanced disease resistance [45]. Recent phylogenetic analyses show that cultivated plants have experienced significantly more polyploidy events than their wild relatives. Often preceding domestication, lineages with recent WGDs are nearly twice as likely to be domesticated compared to those without WGDs [46].

In the genus Brassica, a clear example of the impact of polyploidy is evident in the classic ‘Triangle of U’, comprising six cultivated species derived from three diploid progenitors: Brassica rapa, Brassica nigra, and Brrassica oleracea. These diploids hybridized to form three distinct allopolyploid species—Brassica carinata, Brassica juncea, and B. napus—which then underwent biased gene loss and genome restructuring [47]. Artificial selection has repeatedly targeted various harvestable parts within these species, including underground storage organs (turnips), leafy vegetables (cabbages, pak choi), axillary buds (Brussels sprouts), floral parts (cauliflower, broccolini), and seeds (oilseeds) [48]. Studies on the Brassica Triangle reveal how polyploidy can disrupt highly conserved leaf structure-function relationships in seed plants, leading to novel trait correlations absent in parental species. These new genomic combinations and interactions break ancestral phenotypic trait correlations, creating opportunities for trait improvement. The genomic and chromosomal instability in early generations of allopolyploids enables the exploration of new trait spaces, potentially reaching higher adaptive peaks than their progenitors despite possible transient fitness costs.

A crucial part of understanding the domestication and adaptation processes in these allopolyploid Brassica species is recognizing the role of interploidy introgression, especially in B. napus. This study emphasizes how interploidy introgression was pivotal during the transition from winter to semi-winter rapeseed, aiding adaptation to the milder winter climates of Asia. Detailed genomic analyses identified introgression events from Asian-cultivated B. rapa morphotypes into semi-winter rapeseed, affecting between 2201 and 6950 genes involved in critical biological pathways such as developmental growth regulation, flower development, seed growth, lipid storage, glucosinolate catabolism, and gibberellin response. The flowering time-related genes were particularly significant, adapting to reduced vernalization requirements suitable for milder Asian winters. Key genes related to vernalization, circadian clocks, photoperiods, and autonomous pathways highlighted the vital role of interploidy introgression in adjusting flowering times to local conditions [49].

Furthermore, the creation of tetraploid radish plants using colchicine—validated by cytological observations and flow cytometry analysis of DNA content—demonstrates the ‘gigas’ effect, for which both vegetative and reproductive organs increase in size. These tetraploid radishes exhibited enhanced physiological traits, including increased antioxidant enzyme activity, soluble content, and overall quality. Notably, while some endogenous plant hormone levels exhibited negative trends, others, like abscisic acid and indoleacetic acid, positively correlated with delayed bolting and flowering, in contrast to diploid plants [50].

In addition, newly created artificial allotetraploid plants may provide a unique opportunity to create a new species without genetic modification technologies, although it is a very classical method. This suggests that polyploidy within the Brassicaceae not only fosters the emergence of new, agriculturally beneficial traits and offers an excellent target for artificial selection and crop improvement. The evolutionary instability of traits following hybridization makes them ideal for breeding programs to enhance crop resilience and productivity.

3. Challenges and Solutions in Sequencing and Assembling Polyploidy Plant Genomes

A significant number of polyploids have been identified as ancient or recent polyploids within the vast array of plant species. These plants possess multiple chromosome sets due to events like interspecific hybridization (allopolyploidy) or genome doubling (autopolyploidy) [51]. Particularly, the Brassicaceae family contains a large number of polyploid crops. The genome assembly of polyploids in the Brassicaceae family has seen limited progress, primarily focused on major crops (Table 1). Recently, a pan-genome platform for 86 accessions of B. carinata has been established, and an integrated platform combining the pan-genome and multi-omics data for B. napus has been developed [52,53]. Polyploidy introduces specific challenges at the pseudo-chromosome level or haplotype-resolved assembly. It involves complex tasks like assembling duplicated regions and distinguishing between inter-genomic and intra-genomic polymorphisms. Despite advances in sequencing technologies and computational algorithms over the past two decades, which have aided in analyzing genomes from nonvascular to flowering plants, assembling complex plant genomes remains daunting. Challenges such as high heterozygosity, repetitive sequences, and varied ploidy levels are difficult to address using traditional sequencing and assembly methods [54]. Therefore, to solve these problems, it is important to find an appropriate method for each genome characteristic.

3.1. Assessing Ploidy Levels in Plants

In genomic research focused on plants with varying ploidy levels, the crucial step is usually to ascertain the chromosome count or ploidy level most directly achieved by counting the somatic chromosomes in meristematic cells. Nevertheless, direct chromosome counting can be labor-intensive and time-consuming. It often necessitates the expertise of a researcher skilled in cytology, and in certain species, it is challenging due to factors like abundance, small size, or weak staining of the chromosomes. Additionally, understanding the genetic and morphological traits of polyploid plants is crucial. This understanding bridges the gap between sequencing challenges and practical applications, essential for accurate genome assembly and comprehending the evolutionary role of polyploidy [49].

Shifting into a polyploid state typically increases cell size, resulting in polyploids having larger cells than their diploid counterparts. This enlargement in cell size, particularly in the nucleus, can affect the overall structure and regulatory functions of the cell, reflecting the impact of ploidy changes on cellular architecture [60]. Consequently, these alterations might reduce metabolic and growth rates in polyploid organisms because of the longer replication time [61]. Efforts to assess ploidy led to the creation of various indirect methods for estimating ploidy. These methods utilize the relationship between genome copy number and the size or volume of cells or organs. Commonly used indicators of ploidy include measurements like stomatal guard cell length, stomatal frequency, chloroplast count in guard cells, or pollen diameter [62,63,64]. While these measures have proven effective as quick indicators of ploidy in many polyploid plant groups, they have limitations. These include sensitivity to environmental factors, the need for specific calibration, and overlapping measurement ranges across different ploidy levels. Such issues can lead to uncertainties and challenges when applying these methods to non-model plant species.

The ploidy level also can be determined using flow cytometry to estimate nuclear DNA content. Flow cytometry is an efficient, high-throughput technique that simultaneously measures multiple optical properties of individual particles, such as cells or nuclei, often tagged with fluorescent dyes. The data obtained are then utilized to deduce the physical or chemical properties of these particles, including genome size or ploidy. However, because flow cytometry assesses the total nuclear DNA content without considering chromosome count, determining the exact ploidy level necessitates calibration against a reference sample with a known chromosome number, typically established through conventional karyotyping. These morphological and genetic traits allow for the inference of plant polyploidy, further elucidated in genotypic analysis [65,66].

3.2. Genomic Characteristics of Polyploid Plants

The marker-based approach is crucial in revealing the complex genomic structures of polyploid plants. The first most apparent sign of polyploidy in marker genotyping is detecting multiple bands from putative single-locus markers. Instances such as observing more than two bands in single-locus assays like RFLPs or SSRs often indicate genome duplication in the target species [67,68]. For example, in B. napus, an allotetraploid species of recent origin, most SSR markers amplify both homologous loci, resulting in two to four bands per marker, varying with locus heterozygosity. Although whole genome duplications often result in multiple signals in genotyping assays, they are not the only cause. Other factors, like small-scale sequence duplications from non-homologous translocations, activation of retrotransposons, or unique aspects of primer sequences or enzyme restriction sites, can also lead to multiple responses. Therefore, observing several reactions in genotyping does not always indicate a whole genome duplication; thus, other potential causes should be considered [69]. However, suppose a large number of genotyping tests specific to certain genetic locations frequently detect multiple genome sequences or products. In that case, it strongly indicates the presence of polyploidy in the species being studied.

In genome profiling, k-mer frequencies in polyploid genomes have a crucial role. It enhances the modeling of polyploid genomes by taking the k-mer spectrum as input and using a mixture of negative binomial distributions to estimate genome size, repetitiveness, and heterozygosity rates. For instance, the diploid A. thaliana shows two significant peaks in its k-mer profile, whereas the triploid Meloidogyne enterology has three [70]. The heights of these peaks in the k-mer spectrum are proportional to the species’ heterozygosity. Additionally, the relative heights of the peaks within the k-mer spectrum indicate the level of heterozygosity present in a species. To illustrate, in the case of a diploid species, an increase in heterozygosity leads to a heightened first peak and a diminished second peak in the k-mer spectrum. This relationship is more intricate for polyploid species, but generally, heightened heterozygosity results in an elevated first peak and reduced subsequent peaks. Furthermore, the peaks with higher coverage in the k-mer spectrum signify the presence of increasingly duplicated repetitive sequences within the genomes. An important application of k-mer involves distinguishing between allotetraploid and autotetraploid species by analyzing distinct patterns of nucleotide heterozygosity rates. For instance, in wasabi, a well-documented phenomenon during meiosis is the formation of bivalents where homologous chromosomes from the same subgenome preferentially pair with each other (Figure 2). This phenomenon is also evident in various other allotetraploid species. Consequently, in allotetraploids, one would anticipate a higher proportion of “aabb” and a lower proportion of “aaab” because preferential pairing ensures the presence of two homologs from each of the first and second subgenomes after recombination. Conversely, in the case of potatoes, most cells exhibit quadrivalents during meiosis. In this scenario, an individual might possess 0, 1, 2, or 3 homologs from a given subgenome after recombination. Consequently, “aaab” is expected to be more prevalent than “aabb” because there are more likely to be either one or three copies of a subgenome rather than exactly two [70].

3.3. Sequencing Technology Advancements and Limitations in Polyploids

Reference genomes hold a pivotal role in the field of genomics and genetics. They serve as foundational blueprints for understanding the genetic makeup of organisms, both for individuals and entire species. However, in polyploid genomes, this becomes more complicated. The assembly of a polyploid genome resembles a compounded set of haplotype reconstruction problems, in which the computational complexity increases with higher ploidy [16]. The advancement of long-read sequencing technologies like PacBio and Oxford Nanopore Technologies has significantly eased the process of polyploidy sequencing compared to past methods. These technologies have revolutionized the ability to sequence complex genomes by providing longer reads, which offer a more comprehensive view of genomic structure and variation [71,72]. Additionally, the development of chromosome capture techniques such as Hi-C, Pore-C, and Omni-C has furthered our understanding of chromosomal organization and interaction within the nucleus [73,74]. Despite these advancements, distinguishing sub-genomes within polyploid species remains a challenging task. This complexity arises from the intricate nature of these genomes, where similar sequences across different sub-genomes make it challenging to assign specific sequences to their respective genomic origins accurately. The assembly of autopolyploid genomes is exceptionally challenging, as fragments of a sub-genome might be mistakenly assigned to the wrong sub-genome, resulting in incorrectly assembled genomes. Allopolyploids face similar challenges, but their greater genetic distance usually makes resolving their sub-genomes less complicated. This ongoing challenge in sub-genome discrimination highlights the need for more refined analytical tools and methods in genomic research [75].

3.4. Sub-Genome Discrimination in Allopolyploids

Historically, the identification and differentiation of subgenomes in various species relied on using general genomic features as markers. Characteristics like GC content, centromeric array sequences, microsatellites, and repetitive elements were used as ‘genomic signatures’ to categorize DNA sequences into subgenomes [76,77]. However, while these methods were effective in some instances, they also had notable limitations. This deficiency underscored the difficulties in subgenome phasing, especially within intricate genomic settings in which conventional genetic markers may not have provided the necessary precision for accurate distinction. Several alternative approaches have been proposed to address these challenges. Investigating species of similar lineages, particularly those in phylogenetically proximate areas, is crucial for detecting polyploidy and hybridization events. This approach is especially beneficial in species like Brassica, in which allopolyploid species such as B. napus, B. juncea, and B. carinata contain genomes similar to those of extant diploid species B. rapa, B. oleracea, and Brassica nigra. Including these diploid species in assays provides crucial insights into genome locations within the allopolyploids [21,78]. Based on the understanding that investigating closely related species, particularly those in the phylogenetically proximate area, is crucial for detecting polyploidy and hybridization events, a similar approach was applied to Eutrema japonicum. By comparing its genomic sequences with those of its diploid relatives, Eutrema salsugineum and Eutrema yunnanense, and previously sequenced E. japonicum, it was possible to distinguish the haplotypes of E. japonicum [58]. This approach, involving investigating closely related species, offers significant advantages in understanding polyploidy and hybridization (Figure 3). It enables a more precise identification of subgenomes, particularly in complex genetic landscapes like those of Eutrema spp. However, the method also has limitations, as it relies heavily on the availability and quality of genomic data from related species and may only sometimes provide comprehensive insight into more diverse or less studied genomes.

When genomic data from closely related species is unavailable, the k-mer and LTR-RT based method provides an effective alternative for distinguishing sub-genomes (Figure 4). K-mer phasing involves identifying specific repetitive DNA sequences to categorize subgenomes. This method uses scanning of k-mers and the k-means algorithm to cluster chromosomes into distinct subgenomes. Techniques like PCA and Student’s t-tests are employed to ensure the accuracy of this phasing. Additionally, this approach includes mapping specific k-mers to the genome and applying Fisher’s exact test to confirm their significance. A key focus is on LTR-RTs, which are prevalent in plant genomes and act as markers to estimate the timeframe from subgenome differentiation to allohybridization [79,80]. This comprehensive approach, which integrates multiple bioinformatics techniques, is crucial for accurately distinguishing and understanding subgenomes, especially in cases in which closely related genomic data are not readily available [80]. However, it is important to note that these methods are primarily effective for allotetraploids, and may not be as helpful in distinguishing autotetraploids.

3.5. Genome Assembly Strategies of Autopolyploids in the Brassicaceae

In the Brassicaceae family, autopoliploidy was traditionally thought to be uncommon. However, recent reports identified plants like Capsella bursa-pastoris as naturally occurring autotetraploids. While not yet widely reported, it is suspected that there are several other autopolyploids within this family [81]. Autopolyploids present unique challenges in genome assembly due to the presence of four similar haplotypes with a significant proportion of nearly identical sequences. One major difficulty in assembling autoployploid genomes is the tendency to link these identical sequences, often leading to chimeric contigs with switch errors or false duplications. These chimeric contigs can interfere with Hi-C signals, causing erroneous scaffolds that mix sequences from different haplotypes. Additionally, differentiating nearly identical homologous sequences across haplotypes is challenging, often resulting in many collapsed contigs. These collapsed contigs can create misleading Hi-C links between the phased contigs of different haplotypes, leading to incorrect and overly long scaffolds. Despite these challenges, recent advancements in autotetraploid genome assembly were made using innovative techniques like single-cell sequencing of pollen genomes combined with Hi-C data. This approach, known as ‘gamete-binning’, begins with sequencing long reads from somatic DNA to create an initial contig-level assembly. Simultaneously, gamete genome sequencing data were collected, grouping contigs into clusters representing individual haplotypes based on genetic linkage. Long reads were then assigned to haplotypes based on their similarity to these contigs. This method enables a separate assembly of each haplotype, and facilitates chromosome-scale scaffolding using Hi-C data [82,83]. While not yet applied to the Brassicaceae family, this technique shows great potential for advancing genome assembly in autopolyploid crops in the Brassicaceae family.

4. Conclusions

The Brassicaceae family, characterized by extensive polyploidization, showcases evolutionary complexity through its hybridization events and WGDs. These events led to a significant diversification within the family, creating a range of polyploid species with varying genomic architectures. Despite the challenges in genome assembly and sub-genome discrimination due to polyploid complexity, advancements in sequencing technologies and methodologies, including ‘gamete-binning’ and k-mer analysis, have improved our understanding of polyploid genomes. These tools offer new insights into the genetic diversity and evolutionary history of the Brassicaceae family, highlighting the ongoing need for innovative research strategies to tackle the complexities of polyploid genomes. This review offers valuable insights into the complexities of the Brassicaceae genome, highlighting the evolutionary significance of polyploidization. By underscoring the potential for genetic advancements in crop improvement and emphasizing the need for ongoing plant genomics research, this work lays the groundwork for future agricultural innovations within the Brassicaceae family through its comprehensive exploration of genomic diversity.

Author Contributions

D.J. and C.K. designed and structured the review. D.J. collected the information, organized the Figures and Tables, and wrote the original draft. C.K. reviewed the manuscript. D.J. edited and wrote the manuscript. All authors contributed to the article and approved the submitted version. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture, Forestry and Fisheries (IPET) through Digital Breeding Transformation Technology Project, funded by Ministry of Agriculture, Food and Rural Affairs (MAFRA, PJ#: RS-2022-IP322076).

Conflicts of Interest

The authors declare no conflicts of interest.

References

German, D.A.; Hendriks, K.P.; Koch, M.A.; Lens, F.; Lysak, M.A.; Bailey, C.D.; Mummenhoff, K.; Al-Shehbaz, I.A. An updated classification of the Brassicaceae (Cruciferae). Phytokeys 2023, 220, 127–144. [Google Scholar] [CrossRef] [PubMed]
Hendriks, K.P.; Kiefer, C.; Al-Shehbaz, I.A.; Bailey, C.D.; van Huysduynen, A.H.; Nikolov, L.A.; Nauheimer, L.; Zuntini, A.R.; German, D.A.; Franzke, A.; et al. Global Brassicaceae phylogeny based on filtering of 1000-gene dataset. Curr. Biol. 2023, 33, 4052–4068. [Google Scholar] [CrossRef] [PubMed]
Bednarek, P.; Pislewska-Bednarek, M.; Svatos, A.; Schneider, B.; Doubsky, J.; Mansurova, M.; Humphry, M.; Consonni, C.; Panstruga, R.; Sanchez-Vallet, A.; et al. A Glucosinolate Metabolism Pathway in Living Plant Cells Mediates Broad-Spectrum Antifungal Defense. Science 2009, 323, 101–106. [Google Scholar] [CrossRef] [PubMed]
Moriyama, Y.; Koshiba-Takeuchi, K. Significance of whole-genome duplications on the emergence of evolutionary novelties. Brief. Funct. Genom. 2018, 17, 329–338. [Google Scholar] [CrossRef] [PubMed]
Wood, T.E.; Takebayashi, N.; Barker, M.S.; Mayrose, I.; Greenspoon, P.B.; Rieseberg, L.H. The frequency of polyploid speciation in vascular plants. Proc. Natl. Acad. Sci. USA 2009, 106, 13875–13879. [Google Scholar] [CrossRef] [PubMed]
Bowers, J.E.; Chapman, B.A.; Rong, J.; Paterson, A.H. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 2003, 422, 433–438. [Google Scholar] [CrossRef] [PubMed]
Kagale, S.; Robinson, S.J.; Nixon, J.; Xiao, R.; Huebert, T.; Condie, J.; Kessler, D.; Clarke, W.E.; Edger, P.P.; Links, M.G.; et al. Polyploid Evolution of the Brassicaceae during the Cenozoic Era. Plant Cell 2014, 26, 2777–2791. [Google Scholar] [CrossRef] [PubMed]
Huang, X.C.; German, D.A.; Koch, M.A. Temporal patterns of diversification in Brassicaceae demonstrate decoupling of rate shifts and mesopolyploidization events. Ann. Bot. 2020, 125, 29–47. [Google Scholar] [CrossRef] [PubMed]
Franzke, A.; Lysak, M.A.; Al-Shehbaz, I.A.; Koch, M.A.; Mummenhoff, K. Cabbage family affairs: The evolutionary history of Brassicaceae. Trends Plant Sci. 2011, 16, 108–116. [Google Scholar] [CrossRef]
Paritosh, K.; Yadava, S.K.; Singh, P.; Bhayana, L.; Mukhopadhyay, A.; Gupta, V.; Bisht, N.C.; Zhang, J.; Kudrna, D.A.; Copetti, D.; et al. A chromosome-scale assembly of allotetraploid Brassica juncea (AABB) elucidates comparative architecture of the A and B genomes. Plant Biotechnol. J. 2021, 19, 602–614. [Google Scholar] [CrossRef]
Sun, H.Q.; Jiao, W.B.; Campoy, J.A.; Krause, K.; Goel, M.; Folz-Donahue, K.; Kukat, C.; Huettel, B.; Schneeberger, K. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar. Nat. Genet. 2022, 54, 342–348. [Google Scholar] [CrossRef] [PubMed]
Shen, F.; Xu, S.X.; Shen, Q.; Bi, C.W.; Lysak, M.A. The allotetraploid horseradish genome provides insights into subgenome diversification and formation of critical traits. Nat. Commun. 2023, 14, 4102. [Google Scholar] [CrossRef] [PubMed]
Van De Peer, Y.; Mizrachi, E.; Marchal, K. The evolutionary significance of polyploidy. Nat. Rev. Genet. 2017, 18, 411–424. [Google Scholar] [CrossRef]
Van de Peer, Y.; Ashman, T.L.; Soltis, P.S.; Soltis, D.E. Polyploidy: An evolutionary and ecological force in stressful times. Plant Cell 2021, 33, 11–26. [Google Scholar] [CrossRef]
Anatskaya, O.V.; Vinogradov, A.E. Polyploidy as a Fundamental Phenomenon in Evolution, Development, Adaptation and Diseases. Int. J. Mol. Sci. 2022, 23, 3542. [Google Scholar] [CrossRef]
Kyriakidou, M.; Tai, H.H.; Anglin, N.L.; Ellis, D.; Strömvik, M.V. Current Strategies of Polyploid Plant Genome Sequence Assembly. Front. Plant Sci. 2018, 9, 1660. [Google Scholar] [CrossRef]
Amborella Genome, P. The Amborella genome and the evolution of flowering plants. Science 2013, 342, 1241089. [Google Scholar] [CrossRef] [PubMed]
Walden, N.; Nguyen, T.-P.; Mandáková, T.; Lysak, M.A.; Schranz, M.E. Genomic blocks in Aethionema arabicum support Arabideae as next diverging clade in Brassicaceae. Front. Plant Sci. 2020, 11, 719. [Google Scholar] [CrossRef]
Mandakova, T.; Li, Z.; Barker, M.S.; Lysak, M.A. Diverse genome organization following 13 independent mesopolyploid events in Brassicaceae contrasts with convergent patterns of gene retention. Plant J. 2017, 91, 3–21. [Google Scholar] [CrossRef]
Mandáková, T.; Lysak, M.A. Post-polyploid diploidization and diversification through dysploid changes. Curr. Opin. Plant Biol. 2018, 42, 55–65. [Google Scholar] [CrossRef]
Zou, J.; Hu, D.; Liu, P.; Raman, H.; Liu, Z.; Liu, X.; Parkin, I.A.; Chalhoub, B.; Meng, J. Co-linearity and divergence of the A subgenome of Brassica juncea compared with other Brassica species carrying different A subgenomes. Bmc Genom. 2016, 17, 18. [Google Scholar] [CrossRef] [PubMed]
Hohmann, N.; Wolf, E.M.; Lysak, M.A.; Koch, M.A. A Time-Calibrated Road Map of Brassicaceae Species Radiation and Evolutionary History. Plant Cell 2015, 27, 2770–2784. [Google Scholar] [CrossRef] [PubMed]
Walden, N.; German, D.A.; Wolf, E.M.; Kiefer, M.; Rigault, P.; Huang, X.C.; Kiefer, C.; Schmickl, R.; Franzke, A.; Neuffer, B.; et al. Nested whole-genome duplications coincide with diversification and high morphological disparity in Brassicaceae. Nat. Commun. 2020, 11, 3795. [Google Scholar] [CrossRef] [PubMed]
Ramsey, J.; Schemske, D.W. Neopolyploidy in flowering plants. Annu. Rev. Ecol. Syst. 2002, 33, 589–639. [Google Scholar] [CrossRef]
Baduel, P.; Bray, S.; Vallejo-Marin, M.; Kolář, F.; Yant, L. The “Polyploid Hop”: Shifting challenges and opportunities over the evolutionary lifespan of genome duplications. Front. Ecol. Evol. 2018, 6, 117. [Google Scholar] [CrossRef]
Tank, D.C.; Eastman, J.M.; Pennell, M.W.; Soltis, P.S.; Soltis, D.E.; Hinchliff, C.E.; Brown, J.W.; Sessa, E.B.; Harmon, L.J. Nested radiations and the pulse of angiosperm diversification: Increased diversification rates often follow whole genome duplications. New Phytol. 2015, 207, 454–467. [Google Scholar] [CrossRef]
Schranz, M.E.; Mohammadin, S.; Edger, P.P. Ancient whole genome duplications, novelty and diversification: The WGD Radiation Lag-Time Model. Curr. Opin. Plant Biol. 2012, 15, 147–153. [Google Scholar] [CrossRef] [PubMed]
Estep, M.C.; McKain, M.R.; Diaz, D.V.; Zhong, J.S.; Hodge, J.G.; Hodkinson, T.R.; Layton, D.J.; Malcomber, S.T.; Pasquet, R.; Kellogg, E.A. Allopolyploidy, diversification, and the Miocene grassland expansion. Proc. Natl. Acad. Sci. USA 2014, 111, 15149–15154. [Google Scholar] [CrossRef]
Zachos, J.; Pagani, M.; Sloan, L.; Thomas, E.; Billups, K. Trends, rhythms, and aberrations in global climate 65 Ma to present. Science 2001, 292, 686–693. [Google Scholar] [CrossRef]
Fahey, J.W.; Zalcmann, A.T.; Talalay, P. The chemical diversity and distribution of glucosinolates and isothiocyanates among plants. Phytochemistry 2001, 56, 5–51. [Google Scholar] [CrossRef]
Ratzka, A.; Vogel, H.; Kliebenstein, D.J.; Mitchell-Olds, T.; Kroymann, J. Disarming the mustard oil bomb. Proc. Natl. Acad. Sci. USA 2002, 99, 11223–11228. [Google Scholar] [CrossRef] [PubMed]
Lambrix, V.; Reichelt, M.; Mitchell-Olds, T.; Kliebenstein, D.J.; Gershenzon, J. The Arabidopsis epithiospecifier protein promotes the hydrolysis of glucosinolates to nitriles and influences Trichoplusia ni herbivory. Plant Cell 2001, 13, 2793–2807. [Google Scholar] [CrossRef] [PubMed]
Burow, M.; Markert, J.; Gershenzon, J.; Wittstock, U. Comparative biochemical characterization of nitrile-forming proteins from plants and insects that alter myrosinase-catalysed hydrolysis of glucosinolates. FEBS J. 2006, 273, 2432–2446. [Google Scholar] [CrossRef] [PubMed]
Edwards, D.; Batley, J.; Parkin, I.; Kole, C. Genetics, Genomics and Breeding of Oilseed Brassicas; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
Hofberger, J.A.; Lyons, E.; Edger, P.P.; Pires, J.C.; Schranz, M.E. Whole Genome and Tandem Duplicate Retention Facilitated Glucosinolate Pathway Diversification in the Mustard Family. Genome Biol. Evol. 2013, 5, 2155–2173. [Google Scholar] [CrossRef] [PubMed]
Fernandez-Calvo, P.; Chini, A.; Fernandez-Barbero, G.; Chico, J.M.; Gimenez-Ibanez, S.; Geerinck, J.; Eeckhout, D.; Schweizer, F.; Godoy, M.; Franco-Zorrilla, J.M.; et al. The Arabidopsis bHLH transcription factors MYC3 and MYC4 are targets of JAZ repressors and act additively with MYC2 in the activation of jasmonate responses. Plant Cell 2011, 23, 701–715. [Google Scholar] [CrossRef] [PubMed]
Hansen, B.G.; Kliebenstein, D.J.; Halkier, B.A. Identification of a flavin-monooxygenase as the S-oxygenating enzyme in aliphatic glucosinolate biosynthesis in Arabidopsis. Plant J. 2007, 50, 902–910. [Google Scholar] [CrossRef] [PubMed]
Pfalz, M.; Mukhaimar, M.; Perreau, F.; Kirk, J.; Hansen, C.I.C.; Olsen, C.E.; Agerbirk, N.; Kroymann, J. Methyl Transfer in Glucosinolate Biosynthesis Mediated by Indole Glucosinolate O-Methyltransferase 5. Plant Physiol. 2016, 172, 2190–2203. [Google Scholar] [CrossRef] [PubMed]
Nour-Eldin, H.H.; Andersen, T.G.; Burow, M.; Madsen, S.R.; Jorgensen, M.E.; Olsen, C.E.; Dreyer, I.; Hedrich, R.; Geiger, D.; Halkier, B.A. NRT/PTR transporters are essential for translocation of glucosinolate defence compounds to seeds. Nature 2012, 488, 531–534. [Google Scholar] [CrossRef]
Jorgensen, M.E.; Xu, D.; Crocoll, C.; Ramirez, D.; Motawia, M.S.; Olsen, C.E.; Nour-Eldin, H.H.; Halkier, B.A. Origin and evolution of transporter substrate specificity within the NPF family. eLife 2017, 6, e19466. [Google Scholar] [CrossRef]
de Kraker, J.W.; Gershenzon, J. From amino acid to glucosinolate biosynthesis: Protein sequence changes in the evolution of methylthioalkylmalate synthase in Arabidopsis. Plant Cell 2011, 23, 38–53. [Google Scholar] [CrossRef]
Barco, B.; Clay, N.K. Evolution of Glucosinolate Diversity via Whole-Genome Duplications, Gene Rearrangements, and Substrate Promiscuity. Annu. Rev. Plant Biol. 2019, 70, 585–604. [Google Scholar] [CrossRef] [PubMed]
Chan, J.C.; Ooi, M.K.; Guja, L.K. Polyploidy but not range size is associated with seed and seedling traits that affect performance of Pomaderris species. Front. Plant Sci. 2022, 12, 779651. [Google Scholar] [CrossRef] [PubMed]
Zhang, F.; Xue, H.; Lu, X.J.; Zhang, B.; Wang, F.; Ma, Y.; Zhang, Z.H. Autotetraploidization enhances drought stress tolerance in two apple cultivars. Trees-Struct. Funct. 2015, 29, 1773–1780. [Google Scholar] [CrossRef]
Ha, M.; Kim, E.D.; Chen, Z.J. Duplicate genes increase expression diversity in closely related species and allopolyploids. Proc. Natl. Acad. Sci. USA 2009, 106, 2295–2300. [Google Scholar] [CrossRef] [PubMed]
Salman-Minkov, A.; Sabath, N.; Mayrose, I. Whole-genome duplication as a key factor in crop domestication. Nat. Plants 2016, 2, 16115. [Google Scholar] [CrossRef] [PubMed]
Yim, W.C.; Swain, M.L.; Ma, D.; An, H.; Bird, K.A.; Curdie, D.D.; Wang, S.; Ham, H.D.; Luzuriaga-Neira, A.; Kirkwood, J.S.; et al. The final piece of the Triangle of U: Evolution of the tetraploid Brassica carinata genome. Plant Cell 2022, 34, 4143–4172. [Google Scholar] [CrossRef] [PubMed]
Bonnema, G.; Del Carpio, D.P.; Zhao, J. Diversity analysis and molecular taxonomy of Brassica vegetable crops. Genet. Genom. Breed. Veg. Brassicas 2011, 1, 81–124. [Google Scholar]
Wang, T.; van Dijk, A.D.J.; Bucher, J.; Liang, J.; Wu, J.; Bonnema, G.; Wang, X. Interploidy Introgression Shaped Adaptation during the Origin and Domestication History of Brassica napus. Mol. Biol. Evol. 2023, 40, msad199. [Google Scholar] [CrossRef] [PubMed]
Chen, F.; Gao, J.; Li, W.; Liu, Y.; Fang, P.; Peng, Z. Colchicine-induced tetraploidy influences morphological and cytological characteristics and enhances accumulation of anthocyanins in a red-fleshed radish (Raphanus sativus L.). Hortic. Environ. Biotechnol. 2021, 62, 937–948. [Google Scholar] [CrossRef]
Chen, Z.J.; Ni, Z.F. Mechanisms of genomic rearrangements and gene expression changes in plant polyploids. Bioessays 2006, 28, 240–252. [Google Scholar] [CrossRef]
Cui, X.B.; Hu, M.; Yao, S.L.; Zhang, Y.Y.; Tang, M.Q.; Liu, L.J.; Cheng, X.H.; Tong, C.B.; Liu, S.Y. BnaOmics: A comprehensive platform combining pan-genome and multi-omics data from. Plant Commun. 2023, 4, 100609. [Google Scholar] [CrossRef] [PubMed]
Niu, Y.; Liu, Q.Q.; He, Z.S.; Raman, R.; Wang, H.; Long, X.X.; Qin, H.; Raman, H.; Parkin, I.A.P.; Bancroft, I.; et al. A Brassica carinata pan-genome platform for Brassica crop improvement. Plant Commun. 2024, 5, 100725. [Google Scholar] [CrossRef] [PubMed]
Claros, M.G.; Bautista, R.; Guerrero-Fernández, D.; Benzerki, H.; Seoane, P.; Fernández-Pozo, N. Why Assembling Plant Genome Sequences Is So Challenging. Biology 2012, 1, 439–459. [Google Scholar] [CrossRef] [PubMed]
Jiang, X.; Song, Q.; Ye, W.; Chen, Z.J. Concerted genomic and epigenomic changes accompany stabilization of Arabidopsis allopolyploids. Nat. Ecol. Evol. 2021, 5, 1382–1393. [Google Scholar] [CrossRef] [PubMed]
Song, J.M.; Guan, Z.; Hu, J.; Guo, C.; Yang, Z.; Wang, S.; Liu, D.; Wang, B.; Lu, S.; Zhou, R.; et al. Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nat. Plants 2020, 6, 34–45. [Google Scholar] [CrossRef] [PubMed]
Huang, C.; Ying, H.; Yang, X.; Gao, Y.; Li, T.; Wu, B.; Ren, M.; Zhang, Z.; Ding, J.; Gao, J.; et al. The Cardamine enshiensis genome reveals whole genome duplication and insight into selenium hyperaccumulation and tolerance. Cell Discov. 2021, 7, 62. [Google Scholar] [CrossRef] [PubMed]
Tanaka, H.; Hori, T.; Yamamoto, S.; Toyoda, A.; Yano, K.; Yamane, K.; Itoh, T. Haplotype-resolved chromosomal-level assembly of wasabi (Eutrema japonicum) genome. Sci. Data 2023, 10, 441. [Google Scholar] [CrossRef] [PubMed]
Hu, Q.; Ma, Y.; Mandáková, T.; Shi, S.; Chen, C.; Sun, P.; Zhang, L.; Feng, L.; Zheng, Y.; Feng, X.; et al. Genome evolution of the psammophyte Pugionium for desert adaptation and further speciation. Proc. Natl. Acad. Sci. USA 2021, 118, e2025711118. [Google Scholar] [CrossRef]
Robinson, D.O.; Coate, J.E.; Singh, A.; Hong, L.L.; Bush, M.; Doyle, J.J.; Roeder, A.H.K. Ploidy and Size at Multiple Scales in the Arabidopsis Sepal. Plant Cell 2018, 30, 2308–2329. [Google Scholar] [CrossRef] [PubMed]
Otto, S.P. The evolutionary consequences of polyploidy. Cell 2007, 131, 452–462. [Google Scholar] [CrossRef]
Yuan, S.X.; Liu, Y.M.; Fang, Z.Y.; Yang, L.M.; Zhuang, M.; Zhang, Y.Y.; Sun, P.T. Study on the Relationship Between the Ploidy Level of Microspore-Derived Plants and the Number of Chloroplast in Stomatal Guard Cells in Brassica oleracea. Agric. Sci. China 2009, 8, 939–946. [Google Scholar] [CrossRef]
Roddy, A.B.; Théroux-Rancourt, G.; Abbo, T.; Benedetti, J.W.; Brodersen, C.R.; Castro, M.; Castro, S.; Gilbride, A.B.; Jensen, B.; Jiang, G.F.; et al. The Scaling of Genome Size and Cell Size Limits Maximum Rates of Photosynthesis with Implications for Ecological Strategies. Int. J. Plant Sci. 2020, 181, 75–87. [Google Scholar] [CrossRef]
Williams, J.H.; Oliveira, P.E. For things to stay the same, things must change: Polyploidy and pollen tube growth rates. Ann. Bot. 2020, 125, 925–935. [Google Scholar] [CrossRef] [PubMed]
Hoshi, Y.; Azumatani, M.; Suyama, C.; Adamec, L. Determination of Ploidy Level and Nuclear DNA Content in the Droseraceae by Flow Cytometry. Cytologia 2017, 82, 321–327. [Google Scholar] [CrossRef]
Li, W.; Liu, L.; Wang, Y.; Fan, G.; Zhang, S.; Wang, Y.; Liao, K. Determination of genome size and chromosome ploidy of selected taxa from Prunus armeniaca by flow cytometry. Sci. Hortic. 2020, 261, 108987. [Google Scholar] [CrossRef]
Song, K.M.; Lu, P.; Tang, K.L.; Osborn, T.C. Rapid Genome Change in Synthetic Polyploids of Brassica and Its Implications for Polyploid Evolution. Proc. Natl. Acad. Sci. USA 1995, 92, 7719–7723. [Google Scholar] [CrossRef] [PubMed]
Feng, B.; Yi, S.V.; Zhang, M.; Zhou, X. Development of novel EST-SSR markers for ploidy identification based on de novo transcriptome assembly for Misgurnus anguillicaudatus. PLoS ONE 2018, 13, e0195829. [Google Scholar] [CrossRef]
Shafey, S.; El-Maaty, S.A.; El Habbasha, S.; Elarabi, N.I. Evaluation of the growth, yield traits and the genetic diversity of some Brassica napus genotypes under Egyptian conditions. Beni-Suef Univ. J. Basic Appl. Sci. 2023, 12, 48. [Google Scholar] [CrossRef]
Ranallo-Benavidez, T.R.; Jaron, K.S.; Schatz, M.C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 2020, 11, 1432. [Google Scholar] [CrossRef]
Niedringhaus, T.P.; Milanova, D.; Kerby, M.B.; Snyder, M.P.; Barron, A.E. Landscape of Next-Generation Sequencing Technologies. Anal. Chem. 2011, 83, 4327–4341. [Google Scholar] [CrossRef]
Wenger, A.M.; Peluso, P.; Rowell, W.J.; Chang, P.C.; Hall, R.J.; Concepcion, G.T.; Ebler, J.; Fungtammasan, A.; Kolesnikov, A.; Olson, N.D.; et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019, 37, 1155–1162. [Google Scholar] [CrossRef] [PubMed]
Lieberman-Aiden, E.; van Berkum, N.L.; Williams, L.; Imakaev, M.; Ragoczy, T.; Telling, A.; Amit, I.; Lajoie, B.R.; Sabo, P.J.; Dorschner, M.O.; et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 2009, 326, 289–293. [Google Scholar] [CrossRef] [PubMed]
Deshpande, A.S.; Ulahannan, N.; Pendleton, M.; Dai, X.G.; Ly, L.; Behr, J.M.; Schwenk, S.; Liao, W.; Augello, M.A.; Tyer, C.; et al. Identifying synergistic high-order 3D chromatin conformations from genome-scale nanopore concatemer sequencing. Nat. Biotechnol. 2022, 40, 1488–1499. [Google Scholar] [CrossRef]
Jia, K.-H.; Liu, H.; Zhang, R.-G.; Xu, J.; Zhou, S.-S.; Jiao, S.-Q.; Yan, X.-M.; Tian, X.-C.; Shi, T.-L.; Luo, H. Chromosome-scale assembly and evolution of the tetraploid Salvia splendens (Lamiaceae) genome. Hortic. Res. 2021, 8, 177. [Google Scholar] [CrossRef]
Mitros, T.; Session, A.M.; James, B.T.; Wu, G.A.; Belaffif, M.B.; Clark, L.V.; Shu, S.Q.; Dong, H.X.; Barling, A.; Holmes, J.R.; et al. Genome biology of the paleotetraploid perennial biomass crop. Nat. Commun. 2020, 11, 5442. [Google Scholar] [CrossRef]
Campomayor, N.B.; Waminal, N.E.; Kang, B.Y.; Nguyen, T.H.; Lee, S.S.; Huh, J.H.; Kim, H.H. Subgenome Discrimination in Brassica and Raphanus Allopolyploids Using Microsatellites. Cells 2021, 10, 2358. [Google Scholar] [CrossRef]
Liu, S.; Liu, Y.; Yang, X.; Tong, C.; Edwards, D.; Parkin, I.A.; Zhao, M.; Ma, J.; Yu, J.; Huang, S.; et al. The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nat. Commun. 2014, 5, 3930. [Google Scholar] [CrossRef] [PubMed]
Ma, J.X.; Bennetzen, J.L. Rapid recent growth and divergence of rice nuclear genomes. Proc. Natl. Acad. Sci. USA 2004, 101, 12404–12410. [Google Scholar] [CrossRef]
Jia, K.H.; Wang, Z.X.; Wang, L.; Li, G.Y.; Zhang, W.; Wang, X.L.; Xu, F.J.; Jiao, S.Q.; Zhou, S.S.; Liu, H.; et al. SubPhaser: A robust allopolyploid subgenome phasing method based on subgenome-specific k-mers. New Phytol. 2022, 235, 801–809. [Google Scholar] [CrossRef]
Slotte, T.; Ceplitis, A.; Neuffer, B.; Hurka, H.; Lascoux, M. Intrageneric phylogeny of Capsella (Brassicaceae) and the origin of the tetraploid C. bursa-pastoris based on chloroplast and nuclear DNA sequences. Am. J. Bot. 2006, 93, 1714–1724. [Google Scholar] [CrossRef]
Yan, J.; Zhang, C.; Zhang, M.; Zhou, H.; Zuo, Z.; Ding, X.; Zhang, R.; Li, F.; Gao, Y. Chromosome-level genome assembly of the Colorado potato beetle, Leptinotarsa decemlineata. Sci. Data 2023, 10, 36. [Google Scholar] [CrossRef] [PubMed]
Ye, Z.; Lu, R.; Li, C.; Yang, D.; Zeng, Z.; Lin, W.; Cheng, J.; Yang, Z.; Wang, L.; Gao, Y.; et al. Haplotype-resolved and chromosome-level genome assembly of Colorado potato beetle. J. Genet. Genom. 2023, 50, 532–535. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The evolutionary procedure of polyploidization in the Brassicaceae family. The blue symbols represent diploids that have undergone the conventional evolutionary process, while the red symbols denote closely related diploids. The three phases of polyploidization: paleopolyploidization, mesopolyploidization, and neopolyploidization. In paleopolyploidization, a genome duplication event is followed by diploidization, where the duplicated genome returns to a diploid-like state over evolutionary time. In mesopolyploidization, a genome that was diploidized during paleopolyploidization undergoes another genome duplication event and is diploidized once again. Alternatively, the diploid genome may remain unchanged without undergoing further duplication. In neopolyploidization, polyploidy is maintained after experiencing genome duplication. The genome can be fully replicated to become an autotetraploid or hybridize with a closely related species to form an allotetraploid.

Figure 2. The GenomeScope plots for the (a) diploid Arabidopsis thaliana [70] and (b) allopolyploid Eutrema japonicum within the Brassicaceae family.

Figure 3. Subgenome discrimination using genomic information from closely related species. (a) The phylogenetic tree of the relationship between subgenome A, B, and its closely related species. (b) Distinguishing subgenomes through a comparison of sequence similarity with ancestral species. (c) Mapping analysis against parental species, with colored regions indicating mapping coverage. This Figure was reproduced based on the data from Tanaka et al. [58].

Figure 4. Subgenome phasing of allotetraploid Eutrema japonicum. (These Figures were from unpublished data). (a) A Principal Component Analysis (PCA) confirming effective phasing into three distinct subgenomes, evident by the unique patterns of differential 15-mers and homoeologous chromosomes. (b) A hierarchical clustering approach, where the top axis color bar represents k-mer specificity to each subgenome and the side axis color bar denotes chromosome assignment to subgenomes, with the heatmap reflecting Z-scaled k-mer abundance. (c) The timeline of LTR-RT insertions specific to each subgenome.

Table 1. The current status of polyploid genome assembly within the Brassicaceae family.

Species	Estimated Genome Size	Ploidy Level	Long-Read Sequencing Platform	Chromosome Capture Sequencing Platform	Ref.
Arabidopsis suecica	~272 Mb	Allotetraploid	PacBio Sequel	Hi-C	[55]
Armoracia rusticana	~636 Mbp	Allotetraploid	ONT, PacBio HiFi	Hi-C	[12]
Brassica carinata	~1.31 Gbp	Allotetraploid	PacBio	Hi-C	[47]
Brassica juncea	~922 Mb	Allotetraploid	PacBio RSII	Hi-C	[10]
Brassica napus	~1132 Mb	Allotetraploid	PacBio SMRT	Hi-C	[56]
Cardamine enshiensis	~443 Mb	Allotetraploid	PacBio Sequel	Hi-C	[57]
Eutrema japonicum	~1512.1 Mb	Allotetraploid	PacBio CLR	Hi-C	[58]
Pugionium cornutum	~570 Mb	Allotetraploid	GridION, PacBio RS II
Pugionium dolabratum	~606 Mb	Allotetraploid	GridION, PacBio RS II		[59]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jeon, D.; Kim, C. Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies. Plants 2024, 13, 2087. https://doi.org/10.3390/plants13152087

AMA Style

Jeon D, Kim C. Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies. Plants. 2024; 13(15):2087. https://doi.org/10.3390/plants13152087

Chicago/Turabian Style

Jeon, Donghyun, and Changsoo Kim. 2024. "Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies" Plants 13, no. 15: 2087. https://doi.org/10.3390/plants13152087

APA Style

Jeon, D., & Kim, C. (2024). Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies. Plants, 13(15), 2087. https://doi.org/10.3390/plants13152087

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Polyploids of Brassicaceae: Genomic Insights and Assembly Strategies

Abstract

1. Introduction

2. Polyploidization of the Brassicaceae Family

2.1. Diversification of Brassicaceae

2.2. Evolution of the Biosynthetic Pathways to Produce GSLs

2.3. Polyploidy and Its Role in the Evolution and Domestication of Brassica Species for Agricultural Trait Development

3. Challenges and Solutions in Sequencing and Assembling Polyploidy Plant Genomes

3.1. Assessing Ploidy Levels in Plants

3.2. Genomic Characteristics of Polyploid Plants

3.3. Sequencing Technology Advancements and Limitations in Polyploids

3.4. Sub-Genome Discrimination in Allopolyploids

3.5. Genome Assembly Strategies of Autopolyploids in the Brassicaceae

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI