Previous Article in Journal
How Will Environmental Conditions Affect Species Distribution and Survival in the Coming Decades—A Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Draft Genome Assembly of Parnassius epaphus Provides New Insights into Transposable Elements That Drive Genome Expansion in Alpine Parnassius butterflies

Guangxi Key Laboratory of Sericulture Ecology and Intelligent Technology Application, Guangxi Collaborative Innovation Center of Modern Sericulture and Silk, School of Chemistry and Bioengineering, Hechi University, Hechi 546300, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Diversity 2025, 17(11), 794; https://doi.org/10.3390/d17110794 (registering DOI)
Submission received: 7 October 2025 / Revised: 4 November 2025 / Accepted: 12 November 2025 / Published: 13 November 2025

Abstract

The expansion of genomes is a major evolutionary force, yet its role in facilitating adaptation to extreme environments remains enigmatic. Here, we investigate alpine Parnassius butterflies, a rare genus characterized by exceptionally large genomes, to unravel the interplay between genome architecture and high-altitude colonization. We present a new, 1.46 Gb draft genome assembly for Parnassius epaphus and perform a comparative analysis across six species. Our findings reveal a massive 3- to 5-fold genome expansion driven predominantly by Long Interspersed Nuclear Elements (LINEs). Counterintuitively, we discover that larger genomes possess a proportionally smaller fraction of young, active transposable elements (TEs), challenging the prevailing paradigm that recent TE proliferation is the primary driver of genome size. Instead, our temporal analysis demonstrates that this expansion is a legacy of two ancient TE waves (~8 and ~14 Mya), which remarkably coincide with major uplift phases of the Tibetan Plateau. We propose a model where the selective retention of these ancient TEs, mechanistically linked to major geological upheavals, provided the crucial genomic plasticity for colonizing Earth’s most extreme terrestrial habitats. This study re-frames TEs not merely as genomic parasites but as pivotal architects of adaptive genome evolution in response to profound environmental change.

1. Introduction

Transposable elements (TEs) are now recognized as fundamental architects of genome structure and evolution across all life forms [1,2]. These mobile genetic elements comprise a substantial portion of eukaryotic genomes where they constitute major genomic components, often accounting for the predominant fraction of total genome content [2,3]. In insect genomes, TEs have emerged as key players in adaptation, speciation, and response to environmental challenges [4,5]. Recent studies have demonstrated that TE-mediated genomic reorganization can facilitate rapid adaptation to new or changing environments by creating novel regulatory networks, disrupting existing genes, or creating entirely new gene functions [6,7]. This is particularly evident in insects inhabiting extreme environments, where genomic plasticity may confer significant adaptive advantages [8]. In Orthoptera [9] and Lepidoptera [10] particularly, TEs represent key determinants of genome size variation, driving both expansion and contraction over evolutionary timescales. Their ubiquity and diversity raise important questions about their functional significance beyond being mere genomic parasites.
The genus Parnassius (Apollo butterflies) represents an exceptional model system for studying the role of TEs in adaptation to extreme environments [10,11]. These charismatic lepidopterans primarily inhabit high-altitude regions across the Northern Hemisphere, with many species endemic to the Qinghai–Tibetan Plateau (QTP) and adjacent mountain ranges [10,11,12]. Their adaptation to harsh alpine conditions characterized by intense UV radiation, low oxygen levels, and extreme temperature fluctuations makes them ideal candidates for investigating genomic mechanisms underlying high-altitude adaptation [13,14,15].
While several studies have examined physiological and population-level adaptations in Parnassius butterflies, the genomic basis of their environmental adaptation has been significantly advanced through recent genomic studies of P. glacialis [10] and P. apollo [16]. These studies have revealed considerable variation in TE content and activity, with robust evidence suggesting that TE dynamics play crucial roles in shaping their genomic architecture and facilitating adaptation to challenging environments. These genomic mechanisms have contributed to substantial genome expansion in Parnassius butterflies, establishing them as having among the largest genomes not only within Papilionidae but across the entire Lepidoptera order [10,11]. Studies on P. glacialis revealed that TE-mediated genome expansion and reorganization facilitated their adaptation during altitudinal range shifts, highlighting the potential importance of TEs in high-altitude adaptation [10,13,17]. Current debates in the field center on whether TEs primarily function as genomic parasites that occasionally provide beneficial mutations or whether they represent sophisticated genomic tools that organisms harness for adaptive evolution. Evidence supporting both perspectives exist, and Parnassius butterflies offer an opportunity to address this controversy in the context of extreme environmental adaptation [10,14,16].
In this study, we present an expanded comparative analysis of TEs across six Parnassius species (P. apollo, P. behrii, P. cephalus, P. epaphus, P. glacialis, and P. orleans), significantly broadening the taxonomic scope of previous investigations. Notably, we contribute the new genome of P. epaphus, a species of exceptional ecological and evolutionary significance inhabiting the Three Rivers Source region of the Qinghai–Tibetan Plateau—one of Earth’s most pristine high-altitude ecosystems [18,19]. This genome represents a critical resource for understanding adaptation mechanisms in extreme environments, as P. epaphus occupies one of the highest-altitude ranges among butterflies worldwide, facing some of the most severe environmental pressures known to Lepidoptera. Combing all these data, we aim to characterize the diversity, distribution, and evolutionary dynamics of TEs in these high-altitude specialists and explore potential associations between TE activity and adaptation to alpine environments. Our comparative genomic approach allows us to identify conserved and lineage-specific patterns of TE evolution within Parnassius, enhancing our understanding of how mobile genetic elements may contribute to genomic plasticity and environmental adaptation in high-altitude insects. This research may provide broader implications for understanding evolutionary responses to extreme environments in the face of climate change.

2. Materials and Methods

2.1. Sample Collection and DNA Extraction

Three adult specimens of P. epaphus (two males and one female) were collected from the Three Rivers Source region (Sanjiangyuan; 34.8° N, 95.5° E, elevation 4500 m) on the Qinghai–Tibetan Plateau. Males and females were distinguished based on morphological characteristics including wing pattern and size differences, following taxonomic descriptions for this species. Thoracic tissue was dissected from fresh specimens and immediately preserved in RNAlater solution (Thermo Fisher Scientific, Waltham, MA, USA). High-molecular-weight genomic DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol for insect tissues. Briefly, thoracic muscle tissue (~25 mg) was disrupted in Buffer ATL with proteinase K digestion at 56 °C until complete lysis was achieved. DNA quality was assessed using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and Qubit 3.0 Fluorometer (Invitrogen, Carlsbad, CA, USA), while DNA integrity was verified using pulsed-field gel electrophoresis.

2.2. Genome Sequencing

For P. epaphus, we employed a hybrid sequencing approach combining long and short reads. Oxford Nanopore Technologies (ONT) sequencing was performed at HaoRui genomics Bio-Tech (Xi’an, China) using a GridION X5 sequencer (Oxford Nanopore Technologies, Oxford, UK). High-molecular-weight DNA was prepared using the SQK-LSK109 ligation sequencing kit (Oxford Nanopore Technologies, Oxford, UK) following the manufacturer’s instructions. For short-read sequencing, paired-end libraries with insert sizes of approximately 350 bp were constructed using the NEBNext Ultra DNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA) and sequenced on an Illumina X Ten platform (Illumina, San Diego, CA, USA) with 150 bp paired-end reads. We aimed for approximately 50× genome coverage with ONT reads and 100× coverage with Illumina reads to ensure high-quality assembly.

2.3. Genome Assembly and Quality Assessment

The P. epaphus genome size was estimated using KmerGenie v1.7051 [20] with default parameters. ONT long reads were corrected and assembled using NextDenovo v2.5.2 [21] with default parameters for insect genomes. Haplotigs were removed using purge_dups v1.2.5 [22] to generate a haploid reference assembly. Assembly quality and completeness were assessed using BUSCO v5.8.2 [23] against the lepidoptera_odb12 dataset. For comparative analyses, we obtained previously published genome assemblies for five additional Parnassius species: P. apollo (GCA_907164705.1), P. behrii (GCA_036936625.1), P. cephalus (GWHEQHO00000000), P. glacialis (GCA_033319125.1), and P. orleans (GCA_029286625.1) from public databases (NCBI genome database, CNCB genome warehouse).

2.4. Genome Annotation and Transposable Element Identification

Transposable elements and other repetitive sequences were identified using EarlGrey v6.0.1 [24], a comprehensive pipeline specifically designed for TE annotation. TEs were classified into major classes (DNA transposons, LTR retrotransposons, non-LTR retrotransposons) and subclasses following standard nomenclature. The age distribution of TEs was estimated by calculating the Kimura 2-parameter distances between individual TE copies and their consensus sequences using the scripts provided in the EarlGrey package. We examined the correlation between TE content and genome size using linear regression models in R v4.1.0.
To ensure consistency across all six Parnassius genomes (P. apollo, P. behrii, P. cephalus, P. epaphus, P. glacialis, and P. orleans), we employed a uniform annotation approach using miniprot v0.13-r248 [25]. The homology-based prediction utilized protein sequences from the well-annotated P. apollo genome as a reference, while Augustus was employed for ab initio gene prediction. Gene models were further filtered to remove partial gene models (coverage < 80% of the reference gene) and pseudogenes.
A comparative genomic analysis was performed to investigate the evolutionary relationships and the dynamics of gene family evolution. This analysis included the six Parnassius species, two other papilionid species (Iphiclides podalirius, Papilio machaon), and an outgroup species, Leptidea sinapis. Orthologous gene families were identified using OrthoFinder v2.5.4 with default parameters. A species tree was constructed using the concatenated sequences of single-copy orthologs. Divergence times were estimated using the MCMCTree program in the PAML package, calibrated with divergence times retrieved from the previous genomic analysis of Parnassius genomes. The evolution of gene family size (gain and loss) along the phylogenetic tree was analyzed using CAFE v5.1.0, which employs a stochastic birth and death model to infer changes in gene family size across the phylogeny. GO enrichment analysis was conducted using the TBtools-II v2.371, GO terms with a Q-value (FDR-adjusted p-value) less than 0.05 were considered significantly enriched.

2.5. Statistical Analysis and Visualization

All statistical analyses were performed in R v4.1.0 [26]. Comparisons of TE content across species were conducted using ANOVA followed by Tukey’s HSD post hoc test. Correlations between environmental variables and TE characteristics were assessed using Pearson’s correlation coefficient. p-values were adjusted for multiple testing using the Benjamini–Hochberg method. To investigate the temporal dynamics of transposable element evolution across the studied butterfly species, we employed a multi-faceted approach combining sequence divergence analysis with geological time calibration. For each major TE class in all genomes, we calculated Kimura 2-parameter (K2P) distances between individual TE copies and their corresponding consensus sequences. These K2P distances were converted to absolute time estimates using the formula T = K/2r [10], where T represents time in millions of years ago (MYA), K is the K2P distance, and r is the neutral substitution rate (estimated at 5.6 × 10−9 substitutions per site per year for Parnassius, based on published estimates [15]). Density distributions of TE insertion ages were generated using the ggplot2 package in R, with kernel density estimation adjusted using Gaussian smoothing. To account for the differential genomic impact of TE insertions, we weighted these distributions by the genomic coverage of each TE category.
Visualization of genomic features, TE distributions, and comparative analyses was performed using custom R v4.1.0 scripts [26]. Circos plots depicting genome structure and TE distribution were generated using Circos v0.69-8 [27]. Additional visualizations and data integration were facilitated using online OmicStudio tools v3.6 (https://www.omicstudio.cn/tool (accessed on 17 May 2025)) [28].

3. Results

3.1. P. epaphus Genome Assembly

We estimated the P. epaphus genome size at approximately 1.48 Gb using NGS short reads. This estimate aligns closely with genome sizes of other Parnassius species, indicating its typicality within the genus. Using Oxford Nanopore Technology (ONT) long reads, we generated a high-quality assembly with a total scaffold size of 1.46 Gb (1,462,984,654 bp) after purification and cleaning, remarkably consistent with our initial estimate. Notably, the scaffold N50 of 45,036,452 bp and contig N50 of 346,045 bp indicate a highly contiguous assembly, with only 14 scaffolds required to cover 50% of the genome (L50 = 14). The low rate of ambiguous bases (N rate = 0.00035) further confirms minimal gaps, while the GC content of 37.05% is consistent with other lepidopteran genomes. The longest scaffold reaches 89,215,753 bp, highlighting the effectiveness of long-read sequencing in resolving complex repetitive regions typical of large Parnassius genomes. BUSCO quality assessment against the lepidoptera_odb12 dataset demonstrated exceptional completeness: 92.1% of conserved orthologs were present (87.3% single-copy, 4.8% duplicated), 3.1% were fragmented, and only 4.8% were missing.
Gene annotation using Geta identified 23,979 gene models, consistent with gene counts observed in other reannotated Parnassius genomes. Detailed statistics on protein-coding genes revealing typical patterns for lepidopteran genomes, such as a mean gene length of 2462.71 bp and an average of 1.92 exons per gene. Notably, no alternative splicing (AS) genes were detected, and intergenic regions show a mean length of 11,316.30 bp for positive lengths, with 5696 instances of negative intergenic lengths (indicating potential gene overlaps). These metrics support the high quality of both the assembly and annotation, as they align with expected genomic architectures in butterflies.
Repeat sequence annotation revealed that repetitive elements constitute 67.39% (985.90 Mb) of the P. epaphus genome, highlighting the significant contribution of repetitive DNA to the characteristically large Parnassius genomes. Non-LTR retrotransposons, particularly LINE elements, dominate the repeat landscape, comprising 28.86% (422.22 Mb) of the genome, while SINE elements account for 8.54% (124.94 Mb). LTR retrotransposons and DNA transposons constitute 5.75% (84.12 Mb) and 7.25% (106.07 Mb) of the genome, respectively. Rolling circle transposons represent 2.86% (41.84 Mb), with the remaining repetitive content consisting of unclassified repeats (12.82%, 187.55 Mb), satellite DNA (0.39%, 5.71 Mb), simple repeats (0.83%, 12.14 Mb), and low complexity regions (0.09%, 1.32 Mb).
To enhance the utility of our assembly, we anchored it to chromosome level using the high-quality chromosome-scale P. glacialis assembly as a reference. We successfully mapped 1.29 Gb (88.3%) of the P. epaphus genome to the 29 chromosomes of P. glacialis. This high mapping rate indicates significant chromosomal conservation between these Parnassius species, despite differences in evolutionary history and ecological adaptations. This chromosome-level scaffolding provides a valuable framework for comparative genomic analyses. The circular visualization of the chromosome-anchored P. epaphus genome (Figure 1) displays the distribution of various genomic features across the 29 pseudochromosomes. The Circos plot reveals heterogeneous patterns across chromosomes, potentially reflecting differential evolutionary pressures on various genomic regions. This chromosome-level P. epaphus genome assembly provides a crucial genomic resource for understanding the genetic basis of high-altitude adaptation in butterflies.
The question of whether massive TE proliferation in Parnassius has led to more frequent gene turnover—a potential mechanism for rapid adaptation—prompted an analysis of gene family evolution dynamics across nine lepidopteran species (Figure 2). Our phylogenetic analysis estimated that the Parnassius genus diverged from its sister taxon approximately 40.90 million years ago (Mya). Within the genus, divergence events occurred more recently, such as the split between P. epaphus and P. orleans around 4.78 Mya. Our findings, however, present a complex picture that challenges the simple expectation of a positive correlation between TE content and gene turnover. The Parnassius lineages generally exhibited a notably lower number of gene family gains (ranging from 126 to 243) compared to the outgroup lineages such as I. podalirius (1183 gains) and P. machaon (1069 gains). The pattern of gene family loss was more varied: most Parnassius species showed moderate levels of loss (e.g., P. behrii with 55 losses, P. cephalus with 94 losses), while P. epaphus experienced a substantial loss of 1286 gene families. This number, however, was still less than the 2725 families lost in the I. podalirius lineage. These results suggest that the profound impact of TEs on genome size in Parnassius is not directly coupled with an accelerated rate of gene gain or loss. The relationship between TE activity and gene content evolution appears to be non-linear and may be influenced by other lineage-specific evolutionary forces.
Despite the lower overall rate of gene gain, we identified 28 gene families that underwent significant expansion at the root of the Parnassius clade, suggesting these expansions may be linked to the early adaptation of this genus to alpine environments. To investigate their potential biological roles, we conducted a GO enrichment analysis on the 210 genes from these expanded families (Figure 2B). The analysis revealed significant enrichment in several key functional categories. In the ‘Cellular Component’ category, the most highly enriched terms were related to DNA and chromatin organization, including protein-DNA complex (GO:0032993), chromatin (GO:0000785), and DNA packaging complex (GO:0044815). In ‘Molecular Function’, terms associated with binding were prevalent, such as chromatin binding (GO:0003682) and protein-containing complex binding (GO:0044877). For ‘Biological Process’, the most strikingly enriched term was coagulation (GO:0050817). Additionally, several terms related to neuronal function and development were significantly enriched, including presynaptic active zone (GO:0048786), presynaptic endocytic zone (GO:0098833), and amyloid-beta clearance (GO:0097242). These enriched functions point towards potential adaptive mechanisms in Parnassius, involving genomic regulation, physiological stress responses, and nervous system modifications, which may have been crucial for their successful colonization of high-altitude habitats.

3.2. Genomic Expansion and Repetitive Element Diversification in Parnassius Butterflies

Analysis of repetitive element content across the eight butterfly genomes revealed striking differences between Parnassius species and outgroups (Table S1). Comprehensive investigation of the relationship between genome size and repetitive content demonstrated a strong positive correlation (R2 = 0.97, p < 0.0001; Figure 3), indicating that repetitive element proliferation is the primary driver of genomic expansion in Parnassius butterflies. The six Parnassius species exhibited substantially larger genomes (1.23–1.59 Gb) with remarkably higher repetitive content (67.39–72.56%) compared to the outgroup species I. podalirius (0.43 Gb, 34.92%) and P. machaon (0.25 Gb, 28.68%).
Detailed analysis of repetitive element composition revealed that this expansion was not uniform across all repeat types (Figure 4, Table S1). Non-LTR retrotransposons, particularly LINE elements, showed the most pronounced expansion, comprising 22.09–32.55% of Parnassius genomes but only 5.87–12.19% in the outgroups. This 3–5 fold enrichment of LINE elements appears to be the predominant contributor to genome size increase in Parnassius butterflies. SINE elements also showed significant enrichment (5.21–8.54% in Parnassius vs. 1.65–4.55% in outgroups), as did LTR retrotransposons (4.60–9.78% vs. 2.33–2.71%) and DNA transposons (7.25–8.81% vs. 2.03–2.29%). Unclassified repeat elements consistently represented 12.61–13.58% of Parnassius genomes compared to approximately 7.9% in outgroups.
Interestingly, through hierarchical clustering, the six Parnassius species exhibited species-specific patterns of repeat element composition. For instance, P. glacialis showed the highest LINE content (32.55%), while P. cephalus displayed the highest SINE (8.15%) and RC transposon (8.09%) proportions. P. behrii and P. apollo exhibited notably higher LTR retrotransposon content (9.78% and 8.42%, respectively) than other Parnassius species. These variations suggest independent and possibly lineage-specific bursts of transposable element activity during Parnassius evolution. The substantial genome expansion observed in Parnassius butterflies, primarily driven by proliferation of repetitive elements, particularly LINEs, suggests potential adaptive significance of genome architecture in high-altitude environments. This genomic feature may have facilitated the adaptation of Parnassius butterflies to extreme conditions through various mechanisms, including enhanced genetic plasticity, altered gene regulation networks, or buffering against environmental stressors.
To identify the primary factors driving genome expansion in Parnassius butterflies, we conducted a comprehensive correlation analysis between genome metrics and the proportions of different repetitive element classes across the nine butterfly species (Table 1). This analysis allowed us to quantify the contribution of each repeat type to the observed genomic expansion. The results unequivocally identify LINE elements as the primary driver of genome expansion. LINE content exhibited an exceptionally strong and significant positive correlation with genome size (Pearson’s r = 0.96, p < 0.001) and the total percentage of masked repeats (r = 0.96, p < 0.001). This indicates that the massive accumulation of LINEs is the principal determinant of the dramatic genome size differences observed.
Following LINEs, DNA transposons and Unclassified repeats also emerged as major contributors, showing very strong positive correlations with genome size (r = 0.93, p < 0.001 and r = 0.95, p < 0.001, respectively). LTR retrotransposons and SINE elements played a secondary but still significant role, with moderate to strong positive correlations with genome size (r = 0.83, p < 0.01 and r = 0.77, p < 0.05, respectively). Conversely, an interesting inverse relationship was observed for low complexity regions, which showed a strong negative correlation with genome size (r = −0.95, p < 0.001). This suggests a proportional dilution of these simple sequences as the genome becomes filled with more complex TEs. Other repeat types, such as Rolling circle (RC) transposons and simple repeats, showed no significant correlation with genome size, indicating their minimal contribution to the overall expansion.

3.3. Evolutionary Dynamics of Transposable Elements in Parnassius Butterflies

The temporal patterns of TE accumulation and proliferation provide critical insights into genome evolution. To investigate TE dynamics across Parnassius butterflies, we estimated the age of TE insertions using Kimura 2-parameter (K2P) distances and converted these to absolute time estimates using the formula T = K/2r. The divergence profiles of TEs varied substantially across species, revealing distinct evolutionary trajectories (Figure 5).
The outgroup species I. podalirius and P. machaon displayed markedly different TE evolutionary patterns compared to Parnassius species. Particularly high LTR retrotransposon activity in the recent evolutionary history of these species, potentially contributing to genome restructuring within their lineages. Within the Parnassius genus, we observed considerable heterogeneity in TE divergence profiles that appears to correlate with phylogenetic relationships and ecological adaptations. These butterflies’ genomes showed similar TE age distributions with RC elements displaying peaks at intermediate K2P values (≈0.20 ~ 0.25), suggesting synchronized expansion of these elements coinciding with their divergence. However, the high-altitude specialists P. orleans and P. epaphus exhibited more distinct patterns. While both showed older, more diffuse TE distributions overall (particularly for RC elements), these two species displayed a unique pattern with a secondary peak in Unknown elements at higher K2P values (≈0.20) that was not observed in other species, potentially indicating moderate historical activity with fewer recent proliferation events. DNA transposons generally showed older divergence profiles across all species (higher K2P values with flatter distributions), suggesting reduced activity in recent evolutionary history of papilionid butterflies. Their low density and relatively older age distribution suggest these elements have become largely inactive in these genomes, contrasting with the continued activity of retrotransposons, particularly in certain lineages. Collectively, these observed patterns suggest that TE activity has been particularly dynamic during the divergence and ecological specialization of Parnassius butterflies, potentially contributing to their remarkable diversification across high-altitude environments.
To further better understand the temporal dynamics of different TE classes, we converted K2P distances to insertion age estimates and visualized these across species (Figure 6). This analysis revealed striking differences in the age distribution and genomic contribution of major TE classes across the examined butterflies. The overall TE landscape revealed distinct evolutionary trajectories across species. P. epaphus exhibited older TE populations across most classes (lighter coloration), with minimal evidence of recent proliferation events. In contrast, P. glacialis and the outgroup P. machaon harbored distinctly younger TE populations across multiple TE classes (darker coloration), suggesting more recent and possibly ongoing transposition activity. These distinctive temporal patterns likely reflect different selective regimes acting on TE proliferation throughout these species’ evolutionary histories, potentially linked to their distinct ecological adaptations and demographic histories.
In details, LINEs emerged as the dominant TE class across all species, consistently showing the largest bubble sizes, corresponding with their substantial genomic contribution. LTR elements displayed a distinct pattern with moderate genomic contribution but demonstrated substantial variation in insertion ages across species. Particularly notable were the significantly younger LTR populations in P. behrii and P. apollo (darker purple coloration), suggesting recent transposition activity in these lineages. In comparison, other species exhibited older LTR populations (lighter pink coloration). DNA transposons showed a consistent pattern across species with moderate genomic contribution but varied considerably in age profiles. Most striking was P. glacialis, which exhibited remarkably young DNA transposons (deep purple coloration, suggesting recent activity, close to P. machaon), contrasting sharply with the much older DNA transposon populations in other species (lighter pink to red coloration). This suggests a lineage-specific burst of DNA transposon activity in this species. SINEs showed relatively consistent age profiles and genomic contribution across all species (similar bubble sizes and pink-purple coloration), indicating more uniform evolutionary histories compared to other TE classes.
The insertional age profile of TEs revealed distinctive waves of TE proliferation across species (Figure 7). These density distributions, weighted by genomic coverage, demonstrate significant differences in the timing and intensity of major TE expansion events across the studied butterflies. In the outgroup species I. podalirius and P. machaon, we observed a unimodal distribution with a prominent peak at approximately 12–13 MYA, dominated by LINE elements with substantial contributions from DNA transposons and other Unknown TEs. This concentrated wave suggests a significant historical TE expansion event, followed by relatively modest recent activity.
Across all studied Parnassius species, LINEs consistently formed the largest component of the TE landscape, followed by DNA transposons, while other TE classes (LTR, PLE, RC, and SINE) contributed more modestly to the overall age distribution. This dominance of LINEs suggests they have been particularly successful at proliferating in Parnassius genomes, potentially due to their autonomous replication mechanism. Notably, these TE landscape showed a similar bimodal distribution with peaks at approximately 8 MYA and 14 MYA, with LINEs constituting the largest contribution to both waves. This pattern suggests two distinct episodes of TE amplification in its evolutionary history, potentially coinciding with adaptation events. In particular, P. glacialis and P. orleans showed strikingly similar bimodal distributions with prominent recent peaks at approximately 7–10 MYA and secondary peaks at 15–20 MYA, though P. orleans exhibited a slightly sharper recent peak.
Collectively, the substantial variation in TE age profiles among Parnassius species indicates that TE dynamics respond rapidly to changing ecological conditions and may play important roles in adaptive evolution. Notably, the high-altitude specialist P. epaphus displayed the oldest overall TE profile with minimal recent activity, suggesting stronger purifying selection against TE proliferation in extreme high-altitude environments, while other species showed evidence of more recent TE expansion events.
To investigate potential factors influencing genome size variation, we examined the relationship between genome size and the proportion of young TEs (<20 MYA) across species (Figure 8). We observed a negative correlation (r = −0.577, p = 0.1346) between genome size and the proportion of young TEs. However, this relationship is not statistically significant and appears weaker when excluding P. epaphus, which shows a strikingly low proportion of young TEs despite its large genome size. P. behrii and P. epaphus, which possess the largest genomes (1.59 Gb and 1.46 Gb, respectively), displayed lower proportions of young TEs compared to the smaller genomes of P. machaon and I. podalirius (0.25 Gb and 0.43 Gb, respectively). This counterintuitive pattern suggests that recent TE proliferation is not the primary determinant of genome size expansion in Parnassius butterflies. Instead, the substantial genome size variation in this genus appears to be driven by older TE accumulation events (>20 MYA) and potentially other mechanisms of repetitive DNA expansion. This finding contrasts with observations in other lepidopteran groups where recent TE activity strongly correlates with genome size, suggesting distinctive patterns of genome evolution in high-altitude Parnassius butterflies.

4. Discussion

The genomic architecture of Parnassius butterflies offers a compelling narrative of how transposable element dynamics can sculpt a genome for survival in extreme environments. Our comprehensive analysis of six Parnassius genomes, anchored by the new assembly of the high-altitude specialist P. epaphus, reveals a profound and unexpected relationship between TE evolution, genome expansion, and adaptation.

4.1. Transposable Elements as Architects of Genome Expansion in Extreme Environments

Our findings demonstrate that Parnassius genomes have undergone dramatic expansion, primarily driven by the proliferation of non-LTR retrotransposons, particularly LINE elements. This expansion is remarkable not only in its magnitude—Parnassius genomes are 3–5 times larger than related butterflies—but also in its selective nature [29,30]. The strong positive correlation between genome size and repetitive content (R2 = 0.97), coupled with the asymmetric expansion of specific TE classes, suggests that genome enlargement in Parnassius is not merely the result of relaxed selection against TEs, but potentially an adaptive response to high-altitude environments [10,13]. This phenomenon of TE-driven genome expansion is not unique to high-altitude specialists [10,16]. A similar trend has been documented in the wood-white butterfly, Leptidea sinapis, whose genome has also expanded significantly due to TE proliferation, predominantly involving LINEs, DNA elements, and unclassified repeats [31]. However, the evolutionary narrative in Parnassius presents a compelling distinction. While the expansion in Leptidea is attributed to a more general TE hyperactivity with a significant burst between 10–20 Mya, potentially driven by stochastic processes like genetic drift, our analysis reveals a different pattern. The massive genomes of Parnassius are not the result of recent or ongoing TE hyperactivity; instead, they appear to be ancient relics, shaped by two major historical waves of TE accumulation (~8 and ~14 Mya) that coincide with the uplift of the Tibetan Plateau. This suggests that while the tool of expansion—TE proliferation—is shared, the evolutionary context and selective pressures have forged distinct genomic outcomes. In Parnassius, the retention of these ancient TEs appears to be a key feature linked to their colonization of extreme environments, a scenario distinct from the more neutral evolutionary dynamics proposed for Leptidea.
Most intriguingly, our temporal analysis of TE insertions revealed that genome expansion in Parnassius is not driven primarily by recent TE proliferation events, as commonly observed in other taxa [10]. Rather, we found that species with the largest genomes (P. behrii and P. epaphus) harbor proportionally fewer young TEs compared to outgroups with smaller genomes. This counterintuitive pattern suggests that Parnassius genomes have retained and accumulated ancient TE insertions that would typically be eliminated through purifying selection in other lineages [11,16,32]. The persistence of these ancient TEs may reflect their functional co-option or the evolution of novel regulatory mechanisms that mitigate their deleterious effects while harnessing their potential adaptive benefits in extreme environments.

4.2. Evolutionary Stratification of TE Landscapes Across Altitude Gradients

The divergent TE landscapes across Parnassius species reveal a potential correlation between altitude and TE dynamics. P. epaphus, which inhabits the highest elevations among our study species, displays the oldest overall TE profile with minimal recent proliferation, while moderate-altitude species like P. glacialis exhibit more recent TE activity. This pattern suggests a model whereby extreme high-altitude environments may impose stronger selection against ongoing TE proliferation, perhaps due to increased genome instability risks under heightened UV radiation and oxidative stress [10,33]. Specifically, the strikingly low proportion of young TEs in P. epaphus—despite its large genome—may reflect enhanced purifying selection in its extreme habitat (e.g., 4500 m elevation in the Three Rivers Source region), where new TE insertions could be more deleterious due to environmental stressors [10]. This contrasts with other species and contributes to the weak overall negative correlation observed in Figure 8; excluding P. epaphus weakens the trend, highlighting its unique evolutionary trajectory potentially shaped by prolonged high-altitude isolation.
Intriguingly, our analysis suggests the presence of distinct waves of TE expansion that may coincide with major geological events in the Qinghai–Tibet Plateau’s formation history [34,35]. The apparent dominance of LINEs across the examined Parnassius genomes could potentially be attributed to their autonomous replication mechanism, which might confer advantages in certain genomic environments [36,37]. These Parnassius species appear to exhibit a bimodal distribution of TE insertion ages with peaks at approximately 8 MYA and 14 MYA, with LINEs seemingly constituting the largest contribution to both waves. This temporal pattern is particularly noteworthy as it broadly corresponds with what paleoclimatic and geological studies have identified as a significant period of plateau uplift in the region [38,39]. It is tempting to speculate that genomic restructuring through heightened TE activity during these periods might have facilitated rapid adaptation to newly available high-altitude ecological niches [39]. While further research is needed to establish causality, this temporal concordance between geological and genomic dynamics could potentially reflect the role of TEs in mediating evolutionary responses to dramatic environmental changes. If confirmed through additional studies, these findings might suggest that TEs could serve as genomic “tools” that organism might leverage during periods of rapid environmental transition, potentially accelerating adaptive evolution in the face of novel selective pressures.
Our study further explored a key potential mechanism through which TEs could influence adaptation: the mediation of gene gain and loss events. It is often hypothesized that the genomic instability caused by TE proliferation could increase the rate of gene duplication and deletion, thereby providing raw material for evolutionary innovation [16,31,40,41]. The results from our gene family evolution analysis, however, indicate that Parnassius butterflies, despite their high TE content, do not show a higher frequency of gene gain or loss compared to related species with smaller, less TE-rich genomes. In fact, gene family gains were markedly lower in the Parnassius clade. This counterintuitive result suggests that the adaptive role of TEs in Parnassius, if any, may not be primarily mediated through large-scale changes in gene family size [10,40].
However, the significant expansion of 28 specific gene families at the base of the Parnassius genus suggests that targeted, rather than genome-wide, changes in gene content may have been crucial. The functional enrichment of these expanded families provides compelling, albeit preliminary, clues to their adaptive significance. The strong enrichment of genes related to chromatin binding and protein-DNA complexes points to a potential co-evolutionary dynamic between the host genome and its massive TE load. The expansion of these gene families might represent a sophisticated regulatory adaptation to manage and silence TEs, preventing widespread genomic instability while potentially coopting TE-derived sequences into novel regulatory networks. This genomic control system would be essential for survival in high-stress alpine environments where maintaining genomic integrity is paramount [10,16,40].
Furthermore, the enrichment of terms like coagulation and those related to neuronal function (e.g., presynaptic components, amyloid-beta clearance) suggests adaptations beyond simple genomic maintenance. In insects, coagulation of hemolymph is a critical defense and stress response mechanism [42,43]. Its enhancement could be adaptive in response to physical injury or physiological stress induced by hypoxia and extreme temperatures at high altitudes. Similarly, modifications to the nervous system could be vital for functioning in low-oxygen conditions, which are known to impact neuronal activity [44,45]. The expansion of these gene families may have provided the genetic toolkit for Parnassius to fine-tune its physiological and neurological responses to the unique challenges of alpine life. While these findings do not establish a direct causal link between TE proliferation and these specific gene family expansions, they highlight a plausible indirect pathway: TE-induced genomic plasticity may have created the opportunity for these key adaptive gene families to expand and evolve, ultimately facilitating the successful radiation of Parnassius butterflies across the world’s highest mountains.

4.3. Transposable Elements as Adaptive Mechanisms in Extreme Environments

Our findings contribute to a fundamental reconceptualization of TE biology in evolutionary processes, particularly in extreme environments [38,46,47]. The distinctive expansion patterns of specific TE families in Parnassius butterflies suggest selective retention of elements that may confer adaptive advantages in high-altitude habitats. This non-random distribution contrasts with more uniform TE landscapes observed in lowland species, indicating that TEs have played a structured role in shaping Parnassius genome evolution.
The enlarged genomes of Parnassius butterflies, with their abundant and diverse TE repertoires, likely provide several adaptive advantages in high-altitude environments [10,11,16,29,48]. We speculate that the mechanisms underlying their genome size expansion were driven by these following factors [49,50]: First, TE-mediated genomic rearrangements increase genetic plasticity, potentially facilitating rapid adaptation to environmental stressors including low oxygen, temperature extremes, and intense UV radiation. Second, TEs serve as resources for novel regulatory elements, enabling more sophisticated control of gene expression under variable environmental conditions. Third, the expanded genome architecture may provide buffering against environmental insults, particularly protecting against the damaging effects of UV radiation and temperature fluctuations characteristic of alpine habitats.
These genomic adaptations have significant implications for understanding evolutionary responses to environmental change. Climate warming is dramatically transforming alpine ecosystems, with high-altitude specialists like Parnassius facing habitat contraction and novel selection pressures [51,52,53]. Our comparative analysis reveals that Parnassius species vary substantially in their TE profiles—some showing evidence of ongoing TE activity while others display more stable TE landscapes. This variation suggests differential adaptive potential across the genus, with species possessing more dynamic TE landscapes (such as P. glacialis) potentially holding greater adaptive flexibility, while those with more stable genomes (like P. epaphus) may rely on existing genetic variation to respond to new challenges.
This natural experiment in genomic architecture across closely related species provides valuable insights for predicting evolutionary trajectories under climate change scenarios. By linking genomic features to ecological resilience in Parnassius, we can develop more sophisticated models of how genome architecture shapes adaptive potential. These insights extend beyond Lepidoptera, offering a framework for understanding how genomic composition influences species’ vulnerability and responses to rapid environmental change across diverse taxa facing similar challenges in the Anthropocene [54,55].

4.4. Future Directions

While our study provides compelling evidence for the role of TEs in Parnassius genome evolution and high-altitude adaptation, several questions remain. Future research should focus on functional validation of candidate adaptive TE insertions, particularly those associated with hypoxia response, cold tolerance, and UV resistance pathways [10,11,16,18,19]. Population genomic studies across altitude gradients would further illuminate how TE dynamics vary with environmental conditions and could reveal ongoing processes of adaptation [13]. Additionally, comparative transcriptomic analyses would help elucidate how TE-mediated regulatory innovation influences gene expression patterns under environmental stress [15,29].
The Parnassius butterfly genome, with its rich history of TE-driven expansion and restructuring, represents more than an evolutionary curiosity—it offers a compelling model for understanding how genomes respond to extreme environments. Our findings suggest that genome size expansion through TE accumulation may represent an underappreciated evolutionary strategy for coping with environmental challenges. As climate change increasingly pushes species beyond their physiological limits, understanding the genomic basis of adaptation to extreme environments becomes ever more crucial for predicting and potentially mitigating biodiversity loss.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/d17110794/s1, Table S1: Genome characteristics and repetitive element composition of Parnassius species and outgroup butterflies.

Author Contributions

Conceptualization, D.G.; methodology, N.W. and J.S.; software, J.S.; validation, G.Q.; formal analysis, W.R., N.W. and G.Q.; investigation, W.R. and N.W.; resources, D.G.; data curation, W.R. and N.W.; writing—original draft preparation, W.R.; writing—review and editing, D.G.; visualization, J.S.; supervision, D.G.; project administration, D.G.; funding acquisition, D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the 2023 Annual Operation Subsidy Project for the Guangxi Key Laboratory of Sericulture Ecology and Intelligent Technology Application (23-026-08); Special Project of Guangxi Collaborative Innovation Center of Modern Sericulture and Silk (2023GXCSSC09, 2024GXCSSC03, 2024GXCSSC06), Hechi University high-level talent research start-up fee project (2023GCC017, 2024GCC003), and the Local Science and Technology Development Fund project guided by the central government (Heke ZY230301).

Data Availability Statement

The assembled genome was uploaded to the Zenodo database under the https://doi.org/10.5281/zenodo.15229911.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Hayward, A.; Gilbert, C. Transposable elements. Curr. Biol. 2022, 32, R904–R909. [Google Scholar] [CrossRef]
  2. Feschotte, C. Transposable elements: McClintock’s legacy revisited. Nat. Rev. Genet. 2023, 24, 797–800. [Google Scholar] [CrossRef]
  3. Wells, J.N.; Feschotte, C. A Field Guide to Eukaryotic Transposable Elements. Annu. Rev. Genet. 2020, 54, 539–561. [Google Scholar] [CrossRef]
  4. Zhang, C.; Wang, L.; Dou, L.; Yue, B.; Xing, J.; Li, J. Transposable Elements Shape the Genome Diversity and the Evolution of Noctuidae Species. Genes 2023, 14, 1244. [Google Scholar] [CrossRef] [PubMed]
  5. Gilbert, C.; Peccoud, J.; Cordaux, R. Transposable Elements and the Evolution of Insects. Annu. Rev. Entomol. 2021, 66, 355–372. [Google Scholar] [CrossRef]
  6. Mackay, T.F. Transposable elements and fitness in Drosophila melanogaster. Genome 1989, 31, 284–295. [Google Scholar] [CrossRef] [PubMed]
  7. Lozovskaya, E.R.; Hartl, D.L.; Petrov, D.A. Genomic regulation of transposable elements in Drosophila. Curr. Opin. Genet. Dev. 1995, 5, 768–773. [Google Scholar] [CrossRef] [PubMed]
  8. Adrion, J.R.; Begun, D.J.; Hahn, M.W. Patterns of transposable element variation and clinality in Drosophila. Mol. Ecol. 2019, 28, 1523–1536. [Google Scholar] [CrossRef]
  9. Liu, X.; Majid, M.; Yuan, H.; Chang, H.; Zhao, L.; Nie, Y.; He, L.; Liu, X.; He, X.; Huang, Y. Transposable element expansion and low-level piRNA silencing in grasshoppers may cause genome gigantism. BMC Biol. 2022, 20, 243. [Google Scholar] [CrossRef]
  10. Zhao, Y.; Su, C.; He, B.; Nie, R.; Wang, Y.; Ma, J.; Song, J.; Yang, Q.; Hao, J. Dispersal from the Qinghai-Tibet plateau by a high-altitude butterfly is associated with rapid expansion and reorganization of its genome. Nat. Commun. 2023, 14, 8190. [Google Scholar] [CrossRef]
  11. Höglund, J.; Dias, G.; Olsen, R.A.; Soares, A.; Bunikis, I.; Talla, V.; Backström, N. A Chromosome-Level Genome Assembly and Annotation for the Clouded Apollo Butterfly (Parnassius mnemosyne): A Species of Global Conservation Concern. Genome Biol. Evol. 2024, 16, evae031. [Google Scholar] [CrossRef] [PubMed]
  12. Su, C.; Ding, C.; Zhao, Y.; He, B.; Nie, R.; Hao, J. Diapause-Linked Gene Expression Pattern and Related Candidate Duplicated Genes of the Mountain Butterfly Parnassius glacialis (Lepidoptera: Papilionidae) Revealed by Comprehensive Transcriptome Profiling. Int. J. Mol. Sci. 2023, 24, 5577. [Google Scholar] [CrossRef]
  13. Tao, R.; Xu, C.; Wang, Y.; Sun, X.; Li, C.; Ma, J.; Hao, J.; Yang, Q. Spatiotemporal Differentiation of Alpine Butterfly Parnassius glacialis (Papilionidae: Parnassiinae) in China: Evidence from Mitochondrial DNA and Nuclear Single Nucleotide Polymorphisms. Genes 2020, 11, 188. [Google Scholar] [CrossRef]
  14. Koo, K.A.; Park, S.U. A Dark Future of Endangered Mountain Species, Parnassius bremeri, Under Climate Change. Ecol. Evol. 2025, 15, e71178. [Google Scholar] [CrossRef]
  15. Ding, C.; Su, C.; Li, Y.; Zhao, Y.; Wang, Y.; Wang, Y.; Nie, R.; He, B.; Ma, J.; Hao, J. Interspecific and Intraspecific Transcriptomic Variations Unveil the Potential High-Altitude Adaptation Mechanisms of the Parnassius Butterfly Species. Genes 2024, 15, 1013. [Google Scholar] [CrossRef]
  16. Podsiadlowski, L.; Tunström, K.; Espeland, M.; Wheat, C.W. The Genome Assembly and Annotation of the Apollo Butterfly Parnassius apollo, a Flagship Species for Conservation Biology. Genome Biol. Evol. 2021, 13, evab122. [Google Scholar] [CrossRef]
  17. Matsushita, A.; Awata, H.; Wakakuwa, M.; Takemura, S.Y.; Arikawa, K. Rhabdom evolution in butterflies: Insights from the uniquely tiered and heterogeneous ommatidia of the Glacial Apollo butterfly, Parnassius glacialis. Proc. R. Soc. B Biol. Sci. 2012, 279, 3482–3490. [Google Scholar] [CrossRef]
  18. Si, C.; Chen, K.; Hao, J. The complete mitochondrial genome of Parnassius mercurius Grum-Grshimailo (Lepidoptera: Papilionidae: Parnassiinae). Mitochondrial DNA Part B Resour. 2020, 5, 538–540. [Google Scholar] [CrossRef] [PubMed]
  19. Das, G.N.; Ali, M.; Bálint, Z.; Singh, N.; Chandra, K.; Gupta, S.K. Visiting Ladakh Himalaya for a better knowledge of butterflies: New faunistic data with annotations (Lepidoptera, Papilionoidea). Zootaxa 2023, 5271, 401–445. [Google Scholar] [CrossRef] [PubMed]
  20. Chikhi, R.; Medvedev, P. Informed and automated k-mer size selection for genome assembly. Bioinformatics 2014, 30, 31–37. [Google Scholar] [CrossRef]
  21. Hu, J.; Wang, Z.; Sun, Z.; Hu, B.; Ayoola, A.O.; Liang, F.; Li, J.; Sandoval, J.R.; Cooper, D.N.; Ye, K.; et al. NextDenovo: An efficient error correction and accurate assembly tool for noisy long reads. Genome Biol. 2024, 25, 107. [Google Scholar] [CrossRef]
  22. Guan, D.; McCarthy, S.A.; Wood, J.; Howe, K.; Wang, Y.; Durbin, R. Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 2020, 36, 2896–2898. [Google Scholar] [CrossRef] [PubMed]
  23. Seppey, M.; Manni, M.; Zdobnov, E.M. BUSCO: Assessing Genome Assembly and Annotation Completeness. Methods Mol. Biol. 2019, 1962, 227–245. [Google Scholar] [PubMed]
  24. Baril, T.; Galbraith, J.; Hayward, A. Earl Grey: A Fully Automated User-Friendly Transposable Element Annotation and Analysis Pipeline. Mol. Biol. Evol. 2024, 41, msae068. [Google Scholar] [CrossRef]
  25. Li, H. Protein-to-genome alignment with miniprot. Bioinformatics 2023, 39, btad014. [Google Scholar] [CrossRef] [PubMed]
  26. R Core Team. R: A Language and Environment for Statistical Computing 2014; R Foundation for Statistical Computing: Vienna, Austria, 2008. [Google Scholar]
  27. Krzywinski, M.; Schein, J.; Birol, I.; Connors, J.; Gascoyne, R.; Horsman, D.; Jones, S.J.; Marra, M.A. Circos: An information aesthetic for comparative genomics. Genome Res. 2009, 19, 1639–1645. [Google Scholar] [CrossRef]
  28. Lyu, F.; Han, F.; Ge, C.; Mao, W.; Chen, L.; Hu, H.; Chen, G.; Lang, Q.; Fang, C. OmicStudio: A composable bioinformatics cloud platform with real-time feedback that can generate high-quality graphs for publication. iMeta 2023, 2, e85. [Google Scholar] [CrossRef]
  29. Su, C.; Xie, T.; Wang, Y.; Si, C.; Li, L.; Ma, J.; Li, C.; Sun, X.; Hao, J.; Yang, Q. Miocene Diversification and High-Altitude Adaptation of Parnassius Butterflies (Lepidoptera: Papilionidae) in Qinghai-Tibet Plateau Revealed by Large-Scale Transcriptomic Data. Insects 2020, 11, 754. [Google Scholar] [CrossRef]
  30. Bílá, K.; Šipoš, J.; Kindlmann, P.; Kuras, T. Consequences for selected high-elevation butterflies and moths from the spread of Pinus mugo into the alpine zone in the High Sudetes Mountains. PeerJ 2016, 4, e2094. [Google Scholar] [CrossRef]
  31. Talla, V.; Suh, A.; Kalsoom, F.; Dinca, V.; Vila, R.; Friberg, M.; Wiklund, C.; Backström, N. Rapid Increase in Genome Size as a Consequence of Transposable Element Hyperactivity in Wood-White (Leptidea) Butterflies. Genome Biol. Evol. 2017, 9, 2491–2505. [Google Scholar] [CrossRef]
  32. Meglecz, E.; Petenian, F.; Danchin, E.; D’Acier, A.C.; Rasplus, J.Y.; Faure, E. High similarity between flanking regions of different microsatellites detected within each of two species of Lepidoptera: Parnassius apollo and Euphydryas aurinia. Mol. Ecol. 2004, 13, 1693–1700. [Google Scholar] [CrossRef]
  33. Zhang, H.; Zhang, P.; Niu, Y.; Tao, T.; Liu, G.; Dong, C.; Zheng, Z.; Zhang, Z.; Li, Y.; Niu, Z.; et al. Genetic basis of camouflage in an alpine plant and its long-term co-evolution with an insect herbivore. Nat. Ecol. Evol. 2025, 9, 628–638. [Google Scholar] [CrossRef]
  34. Chen, Y.J.; Zhu, L.; Wu, Q.N.; Hu, C.C.; Qu, Y.F.; Ji, X. Geological and climatic influences on population differentiation of the Phrynocephalus vlangalii species complex (Sauria: Agamidae) in the northern Qinghai-Tibet Plateau. Mol. Phylogenetics Evol. 2022, 169, 107394. [Google Scholar] [CrossRef]
  35. Chen, K.; Wang, B.; Chen, C.; Zhou, G. The relationship between niche breadth and phylogenetic characteristics of eight species of rhubarb on the Qinghai-Tibet Plateau, Asia. Ecol. Evol. 2024, 14, e11040. [Google Scholar] [CrossRef]
  36. Xiao, S.J.; Mou, Z.B.; Yang, R.B.; Fan, D.D.; Liu, J.Q.; Zou, Y.; Zhu, S.L.; Zou, M.; Zhou, C.W.; Liu, H.P. Genome and population evolution and environmental adaptation of Glyptosternon maculatum on the Qinghai-Tibet Plateau. Zool. Res. 2021, 42, 502–513. [Google Scholar] [CrossRef] [PubMed]
  37. Meng, Q.; Xie, Z.; Xu, H.; Guo, J.; Tang, Y.; Ma, T.; Peng, Q.; Wang, B.; Mao, Y.; Yan, S.; et al. Out of the Qinghai-Tibetan plateau: Origin, evolution and historical biogeography of Morchella (both Elata and Esculenta clades). Front. Microbiol. 2022, 13, 1078663. [Google Scholar] [CrossRef] [PubMed]
  38. Geng, Y.; Guan, Y.; Qiong, L.; Lu, S.; An, M.; Crabbe, M.J.C.; Qi, J.; Zhao, F.; Qiao, Q.; Zhang, T. Genomic analysis of field pennycress (Thlaspi arvense) provides insights into mechanisms of adaptation to high elevation. BMC Biol. 2021, 19, 143. [Google Scholar] [CrossRef]
  39. Yu, W.B.; Randle, C.P.; Lu, L.; Wang, H.; Yang, J.B.; dePamphilis, C.W.; Corlett, R.T.; Li, D.Z. The Hemiparasitic Plant Phtheirospermum (Orobanchaceae) Is Polyphyletic and Contains Cryptic Species in the Hengduan Mountains of Southwest China. Front. Plant Sci. 2018, 9, 142. [Google Scholar] [CrossRef] [PubMed]
  40. Zhu, Z.; Su, C.; Guo, X.; Zhao, Y.; Nie, R.; He, B.; Hao, J. Genome-Wide Identification, Gene Duplication, and Expression Pattern of NPC2 Gene Family in Parnassius glacialis. Genes 2025, 16, 249. [Google Scholar] [CrossRef]
  41. Wos, G.; Choudhury, R.R.; Kolář, F.; Parisod, C. Transcriptional activity of transposable elements along an elevational gradient in Arabidopsis arenosa. Mob. DNA 2021, 12, 7. [Google Scholar] [CrossRef]
  42. Schmid, M.R.; Dziedziech, A.; Arefin, B.; Kienzle, T.; Wang, Z.; Akhter, M.; Berka, J.; Theopold, U. Insect hemolymph coagulation: Kinetics of classically and non-classically secreted clotting factors. Insect Biochem. Mol. Biol. 2019, 109, 63–71. [Google Scholar] [CrossRef] [PubMed]
  43. Eleftherianos, I.; Revenis, C. Role and importance of phenoloxidase in insect hemostasis. J. Innate Immun. 2011, 3, 28–33. [Google Scholar] [CrossRef]
  44. Ding, K.; Barretto, E.C.; Johnston, M.; Lee, B.; Gallo, M.; Grewal, S.S. Transcriptome analysis of FOXO-dependent hypoxia gene expression identifies Hipk as a regulator of low oxygen tolerance in Drosophila. G3 2022, 12, jkac263. [Google Scholar] [CrossRef] [PubMed]
  45. Barretto, E.C.; Polan, D.M.; Beevor-Potts, A.N.; Lee, B.; Grewal, S.S. Tolerance to Hypoxia Is Promoted by FOXO Regulation of the Innate Immunity Transcription Factor NF-κB/Relish in Drosophila. Genetics 2020, 215, 1013–1025. [Google Scholar] [CrossRef]
  46. Hu, Y.; Wu, X.; Jin, G.; Peng, J.; Leng, R.; Li, L.; Gui, D.; Fan, C.; Zhang, C. Rapid Genome Evolution and Adaptation of Thlaspi arvense Mediated by Recurrent RNA-Based and Tandem Gene Duplications. Front. Plant Sci. 2021, 12, 772655. [Google Scholar] [CrossRef] [PubMed]
  47. He, X.; Wang, H.; Xu, T.; Zhang, Y.; Chen, C.; Sun, Y.; Qiu, J.W.; Zhou, Y.; Sun, J. Genomic Analysis of a Scale Worm Provides Insights into Its Adaptation to Deep-Sea Hydrothermal Vents. Genome Biol. Evol. 2023, 15, evad125. [Google Scholar] [CrossRef]
  48. Lucas, M.; Rašić, G.; Filazzola, A.; Matter, S.; Roland, J.; Keyghobadi, N. Extremes of snow and temperature affect patterns of genetic diversity and differentiation in the alpine butterfly Parnassius smintheus. Mol. Ecol. 2024, 33, e17503. [Google Scholar] [CrossRef]
  49. Wang, G.; Zhou, N.; Chen, Q.; Yang, Y.; Yang, Y.; Duan, Y. Gradual genome size evolution and polyploidy in Allium from the Qinghai-Tibetan Plateau. Ann. Bot. 2023, 131, 109–122. [Google Scholar] [CrossRef]
  50. Bureš, P.; Elliott, T.L.; Veselý, P.; Šmarda, P.; Forest, F.; Leitch, I.J.; Nic Lughadha, E.; Soto Gomez, M.; Pironon, S.; Brown, M.J.M.; et al. The global distribution of angiosperm genome size is shaped by climate. New Phytol. 2024, 242, 744–759. [Google Scholar] [CrossRef]
  51. Signor, S.; Yocum, G.; Bowsher, J. Life stage and the environment as effectors of transposable element activity in two bee species. J. Insect Physiol. 2022, 137, 104361. [Google Scholar] [CrossRef]
  52. Merenciano, M.; González, J. The Interplay Between Developmental Stage and Environment Underlies the Adaptive Effect of a Natural Transposable Element Insertion. Mol. Biol. Evol. 2023, 40, msad044. [Google Scholar] [CrossRef] [PubMed]
  53. Hu, B.; Xing, Z.; Dong, H.; Chen, X.; Ren, M.; Liu, K.; Rao, C.; Tan, A.; Su, J. Cytochrome P450 CYP6AE70 Confers Resistance to Multiple Insecticides in a Lepidopteran Pest, Spodoptera exigua. J. Agric. Food Chem. 2024, 72, 23141–23150. [Google Scholar] [CrossRef] [PubMed]
  54. Rosser, N.; Seixas, F.; Queste, L.M.; Cama, B.; Mori-Pezo, R.; Kryvokhyzha, D.; Nelson, M.; Waite-Hudson, R.; Goringe, M.; Costa, M.; et al. Hybrid speciation driven by multilocus introgression of ecological traits. Nature 2024, 628, 811–817. [Google Scholar] [CrossRef] [PubMed]
  55. Halsch, C.A.; Shapiro, A.M.; Fordyce, J.A.; Nice, C.C.; Thorne, J.H.; Waetjen, D.P.; Forister, M.L. Insects and recent climate change. Proc. Natl. Acad. Sci. USA 2021, 118, e2002543117. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Circos visualization of the chromosome-level genome assembly of P. epaphus. Circular representation of the P. epaphus genome assembly anchored to 29 pseudo-chromosomes (Chr1-Chr29). The genome totals 1.46 Gb, with 985.90 Mb (67.39%) comprising repetitive sequences and shows 92.1% BUSCO completeness against the lepidoptera_odb12 dataset. Tracks from outermost to innermost: (A) Chromosome ideograms with position scales (Mb); (B) Gene density heat map; (C–J) Distribution of different repeat element classes across chromosomes, including (C) LINE elements, (D) LTR retrotransposons, (E) SINE elements, (F) DNA transposons, (G) Rolling circle (RC) transposons, (H) Satellite DNA, (I) Unknown repeats, and (J) GC content distribution. The center displays a photograph of P. epaphus and key statistics.
Figure 1. Circos visualization of the chromosome-level genome assembly of P. epaphus. Circular representation of the P. epaphus genome assembly anchored to 29 pseudo-chromosomes (Chr1-Chr29). The genome totals 1.46 Gb, with 985.90 Mb (67.39%) comprising repetitive sequences and shows 92.1% BUSCO completeness against the lepidoptera_odb12 dataset. Tracks from outermost to innermost: (A) Chromosome ideograms with position scales (Mb); (B) Gene density heat map; (C–J) Distribution of different repeat element classes across chromosomes, including (C) LINE elements, (D) LTR retrotransposons, (E) SINE elements, (F) DNA transposons, (G) Rolling circle (RC) transposons, (H) Satellite DNA, (I) Unknown repeats, and (J) GC content distribution. The center displays a photograph of P. epaphus and key statistics.
Diversity 17 00794 g001
Figure 2. Gene family evolution and functional enrichment analysis in Parnassius butterflies. (A) Phylogenetic relationships and gene family evolution across nine butterfly species. The ultrametric tree shows the evolutionary relationships and estimated divergence times (Mya). Numbers on the nodes represent the estimated age in millions of years. Numbers at the end of each branch indicate the estimated number of gene family gains (+, red) and losses (−, green) along that lineage. The green shadows (or bars) at the nodes of the tree represent the confidence interval for the estimated divergence time. (B) GO enrichment analysis of the 210 genes from 28 significantly expanded gene families at the base of the Parnassius clade. The bubble plot displays the top 20 enriched GO terms. The x-axis represents the Rich Factor, the size of the bubble corresponds to the number of genes enriched in the term, and the color indicates the p-value.
Figure 2. Gene family evolution and functional enrichment analysis in Parnassius butterflies. (A) Phylogenetic relationships and gene family evolution across nine butterfly species. The ultrametric tree shows the evolutionary relationships and estimated divergence times (Mya). Numbers on the nodes represent the estimated age in millions of years. Numbers at the end of each branch indicate the estimated number of gene family gains (+, red) and losses (−, green) along that lineage. The green shadows (or bars) at the nodes of the tree represent the confidence interval for the estimated divergence time. (B) GO enrichment analysis of the 210 genes from 28 significantly expanded gene families at the base of the Parnassius clade. The bubble plot displays the top 20 enriched GO terms. The x-axis represents the Rich Factor, the size of the bubble corresponds to the number of genes enriched in the term, and the color indicates the p-value.
Diversity 17 00794 g002
Figure 3. Correlation between genome size and repetitive content in Parnassius butterflies and related lepidopterans. The scatter plot illustrates the linear relationship between genome size (Gb) and repetitive sequence content (%) across eight butterfly species. Linear regression analysis reveals a highly significant positive correlation between genome size and repetitive sequence proportion (R2 = 0.96, p < 0.0001, black regression line with 95% confidence interval shown in grey). The red dots signify the Parnassius genus, while blue denotes the outgroups.
Figure 3. Correlation between genome size and repetitive content in Parnassius butterflies and related lepidopterans. The scatter plot illustrates the linear relationship between genome size (Gb) and repetitive sequence content (%) across eight butterfly species. Linear regression analysis reveals a highly significant positive correlation between genome size and repetitive sequence proportion (R2 = 0.96, p < 0.0001, black regression line with 95% confidence interval shown in grey). The red dots signify the Parnassius genus, while blue denotes the outgroups.
Diversity 17 00794 g003
Figure 4. Hierarchical clustering of repetitive element content across Parnassius and related butterfly genomes. (Left): Dendrogram showing hierarchical clustering based on Pearson correlation of repetitive element profiles among six Parnassius species and two outgroup butterflies. (Center): Heatmap displaying the percentage of different repetitive element classes across genomes, with values indicated in each cell. The color scale corresponds to the percentage values, with warmer colors representing higher percentages and cooler colors representing lower percentages. (Right): Comparative genome sizes, showing the dramatic expansion in Parnassius species (1.23–1.59 Gb) relative to non-Parnassius butterflies (0.25–0.43 Gb).
Figure 4. Hierarchical clustering of repetitive element content across Parnassius and related butterfly genomes. (Left): Dendrogram showing hierarchical clustering based on Pearson correlation of repetitive element profiles among six Parnassius species and two outgroup butterflies. (Center): Heatmap displaying the percentage of different repetitive element classes across genomes, with values indicated in each cell. The color scale corresponds to the percentage values, with warmer colors representing higher percentages and cooler colors representing lower percentages. (Right): Comparative genome sizes, showing the dramatic expansion in Parnassius species (1.23–1.59 Gb) relative to non-Parnassius butterflies (0.25–0.43 Gb).
Diversity 17 00794 g004
Figure 5. Divergence distribution of transposable elements across eight butterflies’ genomes. Density plots showing the distribution of Kimura 2-parameter (K2P) distances for major TE classes in these selected butterflies. Each panel represents a different species, with colored areas indicating distinct TE classes. Lower K2P values indicate more recent TE insertions.
Figure 5. Divergence distribution of transposable elements across eight butterflies’ genomes. Density plots showing the distribution of Kimura 2-parameter (K2P) distances for major TE classes in these selected butterflies. Each panel represents a different species, with colored areas indicating distinct TE classes. Lower K2P values indicate more recent TE insertions.
Diversity 17 00794 g005
Figure 6. Age distribution of major TE classes across butterfly species. Bubble chart depicting the mean insertion age (color gradient, in million years ago) and genomic proportion (bubble size) of major TE classes across eight butterfly species. Each row represents a TE class, while columns represent different species. Older TE classes appear in warmer colors (yellow/red), while younger classes are shown in cooler colors (purple/blue). Bubble size reflects the percentage of genome occupied by each TE class.
Figure 6. Age distribution of major TE classes across butterfly species. Bubble chart depicting the mean insertion age (color gradient, in million years ago) and genomic proportion (bubble size) of major TE classes across eight butterfly species. Each row represents a TE class, while columns represent different species. Older TE classes appear in warmer colors (yellow/red), while younger classes are shown in cooler colors (purple/blue). Bubble size reflects the percentage of genome occupied by each TE class.
Diversity 17 00794 g006
Figure 7. Temporal dynamics of transposable element insertions across butterfly species. Density plots showing the age distribution of TE insertions in million years ago (MYA), weighted by genomic coverage. Each panel represents a different species, with colored areas indicating distinct TE classes. Peaks in the distributions represent major waves of TE expansion.
Figure 7. Temporal dynamics of transposable element insertions across butterfly species. Density plots showing the age distribution of TE insertions in million years ago (MYA), weighted by genomic coverage. Each panel represents a different species, with colored areas indicating distinct TE classes. Peaks in the distributions represent major waves of TE expansion.
Diversity 17 00794 g007
Figure 8. Relationship between genome size and young TE proportion across butterfly species. Scatter plot depicting the relationship between genome size (Gb) and the proportion of young TEs (<20 MYA) across eight butterfly species. Point size represents the overall TE content as a percentage of genome size. The dashed line shows the linear regression (r = −0.577, p = 0.1346).
Figure 8. Relationship between genome size and young TE proportion across butterfly species. Scatter plot depicting the relationship between genome size (Gb) and the proportion of young TEs (<20 MYA) across eight butterfly species. Point size represents the overall TE content as a percentage of genome size. The dashed line shows the linear regression (r = −0.577, p = 0.1346).
Diversity 17 00794 g008
Table 1. Summary of genome features and repetitive element composition across nine butterfly species.
Table 1. Summary of genome features and repetitive element composition across nine butterfly species.
SpeciesGenome Size (Gb)GC (%)Masked (%)SINE (%)LINE (%)LTR (%)DNA (%)RC (%)Unknown (%)Satellite (%)Simple
Repeats (%)
Low
Complex (%)
Leptidea sinapis0.6935.7363.550.0031.951.874.560.7124.120.340.710.00
Iphiclides podalirius0.4336.5734.924.5512.192.332.034.547.950.270.900.16
Papilio machaon0.2534.5228.681.655.872.712.296.207.930.641.160.23
Parnassius apollo1.3937.4269.206.1529.008.427.703.3713.110.380.950.10
Parnassius behrii1.5937.6572.565.2131.229.787.923.0412.651.401.210.08
Parnassius cephalus1.2736.5769.048.1522.096.698.508.0913.581.050.800.10
Parnassius epaphus1.4637.0567.398.5428.865.757.252.8612.820.390.830.09
Parnassius glacialis1.3538.0171.586.5432.555.198.174.6012.611.000.830.08
Parnassius orleans1.2337.0968.756.5628.584.608.816.2512.930.270.670.09
Notes: The table details genome size, GC content, and the percentage of the genome comprised by major repetitive element classes. “Masked (%)” represents the total percentage of the genome identified as repetitive elements. It is the sum of the individual repeat classes shown and other minor, unlisted categories.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rong, W.; Wei, N.; Song, J.; Qin, G.; Guan, D. Draft Genome Assembly of Parnassius epaphus Provides New Insights into Transposable Elements That Drive Genome Expansion in Alpine Parnassius butterflies. Diversity 2025, 17, 794. https://doi.org/10.3390/d17110794

AMA Style

Rong W, Wei N, Song J, Qin G, Guan D. Draft Genome Assembly of Parnassius epaphus Provides New Insights into Transposable Elements That Drive Genome Expansion in Alpine Parnassius butterflies. Diversity. 2025; 17(11):794. https://doi.org/10.3390/d17110794

Chicago/Turabian Style

Rong, Wantao, Nan Wei, Jing Song, Guole Qin, and Delong Guan. 2025. "Draft Genome Assembly of Parnassius epaphus Provides New Insights into Transposable Elements That Drive Genome Expansion in Alpine Parnassius butterflies" Diversity 17, no. 11: 794. https://doi.org/10.3390/d17110794

APA Style

Rong, W., Wei, N., Song, J., Qin, G., & Guan, D. (2025). Draft Genome Assembly of Parnassius epaphus Provides New Insights into Transposable Elements That Drive Genome Expansion in Alpine Parnassius butterflies. Diversity, 17(11), 794. https://doi.org/10.3390/d17110794

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop