Next Article in Journal
Protective Effects of Extracts from Green Leaves and Rhizomes of Posidonia oceanica (L.) Delile on an In Vitro Model of the Human Blood–Brain Barrier
Previous Article in Journal
Unraveling LncRNA GAS5 in Atherosclerosis: Mechanistic Insights and Clinical Translation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Chromosome-Level Genome and Variation Map of Eri Silkworm Samia cynthia ricini

1
State Key Laboratory of Resource Insects, Key Laboratory of Sericultural Biology and Genetic Breeding, Ministry of Agriculture and Rural Affairs, College of Sericulture, Textile and Biomass Sciences, Southwest University, Chongqing 400715, China
2
Guangxi Key Laboratory of Silkworm Genetic Improvement and Efficient Breeding, Guangxi Research Academy of Sericultural Science, Nanning 530007, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Biology 2025, 14(6), 698; https://doi.org/10.3390/biology14060698 (registering DOI)
Submission received: 16 May 2025 / Revised: 9 June 2025 / Accepted: 12 June 2025 / Published: 14 June 2025

Simple Summary

The eri silkworm (Samia cynthia ricini) is a resource insect valued for its silk production, nutrient-rich pupae used in food and animal feed, and unique biological traits that intrigue scientists. Despite its importance, genomic resources for this species have remained limited. In this study, we generated a chromosome-level genome (456.16 Mb) using advanced DNA sequencing technologies, revealing key genomic insights such as 15,729 protein-coding genes, a 48.51% repetitive content, and syntenic relationships (including chromosomal fusion/fission events) with the well-studied domestic silkworm (Bombyx mori). We further discovered millions of genetic variations, including SNPs, InDels, and SVs. All the data are freely available in the SilkMeta database, helping researchers and breeders improve silk production, explore sustainable food sources, and advance research on insect biology.

Abstract

The eri silkworm Samia cynthia ricini (S. ricini) is an economically and scientifically significant lepidopteran species, though its genomic resources have remained limited. Here, we present a chromosome-level genome assembly for S. ricini generated through integrated long-read, short-read, and Hi-C sequencing data. The final 456.16 Mb assembly spans 14 chromosomes, exhibiting 98.5% BUSCO completeness and a 48.51% repetitive content. Functional annotation of the 15,729 protein-coding genes against five major databases (NR, SwissProt, Pfam, GO, and KEGG) revealed a maximum annotation rate of 92.71%, demonstrating high gene set quality. Comparative genomics with B. mori uncovered conserved syntenic blocks interspersed with chromosomal fusion/fission events and inversions. We further identified 4.27 million SNPs, 1.02 million InDels, and 53,367 SVs, establishing the first comprehensive variation map for this species. These genomic variations provide a foundation for marker-assisted breeding programs and trait association studies. All the genomic resources and interactive visualization tools were integrated into the SilkMeta database. This study establishes S. ricini as a pivotal resource for comparative lepidopteran genomics and accelerates molecular breeding programs for this agriculturally valuable insect.

1. Introduction

The eri silkworm Samia cynthia ricini (S. ricini, Lepidoptera: Saturniidae), domesticated from Samia canningi, was initially reared in northeastern India for silk production and later introduced to regions including China, Japan, Korea, and Europe [1]. Eri silkworm pupae are rich in protein, serving as a valuable nutritional resource for human consumption and as feed in aquaculture and poultry industries [2,3,4], making this species economically promising. Furthermore, as a newly established lepidopteran model, the eri silkworm exhibits biological properties such as polyphagy, multivoltinism, colored larval epidermis, disease resistance, and a ZZ/Z0 sex-determination system (2n = 28♂/27♀, ZZ♂/Z0♀), which are of significant interest to insect biologists [1,5,6]. These characteristics highlight its dual importance for advancing sericulture practices and deciphering fundamental mechanisms in insect biology.
Over the past two decades, genomic advancements have revolutionized functional genomics and breeding technologies in crops and livestock. For the mulberry silkworm, Bombyx mori (B. mori), a close relative of S. ricini, genome assemblies were released and updated in 2004, 2008, and 2019 [7,8,9], while genetic variation maps were constructed and refined in 2009, 2018, and 2022 [10,11,12]. These resources have enabled breakthroughs in various fields, allowing researchers to decipher the genetic basis of domestication history [11,12], behavior degenerations [13,14], and phenotype mutations [15]; identify targets for breeding traits like silk yield [12,16,17] and disease resistance [18,19]; and elucidate genes governing growth and development (e.g., moultinism and sexual regulation) [20,21]. In 2022, a comprehensive silkworm pan-genome further empowered high-throughput allele mining for functional genomics and breeding [12]. However, aside from the draft genome published in 2021 [5], genomic advancements for S. ricini remain limited. Notably absent is a chromosome-level assembly, which is critical for studying chromosomal evolution and precise genomic variation identification.
Here, we employed a hybrid assembly strategy integrating long-read, short-read, and Hi-C scaffolding, coupled with a comprehensive annotation pipeline using five major public databases (NR, SwissProt, Pfam, GO, and KEGG) and synteny analysis with B. mori. Protein-coding genes were predicted through integrated homology-based and transcriptome-evidenced approaches. The synteny analysis revealed chromosomal fusion/fission events and inversions between S. ricini and B. mori. Leveraging this assembly alongside existing sequencing data, we systematically identified genomic variations, including single nucleotide polymorphisms (SNPs), short insertions/deletions (InDels; <50 bp), and structural variations (SVs; >50 bp). All data were integrated into the SilkMeta database (http://silkmeta.org.cn (accessed on 15 April 2025)) for public access. These resources will allow for accelerated eri silkworm breeding and functional genomics while providing a high-quality reference genome for evolutionary and functional genomic studies across Lepidoptera.

2. Materials and Methods

2.1. Genome, Transcriptome, and Hi-C Sequencing

The experimental S. ricini specimens (GX) were provided by the Guangxi Institute of Sericulture Science. Larvae were reared on Ricinus communis leaves under controlled conditions: 27–29 °C with 80–90% relative humidity for the first to third instar, and 25–27 °C with 75–80% relative humidity thereafter in an air-conditioned room at Guangxi Institute of Sericulture Science, Nanning, China (22°83′ N, 108°31′ E).
For long-read sequencing, we prepared Oxford Nanopore Technology (ONT) libraries using genomic DNA extracted from a single female (GX-F) and a single male (GX-M) pupa, optimized for 20 kb fragment sizes. Library preparation targeted 9 μg DNA input, followed by sequencing on the PromethION platform (Oxford Nanopore Technologies, Oxford, UK) using R9.4 flow cells with SQK-LSK109 chemistry (Oxford Nanopore Technologies, Oxford, UK). For short-read sequencing, paired-end libraries with 300–400 bp insert sizes were constructed and sequenced on the DNBSEQ platform (MGI Tech, Shenzhen, China), using 1 μg genomic DNA for library construction. Raw data from both platforms were processed according to established protocols described in reference [12].
For RNA-seq, total RNA was extracted separately from larval, pupal, and adult stages using Trizol reagent (Simgen, Hangzhou, China). Equimolar RNA pools from each developmental phase were combined to construct a normalized composite library. The concentration and purity of pooled RNA were quantified using a Nanodrop 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA), while RNA integrity was evaluated by agarose gel electrophoresis. The sequencing library was constructed using the VAHTS Universal V10 RNA-seq Kit (Cat: NR616-02, Lot: 7E831C4, Vazyme, Nanjing, China) with 1 μg of total RNA, targeting an insert size of approximately 300 bp. Sequencing was performed on the DNBSEQ-T7 platform (MGI Tech, Shenzhen, China).
For Hi-C sequencing, pupal tissue (GX-M) underwent formaldehyde crosslinking followed by MboI restriction digestion. Biotinylated restriction fragments were ligated using T4 DNA ligase, then reverse crosslinked through sodium dodecyl sulfate (SDS) and protease K treatment. Streptavidin magnetic bead enrichment captured junction fragments, which were processed through end repair, adapter ligation, and PCR amplification. The final library was constructed using 500 ng of genomic DNA, followed by sequencing on the DNBSEQ system (MGI Tech, Shenzhen, China).

2.2. Genome Assembly

Prior to assembly, k-mer frequency analysis (k = 17) was performed using Jellyfish v2.2.6 [22]. Genome characteristics, including estimated size, repeat content, and heterozygosity rate, were subsequently predicted through genomeScope v1.0 [23] analysis using short-read sequencing data.
The de novo genome assembly was conducted as described in a previous report [12]. Briefly, raw ONT reads were error-corrected using Canu v1.8 [24], and the corrected reads were assembled into contigs using Smartdenovo v1.0 [25]. The contigs underwent three rounds of polishing with Racon v1.3.3 [26], followed by a final polishing step with Medaka v0.7.1 (https://github.com/nanoporetech/medaka (accessed on 20 July 2020)). Using this pipeline, genome assemblies for both GX-M and GX-F were generated. Chromosome-level scaffolding of the GX-M genome was achieved by integrating Hi-C data with the 3D-DNA pipeline [27]. The GX-F genome assembly was not scaffolded using Hi-C technology.
A previous study has established that S. ricini telomeres consist of (TTAGG)n repeats [28]. Using tidk v0.2.63 [29] with parameter -w 100000, we identified terminal (TTAGG)n motifs. If a chromosome exhibits more than 300 TTAGG repeats at its termini, it is considered as containing telomeric sequences.
For final assembly validation, we performed two complementary analyses:
(a)
All short-read sequencing data were aligned to the assembled genome to calculate genome-wide coverage and mapping efficiency (considering all successfully mapped reads).
(b)
Genome integrity was evaluated using Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.5.0 [30] with the Lepidoptera_odb10 database, which reports percentages of complete single-copy, duplicated, and fragmented orthologs.
The chromosome-level GX-M genome was used as the reference for subsequent annotation and variation calling pipelines.

2.3. Genome Annotation

Repeat sequences in the genome were annotated through de novo prediction approach. A custom repeat library was constructed using RepeatModeler v2.0.1 (http://www.repeatmasker.org/RepeatModeler/ (accessed on 1 November 2024)) with the -LTRStruct parameter, followed by repeat identification and masking through RepeatMasker v4.1.0 (http://repeatmasker.org/ (accessed on 4 November 2024)).
Protein-coding gene prediction combined homology-based and transcriptome-based methods. For homology prediction, we curated protein sequences of Manduca sexta, Chilo suppressalis, Papilio Xuthus, Spodoptera frugiperda, and Bombyx mori from NCBI and SilkMeta (http://silkmeta.org.cn (accessed on 12 March 2025)) to build a reference protein database [31]. The database was analyzed alongside Braker3 v3.0.8 [32] for gene prediction. For transcriptome evidence, RNA-seq reads were aligned to the genome using STAR v2.7.11b [33] to generate a sorted BAM file, which was subsequently processed with Braker3 (default parameter). Predictions from both approaches were consolidated into a non-redundant gene set via TSEBRA v1.1.2.5 (default parameter) [32]. Subsequently, the integrated gene set underwent comprehensive validation using BUSCO analysis against the Lepidoptera_odb10 database to assess genome annotation completeness.
Functional annotation involved three complementary strategies:
(a)
Sequence homology: Diamond v2.1.10 [34] was used for NR and SwissProt database searches.
(b)
Domain identification: Pfam domains were annotated using InterProScan v5.72-103.0 [35].
(c)
Pathway mapping: Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were assigned using the eggNOG-mapper v2.1.12 (http://eggnog-mapper.embl.de (accessed on 14 March 2025)) online tool.
Transfer RNAs (tRNAs) were identified using tRNAscan-SE v2.0.12 [36], ribosomal RNAs (rRNA) were annotated with rnammer v1.2 [37], and other non-coding RNA (ncRNA), including microRNAs (miRNAs) and small nuclear RNAs (snRNAs), were annotated by searching against the Rfam database (http://rfam.xfam.org/ (accessed on 9 May 2025)) using Infernal v1.1.5 [38].

2.4. Collinearity Analysis

Bombyx mori (2n = 56 chromosomes), a phylogenetically close relative of S. ricini (2n = 28), served as the reference for comparative karyotype analysis. To investigate the chromosomal evolution patterns of S. ricini GX-M and B. mori Dazao, we conducted a whole-genome synteny analysis using JCVI v1.4.16 [39] with the default parameters. The coding DNA sequences (CDS) of both species were input into JCVI, and the parameter -m jcvi.graphics.karyotype was applied to generate synteny visualization plots for comparative genomic analysis.

2.5. Identification and Annotation of SNPs, InDels, and SVs

We identified SNPs and InDels using short-read sequencing data and detected SVs using long-read sequencing data. In addition to our two S. ricini samples (GX-M and GX-F), we incorporated publicly available short-read and long-read sequencing data of a Japanese S. ricini strain (UT; NCBI accession GCA_014132275.2). Short-read data from GX-F and UT were mapped to the GX-M reference genome using BWA v0.7.17 [40] with default parameters. Unmapped and duplicated reads were filtered using SAMtools v1.9 [41] and Picard v2.18.29 (https://broadinstitute.github.io/picard/ (accessed on 22 March 2025)). Raw SNP/InDel variants were called through GATK4 v4.4.0.0 [42]. The HaplotypeCaller generated GVCFs, followed by CombineGVCFs and GenotypeGVCFs for joint genotyping.
Variant filtering was performed using GATK v4.4.0.0 VariantFiltration with stringent thresholds. The SNPs were filtered with the parameters “QD < 2.0, QUAL < 30.0, SOR > 3.0, FS > 60.0, MQ < 40.0, MQRankSum < −12.5, ReadPosRankSum < −8.0”, while InDels were filtered with the parameters “QD < 2.0, QUAL < 30.0, FS > 200.0, ReadPosRankSum < −20.0”.
For SV detection, long-read data from both S. ricini specimens were mapped to the GX-M genome using NGMLR v0.2.7 [43], followed by SV calling, merging, and filtering with the combined calling method (default parameters) of Sniffles2 v2.2 [44]. The combined calling method integrates multi-sample SV detection, merging, and automated filtering into a unified workflow, directly generating a high-quality SV set as the final output. All variants were functionally annotated for gene/variant positional relationship using SnpEff 4.3t [45].

3. Results

3.1. Genome Sequencing and Assembly

One male (GX-M) and one female (GX-F) S. ricini were sequenced using the DNBSEQ platform (short read), the Oxford Nanopore system (long read), and Hi-C technology. The short-read sequencing of S. ricini GX-M and GX-F samples generated 31.88 Gb (70×) and 41.97 Gb (93×) of data, respectively (Table 1). Long-read ONT sequencing for these samples produced 65.27 Gb (145×) and 63.16 Gb (140×) of data, with read N50 values of 18,508 bp and 21,955 bp for GX-M and GX-F, respectively (Table 1). Notably, the maximum read lengths achieved were 155,012 bp (GX-M) and 157,443 bp (GX-F). Hi-C sequencing produced 47.80 Gb of chromatin interaction data from the male specimen (Table 1).
Genome survey analysis based on short-read data estimated a genome size of 451.05 Mb (GX-M) and 438.25 Mb (GX-F), with repetitive sequence contents of 47.68% and 47.49%, and heterozygosity rates of 0.25% and 0.36%, respectively. Preliminary assemblies generated 73 contigs (N50 = 18.55 Mb, total size = 457.85 Mb) for the GX-M genome and 63 contigs (N50 = 25.31 Mb, total size = 455.45 Mb) for the GX-F genome. Hi-C scaffolding enabled the chromosomal-level assembly of the GX-M genome, anchoring 59 contigs into 14 chromosomes with a final size of 456.16 Mb (Table 2 and Table 3, Figure 1). A total of 11 telomeric sequences were detected across 14 chromosomes, with 3 of these chromosomes showing telomeres at both ends (Table 3).
To assess genome assembly quality, we mapped short-read sequencing data to GX-M and GX-F S. ricini genomes independently. The read-mapping ratios reached 99.86% and 99.65%, with genome coverage values of 99.61% and 99.65%, respectively. BUSCO analysis revealed 98.5% completeness for both genomes. The GX-M assembly contained 98.1% complete and single-copy BUSCOs and 0.4% complete and duplicated BUSCOs, while the GX-F assembly showed 98.2% complete and single-copy BUSCOs and 0.3% complete and duplicated BUSCOs. These metrics confirmed the high completeness and reliability of the assembled genomes.

3.2. Genome Annotation

The S. ricini genome contained 222.09 Mb of repeat sequences, representing 48.51% of the total assembly. Transposable elements (TEs) dominated the repeat landscape, constituting 47.22% of the genome. TE composition analysis revealed distinct class distributions: long interspersed nuclear elements (LINEs) were most prevalent (18.97%), followed by helitrons (8.81%), unclassified elements (13.53%), DNA transposons (2.73%), long-terminal repeat (LTR) elements (2.54%), and short interspersed nuclear elements (SINEs) at 0.64% (Table 4).
A total of 15,729 protein-coding genes were predicted in the GX-M genome through integrated RNA- and homology-based annotation approaches (Figure 2). Furthermore, we identified 1175 tRNAs, 112 rRNAs, and 466 other ncRNAs within this genome (Table S1). BUSCO analysis against the Lepidoptera_odb10 database demonstrated 98.2% completeness, comprising 96.6% complete single-copy and 1.6% complete duplicated orthologs, confirming high gene prediction accuracy. Functional annotation through database searches yielded the following results: 14,582 genes (92.71%) matched to NR, 9703 (60.91%) to SwissProt, 14,221 (90.41%) to Pfam, 8364 (53.18%) to GO, and 7906 (50.26%) to KEGG pathways. Cross-database analysis identified 6165 genes (39.2% of the total) annotated across all five databases (Figure 3).

3.3. Collinearity with Bombyx Mori Genome

Bombyx mori, a lepidopteran model organism with 28 (2n = 56) chromosomes, served as the reference for comparative genomic analysis. Syntenic relationship analysis between S. ricini and B. mori genomes revealed strong collinearity, accompanied by chromosomal fusion/fission events and inversions. The syntenic relationships between the two lepidopteran genomes were characterized by the following chromosomal correspondences:
Bombyx mori Chr7, partial Chr23/24, and Chr28 aligned with S. ricini Chr1, while Chr2/20/26/27 corresponded to S. ricini Chr2. Fusion events were also observed between B. mori Chr22/25 and S. ricini Chr3, as well as between B. mori Chr3/13 and S. ricini Chr4. Partial Chr23 and Chr16 of B. mori showed synteny with both S. ricini Chr5. Notably, B. mori Chr1 exhibited synteny with S. ricini Chr7 with a large-scale inversion. Additional syntenic blocks included the following: B. mori Chr18/19 with S. ricini Chr6; Chr11/21 with Chr8; Chr6/10 with Chr9; Chr9/14 with Chr10; Chr5/17 with Chr11; Chr4/15 with Chr12; and Chr8/12 with Chr13. A secondary syntenic association was identified between partial B. mori Chr11/24 and S. ricini Chr14 (Figure 4).

3.4. Variation Map

An analysis of short-read and long-read data identified 1,771,512 SNPs (raw count: 1,868,658) and 433,440 InDels (raw count: 470,271) in the GX-F sample, alongside 25,567 SVs. For the UT strain, we detected 3,362,499 SNPs (raw count: 3,551,595), 780,961 InDels (raw count: 818,341), and 48,757 SVs. Through combined variant calling, we established a non-redundant variation set totaling 4,270,848 SNPs (raw count: 4,509,966), 1,021,705 InDels (raw count: 1,066,653), and 53,367 SVs (Table 5).
The chromosomes of S. ricini displayed distinct spatial distributions of genomic features, including genes, SNPs, InDels, and SVs (Figure 2). Notably, chromosome 1 exhibited relatively lower genetic variation frequencies and gene density compared to other chromosomes. Furthermore, variation density was significantly higher in the terminal regions of chromosomes than in the central regions, a distribution pattern consistent with the genomic architecture of B. mori [12].
SNP distribution analysis revealed that 1,953,178 (45.63%) were in intergenic regions, 1,197,685 (27.98%) were in introns, 429,159 (10.03%) were downstream, 596,161 (13.93%) were upstream, and 104,631 (2.45%) were in exons (Table 5). Among exonic SNPs, 62,880 (60.10%) were synonymous substitutions, 41,751 (39.90%) were non-synonymous mutations, and 361 introduced premature stop codons (Table 5). The SNP density across the whole genome is 107 bp/SNP, while in exons, introns, and intergenic regions, the densities are 369 bp/SNP, 179 bp/SNP, and 70 bp/SNP, respectively (Table 6).
InDel analysis revealed near-equal proportions of insertions (486,695; 47.64%) and deletions (535,010; 52.36%). Genomic distribution included 445,118 (43.46%) intergenic, 316,562 (30.91%) intronic, 142,907 (13.95%) downstream, 114,744 (11.20%) upstream, and 4933 (0.48%) exonic variants (Table 5). Functional impacts included 1798 (0.11%) frameshifts and 158 (0.01%) start/stop codon alterations.
SV characterization identified 23,602 insertions, 28,358 deletions, 318 duplications, and 250 inversions. predominantly 100 bp-1 kb in length (Figure 5A–D). Distribution patterns showed 22,687 (41.19%) intergenic, 15,960 (28.98%) intronic, 7434 (9.75%) upstream, 5249 (9.53%) downstream, and 3746 (6.80%) exonic SVs (Table 5). Functional consequences included frameshifts (817, 1.48%), stop codon gains (388, 0.70%), and exon losses (96, 0.17%). Analysis of SV length distributions demonstrated that the majority of insertions and deletions range from 100 bp to 1 kb (Figure 5A,B), while most duplications and inversions span 100 bp to 10 kb (Figure 5C,D).

3.5. Visit Samia Ricini Genome

To enhance the accessibility of S. ricini genomic resources, we integrated the assembled genome and variant datasets (SNPs, InDels, SVs) into the SilkMeta database (http://silkmeta.org.cn (accessed on 15 April 2025)) through three functional modules: Genome browser, BLAST v2.14.0+, and Download. The genome browser allows for interactive visualization of the chromosome-level assembly, annotated gene models, and genomic variations (SNPs/InDels/SVs) via a user-friendly interface (Figure 6A). The BLAST tool enables sequence similarity searches against S. ricini genomic, coding (CDS), and protein sequences through customizable query parameters (Figure 6B). The Download module offers comprehensive data downloads, including genome assembly (FASTA), gene annotations (GFF), coding/protein sequences, and variant call format (VCF) files. This integration facilitates seamless exploration, analysis, and utilization of S. ricini genomic data for the research community.

4. Discussion

The S. ricini is a key resource insect of agricultural and biological importance. This study delivers a chromosome-scale genome and variation map for S. ricini, addressing gaps in lepidopteran genomics. Using Hi-C sequencing data, the 456.16 Mb assembly was anchored to 14 chromosomes, which corresponded to a previous karyotype analysis [46]. The chromosome-level genome assembly and gene annotation of the S. ricini presented in this study demonstrate higher completeness and integrity than previous genomic resources for this species (Table 7). Moreover, the assembly quality surpasses most existing genomic datasets from other Bombycidae (silkworm) and Saturniidae (giant silkworm moth) species (Table 7).
The correct assembly of chromosomes is an essential foundation for us to conduct synteny analysis and accurately identify chromosomal rearrangement events. Our synteny analysis with B. mori revealed conserved macrosyntenic blocks interspersed with chromosomal rearrangements. These rearrangements were predominantly driven by fusion/fission events, with only a single intrachromosomal inversion detected on chromosome 7 of S. ricini (orthologous to B. mori chromosome 1). This conserved syntenic architecture aligns with a prior cytogenetic mapping study in these species [46] and further supports the internal stability of lepidopteran chromosomes. Future studies leveraging these high-quality genomes should elucidate precise mechanisms of chromosome fusion/fission between S. ricini and B. mori, including gene order conservation within syntenic blocks and structural features at rearrangement breakpoints.
Genomic variations are valuable for functional genomic analysis and molecular marker-assisted breeding. Previously, Simple Sequence Repeat (SSR) markers of S. ricini were developed to evaluate genetic diversity, adaptive evolution, and trait-associated genes [53,54,55]. The genomic variations identified in this study establish a foundational resource for eri silkworm research. These variants serve as standardized genomic markers for establishing phenotype-genotype associations across populations. Furthermore, they constitute high-value markers for designing high-density genotyping arrays, essential tools for cost-efficient population genetics analyses, quantitative trait locus (QTL) mapping, genome-wide association studies (GWAS), and genomic selection breeding programs. Implementation of these resources will accelerate research on the genetic mechanisms underlying agronomically important traits (e.g., disease resistance and silk yield) and enhance marker-assisted breeding in the eri silkworm. We acknowledge that variant discovery from limited samples carries inherent constraints for population-level inferences. Comprehensive characterization of genomic variation across hundreds of individuals remains a critical objective for future studies.
Genomic data sharing remains a major focus for biologists, yet significant limitations persist in the visualization, access, and analysis of the S. ricini genome. Leveraging SilkMeta [31], a robust pan-genome and multi-omics database for B. mori, we implemented its data-sharing and analytical framework to enable interactive visualization of the S. ricini genome and variation datasets, providing researchers with an intuitive interface for genomic exploration. This initiative represents a critical strategy to maximize the utility of S. ricini genomic resources in comparative studies of eri silkworms and broader insect genomics.
In summary, the chromosome-level genome assembly, comprehensive variation dataset, and publicly accessible platform established in this study form an essential foundation for advancing comparative genomic research in S. ricini and related insect species.

5. Conclusions

This study presents a chromosome-level genome assembly of Samia cynthia ricini (456.16 Mb, scaffolded onto 14 chromosomes) generated using integrated long-read, short-read, and Hi-C sequencing data. The assembly achieved 98.5% BUSCO completeness and contains 15,729 predicted protein-coding genes. Functional annotation against five major databases (NR, SwissProt, Pfam, GO, and KEGG) revealed a maximum annotation rate of 92.71% (NR). Comparative genomics with Bombyx mori uncovered conserved syntenic blocks interspersed with chromosomal fusion/fission events and an intrachromosomal inversion. Furthermore, we established a comprehensive variation map, identifying 4.27 million SNPs, 1.02 million InDels, and 53,367 SVs, serving as critical resources for trait association studies and molecular breeding. All data are freely accessible via the SilkMeta database (http://silkmeta.org.cn (accessed on 15 April 2025)), providing an essential platform for sustainable utilization of this agriculturally significant insect.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/biology14060698/s1. Table S1: Predicted non-coding RNAs in the Samia ricini genome.

Author Contributions

Conceptualization, K.L. (Kunpeng Lu) and F.D. Data curation, K.L. (Kunpeng Lu) and K.L. (Kerui Lai). Formal analysis, K.L. (Kunpeng Lu), C.Z. and Z.L. Funding acquisition, K.L. (Kunpeng Lu), X.T. and F.D. Investigation, K.L. (Kunpeng Lu), J.S., Z.L. and K.L. (Kerui Lai). Methodology, K.L. (Kunpeng Lu), J.S., C.Z., S.L. and M.H. Project administration, X.T. and F.D. Resources, W.H., S.L., Q.L., X.T. and F.D. Software, J.S. and C.Z. Supervision, F.D. Validation, K.L. (Kunpeng Lu), J.S. and M.H. Visualization, K.L. (Kunpeng Lu) and J.S. Writing—original draft, K.L. (Kunpeng Lu). Writing—review and editing, K.L. (Kunpeng Lu), J.S., W.H., C.Z., Z.L., S.L., K.L. (Kerui Lai), Q.L., M.H., X.T. and F.D. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by grants from the National Natural Science Foundation of China (No. 32330102, No. 32202746), the Fundamental Research Funds for the Central Universities (SWU-KQ25011), the National Key Research and Development Program (No. 2023YFD1600901, No. 2023YFF1103801), the Natural Science Foundation of Chongqing, China (No. cstc2021jcyj-cxtt0005), the China Agriculture Research System of MOF and MARA (No. CARS-18-ZJ0102, No. CARS-18-ZJ0103), and the High-level Talents Program of Southwest University (No. SWURC2021001).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Long-read (ONT), short-read (DNBSEQ), and Hi-C sequencing data for GX-M and GX-F have been deposited in the Genome Sequence Archive (GSA; https://ngdc.cncb.ac.cn/gsa/ (accessed on 5 May 2025)) at the China National Center for Bioinformation (CNCB) under Project ID PRJCA039017 (Accession: CRA025435). The chromosome-level genome assembly, coding sequences (CDS), protein sequences, annotation files (GFF3 format), and genomic variations (VCF format) are publicly accessible through the SilkMeta database at http://silkmeta.org.cn/download (accessed on 15 April 2025). Previously published long-read and short-read sequencing data were retrieved from the NCBI Sequence Read Archive (SRA) under BioProject PRJNA699736.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Richard Steven Peigler, S.N. A Revision of the Silkmoth Genus Samia; University of the Incarnate Word: San Antonio, TX, USA, 2003. [Google Scholar]
  2. Kongsup, P.; Lertjirakul, S.; Chotimanothum, B.; Chundang, P.; Kovitvadhi, A. Effects of eri silkworm (Samia ricini) pupae inclusion in broiler diets on growth performances, health, carcass characteristics and meat quality. Anim. Biosci. 2022, 35, 711–720. [Google Scholar] [CrossRef] [PubMed]
  3. Longvah, T.; Manghtya, K.; Qadri, S.S. Eri silkworm: A source of edible oil with a high content of alpha-linolenic acid and of significant nutritional value. J. Sci. Food Agric. 2012, 92, 1988–1993. [Google Scholar] [CrossRef]
  4. Longvah, T.; Mangthya, K.; Ramulu, P. Nutrient composition and protein quality evaluation of eri silkworm (Samia ricinii) prepupae and pupae. Food Chem. 2011, 128, 400–403. [Google Scholar] [CrossRef]
  5. Lee, J.; Nishiyama, T.; Shigenobu, S.; Yamaguchi, K.; Suzuki, Y.; Shimada, T.; Katsuma, S.; Kiuchi, T. The genome sequence of Samia ricini, a new model species of lepidopteran insect. Mol. Ecol. Resour. 2021, 21, 327–339. [Google Scholar] [CrossRef]
  6. Yoshido, A.; Sichova, J.; Kubickova, S.; Marec, F.; Sahara, K. Rapid turnover of the W chromosome in geographical populations of wild silkmoths, Samia cynthia ssp. Chromosome Res. 2013, 21, 149–164. [Google Scholar] [CrossRef] [PubMed]
  7. Xia, Q.; Zhou, Z.; Lu, C.; Cheng, D.; Dai, F.; Li, B.; Zhao, P.; Zha, X.; Cheng, T.; Chai, C.; et al. A draft sequence for the genome of the domesticated silkworm (Bombyx mori). Science 2004, 306, 1937–1940. [Google Scholar]
  8. International Silkworm Genome, C. The genome of a lepidopteran model insect, the silkworm Bombyx mori. Insect Biochem. Mol. Biol. 2008, 38, 1036–1045. [Google Scholar] [CrossRef]
  9. Kawamoto, M.; Jouraku, A.; Toyoda, A.; Yokoi, K.; Minakuchi, Y.; Katsuma, S.; Fujiyama, A.; Kiuchi, T.; Yamamoto, K.; Shimada, T. High-quality genome assembly of the silkworm, Bombyx mori. Insect Biochem. Mol. Biol. 2019, 107, 53–62. [Google Scholar] [CrossRef] [PubMed]
  10. Xia, Q.; Guo, Y.; Zhang, Z.; Li, D.; Xuan, Z.; Li, Z.; Dai, F.; Li, Y.; Cheng, D.; Li, R.; et al. Complete resequencing of 40 genomes reveals domestication events and genes in silkworm (Bombyx). Science 2009, 326, 433–436. [Google Scholar] [CrossRef]
  11. Xiang, H.; Liu, X.; Li, M.; Zhu, Y.; Wang, L.; Cui, Y.; Liu, L.; Fang, G.; Qian, H.; Xu, A.; et al. The evolutionary road from wild moth to domestic silkworm. Nat. Ecol. Evol. 2018, 2, 1268–1279. [Google Scholar] [CrossRef]
  12. Tong, X.L.; Han, M.J.; Lu, K.P.; Tai, S.S.; Liang, S.B.; Liu, Y.C.; Hu, H.; Shen, J.H.; Long, A.X.; Zhan, C.Y.; et al. High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation. Nat. Commun. 2022, 13, 5619. [Google Scholar] [CrossRef] [PubMed]
  13. Lu, K.P.; Liang, S.B.; Han, M.J.; Wu, C.M.; Song, J.B.; Li, C.L.; Wu, S.Y.; He, S.Z.; Ren, J.Y.; Hu, H.; et al. Flight muscle and wing mechanical properties are involved in flightlessness of the domestic silkmoth, Bombyx mori. Insects 2020, 11, 220. [Google Scholar] [CrossRef] [PubMed]
  14. Wang, M.; Lin, Y.; Fu, Z.; Wu, X.; Meng, J.; Cheng, Y.; Gao, Y.; Xue, H.; Du, E.; Chen, J.; et al. Insufficient wing development possibly contributes to flightlessness of the silkworm Bombyx mori during domestication. Proc. Biol. Sci. 2025, 292, 20250281. [Google Scholar] [CrossRef] [PubMed]
  15. Wu, S.; Lu, Y.; He, S.; Dai, F. Progress in the molecular research of silkworm mutants. Newsl. Sericultural Sci. 2024, 44, 31–47. [Google Scholar]
  16. Li, C.; Tong, X.; Zuo, W.; Hu, H.; Xiong, G.; Han, M.; Gao, R.; Luan, Y.; Lu, K.; Gai, T.; et al. The beta-1, 4-N-acetylglucosaminidase 1 gene, selected by domestication and breeding, is involved in cocoon construction of Bombyx mori. PLoS Genet. 2020, 16, e1008907. [Google Scholar] [CrossRef]
  17. Ma, L.; Xu, H.; Zhu, J.; Ma, S.; Liu, Y.; Jiang, R.J.; Xia, Q.; Li, S. Ras1(CA) overexpression in the posterior silk gland improves silk yield. Cell Res. 2011, 21, 934–943. [Google Scholar] [CrossRef]
  18. Hu, Z.; Zhu, F.; Chen, K. The mechanisms of silkworm resistance to the baculovirus and antiviral breeding. Annu. Rev. Entomol. 2023, 68, 381–399. [Google Scholar] [CrossRef]
  19. Wang, C.; Yu, B.; Meng, X.; Xia, D.; Pei, B.; Tang, X.; Zhang, G.; Wei, J.; Long, M.; Chen, J.; et al. Microsporidian Nosema bombycis hijacks host vitellogenin and restructures ovariole cells for transovarial transmission. PLoS Pathog. 2023, 19, e1011859. [Google Scholar] [CrossRef]
  20. Kiuchi, T.; Koga, H.; Kawamoto, M.; Shoji, K.; Sakai, H.; Arai, Y.; Ishihara, G.; Kawaoka, S.; Sugano, S.; Shimada, T.; et al. A single female-specific piRNA is the primary determiner of sex in the silkworm. Nature 2014, 509, 633–636. [Google Scholar] [CrossRef]
  21. Daimon, T.; Koyama, T.; Yamamoto, G.; Sezutsu, H.; Mirth, C.K.; Shinoda, T. The number of larval molts is controlled by hox in caterpillars. Curr. Biol. 2021, 31, 884–891.e3. [Google Scholar] [CrossRef]
  22. Marcais, G.; Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 2011, 27, 764–770. [Google Scholar] [CrossRef] [PubMed]
  23. Vurture, G.W.; Sedlazeck, F.J.; Nattestad, M.; Underwood, C.J.; Fang, H.; Gurtowski, J.; Schatz, M.C. GenomeScope: Fast reference-free genome profiling from short reads. Bioinformatics 2017, 33, 2202–2204. [Google Scholar] [CrossRef]
  24. Koren, S.; Walenz, B.P.; Berlin, K.; Miller, J.R.; Bergman, N.H.; Phillippy, A.M. Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017, 27, 722–736. [Google Scholar] [CrossRef]
  25. Hailin Liu, S.W.; Li, A.; Ruan, J. SMARTdenovo: A de novo assembler using long noisy reads. Gigabyte 2021, 1, 2021. [Google Scholar]
  26. Vaser, R.; Sovic, I.; Nagarajan, N.; Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017, 27, 737–746. [Google Scholar] [CrossRef] [PubMed]
  27. Dudchenko, O.; Batra, S.S.; Omer, A.D.; Nyquist, S.K.; Hoeger, M.; Durand, N.C.; Shamim, M.S.; Machol, I.; Lander, E.S.; Aiden, A.P.; et al. De. novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 2017, 356, 92–95. [Google Scholar] [CrossRef]
  28. Okazaki, S.; Tsuchida, K.; Maekawa, H.; Ishikawa, H.; Fujiwara, H. Identification of a pentanucleotide telomeric sequence, (TTAGG)n, in the silkworm Bombyx mori and in other insects. Mol. Cell Biol. 1993, 13, 1424–1432. [Google Scholar]
  29. Brown, M.R.; de la Rosa, P.M.G.; Blaxter, M. tidk: A toolkit to rapidly identify telomeric repeats from genomic datasets. Bioinformatics 2025, 41, btaf049. [Google Scholar] [CrossRef]
  30. Simao, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef]
  31. Lu, K.; Pan, Y.; Shen, J.; Yang, L.; Zhan, C.; Liang, S.; Tai, S.; Wan, L.; Li, T.; Cheng, T.; et al. SilkMeta: A comprehensive platform for sharing and exploiting pan-genomic and multi-omic silkworm data. Nucleic Acids Res. 2024, 52, D1024–D1032. [Google Scholar] [CrossRef]
  32. Gabriel, L.; Bruna, T.; Hoff, K.J.; Ebel, M.; Lomsadze, A.; Borodovsky, M.; Stanke, M. BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 2024, 34, 769–777. [Google Scholar] [CrossRef] [PubMed]
  33. Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 2013, 29, 15–21. [Google Scholar] [CrossRef] [PubMed]
  34. Buchfink, B.; Reuter, K.; Drost, H.G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods 2021, 18, 366–368. [Google Scholar] [CrossRef]
  35. Jones, P.; Binns, D.; Chang, H.Y.; Fraser, M.; Li, W.; McAnulla, C.; McWilliam, H.; Maslen, J.; Mitchell, A.; Nuka, G.; et al. InterProScan 5: Genome-scale protein function classification. Bioinformatics 2014, 30, 1236–1240. [Google Scholar] [CrossRef] [PubMed]
  36. Chan, P.P.; Lowe, T.M. tRNAscan-SE: Searching for tRNA genes in genomic sequences. Methods Mol. Biol. 2019, 1962, 1–14. [Google Scholar] [PubMed]
  37. Lagesen, K.; Hallin, P.; Rodland, E.A.; Staerfeldt, H.H.; Rognes, T.; Ussery, D.W. RNAmmer: Consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007, 35, 3100–3108. [Google Scholar] [CrossRef]
  38. Nawrocki, E.P.; Eddy, S.R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 2013, 29, 2933–2935. [Google Scholar] [CrossRef]
  39. Tang, H.; Krishnakumar, V.; Zeng, X.; Xu, Z.; Taranto, A.; Lomas, J.S.; Zhang, Y.; Huang, Y.; Wang, Y.; Yim, W.C.; et al. JCVI: A versatile toolkit for comparative genomics analysis. Imeta 2024, 3, e211. [Google Scholar] [CrossRef]
  40. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef]
  41. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; Genome Project Data Processing, S. The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef]
  42. McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef] [PubMed]
  43. Sedlazeck, F.J.; Rescheneder, P.; Smolka, M.; Fang, H.; Nattestad, M.; von Haeseler, A.; Schatz, M.C. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 2018, 15, 461–468. [Google Scholar] [CrossRef] [PubMed]
  44. Smolka, M.; Paulin, L.F.; Grochowski, C.M.; Horner, D.W.; Mahmoud, M.; Behera, S.; Kalef-Ezra, E.; Gandhi, M.; Hong, K.; Pehlivan, D.; et al. Detection of mosaic and population-level structural variants with Sniffles2. Nat. Biotechnol. 2024, 42, 1571–1580. [Google Scholar] [CrossRef]
  45. Cingolani, P.; Platts, A.; Wang, L.L.; Coon, M.; Nguyen, T.; Wang, L.; Land, S.J.; Lu, X.; Ruden, D.M. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 2012, 6, 80–92. [Google Scholar] [CrossRef]
  46. Yoshido, A.; Yasukochi, Y.; Sahara, K. Samia cynthia versus Bombyx mori: Comparative gene mapping between a species with a low-number karyotype and the model species of Lepidoptera. Insect Biochem. Molec. 2011, 41, 370–377. [Google Scholar] [CrossRef]
  47. Dubey, H.; Pradeep, A.R.; Neog, K.; Debnath, R.; Aneesha, P.J.; Shah, S.K.; Kamatchi, I.; Ponnuvel, K.M.; Ramesha, A.; Vijayan, K.; et al. Genome sequencing and assembly of Indian golden silkmoth, Antheraea assamensis Helfer (Saturniidae, Lepidoptera). Genomics 2024, 116, 110841. [Google Scholar] [CrossRef] [PubMed]
  48. Duan, J.; Li, Y.; Du, J.; Duan, E.; Lei, Y.; Liang, S.; Zhang, X.; Zhao, X.; Kan, Y.; Yao, L.; et al. A chromosome-scale genome assembly of Antheraea pernyi (Saturniidae, Lepidoptera). Mol. Ecol. Resour. 2020, 20, 1372–1383. [Google Scholar] [CrossRef]
  49. Kim, S.R.; Kwak, W.; Kim, H.; Caetano-Anolles, K.; Kim, K.Y.; Kim, S.B.; Choi, K.H.; Kim, S.W.; Hwang, J.S.; Kim, M.; et al. Genome sequence of the Japanese oak silk moth, Antheraea yamamai: The first draft genome in the family Saturniidae. Gigascience 2018, 7, 1–11. [Google Scholar] [CrossRef]
  50. Lee, J.; Fujimoto, T.; Yamaguchi, K.; Shigenobu, S.; Sahara, K.; Toyoda, A.; Shimada, T. W chromosome sequences of two bombycid moths provide an insight into the origin of Fem. Mol. Ecol. 2024, 33, e17434. [Google Scholar] [CrossRef]
  51. Lee, J.; Fujimoto, T.; Yamaguchi, K.; Shigenobu, S.; Sahara, K.; Shimada, T. Comprehensive genome annotation of Trilocha varians, a new model species of Lepidopteran insects. Sci. Data 2025, 12, 124. [Google Scholar] [CrossRef]
  52. Lee, J.; Kiuchi, T.; Yamaguchi, K.; Shigenobu, S.; Toyoda, A.; Shimada, T. A chromosome-level genome assembly of wild silkmoth, Bombyx mandarina. Sci. Data 2025, 12, 27. [Google Scholar] [CrossRef] [PubMed]
  53. Pradeep, A.R.; Awasthi, A.K.; Singh, C.K.; Anuradha, H.J.; Rao, C.G.; Vijayaprakash, N.B. Genetic evaluation of eri silkworm Samia cynthia ricini: ISSR loci specific to high and low altitude regimes and quantitative attributes. J. Appl. Genet. 2011, 52, 345–353. [Google Scholar] [CrossRef] [PubMed]
  54. Liu, Y.Q.; Qin, L.; Li, Y.P.; Wang, H.; Xia, R.X.; Qi, Y.H.; Li, X.S.; Lu, C.; Xiang, Z.H. Comparative genetic diversity and genetic structure of three Chinese silkworm species Bombyx mori L. (Lepidoptera: Bombycidae), Antheraea pernyi Guerin-Meneville and Samia cynthia ricini Donovan (Lepidoptera: Saturniidae). Neotrop. Entomol. 2010, 39, 967–976. [Google Scholar] [CrossRef] [PubMed]
  55. Vijayan, K.; Anuradha, H.J.; Nair, C.V.; Pradeep, A.R.; Awasthi, A.K.; Saratchandra, B.; Rahman, S.A.; Singh, K.C.; Chakraborti, R.; Urs, S.R. Genetic diversity and differentiation among populations of the Indian eri silkworm, Samia cynthia ricini, revealed by ISSR markers. J. Insect Sci. 2006, 6, 1–11. [Google Scholar] [CrossRef]
Figure 1. Hi-C chromosomal contact map of S. ricini. The interaction map shows a clear structural configuration of the 14 S. ricini chromosomes, characterized by strong intra-chromosomal interactions and low inter-chromosomal signal noise, which underscores the high resolution of chromosome architecture.
Figure 1. Hi-C chromosomal contact map of S. ricini. The interaction map shows a clear structural configuration of the 14 S. ricini chromosomes, characterized by strong intra-chromosomal interactions and low inter-chromosomal signal noise, which underscores the high resolution of chromosome architecture.
Biology 14 00698 g001
Figure 2. Chromosomal distribution of protein-coding genes and genetic variations in the S. ricini genome. (i) Chromosome sizes of S. ricini. (ii) Density of genes, (iii) SNPs, (iv) InDels, and (v) SVs across all S. ricini chromosomes.
Figure 2. Chromosomal distribution of protein-coding genes and genetic variations in the S. ricini genome. (i) Chromosome sizes of S. ricini. (ii) Density of genes, (iii) SNPs, (iv) InDels, and (v) SVs across all S. ricini chromosomes.
Biology 14 00698 g002
Figure 3. Integrated functional annotation of S. ricini protein-coding genes across five databases (NR, Swiss-Prot, Pfam, Gene Ontology, and KEGG pathways).
Figure 3. Integrated functional annotation of S. ricini protein-coding genes across five databases (NR, Swiss-Prot, Pfam, Gene Ontology, and KEGG pathways).
Biology 14 00698 g003
Figure 4. Collinearity of S. ricini and B. mori genomes.
Figure 4. Collinearity of S. ricini and B. mori genomes.
Biology 14 00698 g004
Figure 5. Lengths and counts of (A) deletions, (B) insertions, (C) duplications, and (D) inversions in the S. ricini genome.
Figure 5. Lengths and counts of (A) deletions, (B) insertions, (C) duplications, and (D) inversions in the S. ricini genome.
Biology 14 00698 g005
Figure 6. SilkMeta database interface for S. ricini genomic exploration. (A) Genome browser. Interactive visualization of S. ricini chromosomal assembly, annotated gene models, and genomic variations (SNPs/InDels/SVs). (B) BLAST module. Sequence homology search interface for querying S. ricini genomic DNA, coding sequences (CDSs), and protein datasets. The asterisks (*) are used to move the marked items to the top position.
Figure 6. SilkMeta database interface for S. ricini genomic exploration. (A) Genome browser. Interactive visualization of S. ricini chromosomal assembly, annotated gene models, and genomic variations (SNPs/InDels/SVs). (B) BLAST module. Sequence homology search interface for querying S. ricini genomic DNA, coding sequences (CDSs), and protein datasets. The asterisks (*) are used to move the marked items to the top position.
Biology 14 00698 g006
Table 1. Summary of S. ricini genome sequencing data.
Table 1. Summary of S. ricini genome sequencing data.
SampleSequencing
Platform (Technology)
Clean ReadsClean Bases (Gb)Genome Coverage (×)Reads N50 Length (bp)
GX-MDNBSEQ216,133,15231.8870-
ONT4,028,57565.2714518,508
Hi-C319,016,54847.80--
GX-FDNBSEQ283,461,08041.9793-
ONT3,476,16063.1614021,955
Table 2. Summary of genome assembly.
Table 2. Summary of genome assembly.
SampleInitial AssemblyHi-C Assembly
ContigsContig N50 Length (bp)Total Length (bp)ContigsTotal Length (bp)
GX-M7318,557,025457,852,23159456,164,652
GX-F6325,316,322455,453,376--
Table 3. Chromosome length and telomeres of the S. ricini genome.
Table 3. Chromosome length and telomeres of the S. ricini genome.
Chromosome IDLength (bp)Telomere
Chr0121,357,9090
Chr0230,424,9921
Chr0334,141,3211
Chr0433,297,7541
Chr0535,620,6750
Chr0635,677,3250
Chr0731,408,7180
Chr0831,248,8631
Chr0938,345,8352
Chr1033,366,7360
Chr1141,493,5362
Chr1231,340,0670
Chr1342,746,9982
Chr1415,693,9231
Total456,164,65211
Note: Values of 2, 1, and 0 in the Telomere column denote the number of telomeric regions identified in each of the 14 chromosomes of S. ricini genome (2, 1, or none detected). The total of 11 indicates all telomere regions identified across the genome.
Table 4. Statistics on repeat sequences in the Samia ricini genome.
Table 4. Statistics on repeat sequences in the Samia ricini genome.
Type of RepetitivenessLength (bp)Percentage in Genome (%)
TEsSINEs2,915,5230.64
LINES86,866,33918.97
LTR11,617,9142.54
Penelope1,244,8590.27
DNA transposons12,499,7052.73
Helitrons (rolling-circles)40,349,2158.81
Unclassified61,962,17213.53
Small RNA263,2430.06
Simple repeats4,853,2471.06
Low complexity819,1310.18
Total222,094,62248.51
Table 5. Summary of variations within the S. ricini genome.
Table 5. Summary of variations within the S. ricini genome.
Variation TypeVariation CountsRelative Positions to Protein-Coding Genes
Before
Filtering
After
Filtering
IntergenicIntronDown-StreamUp-StreamExon
(Synonymous)
Exon
(Non-Synonymous)
SNP4,509,9664,270,8481,953,178
(45.63%)
1,197,685
(27.98%)
429,159
(10.03%)
596,161
(13.93%)
62,880
(1.47%)
41,751
(0.98%)
InDel1,066,6531,021,705445,118
(43.46%)
316,562
(30.91%)
142,907
(13.95%)
114,744
(11.20%)
4933
(0.48%)
SV-53,36722,687
(41.19%)
15,960
(28.98%)
5249
(9.53%)
7434
(13.50%)
3746
(6.80%)
Table 6. SNP densities in different genome regions of S. ricini.
Table 6. SNP densities in different genome regions of S. ricini.
Whole GenomeExonIntronIntergenic
DNA length (bp)457,689,39034,975,315214,843,885207,870,190
SNP count4,270,84894,6651,197,6852,978,498
SNP density (bp/SNP)10736917970
Table 7. Genome assembly and annotation statistics of S. ricini and other Bombycidae and Saturniidae silkworms.
Table 7. Genome assembly and annotation statistics of S. ricini and other Bombycidae and Saturniidae silkworms.
FamilySpeciesGenome Size (Mb)Chr. Anchoring StrategyChr. NumbersContig N50 (Mb)BUSCO
(Assembly)
BUSCO
(Gene Model)
Repetitive Elements (%, bp)Publish Year
Saturnii-daeSamia ricini
(China, GX-M)
456.16Hi-CN = 1418.5698.5%98.2%48.5This study
Samia ricini
(China, GX-F)
455.45noneN = 1425.3298.5%--This study
Samia ricini
(Japan, UT)
450.48Linkage analysisN = 1421.3797.9%91.9%43.52021 [5]
Antheraea assamensis501.18noneN = 150.68 98.0%96.0%49.02024 [47]
Antheraea pernyi726.37Hi-CN = 4913.77-95.6%60.72020 [48]
Antheraea yamamai656.00noneN = 310.7496.7%-37.32018 [49]
Bombyci-daeTrilocha varians353.84Optical mappingN = 2613.2898.7%98.6%-2024 [50] 2025 [51]
Bombyx mandarina (Japan)419.60Hi-CN = 2716.4395.1%94.5%-2025 [52]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lu, K.; Shen, J.; Huang, W.; Zhan, C.; Li, Z.; Liang, S.; Lai, K.; Luo, Q.; Han, M.; Tong, X.; et al. Chromosome-Level Genome and Variation Map of Eri Silkworm Samia cynthia ricini. Biology 2025, 14, 698. https://doi.org/10.3390/biology14060698

AMA Style

Lu K, Shen J, Huang W, Zhan C, Li Z, Liang S, Lai K, Luo Q, Han M, Tong X, et al. Chromosome-Level Genome and Variation Map of Eri Silkworm Samia cynthia ricini. Biology. 2025; 14(6):698. https://doi.org/10.3390/biology14060698

Chicago/Turabian Style

Lu, Kunpeng, Jianghong Shen, Wengong Huang, Chengyu Zhan, Zhengqing Li, Shubo Liang, Kerui Lai, Qun Luo, Minjin Han, Xiaoling Tong, and et al. 2025. "Chromosome-Level Genome and Variation Map of Eri Silkworm Samia cynthia ricini" Biology 14, no. 6: 698. https://doi.org/10.3390/biology14060698

APA Style

Lu, K., Shen, J., Huang, W., Zhan, C., Li, Z., Liang, S., Lai, K., Luo, Q., Han, M., Tong, X., & Dai, F. (2025). Chromosome-Level Genome and Variation Map of Eri Silkworm Samia cynthia ricini. Biology, 14(6), 698. https://doi.org/10.3390/biology14060698

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop