Special Issue "Advances in Single Molecule, Real-Time (SMRT) Sequencing"

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Technologies and Resources for Genetics".

Deadline for manuscript submissions: closed (31 January 2019)

Special Issue Editors

Guest Editor
Dr. Adam Ameur

1. Department of Immunology, Genetics and Pathology, Uppsala University, Science for Life Laboratory, Uppsala 75108, Sweden.
2. Department of Epidemiology and Preventive Medicine, Monash University, Melbourne 32901, Australia.
Website | E-Mail
Interests: next-generation sequencing; single-molecule sequencing; clinical sequencing; bioinformatics; genomics
Guest Editor
Dr. Matthew S. Hestand

1. Division of Human Genetics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, Cincinnati, OH 45229, USA
2. Department of Pediatrics, University of Cincinnati, 2600 Clifton Ave, Cincinnati, OH 45220, USA
Website | E-Mail
Interests: next-generation sequencing; single-molecule sequencing; clinical sequencing; bioinformatics; 22q11 deletion syndrome

Special Issue Information

Dear Colleagues,

PacBio’s single molecule, real-time (SMRT) sequencing technology offers important advantages over the short-read DNA sequencing technologies that currently dominate the market. This includes exceptionally long read lengths (20 kb or more), unparalleled consensus accuracy, and the ability to sequence native, non-amplified DNA molecules. From microbes to vertebrates, long reads are now used to create highly accurate de novo genome assemblies, characterize complex structural variations, permit full-length RNA isoform sequencing, and directly phase variants. The high accuracy further enables low frequency mutation detection and clonal evolution determination. Besides reducing biases, sequencing native DNA also permits the direct measurement of DNA base modifications. Therefore, SMRT sequencing has become an attractive technology in many fields, such as agriculture, basic science, and medical research. The boundaries of SMRT sequencing are being continuously pushed by developments in bioinformatics and sample preparation.

This Special Issue is a collection of articles showcasing the latest developments and the breadth of applications enabled by SMRT sequencing technology.

Dr. Adam Ameur
Dr. Matthew S. Hestand
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

For this Special Issue we are glad to offer a 15% discount from our APC to all planned contributions. Please contact and inform [email protected] in advance for this purpose. 

Keywords

  • SMRT sequencing
  • PacBio
  • genome assembly
  • structural variation
  • RNA isoforms
  • DNA base modifications
  • targeted sequencing
  • clinical sequencing
  • bioinformatics

Published Papers (8 papers)

View options order results:
result details:
Displaying articles 1-8
Export citation of selected articles as:

Editorial

Jump to: Research

Open AccessEditorial
The Versatility of SMRT Sequencing
Received: 20 December 2018 / Accepted: 3 January 2019 / Published: 4 January 2019
PDF Full-text (164 KB) | HTML Full-text | XML Full-text
Abstract
The adoption of single molecule real-time (SMRT) sequencing [...] Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

Research

Jump to: Editorial

Open AccessArticle
Single-Molecule Real-Time (SMRT) Full-Length RNA-Sequencing Reveals Novel and Distinct mRNA Isoforms in Human Bone Marrow Cell Subpopulations
Received: 19 February 2019 / Revised: 19 March 2019 / Accepted: 22 March 2019 / Published: 27 March 2019
PDF Full-text (4698 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Hematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single-molecule real-time (SMRT) full-length RNA-sequencing. [...] Read more.
Hematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single-molecule real-time (SMRT) full-length RNA-sequencing. This analysis revealed a ~5-fold higher number of transcript isoforms than previously detected and showed a distinct composition of individual transcript isoforms characteristic for bone marrow subpopulations. A detailed analysis of messenger RNA (mRNA) isoforms transcribed from the ANXA1 and EEF1A1 loci confirmed their distinct composition. The expression of proteins predicted from the transcriptome analysis was evaluated by mass spectrometry and validated previously unknown protein isoforms predicted e.g., for EEF1A1. These protein isoforms distinguished the lineage negative cell population from the lineage positive cell population. Finally, transcript isoforms expressed from paralogous gene loci (e.g., CFD, GATA2, HLA-A, B, and C) also distinguished cell subpopulations but were only detectable by full-length RNA sequencing. Thus, qualitatively distinct transcript isoforms from individual genomic loci separate bone marrow cell subpopulations indicating complex transcriptional regulation and protein isoform generation during hematopoiesis. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Graphical abstract

Open AccessArticle
Genome Sequencing Illustrates the Genetic Basis of the Pharmacological Properties of Gloeostereum incarnatum
Received: 17 December 2018 / Revised: 20 February 2019 / Accepted: 22 February 2019 / Published: 1 March 2019
PDF Full-text (2391 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Gloeostereum incarnatum is a precious edible mushroom that is widely grown in Asia and known for its useful medicinal properties. Here, we present a high-quality genome of G. incarnatum using the single-molecule real-time (SMRT) sequencing platform. The G. incarnatum genome, which is the [...] Read more.
Gloeostereum incarnatum is a precious edible mushroom that is widely grown in Asia and known for its useful medicinal properties. Here, we present a high-quality genome of G. incarnatum using the single-molecule real-time (SMRT) sequencing platform. The G. incarnatum genome, which is the first complete genome to be sequenced in the family Cyphellaceae, was 38.67 Mbp, with an N50 of 3.5 Mbp, encoding 15,251 proteins. Based on our phylogenetic analysis, the Cyphellaceae diverged ~174 million years ago. Several genes and gene clusters associated with lignocellulose degradation, secondary metabolites, and polysaccharide biosynthesis were identified in G. incarnatum, and compared with other medicinal mushrooms. In particular, we identified two terpenoid-associated gene clusters, each containing a gene encoding a sesterterpenoid synthase adjacent to a gene encoding a cytochrome P450 enzyme. These clusters might participate in the biosynthesis of incarnal, a known bioactive sesterterpenoid produced by G. incarnatum. Through a transcriptomic analysis comparing the G. incarnatum mycelium and fruiting body, we also demonstrated that the genes associated with terpenoid biosynthesis were generally upregulated in the mycelium, while those associated with polysaccharide biosynthesis were generally upregulated in the fruiting body. This study provides insights into the genetic basis of the medicinal properties of G. incarnatum, laying a framework for future characterization of bioactive proteins and pharmaceutical uses of this fungus. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Open AccessArticle
Genome Sequencing of Cladobotryum protrusum Provides Insights into the Evolution and Pathogenic Mechanisms of the Cobweb Disease Pathogen on Cultivated Mushroom
Received: 15 January 2019 / Revised: 4 February 2019 / Accepted: 5 February 2019 / Published: 8 February 2019
PDF Full-text (1879 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Cladobotryum protrusum is one of the mycoparasites that cause cobweb disease on cultivated edible mushrooms. However, the molecular mechanisms of evolution and pathogenesis of C. protrusum on mushrooms are largely unknown. Here, we report a high-quality genome sequence of C. protrusum using the [...] Read more.
Cladobotryum protrusum is one of the mycoparasites that cause cobweb disease on cultivated edible mushrooms. However, the molecular mechanisms of evolution and pathogenesis of C. protrusum on mushrooms are largely unknown. Here, we report a high-quality genome sequence of C. protrusum using the single-molecule, real-time sequencing platform of PacBio and perform a comparative analysis with closely related fungi in the family Hypocreaceae. The C. protrusum genome, the first complete genome to be sequenced in the genus Cladobotryum, is 39.09 Mb long, with an N50 of 4.97 Mb, encoding 11,003 proteins. The phylogenomic analysis confirmed its inclusion in Hypocreaceae, with its evolutionary divergence time estimated to be ~170.1 million years ago. The genome encodes a large and diverse set of genes involved in secreted peptidases, carbohydrate-active enzymes, cytochrome P450 enzymes, pathogen–host interactions, mycotoxins, and pigments. Moreover, C. protrusum harbors arrays of genes with the potential to produce bioactive secondary metabolites and stress response-related proteins that are significant for adaptation to hostile environments. Knowledge of the genome will foster a better understanding of the biology of C. protrusum and mycoparasitism in general, as well as help with the development of effective disease control strategies to minimize economic losses from cobweb disease in cultivated edible mushrooms. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Open AccessArticle
Genome Assembly and Annotation of the Trichoplusia ni Tni-FNL Insect Cell Line Enabled by Long-Read Technologies
Received: 17 December 2018 / Revised: 9 January 2019 / Accepted: 14 January 2019 / Published: 23 January 2019
PDF Full-text (1699 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Background: Trichoplusia ni derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in [...] Read more.
Background: Trichoplusia ni derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusia ni-derived cell line Tni-FNL. Methods: By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL. Results: Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly. Conclusions: This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Open AccessArticle
A High-Quality De novo Genome Assembly from a Single Mosquito Using PacBio Sequencing
Received: 18 December 2018 / Revised: 14 January 2019 / Accepted: 15 January 2019 / Published: 18 January 2019
PDF Full-text (1686 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful [...] Read more.
A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Open AccessArticle
De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data
Genes 2018, 9(10), 486; https://doi.org/10.3390/genes9100486
Received: 28 August 2018 / Revised: 21 September 2018 / Accepted: 5 October 2018 / Published: 9 October 2018
Cited by 4 | PDF Full-text (2207 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes [...] Read more.
The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Open AccessArticle
A Statistical Method for Observing Personal Diploid Methylomes and Transcriptomes with Single-Molecule Real-Time Sequencing
Received: 15 August 2018 / Revised: 12 September 2018 / Accepted: 12 September 2018 / Published: 19 September 2018
Cited by 1 | PDF Full-text (2482 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
We address the problem of observing personal diploid methylomes, CpG methylome pairs of homologous chromosomes that are distinguishable with respect to phased heterozygous variants (PHVs), which is challenging due to scarcity of PHVs in personal genomes. Single molecule real-time (SMRT) sequencing is promising [...] Read more.
We address the problem of observing personal diploid methylomes, CpG methylome pairs of homologous chromosomes that are distinguishable with respect to phased heterozygous variants (PHVs), which is challenging due to scarcity of PHVs in personal genomes. Single molecule real-time (SMRT) sequencing is promising as it outputs long reads with CpG methylation information, but a serious concern is whether reliable PHVs are available in erroneous SMRT reads with an error rate of ∼15%. To overcome the issue, we propose a statistical model that reduces the error rate of phasing CpG site to 1%, thereby calling CpG hypomethylation in each haplotype with >90% precision and sensitivity. Using our statistical model, we examined GNAS complex locus known for a combination of maternally, paternally, or biallelically expressed isoforms, and observed allele-specific methylation pattern almost perfectly reflecting their respective allele-specific expression status, demonstrating the merit of elucidating comprehensive personal diploid methylomes and transcriptomes. Full article
(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)
Figures

Figure 1

Genes EISSN 2073-4425 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top