Special Issue "Next Generation Sequencing Approaches in Biology"

Quicklinks

A special issue of Biology (ISSN 2079-7737).

Deadline for manuscript submissions: 31 July 2012

Special Issue Editor

Guest Editor
Prof. Dr. Mario Stanke
University of Greifswald, Institute for Mathematics and Computer Science, 17487 Greifswald, Germany
Website: http://www.math-inf.uni-greifswald.de/mathe/index.php/mitarbeiter/382-prof-dr-mario-stanke
E-Mail: mario.stanke@uni-greifswald.de

Special Issue Information

Dear Colleagues,

New fast and cheap DNA sequencing technologies from an increasing number of different platforms have forwarded biological research applications such as the sequencing of new genomes, the resequencing of individuals' genomes and the analysis of gene expression. While only in the last decade a single genome sequencing project used to keep a large consortium busy for years, now many sequencing projects target dozens, hundreds or even thousands of genomes in a single undertaking. The new technology also promises to make medicine more individual and enhance the understanding of the genotype's influence on diseases and traits. However, next generation sequencing technologies also bring new challenges, such as the requirement of more efficient mapping and assembly algorithms, the necessity to deal with shorter reads or with reads that have a higher error rate. Other challenges in distributed computing arise from the much larger data sizes. This special issue will cover original research papers on bioinformatical and statistical methods that make use of next generation sequencing. The submission of reviews is also welcomed.

Prof. Dr. Mario Stanke
Guest Editor

Submission

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. Papers will be published continuously (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are refereed through a peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biology is an international peer-reviewed Open Access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. For the first couple of issues the Article Processing Charge (APC) will be waived for well-prepared manuscripts. English correction and/or formatting fees of 250 CHF (Swiss Francs) will be charged in certain cases for those articles accepted for publication that require extensive additional formatting and/or English corrections.

Keywords

  • RNA-seq
  • assembly
  • expression analysis
  • mapping
  • alignment
  • gene finding
  • SNPs
  • genome-wide association
  • copy number variations
  • cloud computing
  • ChIP-seq

Published Papers

No papers have been published in this special issue yet, see below for planned papers.

Planned Papers

Type of Paper: Article
Title: Pre-processing NGS Reads with SeqTrimNext Provides more Reliable Sequence Assemblies with Less User Intervention
Authors: Darío Guerrero-Fernández 1, Almudena Bocinos 1, Rocío Bautista 1, Noé Fernández-Pozo 2, Juan Falgueras 3 and M. Gonzalo Claros 1,2
Affiliations: 1 Plataforma Andaluza de Bioinformática, Centro de Bioinnovación, Universidad de Málaga, Spain; E-Mail: claros@uma.es
2 Dep. Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, Spain
3 Dep. Lenguajes y Ciencias de la Computacion, Escuela Superior de Ingeniería Informática, Universidad de Málaga, Spain
Abstract: The advent of the technologies so-called next-generation sequencing (NGS) has not been accompanied by a quality control of the sequence accurateness. To fill this lack, SeqTrimNeXT has been developed as a a quick, reliable, distributed and/or parallelised, customisable, pipeline for pre-processing NGS reads. It is available as a command line, as a friendly web tool, and as a REST web service, being its input/output files compatible with other tools in order to develop more complex pipelines and workflows. Minimal user intervention is required for installing, using and obtaining pre-processed reads. SeqTrimNeXT usage on genomics and transcriptomics datasets accelerates the subsequent sequence assembly, increases the N50 and the mean length of contigs, as well as the accuracy of the consensus sequences, and diminishes the final number of total contigs and scaffolds. Additionally, it saves curation time and efforts for any sequencing project.

Type of Paper: Review
Title: mRNA-Seq and miRNA-Seq application in Prunus species
Authors: Pedro Martínez-Gómez and Manuel Rubio
Affiliation: Department of Plant Breeding. CEBAS-CSIC, PO Box 164, E-30100 Espinardo, Murcia, Spain. E-Mail: pmartinez@cebas.csic.es
Abstract: The genus Prunus includes several species producing edible drupes growth around the world including peach, apricot, plum, almond or cherry. The recent sequencing of the complete genome of the peach considered as reference genome inside this genus, together with the availability of new high-throughput sequencing technologies, offers new possibilities for transcriptomics studies in the called post-genomic era. The transcriptome can be described in live organisms and plants including Prunus as the complete list of all types of RNA molecules, expressed in a cell, tissue or whole organism. The transcriptome comprises the protein coding RNA (also called messenger RNA, mRNA), and the noncoding RNA including the non-regulatory RNA [ribosomal RNA (rRNA) and transfer RNA (tRNA)] and the regulatory RNA [small interfering RNA (siRNA); micro RNA (miRNA); and small nucleolar RNA (snoRNA); piwi-interacting RNA (piRNA)]. RNA-Seq involves direct sequencing of cDNAs (from RNA) using high-throughput DNA sequencing technologies and it represents the latest and most powerful tool for characterizing transcriptomes providing unparalleled insight into transcriptome complexity. RNA seq experiments have been oriented firstly in plants and Prunus to the high-throughput sequencing of the messenger RNA (mRNA-Seq). On the other hand, high-throughput micro RNA sequencing (miRNA-Seq) is another part of the global RNA-Seq application but with a more reduced application in plants and without any example of application in Prunus. The objective of this work is the detailed description of the application of mRNA-Seq in Prunus including the details of the experimental design, the sequencing technology, the type of analysis and the application to biological questions. In the case of mRNA, the two major analysis approaches described are mapping reads to the reference genome and the de novo transcriptome assembly. This analysis will conduct to other more complex analysis including Expressed sequence tags (ESTs), Single nucleotide polymorphisms (SNPs) development and differential expression analysis. Finally, the potentiality of miRNA-seq studies in Prunus will be discussed.

Type of Paper: Article
Title: FLEXBAR-Flexible Barcode and Adapter Processing for Next-generation Sequencing Platforms
Authors: Matthias Dodt, Rina Ahmed and Christoph Dieterich
Affiliation: Bioinformatics in Quantitative Biology, The Berlin Institute for Medical Systems Biology at the Max Delbrück Center for Molecular Medicine, Robert-Rössle-Strasse 10, 13125 Berlin-Buch, Germany; E-Mail: christoph.dieterich@mdc-berlin.de
Abstract: Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be (a) specific to the sequencing platform and library construction method (e.g. adapter sequences), (b) have been introduced by experimental design (e.g. sample barcodes) or (c) constitute some biological signal (e.g. splice leader sequences in nematodes). Generally, tag sequences could be located anywhere within a given short read tuple. Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility. FLEXBAR combines precision (tag recognition by dynamic programming) with speed (multi-threading). We demonstrate the utility of our software in terms of sequence assembly applications, library demultiplexing and splice leader detection.

Type of the Paper: Article
Title: TE-Locate : A Tool to Locate and Group Transposable Element Occurrences Using Next-generation Sequencing Data
Authors: Alexander Platzer, Viktoria Nizhynska and Quan Long
Affiliation: GMI-Gregor Mendel Institute, Vienna, Austria; E-Mail: quan.long@gmi.oeaw.ac.at
Abstract: Transposable elements (TEs) are common mobile DNA elements present in most genomes. While the movement of TEs within a genome can sometimes have phenotypic consequences, an accurate report of TE actions is always desirable. For this we developed TE-Locate, a computational tool that uses paired-end reads to identify the novel locations of known TEs. TE-Locate can utilize either a database of TE sequences, or annotated TEs within the reference sequence of interest. This makes TE-Locate useful in the search for any mobile sequence including retrotransposed gene copies. One major issue is to act on the right hierarchy level, to avoid incorrect calling of a single insertion as multiple nested events of TEs with high sequence similarity. We use the superfamily level, but TE-Locate can also use any other level till down to the individual transposable element. As an example of analyzing with TE-Locate, we use the Swedish population of the 1001 Arabidopsis genomes, and present some associations between TEs and demographic/phenotypic traits.

Type of Paper: Review
Title: Analyzing the microRNA Transcriptome Using Deep Sequencing Data
Authors: Xiaozeng Yang and Lei Li
Affiliation: Department of Biology, University of Virginia, Charlottesville VA 22904, USA; E-Mail: ll4jn@virginia.edu
Abstract: MicroRNAs (miRNAs) are 20-to-24 nucleotides endogenous small RNA molecules emerging as an important class of sequence-specific, trans-acting regulators for modulating gene expression at the post-transcription level. There has been a surge of interest in the past decade in identifying miRNAs and profiling their expression pattern using various experimental approaches. In particular, ultra-deep sampling of specifically prepared low-molecular weight RNA libraries based on next-generation sequencing technologies has been used successfully in diverse species. The challenge now is to effectively deconvolute the complex sequencing data to provide comprehensive and reliable information on the miRNAs, miRNA precursors, and expression profile of miRNA genes. Here we review the recently developed computational tools and their applications in profiling the miRNA transcriptomes, with an emphasis on the model plant Arabidopsis thaliana. Highlighted also are progresses and insights on miRNA biology derived from deep sequencing data.

Type of Paper: Review
Title: Next-generation Sequencing—Application in Liver Cancer—Past, Present and Future?
Authors: J. U. Marquardt 1,2 and J. B. Andersen 2
Affiliations: 1 Department of Medicine I, Johannes Gutenberg University, Mainz, Germany
2 Laboratory of Experimental Carcinogenesis, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland 20892, USA; E-Mail: marquarj@uni-mainz.de.
Abstract: Hepatocellular Carcinoma (HCC) is the third most deadly malignancy worldwide characterized by phenotypic and molecular heterogeneity. The past two decades, advances in genomic analyses have created a comprehensive understanding of different underlying pathobiological layers leading to hepatocarcinogenesis. More recently, improvements of sophisticated next-generation sequencing (NGS) technologies have enabled complete and cost-efficient analyses of cancer genomes at a single nucleotide resolution and advanced into valuable tools in translational medicine. Although the use of NGS in human liver cancer is still in its infancy, great promise rests in the systematic integration of different molecular analyses obtained by these methodologies, i.e. genomics, transcriptomics and epigenomics, to identify relevant and recurrent pathophysiological hallmarks thereby elucidating our limited understanding of liver cancer. However besides tumor heterogeneity, progress in translational oncology is challenged by the amount of biological information and considerable “noise” in the data obtained from different NGS platforms. Nevertheless, the following review aims to provide an overview of the current status of next-generation approaches in liver cancer, and outline the perspective of these technologies in diagnosis, patient classification, and prediction of outcome. Further, the potential of NGS to identify novel applications for concept clinical trials and to accelerate the development of new cancer therapies will be summarized.

Type of Paper: Review
Title: Next-generation sequencing: from understanding biology to personalized medicine?
Authors: Karen S. Frese, Jan Haas, Hugo A. Katus and Benjamin Meder
Affiliation: Department of Internal Medicine, University of Heidelberg, Heidelberg, Germany; E-Mail: benjamin.meder@med.uni-heidelberg.de
Abstract: Within few years, the new methods for high-throughput next-generation sequencing have generated completely novel insights into the missing heritability of various diseases. In this review we want to highlight the benefits of current state-of-the-art sequencing technologies for genetic and epigenetic research. We illustrate how these technologies help to constantly improve our understanding of genetic mechanisms in biological systems and summarize the progress made so far for cardiovascular disease. This is best exemplified in the case of heritable heart muscle diseases, so-called cardiomyopathies. Here, exome as well as whole-genome sequencing is able to identify novel disease genes and first clinical applications demonstrate their successful translation into personalized patient care.

Type of Paper: Review
Title: Genotyping-by-Sequencing in Plants
Authors: Stéphane Deschamps 1, Victor Llaca 1 and Gregory May 2
Affiliations: 1 DuPont Agricultural Biotechnology, Experimental Station, PO Box 80353, 200 Powder Mill Road, Wilmington, DE 19880-0353, USA
2 Pioneer Hi-Bred International Inc., A DuPont Company, 7300 NW 62nd Ave., P.O. Box 1004, Johnston, IA 50131-1004, USA; E-Mail: stephane.deschamps@usa.dupont.com
Abstract: The advent of next-generation DNA sequencing (NGS) technologies has led to the development of rapid genome-wide SNP detection applications in various plant species. Recent improvements in sequencing throughput combined with an overall decrease in costs per gigabase of sequence is allowing NGS to be applied to not only the evaluation of small subsets of parental inbred lines, but also the mapping and characterization of traits of interest in much larger populations. Such approach, where sequences are used simultaneously to detect and score SNPs, therefore bypassing the entire marker assay development stage, is known as “genotyping-by-sequencing” or GBS.  This review will summarize the current state of GBS in plants and the promises it holds as a genome-wide genotyping application.

Type of Paper: Review
Title: Systems Biology and Integrated Computational Frameworks for Rapid Characterization of Multi-omic Data
Authors: Christopher E. Mason 1 and Todd M. Smith 2
Affiliations: 1 Department of Physiology and Biophysics, Weil Cornell Medical College, 1305 York Ave., NY, NY 10021, USA
2 PerkinElmer | Geospiza, 100 W. Harrison St. NT 330, Seattle, WA 98119, USA; E-Mail:  tsmith423@gmail.com
Abstract: In the today's biology, systems are studied over discrete biochemical reactions and pathways. Global datasets that combine genetic data with a diverse array of "omics" data (transcriptional, epigenetic, proteomic, metabolomic) are collected using high throughput data generation platforms that include high content screening, imaging, flow cytometry, mass spectrometry, nucleic acid sequencing. Of these, the next generation DNA sequencing platforms predominate because they provide an inexpensive and scalable way to interrogate genetic differences, gene expression, and a myriad of factors that control gene expression. Today, scores of methods exist to study DNA, mRNA, non-coding RNAs, DNA/RNA protein iterations, and nucleotide modifications that form the epigenome. Future single molecule sequencing platforms will likely make direct RNA and protein measurement possible to increase the specificity of current assays and analyze the epitranscriptome. The challenge ahead is developing the software systems that can integrate datasets in ways that increase biological signal while decreasing experimental and computational noise. Such systems will verify results and provide the necessary user interactions to work with billions of data points needed to make scientific discoveries. Using leukemia as a model system we have integrated data from genome, exome, transcriptome, and methylome assays to observe widespread noise in each assay that requires thorough cleaning before integration.  Yet, we also show that when the methods converge, they give immediately actionable results.  Finally, we report initial progress in single-molecule detection of base modifications, which promise to reveal an entirely new realm of regulation in biology.

Type of Paper: Review
Title: Transcriptome Complexity of Bacterial Cells Revealed by Next Generation Sequencing
Authors: JuHee Kim, JaYoung Kim, Sun Chang Kim and Byung-Kwan Cho
Affiliation: Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Republic of Korea; E-Mail: bcho@kaist.edu
Abstract: Over the past decade or so, dramatic developments in our ability to experimentally determine the contents and functions of genomes have taken place. In particular, next-generation sequencing (NGS) technologies are now inspiring a new understanding of the bacterial genome on a global scale. In bacterial cells, whole-transcriptome studies have not received attention owing to the general view that bacterial genomes are simple, as well as experimental difficulties in enriching for mRNAs that lack poly(A) tails. However, RNA-seq technologies are revealing unexpected levels of complexity in bacterial transcriptomes, which include a wide array of small RNAs, antisense RNAs, and alternative transcripts. For example, a method has been developed for the genome-wide determination of bacterial transcriptional start site (TSS) using massive sequencing platform and 5’triphosphorylated mRNA capturing method. These high-throughput data sets have been then integrated to fully elucidate the bacterial transcriptome architectures. We will review how this systems approach is now revolutionizing our understanding of the complexity, plasticity and regulation of bacterial transcriptomes, and sheds light on the synthetic biology applications such as novel biological circuit design.

Type of Paper: Article
Title: Next Generation Sequencing Pratice in Clinical Routine for Mutation Detection in Large or Multiple Genes That Cause A Genetic Disease
Authors: Miguel de Sousa Dias, Imma Hernan, Emma Borrás, Maria José Gamundy, Beatriz Pascual,  Begoña Mañé and  Miguel Carballo
Affiliation: Unit of Molecular Genetics Hospital of Terrassa, Spain; E-Mail: MCarballo@CST.CAT
Abstract: In current clinical practice, resequencing of genes involved in a disease in individual patient samples is becoming increasingly important in order to carry out molecular testing. Medical analysis of candidate genes to characterize the mutation that causes a disease currently requires amplification of the exonic and flanking sequences by PCR as a previous step to individual PCR fragment Sanger sequencing. Introduction of the massively DNA Next-Generation Sequencing (NGS) technology is becoming increasingly necessary in sequencing genes to characterize mutations causing a monogenic disease. Large NGS platforms have been used for massively parallel DNA sequencing. However, the cost and extremely large capacity of these platforms result in a loss of flexibility for the needs of many clinical genetic laboratories. Alternatively, a clinical scalable Roche 454 GS Junior benchtop sequencing platform has been introduced that is feasible for the sequencing of a subset of genes in individual samples using the NGS technique at lower costs. We use this NGS platform for molecular testing of heterogeneous disease caused by mutation in several o large genes. Thus, we assayed two enzymatic methods and a Long-Range PCR assay to prepare DNA libraries for NGS to achieve molecular testing of the large BRCA1 and BRCA2 genes associated with increased risk of breast and ovarian cancer.. Moreover, we assayed different approaches based on LR-PCR, multiplex PCR or targeting DNA gene capture to detect mutation-causing autosomal domainant retinitis pigmentosa (adRP).  We discussed the efficiency of the different approach used for NGS in a routine clinical practice.

Last update: 16 May 2012

Biology EISSN 2079-7737 Published by MDPI Publishing, Basel, Switzerland RSS E-Mail Table of Contents Alert