Special Issue "Computational Modeling and Analysis of Microarray Data: New Horizons"

A special issue of Microarrays (ISSN 2076-3905).

Deadline for manuscript submissions: closed (31 August 2015)

Special Issue Editor

Guest Editor
Prof. Heather J. Ruskin

Centre for Scientific Computing and Complex Systems Modelling, Dublin City University, Glasnevin, Dublin 9, Ireland
Website | E-Mail
Interests: computational/statistical modeling and analysis; biological networks (GRNs); microarray data analysis; high throughput sequence analysis; cross-platform normalization; cross-scale computational models; bioinformatics; socioeconomic systems; computational science and simulation methods; systems biology and bioinformatics; spatiotemporal processes; agent basis; network analysis

Special Issue Information

Dear Colleague,

Microarray datasets are large and demand sophisticated handling and high precision analysis. This Special Issue will be concerned, therefore, with the range of computational modeling and analytical techniques used to extract information from these technologies, as well as new directions in their development and use. Topics on which submissions are invited include, but are not limited to: Data mining/pre-processing; pattern recognition, cluster analysis and validation; modeling uncertainty and identifying outliers in data; multivariate and non-standard statistical techniques; integrated data analysis, together with reconciliation of data types-evaluation and comparison; genome-wide analyses; gene expression and model-based analysis; serial, co-expression and enrichment analysis; computational algorithms, heuristics and novel tools in microarray modeling and analysis, including genetic, evolutionary, clustering/segmentation, E-M and parallel methods inter alia; network-based approaches and GRNs, including time-conditioned data and dynamic networks; visualization and interpretation of microarray expression data.

Prof. Heather J. Ruskin
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Microarrays is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 350 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • pattern recognition
  • model-based analysis
  • data integration and comparison
  • evolutionary algorithms
  • genetic regulatory networks
  • network analysis
  • time-conditioned microarray analysis
  • visualization

Published Papers (8 papers)

View options order results:
result details:
Displaying articles 1-8
Export citation of selected articles as:

Editorial

Jump to: Research, Review

Open AccessEditorial Computational Modeling and Analysis of Microarray Data: New Horizons
Microarrays 2016, 5(4), 26; doi:10.3390/microarrays5040026
Received: 11 October 2016 / Revised: 13 October 2016 / Accepted: 13 October 2016 / Published: 21 October 2016
PDF Full-text (153 KB) | HTML Full-text | XML Full-text
Abstract
High-throughput microarray technologies have long been a source of data for a wide range of biomedical investigations. Over the decades, variants have been developed and sophistication of measurements has improved, with generated data providing both valuable insight and considerable analytical challenge. The cost-effectiveness
[...] Read more.
High-throughput microarray technologies have long been a source of data for a wide range of biomedical investigations. Over the decades, variants have been developed and sophistication of measurements has improved, with generated data providing both valuable insight and considerable analytical challenge. The cost-effectiveness of microarrays, as well as their fundamental applicability, made them a first choice for much early genomic research and efforts to improve accessibility, quality and interpretation have continued unabated. In recent years, however, the emergence of new generations of sequencing methods and, importantly, reduction of costs, has seen a preferred shift in much genomic research to the use of sequence data, both less ‘noisy’ and, arguably, with species information more directly targeted and easily interpreted. Nevertheless, new microarray data are still being generated and, together with their considerable legacy, can offer a complementary perspective on biological systems and disease pathogenesis. The challenge now is to exploit novel methods for enhancing and combining these data with those generated by alternative high-throughput techniques, such as sequencing, to provide added value. Augmentation and integration of microarray data and the new horizons this opens up, provide the theme for the papers in this Special Issue. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)

Research

Jump to: Editorial, Review

Open AccessFeature PaperArticle Enhancing Interpretability of Gene Signatures with Prior Biological Knowledge
Microarrays 2016, 5(2), 15; doi:10.3390/microarrays5020015
Received: 5 October 2015 / Revised: 25 May 2016 / Accepted: 31 May 2016 / Published: 8 June 2016
Cited by 1 | PDF Full-text (943 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Biological interpretability is a key requirement for the output of microarray data analysis pipelines. The most used pipeline first identifies a gene signature from the acquired measurements and then uses gene enrichment analysis as a tool for functionally characterizing the obtained results. Recently
[...] Read more.
Biological interpretability is a key requirement for the output of microarray data analysis pipelines. The most used pipeline first identifies a gene signature from the acquired measurements and then uses gene enrichment analysis as a tool for functionally characterizing the obtained results. Recently Knowledge Driven Variable Selection (KDVS), an alternative approach which performs both steps at the same time, has been proposed. In this paper, we assess the effectiveness of KDVS against standard approaches on a Parkinson’s Disease (PD) dataset. The presented quantitative analysis is made possible by the construction of a reference list of genes and gene groups associated to PD. Our work shows that KDVS is much more effective than the standard approach in enhancing the interpretability of the obtained results. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Figures

Open AccessFeature PaperArticle Cancer Biomarkers from Genome-Scale DNA Methylation: Comparison of Evolutionary and Semantic Analysis Methods
Microarrays 2015, 4(4), 647-670; doi:10.3390/microarrays4040647
Received: 27 August 2015 / Revised: 9 November 2015 / Accepted: 18 November 2015 / Published: 27 November 2015
Cited by 3 | PDF Full-text (1095 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation
[...] Read more.
DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO) tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Figures

Open AccessCommunication Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications
Microarrays 2015, 4(4), 630-646; doi:10.3390/microarrays4040630
Received: 5 July 2015 / Revised: 22 September 2015 / Accepted: 30 October 2015 / Published: 23 November 2015
Cited by 1 | PDF Full-text (1446 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon
[...] Read more.
Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like) finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Open AccessArticle “Upstream Analysis”: An Integrated Promoter-Pathway Analysis Approach to Causal Interpretation of Microarray Data
Microarrays 2015, 4(2), 270-286; doi:10.3390/microarrays4020270
Received: 17 March 2015 / Revised: 11 May 2015 / Accepted: 14 May 2015 / Published: 21 May 2015
Cited by 10 | PDF Full-text (993 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
A strategy is presented that allows a causal analysis of co-expressed genes, which may be subject to common regulatory influences. A state-of-the-art promoter analysis for potential transcription factor (TF) binding sites in combination with a knowledge-based analysis of the upstream pathway that control
[...] Read more.
A strategy is presented that allows a causal analysis of co-expressed genes, which may be subject to common regulatory influences. A state-of-the-art promoter analysis for potential transcription factor (TF) binding sites in combination with a knowledge-based analysis of the upstream pathway that control the activity of these TFs is shown to lead to hypothetical master regulators. This strategy was implemented as a workflow in a comprehensive bioinformatic software platform. We applied this workflow to gene sets that were identified by a novel triclustering algorithm in naphthalene-induced gene expression signatures of murine liver and lung tissue. As a result, tissue-specific master regulators were identified that are known to be linked with tumorigenic and apoptotic processes. To our knowledge, this is the first time that genes of expression triclusters were used to identify upstream regulators. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Open AccessArticle Data Integration for Microarrays: Enhanced Inference for Gene Regulatory Networks
Microarrays 2015, 4(2), 255-269; doi:10.3390/microarrays4020255
Received: 27 February 2015 / Accepted: 30 April 2015 / Published: 14 May 2015
Cited by 2 | PDF Full-text (129 KB) | HTML Full-text | XML Full-text | Supplementary Files
Abstract
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given
[...] Read more.
Microarray technologies have been the basis of numerous important findings regarding gene expression in the few last decades. Studies have generated large amounts of data describing various processes, which, due to the existence of public databases, are widely available for further analysis. Given their lower cost and higher maturity compared to newer sequencing technologies, these data continue to be produced, even though data quality has been the subject of some debate. However, given the large volume of data generated, integration can help overcome some issues related, e.g., to noise or reduced time resolution, while providing additional insight on features not directly addressed by sequencing methods. Here, we present an integration test case based on public Drosophila melanogaster datasets (gene expression, binding site affinities, known interactions). Using an evolutionary computation framework, we show how integration can enhance the ability to recover transcriptional gene regulatory networks from these data, as well as indicating which data types are more important for quantitative and qualitative network inference. Our results show a clear improvement in performance when multiple datasets are integrated, indicating that microarray data will remain a valuable and viable resource for some time to come. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Open AccessArticle In Silico Genomic Fingerprints of the Bacillus anthracis Group Obtained by Virtual Hybridization
Microarrays 2015, 4(1), 84-97; doi:10.3390/microarrays4010084
Received: 21 October 2014 / Revised: 12 February 2015 / Accepted: 13 February 2015 / Published: 17 February 2015
Cited by 2 | PDF Full-text (627 KB) | HTML Full-text | XML Full-text
Abstract
In this study we evaluate the capacity of Virtual Hybridization to identify between highly related bacterial strains. Eight genomic fingerprints were obtained by virtual hybridization for the Bacillus anthracis genome set, and a set of 15,264 13-nucleotide short probes designed to produce genomic
[...] Read more.
In this study we evaluate the capacity of Virtual Hybridization to identify between highly related bacterial strains. Eight genomic fingerprints were obtained by virtual hybridization for the Bacillus anthracis genome set, and a set of 15,264 13-nucleotide short probes designed to produce genomic fingerprints unique for each organism. The data obtained from each genomic fingerprint were used to obtain hybridization patterns simulating a DNA microarray. Two virtual hybridization methods were used: the Direct and the Extended method to identify the number of potential hybridization sites and thus determine the minimum sensitivity value to discriminate between genomes with 99.9% similarity. Genomic fingerprints were compared using both methods and phylogenomic trees were constructed to verify that the minimum detection value is 0.000017. Results obtained from the genomic fingerprints suggest that the distribution in the trees is correct, as compared to other taxonomic methods. Specific virtual hybridization sites for each of the genomes studied were also identified. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)

Review

Jump to: Editorial, Research

Open AccessReview An Overview of NCA-Based Algorithms for Transcriptional Regulatory Network Inference
Microarrays 2015, 4(4), 596-617; doi:10.3390/microarrays4040596
Received: 1 September 2015 / Revised: 7 October 2015 / Accepted: 11 November 2015 / Published: 16 November 2015
Cited by 3 | PDF Full-text (634 KB) | HTML Full-text | XML Full-text
Abstract
In systems biology, the regulation of gene expressions involves a complex network of regulators. Transcription factors (TFs) represent an important component of this network: they are proteins that control which genes are turned on or off in the genome by binding to specific
[...] Read more.
In systems biology, the regulation of gene expressions involves a complex network of regulators. Transcription factors (TFs) represent an important component of this network: they are proteins that control which genes are turned on or off in the genome by binding to specific DNA sequences. Transcription regulatory networks (TRNs) describe gene expressions as a function of regulatory inputs specified by interactions between proteins and DNA. A complete understanding of TRNs helps to predict a variety of biological processes and to diagnose, characterize and eventually develop more efficient therapies. Recent advances in biological high-throughput technologies, such as DNA microarray data and next-generation sequence (NGS) data, have made the inference of transcription factor activities (TFAs) and TF-gene regulations possible. Network component analysis (NCA) represents an efficient computational framework for TRN inference from the information provided by microarrays, ChIP-on-chip and the prior information about TF-gene regulation. However, NCA suffers from several shortcomings. Recently, several algorithms based on the NCA framework have been proposed to overcome these shortcomings. This paper first overviews the computational principles behind NCA, and then, it surveys the state-of-the-art NCA-based algorithms proposed in the literature for TRN reconstruction. Full article
(This article belongs to the Special Issue Computational Modeling and Analysis of Microarray Data: New Horizons)
Back to Top