Next Article in Journal
Effects Induced by the Initial Condition in the Quantum Kibble–Zurek Scaling for Changing the Symmetry-Breaking Field
Next Article in Special Issue
A Multivariate Multiscale Fuzzy Entropy Algorithm with Application to Uterine EMG Complexity Analysis
Previous Article in Journal
Determining the Optimum Inner Diameter of Condenser Tubes Based on Thermodynamic Objective Functions and an Economic Analysis
Previous Article in Special Issue
Multivariable Fuzzy Measure Entropy Analysis for Heart Rate Variability and Heart Sound Amplitude Variability
Article Menu
Issue 12 (December) cover image

Export Article

Open AccessArticle
Entropy 2016, 18(12), 445; doi:10.3390/e18120445

Multivariate Surprisal Analysis of Gene Expression Levels

1
Département de Chimie, B6c, Université de Liège, B4000 Liège, Belgium
2
The Fritz Haber Research Center for Molecular Dynamics, The Institute of Chemistry, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
3
Department of Urology, David Geffen School of Medicine and Department of Molecular Cell & Developmental Biology, University of California, Los Angeles, CA 90095, USA
4
Crump Institute for Molecular Imaging and Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, and Department of Chemistry & Biochemistry, University of California, Los Angeles, CA 90095, USA
*
Author to whom correspondence should be addressed.
Academic Editor: Anne Humeau-Heurtier
Received: 14 November 2016 / Revised: 6 December 2016 / Accepted: 7 December 2016 / Published: 11 December 2016
(This article belongs to the Special Issue Multivariate Entropy Measures and Their Applications)
View Full-Text   |   Download PDF [3369 KB, uploaded 12 December 2016]   |  

Abstract

We consider here multivariate data which we understand as the problem where each data point i is measured for two or more distinct variables. In a typical situation there are many data points i while the range of the different variables is more limited. If there is only one variable then the data can be arranged as a rectangular matrix where i is the index of the rows while the values of the variable label the columns. We begin here with this case, but then proceed to the more general case with special emphasis on two variables when the data can be organized as a tensor. An analysis of such multivariate data by a maximal entropy approach is discussed and illustrated for gene expressions in four different cell types of six different patients. The different genes are indexed by i, and there are 24 (4 by 6) entries for each i. We used an unbiased thermodynamic maximal-entropy based approach (surprisal analysis) to analyze the multivariate transcriptional profiles. The measured microarray experimental data is organized as a tensor array where the two minor orthogonal directions are the different patients and the different cell types. The entries are the transcription levels on a logarithmic scale. We identify a disease signature of prostate cancer and determine the degree of variability between individual patients. Surprisal analysis determined a baseline expression level common for all cells and patients. We identify the transcripts in the baseline as the “housekeeping” genes that insure the cell stability. The baseline and two surprisal patterns satisfactorily recover (99.8%) the multivariate data. The two patterns characterize the individuality of the patients and, to a lesser extent, the commonality of the disease. The immune response was identified as the most significant pathway contributing to the cancer disease pattern. Delineating patient variability is a central issue in personalized diagnostics and it remains to be seen if additional data will confirm the power of multivariate analysis to address this key point. The collapsed limits where the data is compacted into two dimensional arrays are contained within the proposed formalism. View Full-Text
Keywords: multivariate analysis; maximal entropy; prostate cancer markers; personalized diagnostics; transcriptomics; high order SVD; tensor data format; ensemble phenotypes multivariate analysis; maximal entropy; prostate cancer markers; personalized diagnostics; transcriptomics; high order SVD; tensor data format; ensemble phenotypes
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Supplementary material

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Remacle, F.; Goldstein, A.S.; Levine, R.D. Multivariate Surprisal Analysis of Gene Expression Levels. Entropy 2016, 18, 445.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top