Reprint

Statistical Inference from High Dimensional Data

Edited by
April 2021
314 pages
  • ISBN978-3-0365-0944-0 (Hardback)
  • ISBN978-3-0365-0945-7 (PDF)

This book is a reprint of the Special Issue Statistical Inference from High Dimensional Data that was published in

Chemistry & Materials Science
Computer Science & Mathematics
Physical Sciences
Summary
• Real-world problems can be high-dimensional, complex, and noisy • More data does not imply more information • Different approaches deal with the so-called curse of dimensionality to reduce irrelevant information • A process with multidimensional information is not necessarily easy to interpret nor process • In some real-world applications, the number of elements of a class is clearly lower than the other. The models tend to assume that the importance of the analysis belongs to the majority class and this is not usually the truth • The analysis of complex diseases such as cancer are focused on more-than-one dimensional omic data • The increasing amount of data thanks to the reduction of cost of the high-throughput experiments opens up a new era for integrative data-driven approaches • Entropy-based approaches are of interest to reduce the dimensionality of high-dimensional data
Format
  • Hardback
License
© 2022 by the authors; CC BY-NC-ND license
Keywords
e-bike rider; crash risk; machine learning; traffic violation; breast cancer; radiomics analysis; feature extraction; feature selection; Haar wavelet decomposition; gray-level co-occurrence matrix; contrast-enhanced spectral mammography; kernel regression; semi-supervised learning; dimensionality reduction; anchor graph regularization; Alignment-free; HIV-1 virus; phylogenetic analysis; position-weighted k-mers; Robinson–Foulds distance; topic model; Bayesian information criterion; expectation maximization algorithm; medical abstracts; GWAS; Pearson correlation; mutual information; feature screening; Bayesian Lasso; practical security; offensive security; user authentication; machine learning; vulnerability analysis; gene set analysis; microarrays; RNA-sequencing; genome wide association study; competitive; self-contained; sampling model; null hypothesis; residue cluster class; structural classification; functional classification; hybrid systems; MK classification; spectral features; astronomical databases; artificial neural networks; gene concept; scientific method; experimentalism; reductionism; anti-reductionism; data-mining; structure learning; probabilistic graphical models; pruning; feature selection; machine learning; earth observation; genetic algorithm; information theory; SVM; MRMR; bootstrap; gene expression; biological relevance; subject classification; genome-scale metabolic networks; information redundancy; metabolic landscapes analysis; graph entropy; renal cell carcinoma; transcriptomics; copy number alteration; gene expression; integrative analysis; Renyi’s relative entropy; the cancer gene atlas project; lower-grade glioma; feature selection; feature ranking; grouping; clustering; biological knowledge