Abstract: As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.
Keywords: non-Gaussian statistical models; dimension reduction; unsupervised learning; feature selection; DNA methylation analysis
Export to BibTeX
MDPI and ACS Style
Ma, Z.; Teschendorff, A.E.; Yu, H.; Taghia, J.; Guo, J. Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis. Int. J. Mol. Sci. 2014, 15, 10835-10854.
Ma Z, Teschendorff AE, Yu H, Taghia J, Guo J. Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis. International Journal of Molecular Sciences. 2014; 15(6):10835-10854.
Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun. 2014. "Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis." Int. J. Mol. Sci. 15, no. 6: 10835-10854.