In this work, we proposed a process to select informative genetic variants for identifying clinically meaningful subtypes of hypertensive patients. We studied 575 African American (AA) and 612 Caucasian hypertensive participants enrolled in the Hypertension Genetic Epidemiology Network (HyperGEN) study and analyzed each race-based group separately. All study participants underwent GWAS (Genome-Wide Association Studies) and echocardiography. We applied a variety of statistical methods and filtering criteria, including generalized linear models, F statistics, burden tests, deleterious variant filtering, and others to select the most informative hypertension-related genetic variants. We performed an unsupervised learning algorithm non-negative matrix factorization (NMF) to identify hypertension subtypes with similar genetic characteristics. Kruskal–Wallis tests were used to demonstrate the clinical meaningfulness of genetic-based hypertension subtypes. Two subgroups were identified for both African American and Caucasian HyperGEN participants. In both AAs and Caucasians, indices of cardiac mechanics differed significantly by hypertension subtypes. African Americans tend to have more genetic variants compared to Caucasians; therefore, using genetic information to distinguish the disease subtypes for this group of people is relatively challenging, but we were able to identify two subtypes whose cardiac mechanics have statistically different distributions using the proposed process. The research gives a promising direction in using statistical methods to select genetic information and identify subgroups of diseases, which may inform the development and trial of novel targeted therapies.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited