Next Article in Journal
Development of Novel Chemically-Modified Nucleic Acid Molecules for Efficient Inhibition of Human MAPT Gene Expression
Previous Article in Journal
Human ARF Specifically Inhibits Epimorphic Regeneration in the Zebrafish Heart
Previous Article in Special Issue
Metabolomics: A Tool to Understand the Impact of Genetic Mutations in Amyotrophic Lateral Sclerosis
Open AccessArticle

A Knowledge-Based Machine Learning Approach to Gene Prioritisation in Amyotrophic Lateral Sclerosis

1
Department of Biostatistics & Health Informatics, King′s College London, 16 De Crespigny Park, London SE5 8AF, UK
2
Health Data Research UK London, University College London, 16 De Crespigny Park, London SE5 8AF, UK
3
King′s College Hospital, Bessemer Road, Denmark Hill, Brixton, London SE5 9RS, UK
4
Maurice Wohl Clinical Neuroscience Institute, Department of Basic and Clinical Neuroscience, King′s College London, London, 5 Cutcombe Rd, Brixton, London SE5 9RT, UK
5
Institute of Health Informatics, University College London, 222 Euston Rd, London NW1 2DA, UK
*
Authors to whom correspondence should be addressed.
Genes 2020, 11(6), 668; https://doi.org/10.3390/genes11060668
Received: 13 May 2020 / Revised: 13 June 2020 / Accepted: 16 June 2020 / Published: 19 June 2020
(This article belongs to the Special Issue Perspectives and Opportunities for ALS in the “Omics” Era)
Amyotrophic lateral sclerosis is a neurodegenerative disease of the upper and lower motor neurons resulting in death from neuromuscular respiratory failure, typically within two to five years of first symptoms. Several rare disruptive gene variants have been associated with ALS and are responsible for about 15% of all cases. Although our knowledge of the genetic landscape of this disease is improving, it remains limited. Machine learning models trained on the available protein–protein interaction and phenotype-genotype association data can use our current knowledge of the disease genetics for the prediction of novel candidate genes. Here, we describe a knowledge-based machine learning method for this purpose. We trained our model on protein–protein interaction data from IntAct, gene function annotation from Gene Ontology, and known disease-gene associations from DisGeNet. Using several sets of known ALS genes from public databases and a manual review as input, we generated a list of new candidate genes for each input set. We investigated the relevance of the predicted genes in ALS by using the available summary statistics from the largest ALS genome-wide association study and by performing functional and phenotype enrichment analysis. The predicted sets were enriched for genes associated with other neurodegenerative diseases known to overlap with ALS genetically and phenotypically, as well as for biological processes associated with the disease. Moreover, using ALS genes from ClinVar and our manual review as input, the predicted sets were enriched for ALS-associated genes (ClinVar p = 0.038 and manual review p = 0.060) when used for gene prioritisation in a genome-wide association study. View Full-Text
Keywords: gene prioritisation; machine learning; gene discovery; amyotrophic lateral sclerosis; motor neurone disease; knowledge graph gene prioritisation; machine learning; gene discovery; amyotrophic lateral sclerosis; motor neurone disease; knowledge graph
Show Figures

Figure 1

MDPI and ACS Style

Bean, D.M.; Al-Chalabi, A.; Dobson, R.J.B.; Iacoangeli, A. A Knowledge-Based Machine Learning Approach to Gene Prioritisation in Amyotrophic Lateral Sclerosis. Genes 2020, 11, 668.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop