You are currently on the new version of our website. Access the old version .
VirusesViruses
  • Article
  • Open Access

19 May 2020

Drug Resistance Prediction Using Deep Learning Techniques on HIV-1 Sequence Data

,
and
1
Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
2
Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Bioinformatics and Computational Approaches in Viral Genomics and Evolution

Abstract

The fast replication rate and lack of repair mechanisms of human immunodeficiency virus (HIV) contribute to its high mutation frequency, with some mutations resulting in the evolution of resistance to antiretroviral therapies (ART). As such, studying HIV drug resistance allows for real-time evaluation of evolutionary mechanisms. Characterizing the biological process of drug resistance is also critically important for sustained effectiveness of ART. Investigating the link between “black box” deep learning methods applied to this problem and evolutionary principles governing drug resistance has been overlooked to date. Here, we utilized publicly available HIV-1 sequence data and drug resistance assay results for 18 ART drugs to evaluate the performance of three architectures (multilayer perceptron, bidirectional recurrent neural network, and convolutional neural network) for drug resistance prediction, jointly with biological analysis. We identified convolutional neural networks as the best performing architecture and displayed a correspondence between the importance of biologically relevant features in the classifier and overall performance. Our results suggest that the high classification performance of deep learning models is indeed dependent on drug resistance mutations (DRMs). These models heavily weighted several features that are not known DRM locations, indicating the utility of model interpretability to address causal relationships in viral genotype-phenotype data.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.