Next Article in Journal
Cytonemes, Their Formation, Regulation, and Roles in Signaling and Communication in Tumorigenesis
Next Article in Special Issue
Structural Characterization of the CD44 Stem Region for Standard and Cancer-Associated Isoforms
Previous Article in Journal
Maternal Platelets—Friend or Foe of the Human Placenta?
Previous Article in Special Issue
Molecular Cloning and Exploration of the Biochemical and Functional Analysis of Recombinant Glucose-6-Phosphate Dehydrogenase from Gluconoacetobacter diazotrophicus PAL5
Open AccessArticle

Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study

PEACCEL, Protein Engineering ACCELerator, 6 Square Albin Cachot, BOX 42, 75013 Paris, France
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2019, 20(22), 5640; https://doi.org/10.3390/ijms20225640
Received: 13 October 2019 / Revised: 4 November 2019 / Accepted: 7 November 2019 / Published: 11 November 2019
(This article belongs to the Special Issue Computational Studies of Biomolecules)
The work aiming to unravel the correlation between protein sequence and function in the absence of structural information can be highly rewarding. We present a new way of considering descriptors from the amino acids index database for modeling and predicting the fitness value of a polypeptide chain. This approach includes the following steps: (i) Calculating Q elementary numerical sequences (Ele_SEQ) depending on the encoding of the amino acid residues, (ii) determining an extended numerical sequence (Ext_SEQ) by concatenating the Q elementary numerical sequences, wherein at least one elementary numerical sequence is a protein spectrum obtained by applying fast Fourier transformation (FFT), and (iii) predicting a value of fitness for polypeptide variants (train and/or validation set). These new descriptors were tested on four sets of proteins of different lengths (GLP-2, TNF alpha, cytochrome P450, and epoxide hydrolase) and activities (cAMP activation, binding affinity, thermostability and enantioselectivity). We show that the use of multiple physicochemical descriptors coupled with the implementation of the FFT, taking into account the interactions between residues of amino acids within the protein sequence, could lead to very significant improvement in the quality of models and predictions. The choice of the descriptor or of the combination of descriptors and/or FFT is dependent on the couple protein/fitness. This approach can provide potential users with value added to existing mutant libraries where screening efforts have so far been unsuccessful in finding improved polypeptide mutants for useful applications. View Full-Text
Keywords: innov’SAR; artificial intelligence; machine learning; protein spectrum; rational screening; digital signal processing; extended sequence; directed evolution innov’SAR; artificial intelligence; machine learning; protein spectrum; rational screening; digital signal processing; extended sequence; directed evolution
Show Figures

Figure 1

MDPI and ACS Style

Fontaine, N.T.; Cadet, X.F.; Vetrivel, I. Novel Descriptors and Digital Signal Processing- Based Method for Protein Sequence Activity Relationship Study. Int. J. Mol. Sci. 2019, 20, 5640.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop