Sequence–Activity Relationship of Angiotensin-Converting Enzyme Inhibitory Peptides Derived from Food Proteins, Based on a New Deep Learning Model

Dongya Qin; Xiao Liang; Linna Jiao; Ruihong Wang; Yi Zhao; Wenjun Xue; Jinhong Wang; Guizhao Liang

doi:10.3390/foods13223550

,

and

Key Laboratory of Biorheological Science and Technology, Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400044, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Foods2024, 13(22), 3550;https://doi.org/10.3390/foods13223550

This article belongs to the Special Issue Discovery, Functions and Applications of Food-Derived Bioactive Peptides

Version Notes

Order Reprints

Abstract

Food-derived peptides are usually safe natural drug candidates that can potentially inhibit the angiotensin-converting enzyme (ACE). The wet experiments used to identify ACE inhibitory peptides (ACEiPs) are time-consuming and costly, making it important and urgent to reduce the scope of experimental validation through bioinformatics methods. Here, we construct an ACE inhibitory peptide predictor (ACEiPP) using optimized amino acid descriptors (AADs) and long- and short-term memory neural networks. Our results show that combined-AAD models exhibit more efficient feature transformation ability than single-AAD models, especially the training model with the optimal descriptors as the feature inputs, which exhibits the highest predictive ability in the independent test (Acc = 0.9479 and AUC = 0.9876), with a significant performance improvement compared to the existing three predictors. The model can effectively characterize the structure–activity relationship of ACEiPs. By combining the model with database mining, we used ACEiPP to screen four ACEiPs with multiple reported functions. We also used ACEiPP to predict peptides from 21,249 food-derived proteins in the Database of Food-derived Bioactive Peptides (DFBP) and construct a library of potential ACEiPs to facilitate the discovery of new anti-ACE peptides.

Keywords:

angiotensin-converting enzyme inhibitory peptide (ACEiP); food-derived peptide; deep learning; amino acid descriptor

1. Introduction

ACE-inhibitory peptides (ACEiPs) are short peptides with about 2–19 amino acid residues. They can block the conversion of angiotensin I to angiotensin II by inhibiting ACE in the renin-angiotensin-aldosterone system, leading to the suppression of vasoconstriction and a reduction in blood pressure [1,2]. Many ACEiPs have been identified from milk, animal, plant, seafood, and microorganism sources by fermentation, enzymatic hydrolysis, and synthetic methods [3,4,5,6,7,8]. These ACEiPs can inhibit ACE in vitro and even exhibit similar or better blood-pressure-lowering effects than Captopril (an ACE inhibitor for the treatment of hypertension) in animal experiments [9,10,11]. More importantly, some ACEiPs exhibit multiple activities (antihypertensive, antioxidant, antidiabetic, anticancer, etc.) according to the Database of Food-derived Bioactive Peptides (DFBP) [12]. Therefore, ACEiPs have the advantages of high activity, multiple functions, multiple sources, and easy access, especially food-derived multifunctional peptides with ACE-inhibitory activity, which are expected to become candidates for replacing synthetic drugs in the treatment of cardiovascular diseases [2,7,13,14].

The peptide library, consisting of 20 natural amino acids, is extensive, so the experimental identification of bioactive peptides (including ACEiPs) is labor-intensive, time-consuming, and costly [15]. Recently, computational methods have emerged as a new strategy for the large-scale screening of bioactive peptides, including quantitative structure−activity relationship (QSAR), machine learning (ML), and deep learning (DL) techniques [16,17]. In QSAR studies, amino acid descriptors (AADs) are commonly used to characterize the sequence/structure characteristics of peptides [16]. Multiple AADs have been proposed to characterize ACEiPs, such as z-scales [18], HESE [19], and FASGAI [20], showing favorable interpretability regarding the structure–activity relationship of peptides. ML-based methods have shown advantages in the large-scale prediction of peptides by using physicochemical parameters, structural properties, and experimental data as feature inputs to train models and mine the bio-information found in the multi-dimensional data [15]. AHTpin [21] and mAHTPred [22] are currently the only two accessible predictors for antihypertensive peptides (AHTPs). AHTpin uses amino acid and atomic composition features to develop multiple classification models for small peptides, medium peptides, and large peptides, among which the model has achieved the highest accuracy of 84.21% for large peptides. mAHTPred is a state-of-the-art AHTP classifier that combines an extremely randomized tree and 51 feature descriptors derived from eight different feature encodings, achieving the highest accuracy of 88.3% compared to six different ML algorithms. DL-based methods can use diverse input data, especially when processing images, and natural language to accurately capture the sequence/structure characteristics of bioactive peptides. With the exception of one off-line model (ProtBERT) [23], the DL technique has not yet been applied in the study of ACEiPs, but it has already been used in the study of several other bioactive peptides, such as sAMP-PFPDeep (an antibacterial peptide predictor based on a convolutional neural network) [24], iACP-DRLF (an anticancer peptide predictor based on long short-term memory (LSTM) neural networks and their derived networks) [25], and a multifunctional peptide predictor based on a convolutional neural network [26].

These computational methods have made important contributions to the screening of bioactive peptides. However, there is still insufficient research on the prediction of ACEiPs, and the possible reasons for this are as follows: (i) QSAR is currently used mainly for small-sample regression, while feature extraction is difficult when the lengths of ACEiPs are inconsistent; (ii) ML-based AHTP predictors can be used to screen ACEiPs, but the prediction accuracy (of less than 88.3% [22]) still needs to be improved, and the ML-based model usually requires a specific type and length of data as input parameters and is less interpretable; and (iii) no DL-based prediction server for ACEiPs is yet available. LSTM is a particular recurrent neural network for processing natural sequence data with different lengths, avoiding gradient disappearance and explosion problems during long-sequence training [27,28]. LSTM can memorize the important transfer information between sequences and forget non-important information by controlling the transmission state through three gated states (the input, forget, and output gates) [27]. Therefore, combining the learning ability of an LSTM neural network and the interpretability of AADs may be an efficient strategy to improve the prediction accuracy and explain the sequence/structure characteristics of ACEiPs.

To address the above challenges, we combined an LSTM neural network and AAD encodings to train an optimal model and explain the main sequence/structure characteristics of ACEiPs, thereby developing a predictor named ACEiPP (http://www.cqudfbp.net/ACEiPP/index.jsp, accessed on 4 November 2024) to predict potential ACEiPs (Figure 1). We used ACEiPP to screen four ACEiPs with multiple activities and constructed a library of 1,320,635 potential ACEiPs. To our knowledge, ACEiPP is the first DL-based online predictor for the high-efficiency prediction and screening of ACEiPs, reducing the significant time and resources required for early-stage screening and providing a strong foundation for subsequent experimental validation.

Figure 1. Workflow of ACEiPP: (1) collect the sequences of ACEiPs and non- ACEiPs; (2) translate the sequences into feature vectors by single and combined AADs; (3) train the LSTM models and learn the key features of peptides; (4) characterize the sequence/structure characteristics of peptides; (5) deploy the optimal model in the webserver to access the online prediction; (6) screen the ACEiPs with reported multiple activities from the DFBP [12]; and (7) predict the potential ACEiPs encoded in food-derived proteins.

2. Materials and Methods

2.1. Benchmark and Independent Dataset

One benchmark dataset and two independent datasets were used for training the predictor and evaluating its prediction performance (Supplementary Table S1). After deleting reduplicate or controversial sequences with inconsistent experimental results, 1043 ACEiPs with IC₅₀ values of below 1000 μM were selected as positive samples from three databases: the DFBP (http://www.cqudfbp.net/, accessed on 4 November 2024), BIOPEP-UWM (https://biochemia.uwm.edu.pl/biopep-uwm/, accessed on 4 November 2024) and AHTPDB (http://crdd.osdd.net/raghava/ahtpdb/, accessed on 4 November 2024), along with the related literature. It is worth mentioning that to ensure the model can effectively learn and extract the sequence features of ACE-inhibiting peptides, we ultimately chose 1000 μM as the sample classification threshold after numerous practical attempts and comprehensive consideration. This value allows us to maintain the reliability of the experimental data while avoiding the misclassification of peptides that still possess moderate inhibitory activity as being negative samples. To maintain the balance of the number and length distribution of positive and negative samples, a manually written Java program and CD-HIT [29] were used to generate random sequences and remove the sequences with a greater than 90% similarity with positive samples. In total, 1043 peptides were randomly selected as negative samples to balance the number of positive samples. Both positive and negative samples were split according to a 7:3 ratio to construct a benchmark dataset (benchmark_ACEiPs) and an independent dataset (independent_ACEiPs) [21]. The second independent dataset (saved as independent_AHTPs.txt) with a sequence length of ≥5 was downloaded from mAHTPred (http://thegleelab.org/mAHTPred/, accessed on 4 November 2024), and contained 386 AHTPs and 386 non-AHTPs [21].

2.2. Sequence Representation

In this step, 22 kinds of AADs were used to translate peptide sequences into feature vectors as inputs of neural networks, where One-hot and the other 21 kinds of descriptors described the types and physicochemical attributes of natural amino acids, respectively (Supplementary Table S2). Two encoding strategies were used, as follows: (i) single AADs were used to train the models using 22 independent AAD encoding matrices individually; and (ii) combined AADs were used to combine all the single AADs into a 20 × 149 matrix (CodeSet22), using “Leave-Group-Out” to sequentially delete single AADs from CodeSet22 to optimize the best combination coding.

2.3. LSTM Architecture

In this study, an LSTM neural network in a Deeplearning 4j (version 1.0.0-M1.1) framework (DL4J: https://deeplearning4j.org/, accessed on 4 November 2024) was used to train the models. As shown in Figure 2, four layers were set up. The input length of the first LSTM layer was equal to the encoding length of the AADs. The first layer’s activation function was “tanh”, the dropout value was 0.4, and the output length was 128. The input and output lengths of LSTM layers 2–3 were 128, the activation functions were all “tanh”, and the dropout values were 0.4 and 0.25, respectively. The input and output lengths of the output layer were 128 and 2, respectively, and the activation function was “sigmoid”.

Figure 2. Flowchart of LSTM architecture. Four layers were set up, including three LSTM layers and one RnnOutput layer.

2.4. Evaluation of Performance

Five evaluation indices were used to evaluate the prediction performance of the proposed models, including sensitivity (Sn), specificity (Sp), accuracy (Acc), Matthews’s correlation coefficient (MCC), and the area under the receiver operating characteristic curves (AUC). These indices were defined as follows:

S n = \frac{T P}{T P + F N}

(1)

S p = \frac{T N}{T N + F P}

(2)

A c c = \frac{T P + T N}{T P + T N + F P + F N}

(3)

M C C = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}

(4)

Additionally, we used 0 to mask the single or multiple target parameters for all samples, then evaluated the importance score of features using the following formula:

Importance score = \frac{F P + F N}{T P + T N + F P + F N}

(5)

where TP, FN, FP, and TN denote the numbers of correctly predicted ACEiPs, misclassified non-ACEiPs, misclassified ACEiPs, and correctly predicted non-ACEiPs, respectively.

2.5. Webserver Construction

ACEiPP was deployed on the Tencent Cloud Server (Windows Server 2012 R2 Standard Edition 64-bit) through Java web technology. The web directory was designed using HTML, JSP, and JavaScript. MySQL (version 5.7.16) was used for data storage and invoked by manually written Java programs to analyze the data. EChart (https://echarts.apache.org/en/index.html, accessed on 4 November 2024) technology was used for data visualization. ACEiPP can provide multiple services such as prediction, screening, and the structure–activity exploration of ACEiPs.

2.6. Theoretical Screening of Food-Derived ACEiPs

For this study, we used two different peptide libraries to screen potential ACEiPs.

A total of 4506 non-repetitive sequences of 31 kinds of food-derived bioactive peptides were derived from the DFBP [12], and a total of 2545 bioactive peptides with unknown ACE-inhibitory activity (e.g., antioxidant, DPP IV-inhibitory, antithrombotic, and anticancer activities) were obtained by excluding 1961 reported ACEiPs. ACEiPP was used to identify potential ACEiPs from these 2545 peptides.

The sequences of 21,249 food-derived proteins were retrieved from the DFBP, and the THP-Tool (http://www.cqudfbp.net/enzymes/hydrolysis_tools/dataInput.jsp, accessed on 4 November 2024) was used for the virtual hydrolysis of these proteins [12]. It is important to note that the virtual hydrolysis process assumes that hydrolysis occurs under optimal conditions, including ideal pH and temperature, to reflect the maximum enzyme activity. Six typical protein hydrolases, including chymotrypsin (EC 3.4.21.1), papain (EC 3.4.22.2), proteinase K (EC 3.4.21.64), thermolysin (EC 3.4.24.27), pepsin (EC 3.4.23.1), and trypsin (EC 3.4.21.4), were used to generate a large number of peptides, and the resulting peptides were predicted by ACEiPP to obtain potential food-derived ACEiPs.

2.7. Molecular Docking

The 3D crystal structure of human ACE, complexed with Lisinopril (PDB ID: 1O86.pdb), was retrieved from the Protein Data Bank (http://www1.rcsb.org/, accessed on 4 November 2024). The 3D structures of the peptides were manually constructed using Discovery Studio 2019 (Dassault Systèmes, BIOVIA, San Diego, CA, USA) and optimized for energy minimization using the “CHARMm 46b1” force field. The ACE’s active pocket was generated based on the binding site of the original ligand Lisinopril, and the semi-flexible docking of the ACE (rigid) and peptide (flexible) was performed using the “CDOCKER” module. The number of interaction poses that were generated per complex was set to 20, and the other parameters were used as default values. The first, top docking result of “-C DOCKER interaction energy” was chosen as the optimal binding of the peptide and ACE.

2.8. Determination of ACE Inhibitory Activity

The ACE inhibitory activity assay was performed according to the method reported by Lahogue and Cushman [30,31], with appropriate modifications. First, 10 μL of substrate (HHL, 5 mM, Sigma-Aldrich Co., Shanghai, China) was mixed with 100 μL of sample or sodium borate buffer (pH = 8.3, 0.1 M boric acid, 0.3 M NaCl, Sigma-Aldrich Co., Shanghai, China), and the reaction was initiated by the addition of 20 μL of ACE solution (0.1 U/L, Sigma-Aldrich Co., Shanghai, China) in a warm bath at 37 °C for 5 min. The samples, substrate HHL, and ACE solutions were prepared from sodium borate buffer. The peak areas of the mixtures were measured at 228 nm using an HPLC system (LC-20A, Shimadzu Co., Kyoto, Japan) and an InertSustain AQ-C18 column (4.6 × 250 mm, 5 μm particle size). The conditions of the HPLC analysis were as follows: mobile phase A (ultrapure water with 0.05% TFA, Merck & Co., Inc., Darmstadt, Germany) and mobile phase B (ACN, Merck & Co., Inc., Darmstadt, Germany); the volume ratio was 25:75, with isocratic elution at a flow rate of 1 mL/min for 11 min; the column temperature was 30 °C, and the injection volume was 10 μL. The ACE inhibition (%) was calculated according to the following formula:

ACE inhibitory rate (%) = \frac{A_{b l a n k} - A_{s a m p l e}}{A_{b l a n k}} \times 100 %

(6)

where A_blank and A_sample represent the peak areas of the hippuric acid chromatographic peaks in the blank (sodium borate buffer) and in the sample, respectively. IC₅₀ refers to the concentration value of the corresponding peptide sample (μM) when a 50% inhibition of ACE was achieved, and the value was calculated based on the regression curves of each component.

3. Results and Discussion

3.1. Performance of Single Coding Models

Twenty-two single AADs were used to convert the peptide sequences into matrices containing the sequence/structure characteristics as the input parameters of neural networks to train the LSTM models. As shown in the five-fold cross-validation conducted on the benchmark dataset (Supplementary Table S3), 18 models exhibited higher predictive performance, with Acc values of 0.7705–0.9308, MCC values of 0.5458–0.8629, and AUC values of 0.8653–0.9773, compared with the remaining four models (ISA-ECI, Lin’s scales, MS-WHIM, and VSTV), which were trained with three or two features. Moreover, 16 models (number of features: 6–20) exhibited high Acc values (0.8348–0.9211) on the test set (independent_ACEiPs), which contained 313 ACEiPs and 313 non-ACEiPs (Supplementary Table S4).

Thus, the LSTM-based models could effectively distinguish the dependencies between residue sites in unknown peptide sequences and accurately identify the key features of ACEiPs and non-ACEiPs. The models that learned 10, or more than 10, features generally showed higher predictive performance than the models based on fewer than 10 features. Each of the AADs translated different types and physicochemical attribute characteristics of residues encoded in peptide sequences, so multi-feature inputs could prompt LSTM models to reasonably learn the sequence/structure characteristics of peptides. Therefore, it is feasible to use single ADDs to extract the features in order to characterize peptide sequences and feed them into LSTM models. However, we must face a new challenge: optimizing and screening new sequence/structure characteristics with high efficiency but low redundancy.

3.2. Performance of Optimized Coding Models

To further optimize a set of high-efficiency AADs with low redundancy, we restructured all single AADs into combined AADs as inputs of the LSTM model and evaluated the predictive performance of the trained models. First, the 22 single AADs were integrated into CodeSet22 to evaluate the Acc value of the obtained model. Then, “Leave-Group-Out” was used to reduce feature redundancy by gradually removing one group of single AADs from CodeSet22 in turn, and the optimal feature matrices for each round of training would be saved to perform the next optimization. The five-fold cross-validation scores (Supplementary Table S5) and independent test scores (Supplementary Table S6) indicated that the restructured combined AADs have more stable feature characterization ability than single AADs. As shown in Figure 3, the models based on combined AADs have the following key characteristics relative to the models based on single AADs. (i) Except for the model based on CodeSet22, the optimal models trained with combined AADs generated higher Acc values (from 0.9342 to 0.9489) than the model based on single AADs, even SubSet3-1 (combined with FASGAI and the ST scale; see Supplementary Table S6) with only 14 features generated higher Acc than all single AADs. (ii) With the increasing number of features from 2 to 37, the Acc values gradually improved, indicating that the effective features were continuously accumulated. (iii) When the number of features was added up to 37, SubSet7-5 (named VVSFZL37, composed of the FASGAI, Lin’s scales, ST scales, VHSE, VSW, and Z-scales) achieved a critical point whereby a set of non-redundant features were extracted from the total feature set (CodeSet22 with 179 features). The obtained model exhibited an Acc value of 0.9479, with an enhancement of 2.68% compared to the One-hot model with the highest Acc value in all the models by single AADs. (iv) With the increasing number of features (more than 37), the Acc values basically stabilized above 0.9431, revealing that the redundancy of features began to gradually increase. This not only made the variable weights increase ineffectively but also added to the difficulty of training; even the Acc values rapidly dropped to 0.9147 when the number of features accumulated to 179. Therefore, VVSFZL37 was the most efficient coding set for translating peptide sequences into high-efficiency features, which were used to train the LSTM model to acquire the predictor of ACEiPs.

Figure 3. The predicted Acc on the independent dataset (independent_ACEiPs), based on different AADs: (A) 22 single AADs (Supplementary Table S4); (B) 22 optimized AADs (Supplementary Table S5). CodeSet22 is a total feature matrix consisting of 22 single AADs. Each “SubSet” represents the selected combined AADs gradually screened out from CodeSet22 by the “Leave-Group-Out” method. For example, SubSet22-16 represents the optimized encodings obtained by removing the 16th single AAD from CodeSet22 (see Supplementary Table S6 for the detailed screening process). The numbers in parentheses represent the number of features.

3.3. Comparison with the Existing Predictors

To evaluate the generalization ability of ACEiPP with the existing predictors (i.e., mAHTPred, AHTpin_AAC, and AHTpin_ATC), we carried out comprehensive comparisons on three independent datasets. The evaluation indices on the independent_ACEiPs test set (Table 1) indicate that ACEiPP (Acc = 0.9479, MCC = 0.8959) achieved higher prediction performance than the three existing models. To be specific, ACEiPP showed the best prediction ability with an Acc value of 0.9479, while the Acc values of mAHTPred, AHTpin_AAC, and AHTpin_ATC were only 0.8492, 0.7706 (average), and 0.756 (average), respectively.

Table 1. Performance comparison of ACEiPP and the three existing predictors on independent_ACEiPs and independent_AHTPs.

We then evaluated the predictive power of each model for AHTPs on the independent_AHTPs test set (Table 1). The Acc values of ACEiPP, mAHTPred, AHTpin_AAC, and AHTpin_ATC were 0.8303, 0.8834, 0.8061 (average), and 0.8204 (average), respectively, showing that ACEiPP has better prediction ability than AHTpin_ACC and AHTpin_ATC, but has poorer prediction ability than mAHTPred for AHTPs. As we know, AHTPs involve multi-type target protein inhibitors [32,33], which can be seen in the dataset independent_AHTPs, including ACEiPs and DPP IV-inhibitory, antioxidant, and antihypertensive peptides [22]. Nevertheless, ACEiPP was trained on the ACEiPs and non-ACEiPs benchmark dataset, meaning that ACEiPP only learned the features of ACEiPs but was unable to effectively identify the features of the whole AHTP samples, thereby affecting the prediction performance for AHTPs.

3.4. Key Sequence/Structure Characteristics Learned by ACEiPP

To explore the key characteristics learned by ACEiPP, we interpreted the physicochemical meaning of VVSFZL37 (Supplementary Table S7). As shown in Figure 4A, VVSFZL37 consists of 37 parameters derived from six sets of AADs (VSW, VHSE, ST-scales, Z-scales, FASGAI, and Lin’s scales), showing higher predictive power than any single AADs. All 37 parameters made important contributions to model performance, especially the top ten parameters, involving electronic properties, hydrophobicity, molecular composition and structure, and van der Waal volume, which were important residue characteristics of ACEiPs (Figure 4B). To further compare the overall contribution of the same/similar parameters to the model, we divided these 37 parameters into five patterns according to their physicochemical meanings, namely, geometrical, electronic, hydrophobic, steric, and composition characteristics, consisting of 12, 11, 6, 4, and 4 parameters, respectively (Figure 4C, Supplementary Table S8). Their importance scores are ordered as follows: Geometrical > Electronic > Hydrophobic > Steric > Composition (Figure 4D).

Figure 4. Feature composition and importance scores of VVSFZL37. (A) Six kinds of single AADs are contained in VVSFZL37, along with their physicochemical properties; (B) importance scores for 37 single features; (C) five patterns, determined by the physicochemical meaning; (D) importance scores for five patterns of features; (E) sequence characteristics of ACEiPs. ACEiPs are mainly oligopeptides (2–10 residues, molecular weight ≤ 1000 Da), composed of hydrophobic amino acids (Pro, Leu, Val, Gly, and Ala) and aromatic amino acids (Tyr); the C-terminus tends to be hydrophobic (Pro and Phe), positively charged (Arg and Lys), and bulky aromatic (Tyr) amino acids, while the N-terminus tends to be hydrophobic amino acids (Leu, Val, Ala, Gly, and Ile).

The five patterns of physicochemical attributes encoded by VVSFZL37 can also be intuitively observed in the sequence features of ACEiPs (Figure 4E). Several reports have indicated that hydrophobic, aromatic/aliphatic, C- and N-terminal, residue composition, and charged amino acids are indeed the key factors affecting the activities of ACEiPs. They not only directly determine the molecular geometry, overall hydrophobicity, side-chain functional groups, and charged characteristics of ACEiPs but also indirectly affect the second structure, steric structure, and molecular compositional characteristics of ACEiPs [2,3,17] (see Supplementary Note S1 for details). These studies strongly supported our conclusion that these five patterns of physicochemical attributes are indeed important factors in forming ACEiPs. Therefore, VVSFZL37 constructed plausible multiple properties and a low redundancy matrix to characterize various sequence/structure characteristics that are closely related to the activity of ACEiPs, thereby improving the robustness and accuracy of the LSTM model, rather than all single AADs.

3.5. Web Server Implementation

We built the ACEiPP server (http://www.cqudfbp.net/ACEiPP/index.jsp, accessed on 4 November 2024) with the optimal LSTM model, trained by VVSFZL37 to predict, design, and screen ACEiPs. ACEiPP comprises six modules, including Home, Pre-ACEiPs, Seq-Features, Pre-Libraries, Help, and Contact. As the first DL-based prediction platform for ACEiPs, ACEiPP can perform multiple applications such as prediction, residue-based mutation screening, structure–activity explorations of new ACEiPs, and the discovery of potential ACEiPs.

3.6. Characterization and Screening of Food-Derived ACEiPs with Multi-Activities

By selecting food-derived dipeptides or tripeptides in the DFBP that have been reported to have specific physiological functions but have not been reported to have anti-ACE inhibitory effects, we used ACEiPP prediction, combined with bioinformatics and molecular docking screening, to obtain bioactive peptides with anti-ACE effects. We used ACEiPP to predict the ACE inhibitory activity of 2346 peptides and obtained 1358 potential ACEiPs (probability threshold > 0.5). Based on Lipinski’s rule of five, combined with TPSA, water solubility, GI, P-gp substrate, BBBP, CYP enzyme inhibition, and toxicity predictions, we obtained 37 food-derived dipeptides and tripeptides (Supplementary Table S9).

The docking results of 37 peptides with the target protein ACE showed that 19 of the peptide-ACE scores were higher than Lisinopril (83.7945 kcal/mol) (Supplementary Table S10), implying that these peptides may have a strong inhibitory effect on ACE (the type and number of non-bonding forces of the eight highest-scoring peptides with ACE are shown in Supplementary Table S11). These eight peptides have a similar binding mode to Lisinopril (Supplementary Figure S1), i.e., they bind primarily through hydrogen bonds (including a conventional hydrogen bond, salt bridge, carbon–hydrogen bond, and Pi-donor hydrogen bond) and non-hydrogen bonds (Pi-Cation, attractive charge, and Pi-Alkyl bonds) to promote inhibitory peptide interactions with S1 (Ala354, Glu384, and Tyr523), S2 (His353 and His513), S1’ (Glu162), and the Zn²⁺-binding domain (His383) of the ACE. In addition, these eight peptides can also form the same attractive charge or metal acceptor with metal ions (Zn701) as Lisinopril, thus hindering the activating effect of Zn701 on ACE.

The experimental effects of eight peptides on ACE inhibition (Table 2) shows that four peptides, namely, LAF, LLL, GLF, and LIV, exhibit relatively high ACE inhibitory activities, with IC₅₀ values of 4.35 μM, 17.99 μM, 270.93 μM, and 330.75 μM, respectively. By reviewing the DFBP, we found that these four peptides also possess 1 (antioxidative), 3 (antioxidative, DPP IV-inhibitory, and stimulating), 3 (antimicrobial, immunomodulatory, and regulating) and 1 (antioxidative) reported biological activities, respectively. The other four peptides (LAL, AVL, LE, and VLV) show low/no ACE inhibitory activity (IC₅₀ > 1000 μM).

Table 2. The ACE inhibitory activity and multi-bioactivities of eight screened peptides.

3.7. Future Development Trends for ACEiPs

We used ACEiPP to predict 1,320,635 potential ACEiPs that are encoded in food-derived proteins (http://www.cqudfbp.net/ACEiPP/prePool/dataPool.jsp, accessed on 4 November 2024), and further compared the sequence features between the experimental ACEiPs (IC₅₀ < 1000 μM; 1043 unique sequences) and the predicted potential ACEiPs. The experimentally discovered ACEiPs are concentrated in the tripeptides to octapeptides (mainly tripeptides and pentapeptides), whereas the predicted ACEiPs were primarily concentrated in the pentapeptides to octapeptides, with the most being discovered for hexapeptides (Supplementary Figure S2). This difference may be attributed to the limitations of the actual enzymatic hydrolysis conditions and the number of reports. Prospectively, the main distribution area of ACEiPs may be concentrated in the range of pentapeptides to octapeptides, especially hexapeptides, providing an important reference for the screening of ACEiPs in the future. This is because small peptides (2–4 residues) can access the active pocket of ACE but cannot form enough hydrogen bonds, while large peptides (≥9 residues) cannot enter the active pocket easily, and it may be hard to change the catalytic activity of ACE. Medium peptides (5–8 residues) might be the most suitable, which is partly due to the contribution of hydrogen bonds to their affinity [34,35]. Furthermore, the predicted ACEiPs have similar residue composition and two-terminal features to experimental ACEiPs (Supplementary Figure S2). Therefore, we not only demonstrated the feature consistency between predicted and experimental ACEiPs but also answered two open questions in ACEiP research: (i) the relationship between peptide chain length and ACE-inhibitory activity; (ii) the optimal amino acid profiles of ACEiPs that are derived from food proteins [20].

4. Conclusions

For this study, we developed a predictor ACEiPP using a DL-based model based on natural sequences processing by LSTMs and used the interpretability of AADs to predict, design, and screen ACEiPs. ACEiPP effectively learned the key sequence/structure characteristics of ACEiPs for transformation by VVSFZL37 and shows improved prediction performance compared with the existing three models. The optimal sequence/configuration profiles affecting the ACE inhibitory activity of the peptides were mapped. We used ACEiPP to acquire four ACEiPs, namely, LAF, LLL, GLF, and LIV, with other known bioactivities (hypertension, diabetes, hyperlipidemia, etc.), indicating that ACEiPP can facilitate the efficient discovery of bioactive peptides. In the future, our method may be applied to other DL-based studies on peptides for transfer learning.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/foods13223550/s1 [17,18,19,20,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58], Figure S1: 2D-diagram of the interaction between eight pre-MBPs and ACE; Figure S2: Hydrolyzed peptide prediction (1,320,635 unique sequences, probability ≥ 0.5) and their sequence feature comparison with experimental ACEiPs (1043 unique sequences); Table S1: Benchmark and independent datasets; Table S2: The list of 22 AAD encodings and their parameter meanings collected from literature; Table S3: Five-fold cross-validation metrics comparison for 22 single-AADs on the benchmark dataset (benchmark_ACEiPs.txt); Table S4: The average performance evaluation scores for 22 single-AADs on the independent dataset (independent_ACEiPs.txt); Table S5: The average performance evaluation scores for combined-AADs on the benchmark dataset (benchmark_ACEiPs.txt); Table S6: The average performance evaluation scores for combined-AADs on the independent dataset; Table S7: The main coding features of VVSFZL37 consisting of six single-AADs; Table S8: Five classifications with similar/same characteristics in VVSFZL37; Table S9: Potential peptides screened based on ADMET; Table S10: The semi-flexible docking results of ACE receptor (rigid) with the different ligands (flexible); Table S11: Interaction of ACE receptors with eight peptides.

Author Contributions

D.Q. and X.L. contributed equally to this work. D.Q.: data curation, methodology, formal analysis, and writing—original draft. X.L.: Data curation, methodology, formal analysis, and writing—original draft. L.J.: Data curation, methodology, and validation. R.W.: Investigation and visualization. Y.Z.: Resources and validation. W.X.: Investigation and visualization. J.W.: Investigation and visualization. G.L.: Conceptualization, methodology, funding acquisition, project administration, writing—review and editing, and supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 32172196).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Materials; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Abachi, S.; Bazinet, L.; Beaulieu, L. Antihypertensive and angiotensin-I-converting enzyme (ACE)-inhibitory peptides from fish as potential cardioprotective compounds. Mar. Drugs. 2019, 17, 613. [Google Scholar] [CrossRef] [PubMed]
Martin, M.; Deussen, A. Effects of natural peptides from food proteins on angiotensin converting enzyme activity and hypertension. Crit. Rev. Food Sci. Nutr. 2019, 59, 1264–1283. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.Q.; Strappe, P.; Shang, W.T.; Zhou, Z.K. Functional peptides derived from rice bran proteins. Crit. Rev. Food Sci. Nutr. 2019, 59, 349–356. [Google Scholar] [CrossRef] [PubMed]
Maleki, S.; Razavi, S.H. Pulses’ germination and fermentation: Two bioprocessing against hypertension by releasing ACE inhibitory peptides. Crit. Rev. Food Sci. Nutr. 2021, 61, 2876–2893. [Google Scholar] [CrossRef] [PubMed]
Lee, S.Y.; Hur, S.J. Antihypertensive peptides from animal products, marine organisms, and plants. Food Chem. 2017, 228, 506–517. [Google Scholar] [CrossRef]
Aluko, R.E. Antihypertensive peptides from food proteins. Annu. Rev. Food Sci. Technol. 2015, 6, 235–262. [Google Scholar] [CrossRef]
Manikkam, V.; Vasiljevic, T.; Donkor, O.N.; Mathai, M.L. A review of potential marine-derived hypotensive and anti-obesity peptides. Crit. Rev. Food Sci. Nutr. 2016, 56, 92–112. [Google Scholar] [CrossRef]
Morales, D.; Miguel, M.; Garcés-Rimón, M. Pseudocereals: A novel source of biologically active peptides. Crit. Rev. Food Sci. Nutr. 2021, 61, 1537–1544. [Google Scholar] [CrossRef]
Su, Y.; Chen, S.; Cai, S.; Liu, S.; Pan, N.; Su, J.; Qiao, K.; Xu, M.; Chen, B.; Yang, S.; et al. A novel angiotensin-I-converting enzyme (ACE) inhibitory peptide from Takifugu flavidus. Mar. Drugs. 2021, 19, 651. [Google Scholar] [CrossRef]
Hyoung Lee, D.; Ho Kim, J.; Sik Park, J.; Jun Choi, Y.; Soo Lee, J. Isolation and characterization of a novel angiotensin I-converting enzyme inhibitory peptide derived from the edible mushroom Tricholoma giganteum. Peptides 2004, 25, 621–627. [Google Scholar] [CrossRef]
Wang, J.; Wang, G.; Zhang, Y.; Zhang, R.; Zhang, Y. Novel angiotensin-converting enzyme inhibitory peptides identified from walnut glutelin-1 hydrolysates: Molecular interaction, stability, and antihypertensive effects. Nutrients 2021, 14, 151. [Google Scholar] [CrossRef] [PubMed]
Qin, D.; Bo, W.; Zheng, X.; Hao, Y.; Li, B.; Zheng, J.; Liang, G. DFBP: A comprehensive database of food-derived bioactive peptides for peptidomics research. Bioinformatics 2022, 38, 3275–3280. [Google Scholar] [CrossRef] [PubMed]
Lammi, C.; Aiello, G.; Boschin, G.; Arnoldi, A. Multifunctional peptides for the prevention of cardiovascular disease: A new concept in the area of bioactive food-derived peptides. J. Funct. Foods 2019, 55, 135–145. [Google Scholar] [CrossRef]
Wu, Q.; Luo, F.; Wang, X.L.; Lin, Q.; Liu, G.Q. Angiotensin I-converting enzyme inhibitory peptide: An emerging candidate for vascular dysfunction therapy. Crit. Rev. Biotechnol. 2022, 42, 736–755. [Google Scholar] [CrossRef]
Rifaioglu, A.S.; Atas, H.; Martin, M.J.; Cetin-Atalay, R.; Atalay, V.; Doğan, T. Recent applications of deep learning and machine intelligence on in silico drug discovery: Methods, tools and databases. Brief. Bioinform. 2019, 20, 1878–1912. [Google Scholar] [CrossRef]
Bo, W.; Chen, L.; Qin, D.; Geng, S.; Li, J.; Mei, H.; Li, B.; Liang, G. Application of quantitative structure-activity relationship to food-derived peptides: Methods, situations, challenges and prospects. Trends Food Sci. Technol. 2021, 114, 176–188. [Google Scholar] [CrossRef]
Lu, X.; Qiu, Z.; Zhao, R.; Zheng, Z.; Qiao, X. Advancement and prospects of production, transport, functional activity and structure-activity relationship of food-derived angiotensin converting enzyme (ACE) inhibitory peptides. Crit. Rev. Food Sci. Nutr. 2023, 63, 1437–1463. [Google Scholar]
Hellberg, S.; Sjostrom, M.; Skagerberg, B.; Wold, S. Peptide quantitative structure-activity relationships, a multivariate approach. J Med Chem. 1987, 30, 1126–1135. [Google Scholar] [CrossRef]
Shu, M.; Mei, H.; Yang, S.; Liao, L.; Li, Z. Structural parameter characterization and bioactivity simulation based on peptide sequence. QSAR Comb. Sci. 2009, 28, 27–35. [Google Scholar] [CrossRef]
Liang, G.; Li, Z. Factor analysis scale of generalized amino acid Information as the source of a new set of descriptors for elucidating the structure and activity relationships of cationic antimicrobial peptides. QSAR Comb. Sci. 2007, 26, 754–763. [Google Scholar] [CrossRef]
Kumar, R.; Chaudhary, K.; Singh Chauhan, J.; Nagpal, G.; Kumar, R.; Sharma, M.; Raghava, G.P. An in silico platform for predicting, screening and designing of antihypertensive peptides. Sci. Rep. 2015, 5, 12512. [Google Scholar] [CrossRef] [PubMed]
Manavalan, B.; Basith, S.; Shin, T.H.; Wei, L.; Lee, G. mAHTPred: A sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation. Bioinformatics 2019, 35, 2757–2765. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Dai, Z.; Zhao, X.; Chen, C.; Li, S.; Meng, Y.; Suonan, Z.; Sun, Y.; Shen, Q.; Wang, L.; et al. Deep learning drives efficient discovery of novel antihypertensive peptides from soybean protein isolate. Food Chem. 2023, 404, 134690. [Google Scholar] [CrossRef]
Hussain, W. sAMP-PFPDeep: Improving accuracy of short antimicrobial peptides prediction using three different sequence encodings and deep neural networks. Brief. Bioinform. 2022, 23, bbab487. [Google Scholar] [CrossRef]
Lv, Z.; Cui, F.; Zou, Q.; Zhang, L.; Xu, L. Anticancer peptides prediction with deep representation learning features. Brief. Bioinform. 2021, 22, bbab008. [Google Scholar] [CrossRef]
Tang, W.; Dai, R.; Yan, W.; Zhang, W.; Bin, Y.; Xia, E.; Xia, J. Identifying multi-functional bioactive peptide functions using multi-label deep learning. Brief. Bioinform. 2022, 23, bbab414. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Cheng, Y.; Gong, Y.; Liu, Y.; Song, B.; Zou, Q. Molecular design in drug discovery: A comprehensive review of deep generative models. Brief. Bioinform. 2021, 22, bbab344. [Google Scholar] [CrossRef]
Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 2012, 28, 3150–3152. [Google Scholar] [CrossRef]
Lahogue, V.; Réhel, K.; Taupin, L.; Haras, D.; Allaume, P. A HPLC-UV method for the determination of angiotensin I-converting enzyme (ACE) inhibitory activity. Food Chem. 2010, 118, 870–875. [Google Scholar] [CrossRef]
Cushman, D.W.; Cheung, H.S. Spectrophotometric assay and properties of the angiotensin-converting enzyme of rabbit lung. Biochem. Pharmacol. 1971, 20, 1637–1648. [Google Scholar] [CrossRef] [PubMed]
UG, Y.; Bhat, I.; Karunasagar, I.; BS, M. Antihypertensive activity of fish protein hydrolysates and its peptides. Crit. Rev. Food Sci. Nutr. 2019, 59, 2363–2374. [Google Scholar]
Kaur, A.; Kehinde, B.A.; Sharma, P.; Sharma, D.; Kaur, S. Recently isolated food-derived antihypertensive hydrolysates and peptides: A review. Food Chem. 2021, 346, 128719. [Google Scholar] [CrossRef] [PubMed]
Lin, K.; Zhang, L.-W.; Han, X.; Cheng, D.-Y. Novel angiotensin I-converting enzyme inhibitory peptides from protease hydrolysates of Qula casein: Quantitative structure-activity relationship modeling and molecular docking study. J. Funct. Foods. 2017, 32, 266–277. [Google Scholar] [CrossRef]
Lourenço da Costa, E.; Antonio da Rocha Gontijo, J.; Netto, F.M. Effect of heat and enzymatic treatment on the antihypertensive activity of whey protein hydrolysates. Int. Dairy J. 2007, 17, 632–640. [Google Scholar] [CrossRef]
Wu, J.; Aluko, R.E.; Nakai, S. Structural requirements of Angiotensin I-converting enzyme inhibitory peptides: Quantitative structure-activity relationship study of di- and tripeptides. J. Agric. Food Chem. 2006, 54, 732–738. [Google Scholar] [CrossRef]
Sagardia, I.; Roa-Ureta, R.H.; Bald, C. A new QSAR model, for angiotensin I-converting enzyme inhibitory oligopeptides. Food Chem. 2013, 136, 1370–1376. [Google Scholar] [CrossRef]
Hernández-Ledesma, B.; del Mar Contreras, M.; Recio, I. Antihypertensive peptides: Production, bioavailability and incorporation into foods. Adv. Colloid Interface Sci. 2011, 165, 23–35. [Google Scholar] [CrossRef]
Jurtz, V.I.; Johansen, A.R.; Nielsen, M.; Almagro Armenteros, J.J.; Nielsen, H.; Sønderby, C.K.; Winther, O.; Sønderby, S.K. An introduction to deep learning on biological sequence data: Examples and solutions. Bioinformatics 2017, 33, 3685–3690. [Google Scholar] [CrossRef]
Tian, F.; Yang, L.; Lv, F.; Yang, Q.; Zhou, P. In silico quantitative prediction of peptides binding affinity to human MHC molecule: An intuitive quantitative structure-activity relationship approach. Amino Acids. 2009, 36, 535–554. [Google Scholar] [CrossRef]
Georgiev, A.G. Interpretable numerical descriptors of amino acid space. J. Comput. Biol. 2009, 16, 703–723. [Google Scholar] [CrossRef] [PubMed]
Henikoff, S.; Henikoff, J.G. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 1992, 89, 10915–10919. [Google Scholar] [CrossRef] [PubMed]
Cocchi, M.; Johansson, E. Amino Acids Characterization by GRID and Multivariate Data Analysis. Quant. Struct. Act. Relatsh. 1993, 12, 1–8. [Google Scholar] [CrossRef]
Collantes, E.R.; Dunn, W.J., 3rd. Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. J. Med. Chem. 1995, 38, 2705–2713. [Google Scholar] [CrossRef]
Lin, Z.H.; Long, H.X.; Bo, Z.; Wang, Y.Q.; Wu, Y.Z. New descriptors of amino acids and their application to peptide QSAR study. Peptides 2008, 29, 1798–1805. [Google Scholar] [CrossRef]
Zaliani, A.; Gancia, E. MS-WHIM scores for amino acids: A new 3D-description for peptide QSAR and QSPR studies. J. Chem. Inf. Comput. Sci. 1999, 39, 525–533. [Google Scholar] [CrossRef]
Liang, G.; Liu, Y.; Shi, B.; Zhao, J.; Zheng, J. An index for characterization of natural and non-natural amino acids for peptidomimetics. PLoS ONE 2013, 8, e67844. [Google Scholar] [CrossRef]
van Westen, G.J.; Swier, R.F.; Wegner, J.K.; Ijzerman, A.P.; van Vlijmen, H.W.; Bender, A. Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): Comparative study of 13 amino acid descriptor sets. J. Cheminform. 2013, 5, 41. [Google Scholar] [CrossRef]
Hemmateenejad, B.; Yousefinejad, S.; Mehdipour, A.R. Novel amino acids indices based on quantum topological molecular similarity and their application to QSAR study of peptides. Amino Acids 2011, 40, 1169–1183. [Google Scholar] [CrossRef]
Yang, L.; Shu, M.; Ma, K.; Mei, H.; Jiang, Y.; Li, Z. ST-scale as a novel amino acid descriptor and its application in QSAM of peptides and analogues. Amino Acids 2010, 38, 805–816. [Google Scholar] [CrossRef]
Tong, J.; Che, T.; Li, Y.; Wang, P.; Xu, X.; Chen, Y. A descriptor of amino acids: SVRG and its application to peptide quantitative structure-activity relationship. SAR QSAR Environ. Res. 2011, 22, 611–620. [Google Scholar] [CrossRef] [PubMed]
Tong, J.; Chen, Y.; Liu, S.; Che, T.; Xu, X. A descriptor of amino acids SVWG and its applications in peptide QSAR. J. Chemom. 2012, 26, 549–555. [Google Scholar] [CrossRef]
Liang, G.-Z.; Shu, M.; Li, S.-S.Z. A new set of amino acid descriptors for the development of quantitative sequence-activity modelings of HLA-A*0201 restrictive CTL epitopes. J. Chin. Chem. Soc. 2008, 55, 1178–1185. [Google Scholar] [CrossRef]
Tian, F.; Zhou, P.; Li, Z. T-scale as a novel vector of topological descriptors for amino acids and its application in QSARs of peptides. J. Mol. Struct. 2007, 830, 106–115. [Google Scholar] [CrossRef]
Mei, H.; Liao, Z.H.; Zhou, Y.; Li, S.Z. A new set of amino acid descriptors and its application in peptide QSARs. Biopolymers 2005, 80, 775–786. [Google Scholar] [CrossRef]
Tong, J.; Li, L.; Bai, M.; Li, K. A new descriptor of amino acids-SVGER and its applications in peptide QSAR. Mol. Inform. 2017, 36, 1501023. [Google Scholar] [CrossRef]
Tong, J.; Liu, S.; Zhou, P.; Wu, B.; Li, Z. A novel descriptor of amino acids and its application in peptide QSAR. J. Theor. Biol. 2008, 253, 90–97. [Google Scholar] [CrossRef]
Shu, M.; Cheng, X.; Zhang, Y.; Wang, Y.; Lin, Y.; Wang, L.; Lin, Z. Predicting the activity of ACE inhibitory peptides with a novel mode of pseudo amino acid composition. Protein Pept. Lett. 2011, 18, 1233–1243. [Google Scholar] [CrossRef]

Figure 1. Workflow of ACEiPP: (1) collect the sequences of ACEiPs and non- ACEiPs; (2) translate the sequences into feature vectors by single and combined AADs; (3) train the LSTM models and learn the key features of peptides; (4) characterize the sequence/structure characteristics of peptides; (5) deploy the optimal model in the webserver to access the online prediction; (6) screen the ACEiPs with reported multiple activities from the DFBP [12]; and (7) predict the potential ACEiPs encoded in food-derived proteins.

Figure 2. Flowchart of LSTM architecture. Four layers were set up, including three LSTM layers and one RnnOutput layer.

Figure 3. The predicted Acc on the independent dataset (independent_ACEiPs), based on different AADs: (A) 22 single AADs (Supplementary Table S4); (B) 22 optimized AADs (Supplementary Table S5). CodeSet22 is a total feature matrix consisting of 22 single AADs. Each “SubSet” represents the selected combined AADs gradually screened out from CodeSet22 by the “Leave-Group-Out” method. For example, SubSet22-16 represents the optimized encodings obtained by removing the 16th single AAD from CodeSet22 (see Supplementary Table S6 for the detailed screening process). The numbers in parentheses represent the number of features.

Figure 4. Feature composition and importance scores of VVSFZL37. (A) Six kinds of single AADs are contained in VVSFZL37, along with their physicochemical properties; (B) importance scores for 37 single features; (C) five patterns, determined by the physicochemical meaning; (D) importance scores for five patterns of features; (E) sequence characteristics of ACEiPs. ACEiPs are mainly oligopeptides (2–10 residues, molecular weight ≤ 1000 Da), composed of hydrophobic amino acids (Pro, Leu, Val, Gly, and Ala) and aromatic amino acids (Tyr); the C-terminus tends to be hydrophobic (Pro and Phe), positively charged (Arg and Lys), and bulky aromatic (Tyr) amino acids, while the N-terminus tends to be hydrophobic amino acids (Leu, Val, Ala, Gly, and Ile).

Table 1. Performance comparison of ACEiPP and the three existing predictors on independent_ACEiPs and independent_AHTPs.

Predictor ^a	Applicable Length	Independent_ACEiPs		Independent_AHTPs
Predictor ^a	Applicable Length	Acc	MCC	Acc	MCC
AHTpin_AAC	Tetrapeptides	0.7639	0.5168	-	-
	Pentapeptides	0.5429	0.0880	0.7814	0.4121
	Hexapeptides	0.8507	0.7152	0.9045	0.4354
	Medium peptides (7–12)	0.7957	0.5898	0.7622	0.5033
	Large peptides (≥13)	0.9000	0.7980	0.7761	0.2618
AHTpin_ATC	Tetrapeptides	0.7639	0.5168	-	-
	Pentapeptides	0.5810	0.1674	0.7705	0.3292
	Hexapeptides	0.7910	0.5775	0.9045	0.3582
	Medium peptides (7–12)	0.8441	0.7053	0.7744	0.4828
	Large peptides (≥13)	0.8000	0.6162	0.8321	0.3256
mAHTPred	≥5	0.8492	0.7027	0.8834	0.7670
ACEiPP	≥2	0.9479	0.8959	0.8303	0.6614

^a AHTpin_AAC and AHTpin_ATC are two AHTP models that were trained with amino acid composition and atomic composition features, respectively. Since their evaluation parameters are determined according to different peptide lengths, the average values were calculated to compare their prediction performance. mAHTPred is an AHTP prediction model based on an extremely randomized tree, which can only predict sequences greater than or equal to five residues.

Table 2. The ACE inhibitory activity and multi-bioactivities of eight screened peptides.

No.	Sequence	IC₅₀ (μM)	Bioactivities
1	LAF	4.35	High ACE-inhibitory activity Antioxidant activity (DFBPANOX1165)
2	LLL	17.99	High ACE-inhibitory activity Antioxidant activity (DFBPANOX0810) DPP IV-inhibitory activity (DFBPDPIV0160) Stimulating activity (DFBPSTPE0004) Multifunctional activity (DFBPMUFU0682)
3	GLF	270.93	Moderate ACE-inhibitory activity Antimicrobial activity (DFBPAMIC0518) Immunomodulatory activity (DFBPIMMU0002) Immunomodulatory activity (DFBPIMMU0093) Regulating activity (DFBPREPE0005) Multifunctional activity (DFBPMUFU0716)
4	LIV	330.75	Moderate ACE-inhibitory activity Antioxidant activity (DFBPANOX0821)
5	LAL	1162.34	Low ACE-inhibitory activity Antioxidant activity (DFBPANOX0812)
6	AVL	5093.25	No ACE-inhibitory activity Antihypertensive activity (DFBPANHY0635)
7	LE	28,526.40	No ACE-inhibitory activity Antiviral activity (DFBPANPE0012)
8	VLV	No inhibition	No ACE-inhibitory activity Antioxidant activity (DFBPANOX0910)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Sequence–Activity Relationship of Angiotensin-Converting Enzyme Inhibitory Peptides Derived from Food Proteins, Based on a New Deep Learning Model

Abstract

1. Introduction

2. Materials and Methods

2.1. Benchmark and Independent Dataset

2.2. Sequence Representation

2.3. LSTM Architecture

2.4. Evaluation of Performance

2.5. Webserver Construction

2.6. Theoretical Screening of Food-Derived ACEiPs

2.7. Molecular Docking

2.8. Determination of ACE Inhibitory Activity

3. Results and Discussion

3.1. Performance of Single Coding Models

3.2. Performance of Optimized Coding Models

3.3. Comparison with the Existing Predictors

3.4. Key Sequence/Structure Characteristics Learned by ACEiPP

3.5. Web Server Implementation

3.6. Characterization and Screening of Food-Derived ACEiPs with Multi-Activities

3.7. Future Development Trends for ACEiPs

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics