# Multi-TransDTI: Transformer for Drug–Target Interaction Prediction Based on Simple Universal Dictionaries with Multi-View Strategy

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Our Datasets

#### 2.2. Overall Architecture of Our Model

_{1}, p

_{2}, p

_{3},

**, p**

^{……}_{i}, p

_{n}} with a size of n, where i is the i-th protein sequence. All drugs are represented as set D = {d

_{1}, d

_{2}, d

_{3},

**, d**

^{……}_{i}, d

_{k}} with a size of k, where i is the i-th drug SMILES. All DTI data are represented by set S = {<p

_{1},d

_{1},0>, <p

_{2},d

_{5},1>, <p

_{4},d

_{3},1>, ……, <p

_{i},d

_{i},0>}, in which each triplet is either a positive or negative sample, and the amount of triplets is the total number for S. More specifically, for each input drug–target pair in S, we first transform the corresponding sequence of p

_{i}and d

_{i}into encoded tokens ${V}^{\mathrm{pi}}\in {\mathbb{R}}^{m}$ and ${V}^{\mathrm{di}}\in {\mathbb{R}}^{v}$, respectively, based on SUPD and SUDD, where m is the dimension of ${V}^{\mathrm{pi}}$ and v is the dimension of ${V}^{\mathrm{di}}$. With experiments and tabular statistics in Appendix A, the length of the maximum protein sequence was ultimately set to m = 800, with v = 100 for the maximum drug sequence. The changed S is denoted by S = {<${V}^{p1},{V}^{d1},0$>, <${V}^{p2},{V}^{d5},1$>, <${V}^{p4},{V}^{d3},1$>,……, <${V}^{pi},{V}^{di},0$>}. Next, for newly encoded drug–target token pairs, we flow ${V}^{pi}$ to both the embedding layer and Transformer module, while ${V}^{di}$ is sent to embedding layer. The embedding layer is a lookup table of embedding vectors [5,21] in which embedding vector values are trainable and optimized from loss during training. We initialize their values in the form of ‘glorot normal’ [21,36] in tensorflow of our model. Then, we obtain two matrices ${M}^{{V}^{pi}}\in {\mathbb{R}}^{m\times u}$ and ${M}^{{V}^{di}}\in {\mathbb{R}}^{v\times j}$, where u/j is the embedding size of each token in ${V}^{pi}$/${V}^{di}$. Next, we conduct convolution operations [28] on embedding matrices ${M}^{{V}^{pi}}$ along encoded protein tokens and ${M}^{{V}^{di}}$ along encoded drug tokens in a 1D fashion to fully extract feature information for both proteins and drugs. After that, we execute global max pooling [37] to filter out the local important residues of encoded proteins and drugs. Eventually, the extracted crucial features are concatenated together to make the final prediction.

#### 2.3. Feature of Protein Amino Acid Sequence

#### 2.3.1. Simple Universal Protein Embedding Dictionary (SUPD)

#### 2.3.2. Different Inputs to CNN and Transformer Module

#### 2.4. Feature of Drug SMILES

#### 2.4.1. Simple Universal Drug Embedding Dictionary (SUDD)

#### 2.4.2. Morgan Fingerprints of Drugs

#### 2.5. Feature Learning Process of Our Deep Neural Network Model for Both Proteins and Drugs

## 3. Results

#### 3.1. Evaluation Indicators

#### 3.2. Baseline Methods

#### 3.3. Comparisons of Different Models

#### 3.4. Ablation Experiments

## 4. Discussion

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

#### Appendix A.1. Maximum Embedding Length of Proteins

Maximum Embedding Length of Protein | Coverage on Training Set | Coverage on Validation Set | Coverage on Test Set | Coverage on All Sets |
---|---|---|---|---|

600 | 85.8% | 84.9% | 85.0% | 85.5% |

700 | 92.5% | 92.8% | 91.9% | 92.4% |

800 | 96.2% | 96.4% | 96.1% | 96.2% |

#### Appendix A.2. Maximum Embedding Length of Drugs

Maximum Embedding Length of Drug | Coverage on Training Set | Coverage on Validation Set | Coverage on Test Set | Coverage on All Sets |
---|---|---|---|---|

80 | 87.3% | 88.5% | 88.8% | 87.7% |

90 | 91.7% | 92.8% | 92.1% | 91.9% |

100 | 93.0% | 93.8% | 92.8% | 93.1% |

#### Appendix A.3. Hyperparameter Setup of Our Model

Hyperparameter | Range | Selected Value |
---|---|---|

Learning rate | [0.01,0.001,0.0001,0.0002] | 0.0001 |

Decay rate | [0.01,0.001,0.0001] | 0.0001 |

Activation function | [Sigmoid, ReLU, ELU] | ReLU, Sigmoid |

Dropout rate | [0,0.1,0.2,0.3,0.4,0.5] | 0.2 |

Epoch | 0–60 | 50 |

Batch size | [8,16,32,64,128] | 16,32 |

## References

- Song, T.; Zheng, P.; Dennis Wong, M.L.; Wang, X. Design of logic gates using spiking neural P systems with homogeneous neurons and astrocytes-like control. Inf. Sci.
**2016**, 372, 380–391. [Google Scholar] [CrossRef] - Xue, H.; Li, J.; Xie, H.; Wang, Y. Review of drug repositioning approaches and resources. Int. J. Biol. Sci.
**2018**, 14, 1232–1244. [Google Scholar] [CrossRef] [PubMed][Green Version] - Yeu, Y.; Yoon, Y.; Park, S. Protein localization vector propagation: A method for improving the accuracy of drug repositioning. Mol. Biosyst.
**2015**, 11, 2096–2102. [Google Scholar] [CrossRef] [PubMed] - Lee, I.; Keum, J.; Nam, H. DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput. Biol.
**2019**, 15, e1007129. [Google Scholar] [CrossRef][Green Version] - Huang, K.; Xiao, C.; Glass, L.M.; Sun, J. MolTrans: Molecular Interaction Transformer for drug-target interaction prediction. Bioinformatics
**2021**, 37, 830–836. [Google Scholar] [CrossRef] - Song, T.; Zeng, X.; Zheng, P.; Jiang, M.; Rodriguez-Paton, A. A Parallel Workflow Pattern Modeling Using Spiking Neural P Systems with Colored Spikes. IEEE Trans. Nanobiosci.
**2018**, 17, 474–484. [Google Scholar] [CrossRef] - Wang, S.; Jiang, M.; Zhang, S.; Wang, X.; Yuan, Q.; Wei, Z.; Li, Z. Mcn-cpi: Multiscale convolutional network for compound–protein interaction prediction. Biomolecules
**2021**, 11, 1119. [Google Scholar] [CrossRef] - Song, T.; Pang, S.; Hao, S.; Rodríguez-Patón, A.; Zheng, P. A Parallel Image Skeletonizing Method Using Spiking Neural P Systems with Weights. Neural Process. Lett.
**2019**, 50, 1485–1502. [Google Scholar] [CrossRef] - Gönen, M. Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics
**2012**, 28, 2304–2310. [Google Scholar] [CrossRef] - Ezzat, A.; Zhao, P.; Wu, M.; Li, X.L.; Kwoh, C.K. Drug-target interaction prediction with graph regularized matrix factorization. IEEE/ACM Trans. Comput. Biol. Bioinform.
**2017**, 14, 646–656. [Google Scholar] [CrossRef] - Allouche, A. Software News and Updates Gabedit—A Graphical User Interface for Computational Chemistry Softwares. J. Comput. Chem.
**2012**, 32, 174–182. [Google Scholar] [CrossRef] [PubMed] - Koes, D.R.; Baumgartner, M.P.; Camacho, C.J. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. J. Chem. Inf. Model.
**2013**, 53, 1893–1904. [Google Scholar] [CrossRef] - Wan, F.; Zhu, Y.; Hu, H.; Dai, A.; Cai, X.; Chen, L.; Gong, H.; Xia, T.; Yang, D.; Wang, M.W.; et al. DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening. Genom. Proteom. Bioinforma.
**2019**, 17, 478–495. [Google Scholar] [CrossRef] [PubMed] - Li, H.; Leung, K.S.; Wong, M.H.; Ballester, P.J. Low-quality structural and interaction data improves binding affinity prediction via random forest. Molecules
**2015**, 20, 10947–10962. [Google Scholar] [CrossRef] [PubMed][Green Version] - Bredel, M.; Jacoby, E. Chemogenomics: An emerging strategy for rapid target and drug discovery. Nat. Rev. Genet.
**2004**, 5, 262–275. [Google Scholar] [CrossRef][Green Version] - Cheng, F.; Zhou, Y.; Li, J.; Li, W.; Liu, G.; Tang, Y. Prediction of chemical-protein interactions: Multitarget-QSAR versus computational chemogenomic methods. Mol. Biosyst.
**2012**, 8, 2373–2384. [Google Scholar] [CrossRef] - Van Laarhoven, T.; Nabuurs, S.B.; Marchiori, E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics
**2011**, 27, 3036–3043. [Google Scholar] [CrossRef][Green Version] - Zhang, Y.; Qiu, Y.; Cui, Y.; Liu, S.; Zhang, W. Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning. Methods
**2020**, 179, 37–46. [Google Scholar] [CrossRef] - Köhler, S.; Bauer, S.; Horn, D.; Robinson, P.N. Walking the Interactome for Prioritization of Candidate Disease Genes. Am. J. Hum. Genet.
**2008**, 82, 949–958. [Google Scholar] [CrossRef][Green Version] - Cao, M.; Pietras, C.M.; Feng, X.; Doroschak, K.J.; Schaffner, T.; Park, J.; Zhang, H.; Cowen, L.J.; Hescott, B.J. New directions for diffusion-based network prediction of protein function: Incorporating pathways with confidence. Bioinformatics
**2014**, 30, 219–227. [Google Scholar] [CrossRef][Green Version] - Pang, S.; Zhang, Y.; Song, T.; Zhang, X.; Wang, X.; Rodriguez-Patón, A. AMDE: A novel attention-mechanism-based multidimensional feature encoder for drug–drug interaction prediction. Brief. Bioinform.
**2022**, 23, bbab545. [Google Scholar] [CrossRef] [PubMed] - Wen, M.; Zhang, Z.; Niu, S.; Sha, H.; Yang, R.; Yun, Y.; Lu, H. Deep-Learning-Based Drug-Target Interaction Prediction. J. Proteome Res.
**2017**, 16, 1401–1409. [Google Scholar] [CrossRef] [PubMed] - Yao, Y.; Du, X.; Diao, Y.; Zhu, H. An integration of deep learning with feature embedding for protein–protein interaction prediction. PeerJ
**2019**, 2019, e7126. [Google Scholar] [CrossRef] [PubMed] - Kimothi, D.; Shukla, A.; Biyani, P.; Anand, S.; Hogan, J.M. Metric learning on biological sequence embeddings. In Proceedings of the 2017 IEEE 18th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Sapporo, Japan, 3–6 July 2017; pp. 1–5. [Google Scholar] [CrossRef]
- Peng, J.; Li, J.; Shang, X. A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network. BMC Bioinform.
**2020**, 21, 394. [Google Scholar] [CrossRef] - Ji, B.Y.; You, Z.H.; Jiang, H.J.; Guo, Z.H.; Zheng, K. Prediction of drug-target interactions from multi-molecular network based on LINE network representation method. J. Transl. Med.
**2020**, 18, 347. [Google Scholar] [CrossRef] - Luo, Y.; Zhao, X.; Zhou, J.; Yang, J.; Zhang, Y.; Kuang, W.; Peng, J.; Chen, L.; Zeng, J. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun.
**2017**, 8, 573. [Google Scholar] [CrossRef][Green Version] - Abbasi, K.; Razzaghi, P.; Poso, A.; Amanlou, M.; Ghasemi, J.B.; Masoudi-Nejad, A. DeepCDA: Deep Cross-Domain Compound-Protein Affinity Prediction through LSTM and Convolutional Neural Networks. Bioinformatics
**2020**, 36, 4633–4642. [Google Scholar] [CrossRef] - Hasan Mahmud, S.M.; Chen, W.; Jahan, H.; Dai, B.; Din, S.U.; Dzisoo, A.M. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Anal. Biochem.
**2020**, 610, 113978. [Google Scholar] [CrossRef] - Rayhan, F.; Ahmed, S.; Mousavian, Z.; Farid, D.M.; Shatabda, S. FRnet-DTI: Deep convolutional neural network for drug-target interaction prediction. Heliyon
**2020**, 6, e03444. [Google Scholar] [CrossRef] - Chen, H.; Cheng, F.; Li, J. IDrug: Integration of drug repositioning and drug-target prediction via cross-network embedding. PLoS Comput. Biol.
**2020**, 16, e1008040. [Google Scholar] [CrossRef] - Song, T.; Wang, G.; Ding, M.; Rodriguez-Paton, A.; Wang, X.; Wang, S. Network-Based Approaches for Drug Repositioning. Mol. Inform.
**2021**, 2100200. [Google Scholar] [CrossRef] [PubMed] - Lin, X.; Zhao, K.; Xiao, T.; Quan, Z.; Wang, Z.J.; Yu, P.S. Deepgs: Deep representation learning of graphs and sequences for drug-target binding affinity prediction. Front. Artif. Intell. Appl.
**2020**, 325, 1301–1308. [Google Scholar] [CrossRef] - Liu, T.; Lin, Y.; Wen, X.; Jorissen, R.N.; Gilson, M.K. BindingDB: A web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res.
**2007**, 35, 198–201. [Google Scholar] [CrossRef] [PubMed][Green Version] - Song, T.; Zhang, X.; Ding, M.; Rodriguez-Paton, A.; Wang, S.; Wang, G. DeepFusion: A deep learning based multi-scale feature fusion method for predicting drug-target interactions. Methods
**2022**, in press. [Google Scholar] [CrossRef] - Meng, X.; Li, X.; Wang, X. A Computationally Virtual Histological Staining Method to Ovarian Cancer Tissue by Deep Generative Adversarial Networks. Comput. Math. Methods Med.
**2021**, 2021, 4244157. [Google Scholar] [CrossRef] - Chen, L.; Tan, X.; Wang, D.; Zhong, F.; Liu, X.; Yang, T.; Luo, X.; Chen, K.; Jiang, H.; Zheng, M. TransformerCPI: Improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics
**2020**, 36, 4406–4414. [Google Scholar] [CrossRef] - Luo, H.; Li, M.; Yang, M.; Wu, F.X.; Li, Y.; Wang, J. Biomedical data and computational models for drug repositioning: A comprehensive review. Brief. Bioinform.
**2021**, 22, 1604–1619. [Google Scholar] [CrossRef] - Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5999–6009. [Google Scholar]
- Weininger, D. SMILES, a Chemical Language and Information System: 1: Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci.
**1988**, 28, 31–36. [Google Scholar] [CrossRef] - Tsubaki, M.; Tomii, K.; Sese, J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics
**2019**, 35, 309–318. [Google Scholar] [CrossRef] - Sun, M.; Zhao, S.; Gilvary, C.; Elemento, O.; Zhou, J.; Wang, F. Graph convolutional networks for computational drug development and discovery. Brief. Bioinform.
**2020**, 21, 919–935. [Google Scholar] [CrossRef] - Badkas, A.; De Landtsheer, S.; Sauter, T. Topological network measures for drug repositioning. Brief. Bioinform.
**2020**, 22, bbaa357. [Google Scholar] [CrossRef] [PubMed] - Köhler, S.; Vasilevsky, N.A.; Engelstad, M.; Foster, E.; McMurry, J.; Aymé, S.; Baynam, G.; Bello, S.M.; Boerkoel, C.F.; Boycott, K.M.; et al. The human phenotype ontology in 2017. Nucleic Acids Res.
**2017**, 45, D865–D876. [Google Scholar] [CrossRef] [PubMed] - Cai, R.; Chen, X.; Fang, Y.; Wu, M.; Hao, Y. Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers. Bioinformatics
**2020**, 36, 4458–4465. [Google Scholar] [CrossRef] [PubMed] - Guney, E.; Menche, J.; Vidal, M.; Barábasi, A.L. Network-based in silico drug efficacy screening. Nat. Commun.
**2016**, 7, 10331. [Google Scholar] [CrossRef][Green Version] - Available online: https://github.com/nick1997a/model (accessed on 26 February 2022).

**Figure 1.**Two flowcharts on comparisons between traditional drug development and drug repositioning.

**Figure 5.**Model comparisons of AUC and AUPR on 30% and 50% BindingDB dataset (Our = MultiTrans-DTI).

Name | Positive Samples | Negative Samples | Total Samples | Number of Drugs | Number of Proteins |
---|---|---|---|---|---|

BindingDB (100%) | 6571 | 6571 | 13,142 | 7137 | 1253 |

Percent | Train/Valid/Test | Ratio of Positive and Negative Samples in Train/Valid/Test |
---|---|---|

100% | 9200/1970/1972 | 1:1/1:1/1:1 |

50% | 4600/1970/1972 | 1:1/1:1/1:1 |

30% | 2770/1970/1972 | 1:1/1:1/1:1 |

Methods | AUC | AUPR | ACC | F1-Score | Threshold |
---|---|---|---|---|---|

DNN | 0.875 | 0.852 | 0.805 | 0.812 | 0.351 |

ModelCPI | 0.880 | 0.892 | 0.805 | 0.799 | 0.654 |

Moltrans | 0.881 | 0.855 | 0.811 | 0.819 | 0.514 |

DeepConv | 0.901 | 0.878 | 0.834 | 0.834 | 0.552 |

Multi-TransDTI | 0.909 | 0.898 | 0.842 | 0.843 | 0.604 |

Methods | AUC | AUPR | ACC | F1-Score | Threshold |
---|---|---|---|---|---|

DNN | 0.853 | 0.836 | 0.789 | 0.794 | 0.521 |

ModelCPI | 0.872 | 0.875 | 0.804 | 0.790 | 0.496 |

Moltrans | 0.869 | 0.841 | 0.804 | 0.796 | 0.349 |

DeepConv | 0.880 | 0.865 | 0.810 | 0.825 | 0.316 |

Multi-TransDTI | 0.891 | 0.884 | 0.820 | 0.829 | 0.397 |

Methods | AUC | AUPR | ACC | F1-Score | Threshold |
---|---|---|---|---|---|

DNN | 0.834 | 0.803 | 0.763 | 0.762 | 0.489 |

ModelCPI | 0.860 | 0.860 | 0.784 | 0.787 | 0.387 |

Moltrans | 0.849 | 0.818 | 0.767 | 0.783 | 0.364 |

DeepConv | 0.868 | 0.840 | 0.793 | 0.800 | 0.355 |

Multi-TransDTI | 0.871 | 0.860 | 0.799 | 0.802 | 0.553 |

Channels | AUC | AUPR | F1-Score | ACC |
---|---|---|---|---|

Protein_CNN | 0.905 | 0.893 | 0.836 | 0.836 |

Protein_transformer | 0.893 | 0.878 | 0.838 | 0.830 |

Drug_CNN | 0.896 | 0.888 | 0.836 | 0.829 |

Drug_fingerprints | 0.905 | 0.894 | 0.837 | 0.833 |

ALL | 0.909 | 0.898 | 0.842 | 0.843 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, G.; Zhang, X.; Pan, Z.; Rodríguez Patón, A.; Wang, S.; Song, T.; Gu, Y.
Multi-TransDTI: Transformer for Drug–Target Interaction Prediction Based on Simple Universal Dictionaries with Multi-View Strategy. *Biomolecules* **2022**, *12*, 644.
https://doi.org/10.3390/biom12050644

**AMA Style**

Wang G, Zhang X, Pan Z, Rodríguez Patón A, Wang S, Song T, Gu Y.
Multi-TransDTI: Transformer for Drug–Target Interaction Prediction Based on Simple Universal Dictionaries with Multi-View Strategy. *Biomolecules*. 2022; 12(5):644.
https://doi.org/10.3390/biom12050644

**Chicago/Turabian Style**

Wang, Gan, Xudong Zhang, Zheng Pan, Alfonso Rodríguez Patón, Shuang Wang, Tao Song, and Yuanqiang Gu.
2022. "Multi-TransDTI: Transformer for Drug–Target Interaction Prediction Based on Simple Universal Dictionaries with Multi-View Strategy" *Biomolecules* 12, no. 5: 644.
https://doi.org/10.3390/biom12050644