Discovering Photoswitchable Molecules for Drug Delivery with Large Language Models and Chemist Instruction Training
Abstract
:1. Introduction
2. Results
2.1. Delivery Large Language Model for Photoresponsive Molecule Discovery
2.2. Screening Molecules with QED, SA, and PageRank Score
2.3. First Excitation Energy and Photo-Isomerisation Mechanisms
2.4. Instruction Training
3. Discussion
4. Materials and Methods
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix A.1. Details of LLMs
Appendix A.2. Optimized Coordinates for ’NNCCN=NNC(C)C=C’ Isomerization
- C 3.414338000 1.148000000 −2.948062000C 2.453240000 1.810094000 −2.304456000C 1.030152000 1.340845000 −2.171861000N 0.565434000 1.436787000 −0.765514000N −0.018235000 0.420347000 −0.057741000N 0.468095000 −0.721198000 0.045662000C 1.764858000 −1.106221000 −0.513332000C 1.600521000 −2.202974000 −1.562107000N 2.895426000 −2.782005000 −1.893707000N 2.820639000 −3.776054000 −2.909832000C 0.103066000 2.185822000 −3.046965000H 4.419490000 1.550929000 −3.043161000H 3.226481000 0.179493000 −3.410081000H 2.662630000 2.782510000 −1.856343000H 0.947292000 0.298006000 −2.491054000H −0.052585000 2.232296000 −0.652310000H 2.360560000 −0.269106000 −0.892697000H 2.310142000 −1.545726000 0.332253000H 0.966821000 −2.997542000 −1.148498000H 1.078735000 −1.810570000 −2.455011000H 3.498891000 −2.049785000 −2.260349000H 2.452002000 −4.616168000 −2.468425000H 2.122588000 −3.506686000 −3.610575000H −0.930944000 1.833977000 −2.962802000H 0.412339000 2.123030000 −4.094814000H 0.140158000 3.239718000 −2.744598000
- C −3.646803000 1.358683000 1.418816000C −2.518253000 2.060561000 1.331076000C −1.153511000 1.451954000 1.183465000N −0.573440000 1.914885000 −0.079189000N 0.312755000 1.210229000 −0.777589000N 0.844090000 0.253945000 −0.162598000C 1.805638000 −0.449724000 −0.996872000C 1.360325000 −1.895054000 −1.192361000N 2.379562000 −2.647597000 −1.910045000N 1.995615000 −3.989780000 −2.189774000C −0.250743000 1.816593000 2.364112000H −4.610300000 1.845410000 1.548604000H −3.646050000 0.271044000 1.371167000H −2.547843000 3.150532000 1.387654000H −1.238777000 0.357561000 1.121782000H −1.146120000 2.488188000 −0.684023000H 1.928427000 0.050917000 −1.968502000H 2.775745000 −0.448599000 −0.482651000H 1.213669000 −2.366055000 −0.211765000H 0.382437000 −1.911181000 −1.709156000H 2.535479000 −2.207066000 −2.813728000H 2.099137000 −4.509156000 −1.320241000H 0.995211000 −4.025956000 −2.410894000H 0.751887000 1.405866000 2.216206000H −0.664234000 1.409150000 3.292863000H −0.173773000 2.904888000 2.465215000
- C −4.213817000 −0.450788000 0.968925000C −3.557760000 0.108145000 −0.042077000C −2.086138000 0.393378000 −0.029112000N −1.436466000 −0.430277000 −1.049211000N −0.340585000 −1.129848000 −0.901860000N 0.214890000 −1.130837000 0.246591000C 1.252794000 −0.276054000 0.733660000C 2.579703000 −0.424941000 −0.005454000N 3.581024000 0.471451000 0.554781000N 4.815804000 0.439462000 −0.159520000C −1.793535000 1.872716000 −0.249617000H −5.285333000 −0.623124000 0.923392000H −3.704543000 −0.750367000 1.882011000H −4.093875000 0.405769000 −0.944011000H −1.639253000 0.059269000 0.912941000H −1.924269000 −0.616816000 −1.916507000H 0.950461000 0.787712000 0.705283000H 1.411056000 −0.515103000 1.788259000H 2.937138000 −1.454512000 0.097691000H 2.423677000 −0.240723000 −1.082872000H 3.236228000 1.421063000 0.467384000H 5.300848000 −0.402504000 0.131311000H 4.627335000 0.314620000 −1.155589000H −0.717104000 2.059337000 −0.260440000H −2.237603000 2.462970000 0.556141000H −2.214929000 2.218022000 −1.198421000
References
- Vargason, A.M.; Anselmo, A.C.; Mitragotri, S. The evolution of commercial drug delivery technologies. Nat. Biomed. Eng. 2021, 5, 951–967. [Google Scholar] [CrossRef] [PubMed]
- Tao, Y.; Chan, H.F.; Shi, B.; Li, M.; Leong, K.W. Light: A Magical Tool for Controlled Drug Delivery. Adv. Funct. Mater. 2020, 30, 2005029. [Google Scholar] [CrossRef] [PubMed]
- Liu, D.; Yang, F.; Xiong, F.; Gu, N. The Smart Drug Delivery System and Its Clinical Potential. Theranostics 2016, 6, 1306–1323. [Google Scholar] [CrossRef] [PubMed]
- Son, J.; Yi, G.; Yoo, J.; Park, C.; Koo, H.; Choi, H.S. Light-responsive nanomedicine for biophotonic imaging and targeted therapy. Adv. Drug Deliv. Rev. 2019, 138, 133–147. [Google Scholar] [CrossRef] [PubMed]
- Jia, S.; Fong, W.-K.; Graham, B.; Boyd, B.J. Photoswitchable Molecules in Long-Wavelength Light-Responsive Drug Delivery: From Molecular Design to Applications. Chem. Mater. 2018, 30, 2873–2887. [Google Scholar] [CrossRef]
- Cho, H.J.; Chung, M.; Shim, M.S. Engineered photo-responsive materials for near-infrared-triggered drug delivery. J. Ind. Eng. Chem. 2015, 31, 15–25. [Google Scholar] [CrossRef]
- Liu, J.; Kang, W.; Wang, W. Photocleavage-based Photoresponsive Drug Delivery. Photochem. Photobiol. 2021, 98, 288–302. [Google Scholar] [CrossRef] [PubMed]
- Barhoumi, A.; Liu, Q.; Kohane, D.S. Ultraviolet light-mediated drug delivery: Principles, applications, and challenges. J. Control. Release 2015, 219, 31–42. [Google Scholar] [CrossRef]
- Olejniczak, J.; Carling, C.-J.; Almutairi, A. Photocontrolled release using one-photon absorption of visible or NIR light. J. Control. Release 2015, 219, 18–30. [Google Scholar] [CrossRef] [PubMed]
- Karimi, M.; Sahandi Zangabad, P.; Baghaee-Ravari, S.; Ghazadeh, M.; Mirshekari, H.; Hamblin, M.R. Smart Nanostructures for Cargo Delivery: Uncaging and Activating by Light. J. Am. Chem. Soc. 2017, 139, 4584–4610. [Google Scholar] [CrossRef] [PubMed]
- Linsley, C.S.; Wu, B.M. Recent advances in light-responsive on-demand drug-delivery systems. Ther. Deliv. 2017, 8, 89–107. [Google Scholar] [CrossRef] [PubMed]
- Dudek, M.; Tarnowicz-Staniak, N.; Deiana, M.; Pokładek, Z.; Samoć, M.; Matczyszyn, K. Two-photon absorption and two-photon-induced isomerization of azobenzene compounds. RSC Adv. 2020, 10, 40489–40507. [Google Scholar] [CrossRef] [PubMed]
- Sana, B.; Finne-Wistrand, A.; Pappalardo, D. Recent development in near infrared light-responsive polymeric materials for smart drug-delivery systems. Mater. Today Chem. 2022, 25, 100963. [Google Scholar] [CrossRef]
- OpenAI. Training language models to follow instructions with human feedback. In Proceedings of the 36th International Conference on Neural Information Processing Systems (NIPS ’22), Red Hook, NY, USA, 28 November–9 December 2022; pp. 27730–27744. [Google Scholar]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.; Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton, A.; et al. Mastering the game of go without human knowledge. Nature 2017, 550, 354–359. [Google Scholar] [CrossRef] [PubMed]
- Hassanzadeh, P.; Atyabi, F.; Dinarvand, R. The significance of artificial intelligence in drug delivery system design. Adv. Drug Deliv. Rev. 2019, 151–152, 169–190. [Google Scholar] [CrossRef]
- Meenakshi, D.U.; Nandakumar, S.; Francis, A.P.; Sweety, P.; Fuloria, S.; Fuloria, N.K.; Subramaniyan, V.; Khan, S.A. Deep Learning and Site-Specific Drug Delivery. In Deep Learning for Targeted Treatments; Malviya, R., Ghinea, G., Dhanaraj, R.K., Balusamy, B., Sundram, S., Eds.; Wiley: Hoboken, NJ, USA, 2022; pp. 1–38. [Google Scholar]
- Vora, L.K.; Gholap, A.D.; Jetha, K.; Thakur, R.R.S.; Solanki, H.K.; Chavda, V.P. Artificial Intelligence in Pharmaceutical Technology and Drug Delivery Design. Pharmaceutics 2023, 15, 1916. [Google Scholar] [CrossRef] [PubMed]
- Harrison, P.J.; Wieslander, H.; Sabirsh, A.; Karlsson, J.; Malmsjö, V.; Hellander, A.; Wählby, C.; Spjuth, O. Deep-learning models for lipid nanoparticle-based drug delivery. Nanomedicine 2021, 16, 1097–1110. [Google Scholar] [CrossRef]
- Gao, J.; Karp, J.M.; Langer, R.; Joshi, N. The Future of Drug Delivery. Chem. Mater. 2023, 35, 359–363. [Google Scholar] [CrossRef] [PubMed]
- Zaheer, M.; Guruganesh, G.; Dubey, K.A.; Ainslie, J.; Alberti, C.; Ontanon, S.; Pham, P.; Ravula, A.; Wang, Q.; Yang, L.; et al. Big bird: Transformers for longer sequences. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–12 December 2020; Volume 33, pp. 17283–17297. [Google Scholar]
- Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. arXiv 2023, arXiv:2302.13971. [Google Scholar]
- Gemma Team. Gemma: Open Models Based on Gemini Research and Technology. arXiv 2024, arXiv:2403.08295. [Google Scholar]
- Black, S.; Biderman, S.; Hallahan, E.; Anthony, Q.; Gao, L.; Golding, L.; He, H.; Leahy, C.; McDonell, K.; Phang, J.; et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model. arXiv 2022, arXiv:2204.06745. [Google Scholar]
- Rafailov, R.; Sharma, A.; Mitchell, E.; Manning, C.D.; Ermon, S.; Finn, C. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Volume 36. [Google Scholar]
- Hu, J.; Wu, P.; Wang, S.; Wang, B.; Yang, G. A Human Feedback Strategy for Photoresponsive Molecules in Drug Delivery: Utilizing GPT-2 and Time-Dependent Density Functional Theory Calculations. Pharmaceutics 2024, 16, 1014. [Google Scholar] [CrossRef] [PubMed]
- Raccuglia, P.; Elbert, K.C.; Adler, P.D.F.; Falk, C.; Wenny, M.B.; Mollo, A.; Zeller, M.; Friedler, S.A.; Schrier, J.; Norquist, A.J. Machine-learning-assisted materials discovery using failed experiments. Nature 2016, 533, 73–76. [Google Scholar] [CrossRef] [PubMed]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. Lora: Low-rank adaptation of large language models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
- Jain, A.; Ong, S.P.; Hautier, G.; Chen, W.; Richards, W.D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater. 2013, 1, 011002. [Google Scholar] [CrossRef]
- Openai, A.; Openai, K.; Openai, T.; Openai, I. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://www.mikecaptain.com/resources/pdf/GPT-1.pdf (accessed on 13 September 2024).
- Adilov, S. Generative Pre-Training from Molecules. Cambridge Engage Preprints. 16 September 2021. Available online: https://chemrxiv.org/engage/chemrxiv/article-details/6142f60742198e8c31782e9e (accessed on 13 September 2024).
- Rupp, M.; Tkatchenko, A.; Müller, K.-R.; Von Lilienfeld, O.A. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett. 2012, 108, 058301–058305. [Google Scholar] [CrossRef] [PubMed]
- Montavon, G.; Rupp, M.; Gobre, V.; Vazquez-Mayagoitia, A.; Hansen, K.; Tkatchenko, A.; Müller, K.-R.; Von Lilienfeld, O.A. Machine learning of molecular electronic properties in chemical compound space. New J. Phys. 2013, 15, 095003. [Google Scholar] [CrossRef]
- Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, A.L. Quantifying the chemical beauty of drugs. Nat. Chem. 2012, 4, 90–98. [Google Scholar] [CrossRef]
- Anstine, D.M.; Isayev, O. Generative Models as an Emerging Paradigm in the Chemical Sciences. J. Am. Chem. Soc. 2023, 145, 8736–8750. [Google Scholar] [CrossRef] [PubMed]
- Ertl, P.; Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 2009, 1, 8. [Google Scholar] [CrossRef] [PubMed]
- Neese, F. The ORCA program system. Wiley Interdisciplinary Reviews: Computational Molecular. Science 2011, 2, 73–78. [Google Scholar]
- Adamo, C.; Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 1999, 110, 6158–6170. [Google Scholar] [CrossRef]
- Barone, V.; Cossi, M. Quantum Calculation of Molecular Energies and Energy Gradients in Solution by a Conductor Solvent Model. J. Phys. Chem. A 1998, 102, 1995–2001. [Google Scholar] [CrossRef]
- Marenich, A.V.; Cramer, C.J.; Truhlar, D.G. Universal Solvation Model Based on Solute Electron Density and on a Continuum Model of the Solvent Defined by the Bulk Dielectric Constant and Atomic Surface Tensions. J. Phys. Chem. B 2009, 113, 6378–6396. [Google Scholar] [CrossRef] [PubMed]
- Skyner, R.E.; McDonagh, J.L.; Groom, C.R.; Van Mourik, T.; Mitchell, J.B.O. A review of methods for the calculation of solution free energies and the modelling of systems in solution. Phys. Chem. Chem. Phys. 2015, 17, 6174–6191. [Google Scholar] [CrossRef] [PubMed]
- Biswas, N.; Umapathy, S. Density Functional Calculations of Structures, Vibrational Frequencies, and Normal Modes of trans- and cis-Azobenzene. J. Phys. Chem. A 1997, 101, 5555–5566. [Google Scholar] [CrossRef]
- Schäfer, A.; Huber, C.; Ahlrichs, R. Fully optimized contracted Gaussian basis sets of triple zeta valence quality for atoms Li to Kr. J. Chem. Phys. 1994, 100, 5829–5835. [Google Scholar] [CrossRef]
Models | Number of Molecules Meeting the Chemical Requirements |
---|---|
BigBird | 17 |
Gemma | 79 |
GPT Neox | 132 |
Chemist Instruction Training | 439 |
Methods | SMILES | First Excitation Energy (eV) | Language Model | ||
---|---|---|---|---|---|
Gas Phase | Water | Organic Solvents | |||
PageRanks | OC1CC1=CO | 5.923 | 6.444 | 6.4 | BigBird |
CCNC1CC1=C | 5.066 | 5.142 | 5.121 | GPT NeoX | |
C1C2C=CC=CC12 | 4.346 | 4.275 | 4.281 | GPT NeoX | |
NC1CCCC=C1 | 5.873 | 6.081 | 6.03 | GPT NeoX | |
OC1CCOC1=C | 6.448 | 6.525 | 6.477 | GPT NeoX | |
OC1CC1C | 7.118 | 7.426 | 7.364 | GPT NeoX | |
CCC1CC=C1C | 6.942 | 7.022 | 6.987 | Gemma | |
OCC1=NCCN1 | 6.945 | 7.372 | 7.277 | GPT NeoX | |
OCC1=NCCN1 | 5.922 | 6.16 | 6.113 | Gemma | |
N=CN1CCCN1 | 6.112 | 6.419 | 6.386 | GPT NeoX | |
C1C=CC2NC12 | 5.768 | 5.852 | 5.831 | GPT NeoX | |
OC12CC1=CS2 | 3.817 | 3.6 | 3.632 | Gemma | |
OC1CNC=NC1 | 5.965 | 6.183 | 6.13 | GPT NeoX | |
C1C2CC=CC=CC12 | 4.910 | 4.801 | 4.795 | GPT NeoX | |
OC1=CCN=C1 | 5.097 | 5.251 | 5.207 | Gemma | |
CC1C(O)C1C | 7.019 | 7.394 | 7.313 | GPT NeoX | |
NC1CCC=C1 | 5.949 | 6.202 | 6.165 | GPT NeoX | |
CC1NCC1O | 6.310 | 6.871 | 6.753 | Gemma | |
CC1=CCC2CC12 | 6.612 | 6.493 | 6.473 | GPT NeoX | |
CC1C(O)C1O | 6.648 | 6.909 | 6.853 | BigBird | |
QED | CCC1=COCC=CC(=CC(C)CCC1=CN) | 3.927 | 3.847 | 3.862 | BigBird |
CN1CNC=C1CC1CC1C | 4.807 | 5.099 | 5.016 | BigBird | |
CNCC1COCN1CC1C=CC1 | 5.883 | 6.008 | 5.99 | BigBird | |
CNCCC1CCCC1C | 6.055 | 6.319 | 6.255 | BigBird | |
CC1CC1CC(C)CC1CC1=O | 3.783 | 3.848 | 3.83 | BigBird | |
CCC(C)CC(C)(N)CN | 6.240 | 6.616 | 6.521 | Gemma | |
C1CC1CC1CC1CN | 6.493 | 6.918 | 6.812 | BigBird | |
CCC1=NC=NC=C1Cl | 4.516 | 4.639 | 4.606 | Gemma | |
CCCC(C)C(C)O | 7.084 | 7.485 | 7.394 | Gemma | |
CC1CNCC1CC(C) | 6.239 | 6.575 | 6.484 | BigBird | |
SA | C1C2C3OC1CN23 | 7.331 | 7.704 | 7.62 | GPT NeoX |
C1C2NC1C1NC21 | 6.555 | 6.969 | 6.869 | GPT NeoX | |
CC1=C=CC1C=NN=C | 3.604 | 3.783 | 3.731 | BigBird | |
NC1CN2CC1N2 | 6.317 | 6.994 | 6.854 | GPT NeoX | |
NC1=CSN=C=C1 | 2.173 | 2.091 | 2.041 | GPT NeoX | |
C1NC2CCC1O2 | 6.036 | 6.530 | 6.416 | GPT NeoX | |
C1CC=CC=C=C=CC=CCC1N | 3.613 | 3.481 | 3.465 | BigBird | |
NC1C2NC=CN12 | 5.258 | 5.391 | 5.347 | GPT NeoX | |
OC1C2NC1C=C2 | 5.146 | 5.45 | 5.365 | GPT NeoX | |
CC1C=CON=N1 | 3.501 | 3.594 | 3.568 | GPT NeoX |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hu, J.; Wu, P.; Li, Y.; Li, Q.; Wang, S.; Liu, Y.; Qian, K.; Yang, G. Discovering Photoswitchable Molecules for Drug Delivery with Large Language Models and Chemist Instruction Training. Pharmaceuticals 2024, 17, 1300. https://doi.org/10.3390/ph17101300
Hu J, Wu P, Li Y, Li Q, Wang S, Liu Y, Qian K, Yang G. Discovering Photoswitchable Molecules for Drug Delivery with Large Language Models and Chemist Instruction Training. Pharmaceuticals. 2024; 17(10):1300. https://doi.org/10.3390/ph17101300
Chicago/Turabian StyleHu, Junjie, Peng Wu, Yulin Li, Qi Li, Shiyi Wang, Yang Liu, Kun Qian, and Guang Yang. 2024. "Discovering Photoswitchable Molecules for Drug Delivery with Large Language Models and Chemist Instruction Training" Pharmaceuticals 17, no. 10: 1300. https://doi.org/10.3390/ph17101300
APA StyleHu, J., Wu, P., Li, Y., Li, Q., Wang, S., Liu, Y., Qian, K., & Yang, G. (2024). Discovering Photoswitchable Molecules for Drug Delivery with Large Language Models and Chemist Instruction Training. Pharmaceuticals, 17(10), 1300. https://doi.org/10.3390/ph17101300