A Multimodal Representation Learning Framework for Molecular Graph and NMR Spectrum Alignment
Abstract
1. Introduction
- (1)
- Hybrid molecular representation learning. We design a hybrid molecular encoder that combines attention-based graph interaction with multi-scale neighborhood aggregation to capture complementary structural cues at different receptive fields for molecule–spectrum matching.
- (2)
- Cross-spectral complementary modeling. We develop a spectral modeling strategy for paired 1H and 13C NMR spectra using branch-specific attention enhancement and joint gating, enabling more effective utilization of cross-spectrum complementarity.
- (3)
- Residual multimodal fusion and empirical validation. We integrate molecular and spectral representations through a residual fusion design and validate the proposed framework through comparative and ablation experiments, showing improved overall matching performance on benchmark datasets.
2. Related Work
3. Method
3.1. Task Definition
3.2. Overall Framework
3.3. Molecular Feature Extraction
3.4. NMR Spectral Image Feature Extraction
3.5. Feature Fusion
4. Experiment
4.1. Datasets
4.2. Evaluation Metrics
4.3. Implementation Details
4.4. Baseline Models
4.5. Compare Experiment
4.6. Ablation Experiment
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xue, X.; Sun, H.; Sun, J.; Patiny, L.; Liu, X.; Chen, K.; Yan, J.; Li, L.; Liu, X.; Xu, S.; et al. NMRMind: A Transformer-Based Model Enabling the Elucidation from Multidimensional NMR to Structures. Anal. Chem. 2025, 97, 22603–22614. [Google Scholar] [CrossRef]
- Jaspars, M. Computer-assisted structure elucidation. Nat. Prod. Rep. 1999, 16, 241–263. [Google Scholar] [CrossRef]
- Burns, D.C.; Mazzola, E.P.; Reynolds, W.F. The role of computer-assisted structure elucidation (CASE) programs in the structure elucidation of complex natural products. Nat. Prod. Rep. 2019, 36, 919–933. [Google Scholar] [CrossRef]
- Alberts, M.; Zipoli, F.; Vaucher, A. Learning the language of NMR: Structure elucidation from NMR spectra using transformer models. In Proceedings of the AI for Accelerated Materials Design-NeurIPS 2023 Workshop, New Orleans, LA, USA, 15 August 2023. [Google Scholar]
- Hu, F.; Chen, M.S.; Rotskoff, G.M.; Kanan, M.W.; Markland, T.E. Accurate and efficient structure elucidation from routine one-dimensional nmr spectra using multitask machine learning. ACS Cent. Sci. 2024, 10, 2162–2170. [Google Scholar] [CrossRef]
- Yang, Q.; Wu, B.; Liu, X.; Chen, B.; Li, W.; Long, G.; Chen, X.; Xiao, M. DiffNMR: Diffusion models for nuclear magnetic resonance spectra elucidation. Mater. Futur. 2026, 5, 015601. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, J.; Cao, Z.; Barati Farimani, A. Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 2022, 4, 279–287. [Google Scholar] [CrossRef]
- Priessner, M.; Lewis, R.J.; Lemurell, I.; Johansson, M.J.; Goodman, J.; Janet, J.P.; Tomberg, A. Advancing Structure Elucidation with a Flexible Multi-Spectral AI Model. Angew. Chem. 2026, 138, e17611. [Google Scholar] [CrossRef]
- Li, J.; Liang, J.; Wang, Z.; Ptaszek, A.L.; Liu, X.; Ganoe, B.; Head-Gordon, M.; Head-Gordon, T. Highly accurate prediction of NMR chemical shifts from low-level quantum mechanics calculations using machine learning. J. Chem. Theory Comput. 2024, 20, 2152–2166. [Google Scholar] [CrossRef]
- Gao, P.; Zhang, J.; Peng, Q.; Zhang, J.; Glezakou, V.A. General protocol for the accurate prediction of molecular 13C/1H NMR chemical shifts via machine learning augmented DFT. J. Chem. Inf. Model. 2020, 60, 3746–3754. [Google Scholar] [CrossRef]
- Wei, W.; Liao, Y.; Wang, Y.; Wang, S.; Du, W.; Lu, H.; Kong, B.; Yang, H.; Zhang, Z. Deep learning-based method for compound identification in NMR spectra of mixtures. Molecules 2022, 27, 3653. [Google Scholar] [CrossRef]
- Cortés, I.; Cuadrado, C.; Hernández Daranas, A.; Sarotti, A.M. Machine learning in computational NMR-aided structural elucidation. Front. Nat. Prod. 2023, 2, 1122426. [Google Scholar] [CrossRef]
- Li, Z.; Jiang, M.; Wang, S.; Zhang, S. Deep learning methods for molecular representation and property prediction. Drug Discov. Today 2022, 27, 103373. [Google Scholar] [CrossRef]
- Tian, Z.; Dai, Y.; Hu, F.; Shen, Z.; Xu, H.; Zhang, H.; Xu, J.; Hu, Y.; Diao, Y.; Li, H. Enhancing chemical reaction monitoring with a deep learning model for nmr spectra image matching to target compounds. J. Chem. Inf. Model. 2024, 64, 5624–5633. [Google Scholar] [CrossRef]
- Mohammadi, M.; Tajik, E.; Martinez-Maldonado, R.; Sadiq, S.; Tomaszewski, W.; Khosravi, H. Artificial intelligence in multimodal learning analytics: A systematic literature review. Comput. Educ. Artif. Intell. 2025, 8, 100426. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, K.; Huang, J.; Yin, N.; Liu, S.; Segal, E. ProtoMol: Enhancing molecular property prediction via prototype-guided multimodal learning. Brief. Bioinform. 2025, 26, bbaf629. [Google Scholar] [CrossRef]
- Berahmand, K.; Daneshfar, F.; Rahmaninia, M.; Haghighat, M.; Jalili, M. A comprehensive survey on multi-view classification: Methods, applications, and challenges. ACM Trans. Intell. Syst. Technol. 2025, 16, 1–34. [Google Scholar] [CrossRef]
- He, H.; Xu, J.; Wen, G.; Ren, Y.; Zhao, N.; Zhu, X. Graph embedded contrastive learning for multi-view clustering. In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, IJCAI’25, Montreal, ON, Canada, 16–22 August 2025. [Google Scholar]
- Ghosh, K.; Stuke, A.; Todorović, M.; Jørgensen, P.B.; Schmidt, M.N.; Vehtari, A.; Rinke, P. Deep learning spectroscopy: Neural networks for molecular excitation spectra. Adv. Sci. 2019, 6, 1801367. [Google Scholar] [CrossRef]
- Schwaller, P.; Petraglia, R.; Zullo, V.; Nair, V.H.; Haeuselmann, R.A.; Pisoni, R.; Bekas, C.; Iuliano, A.; Laino, T. Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chem. Sci. 2020, 11, 3316–3325. [Google Scholar] [CrossRef]
- Wang, Z.; Jiang, T.; Wang, J.; Xuan, Q. Multi-modal representation learning for molecular property prediction: Sequence, graph, geometry. arXiv 2024, arXiv:2401.03369. [Google Scholar] [CrossRef]
- Yang, Z.; Song, J.; Yang, M.; Yao, L.; Zhang, J.; Shi, H.; Ji, X.; Deng, Y.; Wang, X. Cross-modal retrieval between 13C NMR spectra and structures for compound identification using deep contrastive learning. Anal. Chem. 2021, 93, 16947–16955. [Google Scholar] [CrossRef]
- Huang, Z.; Chen, M.S.; Woroch, C.P.; Markland, T.E.; Kanan, M.W. A framework for automated structure elucidation from routine NMR spectra. Chem. Sci. 2021, 12, 15329–15338. [Google Scholar] [CrossRef] [PubMed]
- Li, C.; Cong, Y.; Deng, W. Identifying molecular functional groups of organic compounds by deep learning of NMR data. Magn. Reson. Chem. 2022, 60, 1061–1069. [Google Scholar] [CrossRef] [PubMed]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 4700–4708. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]




| Model | Test | AUC | Precision | Recall | F1 | Accuracy |
|---|---|---|---|---|---|---|
| GCN + ResNet-101 [14] | Test_diff | 0.98 | 0.96 | 0.82 | 0.88 | 0.89 |
| Test_rand | 0.86 | 0.77 | 0.82 | 0.79 | 0.79 | |
| DenseNet [25] | Test_diff | 0.96 | 0.93 | 0.81 | 0.86 | 0.87 |
| Test_rand | 0.85 | 0.75 | 0.81 | 0.78 | 0.77 | |
| ResNet-50 [26] | Test_diff | 0.97 | 0.93 | 0.89 | 0.91 | 0.92 |
| Test_rand | 0.86 | 0.75 | 0.89 | 0.82 | 0.80 | |
| GAT + ResNet-101 [14] | Test_diff | 0.99 | 0.98 | 0.87 | 0.92 | 0.93 |
| Test_rand | 0.91 | 0.82 | 0.87 | 0.84 | 0.84 | |
| EfficientNet [27] | Test_diff | 0.99 | 0.98 | 0.86 | 0.92 | 0.92 |
| Test_rand | 0.90 | 0.81 | 0.86 | 0.83 | 0.83 | |
| Ours | Test_diff | 0.99 | 0.99 | 0.89 | 0.94 | 0.94 |
| Test_rand | 0.92 | 0.85 | 0.89 | 0.87 | 0.86 |
| Model | Test | AUC | Precision | Recall | F1 | Accuracy |
|---|---|---|---|---|---|---|
| A | Test_diff | |||||
| Test_rand | ||||||
| B | Test_diff | |||||
| Test_rand | ||||||
| C | Test_diff | |||||
| Test_rand | ||||||
| D | Test_diff | |||||
| Test_rand | ||||||
| all | Test_diff | |||||
| Test_rand |
| Model | Total Params (M) | Time (ms/batch) | Peak Memory (MB) |
|---|---|---|---|
| A | 42.913 | 25.97 | 335.57 |
| B | 43.748 | 30.29 | 338.26 |
| C | 44.115 | 23.78 | 339.66 |
| D | 43.300 | 19.77 | 337.31 |
| all | 44.116 | 21.22 | 339.67 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, X.; Wang, X.; Liu, Z.-M.; Liu, J.-B.; Huang, X. A Multimodal Representation Learning Framework for Molecular Graph and NMR Spectrum Alignment. Entropy 2026, 28, 532. https://doi.org/10.3390/e28050532
Li X, Wang X, Liu Z-M, Liu J-B, Huang X. A Multimodal Representation Learning Framework for Molecular Graph and NMR Spectrum Alignment. Entropy. 2026; 28(5):532. https://doi.org/10.3390/e28050532
Chicago/Turabian StyleLi, Xiao, Xun Wang, Zhong-Ming Liu, Jin-Biao Liu, and Xin Huang. 2026. "A Multimodal Representation Learning Framework for Molecular Graph and NMR Spectrum Alignment" Entropy 28, no. 5: 532. https://doi.org/10.3390/e28050532
APA StyleLi, X., Wang, X., Liu, Z.-M., Liu, J.-B., & Huang, X. (2026). A Multimodal Representation Learning Framework for Molecular Graph and NMR Spectrum Alignment. Entropy, 28(5), 532. https://doi.org/10.3390/e28050532

