You are currently viewing a new version of our website. To view the old version click .
Molecules
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

30 November 2025

Graph Neural Networks vs. Traditional QSAR: A Comprehensive Comparison for Multi-Label Molecular Odor Prediction

,
and
Guangdong College of Medical Information Engineering, Pharmaceutical University; Guangzhou 510006, China
*
Author to whom correspondence should be addressed.
Molecules2025, 30(23), 4605;https://doi.org/10.3390/molecules30234605 
(registering DOI)
This article belongs to the Special Issue Analysis of Natural Volatile Organic Compounds (NVOCs)

Abstract

Molecular odor prediction represents a fundamental challenge in computational chemistry with significant applications in fragrance design, food science, and chemical safety assessment. While traditional Quantitative Structure–Activity Relationship (QSAR) methods rely on hand-crafted molecular descriptors, recent advances in graph neural networks (GNNs) enable direct end-to-end learning from molecular graph structures. However, systematic comparison between these approaches for multi-label odor prediction remains limited. This study presents a comprehensive evaluation of traditional QSAR methods compared with modern GNN approaches for multi-label molecular odor prediction. Using the GoodScent dataset containing 3304 molecules with six high-frequency odor types (fruity, green, sweet, floral, woody, herbal), we systematically evaluate 23 model configurations across traditional machine learning algorithms (Random Forest, SVM, GBDT, MLP, XGBoost, LightGBM) with three feature-processing strategies and three GNN architectures (GCN, GAT, NNConv). The results demonstrate that GNN models achieve significantly superior performance, with GCN achieving the highest macro F1-score of 0.5193 compared to 0.4766 for the best traditional method (MLP with basic preprocessing), representing a 24.1% relative improvement. Critically, we discover that threshold optimization is essential for multi-label chemical classification. These findings establish GNNs as the preferred approach for molecular property prediction tasks and provide crucial insights for handling class imbalance in chemical informatics applications.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.