Abstract
Molecular odor prediction represents a fundamental challenge in computational chemistry with significant applications in fragrance design, food science, and chemical safety assessment. While traditional Quantitative Structure–Activity Relationship (QSAR) methods rely on hand-crafted molecular descriptors, recent advances in graph neural networks (GNNs) enable direct end-to-end learning from molecular graph structures. However, systematic comparison between these approaches for multi-label odor prediction remains limited. This study presents a comprehensive evaluation of traditional QSAR methods compared with modern GNN approaches for multi-label molecular odor prediction. Using the GoodScent dataset containing 3304 molecules with six high-frequency odor types (fruity, green, sweet, floral, woody, herbal), we systematically evaluate 23 model configurations across traditional machine learning algorithms (Random Forest, SVM, GBDT, MLP, XGBoost, LightGBM) with three feature-processing strategies and three GNN architectures (GCN, GAT, NNConv). The results demonstrate that GNN models achieve significantly superior performance, with GCN achieving the highest macro F1-score of 0.5193 compared to 0.4766 for the best traditional method (MLP with basic preprocessing), representing a 24.1% relative improvement. Critically, we discover that threshold optimization is essential for multi-label chemical classification. These findings establish GNNs as the preferred approach for molecular property prediction tasks and provide crucial insights for handling class imbalance in chemical informatics applications.