PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation

Liu, Yanchen; Zheng, Yameng; Tariq, Amina; Nan, Xiaofei; Qu, Lingbo; Song, Jinshuai

doi:10.3390/chemistry8020029

Open AccessArticle

PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation

by

Yanchen Liu

¹,

Yameng Zheng

¹,

Amina Tariq

¹

,

Xiaofei Nan

^2,*

,

Lingbo Qu

^3,4,* and

Jinshuai Song

^1,5,*

¹

College of Chemistry and Pingyuan Laboratory, Zhengzhou University, Zhengzhou 450001, China

²

College of Computer Science and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China

³

Institute of Chemistry, Henan Academy of Science, Zhengzhou 450002, China

⁴

College of Chemical Engineering and State Key Laboratory of Cotton Bio-Breeding and Integrated Utilization, Zhengzhou University, Zhengzhou 450001, China

⁵

State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou 350002, China

^*

Authors to whom correspondence should be addressed.

Chemistry 2026, 8(2), 29; https://doi.org/10.3390/chemistry8020029

Submission received: 2 February 2026 / Revised: 15 February 2026 / Accepted: 16 February 2026 / Published: 18 February 2026

(This article belongs to the Special Issue AI and Big Data in Chemistry)

Download

Browse Figures

Versions Notes

Abstract

Background: While machine learning has advanced molecular conformation generation, existing models often suffer from limited generalization and inaccuracies, especially for complex molecular structures. These limitations hinder their reliability in downstream applications. Methods: We proposed a molecular conformation model combined with a molecular graph pre-training module and a diffusion model (PDCG). Feature embeddings are obtained from a pre-trained model and concatenated with the molecular graph information. Fusion features are used for generating conformations in the model. The model was trained and evaluated on the GEOM-QM9 and GEOM-Drugs datasets. Results: PDCG significantly outperforms existing baselines, which shows markedly superior results. Furthermore, in downstream molecular property prediction tasks, conformations generated by PDCG yield results comparable to those derived from DFT-optimized geometries. Conclusions: Our work provides a robust and generalizable model for accurate conformation generation. PDCG offers a reliable tool for downstream computational tasks, such as the virtual screening of functional materials and drug-like molecules.

Keywords:

molecular conformation generation; deep learning; pre-training model; graph neural networks

Graphical Abstract

1. Introduction

Molecular conformation generation plays a key role in many tasks, including molecular property prediction, docking, and virtual screening [1,2]. Traditional approaches, such as experimental determination and quantum chemical calculation, are expensive and time-consuming [3]. Classical force field methods offer greater computing speed but suffer from insufficient accuracy [4,5]. These limitations present significant challenges to high-throughput molecular screening and rational drug design, adversely affecting both the robustness and efficiency of these processes [6,7]. Machine learning (ML)-based algorithms for conformation generation are increasingly employed to address the compromise between computational cost and accuracy [7].

Machine learning models that predict atomic coordinates encounter intrinsic challenges, including rotational and translational invariance [8]. To address this issue and improve the stability, two principal strategies involving intermediate geometric prediction have been adopted. The first strategy predicts interatomic distances and subsequently reconstructs conformations using SE(3)-invariant networks [9]. Representative implementations include methods that predict the pairwise distance matrices and recover coordinates via distance geometry (DG) [10], as implemented in GraphDG [11] and CGCF [12]. Alternatively, more robust conformation generation can also be achieved by predicting the distance gradient in the models of ConfVAE [13] and ConfGF [14]. However, while distance-based predictions impose valuable geometric constraints, they often fail to adequately capture other critical structural features, such as the bond and torsion angles.

A second strategy has emerged to predict these local geometric attributes directly. Models like GeoMol [15] and Tora3D [16] predict local structures and torsion angles by integrating local and global molecular information. Reinforcement learning has also been employed to sample torsion angles for conformational exploration iteratively [17,18]. A fundamental limitation in both strategies is that errors in predicting intermediate geometric values propagate through the reconstruction process, often causing the final generated conformation to deviate from the actual structural distribution [19].

Directly outputting the molecular conformation from neural networks can provide a more accurate and efficient strategy for eliminating prediction errors [20]. Direct generation involves two methods, one of which is one-hot generation. The model can adaptively aggregate bond and atomic information to directly predict atomic coordinates, as DMCG [21] and Conf-GEM [22]. The conformational quality generated by these one-hot methods has reached the level of traditional methods, while it still exhibits potential for improvement in the conformational diversity of macromolecules. The other is the diffusion method, which gradually generates molecular conformations through continuous sampling. This method can directly predict coordinates without intermediate distances by iteratively processing coordinates and interatomic distances, and simultaneously achieve precise conformational modeling of rotational and translational invariance, such as COSMIC [23], GeoDiff [24], SDEGen [25], and EC-Conf [26]. The diffusion process can also be further restricted to the torsional angle space to improve performance [27]. The diffusion method demonstrates improved conformational diversity and structural validity compared to one-hot generation for macromolecules.

In recent years, to address the poor generalization ability of conformation generation models, pre-training methods have been introduced to learn general molecular representations and reduce reliance on precisely labeled data [28]. Pre-training could be achieved by several strategies such as Auto-Encoding model [29], Autoregressive Modeling [30], Masked Component Modeling [31], Context Prediction [32], Contrastive Learning [33], Replaced Components Detection [34], and DeNoising model [35]. Various pre-trained models [36,37,38,39] have been proposed to integrate chemical domain knowledge through pre-training strategies, and to enhance the generalizability of molecular embedding spaces. A combined physical–geometric constrained pre-trained model and generative models could improve generalization in conformation prediction. Wang et al. [40] proposed a novel framework that pre-trains a model on 3D molecular graphs and then fine-tunes it on molecular graphs without 3D structures. Beyond physics-aware methods, Alhamoud et al. [41] investigated the limitations of the GeoMol method and introduced pre-training to enhance molecular graph embeddings. These two methods emphasize the role of pre-training in connecting 2D and 3D representations.

At present, some generative models have limitations, including insufficient generalization and inaccurate conformation generation for macromolecules. To improve the model’s generalization and generate molecular conformations more efficiently and accurately, we propose PDCG, a model comprising a 2D molecular graph pre-training model and a molecular conformation diffusion generation model. Our model starts from the Simplified Molecular Input Row Input System (SMILES) [42] and obtains molecular features through the pre-trained model. The features are concatenated with the molecular graph information, and the fused features are then used to generate molecular conformations in the model. This model can improve the quality and diversity of the generated conformations, promote superior 3D conformation generation, and provide conformations for downstream tasks.

2. Materials and Methods

Datasets: The experiments utilized two publicly available datasets for molecular conformation generation, GEOM-QM9 and GEOM-Drugs [43]. We adopted the established data split from the work of Xu et al. for both datasets [24]. From this split, a training set was constructed by randomly sampling 40,000 molecules, each represented by its SMILES string and associated with 5 conformations. A validation set of 5000 molecules was similarly prepared. For testing, we extracted 200 distinct molecules from the remaining pool. This yielded a total of 22,408 test conformations for GEOM-QM9 and 14,324 for GEOM-Drugs. Additionally, the standard QM9 dataset was employed to evaluate the performance of downstream molecular property prediction tasks using the generated conformations. For each molecule, the conformation with the lowest MAT-P was selected for our model training and evaluation.

Network architecture: The overall architecture of the proposed PDCG framework is illustrated in Figure 1. It primarily consists of two components: a pre-trained molecular language model MoLR [38] and a diffusion-based generative model [24]. MoLR is a chemical reaction-aware molecular representation learning model based on the USPTO-479k dataset. It uses a two-layer GNN encoder to complete self-supervised pre-training through “reactant embedding = product embedding” and conservation constraints of contrastive loss, providing initialization parameters for the model, shortening the training cycle. We employed a publicly available MoLR model as a fixed feature extractor, while its parameters were kept frozen throughout our experiments. Given a molecular SMILES string as input, this model directly outputs a high-dimensional molecular representation, denoted as H_G.

The extracted representation H_G is first projected into a lower-dimensional latent space via a multi-layer perceptron (MLP), resulting in a refined feature vector h_G. Concurrently, the input SMILES is converted into a 2D molecular graph. A graph encoder processes this graph to generate a complementary feature vector E_G. Subsequently, h_G and E_G are concatenated to form a fused molecular representation G, which serves as the conditional input for the diffusion model. The fused feature G guides a denoising diffusion probabilistic model to generate plausible 3D molecular conformations. This model learns to iteratively reverse a predefined noising process, transforming an initial Gaussian distribution into a target conformational distribution conditioned on G.

During the diffusion process, starting from the given initial conformation C⁰, random noise conforming to the Gaussian distribution is gradually added. This process can be divided into step T. The initial conformation C⁰ will diffuse to the chaotic state C^t after step t. This forward process is a posterior distribution q(C^1:T|C⁰). Specifically, we define it as a Markov chain according to a fixed variance schedule β₁, …, β_t:

q (C^{1 : T}| C^{0}) = \prod_{t = 1}^{T} q (C^{t}| C^{t - 1}),

(1)

where

q (C^{t}| C^{t - 1})

is defined in Equation (2),

q (C^{t}| C^{t - 1}) = N (C^{t}; \sqrt{1 - β_{t}} C^{t - 1}, β_{t} I) .

(2)

During the generation process, under the condition of concatenating the molecular graph feature G, the conformation C⁰ is learned to be restored from the white noise C^T. This process is expressed as a learnable conditional Markov chain:

p_{θ} (C^{0 : T - 1}| {G, C}^{T}) = \prod_{t = 1}^{\bar{T}} p_{θ} (C^{t - 1}| G, C^{t}),

(3)

p_{θ} (C^{t - 1}| G, C^{t}) = N (C^{t - 1}; μ_{θ} (G, C^{t}, t), σ_{t}^{2} I),

(4)

where µ_θ is the parameterized neural network, σ_t is the defined variance, and the initial distribution p(CT) is set as a standard Gaussian distribution and iterated through the anti-Markov kernel p_θ.

Experiment setup: The model was implemented in PyTorch and trained on a single NVIDIA RTX 4080 GPU. The Adam optimizer was employed with a learning rate of 0.001 and a batch size of 64. The model architecture featured a global encoder with 6 convolutional layers and a local encoder with 4 convolutional layers, both with a hidden dimension of 128. For the diffusion process, the noise schedule parameters were set to a minimum variance β₁ of 1 × 10⁻⁷ and a maximum variance level β_T of 0.002. A spatial cutoff radius of 10 Å was applied for local structure modeling. The computational environment utilized Python 3.11, PyTorch 2.4.0, PyTorch Geometric 2.6.0, and RDKit 2022.09.1 [44].

Evaluation metrics: To evaluate the quality and diversity of the conformations generated by the model, we used an evaluation metric based on root mean square deviation (RMSD). The coverage (COV) and matching (MAT) scores measure diversity and accuracy, respectively. The COV score represents the percentage of the reference conformation generated by the prediction set. The MAT score represents the minimum RMSD between the generated conformation and the reference conformation. Based on the measurement of recall, COV-R and MAT-R can be defined as

C O V - R (S_{g}, S_{r}) = \frac{1}{|S_{r}|} C \in S_{r} |R M S D (C, \hat{C} \leq σ, \hat{C} \in S_{g})|

(5)

M A T - R (S_{g}, S_{r}) = \frac{1}{|S_{r}|} \sum_{C ϵ S_{r}} \min_{\hat{C} ϵ S_{g}} (R M S D (C, \hat{C}))

(6)

S_g is the set of generated conformations, and S_r is the set of reference conformations of a molecule. σ represents the defined threshold.

\hat{C}

and C refer to a generated conformation and a reference conformation, respectively. Based on the measurement of precision, COV-P and MAT-P can be defined as

C O V - P (S_{r}, S_{g}) = \frac{1}{|S_{g}|} C \in S_{g} |R M S D (C, \hat{C} \leq σ, \hat{C} \in S_{r})|,

(7)

M A T - P (S_{g}, S_{r}) = \frac{1}{|S_{g}|} \sum_{C ϵ S_{g}} \min_{\hat{C} ϵ S_{r}} (R M S D (C, \hat{C})) .

(8)

A higher COV score or a lower MAT score indicates the generation of a more realistic conformation. Recall indicators focus more on diversity, while precision indicators rely more on quality.

Validation via a downstream task. The practical utility of the conformations was assessed by their ability to predict molecular properties using SphereNet [45]. Predictions were made using three distinct sets of input conformations for comparison: quantum-chemically optimized structures, force field-optimized (RDKit) structures, and structures generated by PDCG. SphereNet was implemented with the same parameters as in its original work. The accuracy for each property was quantified by calculating the mean absolute error (MAE) between the model’s predictions and the reference values.

3. Results and Discussion

3.1. The Performance with Different Diffusion Steps

A key hyperparameter in the diffusion framework is the number of diffusion steps. We examined its influence by training models with different step counts on the GEOM-QM9 and GEOM-Drugs datasets. Each model was evaluated on an independent test set of 200 molecules, with the iteration step size of the generation phase the same as that of the training phase.

For the GEOM-QM9 dataset, we evaluated the performance of the diffusion model with maximum diffusion step sizes of 3000, 5000, 7000, and 10,000 in the forward process. The iteration steps are the same in the generation and training phases. As summarized in Table 1, an increase from 3000 to 5000 steps improved all evaluation metrics, with the latter yielding the best overall performance. Beyond this optimum, further increases in the step size degrade performance. This decline may be attributed to the accumulation of prediction errors across sequential sampling steps, which drives the generated structure away from the actual data distribution. Consequently, 5000 steps were established as the optimal setting for subsequent experiments, which is the same as the diffusion model GeoDiff [24].

For the GEOM-Drugs dataset, we investigated model performance across a range of maximum diffusion steps (6000, 8000, 10,000, and 12,000) in the forward process, with sampling steps held constant during generation. As shown in Table 2, optimal performance across all four evaluation metrics was achieved at 10,000 steps. This configuration yielded the best balance between recall and precision, indicating superior overall performance for this more complex molecular set. Consequently, 10,000 steps were established as the optimal forward process setting for training on GEOM-Drugs. This contrasts with the 5000-step setting used in the GeoDiff model [24], which was kept consistent across both the GEOM-QM9 and GEOM-Drugs datasets.

3.2. Comparison with Baseline Methods

We evaluated PDCG on a diverse set of seven baselines, encompassing one-hot encoding (GraphDG, ConfVAE), iterative generation (CGCF, Conf-GF, SDE-Gen, EC-Conf), and a pre-training-enhanced model (GeoMol + MRL). The results show the advantages of the proposed method across key metrics.

For the GEOM-QM9 dataset, the results are summarized in Table 3. Among all baseline methods, the standard diffusion model achieved the good recall-oriented metrics, with a COV-R of 0.919 and an MAT-R of 0.194. Our model sets a new state-of-the-art in recall, with scores of 0.934 and 0.188, which exceed the best baseline (diffusion model). Conversely, on precision-oriented metrics, the GeoMol + MRL model delivered the strongest performance, with scores of 0.837 (COV-P) and 0.310 (MAT-P), and is the best among all models. Our model obtained values of 0.752 and 0.369 on these metrics, slightly weaker than GeoMol + MRL. This comparative analysis suggests that incorporating pre-trained representations, as in GeoMol + MRL, provides a significant advantage in improving generation precision. As shown in Figure S2, we benchmarked the generation time for a representative molecule. GeoMol + MLR exhibits a significant efficiency advantage, which is primarily attributed to the substantially fewer denoising steps required by GeoMol + MLR.

We further evaluated our model on the more challenging GEOM-Drugs dataset, which includes macromolecular structures. As summarized in Table 4. For recall metrics, it achieves a COV–R of 0.929 and an MAT–R of 0.712, which exceeds the best baseline diffusion model with scores of 0.916 and 0.770, respectively. On precision metrics, our model has a COV–P of 0.766 and an MAT–P of 0.945, which is the highest among all the compared methods. These results indicate a clear advantage in generating accurate macromolecular conformations. Therefore, the effective mapping from 2D graphs to precise 3D conformations validates our core design. The diffusion model ensures extensive generative coverage of the conformational space, while the pre-trained component injects essential prior knowledge to ensure geometric stability. This combined approach is particularly effective at generating plausible conformations for large, flexible drug-like molecules.

3.3. RMSD of Conformation Generation

To evaluate performance, we analyzed the RMSD of generated 3D conformations in the GEOM-QM9 dataset and its correlation with molecular size and complexity. RMSD values exhibited a positive correlation with the number of heavy atoms (Figure 2a). The median RMSD of atomic positions increased from 0.2 Å for molecules with a single heavy atom to 2.1 Å for those with nine. This increase was accompanied by a broadening of the interquartile range (IQR) and whiskers, indicating greater conformational heterogeneity in larger molecules. In contrast, RMSD showed minimal dependence on the number of rings (Figure 2b). The median RMSD remained consistently between 1.5 and 2.0 Å across molecules containing zero to seven rings, with negligible IQR variation. This stability suggests that ring systems impose conformational constraints that limit structural deviation.

Bond distances exhibited a monotonic increase with the number of heavy atoms (Figure 2c), with the median rising from 0.07 Å for single-atom molecules to 0.10 Å for those with nine atoms. This was accompanied by a progressive expansion of the interquartile range (IQR) and overall range, reflecting greater bond length variability in larger frameworks. In contrast, bond distances decreased as the ring count increased (Figure 2d). The highest median bond distance (0.11 Å) was observed in acyclic molecules (zero rings), which steadily declined to 0.06 Å for molecules with seven rings, indicating a ring-induced bond-shortening effect.

Bond angles exhibited a non-linear dependence on heavy atom count (Figure 2e). The median angle peaked at 12° for monatomic systems (one atom), dropped to 3–5° for molecules with 2–5 atoms, and subsequently rose to 10° for those with nine atoms. This trend highlights the distinct angular geometries adopted by small versus large molecular assemblies. For ring-containing systems (Figure 2f), bond angles increased with ring count, rising from a median of 6° for acyclic molecules (zero rings) to 15° for molecules with five rings, after which they plateaued. The interquartile range (IQR) expanded markedly for molecules with three or more rings, which underscores the role of ring systems in inducing angular distortion.

Collectively, these results show that molecular size, measured by heavy atom count, is the primary driver of conformational flexibility and bond elongation. In contrast, ring systems play a secondary but counterbalancing role by constraining geometric variability and promoting shorter and more rigid bond metrics. These trends were consistent across all statistical descriptors.

3.4. Properties Prediction by SphereNet

To qualitatively assess the accuracy of generated conformations for downstream tasks, we evaluated molecular property predictions based on conformations derived from DFT and our proposed PDCG.

As shown in Table 5, the mean absolute errors (MAEs) of properties predicted from PDCG-generated conformations are very close to those obtained from DFT reference conformations. Compared to RDKit, PDCG achieves lower MAEs across most properties, particularly for the energy-related labels (U₀, U, H, and G). These results indicate that the conformation generated by PDCG could improve the precision of downstream predictive tasks.

4. Conclusions

This work introduces PDCG, a diffusion-driven framework for generating 3D molecular conformations combined with a molecular graph pre-training module and a diffusion model. The pre-training module used 2D molecular graphs to extract useful embeddings from SMILES representations. These embeddings and graph features were proceeded in a diffusion module to yield 3D conformations. Comprehensive evaluations reveal that PDCG achieves statistically significant enhancements in both COV and MAT metrics on a benchmark dataset of QM9-Drugs. Downstream validation through molecular property prediction confirms that the conformational accuracy of PDCG is closed to that of DFT-computed structures. Consequently, PDCG presents a promising and efficient alternative to support key computational chemistry applications, such as structure-based virtual screening and the design of functional materials. Future developments of PDCG could add the conditional generation of conformations to discover functional molecules with the desired properties. The model improvement for accurate multi-conformer generation is also a priority for our future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/chemistry8020029/s1, Table S1: Results of Diffusion Model with different iteration steps on the GEOM-QM9 dataset; Table S2: Results of Diffusion Model with different iteration steps on the GEOM-Drugs dataset; Table S3: Test of hyperparameters of batch size (batch) and learning rate (lr) on a subset of 300 molecules; Figure S1: Comparison of RMSD between generated conformations and DFT optimized conformations for four novel molecules from PubChem dataset; Figure S2: Comparison of generation time between PDCG and GeoMol+MRL, molecular SMILES: C=CCNC(=O)C1CCC(=O)N1S(=O)(=O)c1ccc(C)cc1.

Author Contributions

Conceptualization, J.S. and X.N.; methodology, X.N.; software, Y.L.; validation, Y.Z. and A.T.; investigation, L.Q.; resources, J.S.; writing—original draft preparation, Y.L.; writing—review and editing, J.S., X.N. and L.Q.; funding acquisition, J.S. and L.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “the National Natural Science Foundation of China, grant number 22173083” and “the Fundamental Research Funds of State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, grant number CBIUZ2025001”.

Data Availability Statement

The original data presented in the study are openly available at https://doi.org/10.1038/s41597-022-01288-4.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PDCG	Pre-training diffusion conformation generation
SMILES	the Simplified Molecular Input Row Input System
RMSD	Root mean square deviation
COV-R	Coverage—Recall
COV-P	Coverage—Precision
MAT-R	Matching—Recall
MAT-P	Matching—Precision
MAE	Mean absolute error
IQR	Interquartile range

References

Hawkins, P.C.D. Conformation Generation: The State of the Art. J. Chem. Inf. Model. 2017, 57, 1747–1756. [Google Scholar] [CrossRef]
Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Zidek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
Renaud, J.P.; Chari, A.; Ciferri, C.; Liu, W.T.; Remigy, H.W.; Stark, H.; Wiesmann, C. Cryo-EM in drug discovery: Achievements, limitations and prospects. Nat. Rev. Drug Discov. 2018, 17, 471–492. [Google Scholar] [CrossRef]
Allinger, N.L. Calculation of Molecular Structure and Energy by Force-Field Methods. In Advances in Physical Organic Chemistry; Gold, V., Bethell, D., Eds.; Academic Press: Cambridge, MA, USA, 1976; Volume 13, pp. 1–82. [Google Scholar]
Watts, K.S.; Dalal, P.; Murphy, R.B.; Sherman, W.; Friesner, R.A.; Shelley, J.C. ConfGen: A conformational search method for efficient generation of bioactive conformers. J. Chem. Inf. Model. 2010, 50, 534–546. [Google Scholar] [CrossRef]
Jinnouchi, R.; Karsai, F.; Kresse, G. On-the-fly machine learning force field generation: Application to melting points. Phys. Rev. B. 2019, 100, 014105. [Google Scholar] [CrossRef]
Wang, Z.; Zhong, H.; Zhang, J.; Pan, P.; Wang, D.; Liu, H.; Yao, X.; Hou, T.; Kang, Y. Small-Molecule Conformer Generators: Evaluation of Traditional Methods and AI Models on High-Quality Data Sets. J. Chem. Inf. Model. 2023, 63, 6525–6536. [Google Scholar] [CrossRef] [PubMed]
Mansimov, E.; Mahmood, O.; Kang, S.; Cho, K. Molecular Geometry Prediction using a Deep Generative Graph Neural Network. Sci. Rep. 2019, 9, 20381. [Google Scholar] [CrossRef] [PubMed]
Thomas, N.; Smidt, T.; Kearnes, S.; Yang, L.; Li, L.; Kohlhoff, K.; Riley, P. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv 2018, arXiv:1802.08219. [Google Scholar]
Liberti, L.; Lavor, C.; Maculan, N.; Mucherino, A. Euclidean distance geometry and applications. SIAM Rev. 2014, 56, 3–69. [Google Scholar] [CrossRef]
Simm, G.N.; Hernández-Lobato, J.M. A generative model for molecular distance geometry. arXiv 2019, arXiv:1909.11459. [Google Scholar]
Xu, M.; Luo, S.; Bengio, Y.; Peng, J.; Tang, J. Learning neural generative dynamics for molecular conformation generation. arXiv 2021, arXiv:2102.10240. [Google Scholar]
Xu, M.; Wang, W.; Luo, S.; Shi, C.; Bengio, Y.; Gomez-Bombarelli, R.; Tang, J. An end-to-end framework for molecular conformation generation via bilevel programming. In Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual Event, 18–24 July 2021; pp. 11537–11547. [Google Scholar]
Shi, C.; Luo, S.; Xu, M.; Tang, J. Learning gradient fields for molecular conformation generation. In Proceedings of the 38th International Conference on Machine Learning (ICML), Virtual Event, 18–24 July 2021; pp. 9558–9568. [Google Scholar]
Ganea, O.-E.; Pattanaik, L.; Coley, C.W.; Barzilay, R.; Jensen, K.F.; Green, W.H.; Jaakkola, T.S. GEOMOL: Torsional geometric generation of molecular 3D conformer ensembles. In Proceedings of the 35th International Conference on Neural Information Processing Systems, Virtual Event, 6–14 December 2021; pp. 13757–13769. [Google Scholar]
Zhang, Z.; Wang, G.; Li, R.; Ni, L.; Zhang, R.; Cheng, K.; Ren, Q.; Kong, X.; Ni, S.; Tong, X.; et al. Tora3D: An autoregressive torsion angle prediction model for molecular 3D conformation generation. J. Cheminf. 2023, 15, 57. [Google Scholar] [CrossRef] [PubMed]
Gogineni, T.; Xu, Z.; Punzalan, E.; Jiang, R.; Kammeraad, J.; Tewari, A.; Zimmerman, P. Torsionnet: A reinforcement learning approach to sequential conformer search. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; pp. 20142–20153. [Google Scholar]
Volokhova, A.; Koziarski, M.; Hernández-García, A.; Liu, C.-H.; Miret, S.; Lemos, P.; Thiede, L.; Yan, Z.; Aspuru-Guzik, A.; Bengio, Y. Towards equilibrium molecular conformation generation with GFlowNets. Digit. Discov. 2024, 3, 1038–1047. [Google Scholar] [CrossRef]
Luo, S.; Shi, C.; Xu, M.; Tang, J. Predicting molecular conformation via dynamic graph score matching. In Proceedings of the 35th International Conference on Neural Information Processing Systems, Virtual Event, 6–14 December 2021; pp. 19784–19795. [Google Scholar]
Janson, G.; Valdes-Garcia, G.; Heo, L.; Feig, M. Direct generation of protein conformational ensembles via machine learning. Nat. Commun. 2023, 14, 774. [Google Scholar] [CrossRef] [PubMed]
Zhu, J.; Xia, Y.; Liu, C.; Wu, L.; Xie, S.; Wang, Y.; Wang, T.; Qin, T.; Zhou, W.; Li, H. Direct molecular conformation generation. arXiv 2022, arXiv:2202.01356. [Google Scholar]
Yang, Z.; Xu, Y.; Pan, L.; Huang, T.; Wang, Y.; Ding, J.; Wang, L.; Xiao, J. Conf-GEM: A geometric information-assisted direct conformation generation model. Artif. Intell. Chem. 2024, 2, 100074. [Google Scholar] [CrossRef]
Kuznetsov, M.; Ryabov, F.; Schutski, R.; Shayakhmetov, R.; Lin, Y.C.; Aliper, A.; Polykovskiy, D. COSMIC: Molecular Conformation Space Modeling in Internal Coordinates with an Adversarial Framework. J. Chem. Inf. Model. 2024, 64, 3610–3620. [Google Scholar] [CrossRef]
Xu, M.; Yu, L.; Song, Y.; Shi, C.; Ermon, S.; Tang, J. Geodiff: A geometric diffusion model for molecular conformation generation. arXiv 2022, arXiv:2203.02923. [Google Scholar]
Zhang, H.; Li, S.; Zhang, J.; Wang, Z.; Wang, J.; Jiang, D.; Bian, Z.; Zhang, Y.; Deng, Y.; Song, J.; et al. SDEGen: Learning to evolve molecular conformations from thermodynamic noise for conformation generation. Chem. Sci. 2023, 14, 1557–1568. [Google Scholar] [CrossRef]
Fan, Z.; Yang, Y.; Xu, M.; Chen, H. EC-Conf: A ultra-fast diffusion model for molecular conformation generation with equivariant consistency. J. Cheminf. 2024, 16, 107. [Google Scholar] [CrossRef]
Jing, B.; Corso, G.; Chang, J.; Barzilay, R.; Jaakkola, T. Torsional diffusion for molecular conformer generation. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LO, USA, 28 November–9 December 2022; pp. 24240–24253. [Google Scholar]
Hendrycks, D.; Lee, K.; Mazeika, M. Using pre-training can improve model robustness and uncertainty. In Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, CA, USA, 6–15 June 2019; pp. 2712–2721. [Google Scholar]
Chen, Y.; Liu, J.; Peng, L.; Wu, Y.; Xu, Y.; Zhang, Z. Auto-encoding variational Bayes. Cambr. Explor. Arts Sci. 2024, 2, ceas.v2i1.33. [Google Scholar] [CrossRef]
Oord, A.v.d.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K. Wavenet: A generative model for raw audio. arXiv 2016, arXiv:1609.03499. [Google Scholar]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 3–5 June 2019; pp. 4171–4186. [Google Scholar]
Hu, W.; Liu, B.; Gomes, J.; Zitnik, M.; Liang, P.; Pande, V.; Leskovec, J. Strategies for pre-training graph neural networks. arXiv 2019, arXiv:1905.12265. [Google Scholar]
Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. In Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual Event, 12–18 July 2020; pp. 1597–1607. [Google Scholar]
Rong, Y.; Bian, Y.; Xu, T.; Xie, W.; Wei, Y.; Huang, W.; Huang, J. Self-supervised graph transformer on large-scale molecular data. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Virtual Event, 6–12 December 2020; pp. 12559–12571. [Google Scholar]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Virtual Event, 6–12 December 2020; pp. 6840–6851. [Google Scholar]
Ji, X.; Wang, Z.; Gao, Z.; Zheng, H.; Zhang, L.; Ke, G. Uni-Mol2: Exploring Molecular Pretraining Model at Scale. arXiv 2024, arXiv:2406.14969. [Google Scholar]
Li, H.; Zhang, R.; Min, Y.; Ma, D.; Zhao, D.; Zeng, J. A knowledge-guided pre-training framework for improving molecular representation learning. Nat. Commun. 2023, 14, 7568. [Google Scholar] [CrossRef]
Wang, H.; Li, W.; Jin, X.; Cho, K.; Ji, H.; Han, J.; Burke, M.D. Chemical-reaction-aware molecule representation learning. arXiv 2021, arXiv:2109.09888. [Google Scholar]
Zhang, R.; Lin, Y.; Wu, Y.; Deng, L.; Zhang, H.; Liao, M.; Peng, Y. MvMRL: A multi-view molecular representation learning method for molecular property prediction. Brief Bioinform. 2024, 25, bbae298. [Google Scholar] [CrossRef]
Wang, X.; Zhao, H.; Tu, W.-w.; Yao, Q. Automated 3d pre-training for molecular property prediction. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 2419–2430. [Google Scholar]
Alhamoud, K.; Ghunaim, Y.; Alshehri, A.S.; Li, G.; Ghanem, B.; You, F. Leveraging 2D molecular graph pretraining for improved 3D conformer generation with graph neural networks. Comput. Chem. Eng. 2024, 183, 108622. [Google Scholar] [CrossRef]
Zheng, X.; Tomiura, Y. A BERT-based pretraining model for extracting molecular structural information from a SMILES sequence. J. Cheminf. 2024, 16, 71. [Google Scholar] [CrossRef]
Axelrod, S.; Gomez-Bombarelli, R. GEOM, energy-annotated molecular conformations for property prediction and molecular generation. Sci. Data 2022, 9, 185. [Google Scholar] [CrossRef]
Bento, A.P.; Hersey, A.; Felix, E.; Landrum, G.; Gaulton, A.; Atkinson, F.; Bellis, L.J.; De Veij, M.; Leach, A.R. An open source chemical structure curation pipeline using RDKit. J. Cheminf. 2020, 12, 51. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L.; Liu, M.; Liu, Y.; Zhang, X.; Oztekin, B.; Ji, S. Spherical message passing for 3D moecular graphs. arXiv 2022, arXiv:2102.05013. [Google Scholar]

Figure 1. Architecture of the network of PDCG.

Figure 2. Box plot distributions of structural parameters: (a,b) RMSD of atomic positions; (c,d) RMSD of bond distances; (e,f) RMSD of bond angles. (a,c,e) Correlated with heavy atom count; (b,d,f) correlated with ring count. Central lines: medians; boxes: IQR; whiskers: 1.5 × IQR.

Table 1. Results of the diffusion model with different diffusion steps on the GEOM-QM9 dataset ¹.

Diffusion Steps	Sampling Steps	COV–R (%) ↑²	MAT–R (Å) ↓³	COV–P (%) ↑	MAT–P (Å) ↓
3000	3000	0.877	0.354	0.477	0.551
5000	5000	0.919	0.194	0.528	0.445
7000	7000	0.762	0.288	0.493	0.518
10,000	10,000	0.677	0.393	0.431	0.515

¹ The value with best performance is in bold. ² ↑ means higher is better. ³ ↓ means lower is better.

Table 2. Results of diffusion model with different diffusion steps on the GEOM-Drugs dataset ¹.

Diffusion Steps	Sampling Steps	COV–R (%) ↑²	MAT–R (Å) ↓³	COV–P (%) ↑	MAT–P (Å) ↓
6000	6000	0.901	0.795	0.687	1.286
8000	8000	0.904	0.785	0.689	1.083
10,000	10,000	0.916	0.770	0.752	0.988
12,000	12,000	0.805	1.065	0.663	1.358

¹ The value with best performance is in bold. ² ↑ means higher is better. ³ ↓ means lower is better.

Table 3. Comparison of our model with the baseline model on the GEOM-QM9 datasets ¹.

Models	COV–R (%) ↑²	MAT–R (Å) ↓³	COV–P (%) ↑	MAT–P (Å) ↓
GraphDG	0.733	0.425	0.439	0.581
ConfVAE	0.778	0.415	0.380	0.622
CGCF	0.781	0.422	0.662	0.661
CONFGF	0.885	0.267	0.522	0.464
SDEGen	0.815	0.357	0.484	0.566
EC-Conf	0.813	0.324	0.794	0.330
GeoMol + MRL	0.826	0.298	0.837	0.310
Diffusion	0.919	0.194	0.528	0.445
PDCG	0.934	0.188	0.752	0.369

¹ The value with best performance is in bold. ² ↑ means higher is better. ³ ↓ means lower is better.

Table 4. Comparison of our model with the benchmark model on the GEOM-Drugs datasets ¹.

Model	COV–R (%) ↑²	MAT–R (Å) ↓³	COV–P (%) ↑	MAT–P (Å) ↓
GraphDG	0.083	1.972	0.021	2.434
ConfVAE	0.552	1.238	0.230	1.829
CGCF	0.540	1.249	0.217	1.857
CONFGF	0.622	1.163	0.234	1.722
SDEGen	0.673	1.126	0.323	1.679
EC-Conf	0.864	0.902	0.701	1.108
GeoMol + MRL	0.815	1.132	0.756	1.045
Diffusion	0.916	0.770	0.752	0.988
PDCG	0.929	0.712	0.766	0.945

¹ The value with best performance is in bold. ² ↑ means higher is better. ³ ↓ means lower is better.

Table 5. Comparison of property prediction by SphereNet used conformation DFT optimization and our model.

Property	Unit	DFT	PDCG
μ	D	0.0245	0.0330
α	$a_{0}^{3}$	0.0449	0.0510
ϵ_HOMO	eV	0.0228	0.0329
ϵ_LUMO	eV	0.0189	0.0216
ϵ_gap	eV	0.0313	0.0511
<R²>	$a_{0}^{2}$	0.2680	0.4367
zpve	eV	0.0011	0.0020
C_v	cal/(mol·K)	0.0215	0.0255
U₀	eV	0.0063	0.0634
U	eV	0.0064	0.0450
H	eV	0.0063	0.0568
G	eV	0.0078	0.0447

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Zheng, Y.; Tariq, A.; Nan, X.; Qu, L.; Song, J. PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation. Chemistry 2026, 8, 29. https://doi.org/10.3390/chemistry8020029

AMA Style

Liu Y, Zheng Y, Tariq A, Nan X, Qu L, Song J. PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation. Chemistry. 2026; 8(2):29. https://doi.org/10.3390/chemistry8020029

Chicago/Turabian Style

Liu, Yanchen, Yameng Zheng, Amina Tariq, Xiaofei Nan, Lingbo Qu, and Jinshuai Song. 2026. "PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation" Chemistry 8, no. 2: 29. https://doi.org/10.3390/chemistry8020029

APA Style

Liu, Y., Zheng, Y., Tariq, A., Nan, X., Qu, L., & Song, J. (2026). PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation. Chemistry, 8(2), 29. https://doi.org/10.3390/chemistry8020029

Article Menu

PDCG: A Diffusion Model Guided by Pre-Training for Molecular Conformation Generation

Abstract

1. Introduction

2. Materials and Methods

3. Results and Discussion

3.1. The Performance with Different Diffusion Steps

3.2. Comparison with Baseline Methods

3.3. RMSD of Conformation Generation

3.4. Properties Prediction by SphereNet

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI