Abstract: Spectra-structure relationships were investigated for estimating the anomeric configuration, residues and type of linkages of linear and branched trisaccharides using 13C-NMR chemical shifts. For this study, 119 pyranosyl trisaccharides were used that are trimers of the α or β anomers of D-glucose, D-galactose, D-mannose, L-fucose or L-rhamnose residues bonded through a or b glycosidic linkages of types 1→2, 1→3, 1→4, or 1→6, as well as methoxylated and/or N-acetylated amino trisaccharides. Machine learning experiments were performed for: (1) classification of the anomeric configuration of the first unit, second unit and reducing end; (2) classification of the type of first and second linkages; (3) classification of the three residues: reducing end, middle and first residue; and (4) classification of the chain type. Our previously model for predicting the structure of disaccharides was incorporated in this new model with an improvement of the predictive power. The best results were achieved using Random Forests with 204 di- and trisaccharides for the training set—it could correctly classify 83%, 90%, 88%, 85%, 85%, 75%, 79%, 68% and 94% of the test set (69 compounds) for the nine tasks, respectively, on the basis of unassigned chemical shifts.
Keywords: machine learning techniques; Random Forest; classification tree; CPGNN; 13C-NMR; oligosaccharides; disaccharides; trisaccharides
Export to BibTeX
MDPI and ACS Style
Pereira, F. 1D 13C-NMR Data as Molecular Descriptors in Spectra — Structure Relationship Analysis of Oligosaccharides. Molecules 2012, 17, 3818-3833.
Pereira F. 1D 13C-NMR Data as Molecular Descriptors in Spectra — Structure Relationship Analysis of Oligosaccharides. Molecules. 2012; 17(4):3818-3833.
Pereira, Florbela. 2012. "1D 13C-NMR Data as Molecular Descriptors in Spectra — Structure Relationship Analysis of Oligosaccharides." Molecules 17, no. 4: 3818-3833.