Constructing 3-Dimensional Atomic-Resolution Models of Nonsulfated Glycosaminoglycans with Arbitrary Lengths Using Conformations from Molecular Dynamics

Glycosaminoglycans (GAGs) are the linear carbohydrate components of proteoglycans (PGs) and are key mediators in the bioactivity of PGs in animal tissue. GAGs are heterogeneous, conformationally complex, and polydisperse, containing up to 200 monosaccharide units. These complexities make studying GAG conformation a challenge for existing experimental and computational methods. We previously described an algorithm we developed that applies conformational parameters (i.e., all bond lengths, bond angles, and dihedral angles) from molecular dynamics (MD) simulations of nonsulfated chondroitin GAG 20-mers to construct 3-D atomic-resolution models of nonsulfated chondroitin GAGs of arbitrary length. In the current study, we applied our algorithm to other GAGs, including hyaluronan and nonsulfated forms of dermatan, keratan, and heparan and expanded our database of MD-generated GAG conformations. Here, we show that individual glycosidic linkages and monosaccharide rings in 10- and 20-mers of hyaluronan and nonsulfated dermatan, keratan, and heparan behave randomly and independently in MD simulation and, therefore, using a database of MD-generated 20-mer conformations, that our algorithm can construct conformational ensembles of 10- and 20-mers of various GAG types that accurately represent the backbone flexibility seen in MD simulations. Furthermore, our algorithm efficiently constructs conformational ensembles of GAG 200-mers that we would reasonably expect from MD simulations.


Introduction
Proteoglycans (PGs) are a diverse group of proteins in the extracellular matrix (ECM), as well as on and within cells in animal tissue. PGs play key roles in signal transduction [1,2], tissue morphogenesis [3][4][5][6][7], and matrix assembly [7][8][9][10] by binding growth factors [3][4][5][6][7][11][12][13][14][15][16][17], enzymes [7,17], membrane receptors [17], and ECM molecules [2,7,17]. Many of these functions are mediated by glycosaminoglycans (GAGs), which are linear, highly negatively-charged, and structurally diverse carbohydrate chains covalently-linked to PGs. Specifically, GAGs can form covalent and noncovalent complexes with proteins or inhibit complex formation with other biomolecules. All of these functions allow GAGs to modulate disease. For example, hyaluronan can predict disease outcome and is used as Monosaccharide ring geometries are quantified by all bond lengths, bond angles, and dihedral angles within the ring and in exocyclic functional groups that are not part of a glycosidic linkage. Cremer-Pople (C-P) ring puckering parameters (ϕ, θ, Q) [91] of each monosaccharide ring in the MDgenerated 20-mer ensembles were computed to characterize potential ring puckering patterns. Conformations of each monosaccharide and each linkage of the 20-mers were extracted separately from each snapshot of the MD trajectories. First, to determine if conformational data in different runs matched, and if there was interdependency in the individual linkage and ring conformations, data were separated by run and monosaccharide/linkage number and aggregated across all snapshots in each MD simulation run. Next, all individual conformations were aggregated across all snapshots in all runs (e.g., 10,000 snapshots * 4 runs * 10 GlcA monosaccharides = 400,000 samples of GlcA monosaccharide conformations in hyaluronan 20-mer) to generate a single dataset for each of the two monosaccharide types and two linkage types in each GAG (e.g., GlcA monosaccharide, GlcNAc Monosaccharide ring geometries are quantified by all bond lengths, bond angles, and dihedral angles within the ring and in exocyclic functional groups that are not part of a glycosidic linkage. Cremer-Pople (C-P) ring puckering parameters (φ, θ, Q) [91] of each monosaccharide ring in the MD-generated 20-mer ensembles were computed to characterize potential ring puckering patterns. Conformations of each monosaccharide and each linkage of the 20-mers were extracted separately from each snapshot of the MD trajectories. First, to determine if conformational data in different runs matched, and if there was interdependency in the individual linkage and ring conformations, data were separated by run and monosaccharide/linkage number and aggregated across all snapshots in each MD simulation run. Next, all individual conformations were aggregated across all snapshots in all runs (e.g., 10,000 snapshots * 4 runs * 10 GlcA monosaccharides = 400,000 samples of GlcA monosaccharide conformations in hyaluronan 20-mer) to generate a single dataset for each of the two monosaccharide types and two linkage types in each GAG (e.g., GlcA monosaccharide, GlcNAc monosaccharide, β1-3 glycosidic linkage, and β1-4 glycosidic linkage conformations in hyaluronan 20-mer).

Construction Algorithm to Generate GAG Conformational Ensembles
The MD-generated conformational datasets described above made up the database used for the algorithm we developed to generate GAG polymer conformational ensembles of user-specified length and with a user-specified number of conformations (previously described) [33]. Essentially, our algorithm (1) incorporates all bond, bond angle, and dihedral angle conformational parameters from MD simulation of GAG 20-mers, (2) treats monosaccharide rings and glycosidic linkages independently, (3) performs a restrained energy minimization on each constructed conformation to relieve steric overlap while maintaining polymer conformation, and (4) applies a bond potential energy cutoff to exclude conformations that remain nonphysical after minimization. A nonphysical conformation results from either overlapping bonds or a bond that pierces the center of another monosaccharide ring in the initial constructed conformation ( Figure S1). As dihedral angles are restrained, minimization addresses these issues by stretching the overlapping or ring-piercing bonds to nonphysical lengths, which increases the post-minimization bond potential energy by more than 132 kcal/mol [33]. The energy cutoff is the sum of a 100 kcal/mol buffer and a polymer-length specific cutoff, which is equal to the post-minimization bond potential energy of the fully-extended conformation. This conformation was constructed by assigning energetically-favorable glycosidic linkage dihedrals from the corresponding MD simulations of each GAG type that give a fully-extended conformation (i.e., with the maximum end-to-end distance observed in simulation). We used the algorithm to construct 20-mer conformational ensembles of each nonsulfated GAG.
For internal validation of the algorithm, glycosidic linkage dihedral free energies ∆G(φ,ψ), monosaccharide ring C-P parameters, end-to-end distance distributions, and radii of gyration from MD-generated ensembles and constructed ensembles were compared. Additionally, post-minimization bond potential energy distributions from constructed ensembles were analyzed to validate the calculated bond potential energy cutoff and verify that the ensemble had expected energy distributions for the given polymer length.
To evaluate application of MD-generated 20-mer conformations to construct GAG polymers of variable length, we constructed 10-mer ensembles using the algorithm and compared them to 10-mer ensembles generated using the same protocol as 20-mer MD simulations. To assess the efficacy and efficiency of our algorithm to construct conformational ensembles of GAG polymers with biologically-relevant chain lengths, we also constructed 200-mer ensembles.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
To examine the backbone flexibility of hyaluronan 20-mers, end-to-end distances and radii of gyration were analyzed in each of the four MD simulation runs. One of the four runs produced more compact conformations, i.e., with lower end-to-end distances and radii of gyration than the other three (Figures 2 and S2a and Table 1). The system potential energy distributions were identical for all four runs, suggesting that this outlying run was energetically stable ( Figure S3). To uncover the conformational factors that contributed to end-to-end distance and explain the differences in backbone flexibility in the different MD runs, the monosaccharide ring and glycosidic linkage conformations were examined.   1 Probabilities were calculated for end-to-end distances sorted into 0.5 Å bins for the 20-mer ensembles and 0.25 Å bins for the 10-mer ensembles. 2 All = end-to-end distance distribution aggregated across all four runs.

Figure 2.
End-to-end distance probability distribution of MD-generated hyaluronan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for end-to-end distances sorted into 0.5 Å bins. Analysis of C-P parameters of GlcNAc monosaccharide rings revealed mostly 4 C 1 chair conformations, as found in crystal structures [92][93][94][95][96][97][98][99] and NMR and force field studies [67,[100][101][102], with a minority of samples in boat and skew-boat conformations, namely B 3,O , 1 S 3 , 1,4 B, and 1 S 5 (Figures 3a and 4a). While non-4 C 1 conformations of GlcNAc are rare, they have been observed in crystal structures of protein co-complexes [95,[103][104][105][106]. For example, one GlcNAc ring in the crystal structure of a hyaluronan tetramer in complex with hyaluronidase was found in the 1,4 B boat conformation [95]. Additionally, GlcNAc boat/skew-boat conformations have been sampled in biased MD [107], 1-µs unbiased MD [108], and 10-µs unbiased MD simulations of GlcNAc monosaccharides [100]. One NMR and force field study suggested that these conformations were important for GAG-protein interactions [100]. Analysis of individual GlcNAc rings in each run revealed that these non-4 C 1 conformations were not specific to any particular region of the 20-mer and did not occur simultaneously in multiple rings in the same snapshot ( Figure S4). Furthermore, GlcNAc rings that sampled non-4 C 1 puckers returned to 4 C 1 chair conformations relatively quickly, i.e., in under 25 ns ( Figure S4). -mer; geometries from the four sets of each ensemble are represented by red, green, blue, and magenta dots, respectively and the force-field geometry is represented by a single large black dot; Cremer-Pople parameters (ϕ, θ) for all rings in every tenth snapshot from each ensemble were plotted (i.e., 10 rings × 1,000 snapshots per run × 4 runs = 40,000 parameter sets for the 20-mer and 20,000 parameter sets for the 10-mer). -mer; geometries from the four sets of each ensemble are represented by red, green, blue, and magenta dots, respectively and the force-field geometry is represented by a single large black dot; Cremer-Pople parameters (φ, θ) for all rings in every tenth snapshot from each ensemble were plotted (i.e., 10 rings × 1000 snapshots per run × 4 runs = 40,000 parameter sets for the 20-mer and 20,000 parameter sets for the 10-mer).
C-P analysis of GlcA monosaccharide rings in the hyaluronan 20-mer showed that GlcA is found mostly in 4 C 1 chair conformations (Figure 3b), as seen in crystal structures [92][93][94][95] and NMR and force field studies [51,67,101,102,107,109,110], with some boat and skew-boat conformations, including 3 S 1 , B 1,4 , 5 S 1 , 2,5 B, 2 S O , 1 S 3 , 1,4 B, and 1 S 5 (Figures 3b and 4b,c), which were also observed in GlcA rings in unbiased MD simulations of nonsulfated chondroitin 20-mers [33]. Some of these conformations were also sampled rarely in unbiased MD simulations of GlcA monosaccharides [51,107]. The slight differences between boat/skew-boat conformations in our simulations and monosaccharide simulations may have arisen from intramolecular interactions in the hyaluronan 20-mer. As with GlcNAc rings, these conformations did not occur consistently in any one region of the 20-mer, did not occur simultaneously in multiple rings in the same snapshot, and returned to 4 C 1 chair conformations quickly, i.e., within 40 ns ( Figure S5). Except for 2 S O , each of these GlcA and GlcNAc boat and skew-boat conformations brought the linker oxygen atoms closer together, causing a kink in the polymer chain ( Figure 4), which may have contributed to more compact polymer conformations. To investigate this, end-to-end distances and radii of gyration of 20-mer conformations with non-4 C 1 ring puckers were analyzed ( Figure S6) and both compact and extended conformations were observed. This can be explained by the flexibility in the glycosidic linkages. Of note, MD-generated atomic-resolution snapshots before and after a monosaccharide ring underwent a conformational change between 4 C 1 and non-4 C 1 were visualized in VMD to check for ionic interactions (e.g., between Na + and carboxylate O − atoms and/or partially-charged hydroxyl O atoms in GlcA, between Cl − and hydroxyl H atoms in GlcA, between Na + and partially-charged N-acetyl N or O atoms and/or hydroxyl O atoms in GlcNAc, or between Cl − and H atoms of methyl and/or hydroxyl groups in GlcNAc). Specifically, we looked for ions within 5 Å of the exocyclic atoms of the monosaccharide of interest and found that ionic interactions were most often absent during, and thus, did not significantly contribute to, ring puckering in the MD simulation of hyaluronan 20-mer. 20-mer with a 5 S1 GlcA conformer (ring atoms in cyan; linkage atoms in red) and 3 S1, B1,4, 5 S1, 2,5 B, 1 S3, 1,4 B, and 1 S5 GlcA monosaccharide conformations (endocyclic ring atoms and linker oxygen atoms only); (c) 20-mer with a 2 SO GlcA conformer (ring atoms in cyan; linkage atoms in red) and 4 C1 and 2 SO GlcA monosaccharide conformations (endocyclic ring atoms and linker oxygen atoms only); in each 20-mer snapshot, all gray rings are in 4 C1 conformations.
C-P analysis of GlcA monosaccharide rings in the hyaluronan 20-mer showed that GlcA is found mostly in 4 C1 chair conformations (Figure 3b), as seen in crystal structures [92][93][94][95] and NMR and force field studies [51,67,101,102,107,109,110], with some boat and skew-boat conformations, including 3 S1, B1,4, 5 S1, 2,5 B, 2 SO, 1 S3, 1,4 B, and 1 S5 (Figures 3b and 4b,c), which were also observed in GlcA rings in unbiased MD simulations of nonsulfated chondroitin 20-mers [33]. Some of these conformations were also sampled rarely in unbiased MD simulations of GlcA monosaccharides [51,107]. The slight To analyze glycosidic linkage conformations, free energies of glycosidic linkage dihedrals ∆G(φ, ψ) were calculated (Figure 5a,b and Table 2). If we assume O 5 -C 1 -O-C n = H 1 -C 1 -O-C n -120 • for φ values and C 1 -O-C 3 -C 2 = C 1 -O-C 3 -H 3 -120 • and C1-O-C 4 -C 3 = C 1 -O-C 4 -H 4 + 120 • for ψ values, our data match well with X-ray diffraction and force field data from hyaluronan tetrasaccharides [49], as well as NMR and molecular mechanics data from hyaluronan tetrasaccharides [111], hexasaccharides [112], and octasaccharides [113]. If we assume C 1 -O-C 3 -C 2 = C 1 -O-C 3 -C 4 + 120 • and C 1 -O-C 4 -C 3 = C 1 -O-C 4 -C 5 -120 • , our data are also in close agreement with X-ray diffraction and quantum molecular modeling data [65,[114][115][116], as well as NMR and MD data [102]. This agreement is with conformations sampled about the global minimum in each linkage in our MD simulations. All of these conformations were helical, so it stands to reason that 2-, 3-, and 8-fold helices were observed in hyaluronan oligosaccharides. This does not necessarily mean that adjacent glycosidic linkage conformations were interdependent. In fact, variation in the types of helices observed in the force field and experimental studies in only short oligosaccharide units suggests a lack of consistent and uniform helical structures in long hyaluronan polymers.   Our data also show that a secondary basin was sampled in each linkage type in our MD simulations. In GlcAβ1-3GlcNAc linkages, conformations in the upper left quadrant (−ϕ, +ψ) were sampled. A molecular modeling study that reportedly sampled the full allowable range of glycosidic linkage conformations in agreement with experimental data for several disaccharide units also   Our data also show that a secondary basin was sampled in each linkage type in our MD simulations. In GlcAβ1-3GlcNAc linkages, conformations in the upper left quadrant (−φ, +ψ) were sampled. A molecular modeling study that reportedly sampled the full allowable range of glycosidic linkage conformations in agreement with experimental data for several disaccharide units also revealed hyaluronan GlcAβ1-3GlcNAc linkage conformations in a secondary basin similar to ours [117]. Furthermore, crystal structures of hyaluronan oligosaccharides in complexes with proteins [92][93][94][95] were examined and we found that most glycosidic linkage conformations were similar to the most energetically-favorable conformations observed in MD simulation, with the exception of one GlcAβ1-3GlcNAc linkage conformation with −φ, +ψ dihedrals, which was seen in a hyaluronan hexasaccharide in complex with hyaluronidase [94]. This suggests that, although rare, this is a physical conformation. Analysis of individual glycosidic linkages in individual runs ( Figure S7) showed that this basin was sampled mostly in two particular GlcAβ1-3GlcNAc linkages in the MD run with the lowest end-to-end distance distribution. Therefore, 20-mer MD snapshots with these glycosidic linkage conformations were analyzed; we found that these linkage conformations were associated with lower end-to-end distances ( Figure S8 and Table S1). Visual analysis of these snapshots showed that GlcAβ1-3GlcNAc glycosidic linkages with −φ, +ψ dihedrals caused a kink in the polymer, which explains the more compact polymer conformations in comparison to those containing only linkage conformations at the global free energy minimum ( Figure S8). To find out if there was a connection between these glycosidic linkage conformations and non-4 C 1 puckers of adjacent monosaccharide rings, glycosidic linkages flanking non-4 C 1 ring puckers were plotted, and they were all centered about the global minimum, i.e., −φ, −ψ ( Figure S9). In fact, there did not appear to be any correlation between GlcAβ1-3GlcNAc linkage −φ, +ψ conformations and a non-4 C 1 ring pucker in any region of the polymer.

GlcNAcβ1-4 GlcA
Analysis of GlcNAcβ1-4GlcA glycosidic linkage conformations ( Figure 5b and Table 2) revealed secondary minima in the lower left quadrant (−φ, −ψ), which was also seen in a molecular modeling study of GAG disaccharides with results validated by experimental data [117]. In hyaluronan 20-mer MD simulations, these conformations were sampled more in the GlcNAcβ1-4GlcA linkages neighboring the GlcAβ1-3GlcNAc linkages with −φ, +ψ dihedrals in the MD run with the lowest end-to-end distances. The same analyses to those performed on 20-mer conformations with GlcAβ1-3GlcNAc linkages with −φ, +ψ dihedrals were performed on MD-generated 20-mer snapshots with GlcNAcβ1-4GlcA linkages with −φ, −ψ dihedrals. Similarly, these glycosidic linkage conformations caused a kink in the polymer chain and were associated with lower end-to-end distances ( Figure S10 and Table S1). To determine if there was a correlation between GlcAβ1-3GlcNAc −φ, +ψ linkage dihedrals and GlcNAcβ1-4GlcA −φ, −ψ linkage dihedrals, snapshots with both of these linkage conformations flanking the same monosaccharide ring were examined. It was found that there were very few snapshots in which this was the case. GlcNAcβ1-4GlcA linkage conformations flanking non-4 C 1 monosaccharide rings were also analyzed, and these conformations were centered about the global minimum, i.e., −φ, +ψ ( Figure S9). Additionally, ionic interactions (e.g., between Na + , O − in GlcA carboxylate group, and O in C=O of adjacent GlcNAc N-acetyl group, between Cl − , H in GlcNAc methyl group, and H in GlcA hydroxyl groups, or between Na + and O atoms or Cl − and H atoms of hydroxyls in adjacent monosaccharides) did not appear to play a role in the transition of glycosidic linkage conformations between their primary and secondary φ, ψ basins.
Hyaluronan 10-mers were also simulated and showed the same conformations as the 20-mer in MD simulation except for GlcNAc monosaccharide rings, which sampled only 4 C 1 chair conformations in the 10-mer (Figures 3c,d and 5c,d and Table 2). This was expected, as these conformations occurred very few times in 20-mer MD simulations, and there were half as many samples (i.e., monosaccharide rings) in the 10-mer.

Construction Algorithm
Our construction algorithm was used to construct four sets of hyaluronan 20-mer ensembles with 10,000 conformations each (40,000 conformations total). A comparison of end-to-end distance distributions ( Figure 6a and Table 1) and radii of gyration ( Figure S2a,b) showed that our construction algorithm produces ensembles that mimic backbone flexibility observed in MD simulations. Furthermore, the C-P parameters ( Figure S11) and glycosidic linkage dihedral free energies ∆G(φ, ψ) ( Figure S12) of constructed conformations post-minimization matched those of MD-generated ensembles (i.e., construction algorithm input data; Figures 3a,b and 5a,b and Table 2), indicating that dihedral angles were properly restrained and that minimization did not change the overall ring and linkage conformation.  A hyaluronan 10-mer ensemble with 40,000 conformations was also constructed using the algorithm; similarly, the end-to-end distances and radii of gyration of the constructed 10-mer ensemble matched those of the MD-generated 10-mer ensemble (Figures 6b and S2c,d and Table 1). Both ensembles contained mostly extended conformations, as expected for short oligosaccharides with fewer opportunities for kinks or curves.
Finally, a hyaluronan 200-mer ensemble containing four sets of 10,000 conformations was A hyaluronan 10-mer ensemble with 40,000 conformations was also constructed using the algorithm; similarly, the end-to-end distances and radii of gyration of the constructed 10-mer ensemble matched those of the MD-generated 10-mer ensemble (Figures 6b and S2c,d and Table 1). Both ensembles contained mostly extended conformations, as expected for short oligosaccharides with fewer opportunities for kinks or curves.
Finally, a hyaluronan 200-mer ensemble containing four sets of 10,000 conformations was constructed using the algorithm. The construction procedure produced end-to-end distances, radii of gyration, and bond potential energies of all conformations, as well as PDB files of conformations with most probable end-to-end distances, bond potential energies above that of the fully-extended conformation, and all excluded conformations (for a total of about 1000 PDBs for each of the four sets) for visualization and validation of the bond potential energy cutoff. As seen in MD-generated hyaluronan 10-and 20-mer ensembles, we expected the end-to-end distance distribution curve skewness to shift toward the right (i.e., lower end-to-end distances indicating more compact conformations) with increasing polymer length, as there were more opportunities for kinks and curves resulting from ring puckering and the flexibility of the glycosidic linkages. This pattern was seen in the end-to-end distance distributions of the hyaluronan 200-mer constructed ensembles ( Figure 7) and in those of nonsulfated chondroitin 100-and 200-mer constructed ensembles [33]. The relationship between end-to-end distance and radius of gyration was decreasingly linear with increasing polymer length. In other words, conformations with the same end-to-end distance have a wider range of radii of gyration in longer polymers. This was evidenced by comparison of R 2 values of the end-to-end distance vs. radius of gyration regression lines of hyaluronan 20-and 10-mer ensembles, both MD-generated and constructed ( Figure S2a-d). As expected, the end-to-end distance and radius of gyration relationship was decreasingly linear with increasing polymer length, as shown by comparison of hyaluronan 10-, 20-, and 200-mer constructed ensembles ( Figure S2b,d,e). This suggests that our algorithm constructs hyaluronan 200-mer ensembles with the backbone conformations that we would expect to see in MD simulations. hyaluronan 10-and 20-mer ensembles, we expected the end-to-end distance distribution curve skewness to shift toward the right (i.e., lower end-to-end distances indicating more compact conformations) with increasing polymer length, as there were more opportunities for kinks and curves resulting from ring puckering and the flexibility of the glycosidic linkages. This pattern was seen in the end-to-end distance distributions of the hyaluronan 200-mer constructed ensembles ( Figure 7) and in those of nonsulfated chondroitin 100-and 200-mer constructed ensembles [33]. The relationship between end-to-end distance and radius of gyration was decreasingly linear with increasing polymer length. In other words, conformations with the same end-to-end distance have a wider range of radii of gyration in longer polymers. This was evidenced by comparison of R 2 values of the end-to-end distance vs. radius of gyration regression lines of hyaluronan 20-and 10-mer ensembles, both MD-generated and constructed ( Figure S2a-d). As expected, the end-to-end distance and radius of gyration relationship was decreasingly linear with increasing polymer length, as shown by comparison of hyaluronan 10-, 20-, and 200-mer constructed ensembles ( Figure S2b,d,e). This suggests that our algorithm constructs hyaluronan 200-mer ensembles with the backbone conformations that we would expect to see in MD simulations. Figure 7. End-to-end distance probability distribution of constructed ensemble of hyaluronan 200mer; most probable end-to-end distance across all four sets is 235 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
MD simulations of nonsulfated dermatan 20-mer revealed relatively rigid and linear backbone conformations, as evidenced by the narrow and highly left-skewed end-to-end distance distributions, which match in all four MD runs ( Figure 8 and Table 3), and the linear relationship between end-toend distance and radius of gyration ( Figure S13a). To understand the factors contributing to this rigid behavior of the dermatan 20-mer in simulation, conformations of monosaccharide rings and glycosidic linkages were analyzed. . End-to-end distance probability distribution of constructed ensemble of hyaluronan 200-mer; most probable end-to-end distance across all four sets is 235 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
MD simulations of nonsulfated dermatan 20-mer revealed relatively rigid and linear backbone conformations, as evidenced by the narrow and highly left-skewed end-to-end distance distributions, which match in all four MD runs ( Figure 8 and Table 3), and the linear relationship between end-to-end distance and radius of gyration ( Figure S13a). To understand the factors contributing to this rigid behavior of the dermatan 20-mer in simulation, conformations of monosaccharide rings and glycosidic linkages were analyzed.  Figure 8. End-to-end distance probability distribution of MD-generated nonsulfated dermatan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for endto-end distances sorted into 0.5 Å bins. 1 Probabilities were calculated for end-to-end distances sorted into 0.5 Å bins for the 20-mer ensembles and 0.25 Å bins for the 10-mer ensembles. 2 All = end-to-end distance distribution aggregated across all four runs.
GalNAc monosaccharide ring C-P parameters ( Figure 9a) show mostly 4 C1 conformations, as well-established by X-ray diffraction [49,106,118], NMR [47,119], and force field [33,47,49,107] data. Biased MD simulations of β-GalNAc monosaccharides showed that nonsulfated β-GalNAc sampled boat and skew-boat conformations, namely B3,O, 1 S3, and 1,4 B, with relatively high free energies [107]. In line with this study, our simulations showed very few boat and skew-boat puckers, namely 1 S3, 1,4 B, 1 S5 (Figure 4a; -3GalNAcβ1-endocyclic ring and linker oxygen atoms were identical to those of -3GlcNAcβ1-), and B2,5, on the microsecond timescale. These puckers were sampled by different rings of the 20-mer in different MD runs and returned to 4 C1 within 10 ns ( Figure S14), suggesting that this behavior was random, and confirming that non-4 C1 GalNAc puckers are not energetically favorable. While these boat and skew-boat conformations caused a kink in the polymer chain, there were so few occurrences of this that they did not impact the overall end-to-end distance distribution. Figure 8. End-to-end distance probability distribution of MD-generated nonsulfated dermatan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for end-to-end distances sorted into 0.5 Å bins.

Table 3. Most Probable End-to-End Distances (d) in MD-Generated and Constructed Nonsulfated
Dermatan Ensembles 1 . 1 Probabilities were calculated for end-to-end distances sorted into 0.5 Å bins for the 20-mer ensembles and 0.25 Å bins for the 10-mer ensembles. 2 All = end-to-end distance distribution aggregated across all four runs.

20-mer Ensembles 10-mer Ensembles
GalNAc monosaccharide ring C-P parameters ( Figure 9a) show mostly 4 C 1 conformations, as well-established by X-ray diffraction [49,106,118], NMR [47,119], and force field [33,47,49,107] data. Biased MD simulations of β-GalNAc monosaccharides showed that nonsulfated β-GalNAc sampled boat and skew-boat conformations, namely B 3,O , 1 S 3 , and 1,4 B, with relatively high free energies [107]. In line with this study, our simulations showed very few boat and skew-boat puckers, namely 1 S 3 , 1,4 B, 1 S 5 (Figure 4a; -3GalNAcβ1-endocyclic ring and linker oxygen atoms were identical to those of -3GlcNAcβ1-), and B 2,5 , on the microsecond timescale. These puckers were sampled by different rings of the 20-mer in different MD runs and returned to 4 C 1 within 10 ns ( Figure S14), suggesting that this behavior was random, and confirming that non-4 C 1 GalNAc puckers are not energetically favorable. While these boat and skew-boat conformations caused a kink in the polymer chain, there were so few occurrences of this that they did not impact the overall end-to-end distance distribution. IdoA in the 20-mer and (c) GalNAc and (d) IdoA in the 10-mer; geometries from the four sets of each ensemble are represented by red, green, blue, and magenta dots, respectively and the force-field geometry is represented by a single large black dot; Cremer-Pople parameters (ϕ, θ) for all rings in every tenth snapshot from each ensemble were plotted (i.e., 10 rings × 1,000 snapshots per run × 4 runs = 40,000 parameter sets for the 20-mer and 20,000 parameter sets for the 10-mer).
IdoA monosaccharide rings in 20-mer MD sampled a majority of 1 C4 and, to a lesser degree, 2 SO and 4 C1 puckers ( Figure 9b and Table S2), as observed in NMR and force field studies [29,45,47,51,107,[120][121][122][123][124][125]. Literature reports of the relative proportions of these three ring puckers in nonsulfated IdoA vary, which may be explained by differences in the structure of neighboring residues (or lack thereof) [45,120,123,124] and differences in ion concentrations [123]. Furthermore, existing studies were conducted on IdoA monosaccharides and short IdoA-containing GAG oligosaccharides (i.e., <20 monosaccharides). NMR data have shown that, when at the nonreducing terminal in GAG oligosaccharides, nonsulfated IdoA is in equilibrium with 1 C4, 2 SO, and 4 C1 puckers, and when flanked by nonsulfated N-acetylated sugar derivatives (i.e., GalNAc or GlcNAc), nonsulfated IdoA is primarily in 1 C4 and 2 SO conformations [123], with fewer than 10% of samples in 4 C1 [45,124]. Indeed, we found this to be the case in our dermatan 20-mer MD simulations (Table S2). Importantly, none of the 1 C4, 2 SO, or 4 C1 puckers introduced a kink in the polymer chain, so we would not expect variations in the relative proportions of these ring puckers to alter overall polymer backbone conformations. IdoA also sampled other boat and skew-boat conformations in 20-mer MD simulations, specifically B1,4, 5 S1, 2,5 B, B3,O, 1 S3, 1,4 B, and 1 S5 (Figure 9b and Table S2), which is in agreement with results from MD simulations of IdoA monosaccharides [107,126]. Another force field study showed that 2,5 B and B3,O IdoA conformers are also feasible interpretations of existing NMR data [127]. Furthermore, a force field study that mapped X-ray diffraction and NMR data for dermatan sulfate [122] and a molecular modeling study of IdoA monosaccharides [126] showed that interconversion between different boat/skew-boat forms of IdoA is much more common than that between boat/skew-boat and chair forms, as it is less energetically costly, which helps explain the sampling of multiple different boat/skew-boat IdoA conformations. As each of these boat/skew-boat puckers (except for 2 SO) introduces a kink in the polymer chain (Figures 10 and 4b,c; -4IdoAα1endocyclic ring and linker oxygen atoms are identical to those of -4GlcAcβ1-), end-to-end distances of 20-mer conformations with these puckers were analyzed to determine if they were associated with more compact conformations. The end-to-end distance distribution of conformations with boat/skew- IdoA monosaccharide rings in 20-mer MD sampled a majority of 1 C 4 and, to a lesser degree, 2 S O and 4 C 1 puckers (Figure 9b and Table S2), as observed in NMR and force field studies [29,45,47,51,107,[120][121][122][123][124][125]. Literature reports of the relative proportions of these three ring puckers in nonsulfated IdoA vary, which may be explained by differences in the structure of neighboring residues (or lack thereof) [45,120,123,124] and differences in ion concentrations [123]. Furthermore, existing studies were conducted on IdoA monosaccharides and short IdoA-containing GAG oligosaccharides (i.e., <20 monosaccharides). NMR data have shown that, when at the nonreducing terminal in GAG oligosaccharides, nonsulfated IdoA is in equilibrium with 1 C 4 , 2 S O , and 4 C 1 puckers, and when flanked by nonsulfated N-acetylated sugar derivatives (i.e., GalNAc or GlcNAc), nonsulfated IdoA is primarily in 1 C 4 and 2 S O conformations [123], with fewer than 10% of samples in 4 C 1 [45,124]. Indeed, we found this to be the case in our dermatan 20-mer MD simulations (Table S2). Importantly, none of the 1 C 4 , 2 S O , or 4 C 1 puckers introduced a kink in the polymer chain, so we would not expect variations in the relative proportions of these ring puckers to alter overall polymer backbone conformations. IdoA also sampled other boat and skew-boat conformations in 20-mer MD simulations, specifically B 1,4 , 5 S 1 , 2,5 B, B 3,O , 1 S 3 , 1,4 B, and 1 S 5 (Figure 9b and Table S2), which is in agreement with results from MD simulations of IdoA monosaccharides [107,126]. Another force field study showed that 2,5 B and B 3,O IdoA conformers are also feasible interpretations of existing NMR data [127]. Furthermore, a force field study that mapped X-ray diffraction and NMR data for dermatan sulfate [122] and a molecular modeling study of IdoA monosaccharides [126] showed that interconversion between different boat/skew-boat forms of IdoA is much more common than that between boat/skew-boat and chair forms, as it is less energetically costly, which helps explain the sampling of multiple different boat/skew-boat IdoA conformations. As each of these boat/skew-boat puckers (except for 2 S O ) introduces a kink in the polymer chain (Figures 10 and 4b,c; -4IdoAα1endocyclic ring and linker oxygen atoms are identical to those of -4GlcAcβ1-), end-to-end distances of 20-mer conformations with these puckers were analyzed to determine if they were associated with more compact conformations. The end-to-end distance distribution of conformations with boat/skew-boat ring puckers (other than 2 S O ) matched that of the average of the four runs from the full MD-generated 20-mer ensemble ( Figure S15), indicating that these ring puckers do not necessarily give compact polymer conformations. Additionally, the end-to-end distance distribution of 20-mer conformations with 2 S O IdoA conformations was compared to that of the full MD ensemble ( Figure S15). These distributions were similar, further suggesting that 2 S O conformations are not associated with more or less compact 20-mer conformations. To determine if IdoA ring puckering in nonsulfated dermatan 20-mer MD was random, an analysis was performed of C-P parameters of individual IdoA monosaccharides in each of the four MD runs ( Figure S16). This revealed that (1) not all IdoA rings overcame the energy barrier to sample a 4 C 1 pucker, (2) no single IdoA ring sampled 4 C 1 in all four MD runs, (3) each IdoA ring overcame the energy barrier to sample boat/skew-boat puckers (i.e., C-P θ 90 • ) in at least one MD run, and (4) after sampling boat/skew-boat puckers, some IdoA rings returned to 1 C 4 in as little as 165 ns, while others remained in boat/skew-boat conformations for the remainder of the 1-µs trajectory. These observations support the idea that monosaccharide rings in a nonsulfated dermatan 20-mer behave randomly and independently in unbiased MD simulation. boat ring puckers (other than 2 SO) matched that of the average of the four runs from the full MDgenerated 20-mer ensemble ( Figure S15), indicating that these ring puckers do not necessarily give compact polymer conformations. Additionally, the end-to-end distance distribution of 20-mer conformations with 2 SO IdoA conformations was compared to that of the full MD ensemble ( Figure  S15). These distributions were similar, further suggesting that 2 SO conformations are not associated with more or less compact 20-mer conformations. To determine if IdoA ring puckering in nonsulfated dermatan 20-mer MD was random, an analysis was performed of C-P parameters of individual IdoA monosaccharides in each of the four MD runs ( Figure S16). This revealed that (1) not all IdoA rings overcame the energy barrier to sample a 4 C1 pucker, (2) no single IdoA ring sampled 4 C1 in all four MD runs, (3) each IdoA ring overcame the energy barrier to sample boat/skew-boat puckers (i.e., C-P θ ~ 90°) in at least one MD run, and (4) after sampling boat/skew-boat puckers, some IdoA rings returned to 1 C4 in as little as 165 ns, while others remained in boat/skew-boat conformations for the remainder of the 1-µs trajectory. These observations support the idea that monosaccharide rings in a nonsulfated dermatan 20-mer behave randomly and independently in unbiased MD simulation. Analysis of dihedral free energies revealed a single ΔG(ϕ, ψ) minimum for each of IdoAα1-3GalNAc and GalNAcβ1-4IdoA glycosidic linkages (Figure 11a,b and Table 4). This is consistent across all linkages in all runs for each linkage type. The lack of secondary basins in ϕ, ψ dihedral samples may explain the higher degree of rigidity and tendency toward extended backbone conformations of the nonsulfated dermatan 20-mer in comparison to hyaluronan and nonsulfated chondroitin [33] 20-mers in MD simulation. An X-ray diffraction study of sodium dermatan sulfate observed three different helical forms (i.e., right-handed, left-handed, and achiral) [118] and quantified their dihedral angles, which were defined by: ϕ = O5-C1-O-Cn and ψ = C1-O-Cn-C(n + 1). Assuming C1-O-C3-C2 = C1-O-C3-C4 + 120° and C1-O-C4-C3 = C1-O-C4-C5 -120°, we see that the righthanded and achiral helical forms corresponded to high ΔG(ϕ, ψ) values in our nonsulfated dermatan 20-mer MD simulations, and the left-handed helical form corresponded to glycosidic linkage conformations very close to the ΔG(ϕ, ψ) global minimum. In line with our results, force field studies that compared MD results to existing NMR and X-ray diffraction data of dermatan sulfate dismissed right-handed and achiral helical forms and confirmed the presence of left-handed helical structure Analysis of dihedral free energies revealed a single ∆G(φ, ψ) minimum for each of IdoAα1-3GalNAc and GalNAcβ1-4IdoA glycosidic linkages (Figure 11a,b and Table 4). This is consistent across all linkages in all runs for each linkage type. The lack of secondary basins in φ, ψ dihedral samples may explain the higher degree of rigidity and tendency toward extended backbone conformations of the nonsulfated dermatan 20-mer in comparison to hyaluronan and nonsulfated chondroitin [33] 20-mers in MD simulation. An X-ray diffraction study of sodium dermatan sulfate observed three different helical forms (i.e., right-handed, left-handed, and achiral) [118] and quantified their dihedral angles, which were defined by: φ = O 5 -C 1 -O-C n and ψ = C 1 -O-C n -C (n + 1) . Assuming C 1 -O-C 3 -C 2 = C 1 -O-C 3 -C 4 + 120 • and C1-O-C 4 -C 3 = C 1 -O-C 4 -C 5 -120 • , we see that the right-handed and achiral helical forms corresponded to high ∆G(φ, ψ) values in our nonsulfated dermatan 20-mer MD simulations, and the left-handed helical form corresponded to glycosidic linkage conformations very close to the ∆G(φ, ψ) global minimum. In line with our results, force field studies that compared MD results to existing NMR and X-ray diffraction data of dermatan sulfate dismissed right-handed and achiral helical forms and confirmed the presence of left-handed helical structure [49,122]. A quantitative comparison of our MD-generated glycosidic linkage data to their findings and other NMR data for dermatan tetrasaccharides [47] showed close agreement if we assume O 5 [49,122]. A quantitative comparison of our MD-generated glycosidic linkage data to their findings and other NMR data for dermatan tetrasaccharides [47] showed close agreement if we assume O5-   MD simulations of a nonsulfated dermatan 10-mer produced mostly linear backbone conformations ( Figure S13c) and similar monosaccharide ring ( Figure 9) and glycosidic linkage ( Figure 11 and Table 4) geometries to the nonsulfated dermatan 20-mer. The major difference was that in 10-mer MD simulations, 9% of all IdoA rings in all runs were found in 4 C1 form; this was almost exclusively in the nonreducing terminal IdoA (i.e., ring #10; Figure S17 and Table S2), which is mostly in line with the aforementioned NMR study of IdoA ring conformations [45]. However, based on the findings from the NMR study and from our dermatan 20-mer MD simulations, we would still expect the nonterminal IdoA rings in a GAG polymer to sample some 4 C1 conformations. This supports our position that GAG 20-mers are better-suited for predictions of long-chain backbone conformations than short GAG oligosaccharides.

GalNAcβ1-4 IdoA
Min MD simulations of a nonsulfated dermatan 10-mer produced mostly linear backbone conformations ( Figure S13c) and similar monosaccharide ring ( Figure 9) and glycosidic linkage ( Figure 11 and Table 4) geometries to the nonsulfated dermatan 20-mer. The major difference was that in 10-mer MD simulations, 9% of all IdoA rings in all runs were found in 4 C 1 form; this was almost exclusively in the nonreducing terminal IdoA (i.e., ring #10; Figure S17 and Table S2), which is mostly in line with the aforementioned NMR study of IdoA ring conformations [45]. However, based on the findings from the NMR study and from our dermatan 20-mer MD simulations, we would still expect the nonterminal IdoA rings in a GAG polymer to sample some 4 C 1 conformations. This supports our position that GAG 20-mers are better-suited for predictions of long-chain backbone conformations than short GAG oligosaccharides.
These findings suggest that (1) monosaccharide rings and glycosidic linkages in nonsulfated dermatan GAG 20-mers behave randomly and independently in MD simulation, (2) nonsulfated dermatan polymers take on rigid left-handed helical structure with a tendency toward linear backbone conformations in unbiased MD simulations, and (3) nonsulfated dermatan 20-mer conformations in MD simulation provide a realistic representation of longer nonsulfated dermatan polymer conformations.

Construction Algorithm
Four sets of 10,000 conformations of a nonsulfated dermatan 20-mer were constructed using our algorithm and compared to conformations seen in nonsulfated dermatan 20-mer MD. As expected, the C-P parameter plots ( Figure S18) and glycosidic linkage free energies ( Figure S19) in the constructed 20-mer ensemble matched those in the MD-generated 20-mer ensemble (Figures 9a,b and 11a,b and Table 4). Furthermore, the end-to-end distances (Figure 12a) and radii of gyration ( Figure S13a,b) in the constructed ensemble matched those in the MD-generated ensemble, with only a 2.42% difference in most probable end-to-end distance ( Table 3). This demonstrates that our algorithm produces nonsulfated dermatan 20-mer ensembles that mimic 20-mer backbone flexibility seen in MD simulations.

Construction Algorithm
Four sets of 10,000 conformations of a nonsulfated dermatan 20-mer were constructed using our algorithm and compared to conformations seen in nonsulfated dermatan 20-mer MD. As expected, the C-P parameter plots ( Figure S18) and glycosidic linkage free energies ( Figure S19) in the constructed 20-mer ensemble matched those in the MD-generated 20-mer ensemble (Figures 9a,b and  11a,b and Table 4). Furthermore, the end-to-end distances (Figure 12a) and radii of gyration ( Figure  S13a,b) in the constructed ensemble matched those in the MD-generated ensemble, with only a 2.42% difference in most probable end-to-end distance ( Table 3). This demonstrates that our algorithm produces nonsulfated dermatan 20-mer ensembles that mimic 20-mer backbone flexibility seen in MD simulations. The nonsulfated dermatan 10-mer ensemble with 40,000 conformations constructed by the algorithm had very similar end-to-end distance and radius of gyration data compared to the MD- The nonsulfated dermatan 10-mer ensemble with 40,000 conformations constructed by the algorithm had very similar end-to-end distance and radius of gyration data compared to the MD-generated ensemble (Figures 12b and S13c,d and Table 3). The algorithm was also used to create a nonsulfated dermatan 200-mer ensemble. As expected, the 200-mer end-to-end distance distribution ( Figure 13) and radii of gyration ( Figure S13e) showed that conformations tended to be more compact as polymer length increased. The shift in skewness of the end-to-end distance distribution with increasing polymer length was more subtle in nonsulfated dermatan than in hyaluronan, as there were fewer linkage and ring conformations causing kinks in the polymer.  Table 3). The algorithm was also used to create a nonsulfated dermatan 200-mer ensemble. As expected, the 200-mer end-to-end distance distribution ( Figure 13) and radii of gyration ( Figure S13e) showed that conformations tended to be more compact as polymer length increased. The shift in skewness of the end-to-end distance distribution with increasing polymer length was more subtle in nonsulfated dermatan than in hyaluronan, as there were fewer linkage and ring conformations causing kinks in the polymer. Figure 13. End-to-end distance probability distribution of constructed ensemble of nonsulfated dermatan200-mer; most probable end-to-end distance across all four sets is 365 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
In MD simulations, nonsulfated keratan 20-mers showed rigid backbone behavior and favored extended backbone conformations, as evidenced by end-to-end distance distributions ( Figure 14 and Table 5) which were identical in all four MD runs, and the high correlation of end-to-end distance to radius of gyration ( Figure S20a). To understand the causes of this rigid behavior, monosaccharide ring and glycosidic linkage conformations were analyzed. Figure 14. End-to-end distance probability distribution of MD-generated nonsulfated keratan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for endto-end distances sorted into 0.5 Å bins. Figure 13. End-to-end distance probability distribution of constructed ensemble of nonsulfated dermatan200-mer; most probable end-to-end distance across all four sets is 365 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
In MD simulations, nonsulfated keratan 20-mers showed rigid backbone behavior and favored extended backbone conformations, as evidenced by end-to-end distance distributions ( Figure 14 and Table 5) which were identical in all four MD runs, and the high correlation of end-to-end distance to radius of gyration ( Figure S20a). To understand the causes of this rigid behavior, monosaccharide ring and glycosidic linkage conformations were analyzed.
In MD simulations, nonsulfated keratan 20-mers showed rigid backbone behavior and favored extended backbone conformations, as evidenced by end-to-end distance distributions ( Figure 14 and Table 5) which were identical in all four MD runs, and the high correlation of end-to-end distance to radius of gyration ( Figure S20a). To understand the causes of this rigid behavior, monosaccharide ring and glycosidic linkage conformations were analyzed. Figure 14. End-to-end distance probability distribution of MD-generated nonsulfated keratan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for endto-end distances sorted into 0.5 Å bins. Figure 14. End-to-end distance probability distribution of MD-generated nonsulfated keratan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for end-to-end distances sorted into 0.5 Å bins. C-P parameters of GlcNAc monosaccharides in keratan 20-mer MD (Figure 15a) showed similar conformations to GlcNAc in hyaluronan 20-mer. There were predominantly 4 C 1 chair conformations, which are also found in a keratan sulfate tetrasaccharide crystal structure [128], with very few transitions (i.e., no more than one per ring or two per run) to boat/skew-boat conformations (i.e., 2 S O , 1 S 3 , 1,4 B, and 1 S 5 ), which were sampled for no more than 20 ns before returning to 4 C 1 chair ( Figure S21). Analysis of end-to-end distance and radius of gyration of 20-mer conformations containing non-4 C 1 ring puckers showed that these ring puckers were not associated with compact 20-mer conformations ( Figure S22). C-P analysis of Gal monosaccharides showed that for the duration of the keratan 20-mer MD simulations, Gal remained in 4 C 1 chair (Figure 15b), which was the predominant conformation found in NMR [119,[129][130][131] and force field data [107] for Gal in general, and in a keratan sulfate tetrasaccharide crystal structure [128]. 4 C1 ring puckers showed that these ring puckers were not associated with compact 20-mer conformations ( Figure S22). C-P analysis of Gal monosaccharides showed that for the duration of the keratan 20-mer MD simulations, Gal remained in 4 C1 chair (Figure 15b), which was the predominant conformation found in NMR [119,[129][130][131] and force field data [107] for Gal in general, and in a keratan sulfate tetrasaccharide crystal structure [128].  Glycosidic linkage free energy ∆G(φ, ψ) analysis of Galβ1-4GlcNAc linkages revealed a global minimum in the basin with −φ, +ψ and a secondary minimum in the basin with −φ, −ψ ( Figure 16a and Table 6), similar to hyaluronan GlcNAcβ1-4GlcA linkages (Figure 5b,d and Table 2), but with additional rare conformations in +φ, +ψ. A molecular modeling study, which was shown to agree with experimental data, showed that β1-4 linkages in keratan sulfate disaccharides took on the same conformations as β1-4 linkages in hyaluronan disaccharides [117], confirming the primary and secondary basins sampled in our 20-mer MD simulations. Keratan 20-mer conformations with glycosidic linkage conformations in the tertiary basin (i.e., +φ, +ψ) were visualized and end-to-end distances were analyzed ( Figure S23). These conformations formed a slight bend in the polymer chain and were associated with more compact polymer backbone conformations. However, there did not appear to be any close contacts between atoms of adjacent monosaccharides in this conformation or in previous snapshots that could explain these linkage bond rotations. Furthermore, this conformation was not specific to any particular region of the polymer. Therefore, this behavior was likely random. As −φ, −ψ β1-4 linkage conformations are associated with compact hyaluronan polymer conformations, and nonsulfated keratan 20-mer is relatively rigid and favors extended conformations in contrast to hyaluronan, we sought to determine the effects of Galβ1-4GlcNAc linkage conformations in the secondary (i.e., −φ, −ψ) basin on keratan 20-mer backbone flexibility in MD simulations. As in hyaluronan, these conformations caused a kink in the keratan 20-mer chain and were associated with more compact conformations ( Figure S23 and Table S1). However, these conformations were slightly less stable in keratan 20-mer MD (Table 6) than in hyaluronan MD (Table 2), and were, therefore, less common.
conformations in contrast to hyaluronan, we sought to determine the effects of Galβ1-4GlcNAc linkage conformations in the secondary (i.e., −ϕ, −ψ) basin on keratan 20-mer backbone flexibility in MD simulations. As in hyaluronan, these conformations caused a kink in the keratan 20-mer chain and were associated with more compact conformations ( Figure S23 and Table S1). However, these conformations were slightly less stable in keratan 20-mer MD ( Table 6) than in hyaluronan MD (Table  2), and were, therefore, less common.     Analysis of GlcNAcβ1-3Gal glycosidic linkages showed a global free energy minimum in the −φ, −ψ basin ( Table 6) similar to that of IdoAβ1-3GalNAc linkages in nonsulfated dermatan MD (Figure 11a,c and Table 4), but with additional rare occurrences in a secondary basin in −φ, +ψ, confirmed by the aforementioned molecular modeling study [117], and a small tertiary basin in +φ, −ψ (Figure 16b). To determine the effects of GlcNAcβ1-3Gal linkage conformations in these secondary and tertiary basins on backbone conformation, we visualized 20-mer conformations containing these glycosidic linkage conformations and analyzed end-to-end distances ( Figure S24). Glycosidic linkage conformations in the tertiary basin caused a kink and were associated with more compact 20-mer conformations. In contrast to hyaluronan, nonsulfated keratan β1-3 glycosidic linkage conformations in the secondary basin (i.e., −φ, +ψ) caused only a slight bend in the polymer chain, but were still associated with compact 20-mer conformations (Table S1). Both secondary and tertiary GlcNAcβ1-3Gal linkage conformations were rare in keratan 20-mer MD, which helps explain why the keratan 20-mer chain was relatively rigid and favored extended conformations in MD simulation.

GlcNAcβ1-3 Gal
Backbone conformational analysis of nonsulfated keratan 10-mer MD revealed that the 10-mer chain was rigid and favored extended conformations ( Figure S20c and Table 5), which is similar behavior to that of nonsulfated keratan 20-mer in MD simulation. This stands to reason, as conformations of monosaccharide rings ( Figure 15) and glycosidic linkages ( Figure 16 and Table 6) in keratan 10-mer MD matched those in keratan 20-mer MD, with the only differences being that (1) Gal briefly (i.e., for <5 ns) sampled a 1 S 5 skew-boat conformation once in each of two of the 10-mer MD runs ( Figure S25), and (2) GlcNAcβ1-3Gal linkages in the 10-mer did not sample conformations in the tertiary basin (i.e., +φ, −ψ) sampled in the 20-mer. Boat/skew-boat conformations including 1 S 5 were shown to have a high secondary free energy minimum (i.e., > 4 kcal/mol) in biased MD simulations of Gal monosaccharides [107], indicating that while possible, these conformations are not very stable in Gal monosaccharides. In both biased and unbiased MD simulations of a tetrasaccharide with a central Gal-GlcNAc disaccharide unit, the free energy minimum of Gal boat/skew-boat conformations was much higher (i.e.,~7 kcal/mol) [108], suggesting that non-4 C 1 conformations are less stable in Gal linked to other monosaccharides. This could explain why Gal does not sample non-4 C 1 conformations in keratan 20-mer MD, and suggests that these conformations would not be seen in long polymers. Although +φ, −ψ conformations in GlcNAcβ1-3Gal linkages are rare, they are physical conformations [117] that minorly contribute to polymer backbone flexibility. This further supports our belief that conformational landscapes from 20-mer MD are better-suited to construct conformational ensembles of GAGs with biologically-relevant chain lengths than those of short GAG oligosaccharides.
The higher degree of rigidity and probability of extended conformations in nonsulfated keratan MD compared to hyaluronan MD can be explained by the facts that: (1) for each glycosidic linkage type, the conformations that caused a kink in the polymer chain were less stable and, thus, rarer in nonsulfated keratan MD than in hyaluronan MD; and (2) GlcNAc and Gal are mostly rigid 4 C 1 chair conformers in nonsulfated keratan whereas GlcNAc and GlcA took on more boat/skew-boat conformations that caused kinks in hyaluronan. Importantly, nonsulfated keratan 10-and 20-mers behaved randomly, and glycosidic linkages and monosaccharide rings behaved independently in MD simulations.

Construction Algorithm
Nonsulfated keratan 20-mer conformational ensembles (four sets of 10,000 conformations) were constructed using our algorithm and compared to MD-generated keratan 20-mer ensembles. The monosaccharide ring ( Figure S26) and glycosidic linkage ( Figure S27) conformations in the constructed keratan 20-mer ensemble were identical to the input data (i.e., MD-generated keratan 20-mer conformations; Figures 15a,b and 16a,b and Table 6), as expected. The end-to-end distances ( Figure 17a and Table 5) and radii of gyration ( Figure S20a,b) were similar in constructed and MD-generated ensembles, demonstrating that nonsulfated keratan 20-mer conformational ensembles provide an accurate representation of backbone flexibility in nonsulfated keratan 20-mer MD simulations.
Four sets of nonsulfated keratan 10-mer ensembles with 10,000 conformations each were constructed, and the results were compared to those of nonsulfated keratan 10-mer MD. The end-to-end distances (Figure 17b and Table 5) and radii of gyration ( Figure S20c,d) in constructed 10-mer ensembles matched those of MD-generated 10-mer ensembles, demonstrating that MD-generated 20-mer conformations can be applied to construct ensembles of keratan polymers of different lengths that mimic backbone flexibility seen in MD simulation. A nonsulfated keratan 200-mer ensemble was constructed and, as expected, the end-to-end distance distribution ( Figure 18) and radii of gyration ( Figure S20e) showed that as the polymer length increased, conformations tended to be more compact. Notably, this shift in the end-to-end distance distribution curve was more subtle in keratan than in hyaluronan, which stands to reason, as monosaccharide rings and glycosidic linkages are more rigid and tend to be more extended in keratan 20-mer MD than in hyaluronan 20-mer MD.
constructed keratan 20-mer ensemble were identical to the input data (i.e., MD-generated keratan 20mer conformations; Figures 15a,b and 16a,b and Table 6), as expected. The end-to-end distances (Figure 17a and Table 5) and radii of gyration ( Figure S20a,b) were similar in constructed and MDgenerated ensembles, demonstrating that nonsulfated keratan 20-mer conformational ensembles provide an accurate representation of backbone flexibility in nonsulfated keratan 20-mer MD simulations. Four sets of nonsulfated keratan 10-mer ensembles with 10,000 conformations each were constructed, and the results were compared to those of nonsulfated keratan 10-mer MD. The end-toend distances (Figure 17b and Table 5) and radii of gyration ( Figure S20c,d) in constructed 10-mer ensembles matched those of MD-generated 10-mer ensembles, demonstrating that MD-generated 20mer conformations can be applied to construct ensembles of keratan polymers of different lengths that mimic backbone flexibility seen in MD simulation. A nonsulfated keratan 200-mer ensemble was constructed and, as expected, the end-to-end distance distribution ( Figure 18) and radii of gyration ( Figure S20e) showed that as the polymer length increased, conformations tended to be more compact. Notably, this shift in the end-to-end distance distribution curve was more subtle in keratan than in hyaluronan, which stands to reason, as monosaccharide rings and glycosidic linkages are more rigid and tend to be more extended in keratan 20-mer MD than in hyaluronan 20-mer MD.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
The backbone flexibility of nonsulfated heparan 20-mer, quantified by the end-to-end distances ( Figure 19 and Table 7) and radii of gyration ( Figure S28a) in MD simulations was analyzed. The wide end-to-end distance distribution curves and tendency toward lower end-to-end distances, relative to those of the extended 20-mer conformation, indicated that nonsulfated heparan 20-mer was highly flexible and tended toward compact conformations in MD simulations. To determine factors contributing to this conformational flexibility, monosaccharide ring and glycosidic linkage conformations were examined for patterns. Figure 18. End-to-end distance probability distribution of constructed ensemble of nonsulfated keratan 200-mer; most probable end-to-end distance across all four sets is 305 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Molecular Dynamics Simulations: Glycosidic Linkage and Monosaccharide Ring Geometry Effects on Polymer Backbone Flexibility
The backbone flexibility of nonsulfated heparan 20-mer, quantified by the end-to-end distances ( Figure 19 and Table 7) and radii of gyration ( Figure S28a) in MD simulations was analyzed. The wide end-to-end distance distribution curves and tendency toward lower end-to-end distances, relative to those of the extended 20-mer conformation, indicated that nonsulfated heparan 20-mer was highly flexible and tended toward compact conformations in MD simulations. To determine factors contributing to this conformational flexibility, monosaccharide ring and glycosidic linkage conformations were examined for patterns. Figure 19. End-to-end distance probability distribution of MD-generated nonsulfated heparan 20-mer ensemble; each of the four runs includes 10,000 conformations; probabilities were calculated for end-to-end distances sorted into 0.5 Å bins. 1 Probabilities were calculated for end-to-end distances sorted into 0.5 Å bins for the 20-mer ensembles and 0.25 Å bins for the 10-mer ensembles. 2 All = end-to-end distance distribution aggregated across all four runs.
C-P parameters of GlcNAc rings (Figure 20a) revealed all 4 C 1 chair conformations, in line with NMR and force field data for α-GlcNAc [119]. This stands to reason, as we would not expect much difference in ring conformation between αand β-GlcNAc, because the only structural difference is the orientation of the exocyclic oxygen atom on C 1 (i.e., the linker oxygen on nonterminal GlcNAc monosaccharides in GAG polysaccharides). This structural difference will impact glycosidic linkage conformation but is less likely to impact GlcNAc ring conformation. The IdoA monosaccharide C-P parameters (Figure 20b) from nonsulfated heparan 20-mer MD were similar to those of IdoA in nonsulfated dermatan 20-mer MD (Figure 9b): predominantly 1 C 4 , 2 S O , and some 4 C 1 , which is also in line with NMR and force field data for nonsulfated IdoA in heparan sulfate oligosaccharides [120,123] and nonsulfated heparan trisaccharides [126], and occasional boat and skew-boat conformations (Figure 4b,c; -4IdoAα1-endocyclic ring and linker oxygen atoms in heparan are identical to those of -4GlcAβ1-in hyaluronan). As with GalNAc in dermatan, the presence of neighboring GlcNAc monosaccharides substantially decreased sampling of 4 C 1 conformations [123], which is in line with our results. One NMR and force field study reported 19-49% 2 S O conformations of nonsulfated IdoA in heparan sulfate hexasaccharides, and found that as the degree of GlcNAc sulfation increased, the percentage of 2 S O conformations in adjacent nonsulfated IdoA decreased [120]. As we studied only nonsulfated heparan, we would expect to see a higher proportion of IdoA 2 S O conformations (i.e., close to 49%) if simulated under the same conditions. However, the aforementioned study performed MD in aqueous solution with only neutralizing Na + ions, whereas our systems contained an additional 140 mM NaCl. Another NMR and molecular modeling study showed that increasing NaCl salt concentrations caused a shift in equilibrium of 1 C 4 and 2 S O toward 1 C 4 conformations of 2-O-sulfated IdoA flanked by sulfated glucosamine [123]. This may explain the tendency toward 1 C 4 IdoA conformations, with~54% of IdoA rings across all four 20-mer MD runs in 1 C 4 ,~25% in 2 S O , and~4% in 4 C 1 (Table S3). Furthermore, very few IdoA rings sampled 4 C 1 chair conformations, and one run contained no IdoA 4 C 1 chair conformations ( Figure S29), which is in contrast to the more random distribution of IdoA 4 C 1 chair conformations in nonsulfated dermatan 20-mer MD ( Figure S16). This is likely the result of random kinetic trapping during nonsulfated heparan 20-mer MD simulations, which means IdoA only rarely overcame energy barriers between boat/skew-boat and 4 C 1 conformations. Importantly, it appears that all relevant IdoA conformations (i.e., those in line with the literature) were sampled in nonsulfated heparan 20-mer MD simulations. Therefore, we believe our database of MD-generated 20-mer conformations contains a full conformational landscape of IdoA rings in nonsulfated heparan. The primary goal of our algorithm is to predict backbone conformations of long GAG chains, which requires that (1) monosaccharide rings in GAG 20-mers behave independently in MD simulation and (2) contributions of ring puckers to backbone flexibility match those expected in nonsulfated heparan polymers in aqueous solution. There did not appear to be any interdependency The primary goal of our algorithm is to predict backbone conformations of long GAG chains, which requires that (1) monosaccharide rings in GAG 20-mers behave independently in MD simulation and (2) contributions of ring puckers to backbone flexibility match those expected in nonsulfated heparan polymers in aqueous solution. There did not appear to be any interdependency between adjacent rings, suggesting that monosaccharide rings behave independently in nonsulfated heparan 20-mer MD simulations. To determine the effects of IdoA ring puckering on polymer backbone conformations, end-to-end distances of 20-mer conformations containing monosaccharides in (1) boat and skew-boat conformations that caused a kink in the polymer chain and (2) 2 S O conformations, were analyzed ( Figure S30). The end-to-end distance distribution of 20-mer conformations with the ring puckers that caused a kink was similar to that of the average of the four MD runs and the most probable end-to-end distance was only 1 Å lower in conformations with ring puckers that caused a kink, suggesting that boat and skew-boat conformations that introduce a kink do not necessarily give less compact 20-mer conformations, and thus are not major factors contributing to backbone flexibility. The distribution curve of 20-mer conformations with 2 S O IdoA ring puckers and that of 20-mer conformations with non-2 S O boat/skew-boat puckers that caused a kink were qualitatively similar to that of the full 20-mer MD ensemble, but had additional peaks in probability of higher end-to-end distances. These findings suggest that IdoA boat/skew-boat conformations are associated with the full range of 20-mer backbone conformations in MD simulations with only a slight tendency toward extended 20-mer conformations. Therefore, the proportion of IdoA 2 S O conformations likely does not have a major effect on nonsulfated heparan 20-mer backbone flexibility.
Next, glycosidic linkage dihedral free energies ∆G(φ, ψ) were examined ( Figure 21 and Table 8). IdoAα1-4GlcNAc glycosidic linkages sampled primarily −φ, +ψ with secondary basins in −φ, −ψ and GlcNAcα1-4IdoA glycosidic linkages sampled conformations in a single basin (+φ, +ψ). We compared our results to those of an NMR and force field study that performed two sets of MD on each of nonsulfated IdoAα1-4GlcNAc and GlcNAcα1-4IdoA disaccharides, one with IdoA restrained to 1      GlcNAcα1-4IdoA glycosidic linkage conformations differ from any other GAG glycosidic linkage conformation because of the orientation of the linker oxygen atom with respect to GlcNAc. Specifically, the oxygen atom on C1 of α-GlcNAc is in the opposite orientation as that on C1 of any other GAG monosaccharide (Figure 1). The GlcNAcα1-4IdoA linkages have a coiled conformation ( Figure S31), which helps explain the high tendency toward compact conformations in nonsulfated heparan compared to other GAGs.

IdoAα1-4 GlcNAc
As IdoAα1-4GlcNAc linkages have secondary conformations, we sought to determine if their behavior was random and independent. ΔG(ϕ, ψ) was plotted for each individual IdoAα1-4GlcNAc   GlcNAcα1-4IdoA glycosidic linkage conformations differ from any other GAG glycosidic linkage conformation because of the orientation of the linker oxygen atom with respect to GlcNAc. Specifically, the oxygen atom on C 1 of α-GlcNAc is in the opposite orientation as that on C 1 of any other GAG monosaccharide (Figure 1). The GlcNAcα1-4IdoA linkages have a coiled conformation ( Figure S31), which helps explain the high tendency toward compact conformations in nonsulfated heparan compared to other GAGs.

GlcNAcα1-4 IdoA
As IdoAα1-4GlcNAc linkages have secondary conformations, we sought to determine if their behavior was random and independent. ∆G(φ, ψ) was plotted for each individual IdoAα1-4GlcNAc linkage in each MD run and there did not appear to be any connection between adjacent IdoAα1-4GlcNAc linkages or any patterns across different MD runs. To determine the effects of IdoAα1-4GlcNAc linkages with −φ, −ψ dihedrals on 20-mer backbone flexibility, end-to-end distances of 20-mer conformations with these glycosidic linkage conformations were analyzed. Although these linkage conformations caused a kink in the polymer chain (as with hyaluronan GlcNAcβ1-4GlcA linkages), they were not associated with more compact conformations ( Figure S32).
In fact, the end-to-end distance distribution of heparan 20-mer conformations with IdoAα1-4GlcNAc linkage −φ, −ψ dihedrals was similar to that of the full MD-generated 20-mer ensemble. This was likely because these linkage conformations occurred infrequently in nonsulfated heparan 20-mer MD, meaning a single heparan 20-mer conformation was not likely to have many kinks resulting from these −φ, −ψ linkage dihedrals. These findings are in line with the literature, which suggests that IdoA ring conformational flexibility in heparin/heparan sulfate oligo-and polysaccharides does not affect glycosidic linkage conformation or overall backbone shape [132].
To determine if there was any interdependency between different IdoA ring puckers and flanking IdoAα1-4GlcNAc linkage geometries in 20-mer MD simulations, conformations of all IdoAα1-4GlcNAc linkages flanking each of 1 C 4 , 2 S O , 4 C 1 , and boat/skew-boat (non-2 S O ) conformations were analyzed separately ( Figure S33). There did not appear to be strong associations between any particular IdoA ring conformation and flanking IdoAα1-4GlcNAc linkage conformation, which is in line with IdoAα1-4GlcNAc disaccharide data from NMR and MD with restrained 1 C 4 and 2 S O IdoA ring conformations and a biasing potential on glycosidic linkage dihedrals [30]. This supports our hypothesis that glycosidic linkages and monosaccharide rings behave independently in MD simulation of nonsulfated heparan 20-mer.
Backbone conformational analysis of nonsulfated heparan 10-mer in MD simulation revealed flexible, compact conformations ( Figure S28c and Table 7), as seen in nonsulfated heparan 20-mer MD. Monosaccharide ring and glycosidic linkage conformations in nonsulfated heparan 10-mer MD were similar to those of nonsulfated heparan 20-mer MD. These findings indicate that in MD simulations of nonsulfated heparan 10-and 20-mers, (1) IdoAα1-4GlcNAc glycosidic linkages with −φ, −ψ dihedrals cause a kink in the polymer chain but are rare and thus do not contribute to compact polymer conformations and (2) monosaccharide rings and glycosidic linkages behave randomly and independently.

Construction Algorithm
A nonsulfated heparan 20-mer ensemble with 40,000 conformations was constructed by the algorithm.
Monosaccharide ring C-P parameters ( Figure S34) and glycosidic linkage conformations ( Figure S35) post-minimization in the constructed ensemble matched the input data (Figures 20a,b and 21a,b), as expected. Additionally, the end-to-end distance distribution and radii of gyration of the constructed ensemble closely resembled those of the MD-generated 20-mer ensemble (Figures 22a and S28a,b and Table 7), demonstrating that our algorithm generates nonsulfated heparan 20-mer ensembles with backbone conformations that mimic backbone flexibility seen in 20-mer MD. A nonsulfated heparan 10-mer ensemble with 40,000 conformations was also constructed by the algorithm and had end-to-end distances and radii of gyration that matched those of MD-generated nonsulfated heparan 10-mer ensembles (Figures 22b and S28c,d and Table 7). This demonstrates that nonsulfated heparan 20-mer conformations from MD can be used to construct 10-mer ensembles that mimic the backbone flexibility seen in nonsulfated heparan 10-mer MD.
The constructed nonsulfated heparan 200-mer ensemble had reasonably expected end-to-end distance probability distributions (Figure 23), i.e., the skewness of the curves shifted toward the right, indicating more compact conformations with increasing polymer length. The heparan 200-mer endto-end distance distribution curve was qualitatively similar to that of hyaluronan 200-mer (i.e., the probability peak was of similar magnitude and most probable end-to-end distances were similar), which stands to reason, as ring and linkage conformations in nonsulfated heparan 20-mer MD cause more compact backbone conformations, as in hyaluronan 20-mer MD. A nonsulfated heparan 10-mer ensemble with 40,000 conformations was also constructed by the algorithm and had end-to-end distances and radii of gyration that matched those of MD-generated nonsulfated heparan 10-mer ensembles (Figures 22b and S28c,d and Table 7). This demonstrates that nonsulfated heparan 20-mer conformations from MD can be used to construct 10-mer ensembles that mimic the backbone flexibility seen in nonsulfated heparan 10-mer MD.
The constructed nonsulfated heparan 200-mer ensemble had reasonably expected end-to-end distance probability distributions (Figure 23), i.e., the skewness of the curves shifted toward the right, indicating more compact conformations with increasing polymer length. The heparan 200-mer end-to-end distance distribution curve was qualitatively similar to that of hyaluronan 200-mer (i.e., the probability peak was of similar magnitude and most probable end-to-end distances were similar), which stands to reason, as ring and linkage conformations in nonsulfated heparan 20-mer MD cause more compact backbone conformations, as in hyaluronan 20-mer MD. Figure 23. End-to-end distance probability distribution of constructed ensemble of nonsulfated heparan 200-mer; most probable end-to-end distance across all four sets is 260 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Conclusions
Collectively, our findings support our hypotheses that (1) glycosidic linkages and monosaccharide rings in hyaluronan and nonsulfated dermatan, keratan, and heparan GAG 20-mers behave randomly and independently in MD simulation and (2) using a database of conformations from corresponding GAG 20-mer MD simulations and treating glycosidic linkages and monosaccharide rings independently, our algorithm can efficiently construct conformational ensembles of hyaluronan and nonsulfated dermatan, keratan, and heparan GAG 10-and 20-mers that mimic backbone flexibility observed in corresponding MD simulations. Additionally, our algorithm constructed sets of 10,000 molecular conformations of nonsulfated GAG 200-mers with backbone conformations that we would reasonably expect to see in MD simulation and did so within 12 h. This suggests that the algorithm can generate conformational ensembles of nonsulfated heterogeneous GAG polymers of arbitrary length in under a day. For perspective, 1-µs MD simulations, each producing 10,000 snapshots (i.e., 3-D atomic coordinate sets), of GAG 10-and 20-mers were completed in about 1-2 weeks and 1-2 months, respectively, and using modern GPU-accelerated hardware and software. Furthermore, the algorithm's potential energy minimization and bond potential energy cutoff criterion exclude nonphysical conformations from constructed ensembles, leaving only conformations with reasonably expected bond energies ( Figures S36, S37, S38, and S39).
A comparison of the different GAG 20-mer MD simulations provided further insights into the relationship between GAG structure and conformation. For example, no associations between adjacent monosaccharide conformations were seen in any of the GAG 20-mer MD simulations, but differences in conformations of the same monosaccharide ring type in different GAGs were observed. This indicates that the structure of adjacent monosaccharides contributes to ring conformation. Furthermore, IdoA monosaccharides were much more conformationally flexible than GlcA monosaccharides in GAG 10-and 20-mer MD simulations, even though the structural difference between these two monosaccharide types (specifically, the orientation of the carboxylate group on C5) is subtle. Interestingly, dermatan and heparan are the only GAGs with IdoA monosaccharides, yet hyaluronan and heparan showed the most conformational flexibility, and heparan showed the greatest tendency toward compact conformations compared to other GAGs. This was likely because of variability in glycosidic linkage conformation, which is independent of monosaccharide ring conformation. Independence of glycosidic linkages and monosaccharide rings was further evidenced by the fact that despite differences in flanking monosaccharide structure in different GAG types, there were similarities between all 1-3 linkage conformations and between all 1-4 linkage conformations, with the exception of GlcNAcα1-4IdoA linkages in heparan. Figure 23. End-to-end distance probability distribution of constructed ensemble of nonsulfated heparan 200-mer; most probable end-to-end distance across all four sets is 260 Å; probabilities were calculated for end-to-end distances sorted into 5 Å bins; the ensemble contains four sets of 10,000 conformations.

Conclusions
Collectively, our findings support our hypotheses that (1) glycosidic linkages and monosaccharide rings in hyaluronan and nonsulfated dermatan, keratan, and heparan GAG 20-mers behave randomly and independently in MD simulation and (2) using a database of conformations from corresponding GAG 20-mer MD simulations and treating glycosidic linkages and monosaccharide rings independently, our algorithm can efficiently construct conformational ensembles of hyaluronan and nonsulfated dermatan, keratan, and heparan GAG 10-and 20-mers that mimic backbone flexibility observed in corresponding MD simulations. Additionally, our algorithm constructed sets of 10,000 molecular conformations of nonsulfated GAG 200-mers with backbone conformations that we would reasonably expect to see in MD simulation and did so within 12 h. This suggests that the algorithm can generate conformational ensembles of nonsulfated heterogeneous GAG polymers of arbitrary length in under a day. For perspective, 1-µs MD simulations, each producing 10,000 snapshots (i.e., 3-D atomic coordinate sets), of GAG 10-and 20-mers were completed in about 1-2 weeks and 1-2 months, respectively, and using modern GPU-accelerated hardware and software. Furthermore, the algorithm's potential energy minimization and bond potential energy cutoff criterion exclude nonphysical conformations from constructed ensembles, leaving only conformations with reasonably expected bond energies ( Figures S36-S39).
A comparison of the different GAG 20-mer MD simulations provided further insights into the relationship between GAG structure and conformation. For example, no associations between adjacent monosaccharide conformations were seen in any of the GAG 20-mer MD simulations, but differences in conformations of the same monosaccharide ring type in different GAGs were observed. This indicates that the structure of adjacent monosaccharides contributes to ring conformation. Furthermore, IdoA monosaccharides were much more conformationally flexible than GlcA monosaccharides in GAG 10and 20-mer MD simulations, even though the structural difference between these two monosaccharide types (specifically, the orientation of the carboxylate group on C 5 ) is subtle. Interestingly, dermatan and heparan are the only GAGs with IdoA monosaccharides, yet hyaluronan and heparan showed the most conformational flexibility, and heparan showed the greatest tendency toward compact conformations compared to other GAGs. This was likely because of variability in glycosidic linkage conformation, which is independent of monosaccharide ring conformation. Independence of glycosidic linkages and monosaccharide rings was further evidenced by the fact that despite differences in flanking monosaccharide structure in different GAG types, there were similarities between all 1-3 linkage conformations and between all 1-4 linkage conformations, with the exception of GlcNAcα1-4IdoA linkages in heparan.
A comparison of MD-generated GAG 20-mer conformations to those of GAG 10-mers and to existing experimentally-determined conformations of monosaccharides and GAG oligosaccharides supported the use of 20-mer data for the construction of longer GAG polymers. For example, certain IdoA conformations (namely 4 C 1 chair) were found more in unbound monosaccharide rings and terminal rings of GAG oligosaccharides than in central rings. Similarly, galactose occasionally sampled non-4 C 1 conformations in monosaccharide rings and short oligosaccharides, including nonsulfated keratan 10-mer in MD simulation, but boat/skew-boat conformations of galactose were decreasingly common in central rings of polymers of increasing length. This is in line with our observation that only 4 C 1 conformations of galactose were sampled in nonsulfated keratan 20-mer MD. Additionally, some nonhelical glycosidic linkage conformations that caused a kink in the polymer chain, and thus contributed to compact GAG backbone conformations, were found more in GAG 20-mers than in short GAG oligosaccharides. Therefore, conformational landscapes from GAG 20-mer MD simulations likely provide a better representation of conformations of long GAG polymers than existing conformational landscapes of monosaccharides and GAG oligosaccharides.
A comparison of backbone conformational analyses of 200-mers of different GAG types provided insights into structural features and conformational behaviors contributing to GAG polymer backbone flexibility. For example, 200-mer end-to-end distance distribution curves were qualitatively similar in nonsulfated dermatan ( Figure 13), keratan (Figure 18), and chondroitin [33], whereas hyaluronan and nonsulfated heparan polymer end-to-end distance distributions (Figures 7 and 23, respectively) showed an even higher tendency toward compact conformations with increasing polymer length than other nonsulfated GAGs. To find possible explanations for this, we compared observations from MD analyses of all GAG types. In hyaluronan 10-and 20-mer MD simulations, (1) GlcNAc conformations were similar to those in nonsulfated keratan and heparan MD, and (2) GlcA conformations were similar to those in nonsulfated chondroitin MD [33]. Furthermore, IdoA rings showed more flexibility in MD simulation of nonsulfated dermatan and heparan than GlcA rings in hyaluronan and chondroitin MD [33]. However, boat/skew-boat conformations (except for 2 S O ) in IdoA did not appear to cause more compact polymer backbone conformations than 1 C 4 , 2 S O , and 4 C 1 IdoA conformations, while boat/skew-boat GlcA ring conformations were more highly associated with more compact polymer backbone conformations in hyaluronan and chondroitin [33]. Additionally, in hyaluronan 10-and 20-mer MD simulations, (1) GlcNAcβ1-4GlcA linkages took on the same conformations as Galβ1-4GlcNAc linkages in nonsulfated keratan MD, IdoAα1-4GlcNAc linkages in nonsulfated heparan MD, and GalNAcβ1-4GlcA in nonsulfated chondroitin MD [33] (i.e., primarily −φ, +ψ with a secondary basin at −φ, −ψ) and (2) GlcAβ1-3GlcNAc linkages took on the same conformations as IdoAα1-3GalNAc in nonsulfated dermatan MD, GlcNAcβ1-3Gal in nonsulfated keratan MD, and GlcAβ1-3GalNAc linkages in nonsulfated chondroitin MD [33] (i.e., primarily −φ, −ψ), but with more conformations at −φ, +ψ. All secondary and tertiary conformations (i.e., −φ, −ψ in 1-4 linkages and −φ, +ψ in 1-3 linkages) were nonhelical and caused a kink or slight bend in the polymer chain, and were, therefore, associated with more compact conformations. The differences in energetic stability of secondary and tertiary linkage conformations between different GAG types can be explained by the adjacent monosaccharide structure. The major observation that is unique to hyaluronan is that both glycosidic linkage types take on more of these nonhelical secondary conformations in 10-and 20-mer MD simulations than any of nonsulfated dermatan, keratan, or chondroitin [33] in 10-and 20-mer MD simulations. Heparan is unique, in that it has all 1-4 linkages, while all other GAGs have alternating 1-3 and 1-4 linkages, and it contains α-GlcNAc, which gives unique conformations in GlcNAcα1-4IdoA linkages. These characteristics give nonsulfated heparan a tendency toward more coiled structures than other GAGs.
Based on these observations and the fact that nonsulfated dermatan, keratan, and chondroitin showed more extended polymer backbone conformations than hyaluronan and nonsulfated heparan in MD simulations, it is likely that: (1) monosaccharide structure and conformational flexibility determine adjacent monosaccharide conformational flexibility; (2) although there is a much higher degree of ring flexibility in IdoA than in GlcA, the flexibility of GlcA rings is more highly associated with GAG polymer backbone flexibility than that of IdoA rings; and (3) nonhelical glycosidic linkage conformations, which cause kinks in GAG polymer chains, contribute to more compact GAG polymer backbone conformations and, thus, a higher degree of GAG polymer backbone flexibility. These are valuable insights that would be difficult to obtain from conformational analyses of only solid-state structures and MD-generated conformational ensembles of short GAG oligosaccharides. Other important insights that can be gained from 3-D atomic-resolution conformational ensembles of GAG polymers include potential binding properties, and thus, predictions of binding poses with other biomolecules of interest. Models of long-chain GAG polymers ( Figure 24 and [33]) can help characterize complete PGs and GAG-mediated complexes between multiple biomolecules, and consequently, improve our understanding of the bioactivity and function of GAG biopolymers in animal tissue. with GAG polymer backbone flexibility than that of IdoA rings; and (3) nonhelical glycosidic linkage conformations, which cause kinks in GAG polymer chains, contribute to more compact GAG polymer backbone conformations and, thus, a higher degree of GAG polymer backbone flexibility. These are valuable insights that would be difficult to obtain from conformational analyses of only solid-state structures and MD-generated conformational ensembles of short GAG oligosaccharides. Other important insights that can be gained from 3-D atomic-resolution conformational ensembles of GAG polymers include potential binding properties, and thus, predictions of binding poses with other biomolecules of interest. Models of long-chain GAG polymers ( Figure 24 and [33]) can help characterize complete PGs and GAG-mediated complexes between multiple biomolecules, and consequently, improve our understanding of the bioactivity and function of GAG biopolymers in animal tissue.    Figure S1: Constructed hyaluronan 20-mer conformation with a pierced ring, Figure S2: Scatterplots of radius of gyration as a function of end-to-end distance in MD-generated and constructed hyaluronan ensembles, Figure S3: System potential energy probability distribution of the MD-generated hyaluronan 20-mer ensemble, Figure S4: C-P plots and C-P parameter θ timeseries for each GlcNAc monosaccharide ring in the MD-generated hyaluronan 20-mer ensemble, Figure S5: C-P plots and C-P parameter θ timeseries for each GlcA monosaccharide ring in the MD-generated hyaluronan 20-mer ensemble, Figure S6: Scatterplot of radius of gyration as a function of end-to-end distance in MD-generated hyaluronan 20-mer conformations with non-4 C 1 ring puckers, Figure S7: ∆G(φ,ψ) plots for each glycosidic linkage in the MD-generated hyaluronan 20-mer ensemble, Figure S8: MD snapshots of hyaluronan 20-mer GlcAβ1-3GlcNAc glycosidic linkages with dihedrals in different basins and end-to-end distance distributions of 20-mer conformations with linkages in secondary basin, Figure S9: ∆G(φ,ψ) plots for glycosidic linkages flanking non-4 C 1 ring puckers in the MD-generated hyaluronan 20-mer ensemble, Figure S10: MD snapshots of hyaluronan 20-mer GlcNAcβ1-4GlcA glycosidic linkages with dihedrals in different basins and end-to-end distance distributions of 20-mer conformations with linkages in secondary basin, Figure S11: C-P plots for GlcNAc and GlcA in the constructed hyaluronan 20-mer ensemble, Figure S12: ∆G(φ, ψ) for aggregated glycosidic linkage data from the constructed hyaluronan 20-mer ensemble, Figure S13: Scatterplots of radius of gyration as a function of end-to-end distance in MD-generated and constructed nonsulfated dermatan ensembles, Figure S14: C-P parameter θ timeseries for each GalNAc monosaccharide ring in the MD-generated nonsulfated dermatan 20-mer ensemble, Figure S15: End-to-end distance distributions of MD-generated nonsulfated dermatan 20-mer conformations with boat/skew-boat ring puckers that cause a kink in the polymer chain and 2 S O conformations, Figure S16: C-P plots and C-P parameter θ timeseries for each IdoA monosaccharide ring in the MD-generated nonsulfated dermatan 20-mer ensemble, Figure S17: C-P plots and C-P parameter θ timeseries for each IdoA monosaccharide ring in the MD-generated nonsulfated dermatan 10-mer ensemble, Figure S18: C-P plots for GalNAc and IdoA in the constructed nonsulfated dermatan 20-mer ensemble, Figure S19: ∆G(φ, ψ) for aggregated glycosidic linkage data from the constructed nonsulfated dermatan 20-mer ensemble, Figure S20: Scatterplots of radius of gyration as a function of end-to-end distance in MD-generated and constructed nonsulfated keratan ensembles, Figure S21: C-P plots and C-P parameter θ timeseries for each GlcNAc monosaccharide ring in the MD-generated nonsulfated keratan 20-mer ensemble, Figure S22: Scatterplot of radius of gyration as a function of end-to-end distance in MD-generated nonsulfated keratan 20-mer conformations with non-4 C 1 ring puckers, Figure S23: MD snapshots of nonsulfated keratan 20-mer Galβ1-4GlcNAc glycosidic linkages with dihedrals in different basins and end-to-end distance distributions of 20-mer conformations with linkages in secondary and tertiary basins, Figure S24: MD snapshots of nonsulfated keratan 20-mer GlcNAcβ1-3Gal glycosidic linkages with dihedrals in different basins and end-to-end distance distributions of 20-mer conformations with linkages in secondary and tertiary basins, Figure S25: C-P plots and C-P parameter θ timeseries for each Gal monosaccharide ring in the MD-generated nonsulfated keratan 10-mer ensemble, Figure S26: C-P plots for GlcNAc and Gal in the constructed nonsulfated keratan 20-mer ensemble, Figure S27: ∆G(φ, ψ) for aggregated glycosidic linkage data from the constructed nonsulfated keratan 20-mer ensemble, Figure S28: Scatterplots of radius of gyration as a function of end-to-end distance in MD-generated and constructed nonsulfated heparan ensembles, Figure S29: C-P parameter θ timeseries for each IdoA monosaccharide ring in the MD-generated nonsulfated heparan 20-mer ensemble, Figure S30: End-to-end distance distributions of MD-generated nonsulfated heparan 20-mer conformations with boat/skew-boat ring puckers that cause a kink in the polymer chain and 2 S O conformations, Figure S31: Snapshot of nonsulfated heparan 20-mer MD-generated ensemble highlighting GlcNAcα1-4IdoA linkages, Figure S32: End-to-end distance distributions of MD-generated nonsulfated heparan 20-mer conformations with linkage dihedrals in secondary basins, Figure S33: ∆G(φ,ψ) plots for IdoAα1-4GlcNAc linkages flanking different IdoA conformations in the MD-generated nonsulfated heparan 20-mer ensemble, Figure S34: C-P plots for GlcNAc and IdoA in the constructed nonsulfated heparan 20-mer ensemble, Figure S35: ∆G(φ, ψ) for aggregated glycosidic linkage data from the constructed nonsulfated heparan 20-mer ensemble, Figure S36: Bond potential energy probability distributions from constructed hyaluronan ensembles, Figure S37: Bond potential energy probability distributions from constructed nonsulfated dermatan ensembles, Figure S38: Bond potential energy probability distributions from constructed nonsulfated keratan ensembles, Figure S39: Bond potential energy probability distributions from constructed nonsulfated heparan ensembles.