The Advanced Floating Chirality Distance Geometry Approach―How Anisotropic NMR Parameters Can Support the Determination of the Relative Configuration of Natural Products

The configurational analysis of complex natural products by NMR spectroscopy is still a challenging task. The assignment of the relative configuration is usually carried out by analysis of interproton distances from NOESY or ROESY spectra (qualitative or quantitative) and scalar (J) couplings. About 15 years ago, residual dipolar couplings (RDCs) were introduced as a tool for the configurational determination of small organic molecules. In contrast to NOEs/ROEs which are local parameters (distances up to 400 pm can be detected for small organic molecules), RDCs are global parameters which allow to obtain structural information also from long-range relationships. RDCs have the disadvantage that the sample needs a setup in an alignment medium in order to obtain the required anisotropic environment. Here, we will discuss the configurational analysis of five complex natural products: axinellamine A (1), tetrabromostyloguanidine (2), 3,7-epi-massadine chloride (3), tubocurarine (4), and vincristine (5). Compounds 1–3 are marine natural products whereas 4 and 5 are from terrestrial sources. The chosen examples will carefully work out the limitations of NOEs/ROEs in the configurational analysis of natural products and will also provide an outlook on the information obtained from RDCs.

. Plot of the best-fit (minimum pseudo energy) DG structure of Axinellamine A (1) with color-coded representation of all NOE contacts used in the configurational and conformational analysis. The color scale was adapted from calculated final NOE violations, ranging from -0.40 Å (blue) to +0.40 Å (red). Table S1. NOE data used for Axinellamine (1) ( = 37 NOEs, the NMR-derived expectation values are denoted as , and the allowed lower and upper bondsusually ± 10%are labeled as ⋯ ; the averaged distances back-calculated from structural data are labelled , and the corresponding residuals are listed only if this value falls out of range, i.e. < or > ; all distance are given in [Å]). The NOE labels refer to the formula shown on the right, where given as e.g. H A |H B , NOEs were recalculated from structures as -6 averages over all pairs of atom-atom distances. Figure S2.
(a) For Axinellamine A (1), an extended unrestricted floating-chirality DG calculation (100.000 structures) produces all 128 diastereomers (eight stereogenic centers, 2 −1 configurational families, only C14 was fixed in its configuration in order to avoid enantiomeric structures). In this simulation, only holonomic distance bounds (bond lengths) were used as restraints, and no NOE information was used to restrict configurational (and conformational) space. The top plot shows the minimum holonomic pseudo energy of each of the 128 diastereomers, and the center plot provides an overview on the total number (counts as columns) of structures generated in this simulation for each configuration. The bottom plot identifies each configuration generated: The rows of filled circles indicate for each configurational family which stereogenic center (C1, C13, C11, C12, C9, C5, and C10) is inverted in relation to the correct configuration of Axinellamine A (1, the correct configuration is labeled no. #1 on the left side of the plots), a missing circle indicates retention of stereogenic centers as compared to 1. The DG approach samples even highly strained configurations (marked in orange, configurations no. #97-112), albeit these occur with significantly lower statistical sampling rates. In all these 16 highly strained diastereomers, at least C5 and C10 feature an inverted configuration, which leads to a two-fold trans-anellation of two of the five-membered rings (see Figure S3 below for a superposition plot of these structures).
(b) Using NOE data for Axinellamine (1) ( = 37 NOEs), an extended floating-chirality DG calculations (100.000 structures) produced 74 out of the 128 possible diastereomers, the center and bottom plot and visualize the individual configurational families and their relative counts as described in (a). The left most column displays the sampling rate for the correct configuration of Axinellamine A (1, about 66.000 correct structures were generated in this rDG simulation, note the logarithmic scale after the axis break), whereas the sampling rate of wrong configurations significantly drops as the total pseudo energy raises. The top plot shows the minimum total error (= pseudo energy) for each configurational family, the correct structure (left most entry labeled no. #1, large black circle) emerges as the global pseudo energy minimum structure ( = 3.30), the first wrong configuration (wrong configuration at C1, red circle) features a significantly higher pseudo energy (Δ = 3.15), and any alternate diastereomer is characterized by even higher pseudo energies (note the break in the scale of the axis). Nevertheless, DG efficiently samples all possible configurations under the restraints of the experimental NOE data, and the correct configuration of 1 emerges as the best-fit, lowest pseudo energy structure from all rDG simulations described here.
(a) (b) (c) Figure S3. (a) Plot of central ring structure of Axinellamine A (1) with correct configuration of all stereogenic centers as identified from the rDG simulation described in Figure S2b; this structure is identical to the one shown in Figure S1. (b) Superposition of 16 highly strained diastereomers of 1, identified from the unrestricted DG simulation as described in Figure S2a, and marked in these plots in orange. All these highly distorted structures feature a wrong configuration at C5 and C10 (atoms marked in orange in the plot on the right side (c)), and the structure shown in (c) corresponds to the configurational family labeled no. #97 in Figure S2b.  Figure S4. Plot of the best-fit (minimum pseudo energy) DG structure of Tetrabromostyloguanidine (2) with color-coded representation of all NOE contacts used in the configurational and conformational analysis. The color scale was adapted from calculated final NOE violations, ranging from -0.40 Å (blue) to +0.40 Å (red). ; all distance are given in [Å]). The NOE labels refer to the formula shown on the right, where given as e.g. H A |H B , NOEs were recalculated from structures as -6 averages over all pairs of atom-atom distances.
(a) The top left plot shows pseudo energy sorted and ranked rDG structures of Tetrabromostyloguanidine (2) as discussed in the main paper. In this rDG simulation, a chiral volume restraint arbitrarily has been applied to a single stereogenic center (i.e. C10) in order to avoid enantiomeric structures.
(b) On the top right, the plot shows the corresponding pseudo energy of ranked rDG structures emerging from a simulation which did not use any chiral restraints at all, except on planar sp 2 -type centers ( ℎ = 0). The features of this plot, and in particular the energy steps are identical to the plot shown in (a), thus indicating that the arbitrary use of a single chiral volume restraint does not affect the final results of configurational and conformational analysis. In fact, this simulation produced the enantiomeric structures as shown below ( Figure S6).
(c) On the bottom right, single chiral volume restraints were applied in rDG simulations on Tetrabromostyloguanidine (2), each using one of the eight stereogenic centers as a restrained and fixed reference. Restraining either one of C6, C10, C11, C12, C16, C17, C18, or C20 produces within very narrow margins of error the same energy step characteristics that have already been manifested in plots (a) and (b). In all cases, the first small step in energy results from an alternative assignment of diastereotopic methylene protons (see inset plot), and the first wrong relative configuration of one of the eight stereogenic centers (indicated by the bold-face large symbols) is always characterized by a significantly higher error in its total pseudo energy.
(c) Figure S6. Ball-and-stick type representation of the first two lowest pseudo energy structures emerging from a rDG simulation on Tetrabromostyloguanidine (2) using no chiral restraints on any of the eight stereogenic centers, as shown in Figure S5b. These structures are actually exact enantiomers with identical pseudo energies.
(a) chiral volume restraint applied to C6 (b) chiral volume restraint applied to C10 (c) chiral volume restraint applied to C11 (d) chiral volume restraint applied to C12 (e) chiral volume restraint applied to C16 (f) chiral volume restraint applied to C17 (g) chiral volume restraint applied to C18 (h) chiral volume restraint applied to C20 Figure S7. Ball-and-stick type representations of all low pseudo energy structures up to the first wrong configuration emerging from rDG simulations on Tetrabromostyloguanidine (2) using a single chiral volume restraints on one of the eight stereogenic centers, as shown in Figure  The left plot shows pseudo energy sorted and ranked rDG structures of 2 as discussed in the main paper. In addition, the color-coded symbols designate rankings of alternative assignments of CH2-protons (methylene groups C13 and C19), where the relative stereodescriptors "lk" (like) and "ul" (unlike) refer to the global pseudo energy minimum "lk/lk" assignment; the right scale gives Δ values relative to ( ). Different CH2-assignments were found (total number of structures generated: lk/lk: 162 (black), ul/lk: 44 (blue), lk/ul: 146 (green), and ul/ul: 25 (orange), with rankings of first occurrence lk/lk: #1 (Δ = 0.00), ul/lk: #277 (Δ = 2.18), lk/ul: #99 (Δ = 0.14), and ul/ul: #302 (Δ = 2.28)), which were ranked below the first wrong configuration of a stereogenic center (ranked no. #378, Δ = 5.32). (b) The right plot shows the pseudo energy of ranked rDG structures emerging from a simulation in which chiral volume restraints were used to restrict the pseudo configuration of both methylene groups in the correct low-energy configuration of plot (a). The sampling quality of correct lk/lk structures increases to 419 located below the occurrence of the first structure of wrong configuration ranked no. #420. The energy steps become significantly more pronounced as alternative assignments of methylene groups are discarded, and the low-energy structures cluster into three very distinct conformational families A-C. Below, superimposed structure plots of these different conformations are given ( Figure S9).   Figure S10. Plot of the best-fit (minimum pseudo energy) DG structure of 3,7-epi-Massadine chloride (3) with color-coded representation of all NOE contacts used in the configurational and conformational analysis. The color scale was adapted from calculated final NOE violations, ranging from -0.40 Å (blue) to +0.40 Å (red). ). The NOE labels refer to the formula shown on the right, where given as e.g. H A |H B , NOEs were recalculated from structures as -6 averages over all pairs of atom-atom distances.
(a) (b) Figure S11. Detailed analysis of the assignments of diastereotopic methylene protons of 3,7-epi-Massadine chloride (3). (a) The left plot shows pseudo energy sorted and ranked rDG structures of 3 as discussed in the main paper. In addition, the color-coded symbols designate rankings of alternative assignments of CH2-protons (methylene groups C1' and C1''), where the relative stereodescriptors "lk" (like) and "ul" (unlike) refer to the global pseudo energy minimum "lk/lk" assignment; the right scale gives Δ values relative to ( ). Different CH2-assignments were found (total number of structures generated: lk/lk: 25 (black), ul/lk: 0 (blue), lk/ul: 30 (green), and ul/ul: 0 (orange), with rankings of first occurrence lk/lk: #1 (Δ = 0.00), ul/lk: #88 (Δ = 0.74), lk/ul: #25 (Δ = 0.29), and ul/ul: #124 (Δ = 1.02)), which were ranked below the first wrong configuration of a stereogenic center (ranked no. #56, Δ = 0.51). (b) The right plot shows the pseudo energy of ranked rDG structures emerging from a simulation in which chiral volume restraints were used to restrict the pseudo configuration of both methylene groups in the correct low-energy configuration of plot (a). The sampling quality of correct lk/lk structures increases to 122 located below the occurrence of the first structure of wrong configuration ranked no. #123. The energy steps become significantly more pronounced as alternative assignments of methylene groups are discarded, and the low-energy structures cluster into a single conformational family A. Below, a plot of these superimposed structures is given ( Figure S12).
conformational family A (122 structures) Δ = 0.00 Figure S12. Superimposed structures for the main low-pseudo energy conformational family A of 3,7-epi-Massadine chloride (3) as identified from the rDG simulation depicted in Figure S11b.

NOE, RDC, and Structure Data for Tubocurarine (4)
The following  Figure S13. Plot of the experimental RDCs ( ) vs. their backcalculated values ( ) used for Tubocurarine (4, three alignment media labeled [A]-[C]). The corresponding best-fit (lowest total pseudo energy) structure model (see plot below) was obtained from the DG simulation using all three AM data sets.  Figure S14. Plot of the best-fit (minimum pseudo energy) DG structure of Tubocurarine (4) with color-coded representation of all NOE contacts used in the configurational and conformational analysis. The color scale was adapted from calculated final NOE violations, ranging from -0.40 Å (blue) to +0.40 Å (red).

NOE, RDC, and Structure Data for Vincristine (5)
The following  Figure S16. Plot of the best-fit (minimum pseudo energy) DG structure of Vincrsitine (5) with color-coded representation of all NOE contacts used in the configurational and conformational analysis. The color scale was adapted from calculated final NOE violations, ranging from -0.40 Å (blue) to +0.40 Å (red). Table S7. NOE data used for Vincristine (5) ( = 23 NOEs, the NMR-derived expectation values are denoted as , and the allowed lower and upper bondsusually ± 10%are labeled as ⋯ ; the averaged distances back-calculated from structural data are labelled , and the corresponding residuals are listed only if this value falls out of range, i.e. < or > ; all distance are given in [Å]). The NOE labels refer to the formula shown on the right, where given as e.g. H A |H B , NOEs were recalculated from structures as -6 averages over all pairs of atom-atom distances.