Breeding Novel Chemistry in Willow: New Hetero Diels–Alder Cyclodimers from Arbusculoidin and Salicortin Suggest Parallel Biosynthetic Pathways

An investigation of phenolic glycosides extracted from Salix germplasm revealed that arbusculoidin (benzyl 1-O-β-d-glucopyranosyl-1-hydroxy-6-oxo-2-cyclohexenyl carboxylate) and its enolic 6-glycoside isomer, isoarbusculoidin, are widespread across the Salix family. An analysis of natural hybrid species and progeny from a willow breeding programme demonstrated that the putative biosynthetic pathway leading to the salicinoid family of phenolic glycosides runs in parallel to a “benzyl”-based pathway to arbusculoidin. The introduction of a known Diels–Alder reaction trait from Salix dasyclados, as well as an acylation trait, into progeny containing both salicyl- and benzyl- pathways caused the formation of all possible hetero-cyclodimers from mixtures of reactive dienone (acyl)glycosides that participated in cross-over reactions. In addition to providing access to new analogues of the anti-cancer dimer miyabeacin, the analysis of the breeding progeny also indicated that these dienone (acyl)glycosides are stable in planta. Although the immediate biosynthetic precursors of these compounds remain to be defined, the results suggest that the (acyl)glycosylation reactions may occur later in the pathway than previously suggested by in vitro work on cloned UGT enzymes.


Introduction
The metabolome of the Salicaceae family of trees and shrubs (Salix, Populus, and Chosenia) is characterised by a range of abundant phenolic glucosides that includes the salicinoids, a subgroup of pharmacological importance [1][2][3].Structurally, this class of metabolites is based on salicin (salicyl alcohol-2-glucoside) and more complex examples, such as salicortin (1) (Figure 1) and 2 ′ -acetylsalicortin (2).Salicortin and the acylated derivatives are major extractable metabolites in many species of willow and poplar, and these compounds have become of biosynthetic interest as they contain the unusual 1hydroxy-6-oxo-2-cycohexenyl 1-carboxylate (HCH) ring structural fragment.It is this reactive moiety that forms the basis of the newly discovered [3] anti-cancer dimeric salicinoid miyabeacin (3).The HCH group is not confined to salicinoids, and it has been known for some time that an analogue of salicortin based on benzyl alcohol exists in some willow species.Evans et al. [4] were the first to report on the structure and occurrence of benzyl 1-O-β-D-glucopyranosyl-1-hydroxy-6-oxo-2-cyclohexenyl carboxylate (4) and the isomer (5), which bears an enolic glucopyranosyl function, in Salix arbusculoides (Alaskan little tree willow).We are now naming these metabolites arbusculoidin (4) and isoarbusculoidin (5).These two compounds have become of interest to us in light of recent biochemical and genetic investigations into the proposed biosynthetic network leading to salicortin [5], (5), which bears an enolic glucopyranosyl function, in Salix arbusculoides (Alaskan little tree willow).We are now naming these metabolites arbusculoidin (4) and isoarbusculoidin (5).These two compounds have become of interest to us in light of recent biochemical and genetic investigations into the proposed biosynthetic network leading to salicortin [5], in which the involvement of acyltransferases and the intermediacy of benzylbenzoate and/or salicyl-7-benzoate has been implicated [6].Furthermore, the early involvement of a glycosyltransferase (UGT71L1/2/3), that glycosylates salicyl-7-benzoate in vitro has been suggested in both willow and poplar [7,8].Gene knockouts in transgenic poplar have confirmed this UGT enzyme to be a key player in the in vivo pathway to salicortin [7,8].Consideration of the benzyl alcohol-based structures of arbusculoidin and isoarbusculoidin, (4) and ( 5), however, suggests that there may be a parallel pathway from benzylbenzoate whereby formation of the HCH ring must precede glycosylation, which, in these molecules, occurs on the HCH moiety.Further pieces of the biosynthetic jigsaw have emerged from the discovery of miyabeacin (3) from Salix miyabeana and S. dasyclados [3].Miyabeacin has a diketo-1,4ethenodecalin core structure and is the product of an intermolecular [4+2] Diels-Alder reaction between two units of an ephemeral dienone monomer called "salicortenone" (6) that is closely related to salicortin.In our previous report [3], it was also demonstrated that the biosynthetic process required to produce (6) and, subsequently, this new family Further pieces of the biosynthetic jigsaw have emerged from the discovery of miyabeacin (3) from Salix miyabeana and S. dasyclados [3].Miyabeacin has a diketo-1,4-ethenodecalin core structure and is the product of an intermolecular [4+2] Diels-Alder reaction between two units of an ephemeral dienone monomer called "salicortenone" (6) that is closely related to salicortin.In our previous report [3], it was also demonstrated that the biosynthetic process required to produce (6) and, subsequently, this new family of dimeric molecules in Salix spp. is a heritable trait that is genetically encoded in S. miyabeana and S. dasyclados and their progeny.In vivo cross-over [4+2] reactions between acylated analogues of (6) in hybrid willows generated derivatives such as acetyl miyabeacin (7), indicating that both glycosylation and acylation preceded the Diels-Alder reaction to miyabeacin.
Plants 2024, 13, 1609 3 of 17 In this paper, we turn our attention to the apparent parallel biosynthetic pathway leading to arbusculoidin (4) and isoarbusculoidin (5).We describe the quantitation of these compounds across a Salix species diversity panel and report the chemical structures of acetylarbusculoidin (8) and four new [4+2] cyclodimers of the miyabeacin family (9)(10)(11)(12) formed in hybrid willows via condensation of dienone analogues of salicortin (1) and arbusculoidin (4).We define the genetic sources of three metabolic traits-(i) presence of the arbusculoidin pathway, (ii) acetylation of the glucose, and (iii) the formation of the dienone precursor and subsequent spontaneous [4+2] dimerisation reactions, and show how these metabolic traits can be merged by hybridisation to produce novel structures in Salix.Cross-over Diels-Alder products show that pools of reactive glycosylated dienones from both salicortin and arbusculoidin pathways must co-exist in planta.However, the exact biosynthetic relationship (precursor versus product) between the dienones salicortenone (6) and arbusculoidenone (13) and the corresponding saturated ketones salicortin (1) and arbusculoidin (4) still remains to be accurately defined.

Salicortin Analogues Based on Benzyl Rather than Salicyl Rings Are More Common than Expected
Authentic samples of arbusculoidin (4) and isoarbusculoidin (5) were isolated by preparative HPLC from a living accession (NWC1165) of Salix arbusculoides held in the UK National Willow Collection (NWC).Structures were characterised by 1 H-NMR and highresolution LC-MS, and are in agreement with those given in the literature [4].To assess the natural ranges and levels of these analogues relative to salicortin in Salix genotypes across the NWC, we mined the 1 H-NMR fingerprinting data of aqueous methanolic extracts of stem tissue from 191 willow accessions harvested at the dormant stage (February) [9].Data from 7 accessions, harvested when plants had senesced (November), were also included in the study for comparison.The results of quantitation via NMR integration of compoundspecific resonances against internal standard are shown in Table 1.Arbusculoidin (4) was detected in 35 accessions at levels varying from 35.35 ± 1.76 to 3.05 ± 0.08 mg/g dry weight in stems, while isoarbusculoidin (5) was quantified in only six of the accessions in concentrations ranging from 33.25 ± 3.05 to 3.83 ± 0.30 mg/g d.w.(Table 1).There was no obvious correlation between the levels of these metabolites and the levels of salicortin (1), which itself varied between undetectable and 111.27 ± 51.49 mg/g d.w.There was also no proportional relationship between the amounts of arbusculoidin (4) and isoarbusculoidin (5) produced.Accessions which produced high amounts of (4) at levels well in excess of salicortin (1) included NWC617 and NWC614 (both S. schwerinii) and the established commercial biomass variety "Resolution" (NWC1124), which also contains some S. schwerinii in its pedigree.None of these accessions accumulated (5), which was observed in its highest quantity in dormant stems of S. saposhnikovii (NWC1239).In this accession, isoarbusculoidin was also predominant over arbusculoidin.In NWC1165, an accession of S. arbusculoides, similar levels of ( 4) and (5) were produced, which was in agreement with data reported for this species [4].Interestingly, several genotypes produced isoarbusculoidin only, and these accessions included NWC1060 S. gmelinii; NWC1270 S. triandra; and, at low concentrations, NWC577 S. dasyclados.Of these, NWC1270 S. triandra was the only accession to produce isoarbusculoidin (5) in the absence of salicortin (1).Inspection of the LC-MS data of the polar extracts of S. schwerinii 'K3 Hilliers' (NWC615) and S. saposhnikovii (NWC1239) (Supplementary Figure S1) indicated that arbusculoidin (4) and isoarbusculoidin (5)   , 42 mass units higher, which is characteristic of an acetylated derivative.The isolation of this metabolite and a comparison of its 1 H-NMR data to (4) are shown in Table 2. Key differences were the presence of peaks corresponding to an acetyl moiety accompanied by a downfield shift of the double doublet at δ 4.77 (J = 8.0, 9.5) compared to the corresponding signal (δ 3.41) in (4).Interestingly, the 13 C chemical shift of C-2 ′ was not significantly affected, although this is in line with previous studies on acetylated salicin analogues [9] where 13 C signal differences were only observed when acetylation occurred at the C-6 ′ position and not at the C-2 ′ position.These data, together with key HMBC correlations, indicated that the acetyl moiety in (8) was attached at C-2 ′ hydroxy group of the glucose.

s
The position of acetylation was consistent with other salicinoids observed in Salicaceae.Whilst other derivatives can also show acetylation at the C-6 position of the glucose, this is often as a result of acyl migration following extraction or treatment in basic conditions [10,11].The absolute configuration of the glucose in (4) was previously determined by Evans et al. [4], after cation exchange resin acid hydrolysis, to be the D enantiomer, and the assignment of the β-anomer followed from the H-1 ′ /H-2 ′ coupling constant (7.5 Hz).The absolute configuration at C-9 in (4) and ( 8) was assumed to be (S) following from Feistel et al. [12], who determined this via circular dichroism for a number of HCH-bearing salicinoids, including salicortin (1).

Novel Dimers Arising from Arbusculoidin Producing Salix Genotypes Hybridising with 'Diels-Alder' Miyabeacin-Producing Genotypes
Our previous work [3] has already identified [4+2] dimeric structures, e.g., miyabeacin (3), in Salix miyabeana and S. dasyclados, as well as that alternate acyl-glucose derivatives (acetylated, benzoylated) of these dimers can be formed by conventional crossing via careful parental selection.To assess the effect of combining the benzyl (versus salicyl)-containing series, exemplified by hybridisation of arbusculoidin (4)-producing genotypes with accessions known to produce miyabeacin family dimers, we assessed the pedigrees of genotypes bred at Rothamsted Research as part of the BEGIN biomass improvement programme [13].Within the available progeny from trial 'RR/CS/722', 70 examples contained either S. dasyclados or S. miyabeana in their pedigree, and thus were potentially capable of producing miyabeacin (3) (Supplementary Table S1).The panel was screened using LC-MS in negative ion mode to determine the levels of arbusculoidin (4) and, thus, the potential to produce novel dimeric entities via Diels-Alder cycloaddition in planta (Figure 2).The highest contents of arbusculoidin (4) were found in breeding progeny RR10038 and RR10036, both arising from a cross between RR08083 (NWC607 S. rehderiana × NWC619 'Lapin') and NWC446 S. aegyptiaca.However, despite containing high levels of salicinoids, these hybrids did not produce any miyabeacin (3) or other dimeric compounds, suggesting that the Diels-Alder capability was not present (or not operating) in these progeny.When the miyabeacin (3) content was also considered alongside the arbusculoidin content, two hybrids (RR10140 and RR10143) contained significant peaks for both markers.Both hybrids represented siblings from a cross between 'RR08083' (NWC607 S. rehderiana × NWC619 'Lapin') and NWC577 S. dasyclados '77056'.Inspection of the LC-MS data from RR10143 displayed four additional, higher-molecular-weight peaks in the total ion chromatogram (Figure 3), as well as peaks for salicortin (1), 2 ′ -acetyl salicortin (2), arbusculoidin (4), isoarbusculoidin (5), acetyl arbusculoidin (8), miyabeacin (3), and acetyl miyabeacin (7).For chemical structure determination, a larger-scale extraction of RR10143 was performed, and semi-preparative HPLC enabled isolation of each novel entity.Structures were elucidated using MSMS and NMR.
To elucidate the relative configuration and exact structures, NMR data were examined more closely.In the case of data for (9), the signals relating to the core moiety showed good agreement (chemical shift and coupling constants) with those of miyabeacin (3), with the exception of the signal for C-9, which showed a downfield shift of 6.3 ppm (δ C 88.5 ppm).Further connections were established by 2D NMR (COSY, TOCSY, HSQC, and HMBC).A 1 H- 13 C HMBC correlation from the anomeric proton Glc-1 (δ 4.35) to C-9 indicated that glucose was directly attached to the core and explained the resonance shift for C-9.The position of H-10 (δ 3.87) was elucidated by a TOCSY experiment, where correlations among all protons of this spin system were detected.Both H-7 ′ (δ 5.30 and 5.22) and H-10 showed HMBC correlations with C-8, confirming their position in relation to the core structure.The benzylic methylene signals were present as H 2 -7 ′ (δ 5.30 and 5.22), while the other two sets of doublets H 2 -7 (δ 5.37 and 5.15) were representative of the salicyl moiety attached to the core.The specific position of each hydrogen and carbon in the molecule was achieved through COSY and HMBC analysis, and key correlations are highlighted in Figure 5.With this, we concluded that compound 9 had arisen via a [4+2] Diels-Alder cyclodimerisation reaction between salicortenone (6), acting as the diene, and arbusculoidenone (13), behaving as a dienophile.
To elucidate the relative configuration and exact structures, NMR data were examined more closely.In the case of data for (9), the signals relating to the core moiety showed good agreement (chemical shift and coupling constants) with those of miyabeacin (3), with the exception of the signal for C-9, which showed a downfield shift of 6.3 ppm (δC 88.5 ppm).Further connections were established by 2D NMR (COSY, TOCSY, HSQC, and HMBC).A 1 H- 13 C HMBC correlation from the anomeric proton Glc-1 (δ 4.35) to C-9 indicated that glucose was directly attached to the core and explained the resonance shift for C-9.The position of H-10 (δ 3.87) was elucidated by a TOCSY experiment, where correlations among all protons of this spin system were detected.Both H-7′ (δ 5.30 and 5.22) and H-10 showed HMBC correlations with C-8, confirming their position in relation to the core structure.The benzylic methylene signals were present as H2-7′ (δ 5.30 and 5.22), while the other two sets of doublets H2-7 (δ 5.37 and 5.15) were representative of the salicyl moiety attached to the core.The specific position of each hydrogen and carbon in the molecule was achieved through COSY and HMBC analysis, and key correlations are highlighted in Figure 5.With this, we concluded that compound 9 had arisen via a [4+2] Diels-Alder cyclodimerisation reaction between salicortenone (6), acting as the diene, and arbusculoidenone ( 13), behaving as a dienophile.Compound ( 10) is an isomer of ( 9), and although their NMR data are very similar, we confirmed using 2D NMR techniques that they differ in regiochemistry.In 10, we observed a 5.5 ppm downfield shift of the 13 C signal of C-20 (δC 85.5 ppm), indicating a C-20glucose attachment in this derivative.This was further corroborated by the presence of an HMBC correlation between the anomeric proton Glc-1′ (δ 4.71) and C-20.The methylene protons H2-7 (δ 5.41 and 5.18) showed HMBC correlations with C-1 (δC 158.2) and C-8 (δC 173.8), confirming that the salicyl moiety was attached to C-8.COSY and key HMBC correlations for (10) are also shown in Figure 5. Compound (10) is therefore suggested to be biosynthesised via a cyclodimerisation reaction between arbusculoidenone (13), acting as the diene, and salicortenone (6), behaving as a dienophile.
The LC-MS data of ( 11) and ( 12) were suggestive of acetylated derivatives of ( 9) and (10).Inspection of their NMR data (Table 3) confirmed that compound (11) was the Glc-2′-acetyl derivative of (9).New resonances at δH/C 2.12 (s)/23.1 and δC 175.8 arose from the methyl and carbonyl portions of the acetyl moiety.Placement of the acetate group was elucidated by HMBC experiments and a correlation of the acetyl carbonyl to a double doublet at δ 4.76 (J = 8.0, 9.4), assigned to Glc-2′.The glucose directly attached to the dimer core structure.This indicates that (11) represented a hybrid dimer formed via the Diels-Alder cycloaddition of salicortenone ( 6) and 2′-acetylarbusculoidenone ( 14).This was further confirmed via the inspection of the LC-MS spectra for (11), which showed an insource fragment at m/z 447, (Supplementary Figure S5) corresponding to 2′-acetylarbusculoidenone (C22H23O10) and 42 mass units higher than the unacetylated arbusculoidenone monomer normally appearing at m/z 405.Compound (12) was the Glc-2′-acetyl of (10).The new acetate resonances were observed at δH/C 2.12 (s)/23.1 and δC 175.8, relating to the methyl and carbonyl portions, respectively.Again, a correlation of the carbonyl signal with the Glc-2′ proton (double doublet at δ 4.76 (J = 8.0, 9.4) placed the acetate on the glucose directly attached to the dimer core structure rather than on the glucose of the salicyl portion in the other half of the molecule.LC-MS again showed the expected insource fragment at m/z 447, corresponding to the acetyl arbusculoidenone monomer (Supplementary Figure S5).Compound ( 10) is an isomer of ( 9), and although their NMR data are very similar, we confirmed using 2D NMR techniques that they differ in regiochemistry.In 10, we observed a 5.5 ppm downfield shift of the 13 C signal of C-20 (δ C 85.5 ppm), indicating a C-20-glucose attachment in this derivative.This was further corroborated by the presence of an HMBC correlation between the anomeric proton Glc-1 ′ (δ 4.71) and C-20.The methylene protons H 2 -7 (δ 5.41 and 5.18) showed HMBC correlations with C-1 (δ C 158.2) and C-8 (δ C 173.8), confirming that the salicyl moiety was attached to C-8.COSY and key HMBC correlations for (10) are also shown in Figure 5. Compound (10) is therefore suggested to be biosynthesised via a cyclodimerisation reaction between arbusculoidenone (13), acting as the diene, and salicortenone (6), behaving as a dienophile.
The LC-MS data of ( 11) and ( 12) were suggestive of acetylated derivatives of ( 9) and (10).Inspection of their NMR data (Table 3) confirmed that compound (11) was the Glc-2 ′acetyl derivative of (9).New resonances at δ H/C 2.12 (s)/23.1 and δ C 175.8 arose from the methyl and carbonyl portions of the acetyl moiety.Placement of the acetate group was elucidated by HMBC experiments and a correlation of the acetyl carbonyl to a double doublet at δ 4.76 (J = 8.0, 9.4), assigned to Glc-2 ′ .The glucose directly attached to the dimer core structure.This indicates that (11) represented a hybrid dimer formed via the Diels-Alder cycloaddition of salicortenone ( 6) and 2 ′ -acetylarbusculoidenone ( 14).This was further confirmed via the inspection of the LC-MS spectra for (11), which showed an in-source fragment at m/z 447, (Supplementary Figure S5) corresponding to 2 ′ -acetylarbusculoidenone (C 22 H 23 O 10 ) and 42 mass units higher than the unacetylated arbusculoidenone monomer normally appearing at m/z 405.Compound (12) was the Glc-2 ′ -acetyl of (10).The new acetate resonances were observed at δ H/C 2.12 (s)/23.1 and δ C 175.8, relating to the methyl and carbonyl portions, respectively.Again, a correlation of the carbonyl signal with the Glc-2 ′ proton (double doublet at δ 4.76 (J = 8.0, 9.4) placed the acetate on the glucose directly attached to the dimer core structure rather than on the glucose of the salicyl portion in the other half of the molecule.LC-MS again showed the expected in-source fragment at m/z 447, corresponding to the acetyl arbusculoidenone monomer (Supplementary Figure S5).
The above compounds represented the novel metabolites that could be isolated and structurally characterised by NMR.To investigate whether further (lower abundance) dimeric entities had been formed, we examined the total ion chromatogram and relevant extracted ion spectra of the RR10143 extract.Two further peaks with parent ions at m/z 869 appeared at 29.62 and 29.78 min.Inspection of their MSMS and associated monomer ions (m/z 405 and 463) suggested a pair of isomeric compounds generated via the Diels-Alder reaction between acetylsalicortenone and arbusculoidenone (Supplementary Figure S6).Additionally, at 31.76 and 32.04 min, ions with [M − H] − ions at m/z 911, corresponding to metabolites with formula C 44 H 47 O 21 , were detected.Analysis of the monomer ions (m/z 463 (not detected) and 447) from in-source fragmentation (Supplementary Figure S7) was indicative of the diacetyl hybrid dimer generated from 2 ′ -acetylsalicortenone and 2 ′acetylarbusculoidenone.Also present in the low-abundance peaks was an entity appearing at 23.00 min with [M − H] − at m/z 811, corresponding to a molecule with molecular formula C 40 H 43 O 18 .Analysis of the in-source fragments revealed a single ion at m/z 405, corresponding to arbusculoidenone (Supplementary Figure S8).Thus, this compound was putatively assigned as the product of the reaction of two arbusculoidenone monomers, one acting as the dienophile and the other as the diene.Finally (Supplementary Figure S9), traces of two compounds could be detected at 27.41 and 27.53 min.Both entities showed parent ions at m/z 853, and both showed monomer ions at m/z 405 and 447, corresponding to arbusculoidenone and 2 ′ -acetylarbusculoidenone.
The dominant salicinoid in NWC607 (S. rehderiana), the female grandparent, was 2 ′ -acetylsalicortin (2), with lower levels of salicortin (1).Neither arbusculoidin (4) nor isoarbusculoidin (5) were present in this accession, and no dimeric compounds were detected.In the case of NWC619 'Lapin', the male grandparent, itself a hybrid of S. dasyclados × schwerinii, both salicortin (1) and 2 ′ -acetylsalicortin (2) were present with higher levels of the unacetylated compound (1).In addition, the profile of NWC619 'Lapin' also contained significant levels of arbusculoidin (4), but not isoarbusculoidin (5).In terms of dimeric compounds, low levels of miyabeacin (3) and acetylmiyabeacin (7) could also be detected.None of the parental/grandparental germplasms produced detectable levels of the new dimer derivatives (9)(10)(11)(12) or 2 ′ -acetylarbusculoidin (8).It was, therefore, concluded that the ability to assemble these molecules and the other (putative) minor configurations had arisen as a consequence of the novel mixing of substrates and reactions achieved in the parental hybridisation process.Despite the fact that the female parent of RR10143 was no longer available for analysis, the likely heritage of each structural element in the Plants 2024, 13, 1609 12 of 17 novel dimers (Figure 6) was traceable through the grandparents and can be summarised as follows: (i) dienone accumulation and resultant Diels-Alder dimerisation capability is derived from S. dasyclados; (ii) the expression of the abusculoidin pathway originates from the S. scherwinii heritage in the male grandparent, NWC619 'Lapin', from the maternal side; and (iii) the acetylation trait arises from the female grandparent S. rehderiana, also on the maternal side.

Discussion
Our results clearly show that, in many willows, the salicinoid biosynthetic pathway suggested to originate from salicylbenzoate (SB) operates in parallel with a homologous pathway based on benzylbenzoate (BB).This is in line with the original report from cloned poplar enzymes studied in vitro [6], which stated that two acyltransferases, PtACT49 and PtACT47, for which different substrate preferences have been demonstrated in vitro, may be responsible for the genesis of BB and SB, the two suggested entry points into the postulated parallel pathways in planta (Figure 7).We cannot, however, rule out the possibility that, in planta, SB is formed by hydroxylation of BB.Additionally, the two pathways share biosynthetic machinery, i.e., the enzymes required to carry out the transformation of the lower benzoate rings into the HCH function in both salicortin (1) and arbusculoidin (4).The balance between the two pathways is genotype-dependent in our survey, but likely to be dynamic in expression where relative flux varies due to genetic and/or environmental influences.More detailed time-course experiments will be necessary to investigate these aspects.There was no evidence that the ratio of arbusculoidin to isoarbusculoidin was fixed, and thus, the transformation of the aglycone to the tertiary glycoside in (4) or the enol-glycoside in (5) may be a result of different UGT enzymes.However, there is also potential that the enol-glycoside is formed initially and then rearranges to the more stable tertiary glycoside.Nevertheless, it is now clear that the glycosylation (and subsequent acetylation of the glucose moiety) occurs after a proposed oxidative dearomatisation of the benzoate ring in benzylbenzoate to give arbusculoidin/isoarbusculoidin (see Figure 7).Extending this logic to the parallel salicortin pathway suggests that the HCH ring may be formed before the glycosylation by the already characterised UGT71L2/3 enzymes, which are known to be involved in the salicinoid pathway [7,8].These enzymes could act on the upper salicyl ring hydroxy group, leading to the 2-glycosylated dienones, which themselves are the obligate reactants participating in the apparently spontaneous Diels-Alder reaction to miyabeacin (3) and its analogues [3].It remains to be seen whether these particular UGTs have the ability to ortho-glycosylate a wider range of salicyl intermediates, such as the aglycone of salicortenone (6, aglycone).Our previous experience with the synthesis and handling of SB [8] suggests that the formation of the 2-glycoside has a stabilising effect on the ester bond by removal of anchimeric effect of the free 2-hydroxy.We would also suggest that passage of the non-polar BB and SB through the enzyme complex (metabolon) that leads to the HCH ring is rapid and is terminated by glycosylation to the much more polar glycosides, which are compartmentalised and have a greater lifetime in planta relative to the aglycones.The observed dimeric products are believed to have arisen from an inter-molecular Diels-Alder reaction of glycosylated orthoquinols such as (6).Such reactions are well documented to be exquisitely regio-and stereospecific to yield totally endo-selective products in both natural and synthetic scenarios [14,15].In nature, the [4+2] dimerization reaction of the orthoquinols is believed to be spontaneous, rather than enzyme-catalysed.We be- The observed dimeric products are believed to have arisen from an inter-molecular Diels-Alder reaction of glycosylated orthoquinols such as (6).Such reactions are well documented to be exquisitely regio-and stereospecific to yield totally endo-selective products in both natural and synthetic scenarios [14,15].In nature, the [4+2] dimerization reaction of the orthoquinols is believed to be spontaneous, rather than enzyme-catalysed.We believe this is also the case for miyabeacin [3] and the new analogues reported here.The absolute configuration of these new dimeric compounds has not been directly determined here due to insufficient isolated material.However, the absolute configuration can be deduced from the proven 9(S) absolute configuration of the HCH group in salicortin (1) [12].This stereo-centre is carried through into the dienones, such as ( 6), (13), and (14), that partake in the endo-specific cyclodimerisations.Natural glucosides are all Denantiomers and the β-anomeric configuration of the glucose groups follows from the NMR coupling data.Furthermore, as we have shown [3] for miyabeacin (3), hydrolysis of these compounds is complex, leading ester to cleavage, decarboxylation, and also breakdown of the core via retro Diels-Alder reactions.This negates any attempt to directly determine the absolute stereochemistry of the aglycones after hydrolysis.The heritability observed in the breeding study strongly suggests genetic control over the buildup of pools of precursor dienone glycosides.The observation of products from cross-over [4+2] reactions between these (acyl)glycosylated dienones of both pathways confirms that these reactive dienones have a significant lifetime in planta.However, it remains unclear whether the inter-molecular Diels-Alder chemistry occurs in the living plant or is realised upon tissue processing and extraction of this mixture of dienone-glycosides into solvents.Further genetic analysis and careful study of engineered plants will be necessary before we can decide whether there is any involvement of a specific binding protein in accelerating the dimerisation.Also not resolved is the exact biosynthetic relationship between the dienones (salicortenone (6), arbusculoidenone (13), and their corresponding aglycones) and the saturated ketones (salicortin (1), arbusculoidin/isoarbusculoidin (4/5), and their aglycones).In our previous paper [3], we deduced that the genetics could not determine whether 6 (or aglycone) was a precursor that is normally reduced to 1 (or aglycone), or whether 1 (or aglycone) is hydroxylated and dehydrated to form 6 (or aglycone).These scenarios remain the same with the added complexity of the parallel benzyl pathways (13) and (4/5).However, what is clear is that with the judicious choice of breeding partners, new substrates can be introduced to the Salix dasyclados (or S. miyabeana)-derived 'dienone accumulation' trait, and, as a consequence, the subsequent spontaneous intermolecular Diels-Alder reactions can be used to produce further dimeric analogues, not only to gain insight into the biosynthetic pathway, but also to generate a panel of new analogues for further structure-activity assessment of the anti-cancer metabolite miyabeacin.

General Experimental Procedures
General procedures for plant tissue collection, metabolite extraction, HPLC, 600 MHz NMR spectroscopy, and UHPLC-MS have been reported previously [9].Plant material was collected from the National Willow Collection (NWC) and from existing experimental breeding progeny growing in field plots at Rothamsted Research.Samples were harvested into liquid nitrogen and stored at −80 • C before freeze-drying.Analysis was carried out on freeze-dried, milled tissue.Tissues for metabolite screening were harvested at dormancy (January/February) or senescence (November), as described in Table 1  1 H-NMR profiling of aqueous methanol extracts was conducted at 600 MHz according to previously described methods [9].The levels of target metabolites were quantified directly from the 1 H-NMR data relative to an internal standard (d 4 -TSP, 0.01% w/v).Characteristic regions for each metabolite were used for the quantitation as follows: salicortin

Figure 1 .
Figure 1.Compound structures.Structure numbering system does not reflect biosynthetic provenance; for clarity in NMR assignments, numbering was maintained for each structural module.

Figure 1 .
Figure 1.Compound structures.Structure numbering system does not reflect biosynthetic provenance; for clarity in NMR assignments, numbering was maintained for each structural module.

Figure 2 .
Figure 2. Comparison of miyabeacin and arbusculoidin levels observed in LC-MS analysis of polar (80:20 water:methanol) extracts of stem tissue from breeding progeny of trial RR/CS/722 .Bars Figure 2. Comparison of miyabeacin and arbusculoidin levels observed in LC-MS analysis of polar (80:20 water:methanol) extracts of stem tissue from breeding progeny of trial 'RR/CS/722'.Bars represent integrated peak areas corresponding to miyabeacin (3) and arbusculoidin (4) in negativemode MS following reversed-phase HPLC.

Figure 7 .
Figure 7. Suggested architecture and parallel relationship of salicyl-and benzyl-based biosynthetic pathways in willow and poplar.Enzymes in red have been previously biochemically characterised in vitro; however, the precise nature and range of their substrates in planta remains to be defined.The green box represents a metabolic channel (metabolon) containing mainly unstable and transient intermediates in an oxidative/reductive chain leading to the HCH ring.Exit from this metabolon is via glycosylation, which, for the salicinoids, has a stabilising effect.The blue box represents a pool of (acyl)glycosyl-dienones that provides the substrates for the observed combinatorial cross-over inter-molecular [4+2] Diels-Alder chemistry.Structure numbers relate to Figure 1.

Figure 7 .
Figure 7. Suggested architecture and parallel relationship of salicyl-and benzyl-based biosynthetic pathways in willow and poplar.Enzymes in red have been previously biochemically characterised in vitro; however, the precise nature and range of their substrates in planta remains to be defined.The green box represents a metabolic channel (metabolon) containing mainly unstable and transient intermediates in an oxidative/reductive chain leading to the HCH ring.Exit from this metabolon is via glycosylation, which, for the salicinoids, has a stabilising effect.The blue box represents a pool of (acyl)glycosyl-dienones that provides the substrates for the observed combinatorial cross-over inter-molecular [4+2] Diels-Alder chemistry.Structure numbers relate to Figure 1. .

4. 2 .
NMR Screen of NWC Accessions and Quantification of Salicortin (1), Arbusculoidin (4), and Isoarbusculoidin(5) eluted at 24.32 min and 21.88 min, respectively.Both compounds exhibited strong [M + formate − H] − ions at m/z 453.1389 corresponding to the molecular formula C 21 H 25 O 11 .In the case of arbusculoidin (4), a clear [M − H] − ion at m/z 407.1333 (C 20 H 23 O 9 ) was also observed.Further analysis of the total ion chromatogram of NWC615 indicated a peak at 29.11 min with an [M + formate − H] − ion at m/z 495.1501, corresponding to a new compound with molecular formula C 22 H 26 O 10