Aphid Resistance Segregates Independently of Cardenolide and Glucosinolate Content in an Erysimum cheiranthoides (Wormseed Wallflower) F2 Population

Plants in the genus Erysimum produce both glucosinolates and cardenolides as a defense mechanism against herbivory. Two natural isolates of Erysimum cheiranthoides (wormseed wallflower) differed in their glucosinolate content, cardenolide content, and their resistance to Myzus persicae (green peach aphid), a broad generalist herbivore. Both classes of defensive metabolites were produced constitutively and were not further induced by aphid feeding. To investigate the relative importance of glucosinolates and cardenolides in E. cheiranthoides defense, we generated an improved genome assembly, genetic map, and segregating F2 population. The genotypic and phenotypic analysis of the F2 plants identified quantitative trait loci, which affected glucosinolates and cardenolides, but not the aphid resistance. The abundance of most glucosinolates and cardenolides was positively correlated in the F2 population, indicating that similar processes regulate their biosynthesis and accumulation. Aphid reproduction was positively correlated with glucosinolate content. Although the overall cardenolide content had little effect on aphid growth and survival, there was a negative correlation between aphid reproduction and helveticoside abundance. However, this variation in defensive metabolites could not explain the differences in aphid growth on the two parental lines, suggesting that processes other than the abundance of glucosinolates and cardenolides have a predominant effect on aphid resistance in E. cheiranthoides.

An available genome sequence, transcriptomes, and metabolomic data for Erysimum cheiranthoides (wormseed wallflower) [23] facilitate the use of this species for studying the combined function of cardenolides and glucosinolates in plant defense.The most abundant cardenolides in E. cheiranthoides are the mono-and diglycosides of digitoxigenin, cannogenol, cannogenin, and strophanthidin [13,23].Glucosinolates with side chains derived from tryptophan and methionine, which are abundant in the genetic model plant Arabidopsis thaliana (Arabidopsis), are also present in E. cheiranthoides [24,25].Although the analysis of the E. cheiranthoides genome sequence identified homologs of most Arabidopsis glucosinolate biosynthesis genes [23], their specific functions have not been investigated.
Unlike in the case of milkweeds, which have a community of highly adapted herbivores which are largely impervious to inhibition by cardenolides [26], there are no known Erysimum-specialist herbivores that are resistant to both glucosinolates and cardenolides.The relatively recent evolution of cardenolide production in Erysimum may account for the absence of such specialized herbivores.Experiments with two crucifer-specialist lepidopterans, P. rapae and Pieris napi (green-veined white butterfly), showed that Erysimum cardenolides deter both oviposition and feeding [15,[27][28][29][30][31][32][33].However, the larvae of P. xylostella, another crucifer-specialist lepidopteran, have been reported on E. cheiranthoides in field experiments [34].
Myzus persicae, a broad generalist herbivore, is able to feed on E. cheiranthoides in the laboratory and in nature [15,34,35], indicating that this species has a level of tolerance for both glucosinolates and cardenolides.When feeding on Arabidopsis, largely intact methionine-derived aliphatic glucosinolates pass through M. persicae, whereas tryptophanderived indole glucosinolates are activated in the aphid gut [36].Although Arabidopsis cyp79B2 cyp79B3 mutants, which lack indole glucosinolates, are more sensitive to M. persicae [37], tgg1 tgg2 mutants, which are deficient in glucosinolate-activating myrosinases, are not [38].
In the current study, we conducted experiments with two E. cheiranthoides accessions, Elbtalaue and Konstanz, with differing glucosinolate content, cardenolide content, and aphid resistance.By measuring these traits in a segregating Elbtalaue × Konstanz F2 population, we mapped genetic loci affecting the abundance of glucosinolates and cardenolides.Additionally, we used this mapping population to investigate the relative importance of these compounds in E. cheiranthoides' defense against M. persicae feeding.

Phenotypic Differences between the Elbtalaue and Konstanz Accessions
We investigated two inbred E. cheiranthoides accessions, Elbtalaue and Konstanz, for variation in aphid resistance, glucosinolate accumulation, and cardenolide accumulation.In no-choice assays, M. persicae survival and reproduction were higher on Konstanz than on Elbtalaue (Figure 1A,B).Similarly, aphids showed a preference for the detached leaves of Konstanz plants relative to those of Elbtalaue in choice assays (Figure 1C).We measured the relative abundance of glucosinolates (Figure 2) and cardenolides (Figure 3) in the Elbtalaue and Konstanz accessions, as well as in aphids feeding on the leaves of these plants.Among the eight glucosinolates and seven cardenolides that we reliably detected in E. cheiranthoides leaves, none had significantly increased abundance after 24 h of aphid feeding.However, some glucosinolates [4-hydroxyindol-3-ylmethylglucosinolate (4HI3M), 3-methylsulfonylpropylglucosinolat (3-MSOP), and 4-methylsulfonylbutylglucosinolate (4MSOB)] exhibited a transient increase in abundance after 1 h, before decreasing to background levels after 24 h (Figure 2C,I,K).
Plants 2024, 13, x FOR PEER REVIEW 3 of 21 of Konstanz plants relative to those of Elbtalaue in choice assays (Figure 1C).We measured the relative abundance of glucosinolates (Figure 2) and cardenolides (Figure 3) in the Elbtalaue and Konstanz accessions, as well as in aphids feeding on the leaves of these plants.Among the eight glucosinolates and seven cardenolides that we reliably detected in E. cheiranthoides leaves, none had significantly increased abundance after 24 h of aphid feeding.However, some glucosinolates [4-hydroxyindol-3-ylmethylglucosinolate (4HI3M), 3-methylsulfonylpropylglucosinolat (3-MSOP), and 4-methylsulfonylbutylglucosinolate (4MSOB)] exhibited a transient increase in abundance after 1 h, before decreasing to background levels after 24 h (Figure 2C,I,K).Whereas 4-methoxyindol-3-ylmethylglucosinolate (4MI3M) was more abundant in Konstanz than in Elbtalaue, both constitutively and after aphid feeding, 4HI3M was more abundant in Elbtalaue after 8 and 24 h of feeding (Figure 2B).The abundance of indole glucosinolates was generally higher in aphids on the Konstanz accession and, after 24 h feeding on Konstanz, the 4MI3M concentration was higher in aphids than at the 1 h and 4 h timepoints (p < 0.05, t-test; Figure 2F).Furthermore, 4MSOB was twice as abundant and n-methylbutylglucosinolate (NMB) was ten-fold more abundant in Elbtalaue than in Konstanz (Figure 2D).Likely due to the relatively low abundance of the aliphatic glucosinolate NMB in the Konstanz accession, this glucosinolate was not detected above background levels in assays of aphids collected from these plants (Figure 2H).Whereas 4-methoxyindol-3-ylmethylglucosinolate (4MI3M) was more abundant in Konstanz than in Elbtalaue, both constitutively and after aphid feeding, 4HI3M was more abundant in Elbtalaue after 8 and 24 h of feeding (Figure 2B).The abundance of indole glucosinolates was generally higher in aphids on the Konstanz accession and, after 24 h feeding on Konstanz, the 4MI3M concentration was higher in aphids than at the 1 h and 4 h timepoints (p < 0.05, t-test; Figure 2F).Furthermore, 4MSOB was twice as abundant and nmethylbutylglucosinolate (NMB) was ten-fold more abundant in Elbtalaue than in Konstanz (Figure 2D).Likely due to the relatively low abundance of the aliphatic glucosinolate NMB in the Konstanz accession, this glucosinolate was not detected above background levels in assays of aphids collected from these plants (Figure 2H).
In the case of cardenolides, cheirotoxin, erysimoside, erychroside, and glucodigitoxigenin were all more abundant in Elbtalaue during the aphid feeding experiment (Figure 3A-D).However, there was no significant difference in the abundance of these cardenolides in the bodies of aphids feeding from these plants (Figure 3E-H).Three additional cardenolides, helveticoside, erycordin, and the structurally uncharacterized Dig-10, did not differ in abundance levels between Elbtalaue and Konstanz (Figure 3I-K).Uniquely among the detected cardenolides, helveticoside was not detected by HPLC-MS in aphids that were feeding on either of the two E. cheiranthoides accessions (Figure 3N).In the case of cardenolides, cheirotoxin, erysimoside, erychroside, and glucodigitoxigenin were all more abundant in Elbtalaue during the aphid feeding experiment (Figure 3A-D).However, there was no significant difference in the abundance of these cardenolides in the bodies of aphids feeding from these plants (Figure 3E-H).Three additional cardenolides, helveticoside, erycordin, and the structurally uncharacterized Dig-10, did not differ in abundance levels between Elbtalaue and Konstanz (Figure 3I-K).Uniquely among the detected cardenolides, helveticoside was not detected by HPLC-MS in aphids that were feeding on either of the two E. cheiranthoides accessions (Figure 3N).

Correlation of Aphid Resistance with Glucosinolate and Cardenolide Content
To investigate the genetic basis of the variations in aphid resistance, glucosinolate content, and cardenolide content, we generated an F2 population from a cross between the Elbtalaue and Konstanz accessions.The aphid survival and reproduction on F2

Correlation of Aphid Resistance with Glucosinolate and Cardenolide Content
To investigate the genetic basis of the variations in aphid resistance, glucosinolate content, and cardenolide content, we generated an F2 population from a cross between the Elbtalaue and Konstanz accessions.The aphid survival and reproduction on F2 progeny were similar to those observed on Elbtalaue, yet significantly different from Konstanz (Figure 1A,B).This suggested that resistance was a dominant trait in this cross, and that Plants 2024, 13, 466 6 of 21 multiple loci contributed to the higher levels of aphid resistance in Elbtalaue, relative to Konstanz.
From the 155 F2 plants that we used for aphid bioassays (Figure 1), we subjected 83 to glucosinolate analysis, cardenolide analysis, and transcriptome sequencing.After data normalization, we conducted a Pearson correlation analysis in order to (1) compare the aphid resistance (progeny production) and metabolite content and (2) understand the correlation in the abundances of the different metabolites (Figure 4A).Most comparisons of cardenolide and glucosinolate abundance showed a positive correlation.However, helveticoside abundance showed no significant correlation with the abundances of the measured glucosinolates.Aphid reproduction was significantly negatively correlated with the helveticoside abundance and positively correlated with glucosinolate abundance in the F2 population.There was no correlation between the abundance of the other measured cardenolides with aphid reproduction.We confirmed the negative effects of helveticoside on aphid reproduction when using an artificial diet assay (Figure 4B).The calculated IC50 concentration for aphid progeny production on an artificial diet was 14 ng/µL, which is comparable to the helveticoside content of E. cheiranthoides leaves (~20 ng/mg wet weight) [35].

E. cheiranthoides Genetic Map
The previously published E. cheiranthoides genome (version 1.2, [23]) was constructed using 39.5 Gb of PacBio sequences and a Hi-C proximity-guided assembly in order to orient 98.5% of the genome into eight scaffolds.We used transcriptome data from the F2 population to generate an E. cheiranthoides genetic map with 501 molecular markers (Figures S1 and S2).With this genetic map, we re-scaffolded the assembled contigs for version 2.0 of the genome.A comparison of the marker positions between versions 1.2 and 2.0 highlights several inversions and rearrangements that are corrected in the new genome assembly, primarily on chromosomes 1, 4, 6, 7, and 8 (Figure S3).Version 2.0 of the E. cheiranthoides genome has improved assembly statistics relative to the previously published version 1.2 (Figure S4).In addition, we assembled 93 formerly unassigned contigs into a 154,508 bp chloroplast genome, which is similar to the 154,611 bp chloroplast genome described previously for a different isolate of E. cheiranthoides [39].
In some parts of the genome, the frequency of the molecular markers is distorted from the expected 1:2:1 (Elbtalaue:Heterozygote:Konstanz) ratio for an F2 population (Figure S5).Particularly noteworthy is that Elbtalaue alleles are overrepresented across much of chromosome 3.This segregation distortion could indicate that there is a selective advantage to specific parental alleles under our growth conditions.While conducting this research, we noticed that, relative to Elbtalaue, Konstanz seeds require longer cold stratification in order to achieve full germination.If loci affecting this trait are localized on chromosome 3 and the F2 population seeds were not cold-stratified for long enough prior to planting, this could explain some of the unexpected allele frequencies in the F2 population.

Genetic Mapping of Defense Traits
Using the newly assembled E. cheiranthoides genetic map (Figure S1) and 83 genotyped Elbtalaue × Konstanz F2 lines, we conducted the quantitative trait locus (QTL) mapping of aphid survival, aphid progeny reproduction, cardenolide abundance, and glucosinolate abundance.No significant QTL affecting aphid survival or progeny production on E. cheiranthoides F2 lines were identified.Significant genetic linkage was observed for only one cardenolide, helveticoside (Figure 5A).The Konstanz allele of a locus on chromosome 8 causes an approximately two-fold increase in helveticoside abundance; this effect is likely recessive because F2 plants that are heterozygous at this locus have helveticoside levels similar to the Elbtalaue parent (Figure 5B).As there are no genes known to be involved specifically in helveticoside biosynthesis, and as the QTL mapping interval encompasses hundreds of genes, it is not yet possible to identify loci candidates affecting helveticoside accumulation.
population.There was no correlation between the abundance of the other measured cardenolides with aphid reproduction.We confirmed the negative effects of helveticoside on aphid reproduction when using an artificial diet assay (Figure 4B).The calculated IC50 concentration for aphid progeny production on an artificial diet was 14 ng/µL, which is comparable to the helveticoside content of E. cheiranthoides leaves (~20 ng/mg wet weight) [35].NMB, the glucosinolate showing the greatest fold-difference between Elbtalaue and Konstanz (Figure 2D), has a significant QTL on chromosome 1, with a recessive allele in Elbtalaue, causing increased foliar NMB accumulation (Figure 6A,B).Similar glucosinolates with five-carbon side chains derived from isoleucine have been described in Boechera stricta (Drummond's rockcress) [40].The relative incorporation of methionine and branched chain amino acids (valine and isoleucine) into glucosinolate side chains was associated with natural variations in CYP79F enzymes that catalyze the first step of the biosynthesis pathway.The analysis of E. cheiranthoides chromosome 1 in the area of the NMB QTL showed a CYP79F gene (Erche01g017900), with an encoded protein sequence that is similar to those from Arabidopsis, B. stricta, and Brassica oleracea (cabbage) (Figures S6A  and S7).Erche01g017900 expression was not significantly different between the Elbtalaue and Konstanz accessions (p > 0.05; Figure S6B).
The predicted Erche01g017900 protein sequences from Elbtalaue and Konstanz differ at only one amino acid (Figure S6A).Whereas Elbtalaue has glycine at position 51, Konstanz has serine.In B. stricta, BsBCMA1 and BsBCMA3, the two CYP79F enzymes associated with branched-chain amino acid incorporation, have serine at this position, and BsBCMA2, which preferentially catalyzes methionine incorporation, has glycine (Figure S6).At two other positions that have been associated with differential glucosinolate production in B. stricta [40], residues 135 and 536, the Konstanz and Elbtalaue proteins are identical and have the same amino acids as those found in BsBCMA1 and BsBCMA3 (Figure S6).
The accumulation of 4MSOB, 3-methylsulfinylpropylglucosinolate (3MSIP), and 3MSOP, which are predicted to be synthesized by a shared biosynthetic pathway [23], is highly correlated (Figure 4A).Although mapping the accumulation of each of these glucosinolates individually did not identify significant QTL at the 95% confidence level (Figure S8), the sum of these three glucosinolates had a significant QTL localized on chromosome 7 (Figure 6C), with the recessive Elbtalaue allele causing lower glucosinolate accumulation (Figure 6D).
We conducted mutual rank coexpression network analysis [41] to determine whether known homologs of known Arabidopsis glucosinolate biosynthesis genes are also co-expressed in E. cheiranthoides.This identified a network of co-expressed genes, containing eight genes involved in aliphatic glucosinolate biosynthesis, four genes related to sulfur metabolism, and four additional genes likely encoding metabolic enzymes (Figure 6E and Supplemental Table S1).Several E. cheiranthoides genes encoding aliphatic glucosinolate biosynthetic genes have expression-level QTL between 3.0 and 3.4 Mbp on chromosome NMB, the glucosinolate showing the greatest fold-difference between Elbtalaue and Konstanz (Figure 2D), has a significant QTL on chromosome 1, with a recessive allele in Elbtalaue, causing increased foliar NMB accumulation (Figure 6A,B).Similar glucosinolates with five-carbon side chains derived from isoleucine have been described in Boechera stricta (Drummond's rockcress) [40].The relative incorporation of methionine and branched chain amino acids (valine and isoleucine) into glucosinolate side chains was associated with natural variations in CYP79F enzymes that catalyze the first step of the biosynthesis pathway.The analysis of E. cheiranthoides chromosome 1 in the area of the NMB QTL showed a CYP79F gene (Erche01g017900), with an encoded protein sequence that is similar to those from Arabidopsis, B. stricta, and Brassica oleracea (cabbage) (Figures S6A and S7).Erche01g017900 expression was not significantly different between the Elbtalaue and Konstanz accessions (p > 0.05; Figure S6B).
The predicted Erche01g017900 protein sequences from Elbtalaue and Konstanz differ at only one amino acid (Figure S6A).Whereas Elbtalaue has glycine at position 51, Konstanz has serine.In B. stricta, BsBCMA1 and BsBCMA3, the two CYP79F enzymes associated with branched-chain amino acid incorporation, have serine at this position, and BsBCMA2, which preferentially catalyzes methionine incorporation, has glycine (Figure S6).At two other positions that have been associated with differential glucosinolate production in B. stricta [40], residues 135 and 536, the Konstanz and Elbtalaue proteins are identical and have the same amino acids as those found in BsBCMA1 and BsBCMA3 (Figure S6).
The accumulation of 4MSOB, 3-methylsulfinylpropylglucosinolate (3MSIP), and 3MSOP, which are predicted to be synthesized by a shared biosynthetic pathway [23], is highly correlated (Figure 4A).Although mapping the accumulation of each of these glucosinolates individually did not identify significant QTL at the 95% confidence level (Figure S8), the sum of these three glucosinolates had a significant QTL localized on chromosome 7 (Figure 6C), with the recessive Elbtalaue allele causing lower glucosinolate accumulation (Figure 6D).
Plants 2024, 13, 466 9 of 21 8. Known Arabidopsis transcription factors regulating aliphatic glucosinolate biosynthesis include MYB28, MYB29, and MYB76 [42].However, E. cheiranthoides homologs of these genes are not located in this part of the genome, suggesting that gene expression variation in our F2 population is regulated by some other mechanism that genetically maps to chromosome 7.  S1.S1.
We conducted mutual rank coexpression network analysis [41] to determine whether known homologs of known Arabidopsis glucosinolate biosynthesis genes are also coexpressed in E. cheiranthoides.This identified a network of co-expressed genes, containing eight genes involved in aliphatic glucosinolate biosynthesis, four genes related to sulfur metabolism, and four additional genes likely encoding metabolic enzymes (Figure 6E and Supplemental Table S1).Several E. cheiranthoides genes encoding aliphatic glucosinolate biosynthetic genes have expression-level QTL between 3.0 and 3.4 Mbp on chromosome 8.Known Arabidopsis transcription factors regulating aliphatic glucosinolate biosynthesis include MYB28, MYB29, and MYB76 [42].However, E. cheiranthoides homologs of these genes are not located in this part of the genome, suggesting that gene expression variation in our F2 population is regulated by some other mechanism that genetically maps to chromosome 7.
Indol-3-ylmethylglucosinolate (I3M) is hydroxylated to form 4HI3M, and then methylated to form 4MI3M (Figure 7A).Furthermore, 4MI3M is significantly more abundant in the Konstanz parent than in the Elbtalaue parent of the F2 population.To determine whether there is genetic regulation of the relative 4MI3M content, we mapped the ratio of peak areas, (4MI3M)/(4MI3M + 4HI3M), as a quantitative trait (Figure 7B).For both of the detected QTL, the Konstanz allele caused higher relative 4MI3M accumulation (Figure 7C,D), with the Elbtalaue allele on chromosome 2 being recessive and the allele on chromosome 3 being dominant.The two Konstanz alleles had an additive effect on the 4MI3M concentration (Figure 7E).To identify loci that influence indole glucosinolate hydroxylation, we mapped the ratio (4MI3M + 4HI3M)/(I3M + 4MI3M + 4HI3M) as a quantitative trait.This identified a dominant locus from the Konstanz genetic background on chromosome 7, which increased the relative abundance of modified indole glucosinolates (4MI3M + 4HI3M) (Figure 7F,G).The E. cheiranthoides homologs of Arabidopsis enzymes that catalyze I3M 4-hydroxylation [43,44] are encoded on chromosome 2 (Erche02g041710 and Erche02g041680).Therefore, cis-regulation or differences in enzymatic activity are unlikely to be the cause of this variation in the indole glucosinolate profile.
Arabidopsis has five indole glucosinolate methyltransferase (IGMT) genes.IGMT1-4 (AT1G21100, AT1G21110, AT1G21120, and AT1G21130) are in a tandem-duplicated gene cluster on chromosome 1, and the more distantly related IGMT5 (AT5G53810) is located on chromosome 5 [44,45].Three predicted E. cheiranthoides IGMT genes (Erche01g022140, Erche01g022141, and Erche01g022144) are in a tandem-duplicated cluster on chromosome 1, and the encoded proteins are highly similar to the Arabidopsis IGMT1-4 (Figures 8A and S9), which catalyze the O-methylation of 4HI3M to make 4MI3M.The most similar methyltransferases from Raphanus sativus (radish) and B. oleracea are shown for comparison in the phylogenetic tree.Consistent with the greater abundance of 4MI3M in Konstanz, two of the three E. cheiranthoides IGMT genes are expressed at a significantly higher level in Konstanz than in Elbtalaue (Figure 8B).In the F2 population, the expression of all three E. cheiranthoides IGMT genes was positively correlated with the relative abundance of 4MI3M (Figure 8C-E).Quantitative trait mapping identified gene expression QTL on chromosome 6 for Erche01g022140, and on chromosome 3 for Erche01g022141 and Erche01g022144 (Figure S10).Chromosome 3 also has a QTL regulating the relative abundance of 4MI3M (Figure 7B), suggesting that IGMT gene expression variation may be the cause of the observed metabolite abundance QTL.trait.This identified a dominant locus from the Konstanz genetic background on chromosome 7, which increased the relative abundance of modified indole glucosinolates (4MI3M + 4HI3M) (Figure 7F,G).The E. cheiranthoides homologs of Arabidopsis enzymes that catalyze I3M 4-hydroxylation [43,44] are encoded on chromosome 2 (Erche02g041710 and Erche02g041680).Therefore, cis-regulation or differences in enzymatic activity are unlikely to be the cause of this variation in the indole glucosinolate profile.

Discussion
By crossing two E. cheiranthoides inbred lines, we generated a segregating F2 population and used this to make a genetic map with 501 molecular markers (Figure S1).For the original E. cheiranthoides genome assembly, sequencing contigs were ordered into scaffolds using a Hi-C proximity ligation method [23].Although this approach is efficient at placing assembled contigs in the right order on each chromosome, it is less reliable at placing contigs in the correct orientation.Based on the new genetic linkage map, we changed the relative orientations of individual contigs for several of the E. cheiranthoides chromosomes (Figure S3), and we increased the percentage of the overall genome assembly that was anchored to chromosomes (Figure S4).This improved genome assembly not only made it possible to conduct reliable quantitative trait mapping for the current project but will also facilitate future genetic studies with E. cheiranthoides.

Discussion
By crossing two E. cheiranthoides inbred lines, we generated a segregating F2 population and used this to make a genetic map with 501 molecular markers (Figure S1).For the original E. cheiranthoides genome assembly, sequencing contigs were ordered into scaffolds using a Hi-C proximity ligation method [23].Although this approach is efficient at placing assembled contigs in the right order on each chromosome, it is less reliable at placing contigs in the correct orientation.Based on the new genetic linkage map, we changed the relative orientations of individual contigs for several of the E. cheiranthoides chromosomes (Figure S3), and we increased the percentage of the overall genome assembly that was anchored to chromosomes (Figure S4).This improved genome assembly not only made it possible to conduct reliable quantitative trait mapping for the current project but will also facilitate future genetic studies with E. cheiranthoides.
With the notable exception of helveticoside, there was significant positive correlation in the abundance of glucosinolates and cardenolides in the F2 plants.Thus, there appears to be no major regulatory tradeoff in the production of these two classes of defensive metabolites in E. cheiranthoides.Among the detected cardenolides and glucosinolates in our assays, only helveticoside was negatively correlated with aphid reproduction on plants in the F2 population (Figure 4A).When added to an artificial diet, the IC50 of purified helveticoside was 14 ng/µL, which is similar to the 20 ng/mg wet weight concentration of this cardenolide in E. cheiranthoides leaves [35].However, it is not known at what concentration helveticoside is found in the phloem from which the aphids are feeding.We were not able to detect helveticoside in aphid tissue (Figure 3N), suggesting that it is either not localized in the phloem or somehow metabolized after it enters the aphids.However, the presence of helveticoside in aphids feeding on an artificial diet containing this cardenolide [35] suggests that the complete conversion of helveticoside in aphids is less likely.Further research, ideally with mutations that specifically affect the production of helveticoside, will be needed to investigate the function of this metabolite in plant defense.A QTL affecting the abundance of helveticoside, but not other cardenolides (Figure 5), may lead to the eventual identification of biosynthetic or regulatory genes that specifically affect the production of this cardenolide.
Homologs of known genes from Arabidopsis can account for most of the aliphatic glucosinolate biosynthesis pathway in E. cheiranthoides [23].However, biosynthetic enzymes for glucosinolates that are present in E. cheiranthoides but not in Arabidopsis remain to be discovered.Leucine and isoleucine have both been described as amino acid precursors for glucosinolate biosynthesis [25], and these could account for the structurally uncharacterized NMB glucosinolate, which is significantly more abundant in the Elbtalaue accession (Figure 2D).Cytochrome P450 enzymes in the CYP79F family have been associated with the differential incorporation of methionine or branched-chain amino acids into B. stricta glucosinolates [40], and Erche01g017900, a gene encoding a predicted CYP79F enzyme, is within the confidence interval of an NMB QTL on chromosome 1 (Figure 6A).
The expression levels of the Elbtalaue and Konstanz alleles of Erche01g017900 were not significantly different (Figure S6B), and there is only one amino acid sequence difference between the two accessions, glycine and serine, respectively, at position 51 (Figure S6A).BsBCMA1 and BsBCMA3, two B. stricta enzymes that preferentially catalyze the incorporation of branched-chain amino acids rather than methionine into glucosinolates [40], have a serine in this position, whereas BsBCMA2 has a glycine (Figure S6).Since Konstanz has the serine allele at position 51, differences in Erche01g017900 enzymatic activity may not explain the lower NMB abundance relative to Elbtalaue (Figure 2D).
Biosynthetic enzymes for methylsulfonyl glucosinolates have not yet been identified in any plant species.A family of flavin-dependent monooxygenases catalyze the formation of melthylsulfinyl glucosinolates in Arabidopsis [46,47], and it is possible that similar enzymes catalyze further the oxidation of glucosinolate substrates to produce methylsulfonyl glucosinolates in E. cheiranthoides.Both the genetic mapping (Figure 6C) and analysis of genes with expression patterns that are similar to those encoding other aliphatic glucosinolate biosynthetic enzymes (Figure 6E) may lead to the identification of such enzymes in E. cheiranthoides.
In Arabidopsis, three CYP81F monooxygenases (AT4G37430, AT4G37400, and AT5G57220) and four IGMTs (AT1G21100, AT1G21110, AT1G21120, and AT1G21130) [43,44] catalyze the sequential modification of I3M to form 4HI3M and 4MI3M (Figure 7A).The formation of hydroxylated and methoxylated indole glucosinolates is induced as a defense response, and the presence of multiple enzymes with similar functions may allow for more complex regulation of this process.To accomplish this, the multiple indole glucosinolate modifying enzymes may be subject to differential regulation.Among the two IGMTs that are expressed at a significantly higher level in Konstanz (Figure 8B), the expression of Erche01g022140 is regulated by a QTL on chromosome 6, and the expression of Erche01g022144 is regulated by a QTL on chromosome 3.
A QTL on chromosome 7 (Figure 7F) may be associated with increased I3M hydroxylase activity.However, the E. cheiranthoides homologs of Arabidopsis CYP81F monooxygenases that catalyze I3M 4-hydroxylation [43,44] are encoded on chromosome 2 (Erche02g041710 and Erche02g041680).Therefore, cis-regulation or differences in enzymatic activity are unlikely to be the cause of this variation in the indole glucosinolate profile.Moreover, Erche02g041710 and Erche02g041680 do not have significant expression QTL on chromosome 7, the location of a QTL affecting the (4MI3M + 4HI3M)/(I3M + 4MI3M + 4HI3M) ratio, indicating that this QTL does not affect the expression of Erche02g041710 and Erche02g041680.
Despite the significantly higher aphid reproduction on Konstanz than on Elbtalaue (Figure 1B), the genetic mapping of this trait in an F2 population identified no significant QTL.A likely explanation is that there are multiple loci affecting aphid resistance, none of which have an effect that is large enough to be identified in an F2 population with only 83 genotyped plants.The hypothesis of multiple loci independently causing aphid resistance is also consistent with the observation that aphid survival and reproduction are not significantly different between Elbtalaue and F2 plants (Figure 1A,B).For instance, if multiple R-genes from the Elbtalaue genotype independently cause dominant resistance in the F2 population, the aphid performance on the average F2 plant would be similar to Elbtalaue.R-gene-mediated resistance to aphids has been observed in other plant species, including tomatoes and melons [48,49].
Differences in cardenolide abundance do not adequately explain the improved performance of aphids on Elbtalaue relative to Konstanz plants.Although cheirotoxin, erysimoside, erychroside, and glucodigitoxigenin were more abundant in the Elbtalaue accession (Figure 3A-D), the abundance of these cardenolides was not negatively correlated with the aphid performance on F2 plants (Figure 4A).Conversely, although the helveticoside abundance is negatively correlated with aphid resistance in the F2 population (Figure 4A), there is no significant difference in the abundance of this compound between the two parent lines (Figure 3K).The greater helveticoside variation in the F2 population is due to transgressive segregation, and the lack of helveticoside is unlikely to be the cause of improved aphid growth on Konstanz plants.The performance of M. persicae was also not significantly improved on cyp87a126 mutant E. cheiranthoides plants, which have a complete knockout of cardenolide biosynthesis [15].
Although aphid feeding did not induce overall glucosinolate accumulation, aphids feeding on Konstanz plants for 24 h had elevated levels of 4MI3M in their bodies (Figure 2F) relative to earlier timepoints, suggesting an increased abundance of this compound in the phloem.In Arabidopsis experiments, indole glucosinolate breakdown products were aphiddeterrent [36], and induced 4MI3M accumulation increased aphid resistance [11].However, 4MI3M abundance was positively correlated with aphid reproduction in the E. cheiranthoides F2 population (Figure 4A).Given that 4MI3M abundance is positively correlated with other E. cheiranthoides metabolites, it is possible that additional defenses mask the predicted negative effects of 4MI3M.It is also possible that other factors in E. cheiranthoides influence the breakdown of 4MI3M and make this compound less toxic in this experimental context, when compared to the aphid consumption of 4MI3M from Arabidopsis.
Previous research with M. persicae feeding on Brassica napus (oilseed rape) showed that glucosinolates are present in the honeydew but not the hemolymph [50], suggesting that these aphids are resistant to glucosinolates, largely due to avoidance and excretion, rather than uptake and detoxification.More recent research with M. persicae on Arabidopsis showed that aliphatic glucosinolates are excreted in the hemolymph, but indole glucosinolates are activated by the cleavage of the glucose moiety within the aphids, leading to the production of toxic breakdown products [36].It remains to be determined how well M. persicae can prevent the uptake of cardenolides into the hemolymph, or whether there are differences in the uptake of different cardenolide types.
Together, experiments with our E. cheiranthoides F2 population have resulted in an improved genome assembly as well as new insights into the biosynthesis and defensive functions of glucosinolates and cardenolides.Although aphid reproductive fitness, cardenolide content, and glucosinolate content all vary between the two parental lines of the F2 population, the variation in the abundance of the two classes of defensive metabolites do not adequately explain the observed differences in aphid performance.This indicates that additional but as-yet-unknown mechanisms of aphid resistance exist in E. cheiranthoides.A diverse defensive repertoire likely provides benefits in defense against generalist herbivores, like M. persicae, that are relatively tolerant to both glucosinolates and cardenolides.

Plant and Insect Rearing
Erysimum cheiranthoides accession Elbtalaue, which has a published genome sequence [23], was collected in the Elbe River floodplain (Elbtalaue) in Lenzen, Germany.The Konstanz accession was originally collected in Oggenhausen, Germany, and was propagated at the Konstanz Botanical Garden in Konstanz, Germany.The seed stocks of both E. cheiranthoides accessions are available from the Arabidopsis Biological Resource Center (www.arabidopsis.org(accessed on 25 October 2019); stock numbers CS29250 and CS29251, respectively).We grew all plants in Cornell Mix [by weight 56% peat moss, 35% vermiculite, 4% lime, 4% Osmocote slow-release fertilizer (Scotts, Marysville, OH, USA), and 1% Unimix (Scotts, Marysville, OH, USA)] in 6 × 6 × 6 cm pots in a Conviron (Winnipeg, MB, Canada) growth chamber, with 200 mmol m −2 s −1 light intensity at 23 • C, with 50% relative humidity and a 16 h/8 h day/night cycle.We conducted all insect assays with a genome-sequenced, tobacco-adapted M. persicae strain [51] that we maintained on Nicotiana tabacum (tobacco), with 150 mmol m −2 s −1 light intensity at 24/19 • C day/night temperature, with 50% relative humidity and a 16 h/8 h day/night cycle.

Insect Bioassays
For aphid survival and reproduction assays, we placed groups of five fourth-instar M. persicae into clip cages on E. cheiranthoides leaves.After 10 days, we counted the number of surviving adult aphids and nymphs.For the time-series aphid experiment, we utilized five-week-old E. cheiranthoides plants, with each plant hosting a group of 15 fourth-instar M. persicae aphids enclosed within clip cages.Aphids, along with the leaf areas surrounded by these cages, were collected in separate tubes and promptly frozen in liquid nitrogen after 1, 8, and 24 h.For aphid choice assays, we placed one leaf each of Elbtalaue and Konstanz plants into 15 cm diameter Petri dishes, with their petioles inserted into a piece of moistened filter paper.To determine aphid feeding preferences, we released 10 adult M. persicae at the midpoint between the two leaves and, 24 h later, counted the number of aphids on each leaf.Aphids that were not on either of the two leaves were not included in the data analysis.For the artificial diet assays, we assembled aphid cages with 200 µL artificial diet [52,53], containing helveticoside (Cfm Oskar Tropitzsch GmbH, Marktredwitz, Germany) at concentrations ranging from 0 to 100 ng/µL, between two layers of stretched Parafilm at the top of the cage.We placed 10 adult aphids into each cage, and, after 7 days, we counted the number of surviving aphids and progeny in each cage.The experiment was conducted with four replicates.

Transcriptome Sequencing
We sequenced the transcriptomes of the 83 F2 individuals from a cross between E. cheiranthoides accessions Elbtalaue and Konstanz using the 3'RNAseq method [54].Additionally, we sequenced RNA from 5 Elbtalaue and 5 Konstanz samples, which served as parental references.RNA was isolated from frozen harvested materials using the SV Total RNA isolation kit with on-column DNA digestion (Promega, Madison, WI, USA).The purity of all RNA samples was confirmed using a NanoDrop2000 instrument (Thermo Scientific, Waltham, MA, USA).The 3´RNA-seq libraries were prepared from 6 µg total RNA at the Cornell Genomics facility (Cornell University, Ithaca, NY, USA) [54].Transcriptome sequencing data were deposited in the Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra(accessed on 17 December 2023)) under accession PRJNA1053801.

Genetic Map Construction and Assembly of E. cheiranthoides Genome v2.0
We performed read mapping and SNP calling by following the Genome Analysis ToolKit (GATK) best practices for RNAseq short variant discovery [55,56].Furthermore, 3 ′ RNAseq data from 83 F2 plants, five var.Konstanz, and five var.Elbtalaue plants were aligned to unpolished PacBio contigs using STAR version 2.7.1a default parameters and 2-pass mapping [57].The resulting bam files were cleaned using GATK tools MarkDuplicates, AddOrReplaceReadGroups, and SpljitNCigarReads.Variants were called with HaplotypeCaller, and joint genotyping was performed using GenotypeGVCFs [58].The resulting VCF file was filtered using the bcftools filter [59] to include only biallelic SNPs with a quality score greater than 30, an alternate allele frequency between 0.3-0.7,an excess heterozygosity less than two, and a called genotype in at least half of the samples.The filtered VCF was converted to ABH using Tassel 5 [60], the markers were binned using SNPbinner [61], and a genetic map was made using MSTmap [62].During the map construction, one contig was found to be chimeric and was split at the most likely splice point, as determined by a visual analysis of aligned PacBio reads.The resulting genetic map was reconciled with the Hi-C proximity guided assembly [23] using a custom Python script (https://github.com/gordonyounkin/Erysimum_F2_aphids(accessed on 5 January 2024)) that prioritized placement and orientation of contigs in the genetic map.The final fasta assembly containing pseudomolecules and contigs was constructed using CombineFasta (https://github.com/njdbickhart/CombineFasta(accessed on 1 December 2023)).Illumina reads were aligned to the new genome using Burrows-Wheeler Aligner version 0.7.8 [63], and the assembly was polished with three rounds of Pilon version 1.23 [64].The chloroplast genome was assembled from PacBio reads using Organelle_PBA [65,66].Plots were generated in R [67] using R/qtl [68].

Genome Annotation
Gene annotations were transferred from version 1.2 to version 2.0 of the E. cheiranthoides genome using GMAP [69].Annotations were improved by aligning full length E. cheiranthoides RNA sequencing reads (NCBI: PRJNA563696) to the new genome assembly with hisat2 [70], sorting aligned reads with samtools [71], and assembling and merging transcripts with StringTie [72].In cases where there was not a 1:1 relationship between stringtie transcripts and the original gene annotations, a new name was assigned to each transcript, and redundant gene models were filtered using gffcompare [73].Transcripts, coding sequences, and protein sequences were predicted using gffread [73], and untranslated regions were annotated using the add_utrs_to_gff.pyscript, publicly available from NCBI [74].

Data Analysis
ANOVA and t-tests were conducted using JMP Pro 16 (JMP, Cary, NC, USA).We calculated the IC50 (cardenolide concentration to reduce progeny production by 50%) using the Solver function in Excel to fit a curve of the form: Y = 1/(1 + exp [B − G•ln(X)]), where X is cardenolide concentration, Y is the fraction of larvae killed by the infection, and B and G are parameters which are varied for the optimal fit of the curve to the data points (minimizing the residuals).We conducted QTL mapping using Windows QTL Cartographer [78].Sequences were aligned using Clustal Omega [79].Neighbor joining trees were constructed using default parameters in MEGA11 [80].For the Pearsson correlations of metabolite and aphid resistance data, the data were transformed to normality using a two-step process in SPSS (IBM, Armonk, NY, USA), as described previously [81].Raw data underlying all manuscript figures are included in Tables S2-S11.
. Distribution of 501 genetic markers in the Konstanz × Elbtalaue genetic map across 8 chromosomes, Figure S2.Pairwise recombination fractions and LOD scores indicating probability of genetic linkage for 501 genetic markers, Figure S3.Position of genetic markers in E. cheiranthoides genome versions 1.2 and 2.0, Figure S4.Comparison of assembly statistics for E. cheiranthoides genome versions 1.2 and 2.0, Figure S5.Allele frequencies across the F2 mapping population, Figure S6.Alignment of CYP79F sequences, Figure S7.Neighbor joining tree of CYP79F protein sequences, Figure S8.QTL affecting oxygenated sulfur glucosinolate abundance, Figure S9.Alignment of indole glucosinolate methyltransferase proteins sequences, Figure S10.QTL affecting indole glucosinolate methyltransferase gene expression, Table -S11.Raw data underlying manuscript figures.Author Contributions: Conceptualization, G.J. and M.M.; Methodology, M.M., G.C.Y. and G.J.; Software, G.C.Y. and A.F.P.; Validation, M.M. and G.C.Y.; Formal Analysis, M.M., G.C.Y., A.F.P., M.L.A. and G.J.; Investigation, M.M., G.C.Y., A.F.P. and M.L.A.; Resources, G.J.; Data Curation, G.C.Y.; Writing-Original Draft Preparation, G.J.; Writing-Review and Editing, M.M., G.C.Y. and G.J.; Visualization, M.M., G.C.Y. and G.J.; Supervision, S.R.S. and G.J.; Project Administration, G.J.; Funding Acquisition, G.J., G.C.Y. and M.L.A.All authors have read and agreed to the published version of the manuscript.Funding: This research was funded by US National Science Foundation award 1645256, United States Department of Agriculture-National Institute of Food and Agriculture award 2020-67013-30896, and a Triad Foundation grant to GJ; a Summer Undergraduate Research Fellowship from the American Society of Plant Biologists and a Rawlings Cornell Presidential Research Scholar award to MLA; and a Cornell Chemistry Biology Interface Training Program (National Institute of Health/National Institute of General Medical Sciences award T32GM138826) fellowship and a US National Science Foundation Graduate Research Fellowship under Grant No. DGE-2139899 to GCY.Data Availability Statement: Version 2.0 of the E. cheiranthoides genome is available from GenBank (accession number PRJNA563696), and an annotated version of the genome is available at www. erysimum.org(accessed on 5 January 2024).Transcriptome sequencing data generated through this