Lipidomics-Based Comparison of Molecular Compositions of Green, Yellow, and Red Bell Peppers

Identifying and annotating the molecular composition of individual foods will improve scientific understanding of how foods impact human health and how much variation exists in the molecular composition of foods of the same species. The complexity of this task includes distinct varieties and variations in natural occurring pigments of foods. Lipidomics, a sub-field of metabolomics, has emerged as an effective tool to help decipher the molecular composition of foods. For this proof-of-principle research, we determined the lipidomic profiles of green, yellow and red bell peppers (Capsicum annuum) using liquid chromatography mass spectrometry and a novel tool for automated annotation of compounds following database searches. Among 23 samples analyzed from 6 peppers (2 green, 1 yellow, and 3 red), over 8000 lipid compounds were detected with 315 compounds (106 annotated) found in all three colors. Assessments of relationships between these compounds and pepper color, using linear mixed effects regression and false discovery rate (<0.05) statistical adjustment, revealed 11 compounds differing by color. The compound most strongly associated with color was the carotenoid, β-cryptoxanthin (p-value = 7.4 × 10−5; FDR adjusted p-value = 0.0080). These results support lipidomics as a viable analytical technique to identify molecular compounds that can be used for unique characterization of foods.


Introduction
Nutritional research has established that certain foods and food compounds influence human health and disease; however, the biochemical underpinnings of the reported health effects require further investigation. Additionally, there is a myriad of uncharacterized food-related compounds, which together results in the delivery of incomplete, ambiguous, and confusing nutrition messages to the general public. The outcomes of food science research is an important element that contributes to the likelihood of individuals choosing to consume appropriate health-promoting foods and adhere to a healthy diet [1].
PCA of the 8174 compounds detected in at least two of the 23 pepper samples from six peppers revealed that peppers cluster by color, with red and yellow peppers grouping more closely compared to either green and red or green and yellow peppers ( Figure 1A). HC of compounds from peppers of different colors revealed regions of distinct variation, as well as similarities ( Figure 1B). For instance, the region to the far left of the HC graphic reveals compounds that are highly abundant in all three pepper colors (solid black box). Variation in compound abundance between different colored peppers is also demonstrated (dashed black box). Interestingly, the compounds within the dashed box were also found to be driving some of the separation in PC1 of the green peppers from yellow and red peppers in the PCA when the PCA loadings plot was examined (data not shown).

Untargeted Lipidomics Reveals Several Compounds That Differ between Green, Red, and Yellow Peppers
Briefly, 8174 compounds were detected in at least 2 pepper samples of the same color (often from the same individual pepper), with 1813 compounds unique to red peppers compared to 1229 and 324 unique to green and yellow peppers, respectively. This suggests that pepper colors may have distinct chemistry ( Figure 1C). While the six peppers differed in characteristics such as color, organic status, harvest location, and time of year, due to sample size and study design limitations, statistical analysis could only be used to evaluate differences due to color. A total of 111 compounds were found to be nominally associated with pepper color (p-value < 0.05) (Table S2). These included a total 39 annotated compounds, which include 26 in the class of glycerolipids and glycerophospholipids and 13 miscellaneous classes (Table 1). A total of 11 annotated compounds were found to differ by color with false discovery rate (FDR) adjusted p < 0.05 [23] (Table 1, above bold line). This included one monoglycerolipid, two diglycerides, two triglycerides, five phospholipids, and one carotenoid ( Table 1). The compound most strongly associated with color was β-cryptoxanthin (p-value = 7.4 × 10 −5 ; FDR adjusted p-value = 0.0080). Other annotated compounds, such as sucrose acetate isobutyrate (SAIB), 2-ethenyl-2,4b,8,8tetramethyl-tetradecahydrophenanthrene-3,5,10a-triol, fargesin, and ascorbyl linoleate/Lascorbyl linoleate (LAA) were found to nominally differ by color but not FDR adjusted ( Table 1, below bold line).
HC of compounds from peppers of different colors revealed regions of distinct variation, as well as similarities ( Figure 1B). For instance, the region to the far left of the HC graphic reveals compounds that are highly abundant in all three pepper colors (solid black box). Variation in compound abundance between different colored peppers is also demonstrated (dashed black box). Interestingly, the compounds within the dashed box were also found to be driving some of the separation in PC1 of the green peppers from yellow and red peppers in the PCA when the PCA loadings plot was examined (data not shown).  1 (MPP) using data from all bell pepper samples. Component 1, which explains 15% of the variation, is shown on the x-axis; component 2, which explains 11.2% of the variation, is shown on the y-axis; and component 3, which explains 7.3% of the variation, is shown on the z-axis. (B) Hierarchical clustering of data from 23 replicates from six individual bell peppers. The x-axis corresponds to individual compounds detected in the peppers, which are grouped by color and listed on the y-axis. Blue lines indicate less relative abundance of a compound compared to the other 8174 compounds, while red lines indicate higher relative abundance compared to the other 8174 compounds. The vertical distance between compounds provides a rough estimation of their similarity. (C) Venn diagram illustrates overlap between the 2623 compounds detected in all colors of pepper samples (center section), the 1229 compounds detected in green peppers (green circle), but not red or yellow; the 200 compounds in both green and yellow; and the 1076 compounds in both green and red pepper. Within the yellow peppers (yellow circle), 324 compounds were detected in yellow but not red or green and 810 compounds were detected in both yellow and red bell peppers. Red peppers contained 1813 compounds that were detected in red (red circle), but not green or yellow bell peppers.

Pairwise Comparisons of Green, Yellow, and Red Peppers
Based on these results from the untargeted analysis, the 11 compounds passing an FDR of 0.05 were further evaluated using pairwise comparisons (i.e., t-tests). Because individual compounds were being evaluated rather than an entire dataset, non-phenotypic factors such as organic vs. non-organic and location were again considered in the analyses in addition to color. For the 11 compounds significant after FDR adjustment, the highest number of differences occurred when green and red peppers were compared (n = 9 compounds, Table 1, gray cells above bold line), following by green vs. yellow (n = 4, Table 1, gray cells above bold line) ( Table 1 and expanded version in Table S2). Only β-cryptoxanthin differed significantly between yellow and red peppers. Subsequent pairwise comparisons showed that β-cryptoxanthin levels were 12.1-fold higher in red peppers compared to green peppers and 8.2-fold higher than yellow peppers (p = 7.42 × 10 −5 , Figure 2A). organic pepper from the United States (four replicates) ( Figure 2D). Since these results are based on a small sample of peppers, it is not possible to discern if the differences seen in red organic status are due to differences in harvest location instead of organic vs. nonorganic. There continued to be no significant difference by cooked vs. raw when assessing within yellow and red peppers (p-value = 0.913 and 0.660, respectively). Given the extremely small sample sizes, all of these results should be interpreted with caution and warrant further study.
Relative abundance of β-cryptoxanthin detected in bell pepper samples by color. Following analysis of pepper samples using untargeted LC/MS, β-cryptoxanthin levels were compared according to their relative abundance across samples. (A) Red bell pepper has significantly higher levels compared to green and yellow. *** p < 0.001; (B) To determine whether cooking had any impact, each pepper was divided, three samples were heated for 5 min, and two samples remained raw. There was no statistically significant difference between the cooked peppers compared to raw overall or within either pepper color; (C) There was a nominal statistical difference between one non-organic, Canadian red pepper compared to two organic, Mexican red peppers; (D) There was no statistical difference between the non-organic, Mexico grown green pepper vs. the organic US grown green pepper.

Discussion
In the current study, we assessed six bell peppers with multiple characteristics such as color, organic status, harvest location, and purchase time of year. Our methods detected over 8000 compounds in peppers, with 315 compounds found in 100% of the 23 samples from the six individual peppers analyzed. The application of an untargeted lipidomics approach to characterize a large number of compounds from a variety of bell peppers is a novelty of this research. However, due to limited sample size, color was the only characteristic that could be statistically analyzed with confidence, revealing 11 compounds that reached significance and passed FDR. Interestingly, the top 11 revealed one compound, β-cryptoxanthin, as a "true positive" or previously known compound to be associated with bell peppers. The remainder of the compounds in the top 11 are commonly detected and/or ubiquitous.
Many of the compounds detected in the current study have yet to be characterized as being specifically related to bell peppers and a myriad are not present in any existing metabolomics database at the time of this study. For example, of the 315 compounds detected in all samples, only 106 were annotated from databases accessed in August 2020. While 111 compounds were found to be nominally statistically different across pepper Relative abundance of β-cryptoxanthin detected in bell pepper samples by color. Following analysis of pepper samples using untargeted LC/MS, β-cryptoxanthin levels were compared according to their relative abundance across samples. (A) Red bell pepper has significantly higher levels compared to green and yellow. *** p < 0.001; (B) To determine whether cooking had any impact, each pepper was divided, three samples were heated for 5 min, and two samples remained raw. There was no statistically significant difference between the cooked peppers compared to raw overall or within either pepper color; (C) There was a nominal statistical difference between one non-organic, Canadian red pepper compared to two organic, Mexican red peppers; (D) There was no statistical difference between the non-organic, Mexico grown green pepper vs. the organic US grown green pepper.
In addition to several compounds found to be significantly different between peppers following FDR adjustment, a number of compounds reached significance but did not pass FDR. For example, the tentatively annotated SAIB compound was nominally different between green and red peppers (p-value = 0.0065 and FDR = 0.056). The putatively identified compound ivermectin B1b was found to be nominally significantly different between green and yellow peppers (p-value = 0.0488), while all-trans-retinyl oleate was close but not below the nominally significant threshold for red vs. yellow peppers (p-value = 0.0523).

Evaluation of Pepper Attributes for Beta-Cryptoxanthin
Further analysis of the top result of β-cryptoxanthin was done since it is a "true positive" or compound known to be associated with bell peppers. The other 10 from the top 11 are ambiguous compounds, not specifically related to food and hence additional analyses was not conducted in order to avoid overinterpretation of the data. The additional analysis showed that there was no statistically significant difference in the β-cryptoxanthin levels between yellow and red cooked and raw peppers (p-value = 0.794) ( Figure 2B) or by green and red peppers grown organically compared to conventionally (p-value = 0.978). Upon further inspection of the red and green peppers separately, the non-organic red pepper from Canada (three replicates) had significantly lower β-cryptoxanthin levels compared to the two organic red peppers from Mexico (three and five replicates) (p-value = 0.014) ( Figure 2C). The non-organic green pepper from Mexico (three replicates) did not have significantly different (p-value = 0.544) levels of β-cryptoxanthin compared to the organic pepper from the United States (four replicates) ( Figure 2D). Since these results are based on a small sample of peppers, it is not possible to discern if the differences seen in red organic status are due to differences in harvest location instead of organic vs. non-organic. There continued to be no significant difference by cooked vs. raw when assessing within yellow and red peppers (p-value = 0.913 and 0.660, respectively). Given the extremely small sample sizes, all of these results should be interpreted with caution and warrant further study.

Discussion
In the current study, we assessed six bell peppers with multiple characteristics such as color, organic status, harvest location, and purchase time of year. Our methods detected over 8000 compounds in peppers, with 315 compounds found in 100% of the 23 samples from the six individual peppers analyzed. The application of an untargeted lipidomics approach to characterize a large number of compounds from a variety of bell peppers is a novelty of this research. However, due to limited sample size, color was the only characteristic that could be statistically analyzed with confidence, revealing 11 compounds that reached significance and passed FDR. Interestingly, the top 11 revealed one compound, β-cryptoxanthin, as a "true positive" or previously known compound to be associated with bell peppers. The remainder of the compounds in the top 11 are commonly detected and/or ubiquitous.
Many of the compounds detected in the current study have yet to be characterized as being specifically related to bell peppers and a myriad are not present in any existing metabolomics database at the time of this study. For example, of the 315 compounds detected in all samples, only 106 were annotated from databases accessed in August 2020. While 111 compounds were found to be nominally statistically different across pepper color, only 39 were annotated and taken forward for further analysis (Table S2). The further analysis indicated only 11 annotated compounds, including β-cryptoxanthin, which are identified from the databases used (see Supplemental Methods for more details). Further analysis was performed on β-cryptoxanthin because it is known to be found in bell peppers and hence biological interpretation would be more reliable. β-cryptoxanthin is the only named compound that reached p-value significance and passed FDR; while the other named compounds have less specificity to bell peppers, our analysis nonetheless showed that they reached p-value significance.
Subsequent to the global statistical approach a more focused, pairwise analysis was conducted on β-cryptoxanthin. Red peppers had significantly more β-cryptoxanthin, compared to green or yellow peppers, which supports previous studies demonstrating concentration differences of this compound as a result of fruit maturity or ripening that results in bell pepper color changes [11,16]. The finding of β-cryptoxanthin as a significant difference between the samples represents the feasibility of using an untargeted approach to identify candidate molecules that can then be interrogated more deeply. Caution is required when conducting focused analyses on candidates discovered using a small sample size; in the current case, β-cryptoxanthin was previously known to be related to pepper characteristics and hence represented a "true positive," thereby lending confidence to the results. While a number of compounds have been identified as differing across pepper colors [15,16], to our knowledge, an untargeted lipidomics strategy that focuses on lipidbased or pigment-related molecules, such as carotenoids, has not yet been reported. Our metabolomics-based approach is a "hypothesis generating" strategy that can be performed using similar techniques on other foods to identify compounds of interest.
β-cryptoxanthin is a pigment compound in the carotenoid family responsible for the orange/red color to many fruits and vegetables [24,25]. Carotenoids are categorized by the presence or absence of oxygen on the terminal ends of the molecule and are well known for their antioxidant function with variable antioxidant capacity [24]. Analytical data on the carotenoids of pepper are divergent and there are difficulties in separating, identifying conclusively, and quantifying the major carotenoids. The carotenoid action in human health is intimately related to their structure and the effects differing with different carotenoids; thus, conclusive identification and accurate quantification of individual carotenoids are necessary. For example, carotene is the primary carotenoid that contains no oxygen with several isomers, such as β-carotene, α-carotene, and γ-carotene [24]. β-Cryptoxanthin is an oxygenated carotenoid termed xanthophyll with a chemical structure similar to, but more polar or hydrophilic than β-carotene [25]. Though β-carotene is present in more fruits and vegetables than β-cryptoxanthin, β-cryptoxanthin is found at high concentrations in certain foods, has greater bioavailability, and has nearly equivalent antioxidant capacity when compared to β-carotene [25]. β-cryptoxanthin has been demonstrated to have antiproliferative effects in lung cancer, may have hepato-protective effects, and may improve insulin resistance [26][27][28].
Previous studies have investigated different cooking methods and their impact on β-cryptoxanthin levels in foods, which have included various microwave cooking times and various boiling times [29], as well as in pepper samples that were frozen and boiled compared to raw [30]. Cooking either by boiling or microwave heating was reported to result in an approximately 20% loss in total carotenoid content [31]. Total carotenoid content of red bell pepper has been reported to be significantly reduced by boiling, steaming, and roasting, but not stir-frying, as compared to raw [32]. The current study used a very brief cook time to avoid burning. Oils were not used in the cooking process to avoid contributions of oils to the compound analysis and limit our cook time. While these previous studies demonstrated significant losses of β-cryptoxanthin as cook time was increased [29,30], our study did not reveal significant differences in either the yellow or red pepper β-cryptoxanthin levels compared to the matched raw pepper. Our data suggest that peppers may be capable of withstanding short cook times without significant losses of β-cryptoxanthin, although more research is needed with a larger sample size to confirm these findings.
Differences in carotenoid content have been reported as related to growing conditions, such as higher levels of β-cryptoxanthin in peppers grown in a glasshouse compared to those grown outdoors [33]. Organic compared to conventionally grown counterparts have been evaluated in a variety of plant foods, including bell peppers, showing that fresh vegetables from organic production are usually richer in biologically active compounds when compared to the conventional ones [34]. A study determined that pickled red bell peppers revealed significantly higher content of total carotenoids, and specifically βcryptoxanthin, in organically grown peppers [34]. The same laboratory compared organic vs. conventional white cabbage and yet determined higher carotenoid content in the conventionally grown samples compared to organic [35]. In the current study, pairwise analysis of β-cryptoxanthin in red and green peppers showed the non-organic red pepper from Canada had significantly lower β-cryptoxanthin levels compared to organic red peppers from Mexico; however, it was not possible to definitively determine organic status as a single factor that contributed to differences in the carotenoid content between those peppers. Since our results are based on a small sample of peppers, more samples with less variation in season, location, and soil conditions would be needed to further evaluate the organic status for comparison between the peppers.
As described in the Supplemental Methods, MS2 level annotations were obtained for only a small number of compounds (Supplemental Methods, Table S1); therefore, with the exception of α-cryptoxanthin, annotations should be considered putative with a MSI level 2 or 3 [18]. Identification of several compounds, such as 2-ethenyl-2,4b,8,8-tetramethyltetradecahydrophenanthrene-3,5,10a-triol, all-trans-retinyl oleate, and archaetidylglycerolmyo-inositol, lack characterization as they relate to any food, including bell peppers. Of the remaining compounds that were significantly different by pepper color (p-value < 0.05), several have well-described associations with foods and health benefits, such as ascorbyl linoleate/L-ascorbyl linoleate (LAA), the most biologically active and well-studied form of vitamin C [36]. Vitamin C content is higher in bell peppers compared to several other vegetables and fruits commonly recognized as vitamin C sources [10]. Vitamin C is an important antioxidant compound that supports collagen production, prevents common degenerative conditions like cataracts, and aids immune system functioning [11]. Another compound, identified as fargesin, is a bioactive neolignan isolated from magnolia plants, with antihypertensive and anti-inflammatory effects; however, it has not been previously identified in bell peppers [37][38][39].
Another compound was tentatively annotated as glycidyl oleate, which is a carboxylic ester and an epoxide. This compound has been previously described as a possible carcinogen, which may be the result of heat, food processing, or a contamination [40]. In the present study, bell peppers were minimally cooked to avoid molecular changes as a result of heat, though there have been no studies investigating glycidyl oleate composition as it relates to bell peppers and cooking. This study identified Goyaglycoside g as a compound in bell peppers; however, it has only been previously isolated from the fresh fruit of Japanese Momordica charantia L. (Cucurbitaceae), which has been used in several Asian cultures as a stomachic, laxative, or an anthelmintic [41]. There are known nutraceuticals and functional food peptides known as angiotensin converting enzyme inhibitors (ACE) inhibitors that belong to a group called "bioactive substances" [42]. Various plants are natural sources of ACE inhibitors such as soybean, mung bean, sunflower, rice, corn, wheat, buckwheat, broccoli, mushroom, garlic, spinach, and grapes [43]. Bell peppers may potentially be a source of ACE inhibitors as well, though this has not yet been previously described in the literature.
Additionally, our study revealed some compounds that may be potentially related to undescribed agricultural practices. For example, ivermectin B1b is a compound with antiparasitic activity that is the minor component (<20%) of the anthelmintic ivermectin, which is mainly composed of ivermectin B1a (>80%) [44]. Navratilova et al. have demonstrated that the utilization of ivermectin to prevent and treat animal parasitic diseases in livestock has the potential to enter soybean roots and leaves, but not the beans, through the use of manure for fertilizer [45]. The compound SAIB was identified in bell peppers, however, it is most commonly known for its use as an alternative food emulsifier in nonalcoholic carbonated and non-carbonated beverages [46]. Though compounds such as ivermectin and SAIB have not been previously shown to be associated with bell peppers, the findings in this study support gaps in our understanding of agriculture and food preservation practices.
This study has some limitations, such as the inherent limitations of lipidomics in terms of the resulting purity and integrity of the samples. As described in Supplemental Methods, we were unable to distinguish between the carotenoids β-cryptoxanthin and αcryptoxanthin using authentic standards analyzed by tandem MS or ion mobility MS. While these isomers can be resolved using different chromatography methods [47], repeating the study under different conditions was not feasible, however, α-cryptoxanthin is rarely found in plants [48]. Moreover, our stringent methods may have filtered out other carotenoids that were not abundant or present in all samples, therefore, not included in the final 315 compounds utilized in the analysis. The lipidomics approaches isolated a large number of compounds, however, most could not be annotated due to the narrow coverage of plant compounds in many metabolomics databases, therefore limiting the number of compounds available for a comprehensive analysis. Furthermore, while lipidomics is appropriate for understanding how lipid species vary between samples, it does not capture differences in non-lipid species, consequently, abating the contributions of aqueous compounds. Due to study design and sample size limitations, additional analyses with complementary and comparative methods will be needed to assess the robustness of these results as well as to accommodate for more in-depth analysis of various food characteristics, such as location and harvest time.
The study supports previous studies that have shown that red peppers had significantly more β-cryptoxanthin, compared to green or yellow pepper. Additional studies are required to further test the relationship of the compounds as they relate to bell pepper color and other variable qualities such as cultivation, harvest, processing, storage, and cooking. Lipidomics can help close this knowledge gap by deciphering if the molecular composition of a food is altered based on these and other factors. The compounds related to the distinctive characteristics of an individual food may suggest unique health benefits. Thus, determining whether the health impact of foods is altered by the variability in their characteristic may be a valuable target for future nutritional research.
Overall, this research serves as a proof-of-principle that further characterization of bell peppers can be determined by applying untargeted metabolomics to identify compounds associated with specific food qualities, such as pigmentation. This research also demonstrates the utility of starting with an untargeted approach to identify potentially interesting molecules, followed by a targeted approach to look more specifically at single compounds of interest, such as β-cryptoxanthin. Hence, our metabolomics-based "hypothesis generating" strategy can be performed using similar techniques on other foods to identify compounds of interest.

Bell Pepper Sample Preparation
Green, yellow, and red bell peppers were purchased locally in Denver, Colorado, U.S.A., and represented different farming practices, locations, and purchase time ( Table 2, n = 1 each with technical replicates). Green and red peppers that were conventionally grown in Mexico and an organic variety grown in the United Stated, organic red peppers grown in Mexico, and conventionally grown peppers from Canada were purchased from the same grocery store. Four months later, one organic, Mexico grown red pepper and one yellow, organic pepper, with no label indicating company name or country of origin, were purchased together from a different location than the peppers purchased previously. All peppers were chopped into similar sized pieces and an equal aliquot was stored in 15 mL Eppendorf tubes. The red and yellow peppers purchased together in March 2019 were each split into two groups: (1) raw and (2) cooked separately over medium heat in a multicooker with no oil or additives for 42 s while continually stirred to avoid charring. All samples were stored at −80 • C until lyophilization. All pepper samples were lyophilized for 72 h at −40 • C. Dried bell pepper samples were suspended in ice cold 100% methanol at 100 mg/mL and bead homogenized for two cycles of 5 min at 50 Hz on a Qiagen TissueLyser LT (Germantown, MD, USA).
For this untargeted lipidomics approach, a modified liquid-liquid extraction method using MTBE was used to separate hydrophobic and hydrophilic fractions of each sample [49,50]. Briefly, 100 µL pepper homogenates (10 mg lyophilized equivalent) were spiked with 10 µL Avanti Polar Lipid's SPLASH Lipidomix prior to extraction. A three-to-one mixture of MTBE-to-methanol was added to each tube and then vortexed. Then, a three-to-one mixture of water-to-methanol was added and vortexed. The samples were centrifuged at 4 • C for 15 min, followed by removal of the lipid and the aqueous layer separately and dried under nitrogen. Lipids were reconstituted in 100 µL of 100% methanol; aqueous fractions were reconstituted in 50 µL of 95% water and 5% acetonitrile, as previously described [49]. Only the lipid fractions were used for the purposes of the current study. The hydrophilic fraction was stored at −80 • C for future analysis and publication.

Mass Spectrometry (MS)
Samples were analyzed using an Agilent 6545 Time-of-Flight (TOF-MS) with dual Agilent JetStream electrospray ionization (AJS-ESI) source in positive ionization mode. Specific parameters are as follows: scan rate of 2 spectra/s, mass range of 75-1700 m/z, drying gas temperature 300 • C and flow rate of 12.0 L/min, nebulizer pressure 35 psi, sheath gas temperature 275 • C, sheath gas flow 12 L/minute, skimmer 65 V, capillary voltage 3600 V, fragmentor 100 V, and reference masses 121.050873 and 922.009798 (Reference mix, Agilent Technologies, Santa Clara, CA, USA).

Data Processing
Raw LC/MS data were extracted using Agilent Technologies MassHunter Profinder Version B.08 (Profinder) software and analyzed using Agilent Technologies Mass Profiler Professional Version 14.1 (MPP), as previously described [9]. Untargeted and recursive feature extraction was applied to compound data from each sample, using abundance profiles in m/z and retention time (RT) dimensions. Lipid positive mode samples were extracted using Batch Molecular Feature Extraction (BMFE) in Profinder with the following parameters: retention time extraction range of 0-10.4 min with noise peak height filter ≥ 3000 counts; ion species: +H, +Na, +K, +NH 4 ; and charge state maximum of two. Alignment tolerance for RT was 0% + 0.3 min with mass 20 ppm + two mDa (millidalton). BMFE parameters were height at ≥ 15 counts and a score of ≥ 50. Compounds were then imported into MPP and filtered based on presence in at least two samples. Any compounds detected in the spiked blank samples were removed. The remaining list of compounds was then imported into Profinder for searching with Batch Targeted Feature Extraction (BTFE) using the following parameters: height ≥ 10,000 for EIC peak integration with post-processing filters using Abs height ≥ 10,000 counts and score ≥ 50.

Compound Annotation
Processed data were reimported back into MPP for annotation using Agilent MassHunter ID Browser B.08 (ID Browser) to search in-house and commercial databases. The in-house database is composed of HMDB 4.0 [19], Lipid Maps [51], National Institute of Science and Technology (NIST) [52], and a set of 683 authentic standards with tandem MS analysis (MS/MS) data. Annotations were based on accurate mass, with a mass error cutoff of 10 ppm, isotope ratios and isotopic distribution through which the predicted isotope distribution is compared to actual ion height and a score is generated. Scores > 50 were considered putative annotations and correspond to an MSI metabolite identification level two or three [18]. In addition, unmatched data for significant compounds were manually searched using FoodDB [20], Phenol-Explorer [53], and KNApSAcK [22]. For compounds in which no annotation was possible, the molecular formula generator in ID Browser was used to estimate a metabolite chemical formula. All data and annotations were also manually reviewed.
To aid in interpretation of data, a software tool was developed to efficiently annotate identified compounds. The tool, MetabAnnotate, along with open-source Python 3 code is available at https://github.com/AnachronicNomad/metabolite_annotations (accessed on 2 February 2021). Briefly, MetabAnnotate takes an input Excel file, uses batch export by HMP ID to access XML files from the HMDB [19]. Project database v4 is found at https://hmdb.ca/metabolites, and it outputs an annotated Excel file that includes the HMDB [19] description as well as other database IDs such as Kyoto Encyclopedia of Genes and Genomes (KEGG) [54] and Lipid Maps [51].

Tandem MS (MS/MS)
To improve confidence in annotations, MS/MS was performed by targeting m/z and retention time (RT) of statistically significant compounds of interest. Lipid extracts were analyzed using the above LC-MS method with MS/MS data collected at fixed 10, 20, and 40 eV collision energies. Resulting experimental MS/MS spectra were compared to the National Institute of Science and Technology (NIST) Tandem Mass Spectral Library (Version 2.3) [52,55] using the NIST14 and NIST17 MSMS spectral libraries. The data were then searched using the in silico MS/MS spectral interpretation software SIRIUS version 4.6.0 [56] for formula and CSI:FingerID version 1.4.8 [57] for compound annotation. To increase confidence in compound ID database searches, pepper compounds were limited to the following natural product databases: Collection of Open Natural Products (COCONUT) [58], Global Natural Products Social Molecular Networking (GNPS) [59], Plant Metabolic Network (PMN) [60], KNApSAcK [22], and SUPER NATURAL II [61]. In silico strategies for improving confidence in annotations are further described in the Supplemental Methods. For example, one molecule of interest was initially annotated as β-cryptoxanthin although manual review of the data showed that α-cryptoxanthin is indistinguishable from β-cryptoxanthin, both of which are 552 m/z, though the latter is rarely found in plants [48]. As described in the Supplemental Methods, additional mass spectrometry, in silico strategies, and manual examination of the data resulted in confirmation of the original annotation of β-cryptoxanthin.

Statistical Analysis
Statistical analysis was performed in R Studio using R v.3.5.1. A linear mixed effects model using the function lmer from the lme4 [62] package was used with log2 transformed relative metabolite peak height (i.e., comparing relative abundances) as the outcome and pepper color as the predictor. A random intercept term for pepper was used to control for correlation due to pepper pieces being from the same individual pepper. An adjusted model controlling for organic status and raw vs. cooked was used to assess stability of the results. A FDR of 0.05 [23] was used to identify compounds significantly associated with pepper color. Post-hoc pairwise comparisons for pepper color were performed with Tukey's multiple testing correction using the emmeans function from the emmeans package. For the compound most significantly associated with pepper color, additional analyses were performed on overall pepper colors and within pepper color where it was possible to evaluate other pepper attributes such as organic status and cooked vs. raw. Similar to before, linear mixed effects models with a random intercept term for pepper were used.

Data Visualizations
Visualization of the data using principal component analysis (PCA) and hierarchical clustering (HC) were performed in MPP [9]. For PCA, log2 transformed data were mean centered scaled and the PCA was performed using the non-averaged sample groups interpretation with pruning using 4 principal components. For HC, data from 8174 compounds found in at least two samples were clustered on both samples and compounds and performed using averages of replicates within each color. The MPP software for HC uses using log2 transformed intensity values with the Euclidean distance matrix algorithm and Ward's linkage rule.