Identification of Anti-Proliferative Compounds from Genista monspessulana Seeds through Covariate-Based Integration of Chemical Fingerprints and Bioactivity Datasets

Genista monspessulana (L.) L.A.S. Johnson (Fabaceae) is a Mediterranean plant introduced to South America and other regions for ornamental purposes. However, it is considered an invasive shrub due to its reproductive vigor in many areas. Unlike other Genista plants, G. monspessulana has few studies disclosing its biologically active components, particularly cytotoxic agents against cancer cells. Thus, as part of our research on anti-proliferative bioactives, a set of ethanolic seed extracts from ten accessions of G. monspessulana, collected in the Bogotá plateau, were evaluated against four cell lines: PC-3 (prostate adenocarcinoma), SiHa (cervical carcinoma), A549 (lung carcinoma), and L929 (normal mouse fibroblasts). Extracts were also analyzed through liquid chromatography coupled with mass spectrometry (LC/MS) to record chemical fingerprints and determine the composition and metabolite variability between accessions. Using multiple covariate statistics, chemical and bioactivity datasets were integrated to recognize patterns and identify bioactive compounds among studied extracts. G. monspessulana seed-derived extracts exhibited dose-dependent antiproliferative activity on PC-3 and SiHa cell lines (>500 µg/mL < IC50 < 26.3 µg/mL). Seven compounds (1–7) were inferred as the compounds most likely responsible for the observed anti-proliferative activity and subsequently isolated and identified by spectroscopic techniques. A tricyclic quinolizidine (1) and a pyranoisoflavone (2) were found to be the most active compounds, exhibiting selectivity against PC-3 cell lines (IC50 < 18.6 µM). These compounds were used as precursors to obtain a quinolizidine-pyranoisoflavone adduct via Betti reaction, improving the activity against PC-3 and comparable to curcumin as the positive control. Results indicated that this composition–activity associative approach is advantageous to finding those bioactive principles efficiently within active extracts. This correlative association can be employed in further studies focused on the targeted isolation of anti-proliferative compounds from Genista plants and accessions.


Introduction
Genista is a plant genus within the Fabaceae family that comprises brooms and gathers ca. 90 perennial shrubby and short woody species, which are generally accepted to have Mediterranean origin [1], and are usually employed as ornamentals due to their abundant blooms and often sweet smell [2]. An interesting plant within this genus is Genista monspessulana (L.) L.A.S. Johnson [=Cytisus monspessulana L.; Teline monspessulana (L.) K. Koch]. It is a perennial, upright honey shrub that can reach up to 3 m [3] and is native to the Mediterranean region, Canary Islands, north Africa, and western Asia. However, due to its flowering being predominantly continuous in tropical regions [4], this plant has been naturalized in Australia and North and South America [5,6]. In Colombia, G. monspessulana is well known by common names such as 'smooth broom' (i.e., 'retamilla' or 'escobilla') and French or cape broom [7,8]. It was introduced to Colombia as an ornamental plant, as living factors and promote different bioactive and chemically diversified plant mixtures to be used as input for statistical analysis [44,45]. Combining chemical and bioactivity data from extracts of different accessions into a single examination is the main advantage of this integrative strategy, so the active principles highlighted by this covariate-based strategy can be subsequently isolated [46,47]. Therefore, it constitutes a beneficial starting point for focusing on the bioactive finding from plant sources [48].
Hence, as part of our research on recognizing naturally occurring compounds with anticancer properties, ethanol extracts from the seeds of ten G. monspessulana accessions were evaluated by the MTT (3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide) cell viability assay against three human cancer cell lines, i.e., lung carcinoma, cervical carcinoma, and prostatic adenocarcinoma, which are related to those cancer types with high mortality and incidence in Colombia [31]. The resulting anti-proliferative activity was statistically associated with mass spectrometry-based chemical composition, and seven compounds were recognized and identified using this integration.

Results and Discussion
G. monspessulana is widely distributed along the Bogotá plateau, particularly in open areas with higher light incidence [15]. These facts allow this plant to have rapid leaf and root development and microorganism association, facilitating nitrogen fixation [49]. Based on these facts, seeds from ten G. monspessulana accessions were retrieved from environmentally different locations to ensure a plausible chemical variation depending on growth conditions provided by surroundings [50]. Thus, the collected seeds per accession were subdivided into two accession groups (AG). AG1 (Gm1,2,4,6,8) comprised seeds of five accessions from open spaces and grasslands forming dense scrub associations and AG2 (Gm3, 5,7,9,10) gathered those seeds of the other five accessions from spaces near woodland and even under a tree canopy. Ground, dry seeds were then extracted using 96% ethanol to afford crude extracts and subsequently used in the MTT cell viability assay.

Anti-Proliferative Activity of G. monspessulana Seed-Derived Extracts
G. monspessulana has no previous studies on bioactivity against cancer cell lines, but other Genista plants have shown cytotoxic/anti-proliferative activity due to alkaloids and flavonoids [24]. The anti-proliferative activity against three cancer cell lines (i.e., PC-3 (human prostate adenocarcinoma), SiHa (human cervical carcinoma), A549 (human lung carcinoma)), and a normal cell line (i.e., L929 (normal mouse fibroblasts)) of the ethanolic seed extracts (n = 10), obtained from the ten accessions of G. monspessulana (Gm1-10), are presented in Table 1. Studied extracts showed anti-proliferative activity against test cell lines at different levels, indicating that these G. monspessulana accessions might differentially produce cytotoxic compounds. The resulting IC 50 values fell into the 26.3-500 µg/mL range. The ten extracts were not active against A549 (IC 50 > 500 µg/mL) and practically inactive against fibroblasts (IC 50 > 481 µg/mL), while PC-3 was the most susceptible cell line to the extracts (average IC 50 = 98.3 ± 68.1 µg/mL), followed by SiHa (average IC 50 = 148.8 ± 97.8 µg/mL). Gm10 accession was the most active extract (IC 50 = 26.3 µg/mL) against the PC-3 cell line, having a good selectivity since it was inactive against fibroblasts. Indeed, the second accession group (i.e., Gm3,5,7,9,10) exhibited better activity (IC 50 < 57.8 µg/mL) against PC-3 than the first group (Gm1,2,4,6,8; IC 50 > 92.8 µg/mL). This trend was similar to SiHa and oppositely behaved against L929. These facts indicated that G. monspessulana seed-derived extracts exhibited selectivity to PC-3 and SiHA cancer cell lines. Additionally, their anti-proliferative activity was found to be differential depending on the seed origin, possibly due to the collecting environment (i.e., grassland versus woodland areas). In this regard, seeds retrieved from woodland-growing plants exhibited generally higher anti-proliferative activity than seeds from grassland-growing plants, conceivably due to the stimulation of metabolite production as an adaptive response to face the plausible biotic pressure of other competing organisms (e.g., plants, insects, microorganisms) [51]. Contrarily, plants in open spaces, having different pressure and good growth conditions, possibly promoted a more constitutive status in G. monspessulana plants [52]. Table 1. Anti-proliferative activity against cancer cell lines and fibroblasts of seed extracts from ten G. monspessulana accessions.

Characterization Based on LC-ESI-MS Data of G. monspessulana Seed-Derived Extracts
Reverse-phase liquid chromatography coupled to mass spectrometry using electrospray ionization (RP-LC-ESI-MS) was used to characterize the ten extracts from G. monspessulana accessions chemically. The recorded m/z features per extract were recovered from the LC-MS raw data and gathered into a feature intensity table (FIT). Thus, 423 features were compiled from the ten extracts, implying a high metabolite diversity. Extracts shared various metabolites (i.e., features) involving intensity variations. In contrast, other metabolites occurred in particular extracts. This observation was expanded by the global LC-MS-based metabolite distribution illustrated by a heat map. It was built under the classification based on accession groups (AGs) according to the seed origin, i.e., AG1 and AG2, and after scaling the feature intensity to unit variance (i.e., autoscaling) to define differential metabolites depending on the color scale (3 to −3), i.e., dark red (=3) related to high feature intensity and dark blue related (=−3) to low feature intensity ( Figure 1A).
According to the hierarchical clustering analysis (HCA) performed on the autoscaled metabolite data, the heat map evidenced that both AGs were discriminated by the presence and/or abundance of particular metabolites. Globally, accessions within an AG exhibited similar profiles, although some specific differences are also involved. In addition, the AG separation depending on feature amount was similar (i.e., the total number of features was almost equally divided between AG1 and AG2). Gm6 was the seed accession with the highest number of the most abundant metabolites, and Gm1 and Gm4 with the lowest number ( Figure 1A). This fact suggested that the chemistry of G. monspessulana seeds is also influenced by the growth environment, as reported for other Fabaceae plants [53,54]. The most active extracts coincided with the AG2, and they exhibited more abundant metabolites than AG1 (which included the least active extracts), suggesting that AG2 extracts contain interesting compounds that are probably responsible for the observed anti-proliferative effect.
An additional partial least square-discriminant analysis (PLS-DA) demonstrated such a chemical differentiation between the AGs, using the two first components (69.3% variance explained), involving a good separation as observed in the C1 vs. C2 score plot ( Figure 1B). The PLS-DA-derived variable importance in the projection (VIP) demarcated such a contrast, the scores of which led to ranking the ten most influencing compounds in AG differentiation through a VIP plot ( Figure 1C). These top-ranked fea-tures were registered by the retention time, and mass/charge ratio (rt/m/z) pairs since the annotation failed because the putative identification resulted in various isomers. tered by the retention time, and mass/charge ratio (rt/m/z) pairs since the annotation failed because the putative identification resulted in various isomers.
Under this top ranking, nine features exhibited a high differential trend based on VIP scores (>3) as the criterion for selection. Therefore, two features were then related to the statistical separation of AG1 metabolite profiles, while seven features were related to AG2 distinction. Since the anti-proliferative activity was linked to the AG division, these seven top-ranked compounds that influenced the AG differentiation through chemical profiles can be considered active principal candidates. Consequently, the MS-based chemical and bioactivity datasets were statistically integrated to support this hypothesis and recognize the plausible anti-proliferative compounds.

Detection of Anti-Proliferative Candidates from G. monspessulana Accessions through the Integration of Chemical Fingerprint and Bioactivity Datasets
To recognize such active metabolites produced by G. monspessulana seeds, the antiproliferative activity (APA) and the LC-MS-based fingerprint (LMFP) datasets were integrated through multiple-covariate statistics. A single-Y orthogonal partial least squares (OPLS) regression was then used to associate such datasets. Due to the observed selectivity on the PC-3 cell line (Table 1), the respective IC50 values were specifically employed as the APA dataset and, consequently, the continuous Y variable. Thus, the resulting OPLS model, containing one predictive score (t1) and one orthogonal component (to1), differentiated the studied extracts based on APA (Y-data) and LMFP (X-data), exposing a well- Under this top ranking, nine features exhibited a high differential trend based on VIP scores (>3) as the criterion for selection. Therefore, two features were then related to the statistical separation of AG1 metabolite profiles, while seven features were related to AG2 distinction. Since the anti-proliferative activity was linked to the AG division, these seven top-ranked compounds that influenced the AG differentiation through chemical profiles can be considered active principal candidates. Consequently, the MS-based chemical and bioactivity datasets were statistically integrated to support this hypothesis and recognize the plausible anti-proliferative compounds.

Detection of Anti-Proliferative Candidates from G. monspessulana Accessions through the Integration of Chemical Fingerprint and Bioactivity Datasets
To recognize such active metabolites produced by G. monspessulana seeds, the antiproliferative activity (APA) and the LC-MS-based fingerprint (LMFP) datasets were integrated through multiple-covariate statistics. A single-Y orthogonal partial least squares (OPLS) regression was then used to associate such datasets. Due to the observed selectivity on the PC-3 cell line (Table 1), the respective IC 50 values were specifically employed as the APA dataset and, consequently, the continuous Y variable. Thus, the resulting OPLS model, containing one predictive score (t1) and one orthogonal component (to1), differentiated the studied extracts based on APA (Y-data) and LMFP (X-data), exposing a well-fitted (R 2 X = 0.883, R 2 Y = 0.812) and predictable (Q 2 Y = 0.721) model and explaining the variance by the APA (49.5% along t1) and LMFP (38.8% along to1). The OPLS-derived score plot (Figure 2A) revealed the respective discrimination mode of seed extracts from G. monspessulana accessions. Hence, the APA-influencing differentiation of metabolite profiles was visualized by the IC 50 values using a color scale between red (250 µg/mL) and blue (0 µg/mL). Through this pattern, the most active extracts clustered on the left side, but contained clearly different profiles because of their high dispersion, while the least active extracts were located on the right side. This trend corroborated the previously observed fact that specific metabolites occurring in the most active extracts might be responsible for the observed APA.
fitted (R 2 X = 0.883, R 2 Y = 0.812) and predictable (Q 2 Y = 0.721) model and explaining the variance by the APA (49.5% along t1) and LMFP (38.8% along to1). The OPLS-derived score plot ( Figure 2A) revealed the respective discrimination mode of seed extracts from G. monspessulana accessions. Hence, the APA-influencing differentiation of metabolite profiles was visualized by the IC50 values using a color scale between red (250 µg/mL) and blue (0 µg/mL). Through this pattern, the most active extracts clustered on the left side, but contained clearly different profiles because of their high dispersion, while the least active extracts were located on the right side. This trend corroborated the previously observed fact that specific metabolites occurring in the most active extracts might be responsible for the observed APA. To facilitate the recognition of those compounds as bioactive candidates, the respective PLS-DA-derived loadings were scrutinized by employing an S-plot transformation (a p1 × p(corr)1 scatter plot forming an S-like contour) to categorize the relative importance of differential variables (i.e., metabolites). This S-plot ( Figure 2B) displayed the covariance and the correlation structure between the X-data and t1 [55], using Pareto and centering scaling. Accordingly, the most important chemical differences among least active (p1 > 0) and most active (p1 < 0) G. monspessulana seed-derived extracts were exposed by the metabolites located distantly in the wings of the S-plot, showing a strong influence on the model with high reliability. Therefore, seven compounds (numbered as 1-7, red dots) were categorized as the most influential variables (p(corr)1 > 0.4, p1 < -0.2) for the OPLSbased differentiation of the most active extracts. In contrast, the other two metabolites (blue dots) were highly related to the least active extracts.
The resulting VIP plot ( Figure 2C) corroborated such a relevant influence on the integrative discrimination as plausible bioactives (VIP scores > 3). These features showed To facilitate the recognition of those compounds as bioactive candidates, the respective PLS-DA-derived loadings were scrutinized by employing an S-plot transformation (a p1 × p (corr) 1 scatter plot forming an S-like contour) to categorize the relative importance of differential variables (i.e., metabolites). This S-plot ( Figure 2B) displayed the covariance and the correlation structure between the X-data and t1 [55], using Pareto and centering scaling. Accordingly, the most important chemical differences among least active (p1 > 0) and most active (p1 < 0) G. monspessulana seed-derived extracts were exposed by the metabolites located distantly in the wings of the S-plot, showing a strong influence on the model with high reliability. Therefore, seven compounds (numbered as 1-7, red dots) were categorized as the most influential variables (p (corr) 1 > 0.4, p1 < -0.2) for the OPLS-based differentiation of the most active extracts. In contrast, the other two metabolites (blue dots) were highly related to the least active extracts.  (7). According to the combined LC and MS behavior, these metabolites were related to intermediate-polar and low-weight compounds. Compound 1 showed the most substantial model influence due to its high discriminating importance (p1 < -0.4). This compound was highly present in Gm9 and Gm10 extracts. Compounds 4 and 7 showed the best reliability on the basis of their differential p (corr) 1 value, indicating that these compounds are frequent in various G. monspessulana extracts. Compounds 5 and 6 displayed the lowest model influence and reliability. Nevertheless, slight differences in these discriminating parameters were observed for the differential metabolites 1-7. Therefore, they were considered within the pattern recognition as anti-proliferative candidates that probably participated in the measured APA against the PC-3 cell line by the studied extracts. This outcome indicated that APA/LMFP dataset integration could be successfully achieved for bioactive pinpointing by single-Y OPLS, since the covariance maximization of discriminating metabolites (independent variables) as a function of bioactivity (continuous or categorical dependent variable) is satisfactorily achieved by supervised statistical methods, e.g., OPLS or PLS, but not by unsupervised methods, e.g., principal component analysis (PCA) [56]. Thus, PCA was not employed as a first-line analysis.
The main advantage of the dataset association based on metabolite profiling is using chemical fingerprints as the source of independent variables to be integrated with the bioactivity of mixtures of natural origin as a dependent variable [57]. Because single-Y OPLS uses a continuous variable, its convenience as a multiple-covariate dataset integration is higher than that of those using categorical variables since a considerable amount of relevant information can be lost [58,59]. In addition, this integration can also detect unstable metabolites, with this being the primary concern during an extract fractionation [48]. However, an intrinsic limitation is the detection of false positives due to the synergistic/antagonistic effects with other components within extracts [60]. Therefore, the targeted isolation of 1-7 was carefully performed to validate the observed correlative differentiation by assessing their anti-proliferative activity on cancer cell lines.

Isolation and Identification of OPLS-Recognized Anti-Proliferative Candidates
The fingerprint/bioactivity integration recognized metabolites 1-7 as the most discriminating metabolites for the most active APA-based extracts. Therefore, semipreparative HPLC separations were conducted to purify these compounds from the most active extracts (i.e., Gm9 and Gm10). Compounds 1, 3, and 4 were obtained from Gm10, whereas Gm9 extract afforded compounds 2, 5, 6, and 7. After isolation, the structures of compounds 1-7 were elucidated by diagnostic scrutiny of nuclear magnetic resonance (NMR) and MS data. Compounds were therefore identified, and their 13 C NMR data were identical to those reported for the known metabolites (-)-cytisine (1) [61], alpinumisoflavone (2) Figure 3.
The purified compounds were assessed against the four cell lines using the MTT viability assay to validate the OPLS-based bioactive recognition. Their resulting IC 50 values are listed in Table 2. As expected, PC-3 was the most susceptible cell line when treated with compounds 1-7, but the alkaloid 1 exhibited the most potent anti-proliferative effect (IC 50 = 15.8 µM) on this cell line, but lower than positive control curcumin (IC 50 = 9.5 µM). However, 1 showed a cytotoxic effect on A549 (IC 50 = 42.5 µM) and fibroblasts (IC 50 = 102.7 µM). The pyranoisoflavone 2 also displayed good activity against PC-3 (IC 50 = 18.6 µM), but it was most active for SiHa (IC 50 = 19.6 µM) and inactive against L929, while prenylated isoflavone 6 was the least active compound against PC-3 and also inactive for fibroblasts.   Cytisine-type alkaloids 1 and 5 revealed an interesting trend regarding the anti-pr liferative activity. In this regard, a methyl group at N12 seemed to negatively affect t anti-proliferative activity against PC-3 and SiHa but had a positive effect against A54 which can be further investigated. In contrast, alkaloids 3, 4, and 7 and isoflavones 2 an 6 showed similar anti-proliferative profiles for each metabolite type, with an uncle trend. However, alkaloids were generally cytotoxic for fibroblasts (IC50 < 85.9 µM), wh isoflavones were found to be inactive against L929 (IC50 > 400 µM).
Cytisine (1) is a well-known compound with several relevant biological propertie including cytotoxic and anti-proliferative activities [68], particularly against the HepG (human hepatocellular carcinoma) cell line, by inducing mitochondrial-mediated apopt sis [69]. Compound 1 was also reported to be cytotoxic through apoptosis inductio against the A549 cell line, coinciding with our findings but involving better activity (IC = 26.83 µM) [70]. Other cancer cell lines, such as the FaDu, MCF-7, and MDA-MB cell line are not affected by 1 [71], which confirms its observed selectivity to lung and prosta cancer cell lines. Similarly, alpinumisoflavone (2) has several reports on various biologic activities, including cytotoxicity against various cancer cell lines, such as human oral ep dermoid (KB), murine leukemia (P-388), human leukemia (HL-60, K-562, MOLT-4), h man lung (H2108, H1299, MRC-5), human renal (ccRCC 786-O, Caki1, SN12C), neurobla   Cytisine-type alkaloids 1 and 5 revealed an interesting trend regarding the antiproliferative activity. In this regard, a methyl group at N12 seemed to negatively affect the anti-proliferative activity against PC-3 and SiHa but had a positive effect against A549, which can be further investigated. In contrast, alkaloids 3, 4, and 7 and isoflavones 2 and 6 showed similar anti-proliferative profiles for each metabolite type, with an unclear trend. However, alkaloids were generally cytotoxic for fibroblasts (IC 50 < 85.9 µM), while isoflavones were found to be inactive against L929 (IC 50 > 400 µM).
There are no records of aphylline (3) or 5,6-dehydrolupanine (7) possessing antiproliferative activity. In addition, compounds 1 and 4-6 have no records on activity against PC-3, and none of the bioactives 1-7 have been evaluated against SiHa. Therefore, they were evaluated for the first time in the present study against these cell lines for the purposes of statistical integration. These findings confirmed that this association effectively identifies anti-proliferatives against these three cancer cell lines from G. monspessulana seed extracts, validating the candidates, supported by previous studies assessing the activity of some of these compounds against several cancer cell lines. However, there is a probability that other active compounds are missing due to the plausible antagonistic effects of diverse extract components, so deeper integrative analyses for detecting those missed bioactives are recommended, even those less active but having synergistic roles that could improve the activity [77]. Lastly, since the isolated compounds 1-7 were evaluated at different concentrations from their parent extracts, a direct comparison of anti-proliferative activity between individual compounds and extracts was not possible. Therefore, further studies would be necessary to disclose whether the isolated compounds 1-7 are the only antiproliferative bioactives in the test G. monspessulana seed extracts.
A recent study explored the inhibitory activity of prostate and colon cancer cell proliferation by cytisine-linked isoflavonoids (CLIFs) at C7 through a carbon chain (C2-C6) spacer showing inhibition > 60% at 10 µM [78]. We explored a similar strategy, considering that 1 and 2 were the most potent compounds against the PC-3 cell line (IC 50 < 18.6 µM). Thus, we used a Betti-like reaction [79] to condense 1 and 2 into the CLIF 8 (a novel cytisine-alpinumisoflavone adduct) (Figure 4). This reaction involves a multi-component protocol to combine an aldehyde, a primary/secondary amine, and a phenol to produce N-substituted-2-aminomethylphenols. In this regard, the reaction proceeded at room temperature with the imine formation by nucleophilic addition of 1 (amine) to formaldehyde and, subsequently, 2 (nucleophilic phenol) was added to the resulting imine in the presence of 4-dimethylaminopyridine (DMAP) to afford the CLIF 8 (63% yield).
The synthetic compound 8 was also evaluated against the four cell lines. This compound showed an enhanced anti-proliferative profile, since the activity against the three cancer cell lines was better than that of precursors 1 and 2, with PC-3 being the most susceptible cell line (IC 50 [80]. This adduct exhibited ca. 2-3-fold better activity against two human breast cancer cell lines (i.e., MCF-7 and MDA-MB-231) and lower cytotoxicity against normal RAW 264.7 (mouse macrophages) and BV2 (mouse microglia) cell lines than respective precursors, coinciding with our findings. Tonkinensine B has cytotoxic activity by apoptosis induction, a relevant mechanism to be expected for anticancer compounds. In tonkinensine B, cytisine was linked to C4 at the pterocarpan's A-ring through its OH at C3. Other previously studied CLIFs had the cytisine linked to C7 at the A-ring via an ether bridge. In contrast, CLIF 8 contained a phenolic OH at the B-ring, which favored the reaction with cytisine, since the OH at C5 does not have the proper chemical environment for the Betti reaction. Thus, derivatives of 8 can be further studied as a novel CLIF series to explore its potential against mainly prostate but also cervical and lung cancers. cytisine-alpinumisoflavone adduct) (Figure 4). This reaction involves a multi-component protocol to combine an aldehyde, a primary/secondary amine, and a phenol to produce N-substituted-2-aminomethylphenols. In this regard, the reaction proceeded at room temperature with the imine formation by nucleophilic addition of 1 (amine) to formaldehyde and, subsequently, 2 (nucleophilic phenol) was added to the resulting imine in the presence of 4-dimethylaminopyridine (DMAP) to afford the CLIF 8 (63% yield). The synthetic compound 8 was also evaluated against the four cell lines. This compound showed an enhanced anti-proliferative profile, since the activity against the three cancer cell lines was better than that of precursors 1 and 2, with PC-3 being the most susceptible cell line (IC50 = 10.1 µM). In addition, CLIF 8 exhibited better activity against SiHA and A549 (IC50 = 17.5 and 46.8 µM, respectively) but a lower anti-proliferative effect on fibroblasts (IC50 = 385 µM). The activity outcome for 8 comprised selectivity indexes (SI = IC50 normal cells/IC50 cancer cells) of 38.1, 22.0, and 8.2 for PC-3, SiHa, A549, and L929, respectively. Recently, a cytisine-linked pterocarpan (tonkinensine B) was synthesized from cytisine and (−)-maackiain [80]. This adduct exhibited ca. 2-3-fold better activity against two human breast cancer cell lines (i.e., MCF-7 and MDA-MB-231) and lower cytotoxicity against normal RAW 264.7 (mouse macrophages) and BV2 (mouse microglia)

Plant Material
Seeds of ten G. monspessulana (L.) L.A.S. Johnson were collected in Cundinamarca and Boyacá, Colombia, between July and September, 2014, from different growth conditions provided by surroundings, abiding by the Colombian ethical legislation. Collected seeds per accession were then subdivided into the two accession groups: (1) AG1 contained seeds of five accessions (Gm1,2,4,6,8) retrieved from plants growing in open spaces and grasslands forming dense scrub associations, and (2) AG2 contained those seeds of the other five accessions (Gm3, 5,7,9,10), retrieved from plants growing in spaces near woodland and even under a tree canopy. Voucher specimens are kept at Colombian National Herbarium. The collected healthy seeds were transported to the laboratory for extract preparation.

Extract Preparation
The healthy seeds (50 g) from the ten G. monspessulana accessions were separately air-dried, crushed, ground, and extracted with 96% ethanol at a constant shaking speed (120 rpm) using a Heidolph Rotamax 120 platform orbital shaker (Heidolph Instruments GmbH & Co.KG, Schwabach, Germany). The extraction lasted one week, with daily filtration-mediated removal of the extract-containing solvent and replaced by fresh 96% ethanol. The filtered solution was concentrated by distillation under reduced pressure at 40 • C using an IKA RV 10 Control rotary evaporator (IKA ® RV 10, IKA ® Werke GmbH & Co. KG, Staufen, Germany) to afford the raw extracts, which were compiled per accession after each daily extraction. The resulting raw extracts per accession were dried and stored at −20 • C until subsequent biological and chemical analyses.

In Vitro Cell Viability Assay
Human prostatic adenocarcinoma (PC-3, ATCC CRL-7934), human lung adenocarcinoma (A549, ATCC CCL-185), and human cervical carcinoma (SiHa, ATCC HTB-35) cancer cell lines and normal mouse fibroblasts (L929, ATCC CRL-6364) were maintained in a humidified atmosphere with 5% CO 2 at 37 • C, and grown as a monolayer culture in Dulbecco's Modified Eagle Medium (DMEM) medium with 10% (v/v) fetal bovine serum (FBS), 1% (v/v) penicillin, and 1% (v/v) streptomycin. The anti-proliferative effects of G. monspessulana extracts and isolated compounds were measured according to a reported method [81]. Cell suspension (100 µL, 5 × 10 3 cells/well) was inoculated in 96-well plates and cultured for 24 h. After that, the culture medium was replaced with a serum-free medium (100 µL) containing different concentrations of treatments (0.8-500 µg/mL for extracts and 0.16-100 µg/mL for pure compounds). Extracts and compounds were measured in triplicate. A PBS-containing free medium was used as a blank, 1% (w/v) bovine serum albumin-amended medium as negative control (100% survival), and curcumin (0.16-100 µg/mL) was used as the positive control. After 48 h of incubation, the cell viability was assessed by adding MTT (10 µL, 5 mg/mL) to each well, and the plates were subsequently incubated under 5% CO 2 at 37 • C for 3 h. Formazan crystals were dissolved with 100 µL of DMSO. Absorbance was measured at 570 nm using a Varioskan LUX 96-well plate reader (Thermo Fisher Scientific, Waltham, MA, USA). The anti-proliferative effects were expressed as half-maximal inhibitory concentration (IC 50 ) in µg/mL (extracts) and µM (compounds). The IC 50 values were calculated from the dose-response curves through non-linear regression using GraphPad 5.0 (GraphPad Software, San Diego, CA, USA).

High-Performance Liquid Chromatography Coupled to Mass Spectrometry
Metabolite profiles of test extracts were recorded on a Shimadzu Prominence (Shimadzu Corporation, Kyoto, Japan) equipped with two binary pumps, an autoinjector, a photodiode array (PDA) detector, and an LCMS2020 mass spectrometry detector with a single quadrupole analyzer and electrospray ionization (ESI). Each seed extract was dissolved in absolute ethanol (5 mg/mL) and injected (20 µL) into the HPLC system.

Multiple-Covariate Integration of Chemical Fingerprint and Bioactivity Datasets
The processed data was exported in csv format to build the feature intensity table (FIT), i.e., (10 samples × 423 features), and the data were autoscaled (unit variance scaling) to perform suitable comparisons. A heat map was built to intuitively visualize the autoscaled feature distribution using MetaboAnalyst 5.0 (McGill University, Quebec, Canada) [83]. The pre-treated FIT was then joined with the respective anti-proliferative activity data (i.e., APA as a continuous variable) to assemble the integrated dataset. The resulting matrix was then imported into the SIMCA software (v 14.0) (Umetrics, Umeå, Sweden) to build the respective models by single-Y orthogonal partial least squares (OPLS). The obtained results were visualized using scores and S plots.

Synthesis of Cytisine-Linked Isoflavonoid 8
Into a round-bottom flask, (-)-cytisine (0.02 mmol), alpinumisoflavone (0.01 mmol), and 1,4-dioxane (2 mL) were added and mixed at room temperature to afford a solution. Subsequently, formaldehyde (0.02 mmol) and DMAP (1% mmol) were also added. The reaction mixture was stirred at room temperature until completion, determined by thinlayer chromatography using silica gel plates and a chloroform/methanol (98:2) solvent mixture as a mobile phase. Subsequently, the solvent was removed under reduced pressure to afford the crude reaction product.

Conclusions
Seed extracts from ten G. monspessulana accessions exhibited anti-proliferative activity against three cancer cell lines (i.e., PC-3, SiHa, and A549) at different levels, demonstrating selectivity to prostatic adenocarcinoma (PC-3) and cervical carcinoma (SiHa) cell lines and less toxicity on fibroblasts. To the best of our knowledge, the present study constitutes the first attempt to evaluate the inhibitory capacity of the studied G. monspessulana-derived extracts against cancer cell lines, affording selectivity to PC-3 cell lines. In addition, anti-proliferative activity revealed a differential pattern depending on the seed origin, probably due to the growing area of parent plants. Thus, seeds retrieved from woodland-growing plants were significantly more active against PC-3 than seeds from grassland-growing plants. This trend was rationalized through a specialized metabolite-mediated adaptive response to face plausible biotic pressures by other woodland-competing organisms, while plants in open spaces promoted a more constitutive status. This information was used as a bioactivity input for the indirect detection of plausible anti-proliferative candidates through metabolite profiling by relating the fingerprints and the PC-3-oriented anti-proliferative activity datasets, leading to the recognition of seven hits, such as (-)-cytisine (1), alpinumisoflavone (2), (+)-aphylline (3), (-)-anagyrine (4), (-)-N-methylcytisine (5), wighteone (6), and (+)-5,6-dehydrolupanine (7), as the active principle candidates. The isolation of these hits from active seed extracts and their anti-proliferative activity assessment demonstrated the effectiveness of this indirect approach based on LC-MS-based profiles to find bioactives from G. monspessulana extracts. Further explorations on Genista active extracts or individual compounds comprising quinolizidines and prenylated isoflavonoid moieties should be conducted to expand their potential as protective agents against cancer. Indeed, the most active compounds (1 and 2) were transformed into a more active compound, cytisine-linked isoflavonoid 8, which displayed better activity against PC-3 with a good selectivity index (>30). Thus, 8-related CLIFs might be considered in future studies to define their potential against lung, cervical, and prostate cancer.  Informed Consent Statement: Not applicable.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.