Systematic Analysis of Genetic and Pathway Determinants of Eribulin Sensitivity across 100 Human Cancer Cell Lines from the Cancer Cell Line Encyclopedia (CCLE)

Simple Summary Despite decades of clinical use and detailed understandings of their mechanisms of action, clinically useful predictive biomarkers for approved microtubule targeting agents such as eribulin, paclitaxel and vinorelbine have eluded development. Our results now provide the basis for gene expression-based, drug-specific predictive biomarkers for eribulin and vinorelbine, as well as deeper understandings of the molecular pathways associated with eribulin and vinorelbine response. Abstract Eribulin, a natural product-based microtubule targeting agent with cytotoxic and noncytotoxic mechanisms, is FDA approved for certain patients with advanced breast cancer and liposarcoma. To investigate the feasibility of developing drug-specific predictive biomarkers, we quantified antiproliferative activities of eribulin versus paclitaxel and vinorelbine against 100 human cancer cell lines from the Cancer Cell Line Encyclopedia, and correlated results with publicly available databases to identify genes and pathways associated with eribulin response, either uniquely or shared with paclitaxel or vinorelbine. Mean expression ratios of 11,985 genes between the most and least sensitive cell line quartiles were sorted by p-values and drug overlaps, yielding 52, 29 and 80 genes uniquely associated with eribulin, paclitaxel and vinorelbine, respectively. Further restriction to minimum 2-fold ratios followed by reintroducing data from the middle two quartiles identified 9 and 13 drug-specific unique fingerprint genes for eribulin and vinorelbine, respectively; surprisingly, no gene met all criteria for paclitaxel. Interactome and Reactome pathway analyses showed that unique fingerprint genes of both drugs were primarily associated with cellular signaling, not microtubule-related pathways, although considerable differences existed in individual pathways identified. Finally, four-gene (C5ORF38, DAAM1, IRX2, CD70) and five-gene (EPHA2, NGEF, SEPTIN10, TRIP10, VSIG10) multivariate regression models for eribulin and vinorelbine showed high statistical correlation with drug-specific responses across the 100 cell lines and accurately calculated predicted mean IC50s for the most and least sensitive cell line quartiles as surrogates for responders and nonresponders, respectively. Collectively, these results provide a foundation for developing drug-specific predictive biomarkers for eribulin and vinorelbine.


Introduction
Microtubule targeting agents (MTAs), including eribulin, several taxanes, vinca alkaloids, and at least one epothilone, are important clinical anticancer drugs for several cancer types [1][2][3][4]. Despite sharing targets of MTs and their α/β-tubulin building blocks, different MTAs have different clinical profiles, binding sites, mechanisms of inhibiting MT dynamics, as well as different effects on mitotic versus interphase cells [1][2][3][4]. For example, paclitaxel is an MT stabilizer, while eribulin and vinorelbine are MT destabilizers; both MT stabilization and destabilization are sufficient to disrupt the MT dynamics that underly MT Eribulin mesylate (hereafter, eribulin) was supplied by Eisai Inc. (Cambridge, MA, USA) as laboratory-grade dry powder active pharmaceutical ingredient (API) obtained from the same manufacturing stream used for Eisai's branded clinical product, Halaven ® . Paclitaxel and vinorelbine tartrate (hereafter, vinorelbine) were obtained as dry powders from Selleckchem (Shanghai, China). Eribulin, paclitaxel and vinorelbine were prepared as 10 mM stock solutions in 100% (v/v) DMSO, aliquoted into small volumes and stored at −20 • C until day of use. Although not a designated test agent for this study, cisplatin was included in all assays as an internal reference control for assay performance; cisplatin was obtained as a laboratory grade liquid formulation from Hospira Australia Pty Ltd. (Melbourne, Australia), with storage per manufacturer's instructions and dilutions for cell culture studies on day of use.

Human Cancer Cell Lines
One hundred established human cancer cell lines were selected based on inclusion in both the CCLE and Crown Bioscience's OmniPanel TM in vitro cell line screening service (see Section 2.3). A list of selected cell lines and their tissues of origin are presented in Supplemental Table S1. Cell line selection was based on a combination of common usage, well-established characterization in the literature, personal experience and our hypothesis that the most diverse cell line panel would provide the highest likelihood of identifying genes and pathways associated with specific drug responses, agnostic of cancer type.

In Vitro Cell-Based Antiproliferative Assays
Measurement of antiproliferative activities of test agents against the selected 100 CCLE cell lines was performed by Crown Bioscience under contract from Eisai Inc., using Crown's OmniPanel TM in vitro cell line screening service in their Beijing (China) laboratories. All OmniPanel TM cell lines are routinely tested for mycoplasma and authenticated using short tandem repeat (STR) DNA profiling. Testing was conducted using the CellTiter-Glo ® Luminescent Cell Viability Assay (Promega Corp., Beijing, China) following 72 h compound exposures in 96-well plate assay formats. Compounds were added 24 h after cell seeding into plates, with initial seeding densities for individual cell lines having been previously optimized for OmniPanel TM screening. Based on the authors' prior experience with the test agents, assays employed 9-step half-log test concentrations of 30 pM-300 nM for eribulin and 100 pM-1 µM for paclitaxel and vinorelbine. Concentrations of test agents inhibiting 50% of viable cell densities in vehicle-treated control wells compared to wells with the highest drug concentrations were defined as IC50s and were determined using GraphPad Prism software (version 5.0; GraphPad-Prism China, Beijing, China). Doseresponse curve fitting used a nonlinear regression model with sigmoidal dose response, fixed 100% top y-axis values defined by vehicle and floating bottom y-axis values at either the highest concentration tested or another concentration that resulted in a minimum y-axis reading, ensuring that calculated IC50s represent actual antiproliferative biological responses occurring during the 72 h treatment period without plateau artifacts based on different growth rates of different cell lines.
IC50s for 5 cell lines for eribulin (769-P, 786-O, CADO-ES1, HCT-15, NCI-H716) and 1 cell line for paclitaxel (HCT-15) exceeded the 300 nM and 1000 nM top concentrations used for these 2 drugs, respectively, so surrogate values of 301 nM eribulin and 1001 nM paclitaxel were assigned for purposes of defining cell lines in the bottom (least sensitive) quartile and calculating means/medians of bottom quartiles. Although using such surrogates does not change assignment of bottom quartiles or median IC50 values, it probably results in minor underestimations of mean bottom quartile IC50s, SDs and SEMs for eribulin and paclitaxel, although such underestimations are likely to be small due to the large overall number of cell lines tested.

Gene Expression Data Analysis
Expression data for 11,985 genes from the 100 selected cell lines were downloaded from the Cancer Dependency Map (DepMap) project portal (https://depmap.org/portal/ accessed on 3 August 2021; release version "DepMap Public 18Q2", 2 May 2018). DepMap data using the symbol SEPT10 were replaced with the currently accepted symbol SEPTIN10. Expression values are log 2 gene-level reads per kilobase million (RPKM) derived from RNA Sequencing (RNA-Seq), aligned using TopHat version 1.4 and quantified using the pipeline developed for the Genotype-Tissue Expression (GTEx) project as described by Tsherniak et al. [15]. Systems-level analyses of genes associated with drug response were performed using the Data4Cure Biomedical Intelligence ® Cloud (La Jolla, CA, USA) [16] to identify gene-level determinants of response that were unique to eribulin or shared with vinorelbine or paclitaxel (see Section 2.5). Differential gene expression analysis comparing top and bottom cell line quartiles for each drug (most and least sensitive 25 cells lines, respectively) was performed using the limma R package [17], fitting a linear model for each gene and producing a moderated t-statistic, fold-change, p-value and q-value for each gene, with q-values representing p-values adjusted for multiple hypothesis testing using the false discovery rate (FDR) method of Benjamini and Hochberg [18].

Identification of Drug-Specific UFGs
Ratios of mean expression levels for each gene between the most and least sensitive cell line quartiles (top and bottom quartiles, respectively) for each drug were calculated. Genes with top/bottom quartile ratios with p < 0.0025 significance for a given "cognate" drug were further restricted by sequentially (i) excluding genes having p < 0.0025 overlaps with either of the 2 "noncognate" drugs, (ii) retaining only genes with either ≥2× increased or ≤0.5× decreased expression ratios between top and bottom quartiles (|log 2 [quartile expression ratio]| ≥ 1) and (iii) after reintroducing data from the middle two cell line quartiles, retaining only those genes with statistical significance (p < 0.05) for the cognate drug in full linear regression analysis with all 100 cell lines, but lacking statistical significance for both Cancers 2022, 14, 4532 4 of 21 noncognate drugs. Genes meeting all the above criteria for a given drug were designated as unique fingerprint genes or UFGs for that drug. No gene met all criteria for paclitaxel; thus, no paclitaxel UFGs were identified in this study.

Network Propagation and Reactome Pathway Analyses
Network propagation [19] starting with the 9 eribulin UFGs and 13 vinorelbine UFGs as query inputs was performed using the Network Enrichment platform in the Data4Cure Biomedical Intelligence ® Cloud [16]. The resulting 100-gene networks were then used as query inputs for Reactome pathway analyses [20,21] using the Reactome portal in the Data4Cure platform.

Multigene MVR Model Building
MVR model building was performed to develop multigene panels of UFGs that can predict the likelihood of high versus low response to eribulin and vinorelbine (MVR model building was not performed for paclitaxel since no gene met all UFG criteria for that drug). Full sets and subsets of eribulin and vinorelbine UFGs were combined as follows. Since the criteria for UFGs demand that expression levels of each UFG show statistically significant correlation with IC50s in linear regressions across all 100 cell lines, individual correlation equations for each UFG were used to predict IC50s for each cell line based only on expression of that gene in that cell line. This was repeated for all UFGs and cell lines, followed by averaging predicted IC50s for each cell line based on either full sets or subsets of eribulin and vinorelbine UFGs. Predicted mean IC50s were then correlated with actual measured IC50s by linear regression analysis, thereby obtaining both statistical significance and the equation of the model.
Since IC50s predicted from equations of individual UFGs are actually drug-agnostic (independent of the cognate drug for which UFG status was accorded), MVR models built from eribulin and vinorelbine UFGs were assessed for their abilities to predict responses not just to the corresponding cognate drug but also to the 2 noncognate drugs (including paclitaxel). As expected, MVR models built with full sets of eribulin and vinorelbine UFGs showed highly significant correlations with their cognate drugs (Supplemental Figure S2). Unexpectedly, however, full set MVR models also showed significant correlations with one or both noncognate drugs, presumably due to consolidation and statistical strengthening of nonsignificant numerical trends for noncognate drugs that can be observed for many individual UFGs (visible in Supplemental Figure S1). Accordingly, MVR models using UFG subsets were constructed by prioritizing UFGs based on the highest individual correlation coefficients (R 2 values) across all 100 cells lines for each gene. This approach yielded MVR models based on 4 eribulin UFGs (C5ORF38, DAAM1, IRX2, CD70) and 5 vinorelbine UFGs (EPHA2, NGEF, SEPTIN10, TRIP10, VSIG10), both of which showed highly significant correlations between predicted and measured IC50s across all 100 cell lines for cognate but not noncognate drugs.

Antiproliferative Effects of Eribulin, Paclitaxel and Vinorelbine against 100 CCLE Cell Lines
In vitro antiproliferative potencies of eribulin, paclitaxel and vinorelbine against 100 CCLE cell lines were determined during 72 h compound exposures ( Figure 1A-D, Table 1, Supplemental Table S1). Overall, eribulin IC50s correlated with those for paclitaxel and vinorelbine with high statistical significance across the 100 cells lines ( Figure 1A), although slopes for paclitaxel and vinorelbine were notably shallower compared to eribulin, whose lowest IC50s dipped well below 10 −9 M, thus driving the steeper eribulin slope. The overall order of potency was eribulin > paclitaxel > vinorelbine, with eribulin showing 1.5to 6.7-fold greater potency compared to paclitaxel (means and medians, respectively), and 2.0-to 14.9-fold greater potency compared to vinorelbine (Table 1). Table S1). Overall, eribulin IC50s correlated with those for paclitaxel and vinorelbine with high statistical significance across the 100 cells lines ( Figure 1A), although slopes for paclitaxel and vinorelbine were notably shallower compared to eribulin, whose lowest IC50s dipped well below 10 −9 M, thus driving the steeper eribulin slope. The overall order of potency was eribulin > paclitaxel > vinorelbine, with eribulin showing 1.5to 6.7-fold greater potency compared to paclitaxel (means and medians, respectively), and 2.0-to 14.9-fold greater potency compared to vinorelbine (Table 1).   Although IC50s across the 100 cell lines were highly correlated as a whole, some lines individually responded differently to each drug, as shown by analysis of the most and least sensitive quartiles (25 cells lines with lowest and highest IC50s for each drug, respectively; Figure 1B-D). Thus, 5, 9 and 4 cell lines were uniquely associated with the most sensitive quartiles for eribulin, paclitaxel and vinorelbine, respectively, while 11 cell lines were shared by all 3 drugs, and 12 others were variously shared between the 3 remaining 2-drug pairs ( Figure 1E). In the least sensitive quartiles, 7, 9 and 7 cell lines were unique to eribulin, paclitaxel and vinorelbine, respectively, with 12 cell lines shared by all 3 drugs and another 8 variously shared by the 3 remaining 2-drug pairs ( Figure 1F).

Gene Expression Analysis of Most Sensitive versus Least Sensitive Cell Line Quartiles
Baseline expression data for 11,985 genes in each of the 100 CCLE cell lines were obtained from the publicly available DepMap database. Initial filtering was based on ratios of average expression of each gene across the 25 cell lines in the top (most sensitive) versus bottom (least sensitive) quartiles, defined separately for eribulin, paclitaxel and vinorelbine; top and bottom quartiles thus served as conceptual surrogates for responders and nonresponders, respectively. Z-score heat map analysis of gene expression at p < 0.05 stringency revealed visual groupings of genes that were positively or negatively associated with response to all three drugs, to each of the three possible drug pairs, or to each drug uniquely ( Figure 2).    Table S2). Further analysis showed that 342, 228 and 334 of the upregulated genes and 279, 189 and 392 of the downregulated genes were uniquely associated with responses to eribulin, paclitaxel and vinorelbine at p < 0.05, respectively ( Figure 3D,E). With greater stringency at p < 0.0025, 34, 21 and 48 nonoverlapping upregulated genes and 18, 8 and 32 nonoverlapping downregulated genes were uniquely associated with eribulin, paclitaxel and vinorelbine responses, respectively ( Figure 3F,G).

Identification of Unique Fingerprint Genes (UFGs) for Eribulin and Vinorelbine
While initial filtering by expression ratios between most and least sensitive cell line quartiles narrowed potentially relevant genes from many thousands to just a few dozen, this came at the expense of (i) not utilizing gene expression information from the 50 cell lines in the middle two quartiles, (ii) accepting a risk that small numbers of cell lines in the top and bottom quartiles might have outlier expression patterns that could disproportionately influence gene selection, and (iii) failing to provide a basis for identifying drug-specific UFGs based on the informational power available from all 100 cell lines. To address these limitations, a final curation of the 52, 29 and 80 p < 0.0025 genes for eribulin, paclitaxel and vinorelbine, respectively ( Figure 3F,G), was performed to include only drug-unique genes that were at least 2-fold up-or downregulated between the most and least sensitive cell line quartiles (equivalent to absolute value of log 2 [fold-change] ≥ 1). Following this restriction, data from the middle two cell line quartiles were then reintroduced, and each gene in the three subsets was subjected to full linear regression analysis including expression data and IC50s from all 100 cell lines. Only genes with expression showing statistical significance for a given cognate drug but lacking statistical significance for the other two noncognate drugs were designated as UFGs for each drug. This selection process yielded 9 UFGs for eribulin and 13 UFGs for vinorelbine ( Table 2; Supplemental Figure S1). Unexpectedly, no gene, upor downregulated, met these final criteria for designation as a paclitaxel UFG.  Figure S1. 2 No gene met all criteria for paclitaxel, so only eribulin and vinorelbine UFGs are listed here. 3 Genes are listed in alphabetical order within each subgroup. Up and down genes refer to higher and lower gene expression levels, respectively, associated with greater drug response (lower IC50s).

Molecular Interactions Associated with Eribulin and Vinorelbine UFG Sets
We next asked what mechanisms and pathways were associated with eribulin and vinorelbine UFGs. To this end, network propagation [19] analyses were performed to identify molecular networks associated with eribulin and vinorelbine UFGs followed by pathway enrichment analyses to identify the pathways highlighted by the resulting networks. Using the eribulin and vinorelbine UFG sets as query inputs, 100 gene networks (UFG interactomes) were obtained for both drugs ( Figure 4A,B and Supplemental Table S3)-network propagation and Reactome analyses were not performed for paclitaxel since no gene met all criteria for a paclitaxel UFG. These networks included seven of nine eribulin UFGs and 13 of 13 vinorelbine UFGs; two UFGs for eribulin, C5ORF38 and GPR157, were not themselves captured by network propagation. Interestingly, despite sharing mechanistic similarity as MTAs, only four genes overlapped between the eribulin and vinorelbine 100-gene propagated networks (ESR2, HNRNPL, MTNR1B, TRIM25). We speculate that this may relate to the fact that the original UFGs were selected, in part, based on lack of correlations to noncognate drugs.

Reactome Pathways Associated with Eribulin and Vinorelbine Response
To further elucidate the pathways associated with the eribulin and vinorelbine UFG interactomes, the eribulin and vinorelbine 100-gene propagated networks were used as query inputs for Reactome pathway enrichment analyses. The resulting Reactome maps indicate that most pathways for both drugs fall into three main Reactome groupings or 'islands': the Signal Transduction, Immune System and Cell Cycle islands ( Figure 4C,D). While top-level viewing reveals these commonalities, detailed blowups of the three islands show clear differences in the specific pathways identified for the two drugs ( Figure 4E-J). For instance, in the Signal Transduction island, the eribulin UFG interactome is strongly associated with Rho GTPase signaling pathways, while the vinorelbine interactome shows stronger associations with FGFR, EGFR/ERBB and NGF pathways ( Figure 4E,F). Similarly, in the Immune System island, the eribulin UFG interactome shows strong association with Toll-like receptor (TLR) signaling, which notably was not covered at all by vinorelbine, while the vinorelbine UFG interactome shows stronger associations with both adaptive and innate immune branches as well as cytokine signaling ( Figure 4G,H). Finally, while Cell Cycle island pathways were generally more similar for the eribulin and vinorelbine UFG interactomes compared to Signal Transduction and Immune System islands, eribulin alone was associated with chromosome maintenance pathways, while vinorelbine alone was associated with cell cycle checkpoint pathways ( Figure 4I,J). Thus, despite top-level Reactome island commonalities between the two drugs, detailed inspection within the islands themselves reveals significant differences in specific pathway associations.
Cataloging of the individual Reactome pathways identified confirms differences in pathway involvement between the UFGs of the two drugs. Using combined criteria of p < 0.05 significance plus q < 0.1 to account for FDR, 26 and 122 Reactome pathways were identified for eribulin and vinorelbine UFG interactomes, respectively (Table 3); considerable differences in pathways are evident among these. First, only two Reactome pathways were shared: Immune System and Signaling by Interleukins. More notably, TLR-related pathways accounted for 17/26 (65.4%) pathways for eribulin, yet none of the 122 pathways for vinorelbine. Seven of the remaining nine pathways for eribulin involved Rho GTPaseand cell cycle/mitosis-related pathways, which were all but absent for vinorelbine. In contrast, 45 of 122 (36.9%) Reactome pathways for vinorelbine involved FGFR, PI3K/Akt and EGFR/ERBB, yet such pathways were not highlighted by the eribulin UFG interactome. Thus, cataloging Reactome pathways in Table 3 confirms the visual observations of Figure 4 that, despite top-level commonalities seen for the Signal Transduction, Immune System and Cell Cycle islands, individual pathway details point to considerable differences in pathway determinants of eribulin and vinorelbine response, as highlighted by the UFG interactomes.

Reactome Pathways Associated with Eribulin and Vinorelbine Response
To further elucidate the pathways associated with the eribulin and vinorelbine UFG interactomes, the eribulin and vinorelbine 100-gene propagated networks were used as query inputs for Reactome pathway enrichment analyses. The resulting Reactome maps indicate that most pathways for both drugs fall into three main Reactome groupings or 'islands': the Signal Transduction, Immune System and Cell Cycle islands ( Figure 4C,D). While top-level viewing reveals these commonalities, detailed blowups of the three islands show clear differences in the specific pathways identified for the two drugs ( Figure  4E-J). For instance, in the Signal Transduction island, the eribulin UFG interactome is strongly associated with Rho GTPase signaling pathways, while the vinorelbine interactome shows stronger associations with FGFR, EGFR/ERBB and NGF pathways ( Figure  4E,F). Similarly, in the Immune System island, the eribulin UFG interactome shows strong association with Toll-like receptor (TLR) signaling, which notably was not covered at all by vinorelbine, while the vinorelbine UFG interactome shows stronger associations with both adaptive and innate immune branches as well as cytokine signaling ( Figure 4G,H). Finally, while Cell Cycle island pathways were generally more similar for the eribulin and vinorelbine UFG interactomes compared to Signal Transduction and Immune System islands, eribulin alone was associated with chromosome maintenance pathways, while vinorelbine alone was associated with cell cycle checkpoint pathways ( Figure 4I,J). Thus, despite top-level Reactome island commonalities between the two drugs, detailed inspection within the islands themselves reveals significant differences in specific pathway associations.
Cataloging of the individual Reactome pathways identified confirms differences in pathway involvement between the UFGs of the two drugs. Using combined criteria of p <    1 Reactome pathways for eribulin and vinorelbine at p < 0.05 plus q < 0.1 were obtained using the 100 gene networks (Supplemental Table S3) as query inputs. 2 Both spellings, 'signaling' and 'signalling,' are used in the Reactome database. Spellings used here exactly reflect those downloaded from the database to facilitate searching for specific Reactome pathways names. 3 Shared between eribulin and vinorelbine at p < 0.05 plus q < 0.1.

Multivariate Regression (MVR) Model Building to Predict Drug Sensitivities
Finally, multigene MVR models for eribulin and vinorelbine were assembled (not done for paclitaxel since no gene qualified as a paclitaxel UFG) by first calculating each cell line's predicted IC50 based on individual expression of each of the 9 eribulin or 13 vinorelbine UFGs for that cell line, averaging the resulting 9 or 13 predicted IC50s for each cell line, then comparing the averaged predicted IC50s to actual measured IC50s for each cell line for all three drugs. Although no gene met the criteria for assignment as a paclitaxel UFG, predicted IC50s based on expression of eribulin or vinorelbine UFGs are conceptually drug agnostic, so paclitaxel was included as a drug specificity comparator. Not surprisingly, the resulting MVR models built from all 9 or 13 eribulin or vinorelbine UFGs, respectively, successfully predicted cognate drug IC50s with high statistical significance (Supplemental Figure S2). Unexpectedly, however, these two MVR models also showed statistical correlations with one or both noncognate drugs (albeit at lower statistical significance), even though one criterion for each individual UFG was lack of statistical correlation with the two noncognate drugs. Thus, IC50s predicted from the 9 UFG eribulin MVR model correlated with measured IC50s not just for eribulin, but also for paclitaxel and vinorelbine, with p-values of <0.001, 0.022 and 0.021, respectively, while those predicted from the 13 UFG vinorelbine MVR model correlated with measured IC50s for not just vinorelbine, but also for eribulin with p-values of <0.001 and 0.009, respectively; correlation with paclitaxel was not significant (p = 0.097; Supplemental Figure S2). We speculate that acquisition of statistical significance for noncognate drugs in MVR models built from all 9 eribulin or 13 vinorelbine UFGs probably reflects consolidation and statistical strengthening of nonstatistical numeric trends for noncognate drugs that can be seen for several individual UFGs (Supplemental Figure S1).
The use of smaller UFG subsets was then employed to generate MVR models that could predict IC50s for cognate but not noncognate drugs. Prioritizing UFGs by highest correlation coefficients (individual R 2 values) yielded a four-gene eribulin MVR model (C5ORF38, DAAM1, IRX2, CD70) and a five-gene vinorelbine MVR model (EPHA2, NGEF, SEPTIN10, TRIP10, VSIG10); predicted IC50s from both models correlated with measured IC50s across all 100 cell lines with high statistical significance for cognate but not noncognate drugs ( Figure 5A To determine if these MVR models could form the basis for predictive biomarker gene panels, we returned to the starting concept of most versus least sensitive cell line quartiles as surrogates for "likely responders" versus "likely nonresponders," respectively. As shown in Figure 5B,D, both MVR models calculated mean IC50s for the most and least sensitive cell line quartiles with high accuracy relative to actual measured IC50s ( Figure 5B,D).
In summary, an approach defining drug-specific top and bottom cell line quartiles from 100 CCLE cell lines allowed distillation of 11,985 genes down to 52 and 80 drugnonoverlapping p < 0.0025 genes for eribulin and vinorelbine, respectively, which were then further distilled to 9 and 13 UFGs by setting expression ratio and drug specificity thresholds followed by reintroducing data from all 100 cell lines. MVR models comprised of four-and five-gene UFG subsets accurately calculated mean eribulin and vinorelbine IC50s of the original top and bottom cell line quartiles based solely on baseline gene expression levels. We propose that the four-and five-gene MVR panels identified here may form the basis for drug-specific predictive biomarkers for eribulin and vinorelbine response, respectively. To determine if these MVR models could form the basis for predictive biomarker gene panels, we returned to the starting concept of most versus least sensitive cell line quartiles as surrogates for "likely responders" versus "likely nonresponders," respectively. As shown in Figure 5B,D, both MVR models calculated mean IC50s for the most and least sensitive cell line quartiles with high accuracy relative to actual measured IC50s ( Figure 5B,D).
In summary, an approach defining drug-specific top and bottom cell line quartiles from 100 CCLE cell lines allowed distillation of 11,985 genes down to 52 and 80 drugnonoverlapping p < 0.0025 genes for eribulin and vinorelbine, respectively, which were then further distilled to 9 and 13 UFGs by setting expression ratio and drug specificity thresholds followed by reintroducing data from all 100 cell lines. MVR models comprised of four-and five-gene UFG subsets accurately calculated mean eribulin and vinorelbine IC50s of the original top and bottom cell line quartiles based solely on baseline gene expression levels. We propose that the four-and five-gene MVR panels identified here may form the basis for drug-specific predictive biomarkers for eribulin and vinorelbine response, respectively.

Discussion
Despite decades of use in cancer therapy, reliable biomarkers to predict response to

Discussion
Despite decades of use in cancer therapy, reliable biomarkers to predict response to clinically approved MTAs remain lacking. While all MTAs share the top-level mechanism of inhibiting MT dynamics, differences in binding sites on α/β-tubulin monomers, locations of those binding sites in the context of polymerized MTs, net effects as MT stabilizers versus MT destabilizers and demonstrable differences in clinical profiles indicate that despite our extensive knowledge of MTA biochemistry, utilizing such knowledge to create predictive biomarkers remains elusive. To address this, we coupled measured response data from 100 human cancer cell lines from the CCLE with gene expression data for almost 12,000 genes from each cell line to identify molecular and pathway correlates of eribulin response, contrasting results with two other clinically used MTAs, paclitaxel and vinorelbine. Our results identified nine genes (termed UFGs) whose expression is uniquely associated with response to eribulin versus the other two drugs, with four eribulin UFGs being sufficient for a multigene panel that accurately predicts high versus low eribulin response. Separately, 13 UFGs were identified as uniquely associated with vinorelbine response, with 5 of these being sufficient for a predictive vinorelbine multigene panel. Notably, no genes uniquely associated with paclitaxel response were identified using the same stringent criteria used to identify eribulin and vinorelbine UFGs.
In addition to its cytotoxic antimitotic activity [5][6][7], eribulin is unusual among MTAs in that it also exerts a wide range of noncytotoxic effects on the tumor microenvironment, including vascular remodeling or normalization resulting in increased tumor perfusion and mitigation of hypoxia, phenotypic changes including reversal of EMT and induced cellular differentiation, and effects on the tumor immune microenvironment [8][9][10][11][12]22,23]. We hypothesized that the existence of such nonmitotic effects might invoke additional cellular pathways that could provide the basis for a gene expression-based biomarker strategy specific for eribulin.
Responses of 100 CCLE human cancer cell lines to eribulin and comparator MTAs paclitaxel and vinorelbine were first quantified by IC50 values and ordered by response. While sensitivities across the 100 cell lines trended in the same direction for the three drugs, the range of eribulin's response from most to least sensitive was considerably greater than the other two drugs, reflected by a much steeper IC50 slope for eribulin compared to the shallower and parallel slopes for paclitaxel and vinorelbine. Importantly, while IC50s for the least sensitive cell lines were in the same 10 −7 to 10 −6 M range for all three drugs, eribulin's steeper slope resulted primarily from increased responses in its most sensitive cell lines, which collectively dropped into the sub-nM IC50 range. In addition, several nonoverlapping cell lines existed for each drug in their respective top and bottom quartiles, suggesting that aggregating response data across large numbers of cell lines might yield information on drug-specific pathways and mechanisms, a concept first established by the well-known NCI60 cell line screen (https://dtp.cancer.gov/discovery_development/nci-60/; accessed on 26 July 2022). Coincidentally, the NCI60 screen was first used to predict a tubulin-based mechanism for eribulin's natural product parent, halichondrin B [24,25]. We speculated that the enhanced responsiveness of many cell lines to eribulin relative to paclitaxel and vinorelbine resulted from specific upregulation of pathways' governing response, as opposed to general downregulation of resistance pathways that might explain the response to paclitaxel and vinorelbine. Such observations supported the concept that eribulin's unique biology among MTAs might provide a basis for developing a predictive eribulin biomarker strategy.
Using top and bottom quartiles as surrogates for responders and nonresponders, respectively, IC50s were correlated with DepMap gene expression data for 11,985 genes. Z-score heat mapping and filtering on p-value and gene expression ratios between top and bottom quartiles provided the first indications that the three drugs could be distinguished from each other for biomarker development. Notably, at p < 0.0025 stringency, 52 and 80 genes were unique to eribulin and vinorelbine, respectively, with no genes shared between eribulin and paclitaxel, and only nine genes shared between eribulin and vinorelbine. At the same stringency, vinorelbine and paclitaxel shared only one gene. Thus, paclitaxel, a microtubule stabilizer, showed the least similarity with eribulin and vinorelbine, both microtubule destabilizers, yet even eribulin and vinorelbine could be distinguished from each other based on many more genes uniquely associated with each drug compared to shared genes between them.
Further restriction of the p < 0.0025 genes to include only those with at least 2-fold expression difference between top and bottom quartiles, followed by reintroduction of data from the 50 cell lines in the middle two quartiles, resulted in identification of 9 and 13 UFGs for eribulin and vinorelbine, respectively. Interestingly, no gene met these final UFG criteria for paclitaxel. Notably, seven of the nine eribulin UFGs were upregulated with increased response, while 11 of the 13 vinorelbine UFGs were downregulated with increased response, supporting our hypothesis that increased eribulin response may be driven by upregulation of pathways governing sensitivity, whereas increased vinorelbine response may depend more on downregulation of general resistance pathways.
To elucidate biological pathways associated with eribulin and vinorelbine UFGs, network propagation [19] was used to identify networks highlighted by the 9 and 13 UFG sets into expanded 100-gene networks, which were then used as query inputs for Reactome pathway enrichment analyses. Intriguingly, although eribulin and vinorelbine are both microtubule-depolymerizing MTAs, only 4 of 100 genes in the propagated networks for each drug overlapped, likely because assignment as a UFG excluded genes that correlated with response to noncognate drugs. Reactome pathway analyses using the 100-gene propagated networks as query inputs revealed similar top-level pathway commonalities for both drugs within the major Signal Transduction, Immune System and Cell Cycle Reactome islands. Nevertheless, specific pathways identified within these islands differed considerably for the two drugs. Thus, eribulin showed strongest associations with TLRand Rho GTPase-associated signaling pathways, while vinorelbine pathways emphasized FGFR, EGFR/ERBB and both adaptive and innate immune signaling. Cataloging of the 26 and 122 individual Reactome pathways identified for eribulin and vinorelbine UFG interactomes, respectively, confirmed the lack of specific pathway overlaps. For instance, TLR-related pathways accounted for~65% (17/26) of Reactome pathways for eribulin, yet none for vinorelbine. On the other hand,~37% (45/122) of Reactome pathways for vinorelbine involved FGFR, PI3K/Akt and EGFR/ERBB, yet such pathways were completely absent for eribulin. While further work will be required to determine how the identified pathways govern responses to eribulin and vinorelbine, results at the levels of individual UFGs, 100-gene propagated networks and individual Reactome pathways strongly suggest that despite their shared classification as MT depolymerizing MTAs, responses to eribulin and vinorelbine are governed by different genetic and pathway determinants and can be readily distinguished from each other by gene expression patterns.
Finally, subsets of eribulin and vinorelbine UFGs prioritized by highest individual correlation coefficients were used to build drug-specific MVR models that correlate with responses across all 100 cell lines and accurately predict average response levels of the most and least sensitive quartiles based solely on expression levels of only four or five genes, respectively. While others have reported gene or pathway signatures for MTAs, these have been mainly for paclitaxel [26][27][28][29][30][31] and have not, to our knowledge, explored the rigorous drug specificity within MTAs required by our criteria. In this regard, it is noteworthy that no gene met all UFG criteria for paclitaxel, especially considering the importance of cellular signaling to both eribulin and vinorelbine responsiveness. While speculative, paclitaxel may represent a 'purer' cytotoxic based on its internal MT lumen binding site [32,33], which may prevent interactions with microtubule-associated proteins (MAPs) involved in cellular signaling. In contrast, vinorelbine and eribulin bind at or near, respectively, the so-called β-tubulin vinca binding site near MT ends [33][34][35], where such binding may more readily interfere with MAP-mediated signaling events. In this context, eribulin binds almost exclusively to exposed β-tubulin at growing MT plus (+) ends [34,35], and such binding rapidly blocks association of plus-end binding proteins (+TIPS) such as EB1 [36,37] that serve as scaffolds for multiprotein assemblies involved in both structural and signaling activities [38][39][40]. For instance, eribulin binding to MT (+) ends disrupts p130Cas/Src interactions, leading to cortical localization of E-cadherin [41], a hallmark of eribulin's ability to reverse EMT [8]. In addition, both eribulin and vinorelbine binding rapidly inhibits TGFβ-induced Smad signaling by preventing MT-dependent Smad2/3 transport into the nucleus [42]. Thus, binding of both eribulin and vinorelbine to externally accessible sites at or near MT ends compared to paclitaxel's inner MT luminal binding may help explain both the failure to identify drug-specific paclitaxel UFGs as well as the observation that cellular signaling themes dominated for both eribulin and vinorelbine.
Considering the importance of signaling-related Reactome pathways in eribulin and vinorelbine responses, it is reasonable to ask if the four eribulin and five vinorelbine UFGs in the predictive MVR models directly control responses to the two drugs. Strictly speaking, UFG expression levels were only correlated with response, so firm conclusions of direct involvement cannot be drawn. However, with that caveat, information on known roles of the genes permits some speculation. For eribulin, IRX2 is a homeobox gene important in normal embryonic development and has been implicated in cancer, while C5ORF38 is coordinately regulated with IRX2, suggesting both are involved in cellular differentiation and growth regulation [43]. Similarly, DAAM1 is involved in Wnt signaling and early embryonic gastrulation, both processes associated with cellular differentiation [44]. Finally, CD70 codes for a ligand for the immune costimulatory molecule CD27 and thus is involved in the activation and proliferation of T cells, including regulatory T cells [45]. Thus, while speculative, all four of the eribulin UFGs in the predictive MVR model appear related to cell differentiation-, activation-or proliferation-related processes.
For vinorelbine, EPHA2 codes for EPH receptor A2, a tyrosine kinase involved in cancer-related signaling, while NGEF codes for a guanine nucleotide exchange factor associated with signaling from EPH receptor A2, RhoA, Rac1 and CDC42 as well as cellular transformation and tumorigenesis [46,47]. SEPTIN10 is associated with B cell leukemias [48], while TRIP10 is involved in tumorigenesis and cancer progression [49]. Finally, VSIG10 codes for an immunoglobulin-related protein associated with both cell adhesion and macrophage involvement in colonic pathologies including colon cancer [50]. Thus, while solid mechanistic links between drug response and the UFGs in the eribulin and vinorelbine MVR models cannot be established by purely correlational data, the known roles of these genes are consistent with potential mechanisms that set response sensitivity levels to these two drugs.

Conclusions
In summary, we have combined response data for eribulin, paclitaxel and vinorelbine from 100 human cancer cell lines together with gene expression data to identify small numbers of genes (UFGs) that are uniquely associated with eribulin and vinorelbine responses. Reactome pathway analyses based on UFG-seeded propagated gene networks revealed that responses to both eribulin and vinorelbine are dominantly associated with cellular signaling processes as opposed to canonical MT-based antimitotic processes, perhaps also explaining our inability to identify paclitaxel UFGs using the same criteria. Despite top-level dependence on cellular signaling for response to both eribulin and vinorelbine, detailed analyses of the specific Reactome pathways identified reveal considerable discrimination between the two drugs. Finally, we show that small subsets of eribulin and vinorelbine UFGs can be successfully combined into MVR models that accurately predict a high versus low response to the two drugs based on expression of only four or five genes. Our results indicate that further investigation of the genes, pathways and MVR panels identified here is warranted, with further validation in both preclinical tumor models and in the clinical setting being of highest priority. Ultimately, our hope is that the current results together with such future validation work will lead to the development of drug-specific, gene expression-based predictive biomarker panels for eribulin and vinorelbine that support improved therapeutic decision-making in clinical settings.