Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction

Liu, Qi; Muglia, Louis J.; Huang, Lei Frank

doi:10.3390/genes10080602

Open AccessArticle

Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction

by

Qi Liu

^1,2,

Louis J. Muglia

^2,3 and

Lei Frank Huang

^1,2,4,*

¹

Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA

²

Department of Pediatrics, College of Medicine, University of Cincinnati, Cincinnati, OH 45229, USA

³

Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA

⁴

Department of Information Science, School of Mathematical Sciences and LAMA, Peking University, Beijing 100871, China

^*

Author to whom correspondence should be addressed.

Genes 2019, 10(8), 602; https://doi.org/10.3390/genes10080602

Submission received: 26 June 2019 / Revised: 5 August 2019 / Accepted: 6 August 2019 / Published: 9 August 2019

(This article belongs to the Special Issue Selected Papers from the International Conference on Intelligent Biology and Medicine (ICIBM 2019))

Download

Browse Figures

Versions Notes

Abstract

:

With the advances in different biological networks including gene regulation, gene co-expression, protein–protein interaction networks, and advanced approaches for network reconstruction, analysis, and interpretation, it is possible to discover reliable and accurate molecular network-based biomarkers for monitoring cancer treatment. Such efforts will also pave the way toward the realization of biomarker-driven personalized medicine against cancer. Previously, we have reconstructed disease-specific driver signaling networks using multi-omics profiles and cancer signaling pathway data. In this study, we developed a network-based sparse Bayesian machine (NBSBM) approach, using previously derived disease-specific driver signaling networks to predict cancer cell responses to drugs. NBSBM made use of the information encoded in a disease-specific (differentially expressed) network to improve its prediction performance in problems with a reduced amount of training data and a very high-dimensional feature space. Sparsity in NBSBM is favored by a spike and slab prior distribution, which is combined with a Markov random field prior that encodes the network of feature dependencies. Gene features that are connected in the network are assumed to be both relevant and irrelevant to drug responses. We compared the proposed method with network-based support vector machine (NBSVM) approaches and found that the NBSBM approach could achieve much better accuracy than the other two NBSVM methods. The gene modules selected from the disease-specific driver networks for predicting drug sensitivity might be directly involved in drug sensitivity or resistance. This work provides a disease-specific network-based drug sensitivity prediction approach and can uncover the potential mechanisms of the action of drugs by selecting the most predictive sub-networks from the disease-specific network.

Keywords:

network-based sparse Bayesian machine; disease-specific driver signaling network; drug sensitivity; drug resistance; cancer signaling pathway

1. Introduction

It has been reported that some cancer cells are sensitive to drugs while others are not. Meanwhile, the same drug has different efficacy on different cancer cell lines. For example, among 14 lung cancer cell lines, H1666 and Cal12T are sensitive to Dasatinib [1] while the other 12 cell lines, H322, H661, H460, H1568, H226, A549, H522, H2087, H1755, H1395, HCC364, and H2405, are not. For prostate cell lines, PC3, DU145, HPV10, LNCaP, RWPE1, HPV7, NB26, PWR1E, NB11, and W99 are sensitive to Dasatinib, however, 22Rv, VcaP, MDAPCa2b, DUCap, and WPMY1 are not. These examples show that different subtypes of lung cancer cell lines and prostate cell lines exhibited different sensitivity to Dasatinib. This raises the question of whether, based on the high throughput gene expression data, we can predict the drug sensitivity of a new cancer cell.

The question above can be considered as a typically supervised machine learning problem. A classifier can be trained based on high throughput gene expression data and the sensitivity labels of cell lines to drugs to predict drug sensitivities. In previous work, Wong [2] and Huang [3] applied basic t-test methods to find sensitive or non-sensitive biomarkers to targeted therapy and predicted the sensitivities of new cancer cell lines to the drug, according to the gene expression data. However, they only used gene expression data for classification.

It has been reported that utilizing protein–protein interaction network data as prior information can distinguish cancer patients and non-cancer patients [4,5,6,7], and is better than only using the gene expression data of cancer patients [8,9,10]. However, in high throughput gene expression data, the dimension of features d is much larger than the number of samples n, which makes it difficult to construct an optimal classifier. Combining signaling transduction pathways into a high dimensional data classification machine is a challenge. Rapaport et al. [5] used the protein–protein interaction data as a graph and made a spectral decomposition of the gene expression data according to the characteristic functions of the graph for frequency features, and then designed an SVM classifier based on the features to classify yeasts with or without light radiation. Different from extracting the network features directly, Zhu et al. [6] constructed an SVM classifier based on gene expression data directly to classify the status of Parkinson’s patients by taking network data as a punishment term. Gönen et al. [11] combined kernel-based non-linear dimensionality reduction and binary classification to build a Bayesian algorithm under a multitask learning framework, which can reduce the off-target effects and experimental noise. Moreover, Herndaniel et al. [12] and Miguel et al. [13] developed a sparse Bayesian classifier (SBC) to classify high throughput data, integrating the gene expression data with protein–protein interactions, which was different to using gene expression data to obtain SVM classifiers, and showed better results than the network-based SVM classifiers. Additionally, Yang et al. [14] raised the network-based method, NRL2DRP, which predicts drug responses not only based on PPI data but also on the similarity of cell lines, reaching relatively high performance under cross-validation on the GDSC dataset and methods comparison.

In this study, we propose a new network-based sparse Bayesian machine (NBSBM) method by combining a sparse Bayesian classifier with a Laplace graph, which is designed by a disease-related signaling network. Previously, we have developed several disease-specific driver signaling network identification approaches to identify the potential disease-driver networks by integrating the DNA-seq, copy number, RNA-seq, and methylation profiles of cancer patients [9,10,15,16]. We took advantage of these previously identified disease-specific networks and put them as prior information for drug sensitivity or resistance prediction in NBSBM. An expectation propagation strategy was employed to obtain the optimal solution of NBSBM. We then compared the performance of NBSBM with other network-based SVM classifiers. NBSBM demonstrated much better results than the other classifiers. Furthermore, the NBSBM approach is capable of selecting the most predictive networks as a biomarker for drug sensitivity or resistance prediction.

2. Materials and Methods

Sparse Bayesian Classifier Combined with Disease-Specific Network

Specifically, we consider this to be a supervised machine learning problem. The training set

D = {(x_{i}, y_{i})}_{i = 1}^{n}

has features

x_{i} \in ℝ^{d + 1}

of which the zero-th component is equal to 1 and

x_{i}

contains information about the gene expression or transcriptional response of cancer cells. On the other hand,

y_{i} \in {- 1, 1}

is the class labels representing the phenotype data of the cancer cell response to drugs, while 1 represents “sensitive” and −1 represents “resistant”. We aimed to build an optimal linear classifier

β = (β_{0}, β_{1}, \dots, β_{d})

that utilizes a specific cancer signaling network as prior information and maximizes the distance between those sensitive and non-sensitive samples. Herbrich et al. [17] considered the existence of a true classifier

β_{t r u e}

, which was used to label the data according to the rule

y_{i} = s i g n (β_{t r u e} x_{i})

. However, the samples might not be linearly separable, so in a general case, we consider the labeling errors, that is, some of the class labels

y_{i}

have been flipped with probability

ε

. Under these assumptions, given

X = (x_{1}, \dots, x_{n})

,

y = (y_{1}, \dots, y_{n})

and

ε

, the likelihood is shown in Equation (1)

p (y | β, ε, X) = \prod_{i = 1}^{n} p (y_{i} | β, ε, x_{i}) = \prod_{i = 1}^{n} [ε (1 - Φ (y_{i} β^{t} x_{i})) + (1 - ε) Φ (y_{i} β^{t} x_{i})]

(1)

where

Φ

is the Heaviside step function and is defined by

Φ (y_{i} β^{t} x_{i}) = \lim_{k \to \infty} \frac{1}{1 + e^{- 2 k (y | | i β^{t} x_{i})}}

. In fact, the likelihood function (1) is robust to outliers because it only depends on the number of errors of

β

in the training set and not on the actual size of these errors. In high throughput gene expression data,

d ≽ n

, indicating

β

can have different optimal solutions. In this study, we only considered the sparse solution for

β

. Herein, we introduce a new binary hidden variable

z = {z_{0}, z_{1}, z_{2}, \dots, z_{d}} ϵ {0, 1}^{d}

.

z_{i}

takes 0 if the

i^{t h}

. component of

β_{t r u e}

is 0 and

z_{i}

takes 1 otherwise. Assuming

z

is given, the prior density of

β

is

p (β | z) = \prod_{i = 1}^{d} p (β_{i} | z_{i}) = \prod_{i = 0}^{d} [N {(β_{i}, 0, σ_{i}^{2})}^{z_{i}} (δ (β_{i}))^{(1 - z_{i})}]

(2)

Here,

p (β_{i} | z_{i})

is a kind of spike and slab prior, which is a mixture of a Gaussian density (the slab) and a point probability mass placed at zero (the spike).

N (β_{i}, 0, σ_{i}^{2})

represents a Gaussian density with a 0 mean and

σ_{i}^{2}

variance, and

δ (β_{i})

is an impulse function that has a probability of 1 on

β_{i}

and 0 elsewhere. To complete the specification of the prior for

β

at zero, we assume that a network that encodes the dependencies between the gene features are known. Given a specific cancer signaling network

G = (V, E)

whose vertices

V = {0, 1, \dots, d}

correspond to the proteins and whose edges,

E

, link features that are expected to uncover the potential mechanism difference of the drug resistance samples and sensitive samples. Equation (3) shows the prior density for

z

given

G

, which is given by a Markov random field (MRF) model

p (z | G, λ, γ) = \frac{1}{Z} e x p (c z_{0} + λ \sum_{i = 1}^{d} z_{i}) e x p (γ \sum_{{u, v} \in E} {(\frac{z_{u}}{\sqrt{d_{u}}} - \frac{z_{v}}{\sqrt{d_{v}}})}^{2} ω (u, v))

(3)

In Equation (3),

Z

is a normalization constant and

λ \in ℝ

controls the sparsity.

γ \geq 0

determines the sum of the square difference between

z_{u}

and

z_{v}

that is linked in the input network G, and

ω (u, v)

is the weight between proteins

z_{u}

and

z_{v}

. In fact, if we assume,

L (u, v) = {\begin{matrix} 1 - \frac{w (u, v)}{d_{u}}, & i f u = v a n d d_{u} \neq 0, \\ \frac{- w (u, v)}{\sqrt{d_{u} d_{v}}}, & i f u a n d v a r e a d j a c e n t, \\ 0, & o t h e r s i z e . \end{matrix}

then

p (z | G, λ, γ) = \frac{1}{Z} \exp (c z_{0} + λ | z |) \exp (γ z^{T} L z)

(4)

If the sum of square difference,

{(\frac{z_{u}}{\sqrt{d_{u}}} - \frac{z_{v}}{\sqrt{d_{v}}})}^{2}

is small, the subcomponent of

z

will be small, and a smaller solution of

z

will lead to a more sparsity solution of

β

, which will help to avoid overfitting. Furthermore, we assume the prior of

ε

as

p (ε) = B e t a (ε, a_{0}, b_{0}) = \frac{1}{B (a_{0}, b_{0})} ε^{a_{0} - 1}

, where

B (a_{0}, b_{0})

represents the β function with parameters

a_{0}

and

b_{0}

. Under the assumption above, we can use the Bayesian theorem to compute the posterior distribution of the model parameters

β

and

ε

given the training data

X

and

y

. Given the specific cancer signaling network

G

and the model hyper-parameters

λ

and

γ

, the posterior is given by

p (β, ε | y, X, G, λ, γ) = \frac{\sum_{Z} p (y | β, ε, X) p (β | z) p (z | G, λ, γ) p (ε)}{p (y | X, G, λ, γ)}

(5)

The joint probability distributions of the model parameters and hidden variables are given as follows:

p (β, ε, z, y | X, G, λ, γ) = p (y | β, ε, X) p (β | z) p (z | G, λ, γ) p (ε)

(6)

In this equation, the denominator is a normalization constant. If given a new unclassified sample

x^{t e s t}

, we can determine its classification labels

y^{t e s t}

by probability as shown in Equation (7):

p (y^{t e s t} | X^{t e s t}, y, X, G, λ, γ) = \iint p (y^{t e s t} | β, ε, x^{t e s t}) p (β, ε | y, X, G, λ, γ) d β d ε

(7)

With the Bayesian assumption above, we can easily estimate the average noise of classification labels as

E ε = \iint ε p (β, ε | y, X, G, λ, γ) d β ε

. As the integrals and summations in the above three equations are difficult to calculate directly, we can make an approximate Bayesian inference for posterior probability distribution using an expectation propagation (EP) algorithm [18]. The detailed implementation of the EP algorithm for parameter estimation in NBSBM is available in the Supplementary Materials.

3. Results

3.1. Prediction of Sensitivity and Resistance of Prostate Cancer Cell Lines to Dasatinib

In Wang’s work [2], the sensitivity data of 16 prostate cancer cell lines to Dasatinib were provided. Eleven cell lines with half maximal inhibitory concentration (IC₅₀) values lower than 200 nm were designated as Dasatinib-sensitive. Five cell lines with IC₅₀ values larger than 200 nm were designated as Dasatinib-resistant. Previously, we reconstructed a prostate cancer-specific network [15] using multiple genomic and epigenomic data of prostate cancer patients from TCGA. There are 48 differentially expressed subnetworks (gene modules), 6738 genes, and 26,845 edges in this prostate cancer-specific network. Our goal was to predict the drug sensitivity response of these 16 prostate cell lines based on their gene expression data and the prostate cancer-specific network using the NBSBM. In this study, we set

a_{0} = 1

and

b_{0} = 8

in the NBSBM. For parameter

λ (γ)

, we took 500 values evenly from

{e^{- 5}, e^{2}}

(

{e^{- 5}, e^{1}}

) to select the value that achieved the lowest error rate on the training dataset. We used cross-validation (5-fold and 5-repeats) to evaluate the performance of the proposed sparse Bayesian classifier, network-based support vector machine (NBSVM) [5], support vector machine based recursive feature elimination classifier (SVM-RRFE) [19], and sparse Bayesian classifier (SBC) [13] on this dataset. We obtained the ROC curve for each algorithm by obtaining the true positive rate and average false positive rate from the cross-validation process. Figure 1 shows the ROC-curve and AUC results of the four classifiers; our method performed better than all of the other approaches. We also evaluated the differences of the predictive power of these methods by the paired Wilcoxon rank-sum test. The results show that the NBSBM achieved better results than the other two SVM-based approaches in terms of average AUC performance according to the Wilcoxon test with p < 0.01 (Figure 4a).

3.2. Prediction of Sensitivity and Resistance of Breast Cancer Patients to Tamoxifen

Dataset GSE17705 [20] (available in gene expression omnibus (GEO)) contained both the gene expression data of 103 estrogen receptor positive breast cancer patients and their survival time after Tamoxifen treatment. We divided those patients into the Tamoxifen sensitive group and Tamoxifen non-sensitive group according to their median survival time. Patients who survived longer than the median survival time were designated as Tamoxifen-sensitive, otherwise Tamoxifen-non-sensitive. Next, we employed NBSBM to predict the estrogen receptor-positive breast cancer patients’ response to Tamoxifen treatment, using a previously reconstructed estrogen receptor-positive breast cancer-specific network [16] and the gene expression data of those 103 breast cancer patients. The estrogen receptor-positive breast cancer-specific network is highly interconnected and contains 15 differentially expressed gene modules, 923 genes, and 10,073 edges. The 103 estrogen receptor positive breast cancer patients could be accurately classified by the proposed sparse Bayesian machine. We compared the proposed approach with the NBSVM and SVM-RRFE. Figure 2 shows the ROC curves and AUC results of the three classifiers individually. We found that NBSBM performed better than the other methods. We also evaluated the differences of the predictive power of these methods by the Wilcoxon rank-sum test. It can be seen that the NBSBM achieved better results than the other two-SVM approaches in terms of average AUC performance according to the Wilcoxon test with p < 0.05 (Figure 4b).

3.3. Prediction of Sensitivity and Resistance of Various Cancer Cells to Dasatinib

The Genomics of Drug Sensitivity in Cancer (GDSC) database [21,22] contains the gene expression data of 789 cancer cell lines and provides sensitivity data of various cancer cell lines to drugs from in-vitro drug screening experiments. Herein, we used the sparse Bayesian classifier to predict the sensitivity of cancer cell lines to Dasatinib based on the gene expression data of the 319 cancer cells, and an integrated cancer signaling network from our previous work [9]. The integrated human cancer signaling pathways (IHSP) consisted of previously published human cancer signaling pathways [23,24,25,26], Biocarta [27], and KEGG [28] databases. There are 7564 genes and 58,932 edges in IHSP. Figure 3 shows the ROC curve results of the three classifiers. It can be seen that our method performed better than the other two SVM algorithms in terms of average AUC performances according to the Wilcoxon rank-sum test with p < 0.05 (Figure 4c).

4. Discussion

A spike and slab prior distribution combined with a Markov-random-field (MRF) prior were used to build a spare model in the proposed network-based sparse Bayesian machine (NBSBM). Under this sparsity assumption, better results can be achieved if prior information about the gene to gene relationships with the disease-specific network is available. A disease-specific (differentially expressed) network was encoded in such prior information, in other words, MRF prior to improve the prediction performance of NBSBM. Note that the Bayesian classifier proposed in this article is capable of feature selection, in Supplementary Tables S1 and S2, we list the top relevant features (genes) and pathways that can predict prostate cancer cell responses to Dasatinib. For the top-ranked genes or pathways reported to play important roles of prostate cancer development and progression, see Supplementary Materials, Section 2 for more detail. That is, we can derive network-based biomarkers for drugs such as those highly predictive gene modules (features) from the disease-specific signaling network. Then, we can predict the sensitivity level of new cancer cells to drugs only according to the gene expression data of these network-biomarkers, which might provide an exploration of the molecular pathogenesis of specific diseases. Furthermore, those network-based biomarkers might directly contribute to drug sensitivity or resistance. In addition to the application to cancer therapeutics, our approach should be useful in predicting drug sensitivity in many common complex diseases.

5. Conclusions

In this article, we proposed a sparse Bayesian machine to predict the sensitivity level of cancer cells to drugs using gene expression data and disease-specific signaling networks. The Bayesian classifier systematically integrated specific cancer signaling pathways with high throughput gene expression data. It employed an expectation propagation strategy to find a sparse solution. In addition, we compared the performance of the NBSBM with other network based SVM methods. Using three different pharmacological datasets, we applied cross-validation to test the performance of the proposed Bayesian classifier. The results showed that the proposed algorithm performed much better than the other two methods, warranting further studies in individual cancer patients to predict personalized cancer treatments.

Supplementary Materials

The following are available online at https://www.mdpi.com/2073-4425/10/8/602/s1, Table S1: Top-25 most predictive genes for classifying prostate cancer cell responses to Dasatinib; Table S2: The most enriched signaling pathways in those top-100 ranked genes that are most relevant to prostate cancer cell response to Dasatinib.

Author Contributions

Conceptualization, L.F.H.; Methodology, L.F.H.; Software, L.F.H.; Formal analysis, L.F.H.; Investigation, L.F.H.; Resources, L.F.H.; Data curation, L.F.H.; Writing—original draft preparation, L.F.H., Q.L., and L.J.M.; Writing—review and editing, L.F.H., Q.L., and L.J.M.; Visualization, L.F.H.; Supervision, L.F.H.; Project administration, L.F.H.; Funding acquisition, L.F.H.

Funding

This work was supported by the CancerFree KIDS foundation (to L.F.H.) and Lei Frank Huang’s start-up funding from the Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cancer and Blood Disease Institute, Cincinnati Children’s Hospital Medical Center.

Acknowledgments

The author would like to thank the anonymous reviewers for their helpful and constructive comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sen, B.; Peng, S.; Tang, X.; Erickson, H.S.; Galindo, H.; Mazumdar, T.; Stewart, D.J.; Wistuba, I.; Johnson, F.M. Kinase-Impaired BRAF Mutations in Lung Cancer Confer Sensitivity to Dasatinib. Sci. Transl. Med. 2012, 4, 136ra70. [Google Scholar] [CrossRef] [PubMed]
Wang, X.-D.; Reeves, K.; Luo, F.R.; Xu, L.-A.; Lee, F.; Clark, E.; Huang, F. Identification of candidate predictive and surrogate molecular markers for dasatinib in prostate cancer: Rationale for patient selection and efficacy monitoring. Genome Biol. 2007, 8, R255. [Google Scholar] [CrossRef] [PubMed]
Huang, F.; Reeves, K.; Han, X.; Fairchild, C.; Platero, S.; Wong, T.W.; Lee, F.; Shaw, P.; Clark, E. Identification of Candidate Molecular Markers Predicting Sensitivity in Solid Tumors to Dasatinib: Rationale for Patient Selection. Cancer Res. 2007, 67, 2226–2238. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, C.; Li, H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 2008, 24, 1175–1182. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rapaport, F.; Zinovyev, A.; Dutreix, M.; Barillot, E.; Vert, J.-P. Classification of microarray data using gene networks. BMC Bioinform. 2007, 8, 35. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Shen, X.; Pan, W. Network-based support vector machine for classification of microarray samples. BMC Bioinform. 2009, 10, S21. [Google Scholar] [CrossRef] [PubMed]
Chuang, H.Y.; Lee, E.; Liu, Y.T.; Lee, D.; Ideker, T. Network-based classification of breast cancer metastasis. Mol. Syst. Biol. 2007, 3, 140. [Google Scholar] [CrossRef] [PubMed]
Chen, L.; Xuan, J.; Riggins, R.B.; Wang, Y.; Clarke, R. Identifying protein interaction subnetworks by a bagging Markov random field-based method. Nucleic Acids Res. 2013, 41, e42. [Google Scholar] [CrossRef] [PubMed]
Huang, L.; Garrett Injac, S.; Cui, K.; Braun, F.; Lin, Q.; Du, Y.; Zhang, H.; Kogiso, M.; Lindsay, H.; Zhao, S.; et al. Systems biology–based drug repositioning identifies digoxin as a potential therapy for groups 3 and 4 medulloblastoma. Sci. Transl. Med. 2018, 10, eaat0150. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Zhao, X.; Qi, J.; Yang, C.; Cheng, H.; Ren, Y.; Huang, L. Eight proteins play critical roles in RCC with bone metastasis via mitochondrial dysfunction. Clin. Exp. Metastasis 2015, 32, 605–622. [Google Scholar] [CrossRef] [Green Version]
Gönen, M.; Margolin, A.A. Drug susceptibility prediction against a panel of drugs using kernelized Bayesian multitask learning. Bioinformatics 2014, 30, i556–i563. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hernández-Lobato, D.; Hernández-Lobato, J.M. Bayes Machines for binary classification. Pattern Recognit. Lett. 2008, 29, 1466–1473. [Google Scholar] [CrossRef]
Miguel Hernández-Lobato, J.; Hernández-Lobato, D.; Suárez, A. Network-based sparse Bayesian classification. Pattern Recognit. 2011, 44, 886–900. [Google Scholar] [CrossRef]
Yang, J.; Li, A.; Li, Y.; Guo, X.; Wang, M. A novel approach for drug response prediction in cancer cell lines via network representation learning. Bioinformatics 2018, 35, 1527–1535. [Google Scholar] [CrossRef] [PubMed]
Huang, L.; Brunell, D.; Stephan, C.; Mancuso, J.; Yu, X.; He, B.; Thompson, T.C.; Zinner, R.; Kim, J.; Davies, P.; et al. Driver Network as a Biomarker: Systematic integration and network modeling of multi-omics data to derive driver signaling pathways for drug combination prediction. Bioinformatics 2019. [Google Scholar] [CrossRef] [PubMed]
Huang, L.; Li, F.; Sheng, J.; Xia, X.; Ma, J.; Zhan, M.; Wong, S.T.C. DrugComboRanker: Drug combination discovery based on target network analysis. Bioinformatics 2014, 30, i228–i236. [Google Scholar] [CrossRef] [PubMed]
Herbich, R.; Graepel, T.; Campbell, C. Bayes point machines. J. Mach. Learn. Res. 2001, 1, 245–279. [Google Scholar] [CrossRef]
Minka, T.P. Expectation propagation for approximate Bayesian inference. In Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, Seattle, WA, USA, 2–5 August 2001; pp. 362–369. [Google Scholar]
Fröhlich, H.; Sültmann, H.; Brase, J.C.; Johannes, M.; Fälth, M.; Gehrmann, M.; Gade, S.; Beißbarth, T. Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients. Bioinformatics 2010, 26, 2136–2144. [Google Scholar] [CrossRef] [Green Version]
Symmans, W.F.; Hatzis, C.; Sotiriou, C.; Andre, F.; Peintinger, F.; Regitnig, P.; Daxenbichler, G.; Desmedt, C.; Domont, J.; Marth, C.; et al. Genomic Index of Sensitivity to Endocrine Therapy for Breast Cancer. J. Clin. Oncol. 2010, 28, 4111–4119. [Google Scholar] [CrossRef] [Green Version]
Garnett, M.J.; Edelman, E.J.; Heidorn, S.J.; Greenman, C.D.; Dastur, A.; Lau, K.W.; Greninger, P.; Thompson, I.R.; Luo, X.; Soares, J.; et al. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 2012, 483, 570. [Google Scholar] [CrossRef]
Benes, C.; Haber, D.A.; Beare, D.; Edelman, E.J.; Lightfoot, H.; Thompson, I.R.; Smith, J.A.; Soares, J.; Stratton, M.R.; Bindal, N.; et al. Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2012, 41, D955–D961. [Google Scholar] [CrossRef]
Awan, A.; Bari, H.; Yan, F.; Moksong, S.; Yang, S.; Chowdhury, S.; Cui, Q.; Yu, Z.; Purisima, E.O.; Wang, E. Regulatory network motifs and hotspots of cancer genes in a mammalian cellular signalling network. IET Syst. Biol. 2007, 1, 292–297. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cui, Q.; Ma, Y.; Jaramillo, M.; Bari, H.; Awan, A.; Yang, S.; Zhang, S.; Liu, L.; Lu, M.; O’Connor-McCourt, M.; et al. A map of human cancer signaling. Mol. Syst. Biol. 2007, 3, 152. [Google Scholar] [CrossRef] [PubMed]
Li, L.; Tibiche, C.; Fu, C.; Kaneko, T.; Moran, M.F.; Schiller, M.R.; Li, S.S.; Wang, E. The human phosphotyrosine signaling network: Evolution and hotspots of hijacking in cancer. Genome Res. 2012, 22, 1222–1230. [Google Scholar] [CrossRef] [PubMed]
Newman, R.H.; Hu, J.; Rho, H.S.; Xie, Z.; Woodard, C.; Neiswinger, J.; Cooper, C.; Shirley, M.; Clark, H.M.; Hu, S.; et al. Construction of human activity-based phosphorylation networks. Mol. Syst. Biol. 2013, 9, 655. [Google Scholar] [CrossRef] [PubMed]
Nishimura, D. BioCarta. Biotech. Softw. Internet Rep. Comput. Softw. J. Sci. 2001, 2, 117–120. [Google Scholar] [CrossRef]
Kanehisa, M.; Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 2000, 28, 27–30. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Comparison performance of the network-based sparse Bayesian machine (NBSBM) with other methods in terms of average (mean) operating characteristic (ROC) (5-fold cross-validation and 5-repeats), and AUC value. The boxplot indicates the variation around the average ROC curve and reports the median and the interquartile range. ROC curves of (a) network-based SVM, (b) the proposed approach, (c) SVM-RRFE, and (d) sparse Bayesian classifier (SBC) to classify the response of 16 prostate cancer cell lines to Dasatinib.

Figure 2. Comparison performance of the NBSBM with other methods in terms of average (mean) operating characteristic (ROC) curve (5-fold cross-validation and 5-repeats) and AUC value. The boxplot indicates the variation around the average ROC curve and reports the median and the interquartile range. ROC curves of (a) network-based SVM, (b) the proposed approach, (c) SVM-RRFE, and (d) sparse Bayesian classifier (SBC) to predict the response of estrogen receptor-positive breast cancer patients to Tamoxifen.

Figure 3. Comparison performance of the NBSBM with other methods in terms of average (mean) operating characteristic (ROC) curve (5-fold cross-validation and 5-repeats) and AUC value. The boxplot indicates the variation around the average ROC curve and reports the median and the interquartile range. ROC curves of (a) network-based SVM, (b) the proposed approach, (c) SVM-RRFE, and (d) sparse Bayesian classifier (SBC) to classify the response of estrogen receptor-positive breast cancer patients to Tamoxifen.

Figure 4. Performance comparison among NBSBM, network-based SVM, and SVM-FREE in terms of average AUC in predicting (a) prostate cells’ response to Dasatinib, (b) Breast Cancer Patients’ response to Tamoxifen therapy, and (c) 789 cancer cells’ response to Dasatinib. The Wilcoxon rank-sum test was used to examine whether the AUCs obtained by two approaches were different.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Q.; Muglia, L.J.; Huang, L.F. Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction. Genes 2019, 10, 602. https://doi.org/10.3390/genes10080602

AMA Style

Liu Q, Muglia LJ, Huang LF. Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction. Genes. 2019; 10(8):602. https://doi.org/10.3390/genes10080602

Chicago/Turabian Style

Liu, Qi, Louis J. Muglia, and Lei Frank Huang. 2019. "Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction" Genes 10, no. 8: 602. https://doi.org/10.3390/genes10080602

APA Style

Liu, Q., Muglia, L. J., & Huang, L. F. (2019). Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction. Genes, 10(8), 602. https://doi.org/10.3390/genes10080602

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Network as a Biomarker: A Novel Network-Based Sparse Bayesian Machine for Pathway-Driven Drug Response Prediction

Abstract

1. Introduction

2. Materials and Methods

Sparse Bayesian Classifier Combined with Disease-Specific Network

3. Results

3.1. Prediction of Sensitivity and Resistance of Prostate Cancer Cell Lines to Dasatinib

3.2. Prediction of Sensitivity and Resistance of Breast Cancer Patients to Tamoxifen

3.3. Prediction of Sensitivity and Resistance of Various Cancer Cells to Dasatinib

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI