Ovarian cancer, one of the most fatal gynecologic malignancies, is a global burden with 295,414 new cases and 184,799 deaths estimated each year [1
]. Ovarian cancer is the fifth leading cause of female cancer-related deaths in the United States [2
]. The predominant histologic type is high-grade serous ovarian carcinoma (HGSOC) [3
], for which aggressive cytoreductive surgery followed by taxane- and platinum-based chemotherapy is an established standard of care [4
]. Although the initial response rate is high, patients with HGSOC, especially those at advanced stages, eventually experience relapse [6
]. In the era of precision medicine, it is important to initially identify biomarkers to accurately predict the exact prognosis of HGSOC to facilitate personalized treatment.
Mass spectrometry (MS)-based proteomics has been widely used to characterize molecular components and underlying mechanisms associated with various malignancies such as colorectal [7
], breast [8
], lung [9
], and ovarian cancers [10
]. Currently, this emerging technology is used for high-throughput analysis for simultaneous quantification of numerous proteins and discovery of prognostic biomarkers in individual samples. The Clinical Proteomic Tumor Analysis Consortium (CPTAC) provides proteogenomic insights into HGSOC by performing extensive proteomic profiling and correlating results with data contained in The Cancer Genome Atlas (TCGA) database [10
]. However, these biomarkers need to be subjected to specific validation studies before being clinically applied.
In this study, we performed label-free quantitative proteomic analysis of chemotherapy-naïve, fresh-frozen primary ovarian cancer tissues to elucidate prognostic protein biomarkers of HGSOC. We then validated our findings via immunohistochemical (IHC) staining in an independent dataset.
In the current study, we performed label-free liquid chromatography-mass spectrometry (LC-MS/MS)-based proteomic analysis on chemotherapy-naïve, fresh-frozen primary HGSOC tissues. Upon validation with complementary IHC staining for FFPE HGSOC tissue specimens, we identified six protein biomarkers to predict the prognosis of HGSOC; expression levels of AAT, NFKB, PMVK, VAP1, FABP4, and PF4 in ovarian cancer tissue were associated with PFS.
AAT, encoded by SERPINA1
in humans, is a serine protease inhibitor that influences tumor behavior depending on the context and/or cancer type. Consistent with our results, enrichment of SERPINA1
mRNA was associated with a good prognosis in HGSOC [16
]; however, no association was observed between survival of patients with HGSOC and AAT expression levels, as assessed via IHC staining [17
]. Two previous studies evaluated serum AAT levels in patients with epithelial ovarian cancer and reported that AAT may contribute to a differential diagnosis and help predict chemoresistance [18
NFKB is constitutively activated in several cancers. The p100 subunit of NFKB, a precursor of the active p52 subunit, has been suggested to counteract the tumorigenic effects of p52 in breast cancer [20
]. In ovarian cancer, NFKB p52 promoted cancer progression, resulting in an unfavorable prognosis [21
]. To our knowledge, our study is the first to report the association between high expression of NFKB p100 and improved PFS in patients with HGSOC.
Furthermore, this study shows that PMVK, an enzyme involved in cholesterol synthesis and lipid metabolism, can be considered a novel prognostic biomarker for HGSOC. In estrogen receptor-positive breast cancer, high expression of PMVK
gene was positively associated with responses to chemotherapeutic agents [22
]. Similarly, herein, high expression of PMVK, assessed via IHC staining, was significantly associated with platinum sensitivity and improved the survival of patients with HGSOC.
VAP1, an adhesion molecule mediating interactions between various inflammatory and endothelial cells, may be associated with tumor invasion and metastasis. High expression of VAP1 was associated with a lower overall survival in breast cancer [23
]. In our study, VAP was identified as a poor prognostic factor for PFS in patients with HGSOC.
A previous study reported that FABP4 promotes HGSOC progression by mediating lipid metabolism in cancer cells [25
]. Furthermore, the present results indicate that high expression of FABP4, assessed via IHC staining, was associated with a poor PFS.
PF4 is a platelet-activating chemokine that induces thrombocytosis and thromboembolism. High serum PF4 levels were associated with poor survival and an increased risk of venous thromboembolism in patients with pancreatic adenocarcinoma [26
]. According to a microarray study using 51 HGSOC samples, increased expression of PF4 mRNA was negatively associated with patients’ overall survival [27
]. Similarly, our study results also indicate that high expression of PF4, assessed via IHC staining, was associated with a reduced PFS.
APOA1 and AGP are acute-phase reactants, and their serum levels have been assessed in epithelial ovarian cancer to assess their potential as diagnostic or prognostic predictive biomarkers [28
]. However, tissue expression and clinical implications of APOA1 and AGP in HGSOC have not been determined. In the current study, we revealed that expressions of APOA1 and AGP in HGSOC cancer tissues were not associated with patient survival outcomes. Furthermore, in several tissue microarray (TMA) cores, APOA1 and AGP were expressed strongly only in the stroma and inflammatory cells, but not in cancer cells, suggesting the host immune response to cancer.
A potential limitation of our study is that the menopausal status may have influenced the expression levels of AAT, NFKB, PMVK, VAP1, FABP4, and PF4. However, during the proteomic analysis, the proportion of menopausal women was the same between the good and poor prognosis groups (66.7%). Moreover, during subsequent IHC analysis for prognostic validation, the patients’ ages were adjusted for multivariate analyses to identify the prognostic factors for PFS.
Since the CPTAC presented a proteomic landscape of 169 HGSOC tumors [10
], several studies have focused on HGSOC using MS-based proteomics. Coscia et al. performed integrative proteomic profiling of ovarian cancer cell lines and HGSOC tumors and revealed two distinct clusters, epithelial and mesenchymal, which displayed different clinical outcomes [11
]. Dieters-Castator et al. performed label-free quantitative proteomic analysis using 10 fresh-frozen HGSOC tissues and 10 fresh-frozen endometrioid carcinoma tissues, and identified diagnostic biomarkers specific to endometrioid carcinoma. Furthermore, the eight-marker panel, generated in that study, showed good performance in discriminating endometrioid carcinoma from HGSOC [12
Unlike previous studies focused on a differential diagnosis or clustering of HGSOC, this study aimed to investigate protein biomarkers predicting survival outcomes. IHC staining on FFPE HGSOC tissue specimens revealed that expression levels of the six protein biomarkers were not different between the early-stage and advanced-stage disease, and between diseases with high and low initial serum CA-125 levels. Nevertheless, our results indicate that high expressions of AAT, NFKB, and PMVK are favorable prognostic biomarkers for PFS, whereas high expressions of VAP1, FABP4, and PF4 are poor prognostic biomarkers for PFS. These six protein biomarkers, along with well-known prognostic factors including the stage and size of residual tumors, are expected to increase the accuracy of predicting relapse after primary treatment. Such improvements in prognostic prediction would facilitate the development of individualized therapies. For instance, if a patient is identified as being at high risk for recurrence, she may receive intraperitoneal chemotherapy or bevacizumab maintenance therapy in addition to the standard treatment and would be placed on a frequent surveillance schedule for earlier detection of relapse. Moreover, each of the six protein biomarkers, identified in our present study, can be considered a target for novel molecular therapeutic agents against HGSOC. However, further translational studies using cell lines and cancer tissues are essential for assessing biological effectiveness of the protein biomarkers and for identifying relevant pathways.
Herein, we performed IHC staining to validate candidate prognostic protein biomarkers. IHC is more cost-effective and simpler to use in the clinical setting than DNA or RNA PCRs and exome- or transcriptome-level next-generation sequencing methods. IHC can be utilized for prognosis prediction immediately during pathologic examination of the tissue obtained during ovarian cancer surgery.
For further studies validating a predictive model consisting of all the six protein biomarkers, we calculated the estimated powers for various sample sizes and censoring rates, which were drawn from a simple simulation study (p
< 0.05; 1000 replicates), using the estimates from our multivariate model and the predictor frequencies from our dataset (Figure S5
This study has several limitations. First, owing to the retrospective study design, issues such as selection bias might exist. Second, the sample size used in our study may be insufficient for further discovery and validation of protein biomarkers. Third, external validation of our study results is necessary. Fourth, additional studies, elucidating the mechanisms of action of each protein biomarker, were not performed herein. Lastly, we did not evaluate interactions between among these protein biomarkers. Despite these limitations, we faithfully applied a two-step approach consisting of proteomic and bioinformatic analyses and subsequent IHC staining for HGSOC tissue to identify prognostic protein biomarkers. Moreover, we developed predictive models comprising the six protein biomarkers and clinical variables for 18-month PFS of HGSOC patients. Such models displayed better prediction potential than those comprising only clinical variables. Especially, the proposed predictive model was further simplified as a score-based model, which provides comparable performance and substantial intuitiveness.
4. Materials and Methods
This retrospective study was approved by the Institutional Review Board of Seoul National University Hospital (SNUH; No. C-1712-083-907), and conducted in accordance with the Declaration of Helsinki.
4.1. Study Design
This study included two steps: (1) proteomic and bioinformatic analysis for biomarker discovery; and (2) IHC staining for prognostic validation of candidate biomarkers (Figure S6
In the first step, we used fresh-frozen primary (non-metastatic) ovarian cancer tissues obtained intraoperatively and stored at the SNUH Human Biobank for research purposes. We identified patients who met the following inclusion criteria: (1) older than 18 years; (2) diagnosed with HGSOC between June 2012 and December 2016; (3) underwent PDS; and (4) agreed to donate biospecimens and provide written informed consent. Patients with any malignancy other than ovarian cancer, those who received NAC, those with insufficient clinical data or those lost to follow-up, or those with severe co-morbidities were excluded. On the basis of their PFS, patients were divided into the favorable (good) prognosis group (≥18 months) and poor prognosis group (<18 months). In total, 12 patients from the two groups (6 for each group) were selected for further proteomic analysis, and proteomic profiles were compared between the two groups.
In the second step, we used FFPE primary (non-metastatic) ovarian cancer tissues stored in the pathology archive of SNUH. In contrast with the first step, we identified patients meeting the following inclusion criteria: (1) older than 18 years; (2) diagnosed with HGSOC between June 2012 and December 2018; (3) whose ovarian cancer tissue was obtained during chemotherapy-naïve status (such as at the time of PDS, or during diagnostic laparoscopy in case of NAC); and (4) agreed to donate their pathologic specimens for research purposes and provided written informed consent. Patients with insufficient clinical data or those lost to follow-up or those with severe co-morbidities were excluded. In total, 107 patients with primary HGSOC were included in this step.
4.2. Proteomic and Bioinformatic Analyses
4.2.1. Tissue Preparation
Tissue samples were prepared using filter-aided sample preparation (FASP), as previously described [32
]. Briefly, frozen tissue samples were homogenized using lysis buffer (4% sodium dodecyl sulfate (SDS), 2 mM Tris(2-carboxyethyl)phosphine (TCEP), and 0.1 M Tris-HCl pH 7.4), and protein concentration was determined using a reducing agent-compatible bicinchoninic acid (BCA) protein assay kit (Thermo Fisher Scientific, Waltham, MA, USA) in accordance with the manufacturer’s instructions. To eliminate contaminants, we performed acetone precipitation using 250 µg of the lysate at –20 °C. Each protein pellet was dissolved in 50 µL reduction buffer (4% SDS, 0.1 mM dithiothreitol (DTT), and 0.1 M Tris-HCl, pH 7.4) and heated at 95 °C for 15 min. The reduced proteins were loaded onto a 30 K spin filter (Millipore, Billerica, MA, USA), and buffer was exchanged for UA solution (8 M urea in 0.1 M Tris-HCl, pH 8.5) via centrifugation. After triple UA exchanging, the reduced cysteines were alkylated with 0.05 M iodoacetamide (IAA) in UA solution for 30 min at ambient temperature in the dark. Thereafter, UA buffer was exchanged for 40 mM ammonium bicarbonate (ABC), and the samples were digested with trypsin (enzyme to substrate ratio of 1:100) at 37 °C for 16 h. Further, the digested peptides were harvested via centrifugation, and an additional elution step was performed using 40 mM ABC and 0.5 M NaCl.
4.2.2. Desalting and Peptide Fractionation of Individual Samples
Peptide concentrations were measured using the tryptophan fluorescence (WF) assay, as previously described [33
]. Digested peptides (20 μg) were acidified with trifluoroacetic acid (TFA) and then loaded directly onto house-made Stage-Tip with polystyrenedivinylbenzene-reversed phase sulfonate (SDB-RPS) material [34
]. StageTip was washed thrice with 100 μL 0.2% TFA. Three fractionations were performed using elution buffer 1, 2, and 3. All eluted peptides were dried in a SpeedVac centrifuge.
4.2.3. Offline High-pH Reversed-Peptide Fractionation for Library Construction
For library construction, pooled peptides were fractionated via high pH reversed-phase liquid chromatography (RPLC) using an Agilent 1290 bioinert high performance liquid chromatography (HPLC) (Agilent, Santa Clara, CA, USA) equipped with an analytical column (4.6 × 250 mm, 5 μm), as previously described [32
]. Solvent A consisted of 15 mM ammonium hydroxide in water, and solvent B consisted of 15 mM ammonium hydroxide in 90% acetonitrile (ACN). The peptides were separated along a gradient of 5%–35% ACN at a flow rate of 0.2 mL/min. In total, 96 fractions were concatenated to mix different parts of the gradient into 24 fractions. The fractions were lyophilized and stored at −80 °C until MS analysis.
4.2.4. LC-MS/MS Analysis
All LC-MS/MS analyses were conducted using an Ultimate 3000 UHPLC system (Dionex, Sunnyvale, CA, USA) coupled with Q-Exactive HF-X mass spectrometry (Thermo Fisher Scientific, Waltham, MA, USA), as previously described with some modifications [32
]. Peptides were separated on a two-column system equipped with a trap column (300 µm × 5 mm) and an analytic column (75 µm × 50 cm), using 90-min gradients from 7% to 32% ACN at a flow rate of 300 nl/min. Column temperature was maintained at 60 °C using a column heater. For label-free quantification using the data-dependent acquisition (DDA) method, a survey scan (350 to 1650 m
) was acquired with a resolution of 70,000 at m
200. A top-15 method was used to select the precursor ion with an isolation window of 1.2 m
. MS/MS spectra were acquired at a higher-energy collisional dissociation (HCD)-normalized collision energy (NCE) of 30 with a resolution of 17,500 at m
200. Maximum ion injection durations for the full and MS/MS scans were 20 and 100 ms, respectively.
4.2.5. Data Processing
All MS raw files were processed using MaxQuant (version 126.96.36.199) [35
]. MS/MS spectra were searched against the Human Uniprot protein sequence database (December 2014 with 88,657 entries) using the Andromeda search engine [36
]. Primary searches were performed using a 6 ppm precursor ion tolerance for total protein-level analysis. MS/MS ion tolerance was set to 20 ppm. Cysteine carbamidomethylation was set as a fixed modification. Protein N-acetylation and methionine oxidation were considered variable modifications. Enzyme specificity was set to full tryptic digestion. Peptides with a minimum length of six amino acids and up to two missed cleavages were considered. The required FDR was set to 1% at peptide, protein, and modification levels. To maximize the number of quantification events across samples, we enabled the “Match between Runs” option on the MaxQuant platform.
4.2.6. Label-Free Quantification and Statistical Analysis
For label-free quantification, the iBAQ algorithm was used as part of the MaxQuant platform [37
]. Briefly, iBAQ values, determined using MaxQuant, were the raw intensities divided by the number of theoretical peptides. Thus, iBAQ values were proportional to molar quantities of the proteins. Perseus software was used for statistical analysis [38
]. First, we eliminated proteins identified as “reverse” and “only identified by site”. After filtering values of at least 70% in each group, missing values were imputed using a width of 0.3 and down shift of 1.8. Finally, data were normalized using a width adjustment function, which subtracts the medians and scales all values in a sample to yield equal interquartile ranges (IQRs) [39
]. For pairwise proteome comparisons, we performed a two-sided t
-test with a significance level (p
-value) of <0.05 and fold-change of >1.5. Support vector machine analysis was performed using the R/Bioconductor package “GNC” [15
4.2.7. Bioinformatic Analysis
GO enrichment analysis was performed using the DAVID bioinformatics resources (https://david.ncifcrf.gov/
). GO-terms and corresponding p
-values were subsequently submitted to ReViGO [40
], and visualized using high semantic similarity-based treemaps. Tumor purity was assessed using the R package “ESTIMATE” on the basis of the expression levels of marker genes in stromal and immune cells [41
4.3. Validation via IHC Analysis
4.3.1. TMA Construction
Prognostic implications of protein biomarkers, identified through proteomic analyses, were validated via IHC staining using a separate dataset consisting of chemotherapy-naïve, FFPE cancer tissues resected from the primary (non-metastatic) ovarian mass intraoperatively during debulking surgery (PDS cases) or diagnostic laparoscopy (NAC cases) (n = 107). After tissues were retrieved from the pathology archive of SNUH, they were histologically assessed through hematoxylin and eosin staining. To construct a TMA, three cores (2 mm in diameter) per patient were embedded in new recipient FFPE blocks using a trephine apparatus (Superbiochips Laboratories, Seoul, Korea).
4.3.2. IHC Staining
IHC staining for AAT, NFKB, PMVK, VAP1, FABP4, PF4, APOA1, and AGP was performed using 4 μm thick TMA sections using a Benchmark autostainer (Ventana, Tucson, AZ, USA) in accordance with the manufacturer’s instructions (Table S8
Because IHC staining of these eight antibodies and its prognostic effects was not previously evaluated in HGSOC, we determined the optimal cutoff values for each IHC staining, based on the sample distribution and prognostic significance (Table S8
). Briefly, the extent (0–20%, 20–50%, 50–70%, 70–100%) and intensity (absent, weak, moderate, strong) of cytoplasmic/membranous immunoreactivity were semi-quantitatively assessed. Thereafter, the expression level of each protein was dichotomized into high versus low expression (Figure S4
4.4. Statistical Analysis
Descriptive statistics were used to describe clinicopathologic characteristics of the study population. Patient characteristics were compared between the good and poor prognosis groups, and between groups showing low and high expression of each protein biomarker. We used Student’s t and Mann–Whitney U tests to compare continuous variables, and Pearson’s Chi-squared and Fisher’s exact tests to compare categorical variables. Kaplan–Meier methods with log-rank test were used for survival analysis. Multivariate analysis was performed using a Cox proportional-hazards model, and aHRs and 95% CIs were calculated. These analyses were conducted using SPSS software (version 25.0; SPSS Inc., Chicago, IL, USA). All statistical tests were two-sided, and a p-value < 0.05 was considered statistically significant.
We constructed regression- and score-based models predicting 18-month PFS using clinical variables and IHC results of 107 patients with primary HGSOC. To evaluate the performance of regression-based predictive models, we performed leave-one-out cross-validation with the consideration of a small sample size. In brief, leave-one-out cross-validation constructs n models repetitively, by training the model with n-1 samples and testing with the remaining one, where n is the sample size. This analysis was repeated for all samples, and n predicted values were obtained on the basis of n models. We computed AUC using the predicted values and the observed values of the response variable. Finally, we simplified the regression-based model into a score-based model. In this study, each predictor has a single binary value, either 0 or 1. We inverted the original values of the predictors with a negative coefficient (0/1 to 1/0) so that all the direction of effects be positive. Then, a total score for the prediction of 18-month PFS was determined by simply adding all the predictors without coefficient.