Utility of Comprehensive Serum Glycopeptide Spectra Analysis (CSGSA) for the Detection of Early Stage Epithelial Ovarian Cancer

Comprehensive serum glycopeptide spectra analysis (CSGSA) evaluates >10,000 serum glycopeptides and identifies unique glycopeptide peaks and patterns via supervised orthogonal partial least-squares discriminant modeling. CSGSA was more accurate than cancer antigen 125 (CA125) or human epididymis protein 4 (HE4) for detecting early stage epithelial ovarian cancer. Combined CSGSA, CA125, and HE4 had improved diagnostic performance. Thus, CSGSA may be a useful screening tool for detecting early stage epithelial ovarian cancer.


Introduction
Ovarian cancer is the seventh most common malignancy in women worldwide, with 238,700 cases diagnosed in 2012 [1]. As women with ovarian cancer often lack specific symptoms, a large number of affected women present with advanced stage disease, wherein survival rates are dismal [2]. Hence, the early detection of ovarian cancer is an urgent unmet need in women's healthcare.
To date, useful biomarkers for screening of ovarian cancer remain scarce [2]. In the current study, we examined the utility of comprehensive serum glycopeptide spectra analysis (CSGSA)-considering the diagnostic accuracy-for detecting early stage epithelial ovarian cancer (EOC). CSGSA evaluates >10,000 serum glycopeptides and identifies unique peaks and patterns of glycopeptides ( Figure S1) via supervised orthogonal partial least-squares discriminant modeling (OPLS-DA) [3]. The results of CSGSA (OPLS-DA) modeling were compared to those of cancer antigen 125 (CA125) and human epididymis protein 4 (HE4).
We then applied this analytic platform ( Figure S2) to the test set (Table S1). In total, 29 (26.1%) cases of stage I EOC were compared to 82 (73.8%) non-EOC control cases ( Figure 1). The AUC for distinguishing stage I EOC versus non-EOC control cases was 91% on CSGSA (OPLS-DA) modeling; the performance remained higher than that for other markers (83% for CA125 and 86% for HE4). Sensitivity was also higher on CSGSA (OPLS-DA) analysis (86%) than when other biomarkers were used (79% for both CA125 and HE4; Tables 1 and S2).
(a) We examined the utility of the combination assay among these three markers in the test set (Table 1  and Table S2 and Figure 1). The combination index was calculated as (0.43 × CA125) + (0.11 × HE4) + (0.46 × CSGSA (OPLS-DA)); the cutoff was set as 0.12. The combination of all the three markers exhibited the highest AUC value (96%). Moreover, the positive predictive value for the combination of the three markers (81%) outperformed the single assay by 15-21 points (59% for CA125, 66% for HE4, and 66% for CSGSA (OPLS-DA); Table 1 and Table S2).
CSGSA (OPLS-DA) had better accuracy than historical biomarkers (CA125 and HE4): this result is promising, highlighting the possible utility of CSGSA (OPLS-DA) as a biomarker for the detection of early stage epithelial ovarian cancer. Unlike a single marker assay such as CA125 or HE4, CSGSA (OPLS-DA) uses the pattern of a high number of glycopeptides. Although several ovarian cancer-screening tools utilize multi-marker assays [2,4], CSGSA (OPLS-DA) evaluates >1500 glycopeptides digested from serum glycoproteins. Moreover, the marker value of CSGSA (OPLS-DA) is created using OPLS-DA, which is a statistical method to separate two groups (EOC and non-EOC controls). Furthermore, when usual tumor markers are used, which are secreted by tumor cells, the biomarker amounts in serum are dependent on tumor volume. However, the result of CSGSA (OPLS-DA) does not depend on the number of tumor cells. This is a possible reason why CSGSA had a more superior performance than the other commonly used biomarkers (CA125 and HE4); hence, it is a novel method for the detection of early stage epithelial ovarian cancer.
Preoperative assessment and prediction of suspected ovarian malignancy may be useful for surgical management. In the absence of ovarian malignancy, minimally invasive surgery can be safely considered. Alternatively, in the presence of malignancy, laparotomy is recommended to decrease the risk of capsule rupture, which can negatively impact survival [5].
The limitations of the study include the small sample size and heterogeneous tumor types. The lack of external validation is another limitation, and the generalizability of this method needs to be assessed in different populations. Another limitation of this study is non-existence of clear evidence that show whether this CSGSA value (POLS-DA) could be specific for EOC or not. CSGSA (OPLS-DA) evaluates >1500 glycopeptides digested from serum glycoproteins. Moreover, the marker value of CSGSA (OPLS-DA) is created using OPLS-DA, which is a statistical method to separate two groups (EOC and non-EOC controls). If we apply serum-digested glycopeptides of other malignancies into this EOC diagnosis system of OPLS-DA, it would be meaningless because this EOC diagnosis system of OPLS-DA can work just to separate EOC and non-EOC. However, we really calculated the CSGSA value (OPLS-DA) of stage 1 cervical cancer (CC) and stage 1 endometrial cancer (EC), by which we could separate CC and EC patients from non-CC and non-EC ones (preliminary data), but not more significantly than our result between EOC and non-EOC patients. However, these data mean that CSGSA value (OPLS-DA) could differentiate cancer patients from non-cancer patients by using various target groups and control groups. For adding more organ-specific capability, we tried the combination assay with CA125 and HE4, which are EOC specific markers. We will also need to check the other organ cancers.

Patient Samples
A total of 88 serum samples (59 and 29 in the training and test sets, respectively) were prospectively obtained from consecutive patients with stage I EOC (Table S1). Patients with non-EOC controls included both healthy women (n = 220) and patients with leiomyoma (n = 14) or benign ovarian tumors (n = 14). The inclusion criteria for the sample set of healthy women were no history of cancer and no hospitalization in the past 3 months. The study-specific exclusion criteria are shown in Table S3. Sera was obtained by centrifuging blood samples and stored at −80 • C until CSGSA analysis to avoid repeated freeze-thaw cycles.

Preparation of Quality Control Serum, and Calculation of Inter-and Intra-Assay Coefficients of Variability
Detailed descriptions have been provided previously [3]. A quality control (QC) sample was prepared by pooling the sera of several women with EOC and non-EOC controls; 2 QC and 22 samples were prepared within a day, and glycopeptide expression values were obtained as the ratio between samples and the average values of two QC samples.

Liquid Chromatography and Mass Spectrometry
The detailed methods for liquid chromatography and mass spectrometry have been described elsewhere [3].

Data Processing
Detailed descriptions regarding this issue have been reported previously [3]. Briefly, original software, "Marker Analysis," was used to analyze all mass spectral data [8]. The peak area was defined as an area with integrating curves from beginning to end. Peak alignment was performed to maintain the error of retention time and m/z of each peak position within 0.3 min and 0.06 Da, respectively.
Calculating ratios between each peak area and the average peak areas of QCs allowed normalization of mass spectra data. Then, the mode-establishing method with SIMCA software (version 13.0.3; Umetrics; Umeå, Sweden) was applied to the normalized data [9]. The protocols developed for the software program Excel VBA were used to create heat maps of mass spectral data.

Pattern Recognition Analysis and Cross-Validation
Glycopeptide spectra data ( Figure S1) were analyzed in a multivariate manner [9][10][11], and OPLS-DA was applied to distinguish between the EOC and non-EOC control groups (Table S1). Before OPLS-DA, the data set was separated into training and test sets (Table S1) to validate the training model. OPLS-DA showed two-dimensional differentiation using the first and second principal components (Figure 1 and Figure S2). OPLS-DA is a method that elicits discriminating factors between two classes, and the model is generated by reducing non-discriminable dimensions (spaces) step-by-step, thereby eliciting an underlying factor (single dimension determined in 1712-dimension space) that discriminates between Cancers 2020, 12, 2374 6 of 7 two groups. We defined values of the first component as CSGSA value (CSGSA (OPLS-DA)); the values of the EOC and non-EOC control groups obtained via OPLS-DA were plotted as box-whisker plots for the training and test sets, respectively.

Statistical Analysis
The detailed statistical methods have been given previously [3]. p < 0.05 indicated statistical significance (two-tailed hypothesis). All statistical analyses were performed using Statistical Package for the Social Sciences (SPSS, version 17.0, Chicago, IL, USA) and original statistical software.

Study Approval
The ethics committee at Tokai University approved this study (approval number: 09R-082). Written informed consent was obtained from the patients. The data of some of the study patients were obtained from a preliminary report [3].

Conclusions
The results of this study suggest that CSGSA is more accurate than CA125 or human HE4 in detecting early stage epithelial ovarian cancer, while CSGSA, CA125, and HE4 combined exhibit improved diagnostic performance. Thus, CSGSA may be a useful screening tool for detecting early stage epithelial ovarian cancer.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6694/12/9/2374/s1, Figure S1: Glycopeptide heat map (stage I EOC versus non-EOC control), Figure S2: Training set: Biomarker performance for the detection of stage I EOC versus non-EOC control; Table S1: Patient characteristics, Table S2: Frequency tables based on cutoff values (stage I EOC versus non-EOC control), Table S3: Exclusion criteria for the study.