Multiple Chromatographic Analysis of Urine in the Detection of Bladder Cancer

Bladder cancer (BC) is the most common type of carcinoma of the urological system. Recently, there has been an increasing interest in non-invasive diagnostic tumor markers due to the invasive attribute of cystoscopy, which is still considered the gold standard diagnostic method. However, markers published in the literature so far do not meet expectations for replacing cystoscopy due to their low specificity and excessively high false-positive results, which can be mainly caused by frequently occurring hematuria also in benign cases. No reliable non-invasive method has yet been identified that can distinguish patients with bladder cancer and non-malignant hematuria patients. Our work examined the possibilities of non-targeted biomarkers of urine to distinguish patients with malignant and non-malignant diseases of the bladder using 3D HPLC in combination with computer processing of multiple datasets. Urine samples from 47 patients, 23 patients with bladder cancer (BC) and 24 patients with non-malignant hematuria (NMHU), were enrolled in clinical trials. For the separation and subsequent analysis of a large number of urine components, 3D HPLC (high-performance liquid chromatography) with an absorption and fluorescence detector was used. The obtained dataset was further subjected to various uni- and multi-dimensional statistical analyses and mathematical modeling. We found 334 chromatographic peaks, of which 18 peaks were identified as significantly different for BC and NMHU patients. Using receiver operating characteristic (ROC) analysis, we assessed the informative ability of significant chromatographic peaks (90% sensitivity and 74% specificity). By logistic regression, we identified the optimal and simplified set of seven chromatographic peaks (5 absorptions plus 2 fluorescence) with strong classification power (100% sensitivity and 100% specificity) for distinguishing patients with bladder cancer and those with non-malignant hematuria. Partial least square discriminant analysis (PLS-DA) model and orthogonal projection to latent structure discriminant analysis (OPLS-DA) with 100% sensitivity and 96% specificity were used to distinguish BC and NMHU patients. Multivariate statistical analysis of urinary metabolomic profiles of patients revealed that BC patients can be discriminated from NMHU patients and the results can likely contribute to an early and non-invasive diagnosis of BC.


Introduction
Bladder cancer (BC) is one of the most common genitourinary malignancies. Additionally, it is the sixth most common cancer in men [1]. In most cases, BC is detected at a late stage, which represents an unfavorable prognosis for the patient. One reason for delayed diagnosis is the non-specificity of symptoms, such as difficulty with micturition, pain in urination, and blood in urine, which may be accompanied by different diseases unrelated to malignant tumors. Currently, cystoscopy and cytology are the gold standard methods for BC detection. Cystoscopy is considered an invasive and painful examination of BC [2]. This examination method often represents a physical burden on the patient [3]. Although cytology is a non-invasive method, its sensitivity is not sufficient (less than 40%), particularly for low-grade tumors [4]. Recently, research interest has shifted to non-invasive methods of identifying the early stages of BC [5].
Biomarkers have the potential to aid in diagnosis, surveillance, staging, prognosis, and possibly therapeutic guidance. A large number of potential biomarkers for the detection of genomic, transcriptomic, epigenetic, or protein changes in serum or urine samples have been described in the literature, but only some of them are approved by the Food and Drug Administration [2,5,6]. The most common are NMP22 ® BladderChek ® Test, TRAK/BTA stat ® Test, UroVysion ® FISH, ImmunoCyt™/uCyt+™, and CxBladder, which often provide relatively low sensitivity [2]. For the time being, there is no single urinary biomarker achieved from non-invasive BC surveillance tests to replace cystoscopy; therefore, obtaining a diagnostic model that can distinguish BC patients from the others would be more beneficial [7].
The metabolome reflects the status of the biological system. Urine is in direct contact with bladder epithelial cells and metabolites released into the urine provide information on bladder disorders, suggesting that analysis of urinary metabolomic profile is also a promising approach for discrimination between malignant and benign diseases [8,9]. However, there has been limited metabolomic research in the detection of biomarkers specific for BC [10]. Hematuria is the most common presenting symptom in bladder cancer (present in 85% of cases) [11]. Nevertheless, only a few studies have targeted falsepositive values caused by bleeding with a benign origin [9]. No reliable method has yet been identified that can distinguish patients with bladder cancer (BC) and non-malignant hematuria patients (NMHU).
In the present study, an untargeted metabolomics approach using 3D reverse-phase high-performance liquid chromatography (RP-HPLC) with absorption and fluorescence detection was carried out as a promising alternative to omics methods for searching for biomarkers in patient's urine. By subsequent computer processing of acquired extensive data (logistic regression, receiver operating characteristic (roc) analysis, partial least square discriminant analysis, orthogonal projection to latent structure discriminant analysis), we created classification models for the differentiation of patients with BC and NMHU.

Patients and Sample Collection
Urine samples were collected from 23 male patients with bladder cancer (BC) (median age of 65) and 24 male patients with non-malignant hematuria (NMHU) (median age of 65) (Table 1). Urine was collected by spontaneous miction or by catheterization. Collected samples were kept frozen at −28 • C. Each urine sample was thawed at room temperatures and centrifuged at 3000× g for 5 min, followed by filtration through a 0.22 µm nylon membrane filter-LLG Syringe Filter PTFE (AZ chrome s.r.o., Bratislava, Slovak Republic). Urine samples were provided by the Urological department UNB Saints Cyril and Methodius Hospital in Bratislava. Informed consent was signed by each patient and the project was approved by the ethics commission of UNB Saints Cyril and Methodius Hospital in Bratislava. Each patient underwent a biopsy, consequently, histological examination confirmed or excluded a malignant tumor of the bladder (diagnosis C67 according to the International classification of diseases MKCH 10). Based on that, the patient was classed in the appropriate group (BC or NMHU patients). Table 1 shows the basic characteristics of samples with the TNM (tumor-node-metastasis) staging system, describing the stage and grade of BC. Cancer Stage Ta  -10  T1  -9  T2  -4 SD-standard deviation, G1-well-differentiated tumor (low grade), G2-moderately differentiated tumor (intermediate grade), G3-poorly differentiated tumor (high grade), Ta-non-invasive tumor (only in the innermost layer), T1-solitary tumor without muscular invasion, T2-solitary tumor with muscular invasion.

Principle of HPLC
High-performance liquid chromatography (HPLC) is an analytical method used to separate and quantify components from the sample due to differences in their structure. The apparatus consists of multiple modules. Mobile phase is pressed under high pressure into the column with stationary phase. A liquid sample is injected to the stream of mobile phase flowing through a column, whose function is to separate the compounds. Based on the affinity to the stationary phase, the components are eluted at different retention times. Retention time is specific for every compound. Components that display a stronger interaction with the stationary phase will be eluted later from the column, while components with weaker interaction will elute sooner. Subsequently, these eluates consecutively flow through one or more detectors, which generate chromatograms-intensity (voltage response) as function of time. The single chromatographic peaks represent a compound or group of compounds with similar characteristics. The area under the peaks represents the quantity of the compound [12].

Equipment and Conditions
The urine samples were analyzed using reverse-phase high-performance liquid chromatography (RP-HPLC) system Prominence 20 A (Shimadzu Co., Kyoto, Japan) with absorption and fluorescence detection. The system temperature was adjusted to 25 • C. Nucleosil 100-5 C18 (150 × 4.6 mm; 5 µm particle size) (Macherey-Nagel, Düren, Germany) as the stationary phase and 10 mmol/L KH 2 PO 4 with 5% methanol, pH 6.8 as a mobile phase was used. The eluent flow was 1 mL/min. The injection volume was 5-30 µL depending on signal intensity. We recorded 10 chromatograms from the absorption detector and 5 chromatograms from the fluorescence detector of each sample, which formed the basis for 3D chromatograms ( Table 2). The excitation and emission wavelength of the fluorescence detector were selected according to our previous findings [13,14].

Data Processing and Statistical Analysis
Software LC solution (Shimadzu Co., Kyoto, Japan) was used for analysis and processing of chromatograms. Statistical analyses of data and graphing were performed using StatsDirect 3 (intergroup comparison, correlation, and receiver operating characteristic (ROC) analysis) and R software (logistic regression, partial least square discriminant analysis, orthogonal projection to latent structure discriminant analysis). OriginPro2016 software was used to create 3D chromatograms.
The areas under each chromatographic peak were normalized to urine creatinine area so that fluctuation in urine concentration was minimized. The creatinine identification was performed with absorption detection at 220 nm. For inter-group comparison, the t-test or the non-parametric alternative Mann-Whitney U test was used. To assess the predictive ability of the proposed diagnostic test, a ROC analysis was performed. The logistic regression was used to reveal the best classification model. Parameters for logistic regression models were selected according to the Akaike information criterion (AIC) and Log-likelihood function. Partial least square discriminant analysis (PLS-DA) and an orthogonal projection to latent structure discriminant analysis (OPLS-DA) models based on identified urine peaks were used to distinguish patients with BC from NMHU patients. The resulting model was verified using cross-validation, a permutation test, and external validation. Variable importance in projection (VIP) scores estimate the importance of each variable (peak) in the projection used in a PLS model.

Results
Ten chromatograms for the absorption detector and five chromatograms for the fluorescence detector were recorded from the urine of patients with BC and patients with NMHU ( Figure 1). The total number of identified peaks was 334 at all detector settings in each urine sample. The peak of the sample was identified within the interval RT ± 6 s. Areas under all peaks were calculated and then normalized to the creatinine peak area. Normalized peak values were entered into further analyses.

Data Processing and Statistical Analysis
Software LC solution (Shimadzu Co., Kyoto, Japan) was used for analysis and processing of chromatograms. Statistical analyses of data and graphing were performed using StatsDirect 3 (intergroup comparison, correlation, and receiver operating characteristic (ROC) analysis) and R software (logistic regression, partial least square discriminant analysis, orthogonal projection to latent structure discriminant analysis). OriginPro2016 software was used to create 3D chromatograms.
The areas under each chromatographic peak were normalized to urine creatinine area so that fluctuation in urine concentration was minimized. The creatinine identification was performed with absorption detection at 220 nm. For inter-group comparison, the t-test or the non-parametric alternative Mann-Whitney U test was used. To assess the predictive ability of the proposed diagnostic test, a ROC analysis was performed. The logistic regression was used to reveal the best classification model. Parameters for logistic regression models were selected according to the Akaike information criterion (AIC) and Log-likelihood function. Partial least square discriminant analysis (PLS-DA) and an orthogonal projection to latent structure discriminant analysis (OPLS-DA) models based on identified urine peaks were used to distinguish patients with BC from NMHU patients. The resulting model was verified using cross-validation, a permutation test, and external validation. Variable importance in projection (VIP) scores estimate the importance of each variable (peak) in the projection used in a PLS model.

Results
Ten chromatograms for the absorption detector and five chromatograms for the fluorescence detector were recorded from the urine of patients with BC and patients with NMHU ( Figure 1). The total number of identified peaks was 334 at all detector settings in each urine sample. The peak of the sample was identified within the interval RT ± 6 s. Areas under all peaks were calculated and then normalized to the creatinine peak area. Normalized peak values were entered into further analyses. Inter-group comparison of urine samples from patients with NMHU (n = 20) and patients with BC (n = 19) was performed and 18 statistically significant peaks (p < 0.05) were identified (6 peaks detected by the fluorescence detector, 12 peaks detected by absorption detector, Tables 3 and 4). Inter-group comparison of urine samples from patients with NMHU (n = 20) and patients with BC (n = 19) was performed and 18 statistically significant peaks (p < 0.05) were identified (6 peaks detected by the fluorescence detector, 12 peaks detected by absorption detector, Tables 3 and 4).

ROC (Receiver Operating Characteristic) Analysis
The 18 peaks identified as significantly different for BC and NMHU patients (training set-19 from patients with BC and 20 from patients with NMHU) were individually entered into the ROC analysis to discriminate between BC and NMHU patients. The area under the ROC curve (AUC) represented the evaluation index. The highest value of AUC and best classification power was achieved by the fluorescent peak (370/520 nm; ex/em) identified with RT = 2.32 min (F (2.32 min)) with 90% sensitivity and 74% specificity (Table 5, Figure 2).

Logistic Regression Analysis
To identify an optimal and simplified biomarker set for distinguishing BC and NMHU patients, we performed the logistic regression analysis of the training set (n = 39).  Table 6). This classification model provided 100% sensitivity and 100% specificity.

Discriminant Analysis
The partial least square discriminant analysis (PLS-DA) was used to distinguish patients with BC and NMHU based on urinary metabolites (Figure 3). The model was created from a training set of samples (n = 39). All identified urine peaks by RP-HPLC (n

Logistic Regression Analysis
To identify an optimal and simplified biomarker set for distinguishing BC and NMHU patients, we performed the logistic regression analysis of the training set (n = 39). The 18 peaks were used as variables. The aim was to create a model with the best classification power by using the smallest possible number of peaks. Based on the AIC and Loglikelihood values, the best model for patient's classification required 7 peaks (2 fluorescent and 5 absorption peaks): F(2.28 min); F(2.32 min); A(1.77 min); A(4.62 min); A(1.77 min); A(1.77 min); A(2.19 min) ( Table 6). This classification model provided 100% sensitivity and 100% specificity.

Discriminant Analysis
The partial least square discriminant analysis (PLS-DA) was used to distinguish patients with BC and NMHU based on urinary metabolites (Figure 3). The model was created from a training set of samples (n = 39). All identified urine peaks by RP-HPLC (n = 334) were used for the model and it was built from two components. This model retains 33% of the variability from the original data. R2Y as a test prediction ability had a value of 86.2%. The goodness of prediction (prediction power) of the cross-validation test was 65.8% (test set of samples was used, n = 8). The permutation test was 0.05 (p-value) for pR2Y and pQ2.  We also performed an OPLS-DA analysis ( Figure 4) to discriminate between BC and NMHU patients (training set of samples, n = 39). A model OPLS-DA was obtained with one predictive and two orthogonal components. The model retained 38.3% of the variability of the original data. The prediction rate of the test (separation ability of the test) was 91.6%, with cross-validation of 60.8% (test set of samples, n = 8). The permutation test was 0.05 for both pR2Y and pQ2. Potential biomarkers were selected based on the OPLS-DA model by variable importance in the projection score (VIP > 1, Figure 5).  We also performed an OPLS-DA analysis (Figure 4) to discriminate between BC and NMHU patients (training set of samples, n = 39). A model OPLS-DA was obtained with one predictive and two orthogonal components. The model retained 38.3% of the variability of the original data. The prediction rate of the test (separation ability of the test) was 91.6%, with cross-validation of 60.8% (test set of samples, n = 8). The permutation test was 0.05 for both pR2Y and pQ2. Potential biomarkers were selected based on the OPLS-DA model by variable importance in the projection score (VIP > 1, Figure 5).  We also performed an OPLS-DA analysis ( Figure 4) to discriminate between BC and NMHU patients (training set of samples, n = 39). A model OPLS-DA was obtained with one predictive and two orthogonal components. The model retained 38.3% of the variability of the original data. The prediction rate of the test (separation ability of the test) was 91.6%, with cross-validation of 60.8% (test set of samples, n = 8). The permutation test was 0.05 for both pR2Y and pQ2. Potential biomarkers were selected based on the OPLS-DA model by variable importance in the projection score (VIP > 1, Figure 5).   Sequentially, external validation was performed on the test set consisting of eight patients (four BC patients and four NMHU patients). The model achieved 100% sensitivity and 80% specificity. In addition to external validation, we also classified all samples (training set + test set; n = 47) when the test sensitivity acquired 100% and specificity increased to 96%. The test identified one from 47 patients as false positive.

Discussion
The biggest challenge in bladder cancer (BC) diagnostics is to identify the disease before progression. Recently, there has been an increasing interest in non-invasive diagnostic tumor markers due to the invasive attribute of cystoscopy, which is still considered the gold standard diagnostic method [2]. However, markers published in the literature so far do not meet expectations for replacing cystoscopy due to their relatively low specificity and excessively high false-positive results, which can be mainly caused by frequently occurring hematuria [15]. Therefore, the decisive question for the urologists is to reliably and rapidly distinguish patients with bladder cancer from non-malignant hematuria (NMHU) patients. The urinary metabolomics-based diagnostic approach could have clinical relevance, because urine is in direct contact with bladder epithelial cells that may give rise to BC, and thus metabolites released from BC cells may be present in urine samples. Consistent with these findings, urine metabolomic analysis is a promising noninvasive approach for BC detection and marker discovery [8,9].
In this study, we focused on finding an appropriate set of biomarkers from the urinary metabolites of patients with BC and NMHU. For the separation and subsequent analysis of a large number of urine components, we chose a non-invasive, relatively fast, affordable, and analytically undemanding technique HPLC with absorption and fluorescence detection. The obtained dataset was further subjected to various uni-and multi-dimensional statistical analyses and mathematical modeling. The total number of identified untargeted chromatographic peaks was 334 in each urine sample, from which 18 peaks were significantly different for BC and NMHU patients (Tables 3 and 4). These peaks represent urine metabolites with absorption and fluorescent abilities.
The 18 identified peaks were first subjected to a univariate ROC analysis. The best ROC analysis parameters (specificity of 74%, sensitivity of 90%) provided the fluorescent peak F(2.32 min) ( Table 5).
Several other studies of univariate analysis of BC patients that included hematuria have used the BTA and NMP22 biomarkers [16][17][18]. The BTA stat test revealed a sensitivity of 72% to differentiate between BC and NMHU urine samples [18], which is Sequentially, external validation was performed on the test set consisting of eight patients (four BC patients and four NMHU patients). The model achieved 100% sensitivity and 80% specificity. In addition to external validation, we also classified all samples (training set + test set; n = 47) when the test sensitivity acquired 100% and specificity increased to 96%. The test identified one from 47 patients as false positive.

Discussion
The biggest challenge in bladder cancer (BC) diagnostics is to identify the disease before progression. Recently, there has been an increasing interest in non-invasive diagnostic tumor markers due to the invasive attribute of cystoscopy, which is still considered the gold standard diagnostic method [2]. However, markers published in the literature so far do not meet expectations for replacing cystoscopy due to their relatively low specificity and excessively high false-positive results, which can be mainly caused by frequently occurring hematuria [15]. Therefore, the decisive question for the urologists is to reliably and rapidly distinguish patients with bladder cancer from non-malignant hematuria (NMHU) patients. The urinary metabolomics-based diagnostic approach could have clinical relevance, because urine is in direct contact with bladder epithelial cells that may give rise to BC, and thus metabolites released from BC cells may be present in urine samples. Consistent with these findings, urine metabolomic analysis is a promising non-invasive approach for BC detection and marker discovery [8,9].
In this study, we focused on finding an appropriate set of biomarkers from the urinary metabolites of patients with BC and NMHU. For the separation and subsequent analysis of a large number of urine components, we chose a non-invasive, relatively fast, affordable, and analytically undemanding technique HPLC with absorption and fluorescence detection. The obtained dataset was further subjected to various uni-and multi-dimensional statistical analyses and mathematical modeling. The total number of identified untargeted chromatographic peaks was 334 in each urine sample, from which 18 peaks were significantly different for BC and NMHU patients (Tables 3 and 4). These peaks represent urine metabolites with absorption and fluorescent abilities.
The 18 identified peaks were first subjected to a univariate ROC analysis. The best ROC analysis parameters (specificity of 74%, sensitivity of 90%) provided the fluorescent peak F(2.32 min) ( Table 5).
Several other studies of univariate analysis of BC patients that included hematuria have used the BTA and NMP22 biomarkers [16][17][18]. The BTA stat test revealed a sensitivity of 72% to differentiate between BC and NMHU urine samples [18], which is lower than our classification with the biomarker F(2.32 min) and a specificity of 24-80% [16,18], which is comparable to our results of F(2.32 min). Another potential biomarker NMP22 BladderChek assay with specificity in the range 77-96% and sensitivity of 51-85% also exhibits false positivity for hematuria [17,[19][20][21]. Compared to our results, we achieved higher sensitivity values, but worse specificity values over NMP22.
Considering none of the 18 identified peaks showed sufficient power to distinguish BC patients from NMHU, we further applied various multidimensional statistical analysis approaches on our multidimensional dataset of untargeted urine metabolites to identify and optimize the urine biomarker set.
Firstly, we performed a logistic regression analysis with the 18 peaks used as variables. Our model of the smallest possible number of peaks with the best classification power required seven peaks that correspond to the absorbing and fluorescent urine metabolites: A(1.77 min); A(4.62 min); A(1.77 min); A(1.77 min); A(2.19 min); F(2.28 min); and F(2.32 min) ( Table 6). This model provided above average classification power to distinguish patients with BC from NMHU (100% sensitivity and 100% specificity).
Subsequently, the 334 chromatographic peaks of urine samples (training set) were subjected to partial least square discriminant analysis (PLS-DA; Figure 3) and orthogonal projection to latent structures discriminant analysis (OPLS-DA; Figure 4) for classification between BC and NMHU patients. By external validation (test set), we were able to distinguish BC patients from NMHU patients with sensitivity and specificity of 100% and 80%. By testing the classification of all patients (training set + test set n = 47) into diagnostic groups, the differential model achieved 100% sensitivity and 96% specificity. Jin et al. [22] used high-performance liquid chromatography-quadrupole time-of-flight mass spectrometry (HPLC-QTOFMS) to perform urine metabolomic profiles of BC patients and control group, which included healthy subjects and hematuria patients. Their OPLS-DA analysis afforded sensitivity of 91.3% and specificity of 92.5% and PLS-DA-based ROC curve analysis achieved 85% sensitivity and specificity. Compared to our discriminant analyses, we achieved better results of both sensitivity and specificity.

Conclusions
Our work examined the possibilities of non-targeted biomarkers of urine to distinguish patients with malignant and nonmalignant diseases of the bladder using 3D HPLC in combination with computer processing of multiple datasets (334 chromatographic peaks in each sample). By logistic regression, we identified an optimal and simplified set of seven chromatographic peaks (five absorptions plus two fluorescence) with strong classification power (100% sensitivity and 100% specificity) for distinguishing patients with bladder cancer and those with non-malignant hematuria. The differentiation model (OPLS-DA) diagnosed BC with a sensitivity and specificity of 100% and 96%. Monitoring chromatographic peaks with absorption and fluorescent detection thus showed potential for the non-invasive diagnostic test, that can initially and rapidly distinguish patients with malignant and non-malignant hematuria. In addition, the use of this method is fast and inexpensive, requires minimal sample preparation, and, according to our results, has the potential to achieve high accuracy. Prospectively, the proposed method could be an accessible ambulatory tool for diagnostics, therapeutic progress, and recurrence prevention of bladder cancer.