L1CAM, CA9, KLK6, HPN, and ALDH1A1 as Potential Serum Markers in Primary and Metastatic Colorectal Cancer Screening

Background: Colorectal cancer (CRC) screening at the earlier stages could effectively decrease CRC-related mortality and incidence; however, accurate screening strategies are still lacking. Considerable interest has been generated in the detection of less invasive tests requiring a small sample volume with the potential to detect several cancer biomarkers simultaneously. Due to this, the ELISA-based method was undertaken in this study. Methods: Concentrations of neural cell adhesion molecule L1 (L1CAM), carbonic anhydrase IX (CA9), mesothelin (MSLN), midkine (MDK), hepsin (HPN), kallikrein 6 (KLK6), transglutaminase 2 (TGM2) aldehyde dehydrogenase 1 family, member A1 (ALDH1A1), epithelial cell adhesion molecule (EpCAM), and cluster of differentiation 44 (CD44) from blood serum of 36 primary CRC and 24 metastatic CRC (mCRC) were calculated via MAGPIX® System (Luminex Corporation, USA). Results: Significantly increased concentration (p < 0.05) of three serum biomarkers (L1CAM, CA9, and HPN) were shown in mCRC when compared with primary CRC. HPN and KLK6 showed significant differences (p < 0.05) in concentration among different stages of CRC. In contrast, levels of HPN and ALDH1A1 were significantly elevated (p < 0.05) in chemotherapy-treated CRC patients as compared with nontreated ones. Conclusion: Serum biomarkers could act as a potential early CRC diagnostics test, but further additional testings are needed.


Introduction
Colorectal cancer (CRC) is one of the most prevalent diseases with an alarming increase in incidence and mortality, particularly in developing countries [1]. According to Bray et al., in 2018, CRC was the third most prevalent diagnosed cancer and placed second for all cancer-related deaths [2]. Approximately 30% to 50% of the newly diagnosed CRC patients will quickly progress into later stages/metastatic CRC (mCRC), and their 5 year survival rate was around 50% to 60% [3,4]. It is beyond doubt that early detection, notably when the cancer lesions are localized and easy to remove, Table 1. Selection rationale of the 10 markers from the Human Circulating Cancer Biomarker Multiplex Immunoassay Magnetic Bead Panel 4.

L1CAM
• A member of the immunoglobulin-like cell adhesion molecule family that is shown to be associated with a worse prognosis of CRC. [19][20][21] • A major driver for tumor formation and metastasis initiation capability in CRC. [22,23] • L1CAM contributes to peritoneal dissemination in CRC. [24] • The regulation of L1CAM is strongly correlated with morphologic features at the invasive front in CRC. [25] CA9 • A transmembrane glycoprotein involved in cell proliferation, angiogenesis, and a marker for hypoxia and acidosis, which is linked to poor prognosis in CRC. [26] • CA9 correlates with perineural invasion in CRC. [27] MSLN • A cell surface membrane-bound glycoprotein which is highly expressed in solid CRC tumors (40-45%). [28,29] • MSLN acts as a prognostic marker for stage II/III CRC. [30] MDK • A heparin-binding growth factor, which induces neo-lymphangiogenesis and exhibits anti-apoptotic, migration-promoting, and angiogenic properties. [31] • Overexpression of MDK in the blood of CRC patients indicates a worse prognosis. MDK generally increases with increasing severity of cancer. [32][33][34] • MDK adds value to multi-marker CRC biomarker panels. [34] HPN • A cell-surface type II transmembrane serine protease with genetic alteration in colon carcinoma (1.2%). [35,36]

ALDH1A1
• A cancer stem cell marker, where its nuclear expression is associated with shortened overall survival in CRC patients. [52] • Overexpression of ALDH1A1 in CRC is associated with the presence of lymph node metastases and poor prognosis. [53,54] • ALDH1A1 expression is associated with poor differentiation, "right-sidedness", and poor survival in human CRC. [55] EpCAM • A transmembrane glycoprotein cell adhesion molecule highly expressed on the surface of epithelium-originated tumor cells, including CRC. High expression of EpCAM is linked with an aggressive tumor phenotype in primary stages of CRC.
[ [60][61][62][63][64][65] • EpCAM-based assay (the CellSearch ® System) is the only FDA-approved test for enrichment and detection of circulating tumor cells of cancers, including CRC. [66] CD44 • A common CRC stem cell marker and is associated with tumor initiation, tumor progression, tumor growth, invasion, and metastasis. [67][68][69] • Overexpression of CD44 in colon tissue is associated with cancer progression, aggressiveness in stage I and III sporadic CRC, poor differentiation, lymph node metastasis, and distant metastasis. [70] • Low alteration frequency of CD44 found in mRNA is linked with the prediction of prognosis in CRC. [

Patients and Serum Samples
The database of the Biobank, UKM Medical Molecular Biology Institute (UMBI) was searched for specimen collection. All the specimens in the Biobank were obtained according to the institutional ethical committee approval (UKM PPI/111/8/JEP-2017-583 27 August 2017), and the patients have given informed consent. Sixty CRC patients from Hospital Canselor Tuanku Muhriz were included in the study. All patients had serum samples stored in the Biobank, were Malaysians, and comprised all ages with all stages of CRC. None of the included patients had other types of cancer.
The patients were given a number stage based on their tumor, node, metastasis (TNM) system (T1 or T2, N0, M0: Stage I; T3 or T4, N0, M0: Stage II; any T, N1 or N2, M0: Stage III; any T, any N, M1: Stage IV) or Dukes' staging (Dukes' A: Stage I; Dukes' B: Stage II; Dukes' C: Stage III; Dukes' D: Stage IV) as shown in Table 2. They were also categorized based on the presence of distant metastasis, lymph node metastasis, and chemotherapy status.

Luminex Analyser MAGPIX ® Multianalyte Profiling of Markers
This study was performed with commercially available MILLIPLEX ® MAP Human Circulating Cancer Biomarker Magnetic Bead Panel 4 (Merck KGaA, Darmstadt, Germany) based on the Luminex ® xMAP ® technology. Serum samples were diluted in assay buffer with a ratio of 1:5. Twenty-five microliters of each diluted sample was added to a mixture of fluorescent-coded magnetic beads precoated with the analyte-specific capture antibody. Binding of the antibodies to the analytes of interest (biomarkers) took placed overnight (16 to 18 h) at 4 • C. Biotinylated detection antibodies were added the next day, followed by an hour incubation at room temperature. Then, Streptavidin-Phycoerythrin conjugate was added into each well to complete the reaction. Each microsphere was identified using Luminex analyzer MAGPIX ® , and the results were calculated based on fluorescent reporter signals. Mean fluorescence intensities were quantified using the xPonent 4.2 software (Luminex Corporation, Austin, Texas, United States), using a five-parameter logistic curve fitting to derive the analyte concentrations in each sample [72].

Statistical Analysis
The data were first evaluated with the D'Agostino and Pearson omnibus normality tests to determine the normal distribution. For comparisons of variables between two groups, student's t-test or Mann-Whitney U test, depending on normality test results, was performed, whereas one-way analysis of variance (ANOVA) or Kruskal-Wallis test was used to compare groups with three or more variables, followed by post hoc testings (Dunn's or Tukey's multiple comparisons test), respectively. Data were expressed as median with standard error and 95% CI. Data were analyzed with GraphPad Prism version 7.0 (GraphPad Software Inc., San Diego, California, United States). On the other hand, the receiver operating characteristic (ROC) curves and area under the curve (AUC) of selected biomarkers were calculated with 95% confidence intervals. A method by Delong et al. was incorporated to compare the ROC curves AUCs and standard error [73]. The cut point (sensitivity and specificity) was chosen based on the Youden index. Logistic regression was carried out to identify the diagnosis accuracy of selected biomarkers. Data were analyzed with MedCalc ® version 19.2 (MedCalc Software Ltd., Ostend, Belgium) [74]. All tests were two-sided, and p-values of less than 0.05 were observed as statistically significant.

Clinicopathological Features of Colorectal Cancer Patients
For this study, 60 CRC patients from Hospital Canselor Tuanku Muhriz were chosen. The age of diagnosis ranged from 30 to 89 years old. They were then categorized based on their clinicopathological features, as tabulated in Table 3. They were first divided into two subgroups: primary CRC (36 patients) and mCRC (24 patients). A number stage was also given to each patient based on their TNM system or Dukes' stage, resulting in 5 Stage I, 7 Stage II, 24 Stage III, and 24 Stage IV CRC patients. Among them, 10 had a history of chemotherapy, and 18 presented with lymph node metastasis.

Relationship between Analytes Concentrations and Clinicopathological Features of CRC
In an attempt to discriminate CRC on the basis of its clinicopathological features (Figure 2), CRC patients were classified based on their number staging, chemotherapy status, and lymph node metastasis (Table 3). Significant difference (p = 0.034) was encountered in HPN concentration between

Relationship between Analytes Concentrations and Clinicopathological Features of CRC
In an attempt to discriminate CRC on the basis of its clinicopathological features (Figure 2), CRC patients were classified based on their number staging, chemotherapy status, and lymph node metastasis (

Receiver Operating Characteristic Curve and Logistic Regression Analysis
In the test accuracy evaluation of selected serum biomarkers, which were significantly different (p < 0.05) in the CRC patients, the receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was calculated ( Figure 3, Table 4). The cutoff points were determined based on the Youden index. The highest AUC value associated with primary and metastasized tumors was calculated for CA9 (0.686), followed by HPN (0.685) and L1CAM (0.661).
Among different CRC stages, the highest AUC value of 0.757 was found in a comparison of KLK6 between stage I & II and IV CRC patients (sensitivity = 87.5). Another two models revealed an AUC value of 0.727 with a specificity of 87.50% (KLK6 stage III versus IV) and 0.701 with a specificity of 83.33% (HPN stage III versus IV). However, both models had low sensitivity.
The AUC under ROC curves was also calculated for comparisons of selected biomarkers between chemotherapy and chemotherapy-naïve CRC patients. Out of the two models, both HPN and ALDH1A1 revealed significant AUC values of 0.710 and 0.764, respectively. When stage IV CRC patients were compared based on their chemotherapy status, a significant difference (p = 0.031) was only identified in ALDH1A1 with an AUC value of 0.743.
Logistic regression was built to assess the accuracy of the diagnostic markers. Of the selected biomarkers, only five of them are significant (p < 0.05): (1)   Results were expressed as median with the lowest (minimum) and largest (maximum) concentrations, and standard error of means. Boxplot with * signified p < 0.05 and ** signified p < 0.01 among different models.
Concentrations of HPN and ALDH1A1 were significantly higher (p < 0.05) in CRC patients treated with chemotherapy than in chemotherapy-naive patients. As compared with chemotherapy-naive patients (2.533 ± 0.141 ng/mL), HPN concentration was higher (p = 0.036) in CRC patients treated with chemotherapy (3.201 ± 0.225 ng/mL). Similarly, there was an increase in the concentration of ALDH1A1 in chemotherapy-treated patients (581.060 ± 240.426 ng/mL) compared with nontreated ones (66.433 ± 34.389 ng/mL). When only stage IV CRC patients were divided into chemo versus no chemo groups, ALDH1A1 showed significant elevation (p = 0.048) in its concentration (581.060 ± 240.426 ng/mL) as compared with the latter (86.016 ± 93.285 ng/mL).

Receiver Operating Characteristic Curve and Logistic Regression Analysis
In the test accuracy evaluation of selected serum biomarkers, which were significantly different (p < 0.05) in the CRC patients, the receiver operating characteristic (ROC) curves were plotted, and the area under the curve (AUC) was calculated ( Figure 3, Table 4). The cutoff points were determined based on the Youden index. The highest AUC value associated with primary and metastasized tumors was calculated for CA9 (0.686), followed by HPN (0.685) and L1CAM (0.661).

Multivariable Logistic Regression Models
Multivariable logistic regression analysis was also carried out to determine if the increment of analytes would further enhance the diagnostic accuracy (Table 4). However, none had significant improvement when two or three biomarkers were compared. Three biomarkers (L1CAM, CA9, and HPN) showed a rise in AUC to 0.719 with a sensitivity of 88.89%, when compared with individual analytes. Comparisons of the regression models in KLK6/HPN between stage III and IV (AUC = 0.727) and HPN/ALDH1A1 chemotherapy versus chemotherapy-naïve CRC patients (AUC = 0.748) did not reveal any significance, although the latter showed the highest specificity of 98.00%.

Discussion
A serum marker or biomarker is a molecule able to be detected in the serum. It permits identification of a particular disease, in this case, CRC. Thus, marker-based assays have high prognostic and diagnostic values. They are crucial in early CRC detection for treatment selection and prediction of patients' outcomes [75]. Patient outcomes strongly depend on the tumor stage, metastatic capabilities, localization and presence of distant metastases. Beyond any doubt, early diagnosis is crucial for successful treatment, especially in metastatic CRC. Although several studies were identifying novel serum biomarkers, involving insulin-like growth factor-binding protein 2 (IGFBP-2) [76], heat shock protein 60 (HSP60), and chitinase-3-like protein 1 (CHI3L1) [77], to be strongly correlated with metastasis of CRC, their poor selectivity and sensitivity have rendered the tests unsuccessful due to the high rates of false positives and false negatives. Thus, new biological Among different CRC stages, the highest AUC value of 0.757 was found in a comparison of KLK6 between stage I & II and IV CRC patients (sensitivity = 87.5). Another two models revealed an AUC value of 0.727 with a specificity of 87.50% (KLK6 stage III versus IV) and 0.701 with a specificity of 83.33% (HPN stage III versus IV). However, both models had low sensitivity.
The AUC under ROC curves was also calculated for comparisons of selected biomarkers between chemotherapy and chemotherapy-naïve CRC patients. Out of the two models, both HPN and ALDH1A1 revealed significant AUC values of 0.710 and 0.764, respectively. When stage IV CRC patients were compared based on their chemotherapy status, a significant difference (p = 0.031) was only identified in ALDH1A1 with an AUC value of 0.743.
Logistic regression was built to assess the accuracy of the diagnostic markers. Of the selected biomarkers, only five of them are significant (p < 0.05): (1)

Multivariable Logistic Regression Models
Multivariable logistic regression analysis was also carried out to determine if the increment of analytes would further enhance the diagnostic accuracy (Table 4). However, none had significant improvement when two or three biomarkers were compared. Three biomarkers (L1CAM, CA9, and HPN) showed a rise in AUC to 0.719 with a sensitivity of 88.89%, when compared with individual analytes. Comparisons of the regression models in KLK6/HPN between stage III and IV (AUC = 0.727) and HPN/ALDH1A1 chemotherapy versus chemotherapy-naïve CRC patients (AUC = 0.748) did not reveal any significance, although the latter showed the highest specificity of 98.00%.

Discussion
A serum marker or biomarker is a molecule able to be detected in the serum. It permits identification of a particular disease, in this case, CRC. Thus, marker-based assays have high prognostic and diagnostic values. They are crucial in early CRC detection for treatment selection and prediction of patients' outcomes [75]. Patient outcomes strongly depend on the tumor stage, metastatic capabilities, localization and presence of distant metastases. Beyond any doubt, early diagnosis is crucial for successful treatment, especially in metastatic CRC. Although several studies were identifying novel serum biomarkers, involving insulin-like growth factor-binding protein 2 (IGFBP-2) [76], heat shock protein 60 (HSP60), and chitinase-3-like protein 1 (CHI3L1) [77], to be strongly correlated with metastasis of CRC, their poor selectivity and sensitivity have rendered the tests unsuccessful due to the high rates of false positives and false negatives. Thus, new biological markers for early diagnosis with higher sensitivity and specificity are urgently needed in clinical practice for better CRC treatment.
Most of the analytes chosen (L1CAM, CA9, MSLN, MDK, HPN, KLK6, TGM2, ALGH1A1, EpCAM, and CD44) have not been studied previously as biomarkers in early detection of CRC. The only scientific evidence involving this panel of markers was published in 2019 by Torres et al. The authors identified CD44, TGM2, and EPCAM as novel plasma markers for endometrial cancer detection [72]. However, a large amount of literature demonstrated the presence of these biomarkers during the progression of CRC (Table 1), suggesting their potential as serum-originated diagnostic markers [19,27,31]. In the present study, a multiplex ELISA-based approach was undertaken due to the advantages over conventional screening methods including (1) high throughput, (2) less sample volume requirements (in microlitres), (3) ability to undergo simultaneous screening of numerous analytes in a sample, (4) ability to use specimens from noninvasive liquid biopsies (serum), (5) ability to evaluate levels of given analyte separately, (6) ability to repeat experimental assay in same conditions, (7) ability to reliably detect analytes across a broad dynamic range of concentrations, (8) increased accuracy, (9) reduced time, labor, and cost, (10) reduced technical errors, and (11) easy translation into clinical practice [18,78,79].
Before conducting the study, the sample size was estimated using G*Power software version 3.1.9.4. We hypothesized fold difference of at least 0.8 between 36 primary CRC and 24 mCRC patients to generate 84.7% power of study [80]. This number provides a balance between providing a precise estimate of accuracy with a wide confidence interval in screening tests and preventing wastage of resources [81]. On the other hand, for the basis estimation of screening tests, sensitivity was predetermined to be at least 50.0% within the null hypothesis, whereas a lower degree of specificity can be tolerated as a screening tool [82].
Based on the results, concentrations of L1CAM, CA9, and HPN were significantly elevated in mCRC as compared with primary CRC samples. The increased concentration of L1CAM and CA9 was in line with several published studies. For example, Kajiwara et al. found out that overexpression of L1CAM was related to CRC tumor budding grade and solid cancer nests [25]. In 2020, Ganesh et al. demonstrated that L1CAM+ cells in human CRC had the metastasis-initiating capacity, and L1CAM was required for orthotropic carcinoma propagation, liver metastatic colonization, and chemoresistance in CRC [22]. On the other hand, CA9, a hypoxia-inducible membrane-tethered protein, was believed to be closely related to carcinogenesis of CRC [83] and linked to poor prognosis of CRC [26]. Overexpression of CA9 in CRC was proven to be correlated with perineural invasion [27], which was a sign of tumor metastasis and invasion as well as an indication of poor outcome in CRC [84]. Another study in 2019 suggested the co-localization of CA9 with phosphorylated ezrin (EZR), activated the hypoxia-autophagy-EZR pathway in tumor-initiating human cells and primary CRC tissues, proving its clinical relevance [85]. In short, the elevation of L1CAM and CA9 concentration was expectable and corresponded to results presented by other authors.
Contrarily, although the concentration of HPN was increased significantly in mCRC when compared with primary CRC, and significant differences were observed among stage IV and chemotherapy-treated CRC patients, there is yet any scientific facts relating its expression with CRC metastasis [35,36]. Nonetheless, the generated data displayed the involvement of HPN in the metastatic progression of CRC. One of the plausible explanations was that overexpression of HPN was associated with matrix degradation (invasion and metastasis initiator), similar to that of prostate cancer [86]. Another assumption was that HPN was strongly associated with pathogenesis and early carcinogenesis of CRC, since it caused disorganization of the basement membrane and promoted primary prostate cancer progression and metastasis to liver, lung, and bone [87]. Curiously, low expression of HPN was associated with poor survival in breast cancer, renal cell carcinoma, and hepatocellular carcinoma [88][89][90]. Still, it was in parallel with the decrease in HPN concentration after chemotherapy treatment.
Accurate preoperative diagnosis is crucial in the management of CRC. If CRC is detected in the early stages, especially when the cancer lesions are localized, patients are likely to have better clinical outcomes [5]. Until today, the majority of the CRC is uncovered just after the appearance of obvious signs and symptoms (usually signifying late stage). Although colonoscopy and iFOBT are the most established CRC screening tests, they are bounded by uptake and adherence [10,91]. Moreover, a rapid noninvasive screening method with high sensitivity and specificity is still unavailable [92]. Due to this, a reasonably accurate procedure in depicting CRC at its earlier stages would reduce its mortality and incidence rates [93]. For that purpose, CRC patients were divided into three groups of different stages (stage I & II, III, and IV), and comparisons were done between the divided groups and selected biomarkers. Although the number of CRC patients for stage I and II were less than of stage III and stage IV, it did not affect the overall statistical power because a weighted mean (each subsample mean was weighted by sample size) was used [94,95], and the power of the study was based on the smallest sample size [96]. Furthermore, since our analysis did not include factorial ANOVA, where sample sizes are confounded in two or more factors, and post hoc tests (Dunn's test/Tukey) are performed, a possible reduction in statistical power generated was minimized [97]. A significant p-value indicated that there was a difference between the groups.
Based on the results, KLK6 expression was significantly elevated in both stages I & II and IV when compared with stage III CRC patients. The former was not surprising since overexpression of KLK6 was often seen in primary CRC tumors and was linked with tumor aggressiveness, enhanced migration, metastatic capabilities, and poor patients' outcomes [39,40,98]. The latter was undoubtedly in parallel with other previously published studies. For instance, overexpression of KLK6 was related to epithelial-mesenchymal transition during CRC progression [45,99]. In 2019, Chen et al. discovered critical functions of KLK6 enzymes in CRC advancement to late stages via activation of the high mobility group A2 protein [99]. Furthermore, KLK6 expression in CRC correlated significantly with increasing tumor stage and histological grade [100] and was connected with a more advanced Dukes' stage, liver metastasis, and poor prognosis [37][38][39][40]. Upregulation of KLK6 was also believed to be associated with high depth of tumor invasion, presence of distant metastases, and as an independent prognosticator to predict poor disease-free and overall survival in CRC patients [47]. Conversely, the possible hypothesis for the decrease of KLK6 in stage III CRC patients might be due to its tumor-suppressive [101,102], and immunologic properties since downregulation of KLK6 was associated with the compromise of immune system via regulation of lymphocytes survival and accelerated cancer progression [103].
Based on the analyzed results, the concentration of ALDH1A1 was significantly elevated in chemotherapy-treated and stage IV CRC patients. The possible hypothesis behind this phenomenon could be due to the metastasis progression of CRC itself and not due to chemotherapy since chemotherapy-treated patients all comprised stage IV CRC. To justify, Kahlert et al. revealed that ALDH1A1 expression was not significantly connected with prognosis in CRC and did not predict response to chemotherapy in patients with metastatic diseases [52]. Additionally, ALDH1A1 promoted tumor angiogenesis via retinoic acid/HIF-1α/VEGF signaling [104], while being identified as an indication for poor CRC outcome [53,54].
Initially, a priori analysis was done to determine the power of study of at least 80% among the included 60 CRC patients, but later we found out that even with our careful study design choices to minimize bias, samples exhibited large variations within each biomarker. This might be due to the unevenly distributed range and overlapping of marker concentrations, coupled with person-to-person variation from the samples [105]. Heterogeneity within clinical samples is crucial in producing data with less variation and more precision [106]. Increasing sample size might not be the solution and will magnify biases if selected samples show similar data distribution patterns, which do not reflect the total CRC population [107,108]. Concisely, heterogeneity within samples was preferred over the large sample sizes.
One of the research questions in this research was to confirm whether the increment of two or three serum biomarkers would enhance diagnostic accuracy. Unfortunately, none of the combinations of selected biomarkers showed significant improvement. Even if there was an improvement in either sensitivity or specificity, the remaining had decreased value, or there was no improvement at all for both. We predict that this could be due to the presence of outliers and the limited population of CRC patients. To train a reliable model, the inclusion of more data distribution patterns of CRC is needed.
Our study is not without limitations. Even though the sample size is 60, this preliminary study still has sufficient power to detect several significantly expressed circulating markers that hold the potential for future exploration. Additionally, since tumor tissue-derived proteins from the serum are likely to be low and diluted, especially during early stages of cancer, expression analysis of tumor tissue, for instance, immunohistochemistry and quantitative PCR, could be done to identify markers that are both tissue-specific and upregulated in CRC.

Conclusions
In conclusion, this study reported several biomarkers from serum (L1CAM, CA9, KLK6, HPN, and ALDH1A1) that could act as a potential noninvasive screening tool for CRC, but further additional testings are needed.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the writing of the manuscript. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.