Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis
Abstract
1. Introduction
2. Methodology
2.1. Feature Selection and Classification
2.2. Accuracy Metrics
3. Experimental Approach
3.1. Sample Preparation
3.2. Data Collection
3.3. Data Preprocessing and Analysis
4. Results and Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Siegel, R.L.; Miller, K.D.; Jemal, A. Cancer statistics, 2019. CA Cancer J. Clin. 2019, 69, 7–34. [Google Scholar] [CrossRef] [PubMed]
- American Cancer Society. Cancer Facts & Figures 2019. Available online: https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2019.html (accessed on 10 November 2019).
- Adamska, A.; Domenichini, A.; Falasca, M. Pancreatic ductal adenocarcinoma: Current and evolving therapies. Int. J. Mol. Sci. 2017, 18, 1338. [Google Scholar] [CrossRef] [PubMed]
- Swayden, M.; Iovanna, J.; Soubeyran, P. Pancreatic cancer chemo-resistance is driven by tumor phenotype rather than tumor genotype. Heliyon 2018, 4, e01055. [Google Scholar] [CrossRef] [PubMed]
- Subramaniam, D.; Kaushik, G.; Dandawate, P.; Anant, S. Targeting cancer stem cells for chemoprevention of pancreatic cancer. Curr. Med. Chem. 2018, 25, 2585–2594. [Google Scholar] [CrossRef]
- Suraneni, M.V.; Badeaux, M.D. Tumor-initiating cells, cancer metastasis and therapeutic implications. In Madame Curie Bioscience Database [Internet]; Landes Bioscience: Austin, TX, USA, 2013. [Google Scholar]
- Williams, S.A.; Anderson, W.C.; Santaguida, M.T.; Dylla, S.J. Patient-derived xenografts, the cancer stem cell paradigm, and cancer pathobiology in the 21st century. Lab. Investig. 2013, 93, 970–982. [Google Scholar] [CrossRef]
- Liu, J.; Tan, Y.; Zhang, H.; Zhang, Y.; Xu, P.; Chen, J.; Poh, Y.-C.; Tang, K.; Wang, N.; Huang, B. Soft fibrin gels promote selection and growth of tumorigenic cells. Nat. Mater. 2012, 11, 734–741. [Google Scholar] [CrossRef]
- Qureshi-Baig, K.; Ullmann, P.; Haan, S.; Letellier, E. Tumor-Initiating cells: A criTICal review of isolation approaches and new challenges in targeting strategies. Mol. Cancer 2017, 16, 40. [Google Scholar] [CrossRef]
- Auner, G.W.; Koya, S.K.; Huang, C.; Broadbent, B.; Trexler, M.; Auner, Z.; Elias, A.; Mehne, K.C.; Brusatori, M.A. Applications of Raman spectroscopy in cancer diagnosis. Cancer Metastasis Rev. 2018, 37, 691–717. [Google Scholar] [CrossRef]
- Hassing, S. What is vibrational raman spectroscopy: A vibrational or an electronic spectroscopic technique or both? In Modern Spectroscopic Techniques and Applications; IntechOpen: London, UK, 2019. [Google Scholar]
- Tan, P.N.; Steinbach, M.V. Introduction to Data Mining, 2nd ed.; Pearson: London, UK, 2018; ISBN 0-13-312890-3. [Google Scholar]
- Elarre, P.S.; Oyaga-Iriarte, E.; Yu, K.H.; Baudin, V.; Moreno, L.A.; Carranza, O.; Ortega, A.C.; Ponz-Sarvise, M.; Mejías Sosa, L.D.; Sastre, F.R.; et al. Use of machine-learning algorithms in intensified preoperative therapy of pancreatic cancer to predict individual risk of relapse. Cancers 2019, 11, 606. [Google Scholar] [CrossRef]
- Liu, H.; Li, J.; Wong, L. A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Inform. 2002, 13, 51–60. [Google Scholar]
- Thomas, A.; Tourassi, G.D.; Elmaghraby, A.S.; Valdes, R.; Jortani, S.A. Data mining in proteomic mass spectrometry. Clin. Proteom. 2006, 2, 13–32. [Google Scholar] [CrossRef] [PubMed]
- Hilario, M.; Kalousis, A.; Pellegrini, C.; Müller, M. Processing and classification of protein mass spectra. Mass Spectrom. Rev. 2006, 25, 409–449. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Umbach, D.M.; Terry, P.; Taylor, J.A. Application of the GA/KNN method to SELDI proteomics data. Bioinformatics 2004, 20, 1638–1640. [Google Scholar] [CrossRef]
- Marchiori, E.; Heegaard, N.H.H.; West-Nielsen, M.; Jimenez, C.R. Feature selection for classification with proteomic data of mixed quality. In Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, La Jolla, CA, USA, 15 November 2005; IEEE: La Jolla, CA, USA, 2005; pp. 1–7. [Google Scholar]
- Levner, I. Feature Selection and nearest centroid classification for protein mass spectrometry. BMC Bioinformatics 2005, 6, 68. [Google Scholar] [CrossRef] [PubMed]
- Shipp, D.W.; Sinjab, F.; Notingher, I. Raman spectroscopy: Techniques and applications in the life sciences. Adv. Opt. Photonics 2017, 9, 315–428. [Google Scholar] [CrossRef]
- Masson, L.E.; O’Brien, C.M.; Pence, I.J.; Herington, J.L.; Reese, J.; Van Leeuwen, T.G.; Mahadevan-Jansen, A. Dual excitation wavelength system for combined fingerprint and high wavenumber Raman spectroscopy. Analyst 2018, 143, 6049–6060. [Google Scholar] [CrossRef]
- Borgognone, M.G.; Bussi, J.; Hough, G. Principal component analysis in sensory analysis: Povariance or correlation matrix? Food Qual. Prefer. 2001, 12, 323–326. [Google Scholar] [CrossRef]
- Jolliffe, I.T. Principal Component Analysis; Springer Series in Statistics; Springer: Berlin, Germany, 1986. [Google Scholar]
- Subramanian, J.; Simon, R. Overfitting in prediction models–is it a problem only in high dimensions? Contemp. Clin. Trials 2013, 36, 636–641. [Google Scholar] [CrossRef]
- Ghojogh, B.; Crowley, M. The theory behind overfitting, cross validation, regularization, bagging, and boosting: Tutorial. arXiv 2019, arXiv:1905.12787. [Google Scholar]
- Fleischmann, M.; Hendra, P.J.; McQuillan, A.J. Raman spectra of pyridine adsorbed at a silver electrode. Chem. Phys. Lett. 1974, 26, 163. [Google Scholar] [CrossRef]
- Li, P.; Long, F.; Chen, W.; Chen, J.; Chu, P.K.; Wang, H. Fundamentals and applications of surface-enhanced Raman spectroscopy–based biosensors. Curr. Opin. Biomed. Eng. 2020, 13, 51–59. [Google Scholar] [CrossRef]
- Ju, J.; Liu, W.; Perlaki, C.M.; Chen, K.; Feng, C.; Liu, Q. Sustained and cost effective silver substrate for surface enhanced raman spectroscopy based biosensing. Sci. Rep. 2017, 7, 6917. [Google Scholar] [CrossRef] [PubMed]
- Movasaghi, Z.; Rehman, S.; Rehman, I.U. Raman spectroscopy of biological tissues. Appl. Spectrosc. Rev. 2007, 42, 493–541. [Google Scholar] [CrossRef]
- Schulz, H.; Baranska, M. Dentification and qualification of valuable plant substances by IR and Raman spectroscopy. Vib. Spectrosc. 2007, 43, 13–25. [Google Scholar] [CrossRef]
- Notingher, I.; Green, C.; Dyer, C.; Perkins, E.; Hopkins, N.; Lindsay, C.; Hench, L.L. Discrimination between ricin and sulphur mustard toxicity in vitro using Raman spectroscopy. J. R. Soc. Interface 2004, 1, 79–90. [Google Scholar] [CrossRef]
- Chan, J.W.; Taylor, D.S.; Zwerdling, T.; Lane, S.M.; Ihara, K.; Huser, T. Micro-Raman spectroscopy detects individual neoplastic and normal hematopoietic cells. Biophys. J. 2006, 90, 648–656. [Google Scholar] [CrossRef] [PubMed]
- Malini, R.; Venkatakrishna, K.; Kurien, J.M.; Pai, K.; Rao, L.; Kartha, V.B.; Krishna, C.M. Discrimination of normal, inflammatory, premalignant, and malignant oral tissue: A Raman spectroscopy study. Biopolymers 2006, 81, 179–193. [Google Scholar] [CrossRef]
- Stone, N.; Kendell, C.; Smith, J.; Crow, P.; Barr, H. Raman spectroscopy for identification of epithelial cancers. Faraday Discuss 2004, 126, 141–157. [Google Scholar] [CrossRef]
- Gniadecka, M.; Wulf, H.C.; Mortensen, N.N.; Nielsen, O.F.; Christensen, D.H. Diagnosis of basal cell carcinoma by Raman spectroscopy. J. Raman Spectrosc. 1997, 28, 125–129. [Google Scholar] [CrossRef]
- Farquharson, S.; Shende, C.; Inscore, F.E.; Maksymiuk, P.; Gift, A. Analysis of 5-fluorouracil in saliva using surface-enhanced Raman spectroscopy. J. Raman Spectrosc. 2005, 36, 208–212. [Google Scholar] [CrossRef]
- Dukor, R.K. Vibrational spectroscopy in the detection of cancer. Handb. Vib. Spectrosc. 2006, 2006, 3335–3661. [Google Scholar] [CrossRef]
- Ruiz-Chica, A.J.; Medina, M.A.; Sanchez-Jimenez, F.; Ramirez, F.J. Characterization by Raman spectroscopy of conformational changes on guanine-cytosine and adenine-thymine oligonucleotides induced by aminoxy analogues of spermidine. J. Raman Spectrosc. 2004, 35, 93–100. [Google Scholar] [CrossRef]
- Lau, D.P.; Huang, Z.; Lui, H.; Man, C.S.; Berean, K.; Morrison, M.D.; Zeng, H. Raman spectroscopy for optical diagnosis in normal and cancerous tissue of the nasopharynx-preliminary findings. Lasers Surg. Med. 2003, 32, 210–214. [Google Scholar] [CrossRef] [PubMed]
- Kaminaka, S.; Yamazaki, H.; Ito, T.; Kohda, E.; Hamaguchi, H. Near-infrared Raman spectroscopy of human lung tissues: Possibility of molecular-level cancer diagnosis. J. Raman Spectrosc. 2001, 32, 139–141. [Google Scholar] [CrossRef]
- Cheng, W.T.; Liu, M.T.; Liu, H.N.; Lin, S.Y. Micro-Raman spectroscopy used to identify and grade human skin pilomatrixoma. Microsc. Res. Tech. 2005, 68, 75–79. [Google Scholar] [CrossRef]
- Lakshmi, R.J.; Kartha, V.B.; Murali Krishna, C.R.; Solomon, J.G.; Ullas, G.; Uma Devi, P. Tissue raman spectroscopy for the study of radiation damage: Brain irradiation of mice. Radiat. Res. 2002, 157, 175–182. [Google Scholar] [CrossRef]
- Faolain, E.O.; Hunter, M.B.; Byrne, J.M.; Kelehan, P.; McNamara, M.; Byrne, H.J.; Lyng, F.M. A study examining the effects of tissue processing on human tissue sections using vibrational spectroscopy. Vib. Spectrosc. 2005, 38, 121–127. [Google Scholar] [CrossRef]
- Caspers, P.J.; Bruining, H.A.; Puppels, G.J.; Lucassen, G.W.; Carter, E.A. In Vivo Confocal Raman Microspectroscopy of the skin: Noninvasive determination of molecular concentration profiles. J. Invest. Dermatol. 2001, 116, 434–442. [Google Scholar] [CrossRef] [PubMed]
- Shafer-Peltier, K.E.; Haka, A.S.; Fitzmaurice, M.; Crowe, J.; Myles, J.; Dasari, R.R.; Feld, M.S. Raman microspectroscopic model of human breast tissue: Implications for breast cancer diagnosisin vivo. J. Raman Spectrosc. 2002, 33, 552–563. [Google Scholar] [CrossRef]
- Frank, C.J.; McCreery, R.L.; Redd, D.C. Raman Spectroscopy of Normal and Diseased Human Breast Tissues. Anal. Chem. 1995, 67, 777–783. [Google Scholar] [CrossRef]
- Mahadevan-Jansen, A.; Mitchell, M.F.; Ramanujamf, N.; Malpica, A.; Thomsen, S.; Utzinger, U.; Richards-Kortumt, R. Near-infrared Raman spectroscopy for in vitro detection of cervical precancers. Photochem. Photobiol. 1998, 68, 123–132. [Google Scholar] [CrossRef]
- Naumann, D. Infrared and NIR Raman spectroscopy in medical microbiology. In Infrared Spectroscopy: New Tool in Medicine; SPIE: San Jose, CA, USA, 1998; Volume 3257, pp. 245–257. [Google Scholar]
- Silveira Jr, L.; Sathaiah, S.; Zângaro, R.A.; Pacheco, M.T.; Chavantes, M.C.; Pasqualucci, C.A. Correlation between near-infrared Raman spectroscopy and the hisopathological analysis of atherosclerosis in human coronary arteries. Lasers Surg. Med. 2002, 30, 290–297. [Google Scholar] [CrossRef] [PubMed]
- Shetty, G.; Kendall, C.; Shepherd, N.; Stone, N.; Barr, H. Raman spectroscopy: Evaluation of biochemical changes in carcinogenesis of oesophagus. Br. J. Cancer 2006, 94, 1460–1464. [Google Scholar] [CrossRef] [PubMed]
- Krafft, C.; Neudert, L.; Simat, T.; Salzer, R. Near infrared Raman spectra of human brain lipids. Spectrochim. Acta Part A 2005, 61, 1529–1535. [Google Scholar] [CrossRef] [PubMed]









| Method | Description | 
|---|---|
| t-statistic [14] | |
| MIT Correlation [14] | |
| RELIEF [18] | nearest neighbor of from same class nearest neighbor of from opposite class | 
| PCA | -reshapes the space into fewer dimensions capturing the maximum variance. is the kth eigenvector of the covariance matric of , and is the projection of in dimension. | 
| Classifier | Reduction Method | Dimensions | CV Acc. | CV StdDev | Training CV | Training Stdev | 
|---|---|---|---|---|---|---|
| SVM | t-score | 35 | 0.982 | 0.064 | 0.911 | 0.018 | 
| kNN k = 1 | t-score + PCA = 3 | 35 | 0.982 | 0.064 | 1 | 0 | 
| kNN k = 3 | t-score + PCA = 3 | 40 | 0.982 | 0.064 | 0.980 | 0.005 | 
| kNN k = 1 | t-score + PCA = 3 | 45 | 1 | 0 | 1 | 0 | 
| SVM | t-score | 45 | 0.982 | 0.064 | 0.937 | 0.024 | 
| kNN k = 1 | MIT + PCA = 3 | 45 | 0.982 | 0.064 | 1 | 0 | 
| kNN k = 1 | t-score + PCA = 3 | 50 | 0.982 | 0.064 | 1 | 0 | 
| kNN k = 1 | MIT + PCA = 3 | 55 | 0.982 | 0.064 | 1 | 0 | 
| SVM | MIT | 55 | 0.982 | 0.064 | 0.876 | 0.024 | 
| kNN k = 3 | MIT + PCA = 3 | 60 | 0.982 | 0.064 | 0.979 | 0.008 | 
| Wavenumber (cm−1) | T-Score | MIT | Possible Source | Reference | 
|---|---|---|---|---|
| 1116.4 | X | X | CH2,6 in-plane bend and C1-Cα-Hα bend | [30] | 
| 1201.5 | X | Amide III (proteins) Amide III: C-N stretching and N-H bending | [31] [32,33] | |
| 1221.1 | X | Amide III (β-sheet) Amide III (proteins) | [34] [31,35] | |
| 1234.2 | X | X | A concerted ring mode | [36] | 
| 1237.9 | X | Amide III & CH2 wagging: glycine backbone and proline side chains | [37] | |
| 1267.6 | X | X | C-H (lipid in healthy tissue) Amide III (collagen assignment) | [33] | 
| 1272.3 | X | CH rocking | [30] | |
| 1290.7 | X | Cytosine | [38] | |
| 1420.5 | X | CH2 (lipid and protein) DNA/RNA Deoxyribose (B, Z-marker) | [35,39] [31] [38] | |
| 1488.2 | X | Guanine (N7) Collagen | [38] [40] | |
| 1578.9 | X | Guanine (N3) Guanine, adenine | [38] [31] | |
| 1610.4 | X | X | Cytosine (NH2) | [38] | 
| 1614.8 | X | Tyrosine | [41] | |
| 1634.0 | X | Amide I | [37] | |
| 1637.5 | X | Amide I | [42,43] | |
| 1650.5 | X | X | Amide I | [33,44] | 
| 1654.9 | X | Amide I C==C stretching Collagen | [34,37,45,46] [46] [47] | |
| 1660.9 | X | X | Amide I C==C (lipids, fatty acids) Ceramide backbone | [31,48,49] [31,50,51] [51] | 
| 1664.4 | X | X | Amide I | [41] | 
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mandrell, C.T.; Holland, T.E.; Wheeler, J.F.; Esmaeili, S.M.A.; Amar, K.; Chowdhury, F.; Sivakumar, P. Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis. Life 2020, 10, 181. https://doi.org/10.3390/life10090181
Mandrell CT, Holland TE, Wheeler JF, Esmaeili SMA, Amar K, Chowdhury F, Sivakumar P. Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis. Life. 2020; 10(9):181. https://doi.org/10.3390/life10090181
Chicago/Turabian StyleMandrell, Christopher T., Torrey E. Holland, James F. Wheeler, Sakineh M. A. Esmaeili, Kshitij Amar, Farhan Chowdhury, and Poopalasingam Sivakumar. 2020. "Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis" Life 10, no. 9: 181. https://doi.org/10.3390/life10090181
APA StyleMandrell, C. T., Holland, T. E., Wheeler, J. F., Esmaeili, S. M. A., Amar, K., Chowdhury, F., & Sivakumar, P. (2020). Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis. Life, 10(9), 181. https://doi.org/10.3390/life10090181
 
        



 
       