# Statistical Classification for Raman Spectra of Tumoral Genomic DNA

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Experimental Procedures

^{6}were centrifuged for 5 min at 4000 rpm. The resulting cell line pellets were processed to extract the genomic DNA. The cells were lysed in 1 mL of hypotonic lysis buffer (HEPES 10 mM, MgCl 1.5 mM, KCl 10 mM, and fresh Ditiotreitolo 5 mM), incubated 15 min in ice, and centrifuged for 10 min, at 2000 rpm and 4 ${}^{\circ}$C. To extract the genomic DNA, the pellet samples were incubated for 1 h at 37 ${}^{\circ}$C in 750 $\mathsf{\mu}$L of nuclear lysis buffer (Tris-HCl 10 mM, NaCl 400 mM, EDTA 2 mM, 75 $\mathsf{\mu}$L SDS 10%, 25 $\mathsf{\mu}$L of 10 $\mathsf{\mu}$g/$\mathsf{\mu}$L proteinase-K), treated with 250 $\mathsf{\mu}$L of NaCl 6 M and centrifuged for 15 min at 2000 rpm and 4 ${}^{\circ}$C. The supernatants containing genomic DNA were recovered and then precipitated adding a double volume of EtOH 100% and centrifuged for 10 min at 2000 rpm and 4 ${}^{\circ}$C. The DNA pellets were washed in EtOH 70%, centrifuged for 10 min at 7500 rpm and 4 ${}^{\circ}$C and re-suspended in 100–200 $\mathsf{\mu}$L of DNase free H${}_{2}$O. The DNA concentrations were measured with a spectrophotometer (Eppendorf BioSpectrometer® basic) by reading absorbance at 260 nm, and 260/280 ratio absorbance was checked to assess the purity of the DNA. The DNA concentration was kept constant throughout the entire study at ca. 20 ng/$\mathsf{\mu}$L. This concentration was selected in order to achieve sufficient signal-to-noise ratio for both Raman spectra and fluorescence image after staining with Hoechst 33,342 solution. To exclude any influence of the unavoidable morphological variations related to different fabrication batches of the Ag/SiNWs, for each experiment, we deposited two drops, containing respectively HaCaT and SK-MEL-28 cell DNA, on fresh substrates from the very same batch. In addition, we repeated each experiment for three times by using different DNA samples from SK-MEL-28 and HaCaT cells.

#### 2.2. Data Set and Conveyed Information

^{−15}s); then, it relaxes to a vibrational energy state different from the initial one emitting a photon, which, consequently, has energy different from that of the incident one. The total emitted energy, suitably normalized and expressed in arbitrary units, is called Raman intensity, whereas the energy difference between the incident (laser) light and the scattered (detected) light, expressed in cm${}^{-1}$, is called Raman shift.

#### 2.3. Data Pre-Processing

^{−1}. Moreover, we obtained smoothed data by filtering the original raw spectra with the Savitzky–Golay algorithm [23] with a polynomial order 5 (see also [24]) over a window of 90 data points treated as convolution coefficients. In Figure 1, we plotted a raw spectrum and its smoothed version. The original data are kept for the highest and lowest wavenumbers, otherwise truncated by the preprocessing procedure, to avoid losing information at the sides.

#### 2.4. Two Statistical Approaches

#### The Local Method: PCA Analysis and Logistic Regression

#### 2.5. The Global Method: Geometric Analysis

## 3. Results and Discussion

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Stratton, M.R.; Campbell, P.J.; Futreal, P.A. The cancer genome. Nature
**2009**, 458, 719–724. [Google Scholar] [CrossRef] - Chen, M.; Zhao, H. Next,-generation sequencing in liquid biopsy: Cancer screening and early detection. Hum. Genomics
**2019**, 13, 1–10. [Google Scholar] [CrossRef] - Alix-Panabieres, C.; Pantel, K. Clinical applications of circulating tumor cells and circulating tumor DNA as liquid biopsy. Cancer Discov.
**2016**, 6, 479–491. [Google Scholar] [CrossRef] [PubMed] - Kong, K.; Kendall, C.; Stone, N.; Notingher, I. Raman spectroscopy for medical diagnostics—From in-vitro biofluid assays to in-vivo cancer detection. Adv. Drug Deliv. Rev.
**2015**, 89, 121–134. [Google Scholar] [CrossRef] [PubMed] - Liu, Z.; Parida, S.; Prasad, R.; Pandeya, R.; Sharma, D.; Barman, I. Vibrational spectroscopy for decoding cancer microbiota interactions: Current evidence and future perspective. In Seminars in Cancer Biology; Academic Press: Cambridge, MA, USA, 2021. [Google Scholar]
- Mussi, V.; Ledda, M.; Polese, D.; Maiolo, L.; Paria, D.; Barman, I.; Lolli, M.G.; Lisi, A.; Convertino, A. Silver–coated silicon nanowire platform discriminates genomic DNA from normal and malignant human epithelial cells using label–free raman spctroscopy. Mater. Sci. Eng. C
**2021**, 122, 111951. [Google Scholar] [CrossRef] [PubMed] - Movasaghi, Z.; Rehman, S.; Rehman, I.U. Raman Spectroscopy of Biological Tissues. Appl. Spectrosc. Rev.
**2007**, 42, 493–541. [Google Scholar] [CrossRef] - Talari, A.C.S.; Movasaghi, Z.; Rehman, S.; Rehman, I.U. Raman Spectroscopy of Biological Tissues. Appl. Spectrosc. Rev.
**2015**, 50, 46–111. [Google Scholar] [CrossRef] - Petry, R.; Schmitt, M.; Popp, J. Raman spectroscopy—A prospective tool in the life sciences. Chemphyschem A Eur. J. Chem. Phys. Physical Chem.
**2003**, 4, 14–30. [Google Scholar] [CrossRef] [PubMed] - Fleischmann, M.; Hendra, P.J.; McQuillan, A.J. Raman spectra of pyridine adsorbed at a silver electrode. Chem. Phys. Lett.
**1974**, 26, 163–166. [Google Scholar] [CrossRef] - Haynes, C.L.; McFarland, A.D.; Duyne, R.P.V. Surface–Enhanced Raman Spectroscopy. Anal. Chem.
**2005**, 77, 338A–346A. [Google Scholar] [CrossRef] [Green Version] - Stiles, P.L.; Dieringer, J.A.; Shah, N.C.; Duyne, R.P.V. Surface–Enhanced Raman Spectroscopy. Annu. Rev. Anal. Chem.
**2008**, 1, 601–626. [Google Scholar] [CrossRef] [PubMed] - Convertino, A.; Mussi, V.; Maiolo, L. Disordered array of Au covered Silicon nanowires for SERS biosensing combined with electrochemical detection. Sci. Rep.
**2016**, 6, 25099. [Google Scholar] [CrossRef] [PubMed] - Convertino, A.; Mussi, V.; Maiolo, L.; Ledda, M.; Lolli, M.G.; Bovino, F.A.; Fortunato, G.; Rocchia, M.; Lisi, A. Array of disordered silicon nanowires coated by a gold film for combined NIR photothermal treatment of cancer cells and Raman monitoring of the process evolution. Nanotechnology
**2018**, 29, 415102. [Google Scholar] [CrossRef] [PubMed] - Zhang, B.; Wang, H.; Lu, L.; Ai, K.; Zhang, G.; Cheng, X. Large–Area Silver–Coated Silicon Nanowire Arrays for Molecular Sensing Using Surface–Enhanced Raman Spectroscopy. Adv. Funct. Mater.
**2008**, 18, 2348–2355. [Google Scholar] [CrossRef] - Galopin, E.; Barbillat, J.; Coffinier, Y.; Szunerits, S.; Patriarche, G.; Boukherroub, R. Silicon Nanowires coated with Silver Nanostructures as Ultrasensitive Interfaces for Surface–Enhanced Raman Spectroscopy. ACS Appl. Mater. Interfaces
**2009**, 7, 1396–1403. [Google Scholar] [CrossRef] [PubMed] - Zhang, M.-L.; Fan, X.; Zhou, H.-W.; Shao, M.-W.; Zapien, J.A.; Wong, N.-B.; Lee, S.-T. A High–Efficiency Surface–Enhanced Raman Scattering Substrate Based on Silicon Nanowires Array Decorated with Silver Nanoparticles. J. Phys. Chem. C
**2010**, 114, 1969–1975. [Google Scholar] [CrossRef] - Paria, D.; Convertino, A.; Mussi, V.; Maiolo, L.; Barman, I. Silver–Coated Disordered Silicon Nanowires Provide Highly Sensitive Label–Free Glycated Albumin Detection through Molecular Trapping and Plasmonic Hotspot Formation. Adv. Healthc. Mater.
**2021**, 10, 2001110. [Google Scholar] [CrossRef] [PubMed] - Schmidt, M.S.; Hübner, J.; Boisen, A. Large Area Fabrication of Leaning Silicon Nanopillars for Surface Enhanced Raman Spectroscopy. Adv. Mater.
**2012**, 24, OP11–OP18. [Google Scholar] [CrossRef] [PubMed] - Weber, C.E.M.; Luo, C.; Hotz-Wagenblatt, A.; Gardyan, A.; Kordaß, T.; Holland-Letz, T.; Osen, W.; Eichmüller, S.B. miR–339–3p Is a Tumor Suppressor in Melanoma. Cancer Res.
**2016**, 76, 3562–3571. [Google Scholar] [CrossRef] [PubMed] - Boukamp, P.; Petrussevska, R.T.; Breitkreutz, D.; Hornung, J.; Markham, A.; Fusenig, N.E. Normal keratinization in a spontaneously immortalized aneuploid human keratinocyte cell line. J. Cell Biol.
**1988**, 106, 761–771. [Google Scholar] [CrossRef] [Green Version] - Testa, U.; Castelli, G.; Pelosi, E. Melanoma: Genetic abnormalities, tumor progression, clonal evolution and tumor initiating cells. Med. Sci.
**2017**, 5, 28. [Google Scholar] [CrossRef] - Savitzky, A.; Golay, M.J.E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem.
**1964**, 36, 1627–1639. [Google Scholar] [CrossRef] - Zimmermann, B.; Kohler, A. Optimizing Savitzky–Golay parameters for improving spectral resolution and quantification in infrared spectroscopy. Appl. Spectrosc.
**2013**, 67, 892–902. [Google Scholar] [CrossRef] [PubMed] - Hastie, T.; Tibshirani, R.; Wainwright, M. Statistical Learning with Sparsity: The Lasso and Generalizations; CRC Press: Boca Raton, FL, USA; Routledge: London, UK, 2015. [Google Scholar]
- Jackson, J.E. A User’s Guide to Principal Components; Wiley: Hoboken, NJ, USA, 1991. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett.
**2006**, 28, 861–874. [Google Scholar] [CrossRef] - Chicco, D.; Warrensand, M.J.; Jurman, G. The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment. IEEE Access
**2021**, 9, 78368–78381. [Google Scholar] [CrossRef]

**Figure 1.**Comparison between a raw Raman spectrum and the curve (in red) obtained by performing Savitzky–Golay filtering.

**Figure 2.**For the healthy (

**left**) and tumoral (

**right**) data set, we report the smoothed spectra (black solid lines), the average spectra (red solid lines), and the extreme curves computed by adding to and subtracting from the average Raman intensity three times the standard deviation (red dashed lines) labeling the decision surfaces.

**Figure 3.**The red and the blue solid lines are, respectively, the average spectrum of the healthy and the tumoral data set. The point-dashed and the dashed lines report, respectively, one healthy and tumoral pre–processed spectrum. For the healthy spectrum, d

_{t}= 14.9 × 10

^{6}and d

_{h}= 4.7 × 10

^{6}. For the tumoral spectrum, d

_{t}= 2.8 × 10

^{6}and d

_{h}= 16.6 × 10

^{6}.

**Figure 4.**ROC curves for the local (

**left panel**) and the global (

**right panel**) methods. The markers make evident the corresponding Youden’s indices.

**Figure 5.**(

**Left**) the four solid lines report the loadings associated with the first four principal components (blue first, brown second, green third, and yellow fourth); the dashed black lines are the intensity, i.e., the square root of the sum of the squares, of the loadings of the first four components. (

**Right**) percentage of variance as a function of the principal component index.

**Figure 6.**Projections on the coordinate planes of the distribution of the first four principal components on the coordinate planes. The healthy and the tumoral samples are labeled by blue and red points, respectively.

Method | Optimal Tuning Parameter | Area under the Curve |
---|---|---|

local | $\lambda =0.46$ | 0.899 |

global | $\tau =0.34$ | 0.871 |

Principal Component | Standard Deviation | Proportion of Variance | Cumulative Proportion |
---|---|---|---|

PC1 | 5157.11860 | 0.78068 | 0.78068 |

PC2 | 2311.49810 | 0.15684 | 0.93752 |

PC3 | 866.34106 | 0.02203 | 0.95955 |

PC4 | 581.91444 | 0.00994 | 0.96949 |

PC5 | 496.66847 | 0.00724 | 0.97673 |

PC6 | 405.79667 | 0.00483 | 0.98156 |

PC7 | 327.32049 | 0.00314 | 0.98471 |

PC8 | 284.63234 | 0.00238 | 0.98708 |

i | Estimate | Standard Deviation |
---|---|---|

0 | −6.6094526 | 0.2424116 |

1 | −0.0014903 | 0.0000540 |

2 | −0.0002585 | 0.0000440 |

3 | −0.0020081 | 0.0001042 |

4 | −0.0007470 | 0.0001154 |

Population (col.) vs. Prediction (Row) | Positive (%) | Negative (%) |
---|---|---|

Positive (%) | 44.6/44.6 | 11.6/13.8 |

Negative (%) | 5.3/5.4 | 38.4/36.2 |

**Table 5.**Performance of the outcomes for the joint methods. Empty cells correspond to impossible combinations of outcomes.

Local (row) vs. Global (col.) | TP (%) | FP (%) | FN (%) | TN (%) |
---|---|---|---|---|

TP(%) | 42.1 | 2.5 | ||

FP(%) | 9.7 | 1.9 | ||

FN(%) | 2.5 | 2.9 | ||

TN(%) | 4.1 | 34.3 |

Local (row) vs. Global (col.) | Correct Predictions (%) | Wrong Predictions (%) |
---|---|---|

correct predictions (%) | 76.4 | 6.6 |

wrong predictions (%) | 4.4 | 12.6 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Durastanti, C.; Cirillo, E.N.M.; De Benedictis, I.; Ledda, M.; Sciortino, A.; Lisi, A.; Convertino, A.; Mussi, V.
Statistical Classification for Raman Spectra of Tumoral Genomic DNA. *Micromachines* **2022**, *13*, 1388.
https://doi.org/10.3390/mi13091388

**AMA Style**

Durastanti C, Cirillo ENM, De Benedictis I, Ledda M, Sciortino A, Lisi A, Convertino A, Mussi V.
Statistical Classification for Raman Spectra of Tumoral Genomic DNA. *Micromachines*. 2022; 13(9):1388.
https://doi.org/10.3390/mi13091388

**Chicago/Turabian Style**

Durastanti, Claudio, Emilio N. M. Cirillo, Ilaria De Benedictis, Mario Ledda, Antonio Sciortino, Antonella Lisi, Annalisa Convertino, and Valentina Mussi.
2022. "Statistical Classification for Raman Spectra of Tumoral Genomic DNA" *Micromachines* 13, no. 9: 1388.
https://doi.org/10.3390/mi13091388