Machine Learning-Enabled Intelligent Analysis of Surface-Enhanced Raman Scattering: Methods, Applications, and Perspectives
Abstract
1. Introduction
2. SERS Data Characteristics and Analytical Challenges
2.1. Typical Characteristics of SERS Spectra
2.2. Data Preprocessing and Feature Engineering
2.2.1. Stage 1: Intra-Spectrum Feature Diagnostics
- (i)
- Cosmic Ray Removal: SERS spectra are typically collected using highly sensitive Charge-Coupled Device (CCD) detectors, which are susceptible to cosmic ray strikes. These manifest as sharp, intense, and narrow positive spikes. If not removed, ML models may erroneously identify these random spikes as significant Raman features. Therefore, applying cosmic ray removal algorithms—such as median filtering or derivative-based methods—is a mandatory first step [23,24,25].
- (ii)
- Baseline Correction: Following spike removal, baseline correction methods are applied to eliminate broad fluorescence backgrounds and instrumental drift. Polynomial fitting [26] and penalized least squares approaches—most notably asymmetric least squares (AsLS) and alternating least squares (ALS) regressions [27,28]—are widely used to estimate and subtract slowly varying baseline components, thereby restoring the true Raman signal profile.
- (iii)
- Noise Reduction/Smoothing: To suppress high-frequency random noise without distorting peak shapes, smoothing techniques such as Savitzky–Golay filtering [29] and wavelet-based denoising [30] are subsequently employed. This step improves the signal-to-noise ratio (SNR), especially for spectra with weak Raman scattering.
2.2.2. Stage 2: Dataset-Level Feature Diagnostics
- (iv)
- Peak Alignment: In practical SERS measurements, subtle shifts in Raman peak positions (typically by a few cm−1) are frequently observed due to variations in molecular adsorption orientation, thermal effects, or slight instrumental miscalibrations. For ML models that rely on strict wavelength registration, unaligned peaks can be misinterpreted as different chemical species. Spectral alignment techniques like Correlation Optimized Warping (COW) or dynamic time warping (DTW) are essential to align prominent marker bands across the dataset [31,32,33,34].
- (v)
- Spectral Binning and Resampling: When aggregating SERS data from multiple laboratories or different instruments, a major challenge is the discrepancy in spectral resolution and step sizes. ML models require inputs of uniform dimensionality. Spectral binning or spline interpolation-based resampling addresses this by standardizing the number of data points per spectrum [35,36,37].
- (vi)
- Data Augmentation: The implementation of deep learning in SERS is often hindered by ‘data scarcity.’ Acquiring tens of thousands of reproducible SERS spectra experimentally is costly. To prevent models from overfitting on small or imbalanced datasets, techniques such as adding white noise, applying minor spectral shifts, or generating artificial spectra using Generative Adversarial Networks (GANs) are widely employed to robustly expand the training set size [38,39,40,41].
2.2.3. Final Preparation: Normalization and Feature Extraction
3. Machine Learning Methods for SERS Analysis
3.1. Typical Process for ML-Based SERS Analysis
3.1.1. SERS Data Acquisition
3.1.2. Data Preprocessing
3.1.3. Dataset Partitioning
3.1.4. Machine Learning Model Development
3.2. Learning Paradigms and Model Architectures in SERS Analysis
3.2.1. Traditional Machine Learning Models
3.2.2. Deep Learning Models
3.2.3. Critical Comparison: Chemometrics vs. Deep Learning
4. The Key Role of Machine Learning in Intelligent SERS Analysis
4.1. Precise Target Molecule Identification and Quantitative Analysis
4.2. Identification and Discovery of Biomarkers for Unknown Molecules
4.3. Data-Driven Optimization of SERS Nanostructured Substrates
4.4. Methodological Synthesis: Toward Intelligent and Integrated SERS Systems
5. Fundamental Challenges and Future Perspectives
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| SERS | Surface-enhanced Raman spectroscopy |
| ML | Machine learning |
| SPR | Surface plasmon resonance |
| AgNPs | Silver nanoparticles |
| AuNPs | Gold nanoparticles |
| LSPR | Localized surface plasmon resonance |
| CCD | Charge-coupled device |
| AsLS | Asymmetric least squares |
| ALS | Asymmetric least squares |
| SNR | Signal-to-noise ratio |
| COW | Correlation optimized warping |
| DTW | Dynamic time warping |
| GANs | Generative adversarial networks |
| PCA | Principal component analysis |
| LDA | Linear discriminant analysis |
| PLS-DA | Partial least squares discriminant analysis |
| SVM | Support vector machines |
| RF | Random forests |
| PLSR | Partial least squares regression |
| CNNs | Convolutional neural networks |
| AUC-ROC | Area under the receiver operating characteristic curve |
| MSE | Mean squared error |
| RMSE | Root mean squared error |
| MAE | Mean absolute error |
| DT | Decision trees |
| NB | Naïve Bayes |
| KNN | K-nearest neighbors |
| RT | Regression trees |
| t-SNE | t-distributed stochastic neighbor embedding |
| DL | Deep learning |
| ANNs | Artificial neural networks |
| MLP | Multi-layer perceptron |
| RNNs | Recurrent neural networks |
| LSTM | Long short-term memory |
| GRU | Gated recurrent unit |
| ResNet | Residual neural networks |
| DA | Discriminant analysis |
| MCR | Multivariate curve resolution |
| sPLS-DA | Sparse PLS discriminant analysis |
| ATR-FTIR | Attenuated total reflectance—Fourier transform infrared spectroscopy |
| XAI | Explainable artificial intelligence |
| DNN | Deep neural network |
| Score-CAM | Score-weighted visual explanations for convolutional neural networks |
| FDTD | Finite-difference time-domain |
| BP | Backpropagation |
| cVAE | conditional variational autoencoder |
| 2D | Two-dimensional |
| TMDCs | Transition metal dichalcogenides |
| APN | Absorption prediction network |
| IDN | Inverse design network |
| E-field | Electric-field |
| PU | Periodic unit |
| CPUs | Central processing units |
| PINNs | Physical information neural networks |
References
- Ran, C.; Zhang, J.L.; He, X.; Luo, C.; Zhang, Q.; Shen, Y.; Yin, L. Recent development of gold nanochips in biosensing and biodiagnosis sensibilization strategies in vitro based on SPR, SERS and FRET optical properties. Talanta 2025, 282, 126936. [Google Scholar] [CrossRef] [PubMed]
- Langer, J.; Jimenez de Aberasturi, D.; Aizpurua, J.; Alvarez-Puebla, R.A.; Auguié, B.; Baumberg, J.J.; Bazan, G.C.; Bell, S.E.J.; Boisen, A.; Brolo, A.G.; et al. Present and Future of Surface-Enhanced Raman Scattering. ACS Nano 2020, 14, 28–117. [Google Scholar] [CrossRef]
- Jeanmaire, D.L.; Van Duyne, R.P. Surface raman spectroelectrochemistry: Part I. Heterocyclic, aromatic, and aliphatic amines adsorbed on the anodized silver electrode. J. Electroanal. Chem. Interfacial Electrochem. 1977, 84, 1–20. [Google Scholar] [CrossRef]
- Cong, S.; Liu, X.; Jiang, Y.; Zhang, W.; Zhao, Z. Surface Enhanced Raman Scattering Revealed by Interfacial Charge-Transfer Transitions. Innovation 2020, 1, 100051. [Google Scholar] [CrossRef]
- Albrecht, M.G.; Creighton, J.A. Anomalously intense Raman spectra of pyridine at a silver electrode. J. Am. Chem. Soc. 1977, 99, 5215–5217. [Google Scholar] [CrossRef]
- Liu, H.; Gao, X.; Xu, C.; Liu, D. SERS Tags for Biomedical Detection and Bioimaging. Theranostics 2022, 12, 1870–1903. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.; Franceschini, C.; Weber, S.; Dib, T.; Liu, P.; Wu, L.; Farnesi, E.; Zhang, W.S.; Sivakov, V.; Luppa, P.B.; et al. SERS-based detection of the antibiotic ceftriaxone in spiked fresh plasma and microdialysate matrix by using silver-functionalized silicon nanowire substrates. Talanta 2024, 271, 125697. [Google Scholar] [CrossRef]
- Butmee, P.; Samphao, A.; Tumcharern, G. Reduced graphene oxide on silver nanoparticle layers-decorated titanium dioxide nanotube arrays as SERS-based sensor for glyphosate direct detection in environmental water and soil. J. Hazard. Mater. 2022, 437, 129344. [Google Scholar] [CrossRef]
- Logan, N.; Cao, C.; Freitag, S.; Haughey, S.A.; Krska, R.; Elliott, C.T. Advancing Mycotoxin Detection in Food and Feed: Novel Insights from Surface-Enhanced Raman Spectroscopy (SERS). Adv. Mater. 2024, 36, e2309625. [Google Scholar] [CrossRef]
- Li, Q.; Huo, H.; Wu, Y.; Chen, L.; Su, L.; Zhang, X.; Song, J.; Yang, H. Design and Synthesis of SERS Materials for In Vivo Molecular Imaging and Biosensing. Adv. Sci. 2023, 10, e2202051. [Google Scholar] [CrossRef] [PubMed]
- Atta, S.; Vo-Dinh, T. Ultra-trace SERS detection of cocaine and heroin using bimetallic gold-silver nanostars (BGNS-Ag). Anal. Chim. Acta 2023, 1251, 340956. [Google Scholar] [CrossRef]
- Ahi, E.E.; Torul, H.; Zengin, A.; Sucularlı, F.; Yıldırım, E.; Selbes, Y.; Suludere, Z.; Tamer, U. A capillary driven microfluidic chip for SERS based hCG detection. Biosens. Bioelectron. 2022, 195, 113660. [Google Scholar] [CrossRef]
- Tan, E.X.; Nguyen, L.B.T.; Jin, Y.; Lv, Y.; Phang, I.Y.; Ling, X.Y. SERS Cheminformatics: Opportunities for Data-Driven Discovery and Applications. ACS Cent. Sci. 2025, 11, 2034–2052. [Google Scholar] [CrossRef]
- Greener, J.G.; Kandathil, S.M.; Moffat, L.; Jones, D.T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 2022, 23, 40–55. [Google Scholar] [CrossRef]
- Ding, Y.; Sun, Y.; Liu, C.; Jiang, Q.Y.; Chen, F.; Cao, Y. SERS-Based Biosensors Combined with Machine Learning for Medical Application. ChemistryOpen 2023, 12, e202200192. [Google Scholar] [CrossRef] [PubMed]
- Dong, Y.; Hu, J.; Jin, J.; Zhou, H.; Jin, S.; Yang, D. Advances in machine learning-assisted SERS sensing towards food safety and biomedical analysis. TrAC Trends Anal. Chem. 2024, 180, 117974. [Google Scholar] [CrossRef]
- Tang, J.-W.; Yuan, Q.; Zhang, L.; Marshall, B.J.; Yen Tay, A.C.; Wang, L. Application of machine learning-assisted surface-enhanced Raman spectroscopy in medical laboratories: Principles, opportunities, and challenges. TrAC Trends Anal. Chem. 2025, 184, 118135. [Google Scholar] [CrossRef]
- Srivastava, S.; Wang, W.; Zhou, W.; Jin, M.; Vikesland, P.J. Machine Learning-Assisted Surface-Enhanced Raman Spectroscopy Detection for Environmental Applications: A Review. Environ. Sci. Technol. 2024, 58, 20830–20848. [Google Scholar] [CrossRef] [PubMed]
- Bi, X.; Ai, X.; Wu, Z.; Lin, L.L.; Chen, Z.; Ye, J. Artificial Intelligence-Powered Surface-Enhanced Raman Spectroscopy for Biomedical Applications. Anal. Chem. 2025, 97, 6826–6846. [Google Scholar] [CrossRef] [PubMed]
- Horta-Velázquez, A.; Arce, F.; Rodríguez-Sevilla, E.; Morales-Narváez, E. Toward smart diagnostics via artificial intelligence-assisted surface-enhanced Raman spectroscopy. TrAC Trends Anal. Chem. 2023, 169, 117378. [Google Scholar] [CrossRef]
- Pilot, R.; Signorini, R.; Durante, C.; Orian, L.; Bhamidipati, M.; Fabris, L. A Review on Surface-Enhanced Raman Scattering. Biosensors 2019, 9, 57. [Google Scholar] [CrossRef] [PubMed]
- Kumar, P.P.P.; Kaushal, S.; Lim, D.-K. Recent advances in nano/microfabricated substrate platforms and artificial intelligence for practical surface-enhanced Raman scattering-based bioanalysis. TrAC Trends Anal. Chem. 2023, 168, 117341. [Google Scholar] [CrossRef]
- Whitaker, D.A.; Hayes, K. A simple algorithm for despiking Raman spectra. Chemom. Intell. Lab. Syst. 2018, 179, 82–84. [Google Scholar] [CrossRef]
- Coca-Lopez, N. An intuitive approach for spike removal in Raman spectra based on peaks’ prominence and width. Anal. Chim. Acta 2024, 1295, 342312. [Google Scholar] [CrossRef]
- Barton, S.J.; Hennelly, B.M. An Algorithm for the Removal of Cosmic Ray Artifacts in Spectral Data Sets. Appl. Spectrosc. 2019, 73, 893–901. [Google Scholar] [CrossRef]
- Lieber, C.A.; Mahadevan-Jansen, A. Automated Method for Subtraction of Fluorescence from Biological Raman Spectra. Appl. Spectrosc. 2003, 57, 1363–1367. [Google Scholar] [CrossRef] [PubMed]
- Eilers, P.H.C.; Boelens, H.F.M. Baseline Correction with Asymmetric Least Squares Smoothing. Leiden Univ. Med. Cent. Rep. 2005, 1, 5. [Google Scholar]
- He, S.; Zhang, W.; Liu, L.; Huang, Y.; He, J.; Xie, W.; Wu, P.; Du, C. Baseline correction for Raman spectra using an improved asymmetric least squares method. Anal. Methods 2014, 6, 4402–4407. [Google Scholar] [CrossRef]
- Savitzky, A.; Golay, M.J.E. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
- Li, S.; Nyagilo, J.O.; Dave, D.P.; Gao, J.X. Continuous wavelet transform based partial least squares regression for quantitative analysis of Raman spectrum. IEEE Trans. Nanobiosci. 2013, 12, 214–221. [Google Scholar] [CrossRef]
- Gautam, R.; Vanga, S.; Ariese, F.; Umapathy, S. Review of multidimensional data processing approaches for Raman and infrared spectroscopy. EPJ Tech. Instrum. 2015, 2, 8. [Google Scholar] [CrossRef]
- Heyer-Müller, J.; Schiemer, R.; Lopinski, M.; Wang, C.; Willems, F.; Robbel, L.; Schmitt, M.; Hubbuch, J. A Novel Raman-Chromatography Assembly for Automated Calibration and In-Line Monitoring in Bioprocessing. Eng. Life Sci. 2025, 25, e70044. [Google Scholar] [CrossRef]
- Liu, Y.J.; André, S.; Saint Cristau, L.; Lagresle, S.; Hannas, Z.; Calvosa, É.; Devos, O.; Duponchel, L. Multivariate statistical process control (MSPC) using Raman spectroscopy for in-line culture cell monitoring considering time-varying batches synchronized with correlation optimized warping (COW). Anal. Chim. Acta 2017, 952, 9–17. [Google Scholar] [CrossRef] [PubMed]
- Herrmann, M.; Tan, C.W.; Webb, G.I. Parameterizing the cost function of dynamic time warping with application to time series classification. Data Min. Knowl. Discov. 2023, 37, 2024–2045. [Google Scholar] [CrossRef]
- Bocklitz, T.; Walter, A.; Hartmann, K.; Rösch, P.; Popp, J. How to pre-process Raman spectra for reliable and stable models? Anal. Chim. Acta 2011, 704, 47–56. [Google Scholar] [CrossRef] [PubMed]
- Lussier, F.; Thibault, V.; Charron, B.; Wallace, G.Q.; Masson, J.F. Deep learning and artificial intelligence methods for Raman and surface-enhanced Raman scattering. TrAC Trends Anal. Chem. 2020, 124, 115796. [Google Scholar] [CrossRef]
- Guo, S.; Bocklitz, T.; Neugebauer, U.; Popp, J. Common mistakes in cross-validating classification models. Anal. Methods 2017, 9, 4410–4417. [Google Scholar] [CrossRef]
- Koenig, T.; Cadau, L.; Wagner, F.; Kley, M. A generative adversarial network-based data augmentation approach with transient vibration data. Procedia Comput. Sci. 2023, 225, 1340–1349. [Google Scholar] [CrossRef]
- Wu, M.; Wang, S.; Pan, S.; Terentis, A.C.; Strasswimmer, J.; Zhu, X. Deep learning data augmentation for Raman spectroscopy cancer tissue classification. Sci. Rep. 2021, 11, 23842. [Google Scholar] [CrossRef]
- Yang, J.; Xu, J.; Zhang, X.; Wu, C.; Lin, T.; Ying, Y. Deep learning for vibrational spectral analysis: Recent progress and a practical guide. Anal. Chim. Acta 2019, 1081, 6–17. [Google Scholar] [CrossRef]
- Kim, Y.; Lee, W. Distributed Raman Spectrum Data Augmentation System Using Federated Learning with Deep Generative Models. Sensors 2022, 22, 9900. [Google Scholar] [CrossRef]
- Ringnér, M. What is principal component analysis? Nat. Biotechnol. 2008, 26, 303–304. [Google Scholar] [CrossRef] [PubMed]
- Xie, X.; Zheng, Y.; Zhao, F.; Wang, W.; Fu, W.; Ling, Y.; Zhang, Z. Principal component analysis of normalized SERS spectra for trace-level analyte quantification. J. Mater. Sci. Technol. 2026, 241, 107–113. [Google Scholar] [CrossRef]
- Wang, C.; Xiao, L.; Dai, C.; Nguyen, A.H.; Littlepage, L.E.; Schultz, Z.D.; Li, J. A Statistical Approach of Background Removal and Spectrum Identification for SERS Data. Sci. Rep. 2020, 10, 1460. [Google Scholar] [CrossRef]
- Doyle, S.; Lips, E.H.; Marcus, E.; Mulder, L.; Liu, Y.H.; Canton, F.D.; Kootstra, T.; van Seijen, M.M.; Bouybayoune, I.; Sawyer, E.J.; et al. Deep learning for predicting invasive recurrence of ductal carcinoma in situ: Leveraging histopathology images and clinical features. EBioMedicine 2025, 116, 105750. [Google Scholar] [CrossRef]
- Shin, H.; Jeong, H.; Park, J.; Hong, S.; Choi, Y. Correlation between Cancerous Exosomes and Protein Markers Based on Surface-Enhanced Raman Spectroscopy (SERS) and Principal Component Analysis (PCA). ACS Sens. 2018, 3, 2637–2643. [Google Scholar] [CrossRef]
- Zhong, Q.; Shao, L.; Yao, Y.; Chen, S.; Lv, X.; Liu, Z.; Zhu, S.; Yan, Z. Urine-based SERS and multivariate statistical analysis for identification of non-muscle-invasive bladder cancer and muscle-invasive bladder cancer. Anal. Bioanal. Chem. 2024, 416, 6973–6984. [Google Scholar] [CrossRef] [PubMed]
- Chai, Z.; Bi, H. Capture and identification of bacteria from fish muscle based on immunomagnetic beads and MALDI-TOF MS. Food Chem. X 2022, 13, 100225. [Google Scholar] [CrossRef]
- Kang, S.; Kim, I.; Vikesland, P.J. Discriminatory Detection of ssDNA by Surface-Enhanced Raman Spectroscopy (SERS) and Tree-Based Support Vector Machine (Tr-SVM). Anal. Chem. 2021, 93, 9319–9328. [Google Scholar] [CrossRef]
- Zhang, S.; Ma, J.; Qi, C.; Cheng, R.; Shen, J.; Yang, H. Rapid detection of kidney disease based on urine surface-enhanced Raman spectroscopy and principal components analysis-support vector machine/random forests. Spectrochim. Acta. Part A Mol. Biomol. Spectrosc. 2025, 343, 126492. [Google Scholar] [CrossRef]
- Lomarat, P.; Phechkrajang, C.; Sunghad, P.; Anantachoke, N. Raman spectroscopy coupled with the PLSR model: A rapid method for analyzing gamma-oryzanol content in rice bran oil. Food Chem. X 2024, 24, 101923. [Google Scholar] [CrossRef]
- Carlier, A.; Dandrifosse, S.; Dumont, B.; Mercatoris, B. Comparing CNNs and PLSr for estimating wheat organs biophysical variables using proximal sensing. Front. Plant Sci. 2023, 14, 1204791. [Google Scholar] [CrossRef] [PubMed]
- Luo, W.; Phung, D.; Tran, T.; Gupta, S.; Rana, S.; Karmakar, C.; Shilton, A.; Yearwood, J.; Dimitrova, N.; Ho, T.B.; et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J. Med. Internet Res. 2016, 18, e323. [Google Scholar] [CrossRef]
- Nyamdavaa, A.; Kaladharan, K.; Ganbold, E.O.; Jeong, S.; Paek, S.; Su, Y.; Tseng, F.G.; Ishdorj, T.O. DeepATsers: A deep learning framework for one-pot SERS biosensor to detect SARS-CoV-2 virus. Sci. Rep. 2025, 15, 12245. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Q.J.; Chen, Y.; Zou, X.H.; Hu, W.; Ye, M.L.; Guo, Q.F.; Lin, X.L.; Feng, S.Y.; Wang, N. Promoting identification of amyotrophic lateral sclerosis based on label-free plasma spectroscopy. Ann. Clin. Transl. Neurol. 2020, 7, 2010–2018. [Google Scholar] [CrossRef] [PubMed]
- Moisoiu, T.; Dragomir, M.P.; Iancu, S.D.; Schallenberg, S.; Birolo, G.; Ferrero, G.; Burghelea, D.; Stefancu, A.; Cozan, R.G.; Licarete, E.; et al. Combined miRNA and SERS urine liquid biopsy for the point-of-care diagnosis and molecular stratification of bladder cancer. Mol. Med. 2022, 28, 39. [Google Scholar] [CrossRef]
- Yao-Say Solomon Adade, S.; Lin, H.; Jiang, H.; Haruna, S.A.; Osei Barimah, A.; Zareef, M.; Akomeah Agyekum, A.; Adwoa Nkuma Johnson, N.; Mehedi Hassan, M.; Li, H.; et al. Fraud detection in crude palm oil using SERS combined with chemometrics. Food Chem. 2022, 388, 132973. [Google Scholar] [CrossRef]
- Wen, Y.; Wang, X.; Li, D.; Zhang, Q.; Deng, B.; Chen, Y. Rapid detection of phenytoin sodium by partial-least squares and linear regression models combined with surface-enhanced Raman spectroscopy. J. Pharm. Biomed. Anal. 2023, 223, 115160. [Google Scholar] [CrossRef]
- Li, X.; Yang, T.; Li, C.S.; Song, Y.; Wang, D.; Jin, L.; Lou, H.; Li, W. Polymerase chain reaction—surface-enhanced Raman spectroscopy (PCR-SERS) method for gene methylation level detection in plasma. Theranostics 2020, 10, 898–909. [Google Scholar] [CrossRef]
- Kobak, D.; Berens, P. The art of using t-SNE for single-cell transcriptomics. Nat. Commun. 2019, 10, 5416. [Google Scholar] [CrossRef]
- Tang, J.W.; Li, J.Q.; Yin, X.C.; Xu, W.W.; Pan, Y.C.; Liu, Q.H.; Gu, B.; Zhang, X.; Wang, L. Rapid Discrimination of Clinically Important Pathogens Through Machine Learning Analysis of Surface Enhanced Raman Spectra. Front. Microbiol. 2022, 13, 843417. [Google Scholar] [CrossRef]
- Arslan, A.H.; Ciloglu, F.U.; Yilmaz, U.; Simsek, E.; Aydin, O. Discrimination of waterborne pathogens, Cryptosporidium parvum oocysts and bacteria using surface-enhanced Raman spectroscopy coupled with principal component analysis and hierarchical clustering. Spectrochim. Acta. Part A Mol. Biomol. Spectrosc. 2022, 267, 120475. [Google Scholar] [CrossRef]
- Wu, S.; Zhang, Y.; He, C.; Luo, Z.; Chen, Z.; Ye, J. Self-Supervised Learning for Generic Raman Spectrum Denoising. Anal. Chem. 2024, 96, 17476–17485. [Google Scholar] [CrossRef]
- Pang, T.; Wong, J.H.D.; Ng, W.L.; Chan, C.S. Semi-supervised GAN-based Radiomics Model for Data Augmentation in Breast Ultrasound Mass Classification. Comput. Methods Programs Biomed. 2021, 203, 106018. [Google Scholar] [CrossRef]
- Kim, M.G.; Jue, M.; Lee, K.H.; Lee, E.Y.; Roh, Y.; Lee, M.; Lee, H.J.; Lee, S.; Liu, H.; Koo, B.; et al. Deep Learning Assisted Surface-Enhanced Raman Spectroscopy (SERS) for Rapid and Direct Nucleic Acid Amplification and Detection: Toward Enhanced Molecular Diagnostics. ACS Nano 2023, 17, 18332–18345. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
- Guselnikova, O.; Trelin, A.; Skvortsova, A.; Ulbrich, P.; Postnikov, P.; Pershina, A.; Sykora, D.; Svorcik, V.; Lyutakov, O. Label-free surface-enhanced Raman spectroscopy with artificial neural network technique for recognition photoinduced DNA damage. Biosens. Bioelectron. 2019, 145, 111718. [Google Scholar] [CrossRef]
- Huang, Z.; Liang, W.; Lei, Y.; Zhang, R.; Sun, J.; Guo, P. Homogeneous multi-antibiotics residual identification in various actual water via SERS spectra multilayer perceptron algorithm combined with Gaussian kernel density estimation data augmentation. Anal. Chim. Acta 2026, 1383, 344896. [Google Scholar] [CrossRef] [PubMed]
- Luo, Y.; Su, W.; Xu, D.; Wang, Z.; Wu, H.; Chen, B.; Wu, J. Component identification for the SERS spectra of microplastics mixture with convolutional neural network. Sci. Total Environ. 2023, 895, 165138. [Google Scholar] [CrossRef]
- Ljubic, B.; Hai, A.A.; Stanojevic, M.; Diaz, W.; Polimac, D.; Pavlovski, M.; Obradovic, Z. Predicting complications of diabetes mellitus using advanced machine learning algorithms. J. Am. Med. Inform. Assoc. 2020, 27, 1343–1351. [Google Scholar] [CrossRef]
- Lin, Y.; Zhang, Q.; Chen, H.; Liu, S.; Peng, K.; Wang, X.; Zhang, L.; Huang, J.; Yan, X.; Lin, X.; et al. Multi-cancer early detection based on serum surface-enhanced Raman spectroscopy with deep learning: A large-scale case-control study. BMC Med. 2025, 23, 97. [Google Scholar] [CrossRef]
- Cui, F.; Yue, Y.; Zhang, Y.; Zhang, Z.; Zhou, H.S. Advancing Biosensors with Machine Learning. ACS Sens. 2020, 5, 3346–3364. [Google Scholar] [CrossRef]
- Khondakar, K.R.; Mazumdar, H.; Das, S.; Kaushik, A. Machine learning (ML)-assisted surface-enhanced raman spectroscopy (SERS) technologies for sustainable health. Adv. Colloid Interface Sci. 2025, 344, 103594. [Google Scholar] [CrossRef] [PubMed]
- Xiao, J.; Ding, J.; Sun, C.; Liu, D.; Gao, H.; Liu, Y.; Lu, Y.; Gao, X. Simultaneous Detection of Clenbuterol and Higenamine in Urine Samples Using Interference-Free SERS Tags Combined with Magnetic Separation. ACS Sens. 2024, 9, 5394–5404. [Google Scholar] [CrossRef]
- Cai, J.; Wu, Y.; Bai, H.; He, Y.; Qin, Y. SERS and machine learning based effective feature extraction for detection and identification of amphetamine analogs. Heliyon 2023, 9, e23109. [Google Scholar] [CrossRef]
- Simas, M.V.; Olaniyan, P.O.; Hati, S.; Davis, G.A., Jr.; Anspach, G.; Goodpaster, J.V.; Manicke, N.E.; Sardar, R. Superhydrophobic Surface Modification of Polymer Microneedles Enables Fabrication of Multimodal Surface-Enhanced Raman Spectroscopy and Mass Spectrometry Substrates for Synthetic Drug Detection in Blood Plasma. ACS Appl. Mater. Interfaces 2023, 15, 46681–46696. [Google Scholar] [CrossRef]
- Sun, J.; Lai, W.; Zhao, J.; Xue, J.; Zhu, T.; Xiao, M.; Man, T.; Wan, Y.; Pei, H.; Li, L. Rapid Identification of Drug Mechanisms with Deep Learning-Based Multichannel Surface-Enhanced Raman Spectroscopy. ACS Sens. 2024, 9, 4227–4235. [Google Scholar] [CrossRef] [PubMed]
- Martens, R.R.; Gozdzialski, L.; Newman, E.; Gill, C.; Wallace, B.; Hore, D.K. Trace Detection of Adulterants in Illicit Opioid Samples Using Surface-Enhanced Raman Scattering and Random Forest Classification. Anal. Chem. 2024, 96, 12277–12285. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Li, C.; Yang, Y.; Ma, C.; Zhao, X.; Li, J.; Wei, L.; Li, Y. A Surface-Enhanced Raman Spectroscopy Platform Integrating Dual Signal Enhancement and Machine Learning for Rapid Detection of Veterinary Drug Residues in Meat Products. ACS Appl. Mater. Interfaces 2025, 17, 16202–16212. [Google Scholar] [CrossRef]
- Qin, Y.; Zhang, H.; Wang, W.; He, Y. Deep learning-assisted surface-enhanced Raman spectroscopy detection of stimulants. Spectrochim. Acta. Part A Mol. Biomol. Spectrosc. 2026, 348, 127086. [Google Scholar] [CrossRef]
- Treerattrakoon, K.; Roeksrungruang, P.; Dharakul, T.; Japrung, D.; Faulds, K.; Graham, D.; Bamrungsap, S. Detection of a miRNA biomarker for cancer diagnosis using SERS tags and magnetic separation. Anal. Methods Adv. Methods Appl. 2022, 14, 1938–1945. [Google Scholar] [CrossRef]
- Ye, J.; Bi, X.; Deng, S.; Wang, X.; Liu, Z.; Suo, Q.; Wu, J.; Chen, H.; Wang, Y.; Qian, K.; et al. Hypoxanthine is a metabolic biomarker for inducing GSDME-dependent pyroptosis of endothelial cells during ischemic stroke. Theranostics 2024, 14, 6071–6087. [Google Scholar] [CrossRef]
- Zhang, S.; Wu, S.Q.Y.; Hum, M.; Perumal, J.; Tan, E.Y.; Lee, A.S.G.; Teng, J.; Dinish, U.S.; Olivo, M. Complete characterization of RNA biomarker fingerprints using a multi-modal ATR-FTIR and SERS approach for label-free early breast cancer diagnosis. RSC Adv. 2024, 14, 3599–3610. [Google Scholar] [CrossRef]
- Han, Z.; Peng, X.; Yang, Y.; Yi, J.; Zhao, D.; Bao, Q.; Long, S.; Yu, S.-X.; Xu, X.-X.; Liu, B.; et al. Integrated microfluidic-SERS for exosome biomarker profiling and osteosarcoma diagnosis. Biosens. Bioelectron. 2022, 217, 114709. [Google Scholar] [CrossRef] [PubMed]
- Cheng, N.; Lou, B.; Wang, H. Discovering the digital biomarker of hepatocellular carcinoma in serum with SERS-based biosensors and intelligence vision. Colloids Surf. B Biointerfaces 2023, 226, 113315. [Google Scholar] [CrossRef]
- Guselnikova, O.; Lim, H.; Kim, H.J.; Kim, S.H.; Gorbunova, A.; Eguchi, M.; Postnikov, P.; Nakanishi, T.; Asahi, T.; Na, J.; et al. New Trends in Nanoarchitectured SERS Substrates: Nanospaces, 2D Materials, and Organic Heterostructures. Small 2022, 18, e2107182. [Google Scholar] [CrossRef]
- Canning, A.J.; Li, J.Q.; Atta, S.; Wang, H.N.; Vo-Dinh, T. Nanoplasmonics biosensors: At the frontiers of biomedical diagnostics. TrAC Trends Anal. Chem. 2024, 180, 117973. [Google Scholar] [CrossRef] [PubMed]
- Malkiel, I.; Mrejen, M.; Nagler, A.; Arieli, U.; Wolf, L.; Suchowski, H. Plasmonic nanostructure design and characterization via Deep Learning. Light Sci. Appl. 2018, 7, 60. [Google Scholar] [CrossRef]
- Yao, W.; Verdugo, F.; Everitt, H.O.; Christiansen, R.E.; Johnson, S.G. Designing structures that maximize spatially averaged surface-enhanced Raman spectra. Opt. Express 2023, 31, 4964–4977. [Google Scholar] [CrossRef]
- Peurifoy, J.; Shen, Y.; Jing, L.; Yang, Y.; Cano-Renteria, F.; DeLacy, B.G.; Joannopoulos, J.D.; Tegmark, M.; Soljačić, M. Nanophotonic particle simulation and inverse design using artificial neural networks. Sci. Adv. 2018, 4, eaar4206. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Liu, C.; Wei, Y.; Wu, P.; Su, Y.; Zhang, Z. Inverse design of metal nanoparticles based on deep learning. Results Opt. 2021, 5, 100134. [Google Scholar] [CrossRef]
- Vahidzadeh, E.; Shankar, K. Artificial Neural Network-Based Prediction of the Optical Properties of Spherical Core–Shell Plasmonic Metastructures. Nanomaterials 2021, 11, 633. [Google Scholar] [CrossRef] [PubMed]
- He, J.; He, C.; Zheng, C.; Wang, Q.; Ye, J. Plasmonic nanoparticle simulations and inverse design using machine learning. Nanoscale 2019, 11, 17444–17459. [Google Scholar] [CrossRef]
- Hayakawa, D.; Videbæk, T.E.; Grason, G.M.; Rogers, W.B. Symmetry-Guided Inverse Design of Self-Assembling Multiscale DNA Origami Tilings. ACS Nano 2024, 18, 19169–19178. [Google Scholar] [CrossRef]
- Rahman, T.; Tahmid, A.; Arman, S.E.; Ahmed, T.; Rakhy, Z.T.; Das, H.; Rahman, M.; Azad, A.K.; Wahadoszamen, M.; Habib, A. Leveraging generative neural networks for accurate, diverse, and robust nanoparticle design. Nanoscale Adv. 2025, 7, 634–642. [Google Scholar] [CrossRef]
- Kitadai, H.; Tan, Q.; Ping, L.; Ling, X. Raman enhancement induced by exciton hybridization in molecules and 2D materials. npj 2D Mater. Appl. 2024, 8, 11. [Google Scholar] [CrossRef]
- Mamiyev, Z.; Balayeva, N.O.; Zahn, D.R.T.; Tegenkamp, C. Enhanced Light–Matter Interactions With a Single Sn Nanoantenna on Epitaxial Graphene. Adv. Opt. Mater. 2025, 13, e00979. [Google Scholar] [CrossRef]
- Zhou, H.; Xu, L.; Ren, Z.; Zhu, J.; Lee, C. Machine learning-augmented surface-enhanced spectroscopy toward next-generation molecular diagnostics. Nanoscale Adv. 2023, 5, 538–570. [Google Scholar] [CrossRef]
- Lu, Z.; Wang, J.; Yan, S. Quantitative Surface-Enhanced Raman Spectroscopy: Challenges, Strategies, and Prospects. Molecules 2026, 31, 191. [Google Scholar] [CrossRef]
- Ma, L.; Zhou, K.; Wang, X.; Wang, J.; Zhao, R.; Zhang, Y.; Cheng, F. Recent Progress in the Synthesis of 3D Complex Plasmonic Intragap Nanostructures and Their Applications in Surface-Enhanced Raman Scattering. Biosensors 2024, 14, 433. [Google Scholar] [CrossRef]
- Chen, H.; Liu, H.; Xing, L.; Fan, D.; Chen, N.; Ma, P.; Zhang, X. Deep Learning-driven Microfluidic-SERS to Characterize the Heterogeneity in Exosomes for Classifying Non-Small Cell Lung Cancer Subtypes. ACS Sens. 2025, 10, 2872–2882. [Google Scholar] [CrossRef]






| Feature | Description | Analytical Challenge |
|---|---|---|
| High dim. * & Data scarcity | 102–103 Raman shifts per spectrum with limited sample sizes. | Overfitting risk; requires dimensionality reduction & data augmentation. |
| Noise & Backgrounds | Cosmic ray spikes, high-frequency noise, and matrix fluorescence. | Demands rigorous spike removal, denoising, and baseline correction. |
| Substrate dependence | Signals vary with nanoparticle morphology and stochastic hotspots. | Poor inter-batch reproducibility; necessitates strict intensity normalization. |
| Peak position variation | Raman shifts vary due to chemical interactions or instrument calibration. | Requires peak alignment to avoid species misclassification. |
| Intensity instability | Signal fluctuations from laser variations or differing spectral resolutions. | Requires spectral resampling/binning and uniform standardization. |
| Algorithm | Task | Strengths | Limitations | Refs. |
|---|---|---|---|---|
| Supervised learning models | ||||
| LDA | Class. & Dim.Red. 1 | Maximizes class separability; fast analytic solution. | Requires prior PCA for SERS (variables > samples); linear boundaries. | [47] |
| PLS-DA/PLS | Classification | Handles collinear Raman shifts; chemometrics gold standard. | Fails with severe nonlinear matrix effects and baseline drift. | [48] |
| SVM | Classification | Handles collinear Raman shifts; chemometrics gold standard. | Fails with severe nonlinear matrix effects and baseline drift. | [49] |
| RF | Class. & Reg. 2 | Extracts feature importance; highly noise-resistant. | Less interpretable than simple trees; slower prediction. | [50] |
| DT/CART | Class. & Reg. 2 | Highly interpretable; maps rules to specific Raman peaks. | Unstable and highly prone to overfitting on noisy SERS data. | [55] |
| Naïve Bayes | Classification | Extremely fast training for simple mixture screening. | Fails when adjacent Raman peaks are highly correlated. | [56] |
| KNN | Classification | Simple baseline method for direct spectral matching. | Highly sensitive to SERS intensity fluctuations. | [57] |
| Linear Regression | Regression | Simple baseline for quantitative trace analysis. | Fails under “hot spot” saturation and nonlinear adsorption. | [58] |
| XGBoost | Class. & Reg. 2 | Handles complex spectral overlaps with high accuracy. | Prone to fitting instrumental noise if poorly tuned. | [65] |
| ANN/MLP | Class. & Reg. 2 | Captures complex nonlinear concentration–intensity relationships. | “Black box” lacking interpretability; requires large datasets. | [67,68] |
| CNN/ResNet | Classification | Learns peak shapes and shoulders directly from raw data. | “Black box”; prone to overfitting due to SERS data scarcity. | [69,71] |
| RNN/LSTM | Regression | Captures long-range correlations across the Raman shift axis. | Computationally heavy; vanishing gradients on broad spectra. | [70] |
| Unsupervised learning models | ||||
| PCA | Dim.Red. 3 | Reduces dimensions and acts as a secondary noise filter. | Discards nonlinear interactions in complex biological matrices. | [46] |
| t-SNE | Dim.Red. 3 | Excellent for 2D visualization of complex SERS clusters. | Not predictive; cannot map new unseen spectra to clusters. | [60] |
| K-Means | Clustering | Rapid blind grouping of unknown SERS mixtures. | Highly sensitive to baseline drift and cosmic ray spikes. | [61] |
| Hierarchical | Clustering | Reveals spectral similarities via dendrogram visualization. | Computationally heavy for large SERS mapping datasets. | [62] |
| Application | SERS-Specific Challenge | ML Strategy & Algorithm | Key Advantage | Ref. |
|---|---|---|---|---|
| Illicit Drugs & Forensics | Severe spectral overlap among structurally similar analogs. | Nonlinear Classification (PCA-SVM, PCA-DA) | Resolves subtle spectral differences beyond human-resolvable peak assignments. | [75,76] |
| Complex Biospectra (DL) | High dimensionality and interdependent spectral features. | Hierarchical Feature Learning (2D-CNN) | Preserves inter-channel peak correlations without relying on manual feature engineering. | [77] |
| Quantitative Analysis | Nonlinear concentration-response and matrix interference. | Regression & Rule Extraction (RF, MCR-ALS) | Enables robust quantification; RF provides feature importance for chemical interpretability. | [78,79] |
| Trace Sequence Variations | Weak, sequentially distributed spectral signatures. | Sequence-Aware Modeling (LSTM, RNN) | Captures long-range spectral dependencies across adjacent Raman shifts. | [80] |
| Low-Abundance Biomarkers | Ultra-trace signals (picomolar level) buried in background noise. | Chemometric Regression | Improves analytical sensitivity, validating the presence of low-level targets (e.g., miRNA). | [81] |
| Metabolomic Biomarkers | High-dimensional, highly correlated global spectral variations. | Dim. Re & Class. (sPLS-DA & SVM) | Extracts global spectral patterns linked to disease states rather than isolated peaks. | [82] |
| Multiplex Profiling | Overlapping signals from multiple co-existing surface proteins. | Multivariate Modeling (PLS-DA) | Integrates multi-marker spectral features into a unified and robust diagnostic framework. | [83] |
| Multimodal Integration | Cross-platform variability and limited single-mode accuracy. | Data Fusion (PCA-SVM) | Synergistic integration of SERS and FTIR significantly enhances diagnostic discrimination. | [84] |
| Explainable AI (XAI) | “Black-box” nature of DL limits clinical trust and interpretability. | Feature Attribution (DNN & Score-CAM) | Attributes classification decisions to physically meaningful spectral peaks (“digital biomarkers”). | [85] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, Z.; Wang, Y.; Deng, Z.; Zhao, J. Machine Learning-Enabled Intelligent Analysis of Surface-Enhanced Raman Scattering: Methods, Applications, and Perspectives. Molecules 2026, 31, 1599. https://doi.org/10.3390/molecules31101599
Li Z, Wang Y, Deng Z, Zhao J. Machine Learning-Enabled Intelligent Analysis of Surface-Enhanced Raman Scattering: Methods, Applications, and Perspectives. Molecules. 2026; 31(10):1599. https://doi.org/10.3390/molecules31101599
Chicago/Turabian StyleLi, Zixing, Yu Wang, Zi Deng, and Jingjing Zhao. 2026. "Machine Learning-Enabled Intelligent Analysis of Surface-Enhanced Raman Scattering: Methods, Applications, and Perspectives" Molecules 31, no. 10: 1599. https://doi.org/10.3390/molecules31101599
APA StyleLi, Z., Wang, Y., Deng, Z., & Zhao, J. (2026). Machine Learning-Enabled Intelligent Analysis of Surface-Enhanced Raman Scattering: Methods, Applications, and Perspectives. Molecules, 31(10), 1599. https://doi.org/10.3390/molecules31101599

