Fast Spectral Search Using Improved Preprocessing and Limited Axis Check
Abstract
1. Introduction
2. - Tree-Based Search Method
2.1. Preprocessing Stage
2.2. Coarse Search with k-d Tree
2.3. Fine Search
3. Proposed Method
3.1. Preprocessing
3.2. Limited Axis Check
| Algorithm 1: Limited Axis Check: LAC. |
![]() |
3.3. Fine Search
4. Experiment
4.1. Comparison Under Conventional Preprocessing
4.2. Evaluation of LAC with EP
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Hu, Y.; Jiang, T.; Shen, A.; Li, W.; Wang, X.; Hu, J. A background elimination method based on wavelet transform for Raman spectra. Chemom. Intell. Lab. Syst. 2007, 85, 94–101. [Google Scholar] [CrossRef]
- Sigle, M.; Rohlfing, A.K.; Kenny, M.; Scheuermann, S.; Sun, N.; Graeßner, U.; Haug, U.; Sudmann, J.; Seitz, C.M.; Heinzmann, D.; et al. Translating genomic tools to Raman spectroscopy analysis enables high-dimensional tissue characterization on molecular resolution. Nat. Commun. 2023, 14, 5799. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Chu, X.; Tian, S.; Lu, W. The identification of highly similar crude oils by infrared spectroscopy combined with pattern recognition method. Spectrochim. Acta Part 2013, 112, 457–462. [Google Scholar] [CrossRef] [PubMed]
- Iscen, A.; Avrithis, Y.; Tolias, G.; Furon, T.; Chum, O. Fast spectral ranking for similarity search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7632–7641. [Google Scholar] [CrossRef]
- Wang, J.; Shen, J. Fast spectral analysis for approximate nearest neighbor search. Mach. Learn. 2022, 111, 2297–2322. [Google Scholar] [CrossRef]
- Chen, T.; Son, Y.; Dong, C.; Baek, S.-J. Baseline correction of Raman spectral data using triangular deep convolutional networks. Analyst 2025, 150, 2653–2660. [Google Scholar] [CrossRef] [PubMed]
- Liu, X.; An, H.; Cai, W.; Shao, X. Deep learning in spectral analysis: Modeling and imaging. TrAC Trends Anal. Chem. 2024, 172, 117612. [Google Scholar] [CrossRef]
- Ho, C.S.; Jean, N.; Hogan, C.A.; Blackmon, L.; Jeffrey, S.S.; Holodniy, M.; Banaei, N.; Saleh, A.A.; Ermon, S.; Dionne, J. Rapid identification of pathogenic bacteria using Raman spectroscopy and deep learning. Nat. Commun. 2019, 10, 4927. [Google Scholar] [CrossRef] [PubMed]
- Hajjou, M.; Qin, Y.; Bradby, S.; Bempong, D.; Lukulay, P. Assessment of the performance of a handheld Raman device for potential use as a screening tool in evaluating medicines quality. J. Pharm. Biomed. Anal. 2013, 74, 47–55. [Google Scholar] [CrossRef]
- Sanchez, L.; Farber, C.; Lei, J.; Zhu-Salzman, K.; Kurouski, D. Noninvasive and nondestructive detection of cowpea bruchid within cowpea seeds with a hand-held Raman spectrometer. Anal. Chem. 2019, 91, 1733–1737. [Google Scholar] [CrossRef] [PubMed]
- Koyun, O.C.; Keser, R.K.; Şahin, S.O.; Bulut, D.; Yorulmaz, M.; Yücesoy, V.; Töreyin, B.U. RamanFormer: A transformer-based quantification approach for Raman mixture components. ACS Omega 2024, 9, 23241–23251. [Google Scholar] [CrossRef]
- Lackey, H.E.; Nelson, G.L.; Felmy, H.M.; Guo, X.; Bryan, S.A.; Lines, A.M. PCA and PLS analysis of lanthanides using absorbance and single-beam visible spectra. ACS Omega 2024, 9, 33662–33670. [Google Scholar] [CrossRef]
- Tai, S.C.; Lai, C.C.; Lin, Y.C. Two fast nearest neighbor searching algorithms for image vector quantization. IEEE Trans. Commun. 1996, 44, 1623–1628. [Google Scholar] [CrossRef]
- Son, Y.; Chen, T.; Baek, S.-J. Fast search using k-d trees with Fine Search for spectral data identification. Mathematics 2025, 13, 574. [Google Scholar] [CrossRef]
- Huang, Z.; Laffan, S.W. Sensitivity analysis of a decision tree classification to input data errors using a general Monte Carlo error sensitivity model. Int. J. Geogr. Inf. Sci. 2009, 23, 1433–1452. [Google Scholar] [CrossRef]
- Magniez, F.; Nayak, A.; Santha, M.; Sherman, J.; Tardos, G.; Xiao, D. Improved bounds for the randomized decision tree complexity of recursive majority. Random Struct. Algorithms 2016, 48, 612–638. [Google Scholar] [CrossRef]
- Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
- Bentley, J.L. Multidimensional binary search trees used for associative searching. Commun. ACM 1975, 18, 509–517. [Google Scholar] [CrossRef]
- Tiwari, V.R. Developments in KD Tree and KNN Searches. Int. J. Comput. Appl. 2023, 185, 17–23. [Google Scholar] [CrossRef]
- Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; Vetterling, W.T. Numerical Recipes in C; Cambridge University Press: New York, NY, USA, 1988. [Google Scholar]



| Method | Coarse Search | Fine Search (FS) | Total Ops | Reduction | ||
|---|---|---|---|---|---|---|
| Mul | Add | Mul | Add | |||
| KD + FS | 67,760 | 60,752 | 11,872 | 23,739 | 164,123 | — |
| LAC + FS | 52,819 | 58,837 | 8824 | 17,644 | 138,124 | 15% |
| Number of | LAC | Fine Search (FS) | LAC + FS (Total) | |||
|---|---|---|---|---|---|---|
| Multiplication | Addition | Multiplication | Addition | Multiplication | Addition | |
| 4 | 447 | 6483 | 920,590 | 1,840,882 | 921,037 | 1,847,365 |
| 8 | 817 | 6833 | 199,076 | 398,088 | 199,893 | 404,921 |
| 16 | 1618 | 7635 | 8010 | 16,016 | 9628 | 23,651 |
| 32 | 3234 | 9268 | 7603 | 15,203 | 10,837 | 24,471 |
| 64 | 6468 | 12,536 | 7603 | 15,203 | 14,071 | 27,739 |
| Stage | Mul | Add | Description |
|---|---|---|---|
| LAC | 1618 | 7635 | Candidate identification |
| Fine Search (FS) | 8010 | 16,016 | Exact matching using precomputed distance table |
| Enhanced Preprocessing (EP) | 3400 | 13,200 | Running average, downsampling, noise-cut |
| Total (Proposed) | 13,028 | 36,851 | — |
| Method | Multiplication | Addition | Total |
|---|---|---|---|
| Full Search | 46,480,500 | 92,946,915 | 139,427,415 |
| Full Search + PDS | 8,846,894 | 17,679,704 | 26,526,598 |
| PCT + PDS (150 ) | 822,617 | 1,135,948 | 1,958,565 |
| PS (40 ) + CS (80 ) | 319,385 | 350,336 | 669,721 |
| KD + FS (16 ) | 79,633 | 104,475 | 184,108 |
| LAC + FS + EP (16 ) | 13,028 | 36,851 | 49,879 |
| Method | Time (s) | Speedup | Environment |
|---|---|---|---|
| KD + FS | 7.12 | 1.00× | Windows 11 (64-bit) AMD Ryzen 9 5900X |
| LAC + FS + EP | 2.96 | 2.40× | 64 GB RAM, Python 3.11.10 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Son, Y.; Chen, T.; Shang, G.; Kim, M.; Baek, S.-J. Fast Spectral Search Using Improved Preprocessing and Limited Axis Check. Mathematics 2025, 13, 3983. https://doi.org/10.3390/math13243983
Son Y, Chen T, Shang G, Kim M, Baek S-J. Fast Spectral Search Using Improved Preprocessing and Limited Axis Check. Mathematics. 2025; 13(24):3983. https://doi.org/10.3390/math13243983
Chicago/Turabian StyleSon, YoungJae, Tiejun Chen, Guangyong Shang, Myeongjin Kim, and Sung-June Baek. 2025. "Fast Spectral Search Using Improved Preprocessing and Limited Axis Check" Mathematics 13, no. 24: 3983. https://doi.org/10.3390/math13243983
APA StyleSon, Y., Chen, T., Shang, G., Kim, M., & Baek, S.-J. (2025). Fast Spectral Search Using Improved Preprocessing and Limited Axis Check. Mathematics, 13(24), 3983. https://doi.org/10.3390/math13243983


