Identifying the Geographical Origin of Wolfberry Using Near-Infrared Spectroscopy and Stacking-Orthogonal Linear Discriminant Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Samples
2.2. Spectra Acquisition
2.3. Spectral Preprocessing
2.3.1. Baseline Correction
2.3.2. Scatter Correction
2.3.3. Derivative Spectra
2.4. Principle Component Analysis
2.5. Linear Discriminant Analysis
2.6. Orthogonal Linear Discriminant Analysis
2.7. Base Learners and Stacking Combinations
2.7.1. Two Base Classifiers and One Meta-Classifier
2.7.2. Three Base Classifiers and One Meta-Classifier
2.8. Comparing Other Ensemble Learning Frameworks
2.9. Software
3. Results
3.1. Preprocessing Results
3.2. Stacking Classification Results and Optimization
3.2.1. Classification Accuracy of Different Methods
3.2.2. Analysis of Classifiers’ Performance
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ma, R.H.; Zhang, X.X.; Thakur, K.; Zhang, J.G.; Wei, Z.J. Research progress of Lycium barbarum L. as functional food: Phytochemical composition and health benefits. Curr. Opin. Food Sci. 2022, 47, 100871. [Google Scholar] [CrossRef]
- Amagase, H.; Farnsworth, N.R. A review of botanical characteristics, phytochemistry, clinical relevance in efficacy and safety of Lycium barbarum fruit (goji). Food Res. Int. 2011, 44, 1702–1717. [Google Scholar] [CrossRef]
- Fiorito, S.; Preziuso, F.; Epifano, F.; Scotti, L.; Bucciarelli, T.; Taddeo, V.A.; Genovese, S. Novel biological active principles from spinach, goji and quinoa. Food Chem. 2019, 276, 262–265. [Google Scholar] [CrossRef] [PubMed]
- Shahrajabian, M.H.; Wenli, S.; Qi, C. The power of natural Chinese medicine, ginger and ginseng root in an organic life. Middle-East J. Sci. Res. 2019, 27, 64–71. Available online: https://www.researchgate.net/publication/331745993 (accessed on 6 May 2025).
- Qian, D.; Zhao, Z.Y.; Ma, S.; Yang, G.; Zhong, J.Y.; Zang, C.X. Analysis of characteristics and problems of international trade of wolfberry in China. China J. Chin. Mater. Med. 2019, 44, 2880–2885. [Google Scholar] [CrossRef]
- Zhang, H.; Jiang, H.; Liu, G.; Mei, C.; Huang, Y. Identification of Radix puerariae starch from different geographical origins by FT-NIR spectroscopy. Int. J. Food Prop. 2017, 2, 1567–1577. [Google Scholar] [CrossRef]
- Wang, J.; Guo, Z.; Zou, C.; Jiang, S.; El-Seedi, H.R.; Zou, X. General model of multi-quality detection for Apple from different origins by Vis/Nir Transmittance Spectroscopy. J. Food Meas. Charact. 2022, 16, 2582–2595. [Google Scholar] [CrossRef]
- Wu, X.; Fang, Y.; Wu, B.; Liu, M. Application of near-infrared spectroscopy and fuzzy improved null linear discriminant analysis for rapid discrimination of Milk Brands. Foods 2023, 12, 3929. [Google Scholar] [CrossRef]
- Zhang, W.; Kasun, L.C.; Wang, Q.J.; Zheng, Y.; Lin, Z. A Review of Machine Learning for Near-Infrared Spectroscopy. Sensors 2022, 22, 9764. [Google Scholar] [CrossRef]
- Neves, M.D.G.; Poppi, R.J.; Breitkreitz, M.C. Authentication of plant-based protein powders and classification of adulterants as whey, soy protein, and wheat using FT-NIR in tandem with OC-PLS and PLS-DA models. Food Control 2022, 132, 108489. [Google Scholar] [CrossRef]
- Yin, W.; Zhang, C.; Zhu, H.; Zhao, Y.; He, Y. Application of near-infrared hyperspectral imaging to discriminate different geographical origins of Chinese wolfberries. PLoS ONE 2017, 12, e0180534. [Google Scholar] [CrossRef]
- Yahui, L.; Xiaobo, Z.; Tingting, S.; Jiyong, S.; Jiewen, Z.; Holmes, M. Determination of geographical origin and anthocyanin content of Black Goji Berry (Lycium ruthenicum Murr.) using near-infrared spectroscopy and Chemometrics. Food Anal. Methods 2017, 10, 1034–1044. [Google Scholar] [CrossRef]
- Nirere, A.; Sun, J.; Atindana, V.A.; Hussain, A.; Zhou, X.; Yao, K. A comparative analysis of hybrid SVM and LS-SVM classification algorithms to identify dried wolfberry fruits quality based on hyperspectral imaging technology. J. Food Process. Preserv. 2022, 46, e16320. [Google Scholar] [CrossRef]
- Li, X.D.; Kang, T.L.; Liu, X.Z.; Cao, Z.F.; Wu, Y.A. Development suggestions and status of wolfberry industry in Gansu province. Gansu Agric. Sci. Technol. 2017, 1, 65–69. [Google Scholar] [CrossRef]
- Opitz, D.; Maclin, R. Popular ensemble methods: An Empirical Study. J. Artif. Intell. Res. 1999, 11, 169–198. [Google Scholar] [CrossRef]
- Polikar, R. Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 2006, 6, 21–45. [Google Scholar] [CrossRef]
- Wankhade, K.K.; Jondhale, K.C.; Dongre, S.S. A clustering and ensemble based classifier for data stream classification. Appl. Soft Comput. 2021, 102, 107076. [Google Scholar] [CrossRef]
- Ren, Y.; Zhang, L.; Suganthan, P.N. Ensemble classification and regression-recent developments, applications and future directions [review article]. IEEE Comput. Intell. Mag. 2016, 11, 41–53. [Google Scholar] [CrossRef]
- Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
- Ting, K.M.; Witten, I.H. Issues in stacked generalization. J. Artif. Intell. Res. 1999, 10, 271–289. [Google Scholar] [CrossRef]
- Džeroski, S.; Ženko, B. Is combining classifiers with stacking better than selecting the best one? Mach. Learn. 2004, 54, 255–273. [Google Scholar] [CrossRef]
- Cui, C.; Zhao, D.; Huang, J.; Hao, J. Progress on research and development of goji berry drying: A review. Int. J. Food Prop. 2022, 25, 435–449. [Google Scholar] [CrossRef]
- Poggioni, L.; Romi, M.; Guarnieri, M.; Cai, G.; Cantini, C. Nutraceutical profile of goji (Lycium barbarum L.) berries in relation to environmental conditions and harvesting period. Food Biosci. 2022, 49, 101954. [Google Scholar] [CrossRef]
- Zhou, Z.Q.; Fan, H.X.; He, R.R.; Xiao, J.; Tsoi, B.; Lan, K.H.; Kurihara, H.; So, K.F.; Yao, X.S.; Gao, H. Lycibarbarspermidines A–O, new dicaffeoylspermidine derivatives from wolfberry, with activities against Alzheimer’s disease and oxidation. J. Agric. Food Chem. 2016, 64, 2223–2237. [Google Scholar] [CrossRef]
- Lu, Y.; Guo, S.; Zhang, F.; Yan, H.; Qian, D.W.; Wang, H.Q.; Jin, L.; Duan, J.A. Comparison of functional components and antioxidant activity of Lycium barbarum L. fruits from different regions in China. Molecules 2019, 24, 2228. [Google Scholar] [CrossRef]
- Çetin, V.; Yıldız, O. A comprehensive review on data preprocessing techniques in data analysis. Pamukkale Üniv. Mühendis. Bilim. Derg. 2022, 28, 299–312. [Google Scholar] [CrossRef]
- Mishra, P.; Biancolillo, A.; Roger, J.M.; Marini, F.; Rutledge, D.N. New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends Anal. Chem. 2020, 132, 116045. [Google Scholar] [CrossRef]
- Ding, Y.; Yan, Y.; Li, J.; Chen, X.; Jiang, H. Classification of tea quality levels using near-infrared spectroscopy based on CLPSO-SVM. Foods 2022, 11, 1658. [Google Scholar] [CrossRef]
- Rinnan, Å.; van den Berg, F.; Engelsen, S.B. Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends Anal. Chem. 2009, 28, 1201–1222. [Google Scholar] [CrossRef]
- Savitzky, A.; Golay, M.J. Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 1964, 36, 1627–1639. [Google Scholar] [CrossRef]
- Sun, J.; Yang, F.; Cheng, J.; Wang, S.; Fu, L. Nondestructive identification of soybean protein in minced chicken meat based on hyperspectral imaging and VGG16-SVM. J. Food Compos. Anal. 2024, 125, 105713. [Google Scholar] [CrossRef]
- Han, C.; Jifan, Y.; Hao, T.; Jinshan, Y.; Huirong, X. Evaluation of the optical layout and sample size on online detection of apple watercore and SSC using Vis/NIR spectroscopy. J. Food Compos. Anal. 2023, 123, 105528. [Google Scholar] [CrossRef]
- Geladi, P.; MacDougall, D.; Martens, H. Linearization and scatter-correction for near-infrared reflectance spectra of meat. Appl. Spectrosc. 1985, 39, 491–500. [Google Scholar] [CrossRef]
- Barone, V.; Alessandrini, S.; Biczysko, M.; Cheeseman, J.R.; Clary, D.C.; McCoy, A.B.; DiRisio, R.J.; Neese, F.; Melosso, M.; Puzzarini, C. Computational molecular spectroscopy. Nat. Rev. Methods Primers 2021, 1, 38. [Google Scholar] [CrossRef]
- Greenacre, M.; Groenen, P.J.F.; Hastie, T.; D’enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Primers 2022, 2, 100. [Google Scholar] [CrossRef]
- Beattie, J.R.; Esmonde-White, F.W. Exploration of principal component analysis: Deriving principal component analysis visually using spectra. Appl. Spectrosc. 2021, 75, 361–375. [Google Scholar] [CrossRef]
- Xu, L.; Chen, Z.; Bai, X.; Deng, J.; Zhao, X.; Jiang, H. Determination of aflatoxin B1 in peanuts based on millimetre wave. Food Chem. 2025, 464, 141867. [Google Scholar] [CrossRef]
- Anowar, F.; Sadaoui, S.; Selim, B. Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, Lda, MDS, SVD, LLE, Isomap, LE, Ica, T-SNE). Comput. Sci. Rev. 2021, 40, 100378. [Google Scholar] [CrossRef]
- Chen, L.F.; Liao, H.Y.M.; Ko, M.T.; Lin, J.C.; Yu, G.J. A new LDA-based face recognition system which can solve the small sample size problem. Pattern Recognit. 2000, 33, 1713–1726. [Google Scholar] [CrossRef]
- Qu, L.; Pei, Y. A comprehensive review on discriminant analysis for addressing challenges of class-level limitations, small sample size, and robustness. Processes 2024, 12, 1382. [Google Scholar] [CrossRef]
- Martinez, A.M.; Kak, A.C. PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 228–233. [Google Scholar] [CrossRef]
- Zhao, X.; Guo, J.; Nie, F.; Chen, L.; Li, Z.; Zhang, H. Joint principal component and discriminant analysis for dimensionality reduction. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 433–444. [Google Scholar] [CrossRef] [PubMed]
- Ye, J.; Li, Q. LDA/QR: An efficient and effective dimension reduction algorithm and its theoretical foundation. Pattern Recognit. 2004, 37, 851–854. [Google Scholar] [CrossRef]
- Cheng, J.; Sun, J.; Yao, K.; Xu, M.; Wang, S.; Fu, L. Hyperspectral technique combined with stacking and blending ensemble learning method for detection of cadmium content in oilseed rape leaves. J. Sci. Food Agric. 2022, 103, 2690–2699. [Google Scholar] [CrossRef]
- Wu, T.; Zhang, W.; Jiao, X.; Guo, W.; Hamoud, Y.A. Evaluation of stacking and blending ensemble learning methods for estimating daily reference evapotranspiration. Comput. Electron. Agric. 2021, 184, 106039. [Google Scholar] [CrossRef]
- Sun, J.; Jiang, S.; Mao, H.; Wu, X.; Li, Q. Classification of black beans using visible and near infrared hyperspectral imaging. Int. J. Food Prop. 2016, 19, 1687–1695. [Google Scholar] [CrossRef]
- Sun, J.; Ge, X.; Wu, X.; Dai, C.; Yang, N. Identification of pesticide residues in lettuce leaves based on near infrared transmission spectroscopy. J. Food Process Eng. 2018, 41, e12816. [Google Scholar] [CrossRef]
Chemical Composition | Main Structural Group | Main Absorption Wavelength (nm) | Vibration Mode |
---|---|---|---|
protein/amino acid | N-H | 970–1050 | secondary stretching vibration |
wolfberry polysaccharide | C-H, O-H | 1150–1250 | secondary stretching vibration |
flavonoid | C-H | 1200 | secondary stretching vibration |
phenol | O-H | 1400–1450 | first overtone of stretching vibration |
moisture | O-H | 1400–1500 | first overtone of stretching vibration |
carotenoids | C-H | 1640–1680 | first overtone of stretching vibration |
Methods | Base Classifier 1 | Base Classifier 2 | Meta Classifier |
---|---|---|---|
Stacking 1 | KNN | Decision Tree | SVM |
Stacking 2 | KNN | SVM | Decision Tree |
Stacking 3 | SVM | Decision Tree | KNN |
Stacking 4 | KNN | Naive Bayes | SVM |
Stacking 5 | KNN | Naive Bayes | Decision Tree |
Stacking 6 | SVM | Naive Bayes | KNN |
Stacking 7 | SVM | Naive Bayes | Decision Tree |
Stacking 8 | Decision Tree | Naive Bayes | KNN |
Stacking 9 | Decision Tree | Naive Bayes | SVM |
Methods | Base Classifier 1 | Base Classifier 2 | Base Classifier 3 | Meta Classifier |
---|---|---|---|---|
Stacking 10 | KNN | Naive Bayes | Decision Tree | SVM |
Stacking 11 | KNN | Naive Bayes | SVM | Decision Tree |
Stacking 12 | SVM | Naive Bayes | Decision Tree | KNN |
Methods | Accuracy (%) | Methods | Accuracy (%) |
---|---|---|---|
RAW | 44.1 | FD | 55.9 |
SG | 83.6 | SG + MSC | 88.3 |
SNV | 56.7 | SG + SNV | 86.0 |
MSC | 59.3 | SG + FD | 85.5 |
Methods | PCA | PCA + LDA | OLDA | ||||||
---|---|---|---|---|---|---|---|---|---|
2-Fold | 5-Fold | 8-Fold | 2-Fold | 5-Fold | 8-Fold | 2-Fold | 5-Fold | 8-Fold | |
Single classifier | |||||||||
KNN | 84.50 | 88.25 | 88.60 | 86.15 | 90.00 | 90.40 | 87.40 | 91.25 | 90.40 |
Tree | 80.00 | 83.63 | 84.00 | 83.35 | 87.00 | 87.60 | 86.15 | 90.00 | 89.80 |
SVM | 82.75 | 86.50 | 84.20 | 83.75 | 87.50 | 86.60 | 87.90 | 91.75 | 92.20 |
Naive Bayes | 63.90 | 67.20 | 66.40 | 66.60 | 70.00 | 71.90 | 78.10 | 81.63 | 79.80 |
Stacking combination | |||||||||
Stacking 1 | 88.25 | 90.00 | 91.40 | 91.20 | 92.38 | 91.20 | 92.80 | 93.88 | 94.20 |
Stacking 2 | 89.40 | 93.38 | 94.20 | 92.45 | 93.50 | 92.80 | 94.60 | 94.50 | 94.20 |
Stacking 3 | 92.55 | 94.50 | 95.20 | 93.60 | 93.88 | 94.60 | 95.65 | 96.50 | 97.00 |
Stacking 4 | 90.85 | 92.75 | 91.60 | 94.05 | 94.00 | 95.00 | 94.85 | 95.63 | 96.60 |
Stacking 5 | 93.90 | 93.88 | 92.40 | 95.15 | 95.63 | 96.00 | 95.30 | 96.13 | 95.20 |
Stacking 6 | 95.30 | 96.25 | 96.00 | 94.30 | 96.25 | 95.80 | 93.55 | 97.50 | 97.20 |
Stacking 7 | 89.70 | 91.88 | 91.40 | 94.60 | 94.50 | 94.80 | 93.25 | 94.13 | 94.80 |
Stacking 8 | 93.30 | 94.25 | 93.60 | 95.65 | 96.40 | 96.60 | 96.55 | 98.13 | 97.40 |
Stacking 9 | 92.65 | 94.50 | 92.20 | 96.20 | 95.75 | 94.20 | 95.65 | 96.63 | 96.00 |
Stacking 10 | 94.05 | 95.13 | 96.40 | 95.30 | 96.88 | 96.40 | 98.10 | 98.38 | 98.60 |
Stacking 11 | 92.45 | 94.38 | 95.80 | 96.90 | 96.75 | 96.40 | 96.90 | 97.50 | 97.60 |
Stacking 12 | 93.90 | 95.00 | 96.80 | 97.20 | 97.13 | 96.80 | 97.95 | 98.25 | 99.00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, S.; Wu, X.; Li, M.; Wu, B. Identifying the Geographical Origin of Wolfberry Using Near-Infrared Spectroscopy and Stacking-Orthogonal Linear Discriminant Analysis. Foods 2025, 14, 1684. https://doi.org/10.3390/foods14101684
Song S, Wu X, Li M, Wu B. Identifying the Geographical Origin of Wolfberry Using Near-Infrared Spectroscopy and Stacking-Orthogonal Linear Discriminant Analysis. Foods. 2025; 14(10):1684. https://doi.org/10.3390/foods14101684
Chicago/Turabian StyleSong, Shijie, Xiaohong Wu, Mingyu Li, and Bin Wu. 2025. "Identifying the Geographical Origin of Wolfberry Using Near-Infrared Spectroscopy and Stacking-Orthogonal Linear Discriminant Analysis" Foods 14, no. 10: 1684. https://doi.org/10.3390/foods14101684
APA StyleSong, S., Wu, X., Li, M., & Wu, B. (2025). Identifying the Geographical Origin of Wolfberry Using Near-Infrared Spectroscopy and Stacking-Orthogonal Linear Discriminant Analysis. Foods, 14(10), 1684. https://doi.org/10.3390/foods14101684