Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination
Abstract
1. Introduction
2. Materials and Methods
2.1. Experimental Design
2.2. Construction of PLS–Ensemble
2.2.1. Filter-Based Feature Selection
- Variable importance in projection (VIP): First proposed by Wold et al. [46], VIP represents a measure of how much a waveband contributes to describing both the predictors and responses in the PLS model [47,48]. Typically, a VIP value < 1 indicates a non-important variable. Mathematically, the VIP score for waveband is defined as
- Selectivity ratio (SR): Based on the target projection (TP) of loadings, the SR assesses the discriminative power of each waveband by comparing explained variance with residual variance. TP projects the original data onto a single predictive component that captures the part of the variation in predictors most strongly related to responses [47,49]. This separates the variation into two parts: the “signal” (explained variance, ) and the “noise” (residual variance, ). The SR, therefore, reflects the signal-to-noise contribution of each waveband in the regression model by defining a ratio, , between and for waveband , following
- Significance multivariate correlation (sMC): Unlike VIP and SR, sMC assesses a waveband’s statistical significance with respect to its relationship to responses , rather than its relative importance [47]. For a waveband , sMC compares the with , adjusted for degrees of freedom, using an F-type statistic. A higher sMC value indicates that the waveband’s correlation with the responses is stronger than what would be expected by chance or random noise.
- Loading weights (LW): For each latent component constructed by the PLS model, the loadings of predictors can serve as a measure of variable importance. Wavebands with a higher absolute LW contribute more strongly to the component. A subset of wavebands can then be selected based on a user-defined threshold.
- Regression coefficients (RC): Similar to LW, the PLS model’s calculated regression coefficients serve as a measure of variable importance. Wavebands with larger absolute values contribute more strongly to predicting the responses , and a subset can be selected based on a user-defined threshold.
- Peak loadings: This common application of PLS feature selection [4,33] relies on the inspection of PLS loading plots. Peaks (either positive or negative) in the loading curve correspond to wavelengths that strongly influence the model, while values near zero indicate little contribution. Feature selection is then performed by choosing wavebands around these peaks, which differs from the LW approach that quantifies how much each variable contributes to the construction of the latent components.
2.2.2. Wrapper-Based Feature Selection
- Backward variable elimination (BVE): This approach commonly utilises one of the previously described filter methods to rank wavebands. A user-defined threshold is then applied to determine the optimal subset size, after which a PLS model is refitted to evaluate subset performance. This process is repeated until the maximum number of iterations is reached or maximum model performance is observed [47]. BVE was implemented using the plsVarSel (0.9.12) R package and VIP for feature ranking [45].
- Interval PLS (iPLS): Interval-based PLS was first introduced by Nørgaard et al. [51], and splits the input wavebands into equal, non-overlapping intervals and fits a local PLS model in each interval. Backward elimination is employed to iteratively remove the worst-performing interval relative to a PLS model fitted to all wavebands in the available intervals [39]. The process iterates until no further improvement is observed or a maximum number of iterations is reached [52]. iPLS has been recommended for highly correlated spectral datasets because it evaluates groups of adjacent wavebands collectively, reducing the impact of multicollinearity [48,51]. The mdatools (0.14.2) [52] package in R was used to construct the iPLS model.
- Overlapping iPLS: A variant of iPLS, often called moving window or sliding window iPLS, divides the spectral range into overlapping intervals, allowing features spanning interval boundaries to be evaluated more continuously [39]. In this study, the original iPLS model was modified to use intervals of 100 wavebands with a step size of 50 wavebands (i.e., each interval overlapped the previous by 50 wavebands).
2.2.3. Multicriteria Evaluation (Hybrid Filter Approach)
2.2.4. Combined Approaches
2.2.5. Final PLS–Ensemble
- For each FS method, the selected wavebands were ranked by selection frequency, with higher frequency values receiving higher ranks.
- Each rank was then converted into Borda points using
- 3.
- The Borda points for each waveband were then summed across all 18 FS methods to obtain a consensus score:
- Taking a matrix comprised of FS subsets evaluated across classifiers and performance metrics as input, with each FS subset then ranked according to its performance.
- The absolute differences between individual subset rankings (i.e., the SRD values) are then calculated and summed into SRD scores.
- The aggregated SRD scores are then min–max-normalised to produce a weight vector, , which is then applied to the Borda scores from each FS method before the final aggregation (given by Equation (9)), yielding a weighted ensemble ranking of wavebands.
2.3. Aggregation of Spectral Wavebands
2.4. Assessment of Waveband Subset
2.4.1. Oblique Random Forest (oRF)
2.4.2. Multinomial Logistic Regression (Multinom)
2.4.3. Support Vector Machine (SVM)
2.4.4. Multi-Layer Perceptron (MLP)
2.4.5. One-Dimensional Convolutional Neural Network (1D CNN)
3. Results and Discussion
3.1. Assessment of the PLS–Ensemble and Aggregated Subsets
3.2. Assessment of Model Performance
3.3. Limitations and Future Work
4. Conclusions
- Integrating non-linear feature selection methods;
- Validating spectral subsets across temporal and spatial domains;
- Testing streamlined ensembles under tuned classifier settings.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tufail, R.; Tassinari, P.; Torreggiani, D. Assessing feature extraction, selection, and classification combinations for crop mapping using Sentinel-2 time series: A case study in northern Italy. Remote Sens. Appl. Soc. Environ. 2025, 38, 101525. [Google Scholar] [CrossRef]
- Nabil, M.; Farg, E.; Afify, N.M.; Arafat, S.M. Optimizing crop monitoring: Mapping cultivation stages and types with sentinel-1/2 and random forest algorithm. Int. J. Remote Sens. 2025, 46, 273–299. [Google Scholar] [CrossRef]
- Karakizi, C.; Oikonomou, M.; Karantzalos, K. Vineyard Detection and Vine Variety Discrimination from Very High Resolution Satellite Data. Remote Sens. 2016, 8, 235. [Google Scholar] [CrossRef]
- Mirzaei, M.; Marofi, S.; Abbasi, M.; Solgi, E.; Karimi, R.; Verrelst, J. Scenario-based discrimination of common grapevine varieties using in-field hyperspectral data in the western of Iran. Int. J. Appl. Earth Obs. Geoinf. 2019, 80, 26–37. [Google Scholar] [CrossRef]
- Carneiro, G.A.; Cunha, A.; Aubry, T.J.; Sousa, J. Advancing Grapevine Variety Identification: A Systematic Review of Deep Learning and Machine Learning Approaches. AgriEngineering 2024, 6, 4851–4888. [Google Scholar] [CrossRef]
- Bramley, R.G.V.; Ouzman, J.; Sturman, A.P.; Grealish, G.J.; Ratcliff, C.E.M.; Trought, M.C.T. Underpinning Terroir with Data: Integrating Vineyard Performance Metrics with Soil and Climate Data to Better Understand Within-Region Variation in Marlborough, New Zealand. Aust. J. Grape Wine Res. 2023, 2023, 8811402. [Google Scholar] [CrossRef]
- Ferro, M.V.; Catania, P. Technologies and Innovative Methods for Precision Viticulture: A Comprehensive Review. Horticulturae 2023, 9, 399. [Google Scholar] [CrossRef]
- Li, W.; Feng, F.; Li, H.; Du, Q. Discriminant Analysis-Based Dimension Reduction for Hyperspectral Image Classification: A Survey of the Most Recent Advances and an Experimental Comparison of Different Techniques. IEEE Geosci. Remote Sens. Mag. 2018, 6, 15–34. [Google Scholar] [CrossRef]
- Canero, F.M.; Rodriguez-Galiano, V.; Aragones, D. Machine Learning and Feature Selection for soil spectroscopy. An evaluation of Random Forest wrappers to predict soil organic matter, clay, and carbonates. Heliyon 2024, 10, e30228. [Google Scholar] [CrossRef]
- Hughes, G. On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theory 1968, 14, 55–63. [Google Scholar] [CrossRef]
- Raja, S.P.; Sawicka, B.; Stamenkovic, Z.; Mariammal, G. Crop Prediction Based on Characteristics of the Agricultural Environment Using Various Feature Selection Techniques and Classifiers. IEEE Access 2022, 10, 23625–23641. [Google Scholar] [CrossRef]
- Imran, H.A.; Zeggada, A.; Ianniello, I.; Melgani, F.; Polverari, A.; Baroni, A.; Danzi, D.; Goller, R. Low-cost handheld spectrometry for detecting Flavescence dorée in vineyards. Appl. Sci. 2023, 13, 2388. [Google Scholar] [CrossRef]
- López, A.; Ogayar, C.J.; Feito, F.R.; Sousa, J.J. Classification of Grapevine Varieties Using UAV Hyperspectral Imaging. Remote Sens. 2024, 16, 2103. [Google Scholar] [CrossRef]
- Gutiérrez, S.; Fernández-Novales, J.; Diago, M.P.; Tardaguila, J. On-the-go hyperspectral imaging under field conditions and machine learning for the classification of grapevine varieties. Front. Plant Sci. 2018, 9, 1102. [Google Scholar] [CrossRef]
- Pôças, I.; Tosin, R.; Gonçalves, I.; Cunha, M. Toward a generalized predictive model of grapevine water status in Douro region from hyperspectral data. Agric. For. Meteorol. 2020, 280, 107793. [Google Scholar] [CrossRef]
- Rodriguez-Galiano, V.F.; Luque-Espinar, J.A.; Chica-Olmo, M.; Mendes, M.P. Feature selection approaches for predictive modelling of groundwater nitrate pollution: An evaluation of filters, embedded and wrapper methods. Sci. Total Environ. 2018, 624, 661–672. [Google Scholar] [CrossRef]
- Loggenberg, K.; Poona, N. A feature selection approach for terrestrial hyperspectral image analysis. S. Afr. J. Geomat. 2020, 9, 302–320. [Google Scholar] [CrossRef]
- Santos-Rufo, A.; Mesas-Carrascosa, F.-J.; García-Ferrer, A.; Meroño-Larriva, J.E. Wavelength Selection Method Based on Partial Least Square from Hyperspectral Unmanned Aerial Vehicle Orthomosaic of Irrigated Olive Orchards. Remote Sens. 2020, 12, 3426. [Google Scholar] [CrossRef]
- He, S.; Peng, P.; Chen, Y.; Wang, X. Multi-Crop Classification Using Feature Selection-Coupled Machine Learning Classifiers Based on Spectral, Textural and Environmental Features. Remote Sens. 2022, 14, 3153. [Google Scholar] [CrossRef]
- Zhang, X.; Xue, J.; Chen, S.; Wang, N.; Xie, T.; Xiao, Y.; Chen, X.; Shi, Z.; Huang, Y.; Zhuo, Z. Fine Resolution Mapping of Soil Organic Carbon in Croplands with Feature Selection and Machine Learning in Northeast Plain China. Remote Sens. 2023, 15, 5033. [Google Scholar] [CrossRef]
- Swe, K.N.; Takai, S.; Noguchi, N. Novel approaches for a brix prediction model in Rondo wine grapes using a hyperspectral Camera: Comparison between destructive and Non-destructive sensing methods. Comput. Electron. Agric. 2023, 211, 108037. [Google Scholar] [CrossRef]
- Rapaport, T.; Hochberg, U.; Shoshany, M.; Karnieli, A.; Rachmilevitch, S. Combining leaf physiology, hyperspectral imaging and partial least squares-regression (PLS-R) for grapevine water status assessment. ISPRS J. Photogramm. Remote Sens. 2015, 109, 88–97. [Google Scholar] [CrossRef]
- Fu, X.; Zhou, W.; Zhou, X.; Hu, Y. Crop Mapping and Spatio–Temporal Analysis in Valley Areas Using Object-Oriented Machine Learning Methods Combined with Feature Optimization. Agronomy 2023, 13, 2467. [Google Scholar] [CrossRef]
- Chancia, R.; Bates, T.; Vanden Heuvel, J.; van Aardt, J. Assessing grapevine nutrient status from unmanned aerial system (UAS) hyperspectral imagery. Remote Sens. 2021, 13, 4489. [Google Scholar] [CrossRef]
- Gao, H.; Xu, L.; Li, C.; Shi, A.; Huang, F.; Ma, Z. A New Feature Selection Method for Hyperspectral Image Classification Based on Simulated Annealing Genetic Algorithm and Choquet Fuzzy Integral. Math. Probl. Eng. 2013, 2013, 537268. [Google Scholar] [CrossRef]
- Shastry, K.; Sanjay, H. A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture. Knowl.-Based Syst. 2021, 232, 107460. [Google Scholar] [CrossRef]
- Sawant, S.S.; Manoharan, P.; Loganathan, A. Band selection strategies for hyperspectral image classification based on machine learning and artificial intelligent techniques –Survey. Arab. J. Geosci. 2021, 14, 646. [Google Scholar] [CrossRef]
- Pero, C.; Bakshi, S.; Nappi, M.; Tortora, G. IoT-Driven Machine Learning for Precision Viticulture Optimization. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 2437–2447. [Google Scholar] [CrossRef]
- Loggenberg, K.; Strever, A.; Münch, Z. Scoping the Field: Recent Advances in Optical Remote Sensing for Precision Viticulture. ISPRS Int. J. Geo-Inf. 2024, 13, 385. [Google Scholar] [CrossRef]
- Drotár, P.; Gazda, M.; Vokorokos, L. Ensemble feature selection using election methods and ranker clustering. Inf. Sci. 2019, 480, 365–380. [Google Scholar] [CrossRef]
- L’Heureux, A.; Grolinger, K.; Elyamany, H.F.; Capretz, M.A.M. Machine Learning with Big Data: Challenges and Approaches. IEEE Access 2017, 5, 7776–7797. [Google Scholar] [CrossRef]
- Gabrielli, M.; Ounaissi, D.; Lançon-Verdier, V.; Julien, S.; Le Meurlay, D.; Maury, C. Hyperspectral imaging to assess wine grape quality. JSFA Rep. 2023, 3, 452–462. [Google Scholar] [CrossRef]
- Diago, M.P.; Fernandes, A.M.; Millan, B.; Tardaguila, J.; Melo-Pinto, P. Identification of grapevine varieties using leaf spectroscopy and partial least squares. Comput. Electron. Agric. 2013, 99, 7–13. [Google Scholar] [CrossRef]
- Rafique, R.; Ahmad, T.; Ahmed, M.; Azam Khan, M. Exploring key physiological attributes of grapevine cultivars under the influence of seasonal environmental variability. OENO One 2023, 57, 381–397. [Google Scholar] [CrossRef]
- Borgogno-Mondino, E.; De Palma, L.; Novello, V. Investigating Sentinel 2 Multispectral Imagery Efficiency in Describing Spectral Response of Vineyards Covered with Plastic Sheets. Agronomy 2020, 10, 1909. [Google Scholar] [CrossRef]
- Carey, V.A.; Saayman, D.; Archer, E.; Barbeau, G.; Wallace, M. Viticultural terroirs in Stellenbosch, South Africa. I. The identification of natural terroir units. OENO One 2008, 42, 169–183. [Google Scholar] [CrossRef]
- Council for Scientific and Industrial Research (CSIR). Cape Winelands District Municipality Climate Change Adaptation Plan: Draft 1; CSIR GreenBook: Pretoria, South Africa, 2023; pp. 11–12. [Google Scholar]
- Lin, W.; Hang, H.; Zhuang, Y.; Zhang, S. Variable selection in partial least squares with the weighted variable contribution to the first singular value of the covariance matrix. Chemom. Intell. Lab. Syst. 2018, 183, 113–121. [Google Scholar] [CrossRef]
- Wang, L.-L.; Lin, Y.-W.; Wang, X.-F.; Xiao, N.; Xu, Y.-D.; Li, H.-D.; Xu, Q.-S. A selective review and comparison for interval variable selection in spectroscopic modeling. Chemom. Intell. Lab. Syst. 2018, 172, 229–240. [Google Scholar] [CrossRef]
- Sinha, R.; Khot, L.R.; Rathnayake, A.P.; Gao, Z.; Naidu, R.A. Visible-near infrared spectroradiometry-based detection of grapevine leafroll-associated virus 3 in a red-fruited wine grape cultivar. Comput. Electron. Agric. 2019, 162, 165–173. [Google Scholar] [CrossRef]
- Wold, S.; Martens, H.; Wold, H. The multivariate calibration problem in chemistry solved by the PLS method. In Matrix Pencils: Proceedings of a Conference Held at Pite Havsbad, Sweden, 22–24 March 1982; Springer: Berlin/Heidelberg, Germany, 1983; pp. 286–293. [Google Scholar]
- Martens, H. Multivariate Calibration. Doctoral Thesis, Technical University of Norway, Trondheim, Norway, 1985. [Google Scholar]
- Helland, I.S. On the structure of partial least squares regression. Commun. Stat.-Simul. Comput. 1988, 17, 581–607. [Google Scholar] [CrossRef]
- Abrantes, G.; Almeida, V.; Maia, A.J.; Nascimento, R.; Nascimento, C.; Silva, Y.; Silva, Y.; Veras, G. Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil. Molecules 2023, 28, 6959. [Google Scholar] [CrossRef]
- Mehmood, T.; Liland, K.H.; Snipen, L.; Sæbø, S. A review of variable selection methods in Partial Least Squares Regression. Chemom. Intell. Lab. Syst. 2012, 118, 62–69. [Google Scholar] [CrossRef]
- Wold, S.; Sjöström, M.; Eriksson, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 2001, 58, 109–130. [Google Scholar] [CrossRef]
- Mehmood, T.; Sæbø, S.; Liland, K.H. Comparison of variable selection methods in partial least squares regression. J. Chemom. 2020, 34, e3226. [Google Scholar] [CrossRef]
- Andersen, C.M.; Bro, R. Variable selection in regression—A tutorial. J. Chemom. 2010, 24, 728–737. [Google Scholar] [CrossRef]
- Kvalheim, O.M. Variable importance: Comparison of selectivity ratio and significance multivariate correlation for interpretation of latent-variable regression models. J. Chemom. 2020, 34, e3211. [Google Scholar] [CrossRef]
- Yang, W.; Xiong, Y.; Wang, H.; Wu, T.; Du, Y. Interval interaction moving window partial least squares for wavelength interval selection in near infrared spectroscopy. Chemom. Intell. Lab. Syst. 2023, 241, 104976. [Google Scholar] [CrossRef]
- Nørgaard, L.; Saudland, A.; Wagner, J.; Nielsen, J.P.; Munck, L.; Engelsen, S.B. Interval Partial Least-Squares Regression (iPLS): A Comparative Chemometric Study with an Example from Near-Infrared Spectroscopy. Appl. Spectrosc. 2000, 54, 413–419. [Google Scholar] [CrossRef]
- Kucheryavskiy, S. mdatools—R package for chemometrics. Chemom. Intell. Lab. Syst. 2020, 198, 103937. [Google Scholar] [CrossRef]
- Mishra, S.; Mishra, D.; Mallick, P.K.; Santra, G.H.; Kumar, S. A Novel Borda Count based Feature Ranking and Feature Fusion Strategy to Attain Effective Climatic Features for Rice Yield Prediction. Informatica 2021, 45. [Google Scholar] [CrossRef]
- Miri, M.; Dowlatshahi, M.B.; Hashemi, A. Evaluation multi label feature selection for text classification using weighted borda count approach. In 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS); IEEE: New York, NY, USA, 2022. [Google Scholar]
- Héberger, K. Sum of ranking differences compares methods or models fairly. TrAC Trends Anal. Chem. 2010, 29, 101–109. [Google Scholar] [CrossRef]
- Héberger, K.; Kollár-Hunek, K. Sum of ranking differences for method discrimination and its validation: Comparison of ranks with random numbers. J. Chemom. 2011, 25, 151–158. [Google Scholar] [CrossRef]
- Varoquaux, G.; Colliot, O. Evaluating Machine Learning Models and Their Diagnostic Value. In Machine Learning for Brain Disorders; Humana: New York, NY, USA, 2023; pp. 601–630. [Google Scholar]
- Opitz, J. A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice. Trans. Assoc. Comput. Linguist. 2024, 12, 820–836. [Google Scholar] [CrossRef]
- Menze, B.H.; Kelm, B.M.; Splitthoff, D.N.; Koethe, U.; Hamprecht, F.A. On Oblique Random Forests. In Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2011; pp. 453–469. [Google Scholar]
- Poona, N.; Van Niekerk, A.; Ismail, R. Investigating the Utility of Oblique Tree-Based Ensembles for the Classification of Hyperspectral Data. Sensors 2016, 16, 1918. [Google Scholar] [CrossRef] [PubMed]
- Jaeger, B.C.; Welden, S.; Lenoir, K.; Pajewski, N.M. aorsf: An R package for supervised learning using the oblique random survival forest. J. Open Source Softw. 2022, 7, 4705. [Google Scholar] [CrossRef]
- Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
- Kleinbaum, D.G.; Klein, M. Introduction to Logistic Regression. In Logistic Regression: A Self-Learning Text; Kleinbaum, D.G., Klein, M., Eds.; Springer: New York, NY, USA, 2010; pp. 1–39. [Google Scholar]
- Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A training algorithm for optimal margin classifiers. In COLT ’92: Proceedings of the Fifth Annual Workshop on Computational Learning Theory; Association for Computing Machinery: New York, NY, USA, 1992; pp. 144–152. [Google Scholar]
- Meyer, D.; Dimitriadou, E.; Hornik, K.; Weingessel, A.; Leisch, F. e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071); TU Wien: Vienna, Austria, 2023. [Google Scholar]
- Skidmore, A.; Turner, B.; Brinkhof, W.; Knowles, E. PERFORMANCE OF A NEURAL NETWORK: MAPPING FORESTS USING GIS AND REMOTELY SENSED DATA. Photogramm. Eng. Remote Sens. 1997, 63, 501–514. [Google Scholar]
- Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. In OSDI’16: Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation; USENIX Association: Berkeley, CA, USA, 2016; pp. 265–283. [Google Scholar]
- Cacciari, I.; Ranfagni, A. Hands-On Fundamentals of 1D Convolutional Neural Networks—A Tutorial for Beginner Users. Appl. Sci. 2024, 14, 8500. [Google Scholar] [CrossRef]
- Hennessy, A.; Clarke, K.; Lewis, M. Hyperspectral Classification of Plants: A Review of Waveband Selection Generalisability. Remote Sens. 2020, 12, 113. [Google Scholar] [CrossRef]
- Pôças, I.; Rodrigues, A.; Gonçalves, S.; Costa, P.; Gonçalves, I.; Pereira, L.; Cunha, M. Predicting grapevine water status based on hyperspectral reflectance vegetation indices. Remote Sens. 2015, 7, 16460–16479. [Google Scholar] [CrossRef]
- Sims, D.A.; Gamon, J.A. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
- Imran, H.A.; Gianelle, D.; Rocchini, D.; Dalponte, M.; Martín, M.P.; Sakowska, K.; Wohlfahrt, G.; Vescovo, L. VIS-NIR, Red-Edge and NIR-Shoulder Based Normalized Vegetation Indices Response to Co-Varying Leaf and Canopy Structural Traits in Heterogeneous Grasslands. Remote Sens. 2020, 12, 2254. [Google Scholar] [CrossRef]
- Li, C.; Czyż, E.A.; Halitschke, R.; Baldwin, I.T.; Schaepman, M.E.; Schuman, M.C. Evaluating potential of leaf reflectance spectra to monitor plant genetic variation. Plant Methods 2023, 19, 108. [Google Scholar] [CrossRef]
- Khadka, K.; Burt, A.J.; Earl, H.J.; Raizada, M.N.; Navabi, A. Does Leaf Waxiness Confound the Use of NDVI in the Assessment of Chlorophyll When Evaluating Genetic Diversity Panels of Wheat? Agronomy 2021, 11, 486. [Google Scholar] [CrossRef]
- Gitelson, A.A.; Merzlyak, M.N.; Chivkunova, O.B. Optical Properties and Nondestructive Estimation of Anthocyanin Content in Plant Leaves. Photochem. Photobiol. 2007, 74, 38–45. [Google Scholar] [CrossRef]
- Yu, Z.; Zhang, X.; Liu, H.; Zhang, Z.; Meng, L.; Han, Y.; Lu, L. Improving SPAD spectral estimation accuracy of rice leaves by considering the effect of leaf water content. Crop Sci. 2022, 62, 2382–2395. [Google Scholar] [CrossRef]
- Li, Y.; Yang, K.; Wu, B. Feature Selection and Spectral Indices for Identifying Maize Stress Types. Appl. Spectrosc. 2025, 79, 306–319. [Google Scholar] [CrossRef] [PubMed]
- Gitelson, A.A.; Zur, Y.; Chivkunova, O.B.; Merzlyak, M.N. Assessing carotenoid content in plant leaves with reflectance spectroscopy. Photochem. Photobiol. 2002, 75, 272–281. [Google Scholar] [CrossRef] [PubMed]
- Jernelv, I.L.; Hjelme, D.R.; Matsuura, Y.; Aksnes, A. Convolutional neural networks for classification and regression analysis of one-dimensional spectral data. arXiv 2020, arXiv:2005.07530. [Google Scholar] [CrossRef]
- Rossberg, N.; Gautam, R.; Komolibus, K.; O’Sullivan, B.; Visentin, A. Explainable AI-Based Feature Selection Approaches for Raman Spectroscopy. Diagnostics 2025, 15, 2063. [Google Scholar] [CrossRef]




| EM Region | Selected Wavebands (nm) | Number of Wavebands |
|---|---|---|
| Red visible to near-infrared (NIR) | 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742 | 61 |
| Green to yellow–green visible | 508, 509, 510, 511, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566 | 21 |
| Short-wave infrared (SWIR) | 1820, 1821, 1822, 1823, 1824, 1926, 1927, 1931, 1932, 1962, 1963, 1969, 1970, 1971, 1972, 1973, 1975, 1976 | 18 |
| Cluster | Adjacency | Bin Label | EM Region | Waveband Range (nm) | Individual Wavebands (nm) | Number of Wavebands |
|---|---|---|---|---|---|---|
| C1 | A4 | C1_A4 | Red-edge | 717–730 | 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730 | 14 |
| C2 | A4 | C2_A4 | Red-edge | 731–742 | 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742 | 12 |
| C3 | A5 | C3_A5 | SWIR | 1820–1824 | 1820, 1821, 1822, 1823, 1824 | 5 |
| C4 | A2 | C4_A2 | Green | 550–566 | 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566 | 17 |
| C4 | A4 | C4_A4 | Red-edge | 703–716 | 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716 | 14 |
| C5 | A3 | C5_A3 | Red-edge | 670–690 | 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690 | 21 |
| C6 | A6 | C6_A6 | SWIR | 1926–1927 | 1926, 1927 | 2 |
| C7 | A6 | C7_A6 | SWIR | 1931–1932 | 1931, 1932 | 2 |
| C8 | A7 | C8_A7 | SWIR | 1962–1976 | 1962, 1963, 1969, 1970, 1971, 1972, 1973, 1975, 1976 | 9 |
| C9 | A1 | C9_A1 | Green | 508–511 | 508, 509, 510, 511 | 4 |
| Source | Metrics | Feature Set | oRF | Multinom | SVM | MLP | CNN |
|---|---|---|---|---|---|---|---|
| Train | F1 | p = 1874 | 0.86 | 1 | 1 | 1 | 1 |
| p = 100 | 0.95 | 1 | 1 | 1 | 1 | ||
| p = 10 | 0.93 | 1 | 1 | 1 | 1 | ||
| BACC | p = 1874 | 0.85 | 1 | 1 | 1 | 1 | |
| p = 100 | 0.91 | 0.99 | 0.99 | 1 | 1 | ||
| p = 10 | 0.9 | 0.99 | 0.98 | 1 | 1 | ||
| MCC | p = 1874 | 0.48 | 1 | 1 | 1 | 1 | |
| p = 100 | 0.5 | 0.94 | 0.85 | 1 | 1 | ||
| p = 10 | 0.58 | 0.84 | 0.79 | 1 | 1 | ||
| AUC | p = 1874 | 0.71 | 1 | 1 | 1 | 1 | |
| p = 100 | 0.77 | 0.97 | 0.91 | 1 | 1 | ||
| p = 10 | 0.79 | 0.93 | 0.9 | 1 | 1 | ||
| Test | F1 | p = 1874 | 0.75 | 0.93 | 0.93 | 0.72 | 0.85 |
| p = 100 | 0.88 | 0.93 | 1 | 0.8 | 0.78 | ||
| p = 10 | 1 | 1 | 1 | 0.87 | 0.83 | ||
| BACC | p = 1874 | 0.79 | 0.91 | 0.98 | 0.85 | 0.92 | |
| p = 100 | 0.86 | 0.91 | 0.99 | 0.89 | 0.88 | ||
| p = 10 | 0.97 | 0.98 | 0.98 | 0.93 | 0.91 | ||
| MCC | p = 1874 | 0.53 | 0.71 | 0.8 | 0.67 | 0.83 | |
| p = 100 | 0.6 | 0.71 | 0.82 | 0.76 | 0.74 | ||
| p = 10 | 0.64 | 0.8 | 0.78 | 0.85 | 0.8 | ||
| AUC | p = 1874 | 0.74 | 0.89 | 0.9 | 0.92 | 0.99 | |
| p = 100 | 0.87 | 0.9 | 0.96 | 0.97 | 0.95 | ||
| p = 10 | 0.9 | 0.93 | 0.95 | 0.98 | 0.96 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Loggenberg, K.; Strever, A.; Münch, Z. Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination. Geomatics 2026, 6, 12. https://doi.org/10.3390/geomatics6010012
Loggenberg K, Strever A, Münch Z. Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination. Geomatics. 2026; 6(1):12. https://doi.org/10.3390/geomatics6010012
Chicago/Turabian StyleLoggenberg, Kyle, Albert Strever, and Zahn Münch. 2026. "Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination" Geomatics 6, no. 1: 12. https://doi.org/10.3390/geomatics6010012
APA StyleLoggenberg, K., Strever, A., & Münch, Z. (2026). Cross-Learner Spectral Subset Optimisation: PLS–Ensemble Feature Selection with Weighted Borda Count for Grapevine Cultivar Discrimination. Geomatics, 6(1), 12. https://doi.org/10.3390/geomatics6010012

