XAI-Supported Electronic Tongue for Estimating Milk Composition and Adulteration Indicators
Abstract
1. Introduction
- Experimentally demonstrating that, with a low-cost 18-band multispectral sensor, multiple quality/adulteration parameters beyond fat and protein (including SNF, density, freezing point, and water addition) can be estimated on the same system.
- Presenting a systematic and comparative modeling framework for an expanded quality-parameter set, unlike the predominantly limited-parameter (fat–protein) focus in the literature.
- Methodologically testing field generalizability by reporting performance under both classical 5-fold cross-validation and Leave-One-Out (LOO) strategies.
- Isolating band-level contributions by applying permutation-based importance analysis separately for each target variable, thereby offering data-driven band-selection recommendations for low-cost optical design.
2. Materials and Methods
2.1. Sample Collection, Measurement Conditions, and Statistical Characteristics
2.2. Sensor Architecture and Spectral Bands
2.3. Data Preprocessing, Feature Extraction, and Outlier Analysis
2.4. Machine Learning Algorithms
2.5. Validation Strategy and Performance Metrics
2.6. Explainability Analysis
3. Results and Discussion
3.1. Added Water Prediction
3.2. Density Prediction
3.3. Fat Prediction
3.4. Freezing Point Prediction
3.5. Protein Prediction
3.6. Solids-Not-Fat (SNF) Prediction
3.7. General Discussion and Comparison with Literature
3.8. Five-Fold Cross-Validation Results and Comparison with LOO
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Vanaraj, R.; IP, B.; Mayakrishnan, G.; Kim, I.S.; Kim, S.-C. A Systematic Review of the Applications of Electronic Nose and Electronic Tongue in Food Quality Assessment and Safety. Chemosensors 2025, 13, 161. [Google Scholar] [CrossRef]
- Gil, M.; Rudy, M.; Duma-Kocan, P.; Stanisławczyk, R. Electronic Sensing Technologies in Food Quality Assessment: A Comprehensive Literature Review. Appl. Sci. 2025, 15, 1530. [Google Scholar] [CrossRef]
- Jiang, W.; Liu, C.; Liu, W.; Zheng, L. Advancements in Intelligent Sensing Technologies for Food Safety Detection. Research 2025, 8, 0713. [Google Scholar] [CrossRef] [PubMed]
- Yang, B.; Huang, X.; Yan, X.; Zhu, X.; Guo, W. A Cost-Effective on-Site Milk Analyzer Based on Multispectral Sensor. Comput. Electron. Agric. 2020, 179, 105823. [Google Scholar] [CrossRef]
- Zhu, Z.; Guo, W. Recent Developments on Rapid Detection of Main Constituents in Milk: A Review. Crit. Rev. Food Sci. Nutr. 2021, 61, 312–324. [Google Scholar] [CrossRef] [PubMed]
- Galvan, D.; Lelis, C.A.; Effting, L.; Melquiades, F.L.; Bona, E.; Conte-Junior, C.A. Low-Cost Spectroscopic Devices with Multivariate Analysis Applied to Milk Authenticity. Microchem. J. 2022, 181, 107746. [Google Scholar] [CrossRef]
- Hayes, E.; Greene, D.; O’Donnell, C.; O’Shea, N.; Fenelon, M.A. Spectroscopic Technologies and Data Fusion: Applications for the Dairy Industry. Front. Nutr. 2023, 9, 1074688. [Google Scholar] [CrossRef] [PubMed]
- Fizza, K.; Banerjee, A.; Georgakopoulos, D.; Jayaraman, P.P.; Yavari, A.; Dawod, A. An Inexpensive AI-Powered IoT Sensor for Continuous Farm-to-Factory Milk Quality Monitoring. Sensors 2025, 25, 4439. [Google Scholar] [CrossRef] [PubMed]
- Agiomavriti, A.-A.; Nikolopoulou, M.P.; Bartzanas, T.; Chorianopoulos, N.; Demestichas, K.; Gelasakis, A.I. Spectroscopy-Based Methods and Supervised Machine Learning Applications for Milk Chemical Analysis in Dairy Ruminants. Chemosensors 2024, 12, 263. [Google Scholar] [CrossRef]
- Gastélum-Barrios, A.; Soto-Zarazúa, G.M.; Escamilla-García, A.; Toledano-Ayala, M.; Macías-Bobadilla, G.; Jauregui-Vazquez, D. Optical Methods Based on Ultraviolet, Visible, and Near-Infrared Spectra to Estimate Fat and Protein in Raw Milk: A Review. Sensors 2020, 20, 3356. [Google Scholar] [CrossRef] [PubMed]
- Muñiz, R.; Cuevas-Valdés, M.; de la Roza-Delgado, B. Milk Quality Control Requirement Evaluation Using a Handheld near Infrared Reflectance Spectrophotometer and a Bespoke Mobile Application. J. Food Compos. Anal. 2020, 86, 103388. [Google Scholar] [CrossRef]
- Riu, J.; Gorla, G.; Chakif, D.; Boqué, R.; Giussani, B. Rapid Analysis of Milk Using Low-Cost Pocket-Size NIR Spectrometers and Multivariate Analysis. Foods 2020, 9, 1090. [Google Scholar] [CrossRef] [PubMed]
- Diaz-Olivares, J.A.; Adriaens, I.; Stevens, E.; Saeys, W.; Aernouts, B. Online Milk Composition Analysis with an On-Farm near-Infrared Sensor. Comput. Electron. Agric. 2020, 178, 105734. [Google Scholar] [CrossRef]
- Durgun, M. Real-Time Milk Quality Control Using Multi-Spectral Sensing and Edge Computing: Advancing On-Site Detection of Milk Components with XGBoost. Appl. Sci. 2024, 14, 10916. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, K.; Shi, S.; Wang, Q.; Liu, S. Portable Protein and Fat Detector in Milk Based on Multi-Spectral Sensor and Machine Learning. Appl. Sci. 2023, 13, 12320. [Google Scholar] [CrossRef]
- Goyal, K.; Kumar, P.; Verma, K. XAI-Empowered IoT Multi-Sensor System for Real-Time Milk Adulteration Detection. Food Control 2024, 164, 110495. [Google Scholar] [CrossRef]
- ISO 2446:2008|IDF 226:2008; Milk—Determination of Fat Content. ISO: Geneva, Switzerland, 2008.
- ISO 8968-1:2014; Milk and Milk Products—Determination of Nitrogen Content—Part 1: Kjeldahl Principle and Crude Protein Calculation. ISO: Geneva, Switzerland, 2014.
- ISO 6731:2010; Milk, Cream and Evaporated Milk—Determination of Total Solids Content. ISO: Geneva, Switzerland, 2010.
- ISO 5764:2009; Milk—Determination of Freezing Point—Thermistor Cryoscope Method. ISO: Geneva, Switzerland, 2009.
- AS7265x-Triad Spectroscopy Sensor. Available online: https://cdn.sparkfun.com/assets/c/2/9/0/a/AS7265x_Datasheet.pdf (accessed on 24 February 2026).













| Dilution Ratio (%) | Sample |
|---|---|
| 0 | 52 |
| 2 | 8 |
| 3 | 8 |
| 4 | 8 |
| 5 | 8 |
| 10 | 8 |
| 20 | 8 |
| 25 | 20 |
| 30 | 15 |
| 40 | 15 |
| 50 | 15 |
| 60 | 15 |
| 75 | 15 |
| Parameter | Variable | Min | Max | Mean | Std. Dev. |
|---|---|---|---|---|---|
| Fat (g/100 mL) | FATNESS | 0.16 | 3.40 | 1.85 | 0.95 |
| Protein (g/100 mL) | PROTEIN | 0.79 | 3.15 | 2.11 | 0.77 |
| Solids-Not-Fat (g/100 mL) | SNF | 1.86 | 8.34 | 5.52 | 2.09 |
| Density (g/mL) | DENSITY | 1.0048 | 1.0287 | 1.0180 | 0.0075 |
| Freezing Point (°C) | FREEZING | −0.549 | −0.053 | −0.341 | 0.159 |
| Added Water Amount (%) | ADDED WATER | 1.22 | 90.30 | 38.59 | 28.62 |
| Variable | Wavelength (nm) |
|---|---|
| asUV_0 | 410 |
| asUV_1 | 435 |
| asUV_2 | 460 |
| asUV_3 | 485 |
| asUV_4 | 510 |
| asUV_5 | 535 |
| asVIS_0 | 560 |
| asVIS_1 | 585 |
| asVIS_2 | 610 |
| asVIS_3 | 645 |
| asVIS_4 | 680 |
| asVIS_5 | 705 |
| asIR_0 | 730 |
| asIR_1 | 760 |
| asIR_2 | 810 |
| asIR_3 | 860 |
| asIR_4 | 900 |
| asIR_5 | 940 |
| Component | Parameters |
|---|---|
| 5-Fold | n_splits = 5, shuffle = True, random_state = 42 |
| Isolation Forest | n_estimators = 1000, contamination ∈ {0.00, 0.03, 0.05, 0.07, 0.10}, random_state = n_jobs = −1 |
| Random Forest | n_estimators = 1200, random_state = 42, n_jobs = −1 |
| Gradient Boosting | random_state = 42 |
| AdaBoost | n_estimators = 400, learning_rate = 0.05, random_state = 42 |
| KNN | n_neighbors = 7, weights = distance |
| XGBoost | objective = reg:squarederror, n_estimators = 1800, reg_lambda = 1.0, random_state = 42, n_jobs = −1; additionally, max_depth and learning_rate were selected on a small grid |
| Permutation Importance | n_repeats = 30, scoring = neg_root_mean_squared_error, random_state = 42 |
| FEATURE SET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | XGB | 0.892 | 57.200 | 6.143 | 8.783 |
| RAW | GB | 0.891 | 58.758 | 6.316 | 8.824 |
| RAW | RF | 0.855 | 69.159 | 7.539 | 10.158 |
| FUSION | GB | 0.853 | 65.800 | 7.500 | 10.105 |
| FUSION | XGB | 0.849 | 69.634 | 7.544 | 10.246 |
| RAW | ADA | 0.835 | 90.442 | 9.357 | 10.858 |
| DERIVED | GB | 0.814 | 88.430 | 8.915 | 11.797 |
| DERIVED | XGB | 0.813 | 84.562 | 8.825 | 11.841 |
| FUSION | ADA | 0.804 | 90.255 | 9.761 | 11.682 |
| FUSION | RF | 0.791 | 88.403 | 9.064 | 12.046 |
| DERIVED | RF | 0.777 | 97.583 | 9.776 | 12.921 |
| DERIVED | ADA | 0.749 | 119.782 | 11.704 | 13.715 |
| RAW | KNN | 0.715 | 111.443 | 11.132 | 14.255 |
| FUSION | KNN | 0.706 | 118.002 | 11.585 | 14.293 |
| DERIVED | KNN | 0.670 | 129.379 | 12.443 | 15.729 |
| Feature Set | Model | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | GB | 0.877 | 0.171 | 0.002 | 0.002 |
| RAW | XGB | 0.868 | 0.177 | 0.002 | 0.003 |
| FUSION | GB | 0.853 | 0.198 | 0.002 | 0.003 |
| FUSION | XGB | 0.836 | 0.207 | 0.002 | 0.003 |
| RAW | RF | 0.834 | 0.206 | 0.002 | 0.003 |
| DERIVED | GB | 0.809 | 0.234 | 0.002 | 0.003 |
| RAW | ADA | 0.807 | 0.259 | 0.003 | 0.003 |
| DERIVED | XGB | 0.787 | 0.239 | 0.002 | 0.003 |
| FUSION | RF | 0.780 | 0.245 | 0.002 | 0.003 |
| FUSION | ADA | 0.773 | 0.274 | 0.003 | 0.003 |
| DERIVED | RF | 0.756 | 0.266 | 0.003 | 0.004 |
| DERIVED | ADA | 0.722 | 0.325 | 0.003 | 0.004 |
| RAW | KNN | 0.680 | 0.302 | 0.003 | 0.004 |
| FUSION | KNN | 0.669 | 0.318 | 0.003 | 0.004 |
| DERIVED | KNN | 0.656 | 0.332 | 0.003 | 0.004 |
| FEATURE SET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | XGB | 0.922 | 12.378 | 0.189 | 0.261 |
| RAW | GB | 0.907 | 13.277 | 0.207 | 0.285 |
| FUSION | XGB | 0.895 | 15.698 | 0.213 | 0.302 |
| FUSION | GB | 0.894 | 15.921 | 0.220 | 0.305 |
| RAW | RF | 0.891 | 16.718 | 0.231 | 0.309 |
| RAW | ADA | 0.859 | 22.975 | 0.297 | 0.350 |
| FUSION | ADA | 0.858 | 22.295 | 0.296 | 0.353 |
| FUSION | RF | 0.857 | 20.244 | 0.270 | 0.353 |
| DERIVED | XGB | 0.857 | 18.369 | 0.249 | 0.353 |
| DERIVED | GB | 0.848 | 18.994 | 0.258 | 0.363 |
| DERIVED | RF | 0.813 | 21.814 | 0.295 | 0.403 |
| RAW | KNN | 0.773 | 23.971 | 0.339 | 0.445 |
| DERIVED | ADA | 0.772 | 27.560 | 0.369 | 0.445 |
| FUSION | KNN | 0.752 | 26.050 | 0.368 | 0.466 |
| DERIVED | KNN | 0.709 | 28.583 | 0.401 | 0.503 |
| FEATURE SET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | XGB | 0.900 | 11.520 | 0.033 | 0.047 |
| RAW | GB | 0.892 | 11.878 | 0.035 | 0.049 |
| FUSION | GB | 0.857 | 15.198 | 0.041 | 0.055 |
| RAW | RF | 0.856 | 15.145 | 0.042 | 0.056 |
| FUSION | XGB | 0.849 | 15.331 | 0.042 | 0.057 |
| RAW | ADA | 0.840 | 18.738 | 0.051 | 0.059 |
| FUSION | ADA | 0.795 | 21.670 | 0.056 | 0.066 |
| FUSION | RF | 0.794 | 18.778 | 0.050 | 0.067 |
| DERIVED | GB | 0.793 | 17.773 | 0.049 | 0.067 |
| DERIVED | XGB | 0.793 | 18.616 | 0.050 | 0.067 |
| DERIVED | RF | 0.746 | 22.180 | 0.057 | 0.074 |
| RAW | KNN | 0.715 | 22.492 | 0.062 | 0.079 |
| DERIVED | ADA | 0.709 | 26.484 | 0.068 | 0.079 |
| FUSION | KNN | 0.707 | 23.495 | 0.064 | 0.079 |
| DERIVED | KNN | 0.648 | 25.887 | 0.069 | 0.087 |
| FEATURE SET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | GB | 0.888 | 8.546 | 0.170 | 0.240 |
| RAW | XGB | 0.885 | 8.345 | 0.167 | 0.243 |
| FUSION | GB | 0.861 | 9.970 | 0.197 | 0.265 |
| RAW | RF | 0.848 | 10.513 | 0.207 | 0.280 |
| FUSION | XGB | 0.848 | 10.392 | 0.203 | 0.277 |
| RAW | ADA | 0.825 | 13.167 | 0.257 | 0.301 |
| DERIVED | XGB | 0.796 | 12.673 | 0.244 | 0.326 |
| DERIVED | GB | 0.795 | 12.654 | 0.245 | 0.327 |
| FUSION | ADA | 0.783 | 14.455 | 0.273 | 0.330 |
| FUSION | RF | 0.779 | 12.894 | 0.251 | 0.333 |
| DERIVED | RF | 0.752 | 14.429 | 0.274 | 0.360 |
| DERIVED | ADA | 0.718 | 17.316 | 0.328 | 0.383 |
| RAW | KNN | 0.699 | 15.816 | 0.307 | 0.394 |
| FUSION | KNN | 0.687 | 16.483 | 0.320 | 0.396 |
| DERIVED | KNN | 0.631 | 18.414 | 0.348 | 0.438 |
| FEATURE SET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|
| RAW | GB | 0.883 | 9.494 | 0.482 | 0.668 |
| RAW | XGB | 0.878 | 9.323 | 0.472 | 0.682 |
| FUSION | GB | 0.855 | 10.866 | 0.545 | 0.738 |
| FUSION | XGB | 0.845 | 11.190 | 0.561 | 0.765 |
| RAW | RF | 0.841 | 11.635 | 0.583 | 0.780 |
| RAW | ADA | 0.820 | 14.297 | 0.713 | 0.829 |
| DERIVED | GB | 0.787 | 13.410 | 0.663 | 0.896 |
| DERIVED | XGB | 0.785 | 13.587 | 0.670 | 0.901 |
| FUSION | RF | 0.777 | 13.779 | 0.691 | 0.916 |
| FUSION | ADA | 0.775 | 15.653 | 0.757 | 0.920 |
| DERIVED | RF | 0.733 | 15.978 | 0.768 | 1.004 |
| DERIVED | ADA | 0.697 | 18.954 | 0.920 | 1.068 |
| RAW | KNN | 0.691 | 17.399 | 0.858 | 1.086 |
| FUSION | KNN | 0.687 | 17.569 | 0.875 | 1.086 |
| DERIVED | KNN | 0.628 | 19.096 | 0.935 | 1.185 |
| Ref. | Sensor | Wavelength (nm) | XAI | Number of Samples | Performance Metric |
|---|---|---|---|---|---|
| [14] | AS7265x | 610/680/730/760/810/860 | Wavelength selection | 100 | Protein R2 = 0.933 Fat R2 = 0.997 |
| [8] | AS7265x | 410–940 | Yok | 600+ | Fat MAPE = 0.14 Protein MAPE = 0.07 |
| [15] | AS7263 | 610/680/730/760/810/860 | Yok | 60 | Protein R2 = 0.8677 Fat R2 = 0.9713 |
| [12] | SCiO | 740–1070 1350–2550 | Yok | 45 | Fat R2 = 0.969 Protein R2 = 0.917 Carbohydrate R2 = 0.883 |
| [13] | On-farm NIR sensor | 960–1690 | Spectral band selection | 1165 | Fat R2 = 0.98 Protein R2 = 0.94 Lactose R2 = 0.84 |
| Proposed Method | AS7265x | 410–940 | Wavelength and feature selection | 190 | Added Water R2 = 0.892 Density R2 = 0.877 Fat R2 = 0.922 Freezing Point R2 = 0.900 Protein R2 = 0.888 SNF R2 = 0.883 |
| FEATURE SET | TARGET | MODEL | R2 | MAPE (%) | MAE | RMSE |
|---|---|---|---|---|---|---|
| RAW | FATNESS | GB | 0.915 | 14.909 | 0.198 | 0.276 |
| FUSION | FATNESS | GB | 0.899 | 17.019 | 0.211 | 0.300 |
| RAW | FREEZING | XGB | 0.872 | 14.447 | 0.039 | 0.057 |
| RAW | PROTEIN | XGB | 0.868 | 9.977 | 0.194 | 0.278 |
| RAW | ADDED WATER | GB | 0.863 | 75.507 | 7.835 | 10.551 |
| DERIVED | FATNESS | XGB | 0.851 | 21.796 | 0.257 | 0.365 |
| RAW | SNF | GB | 0.848 | 12.276 | 0.585 | 0.812 |
| DERIVED | ADDED WATER | GB | 0.844 | 92.153 | 8.514 | 11.290 |
| RAW | DENSITY | XGB | 0.840 | 0.201 | 0.002 | 0.003 |
| DERIVED | SNF | GB | 0.839 | 13.225 | 0.618 | 0.837 |
| FUSION | ADDED WATER | XGB | 0.838 | 96.876 | 8.316 | 11.490 |
| DERIVED | FREEZING | GB | 0.838 | 19.878 | 0.047 | 0.064 |
| DERIVED | PROTEIN | GB | 0.837 | 12.402 | 0.226 | 0.308 |
| FUSION | FREEZING | XGB | 0.834 | 17.776 | 0.045 | 0.065 |
| FUSION | PROTEIN | XGB | 0.832 | 11.612 | 0.219 | 0.312 |
| FUSION | SNF | XGB | 0.826 | 12.867 | 0.619 | 0.869 |
| DERIVED | DENSITY | GB | 0.818 | 0.228 | 0.002 | 0.003 |
| FUSION | DENSITY | XGB | 0.807 | 0.228 | 0.002 | 0.003 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Seçkin, A.Ç.; Ekici, M.; Akcan, T.; Soygazi, F.; Gürsoy Demir, H. XAI-Supported Electronic Tongue for Estimating Milk Composition and Adulteration Indicators. Biosensors 2026, 16, 245. https://doi.org/10.3390/bios16050245
Seçkin AÇ, Ekici M, Akcan T, Soygazi F, Gürsoy Demir H. XAI-Supported Electronic Tongue for Estimating Milk Composition and Adulteration Indicators. Biosensors. 2026; 16(5):245. https://doi.org/10.3390/bios16050245
Chicago/Turabian StyleSeçkin, Ahmet Çağdaş, Murat Ekici, Tolga Akcan, Fatih Soygazi, and Habibe Gürsoy Demir. 2026. "XAI-Supported Electronic Tongue for Estimating Milk Composition and Adulteration Indicators" Biosensors 16, no. 5: 245. https://doi.org/10.3390/bios16050245
APA StyleSeçkin, A. Ç., Ekici, M., Akcan, T., Soygazi, F., & Gürsoy Demir, H. (2026). XAI-Supported Electronic Tongue for Estimating Milk Composition and Adulteration Indicators. Biosensors, 16(5), 245. https://doi.org/10.3390/bios16050245

