Next Article in Journal
Esquel Meteorite, a Forgotten Argentine Peridot: A Multi Analytical Study
Previous Article in Journal
Portable X-Ray Fluorescence as a Proxy for Aerinite in Pigments of Medieval Alto Aragón Cultural Heritage
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring the Use of Spectral Technologies in Ovine Milk Analysis: A Preliminary Study

by
Aikaterini-Artemis Agiomavriti
1,2,
Olympiada Saharidi
1,
Aikaterini Vasilaki
1,
Stavroula Koulouvakou
1,
Efstratios Nikolaou
3,
Theodora Papadimitriou
3,
Thomas Bartzanas
4,
Nikos Chorianopoulos
5 and
Athanasios I. Gelasakis
1,*
1
Laboratory of Anatomy and Physiology of Farm Animals, Department of Animal Science, School of Animal Biosciences, Agricultural University of Athens, Iera Odos 45 str., 11855 Athens, Greece
2
R&D Department, TCB Avgidis Automations S.A., 11744 Athens, Greece
3
ELGO Demeter, General Directorate of Quality Assurance and Competitiveness of Agricultural Products, Directorate of Milk and Meat Inspections Management, Milk Quality Control Laboratory, 58100 Giannitsa, Greece
4
Laboratory of Farm Structures, Department of Natural Resources Management & Agricultural Engineering, Agricultural University of Athens, Iera Odos 45 str., 11855 Athens, Greece
5
Laboratory of Microbiology and Biotechnology of Food, Department of Food Science and Human Nutrition, Agricultural University of Athens, Iera Odos 45 str., 11855 Athens, Greece
*
Author to whom correspondence should be addressed.
Spectrosc. J. 2026, 4(1), 2; https://doi.org/10.3390/spectroscj4010002
Submission received: 15 November 2025 / Revised: 26 January 2026 / Accepted: 28 January 2026 / Published: 30 January 2026

Abstract

The purpose of this study was to examine the use of portable spectroscopy technologies for rapid milk composition and hygiene quality assessment in ovine milk. Two portable analyzers, namely SmartAnalysis (UV/Vis absorbance) and SpectraPod (NIR transmittance), were used to obtain spectral data of raw milk samples. Additionally, reference values of the milk’s compositional, physical, and hygienic traits were measured. Machine learning algorithms were used to explore the correlations between spectral data and milk traits. The initial results indicated a promising potential of utilizing spectral technologies to predict milk quality and hygienic parameters. Regression models presented a moderate predictive accuracy, with R2 values between 0.55 and 0.34, respectively, regarding fat (RF-NIR) and protein (LR-UV/Vis). Classification models indicated high accuracy for hygienic parameters, with the highest accuracy and AUC values up to 0.87 and 0.83, respectively, predicting increased levels of total bacterial count (TBC), while somatic cell count (SCC) level was less accurately predicted by the model, with AUC values lower than 0.70. The results demonstrate the applicability potential of UV/Vis and NIR portable devices in milk quality assessment, enabling its rapid evaluation, including milk composition and hygiene parameters at the point of service.

Graphical Abstract

1. Introduction

Ovine milk is a product of high nutritional and economic value. In Mediterranean countries, it is particularly important as a raw material for several protected designation-of-origin (PDO) cheeses. Its chemical composition directly influences the intrinsic and extrinsic quality of dairy products. Moreover, parameters such as somatic cell count (SCC) and total bacterial count (TBC) are critical indicators of udder health. These parameters not only ensure milk safety but also determine its technological properties.
Traditional laboratory methods for milk analyses provide reliable and accurate data but are often time-consuming and logistically demanding. They also require specialized personnel to prepare samples and perform analyses, making their in situ, real-time application under field conditions unfeasible. Recent advances in portable and compact instrumentation, together with innovations in chemometric modeling, have introduced promising analytical tools for the rapid evaluation of milk composition and hygiene at the point of service (e.g., farms, milk processing plants). Over the last two decades, spectroscopic techniques have been extensively investigated for milk analysis. Currently, near-infrared (NIR) [1,2,3,4] and ultraviolet/visible (UV/Vis) [5,6,7] spectroscopy technologies are being extensively studied, and their performance has been evaluated in bovine milk with promising results. Similarly, robust predictive performance for fat, protein, and lactose has been demonstrated in caprine milk using NIR spectroscopy [8,9], while portable devices have also shown viability in field settings [10,11]. However, applications in ovine milk remain limited, and relevant data are scarce. The potential use of portable spectrometers in the 850–1700 nm range, including chip-based NIR sensors, for milk analysis has been further supported by studies assessing these instruments in milk quality applications [12,13,14,15].
In addition to NIR and UV/Vis spectroscopy, other methods have been studied and assessed for milk analysis. Mid-infrared (MIR) spectroscopy, and specifically Fourier-transform infrared (FTIR) spectroscopy, currently represents the reference method in both laboratory and industrial environments for milk composition analysis, based on the strong sensitivity of this analytical approach to the fundamental vibrational modes of the molecules in milk [16,17]. On the other hand, laser-induced breakdown spectroscopy (LIBS) has been used to analyze the elemental composition of milk and dairy products [18,19]. Fluorescence techniques were used to assess protein and microbial-related changes in milk [20]. Although these techniques have strong analytical potential, they often require laboratory-based instrumentation and handling, making them not suitable to perform in situ measurements on livestock farms. Spectroscopy has also been applied for the identification of udder health biomarkers. Certain parameters of the dielectric spectrum, which represent physicochemical changes induced by bacterial growth, have been shown to correlate with the TBC of raw goat milk [21], whereas studies on infrared spectroscopy for the prediction of SCC have reported conflicting results [22,23,24]. The analytical potential of spectroscopy in milk diagnostics has been further advanced through its combination with advanced computational methods.
The use of supervised machine learning (ML) techniques such as Random Forest (RF), Support Vector Machines (SVM), and Gradient Boosting (GB) has improved the extraction of relevant information from complex spectra, often outperforming traditional linear regression techniques [25]. In particular, portable spectroscopy technologies combined with ML algorithms enable the development of diagnostic applications, especially for ovine milk quality traits, including SCC and TBC [26].
The objective of this preliminary study was to determine the capability of portable UV/Vis and NIR spectroscopy tools at the point of care for predicting ovine milk composition and hygiene, by integrating multiple ML algorithms with reference laboratory measurements. The study revolves around the feasibility of this approach as an initial step toward integrating spectroscopy-based milk monitoring into precision dairy sheep farming systems.

2. Materials and Methods

2.1. Sampling and Reference Analyses

For this study, milk samples have been collected from 212 Lesvos and 186 Chios ewes. Sampling took place in June, during the morning milking, when milk was manually collected via volumetric bottle milk meters directly into attached 100 mL sampling vials for the estimation of chemical composition, SCC, and TBC. Samples were immediately placed in refrigerated transport boxes, transferred to the laboratory, and analyzed within 24 h of collection. Each sample was divided, with 40 mL placed into separate bottles containing sodium azide for shipment to an external certified laboratory (ELGO-Demeter) for reference measurement of SCC and TBC using BactoScan (FOSS, Hillerød, Denmark). Prior to chemical analysis (fat, protein, and lactose contents), refrigerated samples were heated in a water bath to 37–40 °C, following the manufacturers’ guidelines for Milkoscan (FOSS, Denmark) and Lactoscan (Milkotronic, Nova Zagora, Bulgaria). A total of 275 samples were analyzed with Milkoscan and 333 with Lactoscan. Samples were then measured with spectroscopic tools in room temperature. The “gold-standard” methods used for determining chemical composition were an ultrasonic analyzer (Lactoscan) and a Fourier transform infrared analyzer (FTIR) (Milkoscan).

2.2. Spectral Acquisition

Spectral data were collected from raw milk samples, without pretreatment or dilution. Additionally, no spectral preprocessing, such as smoothing or normalization, was applied, in order to evaluate the suitability of portable spectroscopic devices for use at the point-of-care without preprocessing demands. Measurements were obtained using two portable spectroscopy-based analyzers:
  • SmartAnalysis (DNAPhone, Parma, Italy) covering the ultraviolet-visible range (330–800 nm), as presented in Figure 1a;
  • SpectraPod (MantiSpectra, Eindhoven, The Netherlands) operating in the NIR range (850–1700 nm), as illustrated in Figure 1b.
Transmittance values of NIR spectra were recorded using 16 discrete channels between 850 and 1700 nm (Figure 2). Ultraviolet/visible absorbance spectra were recorded from 330 to 800 nm at 1 nm intervals. Prior to modeling, spectral regions were evaluated based on between-sample variability to avoid including wavelength ranges with limited discriminative information. The regions 330–450 nm and 700–800 nm exhibited substantially higher absorbance variability, while the intermediate region (450–700 nm) showed minimal variation and near-flat spectral behavior (Figure 3). Therefore, only the wavelength ranges 330–450 nm and 700–800 nm were retained for subsequent machine learning analysis, following a data-driven variance-based feature selection approach. However, only the ranges 330–450 nm and 700–800 nm were used in modeling, because visual inspection of the UV/Vis spectra showed that these regions captured the largest between-sample variation in absorbance values, providing the most informative spectral features for describing milk quality and hygiene traits (Figure 3).
Principal Component Analysis (PCA) was subsequently applied to both UV/Vis and NIR datasets, as an exploratory, unsupervised method to summarize spectral variations, identify clustering trends, and facilitate efficient visualization of spectral variability among samples. 5 and 10 principal components were included in the modeling process, respectively.
In order to support the selection, wavelength-wise absorbance variability was quantified by calculating the standard deviation across all samples at each wavelength. This showed that the variability between samples is much larger in the 330–450 nm and 700–800 nm regions compared to the intermediate spectral region, which showed uniform absorbance patterns. Variance spectrum analysis (Figure 4) confirmed that the relevant information is contained in the selected regions of the UV/Vis spectrum.
Negative control tests were carried out using tap water and an empty cuvette to verify that the recorded spectral signals reflected chemically meaningful information. Water spectra were acquired under the same optical configuration and served only for qualitative comparison of spectral structure and variability. As shown in Figure 5 and Figure 6, the water spectra were smooth and exhibited markedly lower absorbance intensity and variability. In contrast, the milk spectra displayed significant differences between samples and structured absorption patterns. These observations confirmed that the spectral diversity exploited by the ML models originates from milk constituents rather than baseline noise or measurement artifacts.

2.3. Dataset and Data Analysis

A unified database was prepared to include spectral, reference compositional traits (protein, fat, lactose, solid-non-fat, salts, free fatty acids, urea, glucose, and casein), physical milk traits (density, pH, conductivity, and freezing point), hygienic indicators (SCC and TBC), and the derived milk quality index (MQI, defined as the sum of protein and fat contents), along with other physiological and production traits recorded during sampling, such as breed, age, and stage of lactation. The correlation between spectral data, the reference traits, and hygienic indicators was investigated using supervised ML algorithms, including Random Forest (RF), Support Vector Regression (SVM), Gradient Boosting (GB), and Linear Regression (LR). Both spectral datasets were used to train supervised ML models. Regression analyses were applied to predict compositional traits (protein, fat, and lactose contents), whereas binary ML models were employed for SCC, TBC, and MQI classification. Model performance was evaluated using R2 and RMSE for regression, and accuracy, F1 score, and area under the receiver operating characteristic curve (ROC AUC) for classification.
For the binary classification, typical threshold values from the literature were employed to derive categorical variables from the reference dataset [27,28]. In particular, MQI was defined as the sum of protein and fat, with the threshold for the binary classification being set at 12%, while in SCC, two thresholds were considered—400 × 103 cells/mL to distinguish normal from increased values, and 106 cells/mL indicating subclinical mastitis. Finally, for TBC, the threshold was set at 2 × 104 CFU (Table 1).

2.4. Machine Learning Modeling and Validation

For generating models of associations between spectral variables and milk composition and quality, supervised machine learning approaches were used. Both the UV/Vis and NIR spectral datasets were used as inputs to the models, in their principal component analysis forms accompanied by other physiological and production traits. No additional spectral preprocessing (e.g., normalization or outlier removal) was applied, in order to preserve the original signal characteristics and reflect realistic field-measurement conditions using portable devices.
For the regression tasks, RF, SVR with a radial basis function (RBF) kernel, GB, and LR models were used to predict values for protein, fat, and lactose content. Classification models for SCC, TBC, and MQI were built using the same algorithms. Hyperparameter tuning was not systematically performed; instead, commonly used configurations were adopted, and for SVR models, the RBF kernel was retained based on preliminary comparisons indicating superior performance relative to linear alternatives.
Because of the limited dataset size, training and evaluation of the models was performed using k-fold cross-validation. A 4-fold cross-validation (75% training—25% validation) was used for regression tasks to provide sufficiently large test sets for the stable estimation of continuous metrics, while stratified 10-fold cross-validation (90% training—10% validation) was applied for the classification tasks to ensure adequate representation of minority-class samples in each fold, mitigating the impact of severe class imbalance. In both tasks, no independent test set was used due to sample size constraints.
The performance of regression models was evaluated using the coefficient of determination (R2) and the root mean square error (RMSE), for training and validation folds separately. Classification performance was assessed using accuracy, F1-score, and the area under the receiver operating characteristic curve (ROC-AUC), Matthews correlation coefficient (MCC), and Cohen’s kappa. Greater emphasis was placed on F1-score, ROC-AUC, MCC, and kappa to ensure robust evaluation under class-imbalanced conditions and to quantify agreement beyond chance. All experiments were conducted using Python 3.8.16 [Spyder integrated development environment (IDE)].

3. Results

Prior to model evaluation, descriptive statistics of the milk samples were calculated to characterize their compositional and hygienic traits. Table 2 summarizes the means and standard deviations of all parameters, presented overall and stratified by breed and age, thereby providing an overview of the compositional, physical, and hygienic properties of the milk examined. As it is demonstrated in Table 2, SCC and TBC values exhibit high variability. This is expected because most animals have normal SCCs and TBCs when they are healthy, while a few animals with udder inflammation (i.e., mastitis) may exhibit extremely high values.

3.1. Principal Component Analysis

Principal Component Analysis (PCA) was applied separately to the UV/Vis and NIR spectral datasets in order to reduce collinearity among spectral variables, compress high-dimensional spectral information, and facilitate both visualization and subsequent machine learning modeling. For the UV/Vis dataset, the five principal components (PC1: 49.16%, PC2: 4.41%, PC3: 3.56%, PC4: 3.33%, and PC5: 2.62%) were used for model development (Figure 7), while for the NIR dataset, 10 principal components (PC1: 82.42%, PC2: 5.90%, PC3: 5.47%, PC4: 5.30%, PC5: 0.51%, PC6: 0.18%, PC7: 0.06%, PC8: 0.04%, PC9: 0.03%, and PC10: 0.02%) were used (Figure 8). Principal Component Analysis score plots revealed distinct variance structures between the two spectral regions. The number of components has been determined in an empirical way in order to achieve a good representation of the variance and stability of the model, and not just to maximize explained variance. This approach was adopted to avoid overfitting and to maintain comparable model complexity across datasets given the limited sample size.
In the NIR spectra, PC1 accounts for approximately 82% of the total variance, as the use of highly correlated, instrument-specific spectral channels led to most sample differences being described by a single dominant pattern. Because the spectra in UV/Vis depend on numerous separate factors, such as various molecules absorbing light and wavelength-dependent scattering, the variability is widely dispersed and the first component only accounts for roughly 49% of the variance. The variance-weighted PCA score plots visualized these differences: the UV/Vis dataset showed a more progressive distribution of influence (Figure 9), whereas the NIR data showed a limited number of very influential samples (Figure 10). Samples with lower variance-weighted scores grouped closer to the origin, while samples with higher variance-weighted scores located further from the score space’s center along high-variance directions. The lack of clear clustering patterns suggests that spectral diversity between samples is continuous rather than class-based.
The spectral regions contributing most to variability in the UV/Vis data were identified through PCA loading analysis. PC1 was characterized by broad, smoothly varying loadings across the spectrum, with the highest contributions observed in the 419–424 nm region (Figure 11). This pattern indicates that PC1 primarily reflects wavelength-independent scattering effects and overall absorbance intensity associated with the milk matrix. Such effects are mainly related to physical properties of milk, including light scattering by fat globules and protein micelles, rather than to distinct chemical absorption bands. In contrast, PC2 exhibited clear wavelength-dependent features, with maximum loadings located near the visible–near-infrared boundary (782–786 nm) (Figure 8). This region is commonly associated with secondary sources of spectral variability and baseline effects. Together, the different loading patterns of PC1 and PC2 suggest that UV/Vis spectral variability arises from a combination of matrix-related optical effects (PC1) and more subtle compositional or structural differences among samples (PC2).
For the NIR dataset, individual wavelength loadings were not interpreted, as the spectra were acquired using discrete, instrument-specific channels rather than continuous wavelength measurements. Consequently, the PCA loadings represent aggregated spectral band contributions rather than specific wavelengths. The PCA in the NIR dataset was therefore used primarily for dimensionality reduction and overall variance characterization, rather than for detailed spectral interpretation.

3.2. Regression Results

Regression analysis using UV/Vis and NIR spectroscopy showed moderate predictive capacity across milk quality traits (Table 3). Among the tested algorithms, RF and LR yielded the most promising results. Specifically, RF achieved very high training R2 values (≈0.90), but the corresponding validation values were markedly lower, indicating limited generalization ability. In contrast, LR showed lower training R2 values (0.35–0.56) but achieved validation R2 values up to 0.49. Fat and protein contents exhibited the highest validation R2 values, reaching 0.55 and 0.34, respectively. Lactose, SNF, and pH showed lower validation R2 values of approximately 0.3–0.4. Parameters such as conductivity, urea, and glucose contents displayed very poor predictive ability, less than 0.30 train R2 values, or exhibited large train-validation R2 deviations, and thus their results were not further considered.

3.3. Classification Results

In the UV/Vis range, RF achieved the best overall performance among the tested algorithms. The highest mean accuracies were observed for TBC (0.87 ± 0.013) and the MQI (0.75 ± 0.055), with AUC values reaching up to 0.83. Somatic cell count exhibited lower discriminative ability, with mean AUC values below 0.70. This trend was further confirmed by low MCC and kappa values. The performance of SVM and XGBoost was consistent but moderate.
In the NIR range, the best-performing models were RF and XGBoost, which achieved mean accuracies of 0.83 ± 0.067 and 0.81 ± 0.085 for TBC and MQI, respectively. The corresponding AUC values reached 0.82–0.83. The SCC prediction was also less reliable (AUC ≤ 0.66). Reduced MCC and kappa values were again observed for SCC classification, reflecting the impact of class imbalance. The complete binary classification metrics, including mean accuracy, F1-score, AUC, MCC, and kappa values across 10-fold cross-validation, are summarized in Table 4 and Table 5.

4. Discussion

The first results indicate that portable NIR and UV/Vis spectroscopic instruments show potential for identifying significant microbiological and chemical parameters in ovine milk. This study compared the performance of UV/Vis and NIR spectroscopy combined with ML algorithms for predicting key compositional, physical, and hygienic traits. The selected variables were chosen to represent both the nutritional composition and hygienic status of milk, which are highly relevant to animal health monitoring and cheesemaking capacity. This evaluation is particularly relevant for ovine milk, which presents distinct compositional and optical characteristics compared to bovine milk, necessitating dedicated spectroscopic modeling rather than direct transfer of existing models. The experimental design aimed to replicate real-world farm conditions and enable a direct correlation between spectral signatures and reference indicators of chemical composition and hygiene at the point of service.
The negative control analysis using tap water showed that the spectroscopic measurements capture significant information, although regression models yielded low to moderate R2 values. In contrast to milk spectra, which showed significant sample variability and visible absorption characteristics, water spectra were smooth and featureless. This suggests that biological variability and the difficulty of predicting quantitative features from raw, on-farm milk samples are the causes of the poor predictive performance, rather than a lack of spectral signal. As a result, the models’ performance should be interpreted as conservative estimates under realistic field conditions.
The regression results showed that fat was best predicted using NIR spectroscopy with RF, while for other physicochemical traits (SCC, TBC, protein, lactose, fat, SNF, urea, casein, glucose and salt contents, pH, freezing point, and density), UV/Vis spectroscopy combined with LR provided the best performance, despite low to moderate R2 values (0.26–0.42). The moderate predictive performance observed in both regression and classification tasks should be interpreted considering the intrinsic characteristics of milk spectra and reference measurements. Even when different machine learning algorithms are applied, the separability of chemical and hygienic information is intrinsically limited by the highly collinear nature of milk spectral data, which are dominated by overlapping absorption and scattering phenomena. The convergence of model performance indicates that data information content is the main factor limiting model accuracy. Considering the relatively small sample sizes (N = 275–335), these results are promising as a stepping stone for further investigation. In classification tasks, TBC and the MQI were most accurately predicted, with TBC reaching 0.87 accuracy and 0.93 F1 score using UV/Vis, and MQI reaching 0.84 accuracy and 0.82 ROC AUC using NIR. The SCC prediction was more challenging, with accuracies ranging from 0.58 (SCC1: UV/Vis) to 0.69 (SCC2: NIR). Reference values used as ground truth may also be impacted by biological variability and analytical uncertainty. Label ambiguity arises when continuous indicators, such as SCC and TBC, are discretized into binary classes in classification tasks, particularly for samples around threshold levels. This reduces achievable accuracy and agreement-based metrics. Therefore, both NIR and UV/Vis proved promising tools for rapid ovine milk analysis at the point of service, capturing the chemical composition as well as hygienic qualities. However, it was evidenced that larger-scale studies are necessary along with potential adjustments, which is also supported by relevant studies on spectroscopy in bovine milk, which highlight that proper preprocessing and denoising can further increase robustness. For example, choosing a preprocessing pipeline through Bayesian optimization has been shown to significantly enhance the prediction of bovine milk compositional traits from infrared spectra [29]. The addition of Cohen’s kappa and MCC further demonstrated the significance of agreement-based metrics beyond accuracy by suggesting that, in highly imbalanced situations such as SCC and TBC, moderate accuracy levels may correlate to limited actual discriminative power.
Overall, the results confirm the feasibility of applying portable spectroscopy as a method for monitoring ovine milk, but also the demand for further studies to support the development of reliable commercial algorithms for universally valid on-farm applications. Important limitations were identified, including the risk of overfitting due to class imbalance, and the need for larger, more representative datasets; increasing dataset size, collecting samples from multiple farms, breeds, animals, and during different seasons, and integrating IoT-based sensors [30] into the methodology, could enhance generalizability in future research. These effects are reflected in the reduced MCC and kappa values for certain classification tasks, indicating that class imbalance remains a critical limitation despite stratified cross-validation. In any case, the combination of spectral analysis data and blockchain-based traceability supports the transformation process of the dairy sheep sector toward real time and transparent quality evaluation systems across the dairy chain [31].

5. Conclusions

Using raw milk samples analyzed directly in the UV/Vis (330–800 nm) and NIR (850–1700 nm) regions, predictive models were established for both compositional and hygienic traits. UV/Vis and NIR spectroscopy combined with supervised ML were shown to be promising for enabling the rapid assessment of ovine milk composition and providing indicators of its hygienic quality. The results show that portable spectroscopy is feasible for qualitative screening and trend detection in ovine milk under real farm conditions, even though the reported R2 values limit the application of the suggested models for precise quantitative prediction. Machine learning algorithms showed varying levels of predictive performance across traits, with RF reaching strong accuracy of classification in TBC and MQI, while LR models provided consistent and interpretable estimates for compositional traits. The limited dataset size, intrinsic spectrum complexity, and uncertainty in reference measurements all contribute to the moderate prediction accuracy achieved, which collectively set an upper limit on the achievable performance.
The use of raw, untreated milk underpins the potential applicability of these instruments at the point of care, where rapid and user-friendly measurements on unprocessed samples are important. However, the limited number of samples and poor to moderate results indicate that further research is warranted to improve the performance of algorithms before they will be used on a commercial scale. Future work needs to be focused on extending the dataset size, improving pre-processing and calibration transfer methods, and including additional animals and physiological data thereof to enhance the models’ robustness.
Overall, the findings reported in this study support the efforts for the utilization of portable spectroscopic tools for the real-time, on-farm evaluation of ovine milk quality, with the potential to complement conventional laboratory methods and advance the field of precision dairy farming.

Author Contributions

Conceptualization, A.-A.A. and A.I.G.; methodology, A.-A.A., T.B., N.C. and A.I.G.; investigation, A.-A.A., O.S., A.V., S.K., E.N. and T.P.; writing—original draft preparation, A.-A.A.; writing—review and editing, A.I.G., T.B. and N.C.; supervision, A.I.G., funding acquisition: A.I.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research is being implemented within the framework of the National Recovery and Resilience Plan “Greece 2.0” funded by European Union—NextGenerationEU: ΥΠ1TA-0558937.Spectroscj 04 00002 i001

Institutional Review Board Statement

The animal study protocol was approved by Ethics Committee of Research Ethics Committee (REC) of Agricultural University of Athens (protocol code 08/04.03.2025 and date of approval 10 March 2025).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to confidentiality restrictions arising from an industrial PhD agreement, under which the data are owned by the industrial partner.

Acknowledgments

The authors would like to thank TCB Avgidis Automations S.A. for the invaluable support in the preparation of this review. The resources, administrative assistance, and access to relevant materials provided by TCB Avgidis Automations S.A. were essential in enabling the authors to thoroughly analyze and compile the information presented in this manuscript. This support is gratefully acknowledged. We gratefully acknowledge ELGO-Demeter for the valuable contributions to milk analyses.

Conflicts of Interest

A.-A.A. was employed by TCB Avgidis Automations S.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Melfsen, A.; Hartung, E.; Haeussermann, A. Accuracy of In-Line Milk Composition Analysis with Diffuse Reflectance near-Infrared Spectroscopy. J. Dairy Sci. 2012, 95, 6465–6476. [Google Scholar] [CrossRef]
  2. Frizzarin, M.; Gormley, I.C.; Berry, D.P.; Murphy, T.B.; Casa, A.; Lynch, A.; McParland, S. Predicting Cow Milk Quality Traits from Routinely Available Milk Spectra Using Statistical Machine Learning Methods. J. Dairy Sci. 2021, 104, 7438–7447. [Google Scholar] [CrossRef] [PubMed]
  3. Tsenkova, R.; Meilina, H.; Kuroki, S.; Burns, D.H. Near Infrared Spectroscopy Using Short Wavelengths and Leave-One-Cow-Out Cross-Validation for Quantification of Somatic Cells in Milk. J. Near Infrared Spectrosc. 2009, 17, 345–351. [Google Scholar] [CrossRef]
  4. Coppa, M.; Revello-Chion, A.; Giaccone, D.; Ferlay, A.; Tabacco, E.; Borreani, G. Comparison of near and Medium Infrared Spectroscopy to Predict Fatty Acid Composition on Fresh and Thawed Milk. Food Chem. 2014, 150, 49–57. [Google Scholar] [CrossRef]
  5. Aernouts, B.; Polshin, E.; Lammertyn, J.; Saeys, W. Visible and Near-Infrared Spectroscopic Analysis of Raw Milk for Cow Health Monitoring: Reflectance or Transmittance? J. Dairy Sci. 2011, 94, 5315–5329. [Google Scholar] [CrossRef] [PubMed]
  6. Bogomolov, A.; Dietrich, S.; Boldrini, B.; Kessler, R.W. Quantitative Determination of Fat and Total Protein in Milk Based on Visible Light Scatter. Food Chem. 2012, 134, 412–418. [Google Scholar] [CrossRef]
  7. Yang, B.; Guo, W.; Liang, W.; Zhou, Y.; Zhu, X. Design and Evaluation of a Miniature Milk Quality Detection System Based on UV/Vis Spectroscopy. J. Food Compos. Anal. 2022, 106, 104341. [Google Scholar] [CrossRef]
  8. Albanell, E.; Caja, G.; Such, X.; Rovai, M.; Salama, A.A.K.; Casals, R. Determination of Fat, Protein, Casein, Total Solids, and Somatic Cell Count in Goat’s Milk by Near-Infrared Reflectance Spectroscopy. J. AOAC Int. 2003, 86, 746–752. [Google Scholar] [CrossRef]
  9. Núñez-Sánchez, N.; Martínez-Marín, A.L.; Polvillo, O.; Fernández-Cabanás, V.M.; Carrizosa, J.; Urrutia, B.; Serradilla, J.M. Near Infrared Spectroscopy (NIRS) for the Determination of the Milk Fat Fatty Acid Profile of Goats. Food Chem. 2016, 190, 244–252. [Google Scholar] [CrossRef]
  10. Llano Suárez, P.; Soldado, A.; González-Arrojo, A.; Vicente, F.; De La Roza-Delgado, B. Rapid On-Site Monitoring of Fatty Acid Profile in Raw Milk Using a Handheld near Infrared Sensor. J. Food Compos. Anal. 2018, 70, 1–8. [Google Scholar] [CrossRef]
  11. Kalinin, A.; Krasheninnikov, V.; Sadovskiy, S.; Yurova, E. Determining the Composition of Proteins in Milk Using a Portable near Infrared Spectrometer. J. Near Infrared Spectrosc. 2013, 21, 409–415. [Google Scholar] [CrossRef]
  12. Diaz-Olivares, J.A.; Adriaens, I.; Stevens, E.; Saeys, W.; Aernouts, B. Online Milk Composition Analysis with an On-Farm near-Infrared Sensor. Comput. Electron. Agric. 2020, 178, 105734. [Google Scholar] [CrossRef]
  13. De La Roza-Delgado, B.; Garrido-Varo, A.; Soldado, A.; González Arrojo, A.; Cuevas Valdés, M.; Maroto, F.; Pérez-Marín, D. Matching Portable NIRS Instruments for in Situ Monitoring Indicators of Milk Composition. Food Control 2017, 76, 74–81. [Google Scholar] [CrossRef]
  14. Gullifa, G.; Barone, L.; Papa, E.; Giuffrida, A.; Materazzi, S.; Risoluti, R. Portable NIR Spectroscopy: The Route to Green Analytical Chemistry. Front. Chem. 2023, 11, 1214825. [Google Scholar] [CrossRef]
  15. Gullifa, G.; Albertini, C.; Amoresano, A.; Pinto, G.; Illiano, A.; Dirito, P.; Materazzi, S.; Risoluti, R. A Smart Based Screening System by MicroNIR and Chemometrics for On-Site Authentication of Buffalo Milk in Dairy Industry. Appl. Food Res. 2025, 5, 101159. [Google Scholar] [CrossRef]
  16. Mohamed, H.; Nagy, P.; Agbaba, J.; Kamal-Eldin, A. Use of near and Mid Infra-Red Spectroscopy for Analysis of Protein, Fat, Lactose and Total Solids in Raw Cow and Camel Milk. Food Chem. 2021, 334, 127436. [Google Scholar] [CrossRef]
  17. Saji, R.; Ramani, A.; Gandhi, K.; Seth, R.; Sharma, R. Application of FTIR Spectroscopy in Dairy Products: A Systematic Review. Food Humanit. 2024, 2, 100239. [Google Scholar] [CrossRef]
  18. Nanou, E.; Pliatsika, N.; Stefas, D.; Couris, S. Identification of the Animal Origin of Milk via Laser-Induced Breakdown Spectroscopy. Food Control 2023, 154, 110007. [Google Scholar] [CrossRef]
  19. Abdel-Salam, Z.; El-Saeid, R.; Abdelghany, S.; Abdel-Salam, S.; Radwan, M. Assessment of Milk Quality at Farm Level Using Laser Techniques. Egypt. J. Chem. 2022, 66, 273–278. [Google Scholar] [CrossRef]
  20. Barreto, M.C.; Braga, R.G.; Lemos, S.G.; Fragoso, W.D. Determination of Melamine in Milk by Fluorescence Spectroscopy and Second-Order Calibration. Food Chem. 2021, 364, 130407. [Google Scholar] [CrossRef]
  21. Zhu, Z.; Zhu, X.; Kong, F.; Guo, W. Quantitatively Determining the Total Bacterial Count of Raw Goat Milk Using Dielectric Spectra. J. Dairy Sci. 2019, 102, 7895–7903. [Google Scholar] [CrossRef] [PubMed]
  22. Guerra, A.; De Marchi, M.; Niero, G.; Chiarin, E.; Manuelian, C.L. Application of a Short-Wave Pocket-Sized near-Infrared Spectrophotometer to Predict Milk Quality Traits. J. Dairy Sci. 2024, 107, 3413–3419. [Google Scholar] [CrossRef]
  23. Iweka, P.; Kawamura, S.; Mitani, T.; Kawaguchi, T. Online Near-Infrared Spectroscopy for the Measurement of Cow Milk Quality in an Automatic Milking System. Eng. Proc. 2023, 56, 145. [Google Scholar]
  24. Iweka, P.; Kawamura, S.; Mitani, T.; Kawaguchi, T. Cow Milk Quality Determination Using a Near-Infrared Spectroscopic Sensing System for Smart Dairy Farming. Eng. Proc. 2023, 58, 118. [Google Scholar]
  25. Agiomavriti, A.-A.; Nikolopoulou, M.P.; Bartzanas, T.; Chorianopoulos, N.; Demestichas, K.; Gelasakis, A.I. Spectroscopy-Based Methods and Supervised Machine Learning Applications for Milk Chemical Analysis in Dairy Ruminants. Chemosensors 2024, 12, 263. [Google Scholar] [CrossRef]
  26. Lianou, D.T.; Kiouvrekis, Y.; Michael, C.K.; Vasileiou, N.G.C.; Psomadakis, I.; Politis, A.P.; Katsafadou, A.I.; Katsarou, E.I.; Bourganou, M.V.; Liagka, D.V.; et al. The Use of Explainable Machine Learning for the Prediction of the Quality of Bulk-Tank Milk in Sheep and Goat Farms. Foods 2024, 13, 4015. [Google Scholar] [CrossRef]
  27. Gelasakis, A.I.; Angelidis, A.S.; Giannakou, R.; Filioussis, G.; Kalamaki, M.S.; Arsenos, G. Bacterial Subclinical Mastitis and Its Effect on Milk Yield in Low-Input Dairy Goat Herds. J. Dairy Sci. 2016, 99, 3698–3708. [Google Scholar] [CrossRef]
  28. Getaneh, G.; Mebrat, A.; Wubie, A.; Kendie, H. Review on Goat Milk Composition and Its Nutritive Value. J. Nutr. Health Sci. 2016, 3, 401. [Google Scholar] [CrossRef]
  29. Babatunde, H.A.; McDougal, O.M.; Andersen, T. Automated Spectral Preprocessing via Bayesian Optimization for Chemometric Analysis of Milk Constituents. Foods 2025, 14, 2996. [Google Scholar] [CrossRef]
  30. Fizza, K.; Banerjee, A.; Georgakopoulos, D.; Jayaraman, P.P.; Yavari, A.; Dawod, A. An Inexpensive AI-Powered IoT Sensor for Continuous Farm-to-Factory Milk Quality Monitoring. Sensors 2025, 25, 4439. [Google Scholar] [CrossRef]
  31. Khanna, A.; Jain, S.; Burgio, A.; Bolshev, V.; Panchenko, V. Blockchain-Enabled Supply Chain Platform for Indian Dairy Industry: Safety and Traceability. Foods 2022, 11, 2716. [Google Scholar] [CrossRef] [PubMed]
Figure 1. (a) SmartAnalysis (DNAPhone, Padova, Italy), and (b) SpectraPod (MantiSpectra, Eindhoven, the Netherlands).
Figure 1. (a) SmartAnalysis (DNAPhone, Padova, Italy), and (b) SpectraPod (MantiSpectra, Eindhoven, the Netherlands).
Spectroscj 04 00002 g001aSpectroscj 04 00002 g001b
Figure 2. Representative near-infrared (NIR) spectra of 45 raw ovine milk samples measured using the SpectraPod (MantiSpectra, 850–1700 nm). Data are expressed as transmittance across 16 discrete spectral channels. Different color lines refer to different samples.
Figure 2. Representative near-infrared (NIR) spectra of 45 raw ovine milk samples measured using the SpectraPod (MantiSpectra, 850–1700 nm). Data are expressed as transmittance across 16 discrete spectral channels. Different color lines refer to different samples.
Spectroscj 04 00002 g002
Figure 3. Representative ultraviolet/visible (UV/Vis) absorbance spectra of 335 raw ovine milk samples measured using SmartAnalysis (DNAPhone, 330–800 nm). For subsequent analysis, only the regions 330–450 nm and 700–800 nm were retained, as these provided the most variable spectral regions for milk composition and hygienic traits. Different color lines refer to different samples.
Figure 3. Representative ultraviolet/visible (UV/Vis) absorbance spectra of 335 raw ovine milk samples measured using SmartAnalysis (DNAPhone, 330–800 nm). For subsequent analysis, only the regions 330–450 nm and 700–800 nm were retained, as these provided the most variable spectral regions for milk composition and hygienic traits. Different color lines refer to different samples.
Spectroscj 04 00002 g003
Figure 4. Standard deviation of absorbance across samples as a function of wavelength, highlighting increased variability in the 330–450 nm and 700–800 nm regions and reduced variability in the 450–700 nm range.
Figure 4. Standard deviation of absorbance across samples as a function of wavelength, highlighting increased variability in the 330–450 nm and 700–800 nm regions and reduced variability in the 450–700 nm range.
Spectroscj 04 00002 g004
Figure 5. Near-infrared (NIR) spectra of 15 tap water samples measured using the SpectraPod (MantiSpectra, 850–1700 nm). Data are expressed as transmittance across 16 discrete spectral channels. Different color lines refer to different samples.
Figure 5. Near-infrared (NIR) spectra of 15 tap water samples measured using the SpectraPod (MantiSpectra, 850–1700 nm). Data are expressed as transmittance across 16 discrete spectral channels. Different color lines refer to different samples.
Spectroscj 04 00002 g005
Figure 6. Ultraviolet/visible (UV/Vis) absorbance spectra of 15 tap water samples measured using SmartAnalysis (DNAPhone, 330–800 nm). Water spectra are smooth, low-intensity, and generally featureless over the measured range, indicating that the recorded spectral signatures originate from milk constituents rather than instrumental noise or baseline effects. Different color lines refer to different samples.
Figure 6. Ultraviolet/visible (UV/Vis) absorbance spectra of 15 tap water samples measured using SmartAnalysis (DNAPhone, 330–800 nm). Water spectra are smooth, low-intensity, and generally featureless over the measured range, indicating that the recorded spectral signatures originate from milk constituents rather than instrumental noise or baseline effects. Different color lines refer to different samples.
Spectroscj 04 00002 g006
Figure 7. For the UV/Vis spectra, the first five principal components accounted for 63.08% of the total variance, with the first component dominating the spectral variability.
Figure 7. For the UV/Vis spectra, the first five principal components accounted for 63.08% of the total variance, with the first component dominating the spectral variability.
Spectroscj 04 00002 g007
Figure 8. For the NIR spectra, the first 10 principal components accounted for approximately 100% of the total variance, with the first component dominating the spectral variability.
Figure 8. For the NIR spectra, the first 10 principal components accounted for approximately 100% of the total variance, with the first component dominating the spectral variability.
Spectroscj 04 00002 g008
Figure 9. PCA score plot (PC1 vs. PC2) of the UV/Vis milk spectra acquired from the selected wavelength ranges (330–450 nm and 700–800 nm). Each point represents an individual milk sample. Points are colored according to a variance-weighted combination of PC1 and PC2 scores to highlight samples contributing most strongly to the variance captured by the first two principal components.
Figure 9. PCA score plot (PC1 vs. PC2) of the UV/Vis milk spectra acquired from the selected wavelength ranges (330–450 nm and 700–800 nm). Each point represents an individual milk sample. Points are colored according to a variance-weighted combination of PC1 and PC2 scores to highlight samples contributing most strongly to the variance captured by the first two principal components.
Spectroscj 04 00002 g009
Figure 10. PCA score plot (PC1 vs. PC2) of the NIR milk spectra (850–1700 nm). PC1 explains the majority of the spectral variance, indicating strong collinearity across the NIR channels, while PC2 captures secondary sources of variability. Points are colored according to a variance-weighted combination of PC1 and PC2 scores, highlighting samples with higher influence on the variance structure of the NIR dataset.
Figure 10. PCA score plot (PC1 vs. PC2) of the NIR milk spectra (850–1700 nm). PC1 explains the majority of the spectral variance, indicating strong collinearity across the NIR channels, while PC2 captures secondary sources of variability. Points are colored according to a variance-weighted combination of PC1 and PC2 scores, highlighting samples with higher influence on the variance structure of the NIR dataset.
Spectroscj 04 00002 g010
Figure 11. PCA loading plots (PC1 and PC2) of the UV/Vis milk spectra. PC1 is characterized by smooth, broad loadings indicative of global absorbance and scattering effects with dominant wavelengths 420, 423, 419, 424, and 422 nm—whereas PC2 exhibits pronounced wavelength-dependent features, with dominant wavelengths 785, 783, 784, 782, and 786 nm.
Figure 11. PCA loading plots (PC1 and PC2) of the UV/Vis milk spectra. PC1 is characterized by smooth, broad loadings indicative of global absorbance and scattering effects with dominant wavelengths 420, 423, 419, 424, and 422 nm—whereas PC2 exhibits pronounced wavelength-dependent features, with dominant wavelengths 785, 783, 784, 782, and 786 nm.
Spectroscj 04 00002 g011
Table 1. Thresholds applied to define binary classification variables in ovine milk samples.
Table 1. Thresholds applied to define binary classification variables in ovine milk samples.
VariableThresholdCategory
Milk Quality Index (MQI)
Protein + Fat (%)
≥12.0High quality vs. Low quality
Somatic Cell Count 1 (SCC 1)≥1 × 106 cells/mLSubclinical mastitis vs. No subclinical mastitis
Somatic Cell Count 2 (SCC 2)≥400 × 103 cells/mLElevated vs. Normal SCC
Total Bacterial Count (TBC)≥2 × 104 CFU/mLElevated TBC vs. Normal TBC
Table 2. Descriptive statistics (mean ± SD) of compositional and hygienic milk traits by age and breed.
Table 2. Descriptive statistics (mean ± SD) of compositional and hygienic milk traits by age and breed.
TraitOverall2y.o. 3y.o. 4y.o. Over 5y.o. ChiosLesvos
SCC
(cells/mL)
1916 ± (3492)1451 ± (1673)1848 ± (3342)1496 ± (2679)2911 ± (5031)1893 ± (3927)1938 ± (3031)
TBC
(CFU/mL)
735 ± (4487)121 ± (332)669 ± (3403)450 ± (2240)1583 ± (8207)866 ± (5759)610 ± (2791)
Protein
(%)
4.52 ± (0.44)4.48 ± (0.36)4.51 ± (0.49)4.52 ± (0.39)4.54 ± (0.44)4.37 ± (0.40)4.66 ± (0.42)
Fat
(%)
6.43 ± (1.25)5.91 ± (1.21)6.42 ± (1.23)6.57 ± (1.27)6.43 ± (1.19)6.66 ± (1.26)6.21 ± (1.20)
Lactose
(%)
4.28 ± (0.42)4.25 ± (0.34)4.28 ± (0.48)4.29 ± (0.37)4.3 ± (0.42)4.14 ± (0.38)4.42 ± (0.40)
SNF
(%)
9.53 ± (0.92)9.45 ± (0.76)9.51 ± (1.04)9.55 ± (0.83)9.58 ± (0.92)9.21 ± (0.85)9.84 ± (0.89)
Density
(kg/m3)
32.1 ± (3.4)32.2 ± (2.7)32.0 ± (3.9)32.0 ± (3.1)32.3 ± (3.4)30.6 ± (3.1)33.4 ± (3.2)
Salts
(%)
0.68 ± (0.07) 0.68 ± (0.05)0.68 ± (0.08)0.68 ± (0.06)0.68 ± (0.07)0.65 ± (0.06)0.71 ± (0.07)
Cond.
(mS/cm)
6.68 ± (2.60)7.51 ± (3.11)6.58 ± (2.61)6.76 ± (2.32)6.32 ± (2.78)6.92 ± (2.77)6.44 ± (2.41)
Fr. point
(°C)
−0.55 ± (0.07)−0.54 ± (0.05)−0.55 ± (0.08)−0.55 ± (0.06)−0.56 ± (0.06)−0.53 ± (0.06)−0.57 ± (0.07)
pH6.72 ± (0.26)6.70 ± (0.27)6.71 ± (0.25)6.71 ± (0.25)6.78 ± (0.27)6.71 ± (0.26)6.73 ± (0.25)
Urea
(mg/L)
507 ± (95)530 ± (94)503 ± (102)504 ± (88)512 ± (92)487 ± (95)526 ± (90)
Casein
(%)
4.71 ± (0.61)4.66 ± (0.50)4.70 ± (0.65)4.71 ± (0.61)4.74 ± (0.59)4.62 ± (0.57)4.80 ± (0.63)
FFA
(mmol/kg)
0.37 ± (0.43)0.32 ± (0.47)0.41 ± (0.45)0.36 ± (0.38)0.34 ± (0.45)0.48 ± (0.40)0.29 ± (0.42)
Glucose
(%)
0.18 ± (0.21)0.21 ± (0.18)0.19 ± (0.20)0.17 ± (0.23)0.19 ± (0.22)0.16 ± (0.24)0.20 ± (0.19)
y.o.: Years Old, SCC: Somatic Cell Count, TBC: Total Bacterial Count, Cond.: Conductivity, Fr. point: Freezing Point, SNF: Solid Non Fat, and FFA: Free Fatty Acids.
Table 3. Best performing algorithms for regression analysis per component for UV/Vis and NIR spectroscopy.
Table 3. Best performing algorithms for regression analysis per component for UV/Vis and NIR spectroscopy.
ComponentAlgorithmSpectroscopy TypeNumber of
Samples
Training R2Training RMSEValidation R2Validation RMSE
ProteinLRUV/Vis3350.390.34%0.340.35%
ProteinLRNIR3330.310.37%0.240.38%
FatLRUV/Vis3350.351.00%0.301.03%
FatLRNIR3330.560.83%0.510.87%
FatRFNIR3330.940.32%0.550.83%
LactoseLRUV/Vis3350.390.32%0.340.33%
LactoseLRNIR3330.310.35%0.240.36%
SNFLRUV/Vis3350.390.72%0.340.74%
SNFLRNIR3330.310.77%0.240.80%
SaltsLRUV/Vis3350.360.06%0.310.06%
pHLRUV/Vis3350.420.190.360.20
pHRFUV/Vis3350.910.080.380.20
pHLRNIR3330.440.190.410.20
Freezing pointLRUV/Vis3350.380.05 °C0.360.05 °C
Freezing pointLRNIR3330.310.05 °C0.250.06 °C
RF: Random Forest, LR: Linear Regression, RMSE: Root Mean Square Error, and R2: Coefficient of Determination. Metrics are reported across four-fold cross-validation.
Table 4. Binary classification performance for SCC, TBC, and MQI using UV/Vis spectroscopy.
Table 4. Binary classification performance for SCC, TBC, and MQI using UV/Vis spectroscopy.
N = 333 N = 275
AlgorithmMetricSCC1SCC2TBCMQI
RF *Accuracy0.69 ± 0.0420.60 ± 0.0630.87 ± 0.0130.75 ± 0.0550.73 ± 0.105
F1 Score0.52 ± 0.1030.70 ± 0.0570.93 ± 0.0070.24 ± 0.1820.76 ± 0.087
ROC AUC0.66 ± 0.0780.65 ± 0.0790.63 ± 0.0750.77 ± 0.0910.75 ± 0.110
MCC0.33 ± 0.1040.12 ± 0.1390.00 ± 0.0000.15 ± 0.2270.45 ± 0.229
Kappa0.31 ± 0.1030.11 ± 0.1360.00 ± 0.0000.12 ± 0.2010.44 ± 0.220
SVM **Accuracy0.63 ± 0.0730.61 ± 0.0980.49 ± 0.0790.72 ± 0.0790.69 ± 0.095
F1 Score0.51 ± 0.0910.64 ± 0.1100.62 ± 0.0800.56 ± 0.1210.72 ± 0.088
ROC AUC0.65 ± 0.0970.65 ± 0.1180.44 ± 0.1110.79 ± 0.0930.74 ± 0.099
MCC0.22 ± 0.1430.25 ± 0.177−0.01 ± 0.1600.40 ± 0.1780.38 ± 0.195
Kappa0.21 ± 0.1400.24 ± 0.172−0.01 ± 0.1070.37 ± 0.1700.38 ± 0.192
XGBoost ***Accuracy0.65 ± 0.0670.58 ± 0.1150.72 ± 0.0230.75 ± 0.0880.69 ± 0.097
F1 Score0.56 ± 0.0940.63 ± 0.1160.83 ± 0.0400.51 ± 0.1470.72 ± 0.089
ROC AUC0.69 ± 0.0510.61 ± 0.1160.63 ± 0.0960.75 ± 0.1150.74 ± 0.119
MCC0.28 ± 0.1400.14 ± 0.2270.04 ± 0.1750.35 ± 0.2050.38 ± 0.202
Kappa0.27 ± 0.1390.14 ± 0.2150.04 ± 0.1580.34 ± 0.2030.37 ± 0.203
SCC: Somatic Cell Count, TBC: Total Bacterial Count, MQI: Milk Quality Index, MCC: Matthews Correlation Coefficient, * RF: Random Forest, ** SVM: Support Vector Machines, and *** XGBoost: Extreme Gradient Boosting. Metrics are reported as mean ± SD across 10-fold cross-validation; N: number of samples. With bold the best performing algorithms for each variable.
Table 5. Binary classification performance for SCC, TBC, and MQI using NIR spectroscopy.
Table 5. Binary classification performance for SCC, TBC, and MQI using NIR spectroscopy.
N = 333 N = 275
MetricsAlgorithmSCC1SCC2AlgorithmTBCMQI
Accuracy 0.64 ± 0.0730.57 ± 0.073 0.83 ± 0.0670.84 ± 0.0840.74 ± 0.042
F1 ScoreSVM *0.47 ± 0.1280.57 ± 0.083RF ****0.90 ± 0.0430.61 ± 0.1990.77 ± 0.036
ROC AUC 0.66 ± 0.0740.64 ± 0.108 0.62 ± 0.1160.83 ± 0.1180.82 ± 0.036
MCC0.26 ± 0.1970.12 ± 0.1510.03 ± 0.1650.50 ± 0.2050.48 ± 0.090
Kappa0.24 ± 0.1880.12 ± 0.1460.03 ± 0.1330.49 ± 0.2070.47 ± 0.090
Accuracy 0.69 ± 0.0610.58 ± 0.091 0.46 ± 0.0840.78 ± 0.0660.74 ± 0.073
F1 ScoreLR **0.54 ± 0.0730.62 ± 0.079SVM *0.57 ± 0.0930.62 ± 0.1000.76 ± 0.075
ROC AUC 0.66 ± 0.0860.64 ± 0.103 0.48 ± 0.1270.85 ± 0.1070.81 ± 0.049
MCC0.24 ± 0.2250.26 ± 0.161−0.04 ± 0.2300.47 ± 0.1600.49 ± 0.148
Kappa0.23 ± 0.2190.25 ± 0.156−0.03 ± 0.1250.44 ± 0.1600.48 ± 0.145
Accuracy 0.65 ± 0.0870.53 ± 0.084 0.87 ± 0.0400.81 ± 0.0850.69 ± 0.058
F1 ScoreNB ***0.42 ± 0.1720.45 ± 0.148XGBoost *****0.93 ± 0.0220.55 ± 0.1990.75 ± 0.053
ROC AUC 0.62 ± 0.1000.64 ± 0.098 0.50 ± 0.1700.77 ± 0.1400.77 ± 0.068
MCC0.21 ± 0.2020.24 ± 0.1430.05 ± 0.1750.38 ± 0.2020.36 ± 0.125
Kappa0.17 ± 0.1920.19 ± 0.1190.04 ± 0.1430.37 ± 0.2010.35 ± 0.123
SCC: Somatic Cell Count, TBC: Total Bacterial Count, MQI: Milk Quality Index, MCC: Matthews Correlation Coefficient, * SVM: Support Vector Machines, ** LR: Logistic Regression, *** NB: Naïve Bayes, **** RF: Random Forest, and ***** XGBoost: Extreme Gradient Boosting. Metrics are reported as mean ± SD across 10-fold cross-validation; N: number of samples. With bold the best performing algorithms for each variable.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Agiomavriti, A.-A.; Saharidi, O.; Vasilaki, A.; Koulouvakou, S.; Nikolaou, E.; Papadimitriou, T.; Bartzanas, T.; Chorianopoulos, N.; Gelasakis, A.I. Exploring the Use of Spectral Technologies in Ovine Milk Analysis: A Preliminary Study. Spectrosc. J. 2026, 4, 2. https://doi.org/10.3390/spectroscj4010002

AMA Style

Agiomavriti A-A, Saharidi O, Vasilaki A, Koulouvakou S, Nikolaou E, Papadimitriou T, Bartzanas T, Chorianopoulos N, Gelasakis AI. Exploring the Use of Spectral Technologies in Ovine Milk Analysis: A Preliminary Study. Spectroscopy Journal. 2026; 4(1):2. https://doi.org/10.3390/spectroscj4010002

Chicago/Turabian Style

Agiomavriti, Aikaterini-Artemis, Olympiada Saharidi, Aikaterini Vasilaki, Stavroula Koulouvakou, Efstratios Nikolaou, Theodora Papadimitriou, Thomas Bartzanas, Nikos Chorianopoulos, and Athanasios I. Gelasakis. 2026. "Exploring the Use of Spectral Technologies in Ovine Milk Analysis: A Preliminary Study" Spectroscopy Journal 4, no. 1: 2. https://doi.org/10.3390/spectroscj4010002

APA Style

Agiomavriti, A.-A., Saharidi, O., Vasilaki, A., Koulouvakou, S., Nikolaou, E., Papadimitriou, T., Bartzanas, T., Chorianopoulos, N., & Gelasakis, A. I. (2026). Exploring the Use of Spectral Technologies in Ovine Milk Analysis: A Preliminary Study. Spectroscopy Journal, 4(1), 2. https://doi.org/10.3390/spectroscj4010002

Article Metrics

Back to TopTop