1. Introduction
Milk, lactic acid products, and cheese are foods with proven health effects [
1]. This is due to their complete composition and content that includes a number of biologically active components: proteins, fats, minerals, trace elements, organic acids, a high content of calcium, conjugated linoleic acid, vitamins, valuable lactic acid bacteria, and others.
The growing demand for different types of cheeses makes manufacturers look for new methods for their production, changing the conditions of coagulation and ripening and using different additives or acids to obtain cheeses with a certain texture and taste [
2].
In international markets, producers compete by offering an ever-widening variety of types of cheeses. The quality of cheese, defined by parameters such as appearance, taste, texture, functionality, and nutritional value, is an essential aspect for both consumers and producers.
In Bulgaria, a number of producers of white brine cheese use standardized raw materials and processes for its production. The producers adhere to strictly coordinated technological processes, allowing minimal permissible deviations in them, and here also in the quality of the cheese. This leads to the search for new methods to evaluate the quality of the finished product, such as color images and spectral characteristics, through which we recognize the existing differences in the same product category (white brined cheese), even though they are produced according to the same standard.
Two main factors related to the main raw materials used for cheese production and its ripening contribute to the expectation of variation in the basic properties obtained from color images and spectral characteristics of white brine cheese when produced to the same standard but from different producers [
3].
One factor is the milk that can be taken from a cow, sheep, or goat, which can vary from dairy to dairy. This difference may contribute to discrepancies in the visual and spectral characteristics of the final products [
4].
The other factor leading to variations in cheese properties is the “ripening” process, its duration, and the temperature regime [
5].
These factors can influence the quality of the finished product and be captured by color images and spectral characteristics. Despite these differences in cheese production, the demands of the market impose a relatively constant quality, regardless of its producer. In order to meet this requirement, the use of online monitoring and evaluation technologies is required, allowing adjustments to be made in the technological processes continuously to maintain the preset quality standards.
Due to the high protein content of cheese, it is categorized as a high-risk product when stored under uncontrolled temperature conditions or under-ripened. Therefore, the provision of reliable laboratory conditions, the use of precise chemical reagents, and the involvement of qualified experts in the examination of each sample are imperative. Despite the expertise of these professionals, subjective variations in the evaluation of cheese color indicators continue to exist [
6].
In the context of preserving cheese quality, the determination of internal defects becomes paramount [
7]. To ensure the successful preservation of cheese products, the development of non-destructive monitoring techniques is necessary. These techniques help to track changes in the quality indicators of products during their storage, thus contributing to their overall control.
As a result of the analysis of the available literary sources, it can be concluded that there is a need to improve and expand the techniques for evaluating the quality of cheese. While medical-imaging methods such as X-rays and magnetic resonance imaging offer accurate results, their high cost and complexity require the development of more affordable and user-friendly devices. The integration of gas and optical sensors provides a comprehensive approach enabling the detection of volatile compounds for freshness and ripeness, together with the evaluation of visual characteristics. In addition, the non-destructive nature of ultrasound is highlighted, as it allows for a sensitive assessment of fat and moisture content without altering the structure of the cheese, which is crucial for maintaining the integrity of the product during production.
In the field of spectroscopy, there is a need to improve the application of various techniques, such as NIR, VIS, FTIR, hyperspectral, multispectral, and dielectric spectroscopy, in the evaluation of cheese quality. Future research directions emphasize the importance of refining the predictive models, evaluating the performance of the portable spectrometer in practical environments, investigating the impact of spectrum broadening, and expanding the ability to predict additional chemical components. The incorporation of advanced chemometric approaches, such as Bayesian methods and machine learning, is recommended to improve accuracy, especially for challenging features. Rigorous validation and testing strategies for industrial applications are recommended, emphasizing the holistic and diverse approach needed to increase the reliability of VIS-NIR spectroscopy in predicting different components of cheese composition.
The combination of sensor and data fusion in cheese quality assessment synergistically improves the accuracy and comprehensiveness of the assessment. By integrating information from different sensors, such as gas, ultrasonic, and optical sensors, the approach ensures the determination of key quality characteristics, including freshness, ripeness, color, and texture. This comprehensive analysis ensures a more reliable assessment by compensating for the limitations of individual sensors and offering cross-validation. The adaptability of the approach to diverse parameters and the holistic understanding it provides make the fusion of sensors and data a powerful tool to improve the overall evaluation of cheese quality in a multifaceted way.
An integrated approach using ultrasonic, spectral, and gas characteristics not only provides objective data on the composition and structure of cheese but also successfully replaces or complements traditional organoleptic analysis. Ultrasonic features provide a sensitive and accurate analysis of fat and moisture content, while preserving product integrity. Spectral features further highlight this multi-layered approach, providing information on the cheese’s texture, structure, and chemical composition. Gas characteristics, in turn, play a key role in aroma and flavor analysis through the detection of various gases that contribute to the cheese’s characteristic qualities. This modern and technologically advanced approach provides a comprehensive and objective evaluation of the product, which ultimately helps achieve the desired quality and taste characteristics of the cheese.
The aim of the present work is to predict the main characteristics of white brine cheeses from different producers, characterizing changes in their quality indicators based on an integrated approach using ultrasonic, spectral, and gas characteristics. The focus is on the importance of the implementation of various sensory inputs for a deeper understanding of cheese properties by highlighting the limitations of independent evaluations of the use of single characteristics for a quality analysis. The introduction of a novel approach that integrates ultrasonic, spectral, and gas characteristics would play a key role in overcoming these limitations and enhancing cheese quality assessment. By gathering data taken from diverse sensors, this research aims to improve predictive models, foster adaptability in decision-making processes, and, at the end, enhance cheese production outcomes.
2. Material and Methods
The research was conducted in laboratory conditions at a temperature of 10 ± 2 °C and a relative air humidity of 70 ± 3% RH. The number of observations is 120 per product from each manufacturer. The spectral, gas, and ultrasonic characteristics of the cheese were investigated. The processing of the data was conducted by a method for their fusion. Classification procedures and regression analysis were also used.
2.1. Source Data for the Cheese Used
The research focuses on samples of white brine cheese manufactured by various producers, all adhering to the same standard (according to the standard BNS 14-2010 “Bulgarian cheese”). The samples were purchased from the commercial network of the city of Yambol, Bulgaria.
Table 1 shows data recorded on the package labels for the cheese used by three manufacturers. Although all producers use cow’s milk, sourdough, yeast, salt, and calcium dichloride, there are subtle differences in additional components, such as rennet enzyme and citric acid. The nutritional content also varies between cheeses, with differences in fat content, carbohydrate composition, protein levels, salt content, and energy value. In particular, Manufacturer 2 (M2) stood out with significantly higher levels of saturated fat and carbohydrates compared to the other manufacturers. These differences offer valuable insight into the variety of cheeses available on the market catering to different dietary preferences and requirements.
2.2. Determination of pH, EC, TDS, and ORP
The samples were meticulously prepared for measurement following the rigorous methodology outlined in AACC 02-52.01 Hydrogen-Ion Activity (pH)—Electrometric Method. Following this protocol, distilled water was heated to 70 °C, and each cheese sample was dissolved in distilled water at a precise ratio of 1:10 (5 g of raw material per 50 mL of distilled water). The mixture underwent periodic stirring until a homogeneous solution was achieved, and then it was cooled to ambient temperature. To ensure accuracy and reliability, three consecutive measurements were taken for each characteristic, and their mean values, along with standard deviations, were calculated. The technical means used for these measurements are presented in
Table 2.
2.3. Organoleptic Analysis of Cheese
The organoleptic evaluation included serves as an additional component aimed at providing a more comprehensive view of the product. Collective evaluations were formed in the organoleptic analysis of white brine cheese based on the individual evaluations of experts in the field. The studied indicators and assessment requirements are based on the BDS 14:2010 standard. The tasting evaluation is made on a 9-point scale (1–9) with step 1 (from 1, does not correspond to the indicator, to 9, fully meets the requirements). All the results presented in the following table have statistically significant differences at
p < 0.05 (
Table 3).
2.4. Obtaining Spectral Characteristics and Indices
The spectral characteristics were obtained according to the methodology presented in Dineva et al. [
8].
A Huawei P10 mobile device video sensor (Huawei Technologies Co., Ltd., Shenzhen, China) was used.
Color correction was performed using a 24-field color scale, namely the Danes Picta BST11 color chart (Danes-Picta, Praha, Czech Republic).
The obtained values from the RGB color model were transformed to the XYZ model in reflectance spectra in the visible (VIS) range, covering 390–730 nm, respectively. The matrices used to convert color components into reflectance spectra in the visible range have an observer angle of 2° (LMS 2°, CIE 2006). The illuminance data are in accordance with the D65 standard (average daylight with the UV component, 6500 K). Additionally, the conversion functions between the RGB and XYZ models in the 380–780 nm range include transformation matrices, taking into account certain observer (2°) and illuminance (D65) conditions.
Since cheese is a specific product, there are no precisely defined spectral indices for it. These should be defined specifically for cheese.
Ju et al. [
9] and Mendiguren et al. [
10] defined the basic spectral indices used in the study. These indices are not calculated at fixed spectral wavelengths. To be used, it is important to select six informative wavelengths specific to the product under investigation. The general form of these indices (
SI) is as follows:
where
λ (nm) is the spectral wavelength, and
R is the reflectance at a specific wavelength.
2.5. Obtaining Data from Gas Sensors
Data from gas sensors were acquired by a system that consists of a sensor module and a personal computer, as presented in Baycheva et al. [
11]. The sensor module is based on a single-board Mega computer (INHAOS Technology Co., Ltd., Dongguan, China). It uses four MQ-xx series metal oxide sensors (Zhengzhou Winsen Electronics Technology Co., Ltd., Zhengzhou, China). The sensors can detect MQ-3, alcohol compounds and benzene; MQ-4, methane, propane, and butane; MQ-6, propane, butane, and LPG; and MQ-135, ammonium compounds and sulfides.
In the present work, data from gas sensors are processed with a Kalman filter. Combining them is realized by the central limit theorem and the Fraser–Potter fixed-interval smoother.
The signals from the gas sensors are precisely filtered by a Kalman filter and fed to a combining unit. A combined synthetic signal is obtained at the output of the software sensor.
Another 12 statistical features were obtained. They were calculated according to Matz et al. [
12] and Zhang et al. [
13]. The features describing the combined data from the gas sensors (
GI) have the following form:
where
N denotes the number of reports in one combined characteristic from the gas sensors, and
x is the amplitude of the signal in the eighth report.
2.6. Obtaining Ultrasonic Characteristics
A system presented in [
14] was used to obtain ultrasonic characteristics. This system is characterized in that all data acquisition and basic processing operations are performed by the single-board microcomputer. The system consists of an ultrasonic sensor, humidity and temperature sensors, a removable stand, and a basis. The compensation of the ultrasound signal by humidity and temperature was performed according to the methodology presented by Ilarionov et al. [
15]. A total of 12 features were obtained. They were calculated according to Matz et al. [
12] and Zhang et al. [
13].
The features describing the ultrasonic characteristics (
UI) of cheese have the following form:
where
N—the number of reports for one ultrasound characteristic;
X—the amplitude of the ultrasound signal in the eighth report; and
SD—the standard deviation.
2.7. Fusion of Sensor Data
An early feature fusion method was used, as detailed in Pereira et al. [
16]. It is a methodology more commonly used in artificial intelligence, machine learning, and pattern recognition tasks. In this approach, features extracted from different levels are merged or combined at an early stage. This enables the integration of low- and high-level features, capturing both detailed and abstract information simultaneously. By combining features, classifiers and regression models provide more comprehensive representations of the input data, potentially leading to improved performance in tasks such as data image classification, object detection, and semantic segmentation.
Figure 1 shows a flow diagram of the approach used to combine data from the spectral, ultrasonic, and gas characteristics of cheese. The outputs of spectral and ultrasonic characteristics are used directly; meanwhile, those of the four gas sensors are combined into a common characteristic. From the three types of characteristics—ultrasonic, spectral, and gas—are extracted features in the form of indices. The most informative features are selected and used to create vectors. These vectors are reduced, and classification and regression are performed. The output from these methods is used in making a decision about the condition of the product—in this case, cheese.
2.8. Classification Methods
Classification is an important aspect of machine learning, where the goal is to determine the class or category to which an object or dataset belongs. Through the different classification procedures, an answer can be given to the question of whether the data describing the characteristics of the examined cheeses are similar or significantly different. For this purpose, the following classifiers were used:
Bayesian classifier. The naïve Bayesian Classifier (NBC) is a statistical classification algorithm based on a Bayesian probabilistic model that allows for the determination of the probability of an event occurring when some data about it are known. This method is “naive” in relation to the assumption that object attributes are statistically independent, i.e., that the presence or value of one attribute does not provide information about other attributes. Conditional independence between individual attributes allows the method to process large volumes of data efficiently.
Discriminant analysis. The discriminant analysis is a data classification method that uses a grouping variable, as explained by Kirilova et al. [
17] and Nachev et al. [
18]. The procedure can be implemented through the use of linear or non-linear partition functions. The non-linear variant of the method (e.g., QDA—quadratic discriminant analysis) is considered more suitable for large datasets. This is due to the lower bias and larger variation of QDA. On the other hand, the linear variant (LDA—linear discriminant analysis) may be preferred for smaller datasets with higher clustering and lower variance.
The following discriminant functions were used in the discriminant analysis:
Linear: a linear partition function applicable to data with a multivariate normal density of each group and a common covariance estimate.
Quadratic: a function using covariance to group the data. The separation of the groups is by a non-linear function—in most cases, of the second degree.
Support vector machines method. The support vector machines (SVMs) method serves to represent the training data in an n-dimensional space in order to achieve linear separability. Items from the training sample are associated with one of two classes. The data are transformed into a new domain where the resulting model has the ability to dispose of them so that there is a clear distinction between classes. With SVM, high-dimensional hyperplanes are created that serve classification purposes. The efficiency of the method is achieved when there is a large distance between the two classes of the training data. In the presence of a larger distance, a smaller classification error is observed. Thus, SVM seeks to create an optimal separation between classes that ensures the reliable and accurate classification of new data.
If the training data can be linearly separated, the support vector method finds two boundary planes that do not pass through any data points. After the linear algorithm of the SVM method, those using non-linear kernel (kernel) dividing functions [
11] were created. This allows the hyperplane to be projected with the maximum distance between the two classes in the transformed feature space. The present study used the following partition functions:
Linear: A term applied to data with a multivariate normal density of each group and a common covariance estimate.
Quadratic: A quadratic function that sorts the multivariate normal density data by calculating the covariance and grouping them.
RBF (Radial Basis Function): The function is represented by radial basis elements. SVM uses the RBF kernel to transform the data into a higher-dimensional space, which can facilitate solving complex classification and regression tasks.
2.9. Evaluation of the Performance of Classification Procedures
Basic, actual, and total classification errors were calculated for the number of classes:
The basic error is a measure of how much data from class i is misclassified into other classes, where FN is the number of data from class i misassigned to other classes, and TP is the number of correctly classified data from class i. The actual error indicates the relative proportion of data from other classes incorrectly assigned by the classifier to a given class i, where FP is the number of data from other classes associated with class i. The total error shows the misclassified data relative to all the data in the sample.
2.10. Regression Methods
Regression prognostic models based on chemometric techniques were applied to quantitatively analyze the cheese data. Partial least squares regression (PLSR) and principal component regression (PCR) were used. Through these techniques, new regression factors are created that concentrate information from the entire spectrum of the data used. A key aspect of PCR is obtaining the so-called principal components, which serve as new predictors in the regression model. This allows the method to be used when a significant number of variables are present and are highly correlated. In the PLSR method, a linear combination of predictors is constructed that maximizes the correlation with the target variable. The aim is to find new, more informative features that are highly correlated with the target variable, thus improving the predictive ability and interpretability of the model.
In addition to the regression models obtained by the mentioned chemometric techniques, a second-order prediction model was also used, which is more often applied to describe the change in quality indicators of cheese and, in general, of products of biological origin. The model looks like this:
where
z is the dependent variable,
x and
y are the independent variables, and
b denotes the coefficients of the model. The evaluation of the model is realized by the coefficient of determination, Fisher’s test,
p-test, and standard error. An analysis of the residuals of the resulting model was carried out.
2.11. Validation of the Resulting Models
In creating reliable and applicable regression models describing the cheese data with sufficient accuracy, an important step is their validation. It is an integral part of the modeling process and is a key stage for detecting and solving potential problems. The validation process ensures that the built model is well calibrated so that it can successfully handle a variety of data, including those that were not used during training. In the validation process, a separate dataset comprising 30% of the total group is used to test the performance of the regression models.
The results are presented by the distribution of the measured and predicted values of the sought technological quantity, showing how these values are distributed relative to the appropriate regression line. This analysis visualizes the degree of agreement between the model predictions and the actual measurements, providing a clear insight into the accuracy of the regression model. When predicting the quality indicators of the cheese, the determination of the relationship between the real and predicted values of the technological parameters was carried out by means of the coefficient of determination (R
2), mean square error (
MSE), root mean square error (
RMSE), mean absolute error (
MAE), and standard error (
SE). These errors are calculated using the following formulas:
where
n is the number of data,
ym is the actual measured value, and
yp is the predicted value.
The coefficient of determination (R2) represents the proportion of the total variation in the predicted values that can be explained by the actual measured values. A high value of the indicator indicates a greater fit between the model and the real data. To determine a more accurate relationship between actual and predicted data, a thorough evaluation using additional criteria is necessary.
MSE—This criterion calculates the arithmetic mean of the squares of the differences between the actual and predicted values. A lower MSE value indicates less bias and better model precision. RMSE determines the average size of the error between the predicted and actual values. The smaller the RMSE, the more accurate the validated regression model. MAE is a measure of the average magnitude of the errors without regard to their direction. MAE treats all errors equally unlike RMSE where large errors are reflected. SE is a measure of how likely the mean of the actual measured data is to deviate from the actual mean of the actual measured data. The smaller the standard error, the more accurate the validated regression model.
To perform the necessary analyses and calculation procedures, we used the products MATLAB version 2017b (MathWorks Inc., Natick, MA, USA), Stat Soft Statistica version 12 (TIBCO Software Inc., Palo Alto, CA, USA), and MS Excel version 2016 (Microsoft Corp., Albuquerque, NM, USA).
4. Discussion
Due to the overlap of the optical data from the three types of investigated cheeses, established by the high values of separation errors during classification, general models were created for their separate properties, characterizing the quality indicators of the products. This complements the studies of Bittante et al. [
19] regarding the effective use of optical techniques in cheese analysis and also improved on the studies of Eskildsen et al. [
20], showing that the main characteristics of the cheese have a significant influence on the changes in the optical characteristics of the product. In this development, this was confirmed by the measurements of the active acidity and electrical conductivity of the product.
Combining the data is a prerequisite for creating an objective model for a given property that captures the general trends and variations in the data for individual cheeses. Using data from three types of sensors, the studies of Sherveglieri et al. [
21] used only gas and optical sensors. In this development, by adding data from ultrasonic sensors, the accuracy of predicting the main characteristics of cheese can be increased to over 95%.
Complementing the results of Meza et al. [
22], the results obtained in the preliminary prediction of the main characteristics of cheese by chemometric techniques presented significantly higher values of the coefficient of determination and lower errors when using the PLSR method. This can be explained by the fact that PLSR is an approach that effectively deals with the multicollinearity found in the data characterizing cheese quality indicators. This aspect is essential to avoid the problem of “overcomplexity” in models that often occurs when using the PCR method. Therefore, the regression predictive models created are based on latent variable data obtained from spectral, ultrasonic, and gas characteristics. Based on these, the obtained regression predictive models showed predictive ability, as represented by the coefficient of determination R
2, ranging from 0.89 to 0.93 for TDS and EC, respectively.
The results obtained in this work share some conceptual similarities with biomimetics in terms of data synthesis, model optimization, and the pursuit of efficiency. This fulfills the recommendations of Ju et al. [
9] and Falchi et al. [
6] regarding the use and processing of data from biomimetics-based sensors and sensor interfaces.
Despite the effectiveness of integrating ultrasonic, spectral, and gas characteristics in assessing cheese quality, there are limitations of independent sensor usage. Data combinations taken from different sensor archives give a broader range of predictive features, which are important in regard to changes in cheese composition and characteristics from different producers. This multimodal approach enhances the precision, reliability, and adaptability of quality assessment models, enabling more informed decision-making in cheese production processes.
5. Conclusions
The integration of ultrasonic, spectral, and gas characteristics to assess cheese quality represents a biomimetics-inspired approach, mimicking nature’s multimodal sensory capabilities and emphasizing adaptability and resilience in decision-making processes. The present research is focused on adapting algorithms and developing instrumentation for a non-destructive, automated quality assessment of white brine cheese based on an integrated approach using ultrasonic, spectral, and gas characteristics. In this way, it complements and improves the approaches and solutions applied up to now in the field of non-contact cheese analysis for the purpose of its quality. The independent use of characteristics can limit the possibilities for adequate quality of cheese, especially if they are weakly sensitive to changes in composition and hence in its quality. In the study, it was found that the use of ultrasonic, spectral, and gas sensors alone gives a limited number of informative indices for quality analysis of this product which are effective only for a specific cheese producer. Combining these methods provides an opportunity to create a wider range of prognostic features that are sensitive to changes in the composition and characteristics of cheese from different producers. This combination of data also allows greater precision and reliability in the assessment of cheese quality, greater adaptability of the model to different producers and production conditions, and, last but not least, improves the possibilities for appropriate decision-making. The results presented in the work represent an effective and balanced approach for predicting the quality characteristics of cheese, taking into account the many factors that can influence their values. The present study integrated ultrasonic, spectral, and gas characteristics to evaluate cheese quality, but there may be additional data sources or sensory modalities that could provide information to increase the accuracy of the regression models. Exploring and integrating new data sources can improve the accuracy and reliability of cheese quality assessment models. The development of a real-time monitoring system based on the joint use of optical, gas, and ultrasonic techniques could be beneficial to cheese producers. This system can continuously monitor cheese quality during production, allowing for timely corrections and interventions to maintain or improve quality. Also, the study may have applications beyond the evaluation of cheese quality. Adaptation of the methodology to other food products or industrial processes may broaden its impact and utility.
This research underlines the importance of the use pf multimodal sensory integration to improve the cheese quality assessment. By integrating ultrasonic, spectral, and gas characteristics, this study provides an effective approach to predict cheese quality among various production ways. Continuous monitoring of systems based on the integrated sensory techniques for real-time quality assessment during cheese production is presented as an important task to be performed by every producer. Moreover, the study suggests the potential application of this methodology not only for cheese evaluation, but also for its implementation on other food products or industrial processes. In this way, it broadens its impact and utility.